Predictive Analytics in Modeling Policyholder Behavior

Predictive Analytics in Modeling Policyholder Behavior
Gordon Klein, FSA, Ph.D.
2016 Equity‐Based Insurance Guarantees Conference
Session 2B – Predictive Analytics in Modeling Policyholder Behavior
1:30‐3:00pm, 14 November 2016
Gordon Klein, FSA, Ph.D.
Transamerica Life Insurance Company
What is Predictive Analytics?
• Using data from the past to predict probabilities of future events.
• Fancy name for Statistics.
• Related buzzwords: Big Data, Machine Learning.
• My definition: Using statistics, with enough predictive variables to link past behavior to future behavior, so that you have reasonably good fit of outcomes to predictions such as Pr[lapse|predictive variables]
2
How is Predictive Analytics Related to Policyholder Behavior Assumption Setting?
• For some assumptions, you can get lots of data.
• Of course you only have data on the past, and you want to predict behavior in the future.
• But the future may not behave like the past. That’s where predictive variables come in.
• Include explanatory variables in your analysis of the past that will be predictive in the future.
3
Examples Where Explanatory Variables Become Predictive
• In the early 1970’s, lapse and loan rates seemed unimportant. But as interest rates soared in the late 70’s, past data could not predict the high rates of lapse and loans.
• Bad assumption: Loan rate = flat rate.
• Good assumption: Loan rate = flat rate + convex function of max(0,current int – gtd loan rate).
• See next slide.
4
Years Remaining in Surr Chg Period and Calendar Quarter
(Tim’s Slide #9)
‐3 Years Remaining in SC Period
0 Years Remaining in SC Period
3 Years Remaining in SC Period
How do We Handle that Spike in 2008?
• We could say “It’s a different regime. The world changed.” If we do this, and don’t build a model for regime‐switching, then we are giving up.
• We could look for explanatory variables that were higher or lower then. Examples: • Moneyness of guarantees. How would this look?
• Macroeconomic variables, like unemployment. Could we incorporate these in our models going forward?
6
Recommended Statistical/PA/Assumption Methodology
• Use methods that are asymptotically the best.
• Don’t use methods that constrain you to linearity.
• Iteratively incorporate variables that explain discrepancies between actual and expected in the data. • Use judgment in selecting explanatory variables. This can help avoid spurious variables.
• Use judgment where there isn’t much data.
7
Determining if a Variable is Predictive
• With any type of statistical modeling, naively including more variables will give the impression of better fit.
• A good test to detect “overfitting” is the Likelihood Ratio Test (LRT).
• Exam C (Klugman, Panjer, and Willmot) gives good coverage of this issue.
• Example: The LRT could be used to determine if your company’s parameter estimates are significantly different from those of the industry.
8
Example of Predictive Analytics for Mortality Assumption on VA
• Easiest approach: q = death count / exposure.
• Graph actual‐to‐expected (A/E) by attained age. Notice that it is increasing. Fit a function to this.
• Evaluate A/E by gender. Notice that Male is higher than 100%, Female is lower than 100%. Adjust.
• Evaluate by Rider Type. A/E for No rider is higher than for GLWB. (See next page, Tim’s Slide #24.)
• Evaluate by Calendar Year. With enough data, you will see improvement, as well as anti‐selection.
• Evaluate over shorter interval. You will find seasonality.
• Evaluate by Size. (See Tim’s Slide #25, two slides ahead.) This is not a uniform effect, so you can’t just multiply by a factor related to size.
• Note: You may not want to use all of these variables, depending on the purpose of your assumption. And note that a lot of these predictive variables have been used for a long time!
9
Mortality by Duration and Guarantee Type
(Tim’s Slide #24)
GLWB
% of Ruark Table
GMIB
None
1
2
3
4
5
Duration
6
7
8
9
10
Mortality by Guarantee Type and Size
None
% of Table
GLWB&GMIB
<$50k
$50‐100k
$100‐250k
$250‐500k
Size
$500k‐1mil
$1mil+
Potential Sources of Predictive Variables
• Data from administrative system. Age, duration, policy values, etc.
• Other data available internally. Customer survey results, application data, etc.
• External data. Credit score, zip code, magazine subscriptions, number and age of children, etc.
• Value of these different sources will depend on the intended use of the assumption.
12
Conclusion
• Predictive Analytics may bring more statistical rigor to something that actuaries have been doing for a long time—identifying variables that help to predict probabilities of future events.
• This allows companies to better quantify the risks that they are taking on.
• Companies should be in a position to make better decisions as they build more predictive assumptions.
13