Abstract
This chapter introduces machine learning and artificial intelligence techniques for analyzing long term care risks. The focus is on incidence rates, but the methods can be extended to termination and lapse rates. Four algorithms are described: generalized linear model, Lasso, neural networks and extreme gradient boosting. We provide practical advice how to clean and prepare data, train and validate models and show how to test and compare the accuracy of the different algorithms. The chapter provides the tools for developing LTC risk rates from either industry data or an insurer’s proprietary database.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
Notes
- 1.
Murphy [1].
- 2.
“Long Term Care Intercompany Experience Study” [2].
- 3.
The full set of data adjustments to date is found in the clean_incidence function on the Github Site.
- 4.
“Caveats for Use of Long Term Care Experience Basic Tables”, 2015, Society of Actuaries.
- 5.
These claims count fields are called Claims_NH, Claims_ALF, and Claims_HHC, respectively, in the database.
- 6.
The selection of ActiveExposure as the measure, instead of TotalExposure, is described in Aalen [3], Chap. 5.
- 7.
- 8.
This same loss function can be derived by using solely a piecewise constant hazard function assumption and does not require the \( y_{i} \) to have a Poisson distribution. See Aalen [3] Chap. 5.
- 9.
Each of the field names are described in the SOA Report [2].
- 10.
Aikake [6].
- 11.
We use the term “ LAMBDA ” to designate the constant in order to be consistent with the R package glmnet which will be used to execute the Lasso. This LAMBDA is not the same as and should not be confused with the Poisson mean parameter \( \lambda \).
- 12.
Tibshirani [7].
- 13.
“Deviance” is a measure of the accuracy of the model (lower is better), see the glmnet package vignette for more details.
- 14.
Unfortunately, at the time of writing, setting up Keras on a GPU is not for the faint of heart. Only Nvidia GPUs are currently supported. Help can be found on the RStudio, Tensorflow and Nvidia websites on how to install the various components. An alternative is to use a cloud server through Google, Amazon where the server is pre-installed with R and Keras.
- 15.
Chen [9].
- 16.
A lengthy discussion of tuning parameters can be found at http://xgboost.readthedocs.io/en/latest///parameter.html.
References
Murphy, K.P.: Machine Learning: A Probabilistic Perspective (Adaptive Computation and Machine Learning series). Massachusetts Institute of Technology (2012)
Long Term Care Intercompany Experience Study—Aggregate Database 2000–2011 Report, January 2015, Society of Actuaries
Aalen, O.O., Borgan, O., Gjessing, H.: Survival and Event History Analysis: A Process Point of View. Springer (2008)
Holford, T.: The analysis of rates and of survivorship using log-linear models. Biometrix 36 (1980)
Rodriguez, G.: Generalized Linear Models. Princeton University, http://data.princeton.edu/wws509/notes/c7s4.html
Akaike, H.: A new look at the statistical model identification. IEEE Trans. Autom. Control. (1974)
Tibshirani, R.: Regression shrinkage and selection via the lasso. J. R. Stat. Society. Ser. B (Methodological) 58(1), 267–288 (1996)
Friedman, J.H.: Greedy function approximation: a gradient boosting machine. Ann. Stat. 29(5), 1189–1232 (Oct., 2001)
XGBoost: A Scalable Tree Boosting System. Tianqi Chen, Carlos Guestrin. https://arxiv.org/pdf/1603.02754.pdf (2016)
McCullagh, P., Nelder, J.A.: Generalized Linear Models. Chapter 13, Chapman & Hall (1989)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2019 Springer Nature Switzerland AG
About this chapter
Cite this chapter
Zail, H. (2019). Predictive Analytics in Long Term Care. In: Dupourqué , E., Planchet, F., Sator, N. (eds) Actuarial Aspects of Long Term Care. Springer Actuarial. Springer, Cham. https://doi.org/10.1007/978-3-030-05660-5_13
Download citation
DOI: https://doi.org/10.1007/978-3-030-05660-5_13
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-05659-9
Online ISBN: 978-3-030-05660-5
eBook Packages: Mathematics and StatisticsMathematics and Statistics (R0)