Skip to main content

Predictive Analytics in Long Term Care

  • Chapter
  • First Online:
Actuarial Aspects of Long Term Care

Part of the book series: Springer Actuarial ((SPACT))

  • 612 Accesses

Abstract

This chapter introduces machine learning and artificial intelligence techniques for analyzing long term care risks. The focus is on incidence rates, but the methods can be extended to termination and lapse rates. Four algorithms are described: generalized linear model, Lasso, neural networks and extreme gradient boosting. We provide practical advice how to clean and prepare data, train and validate models and show how to test and compare the accuracy of the different algorithms. The chapter provides the tools for developing LTC risk rates from either industry data or an insurer’s proprietary database.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 59.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Hardcover Book
USD 79.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

Notes

  1. 1.

    Murphy [1].

  2. 2.

    “Long Term Care Intercompany Experience Study” [2].

  3. 3.

    The full set of data adjustments to date is found in the clean_incidence function on the Github Site.

  4. 4.

    “Caveats for Use of Long Term Care Experience Basic Tables”, 2015, Society of Actuaries.

  5. 5.

    These claims count fields are called Claims_NH, Claims_ALF, and Claims_HHC, respectively, in the database.

  6. 6.

    The selection of ActiveExposure as the measure, instead of TotalExposure, is described in Aalen [3], Chap. 5.

  7. 7.

    See (i) Holford [4] and (ii) Rodriguez [5].

  8. 8.

    This same loss function can be derived by using solely a piecewise constant hazard function assumption and does not require the \( y_{i} \) to have a Poisson distribution. See Aalen [3] Chap. 5.

  9. 9.

    Each of the field names are described in the SOA Report [2].

  10. 10.

    Aikake [6].

  11. 11.

    We use the term “ LAMBDA ” to designate the constant in order to be consistent with the R package glmnet which will be used to execute the Lasso. This LAMBDA is not the same as and should not be confused with the Poisson mean parameter \( \lambda \).

  12. 12.

    Tibshirani [7].

  13. 13.

    “Deviance” is a measure of the accuracy of the model (lower is better), see the glmnet package vignette for more details.

  14. 14.

    Unfortunately, at the time of writing, setting up Keras on a GPU is not for the faint of heart. Only Nvidia GPUs are currently supported. Help can be found on the RStudio, Tensorflow and Nvidia websites on how to install the various components. An alternative is to use a cloud server through Google, Amazon where the server is pre-installed with R and Keras.

  15. 15.

    Chen [9].

  16. 16.

    A lengthy discussion of tuning parameters can be found at http://xgboost.readthedocs.io/en/latest///parameter.html.

References

  1. Murphy, K.P.: Machine Learning: A Probabilistic Perspective (Adaptive Computation and Machine Learning series). Massachusetts Institute of Technology (2012)

    Google Scholar 

  2. Long Term Care Intercompany Experience Study—Aggregate Database 2000–2011 Report, January 2015, Society of Actuaries

    Google Scholar 

  3. Aalen, O.O., Borgan, O., Gjessing, H.: Survival and Event History Analysis: A Process Point of View. Springer (2008)

    Google Scholar 

  4. Holford, T.: The analysis of rates and of survivorship using log-linear models. Biometrix 36 (1980)

    Article  MathSciNet  Google Scholar 

  5. Rodriguez, G.: Generalized Linear Models. Princeton University, http://data.princeton.edu/wws509/notes/c7s4.html

  6. Akaike, H.: A new look at the statistical model identification. IEEE Trans. Autom. Control. (1974)

    Google Scholar 

  7. Tibshirani, R.: Regression shrinkage and selection via the lasso. J. R. Stat. Society. Ser. B (Methodological) 58(1), 267–288 (1996)

    MathSciNet  MATH  Google Scholar 

  8. Friedman, J.H.: Greedy function approximation: a gradient boosting machine. Ann. Stat. 29(5), 1189–1232 (Oct., 2001)

    Article  MathSciNet  Google Scholar 

  9. XGBoost: A Scalable Tree Boosting System. Tianqi Chen, Carlos Guestrin. https://arxiv.org/pdf/1603.02754.pdf (2016)

  10. McCullagh, P., Nelder, J.A.: Generalized Linear Models. Chapter 13, Chapman & Hall (1989)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Howard Zail .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2019 Springer Nature Switzerland AG

About this chapter

Check for updates. Verify currency and authenticity via CrossMark

Cite this chapter

Zail, H. (2019). Predictive Analytics in Long Term Care. In: Dupourqué , E., Planchet, F., Sator, N. (eds) Actuarial Aspects of Long Term Care. Springer Actuarial. Springer, Cham. https://doi.org/10.1007/978-3-030-05660-5_13

Download citation

Publish with us

Policies and ethics