Predictive Analytics in Health Care: Methods and Approaches to Identify the Risk of Readmission

  • Isabella EignerEmail author
  • Andreas Hamper
Part of the Healthcare Delivery in the Information Age book series (Healthcare Delivery Inform. Age)


The increasing focus on evidence-based healthcare services as well as rising health expenditures for inpatient treatment forces hospitals to introduce new approaches to allow for a more efficient delivery of said services. As a new measure of healthcare quality, readmission rates are increasingly used to determine the quality of care, benchmark hospital performance and determine funding rates or even issue penalties. It is therefore key to determine patients at high risk of readmission. This can be done by using predictive risk models that are able to predict the risk of readmission to the hospital for individual patients using various data mining techniques and algorithms. Based on these models and with the increasing amount of data collected in hospitals, clinicians and hospital management can be supported in their daily decision-making to reduce readmission rates. Ultimately, the implementation of such prediction models can help avoid unnecessary costs as well as improve the quality of healthcare services. This work aims at identifying and analysing state-of-the-art risk prediction models in healthcare with regard to their specific application areas, applied algorithms and resulting accuracy to determine the suitability of different methods in different healthcare contexts.


Risk prediction Predictive analytics Healthcare Data mining Readmissions 


  1. Albers, S. (2007). Methodik der empirischen forschung. Wiesbaden: Springer Fachmedien.CrossRefGoogle Scholar
  2. Altman, D. G., & Royston, P. (2000). What do we mean by validating a prognostic model? Statistics in Medicine, 19(4), 453–473.CrossRefPubMedGoogle Scholar
  3. Aral, K. D., Güvenir, H. A., Sabuncuoğlu, I., & Akar, A. R. (2012). A prescription fraud detection model. Computer Methods and Programs in Biomedicine, 106(1), 37–46.CrossRefPubMedGoogle Scholar
  4. Au, A. G., McAlister, F. A., Bakal, J. A., Ezekowitz, J., Kaul, P., & van Walraven, C. (2012). Predicting the risk of unplanned readmission or death within 30 days of discharge after a heart failure hospitalization. American Heart Journal, 164(3), 365–372. Scholar
  5. Backhaus, K., Erichson, B., & Weiber, R. (2015). Fortgeschrittene multivariate analysemethoden. Berlin: Springer. Scholar
  6. Backhaus, K., Erichson, B., Plinke, W., & Weiber, R. (2016). Multivariate analysemethoden. Berlin: Springer. Scholar
  7. Bellazzi, R., & Zupan, B. (2008). Predictive data mining in clinical medicine: Current issues and guidelines. International Journal of Medical Informatics, 77(2), 81–97.CrossRefPubMedGoogle Scholar
  8. Bohsem, G. (2015). Millionen überflüssige Klinikaufenthalte, Sueddeutsche Zeitung.Google Scholar
  9. Breiman, L. (2001). Machine Learning, 45(1), 5–32. Scholar
  10. Brown, A. (SAS Institut, Hrsg.). (2016). Top five reasons for using penalized regression for modeling your high-dimensional data. Zugriff am December 13, 2016, Verfügbar unter
  11. Carey, K., & Stefos, T. (2016). The cost of hospital readmissions: Evidence from the VA. Health Care Management Science, 19(3), 241–248. Scholar
  12. Chae, Y. M., Kim, H. S., Tark, K. C., Park, H. J., & Ho, S. H. (2003). Analysis of healthcare quality indicator using data mining and decision support system. Expert Systems with Applications, 24(2), 167–172.CrossRefGoogle Scholar
  13. Chan, C. L., & Lan, C. H. (2001). A data mining technique combining fuzzy sets theory and Bayesian classifier – An application of auditing the health insurance fee. In Proceedings of the International Conference on Artificial Intelligence.Google Scholar
  14. Chechulin, Y., Nazerian, A., Rais, S., & Malikov, K. (2014). Predicting patients with high risk of becoming high-cost healthcare users in Ontario (Canada). Healthcare Policy, 9(3), 68–79.PubMedPubMedCentralGoogle Scholar
  15. Christy, T. (1997). Analytical tools help health firms fight fraud. Insurance & Technology, 22(3), 22–26.Google Scholar
  16. Delen, D., Walker, G., & Kadam, A. (2005). Predicting breast cancer survivability: A comparison of three data mining methods. Artificial Intelligence in Medicine, 34(2), 113–127.CrossRefPubMedGoogle Scholar
  17. Delen, D., Fuller, C., McCann, C., & Ray, D. (2009). Analysis of healthcare coverage: A data mining approach. Expert Systems with Applications, 36(2), 995–1003.CrossRefGoogle Scholar
  18. Desai, M. M., Stauffer, B. D., Feringa, H. H. H., & Schreiner, G. C. (2009). Statistical models and patient predictors of readmission for acute myocardial infarction: a systematic review. Circulation. Cardiovascular Quality and Outcomes, 2(5), 500–507. Scholar
  19. Futoma, J., Morris, J., & Lucas, J. (2015). A comparison of models for predicting early hospital readmissions. Journal of Biomedical Informatics, 56, 229–238. Scholar
  20. Golmohammadi, D., & Radnia, N. (2016). Prediction modeling and pattern recognition for patient readmission. International Journal of Production Economics, 171, 151–161. Scholar
  21. Görz, G. (2014). Handbuch der künstlichen Intelligenz (5th, revised. and updated edition). München: Oldenbourg.Google Scholar
  22. Guerra, L., McGarry, L. M., Robles, V., Bielza, C., Larranaga, P., & Yuste, R. (2011). Comparison between supervised and unsupervised classifications of neuronal cell types: A case study. Developmental Neurobiology, 71(1), 71–82. Scholar
  23. Hair, J. F. (2007). Knowledge creation in marketing. The role of predictive analytics. European Business Review, 19(4), 303–315. Scholar
  24. Harrach, S. (2014). Neugierige strukturvorschläge im maschinellen lernen. Eine technikphilosophische verortung (Edition panta rei). Bielefeld: transcript.Google Scholar
  25. Haykin, S. (1999). Neural networks. A comprehensive foundation (2nd ed.). Delhi: Pearson Education.Google Scholar
  26. Hess, D. R. (2004). Retrospective studies and chart reviews. Respiratory Care, 49(10), 1171–1174.PubMedGoogle Scholar
  27. Hilbert, J. P., Zasadil, S., Keyser, D. J., & Peele, P. B. (2014). Using decision trees to manage hospital readmission risk for acute myocardial infarction, heart failure, and pneumonia. Applied Health Economics and Health Policy, 12(6), 573–585. Scholar
  28. Howell, S., Coory, M., Martin, J., & Duckett, S. (2009). Using routine inpatient data to identify patients at risk of hospital readmission. BMC Health Services Research, 9, 96. Scholar
  29. Huang, J. S., Chen, Y. F., & Hsu, J. C. (2014, January). Design of a clinical decision support model for predicting pneumonia readmission. In W.-Y. Chen (Ed.), International symposium on computer, consumer and control (IS3C), 2014. 10–12 June 2014, Taichung, Taiwan; proceedings (S. 1179–1182). Piscataway, NJ: IEEE.Google Scholar
  30. Kaiser, C. (2009). Opinion mining im web 2.0—Konzept und fallbeispiel. HMD Praxis der Wirtschaftsinformatik, 46(4), 90–99. Scholar
  31. Kansagara, D., Englander, H., Salanitro, A., Kagen, D., Theobald, C., Freeman, M., et al. (2011). Risk prediction models for hospital readmission: a systematic review. JAMA, 306(15), 1688–1698. Scholar
  32. Koh, H. C., & Tan, G. (2005). Data mining applications in healthcare. Journal of Healthcare Information Management, 19, 64–72.PubMedGoogle Scholar
  33. Krogh, A. (2008). What are artificial neural networks? Nature Biotechnology, 26(2), 195–197. Scholar
  34. Krumholz, H. M., Merrill, A. R., Schone, E. M., Schreiner, G. C., Chen, J., Bradley, E. H., et al. (2009). Patterns of hospital performance in acute myocardial infarction and heart failure 30-day mortality and readmission. Circulation. Cardiovascular Quality and Outcomes, 2(5), 407–413. Scholar
  35. Kudyba, S., & Gregorio, T. (2010). Identifying factors that impact patient length of stay metrics for healthcare providers with advanced analytics. Health Informatics Journal, 16(4), 235–245.CrossRefPubMedGoogle Scholar
  36. Lee, E. W. (2012). Selecting the best prediction model for readmission. Journal of Preventive Medicine and Public Health = Yebang Uihakhoe Chi, 45(4), 259–266. Scholar
  37. Liang, C., & Gong, Y. (2015). Enhancing patient safety event reporting by K-nearest neighbor classifier. Studies in Health Technology and Informatics, 218, 93–99.PubMedGoogle Scholar
  38. Liaw, A., & Wiener, M. (2002). Classification and regression by randomForest. R News, 2(3), 18–22. Verfügbar unter
  39. Loh, W.-Y. (2011). Classification and regression trees. Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery, 1(1), 14–23. Scholar
  40. Medicare (Medicare Payment Advisory Comission, Hrsg.). (2007). Report to the congress. Promoting greater efficiency in medicare. Zugriff am November 28, 2016, Verfügbar unter
  41. Mishra, N., & Silakari, S. (2012). Predictive analytics: A survey, trends, applications, oppurtunities & challenges. International Journal of Computer Science and Information Technologies, 3, 4434–4438.Google Scholar
  42. Müller, R. M., & Lenz, H.-J. (2013). Business intelligence. Berlin: Springer. Scholar
  43. Nyce, C. (American Institute for Chartered Property Casualty Underwriters/Insurance Institute of America, Hrsg.). (2007). Predictive analytics. White Paper. Zugriff am November 13, 2016, Verfügbar unter
  44. Orem, T. (IBM, Hrsg.). (2015). 4 ways predictive analytics in finance can help companies see the future. Zugriff am December 02, 2016, Verfügbar unter
  45. Osheroff, J. A., Teich, J. M., Middleton, B., Steen, E. B., Wright, A., & Detmer, D. E. (2007). A roadmap for national action on clinical decision support. Journal of the American Medical Informatics Association: JAMIA, 14(2), 141–145.CrossRefPubMedGoogle Scholar
  46. Pearsall, B. (2010). Predictive policing: The future of law enforcement. National Institute of Justice Journal, 266, 16–19.Google Scholar
  47. Philbin, E. F., & DiSalvo, T. G. (1999). Prediction of hospital readmission for heart failure: Development of a simple risk score based on administrative data. Journal of the American College of Cardiology, 33(6), 1560–1566.CrossRefPubMedGoogle Scholar
  48. Phillips-Wren, G., Sharkey, P., & Dy, S. M. (2008). Mining lung cancer patient data to assess healthcare resource utilization. Expert Systems with Applications, 35(4), 1611–1619.CrossRefGoogle Scholar
  49. Podgorelec, V., Kokol, P., Stiglic, B., & Rozman, I. (2002). Journal of Medical Systems, 26(5), 445–463. Scholar
  50. Prasad, A. M., Iverson, L. R., & Liaw, A. (2006). Newer classification and regression tree techniques. Bagging and random forests for ecological prediction. Ecosystems, 9(2), 181–199. Scholar
  51. Raghupathi, W., & Raghupathi, V. (2014). Big data analytics in healthcare: promise and potential. Health Information Science and Systems, 2(1), 3.CrossRefPubMedPubMedCentralGoogle Scholar
  52. Ross, J. S. (2008). Statistical models and patient predictors of readmission for heart failure: A systematic review. Archives of Internal Medicine, 168(13), 1371. Scholar
  53. Schneider, A., Hommel, G., & Blettner, M. (2010). Linear regression analysis: Part 14 of a series on evaluation of scientific publications. Deutsches Arzteblatt International, 107(44), 776–782. Scholar
  54. Segal, M. R. (2004). Machine learning benchmarks and random forest regression. Verfügbar unter
  55. Shams, I., Ajorlou, S., & Yang, K. (2015). A predictive analytics approach to reducing 30-day avoidable readmissions among patients with heart failure, acute myocardial infarction, pneumonia, or COPD. Health Care Management Science, 18(1), 19–34. Scholar
  56. Shmueli, G., & Koppius, O. (2011). Predictive analytics in information systems research. SSRN Electronic Journal.
  57. Shulan, M., Gao, K., & Moore, C. D. (2013). Predicting 30-day all-cause hospital readmissions. Health Care Management Science, 16(2), 167–175. Scholar
  58. Singh, K., & Xie, M. (2008). Bootstrap: A statistical method. Unpublished manuscript, Rutgers University, USA. Retrieved from
  59. Son, Y.-J., Kim, H.-G., Kim, E.-H., Choi, S., & Lee, S.-K. (2010). Application of support vector machine for prediction of medication adherence in heart failure patients. Healthcare Informatics Research, 16(4), 253–259.CrossRefPubMedPubMedCentralGoogle Scholar
  60. Song, J. W., & Chung, K. C. (2010). Observational studies: Cohort and case-control studies. Plastic and Reconstructive Surgery, 126(6), 2234–2242. Scholar
  61. Speybroeck, N. (2012). Classification and regression trees. International Journal of Public Health, 57(1), 243–246. Scholar
  62. Strome, T. L. (2015). Healthcare analytics: From data to knowledge to healthcare improvement. Hoboken: Wiley.Google Scholar
  63. Sushmita, S., Khulbe, G., Hasan, A., Newman, S., Ravindra, P., Roy, S. B., et al. (2016). Predicting 30-day risk and cost of “all-cause” hospital readmissions. Verfügbar unter
  64. Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B., & Wei, L. J. (2011). On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Statistics in Medicine, 30(10), 1105–1117.Google Scholar
  65. Van Walraven, C., Bennett, C., Jennings, A., Austin, P. C., & Forster, A. J. (2011). Proportion of hospital readmissions deemed avoidable: A systematic review. CMAJ: Canadian Medical Association Journal = journal de l'Association medicale canadienne, 183(7), E391–E402. Scholar
  66. von der Lippe, P. (1993). Deskriptive statistik (UTB für Wissenschaft Uni-Taschenbücher Wirtschaftswissenschaften, Bd. 1632). Stuttgart: Fischer.Google Scholar
  67. Wang, T., Rudin, C., Wagner, D., & Sevieri, R. (2013, January). Learning to detect patterns of crime. In Joint European conference on machine learning and knowledge discovery in databases (S. 515–530).CrossRefGoogle Scholar
  68. Westreich, D., Cole, S. R., Funk, M. J., Brookhart, M. A., & Sturmer, T. (2011). The role of the c-statistic in variable selection for propensity score models. Pharmacoepidemiology and Drug Safety, 20(3), 317–320. Scholar

Copyright information

© Springer International Publishing AG, part of Springer Nature 2018

Authors and Affiliations

  1. 1.Institute of Information SystemsUniversity Erlangen-NurembergNurembergGermany

Personalised recommendations