Skip to main content

Advertisement

Log in

A machine learning model for predicting ICU readmissions and key risk factors: analysis from a longitudinal health records

  • Original Paper
  • Published:
Health and Technology Aims and scope Submit manuscript

Abstract

Due to high costs, resources and managemant associated with readmission into Intensive Care Units (ICU), it has been a center of clinical research. Previous research successfully identified several common risk factors and proposed a variety of frameworks to predict ICU readmissions, whereas, some studies reported that many risk factors were too specific and/or had limited focus. This study aims to investigate and analyze if the relevance of ICU readmission risk factors may have changed overtime. We used MIMIC-III database with 42,307 ICU stays of 31,749 patients from a US hospital, related to medical services provided from 2001 to 2012. The dataset was initially split into two chronological subsets (2001–2008 and 2008–2012), and then split again into train (70%) and test (30%) datasets. The training datasets were rebalanced through undersampling technique. To identify if the most relevant risk factors changes over time, 13 variables (12 features and one class) were selected and a three-step machine learning approach was executed: (i) Numerical Analysis, to identify overall quantitative changes; (ii) Feature Correlation Value Analysis, to rank the most important risk factors in each subset and compare them to identify any significant changes; and (iii) Classifier Performance Analysis, to identify changes in the risk factors prediction capability, based on the three machine learning algorithms - Multilayer Perceptron, Random Forest and Support Vector Machine. When considering readmission rates, some changes were observed for patients using private insurance (variability of +3.0%) and first admitted in ICU through Medical Intensive Care Unit (−3.1%). Regarding the feature analysis, the two most relevant variables were the same in both datasets, having similar correlation value. When applying the machine learning algorithms in test datasets, the model presented similar results for both periods, achieving the best accuracy of 86.4%, and Area Under ROC Curve (AUC) of 0.642. The difference in AUC values between the first and the second periods varied up to 0.05 (better in the first dataset) and in accuracy up to 4% (better in the second period). Overall results indicate that the most relevant risk factors were stable over the years, with some minor changes. Further research is required to incorporate other readmission risk factors, such as social determinants and mental health and well-being.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2

Similar content being viewed by others

Notes

  1. LACE is the acronym for Length of Stay (L), Acute Admission (A), Comorbidity (C) and Visits to Emergency (E)

  2. HOSPITAL is the acronym for Hemoglobin at discharge (H), discharge from an oncology service (O), sodium level at discharge (S), procedure during the index admission (P), index type of admission (IT), number of admissions during the last 12 months (A), and length of stay (L)

  3. PARR-30 is the acronym for “Patients at Risk of Re-admission within 30 days” 10. Billings, J., et al., Development of a predictive model to identify inpatients at risk of re-admission within 30 days of discharge (PARR-30). BMJ open, 2012. 2(4): p. e001667.

  4. PREADM is the acronym for “Preadmission Readmission Detection Model”

  5. LOS, Number of Services, service MED and service SURG

References

  1. Bosco JA, et al. Cost burden of 30-day readmissions following Medicare total hip and knee arthroplasty. J Arthroplast. 2014;29(5):903–5.

    Article  Google Scholar 

  2. Paratz J, Thomas P, Adsett J. Re-admission to intensive care: identification of risk factors. Physiother Res Int. 2005;10(3):154–63.

    Article  Google Scholar 

  3. Braet A, et al. Risk factors for unplanned hospital re-admissions: a secondary data analysis of hospital discharge summaries. J Eval Clin Pract. 2015;21(4):560–6.

    Article  Google Scholar 

  4. Elliott M, Worrall-Carter L, Page K. Intensive care readmission: a contemporary review of the literature. Intensive and Critical Care Nursing. 2014;30(3):121–37.

    Article  Google Scholar 

  5. Jiang S, et al. An integrated machine learning framework for hospital readmission prediction. Knowl-Based Syst. 2018;146:73–90.

    Article  Google Scholar 

  6. Wong EG, et al. Association of severity of illness and intensive care unit readmission: A systematic review. Heart & Lung: The Journal of Acute and Critical Care. 2016;45(1):3–9 e2.

    Article  Google Scholar 

  7. van Walraven C, et al. Derivation and validation of an index to predict early death or unplanned readmission after discharge from hospital to the community. Can Med Assoc J. 2010;182(6):551–7.

    Article  Google Scholar 

  8. Donzé J, et al. Potentially avoidable 30-day hospital readmissions in medical patients: derivation and validation of a prediction model. JAMA Intern Med. 2013;173(8):632–8.

    Article  Google Scholar 

  9. Lee EW. Selecting the best prediction model for readmission. J Prev Med Public Health. 2012;45(4):259.

    Article  Google Scholar 

  10. Billings J, et al. Development of a predictive model to identify inpatients at risk of re-admission within 30 days of discharge (PARR-30). BMJ Open. 2012;2(4):e001667.

    Article  Google Scholar 

  11. Timmers T, et al. Patients’ characteristics associated with readmission to a surgical intensive care unit. Am J Crit Care. 2012;21(6):e120–8.

    Article  Google Scholar 

  12. Charlson ME, et al. A new method of classifying prognostic comorbidity in longitudinal studies: development and validation. J Chronic Dis. 1987;40(5):373–83.

    Article  Google Scholar 

  13. van Walraven C, Wong J, Forster AJ. LACE+ index: extension of a validated index to predict early death or urgent readmission after hospital discharge using administrative data. Open Medicine. 2012;6(3):e80.

    Google Scholar 

  14. Shadmi E, et al. Predicting 30-day readmissions with preadmission electronic health record data. Med Care. 2015;53(3):283–9.

    Article  Google Scholar 

  15. Rothman MJ, Rothman SI, Beals J IV. Development and validation of a continuous measure of patient condition using the electronic medical record. J Biomed Inform. 2013;46(5):837–48.

    Article  Google Scholar 

  16. Robinson R, Hudali T. The HOSPITAL score and LACE index as predictors of 30 day readmission in a retrospective study at a university-affiliated community hospital. Peer J. 2017;5:e3137.

    Article  Google Scholar 

  17. Maali Y, et al. Predicting 7-day, 30-day and 60-day all-cause unplanned readmission: a case study of a Sydney hospital. BMC Medical Informatics and Decision Making. 2018;18(1):1.

    Article  Google Scholar 

  18. Johnson AE, et al. MIMIC-III, a freely accessible critical care database. Scientific Data. 2016;3:160035.

    Article  Google Scholar 

  19. Fialho AS, et al. Data mining using clinical physiology at discharge to predict ICU readmissions. Expert Syst Appl. 2012;39(18):13158–65.

    Article  Google Scholar 

  20. Hall M, et al. The WEKA data mining software: an update. ACM SIGKDD Explorations Newsletter. 2009;11(1):10–8.

    Article  Google Scholar 

  21. Van Hulse J, Khoshgoftaar TM, Napolitano A. Experimental perspectives on learning from imbalanced data. In: Proceedings of the 24th international conference on Machine learning. ACM; 2007.

  22. Al-Shahib A, Breitling R, Gilbert D. Feature selection and the class imbalance problem in predicting protein function from sequence. Appl Bioinforma. 2005;4(3):195–203.

    Article  Google Scholar 

  23. Chawla NV, et al. SMOTE: Synthetic minority over-sampling technique. J Artif Intell Res. 2002;16:321–57.

    Article  MATH  Google Scholar 

  24. Goswami S, Chakrabarti A. Feature selection: A practitioner view. International Journal of Information Technology and Computer Science (IJITCS). 2014;6(11):66.

    Article  Google Scholar 

  25. Singh B, Kushwaha N, Vyas OP. A feature subset selection technique for high dimensional data using symmetric uncertainty. Journal of Data Analysis and Information Processing. 2014;2(04):95.

    Article  Google Scholar 

  26. Yu L, Liu H. Feature selection for high-dimensional data: A fast correlation-based filter solution. In: Proceedings of the 20th international conference on machine learning (ICML-03). 2003.

  27. Saito T, Rehmsmeier M. The precision-recall plot is more informative than the ROC plot when evaluating binary classifiers on imbalanced datasets. PLoS One. 2015;10(3):e0118432.

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Mirza Mansoor Baig.

Ethics declarations

Conflict of interest

Authors declare no conflict of interest.

Ethical approval

The ethical approval was obtained from the appropriate committees prior to conducting the research study.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Junqueira, A.R.B., Mirza, F. & Baig, M.M. A machine learning model for predicting ICU readmissions and key risk factors: analysis from a longitudinal health records. Health Technol. 9, 297–309 (2019). https://doi.org/10.1007/s12553-019-00329-0

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s12553-019-00329-0

Keywords

Navigation