Abstract
Temporal electronic health record (EHR) data are often preferred for clinical prediction tasks because they offer more complete representations of a patient’s pathophysiology than static data. A challenge when working with temporal EHR data is problem formulation, which includes defining the time windows of interest and the prediction task. Our objective was to conduct a systematic review that assessed the definition and reporting of concepts relevant to temporal clinical prediction tasks. We searched PubMed® and IEEE Xplore® databases for studies from January 1, 2010 applying machine learning models to EHR data for patient outcome prediction. Publications applying time-series methods were selected for further review. We identified 92 studies and summarized them by clinical context and definition and reporting of the prediction problem. For the time windows of interest, 12 studies did not discuss window lengths, 57 used a single set of window lengths, and 23 evaluated the relationship between window length and model performance. We also found that 72 studies had appropriate reporting of the prediction task. However, evaluation of prediction problem formulation for temporal EHR data was complicated by heterogeneity in assessing and reporting of these concepts. Even among studies modeling similar clinical outcomes, there were variations in terminology used to describe the prediction problem, rationale for window lengths, and determination of the outcome of interest. As temporal modeling using EHR data expands, minimal reporting standards should include time-series specific concerns to promote rigor and reproducibility in future studies and facilitate model implementation in clinical settings.
Similar content being viewed by others
Data availability
Not applicable
References
Shickel B, Tighe PJ, Bihorac A, Rashidi P (2018) Deep EHR: a survey of recent advances in deep learning techniques for electronic health record (EHR) analysis. IEEE J Biomed Health Inform 22:1589–1604
Johnson C, Pylypchuk Y (2021). ONC data brief: use of certified health IT and methods to enable interoperability by U.S. non-federal Acute care hospitals, 2019. The Office of the National Coordinator for Health Information Technology 54
Shah SM, Khan RA (2020) Secondary use of electronic health record: opportunities and challenges. IEEE Access 8:136947–136965
Zhao J, Papapetrou P, Asker L, Boström H (2017) Learning from heterogeneous temporal data in electronic health records. J Biomed Inform 65:105–119
Sherman E, Gurm H, Balis U, Owens S, Wiens J (2018) Leveraging clinical time-series data for prediction: a cautionary tale. AMIA Annu Symp Proc AMIA Symp 2017:1571–1580
Harutyunyan H, Khachatrian H, Kale DC, Ver SG, Galstyan A (2019) Multitask learning and benchmarking with clinical time series data. Sci Data 6:96
Bedoya AD, Futoma J, Clement ME et al (2020) Machine learning for early detection of sepsis: an internal and temporal validation study. JAMIA Open 3:252–260
Zhao J, Feng Q, Wu P et al (2019) Learning from longitudinal data in electronic health record and genetic data to improve cardiovascular event prediction. Sci Rep 9:717
Singh A, Nadkarni G, Gottesman O, Ellis SB, Bottinger EP, Guttag JV (2015) Incorporating temporal EHR data in predictive models for risk stratification of renal function deterioration. J Biomed Inform 53:220–228
Ayala Solares JR, Diletta Raimondi FE, Zhu Y et al (2020) Deep learning for electronic health records: a comparative review of multiple deep neural architectures. J Biomed Inform 101:103337
Si Y, Du J, Li Z et al (2021) Deep representation learning of patient data from electronic health records (EHR): a systematic review. J Biomed Inform 115:103671
Xiao C, Choi E, Sun J (2018) Opportunities and challenges in developing deep learning models using electronic health records data: a systematic review. JAMIA 25(10):1419–1428. https://doi.org/10.1093/jamia/ocy068
Xie F, Yuan H, Ning Y et al (2022) Deep learning for temporal data representation in electronic health records: a systematic review of challenges and methodologies. J Biomed Inform 126:103980
Rijnbeek P, Reps J (2019) Patient-Level Prediction. In: The Book of OHDSI: Observational Health Data Sciences and Informatics
Lauritsen SM, Kalor ME, Kongsgaard EL et al (2020) Early detection of sepsis utilizing deep learning on electronic health record event sequences. Artif Intell Med 104:101820
Reps JM, Schuemie MJ, Suchard MA, Ryan PB, Rijnbeek PR (2018) Design and implementation of a standardized framework to generate and evaluate patient-level prediction models using observational healthcare data. J Am Med Inform Assoc 25:969–975
Liberati A, Altman DG, Tetzlaff J et al (2009) The PRISMA statement for reporting systematic reviews and meta-analyses of studies that evaluate health care interventions: explanation and elaboration. J Clin Epidemiol 62:e1–e34
Collins GS, Reitsma JB, Altman DG et al (2015) Transparent reporting of a multivariable prediction model for individual prognosis or diagnosis (TRIPOD): the TRIPOD Statement. BMC Med 13(1):1
Norgeot B, Quer G, Beaulieu-Jones BK et al (2020) Minimum information about clinical artificial intelligence modeling: the MI-CLAIM checklist. Nat Med 26:1320–1324
Ashfaq A, Sant'Anna A, Lingman M, Nowaczyk S (2019) Readmission prediction using deep learning on electronic health records. J Biomed Inform 97:103256
Barbieri S, Kemp J, Perez-Concha O et al (2020) Benchmarking deep learning architectures for predicting readmission to the ICU and describing patients-at-risk. Sci Rep 10:1111
Lin YW, Zhou Y, Faghri F, Shaw MJ, Campbell RH (2019) Analysis and prediction of unplanned intensive care unit readmission using recurrent neural networks with long short-term memory. PLoS One. 14:e0218942
Reddy BK, Delen D (2018) Predicting hospital readmission for lupus patients: an RNN-LSTM-based deep-learning methodology. Comput Biol Med 101:199–209
Chang Y, Rubin J, Boverman G et al (2019) A multi-task imputation and classification neural architecture for early prediction of sepsis from multivariate clinical time series. In: 2019 Computing in Cardiology, pp 1–4
Khoshnevisan F, Ivy J, Capan M, Arnold R, Huddleston J, Chi M (2018) Recent temporal pattern mining for septic shock early prediction. In: 2018 IEEE international conference on healthcare informatics (ICHI), pp 229–240
Li Q, Huang LF, Zhong J, Li L, Li Q, Hu J (2019) Data-driven discovery of a sepsis patients severity prediction in the ICU via pre-training BiLSTM networks. In: 2019 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp 668–673
Lin C, Zhang Y, Ivy J et al (2018) Early diagnosis and prediction of sepsis shock by combining static and dynamic information using convolutional-LSTM. In: 2018 IEEE international conference on healthcare informatics (ICHI), pp 219–228
Nonaka N, Seita J (2019) Demographic information initialized stacked gated recurrent unit for an early prediction of sepsis. In: 2019 Computing in Cardiology (CinC), pp 1–4
Park HJ, Jung DY, Ji W, Choi CM (2020) Detection of bacteremia in surgical in-patients using recurrent neural network based on time series records: development and validation study. J Med Internet Res 22:e19512
Persson I, Ostling A, Arlbrandt M, Soderberg J, Becedas D (2021) A machine learning sepsis prediction algorithm for intended intensive care unit use (NAVOY sepsis): proof-of-concept study. JMIR Form Res 5:e28000
Rafiei A, Rezaee A, Hajati F, Gheisari S, Golzan M (2021) SSP: early prediction of sepsis using fully connected LSTM-CNN model. Comput Biol Med. 128:104110
Reyna MA, Josef CS, Jeter R et al (2020) Early prediction of sepsis from clinical data: the PhysioNet/Computing in Cardiology Challenge 2019. Crit Care Med 48(2):210–217
Saqib M, Sha Y, Wang MD (2018) Early prediction of sepsis in EMR records using traditional ML techniques and deep learning LSTM networks. Annu Int Conf IEEE Eng Med Biol Soc 2018:4038–4041
Van Steenkiste T, Ruyssinck J, De Baets L et al (2019) Accurate prediction of blood culture outcome in the intensive care unit using long short-term memory neural networks. Artif Intell Med 97:38–43
Vicar T, Novotna P, Hejc J, Ronzhina M, Smisek R (2019) Sepsis detection in sparse clinical data using long short-term memory network with dice loss. In: 2019 Computing in Cardiology, pp 1–4
Wickramaratne SD, Mahmud MS (2020) Bi-directional gated recurrent unit based ensemble model for the early detection of sepsis. In: 2020 42nd Annual International Conference of the IEEE Engineering in Medicine & Biology Society (EMBC), pp 70–73
Zhang D, Yin C, Hunold KM, Jiang X, Caterino JM, Zhang P (2021) An interpretable deep-learning model for early prediction of sepsis in the emergency department. Patterns 2:100196
He Z, Du L, Zhang P, Zhao R, Chen X, Fang Z (2020) Early sepsis prediction using ensemble learning with deep features and artificial features extracted from clinical electronic health records. Crit Care Med 48:e1337–e1342
Aczon MD, Ledbetter DR, Laksana E, Ho LV, Wetzel RC (2021) Continuous prediction of mortality in the PICU: a recurrent neural network model in a single-center dataset. Pediatr Crit Care Med 22:519–529
Deasy J, Lio P, Ercole A (2020) Dynamic survival prediction in intensive care units from heterogeneous time series without the need for variable selection or curation. Sci Rep 10:22129
Gandin I, Scagnetto A, Romani S, Barbati G (2021) Interpretability of time-series deep learning models: a study in cardiovascular patients admitted to intensive care unit. J Biomed Inform 121:103876
Gupta A, Liu T, Crick C (2020) Utilizing time series data embedded in electronic health records to develop continuous mortality risk prediction models using hidden Markov models: a sepsis case study. Stat Methods Med Res 29:3409–3423
Harrison E, Chang M, Hao Y, Flower A (2018) Using machine learning to predict near-term mortality in cirrhosis patients hospitalized at the University of Virginia health system. In: 2018 Systems and Information Engineering Design Symposium (SIEDS), pp 112–117
Liu L, Liu Z, Wu H et al (2020) Multi-task learning via adaptation to similar tasks for mortality prediction of diverse rare diseases. AMIA Annu Symp Proc 2020:763–772
Maheshwari S, Agarwal A, Shukla A, Tiwari R (2020) A comprehensive evaluation for the prediction of mortality in intensive care units with LSTM networks: patients with cardiovascular disease. Biomed Tech (Berl) 65:435–446
Sha Y, Wang MD (2017) Interpretable predictions of clinical outcomes with an attention-based recurrent neural network. In: Proceedings of the 8th ACM International Conference on Bioinformatics, Computational Biology, and Health Informatics, pp 233–240
Shickel B, Loftus TJ, Adhikari L, Ozrazgat-Baslanti T, Bihorac A, Rashidi P (2019) DeepSOFA: a continuous acuity score for critically ill patients using clinically interpretable deep learning. Sci Rep 9:1879
Tan Q, Ma AJ, Deng H et al (2018) A hybrid residual network and long short-term memory method for peptic ulcer bleeding mortality prediction. AMIA Annu Symp Proc 2018:998–1007
Thorsen-Meyer HC, Nielsen AB, Nielsen AP et al (2020) Dynamic and explainable machine learning prediction of mortality in patients in the intensive care unit: a retrospective study of high-frequency data in electronic patient records. Lancet Digit Health 2:e179–e191
Wang Y, Zhu Y, Lou G, Zhang P, Chen J, Li J (2021) A maintenance hemodialysis mortality prediction model based on anomaly detection using longitudinal hemodialysis data. J Biomed Inform 123:103930
Yu K, Zhang M, Cui T, Hauskrecht M (2020) Monitoring ICU mortality risk with a long short-term memory recurrent neural network. Pac Symp Biocomput. 25:103–114
Yu R, Zheng Y, Zhang R et al (2020) Using a multi-task recurrent neural network with attention mechanisms to predict hospital mortality of patients. IEEE Journal of Biomedical and Health Informatics 24(2):486–492
Choi E, Bahadori MT, Schuetz A, Stewart WF, Sun J (2016) Doctor AI: predicting clinical events via recurrent neural networks. JMLR Workshop Conf Proc 56:301–318
Chu J, Dong W, Huang Z (2020) Endpoint prediction of heart failure using electronic health records. J Biomed Inform 109:103518
Kaji DA, Zech JR, Kim JS et al (2019) An attention based deep learning model of clinical events in the intensive care unit. PLoS One 14:e0211057
Lee JM, Hauskrecht M, Riaño D et al (2019) Recent context-aware LSTM for clinical event time-series prediction. In: Artificial Intelligence in Medicine, pp 13–23
Lee JM, Hauskrecht M (2021) Modeling multivariate clinical event time-series with recurrent temporal mechanisms. Artif Intell Med 112:102021
Lei L, Zhou Y, Zhai J et al (2018) An effective patient representation learning for time-series prediction tasks based on EHRs. In: 2018 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp 885–892
Pham T, Tran T, Phung D, Venkatesh S (2017) Predicting healthcare trajectories from medical records: a deep learning approach. J Biomed Inform 69:218–229
Rajkomar A, Oren E, Chen K et al (2018) Scalable and accurate deep learning with electronic health records. NPJ Digit Med 1:18
Rodrigues-Jr JF, Gutierrez MA, Spadon G et al (2021) LIG-Doctor: Efficient patient trajectory prediction using bidirectional minimal gated-recurrent networks. Information Sciences 545:813–827. https://doi.org/10.1016/j.ins.2020.09.024
Tang F, Xiao C, Wang F, Zhou J (2018) Predictive modeling in urgent care: a comparative study of machine learning approaches. JAMIA Open 1:87–98
Wang T, Tian Y, Qiu RG (2020) Long short-term memory recurrent neural networks for multiple diseases risk prediction by leveraging longitudinal medical records. IEEE J Biomed Health Inform 24:2337–2346
Chen Z, Chen M, Sun X et al (2021) Analysis of the impact of medical features and risk prediction of acute kidney injury for critical patients using temporal electronic health record data with attention-based neural network. Front Med (Lausanne) 8:658665
Kim K, Yang H, Yi J et al (2021) Real-time clinical decision support based on recurrent neural networks for in-hospital acute kidney injury: external validation and model interpretation. J Med Internet Res 23:e24120
Peng YC, Souza NSD, Bush B, Brown C, Venkataraman A (2021) Predicting acute kidney injury via interpretable ensemble learning and attention weighted convoutional-recurrent neural networks. In: 2021 55th Annual Conference on Information Sciences and Systems (CISS), pp 1–6
Rank N, Pfahringer B, Kempfert J et al (2020) Deep-learning-based real-time prediction of acute kidney injury outperforms human predictive performance. NPJ Digit Med 3:139
Tomasev N, Glorot X, Rae JW et al (2019) A clinically applicable approach to continuous prediction of future acute kidney injury. Nature 572:116–119
Maragatham G, Devi S (2019) LSTM model for prediction of heart failure in big data. J Med Syst 43:111
Rasmy L, Wu Y, Wang N et al (2018) A study of generalizability of recurrent neural network-based predictive models for heart failure onset risk using a large and heterogeneous EHR data set. J Biomed Inform 84:11–16
Chen R, Stewart WF, Sun J, Ng K, Yan X (2019) Recurrent neural networks for early detection of heart failure from longitudinal electronic health record data: implications for temporal modeling with respect to time before diagnosis, data density, data quantity, and data type. Circ Cardiovasc Qual Outcomes 12:e005114
Duan H, Sun Z, Dong W, He K, Huang Z (2020) On clinical event prediction in patient treatment trajectory using longitudinal electronic health records. IEEE J Biomed Health Inform 24:2053–2063
Jin B, Che C, Liu Z, Zhang S, Yin X, Wei X (2018) Predicting the risk of heart failure with EHR sequential data modeling. IEEE Access 6:9256–9261
Liang CW, Yang HC, Islam MM et al (2021) Predicting hepatocellular carcinoma with minimal features from electronic health records: development of a deep learning model. JMIR Cancer 7:e19812
Wang YH, Nguyen PA, Islam MM, Li YC, Yang HC (2019) Development of deep learning algorithm for detection of colorectal cancer in EHR data. Stud Health Technol Inform 264:438–441
Yang Y, Fasching PA, Tresp V (2017) Predictive modeling of therapy decisions in metastatic breast cancer with recurrent neural network encoder and multinomial hierarchical regression decoder. In: 2017 IEEE International Conference on Healthcare Informatics (ICHI), pp 46–55
Yeh MC, Wang YH, Yang HC, Bai KJ, Wang HH, Li YJ (2021) Artificial intelligence-based prediction of lung cancer risk using nonimaging electronic medical records: deep learning approach. J Med Internet Res 23:e26256
An Y, Tang K, Wang J (2021) Time-aware multi-type data fusion representation learning framework for risk prediction of cardiovascular diseases. IEEE/ACM Trans Comput Biol Bioinform 19(6):3725–3734
Guo A, Beheshti R, Khan YM, Langabeer JR 2nd, Foraker RE (2021) Predicting cardiovascular health trajectories in time-series electronic health records with LSTM models. BMC Med Inform Decis Mak 21:5
Kim YJ, Kim JW, Park JJ et al (2018) Interpretable prediction of vascular diseases from electronic health records via deep attention networks. In: 2018 IEEE 18th International Conference on Bioinformatics and Bioengineering (BIBE), pp 110–117
Ningrum DNA, Kung WM, Tzeng IS et al (2021) A deep learning model to predict knee osteoarthritis based on nonimage longitudinal medical record. J Multidiscip Healthc 14:2477–2485
Norgeot B, Glicksberg BS, Trupin L et al (2019) Assessment of a deep learning model based on electronic health record data to forecast clinical outcomes in patients with rheumatoid arthritis. JAMA Netw Open 2:e190606
Fouladvand S, Mielke MM, Vassilaki M, Sauver JS, Petersen RC, Sohn S (2019) Deep learning prediction of mild cognitive impairment using electronic health records. Proc (IEEE Int Conf Bioinformatics Biomed) 2019:799–806
Ljubic B, Roychoudhury S, Cao XH et al (2020) Influence of medical domain knowledge on deep learning for Alzheimer’s disease prediction. Comput Methods Programs Biomed 197:105765
AlSaad R, Malluhi Q, Janahi I, Boughorbel S (2019) Interpreting patient-specific risk prediction using contextual decomposition of BiLSTMs: application to children with asthma. BMC Med Inform Decis Mak 19:214
Alshwaheen TI, Hau YW, Ass’Ad N (2021) Abualsamen M.M.: A novel and reliable framework of patient deterioration prediction in intensive care unit based on long short-term memory-recurrent neural network. IEEE Access 9:3894–3918
Chen D, Jiang J, Fu S et al (2021) Early detection of post-surgical complications using time-series electronic health records. AMIA Jt Summits Transl Sci Proc 2021:152–160
De Brouwer E, Becker T, Moreau Y et al (2021) Longitudinal machine learning modeling of MS patient trajectories improves predictions of disability progression. Comput Methods Programs Biomed 208:106180
Krishnamurthy S, Ks K, Dovgan E et al (2021) Machine learning prediction models for chronic kidney disease using national health insurance claim data in Taiwan. Healthcare 9(5):546
Wu CL, Wu MJ, Chen LC et al (2021) AEP-DLA: adverse event prediction in hospitalized adult patients using deep learning algorithms. IEEE Access 9:55673–55689
Shah PK, Ginestra JC, Ungar LH et al (2021) A simulated prospective evaluation of a deep learning model for real-time prediction of clinical deterioration among ward patients. Crit Care Med 49:1312–1321
Cobian A, Abbott M, Sood A et al (2020) Modeling asthma exacerbations from electronic health records. AMIA Jt Summits Transl Sci Proc 2020:98–107
Dong X, Deng J, Rashidian S et al (2021) Identifying risk of opioid use disorder for patients taking opioid medications with deep learning. J Am Med Inform Assoc 28:1683–1693
Jang DH, Kim J, Jo YH et al (2020) Developing neural network models for early detection of cardiac arrest in emergency department. Am J Emerg Med 38:43–49
Lam C, Tso CF, Green-Saxena A et al (2021) Semisupervised deep learning techniques for predicting acute respiratory distress syndrome from time-series clinical data: model development and validation study. JMIR Form Res 5:e28028
Lee J, Ta C, Kim JH, Liu C, Weng C (2021) Severity prediction for COVID-19 patients via recurrent neural networks. AMIA Jt Summits Transl Sci Proc 2021:374–383
Mohammadi R, Jain S, Agboola S, Palacholla R, Kamarthi S, Wallace BC (2019) Learning to identify patients at risk of uncontrolled hypertension using electronic health records data. AMIA Jt Summits Transl Sci Proc 2019:533–542
Sankaranarayanan S, Balan J, Walsh JR et al (2021) COVID-19 mortality prediction from deep learning in a large multistate electronic health record and laboratory information system data set: algorithm development and validation. J Med Internet Res 23:e30157
Shamout FE, Zhu T, Sharma P, Watkinson PJ, Clifton DA (2020) Deep interpretable early warning system for the detection of clinical deterioration. IEEE J Biomed Health Inform 24:437–446
Tao J, Yuan Z, Sun L, Yu K, Zhang Z (2021) Fetal birthweight prediction with measured data by a temporal machine learning method. BMC Med Inform Decis Mak 21:26
Teoh D (2018) Towards stroke prediction using electronic health records. BMC Med Inform Decis Mak 18:127
Wu S, Liu S, Sohn S et al (2018) Modeling asynchronous event sequences with RNNs. J Biomed Inform 83:167–177
Xiang Y, Ji H, Zhou Y et al (2020) Asthma exacerbation prediction and risk factor analysis based on a time-sensitive, attentive neural network: retrospective cohort study. J Med Internet Res 22:e16981
Dong X, Deng J, Hou W et al (2021) Predicting opioid overdose risk of patients with opioid prescriptions using electronic health records based on temporal deep learning. J Biomed Inform 116:103725
Chen W, Wang S, Long G, Yao L, Sheng QZ, Li X (2018) Dynamic illness severity prediction via multi-task RNNs for intensive care unit. In: 2018 IEEE International Conference on Data Mining (ICDM), pp 917–922
Duan H, Sun Z, Dong W, Huang Z (2019) Utilizing dynamic treatment information for MACE prediction of acute coronary syndrome. BMC Med Inform Decis Mak 19:5
Ge Y, Wang Q, Wang L et al (2019) Predicting post-stroke pneumonia using deep neural network approaches. Int J Med Inform 132:103986
Liu L, Wu H, Wang Z et al (2019). Early prediction of sepsis from clinical data via heterogeneous event aggregation. In: 2019 Computing in Cardiology 1–4
Rhodes A, Evans LE, Alhazzani W et al (2017) Surviving sepsis campaign: international guidelines for management of sepsis and septic shock: 2016. Crit Care Med 45:486–552
Funding
This work was supported in part by the National Science Foundation under grant #1838745 and the National Institute of General Medical Sciences of the National Institutes of Health under grant GM132008.
Author information
Authors and Affiliations
Contributions
SP and VS: conceptualized idea, developed search criteria, assessed records for eligibility, and edited and approved the final manuscript; SP: screened articles and drafted the manuscript
Corresponding author
Ethics declarations
Ethical Approval
Not applicable
Competing Interests
The authors declare no competing interests.
Additional information
Publisher’s Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Pungitore, S., Subbian, V. Assessment of Prediction Tasks and Time Window Selection in Temporal Modeling of Electronic Health Record Data: a Systematic Review. J Healthc Inform Res 7, 313–331 (2023). https://doi.org/10.1007/s41666-023-00143-4
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s41666-023-00143-4