Introduction

Cardiovascular diseases (CVDs), which cause over 18.9 million deaths globally each year, are the number 1 cause of death, responsible for approximately 31% of all health-related deaths worldwide [1, 2]. Heart failure (HF) accounts for a large portion of this CVD morbidity and mortality, as well as an equally large portion of related healthcare expense [2]. One in five people will develop HF in their lifetime, and about 50% of these HF patients will die within 5 years [3]. In the management of this expanding HF patient population, the accurate prediction of HF outcomes is critical to effective prevention and treatment, as well as to the reduction of the burdensome expenditure of related healthcare dollars. The importance of accurate outcome prediction is accentuated by the impact of HF readmissions, which will cost Medicare approximately 17 billion dollars expended on the approximately 20% of patients who are readmitted within 30 days of HF discharge [4].

Expansive implementation of the EHR has led to a revolution in the introduction of demographic characteristics, genetic profiles, medical treatments, diagnostic notes, laboratory results, and image data of individual patients into an electronic format that facilitates access and use in “big data” research investigations. The feasibility of truly personalized and precision medicine is dependent upon the management of the vast quantities of EHR data that has become available and is critical to the development of these models. The sheer volume and heterogenous nature of EHR data have raised new challenges regarding the integration and analysis of this data. Therefore, machine learning computational data integration and analysis models using EHR data are critical for developing personalized and precision prevention and treatment, with improved HF patient outcomes.

Machine learning models, such as random forests [5], decision trees [6], logistic regression [7], and support vector machines [8], have been successfully and widely used in many prediction and classification tasks. Moreover, deep learning models, like deep belief neural networks [9], deep convolutional networks [10], and long short-term memory [11] models, as well as more complicated deep learning models, have for the most part demonstrated stronger prediction and classification capability than traditional machine learning models. In HF-related studies, machine learning and deep learning models have been developed using the variables derived from the complex and diverse EHR data of HF patients. Critical to this rapidly developing area is a functional knowledge of the application of machine learning and deep learning models to the EHR, wearable sensor, genetic, and proteomic data associated with HF diagnosis, hospitalization, readmission, and mortality prediction. Our goal is to summarize state-of-the-art machine learning approaches to HF risk prediction.

Several challenges must be overcome for machine learning models to be applied on a personalized and precision basis for diagnostic and predictive management (diagnosis, prevention, risk stratification, and treatment) of HF patients. For example, algorithms must be developed to allow the full integration of the widely diverse data available in the EHR, ranging from textual medical reports, a wide variety of imaging data formats, and such developing fields as personalized genetic profiles. Furthermore, the application of machine learning to prediction in rare disease patient populations will mandate further enrichment of techniques for managing unbalanced dataset effects and as the identification of stable, clinically applicable, and informative risk factors to make the models interpretable and actionable.

Methods

We conducted a comprehensive review of available literature between January 2015 and August 2020 by a search of the PubMed library database for relevant papers. The keywords searched included “machine learning heart failure” and “deep learning heart failure.” By searching “machine learning” and “heart failure” in the PubMed database, 353 papers were obtained. The search for “deep learning” and “heart failure” revealed 69 papers. Removing the common articles from the above two searches, 374 unique papers were obtained. Among them were 335 relevant papers published after 2015. We reviewed and selected a subset of the most applicable publications (Table 1).

Table 1 Summary of selected articles related to heart failure readmission and mortality prediction

Statistics of Study Articles

Several trends are apparent in the applications of machine learning and deep learning in HF subpopulations (Fig. 1). Figure 1a shows the publishing trends by plotting the quantity of publications related to the machine learning and deep learning in HF from 2015 to 2020. A stable growth in the number of publications can be observed after 2015, especially after 2018, suggesting a progressive clinical recognition of the value of machine learning and deep learning algorithms applied in HF. Figure 1b shows the top 20 journals in which the collected 335 papers were published. These journals contained 35.2% of the papers published in the past 5 years. Most of them are influential journals in the research fields of HF (e.g., European Journal of Heart Failure, JACC Heart Failure, and Circulation Heart Failure), health and medical informatics (e.g., JMIR Medical Informatics, IEEE Journal Biomedical Informatics, and BMC Medical Informatics and Decision Making), and image and biotechnology (e.g., Medical Image Analysis, Computer Biology Medicine).

Fig. 1
figure 1

Some initial statistics of the collected 335 papers about machine learning and deep learning in patients with heart failure. a The paper publishing trends from year 2015 to 2020. b The top 20 journals in which papers were published. c The word cloud of the paper titles from the 335 papers

Moreover, we collected all the paper titles of the 335 published papers and generated a word cloud to capture the most studied topics in the application of machine learning and deep learning algorithms in HF patients. Figure 1c shows the word cloud of all the paper titles resulting from the use of the Natural Language Toolkit (NLTK) tool [32] to lemmatize each word. Figure 1c illustrates the specific techniques used in studying these HF patients: machine learning, deep learning, neural network, artificial intelligence; and medical outcomes such as readmission, mortality, and detection. Other typical words included “patient,” “prediction,” and “risk model,” suggesting that a primary focus was HF patient risk stratification.

Machine Learning and Deep Learning for Heart Failure Risk Prediction

HF outcome prediction is critical to the accurate application of many available therapeutic options, ranging from pharmacologic to highly invasive mechanical ventricular assistance and cardiac transplantation. Recent HF outcome prediction investigations have focused upon EHR, echocardiographic, proteomic, and wearable sensor data. In one investigation, using quantitative features derived from echocardiography images, domain expert selected features and data-driven selected features were integrated in machine learning models, including decision tree models. Data-driven feature selection had much better prediction accuracy than expert-driven feature selection [12]. In another study, the timing and amplitude of left ventricular (LV) images were analyzed to obtain the myocardial motion and deformation information in the cardiac cycle during rest and stress. Their results suggested that LV images can provide informative features for HF with preserved ejection fraction (HFpEF) prediction [15].

In addition to imaging data, EHR data are also informative in HF risk prediction. In one investigation, structured and unstructured EHR data were used to evaluate four approaches of HF hospitalization prediction [14•]. The results indicated that the unstructured notes were important and could improve the prediction accuracy. Eight machine learning approaches, including generalized linear model nets, random forests, support vector machines, logistic regression, and neural networks, were evaluated for HF hospitalization prediction using patient demographic, medical, and clinical data [17]. The GLMN achieved the best performance. In another investigation aimed at patients with HF related to cancer therapeutics, EHR data were used to predict the risk of HF risk occurrence in cancer patients [18]. The results indicated that machine learning models can not only predict associated HF risk but also identify potential contributing clinical factors.

To further improve prediction accuracy using EHR data, novel embedding approaches [13] have been developed to convert EHR data into clinically meaningful numeric vectors/features. Prediction accuracy can be improved by applying machine learning models to these numeric features. Moreover, wearable equipment and sensors are being developed to acquire real-time data from HF patients to monitor potential risks remotely [19••]. For example, wearable devices that can remotely monitor patient electrocardiography (ECG) and seismocardiogram sensing [19••] can predict the HF risk of patients, and thereby potentially reduce patient hospitalizations and mortality. In addition to the EHR and imaging data, novel biomarkers derived from urinary proteomics data [16] of HF patients have also been investigated. These biomarkers may have potential to accurately predict HF risk in machine learning models that combine them with EHR and imaging data.

Machine Learning and Deep Learning for Prediction of Hospital Readmissions

Hospital readmission rate is a significant challenge in the management of HF patients. It is widely accepted that HF outcomes and healthcare expense can be improved if HF patients with a high risk of readmission can be accurately identified and targeted with management algorithms. Several machine learning approaches, most involving the use of EHR data, have been employed to identify HF patients at high risk for readmission. In one study, the naïve Bayes model was used to predict HF readmission using data from patient primary encounters [3]. Specifically, the top associated features in HF readmission for each of subset of the patient cohort were selected and combined as the input of the predictive naïve Bayes model. A tree-based model, adopted from the random forest model, was proposed to predict HF readmission using demographic, socioeconomic, utilization, service-based, comorbidity, and severity features.

In addition to the traditional machine learning approaches, deep learning models were also proposed for HF prediction. In one study, a multi-layer perception (MLP) model was developed to predict HF readmission based on EHR data [20], including demographics, admission characteristics, medical history, visits to emergency departments, history of medication use, and healthcare services out of hospital. The MLP model tolerates the imbalanced data that characterizes the readmission rate, which is low relative to the majority of patients (who are not readmitted). Another study used the deep unified network (DUN) model, which integrates the output from each hidden layer to capture potentially informative features, used to predict HF readmissions [21•]. The DUN model outperformed logistic regression, gradient boosting, and the general deep neural networks. To better deal with the longitudinal temporal data of EHR, the long short-term memory model has also been successfully employed for HF outcome prediction [23••]. In addition to the EHR data, telemonitoring data has also been used to identify HF patients with high readmission risk [22]. The combination of EHR data with tele-HF data and wearable sensor data has considerable potential to predict HF readmission.

Machine Learning and Deep Learning for Mortality Prediction

Accurate mortality prediction is critical to effective therapeutic decision-making in HF patients. This prediction is challenging because of the lack of stable marker factors, the noise in the data, and the prevalence of imbalanced data sets. In one study, several machine learning approaches, including random forests, logistic regression, AdaBoost, decision trees, and support vector machines, were used for the HF mortality prediction using EHR data [29]. A similar set of machine learning models were evaluated for mortality prediction of HF patients using a comprehensive set of data, including all baseline demographic, clinical, laboratory, electrocardiographic, and symptom data [27•]. The random forest model achieved the best performance among these models.

In other recent studies, deep learning models have been used to improve mortality prediction in HF patients. In one study, a novel deep learning model, Feature Rearrangement based Convolution Net (FReaConvNet), was used to predict in-hospital, 30-day, and 12-month mortality by mining the most important features. Feature importance analysis is important in improving the prediction accuracy in unbalanced data sets. Using EHR and laboratory data, machine learning analysis identified two important features, i.e., serum creatinine and ejection fraction [31]. Using only these two features obtained better prediction results than using all other evidential EHR data [31]. In another study, novel and complex computer vision models, using convolutional networks to calculate heart motion trajectories, accurately predicted patient survival using 4D imaging of heart (3D MRI images + time) [30••].

Conclusions

HF is associated with high morbidity and mortality, as well as excessive associated healthcare cost. Using EHR data, including demographic characteristics, medical treatment history, medical diagnosis notes, laboratory results, image data, and genetic and proteomic profiles, machine learning and deep learning have been employed for the prediction of HF readmission and mortality. These models are essential to personalized and precision prevention, treatment, and management of HF patients. A set of machine learning and deep learning models have been evaluated for related prediction analyses with considerable success using large variable data sets derived from the EHR. Nonetheless, there are still challenges to be resolved, and novel machine learning models are still needed to integrate diverse and heterogeneous data in the quest to more accurately identify high-risk HF patients. The diverse and heterogeneous attributes of clinical EHR data include variable data format (longitudinal data versus fixed data; structure versus non-structured data; text data versus complex image data), the measurement of different aspects of the diseases, the data noise, and the predominance of imbalanced HF versus control study group samples. Novel machine learning models for systematic data representation, integration, and prediction have potential to revolutionize model prediction accuracy.