FormalPara Key Summary Points

Machine learning is a branch of artificial intelligence which can achieve several tasks through supervised learning, unsupervised learning, semi-supervised learning. Deep Learning has tremendous potential and is gaining prominence in the field of cardiology.

Machine learning facilitates automation, risk stratification, prediction, quantification, and precision phenotyping. It can be integrated with radiomics.

There is a strong potential for false discovery and biases. Primary investigator and medical team must play an active role during the algorithm training and development.

Introduction

With high-dimensional data emanating from a variety of platforms which include miniaturized devices and third-party apps, commercial industries have been significantly transformed in this current digital era [1, 2]. Artificial intelligence (AI) has proven to be a valuable tool in navigating these new frontiers of information technology [3]. The presence of AI is palpable in day-to-day interaction with devices from voice recognition software such as Siri or Alexa to self-driving automobiles [4]. Similarly, these waves of technological changes have trickled down into the arena of healthcare [5]. For example, certain US Food and Drug Administration (FDA)-approved AI algorithms can read chest radiographs. AI will tremendously expand the boundaries of medicine and will ultimately elevate the quality of patient care.

Machine learning (ML), a subset of AI, can extract and decipher various patterns present in large data reserves [6]. ML and AI have made substantial leaps and bounds in the field of cardiology where there is high-dimensional data. AI can connect information from a multitude of sources and automate calculations or various processes [7]. These changes have been seen in cardiovascular imaging, electrophysiology, heart failure, and interventional cardiology. In this review article, we will discuss the impact of AI and ML in various facets of clinical cardiology. This article is based on previously conducted studies and does not contain any new studies with human participants or animals performed by any of the authors.

Growth and Types of Machine Learning

In recent times, the size and complexity of data arising from various modalities have increased substantially [8]. The sheer size can exceed the capacity of current statistical software. However, this contrasts with ML as the findings of ML algorithms become more precise as data becomes larger [9]. This massive data influx has propelled the growth of ML. In addition, ML algorithms are more dynamic and data-driven [10].

ML is a broad term that refers to a wide variety of algorithms. ML can be classified into supervised learning, unsupervised learning, semi-supervised learning, reinforcement, and deep learning [11]. Supervised learning utilizes labeled parameters or domains within a dataset to orchestrate actions [12]. Unsupervised learning can be considered more agnostic or sovereign—like operating independently to discover patterns within databases [13]. Within unsupervised learning, clustering approaches are proving to be valuable. They can detect new subtypes or variants within complex heterogeneous entities such as heart failure or aortic stenosis. By identifying these variants, we can have more specific treatment options directed towards particular subtypes in these conditions [11]. Semi-supervised learning has attributes of both supervised and unsupervised learning fields [14]. Reinforcement learning is less frequently utilized in comparison to the other algorithms. This ML framework utilizes certain reward criteria to execute appropriate actions [14].

Among all ML frameworks, deep learning is more advanced and adept in processing information [15]. Other algorithms require extensive experience with training sets before executing any given action. Deep learning differs tremendously in this aspect as it needs less training time and performs extremely well with large datasets [16]. To simplify, the structure of deep learning is like that of a neuron. It is arranged in a series of layers [16]. Once information is extracted, it is processed between ascending and descending layers. Owing to the availability of increased computational copacity of computer processing units (CPU), deep learning has proven to be extremely effective in image classification, speech recognition, and genomics [17]. Recurrent neural networks and functional neural networks are frequently used in cardiology [7].

Application of ML in Cardiovascular Imaging

The field of cardiovascular imaging has seen significant adoption of ML for the analysis of images in diagnosis and prognostication (Fig. 1). Cardiovascular imaging broadly encompasses various imaging modalities such as echocardiography, computed tomography (CT), nuclear cardiology, and magnetic resonance imaging (MRI). As technology continues to evolve and grow, new parameters are being added to existing modalities [9]. AI and ML can automate calculations and connect the information in a meaningful manner [9].

Fig. 1
figure 1

ML improves diagnosis and prognosis

Numerous studies have assessed the impact of ML in the field of echocardiography (Table 1). Samad et al. utilized ML to predict all-cause mortality in 171,510 patients by using multiple echocardiographic parameters and information from electronic medical records. The mean area under the curve (AUC) was used to evaluate models and scoring systems over multiple cross-validations. The ML model showed superior accuracy (all AUC > 0.82, p < 0.001) compared with logistic regression models [18] and common clinical risk scores (AUC = 0.69 to 0.79) over 10 survival durations from 6 to 60 months. A unique aspect of the study was utilizing a broad initial hypothesis rather than a focused hypothesis which can help address gaps in knowledge. The original inquiry can be revised or lead to brand new questions which can be investigated. Pandey et al. explored a deep learning model which integrated multidimensional echocardiographic data to distinguish novel subtypes of patients with heart failure with preserved ejection fraction (HFpEF) [19]. The authors identified high- and low-risk phenotypes, and the performance of the model was assessed in two external cohorts. In addition, relationships between these phenotypes were further assessed with adverse clinical outcomes in the TOPCAT clinical trial data. The deep learning model demonstrated a greater area under the receiver operating curve (ROC) than the 2016 American Society of Echocardiography (ASE) guidelines for predicting elevated left ventricular filling pressures in patients with HFpEF (0.88 vs 0.67; p = 0.01). In the TOPCAT cohort, the high-risk phenotype revealed higher rates of hospitalization/cardiac death (HR = 1.92, p = 0.01) and higher event-free survival with spironolactone therapy (HR = 0.65, p = 0.01). Current guidelines for classifying HFpEF are vague, but ML models can further stratify these patients and identify treatment approaches with a higher likelihood of a positive response in certain subtypes. Sengupta et al. utilized a supervised ML model to augment echocardiographic stratification of aortic stenosis (AS) severity. They showed high-severity and low-severity AS phenotypes which were compared to markers of disease severity in CT and cardiac magnetic resonance (CMR) imaging and major clinical outcomes such as aortic valve replacement (AVR) and mortality [20]. Close to 70% of the 1964 patients were classified as having non-severe or discordant AS, but the ML model showed 1117 (57%) patients having high-severity AS and 847 (43%) patients having low-severity AS. High-severity groups had a higher incidence of elevated calcium scores and left ventricular fibrosis. In relation to current classification approaches, ML-derived classification had enhanced discrimination (integrated discrimination improvement 0.17, CI 0.02–0.12) and reclassification (net reclassification improvement 0.17, CI 0.11–0.23) for aortic valve replacement (AVR) outcomes at 5 years [20]. Current recommendations for evaluating AS are hindered by diagnostic ambiguity as many complications in AS arise secondary to valvular obstruction and ventricular decompensation [21]. ML-derived frameworks can help better stratify AS severity and possibly improve the timing of interventional approaches for these patients.

Table 1 Machine learning in echocardiography

The use of CT is ubiquitous in current cardiology practice for diagnosing coronary artery disease (CAD) and procedure planning for structural heart interventions [22, 23]. It plays a seminal role in the evaluation of patients with chest pain. Several studies have applied ML algorithms in CT (Table 2). Hwang et al. applied topological data analysis (TDA) to characterize different patient groups based on coronary plaque features from computed tomography angiography (CTA) and clinical parameters to evaluate their relationship with various outcomes [24]. Group A had the lowest amount of coronary plaque, group B had a moderate amount of coronary plaque enriched with fibrofatty or necrotic core, and group C had the largest amount of plaque with dense calcium components. Among the three groups established by TDA, group B had a higher incidence of acute coronary incidence (0.3% vs 2.6% vs 0.6%; p < 0.001) while groups B and C revealed elevated rate of revascularization (3.1% vs 15.5% vs 17.8%; p < 0.001). TDA can efficiently characterize patients on the basis of plaque properties or dynamics and assess their relationship with different outcomes. Numerous advances in computing power have enabled us to analyze tomographic images (e.g., CT or CMR) in significant detail and these features can be extracted and turned into numbers. Each image can have hundreds of features that can be analyzed or quantified. The overall process is collectively known as radiomics [25]. This can be broadly classified into intensity, structure, texture or gradient, and gradient [26]. Kay et al. utilized the ML framework in conjunction with radiomics in CT with coronary calcium score (CAC) to identify phenotypic properties of high-risk left ventricular hypertrophy (LVH) in 1982 patients. Interestingly, these ML models were quite accurate in LVH detection [27]. Kay et al. showed the possibility of an ML radiomic pipeline capable of identifying high-risk with patients LVH solely on the basis of CT with CAC without requiring other diagnostic imaging. This approach also has the additional benefit of reduced radiation exposure for patients. Al’Aref et al. demonstrated that an ML model incorporating clinical factors and calcium scores from CT angiography (CTA) can effectively predict coronary artery disease (CAD) in the CONFIRM registry [28]. The AUC for ML and CAC showed a higher AUC (0.881) than ML alone (0.773), coronary calcium (0.886), and updated Diamond–Forrester score (0.682) in 35,281 patients with CTA. Many of the current risk score paradigms systematize risk assessment based on clinical factors, e.g., as the updated Diamond–Forrester score. The updated Diamond–Forrester score is a pre-test probability score that can help select patients for additional testing [29]. Previous studies have shown these systems have low to moderate performance when assessing chest pain in the general population [30, 31]. By integrating ML with CAC, we can significantly improve risk stratification and help improve medical management.

Table 2 Machine learning in computed tomography

Single-photon emission computed tomography (SPECT) myocardial perfusion imaging (MPI) is the diagnostic imaging pillar of nuclear cardiology and plays a seminal role in the evaluation of myocardial ischemia [6]. ML algorithms can help augment the diagnostic performance of SPECT and improve the quality of clinical care; this has been observed in several studies (Table 3). Otaki et al. assessed a deep learning model for the prediction of obstructive CAD from SPECT MPI in 3578 patients. The deep learning model was compared to automated total perfusion deficit (TPD) and expert reader diagnosis. The deep learning model had a superior AUC (AUC 0.83; 95% CI 0.82–0.85) than TPD (AUC 0.78; 95% CI 0.77–0.80) or expert reader diagnosis (AUC 0.71; 95% CI 0.69–0.72; P < 0.0001 for both) [32] for obstructive CAD detection. This study shows the potential and superiority of deep learning: it can be easily integrated with standard clinical software for CAD detection following MPI. Betancur et al. explored the role of supervised learning in assessing the predictive value of SPECT MPI with integrated patient information [33] clinical variables for major adverse cardiovascular event (MACE) prediction in 2619 patients. The cohort was monitored for MACE while undergoing exercise or pharmacological SPECT MPI and 239 patients had MACE events at 3-year follow-up. Interestingly, ML combined with patient information had higher MACE prediction than ML imaging (AUC = 0.81 vs 0.78, p < 0.01). The ML model had superior MACE predictive capacity when compared to expert readers, TPD, and automated ischemic perfusion deficit (AUC 0.81 vs 0.65 vs 0.73 vs 0.71, p < 0.01 for all) [33]. Hu et al. utilized ML to predict per-vessel prediction of early coronary revascularization following SPECT MPI after 90 days in 1980 patients. Hu et al. reported an ML model (0.79 vs 0.71 vs 0.72, P < 0.001) having a greater AUC for predicting the possibility of early coronary revascularization 90 days following SPECT MPI than regional or ischemic TPD [34]. Similarly, the ML algorithm outperformed TPD or expert cardiologist interpretation for the prediction of early coronary revascularization.

Table 3 Machine learning in nuclear cardiology

Cardiovascular magnetic resonance (CMR) imaging has quickly ascended as a pivotal diagnostic modality in cardiology that enables tissue characterization and facilitates medical management [35]. ML algorithms can propel and augment the capabilities of CMR to enhance cardiovascular risk stratification and diagnosis [35]. CMR is frequently used in the diagnosis of hypertrophic cardiomyopathy (HCM) but safety issues regarding gadolinium have been reported. Mancio et al. presented an ML model incorporating radiomic features which were capable of detecting HCM in 1099 patients without gadolinium administration. The ML-derived radiomic algorithm successfully detect HCM in patients without fibrosis (AUC = 0.83; 95% CI 0.77–0.89; sensitivity of 91%) [36] (Table 4). By integrating radiomics with ML, we can reduce safety concerns associated with gadolinium in patients. Ruijsink et al. examined the role of deep learning for automated left ventricular assessment; the results were compared with manual analysis in 100 patients undergoing CMR [37]. The ML algorithm closely correlated with manual analysis of left and right ventricular volumes (all r > 0.95) and strain (circumferential r = 0.89, longitudinal r > 0.89). The study showcases the potential of deep learning for automated ventricular assessment which could minimize physician oversight and divert more time towards patient care in the future.

Table 4 Machine learning in cardiac magnetic resonance imaging

Application of ML in Electrophysiology

ML algorithms demonstrate tremendous promise in opening new frontiers in electrophysiology [38]. ML can transform electrophysiology by expanding our understanding of new phenotypes in multiple conditions and improving the risk stratification [38]. Several studies have shown increased clinical utility and the potential of ML in electrophysiology.

The electrocardiogram (ECG) is the most fundamental test in the field of cardiology. ML frameworks can extract information from a variety of rhythms to result in augmented ECG interpretation. Mjahad et al. compared multiple ML algorithms in ECG to detect ventricular tachycardia (VT) and ventricular fibrillation (VF) [39] (Table 5). The deep learning algorithm had more than 98% accuracy for identifying VT and VF, and it performed better than other ML algorithms. Many patients with asymptomatic left ventricular dysfunction remain unrecognized in the general population. Attia et al. explored the role of deep learning in identifying left ventricular dysfunction in the ECG [40]. The ML algorithm was trained from ECG and echocardiogram data from 44,959 patients. After ML training, the ML algorithm was tested on a separate patient population consisting of 52,870 patients. It was extremely effective in detecting left ventricular dysfunction (AUC for sensitivity = 0.93, specificity = 86.3%, specificity = 85.7%) by solely examining ECG. This could serve as a potential screening tool for detecting left ventricular dysfunction. Attia et al. developed a deep learning model capable of identifying the electrographic signature of AF during normal sinus rhythm by using ECG strips [41]. The ML model was trained using 180,922 patients with normal sinus rhythm and atrial fibrillation ECGs. After training, the deep learning model was very successful in identifying AF in a separate cohort consisting of 36,280 patients (AUC for sensitivity = 0.87, specificity = 0.79, overall accuracy = 0.79). Atrial fibrillation (AF) is frequently asymptomatic and can lead to various complications like stroke or heart failure. ML-derived analysis of normal sinus rhythm facilitates rapid identification of AF at the point of care for patients, and this can facilitate earlier management for appropriate patients.

Table 5 Machine learning in electrophysiology

Cardiac implantable electronic devices contain large reserves of information that can be tapped by ML algorithms to be applied in the prediction of response to cardiac resynchronization and prediction of sudden cardiac death, and this can help determine which patient will benefit from an implantable cardioverter-defibrillator (ICD) implantation. Cardiac resynchronization therapy (CRT) is frequently restricted to certain candidates meeting the criteria, and certain studies have shown that ML can expand the application of CRT. Kalscheur et al. applied multiple ML algorithms analyzing the COMPANION trial data to create a model predicting outcomes after CRT which included all-cause mortality or heart failure hospitalization [42]. The random forest model or supervised ML algorithm was found to be superior for predicting CRT outcomes when compared to the other algorithms (AUC = 0.74). Similarly, Feeny et al. explored the potential of multiple ML algorithms for predicting survival-free response from a composite outcome consisting of end of death, heart transplant, or placement of left ventricular device following CRT. In 455 patients, the ML algorithm showed a better prediction response than guidelines (AUC = 0.70, p = 0.012) and better event-free survival (p < 0.001) [43]. Although CRT can reduce mortality and morbidity in patients, individual outcomes can vary substantially. The ML algorithm can be applied before the device is implanted to help predict outcomes, and this can improve decision-making for each patient. Patients can experience multiple occurrences of ventricular arrhythmias during an electrical storm, and this is associated with significant mortality and morbidity [44]. Shakibfar et al. constructed an ML model which could predict electrical storms on the basis of ICD monitoring summaries [45]. In 19,935 patients, the ML model was quite efficacious (AUC = 0.80) for anticipating the probability of electrical storm [45]. As models or systems have not been developed for predicting electrical storms, ML can open new possibilities for predicting the occurrence of an electrical storm.

Application of ML in Heart Failure

Heart failure (HF) is a common clinical entity seen worldwide which is frequently linked to hospitalizations, poor quality of life, and shortened life expectancy [46]. Though much is known, our understanding can be further strengthened by ML algorithms [46]. Several studies have assessed the impact of ML algorithms on HF (Table 6).

Table 6 Machine learning in heart failure

Angraal et al. explored the potential of multiple ML models for predicting mortality and hospitalization in patients with HFpEF with the TOPCAT trial data [47]. The supervised ML was superior to other ML models and had a mean C-statistic of 0.72 for predicting mortality (Brier score 0.17) and 0.76 for HF hospitalization (Bier score 0.19). Wang et al. evaluated the role of multiple ML models for predicting hospitalization and readmission in patients with HF with reduced ejection fraction (HFrEF) [48]. The deep learning outperformed other ML models for predicting 30- and 90-day readmission rates for patients with HFrEF (AUC = 0.977 and 0.972). Lancaster et al. constructed a clustering ML framework to evaluate left ventricular function in high-risk phenotypes for patients with HF. MACE, mortality, and hospitalizations were compared between cluster and conventional classifications. The clustering ML framework identified diastolic dysfunction in 559 of 866 patients, and the results coincided with conventional classification (κ = 0.41, p < 0.001) [49]. Sanchez-Martinez et al. assessed left ventricular function at rest and exercise to evaluate differences between HFpEF and healthy patients [50]. The ML algorithm was able to aggregate patients according to similarity, which facilitated the evaluation of velocity patterns. The ML-proposed groups correlated well with current clinical criteria (κ = 72.6%; 95% CI 58.1–87.0) [50]. ML algorithms may provide additional insight into underlying pathology or mechanism in patients with HFpEF.

Application of ML in Interventional Cardiology

Though AI and ML have made significant leaps and bounds in other subspecialties in cardiology, it is still in the nascent stages of interventional cardiology (IC) [51]. AI has the capability of pushing IC into new frontiers with concurrent growth in robotic and new technology fields [51]. Few studies have assessed the capabilities of AI in percutaneous coronary intervention (PCI) and transcatheter aortic valve replacement (TAVR) (Table 7).

Table 7 Machine learning in interventional cardiology

Physiological measurements acquired during coronary pressure wire enable myocardial ischemia detection [52]. A coronary pressure wire can be utilized to generate an instantaneous wave-free ratio (iFR) pressure wire pull backtrace which measures pressure loss across the length of a coronary artery. The iFR can be used to predict outcomes following PCI [53]. Cook et al. explored the interpretation of iFR pressure wire during coronary revascularization with AI and an expert interventional team [54]. The AI algorithm had a higher agreement with the heart team response than the expert team for hemodynamic appropriateness (89.4% vs 89.3%, p < 0.01 for non-inferiority). Azzalini applied an ML algorithm in 2648 patients to assess which contrast agent contributed to acute kidney injury (AKI) after PCI; they did not find any type to be significantly linked to AKI [55]. Abdul Ghffar et al. developed a semi-supervised ML model in 344 patients with TAVR to isolate phenotyping groups and assess their relationships with clinical outcomes [56]. The ML algorithm isolated five phenotype groups that had significant differences in comorbidities and clinical outcomes. Group 5 was associated with higher rates of in-hospital cardiovascular mortality (OR = 9, p = 0.001) and 30-day cardiovascular mortality (OR = 18, p = 0.02). Interestingly for 30-day mortality, the phenotype grouping had better mortality prediction than the Society of Thoracic Surgeons scoring (AUC 0.96 vs AUC 0.8, p = 0.02). Hernandez-Suarez et al. assessed multiple ML models for predicting in-hospital mortality in 10,833 patients with TAVR from the national inpatient sample [57]. Many of the ML models were capable of predicting in-hospital mortality (AUC > 0.80) following TAVR.

Evolving Beliefs and Perceptions on ML

Traditional research moves in a sequential pathway from a hypothesis to a conclusion. This linear structure is deeply embedded in our research dogma and is fundamental to most analyses. However, ML algorithms can cause a paradigm shift in this ideology by giving paramount importance to data-driven analytics [10]. Initial evaluation of data should be performed by the ML algorithm, and objectives of the projects can be modified or altered on the basis of a machine-provided insight. This non-linear approach may appear unconventional but is reflective of daily life [58]. Life never moves from point A to B but moves dynamically or haphazardly. Greater integration of ML could be instrumental to clinical trials [23]. Many randomized clinical trials fail to reach completion because of poorly defined objectives, inadequate planning, poor execution, and inadequately powered endpoints [23]. Preliminary analysis by these algorithms can offer valuable information to clinical investigators, and clinical trials can be managed more effectively [17]. ML can help in the effective utilization of resources and manpower to help in the successful completion of clinical trials.

Numerous studies have shown significant progress of ML in cardiovascular imaging and electrophysiology [1, 2]. Though this indicates a positive direction of ML in most aspects of cardiology, IC is lagging behind [1]. AI and ML can offer significant opportunities in the catheterization laboratory by providing insight into intravascular imaging and procedural guidance during PCI or TAVR [51]. It can streamline several processes in the catheterization lab. As robotic technology is becoming more advanced and increasingly integrated into the future of IC, it will allow procedures to be more precision-efficient [51]. However robotic technology may not fully comprehend coronary anatomy or perceive the intention of the operator. ML can bridge this gulf between man and machine, and it can help expedite the growth of robotic technology in the IC [59].

Although AI and ML may offer significant opportunities in various subspecialties of cardiology, many people in the medical community may have a foreboding feeling regarding this technology [5]. A common misconception is that ML may replace the occupations of practitioners and medical staff [5]. It is quite the contrary. In reality, it will substantially help reduce the workload. Multiple new parameters are constantly being added to various diagnostic modalities in cardiovascular imaging and electrophysiology [9]. This influx of information can lead to cognitive overload and be counterproductive. ML algorithms can automate calculations, image classification, and quantification of manners in a user-friendly manner [2]. In interventional cardiology, many young operators may lack experience in the early years. Information arising from national and international registries can be utilized to help guide the decision-making [59]. AI can be integrated with virtual planning to create digital twins, and various interventional treatment procedures can be tested on the twin before the actual intervention [1]. Instead of disrupting workflow, ML can be a trustworthy companion which can streamline several processes [23]. Furthermore, it can allow more time for physicians to interact with their patients for meaningful interactions.

Limitations

Although the fruits of AI can be enticing, AI is still far from perfect [7]. Several issues need to be addressed before successful implementation can be possible [60]. One of the key issues is the absence of standardization across institutions and interoperability [6]. Each center has its own protocols for labeling and storing information. Numerous imaging systems reside in each institution which encompasses picture archive and communication systems (PACS) or digital imaging and communications in medicine (DICOM) [6]. Information from the 12-lead EKG, Holter, and telemetry should be collected in a uniform and accessible manner within an institution. Cardiologists and researchers should have access to this information without difficulty. These data repositories should be shared easily among institutions or made publicly available [7]. This will greatly expedite the growth of ML algorithms.

The potential for false discovery is a common issue with ML algorithms [12, 15]. The optimal performance of ML algorithms requires exposure to large databases to help train the algorithm [12]. A few biases may unintentionally occur within a model. This is related to the “Black box” concept of ML algorithms [7, 35]. Moral conscience is not prebuilt within these algorithms. Before initiating a project, the principal investigator and engineer must review the purpose of the project and design the model accordingly [4]. They must be involved at every step of the way. This will enhance the performance of the algorithm and yield promising results. As these algorithms continue to grow, medical curriculums must incorporate the fundamentals of ML in medical and residency training to help train future investigators [1].

Future Possibilities of Machine Learning

With smartphones and mobile apps becoming part of our daily lifestyle, the concept of a “smart clinic” is no longer a distant concept [11]. These clinics would employ a variety of miniaturized devices which can include pocket ultrasound, mobile ECG readers, and a variety of smartphone applications [11]. These devices would be linked to ML algorithms which can analyze information immediately and provide precision medicine at the point of care services [3]. Smart clinics could serve an important role in underserved areas where access to medical care is a challenge.

Conclusion

AI and ML will have a phenomenal impact on cardiology which will lead to a range of possibilities and opportunities. As with any innovation, we may encounter various hurdles and difficulties. It is without a doubt that ML will be strongly linked to the future of cardiology in years to come.