Introduction

Recently, there has been growing interest in possible applications of data mining and artificial intelligence (AI) in medicine. The field of radiomics includes a collection of techniques used to automatically extract large amounts of quantitative features from medical images through the analysis of pixel grey level distribution, thus possibly leading to new insights in pathophysiological mechanisms underlying different medical conditions [1]. Texture analysis (TA) is one of the main areas of radiomics, evaluating grey level value patterns in images that are not detectable by qualitative assessment by a human reader. Therefore, it plays an important role in analyzing features of different tissues or organs in radiology, contributing to the potential development of new biomarkers [2]. For example, texture features may have histopathologic correlates that may help in the evaluation of patient prognosis [3].

AI is frequently used to develop classification or regression models from radiomics data [4]. In particular, machine learning (ML) is the subfield of AI which enables predictive modeling through automated recognition of patterns in the data space (Fig. 1) [5]. ML is based on the use of different algorithm types, which can be broadly classified based on their training mechanism in supervised, unsupervised, and reinforcement learning [6]. The first one requires labeled data to guide the training process, whereas the second does not, as the software automatically searches for structures in data instead. The unsupervised learning process usually results in data clustering, which needs subsequent analysis to correlate its findings with outcomes of interest. Finally, in reinforcement learning, there are both positive and negative reinforcement loops which progressively improve the prediction ability of the algorithm, leading to growth in accuracy through “experience.” ML applications may also include a combination of these types of learning, even though supervised learning is the most common approach in medical imaging. Among the different subtypes of ML algorithms, neural networks (NNs) are frequently used in radiology, due to their intrinsic ability to analyze images. This type of model processes data similarly to the human brain as it is based on a network of nodes, also called “neurons.” Every node stores a numeric value, and the connection between each neuron represents a weight of the NN, which corresponds to the strength of connections between nodes. This architecture results in a multilayer network of nodes, each layer progressively working on a higher degree of abstraction, with the final layer encoding the desired output. Deep learning (DL) is a type of NN which contains multiple hidden layers, detecting complex, non-linear relationships between image features [7]. Thus, DL allows high-level abstractions of the data present in medical images [8].

Fig. 1
figure 1

Example of machine learning and deep learning-based image processing pipelines in radiomics

In the past few years, ML has proved to be potentially useful in multiple subspecialities of healthcare, and several of these tools are now approved for clinical practice [9,10,11]. Radiology is one of the most promising fields of radiomics and ML application, as these may be used for automatic detection and characterization of lesions or segmentation of medical images [12, 13]. In particular, there has been a growing number of scientific works showing ML as a powerful tool in imaging of cardiovascular diseases [14]. For instance, it may improve image acquisition and reconstruction time [15]. They have also shown promising results in automated segmentation of anatomical structures and classification of diseases [16, 17]. Finally, ML may provide new understanding of known diseases though its ability to uncover hidden patterns in the data, thus improving their future management [18].

This review aims to provide an overview of promising applications of radiomics and ML in the domain of cardiovascular imaging disease, sorted by imaging modality. Specifically, we will focus on cardiac imaging, given its essential role to diagnose numerous cardiologic diseases and the consequently growing number of AI tools in this field [19].

Echocardiography

Echocardiography is a widely used imaging modality in cardiology, particularly for the assessment and measurement of heart chambers and in the study of valvular disease [20]. It may greatly benefit from ML tools as these could be used to obtain automated and accurate measurements, reducing inter- and intra-rater variability, which are typical of ultrasound examinations. For example, ML-based software could automatically calculate clinically relevant echocardiography parameters, such as left ventricular ejection fraction. Ash et al. developed a ML model, trained on more than 50,000 echocardiographic exams, which automatically calculates left ventricle ejection fraction with high consistency (mean absolute deviation = 2.9%) and sensitivity and specificity (0.90 and 0.92, respectively) [21]. These solutions may improve the imaging workflow, as well as increase the accuracy of measurements, in particular in case of less-experienced operators [22]. Similarly, AI may be used to automatically calculate global longitudinal strain and left atrial volume (LAV) [23]. As shown by Mor-Avi and colleagues, who evaluated 92 patients, the values of LAV obtained from echocardiography present high correlation with those derived from cardiac magnetic resonance (CMR), in particular when using real-rime 3D technique (r = 0.93 vs. r = 0.74 for maximal LAV; r = 0.88 vs. r = 0.82 for minimal LAV) [24].

AI may also enable automatic detection of wall motion anomalies from echocardiography, as shown by Huang and colleagues. Their group developed an accurate convolutional NN model using a training dataset of 10.638 echocardiography exams performed in two tertiary care hospitals [25]. It achieved an area under the receiver operating characteristic curve (AUC) of 0.891, sensitivity of 0.818, and specificity of 0.816 [25]. The potential value of ML is also emerging in the setting of aortic valve stenosis management, again through automated measurements and image analysis [26].

Another avenue for the implementation of radiomics and ML in echocardiography is represented by the characterization of myocardial tissue anomalies. This type of analysis may be challenging as changes are often subtle. Kagiyama et al. used both supervised and unsupervised learning approaches to develop a ML tool. In this case, a training dataset of 534 echocardiography scans was used, with corresponding CMR images serving as the reference standard. The resulting model predicted the presence myocardial fibrosis with an AUC of 0.84, sensitivity of 86.4%, and specificity of 83.3% [27].

Finally, ML may identify functional phenotypes from whole–cardiac cycle echocardiography. In particular, Loncaric et al. used unsupervised learning trained on a dataset of 189 patients with known hypertension and 97 healthy controls and found that their software could automatically identify patterns in velocity and deformation which correlate with specific structural and functional remodeling [28]. Similarly, AI has been used to analyze diastolic parameters correlating with specific phenotypes, thus leading to a more personalized patient management [29].

Coronary Computed Tomography Angiography

Coronary computed tomography angiography (CCTA) has become one of the most important diagnostic exams in cardiology in multiple settings. Indeed, it plays a pivotal role in the diagnosis of chronic coronary syndrome, as it is recommended as the initial test for diagnosing coronary artery disease, especially when this condition cannot be excluded by clinical exams alone [30]. Radiomics proved to be useful in identifying vulnerable coronary atherosclerotic plaques. For instance, it was used to extract features from CCTAs performed on 624 individuals of the Framingham Heart Study cohort with and Agatston score higher than 0. These patients were clinically followed for more than 9 years, and ML accurately identified subjects at risk of major cardiovascular events among them [31]. Furthermore, Kolossváry et al. developed a tool which detects the napkin-ring sign, an imaging finding of atherosclerotic plaques which correlates with major adverse cardiac events [32]. They enrolled 2674 patients who underwent CCTA due to stable chest pain. Twenty patients with napkin-ring sign were identified within this cohort and matched with 30 healthy controls. More than 4000 radiomics features were extracted from each exam, and the model had an excellent discriminatory power, with a reported AUC > 0.80. On the other hand, Hamersvelt and colleagues used DL to identify patients with significant coronary artery stenosis among those classified as having an intermediate degree of stenosis (corresponding to 25–69% vessel caliber reduction) [33]. This approach proved to have a good potential as the AUC was 0.76 and sensitivity of 92.6%; however, the specificity was only 31.1% [33]. Radiomics may also aid in detecting the presence of coronary inflammation, which has been associated with higher risk of major cardiovascular accidents [34].

CCTA is known to have high negative predictive value to exclude acute coronary syndrome, in particular in patients with low-to-intermediate pre-test probability [35]. Some ML tools have been developed to improve CCTA’s performance in the setting of acute coronary syndrome. For instance, Hinzpeter et al. created a ML model based on TA data using CCTAs of 20 patients with acute myocardial infarction and 20 healthy controls. This proved to be accurate in distinguishing healthy individuals from those with acute myocardial infarction (AUC of 0.90), even if on a small sample of cases overall [36]. Hu and colleagues used radiomics to predict major adverse cardiovascular events from CCTA features [37]. They collected a total of 105 lesions from 88 CCTAs in the training set, and 31 CCTAs were used as the validation set. A total of 1409 radiomics features were extracted and the final model demonstrated an AUC of 0.762 for the training set and 0.671 for the validation one. These results are promising, although this tool also requires further validation prior to consideration for its introduction in clinical practice.

Recently, imaging of pericoronary adipose tissue on routinary CCTA has shown to be a good way to measure coronary inflammation [38]. Therefore, Lin et al. created a model integrating CCTA and clinical features which employs radiomic data of pericoronary adipose tissue to accurately (AUC = 0.87) classify patients with myocardial infarction and those with stable or absent coronary artery disease (CAD) [39]. Interestingly, Mannil et al. found that AI could also be helpful in the setting of non-contrast enhanced low radiation CCTA. They investigated the use of different models (NN, decision tree, naïve Bayes, random forest, sequential minimal optimization), based on TA radiomic data. These proved to be effective in detecting myocardial infarction from non-contrast enhanced low radiation CCTA, with the best (naïve Bayes) achieving a sensitivity of 83% and a specificity of 84% [40].

Radiomics can also be used to identify features useful to predict higher cardiovascular risk. For instance, Oikonomou and colleagues used ML to find features of perivascular adipose tissue associated with major cardiovascular events in three experiments [41]: the first analysis compared adipose tissue biopsies obtained from patients undergoing cardiac surgery with CT images; the second used random forest to distinguish patients who suffered from major cardiovascular events from healthy controls; and the third focused on patients with acute myocardial infarction. Radiomics has the ability to detect features of perivascular adipose tissue (apart from inflammation) associated with CAD [42]. Furthermore, ML may accurately identify patients who require coronary intervention. Liu et al. enrolled 296 patients with symptomatic CAD and stenosis (> 50%) to create a training dataset in order to develop a DL tool which could automatically calculate fractional flow reserve [43]. It proved to be accurate, thus possibly reducing the need of invasive coronary intervention. The automated computation of fractional flow reserve with ML may also be useful in the emergency setting with patients suffering from acute chest pain [44].

AI may also be useful in the differential diagnosis process in particular settings. For example, radiomics can accurately differentiate artifact caused by left atrial appendage from thrombi, as shown by Ebrahimian and colleagues. They developed a highly accurate tool (AUC = 0.85) which only requires early-phase contrast-enhanced CT images to work [45]. Similarly, a ML model may be used in the setting of suspected prosthetic valve obstruction to differentiate pannus from thrombi or vegetation [46].

CCTA may also be useful in evaluating the myocardium when CMR is not available. For example, Qin and colleagues used radiomics to detect myocardial fibrosis in hypertrophic cardiomyopathy using CMR as reference [47]. They enrolled 161 patients and used logistic regression to create a classification model which proved to have high diagnostic power (AUC = 0.81 in the training set and 0.78 in the testing cohort). Esposito et al. used TA to detect extra-cellular matrix changes in the myocardium of patients with ventricular tachycardia, analyzing late iodine enhancement images and identifying different phenotypes of remodeling [48]. Similarly, the analysis of late iodine enhancement with ML may also be useful in distinguishing cardiac sarcoidosis from non-ischemic cardiomyopathies [49].

Radiomics and ML could also identify patients with high risk of major cardiovascular events among those with left ventricular hypertrophy using non-contrast cardiac computed tomography, with high accuracy (AUC > 0.70) [50].

Cardiac Magnetic Resonance Imaging

Cardiac magnetic resonance (CMR) is an essential modality in cardiovascular imaging as it allows evaluation of both function and structure of the heart, and it is crucial in the diagnosis and management of many diseases. An increasing number of AI tools have been developed to be implemented in CMR, aimed at reducing acquisition and reading time as well as improve reproducibility. As previously mentioned, they may also help in automated classification of lesion phenotypes. For instance, Cetin and colleagues extracted radiomic features from CMRs to build models for the classification and diagnosis of cardiovascular diseases [51]. The same research group also used different types of ML algorithms (support vector machine, random forest, and logistic regression) to identify specific CMR features in patients with cardiovascular risk factors (in particular with hypertension, diabetes, high cholesterol, current, and previous smoking) [52]. This approach proved to accurately identify cardiac tissue textures specific for each risk group, with good accuracy (AUC > 0.6).

Regarding myocardial infarction, Chen et al. used TA to extract features from native and post-contrast T1 mapping images to evaluate the extracellular volume fraction mapping and detect irreversible changes after myocardial infarction [53]. In their study, an AUC of 0.91 was achieved, and thus their pipeline may be helpful in predicting left ventricular adverse remodeling. In this setting, TA may also be used to extract features from late-gadolinium enhancement (LGE) images which correlate with a higher risk of developing arrhythmias. This in turn may lead to improved selection of patients that would benefit from an implanted cardioverter defibrillator [54]. Radiomics may also be employed to differentiate non-viable, viable, and remote infarcted myocardial segments analyzing LGE patterns [55]. It may also extract important additional information from unenhanced images, which may be relevant when it is not possible to employ contrast agents (e.g., in case of renal impairment, a common condition in CAD patients). Quanmei et al. used radiomics features from unenhanced T1 mapping and T1 values to diagnose myocardial injury in ST-segment elevation myocardial infarction with high accuracy (AUC = 0.88 in the training set and 0.86 in the test one) [56]. Similarly, Zhang and colleagues developed a DL tool to automatically detect and delineate chronic myocardial infarction from unenhanced CMRs, which showed an AUC of 0.94 [57]. Eftestøl et al. investigated TA’s ability to identify patients that would require implantable cardioverter defibrillator among those with myocardial infarction with high specificity (84%) [58]. ML may also be useful in the differential diagnosis of myocarditis with acute clinical presentation and acute myocardial infarction. In this setting, Baesslet et al. performed TA of T1 and T2 map sequences from 39 CMRs, achieving an AUC of 0.88, a sensitivity of 89%, and a specificity of 92% [59].

In clinical practice, another role for CMR is represented by the diagnosis and management of cardiomyopathies, and ML may also help in this domain. For example, TA may help to discriminate between hypertensive heart disease and hypertrophic cardiomyopathy. Neisius et al. used it to analyze global native T1 mapping images from 232 subjects and their solution achieved an overall accuracy of 0.86 [60]. Alis et al. used both TA and ML to identify patients with tachyarrhythmia from a population of subjects affected by hypertrophic cardiomyopathy [61]. They enrolled 64 patients and tested different types of ML algorithms (support vector machines, naive Bayes, k-nearest-neighbors, and random forest) to analyze LGE patterns, achieving a sensitivity of 95.2%, specificity of 92.0%, and accuracy of 95%. The analysis of LGE patterns may also predict the risk of developing adverse events in the setting of hypertrophy cardiomyopathy with systolic dysfunction [62]. Furthermore, TA may allow for the extraction of features correlating with tachyarrhythmia in patients with hypertrophic cardiomyopathy from non-contrast T1 images [63].

Interestingly, radiomics may help in associating specific genetic mutations to imaging phenotypes. Wang et al. developed an image analysis pipeline to classify hypertrophic cardiomyopathy patients related to MYH7 or MYBPC3 mutations using exclusively T1 native maps, resulting in an AUC higher than 0.90 [64]. Similarly, TA analysis of T1 maps may aid in differentiating patients with dilatative cardiomyopathy from healthy controls, as shown by Shao et al. This group implemented a support vector machine model with an accuracy higher than 0.85 [65]. In dilatative cardiomyopathy, DL and ML may be helpful in identifying specific phenotypes and predicting prognosis [66].

Finally, DL may allow to reduce or even avoid the use of gadolinium in CMR, as proposed by Bustamante et al. using cardiovascular 4D flow MRI. This software may be useful, for example, in congenital heart disease patients as pediatric subjects are likely to require long-time follow-up [67, 68]. Additionally, the analysis of native T1 maps may help identifying patients with low-likelihood of LGE, thus avoiding contrast administration in selected cases [69].

Nuclear Cardiology

Single-photon emission computed tomography (SPECT) is an important imaging modality in assessing significant CAD and risk of major cardiovascular events [70]. As with other modalities, ML may be useful as it may obtain automated segmentations of SPECT images [71]. Furthermore, ML and DL may be also used to classify SPECT images and identify patients with CAD. Apostolopoulos et al. used different subtypes of ML (NNs and random forest) to analyze a large dataset composed by 566 patients who underwent gated SPECT with 99mTc-tetrofosmin. They were able to prove that these tools could perform diagnosis with an accuracy of 79.15% [72]. Deep convolutional NNs could also predict risk of CAD and obstructive disease from 99mTc-tetrofosmin SPECT with high accuracy (AUC = 0.80) [73].

ML may be employed to reduce scan time and radiation dose by avoiding the acquisition of one or more phases of SPECT studies. For instance, Eisenberg et al. used ML to potentially avoid the acquisition of the rest phase in SPECT, as they developed an algorithm exclusively based on the stress myocardial phase in conjunction with multiple clinical features. They reported an accurate prediction of obstructive and high-risk CAD, with an AUC of 0.84 [74]. Hu and colleagues developed a ML tool which can predict per-vessel coronary revascularization within 90 days after stress/rest 99mTc-Sestamibi/Tetrofosmin (AUC = 0.79), even outperforming the interpretation of expert nuclear cardiologists [75].

ML may also improve automatic detection of myocardial perfusion abnormalities. In particular, a deep convolutional NN improved the detection rate of myocardial perfusion abnormalities from stress/rest SPECT performed with 99mTc-Tetrofosmin or 99mTc-Sestamibi with an AUC of 0.872 [76]. ML may also be used in analyzing PET myocardial perfusion data to predict the risk of adverse cardiovascular events [77].

Image Quality Improvement

Another application of ML, especially DL, is represented by improvements in image acquisition. Specifically, DL models may be trained to reduce image noise, artifacts, radiation dose, and inter- and intra-observer variation of measurements [78]. For instance, ML has been used to improve echocardiography acquisition, facilitating access to this imaging modality in the emergency setting. Narang et al. developed a DL software helping non-expert users to acquire exams of acceptable quality. They evaluate 240 exams from two academic hospitals, obtaining diagnostic echocardiography scans in 92.5–98.8% of patients [79].

Regarding cardiac CT, the main aim of ML is to obtain good quality images while reducing radiation dose. This may be achieved creating synthetic contrast-enhanced images from non-enhanced acquisition, thus also avoiding contrast injection [80]. Another possibility is to use low dose protocols, which unfortunately usually determine an increase in image noise [81]. For example, Wolterink et al. used a convolutional NN to automatically convert low dose CT images in higher quality images, comparable to routine-dose CT, enabling accurate coronary calcium scoring [82].

ML may also improve CMR image quality. In this setting, it can be used to reduce motion artifacts, which may strongly deteriorate image diagnostic quality. In particular, Küstner et al. used DL to retrospectively obtain high quality images from low quality ones, where motion artifacts were present. It is interesting to note that, while they obtain high quality images, some anatomical structures were erased or altered; therefore, this type of image processing requires further evaluation prior to introduction in clinical practice [83]. Furthermore, ML may be used to improve images reconstruction, thus improving quality and speeding up scan time [84]. For instance, Hauptmann used ML to reduce acquisition time in patients with congenital heart disease, achieving good image quality and also obtaining automated measurements of heart chambers which were comparable with those of expert radiologists [85].

Discussion

As shown in our review, radiomics and AI have numerous potential applications in the field of cardiovascular imaging. These range from improved image acquisition, higher inter-reader reproducibility, better diagnostic accuracy, and more personalized patient management. In the future, it may also enable automated and accurate prognosis prediction, while more short-medium term implementations could allow reduced artifacts, radiation dose, and scan time. Even though there is a constantly growing amount of studies performed using these tools, few of them are actually approved for clinical practice [86]. As a matter of fact, there are still some issues to overcome of which physicians and patients, as end users, should be aware.

First of all, methodological quality of radiomics and ML studies is frequently low. This has been demonstrated by multiple systematic reviews performed in other fields of medical imaging [12, 87,88,89]. Unfortunately, this finding has also been recently confirmed in the setting of cardiovascular imaging, in particular regarding CT and CMR research [90]. Out of 53 papers reviewed, the median quality score was only 19.4% (interquartile range = 11.1–33.3%), which is not satisfactory. On the other hand, median quality showed a positive trend over the years, even though it peaked at approximately 25%. This systematic review and quality assessment highlights the need for higher standards that should be expected for this area of research by journals, reviewers, and readers.

Specific limitations that lower the quality of studies in this area are also tied to inconsistencies in study design or presentation. For example, image acquisition protocols and preprocessing steps are frequently not described in detail in the papers. The limited scope of most radiomics research is also limiting its potential value, as exams are usually performed in a single institution, thus limiting assessment of model reproducibility and generalization on new data [91]. Another common issue is represented by the fact that almost all ML studies are retrospective in nature, which increases the risk of reporting bias [92]. These points lead to another concern, overfitting. This may be due to excessive tailoring of the ML model to the training population, poor quality of data, or its ineffective preprocessing and results in low ability to generalize [93]. In other words, the results obtained in one institution will not be replicable in another site, hindering the clinical applicability of the process. To overcome overfitting, the ideal solution is represented by the training of ML models on large multi-institutional datasets with appropriate data processing [94]. Finally, as highlighted in auditing of public imaging datasets, it is also crucial to evaluate the quality of medical images used for the training process [95, 96]. Low quality input data can only result in low quality models.

In ML, model interpretability and explainability still represent an open issue, especially for highly complex algorithms, especially DL. Intuitively, it is desirable for the decision process of a predictive model to be clearly presented, facilitating their adoption by physicians. This would also allow for greater involvement of the end user in evaluating the correctness of the model’s output and timely identification of biases or other issues [93]. On the other hand, the current technology does not allow for this type of information to be actually available or to realistically expect this in the next few years. Some domain experts have already proposed that our attention should not be focused on “understanding” DL models, but rather on requiring strong validation alone [97]. In any case, a consensus should be reached on the actual requirements of radiomics and ML software prior to their approval for clinical use. Unfortunately, many products are becoming commercially available with a still unsatisfactory amount of evidence [9].

Conclusions

The number of radiomics and ML-based tools will probably continue to increase in the future. Even in the light of current issues limiting their effective implementation in clinical practice, they still present the potential to positively impact cardiovascular imaging and improve patient outcome. Physicians must become well-versed in the basics of radiomics and familiar with good data science practices to be confident end users and retain a leadership role in this emerging domain of medical imaging.