Background

Diabetes is a common chronic metabolic disease, and its incidence is second only to cardiovascular disease and malignant tumors. As diabetes progresses, it can cause damage to the vascular endothelium and promote the formation of atherosclerosis, which leads to cardiovascular disease. Coronary heart disease (CHD) is a serious complication of diabetes mellitus with high disability and fatality rates [1,2,3]. Coronary angiography (CAG) examination is the gold standard for the diagnosis of CHD. Due to its invasive nature and certain complications, CAG is generally used as a routine screening method before the treatment of coronary artery disease. Coronary computer tomography angiography (CCTA), as a noninvasive imaging mode, has been widely used in the evaluation and diagnosis of patients with suspected CHD [4, 5]. CCTA can not only reliably evaluate luminal stenosis and its functional significance but also accurately evaluate the morphology and composition of plaques and identify vulnerable plaques, which has extremely important significance for guiding the clinical management of patients with CHD. Pericoronary fat based on CCTA is closely related to coronary plaque and stenosis. The pericoronary fat attenuation index is a novel noninvasive imaging biomarker that reflects the spatial changes in pericoronary adipose tissue (PCAT) attenuation and pericoronary inflammation in CCTA images [6, 7].

Radiomics is a high-throughput extraction of a large number of image features from radiographic images and a deeper mining, prediction and analysis of common images [8, 9]. Radiomics was originally mainly applied in oncology research. In recent years, relevant studies have revealed that it has certain potential application value in the diagnosis of CHD, prediction of adverse cardiovascular events, monitoring of disease progression and evaluation of prognosis [10,11,12]. At present, there have been related studies on clinical data and medical image analysis of diabetes mellitus and coronary heart disease [13,14,15]. However, few studies have focused on the association between PCAT around coronary plaques and the possibility of CHD events in type 2 diabetes mellitus (T2DM) patients. Therefore, the purpose of this study was to investigate the ability of vascular and pericoronary fat radiomics results based on CCTA to predict the occurrence of CHD in diabetic patients.

Methods

In this section, we mainly describe the clinical dataset used in the experiment and all the details of our proposed method.

Dataset

Clinical data of patients were collected from the clinical data center and image center of the First Affiliated Hospital of Nanjing Medical University. The population of this study was first enrolled with hospitalized patients with diabetes who underwent cardiac computed tomography angiography (CCTA), and 647 hospitalized patients were retrieved from January 1st 2018 to May 31st 2022. Then, patients diagnosed with CHD before the onset of diabetes, patients who did not undergo CCTA prior to coronary stenting, patients with cancer or malignancy or type 1 diabetes, and patients with incomplete biochemical examination information, clinical history, and medication were excluded. A total of 258 + 221 patients were enrolled according to the exclusion criteria; 258 patients with both CHD and diabetes and 221 patients with diabetes but no CHD as controls were finally enrolled in this study. The flow diagram of this study is illustrated in Fig. 1. The study was approved by the Ethics Committee of the First Affiliated Hospital of Nanjing Medical University (2019-SR-153). As this study is a retrospective study and the contents of the study do not involve personal privacy, the Ethics Committee of the First Affiliated Hospital of Nanjing Medical University waived the requirement for written informed consent.

Fig. 1
figure 1

The patient enrollment flow diagram

B. Statistical methods

The statistical analysis was performed using R software version 4.1.2 (R Foundation for Statistical Computing, Vienna, Austria). Normality was checked by the Kolmogorov‒Smirnov test. Normally distributed variables are described as the mean±standard deviation, and nonnormally distributed variables are described as the median ± (25th-75th percentiles). The χ2 test was used to compare categorical variables. Student’s t test was used to compare continuous variables. Basic characteristics with p < 0.05 were considered statistically significant and included in the following modeling. Radiomics features with a p value < 0.001 were included for model establishment.

C. Image acquisition and segmentation

The CCTA images of all patients in this study were obtained from Siemens and GE computerized tomography instruments of the First Affiliated Hospital of Nanjing Medical University. The original data were maintained in DICOM data format with 512*512 dimensions, the patients’ personal information was desensitized, and the detailed configuration information of DICOM message headers was retained for the subsequent extraction and analysis of data feature values. Deep learning algorithms have made long-term progress in the segmentation of various tissues and organs of medical images [16,17,18,19]. However, the existing methods have a large proportion of tissue structures and clear edges, while the coronary vessels to be segmented in this study have a small proportion, long and thin vessels with fuzzy edges, and the labeling of coronary vessels needs to be marked layer by layer, which is laborious and difficult to obtain a lot of labeling information. Therefore, we adopted the method of coronary artery segmentation proposed our research group yaolei et al. [20]. We used the easily obtained central axis as weak supervision information, proposed an examinee-examiner network, and used customized prior conditions to guide and constrain segmentation to deal with weak supervision with fewer intracranial labels to achieve accurate segmentation of coronary artery lumens and obtain the ROI of the coronary vessel area, to achieve accurate segmentation results. Based on the training model of the training dataset, the dataset used in this study was transferred. The automatic extraction of the coronary artery was realized.

D. Feature extraction

PCAT was defined as all voxels with CT attenuation between − 190 and − 30 Hounsfield units (HUs) located within a radial distance from the outer coronary wall equal to the diameter of the vessel [12, 21]. In this study, the feature extraction of images was carried out through the open-source software package PyRadiomics [22,23,24]. PyRadiomics is implemented by the Python algorithm, which is a popular open-source language for scientific computing. Python version 3.6.4 is adopted in this study. PyRadiomics is a standardized and open platform with a large number of standardized feature extraction algorithms, which is often used for the extraction and processing of imaging features of medical image data. In this study, seven standardized characteristic classes, including first-order statistics, shape features, gray-level cooccurrence matrix (GLCM), gray-level run length matrix (GLRLM), gray-level size zone matrix (GLSZM), gray-level dependence matrix (GLDM), and neighboring gray tone difference matrix (NGTDM), were mainly calculated. A total of 1,037 radiomic parameters were calculated for each PCAT segmentation. Figure 1 shows the process of CCTA image acquisition, segmentation, radiomics feature extraction of the pericoronary adipose region, feature screening and model construction for the studied patients.

E. Model construction

Support vector machine (SVM)

SVM is a supervised machine learning algorithm based on statistical learning theory using the concept of structural risk minimization. It has been applied to many medical diagnoses and disease classifications. The computational time complexity of the SVM algorithm varies with different kernels, ranging from O (n2) to O (n3), where n represents the number of training sessions.

Decision tree (DT)

In the DT algorithm, instances (data points) are classified by sorting them based on feature values. In this study, the Gini index algorithm was used to identify the corresponding threshold to split the input data into subbranches. The computation time complexity of the DT algorithm is O(n*log(n)*f), where n represents the number of training capital and f represents the number of features.

Random forests (RF)

RT is an ensemble-type classification method that tends to perform better than traditional decision tree classification methods. The computational time complexity of the RF algorithm is O(n*log(n)fk), where n represents the number of training samples, f represents the number of features, and k represents the number of trees. We used 500 trees to predict two target classes, the occurrence or not occurrence of CHD in diabetic patients.

We developed SVM, DT and RF models to identify risk factors for CHD occurrence in type 2 diabetes patients. We used the package e1071 in the open-source software R 4.1.2 to perform SVM models, and the DT and RF were implanted using the packages rpart and randomForest, respectively. For the preprocessing of input data before modeling, since Stochastic resonance theory explains the use of noise constructively to enhance the input signal [25,26,27], we did not perform filtering operation. Instead, the input data is directly normalized and then modeled.

F. Performance evaluation

Parameters such as sensitivity, specificity, positive predictive value (PPV), negative predictive value (NPV), false positive rate (FPR), false negative rate (FNR), F score and accuracy (AAC) were calculated. The receiver operating characteristic (ROC) curve area was calculated to observe the relationship between PCAT around coronary plaque and internal radiomics information based on CCTA and the possibility of CHD events in diabetic patients.

Figure 2 shows the full flowchart of the current study. First, the collected CCTA images are resampled and segmented with method in section C. Then, the proposed regions are used as templates for feature extraction using radiomics with method in section D. For the diameter range around the extracted blood vessels and the CT attenuation value is the range area describing the tissue, the PCAT was extracted using the PyRadiomics package. After that, feature selection is performed using statistical methods. The models are constructed using SVM, DT, and RF methods, and the performance of the models is evaluated.

Fig. 2
figure 2

The flowchart of the current study

Results

Clinical characteristics

Our patients’ mean age was 66.85 ± 11.42 years, and 65.76% of them were male. The clinical characteristics and morphological parameters of the patients in the two cohorts are illustrated in Table 1. Patients with both diabetes and CHD (study group) and patients with diabetes but no CHD (comparison group) were analyzed by common factors such as sex, age, hypertension, blood pressure, blood biochemical function test, height and weight. We observed that patients in the study group were older than those in the comparison group (71.35±10.18 vs. 61.43±10.52, P < 0.001). Patients with both diabetes and CHD had a lower diastolic pressure (74.36±11.50, 77.90±11.03, p < 0.001) and lower HDLC (1.04±0.265, 1.12±0.398, p = 0.012). In contrast, gender, smoking status, alcohol consumption, family history of coronary disease, and systolic pressure were not significantly different between these two cohorts.

Table 1 The general characteristics of the study population

Radiomic feature selection

Of all 1,037 calculated radiomic parameters, 18.3% (190 of 1037) showed a significant difference (p < 0.001) between the 2 groups. Specifically, 28.28% (56 of 198) of first-order parameters, 78.57 (11 of 14) of shape parameters, 7.95% (21 of 264) of GLCM parameters, 18.18 (32 of 176) of GLRLM parameters, 21.02 (37 of 176) of GLSZM parameters, 17.53 (27 of 154) of GLDM parameters, and 10.91 (6 of 55) of NGTDM parameters were significant (Fig. 3).

Fig. 3
figure 3

Negative logarithm of p values for comparison between diabetes with CHD and diabetes without CHD are plotted on the y axis, each of 1037 radiomic parameters are showed on the x axis. The broken line indicates the p value of 0.001, and 190 parameters above the line were considered statistically significant

Identification of diabetes patients with CHD using ML

Three different kinds of machine learning models, including support vector machine (SVM), decision tree (DT), and random forests (RF), were used to investigate the association of coronary heart disease occurrence in diabetic patients. We built a dataset with significantly different clinical features, including age, hypertension, family history of diabetes, diastolic pressure, height, HDLC, LDLC, TC and TG, from Table 1 as C-data. Different from this dataset, we set CCTA-based imaging biomarkers: significant radiomic features in the coronary vessel section as V-data. Then, we set the PCAT radiomics phenotype with a significant difference from the V-data as P-data. We used three combinations of clinical information, radiomics information of the vascular region, and radiomics information of perivascular fat to observe the different prediction effects of different datasets.

Table 2 shows the recognition effects of different datasets using different models. We can see that the datasets combining clinical information, vascular internal radiomics information and pericoronary vascular fat radiomics information have the highest ACC values under different machine learning models. The methods using support vector machines all performed best in specificity, sensitivity and accuracy in this dataset. Therefore, it also verified our view that the peri-coronary adipose tissue omics data reflect the perivascular inflammatory characteristics, which is helpful to improve CHD patients’ discrimination from diabetic patients.

Furthermore, the study tries to analyze the elements that have a large influence on the results during the modeling process. The paper takes the analysis process of random forest as an example. The top important variables in 3 kinds of datasets are shown in.

Table 2 The classifying results from different datasets combinations using three models

Figure 4A and C, and the importance for the classification of CHD from diabetic patients was ranked. As seen from the ranking, once the radiomics dataset is integrated with the clinical dataset, the radiomics features are more relevant, especially in terms of shape features and first-order features, which also reflects the high impact of CCTA-based image features on disease diagnosis.

Fig. 4
figure 4

Ranking of variables in 3 kinds of data sets in random forest model. (1) clinical data alone; (2) clinical data and coronary vessel radiomics feature; (3) clinical data, coronary vessel radiomics feature and perivascular fat radiomics features

Discussion

In this study, the clinical data and pericoronary radiomics data of CCTA were fused to predict the occurrence of coronary heart disease in diabetic patients. This provides information for the early detection of coronary heart disease in patients with diabetes and allows for timely intervention and treatment. Coronary artery inflammation is an early manifestation of cardiovascular disease. Blood vessels in an inflammatory state release inflammatory factors that diffuse to pericoronary fat, resulting in local inhibition of fat formation and promotion of local fat decomposition through a series of processes. Antonopoulos et al. [6] suggested a “bidirectional” interaction between coronary artery wall inflammation and pericoronary adipose tissue (PCAT). The relationship between PCAT inflammation and the CT attenuation index was confirmed through biopsies. When there is inflammation in the blood vessels, the inflammatory lesions affect the CT value of PCAT on CCTA images through paracrine action. T2DM is a metabolic disorder in which abnormal insulin activity leads to disturbances in fat metabolism within the body. T2DM is strongly correlated with CHD, and one crucial mechanism is the chronic hyperglycemia-induced vascular inflammation that contributes to the development of atherosclerosis. Inflammation plays a pivotal role as a regulatory factor in metabolism and diabetes-related CHD. Research suggests that T2DM patients are twice as likely to develop CHD compared to nondiabetic individuals [28,29,30]. T2DM patients are prone to the formation of high-risk plaques when compared to those without T2DM, despite the presence of proactive treatment approaches and improved diagnostics. Consequently, the risk of cardiovascular complications remains significantly elevated in T2DM patients. Therefore, the early identification and prevention of CHD lesions hold significant importance for individuals with T2DM.

To the naked eye, the perivascular adipose tissue appears uniformly black, making it difficult to distinguish. However, through omics processing, colors can be assigned to the fat, revealing noticeable contrasts. Radiomics substantially enhances the quantity of quantitative data that can be extracted from CT images. It involves extracting thousands of imaging features that are imperceptible to the human eye from a specific region of interest. These features are then used to generate a vast dataset, which is subsequently analyzed to reveal imaging patterns linked to clinical traits or outcomes. Currently, PCAT has been demonstrated to be capable of predicting plaque progression in coronary arteries and distinguishing types of CHD [21, 31,32,33]. Radiomics can extract a multitude of image features from a given region of interest, far beyond parameters discernible by the human eye [18]. It has emerged as a novel imaging biomarker for atherosclerosis in coronary arteries. This study further analyzed the utilization of PCAT-based radiomic features as detection markers for coronary heart disease in diabetic patients. Through a comparison of three modeling approaches, the predictive efficacy of PCAT was validated. When combined with clinical and vascular information, its effectiveness was further substantiated. AI-assisted measurement of PCAT parameters is rapid, convenient, noninvasive, and easily scalable in clinical settings. This approach can provide early warnings for coronary heart disease in high-risk diabetic populations, ultimately improving their subsequent quality of life.

This study has the following limitations. First, it is a retrospective study with a single center and a small sample size. Second, although CAG is commonly used as the gold standard for diagnosing coronary heart disease (CHD) and as a common treatment method in clinical practice, a large number of diabetic patients do not undergo CAG before the occurrence of obvious CHD symptoms. In this study, the presence of CHD was determined based on comprehensive patient medical records, examinations, and diagnoses. In future studies, it is necessary to address the limitations of a single center and small sample size by expanding the study population. Additionally, enriching clinical information and ensuring accurate patient selection should be further emphasized.

Conclusion

In this study, the clinical data and pericoronary radiomics data of CCTA were fused to predict the occurrence of coronary heart disease in patients with diabetes mellitus. The high-dimensional image features calculated by radiomics were used to extract the parameters that could not be identified by the naked eye. Three machine learning models were selected for classification and comparison. The well-performed classification results confirmed that the radiomics of pericoronary adipose tissue can reflect the perivascular inflammatory features and help to identify CAD patients among diabetic patients.