Introduction

Coronary artery bypass grafting (CABG) as an effective way of myocardial revascularization is a remarkably successful operation and is commonly employed for patients with high-grade or complex coronary artery stenosis. However, the high incidence of postoperative complications may significantly impact both the overall quality of surgical healthcare and patient’s prognosis, as well as mortality [1]. The reported incidence of acute kidney injury (AKI) after CABG surgery ranges from approximate 30–50% [2, 3], while mild to moderate AKI frequently occurs. 2–4% of serious AKI patients are required for continuous renal replacement therapy (CRRT) after CABG surgery in intensive care unit (ICU) [4, 5]. Despite the advancements in intensive care quality and renal replacement therapy technique, short-term mortality of patients receiving CRRT remains high level, ranging from 40% to over 70% [6,7,8]. Therefore, identifying risk factors of postoperative CRRT in CABG patients is critical for reducing death risk.

The application of machine learning model based on artificial intelligence (AI) algorithms has gradually gained momentum in clinical practices owing to its demonstrated superior predictive performance compared to traditional analytical models [9]. This study was conducted to develop machine learning models for predicting risk factors of CRRT after CABG surgery in ICU patients, with an aim to safeguarding postoperative renal function and improving clinical outcomes.

Methods

Data source and patient population

A retrospective review was conducted on 190 adult patients with coronary heart disease who underwent isolated CABG surgery under the support of cardiopulmonary bypass at department of cardiovascular surgery of The First Affiliated Hospital of Nanjing Medical University from January 2013 to June 2020. The exclusion criteria were as follows: (1) patients age < 18 years; (2) patients who received CRRT prior to CABG surgery in the ICU; and (3) patients with missing information exceeding 30%. The endpoint of this study was the requirement for CRRT after CABG surgery. Patients’ data were partitioned randomly into a training set (90%) for model development and a validation set (10%) for model validation. This study complies with the Declaration of Helsinki (revised in 2013) and was approved and supervised by the Ethics Review Committee of The First Affiliated Hospital of Nanjing Medical University (2019-SR-313.A1). The informed consent of patients was waived by the Ethics Review Committee of the hospital.

Criteria of CRRT

The criteria for initiation of CRRT were as follows: (1) more than 6 h of continuous anuria; (2) urine volume less than 200 mL for over 10 h; (3) serum potassium concentration > 6.5 mmol/L (hyperkalemia); (4) severe metabolic acidosis (pH < 7.20 despite normal or low partial pressure of carbon dioxide in arterial blood); (5) serum creatinine (sCr) ≥ 300 µmol/L; (6) volume overload (especially pulmonary edema unresponsive to diuretics); (7) clinical complications of uremia (e.g., encephalopathy, pericarditis, and neuropathy).

Study variables and data collection

Baseline characteristics and co-morbidities of patients, such as age, gender, body mass index (BMI), smoking history, drinking history, acute myocardial infarction (AMI) history, hypertension, diabetes, chronic renal disease, atrioventricular block, atrial fibrillation (AF), and New York Heart Association (NYHA) classification, were documented. The electrocardiogram and echocardiogram records were gathered. Laboratory examinations, such as serum lipids, blood urea nitrogen (BUN), serum creatinine (sCr), myocardial injury biomarkers, albumin (ALB), and blood glucose levels, were tested upon admission. The timing of intra-aortic balloon pump (IABP) implantation, surgical duration, cardiopulmonary bypass time, length of hospital stay (LOS), length of ICU stay, statins use, and in-hospital mortality were recorded. We applied one standard transformation to collected variables: handling missing values by method of filling.

Statistical analysis and development of machine learning models

Statistical analyses were conducted using SPSS software (version 23.0). Continuous variables are presented as mean ± standard deviation or as median (interquartile spacing), while categorical variables are presented as number (proportions). Then, Student’s t-test or the Mann–Whitney test was employed to compare the difference in continuous variables between two groups, and the Chi-square test was used to compare the difference in categorical variables between two groups. All p values were two-sided, with less than 0.05 indicating statistical significance.

The Boruta method was used to select critical features, and machine learning algorithms were then employed to construct training models using 10 fold cross-validation (CV), which effectively avoided overfitting and facilitated determination of optimal hyperparameters. This study included seven machine learning models: AdaBoost, LightGBM, Gaussian Naïve Bayes (GNB), Complement Naïve Bayes (CNB), multi-layer perceptron neural network (MLP), k-nearest neighbors (KNN), and support vector machine (SVM). Performance of the models was evaluated based on relevant indicators, including the area under the receiver operating characteristic curve (AUC), accuracy (ACC), sensitivity, and specificity. In general, the model with the highest AUC exhibited the best predictive capacity and was selected as the final prediction model. A calibration plot was generated to evaluate the correlation between predicted and actual clinical outcomes.

Furthermore, the SHapley Additive exPlanations (SHAP) method was applied to enhance the interpretability of the final model. The SHAP summary plot was used to illustrate the influence of model features. Then, the SHAP dependence plot was used to analyze the importance of individual features affecting model output. The SHAP force plot was utilized to visually represent the impact of key features on the final model in individual patients. Figure 1 illustrates the flowchart for the study.

Fig. 1
figure 1

Flowchart of this study. AUC, area under the receiver operating characteristic curve; SHAP, SHapley Additive exPlanations

Results

Baseline characteristics of patients

The baseline characteristics of the patients are shown in Table 1. This study population had 153 (80.526%) males, with their median age of 66.0 (60.0, 72.0) years old. A total of 72 patients were included in the CRRT group for this study, with a higher median age (69.0 vs 64.0 years old, p = 0.002), a higher mortality (37.500% vs 7.627%, p < 0.001), and a longer length of ICU (14.0 vs 11.0 d, p < 0.001) compared to the non-CRRT group. The comparison of gender, BMI, hypertension, chronic renal disease, AF, NYHA classification, ALB, blood glucose, cardiac troponin T (cTnT), sCr, BUN, creatine kinase isoenzyme (CK-MB), and hospital costs detected statistical differences between the two groups (p ≤ 0.05). Furthermore, there were no differences between the two groups regarding surgical duration (238 in non-CRRT group vs. 230 min in CRRT group, p = 0.737) and cardiopulmonary bypass time (89 in non-CRRT group vs. 94 min in CRRT group, p = 0.881). The detail content is shown in Table 1.

Table 1 Baseline characteristics of patients in the CRRT group and the non-CRRT group

Developed machine learning models and their prediction performance

The Boruta method was employed to identify the key variables associated with CRRT in CABG patients. Ultimately, 17 out of 39 clinical parameters remained significantly associated with CRRT, and these results are presented in Fig. 2. When assessing machine learning models for predicting CRRT, the GNB model showed the highest AUC values in both the training set (0.856, 95% CI: 0.805–0.954, Fig. 3a) and validation set (0.817, 95% CI: 0.630–0.958, Fig. 3b). Furthermore, the GNB model exhibited the highest ACC and specificity in the two data sets (Fig. 4a, b). The calibration plot was generated to evaluate the difference between predicted outcomes and actual clinical outcomes. When predicting CRRT risk, the GNB model displayed excellent calibration performance (Fig. 5). Therefore, the GNB model was recognized as the final predictive model.

Fig. 2
figure 2

Feature selection based on the Boruta algorithm. The horizontal axis is the name of variable, and the vertical axis is the Z-value of each variable. The box plot exhibits the Z-value of each variable during model calculation. The green boxes represent the first 15 important variables, the yellow represents tentative attributes, and the red represents unimportant variables

Fig. 3
figure 3

Comparison of receiver operator characteristic curves (ROCs) for the machine learning models. a The ROCs of training models. b The ROCs of validation models. AUC, area under the ROC; GNB, Gaussian Naïve Bayes; CNB, Complement Naïve Bayes; MLP, multi-layer perceptron neural network; SVM, support vector machine; KNN, k-nearest neighbors

Fig. 4
figure 4

Comparisons of parameters assessing machine learning-based model performance. a Parameters of training models. b Parameters of validation models. ACC, accuracy; AUC, area under the receiver operating characteristic curve (ROC); CNB, Complement Naïve Bayes; GNB, Gaussian Naïve Bayes; MLP, multi-layer perceptron neural network; SVM, support vector machine; KNN, k-nearest neighbors

Fig. 5
figure 5

The calibration plots in the models. GNB, Gaussian Naïve Bayes; CNB, Complement Naïve Bayes; MLP, multi-layer perceptron neural network; SVM, support vector machine; KNN, k-nearest neighbors

Model explanation and application

We next calculated the feature importance using the SHAP value for the GNB model, which had the greatest discriminatory capacity in the validation cohort. Figure 6a exhibits 17 clinical features according to the average absolute SHAP value. Figure 6b provides an overview of the impact (the positive or negative aspects) of factors on the GNB model. To further explore the contribution of the features on a certain individual patient and clinical application for the GNB model, we randomly selected one patient from the validation cohort to exhibit a visual interpretation (Fig. 7). The developed model predicted the probability of CRRT in this patient to be 86.2%. The result shows that fast blood glucose of 20.76 mmol/L, sCr of 125.6 μmol/L, atrial fibrillation, age of 79 years old, and NYHA of III classification were the top five contributors to this prediction.

Fig. 6
figure 6

The SHAP summary plot for the clinical features contributing to the GNB model. a SHAP feature importance measured as the mean absolute Shapley values. This matrix plot depicts the importance of each covariate in development of the final predictive model. b The attributes of the features in the model. The position on the y-axis is determined by the feature and on the x-axis by the Shapley value. SHAP, SHapley Additive explanation; GNB, Gaussian Naïve Bayes

Fig. 7
figure 7

The SHAP force plot for explaining individual prediction results in the validation cohort. SHAP, SHapley Additive explanation

Discussion

Critically ill patients undergoing cardiac surgery frequently present with a complex clinical scenario, posing challenges for clinicians in predicting outcomes. Machine learning algorithms have proven to be valuable tools in overcoming this difficulty and are often employed to develop models for clinical studies [10]. Machine learning has emerged as a powerful tool for distinguishing and predicting prognoses in patients undergoing CABG surgery. However, these studies primarily focused on identifying risk factors associated with postoperative complications, mortality, and prolonged LOS [11,12,13]. Given significant impact of postoperative CRRT on hospital mortality, it was imperative to develop machine learning-based risk prediction model specifically for CRRT after cardiac surgery. Consequently, conducting an effective predictive model using preoperative clinical data is important for clinicians to provide valuable guidelines regarding appropriate interventions and rational allocation of medical resources.

This study involved the development and validation of seven machine learning models based on 17 clinical variables collected in the first 24 h of hospital admission. Among these models, the GNB model demonstrated superior predictive ability for CRRT, primarily due to its highest AUC and better calibration in this study. To achieve the best predictive performance and interpretability, the SHAP was employed in the GNB model. Feature importance analysis revealed that cTnT, CK-MB, ALB, low-density lipoprotein cholesterol (LDL-C), NYHA, sCr, and age were the top 7 features of the GNB model, with significant impact on predicting postoperative CRRT. In addition, the SHAP force analysis enables clinicians to comprehend why specific recommendations are made by the model for high-risk decisions. Collectively, these findings enhance our understanding of decision-making process underlying predictive models for users.

In this study, we have identified crucial features of postoperative CRRT. The biomarkers cTnT and CK-MB, indicating myocardial injury, could to a certain extent reflect cardiac deterioration. Preoperative elevations in cardiac biomarkers may increase the risk of postoperative complications, such as AKI [14, 15]. This could be attributed to the close relationship between cardiac and renal function, referred to as cardio-renal syndrome [15]. In our study, cTnT and CK-MB accounted for the highest weight in the GNB predictive model, which suggested that they were the most critical predictors for postoperative CRRT following CABG surgery. NYHA classification is another heart-related indicator for assessing impaired cardiac function. Several studies have explored the role of NYHA classification in predicting postoperative AKI or need for renal replacement therapy, with NYHA III/VI being an independent risk factor [16,17,18]. Consistently, our study showed that patients requiring CRRT had a significantly higher proportion of NYHA III/VI compared to those without CRRT (69.445% vs. 39.830%). NYHA was identified as an important risk factor of CRRT.

The crucial role of serum ALB in maintaining intravascular volume, partially by facilitating vascular integrity [19], is widely acknowledged. Consequently, reduced levels of serum ALB may result in tissue edema and decrease the circulating volume by extravasation. Previous study has demonstrated an association between preoperative low ALB and short-term or long-term prognosis in patients with cardiac surgery [20]. This suggests that reduced ALB levels can serve as prognostic indicators following cardiac surgery. In our study, we observed significantly lower ALB levels in the CRRT group compared to the non-CRRT group, albeit with a slight difference. It is worth considering that even a mild reduction in ALB levels may affect postoperative CRRT at below-normal levels. The studies have explored the relationship between preoperative renal function and postoperative CRRT, with increased sCr levels identified as the risk factor for CRRT [21,22,23]. Our study corroborates this finding, suggesting that patients with impaired renal function have worse tolerant ability to surgery, which increases their need for CRRT. Age is a force majeure risk factor, with older patients being less tolerant to trauma, stress, and cardiac surgery. Some clinical studies have demonstrated age as CRRT-associated risk factor after surgery in patients [21, 22]. Consistent with these findings, our study confirmed age as a predictor for postoperative CRRT after CABG surgery. Furthermore, our study also found a strong association between LDL-C and postoperative CRRT, although there was little difference in LDL-C levels between two group patients. Given that all patients had coronary artery disease, part of them may have been treated with lipid-lowering medications, such as statins. Indeed, our results showed that the rate of statin use among the two groups had no difference (69.444% vs. 77.966%). Accordingly, we hypothesis that serum LDL-C levels may be affected by different doses of statins and individual responsiveness. Despite no great difference in LDL-C levels, this indicator ought to be highly noticed in CABG patients.

Some previous researches have incorporated preoperative, intraoperative, and postoperative indicators of patients undergoing CABG to develop predictive models for clinical outcomes. It is important to acknowledge that including all variables in the model can enhance risk identification. However, it should be clarified that modeling using patients’ preoperative parameters (within 24 h after admission) offers a priori advantages in early risk variable identification and clinical guidance. Although the machine learning models demonstrated favorable predictive performance in this study, it was important to acknowledge limitations of this study. First, the retrospective nature of the study could have introduced some bias into the results. Second, external data were not used for model validation, potentially impacting the generalizability of the models. Additionally, the usage of angiotensin receptor blockers (ARBs) is reported as an important risk factor for the development of AKI. However, the data on ARBs were lacking in our study, which may affect the development of models and risk identification to some extent. We aim to address and improve these issues in future studies.

Conclusion

Machine learning algorithm was utilized to develop a predictive model for CRRT after CABG surgery in the ICU patients, and the GNB model exhibited an excellent predictive performance and identified risk variables associated with CRRT. This study provides theoretical guidance for surgical physicians and enables the optimization of perioperative managements for patients.