1 Introduction

The critical care landscape is continuously evolving, with intensive care unit (ICU) scores being pivotal in stratifying mortality risks among heterogeneous patient populations. Although tools like the Charlson Comorbidity Index (CCI) [1], Simplified Acute Physiology Score II (SAPS II) [2], and Sequential Organ Failure Assessment (SOFA) [3] are routinely used, they are not without limitations. These scores, while informative, often fail to capture the multifaceted nature of an ICU patient’s condition due to their focus on isolated variables.

From a clinical point of view, it would be more attractive to know which group of patients has a higher mortality risk than the individual patient’s risk. In this context, machine learning (ML), applying unsupervised/supervised algorithms, can be promising in stratifying patients into homogeneous groups and accurately assessing mortality based on score combination [4].

Cluster analysis (CA), a group of multivariate mathematical algorithms, is an unsupervised ML algorithm that divides data into groups or clusters based on similar characteristics without a priori hypotheses [5]. CA quantifies the similarity between individuals within a population based on multiple specified variables. In the ICU, CA can be applied to identify subgroups of patients with similar demographic, clinical, and prognostic characteristics, providing a detailed overview and a more in-depth assessment of the cluster-related mortality risk [6, 7]. However, the choice of variables represents the main source of variability and lack of repeatability of the results, limiting generalizability.

A second important aspect for the clinician is to know whether a patient, upon entry into the ICU, belongs to a specific cluster or not. The classification and regression tree (CART) algorithm, a supervised ML algorithm, is a predictive modeling technique that creates a binary decision model predicting the value of a target variable by learning simple decision rules inferred from the data features [8].

In this monocentric retrospective study, we used ML to perform CA based on the alternative combination of CCI plus SAPS II and CCI plus SOFA. We compared mortality prediction performance for clustered and non-clustered models. Then, we tried to develop a simple decision algorithm to predict the patient’s membership to a cluster using CART.

2 Materials and Methods

2.1 Patients’ Enrollment

This retrospective study involved all patients admitted in regional hub hospital and requiring ICU admission from 1st January 2011 to 16th September 2022. During the COVID-19 pandemic, we considered patients with negative SARS-CoV-2 nasopharyngeal swabs.

Our ICU is part of the ProSafe project network, supported by the Italian Group for the Evaluation of Interventions in Intensive Care Medicine (GiViTI). The project’s purposes are the systematic collection of data and the continuous evaluation and performance comparison of departments.

After collecting written informed consent from conscious patients or from relatives of unconscious patients, demographic data, clinical characteristics on admission and in the first 24 h, the main complications that arose, therapies and procedures performed during the ICU stay, and the outcome at discharge (including from the hospital) were registered on the national clinical information ProSafe system.

Given the retrospective design of this study, local ethical committee approval was not required. Patient privacy has been rigorously protected in accordance with current national legislation (General Data Protection Regulation of the European Union n° 2016/679 and the Italian Legislative Decrees n° 196/2003 and 101/2018). The study was conducted following the International Conference on Harmonisation Good Clinical Practice guidelines and the 2008 Declaration of Helsinki provisions.

The present study was conducted according to Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD) statement [9].

2.2 Data Collection

According to the ProSafe project network, all these data were collected on ICU admission and in the first 24 h:

  • Demographic: age, gender, body mass index (BMI), comorbidities, patient provenance (same or another hospital), ward before ICU admission (emergency department, surgical or medical ward, other ICU), the reason for ICU admission, and time from in-hospital presentation to ICU admission;

  • Function/organ impairments at ICU admission, defined as respiratory, brain, renal, cardiovascular, coagulation, liver, and metabolic failure;

  • Sepsis or septic shock on admission;

  • Trauma;

  • Surgical and non-surgical procedures before ICU admission;

  • Development of in-ICU infection;

  • Percutaneous tracheostomy and need for vasopressor/inotropes administration in ICU;

  • ICU length of stay (LOS) and outcome (death or alive).

Patients’ clinical conditions were stratified according to the CCI, the SAPS II, the SOFA, and the Glasgow Coma Scale (GCS) [10]. Supplementary File 1 provided details about data collection.

2.3 Records Selection

Only local records were retired and anonymously stored on an electronic sheet (Microsoft Excel).

In the identification phase, we applied the following exclusion criteria:

  • Age < 18 years-old;

  • Patients with previous LOS > 15 days before ICU admission;

  • Transfer from other hospitals, ICU, sub-intensive and long-term care units;

  • Admission to ICU for palliative care and post-operative monitoring;

  • BMI > 60 kg/m2;

  • Suffering from AIDS.

Then, in the screening phase, we evaluated missing data. We considered the missing data as missing completely at random, and the observations were deleted.

2.4 Statistics

Supplementary File 1 reported details about statistics. A preliminary explorative examination of the records fulfilling our criteria was performed. Descriptive statistics were computed. Categorical data were reported as absolute numbers and percentages. Continuous data were reported as mean ± standard deviation or median and first-third quartiles [q1–q3], according to distribution (tested with the Shapiro–Wilk test).

For clustering, we used the alternative combination of CCI plus SAPS II and CCI plus SOFA. We chose these variables for two reasons: the first is that they are scores widely used in ICUs, and the second is that, as scoring systems, they allow the patient’s clinical conditions to be objectively defined. After the normalization of continuous variables [11], the optimal number of clusters (k) was estimated with the package NbClust (version 3.0.1) [12]. For partitioning, we used the Partitioning Around Medoids (PAM) algorithm [13].

Intercluster analysis was performed with proper tests and according to k number. The effect size between clusters and variables was analyzed by computing Pearson (r) or Spearman (ρ) coefficient for continuous variables or Cramer’s V coefficient for categorical variables [14]. According to the results, coefficients were interpreted as reported in Supplementary File 1. Survival analysis was performed with the Kaplan–Meier method. Median survival time and 28-day survival probabilities with 95% confidence interval (CI95%) were computed. The differences in survival rates among clusters were tested with the Peto-Peto test [15].

We performed multivariate logistic regression analysis to find the relationship between cluster membership and ICU mortality. We selected these variables: cluster, gender, BMI, time from in-hospital presentation to ICU admission, ward, clinical condition on ICU admission, and surgical and radiological procedure before ICU admission. The odds ratio (OR) with CI95% were computed. Clustered models were compared with the no-clustered model (model 1). For nested models, the difference in deviance was tested with Likelihood Ratio Tests (LRT). If the difference in deviance was not statistically significant, the simplest model was selected based on the parsimony principle. To evaluate the balance between complexity and data fit for non-nested models we used the Akaike Information Criterion (AIC) and Bayesian Information Criterion (BIC). The Receiver Operating Characteristic (ROC) curve was plotted to evaluate the model’s predictive performance. The Area Under Curve (AUC) and CI95% were computed with the DeLong method [16]. Sensitivity (Se) and specificity (Sp) with CI95% were also computed. The differences between AUCs were tested.

With the best predictive model, we developed a simple decision algorithm to predict the patient’s cluster membership with the CART algorithm [17] (Supplementary File 1). Since our goal was to provide an easy-to-use algorithm, we chose a split number of 3 nodes in the pruning phase. The authors believe an excessive number of nodes, beyond the probable overfitting, can be interpreted as complex. The effectiveness of the decision model was evaluated with the confusion matrix. Accuracy, with CI95%, unweighted Cohen’s Kappa, Se, Sp, positive (PPV), and negative (NPV) predictive values were computed.

R-Studio (Posit Software© 2009–2023, version 2023.03.1) was used for the analysis. Two-tailed p-values < 0.05 were considered significant for all hypotheses, with α = 0.05. For multiple comparisons, the p-value was adjusted using the Bonferroni correction. Data are reported in tables and plots.

3 Results

3.1 Population Characteristics

From 1st January 2011 to 16th September 2022, 4723 patients were admitted to ICU. Applying exclusion criteria and removing missing data (Fig. 1), the preliminary descriptive analysis was performed on 2605 observations (Table 1).

Fig. 1
figure 1

Flow chart for records selection. From the 1st of January 2011 to the 16th of September 2022, 4723 observations were retrieved. Once the exclusion criteria were applied, 2877 observations were screened for missing data. Missing data were considered as missing completely at random, and observations were removed. For the analysis, 2605 records were included

Table 1 The table shows the main population characteristics (2605 observations)

Briefly, 1543 patients (59.2%) were male, with a median age of 70.0 years [57.0–80.0] and a BMI of 25.1 kg/m2 [22.1–27.8]. The median time from in-hospital to ICU admission was 0.0 days [0.0–2.0]; 1078 patients (41.4%) came from the emergency department, 915 (35.1%) from surgical, and 612 (23.5%) from medical wards. The medians CCI, SAPS II, SOFA, and GCS were 4.0 [2.0–7.0], 67.0 [51.0–85.0], 10.0 [7.0–13.0], and 5.0 [5.0–9.0], respectively.

The three most common comorbidities were hypertension (1456 patients, 55.9%), moderate/severe COPD (746 patients, 28.6%), and congestive heart failure (727 patients, 27.9%).

The three most common vital/organ function impairments at ICU admission were lung (2382 patients, 91.4%), brain (1582 patients, 60.7%), and cardiocirculatory (1518 patients, 58.2%). Ninety-nine patients (3.8%) showed clinical features suggesting sepsis, while 165 patients (6.3%) had septic shock. Patients with trauma were 437 (16.8%).

Before ICU admission, 995 patients (38.2%) had surgical procedures, while 220 patients (8.4%) had radiological.

Sepsis and septic shock during ICU LOS occurred in 334 (12.8%) and 214 (8.2%) patients, respectively. Percutaneous tracheostomy was performed in 463 patients (17.8%), and 1780 patients (68.3%) required vasopressor/inotrope drugs.

The median ICU LOS was 4.0 days [1.0–11.0], and 1556 patients (59.7%) died during ICU stay.

3.2 Clustering and Intercluster Analysis

NbClust found that the best k number was two for CCI + SAPS II (proposed by eight indexes) and CCI + SOFA (proposed by seven indexes). Figure 2 shows the cluster plots. Clusters were labeled as “A” (1081 observations) and “B” (1524 observations) for CCI + SAPS, and “α” (1276 observations) and “β” (1329 observations) for CCI + SOFA.

Fig. 2
figure 2

Cluster plots for CCI + SAPS II (A) and CCI + SOFA (B). A Clusters were labeled as “A” (1081 observations) and “B” (1524 observations). The medoids coordinate [XCCI, YSAPS II] for clusters were [0.13, 0.36] and [0.40, 0.63]. B Clusters were labeled as “α” (1276 observations) and “β” (1329 observations). The medoids coordinate [XCCI, YSOFA] for clusters were [0.20, 0.32] and [0.40, 0.59]

Supplementary Table 1 shows the main intercluster differences with the computed statistics.

No significant differences were found for BMI, peptic ulcer, connective tissue disease, and lung and liver failure on admission for CA based on CCI + SAPS II, and for gender, BMI, dementia, peptic ulcer, connective tissue disease, and lung and liver failure on admission for CA based on CCI + SOFA.

According to ρ and V coefficients (Table 2), in CA based on CCI + SAPS, gender, liver disease, dementia, hematological malignancy, drug/alcohol addiction, and brain and coagulation failure on admission showed a very weak relationship. In contrast, CCI and SAPS II showed a strong relationship. Age, SOFA, congestive heart failure, chronic kidney disease, and ICU administration of vasopressor/inotrope showed a moderate relationship. All other variables showed a weak relationship.

Table 2 The table shows the effect size between variables and different clustering

In CA based on CCI + SOFA, neurodegenerative disease, hematological malignancy, drug/alcohol addiction, and coagulation failure on admission showed a very weak relationship. SAPS II, SOFA, and ICU administration of vasopressor/inotrope showed a strong relationship, while age, CCI, congestive heart failure, and chronic kidney disease showed a moderate relationship. All other variables showed a weak relationship.

3.3 Survival Analysis

For CCI + SAPS II clustering, the median ICU LOS was 6.0 days [2.0–15.0] for cluster A and 2.0 days [0.0–8.0] for cluster B, with a statistically significant difference (p-value < 0.0001). The survival rates were 62.2% (603 patients) and 24.7% (376 patients) for clusters A and B, respectively, with differences statistically significant (p-value < 0.0001). Kaplan–Meier analysis (Fig. 3A) showed that median survival times were 18.0 days (CI95% 16.0–23.0 days) for cluster A and 4.0 days (CI95% 4.0–5.0 days, p-value < 0.0001) for cluster B.

Fig. 3
figure 3

The figure shows the 28-day survival curves for CCI + SAPS II (A) and CCI + SOFA (B) clustering with the risk tables. A The predicted 28-day survival probabilities for clusters A and B were 43.7% (CI95% 39.7–48.1%) and 11.1% (CI95% 9.2–13.4%). Peto-Peto χ2 = 448 on 1 degrees of freedom, p-value < 0.0001. B The predicted 28-day survival probabilities for cluster α and β were 42.2% (CI95% 38.5–46.3%) and 8.6% (CI95% 6.9–10.8%). Peto-Peto χ2 = 589 on 1 degrees of freedom, p-value < 0.0001

For CCI + SOFA clustering, the median ICU LOS was 6.0 days [2.0–15.0] for the cluster α and 2.0 days [0.0–7.0] for the cluster β, with a statistically significant difference (p-value < 0.0001). The survival rates were 61.1% (780 patients) and 20.2% (269 patients) for cluster α and β, respectively, with differences statistically significant (p-value < 0.0001). Kaplan–Meier analysis (Fig. 3B) showed that median survival times were 17.0 days (CI95% 16.0–20.0 days) for the cluster α and 3.0 days (CI95% 2.0–4.0 days, p-value < 0.0001) for the cluster β.

3.4 Multivariate Logistic Regression and Predictive Performance

Table 3 shows the multivariate logistic regression results. For model 1, time from in-hospital to ICU admission, previous admission in the medical ward, brain, cardiocirculatory and metabolic failure, and septic shock were risk factors for mortality. Contrarily, lung failure, sepsis, and trauma were factors related to the reduction in mortality risk.

Table 3 The table reports the results of multivariate regression analysis for the three models: model 1 (without clustering), model CCI + SAPS II clustering, and model CCI + SOFA clustering

For clustered models, belonging to cluster B or β represented an independent risk factor for the increase in mortality, with an OR of 4.65 (CI95% 3.87–5.60) and 5.44 (CI95% 4.50–6.59), respectively.

In the CCI + SAPS II clustered model, previous admission in the medical ward, brain and cardiocirculatory failure, and septic shock were factors related to an increase in mortality, while kidney failure was associated with a reduction in mortality. In the CCI + SOFA clustered model, brain and cardiocirculatory failure, and septic shock were factors related to an increase in mortality, while kidney failure was associated with a reduction in mortality.

Deviance differences between model 1 (deviance 3315.2) and models based on CCI + SAPS II (deviance 3032.6) and CCI + SOFA (deviance 2983.2) clustering were statistically significant (p-value < 0.0001). According to the AIC and BIC values, the best non-nested model was the CCI + SOFA clustered model (AIC = 3021.21, BIC = 3132.65).

Figure 4 shows the ROC curves for the three models. Se and Sp were 66.06% (CI95% 59.20–79.89%) and 58.03% (CI95% 42.99–64.27%) for model 1, 63.39% (CI95% 59.58–68.06%) and 77.44% (CI95% 72.75–80.33%) for CCI + SAPS II clustered model, and 73.02% (CI95% 67.02–79.93%) and 70.76% (CI95% 66.71–75.51%) for CCI + SOFA clustered model. The AUCs were 0.6528 (CI95% 0.6317–0.6738, poor accuracy) for model 1, 0.7398 (CI95% 0.7201–0.7594, moderate accuracy) for CCI + SAPS II clustered model, and 0.7511 (CI95% 0.7320–0.7702, moderate accuracy) for CCI + SOFA clustered model. According to the ROC curves analysis, both clustered models showed higher AUC compared to model 1 (adjusted p-value < 0.0001). However, no difference was noted between clustered models AUCs (adjusted p-value 0.2213).

Fig. 4
figure 4

The figure shows the ROC curves for the considered models. The AUCs (CI95%) were: 0.6528 (CI95% 0.6317–0.6738, poor accuracy) for model 1, 0.7398 (CI95% 0.7201–0.7594, moderate accuracy) for CCI + SAPS II clustered model, and 0.7511 (CI95%0.7320–0.7702, moderate accuracy) for CCI + SOFA clustered model. The adjusted p-values were: < 0.001 for Model 1 vs. CCI + SAPS II, < 0.001 for Model 1 vs. CCI + SOFA, and 0.2213 for CCI + SAPS II vs. CCI + SOFA

3.5 Prediction and Decision Tree

Supplementary Fig. 1 shows the decision tree obtained with the CART algorithm (cp = 0.001, node split = 15). As described in the methods, the tree was pruned with a 3-node split (complexity parameter = 0.034). Figure 5 shows the pruned decision tree. Considering the β cluster as the positive case, the accuracy of the pruned decision tree was 94.8% (CI95% 93.9–95.6%), with Kappa 0.896, indicating that the model’s predictions are highly reliable. The Se and Sp were 95.3% (CI95% 94.0–96.3%) and 93.4% (CI95% 93.0–95.5%), respectively, with high PPV (94.6%, CI95% 93.3–95.6%) and NPV (95.0%, CI95% 93.7–96.0%).

Fig. 5
figure 5

The figure shows the pruned decision tree (3 split nodes) for cluster membership. Each node represents a decision point with branches to the outcomes “Alpha” and “Beta.” The leaves represent the final classification with the absolute numbers of each cluster within that leaf. Under the leaf is reported the splitting criterion. The color of the leaf indicates the most represented cluster: green for “Alpha” and orange for “Beta.”

4 Discussion

To the best of our knowledge, CCI, SAPS II, and SOFA scores were individually evaluated as prognostic tools within the population requiring ICU admission, and studies have been conducted to establish the best predictive value about complications onset and mortality [18, 19].

However, single scores were validated in defined clinical settings according to different outcomes and observational periods. The CCI (0–33 scale points) was initially developed to predict 1-year mortality in a mixed population of internal medicine patients using comorbidity derived from chart review and consists of 19 selected conditions [1]. The SAPS II (0–163 scale points) is a severity score and has become one of the European ICU’s most widely used scoring systems. It is based on the worst values measured on 17 variables during the initial 24 h of ICU stay, with a higher score indicating a more severe illness [2]. SOFA (0–24 scale points) focuses on six organ systems (respiratory, cardiovascular, hepatic, coagulation, renal, and neurological). Each system is scored from 0 to 4 based on the degree of dysfunction or failure, with higher scores indicating more severe organ impairment. This tool is widely used in the ICU to predict patient outcomes, guide therapeutic decisions, and evaluate the effectiveness of interventions, especially in the context of sepsis [3].

ML based on CA and CART provides useful clinical information, overcoming the limits of a classic approach based on a single score. The primary study’s finding was that a combination of CCI + SAPS II and CCI + SOFA allowed the identification of patient clusters characterized by different mortality risks. In detail, belonging to cluster B/β represented an independent risk factor for mortality and shorter survival times when compared to cluster A/α. Moreover, clustered models improved the mortality prediction ability, such as suggested by moderate accuracy AUC, when compared to the no-clustered model (poor accuracy).

CA can represent a tool to identify patients with peculiar care needs, allowing the redesign of ICU organization based on care processes rather than admission diagnosis [20]. The last aspect is fascinating since the CA, identifying individuals with specific needs, could ensure a better allocation of economic and human resources, reducing waste and improving outcomes. Beyond the organizational aspects, identifying homogenous clusters allows clinicians to adopt the right therapy for the right patients at the right time.

CA has been proposed to personalize treatment in several clinical settings, such as acute respiratory distress syndrome [21], sepsis [22], and out-of-hospital cardiac arrest [23]. However, while most studies focus on categories of patients, our study focuses on a general population, affected by different acute pathologies.

A second aspect that emerges is that clusters do not exist but are the result of complex statistical processing based on the choice of specific variables and justified only by a statistically significant association with a clinical outcome. Identifying a cluster does not necessarily coincide with identifying a patient “prototype”, such as demonstrated by studies showing that the cluster membership is very unstable across different cohorts [24, 25]. In our analysis, this aspect emerges from the different allocation of the observations in a cluster depending on the variables considered for CA and the different degrees of association, as evidenced by the magnitude of the ρ and V coefficients between the clusters and the considered variables.

We found a strong association between the use of vasopressors/inotropes in the ICU when clustering was performed with the combination of CCI + SOFA. This result is due to the different items used in SAPS II and SOFA to calculate the final score. While the SAPS II only records the worst systolic pressure value, the SOFA also reports the administration of vasopressor/inotropic drugs. This aspect may also be responsible for the best balance between model complexity and data fit for clustered model based on CCI + SOFA rather than CCI + SAPS II.

Recent literature emphasized the importance of considering vasopressor dosage in the early phase of septic shock and how it correlates with mortality outcomes. Roberts et al. [26] evaluated the association between vasopressor dosing intensity and 30-day in-hospital mortality in patients with septic shock. It was observed that increasing vasopressor dosing intensity during the first 24 h after septic shock was associated with increased mortality. Vallabhajosyula et al. [27] reported that vasopressor burden during early critical illness appears to be a marker for unfavorable outcomes. The elaborated score, incorporating the cumulative vasoactive medication usage within the first 24 h, performed better than SOFA for 28-day mortality prediction.

Although further studies are necessary to evaluate the generalizability of this evidence in different settings, according to our results, a score that evaluates the administration of vasopressor/inotropic drugs among the items may be preferred in CA to predict mortality.

The identification of clusters characterized by a high mortality risk represents important but incomplete information. From a clinical point of view, it would be much more interesting to identify whether a patient with specific features belongs to a cluster at the time of ICU admission. The application of the CART algorithm in this study highlights its potential in the nuanced field of ICU patient data analysis, where rapid and accurate decision-making is crucial. Our findings demonstrate that CART effectively identifies critical thresholds of CCI and SOFA that are simple to compute and evaluate, offering a clear, interpretable model for predicting cluster membership with high accuracy. These aspects should facilitate its application in daily clinical practice, providing targeted interventions for high-risk ICU patients.

The CART algorithm has shown significant potential in the context of ICU. Alghatani et al. [28], in a study on predicting ICU LOS and mortality using patient vital signs, developed decision tree-based methods like CART, demonstrating the importance of advanced data analysis in forecasting critical ICU outcomes. This reflects a broader trend in healthcare, where ML and CART are increasingly employed for clinical decision support in ICUs [4]. Liu et al. [29] demonstrated the application of CART in predicting successful extubation in ICU patients, underscoring the algorithm’s ability to support critical clinical decisions based on patient data. Ramos et al. [30] developed an algorithm to aid in ICU admission triage decisions and showcased the practicality and reliability of such methods in managing ICU admissions and patient prioritization effectively. These examples highlight the growing relevance of the CART algorithm in ICU settings, where its ability to process complex data and provide interpretable results is invaluable for patient care and resource management.

4.1 Limitations

Our study showed several limitations. First, its retrospective, single-center design may affect the generalizability of our findings. In particular, the choice to exclude patients with a hospital LOS longer than 15 days before ICU admission may have excluded patients with a higher risk of death. Moreover, the decision-making algorithm needs further validation. To ensure the robustness and applicability of the algorithm in diverse clinical settings, it must undergo rigorous validation using external datasets. This step is crucial to ascertain the algorithm’s efficacy and reliability across different ICU environments and patient populations. Hence, while promising, our results should be interpreted cautiously, as they may not accurately reflect the realities of different ICU settings or patient groups.

Second, despite the large sample size, the data selection process and handling of missing data could also have impacted the results, as missing observations were excluded under the missing completely at random assumption.

Third, SAPS II and SOFA scores were recorded at admission. This reliance potentially restricts our understanding of the patient’s progression in the ICU, as these values do not account for changes in evolving patient conditions. The dynamic nature of patients’ health in the ICU setting suggests that reliance on admission scores alone may not comprehensively represent their clinical trajectory, thus impacting our findings’ predictive accuracy and applicability. Furthermore, the use of different scores could lead to different conclusions. Studies are needed to clarify the best combination of scores for clustering.

Fourth, while the use of CART for predicting cluster membership was insightful, the potential overfitting of the model could pose challenges for practical clinical application, especially when referred to other populations.

4.2 Strengths

Our study’s strengths include the comprehensive utilization of ML based on CA and CART algorithms. Our approach offers a novel perspective on patient categorization and outcome prediction, enhancing the understanding of ICU patient profiles.

A notable strength of our study is the development of a simple, yet effective decision-making algorithm designed for easy implementation in clinical settings. This simplicity is a considerable advantage, as it allows for seamless integration into the existing workflows of ICU environments. The algorithm’s user-friendly nature potentially enables a wide range of healthcare professionals to utilize it effectively, enhancing patient care and decision-making efficiency in critical care settings.

These aspects underline the study’s potential to influence future research and clinical practices in intensive care settings.

5 Conclusions

Clustered models significantly improve mortality prediction compared to the no-clustered model and may be more useful in clinical practice for the early identification of high-risk patients. Despite clustered models showing no difference in mortality predictive performance, clustering based on CCI + SOFA showed the best balance between model complexity and data fit and should be preferred to the CCI + SAPS II model. Developing a straightforward, user-friendly decision-making algorithm is a key contribution, offering a practical tool for enhancing ICU performance.

While acknowledging the limitations inherent in our monocentric, retrospective approach, the study opens new avenues for future research and underscores the vital role of ML in improving patient outcomes in critical care. Further multicentric studies are needed to ensure the findings are robust and applicable to a broader patient population.