FormalPara Key Summary Points

Why carry out this study?

Sepsis is a heterogeneous clinical syndrome characterized by a dysregulated host response to infection

Identifying the sepsis subphenotypes could lead to a better understanding of the pathophysiology and discovery of new treatment targets. However, there is a lack of models to identify the subphenotypes in such patients

The aim of this study was to propose a machine-learning method to identify the sepsis subphenotype, using only routinely available clinical data collected within the first 24 h of ICU admission

What was learned from the study?

Machine learning-based algorithms for subphenotype identification in sepsis are possible

Two sepsis subphenotypes could be identified rapidly based on routinely available clinical data

Subphenotype B was independently associated with increased in-hospital mortality among patients with sepsis

Digital Features

This article is published with digital features, including a graphical abstract, to facilitate understanding of the article. To view digital features for this article go to https://doi.org/10.6084/m9.figshare.20418600.

Introduction

Sepsis is a common and frequently fatal clinical condition characterized by a dysregulated host response to infection [1]. Data from the recently epidemiologic studies suggested that the worldwide incidence of sepsis is estimated to 48.9 million cases per year, associated with mortality rates of 20–25% [2, 3]. Although the Surviving Sepsis Campaign Guidelines for Management of Sepsis and Septic Shock have undergone five updates within the last 2 decades since first introduced in 2004, the mortality rate among patients with sepsis remains unacceptably high [4]. A major potential barrier to progress is the heterogeneity in sepsis.

Not all sepsis is the same. However, up to now, a one-size-fits-all approach is still being implemented for clinical practice, which ignores the heterogeneity across sepsis patients. Recently, several studies have accurately identified subphenotypes among sepsis cases; these subphenotypes have different demographics, laboratory values and clinical outcomes [5,6,7,8]. Notably, these methods for subphenotype identification are largely reliant on the measurement of specific protein biomarkers, such as vascular adhesion protein 1 (VAP1), matrix metalloproteinase 8 (MMP8) and proteinase 3 (PRTN3) [7]. However, these variables are not widely available as a routine clinical test, and the high prices for biomarker detection limit the rapid identification of the sepsis subphenotypes in clinical practice. Thus, there is a need to derive the sepsis subphenotypes by using routinely available clinical data in the early intensive care unit (ICU) admission stage.

Machine-learning classifier models that use clinical data could be performed to identify disease subphenotypes. Among them, K-means cluster analysis is a good clustering method that has already gained a wide range of acceptability in medicine [9, 10]. The K-means cluster-based methods have been extensively applied for subphenotype identification of pediatric acute respiratory distress syndrome (ARDS) [11] and sepsis-associated acute kidney injury [12]. However, to the best of our knowledge, there is a lack of models to identify the clinical subphenotypes in patients with sepsis.

Accordingly, the objective of this study was to propose a K-means cluster method to identify the sepsis subphenotype, using only routinely available clinical data collected within the first 24 h of ICU admission.

Methods

Ethical Approval

The data for this study were obtained from the Medical Information Mart for Intensive Care (MIMIC-IV) database. The establishment of the MIMIC-IV database was approved by the Institutional Review Boards of Beth Israel Deaconess Medical Center (Boston, MA) and the Massachusetts Institute of Technology (Cambridge, MA). Requirement for individual patient consent was waived because this project did not impact clinical care, and all protected health information was deidentified [13]. This study was performed according to the Declaration of Helsinki in 2013 [14] and reported in accordance with the Strengthening the Reporting of Observational studies in Epidemiology (STROBE) statement [15].

Data Source

As an update lately to the MIMIC-III, the current MIMIC-IV (version 1.0) is a large, freely available, open-access database comprising a variety of clinical-related data associated with 76,540 distinct admissions for patients who stayed in the ICU of the Beth Israel Deaconess Medical Center between 2008 and 2019 [13]. After completing a recognized course in protecting human research participants and signing a data use agreement, one author (Chang Hu) was approved to access the database (certification no. 47460147).

Study Population and Outcome

In this study, sepsis was defined as a confirmed or suspected infection combined with a Sequential Organ Failure Assessment (SOFA) score ≥ 2 in the first 24 h after ICU admission, in accordance with the Third International Consensus Definitions for Sepsis in 2016 [1]. The timeline of sepsis definition was present in Supplementary Fig. S1. We enrolled all adult (> 18 years) septic patients. Exclusion criteria were multiple ICU admissions (only the data of each patient’s first ICU admission were used in this study) and ICU length of stay (LOS) < 24 h. The primary outcome was in-hospital mortality; the secondary outcomes were ICU mortality, ICU length of stay (LOS) and hospital LOS.

Data Extraction and Preprocessing

In the MIMIC-IV database, we extracted a set of clinical variables. As we have previously described [16], these were routinely accessed parameters, including demographic variables (e.g., age, gender, ethnicity, body weight, height, admission time period and admission type), medical history (e.g., hypertension, diabetes, congestive heart failure, cerebrovascular disease, chronic pulmonary disease, liver disease, renal disease, tumor and acquired immune deficiency syndrome), vital signs (e.g., heart rate, systolic blood pressure, diastolic blood pressure, mean arterial pressure, respiratory rate, body temperature and SpO2), laboratory findings (e.g., blood glucose, lactate, pH, pCO2, pO2, base excess, white blood cell, anion gap, bicarbonate, blood urea nitrogen, serum calcium, serum chloride, serum creatinine, serum sodium, serum potassium, serum fibrinogen, international normalized ratio, prothrombin time, partial thromboplastin time, alanine aminotransferase, alkaline phosphatase, aspartate aminotransferase, total bilirubin, amylase, creative phosphokinase, creatine kinase MB, lactate dehydrogenase, PaO2/FiO2 ratio, hematocrit, hemoglobin, platelets and albumin), medical treatments (e.g., mechanical ventilation, the time to first dose of antibiotic agents and vasopressor use), urine output and Glasgow Coma Scale score. For each variable, we extracted the most abnormal value recorded within the first 24 h of ICU admission. Missingness is considered to be missing at random, and variables with > 30% missing values were excluded from the analysis. Furthermore, we employed the multiple imputation by chained equations (MICE) method to handle the remaining missing data (MICE package in R).

Variable Selection

In feature selection part, we followed the procedure of Soussi et al. [6] and Zhang et al. [8]. In brief, we first removed the variables with a missing fraction > 30%. Then, we captured the preselected variables based on the prior published literature and their potential association with sepsis onset and progress. The final selection of variables included in the K-means clustering algorithm (11 variables) was made by consensus among two critical care medicine experts: YL and ZP (Supplementary Table S1). The PaO2/FiO2 ratio was used for respiratory function; serum creatinine was used for renal function; platelet and hemoglobin were used for hematologic system; the lactate, heart rate, systolic blood pressure and body temperature were used for circulatory system; white blood cell count was used for inflammatory condition; sodium was used for electrolyte parameters; glucose was used for metabolism function.

Subphenotype Classification

In the present study, we employed a K-means clustering algorithm to determine clusters. We first assessed the correlation between candidate variables. Then, all continuous variables were transformed into z-score (mean: 0, standard deviation: -1 to 1) in the algorithm. Two to eight clusters were compared in this unsupervised approach. We determined the optimal number of clusters based on the analysis of total within sum of squares (WSS), the Silhouette score (ranged from − 1 to 1, a value closer to 1 being better), Davies-Bouldin score (ranged from 0 upward, a value closer to 0 being better) and Calinski-Harabasz score (ranged from 0 upward, higher value being better). Finally, we used a principal component analysis (PCA) to visualize the clustering results.

Statistical Analysis

Categorical variables were expressed as number and percentage and tested for baseline comparability with the chi-square test or Fisher’s exact test, as appropriate. Continuous variables were expressed as mean and standard deviation or median and interquartile range (IQR) and compared between the two groups using the Student’s t-test or the Mann-Whitney test, as appropriate.

Multivariable logistic regression models were used to estimate the association between sepsis subphenotypes and in-hospital mortality. Unadjusted and adjusted odd ratios (ORs) and 95% confidence intervals (CIs) were calculated using multiple logistic regression models. In the unadjusted model, we tested the direct effect of sepsis subphenotypes on in-hospital mortality. Model 1 was adjusted for demographic variables (age, gender, body weight, height, racial, admission time period and admission type). Model 2 was adjusted for the covariates included in model 1 as well as the comorbidities (Charlson comorbidity index). Model 3 was adjusted for the covariates included in model 2 as well as the neurologic function (GCS). Model 4 was adjusted for the variables in model 3 along with the severity of illness score (SOFA). Model 5 incoporated model 4 along with medical treatments (antibiotic therapy on day 1, mechanical ventilation on day 1 and vasopressor use on day 1).

A two-sided P value < 0.05 was considered statistically significant. All analyses were carried out using SPSS statistical software version 24.0 (IBM), R statistical software version 3.6.1 (R Foundation) and Python software version 3.6 (Fig. 1).

Fig. 1
figure 1

Schematic illustration of the study design. MIMIC-IV Medical Information Mart for Intensive Care IV, WBC white blood cell, SBP systolic blood pressure

Results

Characteristics of Cohorts

In the current analysis, 12,292 patients were screened with sepsis in the MIMIC-IV database. After all exclusions (2657 patients with multiple ICU admission and 818 patients whose LOS was < 24 h in the ICU), 8817 participants were finally enrolled in this study (Supplementary Fig. S2). Of the 8817 cases, the median age was 66.8 years (IQR 55.9–77.1 years), and 38.1% (3361/8817) were female. The most frequently documented race and ethnicity category was White (5887/8817, 66.8%), followed by Black/African American (467/8817, 5.3%), Hispanic/Latinx (241/8817, 2.7%) and Asian (205/8817, 2.3%). The top three most frequently reported comorbidities included hypertension (4256/8817, 48.3%), diabetes (2528/8817, 28.7%) and congestive heart failure (2365/8817, 26.8%) (Table 1). The demographics at baseline are provided in Table 1.

Table 1 Demographics at baseline

Among 8817 patients, 71.5% (6307/8817) were administered the first antibiotic dose in the first 24 h after ICU admission, 59.5% (5242/8817) received mechanical ventilation on day 1, and 39.8% (3510/8817) were given first vasopressor dose in the first 24 h after ICU admission. The median SOFA score was 5 (IQR: 5–8) across the whole dataset, a result indicating more severe illness. The overall all-cause in-hospital mortality rate was 12.6% (1107/8817) (Table 2).

Table 2 Clinical characteristics of the cohort

Derivation of Sepsis Subphenotypes

Supplementary Table 1 presents the 11 clinical variables covering the functions of several organs. Significant correlations are also noted between the candidate variables (Fig. 2). The highest correlations were observed between glucose and lactate (r = 0.3) and heart rate and body temperature (r = 0.3). Supplementary Table S2 displays a summary of K-means clustering ranging from 2 to 7 for this cohort, respectively. Clustering with K = 2 was found to have a higher Silhoutte score of 0.24 and a higher Calinski-Harabasz score of 879 compared with other classes. The within-cluster variance graph is presented in Supplementary Fig. S3. Therefore, a two-class model provided the optimal fit in this study. For simplicity, we henceforth refered to the two classes as subphenotypes A (N = 7094) and B (N = 1723), respectively. For easier exploration and visualization of two subphenotypes, we also created two-dimensional images using principal component analysis (PCA) to mark the differences between before and after clustering (Fig. 3).

Fig. 2
figure 2

Correlation matrix of the variables measured in this cohort. Coefficients are derived using the Spearman’s rank correlation coefficient. WBC white blood cell, SBP systolic blood pressure

Fig. 3
figure 3

Principal component analysis (PCA) to visualise the clustering results. A Before grouping clusters; B after grouping clusters. PCA principal component analysis

Characteristics of Each Subphenotype

Figure 4 shows the selected variables for the two subphenotypes. Compared with subphenotype A, subphenotype B was defined by considerably higher levels of lactate, glucose, creatinine, white blood cell count and sodium; higher heart rate; and lower body temperature, platelet, systolic blood pressure, hemoglobin and PaO2/FiO2 ratio. Specifically, the details for comparison of these variables between the two subphenotypes are presented in Fig. 5.

Fig. 4
figure 4

Selected variables by subphenotype in sepsis. Differences in standardized values of each variable by subphenotype. All continuous variables were transformed into z-score (mean: 0, standard deviation: − 1 to 1). WBC white blood cell, SBP systolic blood pressure

Fig. 5
figure 5

Comparison of the selected variables between the two subphenotypes in sepsis

Baseline characteristics of study participants according to subphenotype are also shown in Tables 1 and 2. There were no significant differences in age, gender and height between subphenotype A and subphenotype B. Subjects in subphenotype A vs subphenotype B were more likely to be White (68.3% vs. 60.3%). In addition, participants in subphenotype A vs subphenotype B had lower Charlson Comorbidity Index (5 [3–7] vs. 6 [4–8], P < 0.001) and lower need for antibiotic therapy (70.9% vs. 74.2%, P < 0.001), mechanical ventilation (57.0% vs. 69.6%, P < 0.001) and vasopressor use (34.9% vs. 60.0%, P < 0.001).

Clinical Outcomes of Each Subphenotype

Results for all outcomes are shown in Table 2. The in-hospital and ICU mortality in subphenotypes B were significantly higher than those in subphenotype A (29.4% vs. 8.5%, P < 0.001; 25.4% vs. 6.0%, P < 0.001; respectively). Furthermore, the lengths of hospital stay and ICU stay were significantly longer in patients with subphenotype B compared with those in subphenotype A (10.6 [5.5–18.9] vs. 7.9 [5.1–13.1], P < 0.001; 4.6 [2.4–9.7] vs. 2.8 [1.5–5.3], P < 0.001; respectively).

Results of the univariable and multivariate logistic regression analysis for the primary outcome are presented in Table 3. In a univariable analysis, subjects in subphenotype B were associated with increased in-hospital mortality (OR 4.492; 95% CI 3.932–5.132; P < 0.001). After adjusting for multiple potential confounders using several multivariate logistic regression models, we found that subphenotype B was independently associated with increased risk of in-hospital mortality compared with subphenotype A (adjusted OR 2.214; 95% CI 1.780–2.754, P < 0.001).

Table 3 Univariate and multivariate analysis of sepsis subphenotypes associated with in-hospital mortality for all included patients

Discussion

In this study, we demonstrated that K-means cluster analysis, using only routinely available clinical data as factors, could accurately identify the sepsis subphenotypes. We captured 11 representative and easily accessible variables related to different organ systems and observed important differences between the two identified subphenotypes. Additionally, patients in subphenotype B had significant higher mortality even after adjusting for potential confounders compared with those in subphenotype A. Taken together, this finding might be a valuable tool for prognostication stratification of sepsis in the clinical practice.

Data-driven cluster analysis is widely used for diseases classification [17]. Among several cluster analysis methods, the K-means algorithm is the most popular machine-learning clustering algorithm [18]. To date, several researchers have successfully applied K-means cluster analysis to identify asthma phenotypes [17], Parkinson's disease subtypes [19] and complex regional pain syndrome phenotypes [20]. Thus, the K-means cluster analysis used in this study seemed to be an appropriate choice since it maximizes separation between clusters and offering the greatest scope for identifying distinct groups within the patients [16].

Identification of distinct subphenotypes in sepsis is a key component of personalised medicine, which may help in better risk stratification and treatment decisions. Notably, how to translate research into clinical practice remains one of the biggest challenges in subphenotypes identification [10]. For example, HBP (heparin-binding protein), Ela (neutrophil elastase 2), PRTN3 and MMP8 have been reported to be the key factors in phenotype identification for septic acute kidney injury [7]. However, these variables were not widely available as a routine clinical test, which failed to translate into clinically useful applications. In the current study, we used 11 routinely available clinical features for deriving the clinical phenotypes of sepsis in the first 24 h after ICU admission. The value of these parameters reflected the state of different target organs. These findings allowed the realization of personalized physiologic medicine to be practiced at the bedside for critically ill patients with sepsis.

Understanding of the underlying pathophysiologic mechanisms behind the subphenotype identification could help identify the subphenotype in patients with sepsis. Of the two subphenotypes identified in sepsis, subphenotype B was associated with higher levels of lactate, glucose and creatinine and lower levels of hemoglobin and PaO2/FiO2 ratio. Lactate is a marker of abnormal microcirculation [21], reflective of tissue hypoperfusion and cellular hypoxia [22], and has been reported to be a sensitive indicator to predict prognosis in sepsis [23]. The glycometabolism disorder is common in critically ill patients, especially those with sepsis [24]. A mild elevation of glucose is acceptable because it allows the host to survive during severe stress. However, an excessively high level of glucose may cause immunosuppression and oxidative stress, which were associated with worse outcomes [25, 26]. Serum creatinine is the most widely used measure of renal function in clinical practice. The significantly elevated creatinine in subphenotype B revealed the subjects in this group had a higher proportion of renal dysfunction compared with patients in subphenotype B [27]. This result is in accord with a recent study in which the serum creatinine was a predictor of mortality in sepsis [6]. Additionally, individuals in subphenotype B were found to have a lower PaO2/FiO2 ratio compared with those in subphenotype A. Previous study also demonstrated that the PaO2/FiO2 was an important indicator in respiratory function and widely used to differentiate groups of patients at high risk for adverse clinical outcomes [28]. Taken together, it is not difficult to understand why patients in subphenotype B had worse clinical outcomes compared to those in subphenotype A.

In the current study, we used a classification approach based on routinely available clinical data to yield insights into sepsis subphenotypes. Embedding this classification model into the electronic health record (EHR) system would allow for rapid bedside screening. Moreover, automating the capture and processing of an abundant data stream in the ICU would contribute to evaluating prognostic or therapeutic differences among septic patients. However, this advanced method for sepsis classification would require additional external validation before clinical implementation.

This study has several limitations. First, because the etiology of sepsis is complex, it might be not enough to discover all the subphenotypes using known rules and some important variables. Additionally, the MIMIC-IV database covers the period from 2008 to 2019 without the exact year of patient admission, and changes in the management of sepsis may have occurred in the interim, which increased the bias. We have attempted to partly mitigate this effect by applying a unified definition of sepsis (Sepsis-3) and divided patients into four groups in terms of admission time period (2008–2010, 2011–2013, 2014–2016 and 2017–2019) for enrollment in the model. The results held true even after adjusting for admission time period. Second, this study only focused on baseline data with the most abnormal value in the first 24 h of ICU admission, which limited dynamic classification for sepsis subphenotype. On the other hand, the natural progression of sepsis over time might lead to changing of subphenotype. Nonetheless, our static classification model could provide potential suggestions for care providers and prevent alert fatigue, which was blamed for high override rates in dynamic systems. Third, the inclusion of potential indicators may provide additional insights, such as type of infection, infection site, microbiology data, cigarette smoking data as well as drinking data. However, these were not available in this clinical database. Fourth, the lack of lactate and PaO2/FiO2 values was 22.18% and 27.39%, respectively. Although we used a multiple imputation approach to handle these missing data in the clustering model, this may still produce biased estimates of the relative risk. Fifth, sepsis was defined as a confirmed or suspected infection combined with a SOFA score ≥ 2. However, we could not obtain the exact values except ICD-9 (or ICD-10) codes for chronic conditions in the electronic database. Therefore, we could not calculate the baseline SOFA score, which may have overestimated sepsis, and the results should be interpreted with caution. Sixth, the variables of clinical subphenotypes in this study were derived from a single-center retrospective database in the USA. Thus, it remains unknown whether these subphenotypes exist outside studies in a more diverse population of critically ill patients with sepsis. Additional studies are needed to further verify and validate the two distinct subphenotypes of sepsis.

Conclusion

Two sepsis subphenotypes with different clinical outcomes can be rapidly identified using the K-means clustering analysis based on routinely available clinical data. This finding may help clinicians to rapidly and easily identify the subphenotype of sepsis at the bedside.