Background

Bloodstream infections (BSI) are among the most serious infections causing sepsis or septic shock and are highly prevalent in hospitalized patients requiring intensive care, which can lead to prolonged hospital stays and high healthcare costs [1, 2]. Currently, the main therapeutic option for BSI patients is antimicrobial therapy, combined with optimal management of its consequences (such as shock, organs dysfunctions or metastatic suppurative complications) and surgical treatment (including debridement, abscess drainage, or removal of intravascular devices) when necessary. To achieve optimal clinical outcomes, timely and critical assessment of BSI patients are necessary to ensure prompt, effective, and targeted treatment [3]. However, the current standard of care mostly depends on blood culture-based diagnosis, which is often extremely slow [4]. Therefore, antimicrobial therapy is still empiric, targeting the most likely etiologic pathogens. Moreover, in recent years, there has been a rapid increase in the occurrence of antimicrobial-resistant pathogens in BSI, limiting treatment options and affecting the prognostic outcomes [3].

In the intensive care unit (ICU), BSI could be hospital-acquired, community-acquired or healthcare-associated. They are characterized by different epidemiology, risk factors, microbiology, sources, systemic responses and prognostic outcomes [5], increasing disease complexity and heterogeneity. Regarding heterogeneities, clinicians have tried to cluster critically ill patients into different sub-phenotypes based on clinically objective parameters [6,7,8,9]. To a certain degree, this improvement in recognition allows further understanding of disease classification and pathophysiology, potentially leading to precision treatments that reduce morbidity and mortality rates among critically ill patients. Thus, to identify patients at a high risk of BSI and to inform targeted/personalized management, there is an urgent need for better characterization of BSI phenotypes.

In this study, we hypothesized that applying a clustering approach to a database of BSI patients can help better characterize different BSI phenotypes, which may be of significance in constructing an easy-to-use nomogram for screening high-risk patients. To determine whether the developed model accurately predicts poor outcomes for BSI patients in ICU, we externally validated this model using an independent cohort.

Methods

Study design and participating cohorts

This retrospective observational study was conducted on two primary cohorts. For the development cohort, we collected data from patients who presented to the ICU at the First Affiliated Hospital of Xiamen University between January 2016 and December 2021. For external validation, we utilized data from an independent cohort at the First Hospital of Shanxi Medical University, that was retrospectively enrolled over a similar period.

The study was carried out according to the principles of the declaration of Helsinki and was approved by The Medical Ethics Committee of First Affiliated Hospital of Xiamen University (approval number: ky2021044) and First Hospital of Shanxi Medical University (approval number: 2021-K121) approved this study. Since the study was retrospectively conducted and no interventions were applied, the Ethics Committee of First Affiliated Hospital of Xiamen University and First Hospital of Shanxi Medical University approved the waiver of informed consent.

Patients were eligible for inclusion if they were aged ≥ 18 years and had a clinically positive blood culture for a bacterium or fungus obtained during their stay in the ICU [5]. The exclusion criteria was: incomplete core data, especially with regards to a lack of information on the specific treatment received before the diagnosis of BSI or its prognostic outcomes.

Data collection

Research coordinators and board-certified ICU physicians collected demographic and clinical data from the patients using a case report form. They reviewed the electronic medical records and verified the final data. The following information was collected: demographic characteristics (age, gender, BMI, etc.), comorbidities, conditions before BSI (mechanical ventilation, deep vein catheterization, antibiotic use, vasoconstrictor use etc.), ICU complications (multiple organ dysfunction syndrome (MODS), acute respiratory distress syndrome (ARDS), septic shock, acute kidney injury (AKI) and disseminated intravascular coagulation (DIC)), outcomes (hospital stays, ICU stays and ICU mortality), primary site of infection, vital signs at baseline and results /from laboratory examinations. Vital signs at baseline and laboratory indicators, including inflammatory indicators and organ function damage indicators, were collected at the time point of blood sample collection. The baseline Sequential Organ Failure Assessment (SOFA) score and Pitt bacteremia score (Pitt score) were also calculated at the same timepoint [10, 11].

The main outcome was ICU mortality. The secondary outcomes were days of ICU stay, days of hospital stay, and ICU-associated complications such as MODS or septic shock.

Statistical analysis

Categorical variables are presented as numbers (percentages), while continuous variables are presented as means ± SD or median (IQR) according to whether they were normally distributed or not. Statistical analyses were performed using R version 3.5.3 for Windows (http://www.r-project.org/). Categorical variables were compared by Chi-square or Fisher’s exact tests. Normally distributed variables were compared by the Student’s t test. The Mann-Whitney U test, a non-parametric test, was performed to compare variables that were not normally distributed. Based on all baseline variables, partitioning-based algorithms k-means was used to discover the groups of BSI phenotypes with different prognostic outcomes. Kaplan-Meier curves were constructed and compared using log-rank tests to validate the results of k-means. Univariate Cox regression analyses were used to estimate the risk of clusters.

In this study, random forest was developed to identify the predictors of the clusters. The selected predictors were subjected to multivariate logistic regression analysis, and a nomogram was developed. For easy clinical applications, a bloodstream infections clustering (BSIC) score was set based on the nomogram. The discriminative abilities of the nomogram were measured by area under the receiver operating characteristic curve (AUC). Sensitivity, specificity, positive predictive value (PPV) and negative predictive value (NPV) were proposed. The results were validated in the validation cohort of 310 adult BSI patients. We considered p values of less than 0·05 to be significant, and all tests were two-tailed.

Results

Patient characteristics

In the discovery cohort, 383 patients were initially recruited, of which 23 patients without complete data were excluded, leaving 360 patients who were eligible for analysis. For the validation cohort, 313 patients were initially enrolled, and 3 patients were excluded due to missing prognostic information. A final total of 310 patients were included in the validation cohort. Table 1 and S1 show the baseline characteristics (demographic characteristics, pre-existing conditions, primary sites of infection, vital signs at baseline, laboratory examination outcomes and specific treatments before BSI) as well as prognostic outcomes (ICU complication and outcomes) of patients in the discovery and validation cohorts. In the discovery cohort, the median age for the patients was 64 years, with 37.2% of the patients being female. The baseline SOFA scores and ICU mortality rate for this cohort were 8 and 27.2%, respectively. In the validation cohort, the median age and proportion of female patients were 62 years and 48.7%, respectively while the baseline SOFA scores and ICU mortality rate were 7 and 25.5%, respectively.

Table 1 Baseline characteristics and prognosis of patients in the discovery and validation cohorts

Characteristics of clusters in the discovery cohort

Based on baseline variables, k-means analysis revealed two distinct clusters with differing prognostic outcomes. The clusters were well separated from one another, as shown S-Fig. 1. Patients in the two clusters had distinct baseline characteristics and prognostic outcomes (Table 2 and S-Table 2).

Table 2 Patient’s characteristics and prognosis difference in the clusters of discovery cohort

Patients in cluster 1 (n = 211) were likely to have been directly transferred from the emergency department to ICU, and had less time in the hospital before BSI. These patients had milder organ dysfunctions (lower SOFA scores) and received fewer invasive treatments. The main primary sites of infection were the abdomen, followed by the urinary system, with fewer patients exhibiting pulmonary infections. Regarding prognosis, patients in cluster 1 were less likely to suffer from MODS, had lower incidences of ARDS and septic shock, and had lower ICU mortality rates (17.1%). Patients in cluster 2 (n = 149) had significantly higher SOFA scores, more ICU complications (MODS, ARDS, and septic shock) as shown in Table 2; Fig. 1A, and poorer prognostic outcomes (longer hospital and ICU stays, higher ICU mortality (41.6%)).

Fig. 1
figure 1

The organ dysfunction of patients in the discovery (A) and validation cohorts (B). Patients in cluster 2 had significantly more ICU complications (MODS, ARDS, septic shock, AKI and DIC). Abbreviation: MODS, multiple organ dysfunction syndrome; ARDS, acute respiratory distress syndrome; AKI, acute kidney injury; DIC, disseminated intravascular coagulation

Fig. 2
figure 2

The Kaplan-Meier survival curves of patients in the discovery (A) and validation cohorts (B). The assumption of the proportional hazards was confirmed in Cox regression analysis. The Kaplan-Meier curve revealed a higher risk of death for cluster 2 patients, compared to cluster 1 (hazard ratio: 2.31 [95% CI, 1.53 to 3.51], p < 0.001, Fig. 2A). A higher proportion of patients in cluster 2 had been subjected to mechanical ventilation (117/149, 78.5%), deep vein catheterization (133/149, 89.3%), antibiotics (145/149, 97.3%) and vasoconstrictor agents (105/149, 70.5%) before the diagnosis of bloodstream infections

Predicting the identified clusters using baseline variables

Using the random forest, the four baseline variables (vasoconstrictor use before BSI, MV before BSI, DVC before BSI, and antibiotic used before BSI; Figure S2) were identified to predict the prognostic outcomes of the identified clusters in the discovery cohort. Then, we created a nomogram that integrated all four significant independent predictors. For easy clinical applications, based on the derived nomogram using only four baseline variables, we developed a bloodstream infections clustering (BSIC) score (Fig. 3A). Figure 3B shows adequate calibration of the score, as the proportion of patients attributed to cluster 2 increased with the score. The nomogram and BSIC score showed good discrimination with AUC of 0.96 (95%CI, 0.94 to 0.98 and 0.74–0.98, respectively, Fig. 3C). The optimal cut-off value of the score was 5. The accuracy, sensitivity and specificity according to this cut-off value were 91%, 86% and 95% respectively, with PPV of 92% and NPV of 90%.

Fig. 3
figure 3

Predict the identified clusters in the patients with BSI. (A) Nomogram to predict the identified clusters. Points are assigned based on 4 baseline variable by drawing a line upward from the corresponding values to the “Points” line. The sum of these four points, plotted on the “Total points” line, corresponds to possibility of cluster 2. The bloodstream infections clustering (BSIC) score derives from the nomogram, allows the user to partition patients into 2 clusters by calculate total point. (B) Calibration plot of BSIC score. The proportion of patients attributed to cluster 2 increased with the score. (C) Receiver operating characteristic curves (ROC) for discrimination estimate. Abbreviation: DVC, Deep vein catheterization; MV, mechanical ventilation; BSI, bloodstream infection; AUC, area under the curve

Validation of the BSIC score

The four baseline variables (vasoconstrictor use before BSI, MV before BSI, DVC before BSI, and antibiotic use before BSI) were used to predict cluster labels of the 310 BSI patients in the validation cohort with BSIC scores. In this study, 124 of 310 patients were assigned to cluster 1, while 186 patients were assigned to cluster 2. Patient’s baseline characteristics and prognostic differences between predicted clusters of the validation cohort are shown in S-Table 3. Consistent with findings from the discovery cohort, cluster 2 patients had higher SOFA scores, more ICU complications (MODS, ARDS, septic shock, AKI, DIC), and poorer prognostic outcomes (longer hospital stays and ICU stays, higher ICU mortality), compared with patients in cluster 1. And the results are also shown in (Figs. 1B and 2B). The Kaplan-Meier analysis revealed a high risk of death for cluster 2 patients, compared to cluster 1 (hazard ratio: 2.23 [95% CI, 1.34 to 3.71], p = 0.001).

Table 3 Patient’s prognosis difference between the predict clusters of validation cohort

Moreover, the species of pathogens in the discovery and validation cohorts were as shown in Figure S3. Escherichia coli, Klebsiella pneumoniae and Staphylococcus were the top 3 most common pathogens in the discovery cohort. In contrast, the most common pathogens in the validation cohort were Staphylococcus, Candida and Klebsiella pneumoniae, respectively.

Discussion

In recent years, several scoring systems have been developed for stratifying the risk of patients with sepsis [12, 13], but not for patients with BSI. Therefore, we identified independent parameters from available data during ICU stay and constructed a novel score for predicting the prognostic outcomes of BSI, which may promote patient stratification and inform personalized interventions. The clustering approach combining baseline variables allowed us to characterize two distinct BSI phenotypes, the clinical profiles of which correspond to “good prognosis” patients (cluster 1) and “poor prognosis” patients (cluster 2). The established nomogram incorporated four factors: vasoconstrictor use before BSI, MV before BSI, DVC before BSI, and antibiotic use before BSI. The novel prediction instrument showed good discrimination as well as calibration, and was also successfully externally validated.

Initiation of vasopressors over the course of critical illness is usually due to profound and durable hypotension, which is independently associated with increased mortality [14]. Hence, vasoconstrictor use before BSI reveals illness severity. Nosocomial infections are an important determinant of the outcomes of ICU patients [15]. About 70% of nosocomial BSI in the ICU are secondary to other primary infections, and among them, catheter-related infections and respiratory tract infections are the leading sources of secondary episodes [5, 15,16,17]. Bloodstream infections are associated with prolonged mechanical ventilation and deep vein catheter indwelling [18, 19]. Our results are consistent with those of previous studies that showed that MV use before BSI as well as DVC interventions before BSI are two important predictors of poor prognostic outcomes for BSI patients. Additionally, inappropriate applications of antibiotics induce bacterial resistance [20], and antibiotic resistance in pathogens is a challenge that is associated with high morbidity and mortality rates [21]. Therefore, in our study, the antibiotics used before BSI were independent risk factors for its development.

The BSIC score has several strengths. A remarkable strength is its ease of use. The parameters obtained from the patient status in the early stages of the ICU stay are well-defined and easily obtainable. Our model was constructed using baseline variables and does not require information about the detailed laboratory examination. Another advantage of the BSIC score is that it was subjected to an independent external validation process and showed good discrimination, thereby minimizing interpretation variabilities and improving their generalizability as well as lending credibility to their usefulness in different BSI cohorts. In addition, it can help in identification of individuals at a high risk of BSI, for whom treatment with broad-spectrum antibiotics and effective early-stage rapid microbiological identification should be considered. Therefore, this score can be used as a screening tool to improve clinical care decisions for patients at a high risk of BSI.

This study has various limitations. First, the BSIC score was developed based on data retrospectively obtained at two-centre cohorts, and only patients with positive blood cultures were included in this study. Second, other valuable predictors may not have been included in our analysis. The presented scores will be improved as additional predictive variables are incorporated. Third, it was not determined whether interventions that are based on the BSIC score can improve the outcomes of BSI patients. Finally, the scores only apply to adult patients in ICU. Their purpose is to predict the prognosis of BSI patients during their ICU stay. Thus, further studies should be performed to determine whether this score can be extended to all BSI patients.

Conclusions

In conclusion, using a clustering approach in a cohort of BSI patients, we identified two distinct BSI phenotypes that will help physicians to identify high-risk patients. Four independent predictors (vasoconstrictor use before BSI, MV before BSI, DVC before BSI, and antibiotic use before BSI) were identified. These predictors are readily available during the early ICU stay and are easy to obtain. They can be used to construct an easy-to-use score for predicting the prognosis of BSI patients in the ICU. The significance of this characterization in patient management and prognosis should be evaluated.