Background

Sepsis is defined as a life-threatening organ dysfunction, which caused by a dysregulation of the host’s response to infection [1]. It is estimated that more than 19 million people suffer from sepsis each year, and it has become one of the major threats to human mortality [2]. Acute respiratory distress syndrome (ARDS) is regarded as the earliest and most common complication of sepsis, leading to the excessive and uncontrolled inflammatory reactions and increased mortality rate in sepsis patients, especially for critically ill patients [3, 4]. Previous studies have shown that the risk of death in sepsis patients complicated with ARDS was as high as 20–50% [5, 6]. Therefore, it is essential to pay attention to the risk of ARDS for sepsis patients.

Several researches have indicated that biomarkers, sociodemographic, clinical characteristics were related to the ARDS risk of patients with sepsis [4, 6,7,8]. In the study of Wang Q et al., they found that microRNA 103 (MIR103) and microRNA 107 (MIR107) were predictive biomarkers for ARDS risks in sepsis patients [6]. A retrospective cohort study found that oral glucocorticoids before admission were associated with a lower incidence of early ARDS among ICU sepsis patients [7]. Nam and colleagues also reported that pneumonia, coagulation score and the central nervous system score were associated with the risk of ARDS in Korean patients with sepsis, and these also were considered as risk factor for 28-day mortality [8]. In general, the risk of developing ARDS in patients with sepsis may be influenced by multiple factors, and the development of predictive models is of great importance for risk assessment [9]. Currently, ARDS risk prediction models for different populations have been proposed [10, 11]. The lung injury prediction score (LIPS) was considered to identify patients at a high risk of ARDS in non-emergency department hospitalized patients [10], as well as patients at high risk for acute lung injury early in the course of their illness and before intensive care unit (ICU) admission [12]. In addition, Lin F, et al. successfully constructed a model combining partial pressure of oxygen: fraction of inspired oxygen (PaO2:FiO2), platelet count, lactate dehydrogenase, creatinine, and procalcitonin levels to predict the ARDS risk among patients with severe acute pancreatitis [11]. Nevertheless, to the best of our knowledge, there were few studies have established a predictive model by combining multiple predictors to predict the risk of ARDS in sepsis patients.

Herein, the purpose of this study was to develop and validate a prediction model for prediction of ARDS risk in patients with sepsis based on the Medical Information Mart for Intensive Care (MIMIC) IV database.

Methods

Source of data

We conducted a retrospective cohort study based on the MIMIC-IV database, as a single-center and freely accessible database, which contains a comprehensive and high-quality data of 53,130 patients in ICU at the Beth Israel Deaconess Medical Center (BIDMC) between 2008 and 2019 [13]. This study used de-identified data and was approved by the Massachusetts Institute of Technology and Institutional Review Board of BIDMC [14]. Patient’s informed consent has been obtained from all participants. All methods were carried out in accordance with relevant guidelines and regulations (declaration of Helsinki).

Selection of participants

Sepsis was defined as a suspected infection combined with an acute increase in SOFA score ≥ 2 according to the Sepsis-3 criteria [1]. All information of participants derived from the MIMIC-IV database. Participants were included in the study if they met the definition of sepsis, were older than 18 years old and did not develop ARDS on admission and within 2 days of admission. The exclusion criteria were as follows: (1) patients who stayed in the ICU less than 24 h; (2) patients who had abnormal data records (height ≤ 50 cm or weight ≤ 1 kg). If patients were admitted repeatedly between 2008 and 2019, we adopted only the record of patient's first admission to the ICU. After implementation of inclusion and exclusion criteria, a total of 16,523 patients with sepsis were included in this study (Fig. 1).

Fig. 1
figure 1

The flow-chart for population selection

Data collection

We extracted the following variables from the MIMIC-IV database, including the demographic data [age, gender, ethnicity, marital status, insurance status, admission type, body mass index (BMI, kg/m2) and patients’ comorbidity]; the vital signs and laboratory data within 48 h after ICU admission [respiratory rate (times/min), systolic blood pressure (SBP, mmHg), diastolic blood pressure (DBP, mmHg), heart rate (times/min), temperature (℃), urine output (mL), partial pressure of carbon dioxide (PCO2, mmHg), FiO2, mmHg, bicarbonate (HCO3), hemoglobin (g/dL), neutrophil (NEUT), lymphocyte (LYM), platelet (PLT, K/L), white blood cell (WBC, K/L), albumin (ALB), alanine aminotransferase (ALT, U/L), aspartate aminotransferase (AST, U/L), creatinine (mg/dL), blood urea nitrogen (BUN, mg/dL), glucose (mg/dL), C-reactive protein (CRP, mg/L), total cholesterol (TC, mg/dL), triglycerides (TG, mg/dL), low density lipoprotein cholesterol (LDL-C, mg/dL), high density lipoprotein cholesterol (HDL-C, mg/dL)]; severity scoring system [Sequential Organ Failure Assessment (SOFA) score, Simplified Acute Physiology Score (SAPS II)]; medications (heparin, aspirin, antibiotics and vasopressors); treatment [continuous renal replacement therapy (CRRT), mechanical ventilation (MV), red blood cell (RBC) transfusion, PLT transfusion, frozen plasma]. If patients received a laboratory test more than one time during their hospitalization, only the initial test results were included in this study. The diagnosis of ARDS met the Berlin criteria for patients in the MIMIC-IV database [15]. The Berlin criteria include: acute onset, PaO2/FiO2 ≤ 300 mmHg, positive end-expiratory pressure (PEEP) ≥ 5 cm H2O on the first day of ICU admission, bilateral infiltrates on chest radiograph, and absence of heart failure [16].

Outcomes and follow-up

In this retrospective cohort study, the outcomes were defined as the occurrence of ARDS for ICU patients with sepsis. The start date of follow-up was considered as the date of the patient’s admission, and the median follow-up time was 8.47 (5.20, 16.20) days.

Development and validation of prediction model

All eligible sepsis patients (n = 16,523) were randomly divided into the training (n = 11,566) set and testing set (n = 4957) in a ratio of 7:3. The prediction model was developed in the training set, and validated in the testing set. In the training set, univariate logistic regression analysis was used to screen the factors with P < 0.05, combining with factors associated with the risk of ARDS in septic patients in the literature, which were put into a multivariate model for stepwise regression to select some possible predictors. These predictors were used to construct prediction model for predicting the ARDS risk of sepsis patients. The area under the curve (AUC) of receiver operator characteristic curve (ROC) were adopted to compare the predicting performance between constructed prediction model and SOFA, SAPS II scoring system. Calibration curves were used to assess the predicting performance of prediction model in the training set and testing set.

Statistical analysis

For the present study, mean ± standard deviation (Mean ± SD) and median and quartiles [M (Q1, Q3)] were adopted to described the normally-distributed and nonnormally-distributed of measurement data, respectively. The differences of the groups were compared by the t-test and Mann–Whitney U test. And the categorical data were presented by the number of cases and the constituent ratio [N (%)], and the χ2 test performed the comparisons of groups.

We conducted a difference analysis between the training set and testing set. In the training set (n = 11,566), patients with sepsis were divided into ARDS group (n = 2422) and non-ARDS group (n = 9144) according to whether ARDS occurred, and we also did a difference analysis between the ARDS group and non-ARDS group. Lastly, we developed and validated the predicting performance of developed model by ROC and calibration curves. The relative risk (RR) and 95% confidence interval (CI) were calculated. In addition, we deleted the variables with more than 20% missing values (ALT, ALB, TG, LDL-C, HDL-C, AST, NEUT, SaO2, TC and CRP), and the multiple filling method was used to fill the variables less than 20% missing values. All analyses were conducted by using SAS 9.4 software (SAS Institute Inc., Cary, NC, USA). P < 0.05 was considered to be statistically significant.

Results

Baseline characteristics

The incidence of ARDS was 20.66% among total population. No differences were noted between the training set (n = 11,566) and the testing set (n = 4,957) (P > 0.05) with respect to baseline information (Additional file 1: Table S1), suggesting that the division of data was balanced and comparable. The characteristics of 11,566 patients with sepsis in the training set were displayed in Table 1, of which 2422 (20.94%) developed ARDS. The sepsis patients developing ARDS had significantly higher heart rate, respiratory rate, BUN level, PCO2 level and urine output than sepsis patients with non-ARDS. Additionally, compared to sepsis patients with non-ARDS, those with ARDS were more likely to have chronic pulmonary disease, vasopressin, red blood cell transfusion, liver disease and CRRT (P < 0.05).

Table 1 Baseline characteristics of 11,566 sepsis patients in the training set

Construction of the prediction model

The multivariate logistic regression analysis in the training set found that BMI, respiratory rate, urine output, PCO2, BUN, vasopressin, CRRT, ventilation status, chronic pulmonary disease, malignant cancer, liver disease, septic shock and pancreatitis might be predictors (Table 2). A prognostic prediction model, containing thirteen prognostic factors, to predict the ARDS risk in sepsis patients was established. For visualizing the prediction model, we plotted a nomogram (Fig. 2). For instance, a patient with sepsis had a septic shock (No), malignant cancer (No), BMI ≥ 30 kg/m2, BUN = 12 mg/dL, pancreatitis (No), chronic pulmonary disease (No), vasopressin (Yes), liver disease (Yes), PCO2 = 53 mmHg, respiratory rate = 32 times/min, CRRT (No), ventilation status = non-invasive vent, urine output = 1120 mL, the total score was 151 points and meant a predicted the risk of ARDS of 0.481, which was consistent with the actual outcome of this patient with sepsis (Fig. 3). Additionally, we also have developed an online prediction nomogram for easy clinical application: https://xuchi777.shinyapps.io/DynNomapp/

Table 2 The prognostic factors associated with the risk of ARDS for patients with sepsis
Fig. 2
figure 2

The nomogram for predicting the ARDS risk in ICU patients with sepsis

Fig. 3
figure 3

An example for the application of the nomogram

Validation of the prediction model

To assess the predictive ability of developed prediction model, the ROC curves and calibration curves were applied in this study. As presented in Table 3, the accuracy, sensitivity, specificity, PPV and NPV of prediction model was 0.732 (95% CI 0.724–0.740), 0.762 (95% CI 0.745–0.779), 0.724 (95% CI 0.714–0.733), 0.422 (95% CI 0.407–0.437) and 0.920 (95% CI 0.914–0.926) respectively, in the training set. Similarly, Table 3 displays that the established model had a 0.705 (95% CI 0.692–0.718) of accuracy, 0.798 (95% CI 0.773–0.823) of sensitivity, 0.682 (95% CI 0.668–0.697) of specificity, 0.385 (95% CI 0.364–0.407) of PPV and 0.931 (95% CI 0.922–0.940) of NPV in the testing set. Moreover, Table 3 also showed that the area under the curve (AUC) of the constructed prediction model was 0.811 (95% CI 0.802–0.820) in the training set (Fig. 4a), corresponding to 0.812 (95% CI 0.798–0.826) in the testing set (Fig. 4b). We also compared the predicting value of constructed prediction model and SOFA, SAPS II scoring systems for predicting the ARDS risk for sepsis patients (Table 3). The AUC of SOFA score and SAPS II score was 0.539 (95% CI 0.518–0.559) and 0.609 (95% CI 0.589–0.629) in the testing set (Fig. 4c and d), separately, which was obviously lower than constructed prediction model (P < 0.001). The result implied that the constructed prediction model had favorable discriminatory ability for the prediction of ARDS risk in patients with sepsis. In addition, the calibration curve also showed a good concordance between the predicted and observed risk of ARDS in both training and testing sets (Fig. 5a and b).

Table 3 The predictive performance of prediction model, SOFA and SAPSII
Fig. 4
figure 4

ROC curves of a established model in the training set; b established model in the testing set; c SOFA in the testing set; d SAPSII in the testing set

Fig. 5
figure 5

Calibration curves of a the training set and b testing set

Discussion

In this retrospective cohort study, a prediction model for predicting the ARDS risk in sepsis patients admitted to ICU was developed. Through verification, this model had a good predictive ability as well as discrimination.

ARDS is considered to be a serious and acute inflammatory lung injury, and could increase the severity of illness and brought a worse outcome for patients with sepsis [17]. Zhao J, et al. pointed out that sepsis-associated ARDS has a higher disease severity and worse clinical outcomes than non-sepsis-associated ARDS [18]. Therefore, early identification of patients with sepsis who are at high risk of developing ARDS is very important. Previous research has found that some prediction model for predicting the ARDS risk were developed and validated in traumatic brain injury (TBI) patients [19], non-emergency department hospitalized patients [10], patients undergoing cardiac surgery [20], and patients with coronavirus disease (COVID-19) [21]. However, these prediction models were not focused on patients with sepsis so far. In this study, we developed a model based on several clinical indicators to predict the development of ARDS in sepsis patients admitted to ICU. The developed prediction model in this study contains thirteen predictors: BMI, respiratory rate, urine output, PCO2, BUN, vasopressin, CRRT, ventilation status, chronic pulmonary disease, malignant cancer, liver disease, septic shock and pancreatitis. Liver disease was regarded to be a predictor of developing ARDS in this study, which were consistent with prior studies [22, 23]. A study has expounded that liver disease was an important predictor for the in-hospital mortality of patients with sepsis and lung infection [22]. In general, the liver could prevent sepsis from aggravating tissue and organ damage by removing bacteria and regulating the metabolism of inflammatory factors. However, when the liver occurs injury, it might increase the inflammatory response of the lung to septic bacterial infection, which leading to an increased risk of ARDS [22, 23]. In the study of Li X, et al., respiratory rate in the non-survival group was significantly higher than that of the survival group among sepsis patients with developing ARDS, which also indicated that respiratory rate was associated with the prognosis for sepsis patients with developing ARDS [24].

Nowadays, nomogram has proven to be an effective tool in predicting an individual’s probability of a clinical event, and it is consistent with the requirements of integrated model [25]. Moreover, the nomogram is also simple, intuitive and convenient for clinicians to use on prognostic prediction of disease [26]. In the present study, for visualizing the developed prediction model, we plotted a nomogram. Additionally, the ROC curves indicated that this established model had a predictive ability compared with SOFA score and SAPS II score. It is worth noting that, we have also developed an online prediction system, which may be more convenient for clinical application (https://xuchi777.shinyapps.io/DynNomapp/). The developed predictive model may also be a potential tool to guide clinicians in predicting the risk of ARDS in septic patients in the ICU, which help take early interventions to prevent ARDS progression in sepsis patients admitted to ICU and improved clinical outcomes.

The present study had some strengths. Firstly, the relatively large sample size of this study makes the results convincing. Secondly, we developed a model with an intuitive and easy to use based on some clinical indicators to predict the ARDS risk of ICU patients with sepsis. Simultaneously, the result of internal validation showed that the prediction model had a good discrimination and accuracy in predicting the risk of ARDS for sepsis patients. Nevertheless, we also acknowledged that there were some limitations in this study. Firstly, due to all patients from MIMIC-IV database and only septic patients in ICU were considered, we were unable to confirm whether this developed prediction model was applicable to patients with sepsis who were not admitted to the ICU. More prospective studies are needed to validate this result. Secondly, this is a retrospective cohort study, some variables with more than 20% missing values (ALT, ALB, TG, LDL-C, HDL-C, AST, NEUT, SaO2, TC and CRP) were deleted, which may affect the result. Thirdly, MIMIC-IV is a single-center database, so the results of this study should be prudently interpreted when involving other populations. Lastly, an external validation should be still required in the future.

Conclusion

In conclusion, we developed a prediction model incorporating thirteen clinical features to effectively predict the ARDS risk in ICU patients with sepsis. Additionally, the prediction model showed a good predictive ability as well as discrimination by internal validation. Nevertheless, further prospective studies are warranted to validate the effectiveness and applicability of this prediction model.