Background

Ventilator-associated lung injury (VALI) commonly occurs in patients hospitalized in the intensive care units (ICUs). A growing body of studies have shown that VALI is not limited to patients with acute respiratory distress syndrome (ARDS) but might occur in patients undergoing mechanical ventilation (MV) without lung injury; improper placement of the ventilator can also cause VALI [1, 2]. The population undergoing surgeries under anesthesia requiring short-term ventilation is a well-documented group. Each year, approximately 230 million patients worldwide are administered MV for surgical anesthesia. Most of these patients may have no basic lung disease before surgery, but after a short period of MV, the incidence of postoperative pulmonary complications has been shown to be 5% to 40%, and it has been shown to occur in patients undergoing chest or abdominal surgery in particular; these patients are more likely to develop postoperative edema, consolidation, and even ARDS [3]. As it is the most serious postoperative pulmonary complication, ARDS is the leading cause of postoperative respiratory failure [2, 3]. It is worth noting that in patients without lung injury who receive MV for more than 48 h, the incidence of ARDS within 5 days is nearly 20% [4]. Therefore, identifying distinct clinical phenotypes and adopting different ventilation strategies for different patients accordingly may potentially benefit both critical care clinical research and practices.

Previous clinical studies of MV in critically ill patients have analyzed MV as a homogeneous disease or only discussed how to perform pulmonary protective ventilation in patients with ARDS [5,6,7]. In addition, the current methods of selection, assessment, and intervention for MV in the ICU rarely consider distinct clinical phenotypes in patients receiving MV. However, distinct clinical phenotypes of MV are actually associated with different disease severities and prognoses and may be caused by a variety of underlying mechanisms. It is unclear whether various phenotypes of MV should be distinguished when clinicians administer MV treatments. There is no publicly available evidence showing whether clinicians should only focus on small tidal volume lung protection or a certain target, such as the plateau pressure, tidal volume (VT), or positive end-expiratory pressure (PEEP), when they treat critically ill patients. To resolve the practical problems encountered in the clinic and overcome the bottleneck of traditional studies, we used a database and machine learning based method to explore the significance of MV phenotypes. Perhaps there are some potential phenotypes relevant to disease severity and patient outcomes that may contribute to MV therapy. The model can predict which phenotype the patient is and guide the next step of treatment.

Methods

Patient data collection

Using the administrative database of Peking Union Medical College Hospital and the data from our previously published study in Critical Care Medicine, we conducted a retrospective study of patients undergoing MV [8]. From May 2013 to December 2016, we identified MV patients admitted to the ICU of Peking Union Medical College Hospital. Patients younger than 18 years of age or admitted to the ICU were excluded from the 24-h period. The Institute of Institutional Research and Ethics of Peking Union Medical College Hospital approved this study involving human subjects. We retrieved data on a total of 5013 unique adult individuals from Peking Union Medical College Hospital who received MV during their ICU stay. We randomly divided this batch of data into two groups according to a ratio of 4:1. A total of 4009 patients were enrolled as training cohort, and 1004 patients were collected as validation cohort.

Ventilator mode selection

Lung-protective strategies for MV were performed on all of the patients who were admitted to the ICU. When the patients were under adequate sedation and analgesia but without spontaneous breathing, volume-controlled or pressure-controlled ventilation was used. Once the patient had spontaneous breathing, controlled ventilation was immediately converted to pressure support ventilation.

Statistical methods

We selected 22 candidate variables based on their relevance to MV. These mainly included demographic characteristics (age and temperature (T)), scores (acute physiology and chronic health evaluation (APACHE) II scores and sequential organ failure assessment (SOFA) scores), respiratory variables (respiratory rate (RR), oxygen concentration in the inhalation gas (FiO2), pulse oxygen saturation (SpO2%), mean airway pressure (Pmean), peak airway pressure (Ppeak), positive end-expiratory pressure (PEEP), tidal volume (VT), partial pressure of oxygen (pO2) and partial pressure of carbon dioxide (pCO2)), circulatory and perfusion variables (heart rate (HR), mean arterial pressure (MAP), central venous pressure (CVP), static arterial carbon dioxide partial pressure (Pv-aCO2), lactate, P(v-a)CO2/C(a-v)O2 ratio, perfusion index (PI) and hemoglobin (Hb)) and a persistence variable (fluid balance). For each variable, we computed the mean value during the first 24 h of hospital presentation. To derive the phenotypes of MV, we first evaluate the distribution, missingness, and correlation of the above candidate variables. Due to the scarcity of the data samples, the missing variables were not simply eliminated. Instead, multiple imputation with chained equations was used to account for missing data. We also excluded highly correlated variables using rank-order statistics in the sensitivity analysis with threshold value of 0.75. Because the data type of the candidate variables was numerical, we used latent profile analysis (Gaussian mixture model) to recover hidden groups based on the means of the continuous variables observed. It always involved multiple constraint conditions: “model_1” involved equal variance and zero covariance, and “model_2” involved varying variance and zero covariance. An algorithm, the “expectation maximization” (EM) algorithm, was used to find the means and standard deviations of these distributions through two steps. First, we started with some initial starting guesses of the means and standard deviations. Based on these guesses, we assigned a posterior probability of being in a certain group to each patient. Second, these posterior probabilities were then used to update our guesses of the within-class parameters, which, in turn, were used to update the posterior probability, and so on until nothing seemed to change much anymore. Then, “model_1” was adopted in this manuscript. In the latent class analysis, the optimal phenotype number was confirmed using a combination of Bayesian information criteria, adequate size, high median probabilities of group membership with each phenotype, maximum entropy, and clinical features of hidden groups.

To visualize the optimal phenotype results and patterns in clinical variables, the data were analyzed with three types of plots: (1) ranked plots of the variables by the mean and standardized difference among the phenotypes, (2) t-distributed stochastic neighbor embedding (t-SNE) plots (which show multidimensional data in three dimensions), and (3) alluvial plots (which show the proportional distribution of phenotype members across specific variables).

Since the SOFA score is used to track a person's disease severity during their stay in the ICU and determine the extent of his or her organ function or rate of failure, we also performed an analysis to determine the relationship between the new phenotypic classification and the SOFA score, which included determining (1) whether the traditional SOFA score reflects the severity of the disease; (2) whether the phenotypes overlap with the SOFA score area on the chord graph; and (3) the deaths associated with each phenotype and the corresponding quartile of the SOFA score (especially in the area with overlapping phenotypes). The mortality rate of different phenotypes was significantly different across regions, which was evaluated a posteriori by a clustering phenotypic method.

External database verification

To further verify the phenotyping results yielded from Peking Union Medical College Hospital cohort, we then used another open database for external verification. The selected database is Medical Information Mart for Intensive Care III (MIMIC-III) database, which is a large, freely available database comprising deidentified health-related data associated with over 40,000 patients who stayed in critical care units of the Beth Israel Deaconess Medical Center between 2001 and 2012. According to the same inclusion criteria, data from 9457 patients were enrolled in our external verification study.

Results

Five clinical MV phenotypes

Summary information on our patient population is shown in Table 1. Multiple imputation with chained equations was employed to fill null values for each variable. After data preprocessing, we used latent profile analysis to derive groups. It was found that the five-class model was the optimal fit, forming the probability density distributions (1),

$$p = \mathop \sum \limits_{l = 1}^{5} p_{l} Normal\left( {\mu_{l} ,\sigma } \right)$$
(1)

where \(Normal\left({\mu }_{l},\sigma \right)\) is a Gaussian distribution with a mean value \({\mu }_{l}\) and variance \(\sigma\). The resulting probability density weight, mean value and variance are shown in Eq. 2.

(2)
Table 1 Summary statistics of selected features in each group

Consequently, the probability of an observation xi categorized as the nth phenotype was

$${\text{ p}}\left( {{\text{z}} = {\text{n|}}x_{i} } \right) = {\raise0.7ex\hbox{${{\text{p}}\left( {{\text{z}} = {\text{n}}} \right)Normal\left( {\mu_{n} ,\sigma } \right)}$} \!\mathord{\left/ {\vphantom {{{\text{p}}\left( {{\text{z}} = {\text{n}}} \right)Normal\left( {\mu_{n} ,\sigma } \right)} {\mathop \sum \nolimits_{{{\text{l}} = 1}}^{5} {\text{p}}\left( {{\text{z}} = {\text{l}}} \right)Normal\left( {\mu_{l} ,\sigma } \right)}}}\right.\kern-\nulldelimiterspace} \!\lower0.7ex\hbox{${\mathop \sum \nolimits_{{{\text{l}} = 1}}^{5} {\text{p}}\left( {{\text{z}} = {\text{l}}} \right)Normal\left( {\mu_{l} ,\sigma } \right)}$}}$$
(3)

With an observation xi of [33.0, 37.0, 26.57, 16.0, 118, 63.0, 98.0, 20.0, 34.0, 29.0, 498.0, 141.0, 51.0, 23.0, 92.0, 9.0, 13.0, 2.0, 5.0, 1.0, 97.0, − 2165.0] in the variable space as an example, we can calculate the five-class probabilities [0.034, 0.0085, 0.957, 0.00, 0.00052] with Eqs. (2) and (3). Taking the phenotype of the maximum probability implied that observation xi was categorized as phenotype V.

As shown in Fig. 1a and Table 2, the variables were scaled for each phenotype. Broad differences were observed in the distributions of the scaled variables across phenotypes. Of the 22 variables measured, 18 were significantly different across phenotypes with P < 0.05. Several analyses were conducted to ensure that the phenotypes were not simply recapitulations of more traditional clinical groups. To that end, we first assessed the associations between the score variables and measures of illness severity, such as the APACHE II and SOFA scores. The mean APACHE II/SOFA scores for phenotypes II and IV were 23.40/11.45 and 21.31/9.79, respectively, which were significantly higher than those for phenotypes I, III and V (13.96/5.90, 16.64/6.78 and 11.31/5.01, respectively). For RR, FiO2, Pmean and Ppeak in terms of respiration, the means for phenotype IV were approximately 97.8, 48.9, 11.8, and 218, respectively; these values exceeded those of the other for phenotypes by 498%, 23%, 30.9% and 19%, respectively. Furthermore, for the HR in terms of circulation and perfusion, the differences across phenotypes IV and I, II, III, and V were also large. The mean for phenotype IV was 18.9, which was only 23.3%, 18.9%, 21.3% and 19.2% of that of the other four profiles. Phenotype III had a mean PEEP of 17.42, while the PEEP of phenotypes I, II, IV and IV was 5.74, 8.14, 10.21, and 5.9, respectively; the value for phenotype III exceeded those of the other phenotypes by 203%, 114%, 70.6%, and 195%, respectively. Phenotype V corresponded to an average age of 40.17 years, which was much lower than that of the other four phenotypes, which were 66.58, 58.54, 59.37 and 55.2 years.

Fig. 1
figure 1

Variable distribution after normalization (a) and sample distribution in a 3-dimensional space {× 1, × 2, probability} (b) of the five profiles. Colors represent different profiles. Variable differences are closely related to profile divisions

Table 2 Characteristics of five mechanical ventilation phenotypes

The means and standard deviations of the five phenotypes are shown in Table 2. To visualize the differences among all the phenotypes, t-SNE plots were used to reduce the dimensionality. Namely, nonlinear dimensionality reduction was performed in a 22-dimensional variable space to form 2-dimensional features ranging from [− 80, 70] to [− 80, 80]. Moreover, combined with the maximum probability from Eq. 3, a 3-dimensional space {x1, x2, probability} was thus formed. Therefore, as shown in Fig. 1b, we observed five distinct clusters that were formed in this 3-dimensional space, which are denoted in blue, pink, cyan, yellow, and gray and characterize the five respiratory phenotypes. This result indicated that there was strong separation in the likelihood of membership for patients assigned to a given phenotype compared with those assigned to other phenotypes. The number of patients assigned to phenotypes I, II, III, IV and V was 2174, 480, 241, 368, and 746, respectively, and the corresponding proportion of the total number of patients was 54.2%, 12.0%, 6.01%, 9.18% and 18.6%, respectively.

Severity assessments and prognosis for the 28-day survival curve

Since the APACHE II and SOFA values of phenotypes II and IV were significantly higher than those of the other phenotypes and the respiratory characteristics (RR, FiO2, Pmean, and Ppeak) and circulatory characteristics (HR) of phenotype IV were significantly different from those of the other four phenotypes, we used the SOFA score as an example by dividing it into four score intervals: [0, 4], (4, 6], (6, 9] and (9, 22]. We calculated the total number of patients and the mortality of each profile when the mortality rate of the four score intervals was more than 2%. As shown in Fig. 2, considering phenotype IV, which has the highest mortality rate, the number of patients in the SOFA score segment (9, 22] was 198, the number of deaths was 88, and the mortality rate was higher than 44%. The mortality of phenotype V in each score segment was less than 2%, so it was not included in Fig. 2.

Fig. 2
figure 2

Relationships among the SOFA score segment, profile and number of deaths. In the top half of the gap in the figure, different colors and the outer circle numbers represent the deaths of different profiles. In the bottom half, different colors and the outer circle number intervals represent the deaths of different SOFA score segments. The details are shown in the table below

Furthermore, the chord graph shown in Fig. 2 can effectively reflect the interactions among the SOFA score segment, section and the number of deaths. The upper and lower arcs of the graph correspond to the number of deaths in the section and SOFA score segment and interact with the two in the sphere with different colored surfaces. The number of deaths for phenotypes I, II, III, IV and IV was15, 4, 107, and 120, respectively, and the mortality rate was 6.1%, 1.6%, 43.4% and 48.8%, respectively. Cross-sections I, III, II and IV correspond to the upper halves of the arc denoted by the pink curved surface in a counterclockwise manner, as shown in Fig. 3. The proportions of individuals with phenotypes I, III and V among all cases were 54.2%, 6% and 18.6%, respectively, and the mortality rates for these three phenotypes were 1.23%, 1.40% and 1.59%, respectively, which were all smaller than 2.00%. The proportions of patients with phenotypes II and IV accounted for 12.0% and 9.2% of all patients, respectively, with mortality rates of 27.4% and 18.1%, respectively.

Fig. 3
figure 3

Proportion of patients of each profile and the corresponding 28 natural survival curves

Model stability verification

The validation cohort included 1004 patients, and the summary statistics are depicted in Table 3. Predictions of the phenotype distributions were generated for the validation cohort with the above model. This result was similar to that in the above training cohort. The number of patients assigned to phenotypes I, II, III, IV and V were 542, 124, 52, 81, and 205, respectively, and the corresponding proportions of the total number of patients were 54.0%, 12.4%, 5.18%, 8.07% and 20.4%, respectively (Table 4). Compared with the profile in Fig. 1a, that in Fig. 4 shows consistent differences in the biomarker patterns. To check the phenotype population stability, a metric (population stability index, PSI) was used to identify differences shifts in the population with the following formula:

$${\text{PSI }} = \sum {\left( {actual\% - expected\% } \right) \times {\text{ln}}\left( {\frac{actual\% }{{expected\% }}} \right)}$$
(4)
Table 3 Summary statistics of selected features in each group in the validation cohort
Table 4 Characteristics of five mechanical ventilation phenotypes in the validation cohort
Fig. 4
figure 4

Characteristic distribution for each phenotype (a) and 28-day accumulative mortality curve (b) in the validation cohort

Additionally, we used the PSI to measure 28-day mortality by phenotype in two datasets. The number of deaths at 28 days were 2 (1.11%), 23 (20.2%), 6 (3.85%), 25 (28.4%) and 3 (1.46%). Then, we calculated the PSI for the phenotype populations and mortality across the two datasets. The final values were 0.0045 and 0.007, which indicated an insignificant difference in both the population size and mortality rate between the validation cohort and study cohort.

External verification results

We used MIMIC III cohort for further verification. To minimize the impact of database differences, we finally chose 15 variables and divided them into four groups: demographics (age and temperature (T)) and scoring features (SOFA score and SAPSII score), respiratory features (Pmean, Ppeak, VT, SpO2%, RR, pO2 and PEEP), circulation and perfusion features (Hb, HR and MAP) and residual features (fluid balance).

The clustering results may not be completely consistent with the data of Peking Union Medical College Hospital; however, the same five profiles were generated as for the Peking Union Medical College Hospital cohort, as displayed in Fig. 5. The SpO2 of Profile 5 was significantly lower than those of the other profiles, the PEEP values of Profiles 3 and 4 were higher than those of Profiles 1, 2 and 5, and the Pmean of Profile 4 was obviously different from those of the other profiles. While given the differences between databases in ethnic composition, recording methods and data types, the verification results demonstrated that our phenotyping method has generalizability and the model is stable to divide patients undergoing MV into five different subtypes.

Fig. 5
figure 5

Characteristic distribution for each phenotype (a) and 28-day accumulative mortality curve (b) in the MIMIC III validation cohort

Discussion

In this retrospective ICU MV patient study, we used machine learning methods to obtain five ventilator-related phenotypes on the day of ICU admission based on the patient's general data, ventilator parameters and partially monitored circulating perfusion indicators, as well as the in-and-out balance (residual characteristics) within 24 h of the day. For these five types of respiratory phenotypes, we used a chord diagram to suggest that the interaction among the SOFA score, profile, and death toll showed that Profiles IV and II had the highest mortality. Survival curve analysis revealed that the 28-day survival rate of Profile IV and Profile II was significantly lower than the those of the other three profiles. We constructed five ventilator-related phenotypes with mathematical expressions using a potential profile approach and used validation data sets to predict and validate the existence of the five phenotypes. Additionally, the experiments on the MIMIC III cohort confirmed these phenotypes. The implementation of this study facilitates the identification and treatment of these mechanically ventilated patients in clinical practice, improving patient survival and improving prognosis.

Inappropriate ventilation strategies during MV result in lung injury, and the mechanisms include high airway pressures or tidal volumes resulting in lung pressure/volume injury. Excessively low expiratory lung volumes or atelectasis lead to repeated opening and collapsing of the terminal lung unit. In addition, in patients undergoing MV, even without anatomical changes in the lung tissue, the effects of various forces can induce the release of pro-inflammatory cytokines and the recruitment of white blood cells, and the initiation of local inflammatory processes is called biological injury [9]. The first three injuries are considered to be mechanical injuries caused by mechanical factors, while the remaining injuries are caused by mechanical damage caused by secondary damage involving inflammatory cells and inflammatory mediators. Therefore, information on how to adjust and control MV is clinically important. Previous studies have shown that low tidal volumes, high PEEPs and proper control of the plateau pressure are important for resolving the current MV problem [10, 11]. Even simply limiting the VT and plateau pressure is not completely safe. On the other hand, contradictory situations in which various protective strategies are contradictory may also occur in clinical practice. For example, increasing the PEEP will cause a corresponding increase in the plateau pressure when volume control mode, and often, patients with more severe lesions and poorer lung compliance will reach the an acceptable maximum limit of the plateau pressure (often known as 40 cmH2O) at a PEEP lower than the required level. Therefore, we urgently need to use new methods to examine traditional clinical problems.

Through analysis, we found that MV patients could be divided into five subtypes. For example, in MV patients of Peking Union Medical College Hospital, Profile IV showed a high mean airway pressure, a high peak airway pressure, a fast respiratory rate and a slow heart rate. Profile III showed a high pure PEEP. Based on the survival analysis, we can clearly see that the mortality rate was the highest for Profile IV compared with the other profiles. Namely, the prognosis of patients who show improved oxygenation by increasing the airway pressure is worse than that of patients who show improved oxygenation by increasing the PEEP without increasing the airway pressure. The occurrence of VILI depends on the interaction of the ventilator and the lungs, i.e., the pressure, volume, flow rate and frequency of the ventilator applied to the lungs and the reactivity of the lung parenchyma. The pressure generated in the lung tissue when MV is used to expand the lung is called stress, which is reflected as transpulmonary pressure. The driving pressure may better reflect the actual ventilation state in terms of the MV phenotype. Namely, with the same tidal volume, the larger the amount of functional residual gas there is, the smaller the strain generated, the smaller the probability of the occurrence of VILI, and the better the prognosis of the patient. In 2015, Amato et al. proposed that for patients with ARDS undergoing MV, the VT, PEEP, and driving pressure (DP) are independent factors [12]. Among these factors, the DP level had the strongest correlation with the survival rate of ARDS patients. The study also suggested that the PEEP and small VTs only play roles in lung protection when the DP is reduced. Neto et al. [13] found that DP is the only factor strongly correlated with postoperative pulmonary complications by performing a high-quality meta-analysis of MV risk [14]. In another study of ARDS, it was found that a DP higher than 14 cmH2O significantly affects the prognosis of patients [13].Our study using big data also demonstrates the role of DP in determining the phenotype of patients undergoing MV. In other words, it is only used in patients with good recruitibility to increase the PEEP, which is beneficial, and in patients with poor recruitibility, increasing the PEEP causes excessive expansion, which is harmful [11, 15].

Phenotype V represents a younger but higher-scoring patient, and this phenotype may become an independent phenotype, probably due to lung compliance. The parameters of MV did not appear to have an effect on these aspects of the patients, and the prognosis of the patients were good. Phenotype II is different from phenotype III, as the patients classified as phenotype II have higher scores, but they are not considerably different in terms of age. These patients had poor organ function which may not be only reflected from the lung-related variables, so these patients may form an independent group as phenotype II. The last group is phenotype I, which represents the majority of clinical patients and the most homogeneous group. We further analyzed the departments in which the patients were admitted and found that phenotypes I, III, and V were from surgery, and phenotypes II and IV were from internal medicine.

This study also has some limitations. First, although it was a big data study, there was a training set combined with a verification set to validate five phenotypes. We used the data from the MIMIC III cohort to demonstrate our findings. However, this model still needs to be clinically verified by a prospective study. Second, the study included general information, ventilator parameters and circulation and perfusion indicators, as well as the balance of input and output (residual characteristics) within 24 h of the day, which take into account some of the indicators commonly considered in clinical use of MV. It is not known whether it is necessary to include all the indicators that can be obtained and recorded in the clinic or to combine some indicators. Third, many of the parameters included in this study are related to clinical decision making. For example, PEEP, VT and other parameters are more decisions of the clinician than the patient's own respiratory mechanics. Therefore, phenotypic analysis in different institutions may conclude different results. The current conclusion of this study requires further multicenter validation if possible.

Conclusions

In this retrospective analysis, respiratory and circulatory data were obtained from patients undergoing MV from two databases. Five ventilator-related phenotypes were obtained, and from these five phenotypes, Profiles II and IV were both related to mortality, and Profile IV was more significant in higher mortality caused by mechanical ventilation parameters. These five new identified phenotypes of critically ill patients undergoing MV may be helpful in the future identification and interpretation of clinically high-risk patients.