Introduction

Arterial blood gas (ABG)-based measures of hypoxemia such as PaO2/FiO2 (PF) ratio have been used in ICU severity of illness scores [15], and as diagnostic criteria for acute respiratory distress syndrome (ARDS) [6, 7]. The Berlin definition (Berlin) has established new diagnostic criteria based exclusively on adult data and consensus of adult practitioners for ARDS [8], but has limited validation in children [9]. Berlin created three ARDS severity classes using PF with a minimum PEEP of 5 cmH2O. It requires a risk factor within 7 days of hypoxemia, bilateral infiltrates, and allows ARDS to co-exist with heart failure. The draft definition considered ancillary measures for severe ARDS (three quadrants of consolidation on chest x-ray, dead space, PEEP ≥10 cmH2O, and compliance) that were omitted from the final definition because they did not discriminate mortality.

When applying Berlin to pediatric ARDS patients, it is important to consider pediatric PEEP management, radiology, and measures of dead space or compliance. These may alter the ability of Berlin to risk stratify children with ARDS [10]. Perhaps most importantly, improvements in pulse oximetry have shifted practice patterns in pediatric intensive care to reduced use of arterial catheters; hence fewer children have a PF ratio to diagnose ARDS [11, 12]. Moreover, recent evidence that arterial catheters are an under-recognized source of infection [13] may continue to shift practice patterns away from routine use. Several studies have demonstrated that SpO2-based parameters such as SpO2/FiO2 ratio can be substituted for PF ratio when the SpO2 is ≤97 % [1417].

There is good evidence that PF or oxygenation index (OI) [(mean airway pressure × FiO2/PaO2) × 100] discriminates mortality risk in children with ARDS [1821], but few studies have examined whether SpO2-based metrics (SF or oxygenation saturation index (OSI) [(mean airway pressure × FiO2/SpO2) × 100]) discriminate mortality [22, 23] in children.

The objectives of this study are (1) to evaluate the performance of the Berlin definition for pediatric ARDS; (2) to determine how substituting SF for PF in the Berlin definition would alter ARDS incidence and mortality; (3) to examine the potential for selection bias introduced when requiring an ABG by examining differences between children with and without a PaO2 available on the first day of mechanical ventilation; and (4) to compare the ability of SpO2 vs. PaO2-based metrics of hypoxemia to discriminate mortality in a heterogeneous cohort of mechanically ventilated children (not just those with ARDS).

Materials and methods

We reviewed the electronic health records (EHR) of children admitted to the Children’s Hospital Los Angeles (CHLA) Pediatric Intensive Care Unit (PICU) from March 2009 to April 2013. Patients were eligible if intubated (endotracheal or tracheostomy) and mechanically ventilated (MV) within 7 days of PICU admission. Cyanotic congenital heart disease patients were excluded. IRB approval was obtained at CHLA (CCI-12-00140), in accordance with the 1964 Declaration of Helsinki and its later amendments.

Variable selection

We used a copy of the EHR (Cerner Corp®, Kansas City, MO), and a diagnostic and demographic database (Microsoft Access, Redmond, WA) maintained by physicians providing clinical care. Demographics, diagnoses, outcomes, ventilator settings, SpO2, ABGs, doses and times of vasoactive-inotropic medications, and 12-h admission pediatric risk of mortality (PRISM III) scores were extracted. Masimo pulse oximeters™ were used for all patients. There was no protocol regarding ABGs or the decision to place an arterial catheter. There has been a recent trend to rely on non-invasive parameters or capillary or venous blood gases, and only place arterial catheters for patients who need hemodynamic monitoring or have severe hypoxemia. Nurses and respiratory therapists primarily changed FiO2, with a guideline to wean FiO2 when >0.40 if SpO2 was >97 %.

Composite variable definitions

Data to calculate PF, SF, OI, and OSI were extracted for the first day of MV. SF and PF were calculated using the closest charted FiO2 ≤1 h from the SpO2 ≤97 % or PaO2. SpO2 >97 % were excluded. OI and OSI were calculated using the PF and SF values, incorporating the closest charted mean airway pressure (MAP) ≤6 h prior to SF or PF as previously described [18, 22]. Ventilator settings were charted every 6 h, or with changes. SpO2 was charted hourly, or with desaturation. We analyzed the first PF, SF, OI, and OSI after intubation, as well as average values over the first day of MV. The average values for SF and OSI were calculated using hourly average SpO2 values ≤97 %, preventing over-weighting times of instability, when there were several SpO2 values in each hour (e.g., SpO2 values charted between 12:00 and 13:00 were 97, 93, and 89 %. The hourly average for 12:00 is 93 %). Hourly average values ≤97 % were then averaged to derive the day 1 value. An hourly inotrope score was calculated for the first 3 days of ICU admission, using hourly average values of continuous dopamine, dobutamine, epinephrine, norepinephrine, vasopressin, and milrinone [24, 25].

Berlin definition ARDS criteria

Diagnostic risk factors for ARDS included pneumonia, sepsis, aspiration, pancreatitis, near drowning, trauma, and rheumatologic disease. Chest radiographs during the first 3 days of MV were reviewed for patients with a risk factor and either SF (≤264) or PF (≤300). Two independent reviewers examined x-rays and radiology reports for presence of bilateral infiltrates and cardiomegaly. Disagreements regarding bilateral infiltrates were adjudicated by a third reviewer. Patients with cardiomegaly had a review of their medical record to determine if hypoxemia was predominantly from heart failure. Ventilator settings were reviewed to ensure PEEP ≥5 cmH2O.

Outcome measures and analysis

The main objective was to evaluate the performance of Berlin to stratify ICU mortality risk in pediatric ARDS, using PF and SF criteria. Additional objectives were to examine differences between patients with and without an ABG, so descriptive statistics were stratified by the presence of an ABG on the first day of MV. Mortality and inotrope/vasopressor use were examined as a function of categories of SF, stratified by the presence of an ABG on the first day of MV. We used four categories of SF (>264, 221 to ≤264, 150 to ≤221, ≤150), corresponding to Berlin PF values (>300, 200 to ≤300, 100 to ≤200, ≤100) [16]. A final objective was to compare the abilities of PF, SF, OI, and OSI to discriminate mortality in MV children. Statistical analysis was performed using Statistica v.5.5 (StatSoft, Tulsa, OK) and Stata v.10 (StataCorp, College Station, TX). Continuous variables are described with median and interquartile range, analyzed with Wilcoxon rank-sum. Dichotomous variables are described as number and percent, analyzed using Yates-corrected chi squared. A multivariate logistic regression model was created to evaluate the independent association of SF ratio with mortality, controlling for potential confounding factors. Receiver operating characteristic (ROC) plots were created to examine the discrimination ability using area under the curve (AUC) [26] with 95 % confidence intervals. AUCs were compared using chi-squared tests.

Results

There were 4,803 ICU admissions during the study period with an overall mortality of 4.6 %. The study cohort consisted of 1,833 MV children without cyanotic congenital heart disease, with 9.9 % mortality.

Primary analysis: Berlin definition of ARDS

Stratification of mortality for patients who met Berlin ARDS criteria is summarized in Table 1. Overall, 129 children (7 % of MV children) met Berlin ARDS criteria, with 33 % mortality. There was similar mortality for children with mild (PF 200–300) or moderate (PF 100–200) ARDS using either initial PF after intubation, or day 1 average PF values, but higher mortality for children with severe ARDS (PF ≤100) (Table 1). The mild, moderate, and severe designation for ARDS severity using PF ratio had fair discrimination ability for mortality, which was similar for initial values after intubation [AUC 0.631 (95 % CI 0.54, 0.72)] and day 1 average values [AUC 0.665 (95 % CI 0.57, 0.76), p = 0.2] (Electronic Supplementary Material Fig. 1).

Table 1 Berlin definition of ARDS using PF or SF criteria

When SF criteria were substituted for PF in the Berlin definition, 312 children (17 % of MV children) met ARDS criteria, with 22 % mortality. There was similar mortality between those with mild (SF 221–264) or moderate (SF 150–221) ARDS, but higher mortality for those with severe ARDS (SF ≤150) when the initial SF after intubation was used (Table 1). However, there appeared to be clearer stepwise increases in mortality from mild to moderate to severe ARDS using day 1 average values for SF (Table 1). The mild, moderate, and severe designation for ARDS severity using SF ratio also had fair discrimination ability for mortality using initial values [AUC 0.632 (95 % CI 0.56, 0.69)], although it was significantly higher with day 1 average values [AUC 0.682 (95 % CI 0.62, 0.74), p = 0.002] (Electronic Supplementary Material Fig. 2).

Secondary analysis: selection bias with ABGs

To examine the selection bias if an ABG were required for risk stratification, we examined all 1,833 mechanically ventilated patients, stratified by presence of an ABG within 24 h of MV (Table 2). Those with ABGs had 16.1 % mortality as compared with 6.1 % mortality for those without ABGs (p < 0.0001). Patients with an ABG were older, ventilated longer, more likely to have cancer, and less likely to have chronic respiratory disease, a tracheostomy, or a home ventilator (all p < 0.02). Patients with ABGs on the first day of MV were treated with vasoactive-inotropic infusions sooner and had higher inotrope scores than those without ABGs (all p < 0.0002). Patients with ABGs had higher initial PRISM III severity of illness (p < 0.0001), with 23.8 % of patients with ABGs having predicted PRISM III risk of mortality ≥10 % as compared with 6.7 % for those without ABGs (p < 0.0001). There was a correlation between PRISM III and number of ABGs on day 1 of MV (r = 0.38, p < 0.0001).

Table 2 Demographics and outcomes of all children, stratified by the presence or absence of an arterial blood gas (ABG) on the first day of mechanical ventilation

Table 3 characterizes outcomes and interventions as a function of hypoxemia severity (measured by SF ratio). A total of 1,201 patients had SF available (those without SF ratio available had SpO2 >97 % precluding calculation). Hypoxemia severity correlated with the use of ABGs on day 1 of MV. However, 31 % of patients with severe hypoxemia (SF ≤150) did not have an ABG on the first day of MV, with 35 % mortality. There are stepwise increases in mortality as a function of SF, regardless of the presence of an ABG on the first day of MV (Table 3). Moreover, there is a stepwise increase in frequency of vasoactive-inotropic medication use as hypoxemia worsens (Table 3). Among patients with severe hypoxemia, 67 % of patients with an ABG on the first day of MV received vasoactive-inotropic medications within the first 24 h of PICU admission, whereas only 35 % of patients without an ABG received them (Table 3). SF was associated with mortality after controlling for presence of an ABG on day 1 of ventilation, vasoactive-inotropic medications within the first 24 h of PICU admission, age, and diagnoses of chronic respiratory disease or cancer (p < 0.001).

Table 3 ICU Mortality and use of vasoactive agents within the first 24 h of ICU admission stratified by hypoxemia severity (using SF) and by the presence of an ABG within the first 24 h of mechanical ventilation

Secondary analysis: all MV patients

Four hypoxemia metrics (PF, SF, OI, and OSI) were calculated for all mechanically ventilated children (irrespective of ARDS diagnosis) when the necessary parameters were available. Availability of PF and SF, distribution, and mortality as a function of PF and SF severity ranges are summarized in Fig. 1. Allowing SF to be used when PF was not available doubled the number of patients available for risk stratification. For patients with only SF available, there is a clear step up in mortality for those with severe hypoxemia, with minimal difference between those with mild or moderate hypoxemia. For those with both PF and SF available, there are stepwise increases in mortality for SF ratio in this group, with a similar portion of patients in each severity group. For PF ratio, close to half of these children had day 1 average PF ratio >300, with similar mortality for those with PF ratios from 100 to 200 as 200 to 300. Nearly all patients who had PF but not SF available (i.e., SpO2 >97 %), had PF ratios >300 (Fig. 1).

Fig. 1
figure 1

Breakdown of population proportions and ICU mortality stratified by availability of PF ratio and SF ratio. The total population of mechanically ventilated children is 1,833, subgrouped along the x axis on the basis of availability of each metric (SF only, PF and SF, PF only, neither PF nor SF). The area of each box represents the proportion of the population which would fall into that severity group on the basis of the given metric. Proportions are based on day 1 average SF and PF. The percentage is mortality within each box. A total of 727 children had SF available, but no PF. While many of these patients had mild or minimal hypoxemia, those with severe hypoxemia (SF <150) had 38 % mortality. A total of 474 patients had both PF and SF available. There are stepwise increases in mortality for SF ratio in this group, with a similar portion of patients in each severity group. For PF ratio, close to half of these children had PF ratio >300, with similar mortality for those with PF ratios from 100–200 to 200–300 (note width of these boxes split in half for visualization). For those with PF ratio only (patients in whom SpO2 >97 % precluded SF calculation), 92 % of children had PF ratios >300. The remaining PF subgroups in this section had very few patients, likely affecting the mortality numbers (only five deaths total for PF ≤300). A total of 411 patients had neither metric available (SF not available because SpO2 was above 97 %)

When used as a continuous variable (rather than mild, moderate, or severe classifications), all four metrics (PF, SF, OI, and OSI) had “good” to “excellent” discrimination of ICU mortality, for both the initial value after intubation, and day 1 average values (Table 4) [27]. There was a median of two ABGs available for PaO2 metrics on day 1 of MV (range 1–17; IQR 1–4). For SpO2 metrics, there was a median of three hourly averaged SpO2 values ≤97 % available on day 1 of MV (range 1–22; IQR 1–7). In general, day 1 average values for all four metrics had higher discrimination of mortality than initial values after intubation (Table 4). SpO2 metrics increased the number of patients and had higher AUC than PaO2 metrics, particularly for the day 1 average values (Table 4). For direct comparison, we restricted the analysis to 444 children with all four metrics available (30 of the 474 from the analysis of PF and SF did not have MAP within 6 h for OI and OSI) and found similar discrimination ability for the initial values after intubation (Table 4). However, day 1 average SF had higher discrimination for mortality than PF (p = 0.0003). OSI and OI each tended to be better than PF at discriminating mortality, although not statistically significant (p = 0.06 and p = 0.07, respectively). All other comparisons were not different (p > 0.1). A total of 584 patients had only SpO2 metrics available, and mortality discrimination was similar (day 1, SF AUC 0.769; day 1, OSI AUC 0.752) to the group with both SpO2 and PaO2 available.

Table 4 Areas under the curve of the ROC plot and 95 % confidence interval for ICU mortality for all patients, and subgrouped by the 444 patients with all four metrics available on the first day of mechanical ventilation, to enable direct comparison

Discussion

We have demonstrated that the Berlin criteria identified 7 % of all MV children as having ARDS, with 33 % mortality. In contrast, using SF in the Berlin definition identified 17 % of MV children as having ARDS, with 22 % mortality. SpO2 metrics of hypoxemia (SF and OSI) could double the number of MV children available for risk stratification, and PaO2 metrics are available only for about half of children with moderate hypoxemia. Although 70 % of those with severe hypoxemia have an ABG, the remaining 30 % of patients with severe hypoxemia but without an ABG still have 35 % mortality. ABGs are likely a surrogate for cardiovascular dysfunction or severity of illness, as their use is associated with vasoactive-inotropic medications, higher PRISM scores, and higher mortality in pediatric ICUs. Finally, when looking at the performance of hypoxemia metrics of disease severity, the first PF, OI, SF, and OSI after intubation discriminate mortality similarly, whereas day 1 average values for SF may outperform PF.

Limiting diagnosis and risk stratification for ARDS to children who have only PF or OI available is capturing a combination of the hypoxemia and cardiovascular risk, rather than pulmonary organ dysfunction. However, before simply accepting an SF modified Berlin definition of ARDS for use in children, we should consider creating a pediatric-specific definition of ARDS involving important pediatric practice patterns, co-morbidities, and other markers of disease severity [28].

A major advantage of SpO2 metrics is the avoidance of selection bias regarding the decision to obtain an ABG or place an arterial catheter. In our PICU, it appears that children with ABGs are older, have different co-morbidities, have higher initial severity illness, and are more likely to receive vasoactive-inotropic infusions. Although mortality rates are higher for those with ABGs, SF and OSI discriminate mortality, regardless of the presence of an ABG. This is seen by stepwise increases in mortality as SF worsens, and similar AUCs for the ROC plots between the groups with and without ABGs. Moreover, SF remains associated with mortality after controlling for ABG use, inotrope-vasopressors, age, and diagnoses.

Interestingly, we found that the average SF on the first day of MV better discriminates mortality as compared with PF. There were similar trends for OSI and OI, although not statistically significant (p = 0.06 and 0.07, respectively). There are likely several reasons for this higher discrimination. First, because SpO2 is available routinely, one can calculate SF and OSI on more patients, with a larger spectrum of disease severity. Second, SpO2 may be a better marker of intrapulmonary shunt [29]. Third, SF and OSI have more data points to average than PF or OI, which may better capture the composite risk for the patient during day 1 of ventilation. This may be the reason why there was more equal distribution of patients and clearer stepwise increases in mortality when using day 1 average SF as compared with day 1 average PF for the 474 patients who had both metrics available (Fig. 1). We did not require simultaneous SF and PF, which may explain why patients do not always fall in the same severity buckets (i.e., severe hypoxemia via SF but moderate via PF). However, there were only a median of three hourly SpO2 values ≤97 % to average on the first day of MV, likely because providers do not wean FiO2 to keep SpO2 ≤97 %, as needed for SF and OSI [14, 16, 30]. Finally, when comparing initial values of SF or PF to day 1 average values for ARDS patients, there is a tendency for patients to move from severe to moderate ARDS, which may imply the day 1 value better reflects interventions such as lung recruitment. Day 1 mild, moderate, and severe ARDS designation also appears to better discriminate mortality than the initial value after intubation, particularly when using SF criteria.

One can debate whether a day 1 average value for any of these parameters should be used, as it may reflect quality of ICU interventions, not disease severity [7]. However, such combination metrics may be important for trials that do not enroll patients immediately after intubation, or when considering higher risk therapies for which duration of lung disease severity may be relevant. Future studies could be designed to require a minimum of two SF or OSI values, separated by at least an hour. Regardless, initial values of SF or OSI after intubation appear to discriminate mortality as well as the initial values for PF or OI [3133].

It is logical to use SpO2 metrics for epidemiologic studies and severity of illness scores when a PaO2 is unavailable to increase the number of eligible patients and represent the spectrum of disease severity [3437]. However, SpO2 metrics may be important to avoid selection bias of sicker patients with more cardiovascular dysfunction in interventional trials [38]. This may be particularly important when the intervention has potential for cardiopulmonary interactions (e.g., PEEP or high frequency oscillatory ventilation).

There are several limitations to our study, mostly related to the retrospective, single center design. We cannot have precise alignment of SpO2 or PaO2 with FiO2 and MAP for calculation of PF, SF, OI, and OSI. By requiring SpO2 to be within 1 h of FiO2 we have hoped to minimize this uncertainty (and still have demonstrated a positive association). We anticipate that this relationship may be stronger with precise alignment of FiO2 and SpO2 (or PaO2). We did not have an active protocol to wean FiO2 to maintain SpO2 ≤97 %. For this reason we had to eliminate many SpO2 values from the analysis. Third, we were limited to data that was charted, which may not reflect minute-to minute changes from bedside monitors. However, the EHR is the most frequent data source for severity of illness scores and eligibility for clinical trials, so our data likely represent real-life application. Fourth, we attempted standardized assessment with consensus opinion of bilateral infiltrates on CXR, but there is inherently high variability in this interpretation [39]. Fifth, SpO2 may be less reliable in circumstances of hemodynamic instability and poor perfusion, necessitating PaO2 measurements. Finally, these are data from a single institution, and are subject to local bias and practice patterns, particularly with regards to decisions to place arterial catheters. These findings need to be confirmed prospectively with multi-institutional data.

Conclusions

Children with severe ARDS using Berlin PF criteria have high mortality (45–53 %), with minimal difference in mortality between the mild and moderate ARDS groups. Substituting equivalent SF values for Berlin PF criteria creates slightly better separation in mortality risk between mild and moderate ARDS, although the severe group still has twice the mortality as the moderate group. SpO2 metrics (SF and OSI) discriminate mortality as well as PF and OI, increase the number of children eligible for the diagnosis of ARDS, and prevent limiting the diagnosis to those with hypoxemia and cardiovascular dysfunction. Although our study was performed in children, these findings may generalize to adults, and warrant investigation.