Introduction

Oxygen is essential for human survival and plays a crucial role in a wide range of physiological processes [1]. Oxygen therapy is among the most common interventions in critical illnesses [2]. However, excessive oxygenation can enhance the production of reactive oxygen species, which can lead to oxidative damage to cellular components, including DNA, lipids, and proteins, ultimately resulting in cell death, producing inflammation that damages lung tissue further [3,4,5]. Although previous studies in volunteers and experimental models have investigated the detrimental effects of hyperoxia [6, 7], overuse of oxygen remains prevalent in intensive care unit (ICU) particularly among patients with hypoxemic respiratory failure [8,9,10].

Therefore, conservative oxygen therapy (COT) has been proposed in recent years to prevent excessive oxygen exposure to patients. In 2016, a single-center, open-label randomized controlled trial (RCT) indicated that a conservative protocol resulted in lower ICU mortality [11]. Two years later, the IOTA systematic review and meta-analysis suggested that a LOT strategy above a peripheral oxygen saturation (SpO2) range of 94–96% is associated with increased mortality; this result supported the conservative administration of oxygen therapy in acutely ill patients [12]. However, such findings were not supported by subsequent studies, which yielded conflicting results with a similar setup [13,14,15,16,17,18,19,20,21,22,23,24]. Despite recommendations for oxygenation targets [25, 26], the clinical efficacy of oxygenation strategies in critically ill patients remains uncertain.

As new trial data have been published recently [27, 28], we present an updated this review of this topic. In addition, due to the heterogeneity of patient characteristics in relation to different types of ICU, baseline oxygenation, and actual target oxygenation, it is difficult to interpret the results of these systematic reviews using pairwise meta-analysis [12,13,14,15,16,17,18,19,20,21,22,23,24]. It is essential to evaluate the target oxygenation and distinguish subpopulations who are likely to benefit from different oxygenation strategies.Accordingly, we focus on data from mixed ICUs vs. medical ICUs, different baseline P/Fratios (P/F), and actual target arterial partial pressure of oxygen (PaO2), and conduct a systematic review and meta-analysis of the RCTs. Specifically, we compared COT versus LOT in those subpopulations, to explore the optimal oxygenation targets in ICU patients.

Materials and methods

We designed and and wrote this report the study according to the Preferred Reporting Items for Systematic Reviews and Meta Analysis Protocols checklist (PRISMA-P) [29] (the checklist is presented in Additional file 1: S6) and the principles of the Cochrane Handbook including the Methodological Expectations of Cochrane Intervention Reviews standards [30, 31]. We used GRADEpro GDT to assess the certainty of the results [32, 33]. Trial sequential analysis (TSA) [34, 35] was performed by using TSA version 0.9.5.10 to further investigate the effects of COT and LOT, which was achieved by defining the required information size, using the O’Brien-Fleming boundaries to adjust the thresholds for statistical significance each time a trial was included, and introducing the threshold for futility. Our protocol of this systematic review was pre-registered in the International Prospective Register of Systematic Reviews (PROSPERO: CRD42023434202).

Search strategy

Two authors (XYL and WT) independently searched PubMed, Embase, Web of Science, Scopus, Cochrane Central Register of Controlled Trials, ClinicalTrials.gov, MedRxiv, and BioRxiv before April 2024, focusing on adult ICU patients subjected to two different oxygenation strategies. The details of the search strategy are summarized in additional files to this report (Additional file 1: Search strategy). We searched RCTs in English for which full texts were available. All of the references listed in the included studies were reviewed, and the relevant studies were manually searched.

Inclusion and exclusion criteria

Eligible studies were those that met the following criteria: (1) the enrolled adult patients were admitted to an ICU; (2) patients were randomly assigned to a COT group or a LOT group; (3) oxygenation targets between the two groups were realized by PaO2, arterial oxygen saturation (SaO2), or SpO2 rather than constant fraction of inspired oxygen (FiO2), we did not determine a priori thresholds of oxygenation for the two groups to ensure inclusion of all relevant trials; and (4) mortality was included as a primary or secondary outcome. Exclusion criteria were as follows: (1) the patients did not meet screening criteria; (2) only included patients at risk for ischemia or hypoxic encephalopathy (including traumatic brain injury, stroke, myocardial infarction, and cardiac arrest) or who underwent surgery (including trauma and coronary artery bypass surgery) were included; and (3) the publication was not in English; was a conference reports, commentary, and reviews; and/or represented a redundant publication from a single study.

Outcomes and definition

The primary outcome of interest was 30 (including 28)-day mortality, 90-day mortality, and the longest follow-up mortality in each study.We performed sensitivity analyses according to the two different ICUs (mixed/medical). A mixed ICU was defined as a unit that included both medical and surgical patients, whereas a medical ICU was one that included medical patients. We also performed a subgroup analysis from different baseline P/F at enrollment (mild to moderate hypoxemia, P/F ≥ 150 mmHg; moderate to severe hypoxemia, P/F < 150 mmHg), and the actual PaO2 ( PaO2 in the COT group ≥ 80 mmHg, < 80 mmHg). The secondary outcomes were ICU length of stay, hospital length of stay, days free from mechanical ventilation support (MVF), vasopressor-free time (VFT), and adverse events.

Data extraction and quality assessment

Two investigators (XYL and WT) independently extracted and recorded the desired information from the included studies based on the Cochrane recommendations [28], consisting of the first author, year of publication, setting, country, sample size, intervention protocols, demographic and illness characteristics of patients, and study outcomes. Authors were contacted in cases of missing data or if the reporting format was not suitable for the meta-analysis (e.g., included the data of surgical patients and medical patients). Datas were extracted using the software (GetDataW) when presented in a figure in the trial, or in part from secondary analysis of other studies. Continuous datas were extracted as sample size and mean (standard deviation, SD) or median (inter quartile range, IQR) provided in the studies, with the conversion of medians to estimated mean (SD). Any discrepancies that arose were resolved by the involvement of a third author (BD or HJH).

To evaluate the quality of the eligible RCTs, we used the risk of bias tool recommended by the Cochrane Collaboration [29]. The potential sources of bias were rated according to the following items: sequence generation, allocation concealment, blinding, incomplete outcome data, selective outcome reporting, and other sources of bias. Besides, the quality of evidence was assessed according to the GRADE (Grading of Recommendations Assessment, Development and Evaluation) guidelines on the basis of study limitations, imprecision, inconsistency, indirectness and publication bias for the targeted outcomes [33]. Publication bias was evaluated by visually inspecting funnel plots.

Statistical analysis

Meta-analysis and forest plots were performed using the Cochrane systematic review software Review Manager (RevMan; version 5.4.1; The Nordic Cochrane Center, The Cochrane Collaboration, Copenhagen, Denmark). Dichotomous outcomes are expressed as RRs with 95% CIs, while continuous outcomes are expressed as weighted MDs and 95% CIs, mean (SD) were estimated from median IQRs for further comparison. Heterogeneity was tested using I2 statistics. A fixed-effect model was applied when I2 < 50%, indicating insignificant heterogeneity, whereas a random-effects model was chosen for cases of significant heterogeneity (I2 > 50%). P < 0.05 was considered statistically significant. A relative risk reduction (RRR) of 20%, a type I error level of 5%, and a type II error level of 10% were used in TSA.

Results

Study selection

Following the search strategy, 21,367 records were imported for screening, and 11,712 records were screened by titles and abstracts after removal of duplicates or other reasons. Of these, 10,942 studies were excluded for either not being RCTs or not including adult patients. Furthermore, 68 reports could not be retrieved for evaluation. The remaining 702 studies underwent full-text assessment, of which 689 were excluded for the following reasons: inaccurate intervention, undesirable patient population, or presenting no outcomes of interest. Thus, 13 eligible RCTs [11, 27, 28, 36,37,38,39,40,41,42,43,44,45] were ultimately involved in the meta-analysis as depicted in Fig. 1. Two post-hoc subgroup analyses [46, 47] of the HOT-ICU trial [39] were also included for subgroup analyses.

Fig. 1
figure 1

Study flow diagram

Study description and quality assessment

The main characteristics of the 13 RCTs and 2 post-hoc subgroup analyses are summarized in Table 1, and further demographic details are shown in Additional file 1: Tables S1 and S2. All RCTs were performed in ICU; 9 were conducted in mixed ICU settings [11, 27, 36,37,38,39, 41,42,43] and 6 were conducted in medical ICU settings [28, 40, 44,45,46,47]. The mean baseline P/F varied among the included studies, being ≥ 150 mmHg in 8 studies [11, 27, 36, 38, 40, 42,43,44] and < 150 mmHg in 6 studies [28, 39, 41, 45,46,47]. The mean actual PaO2 levels in both COT and LOT group also varied among the included studies. In the COT group, it ranged from 61 to 87 mmHg, being ≥ 80 mmHg in 5 studies [11, 37, 38, 42, 43], and < 80 mmHg in 10 studies [27, 28, 36, 39,40,41, 44,45,46,47]. In the LOT group, it ranged from 76 to 115 mmHg in 15 studies [11, 27, 28, 36,37,38,39,40,41,42,43,44,45,46,47]. Regarding the difference in mortality, the rate was 5% lower in the COT group than in the LOT group in 3 studies [11, 37, 42], within 5% between the two groups in 8 studies [27, 28, 36, 38, 39, 43, 44, 46] and 5% higher in the COT group than in the LOT group in 4 studies [40, 41, 45, 47].

Table 1 Characteristics of the studies included

The results of quality assessment of the included studies are shown in Figure S1 (Additional file 1: Fig. S1). No selection bias was found in 13 studies, but high performance bias was found due to their unblinded designs. Symmetrical funnel plots of mortality rate showed no significant publication bias (Additional file 1: Fig. S2).

Outcomes

All 13 RCTs [11, 27, 28, 36,37,38,39,40,41,42,43,44,45] (10,632 patients) reported mortality outcomes: 4 reported 30-day mortality [11, 37, 44, 45]; 5 reported both 30- and 90-day [27, 38, 40,41,42] (and 180-day [38]) mortality; and 4 reported 90-day mortality [28, 36, 39, 43]. The mortality at the longest follow-up in the COT and LOT groups was 37.04% (1927 of 5202 patients) and 37.51% (1655 of 4412 patients), respectively, with no significant differences between the two groups (Fig. 2A). The results of the TSA was shown in Fig. 2B. The cumulative Z curve crossed the futility boundary but not crossed the conventional boundary, and 85.89% (9,614 of 11,194 patients) of the required information size was accrued. The results indicated that, when compared with LOT, COT did not reduce the relative risk of the longest follow-up mortality by 20% among ICU patients. The certainty of the evidence was very low (GRADE, Additional file 1: Fig. S3). Also, no significant differences were observed in the analysis of 30-day mortality [11, 27, 37, 38, 40,41,42, 44, 45] or 90-day mortality [27, 28, 36, 38,39,40,41,42,43] between the two groups (Additional file 1: Fig. S4, S5).

Fig. 2
figure 2

Mortality at the longest follow-up and TSA of the included studies. A Mortality at the longest follow-up and B TSA of the the included studies

In subgroup analyses, there were no significant differences in mortality at any analyzed time point (30-day, 90-day, longest follow-up) between the two groups in terms of ICU admission type, different baseline P/F, or different actual PaO2 (Fig. 3; Additional file 1: Fig. S6, 7). Further details of the TSA results were shown in Additional file 1: Fig. S8–10.

Fig. 3
figure 3

Subgroup analyses of mortality

No differences were found in terms of the ICU length of stay [11, 27, 36, 40, 42, 43, 45], length of hospital stay, [11, 27, 36, 40], or VFT days [36, 38, 42, 44] between the COT and LOT group, but MVF days [11, 27, 28, 36, 38, 42,43,44] was significant longer in the LOT group than in the COT group (Additional file 1: Fig. S11–13). The certainty of the evidence was low to very low (Additional file 1: Fig. S3).

Adverse events were reported in 7 RCTs [27, 28, 38, 39, 42,43,44], including organ failure, shock, infection, ICU-acquired weakness, seizure, and delirium (Additional file 1: Table S3). The incidence of adverse events was significantly lower in the COT group than in the LOT group. The results of TSA indicated that the improvement of COT could be considered conclusive with the available evidence (Additional file 1: Fig. S14). However, the certainty of the evidence was low (Additional file 1: Fig. S3). In the subgroup analyses, the incidence of adverse events in the COT group was significantly lower among the patients who enrolled in mixed ICU settings and with baseline P/F ≥ 150 mmHg. However, no differences were found between the two oxygenation strategies for patients enrolled in medical ICU settings and with baseline P/F < 150 mmHg (Fig. 4, Additional file 1: Fig. S13).

Fig. 4
figure 4

Overall and subgroup analyses of adverse events

Discussion

In this systemic review and meta-analysis, we found that COT did not reduce the mortality rate relative to the LOT group in ICU patients; this was also true in terms of the different ICUs, baseline P/F, and actual PaO2 in the subgroup analyses. Some studies showed a slight trend of higher mortality in the COT group with actual PaO2 < 80 mmHg and in medical ICU settings. COT did not affect ICU length of stay, hospital length of stay, or VFT, only MVF days. The incidence of adverse events was significantly lower in the COT group (among patients enrolled in mixed ICU settings and with baseline P/F ratio ≥ 150 mmHg) than in the LOT group, but no differences were found between the two oxygenation strategies for patients enrolled in medical ICU settings and with baseline P/F < 150 mmHg.

Our review has several strengths. First, we included studies only involving ICU patients, and new high-quality RCTs were included [27, 28]. Second, we performed TSA with adjusted CIs in order to control for risk of random errors due to multiple outcomes, sparse data, and repetitive testing on accumulated data, to evaluate the benefits and harms of COT versus LOT in critically ill patients. We also contacted relevant trial authors if additional information was required. Third, to explore the robustness of our results and the influence of hypoxemia severity, ICU population and actual oxygenation on our primary outcome, we have performed the subgroup analyses according to ICU population, baseline oxygenation, and actual PaO2.

In this review, COT did not reduce mortality of ICU patients. In recent meta-analyses summarized in Additional file 1: Table S5 [12,13,14,15,16,17,18,19,20,21,22,23,24, 48], only two earlier reviews found that COT may result in a lower mortality rate. An IOTA meta-analysis [12] was conducted in both ICUs and ordinary care settings, considering all types of diseases, and it suggested that a LOT strategy above an SpO2 range of 94–96% was associated with increased mortality, but subgroup analyses revealed no significant differences in critical care patients. Another meta-analysis that included ICU patients in only four RCTs found that COT resulted in significantly lower mortality rate [48]. However, such findings have not been supported by any subsequent meta-analyses published after 2019 [13, 24]. Our systematic review including more RCTs further confirms these results. However, because blinding of participants and/or personnel is not possible, the the certainty of the evidence was low. Still, the TSA indicated that over 80% of the required information size was accrued, and evidence was able to assess the benefit or harm between the two groups.

No differences of mortality were found between two oxygenation strategies for patients enrolled in medical and mixed ICU settings. Compared to the previous meta-analyses [13, 24], we included more studies in the subgroup analysis. However, the TSA indicated that in most subgroups, the samples did not reach the required information size. We noticed that there was a difference in mortality of up to 5% between the two groups in some studies (Table 1). Interestingly, studies showing a trend of higher mortality in the COT group were mostly conducted in medical ICU settings [40, 41, 45, 47], and the main diagnoses of the included patients were medical diseases, mainly respiratory diseases (Additional file 1: Table S1). On the contrary, studies showing a trend of lower mortality in the COT group were all in mixed ICU settings [11, 37, 42]. We summarized the comorbidities of patients located in medical and mixed ICU and found that the incidence of cardiovascular, respiratory, and digestive diseases was significantly higher in medical ICU patients than that in mixed ICUs (Additional file 1: Table S4). It is reasonable to assume that the patients in medical ICUs may have had more severe gas-exchange impairments and refractory hypoxemia, requiring more oxygen. It may also be worth noting that the COT strategy avoids hyperoxemia but exposes patients to a higher risk of hypoxia, especially in these patients with more comorbidities [26]. Clinical trials comparing different oxygenation groups for these specific patient groups are needed; if possible, such studies should also incorporate stratification of important baseline risk factors (e.g., comorbidities).

In most including RCTs, SpO2 has been the primary parameter defining the target oxygenation range, but discrepancies sometimes exist between the targeted goals and the actual levels. Actually, PaO2 is superior to defining oxygenation target levels precisely and minimizing overlap between two groups [49]. Zhao et al. [17] performed a systematic review according to oxygenation goals, and found that different oxygenation goals do not lead to different mortalities in mechanically ventilated critical ill patients. Further, we performed the subgroup analysis based on the actual PaO2, and no differences of mortality were found between COT and LOT. However, we noticed a trend that (Table 1), the COT group with lower actual PaO2 (< 80 mmHg) may have a higher mortality rate than that of LOT group in some studies [40, 41, 45, 47]; while the COT group with higher PaO2 (≥ 80 mmHg) may have lower mortality in some studies [11, 37, 42], and the actual PaO2 was basically equal to the LOT group in some other studies (Table 1). As the normal range for PaO2 at sea level in healthy individuals is 80 to 100 mmHg [50], the COT strategy may not represent permissive hypoxia, which has not been well studied in adults. The observed degree of difference in mortality may have clinical significance, and thus more careful oxygen titration with “permissive hypoxia” should be considered in these patients until more robust evidence is available.

COT was not associated with any advantages in terms of ICU length of stay, hospital length of stay, or VFT compared to LOT, only MVF days. We believe that many factors contributed to these results, such as the primary diseases of patients admitted to different ICUs, the severity of baseline disease, and the treatment effects.

We also found that the incidence of adverse events in LOT was significantly higher among patients enrolled in mixed ICU settings and with a baseline P/F ≥ 150 mmHg than in COT. This is consistent with previous conclusions that higher oxygenation targets are associated with more adverse events [10]. However, no differences were found between the two oxygenation strategies for patients enrolled in medical ICU settings and with baseline P/F < 150 mmHg. This may also be due to the fact that these patients have more severe exchange impairments and refractory hypoxemia, where the adverse events are offset by the benefits of corrected hypoxia from oxygen therapy. However, due to the different definitions and the inadequate blinding of adverse events, more robust data are needed for a more compelling conclusion. We should also pay close attention to the microcirculation, long-term neurological function and others complications; new technological approaches, such as biomarkers, can be also considered in the future research.

There may be some implications for practice and future research. Given the presence of confounding factors in many existing RCTs, it remains of paramount importance to continue to conduct clinical trials, ideally comparing groups with a clinically relevant contrast between specific patient groups, such as according to the type and severity of disease of patients in the respiratory ICU; machine learning methods using data from these trials could also be utilized to build models for individual patients [51]. Meanwhile, the oxygenation strategies in all trials were grouped by PaO2, SpO2, or FiO2, all of which required manual adjustment during titration. New techniques, such as automated oxygen titration may better identify the suitable oxygenation target for a specific population in the future research [52]. Finally, adverse events are important signals for clinical practice guidelines; it may be necessary to take adverse events as primary or secondary outcomes, more robust data is needed for a compelling conclusion.

Some limitations of our review should be mentioned. First, the definitions of COT and LOT were not quite concordant among the studies assessed and the actual oxygenetions were also very different. Second, clinical heterogeneity among studies is a common concern. Third, inadequate blinding is often associated with performance bias. Finally, TSA indicated that the information size was insufficient for most outcomes, especially in most subgroups.

Conclusion

In conclusion, this systematic review and meta-analysis found that COT did not reduce all-cause mortality at 30-day, 90-day or longest follow-up of ICU patients. There was a trend, but without a statistical difference, showing that patients in the COT group with lower PaO2 had an increased mortality rate in medical ICU settings; further studies are needed to confirm our findings. COT was associated with a lower incidence of adverse events among patients enrolled in mixed ICU settings and with baseline P/F ≥ 150 mmHg; however, no differences were found between the two oxygenation strategies for patients enrolled in medical ICU settings and with baseline P/F < 150 mmHg.