Introduction

Acute respiratory distress syndrome (ARDS) represents a serious problem in critically ill patients and is associated with an in-hospital mortality of between 33 and 52 % [13]. Although mechanical ventilation provides essential life support, it can worsen lung injury by regional alveolar overdistention, cyclic alveolar collapse with shearing (atelectrauma), and failure of collapsed alveolar units to re-open [46].

Atelectrauma plays a major role in ARDS [7] and may contribute to mortality [8]. In this context, the re-opening/re-expansion of collapsed lung tissue by alveolar recruitment maneuvers (ARMs) through the transient raising of transpulmonary pressure to levels higher than those achieved during tidal ventilation [9] and preventing further collapse by using positive end-expiratory pressure (PEEP) may prevent atelectrauma. However, the application of ARMs in clinical practice remains controversial because such interventions increase intrathoracic pressure, reduce venous return, and may increase the risk of barotrauma.

The results of previous randomized controlled trials (RCTs) and systematic reviews on the effects of ARMs on survival, length of mechanical ventilation, and/or length of hospital stay of ARDS patients have been inconclusive [1013]. However, other RCTs have been completed since these reviews were published, requiring a new appraisal of current evidence on the use of ARMs for ARDS. Therefore, we have conducted a systematic review and meta-analysis of the relevant RCTs to assess the effect of ARMs on mortality and other clinical outcomes in patients with ARDS. Part of the data included in this review has been presented as a poster at an international scientific meeting [14].

Methods

The recommendations of the Cochrane handbook for systematic reviews of interventions and of the PRISMA Statement—preferred reporting items for systematic reviews and meta-analyses—were followed during the design, implementation, and reporting of this study [15, 16]. The study protocol was approved by the Research Ethics Committee of São Paulo University on 29 August 2013 [Electronic Supplementary Material (ESM)].

Data sources and searches

We searched the following electronic databases (from inception to 1 July 2014): MEDLINE, EMBASE, LILACS, Cumulative Index to Nursing and Allied Health Literature (CINAHL), Cochrane Central Register of Controlled Trials (CENTRAL), Scopus, and Web of Science for relevant articles. We placed no language restrictions and used controlled vocabulary whenever possible (MeSH terms for MEDLINE and CENTRAL; EMTREE for EMBASE). Keywords and their synonyms were used to optimize the search, and applied standard filters were used to identify RCTs. We adapted our MEDLINE search strategy for use with other electronic databases (see terms used in ESM Table 1). We also hand-searched the reference lists of the included studies to identify other relevant trials. Finally, we attempted to identify unpublished or ongoing trials by contacting experts in the field and by searching clinical trial registries [ClinicalTrials.gov and International Standard Randomised Controlled Trial Number (ISRCTN) Register].

Study selection

We included RCTs that have assessed the clinical effects of ARMs compared to no recruitment maneuvers in adult patients with ARDS. The ARM could have been applied as an isolated intervention or as part of a ventilation package. However, we did not include trials that applied different tidal volumes between groups because low tidal volume has a proven beneficial effect on mortality [17, 18]. Observational studies and trials that enrolled exclusively patients with barotrauma were excluded, as were randomized cross-over trials, as there may have been carry-over effects, which may have led to a bias on the effect of ARMs on clinical outcomes.

We defined an ARM as any technique that transiently increased the alveolar pressure above that of regular tidal ventilation, including—but not limited to—maneuvers involving sustained inflation, stepwise increase of PEEP, increase in tidal volume or controlled pressure, and extended sigh maneuvers.

Four teams of two reviewers (EAS teamed with KNS, LL, AMB, and DB, respectively) independently screened all retrieved citations by reviewing titles and abstracts. If at least one of the authors considered a citation potentially eligible for inclusion in our systematic review, the full text was obtained. Then, the four teams of two reviewers independently evaluated full-text manuscripts for eligibility using a standardized form (ESM). Duplicate publications or sub-studies of included trials were listed under the primary reference, as they may have provided additional relevant information that was not available in the original publication. Any disagreement within each team was resolved by third-party adjudication.

Data extraction and risk of bias assessment

The four teams of two reviewers independently extracted clinical data and assessed the risk of bias. Any disagreement was resolved by consensus or third-party adjudication. If additional information was required, we contacted the original authors by e-mail.

The following data were extracted from the included studies: study location, enrollment period, sample size, inclusion and exclusion criteria, baseline characteristics of the included patients, details of the experimental intervention (type and frequency of ARM, method used to adjust PEEP after ARM, and maintenance ventilation), details of control intervention, length of follow-up, and clinical outcomes.

The risk of bias of the included studies on our primary outcome was assessed by domain-based evaluation [15]. The domains assessed in this review were random sequence generation, allocation concealment, incomplete outcome data, selective outcome reporting, and early stopping for benefit. We did not assess the domains related to blinding of personnel, patients, or outcome assessors for the following reasons: (1) due to the nature of the intervention, it is not feasible to blind investigators and healthcare personnel to group allocation; (2) we assumed that participants were unaware of group allocation because they were critically ill or generally sedated, and consent for participation in the study was given by the next of kin; (3) blinding of outcome assessors would not introduce a differential detection bias because the primary outcome assessed was mortality. For the remaining domains, we indicated “low risk of bias,” “high risk of bias,” or “unclear.” We considered trials with “lower risk of bias” to indicate those at low risk of bias in all domains assessed.

Outcomes

The primary outcome was in-hospital, all-cause mortality. If the authors did not report in-hospital mortality, we considered the relevant data at the maximum follow-up period reported. Secondary outcomes were barotrauma (pneumothorax, pneumomediastinum, subcutaneous emphysema, or pneumatocele), the need for rescue therapies (prone position, nitric oxide, high-frequency oscillatory ventilation, or extra-corporeal membrane oxygenation), the duration of mechanical ventilation (expressed as days free of mechanical ventilation from randomization to day 28 and mean or median number of days of mechanical ventilation), length of stay in the hospital and intensive care unit (ICU), and other adverse events.

Data synthesis and analysis

We presented the risk ratios (RR) and their respective 95 % confidence intervals for the binary outcomes of each trial. Meta-analysis was performed using the Mantel–Haenszel random effects model. Values for continuous outcomes were given as the mean ± standard deviation or as the median with the interquartile range. Three pre-specified subgroup analyses were conducted for the primary outcome: (1) trials with higher risk of bias versus trials with lower risk of bias; (2) adjusted PEEP levels after ARM in the experimental group versus similar PEEP levels in both groups; (3) ARM achieving a peak pressure of ≤40 cmH2O versus ARM achieving peak pressure of >40 cmH2O. We assessed statistical heterogeneity across trials or subgroups using the Cochrane’s chi-squared test [15], and the Higgins’ inconsistency test (I 2) was used to quantify the percentage of the variability in the effect estimates that was due to heterogeneity rather than chance. We considered I 2 ≤ 25 % to indicate low heterogeneity and ≥75 % to indicate high heterogeneity [19]. We analyzed the probability of publication bias by funnel plot and considered plot asymmetry to be suggestive of reporting bias. Plot asymmetry was tested using Egger’s test [20, 21]. All analyses were performed using the Review Manager Version 5.2 (Cochrane IMS, Oxford, UK) and Stata version 11.0 (StataCorp, College Station, TX).

As the event size needed for a very precise meta-analysis is at least as large as that for a single optimally powered RCT, we calculated the optimal event size requirement for our meta-analysis considering a mortality rate of 36 % in the control group [3], a relative risk reduction of 20 %, 90 % of power, and a type I error of 5 %. We chose a relative risk reduction of 20 % to calculate the optimal event size in order to have adequate power to detect even a small but clinically important effect; furthermore, this risk reduction value is the typical effect size observed in intensive care studies [17, 22]. Thus, the observation of at least 575 events would be needed. We did a formal trial sequential analysis (TSA; TSA software version 0.9 Beta; Copenhagen Trial Unit, Copenhagen, Denmark) [23] by using the optimal event size to help to construct sequential monitoring boundaries for our meta-analysis, analogous to interim monitoring in a RCT. We established boundaries limiting the global type I error to 5 %. As a sensitivity assessment, we also conducted TSA considering a more strict type I error of 1 %. This more conservative approach may be appropriate for a meta-analysis of small trials [24].

Quality of meta-analysis evidence

The quality of evidence generated by this meta-analysis was classified as high, moderate, low, or very low in accordance with the Grading of Recommendations Assessment, Development and Evaluation (GRADE) system [25]. The quality of evidence indicates our confidence that the evidence generated by the meta-analysis is definitive. According to the GRADE system, a precise result (associated with low random error or a small P value) is necessary but not the only criterion for a summarized evidence to be of high quality. The level of evidence of the meta-analysis was initially set as high and was downgraded when any of the following were present: high risk of bias, imprecision of estimate of effect, indirectness of evidence, inconsistency, or evidence of reporting bias. Evidence classified as “at high risk of bias” means that most of the included trials showed a high risk of bias in at least one of the domains assessed. “Imprecision” of estimates of effect indicates that there was a non-acceptable random error in the estimate of effect generated by the meta-analysis. “Indirectness” of evidence occurred when there were differences between the population, intervention, comparator, and outcome of the research question, and those included in the relevant studies. That is, there is “indirectness” if the review question was not directly addressed by the available evidence. “Inconsistency” indicates that the results of the individual trials differed from each other. Finally, “reporting bias” occurred when investigators failed to report studies (typically those that showed no effect) or outcomes (typically those that may be harmful or for which no effect was observed). We used the GRADE pro version 3.6 (GRADE Working Group).

Results

Our search strategy identified 9,451 citations, from which 3,279 were duplicates (Fig. 1). After screening titles and abstracts, we selected the full-text version of 56 relevant citations for an in-depth analysis, of which 47 were subsequently excluded for the reasons listed in the Fig. 1 (details are given in ESM Table 2). Before excluding the five trials that did not report clinical outcomes, we contacted the authors to obtain missing data but received no response. In addition, we identified one unpublished trial through contact with experts and obtained data relevant to our primary outcome at a conference attended by the authors [26]. As a result, ten trials involving a total of 1,594 participants were included in our systematic review and meta-analysis [11, 2634].

Fig. 1
figure 1

Study search and selection processes. RCT Randomized clinical trial, ARM alveolar recruitment maneuver

Characteristics of included trials

The included trials were published between 2003 and 2011. Two trials were published in Chinese [29, 30]. The sample size ranged from 17 to 985 patients. The characteristics of the trials are listed in Table 1.

Table 1 Characteristics of included trials

All included trials defined ARDS according to American–European consensus conference (AECC) criteria [35]. Although two trials enrolled patients with PaO2/FiO2 (partial pressure of oxygen in arterial blood/fraction inspired oxygen) of ≤250 mmHg (that is, patients with acute lung injury according to the AECC definition), most of the patients at these trials had a PaO2/FiO2 of ≤200 mmHg at inclusion (85 % of patients in the trial conducted by Meade et al. [11] and 91 % of those in the trial conducted by Liu et al. [34]).

The ARMs were different in type, duration, and intensity across the studies. Four trials assessed the effect of sustained inflation [11, 28, 30, 32], two studies performed ARM by stepwise increase of PEEP with constant driving pressure [26, 33], two studies used stepwise increase of PEEP with stepwise decrease of tidal volume [27, 31], and one trial performed one-step increase of PEEP while maintaining the driving pressure at 15 cmH2O [34]. The type of ARM was not clearly stated in one trial [29]. Four trials adjusted PEEP levels after the ARM in the experimental group [11, 26, 31, 33], and six trials used the same PEEP levels in both groups [2730, 32, 34]. One trial evaluated the effect of ARM on patients already receiving nitric oxide therapy [28]. In one trial, patients in both groups were subject to the same ventilator settings and co-interventions, allowing the assessment of the effects of ARMs alone [32]. Lim et al. also used the same ventilator settings; however, more patients in the experimental group were managed in the prone position [27].

Risk of bias

Three trials were considered to be at a lower risk of bias because they showed a low risk of bias in all of the domains assessed [11, 26, 33]. In four trials, the generation of the randomization list and the allocation concealment were adequate [11, 26, 32, 33], in one trial, the allocation list was not concealed [30], and in the remaining five trials insufficient information on the randomization method or allocation concealment was provided. Outcome data on mortality was complete in seven trials [2630, 33, 34]. In one trial, two of the 985 patients were not included in the analysis [11]. We classified two trials as being at a high risk of attrition bias because of post-randomization exclusions [31, 32]. In the trial by Huh et al., one patient from the experimental group and three in the control group were excluded due to low blood pressure [31]. In the study by Xi et al., 12 patients in the experimental group and three in the control group were excluded because they did not adhere to the study protocol [32]. None of the trials were stopped early for benefit (ESM Table 3).

Primary outcome

Ten trials [11, 2634] were included in our meta-analysis assessing the effect of ARM on the mortality of patients with ARDS (Fig. 2). In-hospital mortality was 36 % in the ARM group and 42 % in the control group (RR 0.84; 95 % CI 0.74–0.95; I 2 = 0 %). The total number of deaths was 612, which is greater than the optimal event size (575 events); that is, the TSA indicated an overall type I error of <5 % for the meta-analysis result. Conversely, when a more conservative type I error of 1 % was considered, the number of events is insufficient, and the cumulative meta-analysis did not cross the efficacy-monitoring boundary (Fig. 3).

Fig. 2
figure 2

Forest plots showing the effects of ARMs on clinical outcomes of patients with acute respiratory distress syndrome (ARDS). CI Confidence interval, M–H Mantel–Haenszel test. See Table 1 for more detail on the study or subgroup

Fig. 3
figure 3

Trial sequential analysis assessing the effect of ARMs on in-hospital mortality. Cumulative meta-analysis with 612 in-hospital deaths (blue line) crossed the efficacy monitoring boundary for the primary outcome—i.e., overall type 1 error is <5 % (purple line). Considering a global type I error of 1 %, the cumulative meta-analysis did not cross the efficacy monitoring boundary and the optimal event size of 814 (green line) was not reached. Optimal event size Event size needed for a very precise meta-analysis (which is at least as large as that for a single optimally powered randomized controlled trial)

The funnel plot was visually asymmetric (P = 0.055), and although not statistically significant, it suggests that publication bias or bias associated with smaller trials may be an issue (ESM Fig. 1).

Secondary outcomes

The effects of ARMs on secondary outcomes are presented in Fig. 2. Recruitment maneuvers were not associated with an increased risk of barotrauma (RR 1.11; 95 % CI 0.78–1.57; I 2 = 0 %). Length of mechanical ventilation and length of stay in ICU and hospital were reported using different metrics among the trials; therefore, we did not conduct a meta-analysis using data on these parameters. Most of the trials showed no between-group differences in the length of mechanical ventilation and length of time in the ICU or hospital (Table 2). There was also no difference in the rates of severe hypoxemia requiring rescue therapies between groups (RR 0.76; 95 % CI 0.41–1.40; I 2 = 56 %). The most commonly observed adverse effects after ARM were transient hypotension and desaturation (ESM Table 4).

Table 2 Duration of mechanical ventilation, length of stay in intensive care unit and hospital

Subgroup analyses

Pooled analysis of the trials with a lower risk of bias showed a smaller effect on mortality (RR 0.90; 95 % CI 0.78–1.04; I 2 = 0 %) than the pooled analysis of trials with higher risk of bias (RR 0.72; 95 % CI 0.58–0.89; I 2 = 0 %); the P value for subgroup differences was 0.09 (Fig. 4). The effect of ARM on mortality on the subgroup of trials that adjusted PEEP after ARM in the experimental group was less pronounced (RR 0.89; 95 % CI 0.78–1.03; I 2 = 0 %) than that in the subgroup with similar PEEP levels (RR 0.69; 95 % CI 0.54–0.88; I 2 = 0 %); the P value for subgroup differences was 0.07 (ESM Fig. 2). The effect of ARM on mortality was similar in the subgroup of studies in which the ARM reached a peak pressure of ≤40 cmH2O (RR 0.83, 95 % CI 0.71–0.97; I 2 = 6 %) compared to the subgroup with a peak pressure of >40 cmH2O (RR 0.85; 95 % CI 0.65–1.12; I 2 = 0 %); the P for subgroup differences was 0.84 (ESM Fig. 3).

Fig. 4
figure 4

Forest plots showing the effects of ARMs on in-hospital mortality for the subgroup of trials with a lower risk of bias versus trials with a higher risk of bias. See Table 1 for more detail on the study or subgroup

Quality of the meta-analysis evidence

We classified the quality of evidence generated by the meta-analysis for the primary outcome as low. Our reasons for downgrading the quality of evidence were the risk of bias of the included studies and the indirectness of evidence. We considered that the current evidence was indirect because most of the trials assessed ARM as part of a ventilatory package, with differences in other variables beyond ARM, and not as an isolated intervention. Furthermore, publication bias and the imprecision of estimate of effect could not be completely ruled out.

Discussion

Our systematic review suggests that ARMs are associated with lower mortality in patients with ARDS. Our meta-analysis revealed that the difference in mortality was approximately 6 %, suggesting that for every 17 patient with moderate to severe ARDS treated with ARM, one in-hospital fatal event was prevented. Although fewer patients who received ARMs received rescue therapies for refractory hypoxemia, this benefit was not statistically significant. Furthermore, ARMs were not associated with an increase in the risk of barotrauma. Most trials showed no differences between the groups in terms of the length of mechanical ventilation and the length of stay in the ICU or hospital, and other adverse events seemed to be transient and self-limited. Additionally, most patients (90 %) included in the primary studies had moderate to severe ARDS according to the Berlin definition, which is a PaO2/FiO2 of ≤200 mmHg [36]. In this context, we believe that the result of our primary analysis is applicable to patients with moderate–severe ARDS.

The reduction of atelectrauma is the mechanism that possibly mediates the beneficial effect of ARMs on mortality. In support of this notion, experimental and clinical evidence suggests that it is important to adjust PEEP after ARM in order to maintain alveoli opening [3739]. Conversely, performing ARM and returning to previous PEEP causes transient alveoli distention with no prolonged benefit [12, 40]. However, in our subgroup analysis, the effect was not different between the subgroup that adjusted PEEP after ARM and the subgroup that did not. It is important to note that the trials without PEEP adjustment in the ARM group are also the smaller ones, which are at a higher risk of bias and more likely to report beneficial effects.

It is possible that the effect of ARMs would be more pronounced if all of the studies had performed maneuvers achieving a pressure of >40 cmH2O. Previous case series suggest that 54–71 % of patients with ARDS who received an ARM require >40 cmH2O to achieve full recruitment [38, 39]. We performed a subgroup analysis to explore the heterogeneity of studies according to ARM methodology and its potential effect on mortality. Our result showed that the effect of ARMs was similar in both subgroups. Nevertheless, it is possible that other differences between studies could explain the similar effect. For example, five of six studies using ARMs at >40cmH2O also were at higher risk of bias.

Two previous systematic reviews evaluated ARMs for patients with ARDS [12, 13]. The first review included 40 randomized and non-randomized studies [12], of which four were RCTs with distinct design features and outcomes: one applied ARMs to both groups [41], one was a cross-over trial [9], a third did not report effects on clinical outcomes [40], and the fourth was a preliminary report of the LOV study [42]. This review focused mostly on short-term effects, such as oxygenation and adverse events and did not perform meta-analysis due to heterogeneity in populations, interventions, and study outcomes between trials. The second review published by Hodgson et al. [33] assessed the effects of ARMs on clinical outcomes and found no benefit on survival. This review included seven trials, but only two of these [10, 11] were considered for the meta-analyses assessing the effect on mortality. In our review, we identified nine additional trials [2634]. Conversely, we decided not to include the trial that also used lower tidal volumes in the experimental group [10] because this intervention has a proven beneficial effect on mortality [17].

Our review has a number of strengths. First, our search strategy was comprehensive, including seven electronic databases, clinical trial registries, contact with experts, and hand-searches the reference list of included and other relevant studies. Secondly, we conducted eligibility assessment and data extraction in duplicate. Thirdly, we evaluated the reliability and conclusiveness of the available evidence through a method of formal TSA. Finally, we evaluated the quality of evidence using the GRADE system.

However, our review has a number of limitations that merit consideration. First, although our review was based on a pre-specified protocol, it was not pre-published. Second, because we had only trial aggregate data available instead of individual patient data, we were unable to explore the effects of ARMs in some important subgroups. For example, it is possible that the ARMs are only beneficial in patients with moderate and severe ARDS. However, two trials included all the spectrum of ARDS and did not report outcomes according to severity [11, 34]. Nevertheless, more than 90 % of patients considered in this review had moderate or severe ARDS. Third, information on some outcomes was not available from all studies. Fourth, in some trials, ARMs were applied inconsistently and sometimes linked to mechanical ventilator disconnections, and we did not explore the effect of ARM repetitions. Additionally, the ventilator settings to perform ARM and the method to set PEEP varied across trials. Finally, other differences of management were substantial between trials; for example, in one trial, all patients received nitric oxide [28], one allowed for ventilation in the prone position (with a higher percentage of patients in the ARM group receiving prone ventilation) [27], and three trials used continuous doses of paralytics for all patients [27, 28, 31].

We classified the quality of evidence generated by this meta-analysis as low. The main reasons for downgrading the quality of the evidence were the high risk of bias observed in most of the trials and the indirectness of evidence. We considered that the evidence was indirect because our research question was to determine the effect of ARM on mortality. Conversely, the evidence we gathered involved the assessment of ARM as part of a mechanical ventilation package with adjustments in several variables other than ARM. Furthermore, despite our comprehensive search, we cannot completely rule out publication bias because the funnel plot was visually asymmetric, although the formal test for funnel plot asymmetry was non-significant [21]. Moreover, although the cumulative meta-analysis achieved optimal event size with a global type I error rate of 5 %, it did not meet the optimal event size or efficacy-monitoring boundaries with a more conservative global type I error of 1 %. Thus, there is still some chance that future research may contradict current evidence [23, 24].

In conclusion, our meta-analysis assessing the effects of ARMs on patients with ARDS suggests a benefit on survival without an increasing risk for major adverse events. However, the quality of the current evidence is low and insufficient in terms of allowing for definitive and reliable conclusions. Thus, further research is likely to impact our confidence in the estimate of the effect and may change the estimate. Ongoing trials [43, 44] may better determine whether ARMs should be routinely applied for improving clinical outcomes of patients with ARDS.