Introduction

In the intensive care unit (ICU), up to 25 % of admissions are related to sepsis and an additional 12 % of patients develop sepsis during their stay [1]. Sepsis and, if deteriorating, septic shock have a high morbidity and mortality. Depending on the definition of sepsis, mortality varies from 27 to 54 % [1]. To decrease this high rate of morbidity and mortality, several interventions have been suggested. Bundled interventions, like those of the Surviving Sepsis Campaign [2], aim to improve outcome of patients by, among others, early antibiotics, glucocorticosteroids (steroids) and supportive care.

Based on supposed pathophysiological mechanisms, two rationales for steroids in sepsis have been put forward. The first rationale is that high dose steroids may suppress the excess in inflammatory response in sepsis. In the 1970s and the early 1980s high dose steroids (30 mg/kg methylprednisolone or equivalent dose) were used in sepsis [3]. In the late 1980s the use declined on the basis of negative results of randomised clinical trials [3]; however, in some centres, high dose steroids are still used in clinical practice today. The second rationale, introduced in the 1990s, is that low dose steroids may recover a relative adrenal insufficiency [4]. Many trials have been conducted, but the pathophysiological basis of the second rationale is still questioned [5]. Possibly both high and low dose steroids can have beneficial or harmful effects in sepsis.

Despite the lack of evidence for the underlying mechanism, the use of low doses of 200–300 mg hydrocortisone is recommended in patients with septic shock not responding to fluid and vasopressor therapy [2]. This recommendation is likely based on results of one systematic review, which found a statistically significant 16 % relative risk reduction (RRR) of mortality (relative risk 0.84; unadjusted 95 % confidence interval 0.72–0.97) in favour of prolonged low dose steroids [6]. The beneficial effect found might be a subgroup effect in more severely ill patients, a spurious finding due to a type I random error as a consequence of repetitive testing as the information size required for showing a 16 % RRR was far from being reached, or an overestimation of a treatment effect due to bias and suboptimal trial methodology [712].

A sound methodology in a systematic review is as important as in any type of study to avoid critical errors in the analyses and conclusions [13]. We therefore decided to conduct a new systematic review evaluating the effects of steroids for sepsis in patients with systemic inflammatory response syndrome (SIRS), sepsis, severe sepsis or septic shock as previous meta-analyses fall short on several aspects of rigorous methodology.

Objective

The objective was to perform a systematic review according to a published protocol following guidelines from PRISMA [14] and The Cochrane Handbook for Systematic Reviews of Interventions [15]. We also planned to execute meta-analyses and trial sequential analyses (TSA) of randomised clinical trials that compared the benefits and harms of high and/or low dose steroids for patients with SIRS, sepsis, severe sepsis or septic shock. Our primary outcome was mortality at longest follow-up and serious adverse events.

Available evidence was to be evaluated in the perspective of the three dimensions of possible risks of errors: systematic errors (bias), design errors (also leading to systematic errors due to outcomes, comparators, etc.) and random errors (‘the play of chance’) [16].

Methods

The systematic review was conducted following the recommendations of The Cochrane Handbook for Systematic Reviews of Interventions [15] and reported according to the PRISMA statement (www.prisma-statement.org). The protocol was published on PROSPERO (http://www.crd.york.ac.uk/PROSPERO, ID: CRD42013005617).

In addition to an overall analysis including all doses, two separate analyses were conducted of high and low dose steroids. The cut-off value of high and low dose steroids was chosen arbitrarily. High doses were defined as daily doses of more than 500 mg hydrocortisone; low doses were defined as a daily intake of equal or less than 500 mg. Other steroids were recalculated into the equivalent hydrocortisone dose [17]. When doses were expressed in milligram per kilogram body weight, doses were calculated assuming a body weight of 75 kg.

Eligibility criteria

Randomised clinical trials that included adult patients (age >18 years) with SIRS, sepsis, severe sepsis or septic shock, or any combinations thereof (Table S1). Trials with patients with SIRS were excluded when SIRS criteria were not explicitly described in their methods section. Trials were also excluded when evaluating steroids for the prevention of the occurrence of SIRS. No limitations were made regarding underlying cause of illness. There were no restrictions on duration of treatment (days) or whether administration occurred continuously or intermittently. All types of steroids were included.

Trials were included independently of the type of control intervention: placebo, no intervention, or any other control intervention. Co-interventions were allowed provided that similar administration occurred in the intervention groups. Trials were included irrespective of chosen outcomes. No limitations were made on the basis of language or publication status. The following study types were excluded: quasi-randomised studies, observational studies, cross-over studies, and studies comparing different doses or different types of steroids in both trial intervention groups.

Search strategy

We searched the Cochrane Central Register of Controlled Trials (CENTRAL) in The Cochrane Library, PubMed/Medline, Embase, Web of Science and Cinahl. We also hand-searched the reference lists of included trials and systematic reviews for further trials. Ongoing trials were sought through trial registries (www.clinicaltrials.gov, www.controlled-trails.com, www.centerwatch.com). No time restrictions were applied. The electronic literature search strategies are listed in Table S2.

Study selection and data extraction

Two authors independently reviewed all identified titles and abstracts and excluded clearly irrelevant hits. The remaining hits were evaluated in full text. Disagreements were resolved through discussion and the hits excluded on the basis of full text were all listed.

Characteristics of patients and trials and data for analyses were extracted by two authors independently from the included reports. A summary of the recorded patient data is presented in Table 1. A thorough recording of patient data is listed in Table S5.

Table 1 Characteristics of included trials

We contacted corresponding authors for unreported data.

Outcomes

Primary outcomes were mortality at longest follow-up and serious adverse events. Serious adverse events were a composite outcome, summarizing all serious events excluding mortality, necessitating an intervention, operation or prolonged hospital stay.

Secondary outcomes were persistent dependence on haemodialysis and duration of mechanical ventilation. Additionally, time-specific analyses of mortality at 30 and 90 days were conducted as secondary outcomes according to availability of data. Data of trials reporting 28-day mortality were included in 30-day mortality analyses. All outcomes were classified according to the patients’ perspective according to GRADE Working Group (Table S3) [18]. Although used in previous trials and meta-analyses, shock reversal was not considered, as it is a surrogate outcome, which is not important according to the patients’ perspective [19].

Risk of bias assessment

We assessed the risk of bias according to The Cochrane Handbook for Systematic Reviews of Interventions [15], including all eight domains: random sequence generation, allocation concealment, blinding of participants and personnel, blinding of outcome assessment, incomplete outcome data, selective reporting, and other bias (academic or funding bias). If one or more of the domains were judged as having high or unclear risk of bias, the trial was classified as having a high risk of bias. Only two trials had low risk of bias (in all domains) with a pooled information size of 591 patients (nearly all data came from one trial [20]). Therefore, we formulated a group of trials with lower risk of bias. These trials had at least low risk of bias in sequence generation, allocation concealment, blinding of participants and personnel, and blinding of outcome assessment. This does not, however, exclude risk of bias from other domains.

Statistical analysis

Review Manager 5.1.6 was used for statistical analyses. We used the TSA program version 0.9 beta (www.ctu.dk/tsa; [21]) to control random errors and assess imprecision. For each included trial we calculated the relative risk (RR) with 95 % confidence intervals (CI) for dichotomous outcomes. We report risk differences if statistically significant different from relative risk. We calculated the numbers needed to treat or numbers needed to harm with 95 % CI based on a statistically significant RR.

Heterogeneity among trials was explored by the Chi-squared test with significance set at P value of 0.10, and quantified with inconsistency factor (I 2) statistics. We reported the results from the random-effects model anticipating abundant clinical heterogeneity (in populations, interventions and settings). We reported the results from a fixed-effect model if one or two trials dominated the available evidence [22].

The following subgroup analyses were planned: (1) the stratification of bias risk of trials (lower risk of bias compared to high risk of bias); (2) the duration of steroid treatment [long-term (≥4 days) compared to short-term (<4 days) use]; (3) patients with SIRS or sepsis compared to patients with severe sepsis or septic shock; (4) stratification based on the aetiology of sepsis; (5) trials using dexamethasone compared with trials using any other steroid possessing mineralocorticoid properties (unlike dexamethasone).

Trial sequential analysis

We conducted trial sequential analysis (TSA). Conventional meta-analysis runs the risks of random errors due to sparse data and repetitive testing [15, 18]. TSA adjusts the confidence intervals, if data are sparse or repeatedly analysed as a result of multiple updates, to allow firm conclusions. TSA is similar to interim analysis in a single trial where monitoring boundaries are used to decide whether the trial should be terminated early or whether the confidence interval and the adjacent P value are sufficiently narrow or small respectively to show the anticipated effect [23]. In the same manner, trial sequential monitoring boundaries can be applied to meta-analyses [912, 24].

TSA depends on the quantification of the required information size. We calculated a diversity-adjusted (D 2) required information size, since the heterogeneity adjustment with I 2 underestimates the required information size [25]. TSA was conducted with the intention to maintain an overall 5 % risk of a type I error and a power of 90 %. For the calculation of the required information size, we anticipated an intervention effect of a 10 % RRR using the control event proportion calculated from the actual meta-analyses. We provided the TSA-adjusted CI for sparse data and repetitive testing, which we described as the trial sequential analysis adjusted CI. We also performed sensitivity analyses with a power of 80 % and assuming a 20 % RRR.

Grade assessment of outcomes

Data on the outcomes of all trials were assessed for the risk of bias measured by the level of evidence, the risk of random error measured by standard error, and the design error measured by grading the outcomes according to GRADE [16, 26]. Data were presented in a three-dimensional Manhattan error matrix that facilitates the overview of available evidence at a glance and may identify possible lacunae [16]. A GRADE assessment of all outcomes considering risk of bias, inconsistency, imprecision and risk of publication bias was conducted as well [22].

Results

The search retrieved 5366 hits (Fig. S1). A total of 48 articles were included describing 35 distinct randomised clinical trials. One ongoing trial was identified [27]. Two papers were translated from Chinese by a native Chinese medical doctor. The excluded trials and reasons for exclusion are listed (Table S4).

Characteristics of trials

Thirty-five randomised trials were included. Fourteen trials evaluated high dose (>500 mg hydrocortisone or equivalent) steroids and 21 trials used low dose (≤500 mg hydrocortisone or equivalent) steroids (Table 1 and Table S5). Fifteen trials included patients with septic shock and two trials included patients with SIRS. Duration of steroid treatment varied between one single dose and 8 weeks. The daily dose of steroids varied between 30 mg and 600 mg/kg (total 45 g) hydrocortisone (or equivalent) (Fig. 1). In the high dose trials the daily doses varied between 10 mg/kg (total 750 mg) and 600 mg/kg (total 45 g) hydrocortisone (or equivalent). In the low dose trials daily doses varied between 30 mg and 440 mg hydrocortisone (or equivalent).

Fig. 1
figure 1figure 1

Randomised clinical trials with mortality data showing for each trial the hydrocortisone dose (on the first day) (white bars) and the time interval from sepsis/septic shock onset until randomisation/start treatment (black bars); a high dose steroids (>500 mg hydrocortisone or equivalent) and b low dose (≤500 mg hydrocortisone or equivalent). When other steroids were used, the equivalent hydrocortisone dose was calculated using the table in the Oxford Handbook of Critical Care [17]. When doses were expressed in milligrams per kilogram body weight, daily doses were calculated assuming a body weight of 75 kg. The trials by Hoffman [37], Klastersky [29], Scarborough [30], Schumer [43], Snijders [40], Rinaldi [58], Ruolan [59] and Wan [31] did not provide information on the time interval. The trials by Schumer [43] and Sprung [44] included two intervention groups using different doses of steroids: D dexamethasone, MP methylprednisolone

There was insufficient data to evaluate the outcomes persistent dependence on haemodialysis and duration of mechanical ventilation.

Bias risk assessment

Bias risk of trials was assessed according to The Cochrane Handbook for Systematic Reviews of Interventions (Fig. S2) [15]. Only two trials scored low risk of bias in all domains. Therefore, we used the prespecified group of trials with lower risk of bias in four domains. A total of 16 studies were assessed to have lower risk of bias (Fig. S2).

Effects of interventions

All pooled intervention effects with their 95 % CI of all trials along with subgroup effects and all TSA are listed in Table 2.

Table 2 Conventional and trial sequential analysis (TSA)-adjusted relative risks (RR) with 95 % confidence intervals for the primary and secondary outcome of mortality and serious adverse events

Comparison 1: steroids versus placebo or no intervention

Thirty-five trials randomised 4682 patients and evaluated any dose of steroids in patients with sepsis. Two trials had low risk of bias. We considered 16 trials to have lower risk of bias.

All-cause mortality within longest follow-up

Thirty-one trials (including 4290 patients) provided mortality data. Mortality within longest follow-up was 37.6 % in the steroids group and 41.0 % in the control group. Substantial heterogeneity was found (I 2 = 54 %). There was no statistical significant difference (random-effects model RR 0.89, 95 % CI 0.79–1.01; TSA-adjusted CI 0.74–1.08; Figs. 2, S3). TSA on the two low risk of bias trials appeared impossible because of insufficient data; the conventional model showed no statistically significant effect (Table 2). Subgroup analysis based on the prespecified lower risk of bias trials revealed no differential effect. TSA of all trials (RRR 10 %; power 90 %) showed that the accrued and the required information size were far apart (Fig. S3) and that more than 17,000 additional patients may need to be randomised before firm conclusions can be drawn regarding the effect on mortality. However, futility has been reached in our sensitivity trial sequential analysis (RRR 20 %; power 80 %) (Fig. S3) refuting a 20 % RRR to be shown with 80 % power.

Fig. 2
figure 2

Forest plot of mortality at longest follow-up of all trials evaluating steroids for sepsis with subgroups according to risk of bias (random-effects model)

Analyses stratified by risk of bias (Fig. 2), treatment duration (long- compared to short-course steroids; Fig. S4), severity of illness (SIRS and sepsis compared to severe sepsis and septic shock; Fig. S5) and type of steroids (excluding trials using dexamethasone) all showed no statistically significant effects (Table 2) and were in line with the other analyses suggesting that many more patients need to be randomised before firm conclusions may be drawn.

Serious adverse events excluding mortality

No statistical difference was found in serious adverse events including all trials that evaluated steroids for sepsis (random-effects model RR 1.02, 95 % CI 0.92–1.15; TSA-adjusted CI 0.7–1.48; Fig. 3, S6). TSA showed that nearly 50,000 additional patients may need to be randomised before firm conclusions can be drawn on the effect on serious adverse events (Fig. S6). The types of serious adverse events are listed in Table S6. The incidence of serious adverse events did not vary according to the degree of sepsis (Table 2).

Fig. 3
figure 3

Forest plot of serious adverse events of all trials evaluating steroids for sepsis with subgroups according to risk of bias (random-effects model)

Other outcomes

Time-specific analyses of mortality were conducted for 30-day (16 trials) and 90-day (two trials) follow-up. We found no statistically significant treatment effect when evaluating 30-day mortality (random-effects model RR 0.96, 95 % CI 0.85–1.08; TSA RR 0.98, TSA-adjusted CI 0.83–1.17; point estimates were different as a result of different handling of zero event trials) (Fig. S7). Data for 90-day mortality were too sparse to perform TSA-adjusted analysis; in a conventional analysis no statistically significant effect was found (random-effects model RR 0.36, 95 % CI 0.04–2.90) (Fig. S8).

GRADE assessment considering risk of bias, inconsistency, imprecision and risk of publication bias showed very low quality of evidence (Table S7).

Comparison 2: high dose steroids versus placebo or no intervention

Fourteen trials (2624 patients) evaluated high doses (>500 mg hydrocortisone or equivalent) of steroids for sepsis. Only one trial had low risk of bias in all domains [20]. Six trials were considered to have lower risk of bias (lower risk of bias in four domains). One trial did not report mortality (Table 1) [28].

All-cause mortality within longest follow-up

Six trials with lower risk of bias and seven trials with high risk of bias evaluated mortality in 2537 patients at different lengths of follow-up. No statistically significant beneficial effect from steroid treatment was found (random-effects model RR 0.87, 95 % CI 0.70–1.07; TSA-adjusted CI 0.38–1.99; Fig. S9).

Analyses stratified by risk of bias (Fig. S9), severity of illness (SIRS and sepsis compared to severe sepsis and septic shock; Fig. S10) and type of steroids (excluding trials using dexamethasone) all showed no statistically significant effects (Table 2).

Serious adverse events excluding mortality

No statistical significant difference was found in the overall proportion of serious adverse events (random-effects model RR 1.03, 95 % CI 0.90–1.17; TSA-adjusted RR 1.02, CI 0.70–1.48). No statistically significant differences were found in subgroups according to bias risk and disease severity (Fig. S11, Table 2).

Other outcomes

Three trials with high risk of bias [2931] reported mortality at 30 days; no 90-day follow-up data was reported (Table 2). There was no statistically significant effect.

Error matrix plots were constructed for overview of all available evidence for high dose steroids at a glance (Fig. S17). GRADE assessment considering risk of bias, inconsistency, imprecision and risk of publication bias showed very low quality of evidence for all outcomes (Table S8).

Comparison 3: low dose steroids versus placebo or no intervention

Twenty-one trials randomised 2058 patients for low dose (≤500 mg hydrocortisone or equivalent) steroids for sepsis. There were large differences between the trials in the time interval between sepsis onset and initiation of steroids treatment: 2–72 h (Fig. 1). Only one trial applied short-course low dose steroids [32].

All-cause mortality within longest follow-up

Only one trial had low risk of bias in all domains [33]. Ten trials with lower risk of bias (1315 patients) and eight trials with high risk of bias (438 patients) evaluated mortality at different lengths of follow-up. The overall pooled estimate showed no statistical difference (random-effects model RR 0.90, 95 % CI 0.77–1.05; TSA-adjusted CI 0.49–1.67) (Figs. 4, S12).

Fig. 4
figure 4

Forest plot of mortality at longest follow-up of low dose steroids (≤500 mg hydrocortisone or equivalent) use according to risk of bias subgroups (random-effects model)

Analyses stratified by risk of bias (Figs. 4, S12), treatment duration (long- compared to short-course steroids; Fig. S13), severity of illness (SIRS and sepsis compared to severe sepsis and septic shock; Fig. S14) and type of steroids (excluding trials using dexamethasone) all showed no statistically significant effects (Table 2). Sensitivity analyses (TSA, RRR 20 %; power 80 %) based on lower risk of bias trials and separately all trials evaluating severe sepsis and septic shock showed futility for treatment with low dose steroids (Figs. S12, S14).

Serious adverse events excluding mortality

Nine trials (with 866 patients) evaluated serious adverse events. One trial had low risk of bias [33]. Seven trials had lower risk of bias. Data were too sparse to perform a TSA. A conventional analysis found no statistically significant intervention effect (random-effects model RR 0.96, 95 % CI 0.73–1.27; Fig. S15). The incidence of serious adverse events did not vary according to the degree of sepsis (Table 2).

Mortality at 30 days

Thirteen trials (1479 patients) evaluated mortality at 30-day follow-up. One trial had low risk of bias [33]. Ten trials had lower risk of bias. No significant benefit was found from steroids treatment (random-effects model RR 0.91, 95 % CI 0.77–1.07; TSA RR 0.94, TSA-adjusted CI 0.55–1.62; point estimates were different as a result of different handling of zero event trials; Fig. S16). Subgroup analysis based on bias risk showed no differential effect. TSA estimated that many more randomized patients are needed before firm conclusions can be drawn. A sensitivity analysis (TSA, RRR 20 %; power 80 %) showed futility for low dose steroids treatment (Fig. S16).

Mortality at 90 days

The results of 90-day mortality in low dose steroids are equal to the overall 90-day mortality, since only trials that evaluated low dose steroids provided data on 90-day follow-up.

Other outcomes

Error matrix plots were constructed for overview of all available evidence for low dose steroids at a glance (Fig. S18). Data suggest that there are considerable risks of systematic errors (bias) and random errors (large standard errors on average). GRADE assessment considering risk of bias, inconsistency, imprecision and risk of publication bias showed very low quality of evidence for all outcomes (Table S9).

Discussion

We did not find evidence for a beneficial effect of an intervention with steroids in patients with SIRS, sepsis, severe sepsis or septic shock in this systematic review with meta-analyses and TSA, including 35 randomised trials with 4682 patients. High (>500 mg hydrocortisone or equivalent) and low dose (≤500 mg hydrocortisone or equivalent) steroids were evaluated in separate comparisons and no evidence of a beneficial effect was found. Moreover, TSA suggested that more than 17,000 patients need to be randomised before firm conclusions can be drawn on any present or absent intervention effect with a 10 % RRR. Evidence has been reached to refute a 20 % RRR with a power of 80 %.

Our conclusion contrasts with previous publications suggesting beneficial effects associated with use of a long course of low dose steroids [6]. Differences might be explained by another search strategy and different analyses with improved accounting for risks of systematic, design and random errors. There is accumulating evidence that random error plays an important role in premature conclusions of spurious significant findings [11]. In simulation studies up to 30 % of premature declarations of significant effects are in fact overestimations of intervention effects, once sufficient evidence has been reached [34].

A substantial clinical heterogeneity existed between the included trials, even after separation into high and low doses of steroids. Within both high and low dose steroid groups the administrated daily dose of steroids differed importantly (Fig. 1). Moreover, there were substantial differences regarding the time interval between sepsis onset and administration of the first steroid dose, e.g. in antibiotic therapy timing appears to be an essential feature for any beneficial effect. Therefore, lack of both optimised dosing and optimised timing might obscure potential beneficial effects of steroids. As dexamethasone has no mineralocorticoid activity, analyses were repeated excluding trials using dexamethasone and results appeared to be similar (Table 2).

Retrospectively, our study protocol could have been more specific on exclusion criteria for trials evaluating patients with SIRS and sepsis. We excluded trials that evaluated steroids for prevention of diseases and for autoimmune diseases. We also excluded trials that evaluated steroids for treatment of localized oedema. Furthermore, we only included trials that evaluated patients with SIRS if the criteria for SIRS were explicitly stated in the methods section of the report; however, SIRS criteria may not predict patient important outcomes and may be too inclusive. Although we reached consensus through discussions, the inclusion of some trials can still be discussed [20, 30, 31, 33, 3540]. To test the robustness of our conclusions, we therefore conducted several sensitivity analyses by excluding these trials and they all resulted in similar findings (Table 2).

Trial sequential analyses, an error matrix to evaluate risks of errors and GRADE assessment add to the strength of the conclusions. However, our systematic review mirrors the lack of quality and quantity of the included randomised clinical trials. The included trials fall short on bias protection, included numbers of patients and chosen outcomes. Therefore, the evidence of steroids for sepsis (both high and low dose) is characterized by high risks of both systematic errors and random errors. We used wide inclusion criteria to evaluate the effect of steroids in the broad spectrum of disease. Although none of the analyses showed a statistically significant effect, we cannot exclude a small beneficial intervention effect of steroids in a specific subgroup.

Conclusion

In this systematic review with meta-analyses and TSA we did not find any statistically significant beneficial effect that could support propagating the use of steroids for sepsis. Further, our TSA suggests that many thousands of randomised patients are needed in order to change this perspective. Therefore, steroids should no longer be recommended for patients with sepsis outside ongoing [27] or future well-designed randomised clinical trials with low risks of both systematic error and random error.