Drotrecogin alfa activated (DAA) was approved for treatment of patients with severe sepsis in 2001 based on a large randomized double-blind clinical trial, PROWESS [1]. A second randomized clinical trial, PROWESS-SHOCK [2], was recently completed, but the survival benefits observed with the original trial were not reproduced in the second trial. The results from PROWESS-SHOCK culminated in the removal of this drug from the market in 2011. Both trials were multicenter, randomized, and double-blinded, and both used the same drug manufacturer. Which trial should we believe in? What should be done while one-third of our patients with severe sepsis are still dying despite the best standard of care? Our paper aims to explore the reasons for this discrepancy and offer new solutions.

A total of 3,370 patients with severe sepsis were enrolled in both trials, for which we performed an analysis of the clinical heterogeneity (differences related to the trials' clinical aspects) and the statistical heterogeneity (differences related to the trials' statistical aspects) between these trials. Baseline characteristics, infection etiologies and sites, and co-interventions were compared by chi-square testing for the clinical analysis, while random-effects modeling and I2 were performed for the statistical analysis. All results are shown in Tables 1 and 2. Our clinical findings demonstrate that infection sites, etiology, co-interventions, and geographic enrollment were all significantly different between the two trials. Moreover, the use of appropriate antibiotics, low-dose steroids, and heparin were all significantly different. Based on 28-day mortality, we also found a highly significant statistical heterogeneity: up to 90% of the mortality differences between the trials were not due to chance. This heterogeneity remained consistently high even when the analysis was done by shock status, number of organ failures, or APACHE II (Acute Physiology and Chronic Health Evaluation II) scores.

Table 1 Clinical heterogeneity: PROWESS versus PROWESS-SHOCK - trial characteristics
Table 2 Statistical heterogeneity: PROWESS versus PROWESS-SHOCK - 28-day mortality

A recent study by Levy and colleagues [3] showed that the absolute mortality of severe sepsis is different between the US and EU; hence, the very different rates of geographic enrollment between PROWESS and PROWESS-SHOCK may explain, in part, the mortality differences. Infection site is known to be a major determinant of survival in patients with severe sepsis, so the significant differences we found regarding these sites provide more corroborating evidence of clinical heterogeneity. The diversity of microbiological etiologies between studies also provides evidence for the clinical differences; similarly, the rate of appropriate use of antibiotics was not comparable. Let alone the fact that co-interventions (for example, heparin and low-dose steroids) were significantly different between trials.

The statistical heterogeneity analysis demonstrates that the vast majority (80 to 90%) of the detected heterogeneity regarding survival outcomes between these trials could not be explained by chance. This is quite remarkable because it points out that the reasons for this large heterogeneity derive from differences related to the trials themselves, in this case, patient population, baseline infection, and co-interventions. Moreover, even after we stratified the survival outcome analysis by disease severity, the elevated heterogeneity did not change. We conjecture that the different clinical characteristics and co-interventions were most likely the cause for this irreversible statistical heterogeneity.

Another complicating factor is that the PROWESS-SHOCK trial was substantially underpowered: 42% probability of false-negative results. Although a frequent question since the completion of PROWESS-SHOCK has been 'which trial should we believe in?', we propose that this is not the 64 million-dollar question; the one that is begging for an answer is 'will we ever be able to replicate the design of the PROWESS trial?' If we aim for a control mortality of 35%, the answer is 'no' because recent phase III trials [2, 4] have shown that the mortality associated with severe sepsis now ranges from 24 to 28%. If we slightly modify the question to 'can we perform another phase III trial on DAA with adequate statistical power?', the answer is yes on two accounts: 1) a large sample size (N = 2,500 to 3,000) would fulfill the frequentist (classical) statistical approach; and 2) a smaller sample size (500 to 1,000) would fulfill the adaptive Bayesian statistical approach, as we explained in a previous manuscript [5]. What about financial support? The financial and logistic challenges would be enormous for the frequentist approach, but definitely more feasible for the Bayesian approach. Would it be ethical to perform a third trial? Yes, a study we published recently [6] demonstrated that, in real-life application outside phase III trials, DAA significantly reduced in-hospital mortality by 18% (95% confidence interval 13 to 22%) in patients with severe sepsis (N = 41,401 patients). How would this trial be designed? First, an individual-patient data meta-analysis to combine all randomized trials at the patient-level would provide the most accurate and statistically powerful way to reduce the current scientific uncertainty; second, the concomitant use of both frequentist and Bayesian methodologies [7] would maximize the opportunity to gather the most valuable scientific information on the efficacy of DAA; and third, the findings from this new analysis would provide the necessary tools to optimize the design of the next randomized trial. Thus, it is our responsibility to not stop our scientific investigation here, especially considering that the 3,370 patients who gave their consent to participate in these clinical trials were assured that their information would be fully utilized for fostering progress in medical science and for the betterment of future patients afflicted by severe sepsis.

Conclusion

PROWESS and PROWES-SHOCK trials are not comparable based on both clinical and statistical heterogeneity. Hence, the true effect of DAA in patients with severe sepsis remains to be defined. Unless the totality of the available evidence is thoroughly evaluated through an individual-patient data meta-analysis, and an adaptive Bayesian clinical trial is performed, we will continue treating our patients with the appalling sensation that we are not improving their survival due to our own inability to advance the quality of clinical research in the sepsis field.