Introduction

Should I change practice on the basis of this study [1]? Which primary endpoint should I choose to compute the sample size of my trial [2, 3]? Answering these questions requires a critical appraisal of study endpoints by both researchers and clinicians. In this report, we analyze the use of disease-oriented endpoints (such as organ dysfunction scores) in intensive care (IC) trials, and we briefly review the pitfalls of extrapolating disease-oriented endpoints to real patient benefit.

Disease-oriented endpoints in intensive care trials

The goal of IC research is to improve patients’ health in ways that matter. Study endpoints that are important to patients, such as quality of life or survival, are referred to as “patient-oriented” endpoints because their relevance is self-evident and unambiguous [1, 2]. However, randomized trials often use “disease-oriented” endpoints, which correlate with patient-oriented benefit, but do not matter directly and unequivocally to patients [4].

Reasons to choose disease-oriented primary endpoints are that such endpoints generally require less patients and may be more sensitive indicators of treatment effects than survival. Examples of disease-oriented endpoints in IC research are organ failure scores, length of stay, time to shock reversal, or ventilator-free days. Although these endpoints have face validity, they leave room for ambiguity with respect to real patient benefit. A therapy may decrease organ failure without improving survival, suggesting that the new therapy worked, but did not help the patient survive.

We analyzed the primary endpoints of IC-related randomized trials published in five critical care journals and three high-impact medical journals over the past 15 years. The methods, baseline characteristics, and additional results are available as electronic supplementary material.

The use of disease-oriented primary endpoints significantly increased over time in trials with more than 200 patients and in trials published in high-impact journals (Fig. 1). For trials with more than 500 patients, disease-oriented endpoints were more prevalent than patient-oriented endpoints for the first time in 2016 (data not shown). Trials with disease-oriented primary endpoints more often reported positive results (i.e., p values less than 0.05) than trials with patient-oriented endpoints. Consequently, the question “Should we rely on trials with disease- rather than patient-oriented endpoints?” is critical.

Fig. 1
figure 1

Primary endpoints of ICU-related randomized controlled trials (RCTs) in five intensive care journals and three high-impact journals. a Among trials with a sample size greater than 200, there is no trend in patient-oriented endpoints, but the prevalence of disease-oriented endpoints is progressively increasing. b Among trials published in high-impact general medical journals, there is no trend in patient-oriented endpoints, but the prevalence of disease-oriented endpoints is progressively increasing. c The boxplot (IQR, range) of reported p values by endpoint category shows that trials with a disease-oriented endpoint report significantly lower median p values and more often report “positive” (p < 0.05) results

From disease-oriented endpoints to patient-oriented benefit

Several guidelines on the validation of disease-oriented endpoints have been published [1, 5]. Yet some recommendations are difficult to translate to IC research, which is characterized by complex syndromes (such as sepsis) rather than focused diseases (such as ischemic stroke). Consequently, disease-oriented endpoints are often broad and multifaceted (such as organ failure scores) rather than proximal surrogates for a specific clinical outcome (such as LDL-cholesterol as a surrogate for cardiovascular risk). No Food and Drug Administration approved surrogate endpoints are therefore currently used in IC research.

In general, a relevant endpoint must satisfy three criteria [5]: there must be biological plausibility that improvement in the disease-oriented endpoint will cause improvement in true patient benefit; there must be a well-established observational association; and there must be evidence from intervention studies that the disease-oriented endpoint adequately captures the treatment effects on patient-oriented outcomes.

The capture criterion is especially important in IC research because the causal pathways between disease-oriented endpoints and real patient-oriented benefit are seldom clear and linear [6]. For example, the relationship between ventricular ectopy and sudden cardiac death after myocardial infarction is both plausible and statistically significant, but therapies that reduce ventricular ectopy after myocardial infarction (a disease-oriented endpoint) may paradoxically increase mortality [7]. For the same reason, a composite endpoint with disease- and patient-oriented components may be dificult to interpret when the treatment has different effects on the individual components.

Three complementary approaches improve our understanding of the relationship between disease- and patient-oriented endpoints in IC research.

Firstly, a critical appraisal of the literature may reveal vulnerabilities in the hypothesized causal chain between a disease-oriented endpoint and patient-oriented outcomes. For example, even though oxygenation and survival are both plausibly linked and statistically associated in acute respiratory distress syndrome (ARDS), there is no evidence that oxygenation impairment is the most important driver of mortality [8]. It is therefore not surprising that, in ARDS trials, treatments that improve oxygenation (a frequently used endpoint) do not necessarily improve survival [9].

Secondly, a meta-analysis of the association between a disease-oriented endpoint and patient-oriented outcomes may elucidate the responsiveness and the reliability of the disease-oriented endpoint [10, 11]. We have recently shown in a meta-analysis of 87 RCTs that, on average, treatments that improved the Sequential Organ Failure Assessment (SOFA) score on a fixed day after randomization did not improve mortality. In contrast, treatments that improved delta-SOFA score (the trajectory from baseline) did improve mortality [12]. This shows that careful calibration is needed to make reliable inferences about patient-oriented benefit.

Thirdly, several statistical techniques can be used to analyze how well a disease-oriented endpoint captures the treatment effects on a patient-oriented outcome within a specific trial [13,14,15]. One such measure is the “proportion explained” (PE), which refers to the proportion of the treatment effect on a patient-oriented outcome explained by the treatment effect on the (primary) disease-oriented endpoint [14, 15]. A low PE may indicate that the disease-oriented endpoint fails to capture treatment effects on the patient-oriented endpoint (inappropriate disease-oriented endpoint) or that the patient-oriented endpoint is largely explained by treatment allocation alone (irrelevant disease-oriented endpoint). Within-trial validation requires data which is generally only available to investigators. Therefore, an “endpoint validity statement” in a trial report (e.g., “the treatment effect on ventilator-free days explained 76% of the treatment effect on mortality”) could improve the relevance of trials with disease-oriented primary endpoints.

In conclusion, disease-oriented primary endpoints have become progressively more prevalent in large and high-impact IC trials. However, ample evidence exists to make clinicians cautious before embracing any new therapy based on studies showing improvements in disease-oriented endpoints that have not been validated against patient-oriented outcomes. To evaluate the relevance of disease-oriented endpoints, we suggest to critically appraise the literature, to study the responsiveness between disease- and patient-oriented endpoints, and to report the PE in clinical trials. These strategies will help clinicians to be better informed about both the efficacy and the effectiveness of a treatment.