Of the organ failure that complicates sepsis, acute kidney injury (AKI) portends a particularly grave prognosis [1]. Discovery of pharmacological interventions that improve outcomes for critically ill patients with both sepsis and AKI is therefore a priority. However, achieving such a goal remains challenging. In this context, the REVIVAL trial investigators are to be commended on their multi-centre evaluation of ilofotase alfa (human recombinant alkaline phosphatase hrAP) as a potential novel treatment for patients with sepsis-associated AKI (SA-AKI) [2]. The biological rationale behind hrAP is that of a broad detoxifying role through dephosphorylation and ameliorating systemic inflammation in sepsis, particularly in the kidney, and was well supported by preclinical data [3, 4]. Previously, the phase 2 STOP-AKI study performed in critically ill with sepsis, but without evidence of AKI, showed neutral effects for hrAP on the primary efficacy end point of improved short-term kidney function, defined as the area under the curve for creatinine clearance over the first 7 days following enrolment [5]. However, in STOP-AKI, an improvement in kidney function was observed over a 28-day window, along with the occurrence of fewer major adverse kidney events (MAKE) at 60 and 90 days, a tertiary end point, driven primarily by differences in mortality between the groups. These differences in MAKE, in particularly mortality, albeit described in post -hoc exploratory analyses, are the key findings that prompted the REVIVAL trial [6].

The REVIVAL trial randomised 650 patients (out of a planned recruitment of 1400 patients [46.4%]) within 24 h of the diagnosis of sepsis with AKI to either 3 days of intravenous ilofotase alfa (human recombinant alkaline phosphatase) or placebo. The trial was terminated prematurely for futility based on the low probability of detecting a difference in the primary efficacy end point of 28-day all-cause mortality. As has often been observed in phase 3 clinical trials, the observed 28-day mortality rates in the hrAP and control arms (27.9% and 27.9%) were both well below the expected 35% event rate in the control group, negatively impacting the overall power of the planned study which also incorporated a somewhat unrealistic proposed effect size, an 8% absolute mortality difference. Unlike STOP-AKI, the REVIVAL study primarily examined mortality and recruited patients at a later point in their course of critical illness, after evidence of AKI. Given this considerable difference in timing, one can argue that the REVIVAL cohort might have been less amenable to the biologic activity and risk modification by an early-acting anti-inflammatory intervention, whilst at the same time any potential benefit by avoiding AKI was lost.

As REVIVAL did not replicate the enrolment criteria of STOP-AKI, it remains uncertain if the observed survival benefit in STOP-AKI was real or a chance finding (Type 1 error). Such uncertainly is likely to persist as, realistically, detection of survival benefit from a single intervention in heterogeneous populations of critically ill patients with sepsis and AKI is challenging. Indeed, temporal improvements in intensive care unit (ICU) outcomes over the last decades likely largely reflect the incremental gains of process improvement and harm avoidance. Consequently, novel therapies may become established through demonstration of measurable effect on short- and longer-term organ function, and to that end the REVIVAL investigators have focused on improved longer-term kidney outcomes as a secondary end point, establishing a common signal with longer-term kidney benefit in STOP-AKI. This end point is presented as the occurrence of MAKE90, a composite of death, receipt of renal replacement therapy (RRT) or a specified decline in kidney function (as a surrogate for longer-term risk of kidney disease and its sequalae) at 90 days after intervention. Since first described over 10 years ago, the use of MAKE has increased, likely because of improved trial efficiency, and it has achieved acceptance by regulators [7]. However, like many composites, MAKE is not without challenges (Fig. 1). Although commonly assessed at 90 days after injury or exposure (MAKE90), time points from 7 days after AKI diagnosis up to 1 year have also been considered, reflecting a lack of standardisation. Moreover, aside from the challenge of non-equality between components of the composite (such as death relative to mild decline in kidney function) there is lack of consensus in the definition of the components. For example, receipt of RRT may include any exposure to new RRT within the study window, as originally proposed, or continued RRT dependence at the time end of the study window. While more sensitive, any exposure to RRT as an end point is subject to variation in clinical practice in RRT initiation and may include a wide spectrum of severity (duration of RRT). Conversely RRT dependence, while perhaps a more patient-centred and consistent end point of a trial focused on long-term kidney outcomes, fails to capture the potential harm and burden of any exposure to RRT during critical illness, even if relatively short. Similarly, the kidney dysfunction end point has been variably described as persistent elevation of serum creatinine above baseline or a percentage decrease in estimated glomerular filtration rate (GFR) from premorbid baseline, with various thresholds employed. Importantly, this end point requires knowledge, or reliable imputation, of baseline kidney function and a follow-up assessment of kidney function. Finally, while a 25% decline in eGFR at 90 days has been the most frequently used MAKE-GFR criterion, there is a paucity of epidemiological evidence supporting this threshold or any other fixed or continuous rate of decline (e.g. slope change in GFR) and association with important clinical events and outcomes.

Fig. 1
figure 1

Major adverse kidney events composite end point—formulation and considerations. MAKE can be defined over a range of timescales. Currently, all MAKE components have equal weighting; however incorporation of new criteria might incorporate some form of weighting/hierarchy

In the REVIVAL trial, the application of MAKE reflects its lack of standardisation with use of two separate definitions, MAKE-A and MAKE-B. MAKE-B is the definition that was pre-specified in the trial registration, and included death, RRT dependence at 90 days or a 25% decrease in eGFR at day 28 and day 90. The second end point (MAKE-A) was first presented in the published protocol and statistical analysis plan and was based on death, receipt of any new RRT up until day 28 or RRT dependence at day 90, an eGFR drop of 25% at day 90 (only), or re-hospitalisation up to day 90 [6]. The justification for use of this second end point at a late stage in the trial is unclear, although these changes would logically increase the likelihood of clinical events qualifying the occurrence of a MAKE. Interestingly, the secondary proposed MAKE-A outcome did demonstrate a significant difference in outcome at 28 days, suggesting benefit of hrAP, whereas the findings were neutral for hrAP compared with placebo with the originally planned MAKE-B definition.

Where does this leave us? This presents challenges for the interpretation of the multiple versions of MAKE in the REVIVAL trial. Firstly, if any RRT is included, it would be vital to ensure and describe information on the relative standardisation of indications for starting and the duration of RRT. Secondly, while the inclusion of rehospitalisation can be commended as an attempt to represent greater burden of healthcare need and poor patient experience, direct relation to a kidney-specific outcome is unestablished. The authors claim that a similar measure was included in the MAKE definition used in the STOP-AKI study. However, supplementary appendix of that study suggests only re-hospitalisations for AKI were counted—a very different metric. Finally, the inclusion of a non-standard dual time point eGFR criteria in MAKE-B significantly reduced the number of events (see supplementary Table 4)—suggesting confounding from reduction of serum creatinine generation at 28 days, while acute illness was likely ongoing. Overall, the data imply that the inclusion of any RRT criteria and using only the 90-day eGFR as components of MAKE drove the difference between MAKE A and B outcomes. Furthermore, the significance of the observed difference in any exposure to RRT in REVIVAL can be questioned, given that there was no difference in days alive and free of RRT up to day 28 (see supplementary Table 5). Given these concerns, we strongly suggest that future trial establish a single unambiguous definition of MAKE before commencement of recruitment in any trial on AKI.

What then should we conclude from this trial? Firstly, that the choice to focus on mortality in a relatively unselected SA-AKI population, although laudable may have been too ambitious without provisions for predictive enrichment of the trial with patients with greater likelihood of having a favourable biologic response to hrAP. A focus on dysfunction of the target organ as an end point, and/or some form or targeting to patients with a pathophysiology modifiable by the intervention might improve the probability of detecting benefit, if one exists. Finally, consensus on the components and analytic strategy for the end point of MAKE for AKI treatment studies is needed. Future AKI studies should select components for MAKE end points that are important to patients, that are accepted by regulatory agencies, and that balance trial efficiency with feasibility, safety and efficacy. At first glance, it may be easy to dismiss the findings from the REVIVAL trial as another study showing no survival benefit in sepsis. However, as outlined, there are potential reasons for “failure” of the study including the optimistic primary end point. Better targeting of intervention, together with the use of hierarchical composite end points may have presented us with a different story. In the meantime we await to see if the role of rhAP can be revisited or indeed ‘revived’ in future studies.