FormalPara Key Summary Points

Why carry out this study?

Evaluating the effect of pulmonary arterial hypertension (PAH) therapies on mortality is of the utmost relevance to patients and clinicians.

However, in randomized clinical trials, standard intention-to-treat (ITT) analyses may inaccurately estimate the true survival benefit of a therapy if patients switch treatment during the study; for example, if control patients switch to receive experimental treatment following worsening of disease symptoms.

In the large, randomized SERAPHIN study, a non-significant decrease in the risk of all-cause mortality in an ITT analysis with macitentan 10 mg versus placebo up to end of study was reported in patients with PAH. This ITT analysis was performed in the context of many placebo patients switching to receive active therapy which may have confounded the treatment effect.

What was learned from the study?

After additional analyses, with adjusting for treatment switching in SERAPHIN, there was a greater effect of macitentan 10 mg on overall survival in patients with PAH than originally reported.

These additional analyses highlight that, when treatment switching occurs, results from randomized clinical trials should not be solely interpreted using ITT analyses.

Introduction

Treatment switching is a frequent occurrence in randomized controlled trials (RCTs) and arises when patients discontinue their randomized treatment and switch to an alternative treatment [1]. It is especially common in trials in which control group patients are allowed to switch to the experimental or alternative treatment after symptoms of disease progression. While withholding an effective treatment from patients in the placebo group would be unethical, a consequence of treatment switching can be that standard statistical methods, such as intention-to-treat (ITT) analyses, provide inaccurate estimates of the treatment effect on endpoints evaluated up to study closure, such as, for example, overall survival [2]. Accurate estimates of overall survival are important, not only for patients and clinicians, but also for regulators and health technology assessments.

Macitentan is a dual endothelin receptor antagonist approved for the treatment of pulmonary arterial hypertension (PAH), a rare, progressive and ultimately fatal disease [3]. The landmark event-driven SERAPHIN study, the largest clinical study in PAH at the time (N = 742), evaluated the safety and efficacy of macitentan and was the first study to demonstrate that a PAH therapy could provide long-term clinically meaningful benefits [4]. In this study, patients with PAH were treated with macitentan or placebo for a median of 2 years. Macitentan 10 mg significantly reduced the risk of the primary composite endpoint of morbidity and mortality by 45% versus placebo (p < 0.0001) up to end of treatment (EOT).

SERAPHIN also reported a 23% decrease in the risk of all-cause mortality in an ITT analysis with macitentan 10 mg versus placebo up to study closure [4]. This was not significant; however, the study was not powered to detect a survival benefit as recruiting a sufficient number of patients with a rare disease such as PAH is not feasible. This secondary endpoint was further confounded by patients being allowed to switch therapy following disease progression. Patients could, for example, switch from placebo to macitentan 10 mg (or another approved PAH-specific drug) following a confirmed morbidity event, and patients in the macitentan group could also stop receiving macitentan at any point in the study. The ITT survival estimate is thus not an estimate of the direct effect of macitentan 10 mg on survival compared to placebo, rather is an estimate arising from a mixture of different treatment patterns observed during the study.

The question that is of interest is what would the overall survival treatment effect have been in SERAPHIN had treatment switching not occurred? This has previously been explored using a model-based approach in which real-world observational data were used to predict what the survival in the SERAPHIN population would have been had patients not received macitentan [5]. By comparing this predicted survival with the observed survival in SERAPHIN with macitentan 10 mg, a 35% decrease in mortality was indicated to be more representative of the real treatment effect of macitentan 10 mg [5], than the reported 23% [4], supporting the hypothesis that treatment switching had substantially confounded the analysis of time to all-cause death up to study closure.

Given these results, it was of further interest to see if this question could be similarly answered by applying statistical methods directly to the SERAPHIN study data, rather than through the use of an external control arm. The observational-based inverse probability of censoring weights (IPCW) and the randomized-based rank-preserving structural failure time (RPSFT) model are two validated treatment switching adjustment methods [6, 7, 24]. These methods apply counterfactual arguments to reconstruct data in the absence of switching, with the aim of reducing bias and allowing the treatment effect to be estimated more accurately. Such methods are widely accepted by regulators and health technology assessment bodies [8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23] and are increasingly applied in cancer and a number of other therapeutic areas that apply time-to-event endpoints [14, 16, 18]. As yet, their application in PAH, where the use of long-term morbidity/mortality outcomes has replaced short-term outcomes only relatively recently, has not been published.

Here, we further explore mortality in the SERAPHIN study by applying the IPCW and RPSFT methodologies directly to the study to statistically correct for switching in an attempt to produce adjusted treatment effect estimates on overall survival, up to study closure.

Methods

The SERAPHIN Study

SERAPHIN was a global, multicenter, double-blind, randomized, placebo-controlled, event-driven, phase 3 study (ClinicalTrials.gov identification number: NCT00660179) [4]. The study included 742 patients with PAH who were randomized (1:1:1) to receive placebo (n = 250), macitentan 3 mg (n = 250), or macitentan 10 mg (n = 242) once daily (Fig. 1). Concomitant treatment with a stable dose of phosphodiesterase type 5 inhibitors, oral/inhaled prostanoids, calcium channel blockers, or l-arginine was allowed. Only the approved macitentan 10 mg dose arm was included for the purpose of this analysis; patients randomized to macitentan 3 mg were not included.

Fig. 1
figure 1

Schematic of the SERAPHIN study design. Time to all-cause death was evaluated up to study closure (15 March 2012), which occurred after a pre-defined number of primary endpoint events (285). Up to end of treatment (EOT), patients received their randomized double-blind treatment. Following EOT, patients could receive open-label (OL) macitentan 10 mg or any available alternative pulmonary arterial hypertension (PAH) therapy between EOT and study closure. Asterisk: 250 patients were also randomized to macitentan 3 mg, but only the approved macitentan 10 mg dose was included for the purpose of this analysis

The primary time-to-event endpoint was a composite of morbidity or mortality, whichever occurred first, up to EOT + 7 days. Events were defined as worsening of PAH, initiation of intravenous or subcutaneous prostanoid therapy, or the need for lung transplantation or atrial septostomy, or death from any cause. Worsening of PAH was defined by the occurrence of all three of the following: a decrease from baseline of ≥ 15% in 6-min walk distance (6MWD); worsening of PAH symptoms (a change from baseline to a higher World Health Organization functional class [FC] [or no change in patients who were in FC IV at baseline], and/or the appearance or worsening of signs of right heart failure that did not respond to oral diuretic therapy); and the need for additional treatment for PAH. All primary endpoint events were adjudicated by a blinded independent clinical-event committee (CEC).

Double-blind treatment continued until patients experienced a primary endpoint event or until 285 events had accrued. Patients were followed until withdrawal from the study or until study closure (March 2012). Patients who prematurely discontinued double-blind treatment and provided written informed consent were followed up to study closure. Those who experienced a nonfatal primary endpoint event and terminated double-blind treatment were eligible to receive another PAH therapy, including open-label macitentan 10 mg. In addition, all patients who completed the study as scheduled were eligible to continue in the SERAPHIN open-label study and receive open-label macitentan 10 mg once daily (ClinicalTrials.gov identifier: NCT00667823). Vital status follow-up at study closure was performed for all patients who had not prematurely discontinued from the study (i.e., died, withdrawn consent, or had been declared lost to follow-up). Patients with missing vital status at study closure were censored at date of last contact.

SERAPHIN was conducted in accordance with the amended Declaration of Helsinki and with the protocols reviewed by local institutional review boards with written informed consent obtained from all patients. This analysis is based on data from the SERAPHIN study and does not contain any new studies with human participants performed by any of the authors.

Treatment Switching

The study cut-off date used was the date of study closure, defined as 15 March 2012. Treatment switching was defined as patients randomized to placebo who switched to open-label macitentan 10 mg and patients randomized to macitentan 10 mg who discontinued macitentan 10 mg up to study closure.

For the placebo group, the time of switch was defined as the time of starting open-label macitentan 10 mg. For the macitentan 10 mg group, a 7-day grace period was applied to account for possible residual effects of treatment and thus the time of switch (i.e. end date of macitentan 10 mg) was defined as the last date the patient received macitentan 10 mg + 7 days. Treatment duration is reported without the 7-day grace period.

The overall cumulative exposure time of macitentan 10 mg up to study closure was assessed in patients randomized to placebo and macitentan 10 mg. For patients who discontinued randomized macitentan 10 mg, the non-exposure time to macitentan was also assessed, from EOT to study closure.

Statistical Analyses

For the ITT analysis, overall mortality was defined as the time from date of randomization to date of death due to any cause up to study closure. The IPCW and RPSFT methodologies have been described in full previously [6, 7, 24].

The IPCW method artificially censors patients at the time of treatment switch and estimates weights for the follow-up information for the remaining patients according to baseline and time-varying demographic and disease-related characteristics [24, 25]. This adjusts for any potential bias created by the censoring due to switching and allows for the estimate of the survival function in the absence of switching. The IPCW method is reliant on the 'no unmeasured confounders' assumption which assumes that the probability of switching at a given time depends only on these covariates included in the model [24, 25].

For each patient, a number of baseline and time-dependent covariates were used in this multivariate IPCW model in order to generate time-varying stabilized weights for each patient (Table 1). Stabilized weights were calculated by fitting logistic regression models predicting the probability of not switching treatment and remaining uncensored. Censoring weights were modeled to account for potential informative censoring. The time-dependent intercepts in the logistic models were estimated using a smooth cubic spline function of time. The analysis was performed setting the length of the intervals to 1 month. The weights were then used in a weighted Cox-proportional hazards regression model to obtain an adjusted estimate of the treatment effect. A 'robust' standard error was used for the estimation of the confidence interval (CI) as the stabilized weights introduce within-patient correlation due to these being estimated from the same dataset. To minimize the influence of extreme stabilized weights, analyses were performed in which weights were truncated at 1% and 99%. Patients with final weights more extreme than the threshold had their weights set to the threshold level. Analyses were repeated without truncation.

Table 1 Covariates used in the inverse probability of censoring weighted model

The RPSFT method estimates the patient’s counterfactual survival time (ψ), which is the survival time that would have been observed had treatment switching not occurred [6]. Re-censoring was performed in an attempt to make censoring non-informative on the counterfactual time scale. A Cox-proportional hazards model was fitted to the re-censored adjusted survival times for the macitentan 10 mg and placebo groups, and the adjusted HR treatment effect that would have been had switching not occurred was estimated. The symmetrical test-based 95% CIs were obtained by inflating the standard error of the log-hazard ratio to preserve the ITT p value [26].

The RPSFT method relies on: (1) the common treatment assumption, i.e., the treatment effect is equal across all patients, relative to the duration of time the treatment was taken for, and (2) the randomization assumption, i.e., in the absence of treatment, survival times are independent of the randomized group. It is assumed that no other factor other than macitentan induces a difference in survival between the treatment groups and that the censoring mechanism is non-informative, i.e., at the time of treatment switch, the survival prognosis of patients who switch treatment is the same as patients remaining on randomized treatment [6].

Results

Treatment Switching

In SERAPHIN, at study closure, 183 of 250 patients (73.2%) randomized to placebo had switched to open-label macitentan 10 mg, of whom 86 (34.4%) switched to macitentan 10 mg following a CEC-confirmed morbidity event and 97 (38.8%) switched without an event (Fig. 2). Patients who switched following a CEC-confirmed event did so after a mean (median) of 54.2 (45.1) weeks, and they initiated macitentan 10 mg treatment as early as 2.9 weeks after having been randomized to placebo (range 2.9–160.6 weeks). Among these patients, the mean (median) duration of macitentan 10 mg was 76.0 (71.7) weeks. Patients who switched without an event did so after a mean (median) of 135.9 (131.0) weeks (range 111.0–184.9) weeks. The mean (median) duration of macitentan among these patients was 5.0 (5.3) weeks. Overall, among placebo patients who switched, exposure time to macitentan 10 mg represented 28.2% of total study treatment exposure (cumulative exposure 134.6 patient-years); for the total placebo group (including those who did not switch), the exposure time to macitentan 10 mg represented 24.8% of total exposure to study treatment.

Fig. 2
figure 2

Exposure to macitentan 10 mg during SERAPHIN up to study closure. Heights of the boxes are proportional to the patient number. Data within the figure are mean weeks; data in the table are median (range) treatment duration (weeks) and cumulative exposure (years). DB Double-blind, pts patients, wk week, yrs years. Asterisk indicates from EOT to study closure. Treatment duration does not include the 7-day grace period and is calculated from randomization to treatment end

In addition to patients in the placebo group (placebo patients) switching to macitentan, the analyses also considers the time off macitentan in patients randomized to macitentan 10 mg. In the macitentan 10 mg group, at the time of study closure, 60 patients were not receiving open-label macitentan and the mean (median) time off macitentan for these patients, i.e. the time between EOT and study closure, was 44.3 (12.2) weeks (range 0.0–162.9) weeks (Fig. 2).

All-Cause Mortality

There were 36 deaths (14.9%) from any cause in the macitentan 10 mg group and 45 (18.0%) in the placebo group, up to study closure (15 March 2012). This includes two additional deaths (1 per treatment arm) that were captured after data cleaning, and which were not reported by Pulido et al. [4]. Including these two additional deaths, the ITT analysis (unadjusted for treatment switching) of overall mortality up to study closure showed a 20% reduction in the risk of mortality in macitentan 10 mg versus placebo treated patients; this was not statistically significant (hazard ratio [HR] 0.80; 95% CI 0.51, 1.24).

Vital status could not be recorded for seven (2.9%) and 11 (4.4%) patients in the macitentan 10 mg and placebo groups, respectively, due to patients’ withdrawal of consent, loss to follow-up, and administrative reasons. These patients were censored and assumed to be alive at the moment last information was received.

Inverse Probability of Censoring Weighted Method

The results of the two adjustment methods used in this analysis are presented in Table 2. The truncated IPCW multivariate estimate of the HR of all-cause death was 0.42 (95% CI 0.22, 0.81; p = 0.009), i.e., continuous treatment with macitentan 10 mg was associated with a 58% reduction in the risk of mortality versus placebo (i.e., patients never on macitentan).

Table 2 Estimates of overall survival treatment effect up to study closure

Eight patients in the macitentan 10 mg group and three in the placebo arm prematurely discontinued macitentan 10 mg, switched from placebo to macitentan 10 mg, or were censored during the first month of follow-up. These patients were included in the modeling of censoring weights, but final stabilized weights were not defined for these patients as weights in this model are set as missing starting from the moment of treatment switch or censoring.

An exploration of the distribution of the truncated stabilized weights showed that for the macitentan 10 mg group, the mean was close to 1 (0.995), the standard deviation was small (0.082), and only 15 individual stabilized weights were truncated (0.23%) (Table 3). No weights were truncated before 30 months. In the placebo group, the mean of the stabilized weights deviated from 1 even after truncation (0.894), the standard deviation was 0.186, and 223 weights were truncated (4.17%) (Table 3); weights were truncated as early as 2 months.

Table 3 Descriptive statistics of the inverse probability of censoring weighted final stabilized weights before and after truncation

The truncated HR was similar to the HR before truncation (untruncated HR: 0.46; 95% CI 0.23, 0.94; p = 0.033) (Table 2). Final stabilized weights in the macitentan group were also similar before and after truncation. In the placebo group, the mean was brought closer to 1 following truncation, but still deviated from 1. In neither group were extreme weights observed for patients who died (Electronic Supplementary Material Table [ESM] S1).

Rank-Preserving Structural Failure Time Method

The RPSFT model produced a ψ point-estimate of − 0.46 (95% CI − 1.26, 0.17). This translates into a relative survival benefit of 1.58 (95% CI 0.84, 3.53) for macitentan 10 mg. This can be interpreted as had a patient continually received macitentan 10 mg, their survival time would have been 1.58-fold longer than had they never received macitentan. The estimation method resulted in additional censoring, presented in ESM Table S2.

The RPSFT-adjusted estimated HR for overall survival for randomization to macitentan 10 mg (i.e., patients always received macitentan 10 mg) versus placebo (i.e., patients never received macitentan) was 0.33 (95% CI 0.04, 2.83), i.e., a 67% reduction in the risk of mortality with macitentan 10 mg versus never macitentan 10 mg. The 95% CIs were wide, and the upper limit crossed 1.00; thus, the adjusted analysis, although indicating a trend toward a survival benefit, was not statistically significant.

Discussion

The SERAPHIN study was the first study designed to evaluate the long-term effect of PAH therapy on disease progression up to EOT in patients with PAH. The results demonstrated that macitentan 10 mg significantly reduced the risk of the composite primary endpoint of morbidity/mortality up to EOT by 45% compared with placebo [4]. Macitentan 10 mg also improved, albeit non-significantly, overall survival up to study closure. However, the secondary endpoint of time to all-cause death was evaluated up to study closure where many patients were not receiving the treatment which they were randomized to due to substantial treatment switching that occurred between EOT and study closure. As a result, as demonstrated in these additional analyses, the ITT approach appears to be heavily biased. When analyses were adjusted for the confounding effects of switching—either placebo patients switching to macitentan 10 mg or macitentan 10 mg patients prematurely discontinuing macitentan—the results indicate a substantially greater treatment effect on overall mortality with macitentan 10 mg than what was originally reported.

The IPCW and RPSFT methods, two established and widely used modeling techniques, resulted in point estimates for the HR of overall survival ranging from 0.33 to 0.46, indicating a markedly greater increase in treatment effect compared with the unadjusted ITT HR of 0.80. Results were statistically significant for the IPCW method. For the RPSFT method, CIs were wide and not significant. By design, the RPSFT model retains the p value from the ITT analysis and, therefore, when the point estimate of the HR is reduced, the CI widens [27].

These point-estimate reductions in the HR for overall survival are perhaps not surprising given that over one third of all patients randomized to placebo had switched to macitentan 10 mg by study closure following disease progression. Within the placebo group, exposure to macitentan 10 mg represented 24.8% of total exposure to study treatment. Moreover, this is not the first time an underestimation of the observed overall survival with macitentan 10 mg in SERAPHIN has been reported. The results from the current additional analyses support the findings from Torbicki et al. who used a prediction model-based approach to further explore survival with macitentan 10 mg, using real-world observational data in conjunction with SERAPHIN [5]. A survival prediction model was built based on baseline characteristics of a subgroup of patients from the US REVEAL registry who would have met the SERAPHIN eligibility criteria. The model was applied to all 742 patients in SERAPHIN to predict their survival had they received real-world treatment, i.e., no macitentan. This predicted survival was then compared with the observed survival in SERAPHIN for patients treated with macitentan 10 mg. The analysis resulted in a HR of 0.65 (95% CI 0.46, 0.90; p = 0.010) for overall survival, indicating that over 3 years, the observed risk of mortality with macitentan 10 mg was 35% lower than the predicted mortality for all SERAPHIN patients had they never received macitentan 10 mg [5]. This and the current analyses are complementary approaches, with the former using real-world evidence data and the current analyses using clinical trial data and adjusting for confounding due to treatment switches. Although the results should be regarded as exploratory, the findings are, reassuringly, consistent, and further demonstrate that treatment switching can have substantial implications when interpreting overall survival results in PAH studies.

A number of approaches can be used to adjust for the confounding impact of treatment switching. Simple adjustment methods, such as excluding patients who switch or censoring patients at time of switch, are prone to severe selection bias as switching is usually related to prognosis [2, 9, 28] and, in general, are not considered appropriate [29, 30]. More complex switch adjustment methods, such as the IPCW [24] and RPSFT [6] models, are generally preferable and overcome some of the key fundamental limitations associated with the more naïve analyses and, providing the assumptions hold, these models produce unbiased adjustments [2]. Given that there is no single optimal statistical method for adjusting for treatment switching in clinical trials [28], it is recommended that more than one model be used when doing such analyses [2]. For this reason, we performed both IPCW and RPSFT.

As with all analyses using treatment switching adjustment methods, limitations related to the model assumptions should be considered. The IPCW relies upon the no unmeasured confounders assumption which requires that all variables that influence the probability of a switch and that are prognostic are captured [28]. The SERAPHIN study does lend itself to an IPCW analysis given the large amount of baseline and time-dependent data that were collected on a number of prognostic variables. We do acknowledge that while background PAH therapy was included as a baseline covariate, any changes to the background therapy during the study was not included in the model. In addition, patients were considered to be on 'active treatment' only if they were treated with macitentan. The results do not take into consideration the possible effect of alternative rescue therapy following a primary endpoint event, and we acknowledge that this could impact on the assumptions for the IPCW as well as the RPSFT, which assumes that only macitentan induces a difference in survival [6]. IPCW results can also be prone to error with small sample sizes and if almost all patients switch (> 90% [2]) and/or very few events are observed in patients who do not switch [2, 28, 31]. Given the size of the SERAPHIN study and the proportion of patients switching, we considered the IPCW an appropriate method to use.

The RPSFT method includes the complete data set of patients in the study and compares the counterfactual survival times for both treatment groups as if no switching had occurred. The RPSFT method estimates the adjusted treatment effect based purely on the randomization of the trial: it assumes that in the absence of macitentan in both arms, the survival times would have been similar [6]. In large trials such as SERAPHIN, this is a reasonable assumption [28]. The RPSFT method also relies heavily upon the common treatment effect assumption [2], namely, it assumes that the treatment effect is the same regardless of whether the experimental therapy is given from randomization or from time of treatment switch. This assumption may not be plausible given that many patients switched from placebo to macitentan 10 mg after disease progression. These patients could potentially have a reduced capacity to benefit from macitentan 10 mg compared to those who received macitentan 10 mg at randomization. This assumption is impossible to test.

The statistical methodologies presented here provide an additional means of exploring the overall survival benefits when the more traditional ITT approach is confounded by substantial treatment switching. The application of these statistical methods is an evolving area of research, and here we present their first application in PAH. In rare diseases such as PAH, where it is not feasible to recruit a sufficient number of patients to adequately power a study to evaluate mortality alone and where patients receive active rescue therapy in response to disease progression, we may see these methods in the future being more commonly used in long-term studies. However, they can only be robustly applied if the correct data are collected appropriately from a well-designed trial. Analyses should be prospectively planned, with pre-definition of variables to be used as covariates and ensuring the protocol and case report forms are designed to collect the variables at the appropriate time together with agreeing the strategy with health authorities. Together, these should improve the robustness and validity of future analyses.

Conclusion

In conclusion, after adjusting for the treatment switching that occurred in the SERAPHIN study, the results of the current analyses, consistent with previous findings [5], suggest that the estimated survival benefit of macitentan 10 mg in PAH is far greater than the 20% estimated using the standard ITT analysis in SERAPHIN. The IPCW estimated a 58% reduction in risk of mortality versus placebo, and the RPSFT estimated a 67% reduction. Although an ITT analysis provides an unbiased estimate of the treatment effect according to randomization, these adjustments show that, in the presence of treatment switching, these additional analyses can provide more reliable estimates of the true treatment effect of therapy on mortality than the standard ITT analysis. Such information is of utmost relevance to patients, caregivers, and clinicians in developing expectations about the disease course.