FormalPara Key Points

Olaparib is the first and only PARPi approved for mCRPC based on positive phase III evidence.

Olaparib demonstrated a significant improvement in overall survival compared to control in PROfound.

However, almost 70% of patients randomized to the control arm switched to olaparib following disease progression.

This study explores validated methods to adjust for control patients switching to olaparib using the final data from PROfound.

All methods explored demonstrate that the observed overall survival results are likely to be underestimated.

1 Introduction

In recent years, the emergence of new treatments for metastatic castration-resistant prostate cancer (mCRPC) has led to improved survival outcomes [1]. This includes novel molecularly targeted agents, such as olaparib, an oral poly(ADP-ribose) polymerase inhibitor (PARPi), which has been investigated in the PROfound trial (NCT02987543) [2, 3].

PROfound is a prospective, randomized, open-label, phase III trial that included patients with mCRPC who had one or more of 15 homologous recombination repair (HRR) gene alterations based on prospective tissue testing and who had experienced disease progression while receiving a new hormonal agent. Patients with BRCA1/2 germline mutations [4] have known sensitivity to targeted therapy with PARPi [5, 6].

Patients were assigned to cohorts depending on their gene alterations: patients with at least one alteration in BRCA1, BRCA2, or ATM were assigned to Cohort A (n = 245); patients with alterations in any 12 other prespecified genes were assigned to the exploratory Cohort B (n = 142). Patients were randomized (2:1) to olaparib (300 mg twice daily) or investigator’s choice of new hormonal agent (enzalutamide 160 mg once daily or abiraterone 1,000 mg once daily plus prednisone 5 mg twice daily, hereafter called ‘control’). Overall survival (OS) was a key secondary endpoint for Cohort A, which was assessed in a hierarchical fashion. The patients randomized to control could switch over to treatment with olaparib upon confirmed radiological disease progression and meeting all eligibility criteria.

Olaparib is the first and only PARPi approved for mCRPC based on positive phase III evidence [2, 3]. Olaparib was approved by the US Food and Drug Administration (FDA) in patients with mCRPC and 14 somatic and/or germline HRR-mutated gene alterations (excluding the PPP2R2A gene) in patients who had experienced disease progression while receiving enzalutamide or abiraterone [7]. Olaparib has also been approved by the European Medicines Agency (EMA) in patients with mCRPC who had experienced disease progression while receiving a new hormonal agent and with BRCAm gene alterations [8].

Results from PROfound reported an OS benefit for patients treated with olaparib compared with control across Cohort A, A+B, and BRCAm populations [2, 3]. For the Cohort A, A+B, and BRCAm populations, OS hazard ratios (HRs) for olaparib compared with control were 0.69 (95% confidence interval [CI] 0.50–0.97), 0.79 (0.61–1.03) [3], and 0.63 (0.42–0.95) [8], respectively; the OS results in Cohort A were statistically significant. The OS endpoint for other populations was not included in statistical testing. There was no observed benefit in Cohort B. However, these results may underestimate the effect of olaparib as around 66–69% of patients in the control arm switched to olaparib, depending on the population considered.

Treatment switching (‘switching’) in oncology trials is common [9]. For the PROfound trial it was agreed with the steering committee and regulatory bodies (FDA and EMA) that switching from control to olaparib should be allowed upon radiographic progression (blinded independent central review (BICR)-confirmed up to data cutoff (DCO)1 and investigator-assessed beyond DCO1) if appropriate for the patient.

For Health Technology Assessment and payer purposes, it is important to understand the incremental benefit of the new treatment compared with standard of care without results being confounded by switching. Various statistical methods exist to adjust OS outcomes for switching, including ‘naïve’ methods, such as censoring or excluding patients who switch from the analysis to more ‘complex’ methods, such as Rank Preserving Structural Failure Time Models (RPSFTM), Inverse Probability of Censoring Weights (IPCW), and Two-Stage Estimation (TSE). These methods are acknowledged and accepted by Health Technology Assessment agencies and regulatory bodies [10,11,12]. Identifying the appropriate adjustment method for PROfound is key to estimating the magnitude of OS benefit for patients treated with olaparib had switching to olaparib from control not occurred.

Hussain et al. have partially explored adjusting for switching in PROfound as part of a prespecified sensitivity analysis using the RPSFTM for Cohort A, Cohort B, and Cohort A+B [3]. It has also been explored using the interim OS data [13].

The objective of this study is to evaluate the impact of switching on OS using the final data, comparing published adjustment methods and identifying the most appropriate method. We consider Cohort A and two further populations approved for olaparib treatment by the EMA and US FDA: BRCAm and Cohort A+B (excluding the PPP2R2A gene).

2 Methods

2.1 Patients

Detailed methodology and eligibility criteria for PROfound have been published elsewhere [2].

In this study, the results for OS adjusted for switching in the control arm to olaparib were primarily explored in Cohort A. Two secondary analyses were also included. One analysis in those with a BRCAm as included in the EMA label [8], and is the largest gene subgroup in the study (n = 160), and a predefined subgroup analysis in PROfound. An analysis in the FDA-approved population was also included, that is those in Cohort A + B (minus PPP2R2A) [7] (= 372). Methodology and selected results pertaining to Cohort A + B (minus PPP2R2A) population are included in the Online Supplementary Material (OSM).

Following disease progression, subjects randomized to the control arm were eligible to receive olaparib, given no subsequent anti-cancer therapies were received following discontinuation, and unresolved toxicities from prior therapy were resolved and no greater than Common Terminology Criteria for Adverse Events grade 1 at time of starting olaparib. At the prespecified interim analysis for OS (DCO1, 4 June 2019), patients were eligible to receive subsequent olaparib after BICR-assessed radiographic progression. BICR assessments of progression were stopped at DCO1 following positive primary PFS results [3]. From DCO1 to the time of the final analysis (DCO2, 20 March 2020), patients were eligible to receive subsequent olaparib after investigator-assessed radiographic progression; 67% and 69% of patients randomized to the control arm received subsequent olaparib in Cohort A and the BRCAm subgroup, respectively.

2.2 Endpoint and Statistical Analysis

The endpoint for this analysis is OS. Detailed results for the primary endpoint (rPFS) and other secondary endpoints have been previously reported [2, 3]. OS (defined as time from randomization to death from any cause) was an alpha controlled, key secondary endpoint in Cohort A of PROfound. Kaplan-Meier (KM) curves and Cox HRs are presented for each method and population.

Five switching methods were investigated: ‘naïve methods’ (excluding switchers, censoring switchers) and ‘complex methods’ (RPSFTM, IPCW, TSE). Methods are summarized in Table 1 and more detail has been provided in the OSM. A priori, there is often no clear best method for adjustment as it depends upon study design, conduct and patient characteristics. The National Institute for Health and Care Excellence (NICE) guidelines, referenced by the EMA, recommends all methods and associated assumptions are explored to select the most appropriate on a study-by-study basis [10, 11, 14].

Table 1 Summary of methodology

Naïve methods do not adjust for patient characteristics. These methods are prone to selection bias if switching is associated with patient characteristics, which breaks the randomization balance. In the PROfound trial, the decision to switch is made by the investigator and patient, and likely based on patient characteristics and preferences.

The RPSFTM approach estimates the causal effect of treatment using a counterfactual framework, where counterfactual survival times are those that would have been observed if switching from control to the experimental arm had not occurred [15]. The counterfactual survival times for control are equivalent to the time spent on control plus the time spent on olaparib multiplied by an ‘acceleration factor’, which was calculated using different models (see Table 1). The acceleration factor used should be that for which the survival times between the two randomized groups are equal, before treatment with olaparib has been received. The acceleration factor is applied to the survival times for the switchers in the control arm before treatment effect estimates are recalculated for olaparib compared with adjusted control.

The RPSFTM relies on the clinical and biological plausibility of the randomization assumption and common treatment effect assumption (see Table 1). Plots of the counterfactual times between the reference and comparator arms were compared to test the randomization assumption, and to test the common treatment effect assumption a threshold analysis on the treatment effect received by switchers is applied in a sensitivity analysis [16].

The TSE method also uses a counterfactual framework. TSE should only be used to adjust for switching that occurs after a specific disease-related time point, called the ‘secondary baseline’. Disease progression is commonly used, as trials, such as PROfound, often stipulate that patients are only permitted to switch treatments once disease progression has occurred. Disease progression would be an unsuitable secondary baseline if patients deviated from the trial protocol, or if the spread of switch times after disease progression is wide and confounding exists between the secondary baseline and point of switch.

The IPCW method extends the censoring at point-of-switch analysis, by applying weights to patients according to their probability of switching treatments. It aims to remove selection bias introduced by censoring switchers by reweighting non-switchers according to an estimated probability, based on covariables, that they would have switched. The IPCW analysis relies on the ‘no unmeasured confounders’ assumption (see Table 1) and can be prone to error when there is a small sample size with very large (or small) switching proportions.

Detailed methodology of all methods is given in the OSM.

3 Results

3.1 Patient Population

Baseline characteristics for the olaparib and control arms have been previously reported for Cohort A, Cohort A+B and the BRCAm subgroup [2, 17]. Baseline characteristics for the control arm for switchers and non-switchers, for Cohort A and the BRCAm subgroup, are included in Table 2. Characteristics for Cohort A+B (minus PPP2R2A) population are included in the OSM.

Table 2 Cohort A and BRCAm baseline characteristics for the control arma

3.2 Outcomes

3.2.1 Cohort A

The OS HR for olaparib compared with control without adjustment for switching in the control arm is 0.69 (95% CI 0.5–0.97) [3]. All HR results for analysis adjusted for switching are provided in Fig. 1.

Fig. 1
figure 1

Overall survival hazard ratios (HRs) for all methods, Cohort A. The dashed line represents unadjusted for switching HR for Cohort A (n = 245); the dotted line represents HR = 1.0; Data used from DCO2: 20 March 2020; all HRs are adjusted for trial stratification factors as in the intention-to-treat analysis [2, 3]; see the Online Supplementary Material for the results for Cohort A + B minus PPP2R2A. DCO data cutoff, IPCW Inverse Probability of Censoring Weights, RPSFTM Rank Preserving Structural Failure Time Model

3.2.1.1 Rank Preserving Structural Failure Time Models (RPSFTMs)

The HR for olaparib compared with control ranged between 0.42 (95% CI 0.18–0.9) and 0.52 (95% CI 0.31–1.00) depending on the model selected (i.e., log rank, Cox and Weibull; with or without recensoring). Median survival estimates were between 11.73 and 12.63 months for the adjusted control arm, compared with 14.70 months for the unadjusted control. The acceleration factors were broadly similar between models (see OSM). Differences between the estimates including or excluding recensoring may be explained by the shape of the KM curves presented in Fig. 2A, B: when recensoring is not included, the treatment and comparator curves meet, or almost meet, and plateau; whereas for the recensored curves, the treatment and comparator arms do not meet, because the patients were recensored before this point, which changes the shape of the curves.

Fig. 2
figure 2

Kaplan–Meier curves for Cohort A and BRCAm, adjusted for treatment switching using RPSFTM. A Cohort A, RPSFTM using Cox model without recensoring; B Cohort A, RPSFTM using Cox model with recensoring; C BRCAm, RPSFTM using Cox model without recensoring; D BRCAm, RPSFTM using Cox model with recensoring. bid twice daily, BRCAm BRCA1 and/or BRCA2 mutation, RPSFTM Rank Preserving Structural Failure Time Model

Results of the RPSFTM were robust to deviations from the common treatment effect assumption. The randomization assumption was considered to hold (see OSM).

3.2.1.2 Two-Stage Estimation (TSE)

TSE can be applied when switching occurs immediately after a disease-related time point (secondary baseline) [18]. BICR-confirmed radiological progression was considered; however, it was not found to be an adequate predictor of imminent switching, with only 38% of control arm switchers switching to olaparib within 30 days of radiological progression. The reasons for delayed switching to olaparib after progression were not collected and likely driven by multiple factors. Furthermore, there was an added complication of identifying a secondary baseline due to switching from control to olaparib being allowed to occur after BICR-assessed progression up to DCO1 and based on investigator-assessed progression from DCO1 to DCO2. Other events were considered, but a reliable secondary baseline could not be identified, and therefore the analysis was not performed.

3.2.1.3 Inverse Probability of Censoring Weights (IPCW)

The central HR for olaparib compared to control was estimated to be 0.57 (95% CI 0.33–0.97). The median OS for adjusted control was 14.01 months and the KM curves are presented in Supplementary Fig. 4 (OSM).

The HR for the sensitivity analysis, which calculates weights using all available variables for olaparib compared to control, was 0.59 (95% CI 0.35–1.01) (see OSM).

Only 27 (33%) patients in the Cohort A control arm did not switch to olaparib, and so there may be some bias present in the results.

3.2.1.4 Naïve Methods

For the naïve approaches, the HR for OS for olaparib compared with control was 0.39 (95% CI 0.24–0.65) when switchers were excluded and 0.69 (95% CI 0.42–1.14) when switchers were censored. For KM curves see the OSM.

3.2.2 BRCAm Subgroup

All results are provided in Fig. 3. Findings for this subgroup are consistent with those for Cohort A.

Fig. 3
figure 3

Overall survival hazard ratios (HRs) for all methods, BRCAm. The dashed line represents unadjusted for switching HR for BRCAm (n = 160); the dotted line represents HR = 1.0; Data used from DCO2: March 20, 2020; all HRs are adjusted for trial stratification factors as in the intention-to-treat analysis [2, 3]; see the Online Supplementary Material for the results for Cohort A + B minus PPP2R2A. BRCAm BRCA1 and/or BRCA2 mutation, DCO data cutoff, IPCW Inverse Probability of Censoring Weights, RPSFTM Rank Preserving Structural Failure Time Model

3.2.2.1 RPSFTM

For the RPSFTM, the HR for olaparib compared with control ranges between 0.27 (95% CI 0.19–0.75) and 0.40 (95% CI 0.27–0.90) depending on model selected. Median survival was between 9.15 and 10.16 months for the adjusted control arm, compared with 14.45 months for the unadjusted control. KM curves are presented in Fig. 2C, D.

3.2.2.2 IPCW and TSE

For the IPCW method, the HR for olaparib compared with control was 0.40 (95% CI 0.21–0.76), and the median OS for adjusted control was 14.01 months. TSE was deemed inappropriate for the same reasons as Cohort A.

3.2.2.3 Naïve Methods

For the naïve approaches, the HRs for OS for olaparib compared with control were 0.22 (95% CI 0.11–0.41) excluding switchers and 0.50 (95% CI 0.27–0.95) when censoring switchers.

Selected results pertaining to the Cohort A+B (minus PPP2R2A) population are included in the OSM.

4 Discussion

Overall survival is considered an integral endpoint in cancer studies [19, 20]. In PROfound, a high proportion of patients in the control arm switched to olaparib treatment after progression (67% in Cohort A after a median follow-up of 21.04 months in the control arm), potentially confounding the observed survival outcomes. All adjustment methods evaluated demonstrate that the survival benefit for olaparib is likely to be greater than that observed in PROfound if switching from control to olaparib is adjusted for. This was consistent across all populations assessed. The most considerable improvement in OS is observed in the BRCAm subgroup, which is also the subgroup that derives the greatest survival benefit in the unadjusted analysis compared with other HRR-mutated subgroups [17].

The RPSFTM was considered the most appropriate approach for the PROfound trial. Sensitivity analyses showed the RPSFTM results were robust to deviation from the common treatment effect assumption, and the randomization assumption was considered to hold. When switching in Cohort A was adjusted for using the RPSFTM, the HR was reduced from 0.69 to between 0.42 and 0.52, with consistency in the direction of the effect across results using different modelling distributions. This aligns with the pre-specified analysis published by Hussain et al. [3]. The application of recensoring had the most impact on the results. Recensoring may, however, produce biased results as there are particular features of the treatment effect that appear at longer follow-up times, and it effectively disregards this information [21]. Conversely, not recensoring may also produce bias if extrapolated curves in the long run are not considered clinically plausible.

The other methods considered in this study were found to introduce more bias than the RPSFTM. For the naïve methods, the lack of adjustment for patient characteristics made the analysis prone to selection bias. The TSE method was the only method that could be discounted a priori due to the problems encountered with identifying a secondary baseline. The time lag between progression (secondary baseline) and switch day for some patients meant that by the time switched patients started to die there were very few non-switching patients alive. The IPCW method can be biased in situations where there are a larger proportion of switchers [14]. For example, a simulation study has shown high levels of bias can be produced when switching proportions exceeded 85% [22]. In this analysis, only 27 patients in the control arm did not switch to olaparib in Cohort A, and these patients form the basis of the IPCW control survival estimates. This small sample size increases the amount of uncertainty associated with the result. Furthermore, compelling predictors of switching could not be identified.

Future analysis could consider more sophisticated versions of the complex methods. For example, Latimer et al. (2020) have produced an ‘improved’ TSE that directly addresses the issue of the time-dependent confounding between disease progression date and switch date in PROfound [23].

Future research could consider the implications of adjusting for switching with small sample sizes. For PROfound, a key limitation applicable to all methods is the sample size of the control arm. This is particularly problematic for the IPCW and TSE, which reduce sample size in the analysis, leading to greater uncertainty in the results. This issue causes further problems for subgroup analysis. The BRCAm patients represent a subgroup of more homogenous patients who have been identified as deriving a clinically meaningful benefit from olaparib treatment [8]. As this is a distinct subgroup, standard methodology dictates that they be selected at the start of the analysis, as has been presented here. However, this reduces the sample size to 160 patients (58 in control). An alternative method involves calculating counterfactual survival times on all patients and then performing the survival analysis on the BRCAm patients. This approach borrows information from the total population, which may lead to more certain estimates. However, this would mean the weights and counterfactual times are influenced by a patient population that is more heterogeneous than the population of interest and judgement is required as to whether the assumption of exchangeability is valid. Further work exploring hierarchical approaches is warranted.

5 Conclusion

These results demonstrate that the observed survival benefit of olaparib versus control in the intention-to-treat population of PROfound is likely to be underestimated. The RPSFTM is a plausible method to adjust for this switching in the PROfound study, but further development and validation of robust and flexible methods to estimate the magnitude of impact of switching are needed. In PROfound, the observed OS and switching analyses demonstrate a survival benefit for olaparib in mCRPC compared with investigator’s choice of new hormonal agent.