FormalPara Key Summary Points

Why carry out this study?

Apalutamide and darolutamide are androgen receptor inhibitors (ARi) approved for the treatment of patients with non-metastatic castration-resistant prostate cancer (nmCRPC)

In phase 3 studies including men with nmCRPC, apalutamide and androgen deprivation therapy (ADT) in the SPARTAN study or darolutamide + ADT in the ARAMIS study significantly improved metastasis-free survival (MFS) and overall survival (OS) compared with placebo + ADT

Patient characteristics differed between SPARTAN and ARAMIS trials

We need to understand comparative efficacy in the absence of head-to-head comparison of these agents in prospective clinical trials

This study used matching-adjusted indirect comparison (MAIC) to compare the efficacy and tolerability of apalutamide + ADT versus darolutamide + ADT in the treatment of men with nmCRPC

What was learned from the study?

Results from this study suggest that apalutamide + ADT is more effective than darolutamide + ADT for MFS, progression-free survival (PFS), and prostate-specific antigen (PSA) progression, with a similar OS benefit and tolerability profile, in men with nmCRPC

Introduction

In men with prostate cancer and advanced disease, surgical castration or medical castration [androgen deprivation therapy (ADT)] is a part of most treatment regimens [1]. However, nearly all patients develop resistance to ADT over time [2], as detected by prostate-specific antigen (PSA) relapse and more rapid doubling times for PSA [3]. Progression of PSA without radiographic evidence of distant metastasis on standard imaging during ADT is categorized as non-metastatic castration-resistant prostate cancer (nmCRPC) [1]. Three next-generation androgen receptor inhibitors (ARi) (apalutamide, enzalutamide, and darolutamide) have been approved for the treatment of patients with high-risk nmCRPC when used either in combination with ADT [e.g., with a gonadotropin-releasing hormone (GnRH) analog] or after surgical castration and are recommended in clinical guidelines [4, 5]. Moreover, these ARis were evaluated in nmCRPC patients with a PSA doubling time (PSADT) of ≤ 10 months, i.e., in patients with an elevated risk of metastases.

In men with high-risk nmCRPC, statistically significant improvements in metastasis-free survival (MFS), which was the primary endpoint of all registration trials, have been demonstrated with each of the next-generation ARi + ADT compared with placebo + ADT [6,7,8]. When focusing on apalutamide and darolutamide, the randomized, placebo-controlled, phase 3 SPARTAN study reported a hazard ratio (HR) for MFS of 0.28 [95% confidence interval (CI) 0.23, 0.35; p < 0.001], representing a 72% reduction in the risk of metastasis or death, for apalutamide + ADT versus placebo + ADT [6]. The randomized, placebo-controlled, phase 3 ARAMIS study reported a HR of 0.41 (95% CI 0.34, 0.50; p < 0.001), representing a 59% reduction in the risk of metastasis or death, for darolutamide + ADT versus placebo + ADT [7]. In both of these studies, secondary endpoints such as overall survival (OS) were analyzed according to protocol-driven rules. Based on the first interim analyses, results for OS in each study showed a consistent trend without reaching statistical significance due to the immature OS data in both studies. The interim analyses 2 (IA2) results of the SPARTAN study reported a 25% reduction in the risk of death for apalutamide + ADT versus placebo + ADT (HR 0.75, 95% CI 0.59, 0.96; p = 0.0197) [9]. Recently, final OS results have been presented and showed significant improvements in OS for both apalutamide and darolutamide despite crossover. A 22% reduction in the risk of death was observed with apalutamide + ADT versus placebo + ADT (HR 0.78, 95% CI 0.64, 0.96; p = 0.016) [10], and a 31% reduction in the risk of death was observed with darolutamide + ADT versus placebo + ADT (HR 0.69, 95% CI 0.53, 0.88; p = 0.003) [11].

To date, no head-to-head, prospective, interventional studies have been conducted to compare the next-generation ARi in men with nmCRPC. The objective of this analysis was to compare apalutamide + ADT with darolutamide + ADT on efficacy and tolerability outcomes when population differences were minimized through anchored, matching-adjusted indirect comparison (MAIC) methods evaluated within a Bayesian framework.

Methods

The methodological approach of anchored MAIC [12, 13] used for this analysis were similar to that in previous publications [14, 15], and has been applied to a comparison of efficacy and tolerability from the SPARTAN and ARAMIS studies.

Data Sources

Patient-level data were available from the SPARTAN study (NCT01946204), a randomized, double-blind, placebo-controlled, international, phase 3 study conducted at 332 sites in 26 countries [6]. A total of 1207 men, age ≥ 18 years, with nmCRPC who were receiving ADT were randomized in a 2:1 ratio to receive apalutamide 240 mg per day (n = 806, apalutamide + ADT) or placebo (n = 401, placebo + ADT).

Aggregated data of darolutamide + ADT versus placebo + ADT were obtained from the published results of the ARAMIS study (NCT02200614), a randomized, double-blind, placebo-controlled, international, phase 3 study at 409 centers in 36 countries [7]. A total of 1509 adult men with nmCRPC who were undergoing ADT therapy were randomized in a 2:1 ratio to receive darolutamide 600 mg (n = 955, darolutamide + ADT) or placebo (n = 554, placebo + ADT) twice daily.

The definitions and assessment methods for the endpoints in the SPARTAN and ARAMIS studies were reviewed to determine the comparability of endpoints between the two studies. Study designs, including eligibility criteria, were similar between SPARTAN and ARAMIS (Table 1), with the exception that patients in ARAMIS were required to have a baseline PSA level of at least 2 ng/mL, while no limit for baseline PSA was specified for patients in SPARTAN. To account for this, patients in the SPARTAN study who had a baseline PSA level of < 2 ng/mL were excluded from our analysis.

Table 1 Study design comparison: SPARTAN and ARAMIS

All appropriate ethics approvals were granted within the SPARTAN study. Data from the ARAMIS study was obtained from publicly available sources. Review boards at participating institutions approved the SPARTAN and ARAMIS studies, and they were conducted in accordance with the current International Conference on Harmonization guidelines for Good Clinical Practice and the principles of the Declaration of Helsinki. Informed consent was obtained from each participant prior to enrollment in each of the studies.

Study Endpoints

The primary endpoint (MFS) and its definition were similar in SPARTAN and ARAMIS: the time from randomization to the first detection of distant metastasis on imaging (as assessed by means of blinded independent central review) or death from any cause, whichever occurred first. Imaging for detection of metastases was performed using the same techniques (computed tomography or magnetic resonance imaging of the pelvis, abdomen, and chest) and with the same time schedule (every 16 weeks). Secondary endpoints in each study included PSA progression, progression-free survival (PFS), and OS. All time-to-event endpoints in both studies were analyzed using a Cox proportional hazards model to estimate a hazard ratio (HR) and 95% credible interval (CrI). Tolerability outcomes in each study included the overall incidence of any adverse event and of any serious adverse event. The clinical visit periods varied between the studies; every month for SPARTAN and every three months for ARAMIS. The first interim analyses from both SPARTAN and ARAMIS have been used to compare apalutamide and darolutamide on the efficacy (i.e. MFS, PSA progression, and PFS) and tolerability endpoints. For OS, both the first and second interim analysis of the SPARTAN study have been compared to the final analysis of ARAMIS to allow a more mature comparison. These data cuts were better aligned, in terms of the reported follow-up time and amount of exposure of patients in the control arm to the active treatment due to cross-over, than the final analysis of the SPARTAN.

Statistical Analyses

An anchored MAIC analysis was performed by using individual patient data from SPARTAN and published aggregate baseline data from ARAMIS to match SPARTAN patient characteristics to those in ARAMIS via inverse probability weighting. This step aimed to estimate the relative efficacy for apalutamide versus ADT if SPARTAN would have enrolled a patient population similar to the ARAMIS study, and to indirectly compare efficacy endpoints and tolerability between patients initiated on either apalutamide or darolutamide. Analyses were conducted using SAS 9.4, R 3.5.0, and Winbugs 1.4.3.

All clinically relevant baseline characteristics reported in ARAMIS that could potentially affect relative treatment effects were considered in the matching process. The baseline characteristics adjusted for included age, baseline PSA, and PSADT, Eastern Cooperative Oncology Group (ECOG) performance status (PS), use of bone-targeting agents, time from initial diagnosis, testosterone level, prior hormonal therapy, and region. Patients from SPARTAN missing any of the matched-on characteristics were excluded from the sample. Patients enrolled in SPARTAN were assigned weights such that each patient’s weight was equal to their estimated odds of enrollment in SPARTAN versus ARAMIS, and the weighted mean or median baseline characteristics in SPARTAN closely matched those reported in ARAMIS. Weights were obtained from a logistic regression model for the propensity of enrollment in SPARTAN versus ARAMIS, estimated using the method of moments [13].

Step 1: Recalculation of Hazard Ratios and Odd Ratios from SPARTAN

In the first step, re-analysis of the SPARTAN study outcomes comparing apalutamide + ADT and placebo + ADT was conducted using the SPARTAN MAIC-weighted population. A weighted Cox proportional hazards regression analysis using a robust estimator for the variance was performed to estimate the HR for each efficacy-related time to event endpoint of interest, and a weighted logistic regression was used to estimate the odds ratio (OR) for each tolerability endpoint with the MAIC-weighted SPARTAN study data.

Step 2: Bayesian Network Meta-analysis

In the second step, the updated HRs and ORs in SPARTAN estimated in Step 1 were compared with the reported aggregate data from ARAMIS to estimate the HRs (for efficacy-endpoints MFS, PSA progression, PFS, and OS) and ORs (for tolerability: any adverse event and serious adverse events) for apalutamide + ADT versus darolutamide + ADT, using a Bayesian framework [13, 16], with ADT as the common comparator across both studies.

Due to the limited number of studies in the networks, only fixed-effects models are presented, and random-effects models were not considered due to a lack of information to estimate between-study variability. These analyses were conducted according to the methods described in the National Institute for Health and Care Excellence (NICE) Decision Support Unit Technical Support Documents [17, 18]. Non-informative prior probability distributions were chosen based on the NICE recommendations.

Results

Baseline Characteristics Before Matching

Table 2 illustrates that populations from both studies were broadly comparable on most clinically relevant baseline measures. However, patients in SPARTAN had a lower median PSA at baseline (7.8 vs. 9.2 ng/mL), a longer median time from initial diagnosis (95 vs. 86 months), and a higher median testosterone (0.8 vs. 0.6 nmol/L) versus patients in ARAMIS (Table 2). Additionally, SPARTAN included less patients with ECOG PS of 1 (23% vs. 31%), more patients with bone-targeted agent treatment (10% vs. 4%), and more patients from the US (35% vs. 12%) versus ARAMIS.

Table 2 Baseline characteristics before and after matching

Inclusion/exclusion criteria from ARAMIS were applied to patients in SPARTAN prior to inclusion in the MAIC. Fifty-seven patients in SPARTAN were excluded from the MAIC because they had a baseline PSA level of < 2 ng/mL, resulting in 1150 patients who were matched to ARAMIS data. The MAIC algorithm then assigned weights to patients from SPARTAN in such a way that their summary statistics matched the aggregate baseline data from ARAMIS on the measures with notable differences described above, as well as age, PSADT (≤ 6 months), and any receipt of previous hormonal therapy.

Effect of Reweighting on Treatment Estimates in SPARTAN

The reweighted SPARTAN population, with an effective sample size (neff) [19] of 455, exhibited baseline measures comparable to the aggregate data from ARAMIS (Table 2). After reweighting the SPARTAN dataset, the adjusted treatment effect of apalutamide + ADT versus placebo + ADT was similar to the results from the original, unweighted analysis of SPARTAN for MFS, PSA progression, and PFS (Table 3). Reweighting the SPARTAN dataset only had a minor effect on the treatment effect for OS in the first interim analysis [HR: 0.75 (reweighted) vs. 0.70 (original)] as well as on the second interim analysis [HR: 0.71 (reweighted) vs. 0.75 (original)].

Table 3 Efficacy of apalutamide + ADT or darolutamide + ADT in each study and in the matching-adjusted indirect comparison across studies

MAIC-Based Efficacy of Apalutamide versus Darolutamide

Metastasis-Free Survival

The MAIC-based HR for apalutamide versus darolutamide for the primary endpoint of MFS was 0.70 (95% CrI 0.51, 0.98), with a Bayesian probability (P) for apalutamide + ADT to be more effective than darolutamide + ADT of 98.3% (Table 3). Figure 1 shows the posterior distribution of the HR for MFS for apalutamide + ADT versus darolutamide + ADT in which P (HR) < 1 is visually represented as the area under the posterior distribution to the left of HR = 1.

Fig. 1
figure 1

Posterior distribution of hazard ratio of a metastasis-free survival, b progression-free survival, c prostate-specific antigen progression, and d overall survival at the first interim analyses among patients of the SPARTAN study versus ARAMIS study. A hazard ratio (HR) or odds ratio (OR) below 1 favors APA + ADT and a value above 1 favors DARO + ADT. Solid curves show the matching-adjusted indirect comparison after reweighting of patient-level data in SPARTAN, using aggregate-level data from ARAMIS. Dashed curves show original SPARTAN data versus ARAMIS data. ADT androgen deprivation therapy; CrI credible interval; MFS metastasis-free survival; OS overall survival; PFS progression-free survival; PSA prostate-specific antigen

Secondary Efficacy Endpoints

Results for the secondary efficacy endpoints of PSA progression and PFS were consistent with MFS results (Table 3; Fig. 1). The HR for PSA progression was 0.46 (95% CrI 0.33, 0.64), with P (apalutamide + ADT > darolutamide + ADT) ~ 100%. The HR for PFS was 0.79 (95% CrI 0.59, 1.08), with P (apalutamide + ADT versus daroluatmide + ADT) = 93.2%. At the first interim analyses, apalutamide + ADT was similar to darolutamide + ADT for OS [HR 1.05; 95% CrI 0.58, 1.93; P (HR < 1) = 43.5%] and this has been confirmed with longer term follow-up [HR 1.02; 95% CrI 0.69,1.52; P (HR < 1) = 45.8%].

Tolerability

Apalutamide + ADT and darolutamide + ADT were similar in terms of tolerability. More specifically, the overall rate of any adverse event (OR 1.02; 95% CrI 0.50, 2.04), was nearly identical (Table 4). There was a 64.5% probability that apalutamide + ADT was better tolerated than darolutamide + ADT when comparing the overall rate of any serious adverse event (OR 0.91; 95% CrI 0.53, 1.53). Figure 2 shows the posterior distribution of the HR for tolerability of apalutamide + ADT versus darolutamide + ADT in which P (OR) < 1 is visually represented as the area under the posterior distribution to the left of OR = 1.

Table 4 Tolerability of apalutamide + ADT or darolutamide + ADT in each study and in the matching-adjusted indirect comparison across studies
Fig. 2
figure 2

Posterior distribution of hazard ratio of adverse events, and serious adverse events among patients of the SPARTAN study versus ARAMIS study. A hazard ratio (HR) or odds ratio (OR) below 1 favors APA + ADT and a value above 1 favors DARO + ADT. Solid curves show the matching-adjusted indirect comparison after reweighting of patient-level data in SPARTAN, using aggregate-level data from ARAMIS. Dashed curves show original SPARTAN data versus ARAMIS data. AE adverse event; CrI credible interval; SAE serious adverse events

Discussion

This anchored MAIC analysis of data from the SPARTAN and ARAMIS studies in nmCRPC patients suggests that apalutamide + ADT is more effective than darolutamide + ADT in achieving MFS (the primary endpoint in both studies) with high probability (98.3%). Using secondary efficacy endpoints in SPARTAN and ARAMIS, this analysis also shows that apalutamide + ADT has a high probability of being more effective than darolutamide + ADT in delaying PSA progression and prolonging PFS. The two treatments were similar regarding OS, both at the first interim analysis and at longer term follow-up which included cross-over of patients in both studies. Importantly, the two treatments were also found to be similar regarding tolerability, both for any adverse event and for any serious adverse event.

Several network meta-analyses, using Bayesian analysis, [20,21,22,23,24,25,26] have indirectly compared the treatment effects and tolerability of ARi in patients with nmCRPC, using aggregate data from SPARTAN, ARAMIS, and PROSPER. However, these analyses did not adjust for potential differences in baseline characteristics across the multiple studies. A strength of our analysis is the use of MAIC methods to provide a more accurate comparison of efficacy and safety results across SPARTAN and ARAMIS, while adjusting for potential biases due to differences in patient effect modifiers (patient characteristics which impact the relative treatment effect) between both studies. Use of this methodology ensures that any differences in baseline characteristics of the populations are minimized when assessing differences in efficacy and safety results across studies [13]. The patient populations in SPARTAN and ARAMIS differed in a few ways that could potentially influence the relative efficacy in both studies. As a result, the MAIC analyses adjusted for these by reweighting the SPARTAN population to reflect a patient population similar as the ARAMIS study. A greater proportion of patients in SPARTAN were from the US and had higher median values at baseline versus patients in ARAMIS for time from initial diagnosis, testosterone level, and use of bone-targeted agents. Patients in SPARTAN had a lower median PSA at baseline than patients in ARAMIS and they were less likely to have ECOG PS of 1. The MAIC approach adjusted for the potential impact of these differences.

Another strength of this analysis is the Bayesian statistical approach. Other authors have used the Bucher technique [27] to compare treatments in this setting [28, 29], which generated results in a frequentist statistical framework. This is known to lack statistical power [30] due to greater uncertainty caused by the standard error of the indirect comparison estimate being based on the simple addition of the variances from the original studies. This conventional approach dichotomizes the results to be either significant or not, which is not well suited for decision-making as it does not indicate the probability of the hypothesis being true or false. A Bayesian analysis provides the probability that one treatment will be more effective than another, which is more relevant for decision-making for patients, providers, and payors [31]. Based on all of the available evidence, this analysis used the Bayesian statistical approach to examine, after matching the patient populations and outcomes, the probability that one treatment is more effective (and/or better tolerated) than another in men with nmCRPC. In the current study, the Bayesian approach presented here allows us to state that, taking into account all available evidence from randomized trials, there is a high probability that apalutamide + ADT is more effective than darolutamide + ADT with respect to MFS and PSA progression (both over 98%) and PFS (over 93%). In addition, this analysis also suggests that apalutamide + ADT and darolutamide + ADT are similar in terms of tolerability. It is important to consider that this analysis could not account for differences in the prespecified timing of adverse event capture and treatment exposure-adjusted comparisons between SPARTAN (every 4 weeks throughout the study) and ARAMIS (after the first month, every 16 weeks). Less frequent monitoring in ARAMIS might have led to less overall reporting of adverse events for darolutamide [32], potentially with a conservative analysis for apalutamide and leading to possible bias. In addition, although the final analysis results for OS have been reported for the SPARTAN study [10], the present analysis compared OS results from SPARTAN second interim analysis with the final analysis of the ARAMIS study to ensure an unbiased comparison with closer treatment exposure durations between the two studies.

While anchored MAIC analyses allow for comparisons of experimental interventions without head-to-head studies, some limitations of MAIC exist. First, adjustment is by definition limited to those characteristics reported in the primary publication for ARAMIS [7]. Additionally, the MAIC analysis could not adjust for differences in study design between SPARTAN and ARAMIS relating to the timing of safety data capture, as mentioned before. Moreover, overall tolerability has been compared whereas specific adverse events could not be compared due to limitations in reporting from ARAMIS.

Due to the absence of head-to-head studies, current guidelines do not recommend one treatment over the other for patients with nmCPRC [1]. The results of this study may help inform treatment decisions in the management of nmCPRC patients.

Conclusion

The results of the current study, based on an MAIC approach to generate indirect comparative evidence between apalutamide and darolutamide, suggest that nmCRPC patients who are treated with apalutamide + ADT have favorable MFS, PSA progression, and PFS, compared with those treated with darolutamide + ADT, while OS and safety profiles based on adverse events and serious adverse events for both treatments are similar.