FormalPara Key Summary Points

Why carry out this study?

The goal of therapy in multiple myeloma is to prevent or delay relapse to extend life expectancy for as long as possible with acceptable toxicity and maintenance of quality of life. Daratumumab, pomalidomide, and dexamethasone (D-Pd) has demonstrated improvement in progression-free survival (PFS) versus pomalidomide and dexamethasone (Pd) alone in patients with relapsed/refractory multiple myeloma (RRMM) exposed to both proteasome inhibitor (PI) and immunomodulatory drug (IMiD).

The objective of this indirect treatment comparison (ITC) was to use combined data from the APOLLO, EQUULEUS, and CASTOR trials to compare improvement in PFS with D-Pd versus daratumumab, bortezomib, and dexamethasone (D-Vd), and bortezomib and dexamethasone (Vd) in patients with RRMM who had prior PI and IMiD exposure.

What was learned from the study?

While some imbalances in patient baseline characteristics remained between treatment cohorts, this ITC with different statistical methods demonstrates a PFS benefit for D-Pd compared with D-Vd and Vd.

The results of this study provide support in favor of using D-Pd versus other treatments that are part of standard of care such as D-Vd and Vd in a population of patients with difficult-to-treat RRMM and particularly in those patients who have been exposed to both a PI and an IMiD.

Introduction

Multiple myeloma (MM) is a malignant plasma cell disorder diagnosed in approximately 160,000 people annually worldwide [1]. In patients with MM, proliferation of malignant clonal plasma cells leads to subsequent replacement of normal bone marrow with hematopoietic precursors and overproduction of monoclonal protein [2]. Characteristic hallmarks of the disease include osteolytic lesions, anemia, increased susceptibility to infections, hypercalcemia, renal insufficiency or failure, and peripheral neuropathy [3].

Most patients with MM will relapse and become refractory to first line(s) of therapy, and with each subsequent line of therapy the duration of remission duration becomes shorter [4, 5]. Goals for the treatment of patients with relapsed/refractory MM (RRMM) are to induce durable and deep responses, and to prevent or delay relapse for as long as possible with acceptable toxicity and no loss in quality of life [4, 6].

Treatment of MM has advanced significantly over the past decade with the approval of novel agents including proteasome inhibitors (PI) such as bortezomib, carfilzomib, and ixazomib; immunomodulatory drugs (IMiD) such as lenalidomide and pomalidomide; monoclonal antibodies, namely daratumumab, isatuximab, and elotuzumab [7]; and other newer treatments in development including bispecific antibodies [8] and chimeric antigen receptor T cell (CAR-T cell) therapies [9].

Daratumumab is an anti-CD38 monoclonal antibody with multiple mechanisms of action including complement-dependent cytotoxicity, antibody-dependent cell-mediated cytotoxicity, induction of apoptosis by Fc gamma receptor-mediated crosslinking of tumor-bound monoclonal antibodies, and antibody-dependent phagocytosis [10]. Several daratumumab combination therapies are recommended for the treatment of patients with RRMM, most recently daratumumab, pomalidomide, and dexamethasone (D-Pd) [7]. Comparative effectiveness studies of D-Pd versus other standard of care (SOC) regimens are of great interest to inform clinical decision-making. In the absence of available head-to-head data, other than pomalidomide and dexamethasone (Pd), indirect treatment comparisons (ITCs) can provide important information about the relative efficacy of D-Pd [11].

The objective of this study was to leverage patient-level data to indirectly compare the efficacy of D-Pd versus daratumumab, bortezomib, and dexamethasone [D-Vd] and bortezomib and dexamethasone [Vd] using statistical adjustment for imbalances in patient characteristics at baseline. Both D-Vd and Vd are recommended by the National Comprehensive Cancer Network treatment guidelines in patients with RRMM [7] and have been utilized for several years in this patient population.

Methods

Data Sources and Eligibility Criteria

The present study used data from two phase 3 clinical trials (APOLLO, ClinicalTrials.gov identifier NCT03180736 and CASTOR, ClinicalTrials.gov identifier NCT02136134) and one phase 1b clinical trial (EQUULEUS, ClinicalTrials.gov identifier NCT01998971) that evaluated daratumumab combination treatments in patients with RRMM [12,13,14].

Data for the D-Pd cohort were taken from the APOLLO and EQUULEUS studies [12, 14]. APOLLO is a randomized open-label trial in patients with RRMM who were previously treated with at least one prior line of therapy including lenalidomide and a PI, and who were randomized to D-Pd (n = 151) or Pd (n = 153) [12]. Patients in the D-Pd group received daratumumab (1800 mg) subcutaneously (SC) or daratumumab (16 mg/kg) intravenously (IV) weekly in cycles 1 and 2, every 2 weeks in cycles 3–6, and every 4 weeks thereafter in combination with orally administered pomalidomide 4 mg daily on days 1–21 of each cycle and orally administered dexamethasone 40 mg weekly. Patients in the Pd group received pomalidomide 4 mg (starting dose) orally once daily on days 1–21 of each cycle and orally administered dexamethasone 40 mg weekly. EQUULEUS is a nonrandomized, open-label trial in patients with RRMM who were previously treated with at least one prior line of therapy; 103 patients were treated with D-Pd [14]. Patients received daratumumab 16 mg/kg IV weekly in cycles 1 and 2 and then every 2 weeks in cycles 3–6 and every 4 weeks thereafter, orally administered pomalidomide 40 mg daily on days 1–21 of each cycle, and orally administered dexamethasone 40 mg weekly.

Data for the D-Vd and Vd cohorts were taken from the CASTOR study, a randomized open-label trial conducted in patients with RRMM who were previously treated with at least one prior line of therapy. Patients were randomly assigned to D-Vd (n = 251) or Vd (n = 247) [13]. Patients in the D-Vd group received daratumumab 16 mg/kg IV weekly in cycles 1 and 2, every 2 weeks in cycles 3 to 6, and every 4 weeks thereafter, bortezomib 1.3 mg/m2 SC on days 1, 4, 8, and 11 of cycles 1–8, and dexamethasone 20 mg (oral or IV) on days 1, 2, 4, 5, 8, 9, 11, and 12 of each cycle. Patients in the Vd group received bortezomib 1.3 mg/m2 SC on days 1, 4, 8, and 11 of cycles 1–8 and dexamethasone 20 mg (oral or IV) on days 1, 2, 4, 5, 8, 9, 11, and 12 of each cycle.

No institutional board review was required for this post hoc analysis of the APOLLO, CASTOR, and EQUULEUS trials. Those trials were conducted in accordance with the ethical standards of the local institutional research committees and with the principles of the Declaration of Helsinki. Informed consent was obtained from all participants in each trial.

Statistical Analyses

The matching and weighting were performed separately for the D-Pd versus Vd and D-Pd versus D-Vd comparisons. Weighting and matching methods were used to adjust for imbalances in patient characteristics at baseline between the study arms to mitigate bias due to potential confounding. Cardinality matching (CM), stabilized inverse probability of treatment weighting (sIPTW), and propensity score matching (PSM) were initially considered. Both PSM and sIPTW rely on estimation of propensity scores for each patient [15]. The propensity score measures how probable it is that a patient is exposed to the treatment (rather than control) based on their baseline characteristics. PSM attempts to mitigate bias by pairing treatment and control patients who have similar propensity scores to produce a matched dataset of treatment and control patients with balanced baseline characteristics. sIPTW attempts to mitigate bias by reweighting patients in each treatment group to produce a weighted pseudo-population in which patient characteristics are balanced between treatment and control groups [16]. The patient’s weighting is based on their propensity score. The CM approach does not rely on propensity scores or directly paired treatment and control patients. Instead, CM uses integer programming techniques to find the largest subset of treatment and control patients which meets a pre-specified balancing criterion [17, 18]. As a result of limited overlap between the APOLLO, EQUULEUS, and CASTOR trial populations and the ability of CM to find the largest sample satisfying specified balancing criteria, it was anticipated that CM would potentially outperform PSM at both improving balance and preserving effective sample size (ESS) and provide better interpretability than sIPTW as the unit of observation is individual patients rather than weighted patients [19]. ITC feasibility assessments were conducted with CM, sIPTW, and PSM using combined data from APOLLO (D-Pd) + EQUULEUS (D-Pd) + CASTOR (D-Vd/Vd). A standardized mean difference (SMD) > 0.1 was used as the criterion for imbalance [20] in baseline characteristics between D-Pd versus D-Vd and D-Pd versus Vd. The PSM analysis utilized nearest-neighbor matching with replacement and caliper = 0.2. The CM procedure was implemented with a pre-specified maximum SMD criteria (≤ 0.1) for matching covariates. Results of the feasibility assessment post-harmonization (details below) revealed that the ESS for the PSM methodology was too low (D-Vd, ESS = 9 and Vd, ESS = 13) to support a meaningful analysis of PFS. CM was selected as the base case on the basis of its capability to retain sample size (Tables 1 and 2) with sIPTW as a sensitivity analysis [19].

Table 1 Patient baseline characteristics for D-Pd and D-Vd cohorts before and after adjustment
Table 2 Patient baseline characteristics for D-Pd and Vd cohorts before and after adjustment

The primary outcome was progression-free survival (PFS). Efficacy results were reported for the CM and sIPTW analyses only.

CM analysis used a Cox proportional hazards model for treatment effect in the matched sample and analyses were performed separately for the D-Pd versus Vd and D-Pd versus D-Vd comparisons. The sIPTW analysis used a weighted Cox proportional hazards model and weights were computed separately for the D-Pd versus D-Vd and D-Pd versus Vd comparisons.

The P values for hazard ratios (HRs) were based on a Wald test. Robust standard errors were used for sIPTW.

Harmonized Eligibility Criteria

To ensure consistency across the trials, the following harmonized inclusion/exclusion criteria were applied prior to weighting/matching. All patients received at least one prior line of anti-MM therapy, including a PI and IMiD (although not necessarily lenalidomide), and patients who had received prior pomalidomide were excluded.

The following criteria were not applied because of feasibility/sample size considerations: inclusion of only patients with prior lenalidomide (rather than any IMiD); inclusion of only patients with two or more lines of prior therapy; exclusion of patients who were refractory to a PI; exclusion of patients with only one prior line who were non-refractory to lenalidomide; and laboratory screening criteria (inclusion of only patients with hemoglobin level ≥ 7.5 g/dL [≥ 4.65 mmol/L], creatinine clearance ≥ 30 mL/min, and serum calcium corrected for albumin ≤ 14.0 mg/dL [≤ 3.5 mmol/L] or free ionized calcium ≤ 6.5 mg/dL [≤ 1.6 mmol/L]).

Covariates

Population differences across the three studies were addressed by using a covariate balancing strategy selected on the basis of the ability to achieve balance across arms and subject to data availability limitations. CM uses integer programming to maximize the size of the matched treatment and control groups subject to specified balance requirements rather than individually pairing treatment and control subjects [21]. In the sIPTW analyses, logistic regression was used to estimate propensity scores. The preferred covariates were selected on the basis of clinical input. Lower priority covariates were subject to impact on ESS [22] and balance.

In the analyses the following covariates were considered for adjustment based on clinician input: age (< 65, 65–75, ≥ 75 years), sex, refractory to lenalidomide status (yes/no/not received), refractory to IMiD/PI status (IMiD only, PI only, both, neither), number of prior lines of therapy (1, 2–3, or ≥ 4), Eastern Cooperative Oncology Group performance status (ECOG PS; 0, 1, or 2), years since diagnosis, prior autologous stem cell transplant (ASCT; yes/no), and cytogenetic risk (high/standard/missing).

International Staging System (ISS) stage (I, II, III, or missing), MM type (immunoglobulin G/nonimmunoglobulin G/missing) were not included in the CM analysis because of missingness in the EQUULEUS dataset. For the sIPTW analysis, prior ASCT, sex, and number of prior lines were excluded from the adjustment where this was able to improve post-adjustment balance.

In the sIPTW analysis, weights were computed separately for the D-Pd versus D-Vd and D-Pd versus Vd comparisons using combined data from APOLLO (D-Pd) + EQUULEUS (D-Pd) + CASTOR (D-Vd/Vd). The estimand was the average treatment effect (ATE). CM was also performed for the D-Pd versus Vd and D-Pd versus D-Vd comparisons using combined data from the three studies. CM can be formulated to solve linear integer programming problems allowing for flexible covariate balance constraints on the entire sample.

Missingness in cytogenetic risk was treated as a distinct covariate value in the CM and sIPTW analyses, which assumed that the drivers of missingness were similar across trials. One patient treated with D-Pd with missing ECOG PS was removed from the dataset in the sIPTW and PSM analyses.

To explore the potential impact of not including ISS stage and MM type in the CM analysis as a result of missingness in EQUULEUS, a sensitivity analysis was performed in the absence of data from the EQUULEUS trial. In addition, ISS and MM type were adjusted as they were commonly reported in both CASTOR and APOLLO.

Results

Patients and Baseline Characteristics

After harmonized eligibility criteria were applied, 253, 104, and 122 patients from the D-Pd, D-Vd, and Vd cohorts, respectively, were included for comparison. This harmonization had the greatest impact on the CASTOR population and resulted in the removal of 270 patients (only 240 out of 497 CASTOR patients had received both a prior PI and IMiD). The ESS for each method and comparison are shown in Tables 1 and 2 and reduced following matching and weighting.

Pre- and post-adjustment baseline characteristics for D-Pd versus D-Vd and D-Pd versus Vd cohorts are shown in Tables 1 and 2, respectively. A naive comparison of patient baseline characteristics before adjustment identified some differences between cohorts. In the sIPTW analysis, the distribution of computed weights for the D-Pd versus D-Vd and D-Pd versus Vd comparisons showed very few extreme weights (99th percentile of weights: 4.3 for D-Pd versus D-Vd and 6.0 for D-Pd versus Vd; trim level applied at 5).

Some differences in baseline characteristics remained for the D-Pd versus D-Vd and D-Pd versus Vd cohorts after CM, and sIPTW adjustment. As a result of missingness in the EQUULEUS study, ISS stage and MM type were not adjusted and the extent of imbalance for these subgroups is unknown. The CM analysis was associated with fewer imbalances compared to the sIPTW analysis and these were in refractory to PI only status in the D-Pd versus D-Vd cohort and in refractory to both PI and IMiD in the D-Pd versus Vd cohort. In the sIPTW analysis, remaining imbalances were in ECOG PS, cytogenetic risk, number of prior lines of therapy, refractory to lenalidomide status, and refractory to PI and/or IMiD status.

Primary Outcome: Progression-Free Survival

After application of harmonization criteria, the PFS Kaplan–Meier (KM) curves for the D-Vd and Vd cohorts fell compared to the D-Pd curves (Figs. 1 and 2, respectively). The PFS HRs for D-Pd versus D-Vd before and after applying harmonization criteria were 1.26 (95% confidence interval [CI] 0.99–1.60) and 0.83 (95% CI 0.63–1.10, P = 0.20) (Fig. 1) and for D-Pd versus Vd were 0.49 (95% CI 0.39–0.61) and 0.42 (95% CI 0.32–0.55, P < 0.01), respectively (Fig. 2).

Fig. 1
figure 1

Progression-free survival for D-Pd versus D-Vd a prior to harmonized exclusion criteria and b after harmonized exclusion criteria. CI confidence interval, D-Pd daratumumab, pomalidomide, and dexamethasone, D-Vd daratumumab, bortezomib, and dexamethasone, HR hazard ratio

Fig. 2
figure 2

Progression-free survival for D-Pd versus Vd a prior to harmonized exclusion criteria and b after harmonized exclusion criteria. CI confidence interval, D-Pd daratumumab, pomalidomide, and dexamethasone, HR hazard ratio, Vd bortezomib and dexamethasone

Cardinality Matching

The CM-adjusted PFS was significantly improved with D-Pd versus D-Vd and versus Vd. There was a statistically significant reduction in risk of disease progression of 45% for D-Pd versus D-Vd (HR, 0.55 [95% CI 0.36–0.82], P < 0.01; Fig. 3a). A significant reduction in risk of disease progression of 72% in CM-adjusted PFS was also observed with D-Pd versus Vd (HR, 0.28 [95% CI 0.19–0.41], P < 0.01); Fig. 3b).

Fig. 3
figure 3

Progression-free survival (cardinality matching). a D-Pd versus D-Vd and b D-Pd versus Vd. CI confidence interval, D-Pd daratumumab, pomalidomide, and dexamethasone, D-Vd daratumumab, bortezomib, and dexamethasone, HR hazard ratio, Vd bortezomib and dexamethasone

Stabilized Inverse Probability of Treatment Weighting

A lowering of the PFS KM curves for the D-Vd and Vd cohorts compared to the D-Pd cohort was observed after sIPTW adjustment. The sIPTW-adjusted PFS was significantly improved with D-Pd versus D-Vd (HR, 0.66 [95% CI 0.45–0.96], P = 0.03) with a reduction in risk of disease progression of 34% with D-Pd compared with D-Vd (Fig. 4a). The sIPTW-adjusted PFS was also significantly improved with D-Pd versus Vd with a reduction in risk of disease progression of 67% for D-Pd compared with Vd (HR, 0.33 [95% CI 0.25–0.43], P < 0.01; Fig. 4b).

Fig. 4
figure 4

Progression-free survival (sIPTW adjusted). a D-Pd versus D-Vd and b D-Pd versus Vd. CI confidence interval, D-Pd daratumumab, pomalidomide, and dexamethasone, D-Vd daratumumab, bortezomib, and dexamethasone, HR hazard ratio, sIPTW stabilized inverse probability of treatment weighting, Vd bortezomib and dexamethasone

Figure 5 summarizes the adjusted PFS HRs for D-Pd versus D-Vd and D-Pd versus Vd using both sIPTW and CM analyses. Adjusted PFS HRs favored D-Pd over D-Vd and Vd and were statistically significant (P < 0.05) for both analysis methods.

Fig. 5
figure 5

Progression-free survival. Values less than 1 favor D-Pd. CI confidence interval, CM cardinality matching, D-Pd daratumumab, pomalidomide, and dexamethasone, D-Vd daratumumab, bortezomib, and dexamethasone, ESS effective sample size, HR hazard ratio, sIPTW stabilized inverse probability of treatment weighting, Vd bortezomib and dexamethasone

Sensitivity Analysis

A sensitivity analysis was performed without data from the EQUULEUS study. The baseline characteristics were well balanced, including for ISS stage and MM type which could not be adjusted for in the presence of the EQUULEUS data (Supplemental Tables 1 and 2). PFS results in the absence of the EQUULEUS data (Supplemental Fig. 1) were similar to the CM-adjusted data including EQUULEUS. CM-adjusted PFS was improved with D-Pd versus D-Vd (HR, 0.58 [95% CI 0.38–0.87]; a reduction in risk of disease progression of 42%) and D-Pd versus Vd (HR, 0.24 [95% CI 0.16–0.35]; a reduction in risk of disease progression of 76%).

Discussion

First-line treatment in MM frequently fails after a period of initial response [3, 23]. Thus, there is a continuing need for second- and later-line treatment options and for insights on the comparative efficacy of these therapies that will help to guide clinicians when selecting the optimum treatment sequence for their patients.

Several network meta-analyses (NMA) have evaluated daratumumab-based regimens in patients with RRMM. Dimopoulos et al. [24] conducted an NMA evaluating the comparative effectiveness of daratumumab plus SOC versus other relevant options. The analysis extracted data from a systematic literature review and from the CASTOR and POLLUX trials. The results demonstrated that the daratumumab-containing regimens daratumumab, lenalidomide, and dexamethasone (D-Rd) and D-Vd were more effective in improving PFS when compared with other evaluated regimens. In a recent NMA by Kiss et al. [25], better clinical outcomes were demonstrated in patients with RRMM treated with daratumumab-containing vs control regimens with regards to minimal residual disease negativity, stringent complete response, death, and disease progression. Additionally, in an NMA by Luo et al. [26], which synthesized results from 24 randomized controlled trials, it was demonstrated that D-Rd showed better efficacy than the other regimens with regards to nonresponse rate, time to progression, and PFS. D-Rd also ranked first with respect to overall efficacy. To date, limited comparisons between D-Pd and other RRMM regimens have been published [27].

In the current analysis, patient-level data were used to compare improvement in PFS with D-Pd versus D-Vd and Vd in patients with RRMM and prior PI and IMiD exposure. Results were generally consistent regardless of which adjustment technique was used. All adjusted PFS HRs favored D-Pd over D-Vd and Vd and were statistically significant for both D-Pd versus D-Vd and D-Pd versus Vd using CM and sIPTW. In general, CM results were similar to post-sIPTW results, although direct comparison is complicated by the different estimands that were used (nonspecific estimand for CM and ATE for sIPTW).

Outcomes may have been driven by the fact that patients in the APOLLO study were more difficult to treat than those in the CASTOR study. This was due to more patients in the APOLLO study being refractory to PI and IMiD and more patients receiving later lines of therapy. The analysis attempted to adjust for differences in eligibility criteria and patient baseline characteristics.

Limitations

In the absence of direct head-to-head comparisons, ITCs can provide important information to help optimize treatment for patients with RRMM. However, ITCs of aggregated clinical trial data can be subject to selection and confounding bias, thus limiting the conclusions that can be drawn from the comparison. The results of the ITC analyses must be interpreted with caution as even after the ITC, patient selection differences remained between the patient populations from the three studies.

Differences in eligibility criteria with respect to prior therapy and refractory status to prior therapies also presented a challenge. EQUULEUS eligibility required at least two prior lines of therapy, whereas eligibility in the APOLLO and CASTOR trials required at least one prior line. However, CM-adjusted results from the sensitivity analysis in the absence of EQUULEUS data showed no differences from the analyses in the presence of the EQUULEUS data.

In addition, while the small ESS for D-Vd and Vd precluded a meaningful comparison of PFS using PSM analysis, satisfactory results were achieved using CM analysis.

Covariate imbalances remained after sIPTW adjustment and may have resulted in residual confounding; therefore, the adjusted HRs for the sIPTW analysis should be interpreted with caution. The use of CM can potentially overcome limitations in matching methods that can fail to achieve covariate balance or result in small sample sizes. Because computer power was historically limited, it only started being widely used recently and, unlike a traditional PSM approach, CM maximizes the size of matched samples that meet pre-specified criteria for covariate balance. This method can make use of more observations than other approaches when there is limited overlap in covariate distributions.

Although the CM method yielded better balance than sIPTW and PSM, the inability to adjust for differences in ISS staging or MM type between arms due to confounders that were not reported was a limitation. However, the sensitivity analysis performed without the EQUULEUS trial did adjust for ISS and MM type and demonstrated similar results to the analyses in the presence of the data from EQUULEUS.

Another limitation is that no imputation of the “missing” category of cytogenetic risk profile was conducted. By matching with the “missing” category, we assumed that both cohorts included the same proportion of high- or standard-risk patients among those with missing values.

This study was also limited by the lack of mature overall survival (OS) data for inclusion in the ITC analyses, which evaluated PFS outcomes only. Although OS is considered the gold standard outcome measure in MM studies [28], OS for D-Pd in the APOLLO trial at a median follow-up of 16.9 months is still premature. However, PFS has been recognized as a surrogate endpoint used to evaluate primary efficacy in clinical trials for hematological malignancies, including MM [29, 30]. Lastly, the proportional hazards assumption was violated for the CM and sIPTW D-Pd versus Vd comparisons so the corresponding HRs should be interpreted as rough averages over the follow-up period.

Furthermore, differences in dosing schedule between the three different regimens should also be noted: D-Pd was delivered as a continuous therapy, Vd as a fixed duration therapy, and the D-Vd regimen was partly discontinued after induction.

Despite these limitations ITCs offer a valuable alternative to head-to-head trials that can be used to leverage results from clinical trials that share a common comparator arm. Comparisons using individual patient-level data for at least one of the treatment arms of interest produce stronger results than those using aggregate data. For ITC results to be considered valid, populations must be sufficiently similar; as shown in the ITC analyses, different techniques can be applied to balance baseline characteristics so that cohorts are more similar.

Conclusions

The results of this ITC demonstrated a PFS benefit for D-Pd compared with D-Vd and Vd in patients with RRMM with previous exposure to a PI and an IMiD. Results were generally consistent irrespective of the type of adjustment used. Despite the inability to adjust for ISS stage and MM type and the residual imbalances between treatment cohorts, these findings provide support in favor of using D-Pd in a population of patients with difficult-to-treat RRMM who have been exposed to both a PI and an IMiD.