Background

Treatment approaches for newly diagnosed multiple myeloma (NDMM) are chosen based on the patient’s fitness; those considered fit usually receive induction, high-dose chemotherapy and autologous stem cell transplant (ASCT) as standard of care [1]. In patients ineligible for ASCT, treatment with bortezomib, melphalan and prednisone (VMP) or lenalidomide plus low-dose dexamethasone is recommended [1]. Older patients and those who are transplant ineligible have significantly shorter relative survival than younger, fitter, transplant-eligible patients [2, 3]. In addition to age, factors such as frailty, performance status and comorbidities are important determinants of ASCT eligibility and treatment selection in the frontline setting [4,5,6].

MM can profoundly impact patients’ daily lives, imposing both physical (i.e. fatigue, mobility, pain and physical activity) and emotional (i.e. distress, anxiety, depression and effects on relationships) burdens [7, 8]. Maintaining health-related quality of life (HRQoL) during treatment is an important goal in MM, with a particular focus on understanding the long-term impact of disease and treatment on patients [9]. However, reports on HRQoL in the ASCT-ineligible population (in particular) are limited.

In May 2018, the US Food and Drug Administration approved daratumumab, an anti-CD38 humanized monoclonal antibody, for use in combination with VMP (D-VMP) in patients with NDMM who are ineligible for ASCT. This approval was based on the results of the multicenter, open-label, phase III ALCYONE trial (NCT02195479), which demonstrated significantly higher response rates, higher rates of minimal residual disease negativity, and lower risk of disease progression or death in patients who received D-VMP compared with those receiving VMP alone. Prespecified subgroup analyses showed the superiority of D-VMP over VMP in patients 75 years of age or older (29.9% of patients in the study) and those with poor prognosis [10]. Rates of grade 3/4 hematologic events, including neutropenia, thrombocytopenia and anemia, were higher in the D-VMP group than in the VMP group [10].

Here, we present analyses from the ALCYONE clinical trial evaluating the treatment effect of D-VMP on patient-reported outcomes (PROs).

Methods

Study design and patients

Details of the multicenter, randomized, open-label, active-controlled, parallel group ALCYONE trial have been previously published [10]. Cycle length for D-VMP and VMP was 6 weeks (cycles 1–9) and 4 weeks (cycle 10+), respectively. Eligible patients were randomized 1:1 to D-VMP (VMP [see below] plus intravenous daratumumab 16 mg/kg [once weekly in cycle 1, every 3 weeks in cycles 2–9 and every 4 weeks thereafter until disease progression or unacceptable toxicity]) or VMP (subcutaneous bortezomib 1.3 mg/m2 [cycle 1: twice weekly; cycles 2–9: 4 doses/cycle], melphalan 9 mg/m2 [days 1–4] and prednisone 60 mg/m2 [days 1–4]).

The study was conducted at 162 sites in 25 countries. Each study site’s local independent ethics committee or institutional review board approved the study protocol. This study was conducted in accordance with the ethical principles that have their origin in the Declaration of Helsinki and the International Conference on Harmonisation Good Clinical Practice guidelines and adhered to CONSORT guidelines. All patients provided written informed consent.

PROs

PROs (a secondary objective of the ALCYONE trial) were assessed by the European Organisation for Research and Treatment of Cancer Quality of Life Questionnaire Core 30-item (EORTC QLQ-C30) [11] and the EuroQol 5-dimensional descriptive system (EQ-5D-5L) [12]. The EORTC QLQ-C30 v3 is a validated, cancer-specific instrument that contains 30 items resulting in five functional scales (physical, role, emotional, cognitive and social functioning), one Global Health Status (GHS) scale, three symptom scales (fatigue, nausea and vomiting, and pain) and six single items (dyspnea, insomnia, appetite loss, constipation, diarrhea and financial difficulties) [11]. Higher scores represent greater GHS, better functioning and worse symptoms, respectively. The EQ-5D-5L, a generic measure of health status, assesses five domains including mobility, self-care, usual activities, pain/discomfort and anxiety/depression plus a visual analog scale (VAS) rating of “health today” [12].

PRO responses were collected using an electronic tablet device prior to any other study-related activities, at baseline (before randomization), every 3 months during the treatment phase and then every 6 months until disease progression. All patients were educated on the use of the electronic tablet. Interim results are presented for the first 36 months of treatment.

Statistical methods

The primary analysis population was the intent-to-treat (ITT) population (all randomized patients); the PRO data set was the ITT population of patients with a baseline and > 1 postbaseline PRO assessment. No imputation of missing data or adjustments for multiplicity were made. A sensitivity analysis was conducted using a pattern-mixture model.

PRO data were summarized using descriptive statistics, including number, mean, standard deviation, median, and minimum and maximum value by treatment group. Compliance was calculated at baseline and for each postbaseline PRO assessment visit as a percentage, with the number of PRO assessments received as the numerator and the number of PRO assessments expected at that time point (a clinical prediction of how many patients will be on treatment) as the denominator.

We assessed treatment differences using a repeated-measures, mixed-effects model with a missing-at-random data assumption. The model included the baseline PRO score, treatment group, time, treatment by time interaction and the stratification factors as fixed effects and subject as a random effect. A 2-sided 5% significance level was used to descriptively compare values for the exploratory PRO endpoints, which are derived from scale scores.

The proportion of patients achieving minimally important differences (MIDs) in each PRO instrument scale score, which indicate clinically meaningful changes, was summarized with odds ratios and 95% confidence intervals (CIs). Although there is no universal MID [13], there are multiple published MID thresholds ranging from 5 to 10 [14,15,16,17]. Here, MID thresholds to explore individual patient-level change were defined a priori as 10 points for the EORTC QLQ-C30 scale scores [18] and ≥ 7 points for EQ-5D-5L VAS [19].

We conducted exploratory subgroup analyses to determine if there were differences in EORTC QLQ-C30 GHS, functional and symptom scale scores by age, and Eastern Cooperative Oncology Group (ECOG) performance status. Exploratory analysis of time to worsening using survival curves and hazard ratios (HRs) by depth of clinical response and minimal residual disease (MRD) status were estimated for EORTC QLQ-C30 scores.

Results

Patients

Baseline patient demographics and population characteristics were similar between groups (D-VMP: n = 350; VMP: n = 356) (Table 1). Mean age was 71 years, there were approximately the same number of male and female patients, and approximately half of patients had a baseline ECOG performance status of 1. Baseline EORTC QLQ-C30 scores were similar between treatment arms for all functional and symptom scales (Table 1).

Table 1 Baseline characteristics, EORTC QLQ-C30 scores and EQ-5D-5L scores (ITT population)

Compliance rates with PRO measures were high and similar across treatment groups. At baseline, 90.6 and 90.3% of patients subsequently assigned to the D-VMP group and 91.9 and 91.3% of patients randomized to the VMP group completed the EORTC QLQ-C30 and the EQ-5D-5L, respectively (supplementary Fig. 1). Compliance rates remained high (> 76%) throughout the study. The number of PRO assessments received was higher in the D-VMP group than in the VMP group, which is consistent with the greater numbers of patients staying on treatment with D-VMP compared with VMP. The PRO data sets for the D-VMP group were also larger owing to the longer treatment duration of these patients.

Treatment effect on EORTC QLQ-C30 scores

ITT population

Using a mixed-effects model with repeated measures, the least squares (LS) mean change in GHS score from baseline was 7.3 in the D-VMP group and 3.9 in the VMP group at 3 months (difference 3.4, p = 0.0240). Between-group differences were not significant at other assessment time points, but point estimates more often favored the D-VMP group than the VMP group. The LS mean change from baseline was clinically meaningful (i.e. ≥10 points) at months 9, 12, 18 and 30 for both treatment groups, as well as at month 24 in the VMP group and month 36 in the D-VMP group (Fig. 1a).

Fig. 1
figure 1

LS mean change from baseline in EORTC QLQ-C30. a – GHS. b – Physical functioning. c – Pain. d – Fatigue up to 36 months (ITT population)

LS mean changes from baseline were not significantly different between treatment groups for the functional scales of the EORTC QLQ-C30. Point estimates favored the D-VMP group at all assessment time points for physical functioning (Fig. 1b) and most time points for role functioning, cognitive functioning and social functioning (supplementary Fig. 2a, c, d). The direction of the point estimates for between-group differences in change in emotional functioning fluctuated depending on the assessment time point, favoring D-VMP at months 6, 12, 30 and 36 and favoring VMP at months 3, 9, 18 and 24 (supplementary Fig. 2b). LS mean change from baseline in physical functioning scores was more often clinically meaningful in the D-VMP group than in the VMP group (6 vs 2 assessment time points). Clinically meaningful changes in role functioning scores were observed in the D-VMP group at all time points after month 3 and in the VMP group at all time points between month 3 and month 30. LS mean changes in emotional functioning scores were clinically meaningful in both treatment groups at all assessment time points after month 3. Scores for cognitive functioning declined from baseline in both groups, but the LS mean change from baseline was not clinically meaningful in either group at any assessment time point.

There were no significant between-group differences in LS mean change from baseline in scores for pain (Fig. 1c), fatigue (Fig. 1d) or nausea and vomiting (supplementary Fig. 2e). Point estimates generally favored D-VMP for these symptoms. Improvements in pain scores were clinically meaningful at all assessment time points for both treatment groups, but were not meaningful for either group at any time point for fatigue or nausea and vomiting. Although overall usage of concomitant medications was similar between groups (97.7 and 96.9% of patients in the D-VMP and VMP groups, respectively), greater proportions of patients in the D-VMP group than in the VMP group used analgesics (70.8% vs 57.9%) and anti-inflammatory agents (25.7% vs 14.4%).

The proportions of patients with a clinically meaningful change (i.e. ≥10 points) in EORTC QLQ-C30 scores at month 36 are shown in Fig. 2. Differences between treatment groups were not statistically significant, but were numerically greater in the D-VMP group for all scales. The greatest proportion of patients experienced clinically meaningful changes in pain scores, with 75% of patients in the D-VMP group and 71% of patients in the VMP group reporting a mean change ≥10 points.

Fig. 2
figure 2

Percentage of patients reporting clinically meaningful improvements in EORTC QLQ-C30 functional and symptom scales at 36 months. Clinically meaningful improvement defined as a ≥ 10-point improvement from baseline score

Subgroup analyses

In a subgroup analysis of EORTC QLQ-C30 GHS, physical functioning, pain and fatigue scores by age and ECOG performance status, similar patterns of increasing improvement in HRQoL were observed in all subgroups, including patients 75 years of age or older and those with poorer overall baseline functional status (ECOG performance status ≥2), with no significant difference between the D-VMP and VMP groups after 3 months (Table 2). Improvements were observed in all subgroups, but were generally greater in the younger (< 75 years) vs older (≥75 years) patients and in those with ECOG performance status of 2 vs those with ECOG performance status of 0 or 1.

Table 2 Mean change from baseline in EORTC QLQ-C30 GHS, physical functioning, pain and fatigue and EQ-5D-5L VAS scores up to 36 months by age and ECOG performance status at baseline

Time to worsening of GHS, function and symptoms was generally longer with greater depth of clinical response (Table 3). Time to worsening was significantly longer for patients with a complete response and significantly shorter for patients with stable disease, both compared with patients with very good partial response/partial response on the EORTC QLQ-C30 GHS (hazard ratio = 0.72 and 1.75, respectively). Similarly, patients who reached MRD-negative status had significantly improved outcomes compared with MRD-positive patients on EORTC QLQ-C30 GHS and pain (hazard ratio = 0.70 and 0.60, respectively; p < 0.05).

Table 3 HRs for comparison of time to worsening of EORTC QLQ-C30 scale scores by depth of response and MRD status for pooled treatment arms

Treatment effect on EQ-5D-5L scores

ITT population

At month 3, the LS mean difference from baseline in EQ-5D-5L VAS score was 7 in the D-VMP group and 3.8 in the VMP group (difference 3.1, p = 0.0160). Between-group differences were not significant at other assessment time points. Point estimates favored the D-VMP group at months 6, 9, 18 and 36 and favored the VMP group at months 12, 24 and 30 (Fig. 3a). The LS mean change was clinically meaningful in both groups at month 18, in the VMP group at month 24 and in the D-VMP group at month 36. The proportion of patients with a clinically meaningful improvement in VAS score was significantly greater in the D-VMP group at month 3 (54.8% vs 41.3%, odds ratio 1.72, p = 0.0025); between-group differences were not significant at other time points (Fig. 3b). The proportion of patients with clinically meaningful improvement in VAS score was numerically greater in the D-VMP group at all assessment time points except month 12.

Fig. 3
figure 3

EQ-5D-5L VAS. a – LS mean change from baseline up to 36 months. b – Percentage of patients reporting clinically meaningful improvements up to 36 months (ITT population)

Subgroup analyses

Similar to the subgroup analysis for EORTC QLQ-C30, increasing improvement over time was observed for EQ-5D-5L VAS scores in subgroups by age and ECOG performance status, with no significant difference between treatment groups after 3 months (Table 2). Improvements were greater in younger patients and those with an ECOG performance status of 2.

Discussion

MM is incurable and was responsible for 1.1% of all cancer deaths worldwide in 2018 [20], and patients with MM experience high levels of pain, fatigue and mood disturbances [21]. MM treatments are often associated with demanding administration and monitoring schedules, as well as adverse events. As a result, the burden of MM on patients’ HRQoL is substantial, and PROs should be an important consideration for evaluating new treatment strategies in these patients. This is particularly relevant for the subpopulation of patients with NDMM who are ineligible for transplant, as this group is typically older and often has comorbidities, including impaired renal and hepatic function, that may limit therapeutic options and/or increase susceptibility to adverse effects. Currently, data on HRQoL in this patient subpopulation are limited.

The results presented here provide clear evidence of the HRQoL benefits of D-VMP and VMP in patients with NDMM who are not eligible for transplant. These findings are consistent with a systematic review by Nielsen et al. [22], which reported clinically relevant improvement in HRQoL following treatment in this population. Our study is the first to examine HRQoL of patients treated with D-VMP, and the robustness of the results was supported by a sensitivity analysis using a pattern-mixture model.

Clinically meaningful improvements in GHS, function and symptoms were maintained in this patient population to at least 36 months, which corresponds to the median overall survival of the control group in the Myeloma Trialists’ Collaborative Group meta-analysis of 24 randomized MM trials [23], and is likely among the longest durations of follow-up reported in the first-line treatment of transplant-ineligible patients with MM. Baseline health status and burden of disease measured using the EORTC QLQ-C30 GHS, functional scales and symptom scales were worse for patients with NDMM compared with a population-based random sample of adults without cancer in Germany [24]. Nevertheless, post-treatment scores improved to a level approaching those of a noncancer population [25]. Improvements in cognitive functioning were lower than those reported for the other functional scales, but this is likely attributable to a ceiling effect, as the mean baseline scores for the cognitive functioning scale were the highest of the functional scales, leaving little additional room for improvement, especially in patients on active treatment. The improvements in pain and fatigue observed with both D-VMP and VMP may be particularly noteworthy. Prior studies have demonstrated that patients with NDMM tend to have more pain and fatigue than those with later-stage disease [26], and a study by Jordan et al. demonstrated that pain and fatigue are the strongest predictors of HRQoL [27]. Treatments that impact these symptoms may therefore have the largest impact on patients’ HRQoL.

Although improvements in the D-VMP group were statistically greater than those in the VMP group on some scores at some time points, between-group differences were largely nonsignificant. This observation needs to be considered in the context of the significant increase in clinical benefits observed with the D-VMP regimen [10]. One possible explanation for the lack of incremental benefit for D-VMP over VMP on HRQoL outcomes may be that this was an on-treatment analysis, in which PRO results are reported for patients remaining on treatment and do not reflect the impact of disease progression resulting in discontinuation of study treatment. A greater proportion of patients in the VMP group compared with the D-VMP group discontinued treatment owing to disease progression (13.3% vs 6.6%) [10]. A second explanation may be the substantial positive impact of bortezomib on HRQoL. The magnitude of symptom improvement observed in the present study is noticeably larger than has been observed in some other studies involving patients with NDMM who were transplant ineligible. For example, the magnitude of the mean changes in GHS, physical functioning, pain and fatigue observed in the D-VMP and VMP groups in the present trial is larger than those observed in the phase III FIRST study of lenalidomide plus low-dose dexamethasone vs melphalan/prednisone/thalidomide [28]. Although cross-trial comparisons need to be interpreted with caution, especially as patient inclusion criteria may differ, these observations suggest that owing to the large improvement in HRQoL with VMP alone and high baseline scores, there was little additional room for improvement in HRQoL upon further addition of daratumumab (i.e. ceiling effect). Especially when it comes to depth of remission, bortezomib-based regimens have consistently reported greater proportions of patients in compete response when compared with immunomodulatory drug–based combinations. In ALCYONE, response assessment was complemented by measurement of MRD rather than by pure International Myeloma Working Group uniform response criteria. In fact, patients who reached MRD-negative status had significantly improved outcomes compared with MRD-positive patients in terms of EORTC QLQ-C30 GHS and pain scores.

Although patients were not randomized by subgroup, and subgroup analyses should be interpreted with caution, results of these analyses were generally supportive of the findings in the overall population. Subgroup analyses also demonstrated symptom improvement with both D-VMP and VMP irrespective of age and functional status. Notably, improvements were observed in patients 75 years of age or older and those with poor overall function, indicating that the addition of daratumumab did not negatively affect HRQoL, even in frail and elderly patients who may have limited treatment options. The improvement in HRQoL in older patients is noteworthy, as elderly patients tend to have greater health impairment, including comorbidities, and so may have a lower likelihood of achieving treatment benefit; and transplant-ineligible patients tend to be older than those who are eligible for transplant [26]. A further subgroup analysis found that improvements in HRQoL were greater for patients achieving the greatest clinical response. This latter observation is consistent with the results of previous studies that have demonstrated an association between improved HRQoL outcomes and depth of clinical response in patients with MM [29, 30].

Other studies have also examined the impact of daratumumab as part of first-line treatment on HRQoL in patients with transplant-eligible and -ineligible MM. In the CASSIOPEIA study, daratumumab in combination with bortezomib, thalidomide, and dexamethasone was associated with significantly greater reductions in pain, less deterioration of cognitive functioning and greater improvements in emotional functioning vs bortezomib, thalidomide and dexamethasone alone in patients with transplant-eligible NDMM [31]. In the MAIA study, the combination of daratumumab with lenalidomide and dexamethasone was associated with faster and sustained improvement in HRQoL measures compared with lenalidomide and dexamethasone alone in patients with transplant-ineligible NDMM [32]. In both the CASSIOPEIA and MAIA studies, improvements in HRQoL were consistent with observed clinical benefit. Our results, the first in a study that includes an alkylator agent, add to these existing data and demonstrate that the combination of D-VMP improves HRQoL, with meaningful improvements in both functional and symptom scales in patients with transplant-ineligible NDMM.

As noted above, the improvements in PROs reported here complement the significant clinical benefits observed with D-VMP vs VMP, including a lower risk of disease progression and higher percentages of patients with MRD negativity. PROs provide the patient perspective on treatment, and use of clinical endpoints and PROs together best reflect the full spectrum of patients’ disease as well as the overall effectiveness of treatment. However, whereas the clinical assessments showed significant improvements with D-VMP vs VMP [10], including significant improvements in overall survival [33], differences in HRQoL between groups were modest and largely nonsignificant. In addition to the two explanations provided above, this disparity could be due to the use of generic PRO instruments in the ALCYONE trial. MM-specific PRO measures with greater sensitivity to changes in HRQoL, symptoms and impacts for comparing two treatments with multiple drugs may have been able to tease out the treatment differences with greater specificity, although the EORTC QLQ-C30 and EQ-5D-5L are validated tools that are widely used to assess HRQoL in patients with cancer.

One of the limitations of the present study is the open-label design, which may lead to biased treatment effects on PROs. As noted above, another limitation is that only on-treatment results are presented, as patients were censored from the analysis when they discontinued treatment, so HRQoL outcomes do not reflect disease progression (and more patients in the VMP group progressed and discontinued treatment). Furthermore, no reasons were documented for missing data, and some consequences of treatment may not have been identified. This study is also limited by the lack of control for the use of pain medication. Nonsteroidal anti-inflammatory drugs are not recommended in patients with MM because of renal toxicity [34], yet 25.7% of patients in the D-VMP group and 14.4% in the VMP group were treated with these agents, and the proportion of patients treated with analgesic, low-dose corticosteroid and anti-inflammatory medications was greater in the D-VMP group than in the VMP group. It is not possible to determine to what extent these medications may have contributed to the decreases in pain observed in the study, although the impact of systemic corticosteroids is likely minimal given the high cumulative dose of prednisone patients received as part of their study treatment. Furthermore, the difference in the proportion of patients treated with these agents while on study treatment may be explained, at least in part, by the fact that more patients remained on treatment in the D-VMP group.

In conclusion, patients with NDMM who were transplant ineligible demonstrated early and continuous improvements in HRQoL, including improvements in function and symptoms following treatment with D-VMP or VMP. Functional status and well-being were maintained in patients who remained in the study for both the D-VMP and VMP treatment groups, and support the clinical efficacy benefits already reported [10]. This analysis highlights the importance of measuring HRQoL and PROs to confirm the benefits of cancer therapy on the day-to-day aspects of patients’ lives, as well as their clinical prognoses.