Introduction

Multiple myeloma (MM) is an incurable plasma cell neoplasm associated with significant morbidity and mortality. It is considered a disease of older adults, with a median age at diagnosis of 69 years [1]. Despite survival gains for patients with MM over the past 2 decades, including advances in available therapeutic agents, outcomes of older adults still lag behind [2]. Older adults represent a heterogeneous group with wide variations in functional status and overall disease-related outcomes [3]. Incorporating frailty assessments can help improve the current understanding of the heterogeneity of aging in various disease states, including MM. Frailty is defined as a state of vulnerability to adverse health outcomes when exposed to an external stressor [4, 5]. Although frailty is age-related, advanced chronological age does not equate to frailty, creating heterogeneity in the aging process.

Several tools have been developed to assess frailty [6], yet operationalizing frailty in clinical practice remains challenging. Of the existing frailty measures used in geriatrics, the most well-known are the Fried frailty phenotype [7] and the deficits accumulation model [8]. Studies have since sought to simplify frailty measures and apply them to select populations with cancer. Among patients with MM, two common tools include the International Myeloma Working Group (IMWG) frailty score [9] [incorporates chronological age, Charlson co-morbidity index, activity of daily living (ADLs), independent activity of daily living (IADLs)] and the simplified frailty score (modified IMWG frailty score) by Facon et al. [10] (incorporates age, ECOG performance status and Charlson co-morbidity index). In subsequent studies incorporating these scores, 33%-50% of older adults with transplant-ineligible MM are classified as frail [11]. Patients classified as frail have worse progression-free and overall survival, and increased rates of infection, treatment toxicity, and chemotherapy discontinuation rates for frail older adults compared to fit individuals [9, 12].

Given the importance of frailty in understanding outcomes in MM, clinical trials have recently started incorporating frailty assessment into their data collection. Some studies incorporating frailty measures have used fitness-based approaches to assign therapies or conducted posthoc subgroup analyes [13,14,15]. However, overall uptake of these frailty measures across clinical trials has been variable, leading to a gap in knowledge regarding the proportion of enrolled trial participants considered as frail and uncertainty in frailty-related treatment effects and outcomes. Understanding the definition and subsequently the prevalence of frailty and its impact on outcomes represents an important step in devising future targeted therapies to optimize outcomes in this high-risk MM subgroup.

To our knowledge, no prior systematic review has been conducted to assess the impact of frailty on treatment outcomes in therapeutic MM trials. Therefore, the objective of this systematic review was to 1) examine prevalence of frailty in therapeutic MM trials and 2) evaluate outcomes among frail older adults in MM clinical trials.

Methods

There was no external funding for this review. We registered this systematic review on PROSPERO (#CRD42022324068) and report the results according to the Preferred Reporting Items for Systematic Reviews and Meta-analyses (PRISMA) guidelines [16].

Search strategy

We created and conducted the search strategy with input from all the authors and a medical librarian (E.U. from the E.M. Uleryk Consulting). We searched the following databases from inception to April 5, 2022: MEDLINE and Embase (OvidSP); Scopus (Elsevier); Web of Science (Clarivate), and Cochrane Library (Wiley). We used a combination of controlled vocabulary (MeSH [Medical Subject Headings] and Emtree terms] and keywords with various synonyms for the following concepts: “multiple myeloma” AND (“frailty” or “geriatric assessment”). We limited the search strategy to English language studies. The full search strategy for each database is available in Supplementary Table S1.

We also did a manual search of (1) bibliographies of any included trials or relevant review articles, (2) ongoing clinical trials (clinicaltrial.gov), and (3) conference abstracts (from 2015-2021 for the American Society of Hematology and from 2015-2022 for the American Society of Clinical Oncology and the European Hematology Association). We imported citations from all databases into an EndNote X9 database. After removing duplicate articles, two independent reviewers (H.M. and A.M.) screened the remaining citations. We included the most recent analysis if studies had multiple interim analyses or abstracts. The same two team members (H.M and A.M) reviewed full text to confirm eligibility for any citation deemed potentially relevant. If there were disagreements, the article was reviewed by a third reviewer (T.W).

Selection criteria

We used the following eligibility criteria for included studies: (1) included an evaluation of therapeutic drug agent for newly diagnosed (NDMM), or relapsed/refractory (R/R) patients (2) was a clinical trial (phase I to IV, we excluded real-world observational cohort and registry database studies) (3) reported on a measure of frailty (intermediate fit or frail) either as inclusion criteria for trial entry, baseline characteristics or post-hoc analysis. We defined frailty measures as any screening or comprehensive geriatric assessment tools which included ≥2 aging-associated domain assessments. These domain assessments could include a combination of age, co-morbidities, functional/performance status [17]. We excluded any study that classified frailty based solely on one factor alone (i.e., studies categorizing patients as being frail solely based upon age, eastern cooperative oncology group performance status (ECOG PS), Karnofsky performance status (KPS), or comorbidities alone). We excluded studies that did not indicate how frailty was defined, as it was not possible to ascertain if ≥2 aging-associated domains were included.

Data extraction

We extracted study data, including first author, years of trial enrollment, trial phase (I, II, or III/IV), study methodology (randomized controlled trial [RCT] vs. non-RCT), disease phase (NDMM vs. R/R), sample size, trial location, and therapeutic agents in both the experimental and control arms. The assessment tool utilized for frailty assessment was recorded. A study was recorded as using the IMWG frailty score if age, Charlson co-morbidity index, ADLs and IADLs were assessed or if the study self-reported as using IMWG frailty Index [9]. A study was recorded as using the simplified frailty score (also known as modified IMWG frailty index) if ECOG PS was used instead of ADLs and IADLs [10]. We also recorded the frailty categorization (two or three subgroups), frailty prevalence, patient characteristics (median age) and outcome data (efficacy and toxicity data).

Definition of outcomes

Progression-free survival (PFS) was defined in all studies as the time from randomization to the disease progression or death, whichever came first. Additional outcomes recorded included overall survival (OS), overall response rates (ORR) [18], and ≥ grade 3 treatment-emergent adverse events (TEAEs). We also included quality of life or patient-reported outcomes stratified by frailty, if available. For the outcomes of PFS and OS, we extracted hazard ratios and 95% confidence intervals whenever available.

Results

Of 3193 studies, we included 257 in the full-text review (Fig. 1). After a full-text review, 43 clinical trials met the eligibility criteria for inclusion in this review. Common reason for exclusion during the full-text review included ineligible study design such as observational cohort or database registry studies (62/257, 24.2%) and not assessing or reporting on frailty (36/257, 14.1%).

Fig. 1: Flow Diagram of Study Selection.
figure 1

The process of selected the included studies is indicated.

Study characteristics

Summary characteristics of the 43 included studies are presented in Table 1. This included 24 RCTs and 19 non-randomized trials. A total of 26/43 (60.4%) and 17/43 (39.5%) of the studies were in the NDMM and R/R settings, respectively. Most studies were multicenter (38/43, 88.3%), with a plurality conducted in Europe (20/43, 46.5%). There were increasing number of studies evaluating or reporting on frailty in more recent years with 18/43 (41.9%) in the last two years (Fig. 2).

Table 1 Summary characteristics of the included studies evaluating frailty in MM therapeutic clinical trials.
Fig. 2
figure 2

Number of studies evaluating or reporting frailty assessments each year.

Further study characteristics for the 24 RCTs (16 in NDMM and 8 in R/R) are shown in Table 2. The median age of the patients ranged from 73 to 77 in the NDMM trials and 64 to 70 years in the R/R setting. Among the included RCTs, planned sample sizes ranging from N = 112 (Muk eight [19]) to N = 1852 (Myeloma XI [20]). Nineteen non-randomized studies (10 in NDMM and 9 in R/R) were included (Table 3). The median age of patients in these studies ranged from 62 to 82 years. These studies varied, including small single-center studies with n < 20 (3/19, 15.8%) [21,22,23] to a larger phase II study with 238 participants (HOVON 123 [24]).

Table 2 Randomized controlled trial of therapeutic agents in multiple myeloma incorporating or reporting on frailty.
Table 3 Non-randomized controlled trial of therapeutic agents in multiple myeloma incorporating or reporting on frailty.

Frailty measurement tools

The most commonly used tool for frailty assessment was the IMWG frailty score (18/43, 41.8%). Among the RCTs, IMWG frailty score was used or is currently being used in a total of 6 NDMM studies (Larocca et al. [25], EMN10 [26], UK FiTNEss [27], MM4 [28], EMN01 [29], IFM 2017_03 [30]). Among the non-RCTs, the IMWG frailty score was utilized in 12 studies (8 NDMM and 4 R/R). The simplified frailty score was the next most commonly utilized score (17/43, 39.2%). Among the RCTs it was used in 6 NDMM studies (MAIA [31], ALCYONE [32], HOVON 126 [33], FIRST [10], HOVON-87 [34], IFM 2017_03 [30]) and all of the 8 studies RCT in the R/R setting. Among the non-RCT, it is currently being utilized in 3 studies in the R/R setting (IFM2021_03 [35], IFM 2018_02 [36] and KMMWP-164 [37]). The Revised Myeloma Comorbidity Index was used in two studies [38, 39]. Other studies incorporated non-MM-specific geriatric assessment tools, including the VES-13 [23, 40], CARG geriatric assessment [21], Alliance geriatric assessment tool [41], or other geriatric domains (two or more components of geriatric assessment, including comorbidities, cognition, and functional/physical assessments) [22, 24, 38, 42, 43].

Reason for frailty assessment in the trial

Frailty assessment was conducted as a subgroup analysis (planned or post-hoc) in 31/43 (72.0%), for study entry criteria in 8/43 (18.6%), or for drug dosing in 5/43 (11.6%) of the included studies. These included studies both in the NDMM setting (the large phase III trials MAIA [31] and ALCYONE [32]) as well as studies in the R/R setting (MUK eight [19], BOSTON [44], ICARIA [45], ASPIRE, ENDEAVOR, ARROW [46] and more recently CANDOR [47] and OPTIMISMM [48]). Two RCTs evaluated frailty specifically for study entry (Larocca et al. enrolled intermediate fit patients only [25] and the ongoing study IFM 2017-03 [30], an RCT specifically designed for frail patients). The UKMRA FiTNEss study [27], an ongoing phase III RCT, was the only study that used frailty to guide treatment delivery into its primary trial design. With regards to longitudinal changes, one prior study (VBDD-VERRUM) [38] evaluated how frailty changed longitudinally over time in R/R MM. This will be further studied in the HOVON 123 [24] and 143 [49] studies along with the UKMRA FiTNEss [27]study will evaluate the dynamic nature of the IMWG frailty index in the longitudinal setting.

Frailty categorization and prevalence

Frailty categorization varied across the different studies, with dichotomous, ordinal or continuous reporting being used. Frailty was divided into three levels (frail, intermediate fit, fit) in 22/43 (51.2%) of the studies and dichotomized (frail, non-frail) in 8/43 (18.6%). Continuous categorization of frailty was present in one study which used the Cancer and Aging Research Group Geriatric Assessment [21].

Given the varied categorization of frailty in either two or three subgroups, frailty prevalence varied greatly across studies. In the RCTs, in the NDMM studies, frailty prevalence ranged from 25.1% [50] to 54.0% [51]. In the R/R setting, many RCTs reported frailty prevalence as high as 73.6% as in the trial Muk eight [19]. Among the non-RCTs, frailty prevalence ranged from 17.2% [52] to 66.0% [40].

Impact of frailty on disease efficacy outcomes

Disease-specific outcomes including PFS, OS, and ORR were reported in the majority of completed studies. In the RCT group, in the NDMM setting, several therapies were found to be beneficial in the frail subgroup, including the incorporation of anti-CD38 upfront. The ALCYONE and MAIA published post-hoc analysis using a simplified frailty score and demonstrated improvement for PFS with the addition of anti-CD38 also among frail older adults consistent with the overall trial results [31, 32]. In ALCYONE, the PFS benefit of daratumumab-bortezomib-melphalan-prednisone (D-VMP) versus bortezomib-melphalan-prednisone (VMP) for the frail population had a HR 0.51 (95% CI, 0.39-0.68) compared to the HR of 0.36 (95% CI, 0.28-0.47) for the total non-frail group (fit and intermediate fit) [32]. Similarly, in the MAIA trial the PFS benefit of daratumumab-lenalidomide-dexamethasone (D-Rd) versus lenalidomide-dexamethasone (Rd) for the frail population had a HR 0.62 (95% CI, 0.45–0.85) compared to the total non-frail population with a HR 0.48 (95% CI, 0.34–0.68) [31]. Overall, the magnitude of benefit with the addition of anti-CD38 was lower among the frail older adults as compared to the fit population in both trials and the addition of anti-CD38 did not overcome the negative impact of frailty.

Among RCTs in the R/R setting, although the point estimates for the efficacy outcomes were often improved in the frail subgroup in the interventional arm compared to the control arm similar to the overall trial population, the magnitude of benefit was attenuated and, in some cases, not statistically significant. In the ICARIA trial, for example, there was a benefit in PFS with isatuximab-pomalidomide-dexamethasone (IsaPd) as compared to pomalidomide-dexamethasone (Pd) in the fit/intermediate group with a HR of 0.49 (95% CI 0.33–0.73); however, this was less pronounced in the frail subgroup with a HR of only 0.81 (95% CI 0.45–1.48) [45]. This was consistent across a number of trials, including CANDOR [47], BOSTON [44], ASPIRE and ARROW [46] which all demonstrated improved outcomes with intervention among fit/intermediate fit patients, but with less pronounced benefit in the frail subgroup. Conversely, in the study MUK eight, which had a high proportion of patients classified as frail, the overall trial results were negative (no difference in primary outcome of progression-free survival of ixa-cyclo-dex compared to cyclo-dex), largely driven the by the impact of frailty on treatment delivery and overall regimen tolerability [19].

Among the non-RCTs, several studies specifically examining frail patients have been conducted or are ongoing. These include studies such as the HOVON 143 [49] in NDMM with an overall PFS of 13.8 months among patients being treated with ixazomib-daratumumab-dexamethasone followed by ixazomib-daratumumab maintenance for 2 years specifically in the subset of frail patients. Additional studies in NDMM include the ongoing MMY2035 study [53] which incorporates frailty-adjusted dosing for lenalidomide, with results expected in 2024. In the R/R setting, several studies are ongoing, including the IFM_2021_03 [35] and the IFM 2018_02 [36].

Impact of frailty on toxicity outcomes

Toxicity outcomes specifically for the frail subgroups were reported in the majority of completed studies. Among the RCTs, toxicity was reported for contemporary trials, including ALCYONE and MAIA, in the NDMM setting. In the MAIA trial, higher rates of ≥ grade 3 TEAE events were observed with the addition of anti-CD38 in the frail subgroups, consistent with the overall trial population (94.6% DRd vs. 89.2% Rd in MAIA) [31]. In the frail subgroup in MAIA, there were of higher rates ≥ grade 3 neutropenia (57.7% DaraRd vs. 33.1% Rd) and infection (41.7% DaraRd vs. 27.7% Rd). Similarly, in the frail subgroup of the ALCYONE trial, higher rates ≥ grade 3 neutropenia (41.3% Dara-VMP vs. 34.4% VMP) and infection (30.0% Dara-VMP vs. 17.9% VMP) were observed with the addition of the anti-CD38 antibody [32]. In the R/R setting, there was increased toxicity seen with therapeutic interventions compared to the control group among the frail subgroup including in BOSTON [44], ICARIA [45], ASPIRE, ENDEAVOR, ARROW [46], CANDOR [47] and OPTIMISMM [48]. Furthermore, specific toxicities of agents such as ≥ grade 3 cardiac failure toxicity observed with carfilzomib was higher in frail patients compared to fit patients (KRd: fit 4% vs. frail 10%; Kd56mg/m2 fit 4% vs. 9% frail) across treatment groups [46].

Among the non-randomized RCTs, toxicity data and treatment discontinuation rates were available in only a subset of trials. HOVON 143, a phase II single-arm study conducted among patients classified as frail, reported high rates of non-hematological toxicity (74% of patients) [49]. This study also reported differences in outcomes among patients classified as frail based upon age alone or who were frail based upon additional geriatric impairments as defined by the IMWG frailty score (median PFS 21.6 months for patients who were frail based on age > 80 years alone versus 10.1 months in patients who were frail based age > 80 and additional geriatric impairments). HOVON 123, a phase II single-arm study, demonstrated higher treatment discontinuation rates among frail patients as compared to intermediate fit patients [24]. Furthermore, HOVON 123 was the only study available that reported quality of life and patient reported outcomes by frailty status, showing inferior quality of life among the frail patient group [54].

Discussion

This analysis represents the first comprehensive systematic review evaluating the prevalence of frailty as well as the outcomes of frailty in MM therapeutic clinical trials. Frailty prevalence greatly varied across trials ranging from 17.2% to 73.6% of the cohort reflecting both differences in the populations as well as different measures of frailty.

Although it is encouraging that frailty is increasingly being incorporated in MM clinical trials, due to the wide variation in both the definition and categorization of frailty, there remains variation in which measure of frailty is used and heterogeneity in the prevalence of frailty, limiting evaluation of its potential impact on outcomes, as patients may be categorized differently in different frailty systems.

As the therapeutic treatment landscape of MM evolves, there is an increasing need to understand frailty as a means of identifying patients who may be at risk of not achieving the maximum benefit while also being at the highest risk of treatment toxicity. However, there remains a range of approaches to operationalizing the clinically intuitive concept of frailty making it challenging to evaluate both baseline populations as well as results across trials. Although the IMWG frailty scores is often thought of as the standard approach to defining frailty in MM [13], there was an increasing number of clinical trials in our review using the simplified frailty score. Although the simplified frailty score (comprising age, comorbidities and performance status alone) has facilitated retrospective post-hoc frailty analyses of previously conducted trials, it is important to note that this abbreviated score is both defined differently as well as often categorized differently compared to the IMWG frailty score. Furthermore, the simplified frailty score may not adequately encompass the heterogeneity in aging and may be more prone to subjective bias compared with prospective evaluation of frailty using more comprehensive tools, such as the IMWG, which incorporate functional status (activity of daily living and independent activity of daily living) [55]. The intermediate fit patients (IMWG frailty score), for example, in the study by Larocca et al. had a median PFS of 18.3 months with lenalidomide-dexamethasone; [25] whereas patient defined as fit/intermediate fit (simplified frailty score) in the MAIA trial had a median PFS of 41.7 months [31]. It is difficult to compare across trials; however, our study highlights that different definitions of this similar concept of frailty make this further challenging.

While the actual different ways of defining frailty make it challenging to compare across studies, the variation in categorizing a patient’s fitness status into either three (fit, intermediate fit, frail) or two (fit or frail) levels may further limit the ability to compare outcomes across different studies. This was further illustrated by Stege et al. where different weights for co-morbidities and cut off for frailty were utilized [56]. In this analysis done using the HOVON 123 study, revised frailty indices using different cut off was able to classify 45% fewer patients as frail, further improving the discriminative power of these scores. Even within sub-categories, there exists heterogeneity in outcomes as shown in the HOVON 143 study [49], where outcomes differed depending on which variable led to the frailty categorization (age and/or geriatric impairments). As frailty becomes increasingly incorporated into studies, clinicians will need to carefully evaluate both the frailty measure, categorization and the cut-off value used to define frailty across different studies.

The most common method for evaluating or conducting a frailty analysis in MM therapeutic trials was subgroup analyses. While many of the studies showed consistent improvements in outcomes with study interventions in frail subgroups, the magnitude of benefit was often less than those seen in fit patients. Furthermore, some of the studies, especially in the R/R setting, either showed no benefit or a substantially less benefit of the intervention with overall higher rates of toxicity. Given the often smaller and variable subset of frail older adults enrolled in these trials, it is not possible to exclude potential benefit from this high-risk subgroup. Larger studies with pre-specified subgroups that are adequately powered are needed to understand the potential benefit as well as toxicity of newer agents including bispecific antibodies and chimeric antigen receptor therapy in older adults with MM. Furthermore, clinical trials specifically focused on enrolling and optimizing therapeutic regimens for frail patients are needed further to improve outcomes in this clinical area of unmet need.

This review also highlights other key areas in incorporating frailty in MM therapeutic trials. While studies are increasingly reporting on frailty subgroups, incorporating frailty in primary study design for treatment delivery was uncommon, with only one study, the UKMRA FiTNESS study, incorporating frailty into the study design. To utilize frailty assessment to direct treatment delivery, rather than just describing the population, integration of frailty into primary study design will be pivotal for different phases of treatment, including NDMM and R/R disease. Another critical area is the need for studies to incorporate longitudinal frailty assessments. Longitudinal approaches to frailty may be important in lower treatment intensity for frail patients in the beginning, while potentially modifying treatment intensity as the frailty status changes. Unfortunately, existing frailty tools, including the IMWG frailty score, consist of largely static variables such as the categorical chronological age and pre-existing comorbidities and may be less suited for detecting changes in frailty over time. We do not yet know if frailty is modifiable or whether longitudinal changes in frailty will further optimize our treatment delivery; however, further development and validation of frailty assessment tools that are both sensitive, specific, and responsive to changes in frailty over time will be essential in the future evaluation.

The strength of this analysis includes the first comprehensive review highlighting the impact of frailty in therapeutic MM trials. We included randomized controlled trials and single-arm therapeutic MM trials. We used a comprehensive search and screening procedure with careful data abstraction. This review also has limitations. We only included studies registered as clinical trials with therapeutic drugs and therefore cannot report on the prevalence of frailty in the real-world which may be substantially higher given the exclusion of frail older adults often from clinical trials [57]. We also did not specifically include either in our search strategy or in evaluation of outcomes other biomarkers of frailty and/or sarcopenia which are known to impact clinical outcomes [58]. We also did not look at the relationship of frailty with other non-drug interventional studies such as physical activity, which are important components of overall MM management [59]. Lastly, given the heterogeneity in results and reporting, we could not conduct pooled analysis examining specific interventions and their efficacy or toxicity in frail compared to fit patients.

In conclusion, this systematic review summarizes how frailty is incorporated into therapeutic MM trials and highlights potential areas for future research. Although frailty assessments are being increasingly incorporated into trial designs, there remains wide heterogeneity in both the definition, categorization and cut-off for frailty among the different trials which may limit our ability to evaluate any associated outcomes. Future strategies aimed at standardizing frailty assessments, along with incorporation of frailty measures in the primary clinical trial design will be critical in operationalizing frailty and using fitness-based approaches to tailor the care of older frail older adults with MM.