FormalPara Key Summary Points

Utility values of patients with generalized myasthenia gravis (gMG) presenting with a variable combination of symptoms and severity are unknown.

Using data from the ADAPT trial comparing efgartigimod add-on treatment to conventional therapy, this study aimed to determine the association between Myasthenia Gravis–Activities of Daily Living (MG-ADL) symptom scores and utilities based on the EQ-5D-5L, a measure of health-related quality of life (HRQoL).

All improvements in gMG symptoms were significantly associated with higher utility values, with the exception of eyelid droop. Individual MG-ADL items contributed differently to utility values, with the largest impact from improvements in chewing, brushing teeth/combing hair, and eyelid droop.

Patients treated with efgartigimod plus conventional therapy had higher utility values relative to patients treated with placebo plus conventional therapy having the same total MG-ADL symptom scores.

Introduction

Generalized myasthenia gravis (gMG) is a neurological condition affecting patients’ muscle strength which often results in problems with fatigue, vision, swallowing, chewing, limb weakness, and breathing [1, 2]. Prevalence estimates indicate that MG affects as many as 700,000 people worldwide and 103,000 people in the European Union (EU) [3, 4]. Approximately 85% of patients with MG will progress to gMG [5]. For patients with gMG, many aspects of life are substantially impacted, including well-being, social function, psychological health, and physical health [6, 7].

Consequently, patients with gMG report lower health-related quality of life (HRQoL) than in the general population [8, 9]. Several disease-specific and generic instruments have been validated in patients with gMG that can be used to assess patients’ symptom severity, functional status, and HRQoL. Developed in 1999, the Myasthenia Gravis Activities of Daily Living (MG-ADL) scale is a disease-specific patient-reported outcome measure assessing gMG symptoms and functional status [2, 10]. A recent literature review of 48 publications and 35 clinical trials indicated that the MG-ADL is a valid and reliable measure, increasingly used in observational studies and as a primary endpoint in clinical trials [2]. Furthermore, it is also frequently used in clinical practice [2].

The EQ-5D-5L is a generic HRQoL instrument developed by the EuroQoL group that contains five questions plus an analogue scale [11, 12]. This instrument can be used across a variety of disease states, including gMG [13, 14]. It is the most widely used HRQoL measure in health technology appraisals and pharmacoeconomic modeling and is endorsed by the National Institute for Health and Care Excellence (NICE) in the United Kingdom (UK) [15, 16]. The responses to the five EQ-5D-5L questions can be summarized in utility values using country-specific value sets [12]. Utilities indicate the severity of a health state on a scale where 1 represents full health, negative values represent health states considered to be worse than death (< 0), and 0 is an anchor representing death [12]. With these utilities, quality-adjusted life years (QALYs) can be calculated, which can, in turn, be used to inform economic evaluations of healthcare interventions [12]. Utility measures have become increasingly important when considering the value and benefit of therapeutic interventions [17].

To date, there is limited information available on the association between functional status (as measured with the MG-ADL) and utility values [2, 18,19,20] in patients with gMG. The ADAPT phase 3 study is one of the few clinical trials of patients with gMG that assessed and reported both symptom and functional scores (MG-ADL) and utility measures (EQ-5D-5L) [21]. The purpose of the ADAPT trial was to evaluate treatment efficacy, safety, tolerability, and impact on normal daily activities and HRQoL in patients treated with efgartigimod + conventional therapy (EFG + CT) vs. placebo + CT (PBO + CT) [21]. Data from this trial may be used to understand how gMG symptoms impact patient quality of life. This would aid in translating findings on treatment efficacy from gMG clinical trials to economic assessments. In addition, determining the association between MG-ADL and EQ-5D-5L would allow for utility values to be used in cost-effectiveness analyses in which the MG-ADL is used to describe health states [22].

The aim of this study was therefore to determine the association between MG-ADL and EQ-5D-5L utility scores using data from the phase 3 ADAPT study. Further, this study assessed if the improvement in utility as captured by the EQ-5D-5L is attributable to improvement in MG-ADL scores, or if there may be an additional impact from treatment or structural changes over time.

Methods

Study Dataset

This analysis was based on the data from the phase 3 ADAPT clinical trial [21]. Patients included in the trial were at least 18 years of age with or without anti-acetylcholine receptor antibody-seronegative (AChR-Ab−) gMG and were eligible if their disease was categorized as MGFA class II to IV and they had a MG-ADL score of at least 5 (with > 50% of the MG-ADL score due to nonocular symptoms) [21]. Treatment was randomly assigned to EFG + CT or matching PBO + CT administered as four weekly infusions per 8-week treatment cycle [21]. In the ADAPT study, the time between each treatment cycle was individualized according to clinical evaluation [21]. Additional details on inclusion and exclusion criteria and study design for the ADAPT trial were previously reported [21]. Symptom and functional scores (MG-ADL) and utility measures (EQ-5D-5L) were simultaneously collected on a biweekly basis up to 26 weeks as part of the ADAPT study design.

Compliance with ethics guidelines

Independent ethics committees and international review boards provided written approval for the ADAPT study protocol and all amendments (21). The trial was conducted in accordance with Declaration of Helsinki principles (21). Patients consented both to participation in original ADAPT study and for their data to be published.

Outcome Measures

MG-ADL

The MG-ADL includes eight items (talking, chewing, swallowing, breathing, brushing teeth/combing hair, rising from a chair, double vision, and eyelid droop) across four domains (bulbar, respiratory, limb weakness, and ocular) [1, 10]. Each item in the scale is weighted equally, with a score ranging from 0 to 3 [2]. Total scores range from 0 to 24, with higher scores indicating more severe functional impact due to gMG [2, 23]. The measure has high reliability and has shown construct validity [23].

EQ-5D-5L

EQ-5D-5L dimensions include mobility, self-care, usual activities, pain/discomfort, and anxiety/depression, with each dimension having five levels that range from no problems to extreme problems, scored 1 to 5 [12]. Utilities are calculated by applying national value sets as weights to the responses of the five dimensions [12, 24]. The UK EQ-5D-5L interim (“crosswalk”) value set [24,25,26] was used to derive utility scores from EQ-5D-5L data collected in ADAPT [21].

Statistical Analysis

Descriptive statistics were reported for MG-ADL items and total score and for the EQ-5D-5L dimensions and utility values at baseline and at follow-up (all available time points pooled together, from week 1 to week 26). First, a normal identity link (ID) regression estimated the association between utility and the eight items of the MG-ADL. The regression had an identity link between the dependent variable (the utility complement = 1 − utility) and the dependent variables (the eight items of the MG-ADL), and errors were assumed to be normally distributed. Bubble plots per treatment arms gave a visual display of the relationship between utility values and MG-ADL total scores per treatment arm. A generalized estimating equations (GEE) model was then estimated to predict utility based on the patient’s total MG-ADL score and treatment received. This regression also had the utility complement (= 1 − utility value) as the dependent variable, and time (in days), time squared, the MG-ADL score, MG-ADL score squared, treatment, and interaction terms between treatment and time, and treatment and MG-ADL score as independent variables. Repeated measurements from the same patients were assumed to be correlated and different variance–covariance matrices were fitted to the data (unstructured, compound symmetry, auto-regressive), selecting the best fitting matrix with restricted maximum likelihood. Independent variables were assessed with a type 3 test and by comparing (nested) models with the Aikaike information criterion (AICC) and Bayesian information criterion (BIC). Gradually non-significant variables (e.g., time as an independent variable, time with a squared term, and time as an interaction with treatment) were removed to obtain a parsimonious model.

We examined the follow-up measurements of the five EQ-5D domains, by conducting a global test using a GEE model with identity link and normal distribution, and a compound symmetry variance–covariance matrix. In these five models, the EQ-5D domain score was the dependent variable, and treatment and MG-ADL total score as independent variables.

Both regression models can be used as mapping models to predict utility values from clinical outcomes: the normal ID model predicts utilities and changes in utility based on the MG-ADL items scores, whereas the GEE model predicts utility values based on the total MG-ADL score and other explanatory variables such as time and treatment. In both cases the clinical endpoint is mapped onto utility values.

Results

Patient Characteristics

Baseline characteristics of the patients included in the study from the ADAPT trial are presented in Table 1. Additional baseline characteristics are reported by Howard et al. [21]. Most patients were female with an average age of 45.9 ± 14.4 years for EFG + CT and 48.2 ± 15.0 years for PBO + CT. The majority of patients were white (> 80%). Average time since diagnosis was 9–10 years. The majority of patients (78%) were AChR-Ab+, but utility at baseline did not differ between antibody-positive and antibody-negative patients. Antibody status was not found to be significant in the relationship between MD-ADL and utility, and thus patients were not evaluated separately on the basis of antibody status.

Table 1 Baseline characteristics of patients in the ADAPT phase 3 trial

Patients with gMG (N = 167, with n = 84 EFG + CT and n = 83 PBO + CT) contributed a total of 3064 simultaneous measurements of MG-ADL and EQ-5D-5L, of which 167 were at baseline and 2897 were at follow-up (all time points combined, from week 1 to week 26).

Mean total MG-ADL and EQ-5D-5L scores were relatively similar between the EFG + CT group and the PBO + CT group at baseline (Table 1) and improved more in the EFG + CT group than in the PBO + CT group at follow-up.

Changes in MG-ADL and EQ-5D-5L Between Baseline and Follow-up

Scores for the five dimensions of the EQ-5D-5L did not differ significantly between treatment arms at baseline. Greater improvements from baseline were seen in EFG + CT versus PBO + CT-treated patients on three dimensions of the EQ-5D-5L combining all available follow-up measures. Between baseline and follow-up, the proportion of patients reporting no or slight problems improved more for EFG + CT patients than for PBO + CT alone for mobility (EFG + CT, from 56.1% to 75.3%; PBO + CT, from 50.6% to 58.8%), for self-care (EFG + CT, from 69.6% to 87.8%; PBO + CT, from 68.6% to 63.9%), and usual activities (EFG + CT, from 40.2% to 72.6%; CT, from 37.4% to 52.7%) (Table 2).

Table 2 Proportion of patients reporting no or slight problems on EQ-5D items at baseline and follow-up

The five GEE models on the follow-up measurement of the five EQ-5D domains showed that treatment had a significant impact over and above any improvements in MG-ADL in explaining the domain scores for mobility (p value for treatment = 0.0065), self-care (p < 0.001), usual activities (p = 0.0062), and anxiety/depression (0.0179), but not for pain/discomfort (p = 0.06) although the trend was in the right direction.

Similarly, compared to patients receiving PBO + CT, patients receiving EFG + CT demonstrated greater improvements from baseline to follow-up across all MG-ADL individual items except double vision, with an increase in the proportion of patients reporting normal function at follow-up compared with baseline. The greatest benefit was observed for chewing (EFG + CT, 14.6% to 56.5%; PBO + CT, 13.3% to 33.7%), brushing teeth/combing hair (EFG + CT, 15.9% to 49.8%; PBO + CT, 15.7% to 29.3%), and eyelid droop (EFG + CT, 20.7% to 41.2%; PBO + CT, 30.1% to 35.7%) (see Table 3 and Figs. S6–S13 in the electronic supplementary material).

Table 3 Proportion of patients reporting normal function on MG-ADL items at baseline and follow-up

Association Between MG-ADL Items and EQ-5D-5L Utility

Regression analysis (Table 4) demonstrated that all individual MG-ADL items contributed statistically significantly but differently to utility values (except for eyelid droop). The largest impact on utility values was associated with improvements in brushing teeth/combing hair (+ 0.042 per point improvement in score), rising from a chair (+ 0.036 per point improvement), chewing (+ 0.0260 per point improvement), and breathing (+ 0.0256 per point improvement).

Table 4 Regression of EQ-5D-5L utility score on MG-ADL items

Association Between MG-ADL Total Score and Utility

Figures 1 and 2 illustrate the distribution of utility values per MG-ADL score, separately for each treatment arm in ADAPT. Bubble size is proportional to the number of observations. There is a downward trend showing diminishing utility values with higher symptom severity. For each MG-ADL score, patients receiving EFG + CT have higher scores and the distribution is skewed towards the higher end of the utility scale, compared to patients receiving PBO + CT.

Fig. 1
figure 1

Bubble plot of the distribution of utility values per MG-ADL score for EFG + CT-treated patients. CT conventional therapy, EFG efgartigimod

Fig. 2
figure 2

Bubble plot of the distribution of utility values per MG-ADL score for PBO + CT-treated patients. CT conventional therapy, PBO placebo

This association of utility values and MG-ADL total score and treatment was evaluated with a GEE model, which included MG-ADL score and treatment as independent variables. Results showed a significant improvement in utility values per unit improvement in MG-ADL score (Fig. 3). Each unit improvement in MG-ADL (higher total score) led to a utility decrease of 0.0233 (p < 0.001). In addition, EFG + CT-treated patients experienced an additional improvement of 0.0598 (p = 0.0079) in utility for the same MG-ADL score (Fig. 3). Time was not found to be a significant predictor, and no interaction effects were found with treatment received.

Fig. 3
figure 3

Association between MG-ADL total score and EQ-5D-5L utility values by treatment. CT conventional therapy, EFG efgartigimod, GEE generalized estimating equations, PBO placebo. Regression results on utility from the GEE model are represented by the dashed lines

Discussion

gMG is a heterogeneous disease in which patients present a variable combination of symptoms of ocular, bulbar, limb strength, and respiratory nature. This variability is a hurdle when estimating the full impact of therapies on clinical outcomes and eventually on HRQoL. Usually, the therapy’s effect on HRQoL is characterized by translating improvements in clinical dysfunction and symptoms (as measured with the MG-ADL) on utility values. However, there might be an additional direct impact on HRQoL through dimensions not included in the clinical outcome measure. As gMG is a complex and multidimensional disease, it is important to assess the entire impact of the disease and its therapies on all facets of HRQoL.

Results from this analysis show that clinical symptoms measured with the MG-ADL improved with treatment, and moreover improved significantly more with EFG + CT than with PBO + CT. Further, a statistically significant relationship was established between the eight individual MG-ADL items and health utility values, with symptom worsening contributing negatively to utility, whereby some items were more impactful than other items. The highest utility decrements were observed with worsened limb strength (brushing teeth/combing hair and rising from a chair), chewing, and breathing. In addition, a statistically negative relationship was also demonstrated between the MG-ADL total score and utility values, with an increase in utility of 0.0233 with each point improvement in the total score.

These observations demonstrated that patients treated with EFG + CT had higher utility improvements than patients treated with PBO + CT. This was due to larger improvements in clinical symptoms (indirect effect on utility through MG-ADL). In addition, for the same MG-ADL score (same disease severity), patients receiving EFG + CT had higher utility because of additional benefits to HRQoL not captured by the MG-ADL but captured by the EQ-5D-5L (direct effect on utilities). Indeed, more patients receiving EFG + CT had no or mild problems with self-care, usual activities, pain/discomfort, and anxiety/depression. These benefits are not included in the MG-ADL outcome but also impact HRQoL.

The models presented in this study predict EQ-5D-5L utility scores based on MG-ADL total score (GEE model) or based on individual MG-ADL items (normal ID regression). This represents a useful method in situations where MG-ADL data from existing trials needs to be converted to EQ-5D-5L utility, for example, for use in economic models. The regressions could be used as mapping tools between clinical and HRQoL outcomes. Whilst the model is good at predicting the mean values for each MG-ADL total score, there is in reality a lot of variation around that mean. This is due to the disease affecting four bodily systems, and patients with the same total MG-ADL score may have very different clinical profiles, with different systems in their body being affected but still having the same MGADL total score. When data allows, direct derivation of the utility score based on data observation from the trial is preferred over mapping.

The differential impact of the various MG-ADL items on utility values, as well as the difference in utility values for the same MG-ADL total score between treatments, highlights the need for a more granular mapping model to fully capture the benefit of a treatment on utility.

Other analyses have utilized different instruments to varying degrees of success to elicit health utility index scores in patients with gMG. Findings from Barnett et al. showed a considerable decrease in EQ-5D-5L utility values with higher Myasthenia Gravis Foundation of America (MGFA) class, which indicates worsening HRQoL with increased disease severity [14]. Lower utility values were statistically significantly associated with higher impairment, an observation that is consistent with the findings from this study [14].

In the EU and UK, several health authorities will soon assess new treatments for gMG (including efgartigimod and ravulizumab) in terms of their value for money, with MG-ADL as the primary clinical trial endpoint [21, 27]. Analyses including the one presented in the current study are needed to capture the clinical and economic facets of treatment comprehensively and accurately for two major reasons. First, information on utilities in patients with gMG is limited. Second, there is increasing pressure to demonstrate the impact of treatment on patients’ lives, beyond hospitalizations and other healthcare resource utilization, that to date have been the primary focus of economic analyses for patients with gMG. Specifically, establishing the relationship between MG-ADL and utility values is critical.

To that end, the findings from this study address this data gap, but are not without limitations. First, the length of follow-up time was limited to up to 26 weeks. Thus, values could change over time. Second, the first regression and GEE model relied on assumptions that may not necessarily be true. Specifically, the first regression assumed that each additional change from one level to the next within each MG-ADL item has the same impact on utility, but a different impact between items. The GEE model predicting EQ-5D-5L utility from MG-ADL total score assumed that any unit of improvement in an item has the same utility impact. In addition, results may not be generalizable outside of a clinical trial setting. Further, it is difficult to establish reference values in this patient population. Finally, the EQ-5D-5L may capture dimensions not included in the primary clinical outcome, MG-ADL, which are commonly associated with MG, such as usual activities and anxiety. Despite this, the EQ-5D-5L allows for comparisons across disease states and informs QALY calculations needed for treatment assessments [12].

Conclusion

This analysis demonstrated that improvements in gMG symptoms were significantly associated with higher utility values. MG-ADL scores alone were not sufficient to capture the utility gained from efgartigimod therapy.