Introduction

Incorporating the patient perspective into clinical studies is an important aspect of drug development [1, 2]. Poor health-related quality of life (HRQoL) in chronic obstructive pulmonary disease (COPD) has been linked to the impact of symptoms such as dyspnea, as well as to physical impairment, decreased mental health, and increased hospital readmission [3,4,5,6]. Furthermore, some reports suggest that improving HRQoL reduces economic and social burdens on both individuals and society [7, 8]. Patient-reported outcome (PRO) measures are therefore important when assessing the overall impact of COPD. The inclusion in clinical development of reliable tools to assess symptoms and HRQoL can, therefore, provide information that complements other efficacy measures, giving a more comprehensive view of the response to treatment and guiding clinical practice [9].

Comparative patient-reported data for COPD therapies are limited, [10, 12] with some initial reports suggesting an improvement in HRQoL with open triple therapy compared with dual therapy in patients with symptomatic COPD [10,11,12]. FULFIL (NCT02345161; GSK study CTT116853) is the first study to compare once-daily single inhaler triple therapy with fluticasone furoate/umeclidinium/vilanterol (FF/UMEC/VI) 100 µg/62.5 µg/25 µg using the ELLIPTA® inhaler with twice-daily dual therapy with budesonide/formoterol (BUD/FOR) 400 µg/12 µg using the Turbuhaler® in patients with symptomatic COPD at risk of exacerbations [13]. Previously published findings showed significant improvements in the co-primary endpoints of lung function and HRQoL as measured by the COPD-specific St George’s Respiratory Questionnaire (SGRQ), and in exacerbation rates at week 24 with FF/UMEC/VI compared with BUD/FOR [13]. The safety profile of FF/UMEC/VI was consistent with the known safety profiles of the individual dual- and monotherapy components [13]. In addition, FULFIL was also designed to prospectively assess symptoms and physical activity, using daily and periodic PRO assessment in patients receiving FF/UMEC/VI or BUD/FOR. The COPD Assessment Test (CAT) was also included as an additional measure of health status directly relevant to clinical practice [13].

Methods

FULFIL was a phase III, randomized, double-blind, double-dummy, parallel-group, multicenter study (Fig. S1 in the online data supplement) [13]. Co-primary outcomes evaluated the effects of fluticasone furoate/umeclidinium/vilanterol (FF/UMEC/VI) on trough forced expiratory volume in 1 s (FEV1) and St George’s Respiratory Questionnaire (SGRQ) [14] total score compared with budesonide/formoterol (BUD/FOR) after 24 weeks of treatment. Secondary and other endpoints that were part of the primary reporting from FULFIL included respiratory symptoms severity (Evaluating Respiratory Symptoms in COPD™ [E-RS: COPD] and subscales [15, 16], Transition Dyspnea Index [TDI] focal score [17,18,19], and Global Rating of Change in COPD Severity), rescue medication use, and inhaler preference [13]. The CAT was included as an additional HRQoL endpoint so as to provide information on health status using data that can be incorporated easily into clinical practice, provide additional data on the relationship to SGRQ total score, and relate to the Global Initiative for Chronic Obstructive Lung Disease assessment of COPD severity [20,21,22].

Patients

FULFIL enrolled male and female patients with COPD aged ≥ 40 years with symptomatic COPD at risk of exacerbation:

  • CAT score ≥ 10,

  • And either

    • forced expiratory volume in 1 s < 50% (no exacerbation history requirement)

    • or forced expiratory volume in 1 s 50–80% with a history of exacerbations in the previous year (≥ 2 moderate or ≥ 1 exacerbation requiring hospitalization).

Patients were randomized to receive 24 weeks of once-daily FF/UMEC/VI (100 μg/62.5 μg/25 μg) using a single ELLIPTA® inhaler or twice-daily BUD/FOR (400 μg/12 μg) using the Turbuhaler® (intent-to-treat [ITT] population) [13]. A subset of patients remained on blinded study treatment for up to 52 weeks [extension (EXT) population].

Patients were required to be receiving daily maintenance therapy for chronic obstructive pulmonary disease (COPD) for ≥ 3 months, with medications at screening that continued unchanged during the 2-week run-in period. Patients were excluded if they had a current diagnosis of asthma, COPD caused by α1-antitrypsin deficiency, other respiratory disorders, lung resection within 12 months of screening, or any other clinically significant diseases. Patients who had pneumonia or a severe COPD exacerbation that had not resolved within 14 days of screening, a respiratory tract infection that had not resolved within 7 days of screening, or an abnormal chest X-ray were also excluded.

The study protocol was reviewed and approved by the relevant investigational center ethics committee or institutional review board, in accordance with the International Council on Harmonisation. The study was conducted in accordance with the Declaration of Helsinki and all participants provided written informed consent.

Patient-Reported Outcome Tools

Patients used an electronic diary each evening to complete the E-RS: COPD [derived from EXAcerbations of Chronic pulmonary disease Tool (EXACT)], record rescue medication use, and limitation in daily activities questions. The COPD-specific SGRQ and CAT were used to assess HRQoL at study visits; this article reports SGRQ domain scores of symptoms, activity, and impacts (total score results have been previously reported elsewhere) [13]. The TDI focal score (self-administered computerized version) was used to assess dyspnea. Global Ratings of COPD Severity and activity limitation were assessed at randomization, and Global Rating of Change from Baseline in Severity and activity limitation at all visits post-randomization. The double-dummy study design enabled patients to express preference between the inhalers; responses were obtained using the COPD Inhaler Preference Questionnaire at week 24 (or at treatment discontinuation, if applicable) [23]. Patients were asked which inhaler they preferred, based on number of steps, time taken, and ease of use: the response options were ELLIPTA, Turbuhaler or no preference. The form and timing of these assessments are summarized in Table 1.

Table 1 Assessment tools used in patient-reported data collection

Statistical Analyses

Prespecified analyses were performed in the ITT population for data up to week 24, and in the EXT population for data up to week 52. Mean scores for the E-RS: COPD and the three subscales over 4-week intervals, change from baseline in CAT scores, TDI focal scores, and mean number of occasions of rescue medication use per day over the 4-week intervals were all analyzed using a mixed model repeat measures analysis including treatment group, smoking status at screening, geographical region, visit, baseline value, and baseline-by-visit and treatment group-by-visit interactions. The proportions of responders on SGRQ, CAT, E-RS: COPD and TDI (response defined as a change from baseline equal to or greater than MCID [Table 1]) and non-responders were analyzed using a generalized linear mixed model with a logit link function. Number need to treat to benefit was calculated (post hoc) for E-RS: COPD, TDI, CAT and SGRQ total scores.

The proportion of days that a patient reported a score of 0 (less activity than usual), 1 (the same activity as usual), or 2 (more activity than usual) on the daily activity question was determined for the weeks 1–24 period for the ITT population and weeks 1–52 for the EXT population. The proportion of days with a score of 2 (able to perform more activities than usual) on the daily activity question was analyzed using an analysis of covariance model and included the following covariates: treatment group, screening smoking status, geographical region, and baseline value. Patient Global Ratings of Change in Activity Limitation and Change in COPD Severity were analyzed separately at each visit, using logistic regression including treatment, screening smoking status, and geographical region as covariates. The proportion of patients who reported a preference for either inhaler or no preference was presented. For patients who expressed a preference for either ELLIPTA® or Turbuhaler®, P values were calculated (post hoc) to test whether the proportion of patients who preferred ELLIPTA® differed significantly from 50%.

Results

The ITT population comprised 1810 patients (FF/UMEC/VI, n = 911; BUD/FOR, n = 899). The EXT population included the first 430 of these patients who consented to receive treatment for 52 weeks (FF/UMEC/VI, n = 210; BUD/FOR, n = 220) (Fig. 1). The full population characteristics are shown in Table S1 in the online data supplement, as reported in the primary publication [13]. Baseline demographics and PRO scores were similar between treatment arms in both the ITT and EXT populations, and between the ITT and EXT populations (Table 2) [13]. COPD maintenance medication taken at screening was comparable between treatment arms and the ITT and EXT populations (Table S2).

Fig. 1
figure 1

CONSORT flow chart for the FULFIL study. BUD/FOR budesonide/formoterol, FF/UMEC/VI fluticasone furoate/umeclidinium/vilanterol, ITT intent-to-treat

Table 2 Patient characteristics at baseline (ITT and EXT populations)

Symptom Assessments

Over 24 weeks (ITT), FF/UMEC/VI produced greater reductions from baseline in mean E-RS: COPD total score and all subscale scores compared with BUD/FOR; treatment differences were statistically significant for each 4-week interval and in the FF/UMEC/VI group exceeded the total score response threshold by week 8. Breathlessness and cough and sputum subscales in this group exceeded the threshold by week 12 (Fig. 2). The between-treatment odds ratios (ORs) of response versus non-response over 24 weeks (ITT) were statistically significant in favor of FF/UMEC/VI (OR range over 24 weeks, 1.59–1.76; all P < 0.001). At weeks 21–24 (ITT) for FF/UMEC/VI versus BUD/FOR, 53 and 42% of patients responded, respectively, and the number needed to treat to benefit (NNTB) was 9 (95% CI 7–17). Similar findings were observed for OR response versus not response at each 4-week interval over 52 weeks in the EXT population [Table S23 in the online data supplement; proportion of responders: 42 and 32% for FF/UMEC/VI and BUD/FOR, respectively; NNTB of 10 (95% CI 5–77)].

Fig. 2
figure 2

Mean change from baseline in 4-weekly E-RS: COPD scores (ITT): total score (a); breathlessness (b); cough and sputum (c); chest symptoms (d). Dark dotted lines represent baseline and light dotted lines represent response threshold. BUD/FOR budesonide/formoterol, CI confidence interval, E-RS: COPD Evaluating Respiratory Symptoms in COPD, FF/UMEC/VI fluticasone furoate/umeclidinium/vilanterol, ITT intent-to-treat, LS least squares

FF/UMEC/VI also demonstrated statistically significant improvements in dyspnea, measured by mean TDI focal score, compared with BUD/FOR at weeks 4 and 24 in the ITT population (Table 3). Between-treatment ORs of response versus non-response at weeks 4 and 24 (ITT) were statistically significant in favor of FF/UMEC/VI [week 4 OR (95% CI): 1.52 (1.25–1.86), P > 0.001; week 24 OR (95% CI): 1.61 (1.33–1.95), P < 0.001]. At week 24 (ITT), 61 and 51% of patients responded on FF/UMEC/VI and BUD/FOR, respectively, and the NNTB was 10 (95% CI 7–18). In the EXT population at week 52, there was no evidence of benefit, the OR was 1.35 (P = 0.1.32) and 53 and 46% responded, respectively. The mean TDI focal score showed numeric improvement with FF/UMEC/VI compared with BUD/FOR at weeks 4, 24, and 52 (EXT) but only achieved statistical significance at week 4 (P = 0.01). However, the ratio between treatments of odds of response versus non-response for the E-RS: COPD Breathlessness Score for each 4-weekly period in the EXT population ranged from 1.60 to 2.32 (all P < 0.05). The observed improvements in symptom scores were independent of baseline COPD medication.

Table 3 Analysis of TDI focal scores and proportion of TDI responders in the ITT and EXT populations

A statistically significant reduction in mean number of occasions of rescue medication use per day was demonstrated with FF/UMEC/VI versus BUD/FOR over weeks 1–24 (ITT) and weeks 1–52 (EXT) (Table 4). Similar findings were observed at each 4-week time period (Table 4).

Table 4 Treatment difference in rescue medication use for FF/UMEC/VI vs BUD/FOR, in the ITT and EXT populations

HRQoL Assessments

In the ITT population at weeks 4 and 24, clinically meaningful improvements from baseline in CAT score (reduction ≥ 2) were observed with FF/UMEC/VI (mean change from baseline: week 4, − 1.7; week 24, − 2.7), but not with BUD/FOR (mean change from baseline: week 4, − 1.4; week 24, − 1.7). The treatment differences of − 0.7 and − 0.9 units, respectively, were statistically significant (Fig. 3). In the smaller EXT population, improvements from baseline were also observed at weeks 4, 24, and 52 with both treatments, numerically in favor of FF/UMEC/VI, but the treatment difference was only statistically significant at week 24 (P = 0.035). The OR of response versus non-response at week 24 (ITT) was statistically significant in favor of FF/UMEC/VI versus BUD/FOR (OR 1.44, P < 0.001). At week 24 (ITT), 53 and 45% of patients receiving FF/UMEC/VI or BUD/FOR responded, respectively, and the NNTB was 11 (95% CI 8–29). At week 52 (EXT), the OR was 1.50 (P = 0.048) and the proportions of responders were 44 versus 35%, respectively, [NNTB, 11 (95% CI 5–459)]. Similar findings were observed in SGRQ analyses. At week 24 (ITT), the OR of response versus non-response was statistically significant in favor of FF/UMEC/VI versus BUD/FOR (OR 1.41, P < 0.001). At week 24 (ITT), 50 and 41% of patients receiving FF/UMEC/VI or BUD/FOR responded, respectively, and the NNTB was 12 (95% CI 8–27). At week 52 (EXT), the OR was 1.50 (P = 0.046) and the proportions of responders were 44 versus 33%, respectively (NNTB, 10 [95% CI 5–97]).

Fig. 3
figure 3

Mean change in baseline in CAT score in the ITT population. BUD/FOR budesonide/formoterol, CAT COPD Assessment Test, CI confidence interval, FF/UMEC/VI fluticasone furoate/umeclidinium/vilanterol, ITT intent-to-treat, LS least squares, SE standard error

Improvements from baseline in SGRQ domain scores were observed in both treatment groups in the ITT population in line with changes in the total score [13], with significantly greater improvements observed with FF/UMEC/VI across all domain scores compared with BUD/FOR at weeks 4 and 24 (Fig. 4). Improvements from baseline in SGRQ domain scores were also observed in both treatment groups at weeks 4, 24, and 52 in the EXT population numerically in favor of FF/UMEC/VI at all time points, except for Symptoms and Impacts domains at week 4 (Table S3 in the online data supplement).

Fig. 4
figure 4

Mean change from baseline in SGRQ domain scores in the ITT population. Treatment differences (95% CIs) shown. *P < 0.05; **P < 0.01; ***P < 0.001. BUD/FOR budesonide/formoterol, CI confidence interval, FF/UMEC/VI fluticasone furoate/umeclidinium/vilanterol, ITT intent-to-treat, LS least squares, SGRQ St George’s Respiratory Questionnaire

The FF/UMEC/VI group had significantly greater odds of being in a better versus a worse response category for patient-rated Global Rating of Change in COPD Severity from baseline at week 24 (ITT) compared with BUD/FOR [ordered OR 1.63; 95% confidence interval (CI) 1.37–1.95] and at week 52 (EXT) population (ordered OR 2.08; 95% CI 1.42–3.06). All HRQoL findings were observed independent of baseline COPD medication. The relationship between SGRQ and CAT scores was explored post hoc using individual patient data and demonstrated a positive correlation between the changes from baseline in SGRQ total score and in CAT score at week 24 in the ITT population (Fig. 5).

Fig. 5
figure 5

Change from baseline in CAT score versus change from baseline in SGRQ total score at week 24 in the ITT population. Line represents line of best fit. Fitted regression line using the SAS procedure SGPLOT with a REG statement. BUD/FOR budesonide/formoterol, CAT COPD Assessment Test, FF/UMEC/VI fluticasone furoate/umeclidinium/vilanterol, ITT intent-to-treat, SGRQ St George’s Respiratory Questionnaire

Activity Limitation and Inhaler Preference

At baseline, most patients reported some level of COPD-related limitation in activity, with similar proportions in each arm reporting slightly limited or limited activity and only a few patients reporting the extremes of no limitation or very limited activity.

At weeks 24 (ITT population) and 52 (EXT population), compared with baseline, more patients treated with FF/UMEC/VI reported improved activity limitation (‘not limited’ and ‘slightly limited’ categories) on the Global Rating of Activity Limitation (13 and 15% increases) compared with BUD/FOR (5% increases at each week).

Patients receiving FF/UMEC/VI had statistically significantly greater odds of being in a better response category versus a worse category for Change in Global Activity Limitation than those treated with BUD/FOR at week 24 (ITT) (ordered OR 1.58; 95% CI 1.33–1.89) and week 52 (EXT) (ordered OR 2.15; 95% CI 1.46–3.16).

Responses to the daily activity question were not easily interpreted as only 5% (approximately) of patients in each group reported an improvement in the number of days they were able to perform more activities than usual. At baseline in both the ITT and EXT populations, the majority of patients in each treatment arm (55–58%) reported a score of 1 (no effect on activities) and approximately 40% reported a score of 0 (fewer activities than normal).

At the final study visit (week 24, ITT population), of those patients who expressed a preference, more preferred the ELLIPTA® inhaler to the Turbuhaler® (Table 5).

Table 5 Inhaler preferences at week 24 (n = 1810; ITT population)

Discussion

The planned inclusion of reliable and responsive PRO tools and the prespecified analyses of these data in the FULFIL study allows the patient perception of real-life disease impact and severity to be more fully described. In the study, rapid and sustained reductions in daily symptoms assessed by the E-RS: COPD and subscales alongside improvement in HRQoL were demonstrated for both treatment groups. Although it was expected that a reduction in symptoms, particularly dyspnea, would also be reflected in improved activity levels, this was not clearly demonstrated in the daily activity question as very few patients reported the ability to perform more than usual activities. However, when responding to the Global Rating of Change in Activity question, patients in both groups were more likely to report improved activity than reduced activity at week 24. Measurement and monitoring of activity in COPD patients is an emerging area for study; patient motivation and habit are likely to affect the magnitude of reported change [24].

In clinical studies, PRO measures provide insights into patient perception of symptom severity and changes alongside the use of physiological measures and can demonstrate the overall impact of treatment. The use of daily diary measures of symptoms such as E-RS: COPD captures time of data entry and minimizes patient recall issues, which can be seen with periodic assessment. It is also possible to determine the timing of symptom improvement by examining symptom patterns and stability in individual patients. Early changes to HRQoL measures (SGRQ and CAT) at week 4 are mostly driven by the reduction in symptoms, and this is reflected in the E-RS: COPD and TDI focal scores, which also show early improvement. The particular improvements in dyspnea measures for FF/UMEC/VI over BUD/FOR are expected from the addition of a second bronchodilator. The SGRQ symptoms domain and the E-RS: COPD subscales of cough and sputum and chest symptoms improve more slowly, as do the SGRQ domains of activity and impacts, which are less immediately or directly affected by bronchodilation.

In FULFIL, both treatments produced improvements in HRQoL at weeks 4 and 24, seen in improvements in both CAT scores reported here and SGRQ total scores (previously reported: FF/UMEC/VI, − 4.2 and − 6.6; BUD/FOR, − 2.5 and − 4.3, respectively) [13]. Mean changes from baseline in the FF/UMEC/VI group were statistically superior to those in the BUD/FOR group for both CAT and SGRQ total scores. However, whilst the SGRQ total score is an accepted measure of HRQoL in clinical research, the COPD-specific SGRQ is a difficult tool to use and interpret in daily clinical practice. In development, CAT demonstrated similar measurement properties to the SGRQ [25, 26], and in FULFIL we examined this relationship in a post hoc analysis of the ITT population. CAT is a measure that can be easily incorporated into routine clinical practice with limited additional burden on patients and healthcare professionals. This makes the results of FULFIL more immediately useful to clinicians. Furthermore, because CAT is easy to administer and interpret, it is more likely to be performed in routine care.

When comparing changes in PRO scores between treatment groups, it is important to consider whether the change from baseline within the groups is clinically meaningful, in order to provide context for the mean differences between groups. In a study comparing two active treatments, such as FULFIL, both groups in the study might be expected to improve from baseline; therefore, a responder analysis such as that reported here provides the most meaningful comparison, since achieving a between-groups difference greater than the minimal clinically important difference is not to be expected.

The prespecified analyses of PROs from FULFIL presented here build on the primary findings [13] and further demonstrate that FF/UMEC/VI improved patient perceptions of both symptom severity and health status, compared with BUD/FOR. In addition, these results show that a larger proportion of patients preferred using the ELLIPTA® inhaler compared with the Turbuhaler®. The findings were also observed in the EXT population over 52 weeks, demonstrating that these effects were maintained over the long term.

Conclusions

These results show that therapy with once-daily triple therapy with FF/UMEC/VI improves patient symptoms and HRQoL compared with twice-daily BUD/FOR. The inclusion of patient perspectives provides the clinician with additional information to assist choice and monitoring of therapies. The findings presented here further illustrate the benefits of once-daily triple therapy in patients with symptomatic COPD who are at risk of exacerbation.