Background

Chronic obstructive pulmonary disease (COPD) is characterized by persistent airway obstruction related to chronic inflammatory responses in the lungs with symptoms including disabling dyspnea, fatigue, and persistent cough with excessive sputum. Exacerbations are characterized by a sustained acute worsening of respiratory symptoms beyond daily fluctuations, which leads to changes in medication use. Due to the disease symptoms, COPD patients often have a reduced capacity for physical activity and this may worsen potential systemic manifestations of the disease, such as cardiovascular and psychiatric comorbidities. The global prevalence of COPD is estimated to be 9.2 % [1] with variable estimates, ranging from 3.9 % [2] in the Netherlands to 20.9 % in the US, [3] when reported by country. Therefore, COPD presents a major clinical and humanistic burden, [4] despite the availability and use of standard treatments, which aim to relieve symptoms and slow disease progression [5].

This heavy disease toll inevitably focuses interest on how patients are treated and the extent to which medications produce meaningful benefits. Assessment of such value in clinical trials has traditionally relied on measures of lung function (such as forced expiratory volume in one second [FEV1]), symptom control, health status, and rates of exacerbation over a period of up to one year. Exacerbations are a particularly important marker, not least because they are a key driver of health resource use (HRU), such as emergency department visits, antibiotic use and hospitalization. Evidence of this includes the fact that an exacerbation can cost upwards of $7,000 each, depending on its severity and whether the patient is hospitalized [6]. Unsurprisingly, payers tend to focus on this outcome in their formulary considerations, with the expectation that decreased exacerbation rates will likely result in lower costs for their plan.

The clinical and economic importance of exacerbations in COPD invites questions about their inter-relationship with other well-established measures of treatment effect. These include, for example, persistent and/or uncontrolled disease symptoms and health status as measured by the St. George’s Respiratory Questionnaire [SGRQ] – which captures symptoms, impact on patient well-being, and activities of daily living. Additionally, clinically relevant improvements in lung function measures such as FEV1, are often required by regulators for certain drug approval processes. Of note, previous studies have looked at the link between FEV1 and SGRQ score [7, 8] but their relationship to longer-term outcomes, such as exacerbations and HRU, is not well-known and/or accepted, and this may account for why they have received comparatively less consideration from clinicians and payers.

Against this background, the current study aimed to investigate the relationship between changes in FEV1 and SGRQ score and economically significant outcomes of exacerbations and HRU, by conducting a systematic literature review (SLR) and regression analysis of relevant studies of pharmacological interventions for COPD. The results of this analysis will help the interpretation of clinical trial results and provide insights into whether or how the effects of COPD treatment seen in such studies relate to long-term clinical benefits.

Methods

Literature review

Search strategy

We systematically reviewed MEDLINE- (via PubMed), Embase-, and the Cochrane Central Register of Controlled Trials (CENTRAL) -indexed literature published from January 1, 2002 through October 1, 2014. The search algorithms used keywords for COPD paired with terms for the endpoints of interest--SGRQ, FEV1, exacerbations, and HRU. Limits included clinical trials on humans published in English.

Study selection

Following the literature search, all titles and abstracts identified from MEDLINE, Embase, and CENTRAL were manually reviewed against the inclusion and exclusion criteria using PICOS (Patient, Interventions, Comparisons, Outcomes, Study Design)-related elements. Studies were required to report on at least 20 adult COPD patients, to evaluate pharmacologic treatments labeled for or intended for use as treatment of COPD with any comparator treatment, to report mean change in either FEV1 or SGRQ score and either COPD exacerbations or any HRU endpoint, and to be a randomized controlled trial (RCT). A single investigator screened all abstracts identified through the searches, according to the specified inclusion and exclusion criteria. The full-text articles of accepted studies that passed abstract screening were retrieved for further review. Screening was conducted by a single investigator using the same inclusion and exclusion criteria that had been applied at the abstract level. All excluded studies were confirmed by a second, senior investigator and any discrepancies between the two investigators were resolved by involvement of a third investigator.

Data extraction process

The results of all accepted studies identified as part of the SLR were extracted by a single investigator trained in the critical assessment of evidence, with validation performed by a senior investigator. Trial quality and risk of bias were assessed during extraction for each included study using the Jadad quality score assessment.

Statistical analysis

The analyses relating measures of FEV1 and SGRQ total score to exacerbations and HRU followed the meta-analyses methods outlined by Johnson et al. [9] Each trial supplied one or more pairs of data points on the treatment effects of interest. These predictor/outcome pairs from each of the studies were analyzed using sample-size weighted regression analyses, which estimated a regression slope relating the two treatment effects, as well as a confidence interval and a test of statistical significance. In general, the predictor was a relative treatment effect for change in SGRQ or trough FEV1, and the outcome was a log-relative-risk or log-rate for exacerbations. Pre-bronchodilator FEV1 was considered as equivalent to trough FEV1 for analysis, while post-bronchodilator measures and FEV1 that was unspecified were not included. Primary analyses were designed to avoid the use of an intercept in the regressions, but fit was superior with an intercept included.

For the analyses of patients experiencing at least one exacerbation, studies were included if they reported on exacerbations of all severities. For analyses of patients experiencing at least one moderate-to-severe exacerbation, studies were included if they reported on exacerbations that required antibiotics, oral corticosteroids (OCS), and/or hospitalization. Data on time to first exacerbation or the number of patients with at least one exacerbation were combined for analysis. COPD exacerbations reported as an adverse event were not included in analysis. All studies reporting data at timepoints ≥24 weeks were eligible for inclusion in the analyses. Separate analyses were conducted for all timepoints ≥24 weeks and ≥48 weeks.

Results

Literature review

The literature review identified 67 trials reporting endpoints of interest at timepoints ≥24 weeks that were eligible for inclusion in the regression analysis. Fig. 1 outlines the overall search hits and study attrition during screening and analysis.

Fig. 1
figure 1

Study Attrition in the Systematic Literature Review

Regression analysis

In the figures representing the analyses, each point in the plot represents a study comparison for two effects. For instance, the point in the middle of Fig. 2 is from Bateman et al. [10] and represents their findings in the comparison of tiotropium 5 mg (via the Respimat® inhaler) vs. placebo. In this example, the difference between the two treatments in trough FEV1 change was -0.10, and the hazard ratio (HR) for any exacerbation risk was 0.693 (for a log-HR of -0.37). Each study with two arms (one treatment comparison, e.g. treatment A vs. treatment B) and with sufficient data contributed one data point to the analysis; studies with three arms (two treatment comparisons, e.g. A vs. B and A vs. C) contributed two data points.

Fig. 2
figure 2

Relationship between Mean Change in Trough FEV1 and Relative Risk for Any Exacerbation

Any given slope can be interpreted by determining what difference between treatments in log-exacerbation risk one would expect given the difference in trough FEV1 change. The predicted log-relative-risk of exacerbation in studies like Bateman 2010 is:

$$ \ln \left(\mathrm{RRAnyExacerbation}\right) = \mathrm{Intercept} + \mathrm{Slope}\ *\ \mathrm{Difference}\ \mathrm{in}\ \mathrm{trough}\ {\mathrm{FEV}}_1\mathrm{change}. $$

Or

$$ \ln \left(\mathrm{RRAnyExacerbation}\right) = 0.14 - 3.56(0.10), = -0.22. $$

As exp (-0.22) = 0.80, we can predict that the relative risk of exacerbation in studies like Bateman 2010 will be 20 % lower for active treatment than for control. As noted above and in the plot, in Bateman 2010 the relative risk of any exacerbation was actually slightly lower than this value (0.693).

Relationships with exacerbations at ≥48 weeks

Forced Expiratory Volume in One Second (trough FEV1)

Mean Change in Trough FEV1 and COPD Patients’ Risk for Any Exacerbation

The relationship between relative treatment effects on change in FEV1 and any exacerbation was of moderate strength and was statistically significant (slope: -3.56, p = 0.0001; Fig. 2) when defining the exacerbation outcome as time to first exacerbation or the number of patients with at least one exacerbation. No relationship was found (slope: 0.078, p = 0.9199) between treatment effects on FEV1 and annualized exacerbation rate. Figure 2 plots the relationship between the mean difference in trough FEV1 and relative risk for any exacerbation and Table 1 shows the raw trial data contributing to this analysis.

Table 1 Study Data for Trials Reporting Mean Change in Trough FEV1 and Patients Experiencing Any Exacerbation

Mean Change in Trough FEV1 and COPD Patients’ Risk for Moderate-to-Severe Exacerbations

The relationship between relative treatment effects on change in FEV1 and moderate-to-severe exacerbations was of moderate strength and was statistically significant (slope: -1.46, p = 0.045; Fig. 3) when defining the exacerbation outcome either as time to first exacerbation, the number of patients with at least one exacerbation, or as annualized exacerbation rates. Figure 3 shows the relationship between the mean difference in trough FEV1 and the relative risk for a moderate-to-severe exacerbation. Table 2 shows the raw trial data contributing to this analysis.

Fig. 3
figure 3

Relationship between Mean Change in Trough FEV1 and Risk for a Moderate-to-Severe Exacerbation

Table 2 Study Data for Trials Reporting Mean change in FEV1 and Patients Experiencing Moderate-to-Severe COPD Exacerbation

St. George’s respiratory questionnaire

Mean Change in SGRQ Total Score and COPD Patients’ Risk for Any Exacerbations

The relationship between relative treatment effects for change in SGRQ score and any exacerbation was of moderate strength (slope: 0.112, p = 0.0002; Fig. 4) and was statistically significant when defining the exacerbation outcome as time to first-exacerbation or the number of patients with at least one exacerbation. The relationship was weaker and not statistically significant (slope: 0.014, p = 0.2825) when examining annualized exacerbation rates. Figure 4 shows the relationship between the mean difference in SGRQ score and relative risk for any exacerbation and Table 3 shows the raw trial data contributing to this analysis.

Fig. 4
figure 4

Relationship between Mean Change in SGRQ Total Score and Risk for Any Exacerbation

Table 3 Study Data for Trials Reporting Mean change in SGRQ Total Score and Patients Experiencing Any COPD Exacerbation

Mean Change in SGRQ Total Score and COPD Patients’ Risk for Moderate-to-severe Exacerbations

The relationship between relative treatment effects for change in SGRQ score and a moderate-to-severe exacerbation was of moderate strength and was statistically significant when defining the exacerbation outcome as either the number of patients with at least one exacerbation (slope: 0.046, p = 0.0279, Fig. 5) or as an annualized exacerbation rate (slope: 0.056, p = 0.0024, figure not shown). Figure 5 shows the relationship between the mean difference in SGRQ score and the relative risk for a moderate-to-severe exacerbation and Table 4 shows the raw trial data contributing to this analysis.

Fig. 5
figure 5

Relationship between Mean Change in SGRQ Total Score and Risk for a Moderate-to-severe Exacerbation

Table 4 Study Data for Trials Reporting Mean change in SGRQ Total Score and Patients Experiencing Moderate-to-severe COPD Exacerbation

Relationship between FEV1 and SGRQ and Hospitalized COPD Exacerbations

There were insufficient data to analyze association with all-cause hospitalizations, and the annualized and patient-level data were combined for the analysis of hospitalizations due to exacerbations. Additionally, relative effects for the number of patients with an exacerbation were combined with annualized exacerbation rates to facilitate analyses.

FEV1 and SGRQ

For both SGRQ score and FEV1, the plots indicate a somewhat weaker relationship with exacerbations resulting in hospitalization (compared to the findings for exacerbations overall). Results were not statistically significant (FEV1 slope: -1.49, p-value = 0.174 [Fig. 6]; SGRQ slope: 0.0518, p = 0.126 [Fig. 7]) for either relationship.

Fig. 6
figure 6

Relationship between Mean Change in FEV1 and Risk for Hospitalization

Fig. 7
figure 7

Relationship between Mean Change in SGRQ and Risk for Hospitalization

Impact of including All timepoints >24 weeks

Expanding the data set from outcomes reported at >48 weeks to include outcomes reported at >24 weeks showed similar directionality but weaker results compared with the long-term analysis data of both SGRQ score and FEV1 (data not shown).

Discussion

Our systematic literature review and regression analysis demonstrated that beneficial mean change in either FEV1 or SGRQ total score was associated with a lower risk for exacerbations. Specifically, it showed that in randomized trials of COPD drug treatments lasting ≥48 weeks, there was generally a relationship between relative efficacy in improving FEV1 and SGRQ total score and relative efficacy for lowering exacerbation risk. The majority of analyses showed the same trend towards a relationship between positive changes in FEV1 and SGRQ score and exacerbation risk, even though results did not always reach statistical significance. Of note, there was no relationship shown between mean change in FEV1 and annualized exacerbation rate, despite this relationship being moderate and statistically significant when the risk of experiencing at least one exacerbation in patients was analyzed. The mean change in SGRQ total score was not significantly related to the rate of exacerbations across all severities but had a moderate, statistically significant relationship with the rate of moderate-to-severe exacerbations. The relationship between FEV1 and SGRQ score and hospitalizations was less clear, and further research is needed in this area.

To our knowledge, the literature review and regression analysis we conducted is the first such study to evaluate the inter-relationship that health status and lung function have with exacerbation risk. It provides a more rigorous examination of a relationship between laboratory values and exacerbations than has been done in the past, as, unlike former studies, it correlates relative treatment effects instead of absolute ones, thus lowering the possibility of ecological bias. However, as this analysis used only aggregated patient data from published trials, we cannot assume that any statistical association observed between arm-level variables may be translated to patient-level associations. Therefore, our findings cannot be used to predict any outcome at the patient-level. Additionally, our analysis may be limited by the available data for the surrogate measures given the trials reported FEV1 in several different ways. Since our analysis was limited to trough or prebronchodilator FEV1 data, analysis using other measures of FEV1 could yield different results. Similarly, regarding exacerbation severity, we categorized exacerbations based on the definitions reported by study authors using a standardized approach as defined in our methods section. However, in some cases definitions were not reported so we relied on author-defined groupings of any or moderate-to-severe exacerbations.

Our research may have important implications for regulatory assessment of drugs intended to help reduce the risk of exacerbations in COPD and, in particular, the evidence considered in such deliberations. Currently, to gain marketing approval for this indication, such treatments have to be tested in long-term, parallel trials, which represent a logistic and economic burden on the sponsoring organization. Because of this, few trials of COPD drugs are powered to identify a significant difference in the reduced risk of exacerbations. It is for this reason that to date very few drugs have been approved for reducing exacerbations on the basis of prospective 1–2 year parallel trials, usually in patients with history of acute exacerbations in the prior year. Our study suggests changes in FEV1 and SGRQ might serve as reliable surrogate markers of patients’ likelihood of experiencing an exacerbation. If so, these measures could allow future trials to be shorter and more manageable while still offering key insights into treatments’ longer-term efficacy. Since exacerbations can be costly to health plans, payers should consider the effect of medications on these surrogate markers, even when long-term RCTs cannot be carried out. Also, confirmation of our results would broaden the application of data already available from published shorter-term studies. This is especially important since the trials used to inform regulatory approval were powered on each specific drug’s expected effect on the acute exacerbation rate and all but one [11] were small and had very selective entry criteria. This contrasts with the trials contributing data for our review and analysis, since these were broader and more inclusive (e.g. with regards to disease duration and reversibility, comorbidities, interventions, and concomitant therapies) and collectively more representative of the general COPD population seen in everyday clinical practice. Therefore, these collated data sources potentially allow more generalizable conclusions to be drawn regarding whether or how standard short-term endpoints assessed in trials relate to effects on exacerbations.

Conclusions

In conclusion, this study demonstrates a significant association between improvements in FEV1 and SGRQ total score and lower risk for COPD exacerbations. We believe that the results of our study offer providers and payers a more informed picture of the inter-relationship between exacerbations and both FEV1 and SGRQ score, which will aid clinical and formulary decisions while stimulating research questions for future prospective studies.