FormalPara Key Points for Decision Makers

Pulmonary rehabilitation is recommended for chronic obstructive pulmonary disease patients; most courses are around 2 months in length.

The effectiveness and cost effectiveness of longer term maintenance programmes is unknown.

A 2-h maintenance session at 3, 6 and 9 months after initial pulmonary rehabilitation programme has a 72.9 % (72.5 %) probability of yielding an incremental cost-effectiveness ratio below £20,000 (£30,000) per QALY gained.

1 Introduction

Chronic obstructive pulmonary disease (COPD), including emphysema and chronic bronchitis, is a major healthcare problem with considerable human and economic costs. It is estimated that there are 3 million people in the UK living with COPD, and in 2005 the annual cost of National Health Service (NHS) treatment was estimated at over £800 million, with a total societal cost (including lost productivity from morbidity and premature mortality) of £2.7 billion [1].

Therapeutic interventions comprise pharmaceutical treatments (e.g. bronchodilators and corticosteroids) and non-pharmaceutical treatment (including pulmonary rehabilitation). Pulmonary rehabilitation (PR) is recommended under current guidelines [2] as a multicomponent programme. A typical programme lasts 8 weeks comprising physical exercise training and disease education with or without nutritional, psychological and behavioural interventions. It is recommended for all patients who consider themselves functionally disabled by COPD [usually Medical Research Council (MRC) dyspnoea scale grade 3 or above] [2, 3].

Although there is convincing evidence that PR offers clinically relevant benefit for patients in the short to medium term (6 months–1 year), all studies have shown that the initial benefits diminish over time [4]. Guidelines highlight the importance of continued exercise following PR [5] and intensive maintenance sessions have shown medium-term benefits [6]; however, the utility of low-intensity maintenance programmes is unclear and the cost effectiveness is not known.

Resources are finite, therefore it is essential to estimate not only the outcomes associated with interventions, but the costs too, in order to assist with the allocation of healthcare resources to maximum effect, subject to the budget. Therefore, we here present the results of cost-effectiveness and cost-utility analyses, conducted alongside a randomised controlled trial (RCT) of a 12-month maintenance schedule versus treatment as usual, following completion of a standard 2-month PR course for patients with COPD. The analysis is conducted from the perspective of the UK NHS over a time horizon of 1 year and reported in a 2012/13 price year. The health outcomes considered are changes in Chronic Respiratory Questionnaire (CRQ) and EuroQol EQ-5D-3L survey responses. Results are presented in terms of incremental cost-effectiveness ratios (ICERs), with decision uncertainty illustrated using cost-effectiveness acceptability curves (CEACs).

2 Methods

2.1 Source Study

Full details of the trial are published elsewhere [7]. In summary, patients with COPD plus a smoking history of more than 20 pack years and a forced expiratory volume in 1 s of <70 % predicted were eligible for entry into this randomised controlled parallel investigator-blind study. Participants were included if they completed at least 60 % of the initial PR programme and as long as they had not had a respiratory infection within 4 weeks of randomisation or other co-morbidities considered severe enough to affect the study outcome, serious pulmonary disease other than COPD, or history of myocardial infarction within 6 months of baseline.

A total of 237 patients received a standard PR course for 2 months. Of these, 148 completed at least 60 % of the programme and were subsequently randomised via a computer-generated random sequence into two groups: 73 receiving a 2-h maintenance session at 3, 6 and 9 months after randomisation, comprising an hour of education and an hour of structured exercise in addition to standard care. The remaining 75 (control group) received standard care alone. Standard care comprised encouragement to continue exercises at the conclusion of the initial 2-month course, and to attend a local support group for people with lung conditions. Groups were broadly comparable at baseline (Table 1), albeit with some differences; for example, patients in the control group on average had a higher EQ-5D utility (0.7 vs 0.6) [7].

Table 1 Baseline characteristics

This study was carried out in primary and secondary care settings in Norfolk, England. Full ethical approval was granted for this study by Cambridgeshire 1 Research Ethics Committee (09/H0304/40) and the study was registered on the clinicaltrials.gov database—identifier NCT00925171.

2.2 Outcome Measures

The primary outcome of the RCT was change in CRQ score [8] over the 12 months following randomisation. The secondary outcome was overall health-related quality of life measured with the EQ-5D-3L [9]. Cost was measured from the perspectives of both the public sector (defined as the sum of NHS and social services costs) and society (defined as the sum of NHS and social services costs, out-of-pocket expenses and lost productivity). Full details of all outcomes are reported elsewhere [7]. A 12-month time horizon was considered a reasonable period over which to observe any differences in primary outcome.

Patients completed the CRQ [8], EQ-5D-3L [9, 10] and resource use questionnaires at baseline (point of randomisation), and at 3, 6, 9 and 12 months. The resource use questionnaire is available as electronic supplementary material. EQ-5D scores were translated to utility scores using health state valuations relevant to the UK population [11]. Integrating utility over time generates an estimate of the quality-adjusted life-years (QALYs) gained over the 1-year time horizon of the study. Change in CRQ was calculated as the difference between baseline (randomisation) and 12 months.

2.3 Resource Use and Cost

Resource use was divided into four categories: NHS, social services, out-of-pocket expenditure and lost productivity (Table 2). All contacts were recorded, whether or not they were considered related to a patient’s COPD. Quantities of NHS and social services resources (except prescribed medications) consumed were multiplied by unit costs extracted from standard UK sources [1216] (Table 3) and summed to generate total cost per patient. Prescribed medications and doses were reported at initiation of PR, randomisation and the final visit, 12 months following randomisation. Quantities were multiplied by unit costs from the British National Formulary [17]. Lost productivity was measured in terms of wages foregone for either the person with COPD or his/her carer(s). The UK national median hourly wage rate [15] was multiplied by the length of a working day (8 h), and then multiplied by the number of days reported as taken off work by the individual.

Table 2 Resource use categorisation
Table 3 Unit costs

The price year of the study was 2012/13. Costs were not discounted as the time horizon of the study was 1 year.

2.4 Analysis

Results are reported as point estimate cost and outcome (QALYs and change in CRQ) per patient in each group, increments and ICERs from the perspective of the NHS and social services (i.e. public sector). Due to poor completion of questions relating to lost productivity, cost-effectiveness analysis from the perspective of society was omitted, although we present data on out-of-pocket costs, lost productivity and overall societal costs within the results tables. 95 % confidence intervals around increments and CEACs were calculated using a non-parametric bootstrap (resampling with replacement, 1000 iterations). An analysis using least-squares regression, adjusting for baseline covariates (cost and quality of life during the PR period) and with missing data imputed using multiple imputation [18] using the ‘ICE’ command in STATA, imputing costs, utility and CRQ score at each time point, and including baseline demographics as covariates with ten replicated datasets is also reported (STATA do file available on request from the corresponding author). Analysis was performed blind to treatment allocation.

3 Results

3.1 Data

Overall, patients were well matched at baseline across the two groups, and no significant differences between any outcome measures considered were detected at baseline [7]. Of 148 patients randomised, data completeness was 71 %. However, due to the pattern of missingness, complete data were available for only 40 patients for cost-effectiveness analysis and 41 for cost-utility analysis. This was due to a small number of data items missing in a large number of observations: 30 % of observations had no missing data and 70 % had three or fewer data items missing. Of the 148 patients randomised, 86 % provided responses and recorded resource use at the point of randomisation and at 12 months. The use of multiple imputation was therefore judged of value in this analysis.

3.2 Intervention Cost

Each PR maintenance session required two nurses (£52 per h [13]), one physiotherapist (£34 per h [13]) and one occupational therapist (£34 per h [13]) for 2 h. Assuming a group size of 24 (the size of groups in the trial), the mean cost per participant was therefore estimated at £14.33 per session, or £43 for all three over the 12-month study duration.

3.3 Other NHS Resource Use and Costs

There was very little difference in NHS resource use between the groups (Table 4). Complete case analysis (CCA) summaries of costs are presented in Table 5. Adjusting for baseline covariates and missing data, the intervention was observed to lead to a mean reduction in NHS costs of £204.04 per patient over the 12-month time horizon (Table 5 final column, Table 6). This difference was not statistically significant (95 % CI −£1522 to 1114).

Table 4 Resource use quantities
Table 5 Summary costs
Table 6 Cost effectiveness (£/CRQ) and cost utility (£/QALY) analyses from the public sector (NHS and social services) perspective

3.4 Outcomes

The 12-month maintenance programme was observed to lead to a mean reduction in incremental CRQ of −0.007 and a gain of 0.015 QALYs (adjusted and imputed results, Table 6). These differences were not statistically significant (95 % CI around change in CRQ, −0.461 to 0.447 and for QALYs, −0.050 to 0.079).

3.5 Cost Effectiveness

Point estimate results suggest the PR maintenance scheme on average leads to lower costs but also poorer CRQ outcomes (Table 6). Therefore, the financial gain for every foregone point deterioration in CRQ is £29,295. On average, we observed a positive incremental QALY gain, meaning the intervention was dominant (less costly and more effective). Considering decision uncertainty, a scatterplot of cost–QALY pairs (Fig. 1) suggests that PR maintenance is approximately equally likely to be cost saving or cost incurring (there is a more or less equal spread of points north and south of the x-axis), but with a slightly higher probability of generating a positive incremental QALY gain (more of the points are to the east of the y-axis). These observations are reflected in the confidence intervals around incremental cost and outcomes (Table 6, adjusted analyses) and the CEAC (Fig. 2), suggesting a 72.88 % probability of cost effectiveness at a threshold of £20,000 per QALY, declining slightly to 72.52 % at a £30,000 threshold.

Fig. 1
figure 1

Scatterplot of cost–QALY pairs. QALYs quality-adjusted life-years, £NHS cost to the National Health Service

Fig. 2
figure 2

Cost effectiveness acceptability curve. QALY quality-adjusted life-year

4 Discussion

4.1 Interpretation of Results

A strict decision-theoretic interpretation of the results (where decisions are based on point estimates alone) would be that, on average, our results suggest PR maintenance is less expensive, but yields worse outcomes on one measurement scale (CRQ) but better outcomes on another (QALYs). One approach to resolve this contradiction is to define a preferred analysis; as QALYs are the more generic health outcome measure (and also allow broad comparison across disease areas), this may be the more useful analysis. Based on our findings, one would conclude that maintenance of PR ‘dominates’ control as it is both less expensive and more effective. An alternative interpretation based strictly on the rules of statistical inference would argue that there is no statistically significant difference in either cost or outcomes (conventionally defined as a 95 % confidence interval that excludes zero), and therefore it is not possible to conclude that one course of action is more cost effective than another.

However, a compromise between these two interpretations would observe that, whilst neither incremental costs nor outcomes were statistically significant at a 95 % confidence level, incremental QALY gain is more likely than not to be positive, whilst there is equal probability of a cost increase or decrease (columns adjusted analyses, incremental costs and outcomes, Table 6). Formally quantifying this into uncertainty in cost effectiveness, there is a 72.52 % probability that PR maintenance is cost effective at a willingness-to-pay threshold of £30,000 per QALY (final column, Table 6). Whether this is sufficient certainty to recommend adoption of maintenance therapy depends on the attitude to risk of the decision maker (assuming a £30,000 threshold is considered the upper limit of cost effectiveness).

The finding of no significant difference in outcome at 12 months was disappointing. Whilst we observed a significant improvement in CRQ scores amongst completers over the initial 2-month programme [7], this improvement was not maintained in either randomisation group. A possible explanation for this is that the maintenance regimen was either not of sufficient intensity or was not commenced early enough to affect a significant or clinically relevant change. In addition, the adherence to our planned maintenance regimen was poor (52 % of individuals completing all planned sessions) although the clinical efficacy in this group of individuals (the intention-to-treat analysis) was similar to the group as a whole (per protocol analysis) [7]. It is possible that a more intensive regimen may be more effective, but at increased cost. However, estimating this is beyond the scope of this study. It should be noted that only 62 % of eligible patients enrolled to the initial PR course were able to complete at least 60 % of it (and thus meet the entry criteria to this study). We have no reason to suspect this may be any different from the completion rate observed in practice, thus our results should be generalisable. However, this does raise questions as to whether it may be more efficient to encourage completion of existing courses, rather than exploring maintenance regimens. Again, exploration of this is beyond the scope of this study.

4.2 Comparison with Other Studies

At the time of writing, we were unable to identify any previous studies of the cost effectiveness of group-based approaches to maintenance of PR on conclusion of the initial scheme [3]. The most similar study was an evaluation of a low intensity (once weekly) PR course over a 26-week time horizon in Germany. This study suggested evidence of a clinically beneficial effect over this time period, at a staff cost of no more than €625 (price year unknown) [19], although other healthcare costs (e.g. other health service contacts) were not measured. Although not entirely comparable as CRQ was not used as an outcome measure, another study from the Netherlands investigated cost effectiveness and cost utility of a community-based 20-month management scheme following 4 months of intensive PR [20]. During the maintenance period of 20 months, patients made monthly visits to a physiotherapist, four visits to a nutritionist, various visits to a respiratory nurse, and up to six physical training sessions. The ICER was estimated to be €32,425 (price year 2007) per QALY, and the probability of cost effectiveness at willingness to pay of €20,000 per QALY was estimated at 33 % (67 % at €50,000). From this, the authors concluded that the intervention was moderately cost effective.

There are, however, a number of studies of the cost effectiveness of variants of the initial PR programme in patients with COPD.

In 2001, Griffiths et al. [21] estimated that at 1 year after a 6-week outpatient PR programme in Wales, UK, incremental costs per patient (NHS costs plus patient out-of-pocket expenditure on travel) in the intervention group were lower than in the control (standard care without PR), but outcomes were also superior, with a high probability of cost effectiveness irrespective of the threshold. This study provided the major economic evidence supporting NICE’s 2010 (and current) recommendation in favour of provision of PR ‘for all who need it’ [2].

In contrast, a more recent study by Gillespie et al. [22] reported the cost effectiveness in an Irish setting of an 8-week structured PR programme for COPD patients similar to that given to all patients in our trial, with a follow-up at 22 weeks. Costs included those associated with delivering the intervention, other primary and secondary healthcare and social services contacts and prescribed medications, as well as private costs to patients (time and travel expenses). The authors estimated that the intervention yielded an incremental cost of €472,000 (£369,000) (price year 2009) per QALY gained, considerably above any ‘reasonable’ threshold.

Other studies have compared alternative means to deliver PR. An RCT-based study of a 6-week programme of hospital versus community-based PR, with and without telephone follow-up in patients with COPD, evaluated costs and outcomes at 18 months [23]. The authors found a 50 % probability of cost effectiveness of hospital versus community PR at a £30,000 per QALY threshold. Telephone follow-up appeared to improve outcomes at reasonable cost in the community-based group but not the hospital group.

4.3 Strengths and Weaknesses

This analysis is based on a well conducted RCT with a relatively long (12-month) time horizon, adhering to recognised standards for the conduct and reporting of economic evaluations [24].

A potential weakness of the study was the capture of data from questionnaire rather than source databases such as GP, secondary care or social care records. Collecting data in this way allows the data to be assembled in a consistent form reducing the analytic burden somewhat, however risks introducing recall bias. Evidence suggests that patient recall for salient events such as secondary care is relatively reliable, but recall for primary care contacts is less so [25], but still within acceptable bounds when respondents are only asked to recall frequency of contact and not duration [26] (as was the case in this study). In our study, secondary care was a bigger cost driver than primary care, thus limiting the impact of any recall bias. There is also no reason to believe that recall bias may be greater in one arm than in the other.

Of note is the seeming contradiction between the CRQ and QALY results, and furthermore the difference in the results between the ‘raw’ CCA, and the imputed analysis, adjusted for baseline characteristics. Given the wide confidence intervals, this is highly likely a chance finding. However, it is also plausible that the EQ-5D instrument is insensitive in this population, or conversely that the EQ-5D picks up broader improvements in quality of life not captured by the CRQ. It should be noted that the CCA will be subject to greater sampling uncertainty due to exclusion of observations for which there are incomplete responses. We employed a very strict definition of complete case, such that patients with any cost or outcome data missing at any time point were excluded. As a result, this analysis excludes a large quantity of relevant data. It should also be noted that QALYs are particularly sensitive to baseline imbalances in utility [27]. A priori, therefore, the adjusted, imputed analysis represents a preferable interpretation of the data, and as explained above, QALYs could be considered a preferred outcome measure over CRQ. However, for ease of computation, we used a simple ordinary least squares (OLS) model to adjust for baseline characteristics. The difference in the results between the CCA and imputed analyses suggests caution should be expressed in interpretation.

A major limitation of this study was the poor quality and quantity of data relating to lost productivity; these data were missing in most cases. A reason for this could be respondent fatigue: these questions appeared at the end of a lengthy battery of measures. Future methodological work should focus on improving phrasing of the required questions whilst minimising burden on respondents.

The time horizon of the study was 1 year. The ideal time horizon for an economic evaluation is sufficient to observe any changes in incremental cost or outcomes. We were limited to a within-trial study in this case. If there is expected to be a difference in incremental cost or outcomes beyond this time horizon, then decision modelling may be of value to project observed costs and outcomes to the future.

5 Conclusion

Whilst, on average, our results suggest PR maintenance is cost effective from the perspective of the NHS, this is based on there being a very small increase in QALYs gained but no difference in NHS costs. The wide confidence intervals around incremental costs and outcomes are reflected in our finding that there is a 73 % probability that the ICER is below £30,000 per QALY gained.

Future research should focus on exploring whether more intense maintenance regimens are able to offer a benefit to patients at reasonable cost.