FormalPara Key Points for Decision Makers

New pharmacologic treatments for chronic obstructive pulmonary disease (COPD) indicate favourable cost effectiveness; however, quality-adjusted life-year (QALY) gains were small, and less than half of the studies included a COPD-specific outcome.

Exacerbation and mortality rates were the main drivers of cost effectiveness.

According to the Quality of Health Economic Studies (QHES), the quality of the studies was generally sufficient, but most studies poorly reflected cost effectiveness in real life.

1 Introduction

Chronic obstructive pulmonary disease (COPD) is a progressive lung disease characterized by reduced airflow and increased chronic inflammatory response in the airways due to noxious particles and gasses [1]. COPD is mostly diagnosed in people aged ≥40 years. In recent years, its prevalence is more equally distributed between men and women due to a more equal distribution of smoking, as well as outdoor and indoor air pollution [1]. Symptoms of COPD include breathlessness, excessive sputum production, and chronic cough [1]. Exacerbations and comorbidities contribute to the impact of COPD on patients’ quality of life [1]. Therefore, the management of exacerbations and comorbidities is key in the treatment of COPD to prevent further progression [1].

COPD is diagnosed by symptoms and airflow obstruction assessed via forced expiratory volume in 1 second (FEV1) divided by the forced vital capacity (FVC) <70 %. FEV1 is measured using spirometry and is expressed as a percentage of the expected value. Historically, the severity of COPD was merely defined by lung function variables such as the FEV1. Severity grades included Global Initiative for Chronic Obstructive Lung Disease (GOLD) 1 (FEV1 %predicted >80), GOLD 2 (FEV1 %predicted 50–80), GOLD 3 (FEV1 %predicted 30–50) and GOLD 4 (FEV1 %predicted <30). Since 2011, the severity of COPD has been defined based on a combined assessment of lung function, symptoms and future risk of exacerbations and is classified as GOLD A, B, C and D [2]. GOLD A is the least severe stage, and GOLD D is the most severe stage of COPD with the worst lung function, highest exacerbation risk and most symptoms. Recent studies have shown that the new GOLD classification was more strongly related to clinical outcomes, quality of life and costs than the old GOLD classification [3, 4].

Various forms of pharmacological and non-pharmacological treatments are available to decrease symptoms, prevent exacerbations and increase the quality of life of patients with COPD. Non-pharmacological treatments include smoking cessation, exercise, nutrition and pulmonary rehabilitation. The cornerstone pharmacological maintenance treatment consists of the group of bronchodilators: long-acting beta2 agonists (LABA) and long-acting muscarinic antagonists (LAMA). Other pharmacological maintenance treatments include methylxanthines, inhaled corticosteroids (ICS), systemic corticosteroids and phosphodiesterase (PDE)-4 inhibitors. Short-acting beta-agonists and short-acting muscarinic antagonists are primarily used for rapid symptom relief. The GOLD guidelines recommend use of a short-acting bronchodilator for GOLD A, a LABA or LAMA for GOLD B, an ICS + LABA or LAMA for GOLD C and ICS + LABA and/or LAMA for GOLD D. In daily clinical practice, the use of a combination of multiple COPD drugs appears to be increasing [5].

Several novel pharmacotherapies have recently entered the market, such as new long-acting bronchodilators for once-daily dosing (indacaterol, olodaterol), new LAMAs (glycopyrronium, aclidinium), fixed-dose LAMA/LABA combinations (tiotropium/olodaterol, aclidinium/formoterol, umeclidinium/vilanterol, glycopyrronium/indacaterol) and a new fixed-dose combination of LABA/ICS (vilanterol/fluticasone furoate). Many of these novel therapies focus on improving dosing convenience. There is also evidence of synergistic effects, but these are not fully additive [6]. When the law of diminishing returns applies, it may become increasingly difficult to demonstrate that the combination therapies are cost effective, especially since some of the commonly used drugs have gone, or will soon go, out of patent and thus become relatively cheap.

Cost-effectiveness analyses (CEAs) help to provide insight in the balance between incremental costs and incremental effects of a new treatment compared with current standards of care. In 2012, Rutten-van Mölken and Goossens [7] systematically reviewed the cost effectiveness of pharmacological maintenance treatment for COPD. They highlighted that “it is important that future studies improve consistency of study methodology and choice of comparators in order to enable meaningful comparison of study results and that it is necessary that more and longer trial-based cost-effectiveness studies are conducted”. The recommendation regarding the application of consistent methodology was in line with an earlier review from 2008 [8]. Given the recent market entry of several new COPD treatments, an update of the previous reviews is required.

The aim of this paper is to systematically identify the recent literature regarding the cost effectiveness of pharmacological maintenance treatments for COPD, review the quality of the studies and report on their strengths and limitations. We also describe current methodological trends, summarise the main drivers of favourable cost effectiveness of COPD treatment and specifically relate our findings to the conclusions from the previous review [7].

2 Methods

2.1 Search Strategy

The search strategy used to perform the literature search for economic evaluations of COPD treatment was based on the strategy of the previous review from 2012 [7]. In short, we performed a systematic literature search in Embase, PubMed, the UK NHS Economic Evaluation Database (NHS-EED) and EURONHEED (European Network of Health Economics Evaluation Databases). We included all relevant papers published between 1 November 2011 (end date of the previous search) and 31 December 2015.

The search strategies in the individual databases were as follows.

  • Embase: ‘chronic obstructive pulmonary disease’/exp AND ‘cost effectiveness’/exp AND [article]/lim AND ([dutch]/lim OR [english]/lim OR [german]/lim) AND [humans]/lim.

  • PubMed: ((Chronic[All Fields] AND (“lung”[MeSH Terms] OR “lung”[All Fields] OR “pulmonary”[All Fields]) AND obstructive[All Fields] AND (“disease”[MeSH Terms] OR “disease”[All Fields])) AND (“cost-benefit analysis”[MeSH Terms] OR (“cost-benefit”[All Fields] AND “analysis”[All Fields]) OR “cost-benefit analysis”[All Fields] OR (“cost”[All Fields] AND “effectiveness”[All Fields]) OR “cost effectiveness”[All Fields])) AND (“2011/11/01”[PDAT]: “3000”[PDAT]).

  • UK NHS: ‘chronic obstructive pulmonary disease’ and ‘pharm*’ and ‘economic evaluation’.

  • EURONHEED: ‘treatment’ as the type of intervention, ‘respiratory tract diseases’ as disease and ‘drug’ as keyword.

The titles and abstracts were screened by SvdS and checked by JvB. Based on titles and abstracts, we assessed whether the studies met the following inclusion criteria:

  • Full text available;

  • In English, Dutch or German;

  • An identifiable group of COPD patients;

  • Only original research, no review papers;

  • Full economic evaluations, including costs and effects;

  • Only maintenance treatment drugs; no drugs used for acute exacerbations, no alfa-antitrypsin replacement therapy, no vaccination strategy or non-pharmacological treatments.

Papers that seemed to meet these criteria based on title and abstract were further assessed in more detail by two independent reviewers (SvdS and JvB). Discrepancies were solved by consensus. The reference lists of these papers were also assessed to identify more papers that might meet the criteria above. The systematic review followed the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) guidelines [9].

2.2 Data Extraction

The data extracted from the papers included the following main study characteristics, reported by the class of drug assessed (LAMA, LABA, PDE-4 inhibitors, LABA/ICS and LABA/LAMA): first author, year, country, funding, drug therapy described and the comparator(s), difference in costs, difference in outcomes (quality-adjusted life-years [QALYs] gained, life-years [LYs] gained, exacerbation risk or pneumonia risk), incremental cost-effectiveness ratio (ICER), and authors’ conclusions. Other data extracted concerned the study design, time horizon, sensitivity analyses and perspective. The study perspective was described as either societal (including all relevant actual costs, inside and outside the healthcare sector) or healthcare payer (including only healthcare costs). The latter could use either actual costs of resources used or tariffs paid). Data extraction was performed by one author (SvdS) and checked by another author (JvB).

2.3 Evidence Summary

Following the narrative description of the studies per drug class, we provide a summary of the evidence. This summary is based on both the evidence from the studies in this review and the studies included in the previous review [7]. Evidence could either be ‘strong’ (five or more studies with consistent results), ‘moderate’ (three to four studies with consistent results), ‘limited’ (fewer than three studies with consistent results) or ‘inconclusive’ (contrasting results no matter the amount of studies).

2.4 Quality Assessment

We considered the following checklists for systematic assessment of the quality of the papers: the Phillips checklist, the Quality of Health Economic Studies (QHES) checklist and the Consolidated Health Economic Evaluation Reporting Standards (CHEERS) checklist [1012]. The Phillips checklist was excluded as it is primarily useful for the quality assessment of modelling studies. Although the majority of articles included were modelling studies, we preferred consistency across all articles, including non-modelling studies. The CHEERS checklist was excluded as it did not provide an average quality score. We eventually chose the QHES checklist because it provides a quantitative score. The quality assessment was performed by two independent reviewers (SvdS and other randomly chosen authors). If the results differed between reviewers, consensus was reached through discussion. Four QHES-based quality levels have been established in previous assessments: category 1 (0–25.0 points), category 2 (25.1–50.0 points), category 3 (50.1–75.0 points) and category 4 (75.1–100 points) [13].

2.5 Critical Assessment of Methods and Outcomes

As the QHES does not cover all topics and is not specifically designed for COPD cost-effectiveness studies, the following additional issues regarding methods and outcomes were further explored and discussed in detail: (1) study design, (2) time horizon, (3) variation in modelling approach (including cycle length and model states), (4) variation in outcomes, (5) variation in costs, (6) variation in analytical approach, (7) transferability issues and (8) other issues. Most of these issues were identified in the previous review [7] and are revisited to assess the current state of the art.

3 Results

3.1 Search Results

The literature search resulted in 210 hits. After reviewing titles and abstracts, 39 papers were included for full-text review. Subsequently, 18 papers complied with the inclusion criteria. Figure 1 is a flow diagram showing the inclusion and exclusion of papers at various stages of the process.

Fig. 1
figure 1

Flow diagram of the search performed. COPD chronic obstructive pulmonary disease

3.2 Main Study Characteristics

Sections 3.2.13.2.5 detail the main study characteristics and brief descriptions of the economic evaluations of LAMA, LABA, PDE-4 inhibitors, LABA/ICS and LABA/LAMA therapies. Tables 1, 2, 3, 4 and 5 provide overviews of the study characteristics, by drug class.

Table 1 Main study characteristics of long-acting muscarinic antagonist cost-effectiveness assessments
Table 2 Main study characteristics of long-acting beta2 agonist (LABA) cost-effectiveness assessments
Table 3 Main study characteristics of phosphodiesterase-4 inhibitors (PDE-4) inhibitors cost-effectiveness assessments
Table 4 Main study characteristics of long-acting beta2 agonist/inhaled corticosteroid cost-effectiveness assessments
Table 5 Main study characteristics of long-acting beta2 agonist/long-acting muscarinic antagonist cost-effectiveness assessments

3.2.1 Long-Acting Muscarinic Antagonist (LAMA) Monotherapy

LAMAs are a mainstay therapy for patients with GOLD B, C or D COPD [1]. In the previous review, 11 studies assessed the cost effectiveness of tiotropium, the sole LAMA available at that time, compared with usual care (placebo, ipratropium or salmeterol); most studies indicated favourable cost effectiveness [7]. In the current review, six new articles reported the cost effectiveness of LAMAs (Table 1). Two of the studies were conducted in Sweden; the others were conducted in the USA, Italy, the UK and Belgium, and Germany. The LAMAs assessed were tiotropium, glycopyrronium and aclidinium. All studies were funded by a pharmaceutical company, combined clinical trial efficacy data with modelling and included QALYs as an effectiveness outcome. Four studies compared tiotropium versus either glycopyrronium, (as an addition to) usual care or versus salmeterol (Table 1). Three of four studies used the UPLIFT clinical trial data, often combined with data from other trials or observational data sources [1416]. In these three UPLIFT-based studies, tiotropium was deemed cost effective compared with usual care (i.e. all respiratory medication except anticholinergic drugs) with ICERs ranging between €8000 and €24,000 per QALY over 4-year [16] and lifetime time horizons [14, 15]. All indicated high probabilities (60–90 %) of being cost effective at current willingness-to-pay (WTP) thresholds used in the respective countries. Note that in two of these three studies, QALY differences were <0.10 per patient [15, 16]. The remaining study reported higher QALY gains (0.42) [14]; however, the authors assumed an additional positive effect of tiotropium on cardiovascular outcomes (myocardial infarction, congestive heart failure) and used mapping to convert St George’s Respiratory Questionnaire (SGRQ) scores into utilities. One study compared tiotropium versus salmeterol based on the head-to-head POET-COPD clinical trial [17]. In this economic evaluation, 1-year trial-based cost effectiveness as well as model-based cost-effectiveness estimates (over 1-year and 5-year time horizons) were reported from both the societal and the payer’s perspective. The 1-year model-based ICERs fell in the same range as those from the studies that compared tiotropium versus usual care (that may have included salmeterol), while the 5-year ICER was slightly lower when calculated from the payer’s perspective. At a WTP of €20,000 per QALY (payer’s perspective), the probability of tiotropium being cost effective was 62.5 %. The trial-based economic endpoint was COPD specific: €2000 (PP) or €2600 (societal perspective) per exacerbation avoided. Note that this is lower than the €4200 per exacerbation avoided reported by Zaniolo et al. [14], whereas the costs per QALY were slightly higher.

The two remaining studies in this category assessed glycopyrronium and aclidinium, and both studies took tiotropium as comparator. A Swedish study, based on the 1-year head-to-head GLOW-2 trial that included patients with moderate to very severe COPD, concluded that glycopyrronium was cost saving with a 99 % probability of dominance. Their conclusions were based on very small QALY gains (0.005) combined with cost savings over a 3-year time horizon from a societal perspective [18]. This contrasts with another Swedish study discussed in the previous paragraph, which found the opposite QALY gain (i.e. more QALYs for tiotropium) when comparing tiotropium and glycopyrronium [15]. Drug costs and severe exacerbation costs were in the same range, but the latter study took a payer’s perspective and based its efficacy measures on the SPARK trial, which included patients with severe or very severe COPD with at least one exacerbation in the previous year. Another study evaluated the cost effectiveness of aclidinium as an alternative to tiotropium in the USA over a 5-year time horizon [19]. The authors concluded that aclidinium was potentially cost effective (probability of dominance: 84 %), but QALY gains were marginal (0.0044) and total costs did not differ significantly. Given the lack of any long-term head-to-head trials comparing aclidinium versus tiotropium, the authors used a network meta-analysis (NMA) based on a set of different data sources and assumptions. However, as the authors did not clearly describe the limitations of this NMA, conclusions are therefore very tentative.

Considering the large number of past and current studies with mostly consistent results, there is strong evidence that tiotropium monotherapy is cost effective compared with usual (non-LAMA) care. However, evidence regarding the relative cost effectiveness of tiotropium, glycopyrronium and aclidinium versus each other is inconclusive.

3.2.2 Long-Acting Beta2 Agonist (LABA) Monotherapy

LABAs are a mainstay therapy for patients with GOLD B, C or D COPD [1]. In the previous review, 13 studies assessed the cost effectiveness of LABAs versus usual care (ipratropium) or placebo: 11 studies reported on salmeterol, one on formoterol, one on LABAs in general, and most studies indicated favourable cost effectiveness [7]. The present review identified two new studies [20, 21], which were both manufacturer-sponsored studies by the same first author that compared the cost effectiveness of once-daily indacaterol versus once-daily tiotropium and twice-daily salmeterol in the UK and in Germany [20, 21]. Both studies included patients with moderate to very severe COPD and followed them for 3 years using the same Markov model. The model results were compared with other (unspecified) studies and national mortality statistics and were reviewed by an external health economist. Efficacy measures were taken from the 26-week INHANCE and INLIGHT-2 trials [20, 21]. In the UK and Germany, indacaterol 150 µg was found to dominate both tiotropium and salmeterol. In the UK, the QALYs gained for indacaterol versus tiotropium were relatively small, ranging from 0.008 (dose 150 µg) to 0.011 (dose 300 µg); the number of QALYs gained versus salmeterol were similar, as were the German results. The model was sensitive to changes in mortality estimates, the severity of the disease of the patient population and the time horizon. For the UK, the authors concluded that both doses of indacaterol (150 and 300 µg) were dominant. The probabilistic sensitivity analysis (PSA) showed that 72 % (vs. salmeterol) and 89 % (vs. tiotropium) of the iterations indicated dominance for indacaterol. In Germany, only the 150-µg dose was dominant (with dominance probabilities of 78 % vs. salmeterol and 90 % vs. tiotropium), and the more expensive dose (indacaterol 300 µg) was deemed cost effective (€28,301 per QALY). It seems that the difference between countries arises from differences in drug costs. In Germany, the 300-µg dose was 1.5 times more expensive than the 150-µg dose, whereas in the UK, both doses were equally priced. The costs of the comparator drugs also varied.

Based on results from over ten studies included in the previous review, evidence is strong that LABAs (particularly salmeterol) are cost effective compared with ipratropium, which was usual care at that time. Given that the cost effectiveness of indacaterol has only been assessed in two studies, the evidence that indacaterol is cost effective over tiotropium or salmeterol (currently considered usual care) is limited.

3.2.3 Phosphodiesterase (PDE)-4 Inhibitors

Roflumilast is indicated as an add-on to bronchodilators for patients with GOLD C or D COPD associated with chronic bronchitis [1]. The previous review included one study on roflumilast, which was only deemed cost effective in a subgroup with very severe COPD [7]. Four new studies reported the cost effectiveness of PDE-4 inhibitors (roflumilast) when added to LABA, LAMA, LABA/ICS or a combination of these therapies (LABA/ICS + LAMA) compared with the therapy without a PDE-4 inhibitor (Table 3). The studies were conducted in Switzerland (one study), Germany (one study) and the UK (two studies). They were all manufacturer funded and used a combination of clinical trial data and lifetime modelling. Effects measures were all based on a relative ratio of exacerbation rate (RRR) of around 0.80 for adding roflumilast (Supplementary Appendix Table A3).

Two studies [22, 23] assessed the addition of roflumilast to LABA monotherapy using a similar Markov model and basing their main efficacy measure on a pooled analysis of the 52-week M2-124 and M2-125 trials [24]. These trials included patients with severe to very severe COPD, bronchitis symptoms and a history of exacerbations. In the UK, Samyshkin et al. [23] showed that the addition of roflumilast to LABA monotherapy resulted in a gain of 0.16 QALYs and an ICER of around £19,500 per QALY. The PSA indicated a probability of 82 % of being cost effective at a WTP of £30,000 per QALY. In Germany, 0.234 QALYs were gained at slightly higher costs, resulting in a similar ICER of €19,457 per QALY gained. The PSA showed that over 80 % of the iterations were below a WTP of €30,000 per QALY. Nowak et al. [22] also reported the prevention of 2.4 exacerbations over a lifetime (€1852 per exacerbation avoided). Both studies concluded that adding roflumilast to long-acting bronchodilators can be cost effective in patients with severe to very severe COPD with chronic bronchitis and a history of exacerbations.

The other two studies [25, 26] were based on efficacy measures obtained from a published mixed-treatment comparison of trials ranging from 24 weeks to 4 years. When adding roflumilast to, respectively, LAMA, LABA/ICS or LAMA + LABA/ICS in Switzerland, all additions were cost effective, with ICERs around €10,000 per QALY (Table 3) with probabilities of 79, 96 and 96 %, respectively, of being cost effective at a €60,000 per QALY threshold. Hertel et al. [26] reported on the addition of roflumilast to LABA/ICS in the UK and reported ICERs of around €19,000 per QALY gained (€16,000 per QALY gained for ICS-intolerant patients), also with a >80 % probability of being cost effective at a threshold of £30,000 per QALY. The higher UK ICER may be partly explained by differences in healthcare systems, but is more likely to be because of higher relative drug costs for the addition of roflumilast in the UK (1.52 times usual care in the UK vs. 1.28 times in the Switzerland). The addition of roflumilast to current maintenance treatment was considered cost effective in both cases for patients with severe to very severe COPD who continue to experience exacerbations despite treatment with bronchodilators.

Based on these four studies and the previous review [7], evidence is strong that add-on therapy with roflumilast is cost effective when used in a specific subgroup: patients with severe to very severe COPD with frequent exacerbations, bronchitis symptoms and not controlled with bronchodilators alone.

3.2.4 LABA/Inhaled Corticosteroid (ICS) Combination Therapy

LABA/ICS are indicated for patients with COPD with a FEV1 % predicted <50 and frequent exacerbations (i.e. GOLD C and D). This combination is proven to be more effective than LABA or ICS alone and improves health status and number of exacerbations; however, it is also associated with an increase in pneumonia risk [1]. The previous review included 12 studies on the cost effectiveness of LABA/ICS; however, populations, comparators and ICERs varied widely [7]. The four new studies reporting on LABA/ICS combination therapies included papers on salmeterol/fluticasone, formoterol/budesonide and formoterol/fluticasone (Table 4). The studies, including two that were not industry funded, used very different methods, contrasting with the frequently used Markov models in other drug categories. The first non-industry-funded study was a CEA performed alongside a 6-month prospective observational study in a tertiary care hospital in South India [27]. It included a small cohort (n = 90) of patients with severe and very severe COPD irrespective of their history of exacerbations. Salmeterol/fluticasone, formoterol/budesonide and formoterol/fluticasone were compared in this cohort, but details on the methods and results were limited, and differences in exacerbations and costs seemed non-significant. The authors concluded that all LABA/ICS had favourable therapeutic performance, but salmeterol/fluticasone and formoterol/budesonide were deemed most effective based on lung function improvement. The second non-industry-funded study was a mathematical modelling exercise that assessed six different hypothetical treatment scenarios, including LABA/ICS for patients with GOLD 3/4 disease, for the African and Asian sub-regions [28]. It was concluded that this therapy was cost effective compared with no treatment in both the African sub-region and the Asian sub-region. Notably, this was the only study that used disability-adjusted LYs (DALYs) as an outcome measure.

The other two studies both evaluated formoterol/budesonide and were sponsored by the manufacturer. The first study assessed the costs and outcomes compared with salmeterol/fluticasone using Swedish real-world effectiveness data from the observational PATHOS study combined with Italian cost data [29]. It showed cost savings mainly driven by lower drug costs and fewer COPD- and pneumonia-related hospitalizations. This was one of the few studies that lacked a PSA. The other economic evaluation was performed alongside the 12-week CLIMB trial and combined its international resource use data with Scandinavian cost data to estimate the cost effectiveness of adding budesonide/formoterol to tiotropium in patients with moderate to very severe COPD and a history of exacerbations [30]. Several scenarios were assessed, but all ICERs were <€350 per exacerbation avoided and, in some cases, the addition was even dominant.

The results of the LABA/ICS evaluations included in the previous review showed contrasting results with a wide variation in ICERs [7]. The four new studies in this review had considerable limitations: a small cohort and an inappropriate population [27], no usual care control group [28], a combination of data from two different countries [29] and limited follow-up [30]. Therefore, the evidence regarding the cost effectiveness of LABA/ICS in patients with COPD is inconclusive. Targeting LABA/ICS to the correct COPD population (i.e. GOLD C and D) may result in more favourable cost effectiveness.

3.2.5 LABA/LAMA Combination Therapy

LABA/LAMA combination therapy is indicated for patients with COPD not controlled with a single long-acting bronchodilator alone (i.e. GOLD B, C or D) [1]. In the previous review, one study assessed the cost effectiveness of the combination of tiotropium and salmeterol in two separate inhalers versus tiotropium alone and found that the combination was dominated by tiotropium alone [7]. Two studies assessed ‘triple therapy’, consisting of a combination of separate LAMA and LABA/ICS, but reported mixed cost-effectiveness results [7]. This review included two new studies [31, 32], both funded by the manufacturer, that reported the cost effectiveness of LABA/LAMA fixed-dose combination (FDC) therapy in Sweden and the UK (Table 5). In Sweden, the once-daily LABA/LAMA combination indacaterol/glycopyrronium (FDC in one inhaler) was compared with indacaterol + glycopyrronium (two separate inhalers) in a cost-minimization analysis [31]. Cost effectiveness compared with salmeterol/fluticasone was also assessed. Both analyses were performed in a population with moderate to severe COPD and a low exacerbation risk. Note that salmeterol/fluticasone is not indicated for patients with COPD with low exacerbation risk and thus cannot be considered a suitable comparator. Clinical efficacy data from four different trials were used (Supplementary Appendix Table A3). Patients were followed over a lifetime horizon using a validated patient-level simulation model that used age, sex, height, smoking status and starting FEV1 level as input. Compared with its free combination, indacaterol/glycopyrronium FDC resulted in cost savings ranging from Swedish krona (SEK) −768 (1 year) to SEK −8703 (lifetime) depending on the time horizon. The cost difference was mainly driven by higher drug costs for the separate inhaler therapy. Compared with salmeterol/fluticasone, FDC showed an incremental gain in QALYs of 0.001 (1-year time horizon) to 0.200 (lifetime) at lower costs (SEK −43,033). Therefore, indacaterol/glycopyrronium FDC was deemed dominant. Indacaterol/glycopyrronium FDC remained dominant in all iterations of the PSA. In clinical terms, indacaterol/glycopyrronium resulted in the avoidance of 1.07 exacerbations and a reduction of 0.31 pneumonia events over a lifetime horizon.

The second study compared umeclidinium/vilanterol combination therapy versus tiotropium monotherapy over a lifetime horizon in the UK [32]. The COPD population assessed was supposed to be symptomatic (modified Medical Research Council [mMRC] scale score ≥2). This study used a linked-equations disease model based on the ECLIPSE study [33] and included the input parameters age, sex, body mass index, cardiovascular and other comorbidities, exacerbation history, smoking status, health status (SGRQ), lung function, dyspnoea and 6-minute walking test result. Utility was derived from the SGRQ score using a mapping model. Treatment effects were based on a meta-analysis of FEV1 data at 24 weeks from three clinical trials and resulted in an increase of 0.18 QALYs, 0.36 LYs and an ICER of £2088 per QALY, assuming a price equal to that of tiotropium. The probability of being cost effective at a threshold of £20,000 per QALY was over 90 %.

As only two studies assessed the cost effectiveness of two different LABA/LAMA combinations, and the studies in the previous review showed mixed results [7], the evidence that LABA/LAMAs are cost effective over tiotropium or other comparators is considered inconclusive.

3.2.6 Main Drivers of Cost Effectiveness of Chronic Obstructive Pulmonary Disease (COPD) Treatment

In 12 of the 18 articles, the main drivers of cost effectiveness were presented in a tornado diagram or table. In most cases, the relative risk of exacerbations and the mortality rate were the main drivers of cost effectiveness. The choice of the time horizon also had a strong impact on the model estimates. Notably, for some therapies (such as roflumilast and ICS), the baseline exacerbation rate of the target population seems to be a prerequisite for favourable cost effectiveness. Other factors that had considerable influence on cost-effectiveness estimates were hospitalization rates, either with a general cause or due to exacerbations or pneumonia.

3.3 Quality Assessment Results

We assessed the quality of the studies using the QHES checklist (Fig. 2). Total scores per study are provided in Tables 1, 2, 3, 4 and 5 and detailed scores in Supplementary Appendix Table A1. Based on QHES total score alone, 14 of the 18 studies scored in category 4 (highest category), two in category 3, one in category 2 and one in category 1 (lowest category). Performance in terms of the discussion of potential bias (item 14) was relatively low in all studies.

Fig. 2
figure 2

Percentage of maximum Quality of Health Economic Studies (QHES) score per question across the total of studies. Q1: Was the study objective presented in a clear, specific, and measurable manner?; Q2: Were the perspective of the analysis (societal, third-party payer, etc.) and reasons for its selection stated?; Q3: Were variable estimates used in the analysis from the best available source (i.e., randomized control trial—best, expert opinion—worst)?; Q4: If estimates came from a subgroup analysis, were the groups prespecified at the beginning of the study?; Q5: Was uncertainty handled by (1) statistical analysis to address random events, (2) sensitivity analysis to cover a range of assumptions?; Q6: Was incremental analysis performed between alternatives for resources and costs?; Q7: Was the methodology for data abstraction (including the value of health states and other benefits) stated?; Q8: Did the analytic horizon allow time for all relevant and important outcomes? Were benefits and costs that went beyond 1 year discounted (3–5 %) and justification given for the discount rate?; Q9: Was the measurement of costs appropriate and the methodology for the estimation of quantities and unit costs clearly described?; Q10: Were the primary outcome measure(s) for the economic evaluation clearly stated and did they include the major short-term, long-term and negative outcomes included? Was justification given for the measures/scales used?; Q11: Were the health outcomes measures/scales valid and reliable? If previously tested valid and reliable measures were not available, was justification given for the measures/scales used?; Q12: Were the economic model (including structure), study methods and analysis, and the components of the numerator and denominator displayed in a clear, transparent manner?; Q13: Were the choice of economic model, main assumptions, and limitations of the study stated and justified?; Q14: Did the author(s) explicitly discuss direction and magnitude of potential biases?; Q15: Were the conclusions/recommendations of the study justified and based on the study results?; Q16: Was there a statement disclosing the source of funding for the study?

3.3.1 Choice of Economic Model (Objective, Perspective, Structure and Time Horizon)

Of the 18 cost-effectiveness studies included in this review, 12 were cost-utility analyses (CUAs), five were CEAs and one used both CEA and cost-minimization analyses (Tables 1, 2, 3, 4, 5). All studies stated their objective in a clear, specific and measurable manner (QHES item 1), and 17 stated their perspective (QHES item 2). Reasons were not always explicitly stated, but the perspectives did not differ from their country’s recommendations or at least also included the recommended perspective [34]; 15 took a healthcare payer perspective (Supplementary Appendix Table A2). Four of these also applied a societal perspective [17, 18, 27, 30], two only used the societal perspective [15, 31] and another did not explicitly mention the perspective, but it seemed to have been undertaken from a healthcare perspective [28]. A total of 15 studies were modelling studies; the remaining three were performed alongside a clinical trial or used observational data, and all but two studies used efficacy data for their main effect estimate (QHES item 3). Follow-up of the efficacy trials varied between 12 weeks and 4 years (Supplementary Appendix Table A3). The time horizons (QHES item 8) differed considerably between the studies (Tables 1, 2, 3, 4, 5). For analyses conducted alongside clinical trials, the time horizons were 3 and 6 months [27, 30]. Time horizons in the modelling studies ranged from 1 year to lifetime, and cycle length varied between 1 month and 1 year (Supplementary Appendix Table A2). In general, the structure and health states of the model studies sufficiently reflected the natural development and progression of COPD (Supplementary Appendix Table A2). The CEAs in this study were conducted in 15 countries/regions. Not all countries recommended a specific time horizon for CEAs; in those that did, recommendations were generic, i.e. that all relevant health effects and cost consequences should be covered [34]. Considering these recommendations, the time horizons of the included studies in our review were mostly appropriate; however, 3 and 6 months may be considered too short as these timeframes do not account for the seasonality of COPD [35].

3.3.2 COPD Model Inputs: Costs and Effectiveness Measures

A total of 15 studies stated the methodology for data abstraction (QHES item 7). The measurement of costs (QHES item 9) and outcomes (QHES item 10) was clearly described in 16 and 14 studies, respectively (Supplementary Appendix Table A1). In most studies, the number of exacerbations or the exacerbation risk/rate was the main cost driver. Most articles differentiated between severe, moderate and/or mild exacerbations, and the costs of treatment for severe versus moderate/mild exacerbations varied widely, ranging between 4.14 times [15] and 21.46 times higher [31]. Only two articles differentiated between the costs of exacerbation within different disease severity stages of COPD [14, 15]. Multiple outcomes were used to describe the effectiveness of the studied therapy (Tables 1, 2, 3, 4, 5); 17 studies reported an ICER, and the most frequently included effectiveness outcome was the number of QALYs gained (14 of 18). The majority of articles used the EuroQoL 5-Dimension (EQ-5D) questionnaire to calculate the utility used to estimate QALY gains, which were <0.30 in all cases in which QALYs were directly estimated. QALY gains ranged from 0.0044 for aclidinium (vs. tiotropium) over a 5-year time horizon [19] to 0.289 for roflumilast (as add on to LABA/ICS) over a lifetime horizon [25]. Two studies used a mapping model for the translation of the SGRQ score into QALYs. The QALY gains in those studies were 0.18 and 0.42 over a lifetime horizon [14, 32]. The slightly higher QALY gains may be due to the mapping; however, these two studies also included the impact on (non-respiratory) comorbidities that may have resulted in higher QALY gains. Nine studies also reported the number of LYs gained as an outcome. Other frequently used outcomes were the number of exacerbations (7 of 18), improvement in FEV1 (3 of 18) or change in exacerbation risk (3 of 18). One study used DALYs to describe effectiveness [28], and two studies included the improvement in pneumonia risk [29].

3.3.3 Model Uncertainty, Validation and Limitations

All but one study included some sort of sensitivity analysis (Supplementary Appendix Table A2): two performed univariate sensitivity analyses only, one performed PSA only, and 11 included both univariate analyses and PSA. Six studies also performed one or multiple scenario analyses (QHES item 5). The potential for bias was poorly reported in the vast majority of studies (QHES item 14). Most modelling studies mentioned that the core Markov assumption that a cohort’s future progression is dependent only on their current state of health was a limitation (QHES item 13). Whilst this assumption may not hold true when estimating the prognosis of an individual patient, it may be more realistic when considering the effect of disease progression on a cohort of patients [16]. The probability of transition between states was often assumed to be constant over time, and a Markov model does not take into account previous health states. As such, “the approach taken did not account for the existing correlation between the number of exacerbations in the previous year and the current year or change in lung function when therapy is withdrawn” [18]. Notably, one study used a linked-equations model [32] and one used a patient-simulation model [31]. The advantage of these models is that they are not memoryless (a common feature of Markov models) and can take a multifactorial and more individualized approach (beyond lung function) to model disease progression.

A limitation of most of the included studies is that they did not describe the inclusion of comorbidities (Supplementary Appendix Table A3), with only five of the 18 studies clearly stating which comorbidities were present [17, 20, 21, 28, 32].

Another limitation reported in multiple studies was that clinical, healthcare utilization and productivity data were obtained from patients in countries other than the country of interest to the CEA [29]. Lastly, many analyses were based on data from randomized controlled trials (RCT) and not on real-world evidence [17, 20, 30].

3.3.4 Stated Conclusions and Disclosures

Most studies reported that their intervention was cost effective; however, in our view, this was not always justified by the results (QHES item 15). Some studies did not use an appropriate population [27], correct effectiveness data [29] or correct comparators [28, 31]. All but two studies [27, 28] were funded by the pharmaceutical company that produced the maintenance treatment studied, and this was clearly stated (QHES item 16). However, the influence of the funder was not always clear, although the choice of comparator in the economic evaluation was obviously largely driven by the choice of comparator in the clinical studies. Likewise, for the choice of patient population. Another issue regarding the funding is that results of industry-funded studies that indicate unfavourable cost effectiveness may not be published, resulting in potential publication bias.

4 Discussion

4.1 Main Findings

This review identified that 18 new pharmacoeconomic analyses of pharmacologic COPD maintenance treatments have been published in recent years (2011–2015). Most papers studied the cost effectiveness of LAMA monotherapy (n = 6), followed by PDE-4 inhibitors (n = 4) and LABA/ICS combination therapy (n = 4). Two papers studied both LABA monotherapy and LABA/LAMA combination therapy. Most studies were cost-utility analyses, and a minority (39 %) included a more COPD-specific outcome such as cost per exacerbation avoided. All studies found the studied therapy to be cost effective or cost saving, either because of lower treatment costs and the same effect or because of better effectiveness. However, QALY gains were small (<0.5 QALYs), and several methodological shortcomings were identified that hampered firm conclusions regarding the evidence of cost effectiveness of some of the new treatments.

Medication tended to be more cost effective in more severely ill populations, that is, those with a high exacerbation and hospitalisation risk. Model study results were also sensitive to assumptions on mortality. Indeed, clinical trial evidence showed a reduction in exacerbations. However, no empirical evidence in clinical studies confirmed that COPD treatments could reduce mortality [1]. Mortality reductions, as well as changes in health status, were mainly the indirect result of long-term extrapolation of a relatively minor improvement in lung function in the first 3–6 months. This suggests that studies have been overly optimistic.

Notably, all but two studies were funded by the manufacturer of the drug that was assessed. The quality of the studies according to the QHES was generally sufficient, except for the reporting of potential bias, which scored consistently low across all studies. However, we identified several other key methodological issues not included in the QHES. These included the wide variety of time horizons (3 months to lifetime), outcomes included (general vs. COPD specific), the sometimes random combination of different data sources, inappropriate choice of population or comparator, calculation of QALYs (either direct via the EQ-5D or using a mapping model) and, arguably most importantly, the issue of clinical efficacy versus real-life effectiveness.

4.2 Detailed Discussion of Methods and Outcomes

4.2.1 Study Design

The vast majority of studies combined clinical trial efficacy data with long-term follow-up using a Markov model. The latest approaches, used in the LABA/LAMA evaluations, used a linked-equation model [32] and a patient-level simulation model [31]. These models have the potential to overcome the common Markov limitation of memorylessness and take a multifactorial approach in the modelling of disease progression. Yet, only the linked-equation model seemed to take into account most relevant co-variables [32], whereas the patient-level model was still mainly driven by lung function and did not take into account exacerbation history and other patient-level characteristics [31]. The authors of the previous review noticed that the ICER in model studies was often higher than in trial-based studies [7]. We cannot update this conclusion because of the small proportion of trial-based studies in this review. Although another recommendation was to perform subgroup analyses, this was not always done and was one of the issues that was poorly defined upfront according to our QHES assessment. The choice of comparator was often in line with one of the recommended choices of treatment as per GOLD guidelines, with some notable exceptions [27, 31]. Treatment recommendations in the GOLD guidelines are based on disease severity stages and includes multiple treatment combinations. Therefore, the design of many studies does not always match the decisions that need to be made in real life. In everyday clinical practice, the question is often whether or not to switch medication, improve adherence, change the dose or add another medication in patients who remain symptomatic despite the first-line treatment. However, thus far, these strategies are mainly understudied, except for one study on the cost effectiveness of adherence enhancement [36]. Some of these types of questions have been addressed in the field of asthma recently [37].

4.2.2 Time Horizon

Studies used a range of different time horizons, varying between 3 months and lifetime. This may be because many pharmacoeconomic guidelines do not clearly define a recommended timeframe for each disease. The lifetime horizon has been questioned as it cannot take into account future treatment and price changes [7]. In general, for more effective treatments, the longer the time horizon, the higher the potential absolute reduction in exacerbations and mortality and the higher the costs and number of QALYs gained. The previous review argued that a time horizon of 4–5 years may be considered suitable, in line with the planning cycle of policy makers [7]. Two studies had time horizons of <1 year [27, 30]. A 3-month time horizon will likely not include all relevant health benefits and costs, and a time horizon shorter than 1 year could lead to an over- or underestimation of results, at least in studies performed alongside a clinical trial or studies using observational data from routine clinical or claims databases. For example, in winter, exacerbation rates and therefore potential gains are higher [35], but these results cannot be linearly extrapolated to the full year. When clinical trial efficacy data are used, the trial should be of sufficient length. Given the chronic nature, relatively slow progression and seasonal variability of COPD, the time horizon of CEAs should be at least 1 year to allow fair comparisons and to be of value to health insurance companies. Generally, time horizons should be chosen in such a way that all expected costs and effects are captured.

4.2.3 Variations in Modelling Approach

Of the 18 studies included, 15 were modelling studies. The previous review called for the use of similarly structured Markov models, the inclusion of extra-pulmonary manifestations (such as comorbidities and exercise) and a shift towards individual patient-simulation models [7]. Indeed, two of the most recent studies moved away from the Markov model and used a linked-equations model [32] or a patient-simulation model [31]. These models may suffer less from the common Markov limitation but are still based on populations and efficacy data from clinical trials that may not be fully representative of real-life medication use. The development of additional models using longitudinal real-life observational effectiveness data is still necessary. The impact of comorbidity should thereby at least be considered; only a few studies incorporated comorbidity. Another issue that must be taken into account in comparing COPD CEAs is that the difference in outcomes can be caused by the difference in the model and the purpose of the model. Hoogendoorn et al. [38] compared seven different COPD cost-effectiveness models in which the input parameters were standardized over a 5-year or lifetime time horizon. The way in which mortality was modelled was the most determining factor in the difference in outcomes, as well as the different outcome definitions in models. After standardization of input parameters, the studied outcomes still differed: percentage of patients per severity stage, mortality, QALYs, costs and ICER [38]. Therefore, model validation may be considered important in COPD models. Some of the studies included in this review did indeed validate their models, usually by comparing the model results with results of other studies [15], the original RCT [16], real-life data or national statistics [20, 21]. Comparing results with those from the original trial may ensure internal validity, but it does not guarantee the results can be extrapolated to real-life populations. In some studies, models were reviewed by an external source/health economist to improve validation. No structured validation tools, such as AdViSHE (Assessment of the Validation Status of Health-Economic decision models) [39], were used. Model validation may help to describe whether or not the model realistically describes the population and disease for which it is intended.

4.2.4 Variation in Outcomes

As the previous review also found, the outcomes used varied [7]. QALYs were mostly included, but including QALYs as the sole outcome can be questionable. QALY gains were small in most studies, which could mean either that the effects of the treatments are indeed small or that the instrument used to measure utility is not sensitive enough. The finding regarding the small QALY gains also confirms a finding of the previous review, which highlighted the relative insensitivity to change of utility measures compared with disease-specific health-related quality of life (HR-QoL) measures in COPD [7]. Another explanation lies in the timing of HR-QoL measurement, which usually takes place at fixed points in time in trials (e.g. after 3 months, 6 months, 1 year). Thus, utility decrement because of an exacerbation that occurs between those fixed timepoints is not always captured [7]. In general, using a mapping model to translate SGRQ scores into QALYs resulted in slightly higher QALY gains than direct utility estimates obtained via the EQ-5D, but this was done in only two studies, which also included the impact of comorbidities [14, 32]. The EQ-5D is the questionnaire most frequently used to estimate utility across all patient populations. The SGRQ is specifically designed for patients with respiratory diseases and includes all important factors that influence quality of life in patients with COPD. A recent study mapped the COPD-specific Clinical COPD Questionnaire (CCQ) [40] to the EQ-5D and found only moderate correlation [41]. The explanation for this discrepancy is that the EQ-5D lacks a dyspnoea domain (one of the most prominent symptoms of COPD) and includes a pain domain that the CCQ does not as it is a much less prominent symptom of COPD. Therefore, given the lack of overlap of important domains, mapping the CCQ to the EQ-5D cannot be recommended. Note that, until now, only the EQ-5D-3L has been available, whereas the EQ-5D-5L has recently been developed. Nolan et al. [42] studied the responsiveness of the EQ-5D-5L in patients with COPD and found it was more valid and sensitive to HR-QoL changes than was the EQ-5D-3L, which was used in the studies included in this review. The development of a respiratory ‘bolt-on’ to the EQ-5D may be another noteworthy future option [43]. In our view, for comparison between COPD and other diseases, a generic outcome (such as the QALY) remains essential; however, we recommend that future research also includes a COPD-specific measure (such as the CCQ) to fully capture all relevant COPD-specific outcomes. More studies are needed to establish clinically relevant cut-offs and the WTP for these COPD-specific outcomes (e.g. cost per point improvement in CCQ score or costs per additional patient with a clinically relevant improvement in CCQ).

The number of exacerbations or change in exacerbation rate was a frequently used outcome. Together with quality of life, exacerbations are one of the most important outcomes as, according to the GOLD guidelines, improvement of quality of life and reduction in exacerbations are the two major treatment aims of COPD [1]. GOLD 2015 states that “Exacerbations and comorbidities contribute to the overall severity in individual patients” and “It is increasingly recognized that many patients with COPD have comorbidities that have a major impact on quality of life and survival” [44]. Indeed, whereas exacerbations were frequently included, comorbidities were rarely included. We recommend that comorbidities be included in future research—particularly when treatment effects are expected to go beyond COPD only or when comorbidities affect treatment effectiveness—because they have a considerable effect on quality of life, economic impact [45] and overall survival. Furthermore, two papers included pneumonia risk as an outcome. Hwang et al. [46] studied pneumonia as a risk factor for exacerbations and found that patients who had pneumonia in the year before the year of analysis had an 18 times greater chance of having an exacerbation in the next year. This outcome should especially be considered for drugs (such as ICS) that may increase the risk of pneumonia [47]. The inclusion of other short- and long-term side effects [48, 49] of pharmacological treatments in COPD cost-effectiveness models is an area that can be improved.

4.2.5 Variation in Costs

In line with the previous review, most studies focused on direct healthcare costs and only included COPD-related costs [7]. Exacerbations were a primary cost driver but costs did vary between studies. Differences in exacerbation costs between countries are expected because of differences in treatment patterns and healthcare systems. However, there was also a small difference in exacerbation costs between studies performed in the same country that were not due to a difference in calendar year (e.g. Germany), offering some potential for improvement [17, 20, 22]. However, we should acknowledge that a one-size-fits-all approach is not always possible because of the large variation in treatment costs among healthcare institutions and individual patients with COPD.

4.2.6 Variation in Analytical Approach

All but one study included some kind of sensitivity analysis, a trend that was observed in the previous review [7]. The majority performed both univariate analysis (1-way) and PSA. However, one study only performed a PSA [32] and two studies only performed 1-way analyses [22, 29]. We recommend both types of analyses be performed as they provide different information (i.e. what are the cost drivers and what is the probability of making a correct overall decision).

Most articles took a payer perspective, two chose a societal perspective and some included both. All articles followed their local guidelines for the choice of perspective. Although we do not aim to undermine local guidelines, we argue that the societal perspective is often of added value over the healthcare or payer perspective as it encompasses all costs of an intervention, regardless of who bears them. All perspectives have their particular relevance. The healthcare perspective is limited to costs within the healthcare sector, whereas the payer perspective focuses on the money actually paid and may or may not reflect the actual costs of the resources used. It has been argued that differences between the healthcare and societal perspectives are marginal in COPD because the average age of patients in many studies is higher than the retirement age, so the impact of productivity losses would be negligible [15]. Nevertheless, the societal perspective incorporates more than just productivity losses. It may also include informal care, and travel and time costs to patients, although these are often neglected, unfortunately. Obviously, if younger working age patients with COPD are studied, production losses may have a large impact on study results [50].

4.2.7 Transferability Issues

All but three studies were performed in a European country, with the majority in the UK, Sweden or Germany. Three studies were conducted outside Europe: the first, by the World Health Organization, reported on the cost effectiveness of COPD medication in sub-Saharan Africa and south-east Asia [28]; the second was performed in the USA [19]; and the third in India (Tables 1, 2, 3, 4, 5) [27]. Most countries used cost and effectiveness data from the same country, with the notable exception of Roggeri et al. [29], who applied Italian cost data to Swedish effectiveness data.

4.2.8 Other Issues: Efficacy versus Effectiveness

An important issue not covered by the QHES is whether the cost effectiveness is representative of the real-life population that will eventually use the particular drug. Most papers focused on moderate to very severe COPD (GOLD 2–4) or even only on severe and very severe COPD (GOLD 3–4). Indeed, in GOLD 1 COPD, only short-acting bronchodilators are recommended. The new GOLD A–D classification was seldom used to describe the patient population. In the new GOLD guidelines, clinical manifestations such as the exacerbation history or the extent of chronic bronchitis play a more prominent role in the selection of treatment. Very few studies incorporated this new focus [27, 31]. However, note that the new GOLD guidelines are not particularly modeller friendly, especially compared with previous guidelines that only considered FEV1, which enabled progression predictions.

Another important issue identified was efficacy versus effectiveness. Most studies included in this review used clinical trial efficacy data in their models. However, clinical trial populations do not fully represent real-life populations [51]. The majority of large clinical COPD trials only include patients with significant smoking history (>10 pack years), an FEV1 <70 % and no atopy or asthma comorbidity [51]. However, in real life, about 22 % of patients with COPD are never smokers and 15 % of patients also have asthma [52]. As a result, less than half of the real-life primary care COPD population (42 %) [53] would be eligible to participate in the UPLIFT trial that was frequently used as basis for CEAs of tiotropium [54, 55]. Moreover, efficacy as seen in controlled settings differs from effectiveness during use in daily practice [56]. For example, in contrast to the well-trained patients in clinical trials, real-life patients often have poor inhalation technique and adherence, which is associated with worse health outcomes [57, 58]. This might be of interest with regard to the recent developments towards more convenient dosing regimens and innovative inhalers. In clinical trials, inhaler efficacy has been shown to be comparable [59], resulting in small differences in incremental cost effectiveness based on trial data. However, differences may be revealed in real-life studies [60].

Lastly, we should acknowledge that some drugs are prescribed outside their main indication (off-label prescribing). The extent of this phenomenon and cost effectiveness in those groups is unknown.

In general, most studies used data from large pharmaceutical industry-sponsored RCTs as the basis for their economic evaluations. This might be a suitable approach for a first indication of cost effectiveness, as no long-term real-world data are available at that stage. After ≥1 year of experience in real life, follow-up studies using data from routine clinical or claims databases may be of added value to assess the cost effectiveness during use in daily practice. However, these studies have their own limitations and challenges, such as dealing with bias, a large level of missing data and potentially incorrect diagnoses.

4.3 General Recommendations

The main conclusions and recommendations of the previous review [7] were as follows: perform longer trial-based studies, assess combinations of medication, assess treatment strategies, include a greater diversity of COPD populations, reach consensus on a common structure of COPD models, and use standardized COPD outcomes (such as exacerbations) and their minimal clinically important differences and WTP, in addition to the cost per QALY. The majority of these recommendations are still valid, but we identified some new issues, such as the incorporation of real-life evidence and the need for post-marketing CEAs and the need for COPD-specific methodology and outcomes. In the last decade, we have seen a shift from economic evaluation alongside RCTs to longer-term Markov modelling to individualized modelling. The next step is to validate these models using longitudinal real-world routine practice data. Table 6 presents an overview of our main recommendations from this review.

Table 6 Key recommendations for future chronic obstructive pulmonary disease cost-effectiveness analyses and gaps in research

4.4 Strengths and Limitations

This systematic review has several strengths, including adhering to PRISMA guidelines, searching four major databases and the consistency and interaction with a previous review that included all previous COPD CEAs. However, the study also has some limitations, the first being that we only included articles in Dutch, English and German. Although this included the vast majority of the articles published, and non-English articles are usually of limited added value [61], it may be considered a possible source of bias. We used the QHES score to systematically assess the quality of the studies included. However, as this checklist and its score are open for interpretation and discussion, results should be interpreted with caution. The QHES checklist asks whether a certain part is performed; however, this does not directly resemble the quality of how it is done. This is also a common limitation of other quality-assessment tools, such as CHEERS [12]. Therefore, we also performed additional detailed assessments, focusing on COPD-specific issues and following the recommendations of the previous review. This study should therefore be interpreted as ‘topping up’ the previous work. This means that, where these additions are treated as a continuum in these interpretations technically represents a disjoint with the search criteria in 2011.

5 Conclusions

The majority of CEAs conducted between 2011 and 2015 indicated the cost effectiveness of pharmacologic maintenance treatments for COPD was favourable. However, the number of QALYs gained was generally small. According to the QHES, the quality of the studies was generally sufficient, but studies poorly reflected cost effectiveness in real life. Therefore, in addition to modelling approaches to assess initial cost effectiveness, further studies using data from daily clinical practice seem valuable to assess real-world cost effectiveness.