Background

Long acting bronchodilators such as long-acting muscarinic antagonists (LAMAs) and long-acting beta-agonists (LABAs) are the main drugs used to treat stable chronic obstructive pulmonary disease (COPD). They improve symptoms and exercise performance status by reducing small airway limitations, hyperinflation [1, 2], and exacerbation risk [3,4,5]. Recently, LAMA/LABA combination therapy has been introduced as a more potent treatment than LAMA or LABA alone. Studies and meta-analyses have shown that such combination therapy has a greater effect than monotherapy on lung function, symptoms, quality of life, and acute exacerbations [6,7,8,9,10,11,12,13,14]. In addition, LAMA/LABA combination therapy is associated with less frequent moderate to severe exacerbations than inhaled corticosteroid (ICS)/LABA combination [8, 9, 15], although a large trial contradicted these results [16]. Currently, in patients with stable COPD whose symptoms or exacerbations cannot be controlled using a single long-acting bronchodilator, LAMA/LABA combination therapy is primarily recommended [17].

A variety of LAMA/LABA combinations are available on the market, but it is not clear which are the best choice; all LAMA/LABA therapies are expected to have similar efficacy [10, 18], even though only a few head-to-head trials have been carried out with a relatively short study duration [19,20,21,22,23]. Instead, several efforts have been made to indirectly compare the efficacy of the LAMA/LABA combinations [24, 25], although no previous studies have compared exacerbation risk or mortality risk, which are the two most important outcomes in patients with stable COPD. Furthermore, no studies have taken into consideration the outcomes of recently conducted large trials [15, 16, 26]. The present systematic review (SR) and Bayesian network meta-analysis (NMA) included all available long-term trials and aimed to compare efficacy and safety, comparing the risk of acute exacerbation and mortality between different LAMA/LABA combinations.

Methods

Protocol and registration

We followed the guidelines of the Preferred Reporting Items for SRs and Meta-analyses extension statement, which incorporates NMAs as medical interventions [27], as well as the BayesWatch guidelines for reporting estimated results using Bayesian methods [28]. We registered our study protocol in the International Prospective Register of Systematic Reviews (PROSPERO, CRD42019126753).

Eligibility criteria

We included clinical studies that met the following eligibility criteria: (1) adult patients with stable COPD; (2) treatment with inhalable LAMA/LABA combinations including dual monotherapies and fixed dose combinations; (3) report of acute exacerbations or mortality; (4) parallel, randomized controlled trial (RCT) study design, judged using the criteria of the Design Algorithm for Medical Literature on Intervention [29]; (5) treatment duration of 48 weeks or more; (6) human subjects; (7) publication in English.

Primary and secondary outcomes

Our primary outcome was a comparison of the total exacerbation rate and all-cause mortality rate among LAMA/LABA combinations. Secondarily, we evaluated moderate to severe exacerbation rate, COPD-related mortality, cardiovascular disease-related mortality, major adverse cardiovascular events (MACE), and pneumonia.

Information sources and search

We searched MEDLINE, EMBASE, and the Cochrane Central Register of Controlled Trials, following a pre-established study protocol and search strategy (search date: July 1, 2019). In addition, we referred to the US national library of medicine, the EU Clinical Trial Register, the AstraZeneca Clinical Trials website, the Boehringer Ingelheim clinical study results website, the GlaxoSmithKline Study Register, and the Novartis clinical trial results website. We also contacted authors and representatives of pharmaceutical companies, including GlaxoSmithKline, Boehringer Ingelheim, AstraZeneca, Novartis, and Kolon to obtain additional data. We conducted manual searches using the study identifiers or references of previous SRs. When designing this search strategy for SRs, we referred to the Peer Review of Electronic Search Strategies (PRESS) checklist [30]. The search terms were “COPD” AND inhaled drugs (“LAMA” AND “LABA”) AND randomized controlled design, which included controlled vocabulary and free text. The LAMAs included aclidinium, glycopyrrolate, tiotropium with a dry powder inhaler or soft mist inhaler, and umeclidinium. The LABAs included formoterol, indacaterol, olodaterol, salmeterol, and vilanterol. A detailed version of the search strategy can be found in both the Additional file 1 and PROSPERO.

Study selection

We screened and reviewed studies according to the PRISMA flow diagram [31]. Duplicated studies were removed based on the title, abstract, and name of the authors. Independent reviewers (H.W.L./J.M.P.) conducted calibration exercises by title and abstract to improve inter-observer reliability, with a sample of 200 randomly selected studies (agreement = 97%, Cohen’s kappa = 0.81). The two reviewers individually screened the abstracts and titles of all potentially eligible studies and performed a full-text review to assess whether the screened studies met the pre-established eligibility criteria. Any conflicts or disagreements regarding study selection were resolved by referring to the original articles and discussing them with a third reviewer (C.H.L.).

Data collection and data items

We coordinated the data collection methods and pre-piloted formats to assess study quality and synthesize the study outcomes. Independent reviewers (H.W.L./J.M.P.) extracted the following data items: (1) basic study information (e.g., year of study, study duration, device used for treatment, study outcomes, and number of patients included in intention-to-treat analysis); (2) baseline characteristics of the study population (e.g., age, sex, body mass index (BMI), smoking status, and ethnicity); (3) clinical information of the study population (e.g., time since COPD diagnosis, severity of COPD, mean post-bronchodilator forced expiratory volume in the first second (FEV1), history of total exacerbations in the past year, patients with a history of ≥ 2 total exacerbations or ≥ 1 severe exacerbations in the past year, modified medical research council dyspnea scale score, and COPD assessment test score); (4) study outcomes (e.g., number of patients experiencing any COPD exacerbation or number experiencing moderate to severe exacerbation, number of all-cause mortalities and cause of death, number of patients with MACEs, and number of patients with pneumonia until the last follow-up). If the absolute number of patients was not available, we recovered the raw data by digitization from the Kaplan–Meier curve of the time to first acute exacerbation [32]. The severity of COPD exacerbation was either assessed using the Exacerbations of Chronic Pulmonary Disease Tool [33] or estimated in terms of healthcare resource use [34]. Any controversy regarding the data extraction process was resolved by discussion.

Network geometry

Two different network geometries were used in the present NMA. In network (A), the network meta-analysis was conducted under the assumption that there is a significant difference in efficacy and safety between individual drugs or their combinations within the same drug class. Network (A) expressed individual drugs or their combinations as nodes, and a direct comparison of two different treatments in an RCT as a link between nodes. In network (B), the network meta-analysis was conducted under the assumption that there was no difference in efficacy and safety between individual drugs or their combinations within the same drug class other than LAMA/LABA. Network (B) combined all inhaled treatments other than LAMA/LABA to each drug class (ICS/LAMA/LABA, ICS/LABA, LAMA, and LABA) and expressed them as each node. Network (B) was applied in the NMA of total exacerbation and all-cause mortality, which were the major outcomes of the present study. The number of direct treatment comparisons was expressed as the thickness of the edges between the nodes.

Risk of bias within and across individual studies

Two reviewers (H.W.L./J.M.P.) independently appraised the risk of bias in each of the included studies in terms of the seven domains defined in the Cochrane Risk-of-Bias tool [35]. Any controversy regarding this risk of bias assessment was discussed with the other author (C.H.L.).

Data synthesis and analysis

The present Bayesian NMA was conducted using a random effects model with a heterogeneous variance structure [36, 37] because we found more than two LAMA/LABA therapy regimens and assumed that the variance of the odds ratios of individual treatment (LAMA/LABA) compared with baseline treatment (Tiotropium) was different. The prior distributions of the Bayesian model parameters were assumed to be non-informative and to have normal or uniform distribution [38]. We estimated the relative probability of the best treatment based on the surface under the cumulative ranking curve (SUCRA) [38]. The median value of the posterior odds ratio (OR), with 95% credible intervals (CrIs), and the posterior probability of the OR exceeding 1 (P[OR > 1]) were estimated to identify the relationship between each inhaled drug and clinical outcomes. Statistical significance was defined when P(OR > 1) was less than 0.025 or more than 0.975. An OR greater than 1 in a pairwise comparison indicated that the comparator group (upper side of the league table) was more beneficial than the treatment group (left side of the league table). Additionally, pairwise meta-analyses were conducted using the random effects model for each direct comparison, and the results were presented as ORs with 95% confidence intervals (CIs). Sensitivity analysis was conducted according to network geometry (A and B) and cause of death (COPD-related and cardiovascular disease-related mortality).

The parameters were estimated using the Markov Chain Monte Carlo (MCMC) algorithm in WinBUGS version 1.4.6 (Imperial College and Medical Research Council, UK). Convergence of the MCMC algorithm was checked using trace plots, autocorrelation plots, and Gelman–Rubin statistics. We discarded the first 20,000 iterations to eliminate the initial value effect and selected 10,000 samples from the MCMC algorithm in two chains after applying the appropriate thinning rate to satisfy the autocorrelation assumption.

We reviewed the baseline characteristics of the eligible trials and the demographic characteristics of patients to monitor the homogeneity and similarity assumptions. Publication bias was investigated in the direct comparisons, which included ≥ 3 RCTs, using funnel plots and Egger’s test. The consistency assumption stating that the direct estimates might be consistent with indirect estimates is another main assumption in a NMA, and this was assessed using the node-splitting method [39]. Heterogeneity was assessed based on the posterior median of the standard deviation (SD) between the studies. An SD close to 0 indicates small heterogeneity, while an SD > 1 indicates substantial heterogeneity [40, 41].

Certainty of evidence

The certainty of evidence was rated using the GRADE (Grading of Recommendations, Assessment, Development and Evaluations) approach [42].

Role of the funding source

The funding source had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.

Results

Study selection and network geometry

Among the total 5718 articles retrieved, 3696 were identified after the removal of duplicates, and 127 were found to be potentially relevant after screening by title and abstract (Additional file 2). After a full-text review, we found 16 articles that met the eligibility criteria of the present SR. The excluded articles are listed in the Additional file 3. Network geometries (A) and (B), addressing total exacerbations and all-cause mortality, respectively, are graphically expressed in Fig. 1.

Fig. 1
figure 1

Network geometry of the direct comparison in the eligible 16 RCTs. Network (a) expressed individual drugs or their combinations as each node, and a direct comparison of two different treatments in an RCT was shown as a line between nodes. Network (b) combined all inhaled treatments other than individual LAMA/LABAs to each drug class (ICS/LAMA/LABA, ICS/LABA, LAMA, and LABA) and expressed them as each node. The number of direct comparison was expressed as a number in the middle of a line between nodes. ACL aclidinium, BEC beclomethasone, FOR formoterol, FLU fluticasone, GLY glycopyrrolate, IND indacaterol, OLO olodaterol, PBO, placebo, SAL salmeterol, TIO tiotropium, UME umeclidinium, VIL vilanterol

Study characteristics

The baseline characteristics of the eligible studies are summarized in Table 1. Among the 39,065 patients included in 16 RCTs published between 2007 and 2018, six LAMA/LABA combinations were identified; 6079 patients were in the tiotropium/olodaterol arm, 4334 were in glycopyrrolate/indacaterol arm, 2296 were in umeclidinium/vilanterol arm, 1060 were in aclidinium/formoterol arm, 1,035 were in glycopyrrolate/formoterol arm, and 148 were in tiotropium/salmeterol arm. The patients’ mean age was 64.8; 68.5% of them were men and 39.3% were current smokers. The dominant ethnicity was white or Caucasian. The patients’ mean post-bronchodilator FEV1 percentage was 45.6%, and no patients with mild COPD were found; specifically, eight RCTs enrolled patients with GOLD grade II–III, seven enrolled patients with GOLD grade II–IV, and two enrolled patients with GOLD grade III–IV. The treatment duration was 52 weeks in 15 RCTs and 64 weeks in one RCT. The patients used a dry powder inhaler in 12 RCTs, a soft mist inhaler in four RCTs, and a metered-dose inhaler in one RCT.

Table 1 Baseline characteristics of the included 16 studies

Risk of bias within studies and across studies

The risk of bias was assessed and considered acceptable for our NMA (Additional file 4). No substantial risk of bias was detected in the random sequence generation or allocation concealment applied in the included studies. Blinding of participants and personnel was well conducted in most of the included RCTs. Our primary and secondary outcomes were unlikely to be influenced by incomplete outcome data because the reasons for withdrawal or follow-up loss were balanced and because outcome assessment was conducted with intention-to-treatment groups. Bias was rarely found from selective reporting of outcomes or any other sources. In analyses exploring the potential for risk of bias across studies, publication bias and selective reporting were not found (Additional file 5).

Total exacerbations

We analyzed 39,065 patients in 16 RCTs to compare efficacy among individual LAMA/LABA combinations in terms of reduction in total exacerbation. In this regard, umeclidinium/vilanterol was ranked first according to SUCRA, followed by glycopyrrolate/formoterol. Compared with tiotropium monotherapy, umeclidinium/vilanterol led to fewer exacerbations (Fig. 2). In addition, umeclidinium/vilanterol was significantly superior to tiotropium/olodaterol, aclidinium/formoterol, and glycopyrrolate/indacaterol with a moderate level of evidence, and tiotropium/salmeterol with a low level of evidence in terms of total exacerbation risk in network (A) (Table 2, Additional file 6). In network (B), umeclidinium/vilanterol no longer showed significant benefits in this regard (S7 information).

Fig. 2
figure 2

Forest plots of the risk of total exacerbations, moderate to severe exacerbations, and all-cause mortality. The risk of total exacerbations, moderate to severe exacerbations, and all-cause mortality were expressed as forest plots using the estimated odds ratios with 95% credible intervals compared to tiotropium. CrI credible interval

Table 2 Results of Bayesian network meta-analyses for exacerbation, mortality, and adverse events among LAMA and LABA combinations compared to tiotropium monotherapy in the network (A)

Moderate to severe exacerbations

We analyzed 27,489 patients in seven RCTs to compare efficacy among individual LAMA/LABA combinations in terms of reducing moderate to severe exacerbations. Tiotropium/salmeterol was not analyzed because limited data were available. No significant difference occurred in terms of the ability of each LAMA/LABA combination to reduce moderate to severe exacerbations (Table 2).

All-cause mortality

We analyzed 39,065 patients in 16 RCTs to compare efficacy among individual LAMA/LABA combinations in terms of reducing all-cause mortality. No LAMA/LABA combination was found to be superior to any others in terms of reducing all-cause mortality (network [A], Table 2; network (B), Additional file 7). In the sensitivity analyses of COPD-related and cardiovascular disease-related mortality, no significant results were found.

Adverse events

We analyzed 20,051 patients in nine RCTs to compare the risk of MACE among individual LAMA/LABA combinations. Umeclidinium/vilanterol and tiotropium/salmeterol were not analyzed because limited data were available. We found no significant differences in the risk of MACE among LAMA/LABA combinations (Table 2). In addition, 39,065 patients in 16 RCTs were evaluated to compare the risk of pneumonia among different LAMA/LABA combinations. There was no significant difference in the risk of pneumonia among LAMA/LABA combinations (Table 2).

Consistency assumption

The posterior effect size estimated by comparison in the present NMA was consistent with the results of the direct comparison approach (Additional file 8). In the inconsistency evaluation, most of the results satisfied the consistency assumption.

Discussion

Our SR compared the efficacy and safety of LAMA/LABA combinations using Bayesian NMA. Umeclidinium/vilanterol was the most effective treatment in terms of reducing total exacerbation events among the LAMA/LABAs with low or moderate level of evidence, except glycopyrrolate/formoterol under the assumption that pharmacologic actions are different between individual drugs or their combinations within the same drug class (network (A)). However, no significant differences were observed under the assumption that pharmacologic actions are not different within the same drug class other LAMA/LABA (network (B)). Until now, it has not been clarified whether there are any differences in pharmacologic effects between individual drugs or their combinations within the same drug class other than LAMA/LABA in the treatment of COPD patients [43]. Furthermore, all-cause mortality, moderate-to-severe exacerbation, and the rate of adverse events were not different among the LAMA/LABA on our NMA. Therefore, our NMA suggests that there is no strong evidence suggesting different benefits among LAMA/LABAs in reducing the risk of exacerbation.

In previous NMAs, umeclidinium/vilanterol and glycopyrrolate/indacaterol showed better efficacy than aclidinium/formoterol for improving lung function [24, 25], while olodaterol/tiotropium showed better efficacy than umeclidinium/vilanterol for reducing symptoms [25]. In another NMA, umeclidinium/vilanterol showed better lung function compared to tiotropium/olodaterol, aclidinium/formoterol and tiotropium/formoterol, although that study should be interpreted with caution because it was commercially sponsored and its methodology was not described [44]. Another NMA compared the exacerbation and mortality risk among various inhaled drugs, but the only actual comparison between LAMA/LABA regimens was between glycopyrrolate/indacaterol and tiotropium/salmeterol, which revealed insignificant results [43]. In a recently published NMA, glycopyrronium/formoterol, glycopyrronium/indacaterol, aclidinium/formoterol, and umeclidinium/vilanterol showed similar efficacy in reducing exacerbation during 24 weeks of treatment, but lung function was more improved in glycopyrronium/formoterol group. However, our NMA including only RCTs with treatment duration ≥ 48 weeks showed the risk of acute exacerbation was significantly reduced in umeclidinium/vilanterol compared to tiotropium/olodaterol, aclidinium/formoterol, glycopyrrolate/indacaterol, and tiotropium/salmeterol.

There are differences in the onset of action, duration of effect, and specificity at the receptor or effector among LABAs [45] and LAMAs [18]. A previous NMA reported that indacaterol was more effective than other LABAs at improving trough FEV1 and symptoms [46], and two NMAs showed differences in treatment outcomes between LAMAs [25, 47]. Considering these results, it seems likely that different LAMA/LABAs have different clinical efficacy. In addition, combined LAMAs and LABAs may have synergistic actions [48, 49] that differ according to the combination used.

In the present study, when individual inhaled drugs or combination therapies were compared (network A), total exacerbation was more reduced in patients who used umeclidinium/vilanterol than in those who used other LAMA/LABAs. This result is consistent with previous studies. In a short-term head-to-head RCT, umeclidinium/vilanterol showed a better efficacy than tiotropium/olodaterol in improving trough FEV1 at week 8 [20]. However, the superiority of umeclidinium/vilanterol in reducing total exacerbation among different LAMA/LABAs disappeared in the analysis comparing individual LAMA/LABAs with drug classes (network [B]). Network (B) allows more nodes to be included in NMAs, but it assumes that all other drug class have equal efficacy and safety. In fact, glycopyrrolate/indacaterol was more effective than ICS/LABA in the FLAME trial, and ICS/LABA was more effective than umeclidinium/vilanterol in the IMPACT trial. The ICS/LABA combinations were different in those two studies, so the NMA using network (A) in the present study did not use the data from those studies when comparing glycopyrrolate/indacaterol with umeclidinium/vilanterol. However, the NMA using network (B) used both studies. Considering that the geometry of network (B) may be more desirable than that of network (A) in terms of reduced imprecision, we should be cautious before declaring the superiority of umeclidinium/vilanterol based on the results of the present NMA.

The present SR with NMA had several strengths. Firstly, to our knowledge, this study was a novel attempt to estimate the comparative efficacy and safety of various LAMA/LABA combinations. No previous NMAs primarily evaluated acute exacerbation, mortality, and adverse events because these outcomes are rare and would yield low statistical power. We believe that exacerbation and mortality are more direct clinical outcomes in patients with stable COPD, although lung function decline and respiratory symptoms are also important clinical outcomes. The present study used Bayesian methods to perform an appropriate analysis of rare events and estimated the value of SUCRA as a numeric presentation of the overall ranking associated with each treatment. It would not be a coincidence that the same agent (umeclidinium/vilanterol) ranked first in SUCRA in terms of reduction in total exacerbation and all-cause mortality. This result may constitute good evidence for generating new hypotheses. Secondly, the present study used pooled data from RCTs with a study duration ≥ 48 weeks. Previous NMAs have mainly focused on lung function decline and respiratory symptoms in RCTs within 24 weeks [24, 25]. In a pairwise meta-analysis conducted by Calzetta et al., treatment duration was an important factor affecting the efficacy of inhaled treatment to reduce respiratory symptoms [12]. Considering that COPD patients require lifelong treatment, an effect size estimated from longer-term clinical outcomes would more reliable when considering which inhaled treatment to prescribe. In our pilot study, pooling the different study periods together showed significant statistical discrepancies between the estimated outcome by indirect comparison and the outcome of the direct comparison. In fact, international industry guidance states that a treatment duration of at least 1 year is needed to modify exacerbations [50]. Through rigorous statistical methods, we found that the estimated results were more reliable when analyzed in RCTs conducted for ≥ 48 weeks. Third, our NMA used the two networks (A and B) under different assumptions that were mutually complementary. Currently, it has not been clearly demonstrated whether there are any differences in pharmacologic effects between individual drugs or their combinations within the same drug class other than LAMA/LABA in the treatment of COPD patients [43]. Network A has the same individual inhaled drug (eg. tiotropium) that mediates indirect comparison between different LAMA/LABAs, so it is more advantageous in terms of comparability or intransitivity. However, in network B, imprecision can be reduced in the network meta-analysis for all-cause mortality, because more studies can be included. Therefore, our study was able to take advantage of the two assumptions and make a balanced conclusion from the results of the two network meta-analyses.

There are limitations in our study. First, our meta-analysis included RCTs for patients with different characteristics, which may have a potential bias. For example, IMPACT trial showed a better efficacy of ICS/LABA than LAMA/LABA, while FLAME trial reported a better efficacy of LAMA/LABA than ICS/LABA. There are concerns that the contradicting results may depend on whether excluding asthmatics or not [51], and IMPACT trial did not exclude the patients with asthma history, which could favour ICS-containing treatment (ICS/LABA) compared with bronchodilators only (LAMA/LABA). However, it seems doubtable that asthma patients were actually enrolled in IMPACT trial more than FLAME trial, given that only 18% of participants in IMPACT trial revealed a positive bronchodilator reversibility (bronchodilator response > 12%/200 mL), while 45% of those in FLAME trial showed a positive bronchodilator reversibility. In addition, in our study, 6 RCTs enrolled only patients with more than one previous exacerbation history, but other studies included those with heterogeneous exacerbation history. Since previous exacerbation history was reported as the most important risk factor of exacerbation in ECLIPSE study [52], the comparisons between studies with different exacerbation history may affect results. Second, inconsistent results in the analyses for total exacerbation and moderate to severe exacerbation implied a between-study heterogeneity in regard to defining and classifying acute exacerbations. It was difficult to identify whether the method of defining and classifying exacerbations was identical in each RCT. Third, mortality and safety outcomes were evaluated by including only studies conducted for more than 48 weeks, but the number of events was small, resulting in extremely wide CIs. Therefore, no definite conclusion can be drawn from the data obtained on mortality and adverse events.

Conclusions

Our NMA including all available RCTs showed that there is no strong evidence suggesting different benefits among LAMA/LABAs patients with stable COPD who have been followed up for 48 weeks or more. Physicians may choose any LAMA/LABA according to the availability or preference of individual patients in the treatment of stable COPD.