FormalPara Key Points

Marginal structural models (MSMs) are informative, account for poor treatment adherence or switching, and allow inclusion of more patients than simply censoring patients in an ‘as protocol’ (AP) analysis. MSMs provide insights into time-varying factors occurring after initiation of disease controller therapy that may affect treatment choices or outcomes, and are not apparent in intent-to-treat (ITT) or AP analyses.

Retrospective comparative effectiveness studies using ITT or AP analysis methods often fail to include treatment adherence or switching in their analyses, leading to biased effect estimates. Other time-varying factors such as acute exacerbations of chronic disease can affect treatment decisions and outcomes, and thus also introduce biases. In this analysis of retrospective data from two regional health systems, we demonstrate that failure to account for treatment adherence can make the outcomes of patients who use controller therapies concurrently appear to be significantly worse than those of patients who use these treatments independently.

Based on this effectiveness study, we find that MSMs may be a useful and informative complementary analysis to include in studies of treatment effectiveness in chronic disease where time-varying confounding is present, and switching between treatments is more common.

1 Introduction

In observational comparative effectiveness studies of chronic disease treatments, treatment effectiveness is assessed over a specified time period that starts with an ‘index date’ when treatment is initiated. Regression analyses are used to assess differences in outcomes between treatments, and traditionally the covariates considered in the regression analyses are baseline factors—sociodemographic characteristics and clinical factors assessed at the index date and a period prior to the index date. Many studies use an intent-to-treat (ITT) approach in which all outcomes post-index are attributed to the index treatment in an attempt to mimic randomized clinical trials (RCTs) [14]. In an ITT analysis, the focus is on effectiveness of that initial treatment decision; irrespective of whether subjects persist with treatment post-index. For the most part, persistence to treatment is not problematic in RCT analyses. If it is, an ‘as protocol’ (AP) sensitivity analysis, in which subjects who discontinued their assigned treatment are censored, is normally conducted. These traditional analysis methods may also use propensity matching to select comparison groups similar in measured baseline factors, although propensity matching may result in a decreased sample size. Additionally, propensity analyses often provide similar results to traditional regression analyses when using the same measured covariates [5, 6]. When treatment switching is prevalent, differences in treatment effectiveness in ITT analyses will tend toward the null; poor outcomes may result from poor persistence to therapy treatment, and effectiveness will be attenuated. And when time-varying confounding is present, traditional methods of estimating effectiveness may not adequately control for bias.

Time-varying confounders are factors that relate not only to past, current, and future treatment choices, but also to outcomes of interest [7]. A well known example is the time-varying factor CD4 lymphocyte count for zidovudine (AZT) treatment for individuals with human immunodeficiency virus (HIV). AZT has an impact on the CD4 lymphocyte count, but the CD4 lymphocyte count is also a factor in the decision to initiate AZT treatment as well as being significantly associated with mortality [7]. Traditional methods of analysis which consider only baseline characteristics (at the index date or from a prior period) have been found to not provide an adequate assessment of treatment effectiveness in the presence of time-varying confounders [7].

A marginal structural model (MSM) is a method of handling time-varying confounding [711]. MSMs describe causal effects (structural models) and produce population-average effect estimates (marginal estimates). MSMs are weighted, repeated measures analyses in which treatment is modeled as a time-varying covariate post-index [12]. By accommodating time-varying treatment choices and events, MSMs may provide better assessments of effectiveness where substantial switching or discontinuation of treatments occurs [13]. Weights balance confounding characteristics across treatment groups and incorporate informative censoring (e.g., loss to follow-up, treatment discontinuation), creating a balanced ‘pseudo-population’ similar to that achieved through an RCT randomization process. Using inverse-probability treatment weights, similar to propensity score weights, is common for observational MSM studies [5, 1417].

Poor persistence to therapy is often a problem among patients with chronic disease [1821]. Switching or discontinuation of treatment can be widespread, creating challenges in determining unbiased estimates of treatment effectiveness in retrospective studies, particularly when associated with time-varying confounders. Treatment for chronic obstructive pulmonary disease (COPD) is a case in point. An objective of COPD treatment is to reduce the occurrence and severity of exacerbations, periods of acute worsening of chronic respiratory symptoms (i.e., shortness of breath, wheezing, or cough) that can be life threatening and result in permanent loss of lung function [22]. However, one indication for long-acting bronchodilator (LABD) and inhaled corticosteroid (ICS) treatments is a history of exacerbations. Exacerbations tend to become more frequent and more severe as COPD progresses [23], and, as in other chronic conditions, when patients are not experiencing symptoms and adverse events, adherence often becomes poor [18, 19, 24, 25]. For patients with COPD, exacerbation experience and prior treatment are confounding factors for treatment and outcome events.

MSMs could aid in addressing challenges inherent in observational chronic disease studies, and more specifically for COPD, in comparing long-acting treatments. Of late, there has been interest in COPD in comparisons between an ICS/long-acting beta-agonist (ICS/LABA) combination therapy or long-acting muscarinic antagonist (LAMA) monotherapy, and triple therapy (use of both concurrently). There has been one RCT [26] and two observational studies [3, 4] that have compared the effectiveness of triple therapy with use of either therapy alone in reducing exacerbations over at least a 1-year period. The RCT reviewed outcomes up to 1 year post-index [26]. Findings suggested fewer exacerbations occurred among the triple therapy group, but sample sizes were modest (approximately 150 in each group) and substantial numbers discontinued therapy [26]. The two observational studies found triple therapy use beneficial. One compared triple therapy with LAMA [3], the other with ICS/LABA [4]. For the LAMA comparison, analyses were ITT, and therapy discontinuation for treatment groups was not reported. In the ICS/LABA comparison, mean follow-up time was 4.65 years, but no information was provided on the degree to which patients discontinued or switched medications post-index [4]. Sensitivity analyses of a propensity-matched analysis and an analysis that considered the triple therapy component tiotropium as a time-dependent covariate within a Cox-regression analysis were conducted, both of which affirmed a reduced risk for triple therapy [4]. Only cardiovascular disease and diabetes mellitus were included as baseline factors and details were not provided on the time-dependent analysis [4]. Moderate exacerbations and symptoms of unstable disease, such as use of relief medication, as time-varying confounders for treatment and experience of severe exacerbation were not considered in either study.

The purpose of this study was to examine the benefits of using the MSM approach to conduct an as-treated analysis that adjusted for time-varying confounding as compared with the commonly used ITT or AP methods in a retrospective observational study of COPD treatments. Specifically, we used claims data from two large Southwestern United States health systems to examine the effectiveness of COPD treatments on severe exacerbations among patients using either ICS/LABA or LAMA alone in comparison with patients using triple therapy, and compared the results of the MSM approach with those from ITT and AP study analyses.

2 Methods

This retrospective observational study compared patients receiving triple therapy with those receiving either ICS/LABA or LAMA therapy. Data consisted of administrative claims from two managed care plans for July 1, 2004 through September 30, 2012. Subjects were followed up to 24 months after treatment initiation (index date). Institutional Review Board approval came from The University of New Mexico Health Science Center, Human Research Protections Office.

Subjects were age ≥40 years, initiating ICS/LABA and/or LAMA therapies, with at least one COPD hospitalization or two COPD outpatient encounters (emergency department or clinic visit) pre-index (see electronic supplementary material [ESM]).

In tracking ICS/LABA and LAMA medication use post-index, an estimated days’ supply was calculated that allowed for 50 % of the optimal use (e.g., a 30-day supply could exist for up to 60 days).

Severe exacerbations were defined as hospitalizations due to COPD or a respiratory-related diagnosis. Moderate exacerbations were defined as a need for systemic corticosteroids and/or antibiotics, and lasted up to 10 days (see ESM). In addition to baseline exacerbations, post-index moderate exacerbations were identified that occurred prior to a severe exacerbation or, in the case where no severe exacerbation occurred, occurred during the post-index period. Hospitalizations during the post-index period for which COPD was recorded as a secondary diagnosis were also captured.

Use of other COPD treatments in the baseline period was summarized. Post-index use of short-acting beta-agonist (SABA) bronchodilators as a potential time-varying confounder was included. SABAs are used to relieve acute symptoms and have been associated with more severe disease and/or poorly controlled disease [27].

Comorbidities were assessed using two commonly used classification systems: Centers for Medicare and Medicaid (CMS) Chronic Condition Data Warehouse (CCW) morbidity definitions [28], and morbidities summarized by Elixhauser and colleagues [29, 30]. Additional baseline diagnoses, symptoms, and procedures were also identified (see Tables 1, 2). Patients with medium or high complexity of COPD were identified using criteria developed by Mapel et al. [31].

Table 1 Baseline demographics and comorbidities, treatment at index
Table 2 Baseline COPD characteristics and related utilization, treatment at index

2.1 Statistical Analyses

Analyses compared risk for a severe COPD exacerbation. In the ITT and AP analyses, Cox proportional hazard models were used to estimate a hazard ratio (HR) and 95 % confidence interval (CI) for severe exacerbation risk (measured as days without a severe exacerbation). Survival analyses started at day 30 since subjects could not have a severe exacerbation event within 30 days post-index. In the ITT approach, outcomes were attributed to subjects’ index treatment and subjects were censored at (1) 24 months, or (2) loss to follow-up, whichever was earlier. In the AP analysis, individuals were additionally censored if they discontinued their index treatment. The Kaplan–Meier estimator was used to estimate unadjusted survival functions within the ITT and AP analyses.

In the MSM analysis, the post-index period was partitioned into short intervals, similar to a clinical trial. Claims data was evaluated at the beginning of each time period, starting with the index date, enabling incorporation of time-varying factors post-index into analyses. The hazard ratio for severe exacerbation risk was approximated using a pooled logistic model that included baseline covariates and used generalized estimating equations (GEE) to adjust for the repeated, and therefore correlated, observations for each study subject across the time periods [32, 33]. The 24 months post-index was divided into seven periods ending at 12, 18, 26, 38, 52, 88, and 104 weeks.

In the MSM, observations were weighted to adjust for confounding due to treatment and censoring. Specific details about how the weights were calculated can be found in the ESM, and are briefly summarized here. For each observation, a treatment weight and a censoring weight were estimated that considered time-varying covariates [9, 34, 35]. The treatment and censoring weights were multiplied to arrive at the overall observation weight. The treatment weight was the inverse of the probability of receiving the treatment (IPTW) actually received at the beginning of each period, starting with the index date. Treatment probability models after the index date included baseline information and post-index information from the prior period (time-varying covariates). The censoring weight was estimated by looking forward and was the inverse of the probability of remaining uncensored (IPCW) for each time period. Individuals were allowed to switch treatment groups and were censored at (1) 24 months, (2) loss to follow-up, or (3) discontinuation of use of any of the index treatments (subjects could switch but had to use one of the index treatments). Censoring probability models included baseline information and post-index information for the current period. Since the treatment and censoring weights were the inverse of a probability estimate, very small probability values resulted in extremely large weights. Therefore, the IPTW and IPCW values were stabilized by a probability estimated from baseline covariates [33, 36]. Since the IPTW estimate at index only included baseline covariates, it was stabilized using the unadjusted probability for treatment at index [35].

All main analyses were pre-specified. SAS (Version 9.2) was used for statistical analyses. Analyses were two-tailed with a p value of <0.05 to determine statistical significance. Significant univariate differences between groups were determined using a Chi-square test for frequencies and Student’s t test for continuous variables. Adjusted odds ratios (OR) and 95 % CIs were estimated from logistic regression results.

3 Results

A total of 5475 subjects met inclusion criteria (see ESM). Of these, 9 % (n = 484) were using triple therapy at index and 91 % (n = 4991) either LAMA (35 %) or ICS/LABA (56 %). Tables 1 and 2 present baseline characteristics present in more than 5 % of either treatment group.

3.1 Treatment Changes

All subjects were managed care health plan members for at least 6 months post-index. In tracking medication use, less than optimum medication use was allowed (twice the days’ supply), but despite this, by 6 months post-index only 46 % (n = 2547) of subjects had persistently used study medications, with 88 % (n = 2238) of persistent users still using their same index medication and 12 % (n = 309) having switched.

Among those using triple therapy at index, 47 % (n = 226) were also using triple therapy at 6 months, 18 % (n = 86) had stepped down to only ICS/LABA or LAMA, and 35 % (n = 172) had discontinued use of both ICS/LABA and LAMA. In contrast, among those using ICS/LABA or LAMA at index, 41 % (n = 2067) were still using either at 6 months (1.0 % [n = 51] had switched treatments within the ICS/LABA or LAMA group), 3.4 % (n = 169) were using triple therapy, and 55 % (n = 2755) had discontinued either use. Table 3 summarizes subjects discontinuing use of any index medications in each time period, subjects disenrolling from the health plan, and among those not discontinuing or disenrolling, those having a severe exacerbation in the time period. Overall, discontinuation affected a smaller percentage of those using triple therapy, and for the MSM model, this was true for each of the seven time periods.

Table 3 Study subjects by follow-up time period

3.2 Weighted Analyses for Treatment and Remaining Uncensored

In MSMs, a key point is that the overall mean of the stabilized weights for each time period should approximate 1.0, with smaller weight value ranges preferred [35, 37]. Extreme values or means other than 1.0 may be indicative of misspecified weight models or of subjects extremely unlikely to have received one of the study medications [35]. Mean weight values were approximately 1.0 for all periods in this analysis and ranges did not contain extreme values (see ESM).

Given that the MSM observation weights include baseline covariates in both the estimation of the inverse probabilities for treatment and censoring and the stabilizing probabilities, the time-varying covariates are the influential components of the stabilized weights. Table 4 provides OR estimates for time-varying covariates included in the MSM regression models for periods 2–7 of the treatment weight models and for all periods of the censoring weight models. Baseline covariates were not incorporated in the treatment stabilizing probability for period 1, so baseline covariates relating to exacerbations and SABA use are also presented. For these, at index, only having a severe exacerbation in the 2 weeks prior to the index date was significantly associated with triple therapy use at index. In the follow-up period, triple therapy at index was the strongest factor related to later use of triple therapy (p < 0.0001), but prior period SABA use and moderate exacerbation experience were also found to be significantly associated with use of triple therapy through the follow-up period (p = 0.01 and p = 0.006, respectively). As the follow-up periods progressed, the odds of using triple therapy increased (p = 0.0001). Triple therapy at index was also strongly related to remaining uncensored in the first 12-week period (p < 0.0001), but less so in the subsequent periods (p = 0.09). And as the follow-up periods progressed, the odds of remaining uncensored decreased (p < 0.0001). Current SABA use (p < 0.0001), moderate exacerbations (p = 0.01), and hospitalizations with secondary COPD-related diagnoses (p = 0.01) were all significantly associated with remaining uncensored in the first 12-week period, but moderate exacerbation experience was not in later periods. Hospitalizations that were not severe exacerbations were positively associated with remaining uncensored in the first 12 weeks, but then had a negative association in later periods.

Table 4 Odds ratios for triple therapy treatment and remaining on one of the study treatments, influential covariates, marginal structural model analysis

3.3 Severe Exacerbation Risk

Figure 1 shows the unadjusted survival functions by index treatment for the ITT and AP analyses. In the ITT analysis, 855 (15.6 % of 5475 subjects) had a severe exacerbation event in the first 2 years, and in the AP analysis, 411 did (7.5 % of 5475 subjects). After adjusting for baseline characteristics, the estimated HR for triple therapy at index for severe exacerbation risk in the ITT analysis was 1.24 (95 % CI 1.00–1.53). In the AP analysis it was 1.00 (95 % CI 0.73–1.36) (Fig. 2). Baseline characteristics significantly associated with greater severe exacerbation risk in both analyses were similar (data not shown); among them, pneumonia (p < 0.05), higher COPD complexity (p ≤ 0.02), any SAMA or SABA/SAMA use (both p < 0.02), and oxygen use (p ≤ 0.01). Greater than 40 % baseline SABA use was significant for ITT only (ITT, p < 0.001; AP, p = 0.07).

Fig. 1
figure 1

Unadjusted survival functions by index treatment, intent to treat (a) and as protocol (b). ICS inhaled corticosteroid, LABA long-acting beta-agonist, LAMA long-acting muscarinic antagonist

Fig. 2
figure 2

Adjusted risk estimates for a severe exacerbation after treatment initiation. (Asterisk) risk estimates are the point estimate with 95 % confidence interval

In the MSM analyses, the HR for patients using triple therapy was still elevated, but there was not a significantly higher risk. Adjusting for all baseline covariates included in the ITT and AP models, the estimated HR for severe exacerbation was 1.11 (95 % CI 0.68–1.81). Baseline characteristics significantly associated with greater severe exacerbation risk were similar to those for the ITT and AP analyses; and, as with the AP analysis, >40 % baseline SABA use did not reach significance (p = 0.07).

4 Discussion

We designed this study to examine the potential benefits of a MSM analysis approach contrasted to traditional models when comparing the effectiveness of COPD treatments using retrospective claims data. In this analysis of study methods, we compared use of long-acting therapies and their effectiveness in reducing severe exacerbations, currently a topic of great interest in COPD management. The ITT approach suggested that persons initiating triple therapy had significantly higher risk of subsequent severe COPD exacerbations. However, the AP and the MSM analyses found that by accounting for adherence to therapy during the follow-up period, there was no significant increased risk. The MSM analysis had some advantages over the AP method; including a larger number of patients in the study, and allowing examination of time-varying clinical factors occurring after initiation of COPD therapy that may affect treatment choices or outcomes. Patients with chronic disease are frequently not adherent to their prescribed controller therapies. This analysis demonstrates that it is important to consider treatment changes and clinical factors that affect treatment and outcomes when conducting comparative effectiveness studies.

We found that the MSM approach provides useful information on time-varying confounders in effectiveness studies. We adjusted for time-varying confounders (i.e., moderate exacerbation events, use of rescue medications (SABA), and inpatient stays with a secondary COPD-related diagnosis) by incorporating their effects through the stabilized weights in the MSM analysis. We used separate models for the first 12 weeks post-index and the time periods following that up until 2 years post-index. Two of the time-varying confounding factors also appeared to be time-modified in the censoring models [8]. That is, the effect of moderate exacerbations and hospitalizations with a secondary diagnosis of COPD changed between the first 12 weeks and later time periods. This is an aspect that bears further investigation in future research studies of COPD treatments and should be considered in other chronic disease treatment studies.

In our study, the MSM risk estimate was midway between the AP estimate of 1.00 and the ITT estimate of 1.24. The review by Suarez and colleagues [38] found that in 40 % of exposure–outcome associations (measured as OR or coefficient of linear regression), the MSM estimate differed by at least 20 % from the conventional estimate. It has been suggested that the full power of the MSM approach is best appreciated when there are numerous time-varying covariates [39], when there is evidence of strong confounding by time-dependent covariates, and past treatment has a sizable effect on the covariates [40]. In our study there was substantial discontinuation of treatment. While there was some switching from triple therapy use to ICS/LABA or LAMA use, there was minimal switching from ICS/LABA or LAMA to triple therapy. An MSM approach may offer greater explanatory benefit when there is more movement between treatment groups than was present in this study. Although it would increase the complexity of the analysis and require a larger sample size than ours, a multinomial treatment probability model that would allow for more than two treatment choices and that included no long-acting treatment as an option may provide additional information [41]. Discontinuation of treatment was treated as censoring in our model, but reasons for discontinuation that may not be apparent in the claims data include patient perceived lack of need for medication or ineffectiveness of medication, adverse events from medications or COPD not captured in claims data, and adverse events from comorbid conditions not included in the study design [18, 25, 42].

4.1 Comparison with Previous Studies

From our literature review, MSMs had not been utilized in COPD treatment effectiveness studies, and a minority of published MSM studies concern chronic disease treatments [38, 43]. It has been speculated that advantages for COPD patients may be gained from long-acting triple therapy since ICS/LABA combination therapies are known to have anti-inflammatory properties, and the LAMA therapy, tiotropium, had been shown to reduce exacerbations in the absence of any anti-inflammatory activity [44].

One RCT study has been conducted comparing triple therapy to LAMA (tiotropium) use, finding a reduced but not significant OR for triple therapy (0.85, 95 % CI 0.52–1.38) using an ITT analysis during a 52-week study period [26]. Similar to our study, discontinuation of triple therapy among patients was differential between treatment arms: 47 % for tiotropium, and 26 % for triple therapy subjects.

Effectiveness estimates from our study were contrary to two retrospective observational studies that found better outcomes to be associated with triple therapy use [3, 4]. In those studies, no information was provided on the degree to which patients discontinued or switched medications post-index. The second retrospective analysis used a US population and was somewhat similar in sample size to our study (852 triple therapy, 2481 LAMA) [3]. Analyses were ITT, and patients treated with LAMA who later switched to triple therapy were excluded from the study. Study subjects had at least one COPD-related exacerbation event and at least one claim for a SAMA medication in the 12-month pre-index period, and post-index had at least two LAMA claims. Despite these additional requirements, the population was similar to ours in that triple therapy patients were more likely to be younger and to have had a baseline period inpatient stay for COPD. However, baseline indicators of more unstable disease (rescue inhaler use) were all lower among triple therapy patients compared with the non-triple (LAMA) group [3]—associations that were reversed for our study. For this study, and for ours, the treatment group with the higher prevalence of baseline adverse indicators also had the higher risk for post-index severe exacerbations, despite adjustment for these characteristics, suggesting that residual and unmeasured confounding may have been present in both studies.

We hypothesized that the MSM model would highlight that the experience of moderate exacerbations and related events (other hospitalizations and SABA use) were associated with post-index treatment decisions. Our results from the treatment and censoring weight calculations for the MSM analysis support that hypothesis. In the first 12 weeks post-index, subjects who had higher probability of remaining uncensored (of continuing to use one of the study COPD LABD treatments) were those who were having moderate exacerbations, using SABA medication, and/or having hospitalizations for which COPD was a secondary diagnosis. Our analyses serve to highlight that issues related to poor treatment adherence and exacerbation experience have substantial impact in observational COPD studies.

4.2 Limitations of the Study

Limitations of this study include the potential for unmeasured confounders. Additional information not available in our dataset may have improved treatment probability models, including provider specialty, spirometry measurements to assess disease severity, patient-reported symptoms, economic status of patients, and information about patient prescription copayments.

Subjects were censored in our study when there was no evidence of persistence to any of the study medications given an allowable gap period. There is the possibility that the outpatient pharmacy database was incomplete and that patients were filling prescriptions outside of the managed care system. However, we have no evidence to suggest that if this occurred it would be differential between the treatment groups being compared. Finally, there may be misclassification due to the exacerbation measures used for severe and moderate exacerbations. Because this was a retrospective analysis and we did not utilize medical chart information, we cannot verify whether events were in fact events indicative of worsening symptoms related to COPD. However, definitions and criteria for these events were the same as have been used for other COPD treatment effectiveness studies, allowing comparison across studies.

5 Conclusions

Few MSM studies have been conducted for chronic disease treatment effectiveness and none for COPD [38, 43]. This study provided a comparison of three different retrospective observational study design analysis approaches. ITT analyses demonstrated outcomes for the initial treatment groups as assigned. Estimates from MSMs are meant to reflect the effect of full adherence, or as patients as treated [33], however, discontinuation of all therapies was prevalent in this study sample. The ITT and AP analyses showed dramatic risk estimate differences, demonstrating the potential for the existence of underlying differences between the two groups as treated. The MSM analysis helped to emphasize how salient events associated with unstable disease (e.g., moderate exacerbations and SABA use) are associated with use of triple therapy and treatment persistence.

It has been stated that “exacerbations are heterogeneous events occurring in a heterogeneous disease” [45]. Understanding how patient characteristics cluster and how clusters relate to both treatment propensity and outcomes may help to improve disease management for patients with COPD. The MSM approach could be a useful tool for identifying the relationship between those clusters and treatment and outcomes.

The full advantages of the MSM approach may not have been illustrated in this study due to minimal switching from the non-triple to the triple therapy group. Nonetheless, this study highlighted the importance of understanding post-index events such as treatment switching and discontinuation that may confound assessment of chronic disease treatment effectiveness. Based on this COPD effectiveness study, we find that MSMs may be a useful and informative complementary analysis to include in studies of treatment effectiveness in chronic disease where time-varying confounding is present, and switching between treatments is common.