Introduction

Randomised, controlled trials (RCTs) are considered the gold standard for evaluating clinical interventions because of their ability to minimise bias [1]. Blinding is an important technique to reduce bias in clinical trials [2, 3]. However, it is not always possible to conduct such trials; therefore, alternative study designs are often used.

In this review, we discuss the design of RCTs of inhaled therapies in patients with chronic obstructive pulmonary disease (COPD), with particular regard to the use of blinding and open-label techniques. Data from studies of two once-daily bronchodilators used in COPD, indacaterol and tiotropium, will be used to evaluate the potential for bias in open-label studies.

Blinding in clinical trials

Administering treatment in a blinded manner is complicated when comparing medications delivered via inhalers of different shapes and sizes [4]. In such cases, it may be necessary to employ more complex methods, such as double-dummy designs. Further difficulties may arise due to the specific properties of some products (e.g. a particular taste or sensation), which may make their identity apparent during administration. The presence of logos on branded products may also present problems if it is not possible to obtain matching placebos with identical branding or to obtain the active drug in an unbranded form. If these difficulties cannot be overcome, it may be necessary to use an open-label design [5]. While double-blind trials are considered as the gold standard for evaluating new products, open-label studies may have some advantages, such as simpler design, lower cost and closer reflection of everyday clinical practice [2, 6, 7].

In the design of open-label studies, great care is needed to minimise possible sources of bias, which can arise in several ways [3]. For example, patients who know they are receiving an active drug, as opposed to knowing they are on placebo, may report more favourable outcomes because they expect a benefit. The patient’s previous experience of a drug may also affect their reporting of subjective efficacy endpoints or adverse effects. Observers may also be biased by awareness of treatment allocation, which might affect reporting of treatment responses or adverse events. Knowledge of treatment assignment could also affect decisions about remaining on treatment or receiving concomitant medications or other therapy, as well as decisions about inclusion of results in an analysis.

Minimising bias in open-label studies

Strategies to minimise the likelihood of bias in open-label studies include randomising patients after collection of baseline data [6] and use of crossover study designs [2]. In a crossover design trial, each patient is randomised to a sequence that includes each treatment; hence, each patient can act as their own control for treatment comparisons [2]. However, crossover designs can be limited by the potential for carry-over effects from the previous treatment and administration procedure [2]. The design of crossover studies therefore needs to incorporate appropriate wash-out and familiarisation periods. Other means of minimising bias in open-label studies include basing efficacy outcomes on objective variables [6]. Spirometry is considered the most objective, standardised and reproducible measure of airflow limitation [8], and is therefore unlikely to be subject to bias. Additionally, many clinical trials use a centralised spirometry organisation to further reduce the possibility of variability in the observation. Centralised spirometry may include provision of the same spirometry equipment to all study sites, consistent training of staff who make spirometry assessments and calibration of equipment daily.

Designing clinical trials of inhaled medications in COPD

Tiotropium is a commonly used once-daily inhaled bronchodilator for the maintenance treatment of patients with COPD and is a key comparator in evaluation of new bronchodilators. However, the use of tiotropium as a comparator presents several difficulties. Firstly, tiotropium is a hygroscopic powder and cannot be removed from the commercial capsules for repackaging into unmarked capsules. Secondly, the commercially available capsules are marked with a logo that can be seen through a window in the inhaler, making it impossible to completely mask the nature of the treatment, and for legal reasons the logo/text cannot be copied onto placebo capsules. The manufacture of tiotropium is also difficult, particularly as the drug is very unstable in the ambient air.

To overcome the problems associated with blinding, a clinical trial of a treatment administered via a dry powder inhaler (DPI) requires considerable resources and possibly assistance from the manufacturer of the comparator DPI to create blinded commercial and placebo products. Ultimately, an alternative clinical trial design may need to be used.

Experience from the indacaterol clinical trial programme

Three Phase III clinical studies have compared indacaterol with tiotropium. As blinded tiotropium was not available, alternative blinding methods were used (Table 1). The INHANCE study was performed in two stages in an adaptive seamless design [9]. In stage 1, patients were randomised to receive double-blind indacaterol 75, 150, 300, or 600 μg once daily, double-blind formoterol 12 μg twice daily, double-blind placebo, or open-label tiotropium 18 μg once daily for 2 weeks. In stage 2, patients continued treatment with two selected doses of indacaterol (150 or 300 μg, based on 2-week efficacy and safety data), tiotropium, or placebo for 26 weeks, with additional patients recruited and randomised. The INTIME study was a randomised, double-blind, crossover study in which patients received three of the following four treatments, each once-daily for 14 days and each followed by a 14-day washout: double-blind indacaterol 150 or 300 μg, double-blind placebo or third-party blind tiotropium [10]. The blinding of tiotropium treatment was maintained by using a third-party blinding procedure where the study drug was prepared and provided to the patient by persons who were independent of the other clinical trial processes (described in more detail in Table 1). In the 12-week INTENSITY study, a blinded, double-dummy design was used [11]. Following a 2-week run-in, patients were randomised to treatment with indacaterol 150 μg or tiotropium 18 μg, each once-daily for 12 weeks. Patients receiving indacaterol also took a placebo via the inhaler used for tiotropium, and patients receiving tiotropium took a placebo via the inhaler used for indacaterol. Blinding was achieved by specifying that study medications were dispensed by a third party not involved in other aspects of the study. In all three studies, indacaterol was administered via the Breezhaler® DPI device and tiotropium via the Handihaler® DPI device.

Table 1 Description of indacaterol studies using a tiotropium comparator arm*

Patient populations were similar across the three studies, which enrolled patients with moderate-to-severe COPD (forced expiratory volume in 1 second [FEV1 <80% and ≥30% predicted; FEV1/forced vital capacity [FVC] <0.7) [1214], and a smoking history of ≥10 pack-years (INTIME and INTENSITY [10, 11]) or ≥20 pack-years (INHANCE [9]). The primary endpoint was trough FEV1 (an objective endpoint) in all three studies, with the primary comparisons being indacaterol versus placebo at Week 12 in INHANCE [9], Week 2 in INTIME [10] and indacaterol versus tiotropium at Week 12 in INTENSITY [11].

Effect of study blinding on objective measures: lung function

The results of 24-hours post-dose (trough) FEV1 for active treatment versus placebo in the INHANCE and INTIME studies are shown in Table 2. The treatment differences for tiotropium versus placebo were similar across both studies, even though they differed in blinding of the tiotropium arm [9, 10]. Treatment differences for indacaterol versus placebo were also consistent across both studies.

Table 2 Key results from indacaterol studies with a tiotropium comparator arm and double-blind studies with tiotropium as the primary treatment of interest

Table 2 also shows FEV1 data from previous studies in which tiotropium was the primary treatment of interest. It should be noted that there were differences in entry criteria and in definitions of trough FEV1 between studies.

In the INHANCE study, the difference in trough FEV1 between open-label tiotropium and placebo was 140 mL at both Weeks 12 and 26 [9]. These values are similar to those recorded during the previous double-blind studies of tiotropium at 12 weeks (60–148 mL), 26 weeks (100–148 mL) and over the longer term (100–150 mL; Table 2 and Figure 1).

Figure 1
figure 1

Differences in trough forced expiratory volume in 1 second (FEV 1 ) between tiotropium and placebo in INHANCE[9]and studies in which tiotropium was the primary treatment of investigation. Values are means and standard errors. *Includes assessments made at 11, 12 and 13 weeks; Includes assessments made at 25 weeks; Estimated from a figure (no standard error available); #No standard error available.

These findings suggest that there was no bias against tiotropium for FEV1 in the open-label INHANCE study. The 140 mL difference between tiotropium and placebo in the indacaterol studies was greater than the differences seen at 12 and 26 weeks in the majority of the double-blind studies of tiotropium, raising the possibility that the bias, if any, was favouring the performance of tiotropium (Figure 1). However, the variation in FEV1 even between the previous double-blind studies may partly reflect differences in enrolment criteria, such as COPD severity or patient characteristics.

In the INTENSITY study, which did not include a placebo arm, trough FEV1 at Week 12 was similar with indacaterol (144 mL) and tiotropium (143 mL), and non-inferiority was demonstrated [11]. Mean changes from baseline in trough FEV1 were also similar between indacaterol (130 mL) and tiotropium (120 mL) in that study.

Effect of study blinding on subjective measures: health-related quality of life

As well as FEV1, many regulatory authorities recommend that clinical trials in COPD include an endpoint that reflects clinical benefit of treatment using a validated measure, such as the St. George’s Respiratory Questionnaire (SGRQ) [26, 27].

The SGRQ is used to assess health status in patients with chronic respiratory disease and was included as a measure in the INHANCE and INTENSITY studies. The questionnaire, which is completed by the patient, comprises of 50 questions in three domains: symptoms, activity (limitations) and impacts (of disease), which are calculated to provide a score between 0 (best) and 100 (worst). The minimal clinically important difference (MCID) for SGRQ score that is generally accepted as indicating an improvement over placebo or from baseline is −4 units [28].

After 12 weeks of treatment, the MCID for change in SGRQ from baseline was achieved by 44.9% of tiotropium-treated patients (with a mean improvement of −1.1 units) in the INHANCE study (open-label tiotropium) and 42.5% of patients (with a mean improvement of −3.0 units) in the INTENSITY study (blinded tiotropium) [11, 29] (Table 2). These values are slightly lower than those recorded at 12 weeks in studies where tiotropium was the primary treatment of interest (Figure 2).

Figure 2
figure 2

Percentages of tiotropium-treated patients achieving the minimum clinically important difference (MCID) for St. George’s Respiratory Questionnaire (SGRQ) score in INHANCE[29], INTENSITY[11]and other studies in which tiotropium was the primary treatment of investigation.

At 26 weeks, 47.3% of tiotropium-treated patients in INHANCE achieved the MCID for change in SGRQ (with a mean improvement of −1.0 units) compared with a range of 49–60% in published tiotropium studies. Overall, the percentage of patients treated with open-label tiotropium in INHANCE who achieved the MCID for change in SGRQ at 12 and 26 weeks, were within the range of values recorded across all of the double-blind tiotropium studies over the durations of 12 weeks to 4 years (Table 2 and Figure 2). Again, it should be noted that, even in double-blind studies of tiotropium, there was variability in the proportion of patients who achieved the MCID for SGRQ (and in the mean improvement scores; –2.7 to −6.5).

These findings indicate that minimal bias was introduced with the open-label design of the tiotropium comparator arm in the INHANCE study.

Dyspnoea – transition dyspnoea index (TDI)

Dyspnoea, or breathlessness, is the major limiting symptom for COPD patients and is often measured using the TDI [30], a tool recommended by regulatory authorities for inclusion in clinical trials of treatments for COPD [26].

The TDI is a multidimensional instrument that measures change from the baseline dyspnoea index (BDI) over time [31]. The BDI considers three components (functional impairment, magnitude of task and magnitude of effort), each rated from 0 (severe dyspnoea) to 4 (no dyspnoea). The TDI measures changes from baseline in each domain of the BDI on a scale of +3 (major improvement) to −3 (major deterioration) and has been shown to be valid, reliable and responsive [31, 32]. The MCID for the TDI is an improvement from the BDI score of ≥1 unit [33].

The proportion of patients achieving an improvement in TDI equal or greater to the MCID was assessed in the INHANCE and INTENSITY trials of indacaterol (Table 2). In the INHANCE study (open-label tiotropium), the MCID for TDI was achieved by 55.0% of tiotropium-treated patients at 12 weeks (with a mean improvement of 0.75 points) and 57.3% at 26 weeks (with a mean improvement of 0.87 points) [9], while in the INTENSITY study (blinded tiotropium), 50.1% of tiotropium-treated patients achieved the MCID at 12 weeks (with a mean improvement of 1.43 points) [11]. These values are slightly higher than the values recorded during the previous studies of tiotropium [21, 24], again demonstrating that the open-label design of INHANCE did not result in a bias against tiotropium (mean improvement of 0.8–1.28 points) (Table 2 and Figure 3).

Figure 3
figure 3

Percentages of tiotropium-treated patients achieving the minimum clinically important difference (MCID) for transition dyspnoea index (TDI) in INHANCE[9], INTENSITY[11]and other studies in which tiotropium was the primary treatment of investigation. *Includes assessments made at 13 weeks; Includes assessments made at 25 weeks; Precise value is not given (values stated as range of 42–47% across all timepoints).

Adverse events

In the INHANCE study, the overall incidence of adverse events and the most commonly occurring adverse events were similar across the treatment groups (indacaterol 150 or 300 μg, tiotropium, or placebo) [9]. Respiratory tract infections and respiratory events were the most common type of adverse events, serious events, and those leading to withdrawal of study treatment. In the INTIME study, the overall incidence of adverse events was similar across all treatments (indacaterol 150 or 300 μg, tiotropium or placebo), and these were predominantly mild or moderate in severity [10]. The most frequent adverse events were cough, COPD exacerbation, and nasopharyngitis. In the INTENSITY study, adverse events were reported for similar proportions of patients in the two treatment groups (indacaterol and tiotropium), with the most common events generally reflecting the typical disease characteristics of COPD [11]. These results reflect adverse events observed in previous double-blind trials of tiotropium compared with placebo [17, 18, 22, 24, 25].

Additional evidence on the effect of study blinding on treatment effect

It is worth noting that the third-party blinding procedure used in the INTIME study required administration of the study drug to the patient by a blinded individual, and therefore the forced high compliance may potentially bias results over what might be seen in a more traditional clinical trial. In addition, the high manpower intensity required for third-party blinding may limit these studies to short periods of time.

Previous reports on the effects of blinding in RCTs have been conflicting. Several studies examining the effects of differing blinding techniques in RCTs from a range of therapy areas have found that open-label studies tend to exaggerate the benefits of treatment when they are included in meta-analyses [3437]. For example, Schulz et al. found that in open-label RCTs, odds ratios were exaggerated by 17% [34]. Other studies of RCT blinding have found that an open-label trial design is not associated with treatment bias [3840]. Moreover, investigations have shown that the adequacy of randomisation has a greater influence on treatment bias than blinding [39].

Observational studies are considered inherently biased as they are both open-label and non-randomised [41, 42]. However, a comparison of treatment bias in observational studies versus RCTs found that in 17 out of 19 analyses, the estimates of treatment effects from observational studies were similar to those from RCTs; in only two of the 19 analyses did the combined magnitude of the treatment effect in the observational studies lie outside the 95% confidence interval for the combined magnitude in the RCTs [43]. Data from epidemiological studies, while very different in design and quality from a RCT indicate that objective outcomes are less affected by bias associated with an open-label study design than subjective outcomes [44]. Thus, even under conditions where bias is expected, an open-label study design need not introduce bias, particularly if assessments are based on objective measurements.

Conclusions

Double-blind RCTs remain the gold standard for evaluating interventions in chronic diseases such as COPD. However, many drugs in COPD, particularly bronchodilators, have an acute effect on both subjective and objective outcomes, therefore full blinding cannot be guaranteed. If an open-label design is chosen, the potential for bias must be taken into account and the selection of outcome measures is of pivotal importance. Measures such as airway function are considered objective and less prone to bias, and are therefore recommended as primary endpoints. Despite possible limitations associated with functional endpoints such as FEV1 in COPD, these variables have proven reliable, are subject to little or no placebo effect, and have produced consistent results that are largely independent of study population, treatment duration and differences in study design and blinding.

This low likelihood of bias with objective measures of airway function is also supported by the fact that spirometry has rigorous standards for performance in clinical trials; any incomplete effort that can introduce bias is detected by the technician/central agency and not included. Thus, selection of an objective outcome measure such as FEV1 in an open-label study in COPD is unlikely to introduce relevant bias if technical requirements are met and established guidelines are followed [45].

While this holds true for FEV1 and, potentially, other objective endpoints in COPD, the inclusion of more subjective, patient-reported outcomes may pose a larger challenge. However, data from studies comparing indacaterol with tiotropium demonstrate that the effects of tiotropium on subjective measures (SGRQ; TDI) vary slightly across blinded and non-blinded studies, indicating that minimal bias was introduced by using open-label tiotropium. It is likely that the appropriate use of randomisation in the comparative studies was an important factor in minimising potential bias. Nevertheless, the extent to which instruments such as symptom indices or health status questionnaires are affected by blinding issues in the absence of a placebo control, are as yet poorly characterised or unknown. Minimum clinically important differences for these instruments have been established using well-controlled, blinded trials against a placebo comparator, thus hampering the clinical interpretation of data generated in open-label studies. As more open-label data on these outcomes become available in the future, this may help to clarify the clinical significance of treatment-associated changes observed in the open-label studies discussed above.

In conclusion, when reporting a clinical trial, it is important to be transparent about who exactly was blinded and how. If an open-label design is necessary, additional efforts should be made to minimise the risk of bias. If these recommendations are followed, and the data are considered in the full knowledge of any potential sources of bias, results with tiotropium suggest that data from open-label studies can provide valuable and credible evidence of the effects of a therapy.