Background

Clinical trials of antimuscarinic drugs in overactive bladder (OAB) have noted marked responses in patients treated with placebo. While this led some to question the usefulness of treatment interventions [1, 2], more recent reviews have confirmed the clinical benefit of antimuscarinics in OAB [3, 4]. However both of these metaanalyses also emphasized the modest difference in outcome measures between active and placebo arms. Nabi et al [3] calculated that 41% of subjects allocated to placebo arms reported symptomatic improvement or cure, compared with approximately 56% in patients allocated to active treatment. Chapple et al [4] also noted considerable variation in placebo rates between trials.

Despite these observations, the placebo response in drug trials for OAB has not been well characterized. One paper reported on study level data in registration trials for drugs treating lower urinary tract symptoms [5] and another from patient level data in four pooled studies in stress urinary incontinence [6]. Issues identified with high placebo response included patient and disease characteristics (e.g. less severe disease), amount of prior and concomitant treatment (e.g. pelvic floor training), and types of endpoints (subjective rather than objective). Nonspecific factors associated with trial participation, such as increased awareness of voiding habits and interactions with clinical trial staff were also considered relevant.

The purpose of this analysis was to characterize the placebo response in antimuscarinic drug trials for OAB, based on changes in commonly-used efficacy endpoints of 1) number of micturitions per day, 2) number of incontinence episodes per day and 3) mean voided volume per micturition. A number of statistical methods were used, including a meta-analysis to obtain a more precise estimate of the placebo effect based on pooled results from various studies.

Methods

Data Sources and Study Selection

The search strategy for the selection of randomized clinical trials for the meta-analysis of placebo response is summarized in Figure 1. We selected all placebo-controlled trials included in a recent comprehensive meta-analysis of antimuscarinic treatments for OAB [7]. In addition, we identified any additional trials that had been published in the 12 months up to July 2008 through a systematic review of the literature using all major literature databases for publications reporting randomized, double-blinded, placebo-controlled trials for individuals diagnosed with OAB, and that were not included in the recent meta-analysis [7]. Data collection, selection, extraction and recording were in accordance with Cochrane Reviews Guidelines [8]. Searches included combinations of the terms overactive bladder, urge urinary incontinence, randomized, double-blind, placebo-controlled, and names of common antimuscarinic medications. In addition, searches included the abstracts from several major conferences. There were seven drugs included in this review (darifenacin, fesoterodine, oxybutynin, propiverine, solifenacin, tolterodine, and trospium).

Figure 1
figure 1

Search Strategy for Selection of Randomized Clinical Trials for Meta-Analysis.

For inclusion in this analysis, publications had to meet the following eligibility criteria; 1) the study was a double-blind randomized placebo-controlled trial of an antimuscarinic medication in patients with OAB; 2) the total number of patients assigned to placebo was reported, and 3) the study reported one or more of the following endpoints: number of incontinence episodes per day, number of micturitions per day, and/or volume voided per micturition. Additional information was requested through direct contact with study authors for abstracts that did not report some of the above details, or for studies where endpoint data were presented as medians rather than means. Because of different assumptions about data distribution, only studies reporting mean data were included.

Study Procedure (Data Extraction)

After the publications were obtained, two of the authors (SL and PG) independently determined the eligibility of each publication by applying the above criteria. If data were reported in more than one publication, only the primary publication was included in this analysis. Study characteristics (year of publication, patient numbers, ages, duration of treatment, diary details), and baseline, endpoint, and change from baseline of the above endpoints, associated estimates of variability (if reported) and statistical power were extracted by one author and entered into Microsoft Excel. Studies that reported their endpoints as number of incontinence episodes per week were converted to episode per day, by dividing the values by seven.

Statistical Methods

Summary statistics were calculated for all variables. The relationship between baseline and change in endpoints, and study size versus year of publication, was assessed using linear regression. The probability of success (achieving statistical significance) as a function of the size of the placebo arm was estimated using logistic regression. The relationship between year of publication and baseline and change in endpoints was assessed using Spearman regression. A meta-analysis was conducted to pool the results of change from baseline data for each of the 3 endpoints using Comprehensive Meta-analysis software version 2.0. For each endpoint, change from baseline was summarized as non-standardized (weighted) means using inverse variance weighting. Point estimates and 95% confidence intervals (CIs) were computed with both random effects (Der Simmonian and Laird method) and fixed effects models. The null hypothesis of homogeneity of response across studies was tested with the Cochran Q statistic. If the null hypothesis was rejected, point estimates and 95% CI were estimated on the basis of random effects model was presented, otherwise the fixed effects model was presented. An ANOVA model was used to assess the possibility that the magnitude of placebo change might influence the probability of study success, with magnitude of change in the placebo arms as the dependent variable and study, success (statistically significant separation between active and placebo arms) and power of the study (80% or 90%) as independent variables. Statistical testing was carried out at the 5% level of significance (two-sided tests).

Results

Trial characteristics

Thirty-four publications which included a placebo arm were identified in the Chapple meta-analysis [7]. The systematic literature review over the most recent 12 months identified 2 additional studies [9, 10]. The 36 studies were identified that met acceptance criteria, and these are listed in Table 1[944]. The most commonly published OAB trials were for tolterodine (n = 15), oxybutynin (n = 8), propiverine (n = 5) and solifenacin (n = 5) (note: several studies included more than one active arm). The mean age of patients enrolled in the placebo arms was 58.9 years. All studies included adult subjects but four studies specifically targeted elderly subjects [12, 19, 21, 30]. Median study duration was 12 weeks (range 2–12 weeks). The mean number of patients in the placebo arms of the trials was 164 (range 13–508).

Table 1 Results for placebo treatments in the studies included in the meta-analysis.

Size of placebo arms tended to increase in more recent studies (r = 0.52; Figure 2A). There were positive associations between the probability of studies reporting statistically successful outcomes and the size of the placebo arm for all endpoints (Figure 2B). This was statistically significant for incontinence episodes (p = 0.03), but not for micturitions/day (p = 0.17) or mean voided volume (p = 0.58). It was not possible to assess the influence of some variables on placebo response, either because they were too homogeneous (e.g. study duration was 12 weeks in 22/36 studies), or because they were not reported (e.g. diary duration in only half of the studies).

Figure 2
figure 2

Relationship of the size of placebo arm with (A) the year of publication (Panel A) and the probability of successful study outcome (Panel B) for three commonly used endpoints.

Incontinence episodes/day

Baseline mean (SD) incontinence episodes/day in subjects randomized to placebo were 3.16 (1.00). At study endpoint, mean (SD) incontinence episodes/day were reduced by 1.16 (0.46). The change in incontinence episodes/day was highly associated with baseline values (r = 0.69; Figure 3A). Although there was no relationship between baseline incontinence episodes and year of publication (r = -0.03), there was a modest positive relationship between change in incontinence episodes and year of publication (r = 0.39, p = 0.10). There was a negative correlation between study size and change in incontinence episodes/day (r = -0.25).

Figure 3
figure 3

Incontinence episodes/day.Panel A: Relationship between baseline and change scores. Panel B: Funnel plot from meta-analysis.

Point estimates (95%CI) from the meta-analysis of change from baseline data were -1.09 (-1.17, -1.02) using a fixed effect model, and -1.15 (-1.34, -0.96) using a random effects model. The forest plot is shown in Figure 4, upper panel. Both results were highly statistically significant (p < 0.0001). Considerable heterogeneity in the data set was suggested by the high Q-value (85.2, df = 16, p < 0.0001), indicating that a random effects model was a more appropriate analytical approach. The high degree of data heterogeneity was also evident on the funnel plot (Figure 3B). Analysis of the relationship between the magnitude of placebo response and successful study outcome and power showed no statistical difference (p = 0.80 and 0.97 respectively).

Figure 4
figure 4

Forest plots form the meta-analysis of commonly used endpoints in OAB trials.

Mean micturitions per day

Baseline mean (SD) micturitions/day in subjects randomized to placebo were 11.8 (0.9). At study endpoint, mean (SD) micturitions/day were reduced by 1.4 (0.7). The change in mean micturitions/day was highly associated with baseline values (r = 0.62; Figure 5A). There was a modest positive relationship between baseline micturitions/day and year of publication (r = 0.34, p = 0.09), however there was no relationship between change in micturitions/day and year of publication (r = 0.02). There was a modest negative correlation between study size and change in micturitions/day (r = -0.21).

Figure 5
figure 5

Mean micturitions/day.Panel A: Relationship between baseline and change scores. Panel B: Funnel plot from meta-analysis.

Point estimates (95% CI) from the meta-analysis of change from baseline data were -1.29 (-1.38, -1.12) using a fixed effect model, and -1.27 (-1.51, -1.03) using a random effects model. The forest plot is shown in Figure 4, center panel. Both results were highly statistically significant (p < 0.0001). Considerable heterogeneity in the data set was suggested by the high Q-value (107.0, df = 17, p < 0.0001), indicating that a random effects model was a more appropriate analytical approach. The high degree of data heterogeneity was also evident on the funnel plot (Figure 5B). Analysis of the relationship between the magnitude of placebo response and successful study outcome and power showed no statistical difference (p = 0.93 and 0.21, respectively).

Mean voided volume

Baseline mean (SD) mean voided volume was 163.1 (42.9) mL. After placebo treatment, mean voided volume was increased by 12.5 (5.9) mL. There was no relationship between baseline and change in mean voided volume (r = 0.06; Figure 6A). There was a modest negative relationship between baseline mean voided volume and year of publication (r = -0.23, p = 0.32), however there was no relationship between change in mean voided volume and year of publication (r = 0.06). There was no relationship between study size and change in mean voided volume (r = -0.07).

Figure 6
figure 6

Mean voided volume/day.Panel A: Relationship between baseline and change scores. Panel B: Funnel plot from meta-analysis.

Point estimates (95% CI) from the meta-analysis of change from baseline data were 18.6 (18.3, 18.9) using a fixed effect model, and 12.4 (9.3, 15.5) using a random effects model. The forest plot is shown in Figure 4, lower panel. Both results were highly statistically significant (p < 0.0001). Considerable heterogeneity in the data set was suggested by the high Q-value (91.0, df = 14, p < 0.0001), indicating that a random effects model was a more appropriate analytical approach. The high degree of data heterogeneity was also evident on the funnel plot (Figure 6B). Analysis of the relationship between the magnitude of placebo response and successful study outcome and power showed no statistical difference (p = 0.26 and 0.50 respectively).

Interrelationship of endpoints

Changes in all 3 endpoints showed were correlated with one another and showed moderate levels of association. Change in incontinence episodes was positively associated with change in micturitions (r = 0.49). As would be expected based on unchanged daily urine output, the change in mean voided volume was negatively associated with change in incontinence episodes (r = -0.38) and change in micturitions (r = -0.61).

Discussion

The main findings of this analysis are that for three commonly published endpoints in OAB studies, changes in the placebo arms were substantial and statistically significant. A high degree of heterogeneity was noted for all endpoints. There were significant associations between baseline and change scores for some but not all of the endpoints. More recent studies tended to be larger than earlier studies, and there were positive associations between probability of achieving statistically significant results and size of the placebo arm. The magnitude of changes in placebo arms did not appear to influence the likelihood of the study to be statistically different from active treatment.

This analysis confirms earlier observations that the placebo response in OAB trials is substantial [3, 4]. In a meta-analysis of placebo responses across different disorders, drug trials in urogenital disorders had the highest placebo response [45]. This may be an underlying consequence of urological disorders in general, rather than related to type of intervention, as high placebo response rates have also been reported in trials of non-pharmacological management of incontinence. For example, in a trial of pelvic floor muscle training for stress urinary incontinence, a 64% response rate was reported for the sham (placebo) intervention arm [46]. Urological disorders are generally kept private by the patients and a majority of people with UI do not seek help despite poor quality of life (i.e., perceived lack of self-control, limited daily activities for fear of an "accident"). [47] As a result, in routine clinical practice, there may not be a lot of patient-level basic knowledge about these urological conditions. However, when participating in a clinical research trial for a urologic problem, there is a much greater enhancement in patient knowledge and awareness of their disorder than it would be for a condition (e.g., cardiovascular or metabolic) that is more openly discussed.

A meta-analysis approach was used in order to obtain a more precise estimate of the placebo effect based on pooled results from various studies. Both fixed and random effects models were tested, to provide an indication on the variability of the results. Because the meta-analysis revealed a high degree of heterogeneity for all three endpoints, a random effects approach was used in this analysis. Egger et al [48] has identified a number of potential causes of heterogeneity. Factors that might contribute to heterogeneity in OAB studies include the nature of the population studied (i.e. the presence of mixed types of incontinence), size of the studies (ranging from 13 to 508 subjects/arm), use of both subjective and objective endpoints, and/or changes in study methodology and types of patients recruited into OAB trials performed over almost 2 decades. We confirm earlier observations that subjective endpoints may be an important contributor to heterogeneity [5, 6]. In OAB trials, a positive correlation of the placebo response with baseline severity was seen for the changes in the endpoints of micturitions and incontinence episodes but was not seen for the mean voided volume.

We were not able to explore the role of other potentially important variable on placebo responses. Because this analysis was performed on study level data, it is not possible to assess the effects of patient level characteristics (e.g. age, gender), or other aspects of design (visit frequency or study location) on responses. It is unclear how important study duration is to placebo responses. Because almost all antimuscarinic OAB studies are 12 weeks long, it is not possible to assess the effects of longer or shorter duration on placebo responses. Finally, the type of diary (paper or electronic) and the duration of recall was not reported by a majority of studies. In particular, the length of time over which patients have to recall subjective endpoint data may be important, with longer durations being associated with greater potential for error.

This analysis confirms that the placebo response in antimuscarinic drug trials for OAB is substantial, and demonstrates high levels of heterogeneity. The substantial placebo response has been noted in treatment trials for other urological disorders, with both drug and non-drug interventions, and may reflect nonspecific effects related to use of a diary, behavioral training, etc, and/or to the use of subjective endpoints. Therefore, in essence, the placebo effect seen in these trials is ascribable to all non-drug aspects of the trial, in addition to treatment with placebo.

Two approaches have been attempted to manage the placebo response- (1) to enroll more severely affected patients in more recent trials (as indicated by the positive association between year of publication and baseline symptom severity); and (2) to enroll larger numbers of subjects. Enrolling more severely affected patients appears to be counterproductive however, as any increase in changes in active treated arms may be offset with larger responses in the placebo arms. Furthermore, our analysis has shown that the probability of study success was unrelated to the magnitude of placebo response in any of the endpoints studied. Using increased sample sizes may have been more effective in ensuring successful study outcome, as demonstrated by the positive associations between these 2 variables (Figure 2B). However this association appears to be greatest for the subjective endpoints, and more modest for the objective endpoint.

Conclusion

This analysis confirms earlier observations of substantial heterogeneity in placebo responses in antimuscarinic drug trials for OAB. More recent clinical trials have tried to address this by recruiting greater numbers of subjects and/or more severely affected patients; however, only the former approach is associated with increased probability of successful study outcome. Alternative approaches to managing the large and heterogeneous placebo response in OAB drug trials in the future might be to develop and validate more objective endpoints for OAB trials, characterize more drug-responsive subpopulations of patients, and/or to explore different trial designs that can reduce population heterogeneity (e.g. relapse prevention).