Background

Fibromyalgia is surrounded by controversy regarding its aetiology and its status as a valid disease entity. Genetic and neurobiological evidence now exists to support differences between fibromyalgia patients and controls [1]. Candidate biomarkers identifying susceptible individuals or indicating disease activity are emerging, [2] along with a better understanding of outcomes in clinical trials [3].

Fibromyalgia is characterised by widespread pain for longer than three months with pain on palpation at 11 or more of 18 specified tender points [4]. Sleep disturbance, depression, and fatigue often complicate the clinical picture [5]. Fibromyalgia is common, occurring in 1-2% of the population, more often in women than men, [68] and often with profound impact on activities of daily living and productivity [9, 10].

It is increasingly recognised that medicines typically provide a good response in half or fewer of patients treated [11, 12]. This is true in acute pain, [13] neuropathic pain, [1416] migraine, [17] and osteoarthritis [18, 19].

Here we present an analysis of the efficacy of pregabalin in fibromyalgia using individual patient data from four randomised, double blind, placebo controlled trials (RCTs). With this analysis we aimed to identify which outcomes were appropriate for a responder analysis based on the Initiative on Methods, Measurement, and Pain Assessment in Clinical Trials (IMMPACT) consensus statement on interpreting changes in chronic pain clinical trial outcomes [20]. This suggested that for pain, a minimally important improvement was a 10-20% decrease in pain intensity, a moderately important improvement a decrease of 30% or more, and a substantial improvement a decrease of 50% or more. It also suggested that responses in Patient Global Impression of Change of minimally improved, much improved, and very much improved would also constitute minimally important, moderately important, and substantial improvements.

IMMPACT defined response in dimensions other than pain, including physical and emotional functioning, as well as global rating of improvements. In theory, any measurement on any scale could be used for a responder analysis, with a wide range of possibilities of what constitutes a responder. The use of change from baseline, with several different levels of response, should allow an assessment of the utility of both the scale, and the level of response. Utility can be assessed by the occurrence of statistically or clinically significant differences between active therapy and placebo for a particular scale, especially if there appears to be a dose response. The absence of a significant difference between an effective therapy and placebo at all levels of response would be an indication that that particular scale lacks utility for measuring response in a particular circumstance.

The particular circumstance of fibromyalgia is interesting because many different measurements are made using different scales, allowing different scales and levels of response to be examined.

Methods

Pfizer Inc provided Excel files containing individual patient data from four multi-centre clinical phase 2/3 or phase 3 RCTs of pregabalin (Lyrica) in the treatment of fibromyalgia that were conducted in the USA and other countries and were completed by July 2008 (trials 105, [21] 1056, [22] 1077, [23] 1100 [24]). Pfizer Inc also provided PDF files of the corresponding company clinical trial reports. A trial of enriched enrolment randomised withdrawal design ("FREEDOM trial", 1059 [25]) was not included in our analysis because it was fundamentally different [26].

Trial patients were at least 18 years old. Women were not pregnant or lactating, and either postmenopausal, surgically sterilised, or using contraception. Important exclusion criteria were: severe pain due to other conditions, rheumatic diseases other than fibromyalgia, active infections, untreated endocrine disorders, severe depression, active malignancy, being immunocompromised, other severe acute or chronic medical or psychiatric conditions, or laboratory abnormalities. Trial patients had to fulfil ACR criteria for fibromyalgia and have pain scores of ≥ 40 mm on the 100 mm visual analogue scale (VAS) after stopping any relevant pain or sleep medication. Patients were randomised to receive pregabalin (150 mg, 300 mg, 450 mg, or 600 mg per day), or placebo, predominantly with a 2-week dose escalation phase followed by fixed dosing for up to 14 weeks of total trial duration.

We calculated the proportion of patients achieving reductions in pain scores of any improvement (≥ 0%), ≥ 15%, ≥ 30%, ≥ 50%, and ≥ 70% compared to baseline pain scores between weeks 1-12. Sleep improvement was calculated in an analogous manner from weekly averages of sleep quality scores. Improvements in end of trial outcomes (Hospital Anxiety and Depression Scale [HADS], Fibromyalgia Impact Questionnaire [FIQ], Short Form 36 [SF-36] domains, Multidimensional Assessment of Fatigue [MAF] global index, Patient Global Impression of Change [PGIC], Medical Outcomes Study [MOS] Sleep Disturbance, and MOS 9-item Sleep Problem Index), were calculated by comparing data at the trial endpoints with baseline data and calculating the percentage improvement with the individual baseline score set as 100%. We chose levels of improvement for non-pain outcomes also at the above-mentioned cut-points in order to allow ready comparison with pain as an outcome, although it has to be kept in mind that those cut-points do not necessarily have the same clinical relevance for non-pain outcomes as they do for pain (where they have been validated).

The following two rules were applied to the data set to handle missing data.

  • For patients who did not drop out, only actual measured values were used for calculations. Last observation carried forward was not used except where no other data were available (for end of trial outcomes in trial 105 and for HADS outcomes from all trials).

  • From discontinuation day forward patients were assigned 0% improvement.

A responder is then defined as any patient who achieves at least the predefined level of change specified or greater. For example, a patient with exactly 50% pain relief and a patient with 57% pain relief would both be counted as responders at the 50% level.

Trial quality was assessed using the Oxford Quality Scale [27]. Validity was scored using the Oxford Pain Validity Scale [28]. The minimum requirement for inclusion in this responder analysis was that trials had to be both randomised and double blind.

Calculations of responder rates and numbers needed to treat (NNT) were performed independently of Pfizer using a spreadsheet consultancy (Spreadsheet Factory -- http://www.spreadsheet-factory.com) run by one of the authors (Jocelyn Paine). Response data were pooled and used in an intention-to-treat analysis including all randomised patients who received at least one dose of trial drug. We calculated the number and percentage of responders for each level of response (≥ 0%, ≥ 15%, ≥ 30%, ≥ 50%, and ≥ 70% improvement compared to baseline pain scores), pregabalin dose (300 mg, 450 mg, or 600 mg per day), and time point (per week of trial or at end of trial, as detailed in the figures and tables). NNTs were calculated with 95% confidence intervals by the method of Cook and Sackett, [29] using the pooled number of observations. NNTs were not calculated when statistical significance was not achieved; in this circumstance NNTs can approach infinity (100/absolute risk difference), with one of the confidence limits being negative. Only data from trials that included a particular pregabalin dose were used for calculations for that dose; only the placebo data from the specific trials which included that specific dose were used in each dosing comparison. The intention was to analyse data only where there were at least 200 patients in at least two trials [30].

For responder analysis to be useful we hypothesised that its should produce stepped reductions in the percentage of patients responding with increasing level of response, a significant difference between pregabalin and placebo in the number of responders at a particular level, and a trend towards lower (better) NNTs at higher doses of pregabalin, given that pregabalin has been shown in randomised trials and meta-analysis to be effective in fibromyalgia, with higher doses being more effective and with more adverse events [31]. Any scale without these features would be unlikely to have any utility for a responder analysis in fibromyalgia.

Results

Patient and trial characteristics

In the four trials 2757 patients aged between 18 and 82 years were treated with pregabalin or placebo. More than 90% were women. One trial lasted 8 weeks (trial 105); the others lasted 13 or 14 weeks. All trials were of high quality and validity, scoring 5/5 on the Oxford Quality Scale and 16/16 on the Oxford Pain Validity Scale. Pregabalin doses of 300 mg (685 patients) and 450 mg (687 patients) were used in all four trials, 600 mg (564 patients) was used in three, and 150 mg (132 patients) in one; placebo was given to 689 patients. We used doses of 300 mg, 450 mg, and 600 mg in our pooled analysis.

Weekly pain response rates

Data for weekly pain response with pregabalin 450 mg daily are shown in Figure 1. Additional file 1 compares the weekly pain response with pregabalin 300-600 mg daily and placebo. Numerical data for six and 12 weeks are presented in Table 1. Over time the number of patients reporting 'any improvement' fell and the number reporting the higher response levels of at least 50% or at least 70% improvement increased, demonstrating that change in recorded pain intensity was a sensitive indicator for a responder analysis. This was apparent for placebo and all pregabalin doses, especially over the first six weeks. At 6 weeks the proportion with at least 50% pain relief, a substantial improvement, reached a steady state. After 12 weeks 38% of those treated with 450 mg pregabalin had a moderate response or better, 21% a substantial response, and 8.5% an extensive response.

Table 1 Pain and sleep responses at different response levels and doses of pregabalin
Figure 1
figure 1

Weekly pain response levels compared to baseline. For patients treated with pregabalin 450 mg daily.

The corresponding NNTs (Table 1, Additional file 2) generally increased over time for all response levels. At 12 weeks, 11 people need to be treated with pregabalin 450 mg daily rather than with placebo for one of them to achieve a moderate benefit of at least 30% pain relief.

Weekly sleep response rates

Figure 2 and Additional files 3 and 4 illustrate the percentages of patients achieving the indicated response levels for sleep improvement over time and the corresponding NNTs. The results for sleep response were similar to pain relief, demonstrating that change in sleep was a sensitive indicator for a responder analysis. After 12 weeks with 450 mg pregabalin daily 40% had ≥ 30% improvement, 26% had ≥ 50% improvement, and 10% had ≥ 70% improvement (Table 1).

Figure 2
figure 2

Weekly sleep response levels compared to baseline. For patients treated with pregabalin 450 mg daily.

The corresponding NNTs (Table 1, Additional file 4) generally increased over time for all response levels. At 12 weeks, 7 people need to be treated with pregabalin 450 mg daily rather than with placebo for one of them to achieve a moderate benefit of at least 30% reduction in sleep interference.

Patient Global Impression of Change

Figure 3 shows the proportion of patients achieving a PGIC rating of very much improved, at least much improved, or at least some improvement at end of study. For the higher hurdles of improvement (much and very much improved), pregabalin was more effective than placebo and a dose response was apparent, although 600 mg daily produced slightly lower levels of improvement than 450 mg. Using 'any improvement' as a measure of efficacy, no consistent and convincing benefit of pregabalin over placebo was apparent. This demonstrates that Patient Global Impression of Change was a sensitive indicator for a responder analysis. NNTs and actual values are shown in Table 2; best sensitivity was shown with 450 mg and the cumulative outcome of much and very much improved.

Table 2 PGIC responses at end of study
Figure 3
figure 3

Patient Global Impression of Change. The proportion of patients achieving a rating of at least some improvement, at least much improved, or very much improved.

Other outcomes

Additional file 5 shows responder analyses for a number of other outcomes, including the MAF global fatigue index, FIQ, and HADS depression and anxiety scores, as well as individual domains of the general health status measure SF-36.

Most of these demonstrated sensitivity, in that the proportion of responders fell with increasing levels of response, though this was less marked with some of the individual domains of SF-36, particularly physical and emotional role limitations, social functioning, bodily pain, and vitality. For these the differential between lowest and highest levels of response was not large. Sensitivity to detect an effect of pregabalin treatment defined by statistical significance over placebo to enable NNT to be calculated was apparent for MOS Sleep Disturbance, MOS Sleep Problems Index, and SF-36 general health perception, bodily pain, and vitality.

Discussion

Analyses presented here involved 2,757 patients with ACR-defined fibromyalgia investigated in high quality randomised double blind trials for eight to 14 weeks. This represents the largest body of evidence available in fibromyalgia, more than double the number of patients investigated in three trials of duloxetine, [16] and four times that with amitriptyline [32]. Moreover, analyses involved a large number of different measures at five different levels of efficacy.

The principal findings were that simple outcomes like pain, sleep, and PGIC were amenable to responder analysis. They demonstrated stepped reductions in value with increasing level of response, showed a significant difference between pregabalin and placebo, and a trend towards lower (better) NNTs at higher doses of pregabalin. With our approach (responder analysis based on percentage change from baseline) this was not generally the case with less simple outcomes, including fatigue, Fibromyalgia Impact Questionnaire scores, anxiety, depression, and most domains of SF-36, apart perhaps vitality. Therefore, responder analysis as performed here is probably not suitable for most of the outcome measures identified in fibromyalgia clinical trials [3].

A minority of patients experience substantial or moderate benefit, though always significantly more than with placebo, whichever IMMPACT definition of benefit is used. Similar levels of response have been seen for duloxetine, amitriptyline, and tramadol/paracetamol in fibromyalgia, [16, 32, 33] and in osteoarthritis [19].

Weekly analyses for changes in pain intensity and sleep interference demonstrated that maximum benefits for moderate (≥ 30%), substantial (≥ 50%), or extensive (≥ 70%) response occurred at four to six weeks, and thereafter remained reasonably constant. By contrast, response rates for any benefit (≥ 0%) and minimal benefit (≥ 15%) dropped over 12 weeks. Those with a useful response for pain and sleep tend to continue with the treatment; those not achieving moderate or substantial improvement after 4-6 weeks are unlikely to do so later and may be better served by alternative therapies. Pregabalin seemed equally effective at treating pain and sleep disturbance in fibromyalgia, though it is not clear if these improvements occurred in the same patients.

NNTs for reduction in pain intensity and sleep interference calculated at different levels of response at weekly intervals increased with time for all three doses of pregabalin. An increase in NNTs over time has been seen before in arthritis [19]. It may represent either increasing discontinuation rates over time, perhaps because of adverse events with active therapy, or patients who had previously achieved a response at a given level now experiencing a decrease in their magnitude of improvement to below the level in question, or some combination of these. Discontinuations can be different between therapies, with more adverse event discontinuations with active therapy, and more lack of efficacy discontinuations with placebo, and these may have different timescales [34].

Changing NNTs over time are an important finding with implications for efficacy comparisons between drugs. Drugs tested in shorter duration trials (six weeks or less) are likely to appear more effective than the same drug in longer duration trials (eight weeks or more). Four of 10 randomised trials of amitriptyline in fibromyalgia were of six weeks or less, [32] though those of duloxetine were of 12 weeks duration, [16] as was that of a tramadol/paracetamol combination [33].

For the PGIC rating at the end of the trial, higher levels of improvement showed pregabalin to be progressively less effective, at least when NNTs were considered. This illustrates the problem with using 'any improvement' as an outcome, as has been the case in many neuropathic pain studies in the past. Use of 'any improvement' as an outcome overestimated efficacy compared with more substantial levels of improvement.

Table 2 shows that PGIC response rates for 'improvement' decreased at 600 mg pregabalin compared with the 450 mg dose. Perhaps 450 mg is the optimal treatment dose for fibromyalgia (as PGIC takes therapeutic efficacy and adverse events into account). However, it has to be kept in mind that the dose of 450 mg pregabalin was used in all four trials (687 patients) while 600 mg pregabalin was used in only three of them (564 patients). Some inter-trial variability may therefore potentially also play a role.

The strengths of our analysis are that we analysed a large number of individual patient data in a clinically important chronic pain condition, using validated instruments for measuring clinically important trial outcomes, based on large, modern, rigorous, and methodologically sound trials. Our approach is limited in that we have analysed individual patient data for the drug treatment of fibromyalgia for only one agent (pregabalin). More individual patient analyses with other treatments for fibromyalgia are needed to confirm that our findings are generalisable. Finally, any work on fibromyalgia as it is presently defined is limited because 'fibromyalgia' is probably a heterogeneous group of clinical entities with multifaceted patterns of pain, driven by complex pathways of neural mechanisms in which different pathways and mechanisms are not clearly correlated with different pain patterns, likely to be different between individuals, and further complicated by co-morbid conditions and increased age. Chronic pain is associated with functional, structural, and chemical changes in the brain, including loss of gray matter [35]. Individual variability in physiological response to analgesic drugs may be genetic, as for NSAIDs, [36] opioids, [37] and more generally, [38] and indeed varies in extent between different conditions, as with pregabalin in peripheral neuropathic pain, central neuropathic pain, and fibromyalgia [39]. Ongoing genetic, neurobiological, and biomarker work in fibromyalgia [1, 2] may one day help to classify patients more appropriately and allow targeted treatment.

Conclusions

Quite large differences in response levels between individuals with fibromyalgia are to be expected, and were found in this analysis, where responder rates with pregabalin were higher than with placebo. Responder analysis in fibromyalgia looks promising. However, responder analysis in the form that we have undertaken in this paper (using percentage change from baseline) is appropriate only for certain outcomes (such as pain and sleep) and not for others; it is informative where it works but not universally applicable. The full potential and limitations of responder analysis will be realised only when more data can be analysed and compared.