A systematic review of topical NSAIDs, conducted by this research group in 1996, reported that they were effective in relieving pain in acute conditions like sprains and strains [1]. Number-needed-to-treat (NNT), the number of patients that need to be treated for one to benefit from a particular drug, who would not have benefited from placebo, was used to estimate efficacy, and for all topical NSAIDs pooled together the NNT at one week was 3.9 (95% confidence interval (CI) 3.4 to 4.4). There were differences between individual topical NSAIDs, with indomethacin being no different from placebo, while ketoprofen (NNT 2.6), felbinac (NNT 3.0), ibuprofen (NNT 3.5) and piroxicam (NNT 4.2) were all significantly better than placebo.

There are three reasons why an updated review of topical NSAIDs in acute pain is needed. First, we have a better appreciation of factors that can introduce bias [24], and would not now accept trials that were not double blind, or were very small. Second, topical salicylate and benzydamine are no longer classed as topical NSAIDs [5]. Thirdly, there are now more trials. We believed that updating the review would provide more accurate efficacy estimates for topical NSAIDs, with a prior intent to determine efficacy for individual drugs.



Relevant studies were sought regardless of publication language, type, date or status. Studies included in the previous review were examined for inclusion in this updated version, according to our inclusion criteria. The Cochrane Library, MEDLINE and PreMedline, EMBASE and PubMed were used to find relevant studies published since the last review, for the years 1996 to April 2003. Reference lists of retrieved articles were also searched. The search strategy included "application: topical" together with "cream", "gel" etc, together with generic names of NSAIDs, and proprietary preparations of topical treatment in which the principal active ingredient was an NSAID [6, 7] (see Additional file 1: search strategy). Twenty pharmaceutical companies in the UK, 66 in Europe, and two in North America, known to manufacture topical NSAIDs, were sent letters asking if they could supply papers.


We identified reports of randomised, double-blind, active or placebo-controlled trials in which treatments were administered to adult patients with acute pain resulting from any strains, sprains or sports injuries. Excluded conditions were oral, ocular or buccal diseases. Application of treatment had to be at least once daily. At least ten patients had to be randomised to a treatment group. Outcomes closest to seven days were extracted.

Quality and validity assessment

Trial quality was assessed using a validated three-item scale with a maximum quality score of five [8]. Included studies had to score at least two points, one for randomisation and one for blinding. A sixteen-point scale was used to assess trial validity [9].

Data abstraction

Quality and validity assessments were made independently by at least two reviewers. Extracted outcomes were verified by one other reviewer. Disputes were settled by discussion between all reviewers.


We defined our own outcome of clinical success, representing approximately a 50% reduction in pain [1]. This was either the number of patients with a "good" or "excellent" global assessment of treatment, or "none" or "slight" pain on rest or movement (or comparable wording) measured on a categorical scale. A hierarchy of outcomes was used to extract efficacy information, shown below in order of preference:

1) number of patients with a 50% or more reduction in pain

2) patient reported global assessment of treatment

3) pain on movement

4) pain on rest or spontaneous pain

5) physician or investigator global assessment of treatment

In addition, the number of patients showing undefined "improvement" was also accepted.

Secondary outcomes were extracted from included papers that reported them. These were the number of patients (i) reporting one or more local adverse event (itching, stinging, rash), (ii) reporting one or more systemic adverse event (iii) withdrawing from trials due to adverse events.

Quantitative data synthesis

The number of patients randomised into each treatment group (intention to treat) was used in the efficacy analysis. Information was pooled for the number of patients in each trial achieving at least 50% pain relief, or similar measure, for both topical NSAID and control. These were used to calculate NNT with a 95% confidence interval (CI) [10]. Relative benefit and relative risk estimates with 95% CIs were calculated using the fixed effects model [11]. A statistically significant benefit of topical NSAID over control was assumed when the lower limit of the 95% CI of the relative benefit was greater than one. A statistically significant benefit of control over active treatment was assumed when the upper limit of the 95% CI was less than one. Number-needed-to-harm (NNH) and relative risk were calculated for these outcomes in the same way as for NNTs and relative benefit. Homogeneity of trials was assessed visually [1214]. All calculations were performed using Microsoft Excel X for the Macintosh and RevMan 4.2. In sensitivity analyses the z test was used [15]. QUOROM guidelines were followed [16].

Sensitivity analysis

Our prior intention was to perform sensitivity analyses on pooled outcomes using the z test in terms of quality score (less than 2 versus 3 or more), validity score (less than 8 versus 9 or more), size (less than 40 patients per group versus 40 or more, the median in a previous meta-analysis [1]), outcome type (higher versus lower preference outcomes), and particular NSAID used. At least three studies had to be available in any of these different contexts before information was pooled.


Study characteristics

Ten out of the 20 UK companies, and two out of 66 European companies that we contacted replied to our request for studies. However, only three companies supplied us with useful material; either published studies or bibliographies. One company supplied material that was unpublished at the time of writing [17].

We identified 89 potential papers from our searches. Fifty-three were excluded (see Additional file 2: studies excluded from the review, Additional file 3: QUOROM flow diagram). Of the original 89, 64 papers were in the previous review, of which we excluded 30; four placebo and 15 active controlled trials used salicylates, four placebo controlled trials used benzydamine, two were single blind, and one each had inappropriate randomisation, did not state dose or duration of treatment, had no useable data, used mixed pain conditions including some chronic conditions, or had an inappropriate add-on design. There were 25 potential papers not in the previous review. Of these, 23 were excluded because they were not randomised (9), had no useable data (4), were not double blind (3), were experimental (2), or for other reasons (5).

Thirty-six trials met the selection criteria; 34 from the previous review and two new ones. Twenty-four trials [1740] had only placebo controls, eight [4148] only active controls, and four [4952] had both placebo and active controls. Of the 12 active controlled trials, nine compared one topical NSAID with a different topical NSAID, and three [44, 48, 49] used oral NSAID controls. Details of all included studies with outcomes and quality and validity scores are in Additional files 4 (Outcome details of placebo-controlled trials) and 5 (Outcome details of active-controlled trials). Information about patients was limited, though age ranges were given. Patients had mainly sports injuries, soft tissue injuries, or sprains and strains.

Quality scores were high, with 24/28 placebo controlled and 11/12 active controlled trials scoring 3 or more points out of a maximum of 5. Validity scores were also high, with 25/28 placebo controlled and 10/12 active controlled trials scoring 9 or more out of a maximum of 16 (see Additional files 4 and 5).

Placebo controlled trials


Twenty-six trials with information from 2,853 patients were analysed for efficacy. In 19 of the 26 trials topical NSAID was significantly better than placebo, with a lower confidence interval of the relative benefit above 1 in our analysis. Topical NSAIDs as a class were significantly better than placebo, with relative benefit 1.6 (95% CI 1.4 to 1.7) and NNT 3.8 (3.4 to 4.4) (Table 1). Mean response rate with placebo was 39% and varied from 8% to 75% in individual trials. Mean response rate with topical NSAID was 65%, varying from 41% to 100% in individual trials (Figure 1).

Table 1 Summary data and sensitivity analyses for placebo controlled trials
Figure 1
figure 1

Randomised double-blind studies of topical NSAID compared to topical placebo for one-week outcome of successful treatment. Inset scale shows size of individual trials.

Sensitivity analyses could not be done for higher versus lower quality or validity scores, because there were few studies of lower quality (2/5 on the quality scale) or lower validity (8/16 or less on the validity scale). Analysis limited to only higher quality or higher validity trials, or trials of both higher quality and higher validity, produced no difference in the efficacy measure (Table 1). The median group size for topical NSAID was 41. There was a significantly better (lower) NNT in trials with fewer than 40 patients in each treatment arm than in those with 40 patients or more (z = 3.3, p = 0.001). Outcomes of undefined improvement and physician rated global outcomes gave the same NNT as our preferred outcomes of patient rated global or pain on movement/spontaneous pain.

Efficacy estimates were also made for five individual drugs studied in at least three trials (Table 1). The five topical NSAIDs were all significantly better than placebo, but in the case of indomethacin just so. Ketoprofen had the lowest (best) NNT of 2.6 (2.2 to 3.3). The result for ketoprofen was significantly better than for ibuprofen (z = 2.2, p = 0.03), felbinac (z = 2.1, p = 0.03), piroxicam (z = 3.0, p = 0.003) and indomethacin (z = 4.5, p < 0.00006).


There was no statistically significant difference between the numbers of patients experiencing one or more local adverse events (4%), one or more systemic adverse events (2.5%), or the numbers of patients withdrawing due to an adverse event (0.8%), with topical NSAIDs than with placebo (Table 1). For systemic adverse events, 56 of the 70 events recorded occurred in a single trial [27], and the rate of systemic adverse events excluding this trial was below 1%. Systemic adverse events and adverse event withdrawals did not differ between topical and oral NSAID.

Active controlled trials

Three trials, with 433 patients, compared a topical NSAID with an oral NSAID (indomethacin 75 mg daily in one trial and ibuprofen 1,200 mg daily in two). One trial [44] compared different topical and oral NSAIDs (felbinac foam with oral ibuporofen), while the other two compared the same topical and oral NSAID, ibuprofen in one [48] and indomethacin in the other [49]. Overall rates of treatment success were similar for topical NSAID (57%) and oral NSAID (62%), with no statistically significant difference (relative benefit 0.9; 0.8 to 1.1).

The other nine trials compared one topical preparation with another (see Additional file 5), For only topical piroxicam 0.5% compared with topical indomethacin 1% was there at least three trials (with 716 patients). Piroxicam was significantly more effective than indomethacin, with improvement in 52% on piroxicam and 39% on indomethacin. The relative benefit was 1.3 (1.1 to 1.5) and the NNT for piroxicam compared with indomethacin was 8 (5 to 20). Local adverse events were less common with piroxicam (2%) than with indomethacin (10%). The relative risk for an adverse event with piroxicam was 0.2 (0.1 to 0.5) compared with indomethacin, and the number needed to prevent one local adverse event was 14 (9 to 26).


The original review [1], and this updated one, concluded that topical analgesics were effective in acute conditions. Despite removing trials of lower quality, and topical agents that are not now regarded as topical NSAIDs, the NNT for all topical NSAIDs compared with placebo for the outcome equivalent to at least half pain relief at seven days was 3.8 (3.4 to 4.4). The previous review gave an NNT of 3.9 (3.4 to 4.4). Three trials comparing topical with oral NSAID found no difference in efficacy.

What are the limitations of this review that might question this demonstration of efficacy? The included trials spanned several decades, and retrospective examination finds fault with them in several respects. Trials were often small. Small size can lead to the influence of chance effects on treatment and placebo event rates [4]. Different preparations were used, with different application schedules, concentrations of active agent, and formulations. Outcomes in the trials were not consistent, and a hierarchy of outcomes had to be constructed. Some clinical heterogeneity was therefore inevitable, even when patients in the trials were similar, with similar conditions, when trial designs included both randomisation and double-blinding, and when the duration of trials was appropriate.

We addressed these limitations with pre-planned sensitivity analyses. Using only studies with higher quality and validity scores, or studies with higher rather than lower preference outcomes made no difference (preferred outcomes were patient rated global or pain, lower preference undefined improvement and physician rated global outcomes). Trial size had an important effect, with smaller trials having a lower (better) NNT. The evidence was that topical NSAIDs were effective whatever strategy was used for sensitivity analysis, improving the robustness of the overall result.

Different NSAIDs had different efficacy, with ketoprofen being significantly better than all others in an indirect comparison, while indomethacin was barely distinguished from placebo. The only direct comparison of topical preparations where there was an adequate amount of information to pool (three trials and 716 patients) was for topical piroxicam compared with topical indomethacin. Topical piroxicam was significantly more effective than topical indomethacin, supporting the indirect comparison.

A possible criticism might be that there has been selective publication of trials showing topical NSAIDs to be effective, and suppression of trials where there was no difference between topical NSAID and placebo. Funnel plots do not reliably detect publication bias [13, 14], so we did not use them or make any adjustment for possible publication bias [53]. We did approach every company in the world that we could identify as being involved with topical NSAID manufacture or sale for any additional unpublished trials. No more unpublished material was identified than in the original review [1]. When unpublished material is found, it often does not change the relevance of a result [5456].

It is important to emphasise that both active and placebo treatments were rubbed on, making any effect of rubbing equal in both groups. It's not just the rubbing! The average placebo response in the included trials was 39%, compared with the average response of 65% with topical NSAID. The response with placebo is consistent with that found in painful conditions using a variety of conditions and endpoints [57].

While there may be reservations about the quality and amount of information available for topical NSAIDs in acute conditions, the comparison with NSAIDs in other conditions is favourable. For instance, in dysmenorrhoeas, a Cochrane review of NSAIDs [58] included 4,066 women, but the trials themselves were small, with an average of about 50 per trial. These 63 randomised double-blind trials investigated 21 different NSAIDS, each at different doses, in studies of varying design, varying outcomes, and varying duration. Two Cochrane reviews of NSAIDs for osteoarthritis in hip and knee [59, 60] had only 3,000 patients in 29 trials.

One implication of short duration studies is that they will not capture important long-term safety information. This may be important for ongoing applications of gels, creams or sprays. There is, however, information that indicates that topical NSAIDs do not cause the gastrointestinal harm found with oral NSAIDs [61], nor are they associated with increased renal failure [62].

Clearly there is a body of evidence to support the efficacy of topical NSAIDs in acute painful conditions. The evidence of efficacy remains despite removing smaller studies lacking double blinding that are open to bias, and substituting newer, larger trials of high quality.

Figure 2
figure 2

Analysis of trials of topical NSAIDs in acute painful conditions. This Forrest plot was created using RevMan 4.2. Details of the statistical tests used can be found in the Cochrane Handbook.


Topical NSAIDs were effective and safe in treating acute painful conditions for one week.