Introduction

Lateral epicondylitis, or tennis elbow, is the most common cause of elbow and forearm pain in adults, with an annual incidence of 1% to 3% in the general population [2, 39]. Lateral epicondylitis is thought to be related to overuse of the extensor carpi radialis brevis muscle, producing pain in the lateral elbow and forearm region. Although the role of inflammation in the pathophysiology of this condition is questionable, lateral epicondylitis is postulated to involve degenerative changes in the epicondylar enthesis of the extensor carpi radialis brevis and perhaps also the supporting collateral ligamentous complex and joint capsule [5]. Uncertainty regarding the pathologic basis of lateral epicondylitis underlies, in part, the lack of consensus on optimal management. The natural history of lateral epicondylitis typically includes resolution in 6 to 24 months, and symptoms remit in approximately 80% of patients within 1 year [3, 9, 17, 39]. Numerous management options are used for this condition, including observation only (no treatment), NSAIDs, injections (corticosteroid, platelet-rich plasma, autologous blood, botulinum toxin, sodium hyaluronate, glycosaminoglycan polysulfate), physiotherapy, bracing, shock wave therapy, laser therapy, and ultrasound therapy.

The rationale for our meta-analysis is that none of these myriad therapies has proven superior to the others [5]. It also is not known whether nonsurgical treatment of this condition provides any intermediate- to long-term advantage over observation only. Numerous treatments for lateral epicondylitis are used in clinical practice without consensus, of which some clearly improve the short-term outcome relative to observation only. However, these treatments are often associated with substantial cost, potential morbidity, and the possibility of worsened long-term outcome with certain treatments such as corticosteroid injections. Our meta-analysis was designed to specifically address longitudinal outcomes at 6 months or greater, in light of the natural history of lateral epicondylitis and largely short-term benefit of current therapies.

The objective of this meta-analysis was to determine whether clinical outcomes differ among patients with lateral epicondylitis who are treated versus untreated according to evidence from randomized-controlled trials (RCTs) comparing no treatment (observation only or placebo) with some type of nonsurgical treatment. We hypothesized that, at intermediate- to long-term followup of 6 months or greater, patients managed with no treatment (observation) and those receiving various nonsurgical treatments would have similar results as measured by (1) overall improvement, (2) need for escape treatment, (3) outcome scores, and (4) grip strength.

Materials and Methods

Eligibility Criteria

The meta-analysis was performed according to the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) guidelines [26]. The inclusion criteria were limited to RCTs that compared any form of nonsurgical treatment with either observation only or placebo at intermediate- to long-term followup of at least 6 months. Observation only or administration of placebo, including saline injection, were considered acceptable forms of nontreatment; dry needling, anesthetic injections, NSAIDs, splints, braces, and bandages were not considered acceptable. Controlled trials with a crossover design were excluded unless they contained patient subgroups that continued with their initial treatment assignment for the entire followup of 6 months or greater. Studies that did not report the followup interval or that reported only limited qualitative findings were excluded. No restrictions were imposed on publication date.

Literature Search

PubMed and the Cochrane Central Register of Controlled Trials (CENTRAL) were queried to identify relevant English-language studies. The search term included the (Therapy/Broad) filter and the following key words: tennis elbow, lateral epicondylitis, lateral epicondylosis, lateral epicondylopathy, and lateral epicondylalgia. The search was performed in December 2013 and repeated the following month. The resulting study titles and abstracts were reviewed according to the eligibility criteria. Full manuscripts were procured and reviewed for eligible studies, and their citations were manually screened to identify additional studies that might have been missed. A PRISMA trial flow shows the study selection algorithm (Fig. 1).

Fig. 1
figure 1

The Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) trial flow shows the inclusion process for the randomized-controlled trials in the meta-analysis.

Study Selection

The initial search of PubMed and CENTRAL identified 804 English-language articles, whose titles and abstracts were subsequently screened to determine their eligibility. Citation lists of selected studies were manually cross-referenced to ensure that no additional studies were missed. Twenty-two studies containing a total of 2280 enrolled patients at intermediate- to long-term followup met the inclusion criteria, comparing nonsurgical treatment (n = 1295) with observation only or placebo (n = 985) [1, 4, 6, 7, 1015, 20, 2325, 27, 28, 31, 33, 3537, 40].

Data Abstraction

Data from eligible studies were extracted for study and patient characteristics, pain relief, overall improvement, requirement for escape treatment, outcome scores, pain-free function, clinician-related global assessments, health-related quality-of-life surveys, maximum and pain-free grip strength, pressure-pain threshold, performance on physical examination maneuvers, and radiologic findings. If outcomes were reported using only graphic plots but were omitted from the body of the text, plot-digitizing software (Plot Digitizer Version 2.6.4, Joseph Huwaldt and Scott Steinhorst, http://plotdigitizer.sourceforge.net) was used to quantify these data.

Data Items

Overall improvement was defined as patient-rated “complete recovery” or “much improvement” on a six-point Likert global assessment scale [4], a 50% or greater reduction in baseline pain status, a three-point reduction in the baseline VAS score, a final pain score of 3 of 10 or less, or a final Roles-Maudsley score [30] of 1 or 2. The analysis of pain relief used scores from the 10- or 100-point VAS, self-reported pain status, and four-point Roles-Maudsley rating scale [30]. Data for pain relief were pooled and analyzed after stratification into two categories: (1) pain at rest or daily activity and (2) pain during strain or resisted wrist extension. Pooled analysis of the presence of pain on resisted wrist extension was performed. The requirement for escape treatments was evaluated, including all cointerventions, analgesics or NSAIDs, outside consultation, and surgery. Three outcome scores were analyzed: the Patient-rated Tennis Elbow Evaluation (PRTEE) [32] score; the DASH [18] score; and the Pain-Free Function Index (PFFI) score, using either the eight- or 10-item version of this questionnaire [21, 38]. When it was necessary to compute a total outcome score from reported components of the score, such as the pain and function components of the PRTEE score, the individual means and SDs were combined. Health-related quality of life was assessed via the EuroQoL (EQ)-5D score [8]. Overall function was assessed by pooling scores from the PRTEE, DASH, Upper Extremity Function Scale [29], PFFI, and study-specific function questionnaires, with inversion of signs when applicable so that lower values represented improvement; if more than one of these outcome measures was reported in the same study, only one was included in the analysis, in the aforementioned order of priority. Finally, maximum and pain-free grip strength were analyzed.

Data Synthesis and Statistical Analysis

Pooled analysis was performed to compare several clinical outcome measures between groups, depending on the availability of data. A random-effects model was selected to account for statistical heterogeneity across the included trials using Review Manager (version 5.2.3; The Nordic Cochrane Centre, The Cochrane Collaboration, Copenhagen, Denmark). Q tests were performed to measure statistical heterogeneity across the included trials, with the I2 value conveying the degree of heterogeneity, and I2 values of 75% or greater representing considerable heterogeneity [16]. If the standard deviation for a given outcome was not reported in a study, it was calculated from other provided statistics, including the 95% or 99% CI, standard error, interquartile range, or p value. Continuous data were analyzed through the inverse-variance statistical method and computation of the standardized mean difference (SMD) or mean difference (MD) and 95% CI. Dichotomous data were analyzed through the Mantel-Haenszel statistical method and computation of the risk ratio (RR) and 95% CI. Pooled analysis was performed for a given outcome when data were reported by at least two studies. It was possible to extract and pool multiple group comparisons from studies that compared more than one treatment with no treatment. When multiple studies reported an outcome using the same scale and unit of measurement, use of the MD method allowed aggregation of change-from-baseline and final data in the same analysis in accordance with the Cochrane Handbook for Systematic Reviews of Interventions [16], under the assumption that between-group differences in both measurements closely approximate each other in RCTs. Otherwise, change-from-baseline and final data were pooled in separate analyses using the SMD method, which allowed comparison of related data that were reported using disparate scales or units of measurement. When multiple intermediate- to long-term followup datasets were reported in a study, the longest followup was preferentially used in the primary analysis. Effect sizes were presented in relation to the treatment group; for instance, a positive RR indicated a greater risk in the treatment group. The z statistic and p value were used to determine the statistical significance of the pooled comparison. Forest plots were provided. Computation of weighted means was performed to analyze demographic characteristics (age, proportion of males, proportion of dominant or right elbows, and duration of symptoms) for each group. The methodologic quality of each included RCT was assessed using the 22-point Consolidated Standards Of Reporting Trials (CONSORT) checklist [34]. Studies were scored and classified as excellent (18–22), good (13–17), fair (8–12), or poor (≤ 7). A sensitivity analysis [22] was performed using only trials of excellent or good methodologic quality according to the CONSORT score. A funnel plot, which is a visual representation of statistical precision plotted against the treatment effect, was constructed to assess the potential influence of publication bias on the results.

Study Characteristics

All included studies were RCTs published from 1990 to 2013 (Table 1). Nineteen studies were placebo-controlled trials, while the remaining three compared one or more treatments with observation only. The size of the nonsurgical treatment and nontreatment groups ranged from 18 to 165 and five to 166, respectively. The followup ranged from 6 months to 5 years, and 10 studies reported data for two intermediate- to long-term followup periods. The methodologic quality was excellent in 10 studies, good in four, and fair in eight, yielding a mean CONSORT score of 15.5 (range, 8–22). A funnel plot (Fig. 2) of the analysis of overall improvement appeared essentially symmetric in relation to the pooled estimate from the meta-analysis, indicating minimal publication bias.

Table 1 Study design and patient characteristics of included studies
Fig. 2
figure 2

A funnel plot of the analysis of overall improvement shows relative symmetry in relation to the pooled estimate from the meta-analysis, indicating minimal publication bias.

Patient Characteristics

The frequency-weighted mean ages of the nonsurgical treatment and nontreatment groups were 47.0 ± 3.84 and 47.3 ± 3.93 years, respectively. The frequency-weighted proportion of males and proportion of dominant or right elbows, respectively, were 55.2% ± 7.30% and 73.6% ± 8.80% in the treatment group and 54.1% ± 7.14% and 74.5% ± 10.2% in the nontreatment group. The frequency-weighted duration of symptoms was 12.0 ± 8.42 months in the treatment group and 13.9 ± 8.04 months in the nontreatment group.

Results

Overall Improvement

Assessing for overall improvement, neither nontreatment nor nonsurgical treatment was favored (RR = 1.05, [0.96–1.15]; p = 0.32; I2 = 51%) (Fig. 3) (Table 2). For pain relief at rest or during activity using change-from-baseline data, there was no difference between groups (SMD = −0.15, [−0.59 to 0.29]; p = 0.50; I2 = 90%) (Fig. 4A). Similarly, for pain relief at rest or during activity using final data, there was no difference between groups (SMD = −0.27, [−0.97 to 0.42]; p = 0.44; I2 = 97%) (Fig. 4B). For pain relief during strain or resisted wrist extension, there was no difference between groups (SMD = −0.67, [−1.87 to 0.53]; p = 0.28; I2 = 98%) (Fig. 4C). Pain on resisted wrist extension occurred at a similar rate in the two groups (RR = 1.07, [0.77–1.49]; p = 0.69; I2 = 0%) (Fig. 4D).

Fig. 3
figure 3

The forest plot shows the risk ratio of overall improvement. M-H = Mantel-Haenszel; df = degrees of freedom.

Table 2 Summary of results of pooled analyses
Fig. 4A–D
figure 4

The forest plots show the standardized mean difference in pain scores at (A) rest or during daily activity, using change-from-baseline data; (B) at rest or daily activity, using final data; and (C) during strain or resisted wrist extension, using final data. (D) This forest plot shows the risk ratio of pain on resisted wrist extension. IV = inverse-variance; M-H = Mantel-Haenszel; df = degrees of freedom.

Requirement for Escape Interventions

The nonsurgical treatment group showed no difference in the need for escape treatment of any kind (RR = 1.50, [0.84–2.70]; p = 0.17; I2 = 86%) (Fig. 5A). The treatment group was no more likely to require analgesics or NSAIDs (RR = 1.24, [0.88–1.74]; p = 0.21; I2 = 37%) (Fig. 5B). The treatment group was more likely to require outside consultation (RR = 2.24, [1.21–4.15]; p = 0.01; I2 = 61%) (Fig. 5C). Both groups were equally likely to require surgery (RR = 1.16, [0.73–1.84]; p = 0.53; I2 = 0%) (Fig. 5D).

Fig. 5A–D
figure 5

The forest plots show the risk ratio of need for (A) escape treatments of any kind; (B) analgesics or NSAIDs; (C) outside consultation; and (D) surgery. M-H = Mantel-Haenszel; df = degrees of freedom.

Outcome Scores

The nontreatment group had better PRTEE scores (MD = 1.47, [0.68–2.26]; p < 0.001; I2 = 23%) using aggregated change-from-baseline and final data (Fig. 6A). Neither group exhibited superior DASH scores (MD = −2.69, [−15.80 to 10.42]; p = 0.69; I2 = 93%) using aggregated change-from-baseline and final data (Fig. 6B). There was no difference in PFFI scores (SMD = 0.25, [−0.32 to 0.81]; p = 0.39; I2 = 91%) using change-from-baseline data (Fig. 6C). EQ-5D scores were similar for the two groups (SMD = 0.08, [−0.52 to 0.67]; p = 0.80; I2 = 89%) (Fig. 6D). The summary analysis of overall function showed no difference between groups using change-from-baseline data (SMD = 0.11, [−0.14 to 0.36]; p = 0.37; I2 = 56%) (Fig. 7A) and final data (SMD = −0.16, [−0.79 to 0.47]; p = 0.61; I2 = 97%) (Fig. 7B).

Fig. 6A–D
figure 6

The forest plots show the mean difference in the (A) Patient-rated Tennis Elbow Evaluation score and (B) DASH score; and the standardized mean difference in the (C) Pain-free Function Index score and (D) EuroQoL-5D score. IV = inverse-variance; df = degrees of freedom.

Fig. 7A–B
figure 7

The forest plots show the standardized mean difference in overall function using (A) change-from-baseline data and (B) final data. IV = inverse-variance; df = degrees of freedom.

Grip Strength

There was no difference in maximum grip strength between groups using change-from-baseline data (SMD = 0.12, [−0.11 to 0.35]; p = 0.31; I2 = 0%) (Fig. 8A) or final data (SMD = 4.37, [−0.65 to 9.38]; p = 0.09; I2 = 100%) (Fig. 8B). Pain-free grip strength was similar for the two groups using change-from-baseline data (SMD = −0.20, [−0.84 to 0.43]; p = 0.53; I2 = 86%) (Fig. 8C) and final data (SMD = −0.03, [−0.61 to 0.54]; p = 0.91; I2 = 84%) (Fig. 8D).

Fig. 8A–D
figure 8

The forest plots show the standardized mean difference in maximum grip strength using (A) change-from-baseline data and (B) final data; and pain-free grip strength using (C) change-from-baseline data and (D) final data. IV = inverse-variance; df = degrees of freedom.

Sensitivity Analysis

The sensitivity analysis, using only trials of excellent or good methodologic quality according to the CONSORT score, confirmed all analyses that might have been influenced by inclusion of low-quality trials. No differences were found in overall improvement (p = 0.52), final pain with rest or daily activity (p = 0.30), pain during strain or resisted wrist extension (p = 0.28), requirement for surgery (p = 0.33), DASH score (p = 0.65), overall function (p = 0.82), and final maximum grip strength (p = 0.49). Sensitivity analysis was not necessary for the remaining analyses, which were based exclusively on trials of excellent or good methodologic quality.

Discussion

Lateral epicondylitis is a common tendinopathy that can cause significant pain, disability, and productivity loss. The impetus for this study was not only the lack of consensus surrounding the management of lateral epicondylitis, but also the significant healthcare-related costs and morbidity risk of many currently used treatments for this condition. Although multiple treatments for lateral epicondylitis are known to improve patient outcomes in the short term, to our knowledge, no meta-analysis to date has specifically compared intermediate- with long-term outcomes for nonsurgical treatment versus no treatment. As numerous management strategies are used for lateral epicondylitis, our meta-analysis was conducted to determine whether nonsurgical treatment of this condition, compared with observation only or placebo, improves subjective and objective clinical outcomes at intermediate- to long-term followup.

This study has notable limitations. Improvements in the nontreatment group may be partially attributable to a placebo effect, activity modification, counseling, and/or not-per-protocol treatments in addition to the natural history of the condition. The potential effects of saline injection and single-pass needling, although presumed to be of minimal biological consequence, are also a consideration. In a similar vein, patients who receive treatment may experience early improvements in pain and function that prompt premature return to their previous activity level, potentially aggravating the condition and obfuscating any beneficial treatment effect. The aggregation of multiple nonsurgical treatments in the same analysis allows the possibility that less effective treatments counterbalance those that are more effective, although this approach made our meta-analysis feasible and increased its statistical power. Our assessment of statistical heterogeneity indicated considerable heterogeneity in 11 of the 20 analyses conducted, as defined by an I2 value of 75% or greater, with clinical heterogeneity being a probable source. This is a limitation innate to our study design. The chief aim of our meta-analysis was to test the overarching hypothesis that, in the intermediate to long term, an observation-only approach provides comparable outcomes to various available treatments, none of which is preferentially accepted in clinical practice or is clearly considered the standard of care over other treatment options. Our study was not designed to specifically focus on individual treatments that have, in many cases, been studied in only a small number of RCTs with at least 6 months followup. The comparative effectiveness of individual treatments was addressed in a recently published systematic review and meta-analysis comparing injection therapies for lateral epicondylitis [19]. Krogh et al. reported benefits over placebo with autologous-blood, platelet-rich plasma, prolotherapy, and hyaluronic acid, but not corticosteroid, botulinum toxin, polidocanol, and glycosaminoglycan, although the number of RCTs available for inclusion was modest. Although our meta-analysis suggests that therapeutic interventions do not enhance long-term outcomes, some patients and clinicians may be unwilling to wait several months to achieve pain resolution and functional improvement, particularly when a timely return to physically demanding work or sport is desired. An observation-only approach has its own risks, including short-term disability and pain, and economic cost in lost productivity. The appropriateness of nonsurgical treatment also may depend on the severity and duration of symptoms. The acceleration of symptom improvement must be weighed against treatment-related expenses, morbidity, and the possibility that certain treatments, such as corticosteroid injections, may worsen the long-term outcome.

Owing to a lack of evidence comparing their efficacy with observation only or placebo at intermediate- to long-term followup, analysis of certain therapeutic modalities, such as botulinum toxin injection and surgery, was not possible. Furthermore, studies that investigated newer, promising therapies, such as platelet-rich plasma injection, were underrepresented in the literature relative to older therapies such as corticosteroid injection and shock wave therapy. Some pooled analyses were based on data from a small number of studies, increasing the likelihood of bias. It also is unclear how multimodal approaches compare with observation only, as most studies investigated treatments administered alone. Finally, certain clinical outcome measures were not amenable to pooled analysis owing to limited or nonuniform reporting, or inability to assess variance about the mean, including patient satisfaction, SF-12 and SF-36 scores, Illness Perception Questionnaire scores, pressure-pain threshold, physical examination tests such as lateral epicondyle tenderness and wrist extensor peak force, and radiologic findings.

Nonsurgical treatment and nontreatment produced similar results for overall improvement, escape treatment, outcome scores, and grip strength, except for an approximately halved need for outside consultation and a statistically but not clinically significant advantage in PRTEE scores in the nontreatment group. These findings likely reflect the self-resolving natural history of this condition in the long term and the predominantly short-term treatment effect of many currently available nonsurgical interventions. Considering the heterogeneity of the interventions aggregated together in our study, we caution that certain nonsurgical treatments may be more effective than others and warrant further exploration in future RCTs.

The current meta-analysis of intermediate to long-term outcomes from RCTs identified no benefit to the nonsurgical treatment of lateral epicondylitis. Therefore, the findings of this meta-analysis validate observation only and reassurance as a practical and cost-effective management strategy for patients able to tolerate their short-term symptoms. Clinicians should counsel patients regarding the merits of watchful waiting, while judiciously weighing interventions for this condition, given their lack of clear long-term benefit, associated costs, and potential for adverse effects.