Hyaluronic acid injection therapy for osteoarthritis of the knee: concordant efficacy and conflicting serious adverse events in two systematic reviews

O’Hanlon, Claire E.; Newberry, Sydne J.; Booth, Marika; Grant, Sean; Motala, Aneesa; Maglione, Margaret A.; FitzGerald, John D.; Shekelle, Paul G.

doi:10.1186/s13643-016-0363-9

Hyaluronic acid injection therapy for osteoarthritis of the knee: concordant efficacy and conflicting serious adverse events in two systematic reviews

Research
Open access
Published: 04 November 2016

Volume 5, article number 186, (2016)
Cite this article

Download PDF

You have full access to this open access article

Systematic Reviews Aims and scope Submit manuscript

Hyaluronic acid injection therapy for osteoarthritis of the knee: concordant efficacy and conflicting serious adverse events in two systematic reviews

Download PDF

Claire E. O’Hanlon^1,2,
Sydne J. Newberry¹,
Marika Booth¹,
Sean Grant¹,
Aneesa Motala¹,
Margaret A. Maglione¹,
John D. FitzGerald^1,3 &
…
Paul G. Shekelle^1,4

5853 Accesses
24 Citations
12 Altmetric
Explore all metrics

Abstract

Background

The prevalence of knee osteoarthritis (OA)/degenerative joint disease (DJD) is increasing in the USA. Systematic reviews of treatment efficacy and adverse events (AEs) of hyaluronic acid (HA) injections report conflicting evidence about the balance of benefits and harms. We review evidence on efficacy and AEs of intraarticular viscosupplementation with HA in older individuals with knee osteoarthritis and account for differences in these conclusions from another systematic review.

Methods

We searched PubMed and eight other databases and gray literature sources from 1990 to December 12, 2014. Double-blind placebo-controlled randomized controlled trials (RCTs) reporting functional outcomes or quality-of-life; RCTs and observational studies on delay/avoidance of arthroplasty; RCTs, case reports, and large cohort studies and case series assessing safety; and systematic reviews reporting on knee pain were considered for inclusion.

A standardized, pre-defined protocol was applied by two independent reviewers to screen titles and abstracts, review full text, and extract details on study design, interventions, outcomes, and quality. We compared our results with those of a prior systematic review and found them to be discrepant; our analysis of why this discrepancy occurred is the focus of this manuscript.

Results

Eighteen RCTs reported functional outcomes: pooled analysis of ten placebo-controlled, blinded trials showed a standardized mean difference of −0.23 (95 % confidence interval (CI) −0.45 to −0.01) favoring HA at 6 months. Studies reported few serious adverse events (SAEs) and no significant differences in non-serious adverse events (NSAEs) (relative risk (RR) [95 % CI] 1.03 [0.93–1.15] or SAEs (RR [95 % CI] 1.39 [0.78–2.47]). A recent prior systematic review reported similar functional outcomes, but significant SAE risk. Differences in SAE inclusion and synthesis accounted for the disparate conclusions.

Conclusions

Trials show a small but significant effect of HA on function on which recent systematic reviews agree, but lack of AE synthesis standardization leads to opposite conclusions about the balance of benefits and harms. A limitation of the re-analysis of the prior systematic review is that it required imputation of missing data.

View this article's peer review reports

The Impact of Excluding Patients with End-Stage Knee Disease in Intra-Articular Hyaluronic Acid Trials: A Systematic Review and Meta-Analysis

Article Open access 01 December 2018

The efficacy of multiple versus single hyaluronic acid injections: a systematic review and meta-analysis

Article Open access 21 December 2017

Differentiating factors of intra-articular injectables have a meaningful impact on knee osteoarthritis outcomes: a network meta-analysis

Article Open access 03 January 2020

Background

Prevalence of osteoarthritis of the knee is increasing rapidly in the USA due to shifting population demographics: primary risk factors include aging, obesity, prior injury, repetitive use, [1], and female gender [2]. The Centers for Disease Control estimate that prevalence of symptomatic knee osteoarthritis may reach 50 % by age 85 [3]. The increase in obesity has translated into not only increasing knee osteoarthritis incidence but also younger age of onset; as a result, by the time individuals reach Medicare eligibility, the length of time they have had the condition has grown, their cases are more advanced [4], and the likelihood of needing surgery has increased.

Traditional treatment options for knee osteoarthritis include both pharmaceutical (analgesics and anti-inflammatory agents) and lifestyle options (physical therapy, exercise, weight loss), as well as surgery (partial or total arthroplasty) for advanced cases. More recent therapies include intraarticular viscosupplementation, which involves local injections of joint lubricant hyaluronic acid (HA) [5].

Recommendations for using HA for knee osteoarthritis have been mixed. In the 2012 update to their 2000 guidelines for the treatment of osteoarthritis of the knee, hip, and hand, the American College of Rheumatology conditionally recommended HA injections for patients who had an inadequate response to initial therapy [5]. The 2013 American Academy of Orthopedic Surgeons guidelines for the treatment of knee osteoarthritis recommend against the use of HA to treat patients with symptomatic conditions [6].

Systematic reviews have an important role in establishing evidence-based clinical guidelines. Much work has been done on improving the methods of systematic reviews for medical treatments, and this work has largely standardized the synthesis of benefits. Clinical and outcomes researchers have created standardized scales and tools to elicit and quantify mean statistical differences. However, clinical guidelines consider the balance of benefits and harms. The elicitation, appraisal, and reporting of harms are far less standardized than for benefits. This lack of standardization can sometimes influence clinical recommendations either for or away from potential treatments.

We report here a comparison of the results of two systematic reviews assessing efficacy and adverse events on the use of HA of patients with knee osteoarthritis. Our starting point is our systematic review and meta-analysis on the use of HA in patients 65 and older commissioned by the U.S. Center for Medicare and Medicaid Services. In the course of comparing our results to a previous systematic review on the same topic, we identified a situation in which differences in how adverse events (AEs) are synthesized have resulted in differences in estimates of the risk of harms, which in turn result in completely different conclusions regarding the balance of benefits and harms for the use of HA, in spite of reporting similar results on effectiveness for functional outcomes. In this paper, we briefly describe the methods and results of our commissioned review (full details are available in our Evidence Report [7]), and then focus on the methods and results of a comparison between the AE results from the two reviews.

Methods

This systematic review was conducted under contract for the Agency for Healthcare Research and Quality (AHRQ) through its Evidence-based Practice Center (EPC) Program. As the Centers for Medicare and Medicaid Services (CMS) was the partner for this review and the vast majority of Medicare beneficiaries are over 65, the key questions focused on the functional efficacy and safety of intraarticular HA injections for knee osteoarthritis in persons aged 65 years and older. Although the study was not originally registered with PROSPERO, this study followed a pre-defined, standardized protocol approved by the Centers for Medicare and Medicaid Services (CMS) that was posted for public comment. The full report [7] is available at http://www.ncbi.nlm.nih.gov/books/NBK343555/. The PRISMA checklist for this manuscript is available as an additional file.

As part of the interpretation of our findings, we compared our results with those of prior systematic reviews. Discrepancies in the analysis of AEs in RCTs in our review and the analysis of RCTs in Rutjes and colleagues’ review [8] caused us to perform a more detailed analysis of serious adverse events (SAEs) reported in the trials included in both reviews. Our investigation into the causes of such discrepant results is the focus of this manuscript.

Search strategy and inclusion criteria

The full description of our search strategy is included in our Evidence Report [7]. Briefly, we searched PubMed, EMBASE, Web of Science, Scopus, the Cochrane database, www.clinicaltrials.gov, the Canadian Agency for Drugs and Technologies in Health database, the Food and Drug Administration Premarket Approval database, the New York Academy of Medicine Grey Literature Report, and unpublished documents provided by manufacturers from January 1, 1990, to December 12, 2014. Search strings included a term for the treatment (hyaluronic acid, hyaluronate, hyaluronan, hylan, viscosupplementation, or similar), a term for the disease state (osteoarthritis, arthritis, gonarthrosis, degenerative joint disease), and a term for the site (knee). See Additional file 1: Table S1 for the full search strategy. Non-English language studies and conference abstracts were excluded, although non-USA studies were included if the product evaluated was analogous to a product available in the USA.

We included randomized controlled trials (RCTs) for functional and quality-of-life efficacy outcomes. We included RCTs and cohort studies for total knee replacement (TKR) efficacy outcomes. Recent comprehensive systematic reviews that reported pain outcomes were also included. RCTs, cohort studies, case series, and case studies were included for AE outcomes, although only RCTs contributed to the pooled estimates.

Screening and data abstraction

Titles and abstracts were independently screened by two reviewers. Data were dually and independently abstracted with disagreements resolved by group discussion. Abstracted data included both study-level data (population demographics, health status, and intervention protocols) and efficacy outcomes of interest. We also abstracted information on AEs.

Quality assessments

Study quality was assessed using questions adapted from the Cochrane Risk of Bias Assessment Tool [9] and EPC Methods Handbook [10]. The quality of RCTs included in the AE assessment was evaluated using the McHarms tool [11].

Efficacy and adverse event analyses

Efficacy analyses were conducted with Stata statistical software, version 12.0 (Stata Corp., College Station, TX). Pooling of adverse events was conducted with StatXact PROCS, version 10 (Cytel, Cambridge, MA).

Efficacy analysis

We conducted meta-analysis of the efficacy outcomes of interest in cases where there were three or more sufficiently homogeneous studies and estimated a pooled random-effects estimate of the overall effect size [12]. We compared these effect sizes to recent estimates of minimum clinically important difference (MCID) for knee osteoarthritis.

Adverse event analysis

We classified each reported adverse event on two dimensions: severity (either serious [“SAE”] or not serious [“NSAE”]) and locality (local to the injected joint, local but not to the injected joint, or non-local [“other”]). Classifications were determined by board-certified clinicians on the research team, a rheumatologist (J.D.F.) and an internist (P.G.S.). Adverse events were pooled by severity and locality. Pooling of adverse events was conducted using exact methods; events with zeros in one group were included in the analysis while events with zeros in both groups were excluded [13].

Sensitivity analysis

As part of the interpretation of our findings, we compared our results with those of prior systematic reviews, including a review by Rutjes and colleagues [8], which was the most recent prior review of high quality (quality score assessed by AMSTAR [14]: 9 out of 11) available at the time. While both reviews had concordant efficacy results, our study and their study resulted in different conclusions on the risk of SAEs. We hypothesized that such differences in conclusions could arise from three sources: differences in included studies; differences in AEs included in the studies; and differences in how AEs were classified and synthesized. We investigated each potential source.

Differences in included serious adverse event studies

We retrieved all studies included in the systematic review by Rutjes and colleagues [8] and compared inclusion criteria and studies included to those of our review. For each study reported in the pooled SAE analysis in the systematic review by Rutjes and colleagues [8], we replicated their pooled analysis to conduct a sensitivity analysis. Because three of the studies included in their meta-analysis were considered proprietary and specific data were withheld from publication, we used the known sample sizes for these three studies and trial-and-error to replicate their pooled result to determine the number of SAEs in these studies. We used this replication to determine how sensitive their conclusions were to inclusion and exclusion of individual studies.

Differences in reported adverse events

We compiled all NSAEs and SAEs reported in our review and in the review by Rutjes and colleagues [8]. As in our own review, we classified AEs reported by Rutjes and colleagues as serious or non-serious, and then further by whether the AE was local to the knee joint, local to somewhere other than the knee joint, or a non-local AE. Rutjes and colleagues reported results for one NSAE (flare) and then used the original study authors’ assessment of AEs as serious or non-serious for their determination of SAEs. We then compared the types of NSAEs and SAEs reported in our review to the NSAEs and SAEs in the review by Rutjes and colleagues [8].

Differences in synthesis of adverse events

We compared how Rutjes and colleagues [8] synthesized the evidence on AEs to our methods for synthesizing AEs. We then conducted sensitivity analyses to assess the degree to which modifications to these classifications influence the pooled results.

Strength of evidence

We assessed the strength of evidence for each outcome using criteria from the Effective Health Care Program [15], which are similar to those used by the Grades of Recommendation Assessment, Development and Evaluation (GRADE) Working Group [16] and include assessments of the study limitations, directness, consistency, precision, and likelihood of reporting bias of the evidence.

Role of the funding source

The original Evidence Report was funded by AHRQ [7]. No additional funding was obtained for the AE sensitivity analysis work. The results and conclusions are those of the authors, who are solely responsible for deciding to submit this manuscript for publication.

Results

Literature flow and efficacy results

Of the 2528 articles screened, 512 were selected for review of the full text, and 63 articles met inclusion criteria for our analyses (Fig. 1). Study-level data can be found in the evidence tables (Additional file 2: Table S2).

Functional efficacy analysis

The full details of the efficacy analyses are included in our Evidence Report [7]. In brief, 18 randomized trials reported on the effects of HA compared to sham-injected placebo control, another HA, or some other active treatment on function, as measured by the Western Ontario-McMaster Universities Arthritis Index (WOMAC [17]), the Lequesne Index [18], the Knee Injury and Osteoarthritis Outcomes Score (KOOS [19]), or Activities of Daily Living, among patients whose average age was 65 or older. Details of included studies and their risk of bias assessment are included in Additional files 2 and 3. Pooled analysis of ten sham-injection, placebo-controlled, assessor-blinded trials showed a standardized mean difference of −0.23 (95 % CI −0.45 to −0.01) (Fig. 2), which significantly favored HA at 6 months follow-up [20–29]. Although our review found that functional outcomes were improved by intraarticular HA injection, the durability of this effect could not be assessed beyond 6 months. We judged the strength of evidence for the function outcome as low because the trials tended to be small, they had moderate risk of bias (often failing to report adequate methods for recruitment or concealment of allocation) (Additional file 3: Table S3), function was usually not a primary outcome, and results were inconsistent.

Our functional effect size of −0.23 (95 % CI −0.45 to −0.01) is similar to previously reported effect sizes. Rutjes and colleagues report an effect size of −0.33 (95 % CI −0.43 to −0.22) [8], while Bannuru and colleagues report −0.30 (95 % CI −0.40 to −0.20) [30]. Our effect size for function did not exceed the minimum clinically important difference (MCID) of −0.37 applied in the review by Rutjes and colleagues [8] but did exceed the minimum clinically important improvement (MCII) of −0.12 derived by Tubach and colleagues [31], as well as the MCII of −0.20 used by Bannuru and colleagues [30].

Other efficacy analyses

Quality-of-life outcomes assessed in three RCTs (one placebo-controlled [26] and two head-to-head trials [32, 33]) found no statistically significant differences between groups. Three RCTs [22, 29, 34] and 13 observational studies (reported in 16 articles [35–50]) reported on TKR, but evidence on delay or avoidance of TKR was insufficient to draw conclusions. Two large, good quality systematic reviews with meta-analyses for pain outcomes showed a significant and clinically important effect among adults of all ages [8, 51].