Introduction

Osteoarthritis (OA) is a chronic degenerative joint disease characterized by the progressive breakdown of joint cartilage, leading to joint pain, stiffness, and functional impairment. It is the most common form of arthritis, affecting millions of people worldwide [1]. Although there is a worldwide trend showing a significant increase in the incidence of OA among younger patients, it affects the elderly most [2]. OA can affect any joint in the body but is most commonly found in the knee [3]. The pathogenesis of OA is complex and not yet fully understood. However, current research suggests that the development and progression of OA are influenced by a combination of genetic, environmental, and mechanical factors [4, 5]. These factors can lead to the loss of chondrocytes and the extracellular matrix through different mechanisms, including mechanical, inflammatory, and metabolic, ultimately causing OA [5,6,7,8].

Diagnosis of knee OA typically involves a combination of patient history, physical examinations and imaging. Among auxiliary examinations, X-ray is most commonly used. The progression of OA can be evaluated using the Kellgren-Lawrence grading system, a widely used radiographic classification system, based on a scale of 0–4, with higher grades indicating more severe joint damage [5, 9, 10].

The management of knee OA aims to reduce pain, improve function, and prevent further joint damage [5]. This can involve a combination of non-pharmacologic interventions such as exercise, weight loss, and physical therapy, as well as pharmacologic interventions such as nonsteroidal anti-inflammatory drugs (NSAIDs) and analgesics [11, 12]. In advanced cases, surgical interventions may be necessary, with total knee arthroplasty (TKA) and unicompartmental knee arthroplasty (UKA) being the most common. However, both techniques are invasive, and should be considered last-resort treatments for knee OA [13, 14].

Arthroscopy is a minimally invasive surgical technique that can remove loose bodies or inflamed synovial tissue contributing to pain and inflammation in the knee joint. It can also be used to repair or remove damaged cartilage or bone [15,16,17]. However, its effectiveness is debated compared to non-surgical treatments. Kirkley conducted a randomized controlled trial (RCT) and found that arthroscopy was no more effective than placebo surgery in improving pain or function in patients with knee OA [18]. Other RCTs, however, have supported the therapeutic effectiveness of arthroscopy in treating degenerative knee OA [19, 20]. Even systematic reviews have reached different conclusions on the topic [21, 22]. The most recent systematic reviews and meta-analysis concluded that arthroscopic surgery provided little or no clinically important benefit in pain, function and knee-specific quality of life compared with a placebo procedure [22]. However, the study included participants with degenerative knee disease, encompassing knee OA and meniscal tears, which means studies recruiting patients without OA were also included. Strictly speaking, the conclusion could only be applied to degenerative knee diseases, leaving the effectiveness of arthroscopy on OA unresolved. Therefore, the present study aims to evaluate the effectiveness of arthroscopy on knee OA compared to any conservative treatment systematically.

Materials and methods

The work was conducted following the instructions of the Cochrane Handbook for Systematic Reviews and reported in accordance with the Preferred Reporting Items for Systematic Review and Meta-Analyses (PRISMA) statement [23, 24]. The review protocol was prospectively registered on PROSPERO (ID: CRD42022379604).

Inclusion and exclusion criteria

As the study focused on the effectiveness of arthroscopy for knee OA, the inclusion criteria were as follows: (1) participants were diagnosed with knee OA; (2) Studies were RCTs or quasi-RCTs comparing arthroscopic surgery with any other non-surgical treatments or placebo; (3) Outcome measurements focused on the treatment effectiveness, with at least one of the two important symptoms, pain and joint function, included.

The exclusion criteria were as follows: (1) Studies that included patients with Kellgren-Lawrence grade 0; (2) Studies involving patients with other serious knee diseases, such as cruciate ligament injury; (3) Studies conducted on animals or cadaveric specimens; (4) Conference abstract or protocols of unfinished studies.

Search strategy

Articles published in PubMed, Embase and the Cochrane Library in English were searched without any time restriction. The search terms included debridement, lavage, knee, osteoarthritis, arthroscopy, RCT and arthroscopic. The literature search was last performed on 1st July 2024. The detailed search strategy is provided as supplemental material 1.

The search results were imported into Endnote software by one of the authors, and duplicates were removed after manual checking. Two authors independently scanned the titles and abstracts, to exclude articles that clearly did not meet our eligibility criteria. The remaining studies were then read in full text. In cases where the two authors had differing views, they first discussed the discrepancies. If they still could not reach an agreement, a senior author made the final decision.

Data extraction

Data extraction from the included studies was conducted independently by two authors. The extracted data included author, year, the number, gender and age of patients, interventions, outcomes and the longest follow up times. Since OA is a chronic disease, for each outcome measure, data from the longest time point was extracted.

Risk of bias assessment

The risk of bias in the included studies was evaluated using the ROB 2 recommended by Cochrane. This evaluation was conducted independently by two authors, who then crosschecked each other’s assessments. Any discrepancies were first discussed between the two authors, and if they still could not reach an agreement, a senior author made the final decision.

Outcomes and statistical analysis

The preset outcomes were pain relief, functional improvement, time to receive arthroplasty and patients-reported satisfaction. Among them, Visual Analogue Scale (VAS) for pain and Western Ontario and McMaster Universities Arthritis Index (WOMAC) were considered primary outcome measurements. Additionally, assessments such as Knee Injury and Osteoarthritis Outcome Score (KOOS), International Knee Documentation Committee (IKDC) score, Lysholm score, and the Short Form-36 (SF-36) were used to evaluate the functional recovery.

Meta-analyses were undertaken using RevMan V.5.4 to calculate the pooled effect, if an outcome measure was used by at least two included studies. For continuous variables, the mean difference (MD) and standard deviation (SD) of the studies were pooled using the inverse variance method. Since the full scores varied across, the full score for VAS was set as 10 while the full score for WOMAC was set as 100. If data transformation or normalization was done across different studies, the standard mean difference (SMD) was used instead of MD. For dichotomous data, the risk ratios (RR) were calculated. Statistical heterogeneity was tested using the I² test, with I² statistic of 0–50% representing low heterogeneity and 50–100% representing high heterogeneity. When I² < 50%, a fixed effects model was used, otherwise, a random effects model was used. Sensitivity analysis was performed by omitting each study to assess its influence on the pooled results when more than three studies were synthesized. Reporting bias was evaluated using the Egger test if an outcome included more than 10 studies. For all data synthesis, confidence intervals (CIs) were set at 95%, and p < 0.05 was considered statistically significant.

Results

Searching and selection results

The initial search identified 1281 records. After removing duplicates, we screened 879 records. We retrieved 37 studies for full-text screening, of which 10 met our inclusion criteria and were included. Some studies were excluded because their study cohorts included patients without OA or patients with Kellgren-Lawrence grade zero [25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,40,41,42,43,44,45,46,47,48]. Additionally, some studies were excluded because they were health economics studies, imaging studies or lacked predefined outcome assessments [33, 49, 50]. Study that performed arthroscopy in both treatment and control group was also excluded [51]. The results of the search are shown in Fig. 1.

Fig. 1
figure 1

Flow chart of literature search

Characteristics of included studies

All ten included studies were parallel-group RCTs conducted between 1993 and 2023 [18,19,20, 52,53,54,55,56,57,58]. Moseley conducted 2 trials with the first being a pilot study for the latter one. Since the two studies did not share data, both were included. These studies were conducted in veteran affairs medical centers, resulting in a male predominance among their subjects [54, 55]. In contrast, female subjects outnumbered males in all other studies.

The arthroscopic surgery in these studies primarily included debridement, meniscectomy, synovectomy, lavage, microfracture, and removal of loose bodies. The control methods included physical therapy, platelet-rich plasma (PRP) or hyalgan injections, nonsteroidal anti-inflammatory drugs (NSAIDs) and sham surgery. The follow-up time ranged from 6 to 36 months, except for Zhang’s study, which did not provide an exact follow-up duration [19].

Outcome measurements varied among the studies, and some preset assessments in our study protocol, such as IKDC, Lysholm scores and time to receive arthroplasty, were not used in any of the studies, and thus will not be reported below. Subgroup analysis and publication bias will also not be shown, as the conditions for these analyses were not met according to our protocol. The characteristics of the included studies are presented in Table 1.

Table 1 Characteristics of included studies

Risk of bias

Only one study was assessed as having a low risk of bias [18]. Most studies were evaluated as having a high risk of bias in measurement of outcome due to the differences in intervention methods, which made blinding of assessor difficult. However, two studies managed to achieve low risk in this section by covering the arthroscopy scars during assessment. In contrast, Merchan’s study used homogeneous baseline data derived from data after the removal of late deaths, which is a statistically incorrect method [20]. According to the instruction of ROB 2, this study was evaluated as having a high risk in this section. All studies reported nearly all data in every cluster, so in missing data section, all studies were evaluated as low risk. The detailed risk of bias assessment is shown in Fig. 2.

Fig. 2
figure 2

Summary of risk of bias of included studies

Pain relief

All studies reported the effectiveness of arthroscopy on pain relief, though the scales they used differed. However, Moseley [55] and Saeed [57] did not perform statistical analyses, and Merchan [20] and Zhang [19] did not evaluated the effectiveness on pain relief in a separate scale. Among these studies, only Singh found a significantly better pain relief in PRP injection group compared to the arthroscopy group at a 9-month follow-up, while the other studies did not show any significant advantages of arthroscopy over non-surgical treatments in terms of pain relief. The synthesis results of available data did not support the superiority of either of arthroscopy or conservative treatments (p = 0.63), with a statistical heterogeneity of 81%. The forest plot is shown in Fig. 3. Additionally, the meta-analysis of pain scores passed the sensitive analysis.

Fig. 3
figure 3

Forest plot of pooled analysis of pain scores

Functional recovery

Among the included studies, Saeed’s was the only one that did not evaluate functional recovery [57]. Moseley did not perform statistical analyses in his pilot study [55]. Merchan compared the effectiveness of arthroscopic surgery and conservative treatment using a knee rating score mainly based mainly on range of motion and ambulation, concluding that arthroscopic treatment was more effective [20]. Zhang reached a similar result in favor of arthroscopy using HSS as the assessing tool [19]. The conventional methods in both studies were mainly medical treatments. However, Singh’s study demonstrated that PRP injection seemed to have an edge over arthroscopic surgery, as concluded from the WOMAC scores, particularly in patients with OA of Kellgren-Lawrence grading two [56]. In the other studies, no significant difference between arthroscopy and conventional therapy was found in terms of functional recovery, regardless of the scales used.

Meta analyses were performed based on the WOMSC (p = 0.38) and SF-36 (p = 0.74) assessments, with WOMAC preset as a primary outcome. No statistical significance was found in either synthesis, indicating similar therapeutic effectiveness of arthroscopy and conventional therapy in terms of functional recovery. The forest plots are shown in Fig. 4.

Fig. 4
figure 4

Forest plots of pooled analyses of (A) WOMAC scores and (B) SF-36

Patients’ satisfaction

Only 2 studies reported patients’ satisfaction as categorical variables. Moseley did not draw any conclusions, while Zhang reported a significantly higher satisfaction rate in the arthroscopy group compared to medical therapy [19, 55]. However, the pooled data showed no significant difference between the treatment methods (p = 0.07). The forest plot is shown in Fig. 5.

Fig. 5
figure 5

Pooled analysis of patients reported satisfaction

Discussion

This systematic review and meta-analysis searched 3 databases and identified 10 RCTs comparing the therapeutic effectiveness of arthroscopic surgery and conventional managements on OA. Although only one study ensured double-blinding and was evaluated as low risk of bias in the overall assessment, most studies used appropriate randomization methods. The results of data synthesis demonstrated that patients with OA did not benefit more from arthroscopy than from conventional managements in term of pain relief and function improvements. Additionally, patient-reported satisfaction with arthroscopy did not show an advantage over non-surgical therapy.

A previous meta-analysis indicated that arthroscopic knee surgery provided little or no clinically important benefit in pain or function in the short or longer term, and probably provided no clinically important benefit in knee-specific or generic quality of life, nor did it improve treatment success. This study included trials comparing arthroscopic surgery with placebo surgery or non-surgical treatment in people with degenerative knee disease, with most participants having OA or meniscus injury [22]. Unlike that study, our review included only patients with OA, though meniscus injury was acceptable. To ensure OA diagnosis was a required condition, we excluded studies including participants with Kellgren-Lawrence grading 0. Despite this more specific population, our study yielded similar results, concluding that arthroscopy should not be preferred over conservative treatment.

A recent systematic review demonstrated that arthroscopic debridement is effective in mild to moderate knee OA at short-term follow-up, which was inconsistent with our results. The discrepancy may be due to different inclusion and exclusion criteria. That study included all types of arthroscopic joint debridement for treating OA, while we only included RCTs. This resulted in a higher level of evidence for our results while a more comprehensive result for that study. In addition, the previous study did not conduct a quantitative meta-analysis, reducing the persuasiveness of their results [59].

Another study published in 2019 concluded that low-quality evidence from a few small trials indicated no benefit of arthroscopic surgery over other non-surgical treatments [60]. Although this study included participants similar to our review, both studies excluding patients with Kellgren-Lawrence grading 0, the results of selection differed. Kats’s study [25] was excluded in our review, since patients with Kellgren-Lawrence grading 0 were found in context, whereas the previous study mistakenly included it. In addition, the previous study researched more types of surgery, such as high tibial osteotomy, while our review focused solely on arthroscopy. Despite these differences, both studies did not support the advantages of arthroscopic surgery over conservative treatments, with our study drawing this conclusion based on more evidence.

Karpinski’s review did not provide a clear view, since quantitative analysis was not performed, and he cautioned that the results should be interpreted carefully since the use of other therapeutic variables such as painkillers or NSAIDs was not controlled in some RCTs [61]. They also included participants with only meniscus injury. Furthermore, systematic reviews and meta-analyses published more than five years ago are now outdated given the publication of new RCTs in recent years [16, 21, 62].

There were still some limitations in this study. First, the quantitative analyses were all based on limited data, and the small sample might reduce the accuracy of the results. The authors had expected numerous RCTs and data on this topic, but the search results were unexpectedly sparse. We cross-checked the searching results with the included studies of some previously published systematic reviews and confirmed the accuracy of our search, as no records were missed. Second, different articles used different assessments to evaluate the same outcomes, potentially leading to bias. The authors carefully reviewed the methods of each study to ensure homogeneity of the pooled data. Third, the protocol for this study was revised once in PROSPERO. However, the revision occurred before the literature search began, so the prospective design and the scientific nature of the study were not compromised. Fourth, the present study included only RCTs with homogeneous populations, limiting the generalizability of the results.

Despite these limitations, the study has its strength. First, as discussed, this study is unique, particularly in its population focus, and demonstrate that arthroscopy does not provide better treatment effectiveness than non-surgical treatment on patients with OA. Second, the basis for our conclusion is all RCTs, and the methodology used in this review is systematic, making this level 1 evidence. Third, quantitative analyses were performed whenever possible, providing credible results. Fourth, the study was registered on public sites, and all work was conducted according to the preset protocol, enhancing the reliability of the final results.

Conclusion

The evidence does not support the effectiveness of arthroscopic knee surgery compared to conservative treatments in knee OA.