Introduction

Knee osteoarthritis (KOA) is a degenerative disease of the joint cartilage, with subchondral bone lesions and synovial inflammation as the main manifestations [1]. The clinical symptoms include pain, joint stiffness, and functional impairment. The prevalence of symptomatic KOA in China is 8.1%, with a higher proportion in women than men and significant geographical differences [2]. In traditional Chinese Medicine (TCM), the pathological basis of KOA is a deficiency of the liver and kidney and invasion of wind, cold, and dampness. The knee joint becomes enlarged, flexion and extension become difficult, and mobility is restricted. The main pathological changes include articular cartilage damage, subchondral bone hardening or cystic changes, osteophyte formation at joint margins, apparent synovial lesions, joint capsule contracture, ligament loosening or contracture, muscle atrophy, and weakness [3]. This disease mainly occurs in middle-aged and elderly patients and belongs to the categories of “paralysis,” “bone paralysis,” “tendon paralysis,” “bone impotence,” and “tendon impotence” in TCM. The clinical manifestations include morning stiffness, unstable walking, pain, and functional impairment.

Pharmacotherapy, physical therapy, rehabilitation therapy, acupuncture, massage, surgery (including total knee replacement), and additional methods are all viable options for managing KOA. Nevertheless, the extended utilization of these medications presents a potential for a multitude of detrimental effects, including hypertension, renal toxicity, gastrointestinal impairment, congestive heart failure, and cardiovascular incidents [4]. Additionally, physical therapy is not appropriate for terminal patients who require surgical intervention, among other limitations. Early-stage KOA patients are treated nonoperatively; surgery is not needed [5]. It is critical to identify a viable nonsurgical intervention that can effectively mitigate symptoms in patients diagnosed with KOA, as early-stage surgery is not advised. A systematic review of therapeutic exercise for KOA suggests that patients may observe significant improvements in their physiological function, overall quality of life, and joint pain reduction [6].

The holistic concept of TCM considers the unity of body and spirit, and only when harmony between them can an organism maintain vitality and vigor. The separation of body and spirit indicates the end of life. Traditional Chinese exercise (TCE) is guided by the holistic concept of TCM, the theory of five elements and yin-yang, and the view of meridians and zang-fu organs [1, 7]. It has gradually formed a unique system that combines movement and stillness, dredges meridians, regulates qi and blood, focuses on strengthening the body, nourishing and controlling, and enhances the body to prevent diseases by combining ancient Chinese philosophy. Studies have shown Taijiquan effectively treats KOA [8], improves mental health, increases life satisfaction, and promotes [9, 10]. In addition to physical exercise, TCE pays more attention to psychological and spiritual adjustment, as well as the intervention of emotions and spirit, appropriate work and rest, and other factors that influence disease development to enhance the body’s defense ability, smooth the flow of qi and blood, harmonize zang-fu organs, and improve organ function, thus playing a role in disease prevention and treatment [11]. The commonly used TCE methods include Taiji, Yijinjing, Baduanjin, Wuqinxi, and Qigong.

The number of systematic reviews (SRs) and meta-analyses (MAs) on TCE interventions for KOA has been increasing in recent years. However, the quality of the literature and evidence needs to be determined. This study used AMSTAR-2 and PRISMA 2020 to assess the methodological and reporting quality and GRADE to evaluate the evidence quality to objectively reflect the current status of the evidence-based evaluation of TCE. The aims were to systematically and critically assess SRs/MAs of TCE interventions for KOA and provide a reference for future evaluation studies of TCE and the development of evidence-based guidelines.

Materials and methods

Design and registration

All analyses were based on previously published data. Therefore, no ethical approval or patient consent was required. The methodology of the overview of systematic reviews (SRs) followed the Cochrane Handbook [12] and the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) guideline. This study was also conducted and reported under the guidance of the checklist of Preferred Reporting Items for the Overview of Systematic Reviews (PRIO-harms) [13].

Eligibility criteria

Inclusion criteria

  1. (1). 

    Types of studies

This study type is a thorough Systematic Review and Meta-analyses (SRs/Mas) of TCE for knee osteoarthritis based on a randomized controlled trial in any language; the randomized controlled trial is the gold standard for evaluating treatments. In clinical research, the random assignment approach is essential. According to this procedure, each individual has an equal chance of being assigned to the experimental or control group, which employs randomization.

  1. (2).

    Types of participants

The study population is anyone who meets the American College of Rheumatology diagnostic standards, the Chinese Medical Association Orthopedic Branch, or the domestic industry norm for Western or Chinese Medicine KOA, regardless of gender, age, or location.

  1. (3).

    Types of interventions

 

Interventions: Traditional Chinese exercise methods, such as Taiji, Baduanjin, Wuqinxi, Yijinjing, Qigong, etc., are used in the treatment group; the control group may consist of conventional exercise, conventional care, health education, or blank control, as well as additional therapies that are not part of the test group.

  1. (4).

    Types of outcomes

The final index should have at least one different index for each: pain, stiffness, physiological function score, quality of life, safety, etc.

Exclusion criteria

(1) Duplicate publications; (2) reviews, animal studies, case reports, conference papers, abstracts, books, comments; (3) mixed hip osteoarthritis in study participants; (4) co-interventions of other complementary and alternative therapies in addition to TCE methods (e.g., massage, acupuncture, herbal therapy, moxibustion, transcutaneous electrical nerve stimulation, cupping, gua sha, bath therapy); (5) protocols for SRs/MAs; (6) Network meta-analyses; (7) insufficient data information for data extraction; (8) full text was not available.

Search strategy

The following eight databases were searched from their inception to January 3, 2023: China National Knowledge Infrastructure (CNKI), Wanfang, Chinese Scientific Journal Database (VIP), China Biology Medical Literature Database (CBM), PubMed, Embase, Web of Science and Cochrane Library, without restrictions on publication date or language. Reference lists of included studies were also reviewed to identify any relevant papers missed in the search. Additionally, trial registries, relevant grey literature, and consultation with experts in related fields were manually searched. Two reviewers conducted the literature search independently. The search terms included Taiji, Tai Chi, Tai Ji, Taichi, T'ai Ji, T'ai Chi, Taichiquan, Taijiquan, T'ai Ji Quan, T'ai Chi Chuan, Baduanjin, Eight-Section Brocade, Yijinjing, Classic of Changing Tendon, Yi-Gin-Ching, Wuqinxi, Five Animals Exercise, Qigong, Traditional Chinese Exercise, Traditional Chinese Medicine Exercise, Remedial Exercise, Therapeutic Exercise; Osteoarthrit*, Knee Osteoarthritis, Gonarthritis, KOA, Osteoarthritis Knee, Degenerative Arthritis; Meta-Analysis, Meta-Analyses, Data Pooling, Systematic Review. The detailed search strategies in Web of Science databases are shown in Additional file 1: Appendix 1.

Study selection and data extraction

Data source and eligibility

EndNote X9 software was utilized to screen the literature. Duplicates were removed using the software and manual examination. Titles and abstracts were read carefully, and those not meeting the inclusion criteria were excluded. The complete manuscripts of the remaining studies were downloaded and reviewed thoroughly, and studies not matching the inclusion criteria in interventions, outcomes, or participants were eliminated. Studies with insufficient information were also excluded. The final studies that were included were determined after discussion and analysis.

Data extraction

Two reviewers extracted data independently using a predesigned form according to the inclusion and exclusion criteria and cross-checked with each other. Any disagreements were resolved through discussion with a third reviewer making the final decision. The critical information of included studies was summarized in a table, including first author, year of publication, number of studies (articles), sample size (participants), interventions in treatment and control groups, risk of bias assessment tool, outcome indicators, and main findings.

Methodological quality assessment

Methodological quality assessment

Two independent reviewers assessed the methodological quality of included SRs/MAs using the AMSTAR-2 tool [14]. Any disagreements were resolved through discussion with a third reviewer making the final decision. Each item was judged as “yes,” “partial yes,” or “no.” “Yes” means the study fully addressed and substantiated the issue raised in the item, while “partial yes” indicates only part of the issue was addressed. “No” means the study did not substantiate or incorrectly examine the issue raised in the item due to insufficient information or absence of data.

Based on the criticality of each item and the evaluation results, the methodological quality of each SR/MA was categorized as high, moderate, low, or critically low: high quality—no or only one non-critical weakness; moderate quality—more than one non-critical weakness; low quality—one critical flaw with or without non-critical deficiencies; critically low quality—more than one essential flaw with or without non-critical faults.

Assessment of reporting quality

Two independent reviewers assessed the reporting quality of SRs/MAs using the PRISMA 2020 checklist [15, 16]. Any disagreements were resolved through discussion with a third reviewer making the final decision. Each item was judged as “fully reported,” “partially reported,” or “not reported” based on compliance with the reporting requirements [17].

Assessment of the risk of bias 

Two independent reviewers assessed the risk of bias for SRs/MAs using the Risk of ROBIS tool [18]. The ROBIS tool aims to evaluate the level of bias presented in a systematic review. This bias assessment tool covers three phases: (1) assessing relevance (optional according to the situation); (2) identifying concerns with the review process (study eligibility criteria, identification and selection of studies, data collection and study appraisal, synthesis, and findings); (3) judging the risk of bias. The results were rated as “high risk,” “low risk,” or “unclear risk.”

Assessment of evidence quality

Two independent reviewers assessed the quality of clinical evidence for SRs/MAs using the GRADE approach [19,20,21]. Any disagreements were resolved through discussion with a third reviewer making the final decision. The main reasons for downgrading evidence quality include limitations in study design, inconsistency of results, indirectness of evidence, imprecision, and publication bias. In addition, if no downgrading was present, the quality of evidence was considered high, with one downgrade, moderate; with two downgrades, low; with three or more downgrades, very low.

Data synthesis

A narrative synthesis approach was utilized. Outcome data were presented as per the original SRs/MAs, and no additional re-analysis of data was conducted. Data were extracted and plotted using WPS 2022 to generate tables. A descriptive analysis was performed to present the literature quality, evidence quality, and main findings of the included studies.

Results

Literature search and selection

The initial search yielded 650 records. After removing 216 duplicates using EndNote X9, 391 records remained. Screening titles and abstracts resulted in the exclusion of 241 articles. After reading the full text of the remaining 50 articles, 23 were further excluded, including three that were unavailable in full text, 1 abstract from an academic conference, 5 containing interventions other than TCM methods, 8 featuring participants without knee osteoarthritis, 3 protocols for SRs/MAs, and 2 with insufficient information. Another 3 articles were excluded as they only conducted qualitative analysis. Reviewing all included studies generated a summary of clinical efficacy evaluation methodologies and criteria systems for TCM interventions in knee osteoarthritis. Finally, 18 studies were included. A comprehensive review framework was established after collecting, organizing, analyzing, and synthesizing relevant literature. The screening process is outlined in Fig. 1.

Fig. 1
figure 1

Flow chart of literature screening

Characteristics of included reviews

A total of 18 SRs/Mas [22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39] were included, among which 10 [22,23,24,25,26,27,28,29,30,31] were in English and 8 [32,33,34,35,36,37,38,39] were in Chinese. Additionally, there were 17 journal articles [22,23,24,25,26,27,28,29,30,31,32, 34,35,36,37,38,39] and 1 dissertation [33]. The publication years have spanned from 2013 to 2022. Fifteen studies [22, 24,25,26,27,28,29,30, 32, 33, 35,36,37,38,39] focused on Taiji, 6 [27, 30,31,32,33,34] involved Baduanjin, 2 [32, 33] included Yijinjing, and 3 [23, 32, 33] examined Wuqinxi. Fifteen studies [23, 25,26,27, 29,30,31,32,33,34,35,36,37,38,39] utilized the Cochrane risk of bias tool, 2 [24, 28] used the Jadad scale, and 1 [22] employed AMSTAR 2. The primary characteristics of the included reviews are presented in Table 1.

Table 1 Basic characteristics of included reviews

Methodological and reporting quality assessment

Methodological quality assessment

The AMSTAR-2 evaluation outcomes are shown in Table 2. The methodological quality of all 18 included SRs was rated as “critically low.” Based on analysis of the 7 critical items, the main flaws were the following: (1) item 2: only 16.67% (3/18) studies provided registration information, while the remaining lacked registration and protocol-related contents; (2) item 4: although 100% (18/18) studies searched ≥ 2 databases, only 11.11% (2/18) performed additional searches such as checking reference lists or grey literature; (3) item 7: none of the 18 studies mentioned excluded studies; (4) item 9: only 11.11% (2/18) studies assessed risk of bias from randomization and blinding, and selective outcome reporting; (5) item 11: 55.56% (10/18) studies did not use appropriate methods for conducting meta-analysis; (6) item 13: 27.78% (5/18) studies did not examine the potential impact of risk of bias in included studies on the effect estimate; (7) Item 15: among 18 studies, 44.44% (8/18) did not assess for publication bias using funnel plots or statistical tests like Egger’s test. Based on the above assessment, all the included reviews were rated as “critically low” in methodological quality.

Table 2 The evaluation results of methodological quality based on AMSTAR-2

Further analysis revealed that the average number of fully addressed items was 7.5 for English literature versus 6 for Chinese literature; the average number of non-addressed items was 7.3 in English literature compared to 8.125 in Chinese literature. This indicates differences between English and Chinese literature regarding the number of “yes” and “no” items.

Report quality of the included SRs/MAs

RPISMA 2020 was used to evaluate the reporting quality of the included studies. The results are presented in Table 3. The key reporting flaws are 77.78% (14/18) of the studies identified themselves as a systematic review; 83.33% (15/18) partially reported the Abstracts checklist; 94.44% (17/18) did not report using supplementary search techniques; 77.78% (14/18) provided incomplete search strategies, with only 11.11% (2/18) giving search strategies for all databases in the appendix; 100% (18/18) did not report funding sources; 100% (18/18) did not explain preprocessing (e.g., handling of missing summary statistics or data conversions) before data merger; 33.33% (6/18) did not discuss strategies to examine heterogeneity (e.g., subgroup analysis, meta-regression); 50%(9/18) did not mention methods to assess results stability (e.g., sensitivity analysis); 50% (9/18) did not evaluate inclusion of studies with publication bias; 83.33% (15/18) did not use the GRADE system to rate the quality of evidence; 38.89% (7/18) did not provide results of risk of bias assessment; 33.33% (6/18) did not describe the characteristics of any composite outcome and the potential for bias across studies; 16.67% (3/18) did not provide full statistically-based results; 50% (9/18) did not present findings of investigation into potential sources of study heterogeneity; 50% (9/18) lacked sensitivity analysis results; 44.44% (8/18) did not provide risk assessment of bias due to missing data (reporting bias); 83.33% (15/18) did not provide any supporting documentation for their grade; 16.67% (3/18) did not analyze findings using additional data; 11.11% (2/18) of included studies in the systematic review did not discuss their limitations; 16.67% (3/18) did not discuss limitations of the review process; 83.33% (15/18) did not mention registration or stated they were not registered; 83.33% (15/18) did not provide access to a protocol or expressed there was none; 55.56% (10/18) did not describe funding source or the funder’s role; 55.56% (10/18) authors conducting the systematic reviews did not declare any conflicts of interest; 66.67% (12/18) data came from sources not publicly available (e.g., data extraction form templates).

Table 3 The evaluation results of reporting quality based on PRISMA 2020

Further statistical analysis indicated disparities between “fully compliant” and “not compliant” items in English and Chinese literature. The percentage of “fully compliant” items was 54.76% for English literature versus 50.45% for Chinese literature, with a mean of 23 and 21.125 for Chinese literature. Regarding the number of “non-compliant” items, the percentage for English literature was 35.24%, while for Chinese literature, it was 38.51%; the mean number was 14.8 for English versus 16.125 for Chinese literature. The difference between partially compliant items was minimal. Compared to Chinese literature, English literature had better full compliance and non-compliance. Table 4 shows the reported status and percentage for English literature, while Table 5 shows the same for Chinese literature.

Table 4 The reporting situation and proportion in English literature
Table 5 The reporting situation and proportion in Chinese literature

The risk of bias of the included SRs/MAs

The risk of bias was assessed using the ROBIS tool, which consists of four risk of bias domains (phase 2), three summary landmark questions (phase 3), and a final risk of bias judgment. 11(11/18, 61.11%) of the 18 included studies were assessed to be at high risk. 1(1/18, 5.56%) study was at high risk for “Inclusion Criteria (Domain 1)”. 10(10/18, 55.56%) studies were at high risk for “Retrieval and Screening (Domain 2)”. Eight (8/18, 44.44%) studies’ “Data Extraction (Domain 3)” failed to provide sufficient judgmental information designated as unclear risk. 6(8/18, 44.44%) studies’ “Data Extraction (Domain 3)” was high risk. 16(16/18, 88.89%) studies had “Data Processing (Domain 4)” as high risk. 12(12/18, 66.67%) studies had none of the risks associated with phase 2 explained and addressed. Thus, phase 3 of Q1 was “No,” 2(2/18, 11.11%) studies did not reasonably consider the included studies' relevance to systematically evaluating research questions. Thus, Q2 was “No”. 18(18/18, 100.00%) studies avoided overemphasizing statistically different results. Thus, Q3 was “Yes.” For details, see Table 6. To summarize, the main reasons for the high risk of bias were (i) failure to search the trial registry, no risk of bias detection, and possible reporting bias; (ii) failure to deal with inter-study heterogeneity; (iii) the stability of the results was unknown; and (iv) failure to elucidate the above limitations in the discussion section. Overall, the risk of bias of the nine SRs included was high, which may affect the results, and it is necessary to standardize the study methods to reduce the risk of bias.

Table 6 Risk of bias for the included SRs/MAs

Evidence quality of the included SRs/MAs

A total of 93 pieces of evidence were extracted, of which 46 (49.46%, 46/93) were very low quality, 34 (36.56%, 34/93) were low quality, 13 (13.98%, 13/93) were moderate quality, and there was no high-quality evidence. 100% (93/93) of the evidence was downgraded due to limitations; 54.84% (51/93) was downgraded due to substantial heterogeneity (I2 50%); 68.82% (64/93) was downgraded due to imprecision; and 13.98% (13/93) was downgraded due to publication bias. Seventy-six pieces of evidence (81.72%, 76/93) showed TCE was more effective than control, while 17 (18.28%, 76/93) revealed no statistically significant difference between the two groups. Table 7 outlines the GRADE downgrading process.

Table 7 The assessment of GRADE

Security of the included SRs/MAs

None of the SRs/MAs quantified the adverse effects of TCE on knee osteoarthritis. However, six articles [26, 31, 32, 34, 38, 39] indicated the safety of TCE for knee osteoarthritis. Therefore, the safety profile of TCE for knee osteoarthritis appears favorable.

Discussion

Mechanism of TCE in KOA treatment

In recent years, more attention has been paid to knee osteoarthritis due to population aging. Guidelines for TCM treatment of knee osteoarthritis [1, 2] recommend TCE for knee osteoarthritis. The 2019 American College of Rheumatology (ACR) guidelines [40] strongly suggest Taiji for knee osteoarthritis, indicating the worldwide use of Taiji exercises. As a physical exercise, Taiji exercises have proven benefits for chronic joint conditions, especially in older adults. Yijinjing originates from an ancient Chinese health-cultivating approach, and research shows it significantly improves knee flexion and relieves joint pain [41]. Yijinjing integrates Traditional Chinese theory with fitness walking, which can enhance the coordination and balance between the internal and external environment of the body, build muscle strength, improve muscle flexibility and endurance, and reduce ligament strains. Baduanjin [42] and Wuqinxi [43] are also performed with precise movement postures to promote blood circulation in joint areas, smooth the passage of qi and blood through meridians, soothe pain, and increase lower limb flexibility and suppleness, thus reducing pain and dysfunction in knee osteoarthritis patients. This study aims to examine the efficacy and influence of TCE on rehabilitating knee osteoarthritis patients of different ages, providing a scientific basis for developing tailored therapies. Patients may exercise less if isolated, especially after the onset of coronavirus disease (COVID). As a physical and mental exercise, TCE has distinct benefits and can be practiced at home during COVID-19. The findings indicate TCE improves functional impairment, pain, psychological status, quality of life, and other conditions in knee osteoarthritis patients. However, due to the poor quality of the studies, the data quality needs improvement and it cannot be concluded that TCE is superior to the control group or other treatments.

Summary of main results

This is the first review of SRs/MAs on the effectiveness and safety of TCE for knee osteoarthritis. Using AMSTAR 2, PRISMA2020, and GRADE, the published SRs and MAs were assessed. Additionally, over 70% of all 11 [18,19,20,21,22, 25,26,27,28, 30, 33] SRs/MAs were adequately reported according to the PRISMA2020 checklist. However, the evidence quality of graded outcomes could have been better. Systematic, high-quality reviews can produce less biased, more scientific evidence for clinical practice and health decisions [44].

All SRs/MAs examined by MSTAR 2 had at least one critical flaw, and the methodological quality of the 18 included publications was deemed “very low.” The primary methodological quality issues were:

  • The lack of a pre-defined review protocol compromised the rigor of the systematic review.

  • No excluded studies were identified, which did not facilitate assessing clinical heterogeneity.

  • Lack of proper referencing, trial registry establishment, and further investigation of gray literature.

  • There is no assessment of the overall effect of risk of bias in RCTs.

  • There is no investigation of potential sources of heterogeneity to help interpret meta-analysis results.

Among the 18 included papers, the top three items with the most “non-conformities” in PRISMA 2020 were item 13b (100%), item 24c (100%), item 13c (83.33%), item 15 (83.33%), item 22 (83.33%), item 24a (83.33%), item 24c (83.33%), item 24a (83.33%), and item 24b (83.33%). Therefore, it can be determined that the primary reporting flaws are the following:

  • Lack of description of preprocessing before data merging (missing data, transformation).

  • Definition and explanation of discrepancies from registration information.

  • Description of presentation method of findings graphs or tables.

  • Description of the method to assess the quality of evidence for each outcome (e.g., using GRADE).

  • Presentation of thresholds.

Differences in journal type, publication dates, and authorship levels can also lead to disparities in reporting standards among works by the same researcher. Regarding defining data extraction methods, interpreting findings in light of other evidence, stating funding sources, and other reporting aspects, the English literature was much more prescriptive than the Chinese literature. All included English journals explicitly referred to PRISMA reporting standards, while none of the Chinese journals did; this may significantly influence the disparate reporting quality between English and Chinese literature.

The top 3 “fully or partially conforming” items out of 18 were: item 3 (94.44%), item 10b (94.44%), item 11 (94.44%), item 16a (94.44%), item 23d (94.44%), item 2 (88.89%), item 5 (88.89%), item 7 (88.89%), item 10a (88.89%), item 16b (88.89%), item 17 (88.89%), and item 23b (88.89%). These items have been part of the PRISMA statement [45] and are becoming increasingly refined as reporting standards evolve.

The 18 studies included in this analysis were published between 2013 and 2021; the more recent the publication date, the better the quality of reporting. The evaluation results revealed the inadequate quality of included studies (only 52.78% (399/756) of “fully conforming” items were 75% [46], indicating the systematic review/meta-analysis on TCM gongfu for knee osteoarthritis lacks normalcy and has room for improvement.

Recommendations for future research

Due to the low quality of early literature studies, there is debate about whether TCE is more effective for knee osteoarthritis than controls or alternative therapies. Therefore, it is crucial to strengthen methodological quality issues of randomized clinical studies, blinding and allocation concealment in later stages, conduct high quality, extensive sample studies, and multicenter clinical controlled trials. Future evidence-based reviews will also require continued focus on SR/MA approaches and reporting quality to produce high-quality research and provide a proper clinical basis for decision-making.

Limitations

This study has several limitations: (1) the interventions of included studies, including Taiji, Baduanjin, Wuqinxi, Yijinjing, and Qigong, are complex and heterogeneous, and their effect values cannot be quantitatively combined for analysis; (2) due to database restrictions and subsequent bias, data from included studies may be missing; (3) the low quality of evidence in the original literature and methodological flaws of researchers conducting the systematic reviews may compromise the accuracy of re-evaluation; (4) subjective disputes between researchers over the evaluation process may influence the outcomes and conclusions of the assessment.

Conclusion

TCE is, therefore, beneficial and safe for knee osteoarthritis. However, clinicians should proceed cautiously from these findings in practice due to the relatively low methodological and evidentiary quality of included SRs/MAs.