Trapeziectomy versus joint replacement for first carpometacarpal (CMC 1) joint osteoarthritis: a systematic review and meta-analysis

Purpose This systematic review and meta-analysis directly compares joint replacement (JR) and trapeziectomy techniques to provide an update as to which surgical intervention is superior for first carpometacarpal (CMC-1) joint osteoarthritis. Methods In August 2020, MEDLINE, Embase and Web of Science were searched for eligible studies that compared these two techniques for the treatment of CMC-1 joint osteoarthritis (PROSPERO registration ID: CRD42020189728). Primary outcomes included the Disabilities of the Arm, Shoulder and Hand (DASH), QuickDASH (QDASH) and pain visual analogue scale (VAS) scores. Secondary outcomes, such as total complication, dislocation and revision surgery rates, were also measured. Results From 1909 studies identified, 14 studies (1005 patients) were eligible. Our meta-analysis found that post-operative QDASH scores were lower for patients in the JR group (five studies, p = 0.0004). Similarly, significantly better postoperative key pinch strength in favour of JR was noted (three studies, p = 0.001). However, pain (VAS) scores were similar between the two groups (five studies, p = 0.21). Moreover, JR techniques had significantly greater odds of overall complications (12 studies; OR 2.12; 95% CI 1.13–3.96, p = 0.02) and significantly greater odds of revision surgery (9 studies; OR 5.14; 95% CI 2.06–12.81, p = 0.0004). Conclusion Overall, based on very low- to moderate-quality evidence, JR treatments may result in better function with less disability with comparable pain (VAS) scores; however, JR has greater odds of complications and greater odds of requiring revision surgery. More robust RCTs that compare JR and TRAP with standardised outcome measures and long-term follow-up would add to the overall quality of evidence.


Introduction
Osteoarthritis of the first carpometacarpal (CMC 1) joint is an extremely common disease that has an age-adjusted prevalence of 7% for men and 15% for women [1]. CMC joint osteoarthritis can cause pain, deformity, limited range of motion, joint instability and weakness, all of which can lead to functional disability, most notably in postmenopausal women and the elderly population [2]. The Eaton-Littler classification system has traditionally been used to radiographically stage CMC osteoarthritis from I to IV based on a true lateral radiograph of the joint [3]. Although the disease is graded in this manner, treatment is largely guided by the patient's pain, functional limitations and desired outcomes.
At present, there are an array of non-surgical and surgical interventions available, of which the latter is reserved as a last resort. The overall goal of treatment, in either case, is to relieve pain, improve thumb motion and provide joint stability [4]. Non-surgical treatments include activity modification, oral pain relief medication, splints, physiotherapy and corticosteroid injections [5]. Surgical interventions are indicated when symptoms have not stabilised or been controlled despite conservative therapy; these include extension osteotomy, CMC arthroscopy with debridement, trapeziectomy alone (TRAP), trapeziectomy with ligament reconstruction and tendon interposition (LRTI), trapeziectomy with tightrope suspensionplasty, arthrodesis and joint replacement (JR) [2,6].
One of the challenges of managing CMC 1 joint osteoarthritis is the lack of guidance on which surgical intervention is more appropriate for a given clinical scenario [6]. Moreover, due to the lack of consensus over which treatment is superior, the treatment for CMC 1 joint osteoarthritis has often been guided by surgeon preference [7]. A survey of hand surgeons in the USA found that 95% of surgeons perform only one type of surgical procedure for this condition, of which 93% utilise LRTI [8]. Similarly, LRTI was the first-choice procedure for the majority of hand surgeons in Europe except in Belgium and France, where JR was the most common choice of treatment [9].
A previous systematic review by Wajon et al. in 2015 found that there is no evidence that any single technique is superior in terms of pain and physical function; however, it was noted that the studies included were "not of high enough quality to provide conclusive evidence that the compared techniques provided equivalent outcomes" [10]. A more upto-date review by Lee et al. in 2021 compared JR exclusively with LRTI and reported a superior clinical outcome for JR [11].
This present review aims to provide an update on the current literature by exclusively investigating comparative studies to provide guidance on which technique is superior between different types of TRAP and JR procedures in terms of both functional and adverse outcomes.

Search strategy
The protocol for this review has been prospectively published on PROSPERO (registration ID CRD42020189728). The search strategy has been provided ("Appendix A"). MEDLINE, Embase and Web of Science were systematically searched for eligible studies on 8 August 2020. All articles were searched and selected on the Preferred Reporting Items for Systematic Reviews and Meta-analyses (PRISMA) criteria [12]. References from all eligible articles were screened, relevant orthopaedic guidelines were read, and experts in the field of orthopaedics were consulted.
Articles identified from the database searches were screened by title and then by abstract by three authors (SR, RC and SR). Thereafter, the full manuscript of the final articles was assessed against eligibility criteria by two independent authors. Any dispute was discussed by all authors and settled by a consensus. Data from eligible articles were inputted into a pre-defined, piloted spreadsheet that was reviewed by an additional author (SS).

Eligible studies
All original research studies that compared functional outcomes and/or complications between trapeziectomy and joint replacement for the treatment of osteoarthritis of the first carpometacarpal joint were eligible for inclusion. Additionally, studies of any language were included, provided that an English translation was available at the time of search. Only studies involving living human participants after the year 2000 were included to reflect modern practice. Studies involving any other type of degenerative joint disease or arthritis that affected the first carpometacarpal joint were excluded. All cadaveric, biomechanical or non-human studies were also excluded.

Eligible participants
Eligible participants were male or female adult patients, over the age of 18, with primary osteoarthritis undergoing treatment with either trapeziectomy or joint replacement for curative intent in the primary setting, i.e. excluding those who require revision surgery.

Eligible interventions and comparators
The eligible intervention was joint replacement of the carpometacarpal joint, regardless of the material used to replace the carpometacarpal joint, to treat osteoarthritis of the first carpometacarpal joint.
The eligible comparator was trapeziectomy to treat osteoarthritis of the first carpometacarpal joint. This included simple trapeziectomy, trapeziectomy with tendon interposition (TI), trapeziectomy with ligament reconstruction (LR), trapeziectomy with tendon interposition and ligament reconstruction (LRTI) and resection-suspension arthroplasty (RSA).

Outcome measures
The primary outcomes were functional outcomes, which included the Disabilities of the Arm, Shoulder, Hand (DASH) score, the QuickDASH (QDASH) score, pain rating via the Visual Analogue Scale (VAS), tip pinch strength, key pinch strength, grip pinch strength and Kapandji score. The DASH score is derived from self-reported responses to a 30-item questionnaire that was developed to measure a patient's degree of upper-limb impairment and disability [13]. Alternatively, there is a shortened 15-item questionnaire known as the QDASH score, which is also commonly used [14]. The VAS score is a single-item continuous scale that serves as a measure of pain intensity [15]. Finally, key pinch, grip pinch and Kapandji scores are also commonly used scores to measure hand strength and mobility [16]. Secondary outcome measures were comprised of adverse outcomes, such as revision surgery rate, failure rate, dislocation rate, loosening rate and total complication rate.

Assessment of risk of bias
The risk of bias assessment was carried out based on the type of study. The ROBINS-I tool was used for non-randomised comparative studies, and the Cochrane Risk of Bias 2.0 tool was used for the one randomised controlled trial (RCT) included in this review [17,18]. The quality of our effect estimates was assessed using the GRADE rating system [19].

Data analysis
The intervention and comparator were compared via a narrative synthesis. All quantitative data for functional outcomes and complications that were available in the form of means, medians and ranges have been presented in separate tables and figures. Continuous variables were measured by the mean or median with standard deviation or interquartile range; categorical variables were measured by percentages.
A quantitative meta-analysis has also been carried out to compare functional outcomes and complications between the intervention and comparator via the Review Manager (RevMan) software. The final follow-up times were pooled when conducting the meta-analysis. A random effects model was used as no fixed effects were assumed. When applicable, mean difference and odds ratios will be calculated with confidence intervals provided. Studies that contained data with disparate or incomparable outcomes were not included in the meta-analysis; instead, these were discussed in the narrative synthesis. In particular, studies that did not report standard deviations were precluded from the meta-analysis for QDASH, pain (VAS) and key pinch strength. Finally, a discussion of possible explanations and an overall summation has been presented in the discussion and conclusion sections, respectively.

Study selection
In total, 1909 studies were identified through database searching. After removal of duplicates and abstract screening, 27 articles were assessed for eligibility by the inclusion criteria. From these 27 studies, 13 were excluded, resulting in 14 eligible studies [20][21][22][23][24][25][26][27][28][29][30][31][32][33]. In accordance with the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) guidelines, a flow diagram for the results of the study selection procedure is shown in Fig. 1. The PRISMA checklist has been included as "Appendix B".

Study characteristics
Studies comparing joint replacement and trapeziectomy were assessed in this systematic review (SR). Study characteristics are shown in Table 1. The types of JR in the included studies were Ivory, Elektra, ARPE, De la Caffiniere, Roseland, MAIA, of which two were cemented, eight were uncemented, and four were unspecified. This was compared with different types of trapeziectomy including LRTI, resection arthroplasty (RA) or RSA, tendon interposition (TI), simple trapeziectomy and trapeziectomy with or without ligamentoplasty (TRAP ± ligamentoplasty).
Only one study [22] in this SR was a randomised controlled trial; five [21,26,29,30,32] were prospective cohort studies and eight [20, 23-25, 27, 28, 31, 33] were retrospective cohort studies. The recruitment period ranged from 1995 to 2016, and all studies were published after the year 2000. This resulted in a total of 1,005 patients (mean age 59.2 years), of which 521 had a joint replacement (mean follow-up 45.5 months) and 484 had a type of trapeziectomy procedure (mean follow-up 48.2 months).

Summary of findings
This systematic review investigated functional and adverse outcomes between JR and TRAP procedures. It was found that treatment with JR led to significantly better QDASH scores and key pinch strength, but with comparable pain

Previous systematic reviews
At the time of writing, this present review is the largest systematic review with a meta-analysis that directly compares functional and adverse outcomes between joint replacement and trapeziectomy. In 2015, a review was published by Wajon et al. [10] that compared functional outcomes between an array of surgical techniques used to treat CMC 1 OA. However, this review included only 670 participants and only one JR technique, the Swanson implant. Wajon et al. also published a review in 2017 that was later retracted [34].
Moreover, another review that was published by Huang et al. in 2015 [35] compared 19 different types of joint replacements and found that "no single implant can be recommended" and that "many implants should only be used with great caution if at all". A more recent systematic review was published by Remy et al. [36] in 2020 that also compared different types of joint replacements. This review noted favourable short-term outcomes relating to pain and improved function that is stable over time with a limited positive effect on joint strength and high rates of failure.
Liu et al. [37] compared simple trapeziectomy with LRTI and found that the latter technique led to superior tip and grip strength at one-year follow-up but did not find a difference between the techniques with regard to pain, key pinch and DASH. A meta-analysis conducted in 2021 by Lee et al. [11] reported that JR has a superior clinical outcome compared to LRTI with better DASH scores as well as improved pinch power along with comparable pain and complications.
This present review adds to the literature by providing a direct comparison of JR with other TRAP techniques, such as simple TRAP and RA, and by highlighting the importance of counselling patients regarding the greater risk of complications and greater odds of requiring revision surgery when undergoing JR procedures.

Functional outcomes
In this review, studies reported postoperative functional outcomes, ranging from subjective measures such as DASH, QDASH and pain (VAS) to objective measures, including tip pinch, key pinch, grip strength and Kapandji scores. No single study reported all of these outcomes, and there was marked heterogeneity in the number of functional outcomes reported per study, ranging from as little as one outcome [25,31] to as many as six outcomes [28] (Table 3). This is likely due to the lack of standardised reporting outcome measures for studies on CMC 1 osteoarthritis. This is supported by a recent review that found 33 unique outcomes and 25 unique outcome measures reported across 97 studies on this topic [38]. This, along with our findings, highlights the need for a core outcome set (COS), which would include standardised outcomes that need to be reported as a minimum in all studies on CMC 1 joint osteoarthritis. This would add to the quality of evidence that would contribute to higher-quality reviews and clinical guidelines on the management of CMC 1 osteoarthritis in the future. Moreover, it was not possible to carry out a metaanalysis for DASH, tip pinch strength, grip strength and Kapandji scores; however, if future studies standardised outcomes, future reviews will be able to perform a metaanalysis and report on functional outcomes holistically.
Of the functional outcomes that underwent meta-analysis, better functional outcomes, namely QDASH and key pinch, were associated with JR. This is similar to the review by Remy et al. [36], which found that JR is associated with a rapid gain of postoperative function.
Additionally, both Huang et al. [35] and Remy et al. [36] noted good pain relief, but neither review compared JR with TRAP. Only the JR versus RA sub-group, which comprised one study [20], found significantly lower pain (VAS) scores in favour of JR. However, this present review is the first to highlight comparable overall pain (VAS) scores between JR and TRAP techniques.
It should be noted that the studies in this review had an overall mean follow-up time of 45.5 months and 48.2 months for JR and TRAP procedures, respectively, with only two studies [24,26] having mean follow-up periods of greater than 10 years. Hence, studies with longer follow-up are required to understand the long-term functional outcomes of both procedures.

Adverse outcomes
Despite better functional outcomes associated with JR, there is greater inherent risk of complications that can occur in JR compared to TRAP as noted in this review, which is in keeping with the literature [35,36]. Loosening and dislocation, in particular, can be attributed to errors in the positioning of implants and the shape or bone quality of the trapezium [36]. We recommend that patients are carefully counselled regarding the risk of complications and revision surgeries when undergoing treatment with JR.
In terms of direct comparison between JR and TRAP, although there was an overall greater number of complications and revision surgeries for JR, only the JR versus RA sub-group showed a statistically significant difference, as seen in Figs. 5 and 6. This indicates that RA is associated with fewer adverse outcomes than JR, making it a potentially safer option in terms of adverse outcomes.
In addition to providing superior functional outcomes, JR techniques also need to provide comparable complication rates to TRAP procedures to justify its use in treating CMC 1 osteoarthritis [35]. Some of the JRs that have shown some promise in this review include Ivory, Elektra and ARPE.
The Ivory JR demonstrated a variation in the odds of complications and revision surgery [20,21,27] as seen in Fig. 5, with high rates of complications reported. This could be explained by the fact that the Ivory JR modifies the movement of tendons, which can result in De Quervain syndrome [20]. However, since the Ivory JR has demonstrated promising functional outcomes, such as favourable pain [21] and DASH scores [27] as seen in Figs. 2 and 3, we speculate this could provide good long-term functional outcomes but with a varying rate of complications as studies have shown.
Additionally, although the Elektra JR showed a significant improvement in functional parameters in one study [30], two other studies [22,26] that investigated the Elektra JR did not report a significant improvement in functional outcomes. This, along with a high failure and revision rate [26] as well as the variation in the odds of complication (Fig. 5), led us to conclude that the Elektra JR is unlikely to be a suitable alternative to TRAP according to this review.
Finally, we found that the ARPE JR was similar to LRTI in terms of complications and revision surgery (Figs. 5,6) and even noted significantly better key pinch strength [23] and DASH scores [31] when compared to LRTI and TRAP, respectively, indicating that the ARPE JR could be a safe alternative to TRAP. However, more robust comparative studies involving this technique are required.
Finally, it is worth noting that there is a range of different prostheses that were not included in this review as only studies that directly compared JR with TRAP techniques were included.

Strengths and limitations
The strengths of this review include prospective registration of the study protocol, an up-to-date search of the literature and a meta-analysis to compare functional and adverse outcomes when feasible. However, there are notable limitations. The majority of the included studies are non-randomised with either moderate (nine studies) or serious (two studies) risk of bias. The one RCT included also has "some concerns" based on its risk of bias assessment. The GRADE rating of the studies that were included in the meta-analysis included two "very low" ratings, two "low" ratings and one "moderate" rating, partly due to the large number of observational studies, which are susceptible to selection bias.
Another obvious limitation is the comparison of only two techniques, thus excluding alternative treatments such as arthrodesis and spacers. Additionally, some of the studies included in this review utilised older models of JRs, such as Elektra and De la Caffiniere, which are not reflective of the prostheses used currently. For example, the Ivory, Elektra and ARPE prostheses have shown good promise in this review, and therefore, the possibility of improved outcomes with newer prostheses should be considered.
Moreover, no subgroup analysis of the JR arm of this review has been carried out, which is due to the numerous types of JRs included as well as an insufficient number of studies of each type of JR, which were inadequate for the purposes of carrying out a meaningful subgroup analysis. Finally, the meta-analyses are limited by the lack of robust RCTs that compare these two techniques, and thus, it is not currently possible to reach a definitive conclusion on which technique is superior overall.

Conclusion
Overall, based on very low-to moderate-quality evidence, there is potential for improved personalised care when choosing between TRAP and JR procedures based on the patient's desired outcomes. We advise that patients need to be counselled on the benefits and risks of both procedures, with JR treatments resulting in better function with lower QDASH scores (very low quality of evidence), improved key pinch strength (low quality of evidence) and comparable pain (VAS) scores (very low quality of evidence).
If opting for JR, patients need to be aware of the greater risk of complications (low quality of evidence) and the greater odds of requiring revision surgery (moderate quality of evidence) when compared to TRAP techniques. Ultimately, the choice of treatment should be made in conjunction with patients who are well-informed about the benefits and risks of both procedures.
Additionally, we believe that more robust studies that compare JR and TRAP with standardised outcome measures and long-term follow-up are required in order to strengthen the quality of evidence available.

Databases and criteria
MEDLINE, Embase and Web of Science will be searched for eligible studies in August 2020. We will limit the search to studies from the year 2000 onwards to reflect modern practice on this topic. All article search and selection will be carried out based on the Preferred Reporting Items for Systematic Reviews and Meta-analyses (PRISMA) criteria. We will also be consulting experts and reviewing references from eligible articles on relevant orthopaedic guidelines. Selection process 8 Specify the methods used to decide whether a study met the inclusion criteria of the review, including how many reviewers screened each record and each report retrieved, whether they worked independently, and if applicable, details of automation tools used in the process.

2-3
Data collection process 9 Specify the methods used to collect data from reports, including how many reviewers collected data from each report, whether they worked independently, any processes for obtaining or confirming data from study investigators, and if applicable, details of automation tools used in the process.

2-3
Data items 10a List and define all outcomes for which data were sought. Specify whether all results that were compatible with each outcome domain in each study were sought (e.g. for all measures, time points, analyses), and if not, the methods used to decide which results to collect.

3
10b List and define all other variables for which data were sought (e.g. participant and intervention characteristics, funding sources). Describe any assumptions made about any missing or unclear information.

2-3
Study risk of bias assessment 11 Specify the methods used to assess risk of bias in the included studies, including details of the tool(s) used, how many reviewers assessed each study and whether they worked independently, and if applicable, details of automation tools used in the process.

3
Effect measures 12 Specify for each outcome the effect measure(s) (e.g. risk ratio, mean difference) used in the synthesis or presentation of results. 3

Synthesis methods
13a Describe the processes used to decide which studies were eligible for each synthesis (e.g. tabulating the study intervention characteristics and comparing against the planned groups for each synthesis (item #5)).

3-4
13b Describe any methods required to prepare the data for presentation or synthesis, such as handling of missing summary statistics, or data conversions.

3-4
13c Describe any methods used to tabulate or visually display results of individual studies and syntheses. 3 13d Describe any methods used to synthesize results and provide a rationale for the choice(s). If meta-analysis was performed, describe the model(s), method(s) to identify the presence and extent of statistical heterogeneity, and software package(s) used.

4
13e Describe any methods used to explore possible causes of heterogeneity among study results (e.g. subgroup analysis, meta-regression). 4 13f Describe any sensitivity analyses conducted to assess robustness of the synthesized results. 4 Reporting bias assessment 14 Describe any methods used to assess risk of bias due to missing results in a synthesis (arising from reporting biases). 3

Location where item is reported
Certainty assessment 15 Describe any methods used to assess certainty (or confidence) in the body of evidence for an outcome. 4

Study selection
16a Describe the results of the search and selection process, from the number of records identified in the search to the number of studies included in the review, ideally using a flow diagram.

4
16b Cite studies that might appear to meet the inclusion criteria, but which were excluded, and explain why they were excluded. 4 Study characteristics 17 Cite each included study and present its characteristics.

4-5
Risk of bias in studies 18 Present assessments of risk of bias for each included study. 8

Results of individual studies
19 For all outcomes, present, for each study: (a) summary statistics for each group (where appropriate) and (b) an effect estimate and its precision (e.g. confidence/credible interval), ideally using structured tables or plots.

4-8
Results of syntheses 20a For each synthesis, briefly summarise the characteristics and risk of bias among contributing studies. 8 20b Present results of all statistical syntheses conducted. If meta-analysis was done, present for each the summary estimate and its precision (e.g. confidence/credible interval) and measures of statistical heterogeneity. If comparing groups, describe the direction of the effect.

5-8
20c Present results of all investigations of possible causes of heterogeneity among study results. 5-8 20d Present results of all sensitivity analyses conducted to assess the robustness of the synthesized results.

4-8
Reporting biases 21 Present assessments of risk of bias due to missing results (arising from reporting biases) for each synthesis assessed. 8 Certainty of evidence 22 Present assessments of certainty (or confidence) in the body of evidence for each outcome assessed. 5-8

Discussion
23a Provide a general interpretation of the results in the context of other evidence. 8 -11 23b Discuss any limitations of the evidence included in the review. 11 23c Discuss any limitations of the review processes used. 11 23d Discuss implications of the results for practice, policy, and future research. 8-12

OTHER INFORMATION
Registration and protocol 24a Provide registration information for the review, including register name and registration number, or state that the review was not registered. 2 24b Indicate where the review protocol can be accessed, or state that a protocol was not prepared. 2 24c Describe and explain any amendments to information provided at registration or in the protocol. Availability of data, code and other materials 27 Report which of the following are publicly available and where they can be found: template data collection forms; data extracted from included studies; data used for all analyses; analytic code; any other materials used in the review. For more information, visit: http://www.prisma-statement.org/