Introduction

Osteoarthritis of the first carpometacarpal (CMC 1) joint is an extremely common disease that has an age-adjusted prevalence of 7% for men and 15% for women [1]. CMC joint osteoarthritis can cause pain, deformity, limited range of motion, joint instability and weakness, all of which can lead to functional disability, most notably in postmenopausal women and the elderly population [2]. The Eaton-Littler classification system has traditionally been used to radiographically stage CMC osteoarthritis from I to IV based on a true lateral radiograph of the joint [3]. Although the disease is graded in this manner, treatment is largely guided by the patient’s pain, functional limitations and desired outcomes.

At present, there are an array of non-surgical and surgical interventions available, of which the latter is reserved as a last resort. The overall goal of treatment, in either case, is to relieve pain, improve thumb motion and provide joint stability [4]. Non-surgical treatments include activity modification, oral pain relief medication, splints, physiotherapy and corticosteroid injections [5]. Surgical interventions are indicated when symptoms have not stabilised or been controlled despite conservative therapy; these include extension osteotomy, CMC arthroscopy with debridement, trapeziectomy alone (TRAP), trapeziectomy with ligament reconstruction and tendon interposition (LRTI), trapeziectomy with tightrope suspensionplasty, arthrodesis and joint replacement (JR) [2, 6].

One of the challenges of managing CMC 1 joint osteoarthritis is the lack of guidance on which surgical intervention is more appropriate for a given clinical scenario [6]. Moreover, due to the lack of consensus over which treatment is superior, the treatment for CMC 1 joint osteoarthritis has often been guided by surgeon preference [7]. A survey of hand surgeons in the USA found that 95% of surgeons perform only one type of surgical procedure for this condition, of which 93% utilise LRTI [8]. Similarly, LRTI was the first-choice procedure for the majority of hand surgeons in Europe except in Belgium and France, where JR was the most common choice of treatment [9].

A previous systematic review by Wajon et al. in 2015 found that there is no evidence that any single technique is superior in terms of pain and physical function; however, it was noted that the studies included were “not of high enough quality to provide conclusive evidence that the compared techniques provided equivalent outcomes” [10]. A more up-to-date review by Lee et al. in 2021 compared JR exclusively with LRTI and reported a superior clinical outcome for JR [11].

This present review aims to provide an update on the current literature by exclusively investigating comparative studies to provide guidance on which technique is superior between different types of TRAP and JR procedures in terms of both functional and adverse outcomes.

Methods

Search strategy

The protocol for this review has been prospectively published on PROSPERO (registration ID CRD42020189728). The search strategy has been provided (“Appendix A”). MEDLINE, Embase and Web of Science were systematically searched for eligible studies on 8 August 2020. All articles were searched and selected on the Preferred Reporting Items for Systematic Reviews and Meta-analyses (PRISMA) criteria [12]. References from all eligible articles were screened, relevant orthopaedic guidelines were read, and experts in the field of orthopaedics were consulted.

Articles identified from the database searches were screened by title and then by abstract by three authors (SR, RC and SR). Thereafter, the full manuscript of the final articles was assessed against eligibility criteria by two independent authors. Any dispute was discussed by all authors and settled by a consensus. Data from eligible articles were inputted into a pre-defined, piloted spreadsheet that was reviewed by an additional author (SS).

Eligible studies

All original research studies that compared functional outcomes and/or complications between trapeziectomy and joint replacement for the treatment of osteoarthritis of the first carpometacarpal joint were eligible for inclusion. Additionally, studies of any language were included, provided that an English translation was available at the time of search. Only studies involving living human participants after the year 2000 were included to reflect modern practice. Studies involving any other type of degenerative joint disease or arthritis that affected the first carpometacarpal joint were excluded. All cadaveric, biomechanical or non-human studies were also excluded.

Eligible participants

Eligible participants were male or female adult patients, over the age of 18, with primary osteoarthritis undergoing treatment with either trapeziectomy or joint replacement for curative intent in the primary setting, i.e. excluding those who require revision surgery.

Eligible interventions and comparators

The eligible intervention was joint replacement of the carpometacarpal joint, regardless of the material used to replace the carpometacarpal joint, to treat osteoarthritis of the first carpometacarpal joint.

The eligible comparator was trapeziectomy to treat osteoarthritis of the first carpometacarpal joint. This included simple trapeziectomy, trapeziectomy with tendon interposition (TI), trapeziectomy with ligament reconstruction (LR), trapeziectomy with tendon interposition and ligament reconstruction (LRTI) and resection-suspension arthroplasty (RSA).

Outcome measures

The primary outcomes were functional outcomes, which included the Disabilities of the Arm, Shoulder, Hand (DASH) score, the QuickDASH (QDASH) score, pain rating via the Visual Analogue Scale (VAS), tip pinch strength, key pinch strength, grip pinch strength and Kapandji score. The DASH score is derived from self-reported responses to a 30-item questionnaire that was developed to measure a patient’s degree of upper-limb impairment and disability [13]. Alternatively, there is a shortened 15-item questionnaire known as the QDASH score, which is also commonly used [14]. The VAS score is a single‐item continuous scale that serves as a measure of pain intensity [15]. Finally, key pinch, grip pinch and Kapandji scores are also commonly used scores to measure hand strength and mobility [16]. Secondary outcome measures were comprised of adverse outcomes, such as revision surgery rate, failure rate, dislocation rate, loosening rate and total complication rate.

Assessment of risk of bias

The risk of bias assessment was carried out based on the type of study. The ROBINS-I tool was used for non-randomised comparative studies, and the Cochrane Risk of Bias 2.0 tool was used for the one randomised controlled trial (RCT) included in this review [17, 18]. The quality of our effect estimates was assessed using the GRADE rating system [19].

Data analysis

The intervention and comparator were compared via a narrative synthesis. All quantitative data for functional outcomes and complications that were available in the form of means, medians and ranges have been presented in separate tables and figures. Continuous variables were measured by the mean or median with standard deviation or interquartile range; categorical variables were measured by percentages.

A quantitative meta-analysis has also been carried out to compare functional outcomes and complications between the intervention and comparator via the Review Manager (RevMan) software. The final follow-up times were pooled when conducting the meta-analysis. A random effects model was used as no fixed effects were assumed. When applicable, mean difference and odds ratios will be calculated with confidence intervals provided. Studies that contained data with disparate or incomparable outcomes were not included in the meta-analysis; instead, these were discussed in the narrative synthesis. In particular, studies that did not report standard deviations were precluded from the meta-analysis for QDASH, pain (VAS) and key pinch strength. Finally, a discussion of possible explanations and an overall summation has been presented in the discussion and conclusion sections, respectively.

Results

Study selection

In total, 1909 studies were identified through database searching. After removal of duplicates and abstract screening, 27 articles were assessed for eligibility by the inclusion criteria. From these 27 studies, 13 were excluded, resulting in 14 eligible studies [20,21,22,23,24,25,26,27,28,29,30,31,32,33]. In accordance with the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) guidelines, a flow diagram for the results of the study selection procedure is shown in Fig. 1. The PRISMA checklist has been included as “Appendix B”.

Fig. 1
figure 1

PRISMA flow chart of studies identified, screened and included

Study characteristics

Studies comparing joint replacement and trapeziectomy were assessed in this systematic review (SR). Study characteristics are shown in Table 1. The types of JR in the included studies were Ivory, Elektra, ARPE, De la Caffiniere, Roseland, MAIA, of which two were cemented, eight were uncemented, and four were unspecified. This was compared with different types of trapeziectomy including LRTI, resection arthroplasty (RA) or RSA, tendon interposition (TI), simple trapeziectomy and trapeziectomy with or without ligamentoplasty (TRAP ± ligamentoplasty).

Table 1 Baseline characteristics of all included studies

All five studies [21,22,23,24,25] that compared JR with LRTI alone used the Burton-Pellegrini technique. Four studies adopted RA techniques, of which three [20, 26, 28] used RSA and one [27] used Lundsborg’s RA. The studies utilising TI adopted flexor carpi radialis (FCR) TI [29] and abductor pollicis longus (APL) TI [30]. Of the three remaining studies, one [31] used simple trapeziectomy, one [32] used LRTI as per the Burton-Pellegrini technique or trapeziectomy, and one [33] used trapeziectomy with or without Sigfuson-Lundborg ligamentoplasty.

Only one study [22] in this SR was a randomised controlled trial; five [21, 26, 29, 30, 32] were prospective cohort studies and eight [20, 23,24,25, 27, 28, 31, 33] were retrospective cohort studies. The recruitment period ranged from 1995 to 2016, and all studies were published after the year 2000. This resulted in a total of 1,005 patients (mean age 59.2 years), of which 521 had a joint replacement (mean follow-up 45.5 months) and 484 had a type of trapeziectomy procedure (mean follow-up 48.2 months).

Functional outcomes

DASH

Five studies [20, 26,27,28, 32] reported postoperative DASH outcomes (Table 2). Only one study [27] that compared the uncemented Ivory JR with Lundsborg’s RA reported a statistically significant difference (p < 0.05).

Table 2 Functional outcomes of included studies

QDASH

Six studies [21,22,23,24, 31, 33] reported postoperative QDASH scores. Four studies [21,22,23,24] that compared JR with LRTI were eligible for meta-analysis, which detected a significant mean difference between JR and TRAP in favour of JR (mean difference −4.86; 95% CI −7.57 to −2.15, p = 0.0004) (Fig. 2).

Fig. 2
figure 2

Meta-analysis of QDASH scores

Pain (VAS)

Eleven studies [20, 21, 23, 24, 26,27,28,29,30, 32, 33] reported postoperative pain (VAS) (Table 2). Five studies [20, 21, 23, 24, 32] were included in the meta-analysis, which revealed a non-significant difference between JR and TRAP procedures (mean difference -0.49; 95% CI −1.27 to 0.28, p = 0.21) (Fig. 3). One subgroup showed lower pain scores in favour of the Ivory JR compared to RSA [20] (mean difference −2.00; 95% CI −3.95 to −0.05, p = 0.04) (Fig. 3).

Fig. 3
figure 3

Meta-analysis of pain (VAS) scores

Tip pinch strength

Four studies [22, 28,29,30] reported postoperative tip pinch scores, of which only one [30] reported significantly better tip pinch strength in the uncemented Elektra JR group compared to the APL TI group (p < 0.05) (Table 2).

Key pinch strength

Eight studies [21,22,23, 25, 28,29,30, 33] reported key pinch strength. Three [21, 23, 25] of these studies were eligible for meta-analysis, all of which compared JR with LRTI. The meta-analysis showed significantly better postoperative key pinch strength in favour of JR (mean difference 0.95; 95% CI 0.36 to 1.53, p = 0.001) (Fig. 4).

Fig. 4
figure 4

Meta-analysis of key pinch (kg)

Grip strength

Of the six studies [21, 22, 28,29,30, 33] comparing postoperative grip strength, only one [30] that compared the uncemented Elektra JR with APL TI showed a significantly better grip strength for the JR group (p < 0.05) (Table 2).

Kapandji score

Of the five studies [21,22,23, 28, 29] that reported Kapandji scores, two studies [21, 23] reported a non-significant difference in scores between uncemented Ivory JR (p = 0.929) and LRTI, and between uncemented ARPE JR and LRTI (p = 0.32) (Table 2); the other three studies [22, 28, 29] did not report p-values.

Adverse outcomes

Failure

Three studies [26, 30, 32] reported on failure (Table 3). One study [26] reported a failure rate as high as 72% for the uncemented Elektra JR and 0% for the RSA group. Another study [30] reported a failure rate of 2.8% for the uncemented Elektra JR and 0% for the APL TI. One study [32] that compared cemented De La Caffiniere JR with LRTI and TRAP reported failure rates of 0%, 0% and 4.5%, respectively.

Table 3 Adverse outcomes (revision surgery, failure, dislocation, loosening and total complication rate)

Dislocation

Dislocation rate, which is an outcome that is only applicable to JR, was reported in eight studies [20,21,22,23, 26, 29, 31, 33] (Table 3). Dislocation rates of 2.4–13.8% were reported for the Ivory JR [20, 21], 3.4–15% for the uncemented Elektra JR [22, 26], 9.6–9.7% for the uncemented ARPE JR [23, 31] and 11.1% for ball-and-socket arthroplasty [33] (Table 3). Only one study [29] reported a dislocation rate of 0% for the MAIA JR.

Loosening

As with dislocation, loosening is only applicable to JR procedures. Seven studies [21, 22, 26, 27, 29,30,31] reported loosening rates with the highest rate of 58.6% reported for the uncemented Elektra JR [26] (Table 3). Two other studies reported on the Elektra JR, noting loosening rates of 0–10% [22, 30]. Two studies [21, 27] reported loosening rates of 1.2–2.6% for the uncemented Ivory JR. One study [29] reported a loosening rate of 4.3% for the MAIA JR, and another study [31] reported a loosening rate of 0% or the uncemented ARPE.

Total complication rate

Total complication rates were available for 12 studies [20,21,22,23, 26,27,28,29,30,31,32,33], and the meta-analysis revealed that, overall, JR was associated with a significantly greater complication rate when compared with TRAP (OR 2.12; 95% CI 1.13 to 3.96, p = 0.02) (Fig. 5). However, sub-group analysis found that only the JR versus RA group [20, 26,27,28] had significantly greater odds of complications (OR 3.95; 95% CI 1.29 to 12.09, p = 0.02) (Fig. 5).

Fig. 5
figure 5

Meta-analysis of total complication rates

Revision surgery rate

Revision surgery rates were reported in nine studies [20,21,22,23, 26, 27, 31,32,33] (Table 3), all of which were eligible for meta-analysis (Fig. 6). Overall, the meta-analysis found that TRAP procedures had significantly lower revision surgery rates compared with JR (OR 5.14; 95% CI 2.06 to 12.81, p = 0.0004) (Fig. 6). The only sub-group with a significant difference in odds of revision surgery was the JR versus RA group [20, 26, 27] (OR 14.87; 95% CI 2.69 to 82.10, p = 0.002) (Fig. 6).

Fig. 6
figure 6

Meta-analysis of revision surgery rates

Quality assessment

The one RCT [22] in this review was assessed via the Cochrane Risk of Bias Tool 2.0 and was found to have some concerns regarding bias. The 13 non-randomised comparative studies [20, 21, 23,24,25,26,27,28,29,30,31,32,33] were assessed for bias via the ROBINS-I tool; two studies [24, 33] were found to have serious risk of bias, and the remaining 11 studies [20, 21, 23, 25,26,27,28,29,30,31,32] were deemed to have moderate risk of bias (Table 4).

Table 4 Risk of bias for non-randomised and randomised comparative studies using the ROBINS-I tool and the RoB 2.0 tool, respectively

GRADE analysis of the studies included in the meta-analyses revealed a very low rating for QDASH and pain (VAS) scores, a low rating for key pinch strength and total complication rate, and a moderate rating for revision surgery rate (Table 5).

Table 5 Quality of evidence of each outcome as assessed by the GRADE system

Discussion

Summary of findings

This systematic review investigated functional and adverse outcomes between JR and TRAP procedures. It was found that treatment with JR led to significantly better QDASH scores and key pinch strength, but with comparable pain (VAS) scores. However, JR was associated with greater odds of complications and requirement of revision surgery.

Previous systematic reviews

At the time of writing, this present review is the largest systematic review with a meta-analysis that directly compares functional and adverse outcomes between joint replacement and trapeziectomy.

In 2015, a review was published by Wajon et al. [10] that compared functional outcomes between an array of surgical techniques used to treat CMC 1 OA. However, this review included only 670 participants and only one JR technique, the Swanson implant. Wajon et al. also published a review in 2017 that was later retracted [34].

Moreover, another review that was published by Huang et al. in 2015 [35] compared 19 different types of joint replacements and found that “no single implant can be recommended” and that “many implants should only be used with great caution if at all”. A more recent systematic review was published by Remy et al. [36] in 2020 that also compared different types of joint replacements. This review noted favourable short-term outcomes relating to pain and improved function that is stable over time with a limited positive effect on joint strength and high rates of failure.

Liu et al. [37] compared simple trapeziectomy with LRTI and found that the latter technique led to superior tip and grip strength at one-year follow-up but did not find a difference between the techniques with regard to pain, key pinch and DASH. A meta-analysis conducted in 2021 by Lee et al. [11] reported that JR has a superior clinical outcome compared to LRTI with better DASH scores as well as improved pinch power along with comparable pain and complications.

This present review adds to the literature by providing a direct comparison of JR with other TRAP techniques, such as simple TRAP and RA, and by highlighting the importance of counselling patients regarding the greater risk of complications and greater odds of requiring revision surgery when undergoing JR procedures.

Functional outcomes

In this review, studies reported postoperative functional outcomes, ranging from subjective measures such as DASH, QDASH and pain (VAS) to objective measures, including tip pinch, key pinch, grip strength and Kapandji scores. No single study reported all of these outcomes, and there was marked heterogeneity in the number of functional outcomes reported per study, ranging from as little as one outcome [25, 31] to as many as six outcomes [28] (Table 3).

This is likely due to the lack of standardised reporting outcome measures for studies on CMC 1 osteoarthritis. This is supported by a recent review that found 33 unique outcomes and 25 unique outcome measures reported across 97 studies on this topic [38]. This, along with our findings, highlights the need for a core outcome set (COS), which would include standardised outcomes that need to be reported as a minimum in all studies on CMC 1 joint osteoarthritis. This would add to the quality of evidence that would contribute to higher-quality reviews and clinical guidelines on the management of CMC 1 osteoarthritis in the future.

Moreover, it was not possible to carry out a meta-analysis for DASH, tip pinch strength, grip strength and Kapandji scores; however, if future studies standardised outcomes, future reviews will be able to perform a meta-analysis and report on functional outcomes holistically.

Of the functional outcomes that underwent meta-analysis, better functional outcomes, namely QDASH and key pinch, were associated with JR. This is similar to the review by Remy et al. [36], which found that JR is associated with a rapid gain of postoperative function.

Additionally, both Huang et al. [35] and Remy et al. [36] noted good pain relief, but neither review compared JR with TRAP. Only the JR versus RA sub-group, which comprised one study [20], found significantly lower pain (VAS) scores in favour of JR. However, this present review is the first to highlight comparable overall pain (VAS) scores between JR and TRAP techniques.

It should be noted that the studies in this review had an overall mean follow-up time of 45.5 months and 48.2 months for JR and TRAP procedures, respectively, with only two studies [24, 26] having mean follow-up periods of greater than 10 years. Hence, studies with longer follow-up are required to understand the long-term functional outcomes of both procedures.

Adverse outcomes

Despite better functional outcomes associated with JR, there is greater inherent risk of complications that can occur in JR compared to TRAP as noted in this review, which is in keeping with the literature [35, 36]. Loosening and dislocation, in particular, can be attributed to errors in the positioning of implants and the shape or bone quality of the trapezium [36]. We recommend that patients are carefully counselled regarding the risk of complications and revision surgeries when undergoing treatment with JR.

In terms of direct comparison between JR and TRAP, although there was an overall greater number of complications and revision surgeries for JR, only the JR versus RA sub-group showed a statistically significant difference, as seen in Figs. 5 and 6. This indicates that RA is associated with fewer adverse outcomes than JR, making it a potentially safer option in terms of adverse outcomes.

In addition to providing superior functional outcomes, JR techniques also need to provide comparable complication rates to TRAP procedures to justify its use in treating CMC 1 osteoarthritis [35]. Some of the JRs that have shown some promise in this review include Ivory, Elektra and ARPE.

The Ivory JR demonstrated a variation in the odds of complications and revision surgery [20, 21, 27] as seen in Fig. 5, with high rates of complications reported. This could be explained by the fact that the Ivory JR modifies the movement of tendons, which can result in De Quervain syndrome [20]. However, since the Ivory JR has demonstrated promising functional outcomes, such as favourable pain [21] and DASH scores [27] as seen in Figs. 2 and 3, we speculate this could provide good long-term functional outcomes but with a varying rate of complications as studies have shown.

Additionally, although the Elektra JR showed a significant improvement in functional parameters in one study [30], two other studies [22, 26] that investigated the Elektra JR did not report a significant improvement in functional outcomes. This, along with a high failure and revision rate [26] as well as the variation in the odds of complication (Fig. 5), led us to conclude that the Elektra JR is unlikely to be a suitable alternative to TRAP according to this review.

Finally, we found that the ARPE JR was similar to LRTI in terms of complications and revision surgery (Figs. 5, 6) and even noted significantly better key pinch strength [23] and DASH scores [31] when compared to LRTI and TRAP, respectively, indicating that the ARPE JR could be a safe alternative to TRAP. However, more robust comparative studies involving this technique are required.

Finally, it is worth noting that there is a range of different prostheses that were not included in this review as only studies that directly compared JR with TRAP techniques were included.

Strengths and limitations

The strengths of this review include prospective registration of the study protocol, an up-to-date search of the literature and a meta-analysis to compare functional and adverse outcomes when feasible. However, there are notable limitations. The majority of the included studies are non-randomised with either moderate (nine studies) or serious (two studies) risk of bias. The one RCT included also has “some concerns” based on its risk of bias assessment. The GRADE rating of the studies that were included in the meta-analysis included two “very low” ratings, two “low” ratings and one “moderate” rating, partly due to the large number of observational studies, which are susceptible to selection bias.

Another obvious limitation is the comparison of only two techniques, thus excluding alternative treatments such as arthrodesis and spacers. Additionally, some of the studies included in this review utilised older models of JRs, such as Elektra and De la Caffiniere, which are not reflective of the prostheses used currently. For example, the Ivory, Elektra and ARPE prostheses have shown good promise in this review, and therefore, the possibility of improved outcomes with newer prostheses should be considered.

Moreover, no subgroup analysis of the JR arm of this review has been carried out, which is due to the numerous types of JRs included as well as an insufficient number of studies of each type of JR, which were inadequate for the purposes of carrying out a meaningful subgroup analysis. Finally, the meta-analyses are limited by the lack of robust RCTs that compare these two techniques, and thus, it is not currently possible to reach a definitive conclusion on which technique is superior overall.

Conclusion

Overall, based on very low- to moderate-quality evidence, there is potential for improved personalised care when choosing between TRAP and JR procedures based on the patient’s desired outcomes. We advise that patients need to be counselled on the benefits and risks of both procedures, with JR treatments resulting in better function with lower QDASH scores (very low quality of evidence), improved key pinch strength (low quality of evidence) and comparable pain (VAS) scores (very low quality of evidence).

If opting for JR, patients need to be aware of the greater risk of complications (low quality of evidence) and the greater odds of requiring revision surgery (moderate quality of evidence) when compared to TRAP techniques. Ultimately, the choice of treatment should be made in conjunction with patients who are well-informed about the benefits and risks of both procedures.

Additionally, we believe that more robust studies that compare JR and TRAP with standardised outcome measures and long-term follow-up are required in order to strengthen the quality of evidence available.