Tenodesis yields better functional results than tenotomy in long head of the biceps tendon operations—a systematic review and meta-analysis

Background Pathology of the long head of the biceps tendon (LHBT) is a common disorder affecting muscle function and causing considerable pain for the patient. The literature on the two surgical treatment methods (tenotomy and tenodesis) is controversial; therefore, our aim was to compare the results of these interventions. Methods We performed a meta-analysis using the following strategy: (P) patients with LHBT pathology, (I) tenodesis, (C) tenotomy, (O) elbow flexion and forearm supination strength, pain assessed on the ten-point Visual Analog Scale (VAS), bicipital cramping pain, Constant, ASES, and SST score, Popeye deformity, and operative time. We included only randomized clinical trials. We searched five databases. During statistical analysis, odds ratios (OR) and weighted mean differences (WMD) were calculated for dichotomous and continuous outcomes, respectively, using the Bayesian method with random effect model. Results We included 11 studies in the systematic review, nine of these were eligible for the meta-analysis, containing data about 572 patients (279 in the tenodesis, 293 in the tenotomy group). Our analysis concluded that tenodesis is more beneficial considering 12-month elbow flexion strength (WMD: 3.67 kg; p = 0.006), 12-month forearm supination strength (WMD: 0.36 kg; p = 0.012), and 24-month Popeye deformity (OR: 0.19; p < 0.001), whereas tenotomy was associated with decreased 3-month pain scores on VAS (WMD: 0.99; p < 0.001). We did not find significant difference among the other outcomes. Conclusion Tenodesis yields better results in terms of biceps function and is non-inferior regarding long-term pain, while tenotomy is associated with earlier pain relief. Supplementary Information The online version contains supplementary material available at 10.1007/s00264-022-05338-9.


Introduction
The biceps brachii muscle has a proven function in forearm supination and elbow flexion [1]. The separate role of the long head of the biceps tendon (LHBT) is still debated. Cadaver studies [2][3][4][5][6] suggest that the LHBT plays an essential role in the stability of the glenohumeral joint, while the results of in vivo studies are controversial [7][8][9].
Besides conservative therapy, surgery plays an important role in the treatment. The most used methods are tenotomy and tenodesis; however, there is more than one surgical approach in both groups. Tenotomy is the more straightforward method, where the tendon is released from the supraglenoid tubercle [16]. This can be performed with or without creating a funnelshaped proximal stump [17] or releasing the LHBT with a portion of the superior labrum [18]. Tenodesis can be performed arthroscopically or through an open approach, and the tendon may be fixed to multiple anatomical locations, such as soft tissue or bone. The site can also be suprapectoral or subpectoral [19]; the fixation may involve suturing to tendons, interference screw, bone tunnels, keyholes, suture anchors, and suture buttons [10,20,21].
Due to the controversial results of clinical trials and limitations of previous meta-analyses, we aimed to provide the most comprehensive analysis to date comparing tenodesis to tenotomy in managing LHBT pathologies.

Methods
We used the Preferred Reporting Items for Systematic Reviews and Meta-Analysis (PRISMA) statement [42] to report our research.

Protocol
We registered our research protocol on PROSPERO in advance under the registration number CRD42021244613. There were no protocol deviations.

Search strategy, inclusion, and exclusion criteria
While stating our clinical question, we used the PICOTS framework. P (population) were the patients who have undergone LHBT operations, I (intervention) was tenotomy, our C (comparison) was tenodesis, and our outcomes were the following: pain on the ten-point Visual Analog Scale (VAS), bicipital cramping pain events, bicipital groove pain events, Constant score (range: 0-100), American Shoulder and Elbow Surgeons (ASES) score (range: 0-100), Simple Shoulder Test (SST) score (range: 0-12), operative time in minutes, elbow flexion strength, forearm supination strength, and Popeye deformity events. Regarding T (timing), we statistically analysed every outcome when at least three studies reported them at the same time point. If an outcome did not qualify for quantitative synthesis, we included it only in the systematic review section. The S (study type) was randomized controlled trials (RCTs).
On 28 November 2020, we conducted a systematic search using the databases of MEDLINE (via PubMed), Embase, Cochrane Central Register of Controlled Trials (CENTRAL), Web of Science, and Scopus, using the following search key: "bicep* AND teno*". We used the "all fields" option (or the equivalent of it) in the first four databases, while in Scopus we used the "Article title, Abstract, Keywords" search field. We applied no filters in any of the databases.
Our inclusion criteria were the following: RCTs, comparing tenotomy and tenodesis and reporting on the outcomes of interest.
Our exclusion criteria were the following: review, metaanalysis, cohort study, case report, surgical technique description, studies comparing different submodalities (for example, different tenodesis techniques), distal biceps tear, biomechanical study, cadaver study, and animal study.

Selection and data extraction
We used EndNote X9 (Clarivate Analytics, Philadelphia, PA, USA) for the selection process. After removing the duplicates, two independent review authors (M.V., S.L.) performed the selection, first by title, then abstract, and finally by full text. Following every step of the selection, Cohen's kappa was calculated to assess the agreement between the two investigators with the following parameters: 0.00-0.20 no agreement, 0.21-0.39 minimal agreement, 0.40-0.59 weak agreement, 0.60-0.79 moderate agreement, 0.80-0.90 strong agreement, and above 0.90 almost perfect agreement [43]. We screened the references of the eligible records for possible additional articles to include in the meta-analysis. The same two review authors conducted data extraction using a pre-specified Excel sheet (Office 2016, Microsoft, Redmond, WA, USA). We gathered data from the articles about the first author, year of publication, country, study design, demographic data, indication of the surgery, surgical methods, and outcomes that we presented. If the strength measurement results were reported in Newton (N), we converted them to kilogram (kg) using an online calculator (calculator-converter.com). If the studies did not report the Strength Index (SI) but did report the strength measurement result of both sides, we calculated SI from them.
Two independent review authors (M.V., L.S.) resolved the disagreements by consensus regarding both the selection and the data extraction process.

Statistical analysis
For dichotomous outcomes, odds ratios (ORs) with their 95% confidence intervals (CI) were calculated from the original raw data of the articles. We decided to use continuity correction [44] in case of the number of reported bicipital cramping pain events, final data outcome as we observed zero events in some studies. For continuous outcomes, weighted mean differences (WMDs) with 95% CI were calculated from the original raw data of the articles except in some cases where standard deviations (SDs) and means were calculated from the minimum, median, maximum, and sample size according to Wan's method [45]. The random effect model by DerSimonian and Laird [46] was applied in all cases, with the estimate of heterogeneity. Following the Cochrane Handbook, the I 2 values were considered moderate heterogeneity between 30 and 50%, substantial heterogeneity between 50 and 75%, and considerable heterogeneity higher than 75%. We used forest plots to display the results graphically. When it was statistically possible, we performed a trial sequential analysis (TSA) [47] to confirm the statistical reliability of the data with the calculation of the required information size by adjusting the significance level for sparse data.
We statistically analysed and compared every outcome when at least three studies reported them at the same time point. To provide a clear picture of the available data, we present the individual results of all included studies, comparing the two surgical methods in the systematic review section.
All data management and statistical analysis were performed with Stata (version 16.0, StataCorp) and TSA (trial sequential analysis tool from Copenhagen Trial Unit, Centre for Clinical Intervention Research, Denmark).

Risk of bias assessment and quality of evidence
We performed the risk of bias assessment for every examined outcome according to the Cochrane recommendation using the RoB 2: A revised Cochrane risk of bias tool for randomized trials [48].
To assess the certainty of the evidence, we used the Grading of Recommendations Assessment, Development, and Evaluation (GRADE) system [49] and classified our results into four levels: high, moderate, low, and very low certainty of evidence.
Two independent review authors (M.V. and L.S.) performed the risk of bias and certainty of evidence assessments. The disagreements were resolved by consensus.

Characteristics of the studies included
We summarized the basic characteristics of the included studies (shown in Table 1). All the included studies were RCTs, and ten of them compared tenotomy to tenodesis [50][51][52][53][54][55][57][58][59][60]. We included nine studies and 572 participants in the meta-analysis, 293 in the tenotomy group and 279 in the tenodesis group. Two studies ([59, 60]) did not have outcomes with a comparable matching time point; therefore, we were only able to include these in the systematic review section.
Tenotomy was performed arthroscopically in all studies. Tenodesis was also performed arthroscopically, except in the case of 31.5% of patients (17 out of 54) in the study of MacDonald et al. [54], where surgeons used an open subpectoral approach.
The follow-up times were different in the studies, mostly between 12 and 24 months, with some variation. The evaluation times of several outcomes were also different.
We were able to analyse the Constant score in three studies at the six month follow-up [51,53,55] Fig. 3). Neither result showed a statistically significant difference between the two groups. The study of Lee et al. [60] also reported the six month and 12-month Constant scores, but it was not possible to analyse these outcomes due to a lack of data.

Post-operative pain
Three studies reported three month pain scores on the tenpoint VAS [50,54,58] (WMD, 0.99; 95% CI, 0.51-1.48; p < 0.001; I 2 = 0.0%; high grade of evidence) (Fig. 3). The difference was significant in favour of tenotomy, therefore, leading to the conclusion that there is earlier pain relief with tenotomy than with tenodesis. Four studies reported the 6-month [51,54,55,58] Fig. 6) pain scores on VAS (different studies reported it at different time points), and we found no significant difference at these time points. The study of Lee et al. [60] also reported the three month, six month, and 12-month level of pain, but it was not possible to analyse these outcomes due to lack of data.

TSA (trial sequential analysis)
The results of our TSA are depicted in Supplementary Figs. 9-16. Due to lack of data, TSA was not possible for the following outcomes: 6-six month Constant scores, six month VAS pain scores, 24-month VAS pain scores, and bicipital cramping pain events at six months post-operatively.
The summary of calculated odds ratios and weighted mean differences for the outcomes that were not eligible for the meta-analysis are shown in Table 2.

Risk of bias assessment and quality of evidence
A summary of the risk of bias assessment is shown in Supplementary Figs. 17-38. The Popeye deformity was the only outcome that all studies reported. In this analysis we found four studies with high risk of bias [55,[58][59][60], six studies carried "some concerns" [50-53, 56, 57], while one study resulted in low risk of bias [54]. Lower grades were mostly due to the unclear randomization process, the lack of blinding, and the missing trial protocols.
The results of the GRADE analysis are shown for every outcome in the results section. A detailed description of the quality of evidence is found in Supplementary Table 1.
Biceps brachii has an essential role in elbow flexion strength. For this reason, we decided to choose this as one of the primary outcome parameters. Even though our analysis did not significantly differ at the 6-month follow-up, at 12 months, the elbow flexion strength was significantly better in the tenodesis group. To our knowledge, this result is a novelty compared to the results of previous meta-analyses that examined this particular outcome [34,[37][38][39][40]. Nevertheless, our TSA indicates that further RCTs are needed in the case of the six month results. Even though the required sample size was reached for the 12-month results, potential spurious significance was present; thus, this should be sponding 95% confidence intervals (CI). The size of the grey squares reflects the weight of a particular study. The blue diamond reflects the overall or summary effect. The outer edges of the diamonds represent the CIs considered inconclusive according to the TSA result. If we consider the results of the individual studies included in the systematic review, we are left with mixed results, but due to the differences in time points, we could not perform more statistical comparisons.
Another major role of the biceps brachii is forearm supination. Our results showed a statistically significant difference between the 12-month supination strength results in favour of tenodesis, contradicting the literature so far [34,[37][38][39][40]. According to our trial sequential analysis, further clinical trials are needed to reach a more certain result. Examining the final data from the individual studies, we discovered a tendency in favour of tenodesis.
The Constant score is a widely accepted scoring system used to evaluate post-operative function after shoulder operations. However, it is not specific to biceps function but was designed to assess the overall functional state of the shoulder [62]. Although we found no significant difference between the Constant scores (6 months, 12 months post-operatively), if we add the systematic review results, there is a trend suggesting that tenodesis might lead to better post-operative scores than tenotomy. This result is in accordance with the previous meta-analyses, where they either found statistically significant difference without reaching the minimal clinically important difference [63] (MCID) [34][35][36][38][39][40][41] or did not find any significant differences when comparing the two methods [37].
From the patient's perspective, post-operative pain might be the strongest quality measure after surgery. We could analyse the degree of pain as the VAS indicated at three, six, 12, and 24 months after surgery. The difference was significant only at the three month follow-up in favour of tenotomy. The TSA for this outcome showed that no further studies are needed to confirm the result. Thus we can conclude that patients experience less pain three months after tenotomy than those who underwent tenodesis. Despite this, we found no significant differences between the two methods in the long term. Out of the meta-analyses that examined pain on VAS [34,[38][39][40], only Ahmed et al. [34] evaluated more time points (6, 12, 24 months), but they did not find significant differences between tenotomy and tenodesis. The systematic review results did not suggest any strong tendency toward the preference of tenotomy or tenodesis. The size of the grey squares reflects the weight of a particular study. The blue diamond reflects the overall or summary effect. The outer edges of the diamonds represent the CIs Zhang (2019) [58] n.a n.a n.a n.a n.a n.a According to some previous articles, one of the drawbacks of tenotomy is that it leads to a higher incidence of cramping pain events [35,37]. The results of our analysis at the six month follow-up do not support this assumption and are in accord with those analyses which found no difference between tenotomy and tenodesis [34,36,[38][39][40][41]. The results remained the same after we evaluated the data of the systematic review.
In a recent study on 1723 patients, tenotomy was associated with a higher incidence of Popeye deformity than tenodesis [23]. Our results confirmed this data: we also found a significant difference between the two groups in favour of tenodesis, in accordance with earlier meta-analyses [34][35][36][37][38][39][40][41]. The TSA showed that no further clinical trials are needed to confirm this result.
Surgical times can vary greatly for various reasons, including concomitant procedures such as rotator cuff repair and the surgical team's experience. According to a recent systematic review and meta-analysis, shorter operative time is one of the advantages of tenotomy [35]. Surprisingly, even though all of the included RCTs that examined this outcome [54,57,58] found that tenodesis requires more time to perform, the result of our analysis showed no statistically significant difference between tenotomy and tenodesis in this regard. Considering the results established in the literature and the conflicting result of our TSA, no conclusion can be drawn on this topic at present.

Strengths and limitations
This meta-analysis from nine studies has considerable strengths. Unlike previous analyses, a strict methodology was applied with outcomes assessed only at the same time points. Since we only included randomized controlled trials, this analysis portrays the highest level of achievable evidence on this topic. Trial sequential analyses were performed to assess whether further clinical trials are needed. It was deemed conclusive regarding three month pain levels on the VAS and Popeye deformity at the 24-month followup outcomes.
Our meta-analysis had some limitations, including the small sample size that influenced some of the TSA results. In addition, the indication for treatment differed among the included trials, and there was heterogeneity among the studies regarding intervention submodalities and rehabilitation protocols. In some cases, standard deviations (SDs) and means were calculated from the minimum, median, maximum, and sample size. TSA was not conclusive in the following outcomes: six month elbow flexion strength in kg, 12-month elbow flexion strength in kg, 12-month forearm supination strength in kg, 12-month Constant score, 12-month pain levels on the Visual Analog Scale, and operative time in minutes.
We suggest conducting further randomized controlled trials focusing on elbow flexion strength, forearm supination strength, pain, and operative time, as these were deemed inconclusive based on our TSA. When designing an RCT, exact time points regarding the assessment of outcomes are required. The importance of biceps function-specific outcomes such as flexion and supination strength should be highlighted and should be focused on by further RCTs. The use of LHB score [61] might be beneficial in studies focusing on LHBT treatment methods, since it is specific to biceps, unlike the score systems most studies use (Constant, ASES, SST, UCLA (University of California at Los Angeles), etc.). Creating and reporting subgroups would be beneficial (i.e., a group with concomitant rotator cuff surgery and a group without it or comparing different tenotomy methods with the potential for autotenodesis).

Conclusions
Based on our results, tenodesis should be preferred over tenotomy due to a less frequent occurrence of Popeye deformity, better postoperative biceps function, and the non-inferior nature of tenodesis regarding long-term pain.
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http:// creat iveco mmons. org/ licen ses/ by/4. 0/.