Background

Femoral neck fractures (FNFs) will bring baneful influences to patients due to its high morbidity, disability rate, economic burden, and mortality, and the rate is rapidly growing because of the increasing number of the elderly [23]. Arthroplasty is commonly recommended for displaced femoral neck fractures (67% of all types FNFs) in the elderly (age > 65 years) and can be categorized as total hip arthroplasty (THA) and hemiarthroplasty (HA) [34]. Whether THA or HA is more applicable in FNF remains controversial [21]. Both pros and cons of the treatments were widely reported in previous studies and synthesized reviews but did not reach a common conclusion [6, 11, 13, 15, 17, 24, 26, 33, 49,50,51,52]. The ongoing discussion requires highly reliable answers. However, previous meta-analysis and reviews have several limitations. First, they did not fully mention the details of surgical approach, prosthetic choice, surgeon experience, and the type of both femoral and acetabular fixation, all of which we consider may cause chaos in conclusion. Second, serious inclusion criteria in some studies may lead to limited data to analyze. Third, subgroup analysis was limited, and long-term results were not considered. The latest meta-analysis included trials reported between 2006 and 2017 and may be outdated [33]. Randomized controlled trials (RCTs) with high quality have been published recently and not been included, and we carefully selected Chinese articles reported with enough follow-up duration and reported outcomes in our analysis [3, 8, 18, 25, 27, 31, 38, 44].

We conducted an updated meta-analysis only including RCTs to provide the most reliable evidence.

Methods

The review followed Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) guidelines (www.prisma-statement.org).

Searches and selection criteria

We searched English databases which included PubMed, Medline, Embase, The Cochrane Library, and Web of Science and Chinese databases CNKI, VIP, WAN FANG, and CBM (all inception to July 2020) without language or date restriction as well as retrieving articles identified in other reviews by manual search. And the search strategy is provided in Supplementary files. Inclusion criteria were RCTs comparing THA with HA for FNFs and at least reporting one of the predetermined outcomes. To make our conclusion generalizable, we set no restrictions for follow-up time, patients’ age, study size, or pre-surgery status.

Outcome measures

We included the following outcomes:

  1. a)

    Hospital and surgery outcomes: hospital stays, surgery duration, blood loss;

  2. b)

    Clinical outcomes: Harris Hip Scores (HHS) within 6 months and up to 13 years;

  3. c)

    Patients’ quality of life: EQ-5D scores within 6 months and at 1 to 2 years;

  4. d)

    Common complications: pulmonary embolism, deep vein thrombosis, pneumonia, urinary tract infection, pressure ulcer, wound disease, surgical-site infection, and cardiovascular disease.

  5. e)

    Prothesis-related complications: revision, fracture, dislocation, loosening or subsidence, heterotopic ossification, and acetabular erosion;

  6. f)

    Mortality: mortality in hospital, within 6 months, at 1 to 2 years and up to 13 years;

  7. g)

    Cost;

Data extraction and study quality assessment

Two reviewers (T-XM, WD) independently screened the titles and abstracts for eligibility. We develop a data extraction form and collected data from including articles after full-text reading and cross-checking procedures. Any discrepancies were evaluated by a third reviewer(C-JL). For study quality assessment, the Cochrane Collaboration’s Risk of Bias was used. For missing data like standard deviation, we calculated them with formulas according to the Cochrane handbook for systematic reviews of interventions or articles’ figure data.

Statistical methods

For statistical analysis, the review used forest plots to present the synthesized results. For continuous and binary variables, the weighted mean differences (WMD) and risk ratios (RR) were reported respectively with 95% confident interval (CI). Survivorship was analyzed through the Kaplan–Meier survivor curve. Heterogeneity was assessed by both Q2 and I2 tests, and P value < 0.1 or I2 > 50% indicates statistical heterogeneity. Galbraith tests and sensitivity analysis were used to identify the possible heterogeneity origins. If necessary, subgroups will be used to dismiss heterogeneity. The random effects model was conducted in any condition. We used sensitivity analysis by sequential omission of individual studies to validate the credibility of pooled data. For publication bias, the symmetry of funnel plots was visually evaluated, and Egger’s tests were also applied. For statistical analyses, the Review Manager (Version 5.0.2) and STATA (Version 13.0) software programs. All P values were two-sided.

Results

Search results

Our review yielded 2325 reports and excluded 1356 after duplicates. Of these literatures, 48 were included after selecting the title and abstracts. After full text screening, 23 were excluded, and the details were described in the flow chart (Fig. 1). For clinical outcomes, we included 25 reports based on 19 trials and extracted non-repeating data at different follow-up stages ([1,2,3,4, 7, 8, 10, 12, 16, 18, 22, 25]; H. H [27].; W [29].; William [30, 31, 36,37,38, 40, 42,43,44, 46, 47]).

Fig. 1
figure 1

PRISMA flowchart of the selection process

Methodological quality assessment

In this study, selection, attrition, and reporting bias can be considered low risk. Detection bias was moderate risk as well as performance bias. Therefore, the methodological assessment of this work can be judged as very good quality. Two reviewers independently assessed the risk of bias of included studies according to the Cochrane Collaboration’s Risk of Bias, and the results are shown in Fig. 2.

Fig. 2
figure 2

Risk of bias of included studies according to the Cochrane Collaboration’s Risk of Bias

Risk of publication bias

Funnel plots of the outcome enrolled the most studies (dislocation) to detect publication bias. The symmetrical distribution and Egger’s test (P = 0.708) show low publication bias (Fig. 3).

Fig. 3
figure 3

Funnel plot based on dislocation rate

Study characteristics

We finally included 25 RCTs involving 3223 patients (THA 1568, HA 1655). Five of them ([1, 16]; W [29, 40].) were follow-up reports of previous trials. Table 1 summarizes the trials’ details.

Table 1 Characteristics of the included studies

Outcome of interests

The overall results are presented in Table 2.

Table 2 The results of meta-analysis

Hospital and surgery outcomes

Compared to HA, THA has longer surgery time (WMD = 20.044, P < 0.0001), more blood loss (WMD = 69.106, P < 0.0001), and longer hospital length (WMD = 2.360, P = 0.031). Fifteen studies reported surgery time (THA 1292, HA 1341) while nine studies reported hospital length (THA 418, HA 443) with high heterogeneity (I2 = 96%). We further did the Galbraith test and found the main source of the heterogeneity ([25]; H. H [27].). We excluded them, and the results are stable with no heterogeneity (I2 = 0%). For blood loss, nine studies were included (THA 1063, HA 1038), and the results are stable after removing developing countries’ studies (Fig. 4).

Fig. 4
figure 4

Forest plot of meta-analysis: Hospital and surgery outcomes

Clinical outcomes

The results evidenced THA has similar HHS score with HA within 6 months (WMD = 1.641, P = 0.124) or after 9 years (WMD = 5.848, P = 0.273) but higher scores at 1 year (WMD = 3.593, P = 0.002), 2 years (WMD = 3.691, P = 0.020), and 3 to 5 years (WMD = 6.027, P = 0.035) (Fig. 5). Three studies reported pain score based on HHS subscore, and other three studies reported pain as binary variables; the results of both show no difference between groups at any follow-up points. For patients’ quality of life, pooled data revealed no significant difference of EQ-5D scores up to 1 year after surgery. But the results favor THA 2 years later (WMD = 0.107, P < 0.0001) (Fig. 6).

Fig. 5
figure 5

Forest plot of meta-analysis: Harris Hip Score

Fig. 6
figure 6

Forest plot of meta-analysis: EQ–5D

Patients’ quality of life

The results showed that EQ-5D scores within 6 months (WMD = 0.031, P = 0.324) and at the first year after surgery (WMD = 0.033, P = 0.351) are similar between groups while favor THA 2 years later (WMD = 0.107, < 0.0001).

Common complications

The pooling data elicited no statistical difference across groups in terms of the events of pulmonary embolism, deep vein thrombosis, pneumonia, pressure injury, wound disease, surgical-site infection, and cardiovascular disease.

Prothesis-related complications

A total of 13 studies suggested that revision rate is similar in both groups with a moderate heterogeneity (I2 = 47.2%), the Galbraith test detected the main source, and the results are stable after deleting the study [40] (I2 = 30.3%). The study reported a result of 13 years follow-up thus generate the heterogeneity. Sixteen studies evidenced that THA has higher dislocations rate than HA with significant difference (WMD = 1.897, P = 0.002). Compared with THA, HA has a higher rate of acetabular erosion (WMD = 0.030, 95% CI 0.004 to 0.219, P = 0.001) (Fig. 7). As for fracture, loosening or subsidence, and heterotopic ossification, the results detected no statistical difference between groups.

Fig. 7
figure 7

Forest plot of meta-analysis: Prosthesis-related complications

Mortality

The Kaplan–Meier curve was applied, and we detected the similarity of survivorship (HR 1.029; 95% CI 0.905 to 1.169; P = 0.665; Fig. 8). Subgroup analysis of 2 years follow-up revealed reduced mortality in HA group (WMD = 1.224, P = 0.008)

Fig. 8
figure 8

Survival curve

Discussion

Hospital and surgery outcomes

For surgery time, almost all previous synthesized outcomes are in consistence with our results [24, 28, 33, 48, 49]. And we consider the main reasons are that HA requires less operative installation steps including cup preparation and implantation. For hospital length, we found that THA has longer in-hospital duration in our study. The common reasons for delayed discharges are usually post-surgery complications, since we did not find out the difference in common complications, and we consider that the early ambulation ability for patients who undergone HA may cause the difference. We also found reduced blood loss in HA group and less surgical procedures; tissue damage may clarify the results.

All three indicators are in favor of HA group, and the results are hardly to change even with more evidence. However, the results may lack clinical values when it comes to decision-making.

Clinical outcomes

Many studies have proved better outcomes in THA group in terms of HHS but did not provide long-term results or subgroup analysis due to limited trials ([24]; Y [28, 33, 49].). We made subgroups based on follow-up periods and initially found that THA group has higher total HHS in medium term (1–5 years) but no difference in short (< 6 months) or long terms (> 9 years).

For pain scores, we detect no difference between two groups, and the PCU-THA used in one trial is the main source of heterogeneity [6]. Liu (Y [28].) and Wang [20] found that patients in THA group experienced significantly less pain, but they only include limited trials in the pooled results.

Patients’ quality of life

For EQ-5D scores, our conclusion agreed with other studies that THA has better overall patients’ quality of life ([24]; Y [28, 48, 49].). We did the subgroup analysis and found that the difference became obvious 2 years after the surgery.

Common complications

Our result found no difference in terms of common complications, and we believed further studies can hardly change it. Our results are against Liu et al.’s study (Y [28].). In his study, he limited patients’ age to over 75 years old, and we believe the complications may largely be attributed to the patients’ own condition rather than implants type.

Prothesis-related complications

The results show that revision rate is similar with moderate heterogeneity (I2 = 47%). After sensitivity analysis, Ravikumar and Marsh’s [41] study was considered as the source because they reported 13-year follow-up results (24% in HA; 6.75% in THA). In meta-analysis that only include RCTs, Metcalfe et al. [32], Liu et al. (Y [28].), and Migliorini et al. [33] are in favor of our results but Migliorini et al. found a higher revision rate in THA within 5 years while in HA after 5 years. Lewis et al. [24] found that THA was superior to HA, but the non-RCTs in his study may influence the evidence grades.

However, data from registries are in contrast to the results from randomized trials because RCTs always have certain selections of enrolled patients. According to national registry studies, dislocation, infection, and periprosthetic fracture are the main reasons for revision [35, 45]. Anterolateral approach, cemented stem, bipolar head, and 36-mm cups are useful methods to reduce revisions and should be considered by the surgeons for the best outcomes for patients [14, 32, 35, 45]. Dislocations are always a concern by clinical doctors because they are the main reason for revision. We found that THA has a higher rate of dislocation compared with HA. The types of head (bipolar vs. unipolar), cups (dual-mobility vs. single cup), age of patients, pre-injury ambulation status, and surgical approaches may cause influence on the dislocation rate. Our conclusion is in line with other reviews and registry reports ([19, 24]; Y [28, 33].; ). Acetabular erosion is a theoretical indication to perform a revision in a painful HA. The pooled data shows higher acetabular erosion rate in HA group. And we found no dissent from other authors. Osteoarthritis also represents an important pillar for the decision on therapy.

Usually, surgeons are conservative about THA due to the elevated risk of dislocation, with the associated risk of subsequent revisions and the death risk in the end. However, our results found that the revision rate is similar between two groups. The possible reason is that THA has higher dislocations rates while HA has higher acetabular erosion rates and thus equals the revision rate between the two groups. The long-term results favor the THA, and surgeons could choose propriate implants and approaches to reduce dislocation rates.

Mortality

We found that the mortality rate was similar in groups, and comparable results were found by other meta-analysis [24, 33, 48, 49]. However, we found that THA has a slightly higher mortality rate 2 years after surgery, and it proves the detective ability of our study. We hypothesize that the early revision caused by dislocations will lead to more deaths in THA group while will be offset by acetabular erosion later. But the result should be interpreted carefully with more studies.

Cost

Three studies mentioned the cost of both techniques. Burgers et al. [5] found that main cost were rehabilitation fares and nursing home care payments in the first year after surgery. Keating et al. [22] found that the cost between groups was not significant, but highlight the high costs of the readmissions in patients who underwent HA. Ravi et al. [39] found that THA reduced health care costs about the index admission 1 year after surgery, relative to HA. Dangelmajer et al. [9] found that patient’s age and medical care payer status were all associated with odds of receiving THA, and patients with private insurance had higher odds of receiving THA. Reducing costs after hip fracture surgery should focus on improving the duration and efficiency of the rehabilitation phase. The economic evidence showed that THA should be more considered because it can cut the cost of readmission and rehabilitation.

Limitations

There are some limitations also needed to be noticed. First, lack of information (implant types, operative approach, etc.), uncontrollability of confounders (medical resources, surgeon experience, etc.), and other factors might affect the credibility of the pooled data despite that we selected the most reliable types of trials. Secondly, we did not set strict inclusion criteria since they have already been considered in the process of RCTs, and the low heterogeneity of these results also proves it. Thirdly, despite that our results suggested the difference between short-term and long-term results in functional outcomes and patients’ quality of life, the long-term reports are still limited.

Therefore, the multicentered and large population-based designs of future research should be considered, and more long-term follow-up surveys should be focused and reported.

Conclusion

Based on the results, we thought HA could be recommended for patients who have cognitive impairment, comorbidities, reduced performance status, and low function demands. And THA should be recommended for patients who are active, healthy, with long life expectancy and young biological age, and have higher demands for functions and quality of life.