Introduction

Total hip arthroplasty (THA) is one of the most commonly performed operations in the United States, with over 280,000 procedures reported annually [1, 28, 46]. The benefits of THA in terms of reduced pain and improved function and quality of life (QoL) for patients with debilitating hip disease have been well documented in the literature [17]. Furthermore, THA is a highly cost-effective intervention when compared with nonoperative management in patients with advanced osteoarthritis (OA) of the hip [13, 20]. However, concerns regarding high rates of THA failure among young, active patients and a desire to preserve bone for future revision operations led to the development of hip resurfacing arthroplasty (HRA), which was first introduced in the United States in the 1970s. HRA differs from THA in that the femoral head is resurfaced rather than resected, thereby preserving femoral bone stock, which could theoretically decrease the morbidity and improve patient outcomes associated with future revision operations. However, early clinical experience with HRA was unfavorable, as high failure rates (13% to 34% within an average of 18 months to 3 years) were reported due primarily to aseptic loosening [6, 19, 23]. Thus, the procedure fell out of favor among orthopaedic surgeons in the late 1980s [6, 26].

With the introduction of large-diameter metal-on-metal (MoM) bearings, which are associated with lower wear rates and less deformation than conventional metal-on-polyethylene bearings, HRA has been reintroduced in the United States amid both controversy and enthusiasm. Proponents of MoM HRA point to the potential benefits in terms of femoral bone preservation and therefore less morbidity and better functional outcomes associated with future revision surgeries [2, 39, 45, 47, 50]. Opponents argue the increased risks of early failure due to femoral neck fracture and increased costs associated with MoM HRA implants overshadow the yet-to-be proven long-term benefits. Furthermore, since the value of MoM HRA in terms of improved patient outcomes and ease of future revision surgery have not been conclusively demonstrated, many health plans have developed payment policies limiting the use of MoM HRA to specific patient populations.

Decision analysis offers a useful approach to compare MoM HRA to THA by comparing the expected lifetime costs and cumulative gains in quality of life associated with MoM HRA to the expected lifetime costs and cumulative gains in quality of life associated with THA based on known information regarding the costs, probabilities of clinical outcomes (including complications and revision surgeries), and quality of life associated with each treatment strategy. This approach is consistent with the emerging field of comparative effectiveness research, which has been defined as the conduct and synthesis of research comparing the benefits and harms of different interventions and strategies to prevent, diagnose, treat, and monitor health conditions in the “real world” setting [18].

The aims of this study were (1) to evaluate the comparative clinical effectiveness, costs, and cost-effectiveness of MoM HRA compared with THA by patient age and gender for the treatment of patients with advanced OA of the hip; (2) to identify which clinical and demographic factors and costs have the greatest influence on the incremental lifetime improvement in quality of life, costs, and cost-effectiveness of MoM HRA versus THA; and (3) to quantify the uncertainty in the estimates of the comparative clinical and cost-effectiveness of MoM HRA versus THA.

Methods

We used a Markov decision model to evaluate the clinical and economic consequences of MoM HRA compared to THA. The population studied was men and women aged 50 years or older undergoing MoM HRA or THA for advanced OA of the hip. A 30-year time horizon was used to evaluate the incremental clinical effectiveness (in terms of quality adjusted life-years (QALYs) gained) and cost-effectiveness (cost per QALYs gained) of MoM HRA and THA. The incremental cost-effectiveness of MoM HRA versus THA was examined from a healthcare system perspective (focusing on health care costs and patient quality of life) using hospital and professional reimbursement to estimate costs and quality adjusted life years to estimate effectiveness.

The decision tree (Fig. 1A–B) begins with the decision to choose either MoM HRA or THA for patients with advanced OA of the hip. Each alternative is represented as a Markov model with mutually exclusive states. Patients transition between states (or remain in a state) over time. The model used intervals (Markov cycles) of 1-year duration. While in each state during each yearly interval, patients experience a quality of life (QoL) and incur direct medical costs; in addition, transitions associated with revision surgery (conversion from HRA to THA, major total revision THA [revision of both the acetabular and femoral components], major partial revision THA [revision of the acetabular component only], or minor revision THA [exchange of the modular acetabular liner and femoral head only]) are associated with a short-term transitional decrement in QoL (or disutility) and an increase in direct medical costs associated with revision surgery. The probability of transition between states depends on the patients’ age, gender, and type of procedure (MoM HRA or THA). For MoM HRA, the health states are year of initial MoM HRA, post-HRA, post-conversion from HRA to THA, post-major total revision THA, post-major partial revision THA, post-minor revision THA, death due to any HRA or THA surgery, and death due to other causes. Thus, the MoM HRA cohort may experience an initial failure (requiring conversion from HRA to THA) or a subsequent failure requiring revision after THA. For the THA cohort, the disease states are year of initial THA, post-THA, post-major total revision THA, post-major partial revision THA, post-minor revision THA, post-second major total revision THA, post-second major partial revision THA, post-second minor revision THA, death due to any THA surgery, and death due to other causes. Decision analysis software (TreeAge Pro 2008, Williamstown, MA) was used to create a Markov decision model.

Fig. 1A–B
figure 1

(A) A Markov decision tree compares the clinical outcomes for MoM HRA and THA patients. MoM THA and primary THA are represented as Markov nodes (“M”). The branches are the Markov states. Conversion from HRA to THA is analogous to first major revision in the primary THA alternative. The [+] indicates there are subsequent events in each state. (B) The detailed outcomes in the post-conversion from HRA to THA branch are shown.

Information on implant survivorship was sought from large national or multicenter registries with implant survival data of sufficient duration to estimate annual age, gender, and procedure specific probability of implant survival and implant failure. Additionally, because the probability of implant failure varies by year of followup, we sought data of sufficient duration (5 or more years after initial surgery). The Australian Orthopedic Association (AOA) National Joint Replacement Registry Hip and Knee Arthroplasty Annual Report [4] provides gender and decade of age stratified cumulative percent revision for 5 to 7 years of followup for 9956 patients who received HRA for primary diagnosis of OA (excluding infection) and 109,972 patients who received primary conventional THA for a primary diagnosis of OA. The report also summarizes the type of revision (major total revision, major partial revision, and minor revision) and probability of subsequent revision for 2616 revision THA procedures. Annual probability of revision of MoM HRA and THA was estimated from the summary gender- and age-stratified data in the AOA National Joint Replacement Registry 2008 report by fitting a general failure time model (Weibull distribution), which allowed for time varying hazard [22] of failure for each gender and age stratum for 5 years of followup (the longest followup interval for which data was available for all strata). As indicated in the AOA registry report, the probability of failure in each stratum was highest in the first year of followup and declined thereafter, resulting in cumulative revision curves that increased with time but at lower rates after the initial year. After year 5, the annual probability of failure was assumed to remain constant. The analyses of MoM HRA and THA are summarized separately for six gender and age strata to correspond with the available data on failure rates in the AOA registry report. For both genders, the analysis used a patient age of 50 years representing the age younger than 55 years stratum, a patient age of 60 years representing the ages 55 to 64 years stratum, and a patient age of 70 years representing the age 65 to 74 years stratum. The probabilities of perioperative mortality for MoM HRA, THA, and all revision THAs were derived from the literature [15, 27, 3032, 37, 41, 44, 51, 52]. Annual gender- and age-specific all-cause mortality rates were based on United States life tables [3].

The effectiveness of each surgical procedure was based on the quality-adjusted life-years associated with each procedure. This measure assigns a QoL weight to each year of followup. The QoL values range from 0 (death) to 1 (perfect health) and reflect the average QoL associated with that health state. The QoL weights for patients with advanced OA of the hip and patients with successful primary THA were obtained from the literature [13, 20, 28, 29, 35]. The QoL values for patients with successful MoM HRA, conversion from HRA to THA, and revision THA were derived from literature comparing each of these health states to patients with primary THA [5, 25, 43]. Perioperative morbidity and recovery were captured by applying a lower QoL for a defined period of time after each surgical intervention (longer for revision than primary procedures). QoL weights that are measured by methods that reflect patient preferences for a health state are described as utilities in the health economics literature.

Costs incorporated into the model included both hospital and professional fees for primary THA, revision THA, MoM HRA, and conversion from HRA to THA. Hospital costs were based on average Medicare payments for diagnosis-related groups 544 (primary lower extremity arthroplasty procedures) and 545 (revision lower extremity arthroplasty procedures) for fiscal year 2008. Similar to previously published cost-effectiveness analyses [40, 48], Medicare reimbursement was chosen (even though the patient population being studied included men and women older than 50 years) since it more closely reflects the actual costs [24] associated with HRA and THA procedures, as opposed to private payer reimbursement, which is based on a negotiated rate, which often exceeds the true costs of the procedure. Revision THA procedure costs were further delineated by procedure complexity, such as isolated femoral component revision, acetabular component revision, both component revision, or femoral head and liner exchange only, based on previously published data [9]. Device costs for primary THA, revision THA, and HRA were obtained from published sources [38]. Costs associated with ambulatory visits and radiographs were also included in the analysis, based on average professional fees for evaluation and management services and both professional and technical fees for hip radiographs [12].

The clinical course for patients who receive MoM HRA was compared to the clinical course for patients who receive THA by comparing the cumulative discounted total quality-adjusted years of life (QALYs) and cumulative costs of MoM HRA with the cumulative discounted total QALYs and cumulative costs of THA. The measures used in the comparison were the incremental QALYs (a measure of effectiveness), the incremental costs, and the incremental cost-effectiveness ratio (ICER), which is the ratio of the incremental costs to the increment effectiveness. In accordance with the recommendations of the Panel on Cost-Effectiveness [21], we discounted all costs and utilities and report reference case estimates for a discount rate of 5%. The base case estimates for the probabilities, utilities, and costs were derived from the literature (Table 1).

Table 1 Variables used in cost-effectiveness analysis and ranges for sensitivity analyses

One-way sensitivity analyses were performed for each of the independent variables (Table 1). In these analyses, each variable was varied from 50% to 200% of the point estimate (Table 1), per decision analysis modeling convention, and the impact of each variable on the ICER was calculated. One-way sensitivity analyses for selected variables (discount rate, difference in utility [QoL] after conversion from HRA to THA compared to primary THA, and incremental cost of HRA compared to THA) were calculated for each gender and age stratum. One-way sensitivity analyses were used to identify thresholds for selected independent variables where MoM HRA would be cost-saving compared to THA and thresholds where MoM HRA would be considered cost-effective based on an ICER of $50,000 per QALY. Two-way sensitivity analyses were performed to identify ranges for the incremental cost of HRA compared to THA and difference in utility (QoL) after conversion from HRA to THA compared to primary THA where MoM HRA or THA was optimal based on net monetary benefits [49], using a willingness to pay threshold of $50,000 per QALY gained.

A probabilistic sensitivity analysis (Monte-Carlo sensitivity analysis) was performed to evaluate the combined impact of the individual independent variables jointly on the incremental costs, incremental QALYs gained, and the ICERs. In this analysis, each variable was represented as a probability distribution (Table 2) and a random sample for each variable was drawn from its probability distribution and entered into the model. The incremental costs, incremental QALYs gained, and the ICERs and their 95% confidence intervals were calculated from a Monte Carlo simulation using 10,000 samples for each gender and age stratum. An acceptability curve showing the proportion of samples for each gender and age stratum that were below a given willingness to pay threshold was calculated and graphed for a willingness to pay range of $0 to $100,000.

Table 2 Variables and distributions for probabilistic sensitivity analysis

Results

Over a 30-year followup period, MoM HRA patients would experience modestly higher lifetime gains in quality adjusted life years and have moderately higher health care costs compared to patients who have primary THA, depending on their age and gender. The cost-effectiveness of MoM HRA compared to THA varies markedly by age and gender with preferable (lower) incremental cost-effectiveness ratios in men compared to women and in younger patients compared to older patients: the lowest ICER was $28,614 for men aged 55–65 and the highest was $2,483,435 for women age 65–74—an 87-fold difference (Table 3). The ICER was less than the $50,000 per QALY threshold for three of the age and gender strata studied: men less than age 55 ($48,882/QALY), men ages 55–64 ($28,614/QALY), and women less than age 55 ($47,468/QALY) (Table 3).

Table 3 Cost-effectiveness of HRA compared to primary THA by gender and age strata

The variables that had the most influence on the model results were the annual probability of MoM HRA and THA failure, the cost of MoM HRA and THA, operative mortality of MoM HRA and THA, and the QoL after conversion from HRA to THA (Fig. 2). The one-way sensitivity analysis of the ICER to the difference in QoL after conversion from HRA to THA compared to primary THA indicated that the ICER for MoM HRA was very sensitive to the differences in QoL after conversion from HRA to THA for both men and women less than age 55, but not for men age 55–64 (Fig. 3). MoM HRA would be cost-saving over the 30 year time horizon if the incremental cost of the HRA implants compared to the primary THA implants was less than $313 for men aged less than age 55 years, less than $711 for men aged 55 to 64 years, and less than $175 for men aged 65–74 years (Fig. 4). The two-way sensitivity analysis indicated that the impact of the incremental cost of MoM HRA and the difference in QoL on the cost-effectiveness of MoM HRA varied depending on age and gender. In general, over a wide range of values for the QoL reduction after HRA conversion and the incremental cost of HRA conversion, MoM HRA was more favorable compared to THA for men than for women and for younger patients (age less than 55) compared to older patients (age 65 or older) (Fig. 5A–D).

Fig. 2
figure 2

One-way sensitivity analyses of ICER to probabilities of clinical outcomes, costs, and QoL are shown. The width of each bar indicates the range of the ICER as each independent variable changes over its range. The upper value for the ICER is over $7,627,147 at the upper value of the annual probability of HRA failure (0.0225). The graph shows that the factors that have the greatest impact on the model results are the probability of HRA failure, cost of HRA and primary THA, probability of primary THA failure, probability of operative death from HRA and primary THA, and quality of life after conversion of HRA to THA.

Fig. 3
figure 3

A graph shows a one-way sensitivity analysis to difference in QoL after conversion from HRA to THA compared to primary THA by gender and age strata. The ICER increased rapidly with small differences in the quality of life after conversion of HRA to THA compared to primary THA for men age less than age 55, men age 55 to 64, and women less than age 55. Men, age 55 to 64 had a more favorable (lower) ICER with much smaller change in ICER as the difference in quality of life after conversion from HRA to THA increased.

Fig. 4
figure 4

The graph shows a one-way sensitivity analysis to incremental cost of HRA compared to THA by gender and age strata. For both men and women, there is a linear relationship of the ICER to the incremental costs of MoM HRA implants. MoM HRA would be cost saving (ICER intercept = 0) if the incremental cost of MoM HRA were less than $313 for men less than age 55 years, less than $711 for men age 55 to 64 years, and less than $175 for men aged 65 to 74 years. For women in each age stratum, the costs of the MoM HRA treatment strategy are higher than the costs of the THA at every value of incremental cost of the MoM HRA implant compared to THA and there is no cost-saving threshold. In women less than age 55, the ICER of MoM HRA is less sensitive to the incremental cost of the HRA implants compared to THA, due to the higher probability of HRA failure in women than in men.

Fig. 5A–D
figure 5

These graphs show two-way sensitivity analyses of incremental cost of HRA compared to primary THA and difference in QoL after conversion from HRA to THA compared to primary THA for (A) men younger than 55 years, (B) men aged 65 to 74 years, (C) women younger than 55 years, and (D) women aged 65 to 74 years. The graph area shows the combination of the incremental cost of HRA and difference between QoL after conversion from HRA to THA and primary THA where MoM HRA (black) or primary THA (white) is optimal based on net monetary benefits analysis with a willingness to pay threshold of $50,000 per QALY. In general, over a wide range of values for the QoL reduction after conversion from HRA to THA and the incremental cost of HRA conversion, MoM HRA was more favorable compared to THA for men than for women (Fig. 5A versus 5C and Fig. 5B versus 5D) and for younger patients (age less than 55) compared to older patients (age 65 or older) (Fig. 5A versus 5B, and Fig. 5C versus 5D).

The probabilistic sensitivity analysis demonstrated wide variation in the ICERs due to the overall simultaneous variation in the many underlying factors that may influence the clinical effectiveness and costs of MoM HRA and THA (Table 3). The acceptability curves can be interpreted as the probability (or confidence) that the ICER is less than a certain willingness to pay threshold. The probabilities that the ICERs are less than or equal to $100,000 per QALY were less than 75% for all strata (Fig. 6), indicating that variation in costs of HRA, failure rates of HRA and THA, and quality of life difference after conversion of HRA to THA have a large impact on the cost and clinical effectiveness of MoM HRA compared to THA. However, it should be noted that the impact is similar for each age and gender strata and unlikely to change the age and gender specific ranking of the incremental cost and clinical effectiveness of MoM HRA compared to THA.

Fig. 6
figure 6

An acceptability curve from the probabilistic sensitivity analysis shows the probability that ICER is below a particular willingness to pay threshold based on the simulation using 10,000 samples for each gender and age stratum. The probability (confidence) that the ICER was less than or equal to $100,000 per QALY gained was only 63% for men less than age 55, 75% for men ages 55–64, and 68% for women less than age 55. The probabilities were lower for the remaining three strata. The uncertainty illustrated by these acceptability curves indicates that variation in costs of HRA, failure rates of HRA and THA, and quality of life difference after conversion of HRA to THA have a large impact on the comparative clinical and cost-effectiveness of MoM HRA.

Discussion

Despite the widely reported success of THA using conventional implants [7, 14, 16, 33, 42], new techniques and technologies are constantly being introduced into the marketplace, with the goal of improving clinical outcomes and reducing failure and reoperation rates. When evaluating any new technique or technology for use in clinical practice, it is important to consider the potential clinical benefits, risks, and economic costs associated with its use, preferably in comparison to the gold standard. MoM HRA offers potential advantages over conventional THA in terms of femoral bone preservation and ease of future revision surgery, especially in younger, more active patients who are more likely to require revision surgery. However, the benefits of MoM HRA compared to primary THA for patients with advanced OA of the hip have not been conclusively demonstrated in clinical trials or long-term observational cohort studies. We used decision analysis to compare the expected gains in quality of life, increase in costs, and cost-effectiveness of MoM HRA by age and gender, identify key factors that influence the cost and clinical effectiveness of MoM HRA compared to THA, and the uncertainty in these estimates. Our decision analysis used data on 5–7 year outcomes of MoM HRA and THA by age and gender from the AOA national registry based on 109,972 THA patients and 9,956 MoM-HRA patients. National joint registry outcomes are more likely to represent clinical practice in the community and less subject to the selection bias and referral bias that might influence outcomes in studies from single centers or academic institutions.

While our study provides novel information regarding the comparative effectiveness of MoM HRA and THA, the limitations of the methodology should be considered when interpreting the results. As is true with any decision analysis model, the validity and generalizability of the results are limited by the availability and accuracy of the data used in the analysis. For instance, although long-term implant survival is available for THA, only midterm survival is available for MoM HRA. Furthermore, there are no direct estimates of QoL following successful HRA or conversion of HRA to THA, so these values were derived from comparisons of QoL and function in patients with HRA and THA. While this introduces uncertainty into the model, sensitivity analysis was used to test the robustness of the model results and the conclusions. One of the advantages of decision analysis modeling is the ability to use sensitivity analysis to determine threshold values for critical input variables (e.g., age, risk of complications, cost) which influence the comparative effectiveness of each treatment option. Moreover, by including a wide range of values for the model variables in our probabilistic sensitivity analysis, our study provides a more realistic estimate of the true uncertainty of the comparative effectiveness of MoM HRA compared to THA.

Our results indicate MoM HRA could be both clinically advantageous and cost-effective in appropriately selected men under the age of 65 years and women under the age of 55 years, when considering the initial and subsequent risks, costs, and benefits accrued over a 30-year period. McKenzie et al. [35] previously evaluated the cost-effectiveness of MoM HRA compared to “watchful waiting” and THA in two groups of patients who were likely to outlive the lifespan of their prosthesis: patients younger than 65 years and those older than 65 years who participated in activities predicted to shorten the lifespan of their prosthesis. Data were obtained from an extensive literature search, and costs were obtained from the British National Health Services price index. The investigators found that THA dominated MoM HRA throughout the 20-year followup period of the Markov model, due to the higher cost of MoM HRA and also the higher revision rate resulting in lower quality adjusted life-years. MoM HRA became more cost-effective as the revision rate of THA increased or revision rate of MoM HRA procedures decreased. An annual revision rate for MoM HRA of 1.52% was used based on a 1996 study by McMinn et al. [36]. This was compared to a revision rate of 1.36% for THA for active, young patients and 1.14% for older, less active patients. Our study uses more recent data with lower age- and gender-specific failure rates for both MoM HRA and THA obtained from a large, national joint replacement registry and explores a larger number of potential factors that may influence the comparative effectiveness of MoM HRA compared to THA.

Our probabilistic sensitivity analysis quantifies the simultaneous impact of uncertainty in 35 independent variables (15 probabilities of clinical outcomes, 12 quality of life utilities, and eight cost estimates) on the incremental effectiveness, incremental costs, and ICER of MoM HRA compared to THA. The acceptability curves provide an upper bound for the confidence that the ICER is less than $100,000 per QALY gained. Thus, we can only be 63% confident that the ICER is less than $100,000 per QALY for men less than age 55, 75% confident for men age 54–75, and 68% confident for women less than age 55. The limited information about the underlying parameters that could influence the comparative effectiveness of MoM HRA versus THA results in the wide variation in our estimate of the ICER and emphasizes the need to include measurements of quality of life and resource use in future studies of the clinical outcomes of HRA and THA.

New surgical techniques and technologies are constantly being introduced into orthopaedic practice in the United States, many of which offer the promise of better clinical outcomes, often at a higher cost. In an era of limited healthcare resources, it is imperative to consider the comparative clinical and cost-effectiveness of new interventions and technologies vis-à-vis the gold standard technique. This is especially important in the field of hip reconstructive surgery, where the gold standard treatment (THA) has been associated with excellent patient outcomes and long-term durability. Given the higher costs associated with MoM HRA implants and the uncertainty that exists with respect to the downstream clinical risks and benefits associated with this new technology, the results of our study offer clinicians, patients, and policy makers the opportunity to consider the incremental risks, benefits, and costs that influence the comparative effectiveness of MoM HRA and THA.