Background

International arthroplasty registries routinely collect outcomes data on prosthesis failure and revision joint replacement, and numerous registries additionally administer patient-reported outcome measures (PROMs) to provide a comprehensive picture of surgical outcomes [1, 2]. The collection of patient-reported outcomes frequently includes the assessment of pain, function, and quality of life using validated instruments [3]. It is well recognised that PROMs can be used to support clinical care [4, 5]; for example, PROMs can be used to monitor improvements in health outcomes and to communicate patient progress. They may also be valuable for flagging suboptimal patient outcomes after joint replacement, enabling limited resources for post-operative follow-up to be directed to patients with the greatest clinical need [6, 7]. This is particularly pertinent as rates of joint replacement continue to grow internationally [8,9,10], stretching health system capabilities.

Using national registry data, we have previously shown that worse joint-specific and generic PROMs scores (derived from either single-item or multi-dimensional instruments) at six months after primary total knee replacement were strongly associated with a heightened risk of early revision surgery within two years [11]. Patients who did not achieve thresholds for clinically important improvement in pain, function, or quality of life were most likely to undergo early revision [11], providing practical screening guidance for surgeons [12]. Whether different types of PROMs instruments can similarly identify patients at greater risk of early revision hip replacement is not well understood. Several studies have demonstrated associations between poor PROMs scores and the risk of revision hip replacement, but these have largely focused on hip-specific instruments [6, 13,14,15,16] or revision outcomes beyond two years after the primary procedure [6, 15,16,17]. This study aimed to determine whether hip-specific and generic PROMs scores are associated with early revision hip replacement (defined as revision surgery performed six to 24 months after the primary procedure).

Methods

Study design

This study is an analysis of national registry data and is reported according to the REporting of studies Conducted using Observational Routinely-collected Data (RECORD) checklist [18].

Data sources

The AOANJRR is a national clinical quality registry that collects data on all joint replacements performed in Australia, with well-established data validation procedures [19]. It has captured over 1.85 million joint replacement procedures, with full national coverage since 2003 [19]. The AOANJRR routinely collects data on primary and revision hip replacement (date, side, type of procedure, diagnosis), age, gender, body mass index (BMI) and American Society of Anesthesiologists (ASA) grade. Additionally, pre- and post-operative PROMs data collection has been undertaken by the Arthroplasty Clinical Outcomes Registry National (ACORN) from 2013 to 2018 and by the AOANJRR since 2018. Pre-operative PROMs data were collected within three months prior to surgery and 6-month post-operative data were collected between 5 and 8 months after surgery, to maximise completion rates. ACORN collected PROMs data from patients undergoing primary hip replacement at nine hospitals [20]. The AOANJRR collects PROMs data from patients undergoing primary hip replacement, using methods reported previously [21]. The data used for this study were collected from all 218 hospitals participating in the AOANJRR PROMs program at the time of data analysis (over 300 hospitals contribute data to the AOANJRR but not all hospitals participate in the PROMs program). Person-level linkage of PROMs data to AOANJRR revision surgery data was undertaken through matching patient name, date of birth, operated joint and operated side data. This linkage occurs regularly as part of usual AOANJRR processes. Statisticians at the AOANJRR had full access to all data used for this study.

Patient-reported outcome measures

Hip-specific and generic PROMs instruments were administered to patients pre- and post-operatively. The instruments administered by the AOANJRR and ACORN at each time point, and completion rates for each instrument, are summarised in the Additional file (Table A1). A hip pain visual analogue scale (VAS) ranging from 0 (no pain) to 10 (worst pain imaginable) was used to assess pain over the previous seven days. A low back pain VAS (0 (no pain) to 10 (worst pain imaginable)) was also administered. The 12-item Oxford Hip Score was used to assess hip-related pain and function (0 (worst) to 48 (best)) [22]. The 12-item HOOS-12 score was administered as an optional measure, given limited evidence of its measurement performance [23, 24]. It provides hip-related pain, function and quality of life domain scores and a summary score (each 0 (worst) to 100 (best)). The EQ-5D-5L instrument was used to evaluate quality of life [25]. An EQ-5D-5L utility score can be generated using country-specific preference weights; utility scores commonly range from less than 0 (indicating quality of life worse than death) to 1.00 (full quality of life). The EQ VAS was used to capture self-reported health (0 (worst health) to 100 (best health)). Three expectation items were also administered pre-operatively for expected hip pain (0 (no pain) to 10 (worst pain)), health (0 (worst health) to 100 (best health), and mobility (5-point scale from ‘no problems’ to ‘severe problems’) in six months’ time. A perceived change question (How are the problems now with your hip on which you had surgery, compared to before you had your operation?) and a satisfaction question (How satisfied are you with the results of your hip replacement?) were also administered post-operatively, with five response options ranging from ‘much better’ to ‘much worse’ and ‘very dissatisfied’ to ‘very satisfied’, respectively.

Study cohort

Between January 2013 and December 2022, PROMs data for 34,473 primary THR procedures were available from the ACORN and AOANJRR and linked to AOANJRR data on revision hip replacement (Fig. 1). We considered patients who provided post-operative PROMs data for at least one instrument and either received revision hip replacement (of any type and for any diagnosis) within six to 24 months after the primary procedure or did not receive revision hip replacement but were alive at 24 months or the end of the follow-up period (27 April 2023). Consistent with the methods used previously [11], we excluded those who did not provide post-operative PROMs data, had not yet reached 6 months post-operatively, had died within 24 months without receiving revision hip replacement, or had undergone revision prior to completing post-operative PROMs. The latter group was excluded as they did not reach the post-operative PROMs follow-up point. As shown in Fig. 1, we excluded data for 13,237 primary THR procedures, leaving data from 21,236 procedures for analysis.

Fig. 1
figure 1

Study cohort

Data analysis

Pre- and post-operative scores for the Oxford Hip Score and HOOS-12 were computed according to published algorithms [23, 26], EQ-5D-5L utility scores were calculated using Australian preference weights [27]. Demographic and clinical data were analysed descriptively. Differences in PROMs scores (pre-operative, post-operative and change scores) between patients who received revision hip replacement and those who did not were evaluated using independent t-tests or chi-square tests, as appropriate. A confidence interval calculator [28] was used to estimate the likelihood of revision for patients who were ‘dissatisfied’ or ‘very dissatisfied’ at six months versus those who were ‘satisfied’ or ‘very satisfied’, and for patients who perceived they were ‘a little worse’ or ‘much worse’ at six months versus those who were ‘a little better’ or ‘much better’. Poisson regression models with robust error variance were used to calculate the relative risk (RR) of revision hip replacement for a one-unit increase in post-operative PROMs score. The models accounted for varying follow-up times [29]. We have used this statistical approach previously for revision knee replacement outcomes [11]. Poisson regression models with robust error variance were also used to evaluate whether clinically important improvement (defined using published anchor-based minimal important change estimates for each PROM instrument: 2 points for hip pain [30, 31], 12.4 points for Oxford Hip Score [32], 19.2 points for HOOS-12 pain [33], 15.7 points for HOOS-12 function [33], 17.2 points for HOOS-12 quality of life [33], 17.9 points for HOOS-12 summary [33], 0.41 utility units for EQ-5D-5L [34], and 9.34 points for EQ-VAS [34]) was associated with early revision. Patients who met the minimal important change threshold were the reference group (relative risk of 1.00). For all PROMs scores, both unadjusted models and models adjusted for age, gender, and pre-operative PROM score were undertaken. Statistical analysis was performed using SAS software version 9.4 (SAS Institute Inc., Cary, North Carolina), with a significance threshold of 0.05.

Results

Patients receiving revision surgery

Within the cohort, 88 primary THR procedures were revised within six to 24 months (Fig. 1). The median (IQR) time from primary total hip replacement to revision was 367 (259–556) days and the median (IQR) time from post-operative PROMs completion to revision was 191 (68–383) days. The most common reason for revision hip replacement was loosening (n = 24, 27%), followed by prosthesis dislocation (n = 19, 22%), infection (n = 18, 21%), fracture (n = 11, 13%), and pain (n = 5, 6%). Revision of the acetabular component was most common (n = 29, 33%), followed by revision of the femoral component (n = 25, 28%) and head/insert revision (n = 22, 25%).

Pre-operative characteristics and patient-reported outcome measure scores

The demographic and clinical characteristics of patients who underwent revision surgery and those who did not are presented in Table 1. Both groups were similar with respect to average age, proportion of females, average BMI, and ASA grade. Osteoarthritis was the most common primary diagnosis for both groups. Patients in the revised group had more back pain and worse HOOS-12 scores before surgery, compared with the non-revised group (Table 1). However, the between-group differences were small (< 1 point difference in low back pain VAS; 5.8–7.5 point difference in HOOS-12 subscale or summary scores) and unlikely to be clinically important with respect to thresholds for minimal important change. All other pre-operative PROMs scores were comparable between groups.

Table 1 Comparison of pre-operative status for primary total hip replacement patients

Associations between post-operative patient-reported outcomes and early revision

Table 2 presents the post-operative PROMs scores for patients who received revision surgery and those who did not. Patients who underwent revision demonstrated significantly greater hip pain, greater low back pain, poorer hip-related function and hip-related quality of life, and poorer health and quality of life scores at six months. Effect sizes for the between-group differences ranged from − 1.51 to 0.95. Patients who had early revision demonstrated significantly smaller post-operative improvements in all PROMs scores than those who did not receive revision, with the exception of low back pain for which both groups reported little improvement (Table 2). Apart from low back pain, the magnitude of mean improvement in PROMs scores for the early revision group ranged from 52% to 75% of the mean improvement reported by the non-revised group.

Table 2 Comparison of patient-reported outcome scores after primary total hip replacement

Between-group differences in patient-perceived change were also evident at six months (Fig. 2). Of those revised, 73% perceived their hip was ‘a little better’ or ‘much better’ (versus 97% of the non-revised group) and 23% described their hip as ‘a little worse’ or ‘much worse’ (versus 1% of the non-revised group). Patients who perceived their hip was worse at six months were significantly more likely to undergo early revision than those who perceived their hip was improved (unadjusted RR 19.62, 95%CI 11.33 to 33.98) (Table A3, Additional file). There were also clear differences in post-operative satisfaction. Sixty per cent of patients who received revision were ‘satisfied’ or ‘very satisfied’ with the results of their primary hip replacement (compared to 92% in the non-revised group) and 28% reported they were ‘dissatisfied’ or ‘very dissatisfied’ at this timepoint (Fig. 3). Patients who were dissatisfied at six months were, on average, ten times more likely to undergo early revision (unadjusted RR 10.18, 95%CI 6.01–17.25), compared to those who were satisfied (Table A2, Additional file).

Fig. 2
figure 2

Perceived joint change at six months after primary total hip replacement. Dark blue bars represent the non-revised group and light blue bars represent the revised group. p < 0.01 for chi square test

Fig. 3
figure 3

Self-reported satisfaction at six months after primary total hip replacement. Dark blue bars represent the non-revised group and light blue bars represent the revised group. p < 0.01 for chi square test

The final regression models included only age and gender as covariates, as the inclusion of variables for which a pre-operative between-group difference was identified (at p < 0.05) did not change the results. Each of the post-operative PROMs scores was independently associated with revision hip replacement, with little change in relative risk estimates after adjustment for age and gender (Table 3). As an example, each one-unit increase in hip pain VAS score at six months was associated with a 31% increase in the risk of early revision in the adjusted model (adjusted RR 1.31, 95%CI 1.23 to 1.39). As higher scores represent improvement for the Oxford Hip Score, HOOS-12, EQ-5D-5L and EQ VAS instruments, a one-unit increase in these scores was associated with a significantly reduced risk of early revision (Table 3). For example, a one-unit improvement in the Oxford Hip Score was associated with a 10% reduction in the risk of revision after adjusting for age and gender (adjusted RR 0.90, 95%CI 0.89 to 0.92).

Table 3 Associations between post-operative scores and revision hip replacement

Associations between clinically important improvement and early revision

After adjusting for age and gender, patients who did not achieve a clinically important improvement in hip pain had a significantly higher risk of early revision, compared with those who achieved clinically important improvement (adjusted RR 3.95, 95%CI 2.30 to 6.77). A similar pattern was observed for the Oxford Hip Score, HOOS-12 pain, HOOS-12 quality of life, HOOS-12 summary, and EQ-5D-5L scores, as shown in Table 4.

Table 4 Clinically important improvement in patient-reported outcomes and early revision

Discussion

Using national registry data, this study provides new evidence that poor hip-specific and generic PROMs scores at six months after primary THR, and smaller post-operative gains in PROMs scores, are associated with a heightened risk of revision surgery within two years. Notably, patients who did not meet thresholds for clinically important improvement in hip pain, hip-related function, hip-related quality of life, or overall quality of life demonstrated a two- to five-fold greater likelihood of early revision. Augmenting our earlier findings in knee replacement [11], these data further emphasise the value of systematically collecting PROMs data before and after joint replacement surgery to flag suboptimal patient outcomes and support clinical care processes.

While early revision was an infrequent outcome in this study (impacting 0.4% of the study cohort), it still represents a considerable burden to patients and the health system at $AUD28,000-$61,000 per revision, depending on procedure complexity [35]. As such, the timely identification of patients most likely to progress to revision surgery is important. Burgeoning rates of elective joint replacement in many countries [8,9,10] necessitate approaches to post-operative patient follow-up that are less resource-intensive and amenable to large scale-up. The routine use of PROMs instruments to assess patient-centred outcomes (including via remote delivery methods, as used by the AOANJRR) is one such approach and could aid in streamlining clinical follow-up so that limited resources are better targeted to ‘high risk’ patients [6]. We have previously demonstrated that knee-specific and generic PROMs scores at six months after primary total knee replacement can identify patients at greater risk of early revision surgery [11]. In this prior work, patients who did not achieve clinically important improvement were up to eight times more likely (depending on the specific PROM instrument) to undergo revision knee replacement within two years [11]. Our present analysis confirms that six-month hip-specific and generic PROMs scores are similarly informative with respect to detecting likely progression to early revision hip replacement.

This study advances existing knowledge around poor hip-specific PROMs scores and the risk of subsequent revision surgery. Two studies from the New Zealand Joint Registry have reported that worse six-month Oxford Hip Scores were associated with a greater likelihood of revision within two years [13, 14]. Although not adjusted for potential confounders, the analysis undertaken by Rothwell et al. reported a similar association to that observed in the present study; each one-unit decrease in Oxford Hip Score was associated with a 9.7% increase in the risk of revision within two years [13]. One recent study from the AOANJRR reported a weak association between a surgeon’s 2-year cumulative percent revision rate and post-operative Oxford Hip Scores for patients who did not undergo revision, but did not examine revision outcomes at the patient level [36]. In the United States, three studies have examined longer-term PROMs collection and shown that two-year and five-year post-operative hip pain, Mayo Hip Score, and Harris Hip Score (and changes in these scores up to five years after THR) were associated with an increased risk of subsequent THR [6, 15, 16]. In Sweden, hip pain, EQ-5D utility, EQ VAS and satisfaction VAS scores at one year post-operatively were also found to be associated with longer-term revision, up to eight years after THR [17]. In the present study, we applied anchor-based thresholds for improvement, as this is the preferred psychometric approach for determining minimal important change [37]. We are not aware of any other studies that have used similar methods for examining relationships between the magnitude of post-operative improvement and early revision outcomes. Two previous studies used arbitrary cut-off scores to classify improvement in hip-specific PROMs scores at two years. The first study found that patients with either no improvement or worsening in their Mayo Hip Score had a nearly four-fold increase in the likelihood of subsequent revision, compared to patients who reported an improvement of at least 50 points (on a 0–80 scale) [16]. The second study found that patients with either no improvement or worsening in their Harris Hip Score had an 18-fold increase in the risk of subsequent revision, compared to patients who reported improvement of 51–75 points (on a 0-100 scale) [15].

For patients who progressed to early revision in our study, the average time between post-operative PROMs completion and revision surgery was six months. This interval offers time for clinical assessment and potentially, early intervention that could mitigate the need for revision. In our cohort, loosening was the most frequent indication for early revision surgery. Detecting and managing this complication early (given radiographs are not commonly obtained until 12 months post-operatively) may enable patients to avoid a protracted period of pain and impaired function. However, contemporary joint replacement pathways provide little opportunity for clinical review of patients in the first year after joint replacement, and only virtual review clinics in some settings [38, 39]. The collection of six-month PROMs data can provide an early ‘safety net’ for patients whose pain, function and quality of life has not improved as expected or for patients who are dissatisfied with their surgical outcome. Embedding pre- and post-operative PROMs collection within clinical pathways could enable direct contact or expedited review to be initiated, where patients report poor post-operative scores or do not meet thresholds for expected improvement (for example, less than two-point improvement in hip pain VAS [30, 31] or less than 12-point improvement in Oxford Hip Score [32]). This approach is already being used in other clinical specialties, such as oncology care [40]. While we acknowledge that administering multiple PROMs instruments is not feasible in all settings, the single-item measures (joint pain VAS, satisfaction, and perceived change) used in our THR and TKR studies are simple, no-cost, license-free tools that are relatively easy to collect in clinical and registry contexts. Each item was capable of detecting patients at higher risk of early revision hip or knee surgery.

This study had several key strengths, including the use of perioperative PROMs data from a large primary THR cohort that was linked to national data on revision surgery. While earlier studies have focused on hip-specific measures [6, 13,14,15,16] or only post-operative PROMs scores [13, 14, 17], we examined a suite of commonly-used hip-specific and generic PROMs instruments and analysed pre-operative, post-operative and change scores with respect to early revision outcomes. We also recognise the study limitations. National arthroplasty registries such as the AOANJRR typically collect a limited set of demographic and clinical data; while the generalisability of the cohort is not known, the age, gender and primary diagnosis characteristics are broadly similar to those reported internationally [41, 42]. The sample size for analysis varied by PROMs instrument, given differences in the AOANJRR and ACORN PROMs programs and some missing data despite direct patient follow-up. We could only include a small number of covariates in the regression models given the number of revision events and we did not adjust for primary diagnosis given the predominance of osteoarthritis. We note the consistent findings across all PROMs instruments with respect to associations with early revision and also the stability of the relative risk estimates in our adjusted models. Together, this suggests that including other variables in the models would likely have little impact. As the AOANJRR PROMs cohort grows over time, opportunities for further multivariate analysis and stratified analysis (for example, based on revision indication) will become increasingly feasible.

Conclusions

This study demonstrates that both hip-specific and generic PROMs scores offer an opportunity to identify, in a timely manner, patients who are at greater risk of early hip revision. The routine capture of six-month PROMs data provides an efficient mechanism for post-operative patient screening, which can be used to trigger clinical review and implement greater surveillance. Our data indicate that either single-item or multi-item PROM instruments can provide an early signal for a suboptimal surgical outcome.