Introduction

THA is one of the most common orthopaedic surgical procedures and generally is associated with high success rates. A good patient outcome after orthopaedic surgery is usually dependent, among other things, on an appropriate indication for surgery [34, 37]. However, there is currently no consensus on objective indications for THA; most experts consider the severity of pain and disability to be important [33], but there is little agreement regarding the actual severity of symptoms that indicates the need for surgery [11].

From the patient’s perspective, pain—and the hope of its resolution—is one of the main driving forces for seeking care [19]. Pain is always subjective, and it has long been known that differences exist between the genders in their response to painful stimuli. Women typically have a lower pain threshold (that is, a greater sensitivity to experimentally induced pain) than men [14]. This may be mediated by sociocultural factors (including age, family history, and gender roles), psychological factors (such as anxiety, depression, cognition, behavior), or biological factors (like genetics, sex hormones, endogenous pain inhibition), all of which can also interact with each another in complex ways [27].

Gender may be an important factor to consider when examining indications for surgery, because differences in pain sensitivity may influence accepted “thresholds for surgery” in relation to pain and disability. Similarly, gender may have an influence on the interpretation of patient-rated outcomes when judging the success of surgery. Many [18, 30, 32] but not all [31] studies have shown that women have higher pain levels and worse function when they present for surgery for the treatment of degenerative musculoskeletal conditions. However, it is unclear whether this necessarily translates into a worse overall outcome after surgery or a lower level of satisfaction with treatment. Moreover, the role of baseline factors such as age, mental health status, comorbidity, body mass index (BMI), and patient expectations in explaining gender differences in patient-reported outcome measure (PROM) scores is not well known. Because these factors are themselves typically correlated with PROM scores [12, 26, 28, 39] and may differ between men and women preoperatively, they represent potential confounding effects in any analysis of gender differences in PROM scores.

We therefore asked the following questions: (1) Are there significant differences in PROM scores between men and women preoperatively and 12 months after THA? (2) Do baseline differences in age, mental health status, comorbidity, BMI, and sociodemographic characteristics explain any significant differences in PROM scores?

Patients and Methods

This was a retrospective analysis of prospectively collected data from patients undergoing THA in a tertiary care center (Specialized Orthopaedic Hospital) in Switzerland between July 2008 and December 2009. Inclusion criteria were: German-speaking, end-stage osteoarthritis requiring primary THA, and surgery planned to be carried out by one of the three most experienced hip surgeons of the clinic. The only exclusion criterion was the presence of neurological disorders. A total of 300 patients (150 men, 150 women), equivalent to 84% of all who were eligible, completed the questionnaires preoperatively and were thus included in the study. Two hundred sixty-one (86%) of those completing the baseline questionnaires (129 women, 64 ± 11 years; 132 men, 66 ± 10 years; Table 1) also completed the followup questionnaire 12 months postoperatively. The main reasons for loss to followup reported by the patients contacted by phone were a lack of time and the questionnaire battery being too long. Fewer than 3% were no longer interested or had pain for other reasons and hence did not consider it appropriate to complete the questionnaires. Two men and two women underwent revision during the study period (within 12 months). Four men underwent revision from 14 months to 5 years after the first operation.

Table 1 Baseline data of the patients

The patients had clinical and radiographic evidence of end-stage hip osteoarthritis and underwent THA using either a posterolateral or direct anterior approach. Postoperative precautions to prevent dislocation were given for the posterolateral approach, but otherwise there was no difference in the postoperative care. Typically, the patients had undergone prior conservative treatment comprising physiotherapy, oral nonsteroidal antiinflammatory drugs, and therapeutic hip injections. One to 3 weeks before admission, the patients were asked to complete a preoperative questionnaire booklet that included the instruments used in the present study (see subsequently). These were completed by the patient at home and brought into the hospital on the day of admission. Twelve months postoperatively, the questionnaire booklet was sent out by mail to those who had returned a preoperative questionnaire and had not undergone any surgery on the spine or lower extremities in the preceding 4 months with the request to complete it and return it using the stamped, addressed envelope enclosed. At its inception, the study did not require ethics committee approval in our country, because it concerned the reuse of routinely collected data with the written consent of the patient.

At baseline, patients completed crossculturally adapted and validated German versions of the following questionnaires: (1) Oxford Hip Score (OHS) [7, 29], which consists of 12 questions on hip pain and function in the past 4 weeks. Each item uses a 5-point response scale with values from 0 to 4. An overall score is created by summing the responses to each of the 12 questions. The total score can range from 0 to 48, where 0 is the worst possible score indicating severe hip symptoms and 48 is the best score suggesting excellent joint function. (2) WOMAC, a 24-item self-administered, disease-specific instrument for assessing pain, stiffness, and physical function in patients with osteoarthritis (OA) [3, 40]. We used the version with the 0 to 10 numeric scale. Five items address pain with a total score for this subscale ranging from 0 to 50. Two items address stiffness (subscale score range, 0–20 points) and 17 items assess physical function (subscale score range, 0–170). The scores for each subscale were converted to a 0 to 100 scale with the highest scores indicating the best health status. (3) SF-12, a 12-item self-administered measure of general quality of life [15, 42]. Scores are transformed into two weighted summary scores for physical function (physical component summary [PCS]) and mental health (mental component summary [MCS]). The standardized scores for the SF-12 range from 0 to 100 with higher scores indicating a better health state. We used the SF-12 MCS for comparing the mental health between genders.

At 12 months’ followup, the patients again completed the OHS, WOMAC, and SF-12 and also rated the acceptability of their current symptoms and change in general health by answering the following questions: (1) “If you had to spend the rest of your life with the symptoms you have now, how would you feel about it?” Response options: very satisfied, somewhat satisfied, neither satisfied nor dissatisfied, somewhat dissatisfied, very dissatisfied. (2) “Compared with 1 year ago, how would you rate your health in general now? Would you say it is: (a) much better now than 1 year ago, (b) somewhat better now than 1 year ago, (c) about the same as 1 year ago, (d) somewhat worse now than 1 year ago, (e) much worse now than 1 year ago.”

Question 1 is the Symptom-Specific Well-Being (SSWB) item and has been used by the Patient Outcome Research Teams in studies on coronary heart disease, prostate disease, and cataract disease. The SSWB was also used in the Lumbar Cluster of the American Academy of Orthopaedic Surgeons and is included in the Core Outcome Measures Index validated for use in patients with femoroacetabular impingement and in patients undergoing THA [9, 20, 21]. The health change question is question 2 of the SF-36 [43].

Descriptive data included proportions (%) for categorical data and means and SDs for the instrument scores. The significance of gender differences was analyzed using independent t-tests. Differences were also reported as mean differences and the corresponding 95% confidence intervals. Multiple regression analyses were carried out with the chosen PROMs (OHS and WOMAC) as the dependent variables and with all the variables identified as possible “confounders” (age, BMI, comorbidity, and mental health [SF-12 MCS scores] and education, civil status, and employment status [transformed as dummy variables]) being force-entered as independent variables together with gender. The assumptions of all statistical tests were verified before analysis. The aforementioned analyses were carried out using SPSS (Version 17; SPSS Inc, Chicago, IL, USA) and Medcalc (MedCalc Statistical Software, Mariakerke, Belgium). Probability values < 0.05 were considered to be statistically significant.

Results

Differences Between Men and Women in PROM Scores Before and After Surgery

Preoperatively, women showed worse scores than men for all of the PROMs (Table 2). Twelve months postoperatively, the absolute scores for all PROMs were not significantly different between the men and women (Table 3). The PROM improvements (preoperatively to 12 months postoperatively) were not different between men and women for the OHS, whereas women showed greater improvement in WOMAC pain (6.5; 95% confidence interval [CI], 1.1–12.0), stiffness (7.9; 1.8–14.1), and total score (5.9; 0.7–11.0) (Fig. 1). However, when adjusting for baseline scores of the corresponding instrument, there were no significant differences between men and women for any of their change scores (Table 4).

Table 2 Comparison of the unadjusted PROM scores between women and men preoperatively
Table 3 Comparison of the unadjusted PROM scores between women and men 12 months postoperatively
Fig. 1
figure 1

The figure shows the change scores for the PROMs from preoperatively to 12 months postoperatively for women and men.

Table 4 PROM change scores from preoperatively to 12 months postoperatively, adjusted by baseline values

No significant differences were found in the proportion of patients in each response category for the SSWB (p = 0.806) and health changes (p = 0.395) questions (Figs. 2A and 2B, respectively).

Fig. 2A−B
figure 2

The figure shows the global treatment outcome ratings of the women and men 12 months postoperatively. (A) If you had to spend the rest of your life with the symptoms you have now, how would you feel about it? (B) Compared to 1 year ago, how would you rate your health in general now?

Differences in Baseline Factors in Relation to Differences in PROMs Between Men and Women

At baseline, the women had lower BMI (−1.9; 95% CI, −1.0 to −2.8) and lower SF-12 MCS scores (−3.1; −0.8 to 5.4) than the men (Table 1).

Significant differences in preoperative PROM scores between men and women were present for all PROMS even when controlling for age, BMI, comorbidity, SF-12 MCS scores, and sociodemographic characteristics (Table 5).

Table 5 Results of the multiple regression analysis showing the influence of gender on each of the PROMs at baseline while controlling for the covariates comorbidity, age, BMI, SF-12 MCS, and sociodemographic differences

There was no significant difference between men and women for the condition-specific PROM scores (OHS and WOMAC) 12 months postoperatively when controlling for baseline values of the corresponding PROM, age, BMI, comorbidity, SF-12 MCS, and sociodemographic characteristics (Table 6).

Table 6 Results of the multiple regression analysis showing the influence of gender on the 12-month postoperative scores for each of the PROMs while controlling for baseline scores of the given PROM and the covariates comorbidity, age, BMI, SF-12 MCS, and sociodemographic differences

Discussion

Severe pain and disability are generally considered to be the primary indications for THA [11]. Some previous studies have reported differences in PROMs in men and women undergoing THA, and hence, sex and gender may be important factors to consider when examining indications for surgery and interpreting outcomes after surgery [18, 30, 32]. We found that women undergoing THA had worse preoperative PROM scores than did men, and these were not explained by differences in BMI, age, comorbidity, or mental health scores. Women showed a greater absolute improvement in PROM scores for many instruments, but this was predominantly the consequence of their worse preoperative scores. Controlling for all baseline variables, there were no gender-related differences in outcomes 1 year after THA.

The study had some limitations. First, we did not obtain any objective measures of the severity of hip arthritis to see whether lower baseline PROM scores were matched by more severe radiographic findings in women. Some previous studies suggest that the grade of radiographic OA is unrelated to self-rated pain and function (WOMAC scores), at least for moderate to severe OA [31], whereas others show a relationship between radiographic and clinical variables [5, 10]. Second, this study was conducted in Switzerland; differences between countries in their healthcare systems or social structure might govern whether gender differences in baseline PROMs are observed. Furthermore, other psychosociocultural factors that we did not measure such as anxiety, catastrophizing, coping strategies, and the like may have influenced the significant differences seen in baseline PROM scores. A final limitation—and one that besets other studies of this type—is the inability to interpret the gender differences in relation to “minimal clinically important differences” (MCID). To our knowledge, no cross-sectional studies have evaluated what constitutes a clinically important score difference between groups. The MCID values reported in the literature refer to changes over time (mostly for improvement) within a group, usually in relation to some external criterion of change, ie, they reflect “change” rather than “differences.” In addition, the MCID depends on the characteristics of the population and the treatment used and should be used only for general guidance in interpreting individual, not group change [24]. Depending on the method used for calculation, the MCID values reported in the literature range from 6 to 29 for WOMAC and from 1 to 11 for OHS [2, 68, 13, 16, 36, 41, 44, 45]; if these values are used to interpret the clinical relevance of our findings (despite the previously mentioned caveats), the CIs of our baseline gender differences would include clinically important differences, whereas the differences at 12 months are highly unlikely to be clinically relevant.

Our findings confirmed previous reports of greater pain and disability in women than men presenting for THA [18, 30, 32]. This finding extends to other medical and orthopaedic conditions requiring surgery and is often considered to indicate that women present for surgery at a later stage in their disease or (for various reasons) choose to wait longer than men before deciding to undergo surgery. Reasons may include a more risk-averse approach to treatment in women (persevering with conservative measures rather than undergoing the risks of surgery), barriers to access (being deterred from undergoing surgery by family, clinicians, or self), personal preferences, or familial pressures and caregiving responsibilities [22, 23]. Indeed, previous reports suggest a threefold greater underuse of arthroplasty for severe arthritis in women than men [17]. However, we cannot rule out the alternative explanations that women are simply more sensitive to the pain associated with a given severity of disease or that there are sociocultural gender differences in the reporting of pain. Social learning has a strong influence on the pain response, and it is generally perceived that men are less willing than women to report pain [38]. Some authors contend that, because PROM scores do not differ between men and women after surgery, overreporting and/or enhanced pain sensitivity in women is an unlikely explanation for the differences seen [23]; however, THA is such a successful intervention, leaving most patients with negligible pain 12 months postoperatively, that any residual differences between men and women would likely not be perceptible.

In the present study we evaluated whether the differences seen between men and women were explained by differences in various baseline factors at presentation. It is known that distressed patients have worse scores than nondistressed subjects for most PROMs [26] and that women tend to display worse preoperative mental health scores than men [25]. Hence, mental health might theoretically mediate the observed gender difference. However, this could not be substantiated in the present study; although mental health was a unique predictor of baseline PROM scores, and women presented with worse mental health than men, the differences in PROM scores remained, even after controlling for mental health scores. BMI and the number of comorbidities had a significant influence on baseline PROMs, but sex or gender remained a significant independent (unique) predictor in the model alongside these variables.

Women undergoing THA had worse preoperative PROM scores than did men, and these were not explained by differences in BMI, age, comorbidity, or mental health scores. Whether this is a reflection of sex or gender differences in pain reporting, of sensitivity to the same physical insult (such as the severity of OA), or of women simply waiting longer and developing more extreme symptoms before presenting for surgery cannot be ascertained from the present study.

However, these differences need to be considered when evaluating indications and thresholds for surgery, because many guidelines are based on the degree of pain and function on presentation [1, 35]. Further studies to elucidate the nature of the differences therefore are required; if they reflect barriers to access, these should be addressed and eliminated. Previous studies have suggested that physicians identify women’s complaints as psychosomatic more frequently than they do men’s [4]. Surgeons may be more reluctant to operate on women than men because they perceive that women are likely to have worse outcomes; however, given that we found no significant differences in patient-reported outcomes at 12 months postoperatively, these suspicions would appear to be unfounded. Women and men can be expected to benefit to a similar extent from THA.