Background

Osteoarthritis causes pain and functional incapacity. In developed societies with a high life expectancy, the prevalence of knee osteoarthritis (KOA) and hip osteoarthritis (HOA) is high, 23.9 and 10.9%, respectively [1]. Osteoarthritis entails a social and economic burden in terms of health-related quality of life (HRQoL) [2, 3] and cost of the disease [2, 4]. Estimated expenses for KOA or HOA are equal to 0.5% of Spain’s GDP [5].

Health problems with such an impact on society and the existent technologies for those health problems deserve to be the focus of health technology assessment and economic evaluation specifically to inform the evidence-based decision making by health authorities. During an economic evaluation it is common to conduct cost-utility analysis (CUA), that is, a comparison of at least two alternatives in terms of costs and outcomes where the outcome measure is expressed as quality-adjusted life years (QALY). The adjustment of quantity of life for quality of life is attained by means of application of weightings that try to reflect the desirability of different states of health by individuals or society and that are denominated utilities or health state utilities [6, 7]. For example, 1 year lived in perfect life (utility = 1) implies 1 QALY, but 1 year lived in a less perfect situation (utility = 0.5, for example) implies less than one QALY (0.5 QALYs in this case). The main advantage of using QALYs as a measurement is due to its potentiality to be used to compare different technologies and even different diseases [6]. Moreover, cost-effectiveness thresholds have been estimated and/or set up in different countries to be used as a limit to decide those technologies to be reimbursed or included in the health care systems based on their cost-effectiveness [8]. For example, in Spain the most recent estimated threshold is 25,000 €/QALY, so any new technology with an incremental cost-effectiveness ratio over this threshold should not be adopted according to this study [9].

The EQ-5D [10], a generic HRQoL questionnaire, is the tool most commonly used in Spain to measure utilities [11]. In its most current version, [12] the EQ-5D-5 L questionnaire consists of two sections: a visual analogue scale to evaluate HRQoL from 0 to 100 and a questionnaire comprised of 5 questions or dimensions (mobility, self-care, performing of usual activities, pain/discomfort and anxiety/depression) with 5 response levels (from no problems to extreme problems). Using combinations of these five questions it can be obtained 3125 (55) health states and a weighted health score denominated utility index associated with each health state. This index varies from 1 (perfect health) to negative values (0 being the value equivalent to death) because valuation studies have found that there are less preferred states than death by the general population. The social value set obtained from the general population in Spain for the EQ-5D-5L has recently been published [13]. A previous article has analyzed the psychometric properties of the EQ-5D-5 L in osteoarthritis patients in Spain [14]: reliability (Cronbach’s alpha = 0.86), validity (Spearman’s correlation coefficient of EQ-5D-5 L index and WOMAC pain and function subscales: − 0.688 and − 0.782, respectively) and responsiveness (floor and ceiling effects < 3%).

In a systematic review of utilities obtained from the Spanish population [11] it was found that 94% of articles used the EQ-5D questionnaire and that health state utilities for a significant number of diseases are unknown. The highest number of utilities was collated in the hospital and specialized care settings whilst only 18% of utilities were collected in primary care [11]. This same review identified three primary papers reporting utility values obtained from the Spanish population with KOA or HOA [15,16,17]. The values varied from 0.2 in patients with total knee replacement (TKR) before surgery to 0.64 at 6 months from surgery [15].

The economic evaluation of technologies applied to osteoarthritis, be they surgical [18, 19], pharmacologic [20] or other interventions [21], is of interest for researchers, professionals and health policy decision makers [22]. In a systematic review of CUA performed in Spain [23] three studies that evaluated technologies in KOA or HOA patients were identified [24,25,26]. None of these studies used health state utilities obtained from the Spanish population, but from studies conducted in other countries.

The lack of transferability of economic evaluations between countries makes necessary to perform studies in the local population [27]. Moreover, utilities used in economic evaluations ideally would proceed from studies performed in the local population or countries with similar context although very often values from foreign populations found in a review of literature is the only available source [7, 28]. The ultimate purpose of this study is determining the values of utilities for a broad variety of health states obtained from Spanish population with KOA or HOA with different level of severity and being treated in different health care settings so that they are usable for future economic evaluations in Spain. Consequently two aims were pursued: 1) obtaining health state utilities in patients with knee or hip osteoarthritis using an observational study performed in Spain, and 2) comparing these values with others obtained in international studies with the purpose of discussing the transferability of utilities in KOA and HOA subjects.

Methods

To achieve these two aims, first, data collected in a primary study with observational and prospective design were analyzed. Second, a review of health state utilities included in published economic evaluations was performed.

Primary study

Participants

An opportunistic and consecutive sample was recruited between January and December 2015 by doctors in traumatology, rheumatology and primary care consultations from three different areas of Spain (Vizcaya, Madrid and Tenerife). To be included patients had to be adults (> 18 years), diagnosed of KOA or HOA according to criteria of the American College of Rheumatology [29] independently of the severity of the disease, and agreed to participate after being informed. Excluded were all those patients who did not understand Spanish, did not know how to read or who suffered from diagnosed cognitive impairment.

Sample size was estimated to achieve other objectives pursued in the project that required more power (mapping between EQ-5D-5 L and clinical questionnaires) than the descriptive analysis presented here; more information elsewhere [14, 30, 31]. It was estimated that 360 KOA or HOA patients were necessary. Assuming a loss rate of 10% for incomplete data based on prior experiences and a 75% response rate, recruitment of 712 subjects was set as an aim. In the end we were able to recruit more than needed, 758 patients with complete baseline data. Subjects were included in the study after giving their informed consent. The study received approval by the Ethics Committees for Clinical Research from the three geographical areas.

Variables

Data were collected at the time of recruitment and at 6 months both by the clinician and the patient. Sociodemographic variables (age, sex, region of residency, education, marital status and work situation) and clinical variables: comorbidity measured using the Charlson Index [32], weight, height, body mass index (BMI), joints affected by arthrosis, time since diagnosis, treatments received (pharmacologic, surgery, rehabilitation and physiotherapy) and healthcare setting where the patient was recruited, were included. Self-perceived measures were: Oxford Knee Score (OKS) [33] or Oxford Hip Score (OHS) [34]; Western Ontario and McMaster Universities Osteoarthritis Index (WOMAC) [35]; and the EQ-5D-5 L questionnaire [12] applying the value set published in Spain [13].

The OKS and OHS instruments measure the severity of symptoms in patients with KOA and HOA, respectively [33, 34]. They are comprised of 12 questions and the score varies from 0 to 48 where 0 is the worst and 48 the best such that patients can be classified into 4 groups: satisfactory joint function (40–48), mild to moderate arthritis (30–39), moderate to severe arthritis (20–29), severe arthritis (0–19) [36]. These questionnaires were recently validated in the Spanish population [30, 31].

The WOMAC is a multidimensional scale comprised of 24 items that measures dimensions: pain (5 items), stiffness (2 items) and physical function (17 items) in osteoarthritis patients [35]. This study used the Likert version with 5 answer levels for each item, which represent different degrees of intensity (none, mild, moderate, severe or extreme) graded from 0 to 4. This score is added and standardized from 0 to 100 such that the higher the score the worse the patient’s condition. This questionnaire was validated in Spain for KOA and HOA patients [37].

Statistical analysis

Sample characteristics are reported as mean and standard deviation (SD) and frequencies and percentages for continuous and categorical variables, respectively. The proportion of patients by dimension and level of response in the EQ-5D-5 L questionnaire are presented. Subgroups were created using clinical baseline and self-reported information for the variables mentioned above (WOMAC, OKS, OHS, number of comorbidities, BMI, etc.). Although some of the subgroups are not necessarily associated to a health condition, such as age group, these subgroups are used for the estimation of the corresponding utilities and we call them indistinctly health states or subgroups. For those patients who underwent a surgical procedure during the 6 months follow-up health states are defined and utilities estimated using a follow-up questionnaire that included the same questions used for the baseline. In both cases the mean and SD for health state utilities are reported. Results were compared with normative values obtained from general population interviewed in the Spanish National Health Survey (2011–2012) [38]. We hypothesized that different (worse) health states mean different (lower) utilities. To identify statistically significant differences between subgroups, ANOVA and student t analyses with multiple testing correction were performed (Bonferroni test when we assume population variances are equal after Levene’s test or Tamhane’s T2 in other cases). P < 0.05 was considered statistically significant; P < 0.01 for multiple comparisons. The program IBM SPSS Statistics 24.0.0.1 (Armonk, NY; IBM Corp) was used to perform the statistical analyses.

Literature review

A systematic search was performed in July 2017 in the NSH EED database (Center for Reviews and Dissemination, University of York). The search strategy is included as Additional file 1: Table S1. Reviewing the title and abstract, those economic evaluations based on models that evaluated technologies in KOA or HOA patients and that included QALYs among the outcome measures were selected. The CEA Registry (Cost-Effectiveness Analysis Registry) database was used [39] to extract health states and health state utilities included in these economic evaluations. The CEA Registry is a comprehensive database of > 7000 cost-utility analyses from literature published in English around the world where utility values, health states and costs per QALY are collected [39]. In addition, papers selected were reviewed and information on the origin and kind of instrument used to obtain utilities was extracted.

Health states equivalent to those health states identified in the literature review were defined using our data, and utilities from our sample of Spanish patients were obtained according to the methodology reported above. An analysis of the utilities obtained with our sample was performed in comparison to utilities reported in the international literature. According to McClure et al., the minimally relevant difference in utilities for EQ-5D-5 L for Spain was estimated at 0.045 [40], whereby we assessed differences identified with this value.

Results

Statistical differences were found between the utility index for the sample with a diagnosis of KOA or HOA (mean = 0.533; N = 750) and the normative value obtained for the Spanish general population (mean = 0.896; N = 20,560) (P < 0.0001). Differences were also found for men, women, and each age group except for the population older than 85 years (P = 0.630) (Additional file 1: Table S2).

Results of the primary study

Knee osteoarthritis: A total of 397 subjects were recruited with KOA (Table 1). Average age was 71.42 years (SD: 9.06) (range: 35–94) and 70% were women. A total of 58% of the sample was recruited in primary care health centers, the remainder were recruited in hospital; 42% have arthrosis in both knees. Average scores of self-administered questionnaires, WOMAC, OKS and EQ-5D-5 L index were respectively, 49.63 (SD: 20.32), 21.96 (SD: 9.96), 0.544 (SD: 0.271).

Table 1 Characteristics of the samples with knee and hip osteoarthritis

The most common state of health (32331) corresponds to a patient with moderate problems in the dimensions mobility, daily activities and pain/discomfort, slight problems in self-care and not anxious or depressed. The second most common state of health (11121) corresponds to a patient with slight pain or discomfort and no problems in the other dimensions (Additional file 1: Table S3). By dimensions the moderate level was the most common in mobility (45.8%), usual activities (36.3%) and pain/discomfort (38.5%). Self-care level 1 (no problems) and level 3 (moderate) were reported equally (32%). For anxiety/depression 43.3% of the sample reported not having any problems. Levels 4 and 5 (severe and extreme) were reported less commonly although 30.2% of patients reported having severe pain or discomfort and 24.4% reported severe problems in mobility (Table 2). Among the 18 patients that reported negative utility values, the most common health state was 44444 (4 patients), that is, with severe problems in the five dimensions.

Table 2 Percentage of patients with knee osteoarthritis reporting levels within EQ-5D-5 L dimensions

Table 3 shows the health state utilities for different population subgroups. No statistically significant differences were detected between groups according to age, sex, region of Spain, joint with osteoarthritis (right or left), time since diagnosis (years), with or without prosthesis in the other joint and number of comorbidities. Differences identified between BMI categories (P = 0.029; ANOVA) disappeared when multiple comparison tests were conducted. Regarding clinical questionnaires (WOMAC, OKS), differences were found between all subgroups (the less clinical problems, the higher utilities) except for those two with less joint stiffness according to WOMAC (P = 0.389; Tamhane’s T2). There was no difference between receiving and not receiving any pharmacologic treatment (P = 0.321), although patients taking opioid pain medication had significantly lower utilities than the whole sample (P = 0.005). During the 6-month follow up, 65 people out of 92 waiting for a prosthesis had had a TKR. There was an increase in the utility index after the TKR (P < 0.0001).

Table 3 Utility weights (EQ-5D-5 L Index) per subgroup in patients with knee osteoarthritis

Hip osteoarthritis: A total of 361 subjects were recruited with HOA (Table 1). Average age was 67.88 years (SD: 11.67) (range: 36–89) and 53% were women. A total of 57% of the sample was recruited in primary care health centers, the remainder were recruited in hospital. A total of 28% have osteoarthritis in both hips. Average scores of self-administered questionnaires, WOMAC, OHS and EQ-5D-5 L Index, were respectively, 50.69 (SD: 22.25), 22.84 (SD: 22.25), and 0.520 (SD: 0.304).

The most common health state (11111) corresponds to a patient with perfect health. The second most common health state (33331) corresponds to a patient with moderate problems in all dimensions except anxiety/depression where there are no problems (Additional file 1: Table S3). By dimensions, the most common moderate level was for mobility (39.3%), self-care (31.3%), usual activities (37.1%) and pain/discomfort (34.9%). For anxiety/depression 43.2% of the sample reported not having any problems. Levels 4 and 5 (severe and extreme) were reported less commonly although 30.2% of patients reported having severe pain or discomfort and 23.3% reported severe problems in mobility (Table 4). Among the 26 patients that reported negative utility values, the most common health state was 44444 (6 patients).

Table 4 Percentage of patients with hip osteoarthritis reporting levels within EQ-5D-5 L dimensions

Table 5 shows the health state utilities for different population subgroups. No statistically significant differences were detected between groups according to age, sex, BMI, joint with osteoarthritis (right or left), time since diagnosis (years), with or without prosthesis in the other joint. Differences identified depending on region of residence and number of comorbidities (P < 0.0001; ANOVA) disappeared when multiple comparison tests were conducted. Regarding clinical questionnaires (level of pain, stiffness and physical function according to WOMAC, and OHS), differences were found between all subgroups (P < 0.0001), that is, the less joint problems, the higher utilities. For example, utilities varied according to the degree of severity using the OHS from of 0.269 (severe) to 0.903 (satisfactory function). There was no difference between receiving and not receiving any pharmacologic treatment (P = 0.321). During the 6-month follow up, 65 out of 97 people waiting for a prosthesis had the THR. There was an increase in the utility index after the THR (P < 0.0001).

Table 5 Utility weights (EQ-5D-5 L Index) per subgroup in patients with hip osteoarthritis

Results of the literature review and comparison with the primary study in Spanish population

Using the bibliographical search 116 references were identified of which 79 did not fulfill the inclusion criteria (Additional file 1: Figure S1). Of the 37 economic evaluations that did fulfill the inclusion criteria, 16 were not included in the CEA Registry or they were included but did not report utilities but rather QALYs and 6 included health states for which our database could not obtain utilities. Consequently, 15 papers were selected to extract health states and utilities, 10 in subjects with KOA [41,42,43,44,45,46,47,48,49,50], 4 in subjects with HOA [51,52,53,54], 1 in both [55]. These economic evaluations were performed in the USA (9), Canada (3), Germany (1), the United Kingdom (1), and Taiwan (1). Of these 15 papers, the CEA Registry included a total of 149 health states with their corresponding utilities. Using our Spanish database, it is not possible to estimate utilities for 61 out of the 149 health states. The main reason was that the health states were directly related to the consequences of joint replacement (complications, long-term outcomes) or with other technologies. Of the 88 states, a significant number of repeated health states-utilities pairs were identified because they were reported by the same authors using the same source [44, 45]. Therefore, a total of 51 pairs of health states-utilities (health states and their associated utilities) were selected: 45 pairs of health states-utilities for knee (Additional file 1: Table S4) and six pairs of health states-utilities for the hip (Additional file 1: Table S5). Several of these health states could be considered equivalent given their definition although there is heterogeneity in the definitions and/or values of utilities found in the literature. For example, three authors used in their models in KOA patients three apparently similar states of health but with clearly different utilities values: Osteoarthritis with conventional treatment (utility = 0.85) [50]; End stage knee osteoarthritis with treatment bridge (utility = 0.7) [46]; Conservative treatment (baseline) for knee osteoarthritis (utility = 0.55) [42]. Differences in these values may be for different reasons such as population characteristics or method used to estimate the utilities. In this specific case the first author used standard gamble, the second various bibliographic sources and the third author converted the EQ-5D VAS general population scores into standard gamble utilities. None of the studies explicitly used the EQ-5D index to estimate utilities.

Using our database, we can create and estimate utilities for 39 states approximately equivalent to those identified in the literature, 36 for knee (Additional file 1: Table S4) and 3 for hip (Additional file 1: Table S5). For 9 states it was not possible to estimate any utility because of lack of sample. A qualitative comparison between the values obtained by both sources shows that, in most cases (80%), the values obtained with our sample are lower than those identified in the literature. Among KOA patients, the largest difference between utilities obtained in the literature and utilities obtained in our sample occurred in health states for which a small sample size was obtained in our observational study. For those states with sample size of at least 30 subjects the largest difference (0.45 points) occurred in the state “0–1 comorbidity, age 65+, obese with osteoarthritis related pain’; the lower difference (0.005) and only one lower than 0.045 [40] occurred in the state ‘TKR during the last 6 months (WOMAC<60)’. Among patients with HOA, the largest difference (0.269 points) occurred in ‘severe HOA patients waiting for THR (0<OHS<19)’ [51] and the lowest difference (0.07) occurred in ‘THR during the last 6 months’ and only when we compare with one of the studies found in the literature [52].

Discussion

This study aimed to obtain utilities by means of the EQ-5D-5 L questionnaire for different health states in KOA or HOA patients using an observational study performed in Spain and subsequently comparing these values with others obtained in international studies. Firstly, from our observational Spanish study, the average utility of samples with KOA and HOA were 0.544 and 0.520, respectively. The analysis of samples by subgroups revealed statistically significant differences according to level of pain, stiffness and physical function according to WOMAC and Oxford scales such that the worse the health condition the worse the utility. The validity of the instruments WOMAC [37] and OKH and OKS [30, 31] in Spain and their use in clinical practice support the use of these instruments to determine health states in economic evaluations. Consequently, the health state utility values obtained should be in mind for future economic studies.

To our knowledge, this is the first article whose aim was the characterization of utilities by health states in KOA and HOA from Spanish population by means of the most recent version of the EQ-5D questionnaire, the EQ-5D-5 L [11]. Previously the EQ-5D-5 L was used in the National Health Survey of Spain 2011/2012 [56]. In this survey it was observed that utility is worse as age increases and lower for women than men just as in our study. However, the values of utilities are far higher than ours because populations are very different. The National Survey questioned non-institutionalized persons (interviewed at home) who stated they had a diagnosis of arthrosis without specifying the joint involved.

Conversely, a review of utilities obtained from the Spanish population identified 6 prior studies focused mainly on joint replacement and in which the EQ-5D-5 L questionnaire was not used [11]. Three of them used the EQ-5D-3L [15,16,17]. One consisted of a prospective study in which the utility was evaluated before surgery (0.2 for knee and 0.47 for hip) and at 6 months (0.64 and 0.55 respectively, n = 40 for each estimate) [15]. In our study as expected there was a difference in the utility index before and after the prosthesis in both knee and hip, although values are substantially different to those obtained in our database probably because of the characteristics of the sample and differences in estimation between the 3 and 5 level version of the EQ-5D [57]. The psychometric qualities of the 5-level questionnaire defend its use in detriment to the 3-level questionnaire as the former is a more precise measure than the latter [57, 58]. Therefore, the average values of utilities obtained using our sample with the EQ-5D-5 L are more precise for characterization of health states in a future CUA study in comparison with values available up to now, apart from facilitating the necessary information (dispersion measures) to analyze the robustness of the model by means of a probabilistic sensitivity analysis.

Secondly, a literature review was carried out that aimed to compare international utilities with our utilities for similar health states. The first impression obtained from the literature review is the low number of publications in HOA patients in comparison to the literature available for KOA patients. The second impression is that, in general, utilities obtained in our observational study are lower than those collated in the literature. However, statistical comparisons between both were not conducted given that none of the studies identified in the review used the EQ-5D index, so a qualitative comparison was conducted and minimally relevant value according to McClure et al. was used [40]. It was possible to obtain and compare health states and utilities from our Spanish observational study for a significant number of health states and utilities used in international economic evaluations. Using Spanish utilities it could be possible to replicate all the models identified in the literature review except for two of them [45, 49]. One of them defined very specific conditions for which we do not have utilities for all of them [45]. Another evaluated among others the condition “Treatment of infection”, for which we only have one patient in our sample [49]. The qualitative analysis of the health states enables us to see the variety of forms in which an apparently same health state or condition can be defined [7]. These definitions, apart from the different methods to elicit the utilities and population/context differences [7], can be accounted for by differences in values found in literature and values estimated by ourselves. In any case, higher differences than the minimally relevant value according to McClure [40] were identified in almost 100% of cases. This shows the importance of appropriately choosing values of parameters in the models to avoid the economic evaluation being artificially affected [7].

Limitations of this study include those arising from the study design. First, a sample with broad criteria to recruit subjects with osteoarthritis leads, as expected, to a heterogeneous sample which necessarily is insufficient (in size) when we wish to characterize multiple population subgroups [59]. Consequently, the sample was heterogeneous in terms of treatments received and there was a small sample size to associate utilities with specific technologies such as prosthesis or medicines after surgery [59]. Second, the criteria used to define health states including cut-off points are at times arbitrary and other criteria may be necessary to define conditions in a future economic model [7]. Third, none of the studies included in the review used the EQ-5D-5 L questionnaire which is the one used in our observational study, whereby the differences observed should be taken with caution as they arise in part from different methods and instruments [7]. Four, the literature review was tackled pragmatically and with illustrative purposes, whereby it is under no circumstances an exhaustive review.

Finally, interesting analysis could not be performed. From a theoretic point of view, it would be desirable for economic evaluations performed in Spain to use values of utilities obtained by means of appropriate methods on a Spanish population. However, this is not always possible, and we must resort to review of literature reporting foreign population values [7]. This is the case of all the Spanish economic evaluations we have identified [24,25,26]. All of them used utilities from foreign studies in which EQ-5D was not used rather than other methods/instruments such as standard gamble or 15D questionnaire. These three studies evaluated the cost-effectiveness of prophylaxis medicines for the prevention of venous thromboembolism after TKR or THR [24,25,26]. Some health states in the studies were symptomatic deep-vein thrombosis or pulmonary embolism [24,25,26]. Because our study is observational and has a heterogeneous sample with short sample sizes for some subgroups [59], utilities could not be obtained for all these conditions related to post-surgery, whereby it was not possible to analyze the effect that using Spanish utilities would have meant.

The selection of utility values from foreign studies to be used in an economic evaluation model should be performed carefully taking care to ensure that the value chosen for the health condition is as real as possible but also making sure that the values used for all conditions considered in the economic model are coherent [7, 28, 59]. The main threat to this coherence is using data from different sources [7]. Therefore, observational or experimental studies that help to ascertain and characterize local populations are necessary [59]. Ideally, it should be possible to incorporate patient-report outcome measurements such as the EQ-5D-5 L into electronic clinical records [60] with the purpose of being able to routinely collate and use data that will be very useful for the economic evaluation of healthcare technologies [59]. Most authors have been interested in the cost-utility of total joint replacement [61] but these are not the only treatments, especially in early stages of osteoarthritis [20, 21]. Utility data estimated in our study may be of interest for researchers seeking to evaluate the cost-utility of these other alternatives in Spain.

Conclusions

To sum up, the following conclusions can be drawn from the primary study and literature review. First, the worse the health condition in terms of level of pain, stiffness and physical function according to WOMAC and Oxford scales, the lower the utilities obtained by means of the EQ-5D-5 L questionnaire in Spain for both KOA and HOA. Secondly, this observational study offers values of utilities that to date were not available for a significant variety of health states. Thirdly, most utilities estimated with the Spanish sample are lower than those collated in the international studies identified. Finally, despite an observational study such as this having a broad sample this may not be sufficient to offer utilities for all possible health states related to existing technologies, it enables obtaining utilities directly from the local population which is necessary to help characterize and find out about these populations such that values of utilities used in future CUA performed in Spain are as close as possible to the country’s cultural reality.