FormalPara Key Points for Decision Makers

Economic evaluations have expanded to advanced models such as Markova and the Osteoarthritis Policy models that incorporate considerations for longer time ranges, health utility, a wider range of adverse events including cardiovascular events, and additional meaningful outcomes such as the cost per QALY ICERs.

Differences in study design and between health systems of different countries hampered meaningful comparison of results across studies.

The key drivers of cost effectiveness included medical resources, productivity, relative risks, and selected comparators.

1 Introduction

Osteoarthritis (OA) is a highly prevalent musculoskeletal disorder that is associated with a significant health and economic burden. Worldwide, OA affected more than 303 million people in 2017 [1]. It has become an increasing global health concern due to the aging population and the frequent occurrence of multiple co-morbidities, such as cardiovascular disease and diabetes mellitus, in OA patients. The pain and disability imposed by OA create significant negative impacts on the patient’s quality of life, and are important clinical considerations in chronic disease management [2]. OA can affect any joint, but most commonly the knee, hands and hip [3, 4]. OA is comparable to diabetes in disability burden, with both responsible for the largest increases in years lived with disability (YLD) at the global population level compared with the other top 20 causes of disability in 1990–2005 and 2005–2015 [5, 6]. OA accounted for 3.9% of YLD in 2015, and by 2020 it is expected to be the fourth leading cause of YLD globally [7, 8]. In addition to imposing a huge disability burden, the direct and indirect costs of OA are continually increasing, which bring a series of socioeconomic consequences: increased expenditure, reduced productivity, over-utilization of healthcare resources, and an overall decline in quality of life for both patients and caretakers [2, 9]. In addition, due to their age and likely presence of co-morbidities, OA patients have higher risks of experiencing complications than the general population [10, 11]. Pharmacological therapy is associated with a range of adverse events (AEs) in OA patients, leading to an increase in direct costs and adding to the already significant economic burden on patients and healthcare systems [12].

Common pharmacological therapies for OA include non-steroidal anti-inflammatory drugs (NSAIDs), opioid analgesics, symptomatic slow-acting drugs for osteoarthritis (SYSADOAs), and intra-articular (IA) injections of substances such as corticosteroids and hyaluronic acid [13] (Fig. 1). According to the recommended treatment modalities for OA by Osteoarthritis Research Society International (OARSI) [14, 15], When choosing pharmaceuticals for the management of OA patients, it is important to consider the risk of complications. For instance, topical NSAIDs were strongly recommended for individuals with knee OA (Level 1A recommendation: ≥ 75% in favor and > 50% strong recommendation). For individuals with gastrointestinal co-morbidities, COX-2 inhibitors had a Level 1B recommendation and NSAIDs with proton pump inhibitors had a Level 2 recommendation, while for individuals with cardiovascular co-morbidities or frailty, oral NSAIDs were not recommended. Clinical decision making for the pharmacological management of OA should be specific to individual patient conditions. To enable this, physicians should be well informed of the treatment options available, including their relative risks, accessibility, and cost effectiveness.

Fig. 1
figure 1

Classification and types of pharmaceuticals used for osteoarthritis treatment

Health economic evaluations have become increasingly important to support the setting of priorities in healthcare, and to help decision makers allocate healthcare resources efficiently [16]. This is a critical but under-reported aspect in OA management, particularly given the heavy economic burden of OA disability that is likely to worsen due to ongoing population aging, and limited healthcare resources for long-term OA treatment especially in rural or marginalized communities. Most results of economic evaluations are presented using the incremental cost-effectiveness ratio (ICER) [17]. The ICER relates the difference in cost between a medicine and its comparator to the difference in outcomes, and is needed for resource allocation policy making. If the new treatment is less expensive and more clinically effective than the standard treatment, it is said to be dominant. However, if the new treatment is more expensive but also more clinically effective, the new treatment is said to be cost effective if the ICER is less than the willingness to pay (WTP) for each individual country. ICERs can be presented as the cost per quality-adjusted life-year (QALY) gained, where one QALY equates to one year in perfect health.

An overview of existing studies analyzing the cost effectiveness of pharmacological interventions for OA would be useful for identifying the gaps in the current evidence, guiding researchers in designing and conducting high-quality economic evaluations, and helping administrators make decisions based on high-quality evidence. In the absence of a current review on this topic, and in light of previous reviews published in related areas [18,19,20,21,22,23,24,25], the purpose of this study is to systematically review economic evaluations for the pharmacological management of OA.

2 Methods

2.1 Design

A structured systematic literature search was performed in November 2021, using a review protocol based on established standards (Center for Reviews and Dissemination guidelines) and integrated with prior methods [26, 27]. This review protocol aimed to limit bias and ensure the best objectivity of the systematic review (Appendix 1, see electronic supplementary material [ESM]).

2.2 Search Strategy

Published literature from inception to November 2021 reporting the cost effectiveness of pharmacological management of OA was identified by searching the following databases: PubMed, EMBASE, Cochrane Library, the Health Technology Assessment (HTA) database, and the National Health Service Economic Evaluation database (NHS EED) (this database ceased to be updated after March 2015). ‘Osteoarthritis’ and ‘economic evaluation’ and free vocabulary terms were used as MeSH/Emtree search terms. Regular alerts were established to update the search until 4 November 2021. In addition, the reference lists of relevant systematic reviews and meta-analyses were scanned to identify potentially eligible studies. The detailed search strategies are presented as supplementary materials (Appendix 2, see ESM). All searches were supplemented by reviewing the bibliography of publications included for full-text review to identify any additional eligible studies.

2.3 Study Selection

Searches were downloaded from each of the databases and uploaded into Endnote X9 for document management. First, duplicates were identified and removed. Second, two reviewers (KNF, ZJF) independently applied inclusion and exclusion criteria (Table 1) to screen titles and abstracts of the remaining articles. Third, the full texts of eligible articles were screened in-depth by two independent reviewers (KNF and ZJF). Any studies resulting in disagreement between the reviewers were presented to a third reviewer (LY) for review and consensus. Subsequently, full-text articles were used for data extraction into an Excel spreadsheet and reviewed by the first author (JYS). Finally, reference lists and citations of eligible articles were checked manually for any additional relevant studies.

Table 1 Eligibility criteria

2.4 Data Extraction

A standardized data-extraction form was developed to collect data from eligible studies. Study characteristics regarding publication (author, year of publication), study design (country/region, perspective, model type, outcome measure, time horizon, comparators, cost type, discount rates, year of valuation), and study results (costs, effectiveness, base-case ICERs, and sensitivity analysis [SA]) were extracted by two reviewer (ZJF, KNF) and checked by a third reviewer (JYS). Afterwards, for comparability reasons, all extracted costs and ICERs were converted into 2021 US dollars using the yearly inflation rates of the countries involved (http://www.rateinflation.com) and the exchange rate published by the Bank of America (https://www.bankofamerica.com/foreign-exchange/exchange-rates.go).

2.5 Assessment for Risk of Bias

Eligible studies were critically appraised by two independent reviewers (JYS, ZJF) at the study level for methodological quality using the standardized critical appraisal instrument for economic evaluations in the Joanna Briggs Institute (JBI) System for the Unified Management, Assessment and Review of Information [28,29,30]. All studies, regardless of their methodological quality, underwent data extraction and synthesis. There was no disagreement among the reviewers during the methodological quality assessment. We determined the level of methodological quality as follows: poor quality = <40% of the items presented; moderate quality = 41%–80% of the items presented; good quality = >80% of the items presented.

2.6 Assessment for Quality of Reporting

We graded the included studies by using the Quality of Health Economic Studies (QHES) instrument that assesses studies for the appropriateness of their methods, the validity and transparency of their results, and the comprehensiveness of how they are reported [31]. The QHES is a 16-item scale that uses a dichotomous ‘yes’ or ‘no’ response for each item. A ‘yes’ is worth a specific number of points for each item (reflecting its relative importance), and a ‘no’ is worth zero. For each study, the points are summed to get a total score that can range from 0 = ‘extremely poor’ quality to 100 = ‘excellent’ quality. The QHES has demonstrated good overall construct validity [31, 32]. Based on the total score threshold recommended by Ofman et al. [24], the included studies were classified as either ‘high’ (≥ 75 points) or ‘low’ (< 75 points) quality. Because instruments for assessing the quality of cost-effectiveness analyses have, in general, been found to have poor inter-rater reliability [33], we established a protocol for using the QHES specifically for this review on the basis of Pinto et al. [20] (Appendix 3, see ESM). Two authors (JYS and ZJF) independently assessed the studies by using these guidelines, with final scoring based on consensus; if a consensus could not be reached, a third author (KNF) mediated.

2.7 Data Synthesis

Data extracted from included studies were analyzed and summarized using narrative and tables.

3 Results

3.1 Study Selection and Assessment

The study selection process is presented in Fig. 2. The literature search resulted in 6106 potential articles, of which 43 CEAs on the pharmacological management of OA were included for analysis. The included studies were conducted in 18 countries on four continents, with one study containing data from multiple continents. The categories and types of pharmaceuticals used for OA treatment are presented in Fig. 1. The overall methodological quality of the included studies was moderate (Appendix 4, see ESM). The quality of most of the included studies assessed by QHES was high (mean QHES score 84). Six studies [34,35,36,37,38,39] missed the 75-point threshold demarcating ‘low-quality’ studies from ‘high-quality’ ones (Appendix 5, see ESM).

Fig. 2
figure 2

PRISMA diagram showing the study selection process

3.2 Characteristics of Included Studies

Characteristics of the included studies are reported in Table 2. The time periods of publication were 2000–2004 (n = 6) [40,41,42,43,44,45], 2005–2009 (n = 10) [34, 35, 37, 46,47,48,49,50,51,52], 2010–2014 (n = 10) [53,54,55,56,57,58,59,60,61,62], and post-2015 (n = 17) [36, 38, 39, 63,64,65,66,67,68,69,70,71,72,73,74,75,76]. The studies were conducted in Europe (n = 15), North America (n = 19) [36, 37, 39, 42, 45, 46, 48, 51, 54,55,56, 59, 60, 67,68,69,70,71,72, 75], South America (n = 1) [65], and Asia (n = 7) [35, 44, 62, 66, 73, 74, 76], and there was also a multi-continental study (n = 1) [34]. Study designs included model simulations of OA (n = 33) [37,38,39,40,41,42,43,44, 46,47,48,49,50, 53,54,55,56,57,58,59,60, 65,66,67,68,69,70,71,72,73,74,75,76], randomized clinical trials (n = 8) [34,35,36, 45, 51, 52, 61, 64], and observational studies (n = 2) [62, 63].

Table 2 Characteristics of CEA studies in the pharmacological management of OA (43 studies)

Different analysis perspectives were used to evaluate treatment costs for OA pharmaceuticals. The payer perspective was typically adopted, including third-party payer, private payer, National Health Service (NHS), and Healthcare System (HCS). The majority of articles adopted NHS (n = 8) [38, 43, 49, 50, 53, 57, 58, 74], third-party payer (n = 3) [55, 56, 65], or HCS perspectives (n = 7) [38, 48, 58, 70]. A number of CEAs offered societal perspectives (n = 8) [35, 37, 41, 44, 46, 54, 60, 69]. However, just one such CEA included direct costs and productivity loss [44]. Four offered both NHS and societal perspectives [45, 51, 52, 64] and others adopted various perspectives, while one article did not report a perspective [36].

The treatment selected as the intervention varied across the studies included in this review. Most studies used NSAIDs and/or coxibs as interventions, which were often combined with a proton pump inhibitor (PPI). A total of 13 studies used IA injection as the intervention [36, 39, 45, 59, 68, 71, 72], while three studies used opioids only [34, 49, 61]. Economic evaluation typically compares an intervention with current best practice or usual care, which may vary by clinical setting. The results from the seven articles that defined the comparator as appropriate care [45], usual care (UC) [64, 75], current care [49], standard care [67], and conventional care [59, 68] might not be transferrable because the details of these treatments were unclear.

Numerous sets of cost-effectiveness outcomes were evaluated in the included studies. In addition to cost per minimal perceptible clinical improvement (MPCI) [42], cost per patient improved [45, 51], and cost per life-year gained [40, 41], a variety of gastrointestinal (GI)-related outcomes were used, such as cost per GI event avoided [42], and cost per perforation, ulcer or bleed avoided [43]. The use of these variable outcome measures was due to the sources of GI adverse event data included in these studies. The remaining studies reported cost-utility analyses with QALY as the outcome measure. To estimate health utilities for a QALY calculation, the Western Ontario McMaster University Osteoarthritis Index (WOMAC) [77] or another instrument (Sleep Problems Index [SPI]) were often translated into a utility-based instrument (e.g. the EQ-5D or Health Utility Index [HUI]). Four studies [34, 49, 61, 68] translate WOMAC into the HUI and three [50, 53, 66] into EQ-5D, while five studies [52, 56, 62,63,64] directly used the EQ-5D and three [45, 51, 59] use the HUI3.

3.3 Data Analysis

3.3.1 Cost Effectiveness of Drugs Estimates in Asia

Table 3 identifies seven articles evaluating economic outcomes in Asia [35, 44, 62, 66, 73, 74, 76]. Three evaluations were conducted in the China region [35, 44, 76], two of which were in Taiwan [35, 44], and one evaluation was conducted in each of the following countries: Japan [74], Saudi Arabia [66], and United Arab Emirates [73]. Most of these economic comparisons were made between coxibs (celecoxib or imrecoxib) and NSAIDs with or without gastroprotection. Differences in study design and between health systems in each country hampered meaningful comparison of results across studies. The authors of these studies concluded that coxibs (celecoxib or imrecoxib) were cost effective in these geographical regions based on the local standards. In these studies, the incremental effectiveness between the treatment and control groups varied between 0.0023 and 1.49 QALYs, and the ICERs varied between US$44.40 and US$58,447.97 per QALY gained. In addition, the two articles from Taiwan [35, 44] reported that IA injection was performed. One of these concluded that hyaluronic acid therapy might not be an economically attractive option since Taiwan has fewer health resources than other places, such as Canada and the US [44]. Of the seven studies included, one reported a threshold range, three reported a single threshold, and the remaining three studies did not report a threshold.

Table 3 Cost‑effectiveness estimates in Asia (7 studies)

3.4 Cost Effectiveness of Drugs Estimates in Europe

Table 4 presents a total of 15 studies conducted in nine European countries, with one study involving two countries [61]. The UK and Sweden were the only countries in which more than one study was conducted. NSAIDs were the most common comparator to celecoxib. Most studies concluded that celecoxib was cost effective compared with other active treatment options based on local standards, and at times dominated comparators (was more effective and less costly) in some countries [37, 40, 41]. The cost and incremental effectiveness between the treatment and control groups varied between US$0.00755 and US$450.98, and 0.002 and 0.038 QALYs, respectively. The ICERs ranged from US$6461.63 to US$38,686.79 per QALY gained. Other articles reported IA injections, opioids, and SYSADOAs, and the intervention group showed cost effectiveness compared with the comparators. For IA injections, the cost and incremental effectiveness between the treatment and control groups varied between US$10.85 and US$1647.84, and 0.042 and 0.35116 QALYs, respectively. The ICERs ranged from US$258.36 to US$10,702.23 per QALY gained. There are three articles reporting a threshold interval and nine reporting a single threshold, while the remaining one did not report a threshold.

Table 4 Cost‑effectiveness estimates in Europe (15 studies)

3.5 Cost Effectiveness of Drugs Estimates in the Americas

Table 5 lists 20 articles assessing the economic evaluations in the Americas, performed in three countries (US, Canada, and Colombia). Similar to other continents, celecoxib was considered cost effective when compared with NSAIDs. The ICERs varied between US$875.91 and US$307,013.56 per QALY gained. The ICER estimates also varied with the subject’s pain and age [75]. However, a study comparing celecoxib with over-the-counter (OTC) naproxen showed that celecoxib was not cost effective because of its exorbitant annual price of US$880 [70]. In addition, one study comparing opioid-based strategies showed that celecoxib was not cost effective because it diminishes the effectiveness of total knee arthroplasty (TKA) [69].

Table 5 Cost‑effectiveness estimates in the Americas (20 studies)

3.6 Cost Effectiveness of Drugs Estimates Across Continents

Table 6 lists one article assessing the economic evaluations across continents, performed in five countries (France, Belgium, Austria, Switzerland, and USA). The ICER estimates also varied with the time period.

Table 6 Cost‑effectiveness estimates across continents (1 study)

3.7 Subgroup Analyses

Subgroup analyses were conducted by type of study (trial-based or model-based), and time period (≤1 year, 1–5 years, lifetime). The results are presented as tables in the supplementary materials (Appendix 6, see ESM).

4 Discussion

In this study, we have addressed gaps (differences in study design and between health systems in each country) in the current evidence by separating the analysis of the cost effectiveness of OA drugs into different continents, and providing up-to-date analyses that would be useful for healthcare providers and payers, as well as researchers for conducting high-quality economic evaluations in the future.

OA is a chronic condition characterized by a long course of disease progression, often associated with severe impacts on the patient’s quality of life and risk of mortality from other co-morbidities [78, 79]. The OA disease burden is growing faster than any other health condition globally [5, 80]. Improving patient quality of life and joint function are the primary goals of OA management strategies, for which the choice of appropriate healthcare interventions is critical in the light of rising costs for the OA patient population. Health economic evaluations provide a critical piece of the puzzle for informing clinical decision making related to OA interventions.

In this study, we performed a systematic review of the literature on cost-effectiveness analysis of OA pharmacological interventions, and provided insights into the changes seen in the methodology of these economic evaluations over the past two decades. Over the past years, such evaluations are no longer limited to decision trees and short time ranges that considered only gastrointestinal events, but have expanded to advanced models such as Markov [38, 48, 50, 53,54,55,56, 58, 60, 66, 73, 74, 76] and Osteoarthritis Policy models [67, 69, 70, 75] that incorporate considerations for longer time ranges, health utility, a wider range of adverse events including cardiovascular events, and more meaningful outcomes such as the cost per QALY ICERs. Depending on the continents, type of drug, the control group, and duration of follow-up, cost-effectiveness analyses for OA pharmacological interventions reported different ranges of ICER estimates.

Despite a significant growth in pharmacoeconomic evaluation studies of OA in recent years, as well as some innovations in trial and model design, the comparability among various studies remains poor due to a lack of standardized research methods and designs. For instance, our analysis indicated that most studies found celecoxib to be economically attractive compared with NSAIDs with or without gastric protective agents. However, due to significant heterogeneity in the methodology and design of the included studies, it is not possible to provide a confident recommendation. Several sources of study heterogeneity should be considered when interpreting the results of our review. First, different perspectives of cost analysis were adopted among the included studies, such as a payer or societal perspective, introducing inconsistencies into the types of resources that should and should not be compared to evaluate the cost effectiveness of pharmacological interventions. This is complicated by the fact that the effects of OA are multi-dimensional, involving not only individual disability and reduced quality of life, but also major impacts on overall societal productivity. This level of complexity is rarely being addressed in the current design of CEAs. Second, the included studies used a range of different comparators, such as comparing against baseline (standard care or no intervention) or comparing against other pharmacological interventions, which reduces the ability to make accurate comparisons among studies [81, 82]. Finally, it is important to note that for a chronic condition such as OA, clinical trials or models spanning only a few months (as seen in a major portion of the included studies) are unlikely to provide evidence that is representative of the entire course of disease. These identified challenges have been discussed in more detail elsewhere [83, 84].

Several key drivers of cost effectiveness were identified in our systematic review, which might have contributed to the variations seen in cost and effectiveness measures across continents. These include medical resources, productivity, relative risks, and selected comparators. Variation in medical resources could lead to different cost effectiveness of OA drugs in different geographical regions. Some economically under-developed areas may have less health resources than economically developed areas, and so some measures that achieve certain benefits but cost more may not be supported [44, 70]. Two trial studies reported that decreased productivity is the most influential parameter to changes in the cost effectiveness of OA drugs [51, 64]. The time lost by patients had a relatively strong effect on the estimated incremental net monetary benefit from a societal perspective.

It has been noted that the main driver of the cost effectiveness of OA drugs is the relative risk of the drugs, which affects the results of the model. The relative risk of the drug drives the absolute risk of the population, which in turn drives the projected cost of side effects and the risk and cost of post-event switch in therapy [37], and also reduces quality of life, resulting in an increase in ICER [41, 53, 54, 57]. Some studies have shown that cost effectiveness varies among the selected comparators. The adverse events rate and PPI utilization rate may vary when different comparators are used, which have a great impact on the results of the model. Therefore, the importance of selecting an appropriate comparator is becoming increasingly apparent, and the interpretation of cost-effectiveness analysis between active comparators requires some caution [60].

Threshold ICER plays a central role in the methodology and application of CEAs, since the intervention ICERs are compared against the threshold to determine whether new interventions offer good value for money. There are several factors requiring consideration here. (i) There are fundamental differences in the threshold values for cost per QALY between different countries or healthcare systems. (ii) Some studies have specified a threshold range rather than a single value [51, 61]. (iii) In some studies, the threshold value against which the intervention ICERs should be compared is unknown. These factors may support the current view that a single ICER threshold should not be applied in CEAs involving a diverse range of technologies and conditions [85]. Moreover, defining an ICER threshold value might be more appropriate in a national health service system, where healthcare budgets are well-defined and more fixed than in a social security system, where the maximum level of total co-payments of the entire population is undefined [85,86,87,88,89]. To ensure efficient healthcare resource allocation, such issues surrounding the definition of ICER thresholds need to be thoroughly considered together with the study population involved.

Our review provides a different perspective on CEA evaluations on the pharmacological management of OA. Although a number of previous reviews have been published on different aspects of this topic [21, 24, 90,91,92,93,94], they have typically described a limited range of therapies and were mostly published more than 5 years ago. Our study provides up-to-date, comprehensive information on a more complete range of OA pharmacological interventions, including oral drugs and IA injections. In addition, in light of limited reviews reporting standardized inflation rates, our study presented pooled economic results that were adjusted by ‘purchasing power parity’ (PPP) and time period, and normalized across different countries. It is interesting to note that 87% of the studies included in our review were conducted in Europe and North America, which may suggest an increasing OA burden in these regions, but is also reflective of strong HTA institutions and the use of economics in decision making and market value.

The interpretation of findings presented in our review is subject to a few limitations. First, high-quality studies that were not published in English but otherwise satisfy the inclusion criteria were not considered in our analysis, which may have made our results more relevant for English-speaking countries. Second, our investigation was limited to pharmacoeconomic analyses that presented ICERs or found an intervention to be dominant. We are aware of cost-minimization studies in which certain treatments have been found cost-saving, and which were analogous to the included CEAs, but were not considered in our analysis since ICERs were not calculated [95,96,97,98,99,100]. Finally, since the included studies span a period of approximately two decades, some of the pharmaceutical prices have dropped significantly since these studies were published, particularly if they were in the 2000s. The findings of these studies therefore may not accurately depict the current market value of the same pharmaceuticals.

5 Conclusion

The findings of this systematic review suggest substantial uncertainty regarding the ICER estimates for OA pharmacological therapies, due to the heterogeneity of the included studies. Nevertheless, the results of most studies indicated cost effectiveness of the intervention based on specific ICER thresholds. There are fundamental differences in the threshold values for cost per QALY in studies, which is contributed by the difference in the threshold determination method used. Further efforts are needed to increase the standardization and quality of applied methods, and future studies should report the threshold that was used to determine cost effectiveness.