Rheumatoid arthritis (RA) is a chronic, progressive autoimmune condition characterized by joint damage, stiffness and swelling [1] that affects up to 1.5 million adults in the US [2]. The chronicity of RA and the need for long-term treatment create significant lifetime humanistic burden to patients, with up to 43.5% of patients with RA facing activity limitations due to their condition [3]; epidemiologic and economic burdens of RA add up to $39 billion in annual costs to society [4,5,6,7].

While appropriate management of RA is critical for modifying the course of joint damage and reducing patient morbidity, and subsequently, economic and humanistic burden, the focus of RA treatment has moved away from managing symptoms to a strategy that targets the underlying inflammation to prevent disease progression [8]. Beginning with early disease onset, treatment with one or more conventional synthetic disease-modifying antirheumatic drugs (csDMARDs; e.g., methotrexate, sulfasalazine, leflunomide, hydroxychloroquine) in addition to glucocorticoids forms the standard of care in RA [9, 10]. For patients with an inadequate response or intolerance to csDMARDs (csDMARD-IR), the addition of a targeted DMARD to a csDMARD is a global recommendation [9, 10]. Biologic DMARDs (bDMARDs) comprise tumor necrosis factor-α inhibitors (TNFi), T cell costimulatory inhibitors, anti-B cell agents and anti-interleukin-6 receptor (anti-IL-6R) monoclonal antibodies. Targeted synthetic DMARDs (tsDMARDs) of the janus kinase inhibitor class are also available for the management of RA.

Sarilumab (Kevzara®) is a human monoclonal antibody directed against the anti-IL-6Rα. The efficacy and safety of sarilumab have been evaluated in both monotherapy and combination therapy for the treatment of moderate-to-severely active RA in patients who have had an inadequate response or intolerance to one or more DMARDs [11,12,13,14].

In developing optimal treatment pathways for RA with the availability of sarilumab, evidence of both the clinical effectiveness and economic consequences of this treatment against relevant comparators is considered by payers and clinicians. The present study evaluated the cost-effectiveness of sarilumab subcutaneous (SC) 200 mg in combination with methotrexate, in accordance with its approved indication in the treatment of adult patients with moderate-to-severely active RA who have had an inadequate response to methotrexate. This study was conducted from a US healthcare payer perspective and aimed to provide decision makers information on the cost-effectiveness of sarilumab in combination with methotrexate. Sarilumab in combination with methotrexate was compared against csDMARD active treatment or approved bDMARDs including adalimumab SC, certolizumab SC, etanercept SC, golimumab SC, tocilizumab SC and the tsDMARD, tofacitinib, all administered in combination with methotrexate.


The cost-effectiveness analysis was conducted using a lifetime Markov state transition model with individual patient simulation (IPS). The model was developed in Microsoft Excel® (version 2013). Health outcomes were measured in quality-adjusted life-years (QALYs), since RA affects both mortality and morbidity. The target population was patients with an inadequate response to methotrexate (i.e., csDMARD-IR population) and therefore eligible for first-line combination treatment with a targeted DMARD, including sarilumab.

Individual profiles of the patients simulated in the model were based upon profiles of patients enrolled in MOBILITY, a pivotal phase 3 trial of sarilumab SC in combination with methotrexate [11]. In MOBILITY, adult patients fulfilling the 1987 American College of Rheumatology (ACR) classification criteria for RA [1, 15] with moderate-to-severe RA were included; other inclusion and exclusion criteria are reported elsewhere [11]. Baseline demographic information of the individual patient simulation cohort is presented in Table 1. In total, 1197 eligible patients were randomized (1:1:1) to sarilumab SC 150 mg or 200 mg or placebo SC every 2 weeks (q2w) added to methotrexate. Patient ages ranged from 19 to 75 years (mean 50.6 ± 11.6); 81.6% were female and 86.1% were White/Caucasian. Duration of RA ranged from 0.3 to 44.7 years (mean 9.0 ± 7.8), and baseline Health Assessment Questionnaire Disability Index (HAQ-DI) scores ranged from 0 to 3.0 (mean 1.6 ± 0.6) [11].

Table 1 Demographics and disease characteristics of patients in the MOBILITY trial

This article does not contain any studies with human participants or animals performed by any of the authors.

Model Structure

For the estimation of expected health outcomes and costs, each patient was moved between health states in a stochastic manner [16]. IPS was deemed to be the suitable approach for the present model because it captures the heterogeneity of the RA patient population [17] and allows the tracking of patient characteristics (e.g., age) and clinical outcomes (i.e., HAQ-DI progression) of individual patients over the lifetime horizon of the model [18, 19] (Fig. 1). For each patient in the model, a duplicate was assigned for each comparator, ensuring that the comparisons were not influenced by factors other than the outcomes of the different treatment sequences.

Fig. 1
figure 1

Model flow. csDMARD conventional synthetic disease-modifying antirheumatic drug; csDMARD-IR inappropriate response or intolerance to csDMARDs/methotrexate; HAQ-DI Health Assessment Questionnaire Disability Index; QALYs quality-adjusted life-years. Comparators: sarilumab SC 200 mg or placebo SC q2w + methotrexate; adalimumab 40 mg SC q2w + methotrexate; certolizumab 200 mg SC q2w + methotrexate; etanercept 25 mg SC q1w + methotrexate; golimumab 50 mg SC q4w + methotrexate; tocilizumab 162 mg SC q1w or q2w + methotrexate; tofacitinib 5 mg twice daily oral + methotrexate; csDMARD active treatment

A decision tree modeled the initial efficacy assessment period, which was based on the MOBILITY randomized controlled trial data, with patients assigned to one of three classifications at the end of the 6-month cycle:

  1. 1.

    Responder: adequate response (ACR20 responders, inclusive of patients with ACR50 and ACR70 responses) and continuation of initial treatment until discontinuation or death.

  2. 2.

    Non-responder: inadequate response and movement to the subsequent treatment line (bDMARD or csDMARD palliative treatment).

  3. 3.


Following the initial 6-month cycle of the decision tree, based on treatment discontinuation data from real-world evidence, all surviving patients could transition to one of the following states in the subsequent 6-month intervals of the Markov model:

  1. 1.

    Remain on initial treatment.

  2. 2.

    Move to the subsequent bDMARD treatment: the commonly used abatacept intravenous (IV) + methotrexate, followed by rituximab IV + methotrexate.

  3. 3.

    Move to final, palliative treatment with csDMARDs.

  4. 4.


Treatment Comparators

The treatment comparators in the model included bDMARDs and the tsDMARD, tofacitinib. In addition, while the population was patients with inadequate response or intolerance to csDMARDs, csDMARDs were also included to compare results with previously published US cost-effectiveness analyses. All comparators are licensed in the US for the treatment of RA and reimbursed through commercial health plan pharmacy budgets. Only SC formulations of bDMARDs were considered for comparison; IV formulations were not considered given that this formulation is typically reimbursed via medical benefits of commercial health plans.

The treatment sequence beginning with sarilumab 200 mg SC q2w + methotrexate was compared with treatment sequences beginning with:

  1. 1.

    adalimumab 40 mg SC q2w + methotrexate,

  2. 2.

    certolizumab 200 mg SC q2w + methotrexate,

  3. 3.

    etanercept 25 mg SC every week (q1w) + methotrexate,

  4. 4.

    golimumab 50 mg SC every 4 weeks (q4w) + methotrexate,

  5. 5.

    tocilizumab 162 mg SC q1w or q2w + methotrexate,

  6. 6.

    tofacitinib 5 mg twice daily (bid) oral + methotrexate,

  7. 7.

    csDMARD active treatment.

Efficacy and costs of the tocilizumab SC treatment regimen assumed in the model were based on a 65% q1w and 35% q2w weighted average of the two available dosing regimens. This was consistent with clinical guidance on usage and on results of claims databases analyses.

Model Inputs

Treatment Response

In the base case, the minimal 6-month treatment response was based on ACR20 criteria; this parameter was informed by results of a network meta-analysis (NMA) of csDMARDs and all bDMARDs and the tsDMARD, tofacitinib, licensed for the treatment of RA [20] (Table 2). Given the lack of evidence on sequential efficacy, outcomes were independent of the line in which the treatment was administered; this approach was consistent with previously published cost-effectiveness models in RA [21]. As per the NMA, minimum response rates for the bDMARDs and tofacitinib were similar (Table 2). Response for the csDMARD active treatment sequence was 27.4%. Patients on the last-line csDMARD (i.e., palliative care) were assumed not to achieve any treatment response.

Table 2 Treatment response rates based on network meta-analysis on ACR 20/50/70 criteria

For each patient, ACR response was mapped to a relative change in HAQ-DI score, which measures physical function in patients with RA [22], based on a mapping algorithm estimated from MOBILITY Part B trial data. Changes in HAQ-DI from baseline to week 24 were then predicted for each patient, within each of the ACR response categories (e.g., ACR70, ACR50, ACR20 and ACR20 non-responders). HAQ-DI scores were assumed to remain constant where a patient remained on a given bDMARD or tsDMARD treatment [21]. When patients discontinued bDMARD or tsDMARD treatment, their HAQ-DI scores were set to the baseline HAQ-DI [21] and then followed the trend in scores for the line of treatment into which the patient transitioned. For patients on csDMARD active treatment, HAQ-DI scores were assumed to increase annually by 0.012, based on a calculation of average annual progression rates obtained via a literature review by Malottki et al. in 2011 [23]. HAQ-DI scores for patients on csDMARD palliative treatment were assumed to increase annually by 0.045 [24]. A recent study indicated a potential non-linear increase of HAQ-DI scores for active csDMARDs [21]; however, since this comparison is included for comparison with previous studies using linear progression, this scenario was not incorporated.

Treatment Duration

After the initial 6 months, transitions between health states instead of the ACR response were based on long-term treatment discontinuation. Inputs for durations of long-term treatment were based on a de novo analysis of the Canadian RHUMADATA registry ( (Supplementary Fig. 1).

Parametric models were fitted to the time to treatment discontinuation data by drug class (TNFi, IL-6 inhibitors and other modes of action) and for the combined drug classes. Gompertz, generalized gamma and log-normal models had the best fit after consideration of the Akaike information criterion and Bayesian information criterion. Based on visual comparison of the observed and predicted curves and probability plots, Gompertz distributions were selected as the best fits. This analysis showed significant differences in treatment discontinuation rates between TNFi therapies and IL-6 inhibitors, with patients on IL-6 inhibitors persisting on treatment for a longer period of time. However, as a conservative approach, the same discontinuation rates were assumed for all treatment classes in the base case. The use of separately fitted discontinuation curves to the three treatment classes (TNFi, IL-6 inhibitors and other mode of action) was applied in the scenario analyses.

Utilities and Mortality

QALYs were based on patient life expectancy and utility weights applied to the life-years accrued. Utilities were calculated via an equation fitted on data from several trials for adalimumab [25, 26]: utility = 0.76 − 0.28 × HAQ-DI + 0.05 × female and according to HAQ-DI score and gender at each model cycle. Additional utility equations were tested in the sensitivity analyses (Supplementary Table 1). The estimation of mortality was based on US life tables adjusted to RA according to changes in HAQ-DI [27], with general population mortality rates for males and females in the US obtained from the National Vital Statistics Reports from 2015 [28]. Adverse events were not separately evaluated in the model as their effects were included in the utility equations.


The base case analysis considered direct healthcare costs only, based on 2017 US$, stemming from DMARD utilization, disease management, routine care and drug administration (e.g., outpatient and nurse visits). A scenario analysis from a societal perspective employed an equation that estimated workdays lost per HAQ-DI level and attendant reductions in productivity [29].

bDMARD utilization was based on treatment schedules specified in the prescribing information for the comparators and sarilumab clinical trials (Table 3) and assumed to be administered in combination with methotrexate, based on the dose administered in the MOBILITY trial [11]. While efficacy outcomes have implicitly included the treatment adherence seen in the trials (for the initial 6 months) and in clinical practice (after an initial 6 months), for costs, the rate of treatment adherence was assumed to be 100% for all treatments leading to potential overestimation of drug costs. The 2018 wholesale acquisition cost of each DMARD was applied to the dosing and treatment schedules to derive drug costs for treatment induction and treatment maintenance thereafter (Table 3). Costs of csDMARDs in the active treatment and palliative lines were based on the weighted average cost of different csDMARDs based on the proportions of patients using the respective csDMARDs [30]. Wastage (i.e., no vial or pack sharing) of bDMARDs was considered in the base case, and no wastage (i.e., sharing of vials and packs) was assumed in the scenario analyses. Wastage implies that if after administering the correct dose to a patient there is surplus, this surplus is discarded. Therefore, in the model, the full cost of the vial was applied to the computation of treatment cost under the assumption of wastage.

Table 3 Drug costs

Disease management or routine care costs included the use of non-DMARD medications, outpatient visits and hospitalization by Medicare patients [31], with costs adjusted for age, disease duration, comorbidities, HAQ-DI score at baseline, current HAQ-DI score, gender, type of DMARD received, number of previous DMARDs, years of education and ethnicity. These costs were inflated to 2017 US$ using the medical component of the consumer price index.


Deterministic Analyses

Incremental cost-effectiveness ratios (ICERs) for sarilumab were calculated in terms of cost per life-year (LY) gained and cost per QALY gained, with a standard annual discount rate of 3% applied for both cost and health outcomes [32]. Efficiency frontiers were computed and plotted, illustrating treatment sequences that were not dominated (i.e., not costlier and less effective) by any of the other treatment sequences in consideration across the full set of comparators.

To account for uncertainty in model parameters and structural assumptions, inputs and assumptions of the model structure were tested via scenario analyses and one-way deterministic sensitivity analyses (OWSA), in addition to probabilistic sensitivity analyses (PSA). Scenario analyses evaluated the robustness of the base case results by varying key model assumptions around the time horizon, minimum response criteria, treatment discontinuation, mortality rate and other model inputs (Supplementary Table 1).

OWSAs were performed on the base case by varying inputs with parameter uncertainty to their extreme values [± standard error or 95% confidence interval (CI), or if neither was available ± 20% of the base case]. Given the difficulty in interpreting a tornado diagram where ICERs result in all four quadrants of the cost-effectiveness plane and particularly in the context of small QALY differences and small cost differences, incremental net benefit (INB) was estimated as: INB = incremental QALYs × threshold − incremental costs. A commonly accepted cost-effectiveness threshold of $50,000/QALY was specified [33, 34].

Probabilistic Sensitivity Analyses

To follow best practice guidelines [19], PSAs were conducted to assess the parameter uncertainty of the model by simultaneously varying all uncertain parameters according to their assumed distribution [35]; this was based on second-order Monte Carlo simulation of 300 iterations on 500 patients, where results had stabilized. Cost-effectiveness acceptability curves (CEACs) were generated to depict the proportion of cost-effective simulations, or the probability of cost-effectiveness, over a range of willingness-to-pay thresholds. Scatter plots illustrated incremental health outcome gains against incremental costs on the cost-effectiveness plane.


Assuming all treatment classes were ascribed equivalent time to treatment discontinuation after 6-month response, average time on first-line treatment based on response ranged from 3.5 years (tofacitinib) to 5.0 years (etanercept) and was lowest at 2.3 years for the csDMARD active treatment sequence. Sarilumab patients experienced a longer time on first-line therapy (4.3 years) versus all comparators except etanercept (5.0 years) because of a relatively higher response rate (Table 4).

Table 4 Base-case cost-effectiveness analysis results

QALYs ranged from 3.43 for the active csDMARD treatment sequence to 5.79 for sarilumab and 5.94 for etanercept treatment sequences. Lifetime costs ranged from $115,019 for the active csDMARD to $524,832 for etanercept treatment sequences. By achieving lower costs and higher effectiveness in first-line treatment, sarilumab dominated adalimumab, certolizumab, golimumab and tofacitinib treatment sequences. Sarilumab was associated with ICERs of $84,079 and $134,286 compared with tocilizumab and csDMARD active treatment sequences, respectively. The substantially higher costs and minimal gain in effectiveness of etanercept versus sarilumab would result in an ICER > $500,000 (Table 4). csDMARD active treatment, sarilumab and etanercept were the three drug sequences that were not dominated or extendedly dominated, and therefore were positioned on the cost-effectiveness frontier (Table 4, Fig. 2a).

Fig. 2
figure 2figure 2

Deterministic and probability sensitivity analyses: a cost-effectiveness efficiency frontier with and without csDMARD active treatment; b tornado diagram of incremental net benefit vs. TCZ; c cost-effectiveness plane for sarilumab 200 mg SC q2w + methotrexate vs. tocilizumab 162 mg SC q1w/q2w mix + methotrexate; d cost-effectiveness acceptability curve. ADA adalimumab; CTZ certolizumab; ETA etanercept; GOL golimumab; INB incremental net benefit; MTX methotrexate; SAR sarilumab; SC subcutaneous; TCZ tocilizumab; TOF tofacitinib

From a societal perspective, the sarilumab sequence would result in fewer workdays lost because of better treatment response and longer treatment duration versus most of the comparator sequences (differences ranging from − 10 to − 21 days versus the other bDMARDs and tofacitinib; − 370 days versus csDMARD active treatment); however, versus etanercept, sarilumab was associated with 25 additional workdays lost (Table 4). Therefore, the ICERs remained stable in the scenario analyses that considered societal costs (i.e., workdays lost) (Table 5). Across other scenarios, results were sensitive to the model time horizon and treatment discontinuation (Table 5). OWSA of sarilumab versus tocilizumab revealed that results were most sensitive to sarilumab dose and dosing intensity as well as tocilizumab dose (Fig. 2b).

Table 5 Scenario analyses

Simulations from PSA for sarilumab versus tocilizumab were slightly to the right-hand side of the origin on the cost-effectiveness plane, indicating comparable costs and slightly better QALYs for sarilumab (Fig. 2c). CEAC indicated that, irrespective of threshold, the probabilities of sarilumab and tocilizumab SC being cost-effective were close to equivalent (Fig. 2d).


Findings from this cost-effectiveness analysis indicate that a treatment sequence initiated with sarilumab can be cost-effective compared with sequences beginning with any other bDMARD, the tsDMARD tofacitinib or csDMARD active treatment for moderate-to-severely active RA adults who have had an inadequate response to methotrexate. The sarilumab + methotrexate treatment sequence was on the efficiency frontier and compared with tocilizumab resulted in an ICER within acceptable ranges as per thresholds suggested by the World Health Organization of up to three times the per capita gross domestic product [36] across a range of diseases and populations with thresholds ranging from $129,000 to $150,000 [33, 34] and per an updated threshold suggested for the US [37] of up to $150,000.

Versus the sequence that begins with csDMARD active treatment, higher response rates and longer time on first-line treatment for sarilumab (63.6% on ACR20; 4.3 years, respectively) combined with methotrexate resulted in lower HAQ-DI scores. As such, patients in the sarilumab sequence attained reduced mortality and higher levels of health-related quality-of-life. Sarilumab, compared with adalimumab, certolizumab, golimumab and tofacitinib, each combined with methotrexate, was the dominant strategy (i.e., more effective and less costly). This outcome was due to increased QALYs for sarilumab derived from higher response rates coupled with lower drug and administration costs, directly and indirectly from the shorter time subsequently spent on some of the more costly treatment lines.

The various scenario analyses indicated that the model was robust and economic efficiency of sarilumab was upheld from the societal perspective in terms of reduced workdays lost associated with treatment relative to most comparators. Only the scenarios with short time frames and discontinuation rates based on drug class were of consequence. Reasonably long model horizons are necessary for differences in duration of treatment to fully accrue; therefore, shorter horizons did not allow the benefits associated with increased time on sarilumab treatment to fully emerge, reducing the QALYs and the economic efficiency of sarilumab. However, the scenario that applied discontinuation rates ascribed to the IL-6 class compared with the relatively lower duration of treatment for the TNFi drug class enhanced the treatment duration for sarilumab and provided increased benefits and reduced costs. It is notable that a very conservative assumption to the model was applied in the base case whereby an equivalent rate of treatment discontinuation was applied for all treatment classes, despite the potential increased benefit on this parameter specifically for IL6 inhibitors according to the RHUMADATA analysis.

In terms of cost-effectiveness, sarilumab was dominant versus certolizumab, golimumab, tofacitinib and adalimumab. While etanercept was more effective, it was also more expensive, producing an ICER of $609,545 versus sarilumab. These findings are in distinct contrast to the recent evidence appraisal of RA biologics published in the report Targeted Immune Modulators for Rheumatoid Arthritis: Effectiveness and Value issued by the Institute for Clinical and Economic Review [18]. The discrepancy in results may be explained by the following two factors. First, the economic models in RA are highly sensitive to the manner in which the ACR response is converted to a utility score, the main driver of lifetime QALYs. The Institute for Clinical and Economic Review model employed a published utility function [31], but deviated from the original equation by putting all coefficients into exponential form, an approach that diverges from other published equations [38, 39]. This reduced the utility benefit associated with an HAQ-DI decrease resulting from, for example, ACR20 response from approximately 0.29 in the original equation to approximately 0.09 in the revised exponential equation, resulting in a substantially lower QALY value compared with the present analyses.

Second, the ICER model assumed a relatively slow HAQ-DI progression among patients treated with csDMARD palliative treatment of 0.0269 per year; furthermore, this increase was only applied in the first 15 years, at which point RA was assumed to no longer be a progressive disease. Patients in the csDMARD active treatment sequence were ascribed an HAQ-DI score of 1.7 and could only progress to a maximum score of 2.1 (1.7 + 15 × 0.0269). Given that the HAQ-DI score in turn informs both costs and QALYs, a lower than expected lifetime HAQ-DI for csDMARD active treatment artificially enhanced its economic value compared with sequences beginning with targeted DMARDs. This approach also diverges from other published RA models [38,39,40]. In contrast, the present study assumed a constant HAQ-DI progression at a rate of 0.045 per year with no lifetime limits as applied in previous health technology appraisals [40]. As a result, some control patients included in the IPS reached a maximum HAQ-DI score of 3.0, which may be more reflective of the expected progressive disease burden of RA.

As for all cost-effectiveness models, the use of multiple sources and assumptions leads to important limitations. Similar to other long-term CEAs in RA [40], the limitations of the present study stem from the uncertainty inherent to the limited, and often conflicting, evidence regarding short-term treatment response rates and long-term benefits of bDMARDs. For example, the 6-month response rates are based on an NMA, which can be affected by a heterogeneous trial population and placebo arm response rates that have proven to be heterogeneous across trials. In the long-term, due to lack of long-term disease activity data, after the initial 6 months, long-term treatment discontinuation from real-world evidence (RHUMADATA registry) was used as a proxy for determining the efficacy of treatment in reducing disease activity. Data from the Canadian RHUMADATA registry were used in lieu of US data. In addition, the present model assumed a linear progression for csDMARDs, while separate studies have assumed non-linear progression [21]. Because of linear progression, worse HAQ-DI scores were modeled in the long-term, therefore resulting in relatively worse QALYs and higher costs for csDMARDs compared with a non-linear approach. This is an important limitation compared with csDMARDs; however, since the population in this study is csDMARD-IR, a comparison is only conducted to compare results with prior publications [41], most of which use linear progression. If non-linear progression was used here, the results for this comparison would most likely change.

Consistent with previously published cost-effectiveness models in RA [21], the initial 6-month treatment response was based on ACR 20/50/70 response criteria and assumed to be independent from the line of treatment. This was a conservative assumption ascribed to the model to address the limited evidence on efficacy per line of treatment; however, some evidence supports that TNFi have lower efficacy when used after an initial TNF [42, 43]. While the ACR 20/50/70 response criteria have limited clinical utility in RA, they are only included to determine if a treatment is active or not in the initial 6 months, in line with other RA cost-effectiveness studies. Additionally, extensive comparative data were only available for ACR response from the trials included.

The strengths of the current CEA include its conduct according to best practice guidelines [44]. Results of the CEA remained robust in various sensitivity analyses.


In patients with moderate-to-severe RA who are naïve to treatment with targeted DMARDs, sarilumab 200 mg SC q2w + methotrexate can be considered a cost-effective treatment option, as it results in lower costs and greater health benefit than multiple licensed bDMARD and tsDMARD treatments (adalimumab, certolizumab, golimumab and tofacitinib) and falls below commonly accepted cost-effectiveness thresholds compared with tocilizumab + methotrexate and csDMARD active treatment.