Introduction

Sarilumab, a human monoclonal antibody, which blocks both the soluble and membrane forms of the interleukin-6 receptor, is a biologic disease-modifying antirheumatic drug (bDMARD) for use as monotherapy or in combination with conventional synthetic (cs) DMARDs for the treatment of adults with moderate-to-severe rheumatoid arthritis (RA) in patients who are intolerant or inadequate responders to csDMARDs (csDMARD-IR) [1,2,3,4]. Evaluation of the comparative effectiveness and safety of sarilumab against other DMARDs is needed to guide evidence-based medicine [5]; however, there are limited head-to-head data comparing sarilumab with other relevant RA treatments. In the absence of direct evidence, a network meta-analysis (NMA) facilitates evaluation of sarilumab against other treatments using a combination of direct and indirect trial data. This NMA was conducted to evaluate the comparative efficacy and safety of subcutaneous (SC) sarilumab 200 mg monotherapy, administered every 2 weeks (q2w) versus other globally approved monotherapies for RA. Comparator treatments included bDMARDs, targeted synthetic DMARDs (tsDMARDs) and csDMARDs at their recommended doses for the treatment of RA as monotherapy for csDMARD-IR. Comparative evidence of sarilumab in combination with csDMARDs is published elsewhere [6].

Methods

A systematic literature review (SLR) and NMA were conducted following methods in line with Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) guidelines [7] and recommended in the current National Institute for Health and Care Excellence (NICE) specification for manufacturer and sponsor submission of evidence [8], as well as the 2016 NICE technology appraisal of adalimumab, etanercept, infliximab, certolizumab pegol, golimumab, tocilizumab, and abatacept for RA [9]. Due to the nature of the study it was not registered with clinicaltrials.gov or a similar body. This article is based on previously conducted studies and does not contain any studies with human participants or animals performed by any of the authors.

Study Selection

Searches for the SLR were conducted in the MEDLINE, EMBASE, and Cochrane databases (all without any time limit), plus conference proceedings since 2013 for evidence published until December 6, 2016. Studies were selected according to pre-defined population/intervention/comparator/outcome/study design criteria (Table 1) [7, 8, 10, 11]. All titles, abstracts and articles were then screened independently by two researchers, with study selection following published best practice guidelines for indirect treatment comparisons [8, 10, 11].

Table 1 Population/intervention/comparator/outcome/study design and search criteria for the SLR

Data on study design, patient characteristics, efficacy, safety and patient-reported outcomes at the time points 12 (± 4), 24 (± 4) and 52 (± 8) weeks for all studies (except open-label extensions) were extracted independently by two reviewers in a pre-defined data extraction process. Evidence for the NMA was filtered for drugs licensed for RA at doses approved in Europe, the USA and Canada. All trials comparing one intervention of interest with at least one other intervention of interest or methotrexate or ≥ 1 csDMARD(s) were considered in the evidence base.

Small studies have been shown to distort meta-analyses [12] therefore, studies with fewer than 30 patients per arm were excluded. Studies which did not report any outcomes of interest were also excluded.

Treatment Categorisation

Different licensed dosages and different routes of administration [e.g., intravenous (IV) versus SC delivery)] of the same treatment were pooled in many cases, on the basis of evidence of equivalence (Supplementary Table S1). These decisions were explored by examining forest plots of the odds ratio (OR) for American College of Rheumatology (ACR) 20% response criteria (ACR20) at 24 weeks in individual studies by group of interventions. If the confidence intervals were overlapping (e.g., as for infliximab studies), the doses were pooled. The validity of the decisions was also confirmed via clinician input.

Outcomes Examined

Key efficacy endpoints were extracted and analyzed including ACR20, ACR 50% response criteria (ACR50), ACR 70% response criteria (ACR70), and the Health Assessment Questionnaire Disability Index (HAQ-DI) change from baseline. The European League Against Rheumatism (EULAR) Disease Activity Score 28-joint count (DAS28) remission (defined as DAS28 erythrocyte sedimentation rate or C-reactive protein < 2.6) was also extracted; however, this endpoint was not analyzed given that the EULAR networks were small and a high level of variability was observed in response rates between the different studies. Safety endpoints included the proportion of patients with any serious infection (SI) and the proportion of patients with any serious adverse event (SAE). All efficacy and safety outcomes were examined at 24 weeks as this was the assessment period with the most data available for analysis.

Network Meta-Analysis

Feasibility Assessment

Prior to the conduct of the NMA, a feasibility assessment was conducted to assess the sufficiency of the evidence base to draw feasible networks for all outcomes of interest. The exchangeability assumption is critical and requires that selected trials measure the same underlying relative treatment effects. Deviations to this assumption can be evaluated through two metrics: heterogeneity (i.e., evaluation of comparability in characteristics and results across included studies) and consistency (i.e., evaluation of consistency between direct and indirect evidence). Effect modifiers were evaluated by establishing the link between patient characteristics at baseline and ACR20; only weight was identified as an effect modifier given the expected variation in patient characteristics across RA studies [11, 13, 14], which can limit the validity of indirect comparisons.

Variability of response in the placebo arms is an issue which can limit indirect comparisons, and the heterogeneity of RA studies has been previously noted [15] where the treatment effect expressed as log ORs has a negative relationship with the baseline risk [9, 16]. While the common comparator across most monotherapy trials was an active comparator (adalimumab), for the few placebo-controlled studies, variability was therefore considered in the selection of models.

Bayesian NMA

The efficacy and safety of the treatments included in the analysis were evaluated using a Bayesian NMA approach [10, 14, 17], comprising a likelihood distribution, a model with parameters and prior distributions for these parameters. A linear model with normal likelihood distribution was used for continuous outcomes, and a binomial likelihood with a log link was used for the dichotomous outcomes [15, 18]. Consistent with NICE guidelines, flat (non-informative) prior distributions were assumed for nearly all outcomes so as not to influence the observed results by the prior distribution [15]. Prior distributions of the baseline treatments and relative treatment effects were normal, with 0 mean and variance of 10,000, with informative prior based on between study variance according to the recommendation of NICE in the case of limited data [15]. Random- and fixed-effects models were evaluated to allow for heterogeneity of treatment effects between studies, with the choice of base-case informed by Deviance Information Criterion (DIC) values and mean total residual deviance (compared against the number of fitted data points), as well as consistency with directly reported trial results [17]. Posterior densities for unknown parameters were estimated using Markov chain Monte Carlo simulations.

All results for OR-NMA and risk difference (RD)-NMA were based on 100,000 iterations on three chains, with a burn-in of 20,000 iterations. Convergence was assessed by visual inspection of trace plots. The accuracy of the posterior estimates was assessed using the Monte Carlo error for each parameter (Monte Carlo error < 1% of the posterior standard deviation). All models were implemented using WinBUGS. Results of the NMA are presented in terms of ‘point estimates’ (median of posterior) for the relative treatment effects, along with the 95% credible intervals.

Scenario Analyses

Two scenario analyses were conducted. The first excluded studies conducted in exclusively Asian populations (i.e., the SATORI, CHANGE and Etanercept 309 studies) to test the potential modifying effect of patient body weight (with Asian ethnicity serving as a proxy for populations with relatively lower body weight than other populations. For the second analysis, tumour necrosis factor (TNF)-α inhibitors were pooled together as a class. For the latter scenario analysis, ACR outcomes were compared with the base case which evaluated the TNF-α inhibitors individually. This scenario was evaluated to inform cost-effectiveness evaluations of sarilumab.

Results

SLR Search and Selection

Following the SLR search a total of 31 citations which met the screening criteria were retrieved (Fig. 1); these reported the results of 19 randomized clinical trials (RCTs). Five RCTs were excluded because the comparator was out of scope. Of the 14 RCTs which were included in the NMA feasibility assessment, 5 of these were excluded due to no outcomes of interest. A net of 9 RCTs were included in the NMA (Fig. 1), including the MONARCH trial of sarilumab versus adalimumab monotherapy (NCT02332590) [2].

Fig. 1
figure 1

Systematic review and network meta-analyses study selection flow chart. NMA Network meta-analysis

Network Meta-Analysis Evidence Base

Evaluation of the evidence base indicated that the ACR networks, and in particular ACR20 (Fig. 2), were the most robust networks where most interventions included were from multiple trials.

Fig. 2
figure 2

Evidence base networks for American College of Rheumatology 20% response rate outcomes at 24 weeks. csDMARD conventional synthetic disease-modifying antirheumatic drug

Key features of patient demographics and baseline data from the selected studies are provided in Supplementary Table S2. Among 9 trials included, 4 were phase III trials, 1 each was a phase II trial, phase II/III trial and phase IV trial; the trial phase was not mentioned for the remaining 2 trials. Study durations varied from 24 weeks up to 52 weeks; however, 1 dose-ranging study of tofacitinib compared with adalimumab was designed for cross-over to tofacitinib at week 12 for patients in the adalimumab arm [19]. Therefore, this study was included only for evaluating outcomes of tofacitinib versus placebo and included in the relevant networks. In 3 studies, patients had to have been on stable methotrexate for at least 12 weeks prior to entering the study; in another 3 studies this criterion was not required, and in the remaining studies no such information was reported.

Sample sizes varied from fewer than 50 patients to more than 150 patients per randomized group. Rescue medication was permitted in 5 of the trials and not reported in the remainder of the trials.

Base Case Model Choice

The feasibility assessment indicated that there was considerably less variability where the common comparator was an active comparator (adalimumab), with some degree of variability where placebo was the control arm. Therefore, a fixed-effects model using conventional OR model was selected as the base-case model for ACR20/50/70 outcomes (ACR20: DIC = 141.98; ACR50: DIC = 135.62; ACR70: DIC = 99.93). For the continuous efficacy outcome, HAQ-DI, a standard change from baseline NMA was conducted. In addition, the RD-NMA was applied for DAS28 < 2.6 and safety outcomes due to convergence issues in the OR model and/or due to the relative rarity of the event.

Base Case Network Meta-Analysis Results

In the base case, NMA indicated superior efficacy for sarilumab 200 mg versus adalimumab monotherapy on all efficacy outcomes and versus tofacitinib monotherapy on ACR20 (Table 2). Compared with csDMARDs, sarilumab 200 mg had superior efficacy on ACR 20/50/70 criteria and DAS28 < 2.6 but had similar efficacy versus csDMARDs on HAQ-DI. Efficacy of sarilumab 200 mg was similar versus certolizumab, etanercept, tofacitinib and tocilizumab 8 mg/kg monotherapy across all efficacy outcomes. All safety outcomes appeared similar for sarilumab 200 mg versus all comparator monotherapies.

Table 2 Summary results for sarilumab SC 200 mg q2w monotherapy versus other monotherapies in the csDMARD-IR population: median estimates of relative treatment effects (95% credible intervals; base case) for week 24 efficacy and safety (SI, SAE)

Scenario Analyses

After excluding the studies SATORI, CHANGE and Etanercept 309, sarilumab 200 mg monotherapy was found to be statistically superior to placebo and adalimumab, and comparable to certolizumab, etanercept, tocilizumab 8 mg/kg IV and tofacitinib monotherapies. For the scenario in which TNF-α inhibitors treatments were pooled together, sarilumab 200 mg monotherapy was found to be statistically superior versus placebo, csDMARDs, TNF-α inhibitors and tofacitinib. Compared with tocilizumab 8 mg/kg IV, sarilumab efficacy was comparable.

Discussion

Active comparator-controlled, randomized trials evaluating the comparative efficacy and safety of bDMARDs and tsDMARDs for the treatment of RA are few and limited to adalimumab as active comparator, including the MONARCH study of sarilumab compared with adalimumab [2, 20,21,22,23,24]. The objective of this study was to evaluate the comparative efficacy and safety of SC sarilumab 200 mg monotherapy, administered q2w versus other globally approved monotherapies for RA.

Consistent with the head-to-head trial of sarilumab versus adalimumab in MONARCH, the present NMA indicated that sarilumab 200 mg monotherapy had superior efficacy versus adalimumab monotherapy on all outcomes. It was also found that sarilumab had superior efficacy versus tofacitinib monotherapy on ACR20 and versus csDMARDs on ACR 20/50/70 criteria and DAS28 < 2.6. On other outcomes for these monotherapies and against all the other tsDMARD monotherapies, similar efficacy was observed. Rates of SI/SAE for sarilumab monotherapy were equivalent to those of all bDMARD, tsDMARD and csDMARD monotherapies.

These results are consistent with previously published NMAs of the comparator bDMARD, tsDMARD and csDMARD monotherapies in RA, without sarilumab included in the networks [25,26,27]. In a 2015 meta-analysis by Buckley et al., it was found that patients receiving a TNF inhibitor or tofacitinib had greater ACR 20/50/70 responses than those receiving placebo [27]. While tocilizumab resulted in greater responses than TNF inhibitor, these were not statistically significant.

A more recent 2017 NMA of 28 studies comparing biologic monotherapy versus methotrexate, placebo or other biologic monotherapy found that all agents, except anakinra and infliximab, were superior to placebo [26]. Furthermore, etanercept and rituximab were superior to anakinra and tocilizumab was superior to adalimumab, anakinra, certolizumab, and golimumab. No differences were found between etanercept, tocilizumab and rituximab. A Cochrane review, updated in 2016, compared the efficacy and safety of biologic or tofacitinib monotherapy in patients who had failed traditional DMARDs [25]. The analysis of 46 RCTs found that overall, biologic therapy was superior to placebo and methotrexate/other DMARDs. TNF inhibitor and non-TNF inhibitor showed similar efficacy. However, anakinra and tofacitinib showed similar results for HAQ as placebo.

Strengths and Limitations

One limitation of the study was the variability in placebo response observed across the network; however, given the large number of active-controlled trials for the monotherapy csDMARD-IR population, it was feasible to address this situation by conducting OR-NMA [15, 28].

In addition, some of the comparative studies of etanercept, certolizumab pegol and tofacitinib required 2 intermediate comparators rather than 1, thus creating a greater degree of uncertainty.

The strengths of this NMA include the range of efficacy and safety outcomes which have been considered, providing a comprehensive view of the comparative efficacy and safety of sarilumab to inform clinical decision-making and the conduct of health technology assessments. The most robust networks, ACR20/50, used only one common comparator on all comparisons with sarilumab on these endpoints. The scenario analyses confirmed the results against the base case analysis, where comparisons were feasible.

The present indirect comparison was conducted following best practice guidelines and demonstrated that sarilumab SC 200 mg monotherapy has superior efficacy compared with adalimumab, as well as csDMARDs alone, and comparable or better efficacy and similar safety compared with other bDMARDs and tsDMARDs in the csDMARD-IR patient populations. Compared with tocilizumab 8 mg/kg IV, sarilumab 200 mg had similar efficacy and safety.

Conclusion

In csDMARD-IR patients, sarilumab 200 mg monotherapy has superior efficacy and similar safety versus csDMARDs, superior efficacy and similar safety versus adalimumab, and similar efficacy and safety versus bDMARDs and tsDMARDs.