Background

The understanding of the multifunctional role of interleukin-6 (IL-6) in biologic activities has expanded in the last decade [1, 2]. Dysregulation of IL-6 has been implicated in the onset or development of several diseases, particularly inflammatory disorders such as rheumatoid arthritis (RA) [3, 4], whereby elevated levels of IL-6 in serum, synovial fluid, and various tissues have correlated with increased RA disease activity [5, 6].

The contribution of IL-6 to joint inflammation and bone erosion in RA is well established [7]; however, it has also been associated with non-articular manifestations of RA, including anemia [8], type 2 diabetes mellitus [9], and increased cardiovascular risk [10]. IL-6 levels also associate with a number of RA-related patient-reported outcomes (PRO), including fatigue and pain [11,12,13]. Studies of anti-IL-6R agents, such as tocilizumab [14,15,16,17,18,19,20,21] and sarilumab [22,23,24], in the treatment of moderate-to-severe RA have revealed the benefits of IL-6 inhibition, not only in the reduction of disease activity, but also improvement in pain and mood disorders associated with RA. The value of these clinical and PRO data notwithstanding, a formal association between IL-6 levels and overall health-related quality of life (HRQoL) in RA patients has not been investigated to date. Given that there are two approved therapeutics for RA that specifically block IL-6 signaling, a better understanding of the association between IL-6 levels and HRQoL fatigue and morning-stiffness is warranted as a potential biomarker to guide RA clinical decision-making.

Sarilumab is a fully human monoclonal antibody directed against both soluble and membrane-bound IL-6 receptor α (anti-IL-6Rα); this biologic disease-modifying antirheumatic drug (bDMARD) is approved for treatment of adult patients with moderate-to-severely active RA with inadequate responses or intolerance to one or more DMARDs [25, 26]. Sarilumab can be used in combination with methotrexate or as monotherapy when treatment with methotrexate is not appropriate. The MONARCH phase III, randomized controlled trial (RCT) of sarilumab (NCT02332590), compared the efficacy and safety of subcutaneous (SC) sarilumab 200 mg monotherapy every 2 weeks (q2w) versus adalimumab 40 mg SC monotherapy q2w in patients with RA not receiving methotrexate due to intolerance or inadequate responses. Adalimumab, a tumor necrosis factor α inhibitor (TNFi) bDMARD, is approved for the treatment of active RA and can also be used in combination or as monotherapy.

The MONARCH RCT demonstrated greater reductions in disease activity and symptoms of RA [24], with greater improvements in PROs including HRQoL [27] with sarilumab versus adalimumab. Safety profiles of both therapies were consistent with previously reported data in both therapeutic classes [28,29,30,31].

The objective of these post hoc analyses was to evaluate whether baseline levels of IL-6 are associated with improvements in PROs including HRQoL with sarilumab versus adalimumab.

Methods

Biomarker assessments

Serum levels of IL-6 were measured using a validated enzyme-linked immunosorbent assay in 300 of 369 randomized patients in the intent-to-treat population who provided consent with at least one serum sample drawn at baseline (i.e., the biomarker population). Patients were categorized into tertiles of baseline IL-6 levels across both treatment groups, classified as low, medium, and high, based on ranges of 1.6–7.1 pg/mL, 7.2–39.5 pg/mL, and 39.6–692.3 pg/mL, respectively.

HRQoL endpoints

Three PRO questionnaires were administered at baseline and (W)24 and W52: Short Form 36 (SF-36), Functional Assessment of Chronic Illness Therapy (FACIT)-fatigue, and duration of morning stiffness visual analog scale (AM-stiffness VAS). SF-36, scores evaluated included physical and mental component summary (PCS, MCS) and domains: physical functioning (PF), role-physical (RP), bodily pain (BP), general health (GH), vitality (VT), social functioning (SF), role-emotional (RE), and mental health (MH). Minimum clinically important differences (MCID) for these endpoints were [32] 2.5 for PCS and MCS [33], 4.0 for FACIT [34, 35], and 10.0 mm for AM-stiffness [21].

Statistical analyses

The Kruskal-Wallis test first evaluated if patients with high baseline IL-6 levels reported worse baseline PRO scores versus those with medium or low IL-6 levels.

The ability of IL-6 levels to predict improvements in HRQoL associated with sarilumab versus adalimumab was then tested using a linear fixed effect model of change from baseline (CFB) in PRO/HRQoL scores, with IL-6 tertile, treatment, region as stratification factor, and baseline IL-6 tertile-by-treatment interactions as fixed effects. The IL-6 tertile at baseline-by-treatment interaction term was calculated using low IL-6 tertile as a reference, i.e., it specifically evaluated whether there was a greater change in PRO/HRQoL scores in patients treated with sarilumab versus adalimumab in high or medium IL-6 tertile groups, respectively, compared with the low IL-6 tertile group. Pairwise comparisons of HRQoL scores between sarilumab versus adalimumab were performed separately for each IL-6 tertile, and least squares mean (LSM) CFB and corresponding 95% confidence intervals (CI) derived.

Patient-level responses (W24 and W52) in HRQoL between sarilumab versus adalimumab were evaluated via logistic regression of within-patient improvements ≥ MCID, with treatment, region as stratification factor, IL-6 tertile at baseline, and IL-6 tertile at baseline-by-treatment interactions, specified as fixed effects. The Mantel-Haenszel estimate (stratified by the region) of odds ratio (OR) between sarilumab and adalimumab and 95% CIs were also derived in each IL-6 tertile.

As all predictive analyses were conducted post hoc, all p values should be considered to be nominal.

Finally, the incidences of treatment-emergent adverse events (AEs) in each IL-6 tertile were analyzed descriptively.

Analyses were performed using SAS version 9.2 or higher (SAS Institute Inc. Cary, NC).

Results

Analysis population

The biomarker population included 300 patients (Table 1), with 152 and 148 patients, respectively, in the adalimumab and sarilumab group. Demographics and baseline clinical characteristics between treatment arms were similar to the overall study population [24]. Mean age (standard deviation [SD]) of patients in the adalimumab and sarilumab arms, respectively, were 50.4 (± 12.5) years and 53.3 (± 12.0) years, and 78.6% and 83.7% were female. The proportion of patients in the high IL-6 tertiles in the adalimumab and sarilumab arms were 35.5% and 31.1%, respectively, 34.9% and 31.8% in medium, and 29.6% and 37.2% in low, respectively.

Table 1 Demographics and baseline disease characteristics of the biomarker population by treatment arm

Baseline disease characteristics and HRQoL scores

Patients with high baseline IL-6 levels reported worse baseline scores on SF-36 MCS and the SF, RE, RP, and BP domains, as well as AM-stiffness, compared with medium or low IL-6 tertile groups (Table 2).

Table 2 Baseline disease characteristics and HRQoL of the biomarker population, by IL-6 tertile

C-reactive protein (CRP) and erythrocyte sedimentation rate (ESR) were lower in the low IL-6 tertile; there were fewer patients in this tertile with positive rheumatoid factor and positive anti-citrullinated peptide antibody

Predictivity of IL-6 tertile

Nominal interaction p values comparing differences in HRQoL improvements in high versus low IL-6 tertiles at W24 were < 0.05 for SF-36 PCS and the PF domain, as well as for AM-stiffness. In patients with high IL-6 levels at baseline and compared with patients in the low tertile, sarilumab treatment had a larger effect on HRQoL than adalimumab, which had stable and similar effects across IL-6 tertiles. LSM differences for sarilumab versus adalimumab, respectively, in the high and low IL-6 tertiles were 5.57, 95% CI [2.85, 8.28], versus 0.87 [− 1.91, 3.66] in SF-36 PCS (Fig. 1a); 3.19 [− 4.74, 11.12] versus 16.59 [8.15, 25.03] in PF domain (data not graphed); and − 19.93 [− 30.30, − 9.56] versus 1.21 [− 8.17, 10.60] for AM-stiffness (Fig. 1b). For SF-36 MCS, interaction p values were ≥ 0.05, suggesting no difference in effect between high or medium IL-6 compared with low IL-6 tertile.

Fig. 1
figure 1

LSM change (95% CI) from baseline to week 24 on HRQoL endpoints by IL-6 tertile and overall population for SF-36 PCS scores (a), AM-stiffness scores (b), and FACIT-fatigue scores (c). Adalimumab: low tertile, n = 45; medium tertile, n = 53; high tertile, n = 54. Sarilumab: low tertile, n = 55; medium tertile, n = 47; high tertile, n = 46. AM-stiffness duration of morning stiffness visual analog scale, CFB change from baseline, CI confidence interval, FACIT-fatigue Functional Assessment of Chronic Illness Therapy-fatigue, HRQoL health-related quality of life, IL-6 interleukin-6, LSM least squares mean, LSM∆ LSM difference between sarilumab and adalimumab, SF-36 Short Form 36, PCS physical component summary, VAS visual analog scale. Low (1.6–7.1 pg/mL), medium (7.2–39.5 pg/mL), high (39.6–692.3 pg/mL). *Nominal interaction p value versus low IL-6 tertile < 0.05

Regarding other SF-36 domains, all nominal interaction p values were < 0.05. However, there were between-group differences (nominal p < 0.05) for the benefit of sarilumab versus adalimumab within the high IL-6 tertile in RP, BP, VT, and SF domains, but not low or medium IL-6 tertiles (Fig. 2). Similarly, there was a difference (nominal p < 0.05) with sarilumab versus adalimumab within the high IL-6 tertile in FACIT-fatigue (4.86 [1.06, 8.65]), but not low or medium tertiles (Fig. 1c).

Fig. 2
figure 2

Mean SF-36 domain scores for adalimumab and sarilumab (combined baseline and week 24) by IL-6 tertile§. The nominal p value for the IL-6 tertile-by-treatment interaction using the low tertile as reference was ≥ 0.05 for all SF-36 domains except PF. Baseline combined scores are presented; change from baseline for each group cannot be inferred from the figure alone. Each 10-point interval represents twice the MCID for the SF-36 domain scores. p value of the between-group difference in LSM change from baseline < 0.05 within each IL-6 tertile. §Low (1.6–7.1 pg/mL), medium (7.2–39.5 pg/mL), high (39.6–692.3 pg/mL). BP bodily pain, FACIT Functional Assessment of Chronic Illness Therapy, GH general health, IL-6 interleukin-6, LSM least squares mean, MCID minimal clinically important differences, MH mental health, PF physical functioning, RE role-emotional, RP role-physical, SF social functioning, SF-36 Short Form 36, VAS visual analog scale, VT vitality

An IL-6 tertile at baseline-by-treatment interaction was also reported in patients reporting improvements ≥MCID in PCS scores (nominal p < 0.01) with high versus low IL-6 comparisons, but not other HRQoL endpoints (MCS, FACIT-fatigue, or AM-stiffness VAS). The OR and 95% CI in the high tertile was 6.31 [2.37, 16.81)] versus 0.97 [0.43, 2.16] in the low tertile (Fig. 3), indicating that patients treated with sarilumab are approximately six times more likely to report improvements in PCS scores than with adalimumab; whereas in the low tertile, there are no differences in responses.

Fig. 3
figure 3

Forest plot of odd ratios from patients reporting improvements ≥ MCID by baseline IL-6 tertile for SF-36 PCS score (a), SF-36 MCS score (b), and AM-stiffness (c) for sarilumab 200 mg q2w versus adalimuma12321b 40 mg q2w. *Nominal p < 0.01 for interaction test for patients reporting improvements ≥cvbnm,./MCID (using low IL-6 tertile as the reference group). CI confidence interval, MCID minimal clinically important differences, AM-stiffness duration of morning stiffness, OR odds ratio, PCS physical component summary, SF-36 Short Form 36,VAS visual analog scale

Safety

Descriptive analysis of AE rates indicated a similar safety profile between IL-6 tertiles [36].

Discussion

In these analyses, at baseline, RA patients with higher levels of IL-6 reported worse PRO/HRQoL scores than medium or low levels. Differences in the treatment effect of sarilumab versus adalimumab were higher (nominal p < 0.05) in patients with high IL-6 versus low IL-6 levels in SF-36 PCS and PF domain scores, and AM-stiffness VAS, with a higher treatment effect in patients with elevated IL-6 values, whereas the effect of adalimumab was stable across all tertiles. Analyses of responses between IL-6 tertiles indicated that patients with high IL-6 levels were more likely to report clinically meaningful improvements in PCS scores with sarilumab versus adalimumab. While our findings suggest that IL-6 levels may be associated with those scores where larger HRQoL improvements were reported, more work is needed to better understand these impacts of disease. For example, it would be pertinent to determine an optimal cut-off for IL-6 concentration, using receiver operating characteristic (ROC) analysis for improvement in patient-level responses, or machine learning methods like Classification and Regression Trees (CART).

Although IL-6 testing is currently not standard practice, the utility of high IL-6 levels as predictive biomarkers to tailor individual therapy has been proposed to address a key goal: to determine patient-specific profiles as a means to help predict responsiveness to specific treatments [37]. To date, several biomarkers have already been associated with RA diagnosis and prognosis, such as CRP, ESR, autoantibodies such as anti-citrinullinated peptide antibody, and TNF levels [38,39,40], although these markers have not been reliable predictors of clinical responses to bDMARDs [41].

A separate post hoc analysis of clinical endpoints evaluated in the MONARCH RCT [36] demonstrated that patients with high serum IL-6 levels prior to treatment with sarilumab or adalimumab had increased baseline disease activity, joint damage, pain, and lower patient global assessment and HRQoL scores, and less likely to benefit from TNFi therapies. Furthermore, patients with high versus normal IL-6 levels reported lower responses to placebo plus methotrexate or adalimumab compared with sarilumab treatment [36].

In addition to evaluating clinical markers, quantifying the burden of RA from the patient perspective is vital to comprehensively understand the disease and its treatment [42, 43]. Findings from this present study support that baseline IL-6 levels may differentially predict treatment improvements in PRO/HRQoL.

Our findings must be examined in light of some limitations. First, the number of patients in each IL-6 tertile was modest; hence, prospective validation in larger cohorts is warranted to confirm the findings. Furthermore, while we have observed that baseline IL-6 levels predict greater improvements in PROs/HRQoL, it will be important to also assess the indirect effects of improvement of disease activity or other clinical endpoints [36] on PROs and to compare them in terms of magnitude and effect size in the different IL-6 tertiles.

Assays that measure known diagnostic biomarkers are commonly used in clinical practice (e.g., 70% of decisions made by physicians are based on results provided by biomarkers [44]). However, implementation of novel biomarkers into clinical practice proves to be a long and challenging process, which includes convincing physicians of their practicality and feasibility of use [5, 45]. Given the complexity and heterogeneous nature of RA, it is unlikely that a single cytokine level will provide sufficient discrimination to predict treatment effect. Many reliable assays are now available, predominantly multiplex formats. At present, the limitation of relying on a biomarker in RA is reflected in the disease-related complexity of immunologic networks and elucidation of the respective role and redundant effects one cytokine may have on another [5].

Conclusion

The beneficial effects of sarilumab versus adalimumab on HRQoL were greater in patients with high IL-6 levels at baseline indicating that among adult RA patients with moderate-to-severely active RA who have had an inadequate response or intolerance to one or more DMARDs, high IL-6 levels may predict greater improvements in PROs/HRQoL than low IL-6 levels. These findings support previous analyses which have shown that across various endpoints, patients with elevated baseline IL-6 levels compared with those without responded better to sarilumab compared with methotrexate or adalimumab [36].