Background

Rheumatoid arthritis (RA) is a chronic, inflammatory, and destructive joint disease that is associated with substantial clinical burden [1]. Pain, fatigue, and morning (AM) stiffness are common symptoms associated with RA [2,3,4,5,6] and have an important negative impact on health-related quality of life (HRQOL) [7,8,9] and ability to work [10,11,12]. A recent survey identified pain, fatigue, and independence as the most important domains of RA disease activity that need to improve in patient-perceived remission [13]. Core patient-reported outcomes (PROs) including global assessment of disease, pain, physical function, fatigue, HRQOL, and work stability provide valuable insights into patients’ perspectives on their health status and impact of disease—improvements in PROs are considered important when evaluating the benefits of treatments [14,15,16,17,18]. Capturing the patient experience with these outcomes provides important information that can be used by clinicians to guide treatment decisions [19].

Janus kinase (JAK) inhibitors are a class of orally administered targeted synthetic disease-modifying antirheumatic drugs (tsDMARDs) that have recently received regulatory approval and are under evaluation in randomised controlled trials (RCTs) for the treatment of RA [20,21,22,23]. Upadacitinib, a selective JAK1 inhibitor, has demonstrated efficacy and a favourable benefit-to-risk profile in active RA among patients with inadequate responses to conventional synthetic disease-modifying antirheumatic drugs (csDMARD-IR) in phase 2 and 3 RCTs (NCT02066389; NCT02675426) [24, 25]. To assess the comprehensive benefits of upadacitinib, it is important to understand its impact on patient-centric outcomes. To this end, we examined the effect of two doses (15 mg or 30 mg daily) of upadacitinib versus placebo on PROs in SELECT-NEXT, an RCT assessing the efficacy and safety of upadacitinib in moderately to severely active csDMARD-IR RA patients.

Methods

Study design and participants

Full details of the study design of SELECT-NEXT (ClinicalTrials.gov, NCT02675426) were reported previously [24]. Patients were randomly assigned (2:2:1:1) to receive either upadacitinib 15 mg or 30 mg or placebo daily for 12 weeks while continuing background csDMARD therapy. After the initial 12-week placebo-controlled period, patients taking placebo received 15 mg or 30 mg of upadacitinib daily, according to the prespecified randomisation assignment. Patients, investigators, and the funder were masked to the treatment allocations. This report is based on post hoc analyses data collected during the placebo-controlled period of SELECT-NEXT. Study participants were ≥ 18 years of age, had active RA for ≥ 3 months, and received csDMARDs for ≥ 3 months with stable doses for ≥ 4 weeks before study entry and inadequate responses to ≥ 1 of the following csDMARDs: methotrexate (MTX), sulfasalazine, or leflunomide. The protocol allowed for enrolment of ≤ 20% with intolerance to at most one biologic DMARD (bDMARD); bDMARD-IR patients were excluded. The protocol was approved by the independent ethics committees or institutional review boards at all study sites. All participants provided written informed consent before enrolment. The RCT was conducted in accordance with the ethical principles that have their origin in the current Declaration of Helsinki and consistent with International Conference on Harmonisation Good Clinical Practice and Good Epidemiology Practices, along with all applicable local regulatory requirements. All patient data were de-identified and complied with patient confidentiality requirements.

Patient-reported outcomes

PROs were secondary outcome measures in the SELECT-NEXT trial. Patient Global Assessment of Disease Activity (PtGA) and the patient’s assessment of pain were measured using visual analogue scales (VAS) of 0 to 100 mm, with higher scores indicating greater disease activity and worse pain. Reductions of ≥ 10 mm in both PtGA and pain scores are the minimum clinically important difference (MCID). Physical function was assessed by the Health Assessment Questionnaire-Disability Index (HAQ-DI) [26, 27], with higher scores indicating worse physical function and greater disability; a reduction of ≥ 0.22 units is the MCID [28, 29], and a score ≤ 0.25 is the normative value [30]. Fatigue was assessed by the Functional Assessment of Chronic Illness Therapy-Fatigue (FACIT-F) scale; scores range from 0 to 52, with higher scores indicating less fatigue [31], an increase of ≥ 4.0 points is defined as MCID [28], and a score of 43.6 as normative [32]. HRQOL was evaluated using the Medical Outcomes Short Form 36 Health Survey (SF-36), which assesses eight domains (Physical Functioning [PF], Role-Physical [RP], Bodily Pain [BP], General Health [GH], Vitality [VT], Social Functioning [SF], Role-Emotional [RE], and Mental Health [MH]), scored from 0 to 100 and aggregated into the physical component summary (PCS) and mental component summary (MCS) measures [33, 34], with normative values of 50 and standard deviations of 10. The SF-36 domain scores were compared with age- and gender-matched norms. Higher SF-36 scores indicate better health; the MCID is an increase of ≥ 2.5 points for SF-36 PCS and MCS and ≥ 5.0 points for individual SF-36 domains [28, 29]. Euro Qol 5-Dimension 5-Level Questionnaire (EQ-5D-5 L) was also used to assess HRQOL. EQ-5D-5 L has two components: a 0 to 100-mm VAS where 0 represents the worst imaginable health state and 100 represents the best imaginable health state and an index score, which has a maximum score of 1 representing the best health state [35, 36]. AM stiffness severity was reported on a numeric scale of 0 to 10, with higher scores indicating greater severity. Duration of AM joint stiffness was reported by the patient as the length of time, in minutes, that AM joint stiffness lasted on the day before each study visit. Because no values for MCID are reported in the literature, the proposed MCID for AM stiffness severity was defined as a reduction of ≥ 1 point, and the minimal important difference (MID) for AM stiffness duration was proxied at half the standard deviation of the mean baseline values. The Work Instability Scale for RA (RA-WIS) identifies patients at risk for disability-associated work instability, defined as a mismatch between an individual’s functional capabilities and job demands because of RA [37]. RA-WIS scores range from 0 to 23, with higher scores indicating a greater risk of work disability; scores < 10 are considered low risk, and MCID is a reduction of ≥ 5 points [38].

Statistical analyses

Changes from baseline at weeks 4 and 12, 95% confidence intervals, and nominal P values were analysed using a mixed-effect repeated measures model with unstructured variance-covariance matrix including treatment, visit, treatment-by-visit interaction, and prior bDMARD use as fixed factors and baseline value as a covariate. The assumptions of linear regression were checked and met for all outcomes included in the study except for AM stiffness duration and EQ-5D-5 L. Linear regression models were implemented for the analysis of AM stiffness duration and EQ-5D-5 L outcomes for consistency; given the large sample size, estimates are unlikely to be biassed. The results were expressed as least squares mean (LSM) changes. The baseline values and LSM changes for SF-36 domains were transformed based on the mean and standard deviation of the 1998 general US population. Analyses were performed in the full analysis set of all randomly assigned patients who received at least one dose of study drug.

The percentages of patients reporting improvements in PRO scores from baseline to week 12 ≥ MCID or scores ≥ normative values (age- and gender-matched for SF-36 only) at week 12 were compared between active treatment groups and placebo. Non-responder imputation was used when PRO data were missing. Comparisons between active treatment groups and placebo were made using chi-square tests. For each PRO, the incremental numbers needed to treat (NNTs) to achieve clinically meaningful improvements from baseline (≥ MCID or MID) were calculated as the reciprocal of the response rate differences between the active treatment groups and placebo. Times to response from baseline to week 12 were assessed for pain, HAQ-DI, and AM stiffness using Kaplan-Meier analysis. Median times to response were calculated for each dose group; comparisons between the groups used log-rank tests. P < 0.05 was considered significant.

Results

Study population

A total of 661 patients with active RA were randomised and treated (221 received upadacitinib 15 mg; 219 received upadacitinib 30 mg; 221 received placebo); of these, 618 (93%) completed the placebo-controlled 12-week period (14 patients in the placebo group, 11 patients in the upadacitinib 15-mg group, and 18 patients in the upadacitinib 30-mg group discontinued). Baseline characteristics were balanced across the 3 groups (Table 1). At baseline, 61% of patients had received MTX only, 21% a combination of MTX and another csDMARD, and 19% with only a csDMARD other than MTX. Thirteen percent of patients had prior bDMARD exposure; these patients were either intolerant or had < 3 months exposure to bDMARDs. Patients with an inadequate response to bDMARDs were excluded from entry. Across the groups, Disease Activity Score 28 using C-reactive protein (DAS28[CRP]) ranged from 5.6 to 5.7 and Clinical Disease Activity Index (CDAI) ranged from 37.8 to 38.6 indicating high baseline disease activity in this population.

Table 1 Patient demographics and baseline characteristics

Baseline mean PtGA scores ranged from 60.3 to 63.1, mean pain scores from 61.5 to 64.1, mean HAQ-DI scores from 1.4 to 1.5, and FACIT-F from 27.5 to 28.3 across the treatment groups (Table 2). Baseline HRQOL scores (as measured by SF-36 and EQ-5D-5 L) were low. SF-36 PCS was approximately 2.0 standard deviations (SD) < normative values of 50 indicating substantial impairment at baseline (Fig. 1). SF-36 MCS was approximately 0.5 SD less. SF-36 domain scores were low, so that baseline SF-6D utility scores, based on mean scores across all 8 domains [39, 40], were 0.57 in all 3 groups compared with 0.763 in the age/gender-matched normative population. The largest decrements from age and gender norms in both upadacitinib and placebo populations were in physical function (PF, − 33.3 to − 34.7), role physical (RP, − 32.7 to − 34.8), and bodily pain (BP, − 30.9 to − 32.4) domains. Baseline AM stiffness duration ranged from 129 to 152 min and severity from 6.1 to 6.2 (Table 2).

Table 2 Baseline PRO scores
Fig. 1
figure 1

Baseline and post-treatment scores at week 12 across all Short Form 36 Health Survey domains. Baseline (BL) and SF-36 domain scores are relative to age- and gender-adjusted norms (A/G norms) for the general US population. a PBO. b UPA 15 mg. c UPA 30 mg. d Combined. In the combined spydergrams, most of the UPA 30-mg results are covered up by the UPA 15-mg results. BL values and SF-36 domain scores were re-scored from 0 to 100. No further transformations were applied for this analysis. BP, Bodily Pain; GH, General Health; MH, Mental Health; PBO, placebo; PF, Physical Functioning; RE, Role-Emotional; RP, Role-Physical; SF, Social Functioning; UPA, upadacitinib; VT, Vitality; Wk, week

Change from baseline

Statistically significant (P < 0.001) LSM changes from baseline to week 12 were reported with both upadacitinib 15 mg and 30 mg compared with placebo for all PROs (P < 0.001) except SF-36 MH with 15 mg, and SF-36 MCS, RE, and MH with 30 mg (Table 3). Duration of AM stiffness was reduced 63 to 67% from baseline after initiating upadacitinib.

Table 3 LSM change (95% CI) from baseline to weeks 4 and 12

Statistically significant LSM changes ([95% CI], P < 0.001) from baseline were reported as early as week 1 for PtGA (upadacitinib—15 mg, − 10.92 [− 13.87, − 7.97]; 30 mg, − 13.74 [− 16.67, − 10.80] versus placebo, − 3.17 [− 6.07, − 0.27], both P < 0.001), pain (upadacitinib—15 mg, − 11.38 [− 14.22, − 8.54]; 30 mg, − 13.80 [− 16.63, − 10.98] versus placebo, − 4.62 [− 7.41, − 1.82], both P < 0.001), HAQ-DI (upadacitinib—15 mg, − 0.25 [− 0.31, − 0.19]; 30 mg, − 0.24 [− 0.30, − 0.18] versus placebo, − 0.14 [− 0.19, − 0.08], both P < 0.001), AM stiffness severity (upadacitinib—15 mg, − 1.15 [− 1.42, − 0.88]; 30 mg, − 1.25 [− 1.52, − 0.98] versus placebo, − 0.40 [− 0.67, − 0.13], both P < 0.001), and AM stiffness duration (upadacitinib—30 mg, − 32.21 [− 46.49, − 17.94] versus placebo, − 9.75 [− 23.96, 4.47], P = 0.013).

Responder analysis

At week 12, significantly (P < 0.05) more upadacitinib-treated (15 mg and 30 mg) patients reported improvements ≥ MCID in PtGA, pain, HAQ-DI, FACIT-F, duration and severity of AM stiffness, RA-WIS, SF-36 PCS, and SF-36 MCS (15 mg only) and seven of eight SF-36 domains with 15 mg and four of eight SF-36 domains with 30 mg (Fig. 2a, b). Across most PROs, NNTs ranged from four to eight patients; NNTs ≤ 10 are considered clinically meaningful [41].

Fig. 2
figure 2

Percentage of patients reporting improvements ≥ MCID at week 12. a Patient’s Global Assessment of Disease Activity (PtGA), pain, Health Assessment Questionnaire-Disability Index (HAQ-DI), Functional Assessment of Chronic Illness Therapy-Fatigue (FACIT-F), morning (AM) joint stiffness duration, AM stiffness severity, Work Instability Scale for RA (RA-WIS), and Short Form 36 Health Survey (SF-36). b SF-36 individual domains. Baseline values and SF-36 domains were re-scored from 0 to 100. ***P < 0.001, **P < 0.01, *P < 0.5 for upadacitinib versus placebo. BP, Bodily Pain; GH, General Health; MCID, minimum clinically important difference; MCS, mental component summary; MH, Mental Health; NNT, number needed to treat; PBO, placebo; PCS, physical component summary; PF, Physical Functioning; RE, Role-Emotional; RP, Role-Physical; SF, Social Functioning; UPA, upadacitinib; VAS, visual analogue scale; VT, Vitality

Patients treated with either dose of upadacitinib had a median time to response of 1 week for pain compared with 4 weeks for placebo; time to response for HAQ-DI and AM stiffness severity was also shorter for upadacitinib-treated patients (1 week for both upadacitinib doses versus 2 weeks for placebo).

Few patients in any group reported PRO scores ≥ normative values at baseline. The number of patients ranged from 2 (1%) for SF-36 PCS in the upadacitinib 30-mg group to 89 (41%) for SF-36 MCS in the upadacitinib 30-mg group (Fig. 3a). At week 12, the percentage of patients reporting scores ≥ normative values ranged from 18% (SF-36 PCS) to 57% (RA-WIS) with upadacitinib 15 mg and 15% (SF-36 PCS) to 50% (SF-36 MCS) with upadacitinib 30 mg, compared with 8% (SF-36 PCS) to 46% (SF-36 MCS) with placebo (Fig. 3a). Differences between the active and placebo treatment groups were statistically significant (P < 0.05) in HAQ-DI, FACIT-F, and SF-36 PCS with both upadacitinib doses versus placebo. Across SF-36 domains, the percentages of patients reporting scores ≥ normative values at 12 weeks ranged from 18% (RP) to 40% (RE) with upadacitinib 15 mg and 20% (RP) to 40% (VT, RE, and MH) with upadacitinib 30 mg, compared with 8% (RP) to 34% (RE and MH) with placebo, statistically significant (P < 0.05) in PF, RP, BP, and VT for both upadacitinib doses and GH with 30 mg (Fig. 3b).

Fig. 3
figure 3

Patients reporting scores ≥ normative values at baseline and week 12. a Health Assessment Questionnaire-Disability Index (HAQ-DI), Functional Assessment of Chronic Illness Therapy-Fatigue (FACIT-F), Work Instability Scale for RA (RA-WIS), and Short Form 36 Health Survey (SF-36) PCS and MCS. b SF-36 individual domains. Baseline values and SF-36 domains were re-scored from 0 to 100. ***P < 0.001, **P < 0.01, *P < 0.5 for upadacitinib versus placebo. BL, baseline; BP, Bodily Pain; GH, General Health; MCS, mental component summary; MH, Mental Health; PBO, placebo; PCS, physical component summary; PF, Physical Functioning; RE, Role-Emotional; RP, Role-Physical; SF, Social Functioning; UPA, upadacitinib; VT, Vitality; Wk, week

Discussion

Upadacitinib treatment resulted in significant and clinically meaningful improvements in patient-reported disease activity, pain, physical function, fatigue, HRQOL, AM stiffness, and work instability in csDMARD-IR patients with RA. Improvements in PtGA, pain, HAQ-DI, and AM stiffness were reported as early as week 1. Patients not only reported improved PtGA, pain, HAQ-DI, AM stiffness, and FACIT scores, but also reported improvement in SF-36 domain scores that support these outcomes (PF, RP, BP, GH, and VT). There appears to be little difference in the treatment responses between the upadacitinib 15-mg and 30-mg doses, consistent with the reported primary efficacy results [24]. Most PROs assessed resulted in NNTs ≤ 10, which are generally considered favourable [41] and demonstrate the value of upadacitinib treatment for csDMARD-IR patients with RA.

Assessing the effect of upadacitinib on pain, physical function, fatigue, and AM stiffness is important because these outcomes directly impact HRQOL by reducing patients’ ability to perform daily activities and providing barriers to maintaining employment [42,43,44]. The Work Productivity and Activity Impairment Questionnaire (WPAI) is often used to assess work productivity [45]; however, this measure mainly assesses the time missed from work and impairment while working. Assessing work instability may be a more meaningful measure as it provides a means of screening for possible work disability and an opportunity for individuals to engage in early job retention interventions. In our study, we examined the effect of upadacitinib on work instability using RA-WIS, which identifies patients at risk of work absence or job transitions because of RA [37]. Job transitions are ways in which patients adapt to remain employed and include reducing work hours, taking a short leave of absence, or changing jobs or occupations [10]. Upadacitinib treatment markedly reduced the proportion of patients at risk of work instability. Fatigue is difficult to treat [46, 47], and there is strong evidence for an association between fatigue and other outcomes important from the patient perspective, such as pain, physical function, and depression [6, 43, 48]. This post hoc analysis demonstrated clinically meaningful improvements in fatigue as well as pain and physical function with upadacitinib treatment in csDMARD-IR patients with RA.

This RCT has several strengths of note. Several validated PROs reflecting different aspects of the patient experience were assessed in this study. The analyses performed in this study were comprehensive in nature as they not only examined the changes from baseline but also the proportion of patients reporting improvements ≥ MCID/MID criteria and population norms as well as the time to response for important patient-centric outcomes, such as pain and physical function. The use of MCID or MID criteria to measure response provides a context of how clinically meaningful these improvements are from a patient’s perspective. In addition, assessing the proportion of patients with improvements that reach normative values is a more stringent assessment criterion than MCID, and our results show that a statistically significant proportion of csDMARD-IR patients with RA reported this level of improvement with upadacitinib treatment.

This RCT also has limitations to be considered when interpreting the results. PROs were collected at fixed visits; therefore, responses were unavailable at other time points; however, differences in the outcomes occurred early and were maintained at week 12. The generalisability of these results to patients with milder disease may be limited because patients had moderately to severely active disease at enrolment. The method used to impute missing data (non-response imputation) assumes that missing PRO scores are associated with non-responses, which are stringent conditions and may underestimate the true rate of response.

Conclusions

Upadacitinib 15 mg and 30 mg daily resulted in rapid and clinically meaningful improvements in the outcomes important to patients including disease activity (per PtGA), pain, physical function, fatigue, HRQOL, AM stiffness, and work instability among csDMARD-IR patients with RA.