Introduction

Patients with rheumatoid arthritis (RA) frequently experience pain, fatigue, and impaired physical functioning that may impact their health-related quality of life (HRQOL) and ability to work and participate in daily activities [1,2,3,4]. Relief of pain is an important treatment outcome for patients and a primary reason for seeking medical care [2]. The restrictions on patients’ daily work and social activities due to symptom burden have a significant impact on their financial and social well-being [2, 4, 5]. With a significant burden on HRQOL, treatment decisions are recommended as a shared decision between the patient and physician [6]. Guidelines recommend conventional synthetic disease-modifying antirheumatic drugs (csDMARDs), such as methotrexate (MTX), as a first-line treatment strategy and biologic DMARDs (bDMARDs) like abatacept (ABA), anti-TNF inhibitors, or Janus kinase (JAK) inhibitors as second-line therapy options [6]. Unfortunately, up to 43% of patients do not respond to first-line csDMARD therapy and as many as two-thirds who receive bDMARDs have an inadequate response (bDMARD-IR) after 1 year of therapy [7, 8]. Thus, this population is difficult to adequately treat and, as such, exhibits marked increases in healthcare resource utilization; bDMARD-IR patients experienced up to 7-fold increases in hospital length of stay, admissions, and emergency department visits as compared with patients that responded to bDMARD therapy [7].

Discordance between healthcare provider and patient perceptions of disease exists [9, 10], especially in patients continuing to experience pain despite inflammation being controlled [11] or by those continuing to experience fatigue despite achieving remission [12]. To fully understand the disease burden and benefits of treatment from the perspective of patients with RA, it is important to include PROs as part of clinical trials and evaluation of treatment efficacy. This is especially true for patients with inadequate response to csDMARDs (csDMARD-IR) and bDMARD-IR. Treatment with upadacitinib (UPA), an oral JAK inhibitor, has resulted in clinically meaningful improvements in patient-reported outcomes (PROs), including in key components of pain, fatigue, and physical functioning [13,14,15]. Improvements in PROs have been observed with UPA as monotherapy [16] and in combination with MTX [13, 15]. Improvements were equivalent to or greater than with anti-TNF inhibitor adalimumab [14]. In patients with inadequate response or intolerance to MTX, UPA treatment significantly improved patient-reported pain and physical functioning [13, 14]. ABA is commonly prescribed as a second-line bDMARD for the treatment of RA, and studies have shown that ABA treatment improves PROs, yet comparisons of ABA to JAK inhibitors are limited [17,18,19,20,21,22]. Further research is needed to guide treatment decisions, particularly from the patients’ perspective. This post hoc analysis evaluated the impact and benefits of treatment with oral UPA versus intravenous (IV) ABA on PROs at weeks 12 and 24 in a head-to-head comparison in patients with active RA and bDMARD-IR in SELECT-CHOICE [23].

Materials and methods

Study design and participants

Full details of the study design of SELECT-CHOICE (NCT03086343) were previously reported [23]. This study was a phase 3, double-blind, randomized clinical trial in patients with active bDMARD-IR RA currently receiving background csDMARD therapy. Patients ≥18 years of age with moderately to severely active RA for ≥3 months on a stable background of csDMARD therapy (≥3 months prior to study entry) were randomized double-blind to receive either oral UPA (15 mg once daily) with IV placebo or IV ABA and oral placebo. This study was not placebo-controlled in that, despite patients receiving an oral or IV placebo, patients knew that they were receiving an active drug, however, they were unaware of which drug they were receiving. Both IV placebo and ABA treatments were administered on day 1 and at weeks 2, 4, 8, 12, 16, and 20 at doses of 500, 750, or 1000 mg, depending on patient weight (<60 kg, 60–100 kg, or >100 kg, respectively). Concomitant use of ≤2 of the following csDMARDs was permitted: MTX, sulfasalazine, hydroxychloroquine, chloroquine, or leflunomide; combination of MTX and leflunomide was not permitted. Eligible patients had no previous exposure to ABA. Data on the primary and ranked secondary outcomes of this study have been published previously [23]. The protocol was approved by independent ethics committee or institutional review board at all study sites. All participants provided written, informed consent prior to enrollment. The registered clinical trial was conducted in accordance with the ethical principles that have their origin in the current Declaration of Helsinki and is consistent with the International Conference on Harmonization Good Clinical Practice and Good Epidemiology Practices, and all applicable local regulatory requirements. All patient data were de-identified and complied with patient confidentiality requirements.

Patient-reported outcomes

Several PROs were collected to assess the impact of UPA on the patients’ symptoms and HRQOL. The Patient Global Assessment of Disease Activity (PtGA) by visual analog scale (VAS) assessed overall disease severity (range 0–100 mm), with higher scores indicating greater disease activity [24,25,26]. Pain was measured with the Patient’s Assessment of Pain by VAS (range 0–100 mm), wherein higher scores denoted greater pain [24, 25]. The Health Assessment Questionnaire Disability Index (HAQ-DI) assessed physical functioning, and higher scores (range 0–3) indicated greater physical impairment [24, 25, 27]. The Functional Assessment of Chronic Illness Therapy-Fatigue (FACIT-F) evaluated fatigue on a scale of 0–52, with higher scores indicating less fatigue [25, 28]. The EQ-5D 5-level (EQ-5D-5L) assessed perceptions of overall health, and higher index scores indicated better health [29]. Morning stiffness was reported as duration in minutes, and stiffness severity on a scale of 0–10, with higher values indicating longer lasting or worse morning stiffness [24, 30]. The Work Productivity and Activity Impairment (WPAI) assessment was also used and consists of 4 domains: absenteeism, presenteeism, overall work impairment, and activity impairment. WPAI domain scores were expressed as impairment percentages (scale 0–100%), with higher values indicating greater impairment [31]. The 36-Item Short Form Health Survey (SF-36), which consists of 8 domains (physical functioning, role physical, bodily pain, general health, vitality, social functioning, role emotional, and mental health), and the composite Physical (PCS) and Mental (MCS) Component Summary scores were also assessed; higher scores on SF-36 (range 0–100) indicated better health and functioning [24, 25, 32, 33].

Table 1 shows the scoring ranges, minimal clinically important difference (MCID) values, and normative values, where available, for each PRO. Pain, PtGA, HAQ-DI, and morning stiffness were assessed at all time points (day 1 and weeks 2, 4, 8, 12, 16, 20, and 24). WPAI was assessed at day 1 and weeks 4, 8, 12, and 24; FACIT-F was assessed at day 1 and weeks 4, 8, 12, 16, and 24; and the SF36 and EQ-5D-5L were assessed at day 1 and weeks 4, 12, and 24.

Table 1 Patient-reported outcomes measurements and meaningful values

Statistical analysis of data

Least squares mean (LSM) changes from baseline to weeks 12 and 24 were assessed based on an analysis of covariance model and comparisons between treatment arms used chi-square tests with significance at the 5% level. The proportion of patients reporting improvements ≥ MCID from baseline through weeks 12 and 24, and those achieving normative values were calculated for UPA and ABA treatment. For each PRO, response rates were only calculated for patients who had non-missing baseline PRO scores and missing values were imputed as non-responses. The incremental numbers needed to treat (NNTs) to demonstrate MCID were calculated as the reciprocal of the response rate differences between UPA and ABA for each PRO at weeks 12 and 24. WPAI work impairment was only calculated for patients who were employed. Time to response, defined as improvement ≥ MCID, was assessed for pain (VAS), HAQ-DI, and duration and severity of morning stiffness and was assessed by Kaplan-Meier analysis and compared using the log-rank test. The PRO endpoints presented in this manuscript were not ranked secondary endpoints and, thus, were not controlled for multiplicity. As such, nominal P values are provided throughout.

Results

Patient demographics

The study enrolled and randomized 612 patients (UPA, n=303; ABA, n=309) with a mean age of 56 years and an average RA duration of 12 years (Table 2). Baseline characteristics among the treatment arms were comparable, with nearly one-third of patients enrolled having received 2 or more prior bDMARDs (32%), over 60% having received at least one TNF inhibitor, and over 80% of patients receiving MTX, with or without another csDMARD, at baseline. Over half of the patients enrolled were on an oral steroid at baseline. Baseline PROs were similar between the two treatment groups (Table 3) and reflect the impact of RA on the HRQOL in patients with a long disease duration.

Table 2 Patient demographics and RA treatment history
Table 3 LSM at baseline and change at weeks 12 and 24

LSM changes from baseline

At week 12, UPA treatment resulted in statistically significant improvements in PtGA, pain, HAQ-DI, severity of AM stiffness, EQ-5D-5L, WPAI activity impairment and presenteeism domains, three SF-36 domains (physical functioning, bodily pain, and general health), and the SF-36 PCS score (p<0.05, Table 3) as compared to improvements with ABA. At week 24, changes from baseline were maintained in UPA-treated patients and a significant difference persisted between UPA- and ABA-treated patients in HAQ-DI, severity of AM stiffness, WPAI activity impairment domain, and the SF-36 PCS and bodily pain domain scores; changes from baseline were similar between groups for the remaining PROs.

Proportion of patients reporting improvements ≥ MCID in PROs at weeks 12 and 24

Compared with ABA at week 12, significantly more UPA-treated patients reported improvements ≥ MCID in HAQ-DI. Similar proportions of patients reported clinically meaningful improvements in the ability to perform work and daily activities as demonstrated by WPAI scores. Likewise, similar proportions of patients reported improvements ≥ MCID across other PROs (Fig. 1A). Among the SF-36 domain scores, a significantly greater proportion of patients receiving UPA, as compared with ABA, reported improvements ≥ MCID on physical functioning, role physical, bodily pain, and general health (Fig. 2A). Likewise, clinically meaningful improvements in SF-36 PCS scores were reported in significantly more UPA-treated patients (Fig. 1A). Improvements in the other 4 domains were similar between groups. At week 24, more UPA-treated patients reported improvements ≥ MCID in most PROs compared with ABA-treated patients; however, these differences were not statistically significant (Figs. 1Band 2B).

Fig. 1
figure 1

Proportion of patients reporting improvements ≥ MCIDa in PROs at weeks 12 (A) and 24 (B). aMCID was defined as reduction of ≥10 mm for PtGA and pain, ≥1 for severity of AM stiffness, reduction of ≥0.22 units for HAQ-DI, increase of ≥4 points for FACIT-F, proxied at one-half standard deviation for duration of AM stiffness, increase of ≥0.05 points for EQ-5D-5L, reduction of 7% in score for WPAI, and increase of ≥2.5 points for SF-36 PCS and MCS. bABA IV at day 1 and weeks 2, 4, 8, 12, 16, and 20 (<60 kg: 500 mg; 60–100 kg: 750 mg; >100 kg: 1,000 mg). cNNTs are for UPA vs ABA. ABA abatacept, AM morning, EQ-5D-5L (index score), EQ-5D 5-Level, FACIT-F Functional Assessment of Chronic Illness Therapy-Fatigue, HAQ-DI Health Assessment Questionnaire Disability Index, IV intravenous, MCID minimal clinically important difference, MCS Mental Component Summary. NNT number needed to treat, PCS Physical Component Summary, PRO patient-reported outcome, PtGA Patient Global Assessment of Disease Activity, SF-36 36-Item Short Form Health Survey, UPA upadacitinib, VAS visual analog scale, WPAI Work Productivity and Activity Impairment. *P<0.05 for UPA vs ABA. P values represent statistical significance between treatment groups

Fig. 2
figure 2

Proportion of patients reporting improvements ≥ MCIDa in SF-36 at weeks 12 (A) and 24 (B). aMCID was defined as increase ≥5.0 for all SF-36 domains. bABA IV at day 1 and weeks 2, 4, 8, 12, 16, and 20 (<60 kg: 500 mg; 60–100 kg: 750 mg; >100 kg: 1000 mg). cNNTs are for UPA vs ABA. ABA abatacept, BP bodily pain, GH general health, IV intravenous, MCID minimal clinically important difference, MH mental health, PF physical functioning, RE role emotional, RP role physical, SF social functioning, SF-36 36-Item Short Form Health Survey, UPA upadacitinib, VT vitality. *P<0.05 for UPA vs ABA. P values represent statistical significance between treatment groups

Proportion of patients achieving normative values in PROs at baseline and weeks 12 and 24

At baseline, few patients reported having normative PRO scores (i.e., values consistent with those reported by patients without disease; Fig. S1 and S2). Not all PROs assessed in this study have known normative values, thus achievement of normative values is only reported for a subset of PROs. The percentage of patients reporting normative values at baseline, for both UPA and ABA groups, ranged from 1% (SF-36 PCS) to 29% (SF-36 MCS). By week 12, the percentages of UPA- vs ABA-treated patients achieving normative values were significantly greater in PtGA (37% vs 23%), HAQ-DI (18% vs 10%), and EQ-5D-5L (22% vs 13%, Fig. S1). Likewise, a significantly greater proportion of patients receiving UPA reported normative values for SF-36 PCS (17% vs 8%, Fig. S1), physical functioning (21% vs 11%), bodily pain (33% vs 23%), and general health (24% vs 17%) domains (Fig. S2). At week 24, significantly more UPA- vs ABA-treated patients achieved normative PRO scores in PtGA (44% vs 34%), HAQ-DI (23% vs 16%), and SF-36 PCS (21% vs 12%) and role physical (22% vs 16%) and bodily pain (38% vs 29%) domains (p<0.05; Figs. S1 and S2). While more UPA-treated patients achieved normative values in the remaining PROs at week 24 compared to ABA-treated patients, the differences between groups were not statistically significant.

Time to treatment response

The time to response (≥MCID), as measured by HAQ-DI, was significantly shorter for UPA- vs ABA-treated patients (medians: 2 weeks vs 4 weeks, P<0.01 [data not shown]). The median time to response was not statistically significantly different for UPA- versus ABA-treated patients in pain (2 weeks vs 4 weeks). There was no difference in the median time to response for morning stiffness severity or morning stiffness duration.

Discussion

JAK inhibitors, including UPA, are a newer class of treatment for RA in comparison to other biologics such as ABA, which have been commonly accepted therapies for patients with RA over the past 20 years. Understanding the efficacy of newer therapies, especially as they compare to more established ones, is important because nearly half of patients do not adequately respond to first-line csDMARDs and up to 66% do not adequately respond to second-line biologics [7, 8]. Thus, these patients represent a population that may be more difficult to treat. The SELECT-CHOICE study is a phase 3 trial that is a direct head-to-head comparison of efficacy, safety, and PROs between UPA and ABA in a bDMARD-IR population [23]. Primary efficacy data demonstrated that UPA was superior to ABA in the change from baseline in DAS28-CRP components and achievement of remission after 12 weeks of treatment; after 24 weeks of treatment, change from baseline in DAS28-CRP components remained numerically greater in UPA- vs ABA-treated patients but were not statistically significant [23]. As a supplement to the clinical efficacy results, analysis of other secondary endpoints showed that both UPA and ABA demonstrated improvement in PROs; however, UPA treatment resulted in more significant and clinically meaningful improvements in PROs at 12 weeks when compared with ABA. Early differences between the treatments were seen in key domains of physical functioning, pain, and general health, with improvements in HAQ-DI observed 2 weeks earlier in UPA- vs ABA-treated patients. In this study, patients treated with UPA achieved significantly greater improvements from baseline in PtGA, pain, HAQ-DI, and FACIT-F as compared with ABA at week 12. Likewise, SF-36 PCS, 3 SF-36 domains (physical functioning, bodily pain, and general health), and 2 WPAI domains (presenteeism and activity impairment) showed significant improvement with UPA vs ABA; at week 12 similar improvements were noted for UPA and ABA in other PROs. At week 24, the change from baseline in HAQ-DI, severity of AM stiffness, WPAI activity impairment domain, and SF-36 PCS and bodily pain domain scores in UPA-treated patients remained statistically significant compared with ABA-treated patients; changes from baseline were similar between groups for the remaining PROs. This study also demonstrated that more patients receiving UPA achieved normative values in PtGA, HAQ-DI, SF-36 PCS, and SF-36 bodily pain domain at both weeks 12 and 24 as compared with ABA-treated patients; UPA-treated patients also had significantly better improvement in the SF-36 role physical domain at week 24 compared with ABA-treated patients. Despite no statistically significant differences in the proportion of patients achieving MCID in these PROs at week 24, these data would suggest that improvements in these PROs with UPA treatment may be more substantial than those improvements observed with ABA treatment based on normative value achievements. The improvements in PROs reported with UPA in this study are similar to those improvements seen with UPA previously in csDMARD-IR and bDMARD-IR patient populations [13,14,15, 23]. Data from SELECT-COMPARE, which compared UPA, placebo, and adalimumab treatment with a background of MTX at 12 weeks also demonstrated significant improvements with UPA in PtGA, pain, HAQ-DI, morning stiffness severity, FACIT-F, SF-36 PCS, and 6/8 SF-36 domain scores as compared with adalimumab and placebo [14]. Importantly, SELECT-COMPARE enrolled patients who had inadequate response or intolerance to MTX, whereas this study enrolled bDMARD-IR patients, who represent a difficult-to-treat population with a greater, unmet medical need.

Assessment of PROs in chronic disease is key to understanding patient perspectives and should be included when analyzing study drug efficacy. PROs are useful tools to measure the impact of chronic illness on daily living and work abilities because these also impact healthcare resource utilization and overall economic burden of disease. Likewise, PROs can influence treatment decisions and provide a more customized approach to disease management, especially when treatments are comparable [34]. When selecting treatments, time to response, route of administration, and quantity of doses taken per day are also important factors to consider as they may greatly affect patient’s perception of efficacy and overall treatment adherence [35, 36]. Patients with RA frequently experience pain, fatigue, and impaired physical functioning and these may have negative impacts on their HRQOL [1,2,3,4]. Fatigue and pain are also associated with reductions in mental well-being and the ability of patients to perform daily activities and maintain employment [4, 37, 38]. In the current study, improvements in physical functioning (HAQ-DI) and severity of morning stiffness were observed as early as 2 weeks after treatment initiation with UPA. After 12 weeks of treatment, greater proportions of UPA- vs ABA-treated patients reported clinically meaningful improvements in physical functioning and in 4 of 8 SF-36 domain and PCS scores. The proportion of UPA- vs ABA-treated patients reporting achieving normative values after 12 weeks treatment was significantly greater in PtGA, physical functioning, general HRQOL by EQ-5D-5L, and SF-36 PCS and 3 of 8 domain scores (i.e., physical functioning, bodily pain, and general health). The proportion of UPA- vs ABA-treated achieving normative values was significantly greater for PtGA, physical functioning, and SF-36 PCS and bodily pain and role physical domain scores at week 24, with significant proportions of patients also achieving normative values for the SF-36 role physical domain. Similar percentages of patients (over half) treated with UPA or ABA achieved clinically meaningful reductions in work and activity impairment at week 12. At week 24, over 68% of UPA or ABA-treated patients had clinically meaningful improvement in activity impairment; 62% of UPA-treated patients had clinically meaningful improvement in work impairment vs only 49% of ABA-treated patients. Likewise, similar percentages of patients treated with UPA or ABA also achieved clinically meaningful reductions in the key symptom of fatigue. Importantly, patients reported shorter median response times to improvements in physical functioning with UPA treatment compared with ABA. Together, these results suggest that UPA may lead to meaningful early improvements in key PROs that are important to patients, including fatigue, pain, physical functioning, and ability to perform work and daily activities.

There are both strengths and limitations to this study. Strengths of the study include the utilization of several validated PROs that reflect the different aspects of the patient experience. To our knowledge, this is the first clinical study comparing a JAK inhibitor to ABA in a bDMARD-IR population. This study fills the gap by providing important data on patient-perceived efficacy in this population. The use of MCIDs and normative values allow for the data to be clinically meaningful and interpretable for patients and physicians. Blinded and randomized study design allows for unbiased reporting from each patient and mitigates biases due to differences between treatment groups. Limitations of the study include the collection of PROs at fixed visits, sometimes weeks apart, with no day-to-day data available. Prolonged recall of such dynamic symptoms may introduce recall bias that could affect patient perceptions of efficacy [39]. Although patients did receive either IV or oral placebo in combination with active therapy, this trial was not placebo-controlled since patients were aware that they were receiving an active treatment. This may impact on patients’ perception of drug efficacy. The PROs presented here were not multiplicity controlled, ranked secondary endpoints, thus all significance values are nominal. Imputation of missing data as non-response may lead to an underestimation of the true response rate for each PRO. The time frame of this analysis was relatively short (24 weeks), thus additional studies are needed to determine if the patient-reported improvements observed are maintained long-term.

Conclusions

Treatment with UPA or ABA resulted in rapid and clinically meaningful improvements in PROs among bDMARD-IR patients with moderately to severely active RA. Overall, greater improvements from baseline in PROs with UPA vs ABA treatment, especially in the key domains of physical functioning, pain, and general health, were observed after 12 and 24 weeks of treatment. Although the proportion of patients achieving ≥MCID were similar between UPA- and ABA-treated patients at week 24, the numerically greater improvements from baseline in PROs, coupled with higher percentages of patients achieving normative values, suggest that improvement in those PROs may be more substantial with UPA vs ABA treatment. Moreover, these data suggest that patients receiving UPA had faster therapeutic response times, as seen by earlier meaningful improvement in PROs, compared with ABA-treated patients.