Background

Psoriatic arthritis (PsA) is a chronic inflammatory autoimmune disease with a range of clinical manifestations affecting skin and musculoskeletal systems [1]. Health-related quality of life (HRQoL) can vary greatly according to a patient’s specific symptoms; hence, assessing treatment effects using patient-reported outcomes (PROs) is particularly important in PsA [2,3,4,5,6]. Several PRO instruments have been validated in PsA, including the Health Assessment Questionnaire-Disability Index (HAQ-DI) [4, 7] and Short Form-36 (SF-36) [5, 6].

Abatacept, a selective T-cell co-stimulation modulator [8], has a distinct mechanism of action upstream of currently available agents, and is approved for treatment of rheumatoid arthritis and juvenile idiopathic arthritis, and recently for active PsA in adults [9]. In the phase 3 Active pSoriaTic aRthritis rAndomizEd triAl (ASTRAEA, NCT01860976), subcutaneous (SC) abatacept 125 mg weekly significantly increased the proportion of patients achieving ≥ 20% improvement in the American College of Rheumatology criteria (ACR20) compared with placebo at week 24 (primary endpoint: 39.4% vs 22.3%; P < 0.001) and was well tolerated in patients with active PsA [10]. A numerically higher proportion of patients with HAQ-DI responses (reductions from baseline ≥ 0.35) was evident with abatacept versus placebo (P > 0.05). Abatacept treatment also reduced progression of structural damage with an overall beneficial effect on musculoskeletal symptoms. However, due to the hierarchical testing procedure employed, it was not possible to attribute significance to endpoints ranked below HAQ-DI responses in the hierarchical testing [10].

The effect of factors associated with poor prognosis and treatment resistance, such as elevated C-reactive protein (CRP) levels and prior exposure to tumour necrosis factor inhibitors (TNFi) [11], was also evaluated in ASTRAEA. Higher ACR20 responses were observed with abatacept versus placebo in both TNFi-naïve and TNFi-exposed subpopulations at week 24, with the largest treatment differences seen in TNFi-naïve patients [10]. Moreover, patients with baseline CRP ≥ upper limits of normal (ULN) had the highest ACR20 responses at week 24 with abatacept versus placebo [10].

The goal of the analyses reported here was to examine the impact of abatacept versus placebo treatment on PROs in ASTRAEA for the overall population and in subgroups by baseline CRP levels and previous TNFi exposure.

Methods

Study design and treatment

The design, eligibility criteria, and main efficacy and safety endpoints of this phase 3, randomised, double-blind, placebo-controlled, multicentre trial have been reported in detail previously [10]. Patients were randomised (1:1) to receive SC abatacept 125 mg weekly or placebo for 24 weeks, after which all patients were transitioned to receive open-label SC abatacept weekly for 28 weeks (total study period of 52 weeks). Patients without ≥ 20% improvement in tender and swollen joint counts at week 16 were switched to open-label abatacept for 28 weeks (early escape [EE], total study period of 44 weeks). Key eligibility criteria included age ≥ 18 years, PsA per the Classification Criteria for PsA (CASPAR) [12], active arthritis (defined as ≥ 3 tender and ≥ 3 swollen joints), active plaque psoriasis with ≥ 1 qualifying target lesion ≥ 2 cm in diameter and inadequate response or intolerance to ≥ 1 non-biologic disease-modifying antirheumatic drug (DMARD). Both TNFi-naïve and TNFi-exposed patients were included.

Patient-reported outcomes

HAQ-DI [4, 7], SF-36 physical component summary (PCS), mental component summary (MCS) and individual domain scores [5, 6], Functional Assessment of Chronic Illness Therapy-Fatigue scale (FACIT-F) [3] and Dermatology Life Quality Index (DLQI) [2] scores were assessed at weeks 16 and 24 in the overall population (prespecified) and in patient subpopulations (post hoc) by baseline CRP (> or ≤ ULN, defined as 3 mg/L) and prior TNFi use. The hierarchical order of the secondary and exploratory PRO endpoints [10] was predefined as: proportions of patients reporting HAQ-DI responses ≥ minimal clinically important differences (MCIDs) and mean changes from baseline in SF-36 PCS and MCS scores (summary and domain scores).

Here, in the overall population, the proportions of patients reporting improvements from baseline in HAQ-DI, SF-36 (summary and domain) and FACIT-F scores ≥ MCID (expressed as a value established for each instrument, and defined as the smallest change in score perceived by a patient to be clinically important) [13] and ≥ normative values (defined based on age/gender-matched population) were analysed (post hoc) at week 16 prior to confounding due to EE to open-label abatacept treatment. Defined MCIDs were: HAQ-DI ≥ − 0.35 [14], SF-36 PCS ≥ 2.5 [13, 15,16,17], SF-36 MCS ≥ 2.5 [13, 15, 17], SF-36 domains ≥5.0 [13, 15, 17], and FACIT-F ≥ − 4.0 [3]. Normative values were: HAQ-DI < 0.5 [7, 18, 19], SF-36 PCS ≥ 50 [17, 20], SF-36 MCS ≥ 50 [20] and FACIT-F ≥ 40.1 [21].

Statistical analyses

All efficacy analyses included all randomised patients who received at least one dose of study medication (intent-to-treat population). Week 16, prior to EE, was the last time point at which all patients were analysed. For week 24 analyses, EE patient data were set to missing. As previously reported, the effect of abatacept on the first key secondary endpoint in the statistical hierarchy (HAQ-DI responses) did not reach significance; therefore, only nominal P values were generated for subsequent outcomes, which were ranked lower in the hierarchy. The significance of the treatment effect cannot be definitively attributed for these outcomes as they were not adjusted for multiplicity (however, 95% confidence intervals [CIs] were not overlapping) [10]. Nonetheless, these lower-ranking outcomes still provide a measure of clinical meaningfulness. Adjusted mean changes from baseline in PROs including SF-36 domain scores were evaluated, and corresponding adjusted mean differences (95% CI) between the abatacept and placebo groups were calculated using a longitudinal repeated measures model. This model included the fixed categorical effects of treatment, day, prior TNFi use, methotrexate (MTX) use, body surface area (BSA), day-by-treatment interaction, prior TNFi-use-by-day interaction, MTX-use-by-day interaction, BSA-use-by-day interaction and the continuous fixed covariate of baseline score and baseline score-by-day interaction. The estimate of difference (95% CI) between abatacept and placebo groups for MCID and normative values was calculated using a two-sided Cochran–Mantel–Haenszel chi-square test adjusted for stratification criteria.

Patient consent and ethics approval

All patients or their legal representatives gave written, informed consent prior to study entry. The study was conducted in accordance with the Declaration of Helsinki, International Conference on Harmonisation Guidelines for Good Clinical Practice and local regulations. Schulman Associates Institutional Review Board or Independent Ethics Committees approved the protocol, consent form and any other written information provided to patients or their legal representatives.

Results

Patients

Of 424 patients randomised, 213 received abatacept and 211 placebo; 76 (35.7%) and 89 (42.2%), respectively, met criteria for EE [10]. Baseline demographic and disease characteristics were similar between treatment groups and were reported in detail previously [10].

Overall population analysis

Changes from baseline at weeks 16 and 24

In the total population, greater improvements from baseline in most PROs were reported with abatacept versus placebo at both week 16, which comprised all patients, and week 24, which included only patients showing a response to either treatment (response defined as 20% improvement in tender and swollen joint counts; Figs. 1, 2 and 3a). Statistically significant (95% CI of difference vs placebo not crossing 0) improvements from baseline with abatacept versus placebo were reported in HAQ-DI scores in the week 24 responder group (Fig. 1a), in SF-36 PCS (Fig. 2), SF-36 physical functioning (PF), bodily pain (BP) and vitality (VT) domains (adjusted mean difference [95% CI], respectively: 4.44 [0.39 to 8.49], 5.36 [1.40 to 9.33] and 4.07 [0.67 to 7.47]), and in DLQI (Fig. 1c) scores at weeks 16 and 24.

Fig. 1
figure 1

HAQ-DI (a), FACIT-F (b), DLQI (c) change from baseline (weeks 16 and 24, overall population). *Statistically significant difference. Dotted lines represent MCID (HAQ-DI: ≥ − 0.35; FACIT-F: ≥ − 4.0). CI confidence interval, DLQI Dermatology Life Quality Index, FACIT-F Functional Assessment of Chronic Illness Therapy-Fatigue scale, HAQ-DI Health Assessment Questionnaire-Disability Index, MCID minimal clinically important difference, NA not applicable, SE standard error

Fig. 2
figure 2

SF-36 PCS and MCS change from baseline (weeks 16 and 24, overall population). *Statistically significant difference. Dotted line represents MCID (≥ 2.5). CI confidence interval, MCID minimal clinically important difference, MCS mental component summary, PCS physical component summary, SE standard error, SF-36 Short Form-36

Fig. 3
figure 3

Abatacept/placebo SF-36 domain scores (baseline, weeks 16, 24) versus normative population (a, overall; b, CRP > ULN). Normative values for SF-36 individual domains were defined based on matching the age/gender distribution of this protocol population to US 1999 norms in patients without chronic disease or arthritis [20, 34]: PF and RP 81.9, BP 69.7, GH 70.4, VT 59.3, SF 84.4, RE 87.8, MH 75.6. A/G age/gender, BP bodily pain, CRP C-reactive protein, GH general health, MH mental health, PF physical function, RE role–emotional, RP role–physical, SF social function, SF-36 Short-Form 36, ULN upper limit of normal, VT vitality

Changes from baseline in SF-36 MCS scores were not statistically significant, but were numerically greater with abatacept versus placebo at week 16 (adjusted mean change from baseline [standard error (SE)]: 2.42 [0.70] vs 1.15 [0.73], adjusted mean difference [95% CI]: 1.28 [− 0.58 to 3.13]; P > 0.05), but were not meaningfully different for the responder-only analysis at week 24 (adjusted change from baseline [SE]: 2.56 [0.83] vs 2.62 [0.92], adjusted mean difference [95% CI]: –0.06 [− 2.32 to 2.20]). All SF-36 domains showed nonsignificant trends towards greater improvements from baseline with abatacept than placebo at week 16; improvements from baseline in all domains increased in both abatacept and placebo groups among responders at week 24 (Fig. 3a; see Additional file 1: Table S1).

Minimal clinically important differences and normative values

A statistically significant benefit at week 16 with abatacept versus placebo was evident in SF-36 PCS and MCS scores, and PF, BP and role–emotional (RE) domains (Fig. 4a; see Additional file 2: Table S2). The proportions of patients reporting scores ≥ normative values at week 16 were significantly greater (estimate of difference [95% CI]) with abatacept versus placebo in FACIT-F (10.4 [0.4 to 20.3]) and SF-36 RE domain (10.3 [3.4 to 17.1]) scores.

Fig. 4
figure 4

Rates of PRO improvements ≥ MCID (a) or ≥ normative values (b) at week 16 (overall population). *Statistically significant difference. MCID values: HAQ-DI ≥ − 0.35, SF-36 PCS ≥ 2.5, SF-36 MCS ≥ 2.5, FACIT-F ≥ − 4.0 and SF-36 domains ≥5.0. Normative values: HAQ-DI ≥ 0.5, SF-36 PCS ≥ 50, SF-36 MCS ≥ 50 and FACIT-F ≥ 40.1. CI confidence interval, FACIT-F Functional Assessment of Chronic Illness Therapy-Fatigue scale, HAQ-DI Health Assessment Questionnaire-Disability Index, MCID minimal clinically important difference, MCS mental component summary, PCS physical component summary, PRO patient-reported outcome, SF-36 Short Form-36

A numerically greater proportion of patients reported improvements ≥ MCID with abatacept versus placebo at week 16 in HAQ-DI scores, but the difference did not reach statistical significance (Fig. 4a). At week 16, the proportion of patients reporting improvements ≥ MCID in FACIT-F scores (Fig. 4a) and SF-36 role–physical (RP), general health (GH), VT, social function and mental health (MH) domain scores (see Additional file 2: Table S2) were numerically higher with abatacept versus placebo. In the abatacept treatment group, changes from baseline exceeded MCID in six of eight SF-36 domains, the exceptions being GH and MH; whereas in the placebo group, mean changes exceeded MCID in the RP and BP domains only.

The proportions of patients who reported scores ≥ normative values at week 16, although not statistically significant, were numerically higher with abatacept than placebo in HAQ-DI, SF-36 PCS and MCS, and FACIT-F (Fig. 4b) scores and all SF-36 domains (see Additional file 3: Table S3) (P > 0.05).

Subpopulation analyses

Changes from baseline at weeks 16 and 24

Across all PROs, improvements from baseline to week 16, although not statistically significant, were numerically greater in patients with baseline CRP > ULN versus those with CRP ≤ ULN in both abatacept and placebo groups (P > 0.05; Table 1). In the CRP > ULN subpopulation, improvements with abatacept versus placebo were significantly greater in HAQ-DI, SF-36 PCS, MCS, FACIT-F and DLQI scores (Table 1). Across all SF-36 domains, with the exception of MH, statistically significantly greater improvements were reported at week 16 with abatacept versus placebo in patients with baseline CRP > ULN (Fig. 3b and Fig. 5). Statistically significant improvements (adjusted mean difference [95% CI]) in DLQI (− 2.32 [− 3.80 to − 0.83]; Table 1) and SF-36 PF (8.57 [2.15 to 14.99]) and BP (6.62 [0.15 to 13.09]) (Fig. 5b) domain scores were reported in the baseline CRP > ULN subpopulation with abatacept versus placebo at week 24; however, data should be interpreted with caution due to low patient numbers. In the CRP ≤ ULN subpopulation, no significant improvements with abatacept versus placebo were evident at week 16 (Table 1).

Table 1 Adjusted mean change from baseline in PROs at weeks 16 (all patients) and 24 (non-EE responder analysis) in patients treated with abatacept or placebo and stratified by baseline CRP level or prior TNFi use
Fig. 5
figure 5

SF-36 domain score changes from baseline for CRP > ULN population: weeks 16 (a) and 24 (b). *Statistically significant difference. BP bodily pain, CI confidence interval, CRP C-reactive protein, GH general health, MH mental health, PF physical function, RE role–emotional, RP role–physical, SE standard error, SF social function, SF-36 Short Form-36, VT vitality

A statistically significant benefit for abatacept versus placebo was reported by TNFi-naïve patients at week 16 in SF-36 MCS (Table 1) scores. Among abatacept-treated TNFi-naïve patients at baseline, numerically, although not statistically significant, greater improvements in SF-36 PCS, MCS and FACIT-F scores at week 16 were reported versus TNFi-exposed patients (P > 0.05; Table 1). In the TNFi-naïve abatacept-treated subpopulation, adjusted mean changes from baseline at week 16 exceeded MCID in SF-36 PCS, MCS and FACIT- F scores (Table 1), and seven of eight SF-36 domains with exception of MH (data not shown). In TNFi-exposed abatacept-treated patients, improvements exceeded MCID in SF-36 PCS scores (Table 1).

Discussion

These analyses demonstrated that abatacept treatment generally improved PROs in patients with active PsA in the phase 3 ASTRAEA trial, particularly in those who were TNFi-naïve and/or with elevated CRP at baseline. In the overall population at week 16, prior to EE, abatacept administration was associated with improved PROs compared with placebo; significant improvements with abatacept versus placebo were reported in SF-36 PCS, PF, BP and VT domain scores as well as DLQI, reflecting those areas of HRQoL most impacted by PsA. At week 24 in the non-EE responder analysis, a potential benefit of abatacept treatment was evident compared with placebo, with significantly greater improvements reported in physical function (by HAQ-DI) and dermatological manifestations (by DLQI). The proportion of patients with clinically meaningful HAQ-DI responses (reductions from baseline score ≥ 0.35) at week 24 was numerically higher with abatacept versus placebo: 31.0% versus 23.7%; however, as this did not reach statistical significance, it was not possible to definitively attribute significance to lower-ranking secondary endpoints in the hierarchical testing (nominal P values only were generated; 95% CIs were not overlapping) [10]. Notably, significant improvements in DLQI, the only PRO investigated here that directly measures the skin domain in PsA, were reported by those patients with a background of an overall modest skin response (by Psoriatic Area and Severity Index) in ASTRAEA at week 24 [10]. Nevertheless, as the week 24 analysis included only non-EE responders, the placebo arm comprised patients who reported responses to placebo. Therefore, it may be expected that differences between treatment groups would be less obvious at week 24 than week 16. In addition, the number of patients analysed at this time point was lower than at week 16.

Comparisons of the proportion of patients reporting improvements ≥ MCID is considered a clinically meaningful estimate of therapy effects [22]. Overall, the proportion of abatacept-treated patients reporting improvements ≥ MCID in PROs exceeded the proportion of placebo-treated patients: at week 16, 41.8–58.2% of abatacept-treated patients across different PROs reported clinically meaningful improvements in HAQ-DI, SF-36 PCS and MCS, individual SF-36 domains and FACIT-F scores compared with 33.6–47.9% of those treated with placebo.

In addition to the overall population analysis, PROs were analysed in subpopulations of patients by baseline CRP, as elevated CRP is an identified poor prognostic factor [11]. There was a non-statistically significant trend towards improved PROs in patients with elevated baseline CRP regardless of treatment arm at week 16. However, among patients with elevated CRP, those receiving abatacept reported greater improvements compared with placebo. Similarly, in the main ASTRAEA study, the highest ACR20 responses with abatacept versus placebo were seen in patients with CRP > ULN at baseline [10], suggesting that these patients may be particularly responsive to abatacept. Our results suggest that baseline CRP should be taken into consideration when evaluating the clinical efficacy of different treatments. PROs were also analysed in subpopulations by previous exposure to TNFi treatment. At week 16, improvements were greater with abatacept than with placebo in both TNFi-naïve and TNFi-exposed subpopulations. However, in abatacept-treated patients, reported improvements in PROs were generally larger in the TNFi-naïve versus TNFi-exposed subpopulations. Indeed, greater efficacy would be expected in TNFi-naïve than in TNFi-exposed patients [23]. These findings are in line with clinical outcomes observed with abatacept in this trial, which were generally better in patients with elevated CRP at baseline and TNFi-naïve patients [10]. The PRO data reported here support previous results that abatacept may be particularly effective in certain subpopulations of patients.

The effects of other DMARDs, including TNFi agents, on PROs in patients with active PsA have been investigated previously, with most studies assessing effects over 24 or 48 weeks. Statistically and clinically meaningful improvements in SF-36 PCS and MCS and all individual domain scores from baseline to week 24 have been reported with etanercept [24]; clinically meaningful improvements in PROs including HAQ-DI and SF-36 PCS scores have also been reported over 48 weeks of treatment [25]. Similarly, adalimumab has been shown to improve HRQoL, based on SF-36 PCS, HAQ-DI, FACIT-F and DLQI scores, after 48 weeks of treatment [26]. The effects of the newer interleukin inhibitors on PROs have also been studied. Ustekinumab, an anti-interleukin-12 and -23 agent, improved physical function (by SF-36 PCS and HAQ-DI) and dermatological manifestations (by DLQI) at week 24 [27, 28]. Beneficial effects on PROs have also been reported after 24 weeks with the anti-interleukin-17A agent secukinumab [29, 30]. Furthermore, apremilast, a phosphodiesterase 4 inhibitor, was shown to significantly improve HAQ-DI scores by week 16 compared with placebo in a 24-week trial in which < 10% of patients had previously failed a biologic therapy [31]. In the current study, improvements in PROs achieved with abatacept appeared less marked than that reported with other biologic DMARDs (bDMARDs) in earlier studies; however, differences in trial design and patient populations preclude comparisons of efficacy between abatacept and other bDMARDs based on currently available evidence.

The therapeutic options for PsA have greatly increased over the past 10 years and, as more new treatments are introduced, assessing responses to therapy including PROs will become increasingly important, aiding treatment choices. A recent literature review provided an evidence-based overview of 44 instruments per core PsA outcome domain to ascertain applicability and best instrument for each domain of the many available PROs [32]. However, further research is warranted to develop and validate specific PRO measures that better capture the impact of all PsA symptoms [33]. In the meantime, using a combination of instruments and/or the best available instrument per domain, as in this trial, provides a more complete picture [33].

A number of study limitations should be considered. First, subpopulation comparisons and ascertainment of scores ≥ MCID and ≥ normative values were post hoc in nature. Second, owing to the particular trial design, a high proportion of patients were subject to EE at week 16; as such, week 24 analyses included a limited number of patients who were still receiving blinded treatment in either arm of the trial. Only nominal P values were provided for endpoints that ranked lower in the statistical hierarchy than the first secondary endpoint, which did not reach statistical significance at the 5% level. For other endpoints, only 95% CIs of differences between abatacept and placebo arms were generated, without associated P values. In addition, due to the low patient numbers, the reported data for subpopulations were difficult to interpret, particularly at week 24. Finally, certain PROs may improve less rapidly over time and thus the week 16 time point may have not allowed maximal effects of abatacept treatment to be observed. In addition, although some statistically significant improvements were noted with abatacept, these may not necessarily be clinically important.

Conclusions

In conclusion, abatacept treatment improved PROs at week 16 in patients with active PsA, with evidence of some effects sustained at week 24. Furthermore, PRO improvements were greater in TNFi-naïve patients and those with elevated CRP. These results demonstrate that clinical improvements in PsA signs and symptoms previously reported with abatacept treatment [10] also result in clinically meaningful improvements in PROs.