In patients with rheumatoid arthritis (RA), the American College of Rheumatology (ACR) 20 response rate was determined to best discriminate active from placebo (PBO) treatment [1]. The determination of ACR20 as a better discriminator than ACR50 and ACR70 was largely based on evaluation of responses from PBO- or active-controlled trials of conventional synthetic disease-modifying antirheumatic drugs (csDMARDs), including methotrexate (MTX), cyclosporine, and gold injections, in patients with established RA [2]. In the past two decades, much progress has been made in the treatment of RA, with the addition of biological DMARDs (bDMARDs) and alternative csDMARDs to our armamentarium. It has also not been demonstrated that the ACR20 is the best discriminator between treatments in head-to-head studies comparing two active treatments. An alternative, the ACRHybrid measure, was proposed 10 years ago and combines the ACR20, 50, and 70 and a continuous measure of change in the 7 ACR core set measures [3], but it has not been adopted as a primary outcome measure for large-scale RA clinical trials [4]. One objective of this analysis was to determine the ACR response rate that has the most discriminatory ability between different treatment strategies in patients with RA in clinical trials.

When assessing depth of response with respect to disease activity, continuous scales, such as the Simplified Disease Activity index (SDAI), Clinical Disease Activity index (CDAI), or 28-joint Disease Activity Score based on erythrocyte sedimentation rate (DAS28(ESR)) or based on C-reactive protein (DAS28(CRP)) are commonly employed. The DAS28(CRP) is particularly used in clinical trials, which typically include central laboratory testing. Another objective of this analysis was to evaluate what criteria based on improvements in these disease activity indices most discriminate between different treatment strategies in clinical trials.

Patients and methods

Patients and studies

This post hoc analysis included data from patients from four double-blind, randomized, PBO- or active-controlled trials of originator adalimumab (ADA) in patients with RA. The individual studies were performed in accordance with the International Conference on Harmonisation Guidelines for Good Clinical Practice and the Declaration of Helsinki.

The individual study protocols were reviewed and approved by an institutional review board or ethics committee at each study center. All patients provided written informed consent.

The OPTIMA and PREMIER studies compared active treatments with initial therapy of ADA+MTX versus MTX in patients with early RA, whereas the ARMADA and DE019 studies compared treatment with ADA versus PBO, both added to background methotrexate in patients with established RA. The methods of these trials have been previously published [5,6,7,8]. Briefly, in the phase 3 PREMIER trial (NCT00195663), patients with early RA who were naïve to tumor necrosis factor inhibitors (TNFi) and MTX, were blindly randomized to receive ADA 40 mg every other week (eow), MTX weekly, or ADA 40 mg eow plus MTX weekly for 2 years [5]. In the phase 4 OPTIMA trial (NCT00420927), patients with early RA who were naïve to MTX and TNFi were blindly randomized to receive ADA 40 mg eow plus MTX weekly or PBO plus MTX for 26 weeks followed by a 52-week treatment continuation, adjustment, or withdrawal period [6].

In the phase 3 DE019 trial (NCT00195702), TNFi-naïve patients with established RA and active disease despite treatment with MTX were randomized to receive ADA 20 mg eow, ADA 40 mg eow, or PBO in addition to background MTX for 1 year [7]. In ARMADA (conducted before the requirement of clinical trial registration), patients with established RA with an inadequate response to ≤ 4 prior csDMARD(s) were randomized to receive ADA at 20, 40, or 80 mg eow or PBO, each with concomitant MTX for 24 weeks [8].

Identification of the most discriminatory ACR response, SDAI, CDAI, and DAS28(CRP) improvement

With adalimumab being an established therapy to treat RA, we defined the criteria for “most discriminatory” as the ability to determine a statistically significant difference between adalimumab and the appropriate control. The response that corresponds to the lowest P value is therefore identified as the most discriminatory, with the consideration that the lowest P value is equivalent to the largest standardized effect size in a completed trial with fixed sample size.

For each treatment, the ACR response in increments of 5% (0–100%) was calculated at 12 and 24/26 weeks. The ACR response which corresponded to the lowest P value for the difference between the ADA+MTX versus PBO+MTX in early RA and ADA versus PBO+background MTX in established RA was identified as having the most discriminatory ability. In addition, the ACR response with the greatest treatment difference was also investigated.

Following a similar reasoning, percent improvement in SDAI, CDAI, and DAS28(CRP) scores (in 5% increments) with the most discriminatory ability between treatments, or between active treatment and PBO was identified. The percent improvement which corresponded to the lowest P value for the between-treatment difference was identified as having the most discriminatory activity. The percent improvement with the greatest treatment difference was also investigated. Subgroup analyses of CDAI response based on baseline CDAI and baseline DAS28(CRP) (≤ median vs > median) were also conducted to assess the impact of baseline disease activity on the most discriminatory cutoff.

Results

From the early RA studies, 268 (PREMIER) and 515 (OPTIMA) patients receiving newly initiated ADA+MTX and 257 (PREMIER) and 517 (OPTIMA) patients receiving newly initiated PBO+MTX were included in this analysis. From the established RA studies with continued background MTX, 207 (DE019) and 67 (ARMADA) patients receiving ADA+MTX and 200 (DE019) and 62 (ARMADA) patients receiving PBO+MTX were included in this analysis. Baseline characteristics were generally similar between the two treatment groups in both early RA and established RA patients, although differences were observed between studies (Table 1).

Table 1 Baseline characteristics

Identification of the most discriminatory ACR response

In patients with early RA from PREMIER who were initiating MTX, the lowest P value for the difference in ACR responses between the PBO+MTX arm and the ADA+MTX arm and the greatest treatment difference was achieved at ACR60 (Fig. 1a) at week 26. In patients with early RA from OPTIMA who were initiating MTX, the most discrimination at week 26 in terms of the lowest P value was achieved at ACR80, while similar absolute treatment differences were observed between ACR35–80 (Fig. 1b and Additional file 1: Table S1). The optimal ACR responses for both trials were lower at week 12 (ACR30 for PREMIER and ACR55–60 for OPTIMA (see Additional file 1: Figure S1). Conversely, for patients with established RA from DE019 and ARMADA who were continuing background MTX, lower ACR responses for both the lowest P value and the greatest treatment difference were more discriminatory when comparing PBO+MTX versus ADA+MTX; ACR35 and ACR30 for DE019 and ARMADA respectively, at week 24 (Fig. 1c, d and Additional file 1: Table S1). At week 12, ACR10 was most sensitive to differences between treatments (see Additional file 1: Figure S1). When considering the standard ACR cutoff points, ACR50/70 were more discriminatory for early RA and ACR20 for established RA at week 24/26 in this analysis.

Fig. 1
figure 1

ACR responses in patients with early RA a PREMIER and b OPTIMA, or established RA c DE019 and d ARMADA at week 24/26. P value for difference between response rates for patients treated with ADA+MTX and PBO+MTX. ADA, adalimumab; ACR, American College of Rheumatology; MTX, methotrexate; PBO, placebo; RA, rheumatoid arthritis

Identification of the most discriminatory CDAI and SDAI improvements

In early RA patients from PREMIER and OPTIMA, the most discrimination in terms of both the lowest P value and the greatest difference between the PBO+MTX arm versus the ADA+MTX arm at week 26 was achieved with CDAI 80% and CDAI 70%, respectively (Fig. 2a, b and Additional file 1: Table S1). For both trials, the CDAI improvement with the most discrimination at week 12 was consistent with that for week 24 (CDAI 65–80%) for PREMIER and CDAI 70–80% for OPTIMA (see Additional file 1: Figure S2). In patients with established RA from the DE019 trial, CDAI 55% had the most discriminatory ability in terms of both the lowest P value and the greatest treatment difference at week 24, when comparing PBO+MTX versus ADA+MTX (Fig. 2c and Additional file 1: Table S1). At week 12, however, the most discrimination was observed at CDAI 40% (see Additional file 1: Figure S2). In established RA patients from the ARMADA trial, the lowest P value comparing ADA+MTX versus PBO+MTX at week 24 was observed at CDAI 45%, while the greatest difference was observed at CDAI 60% (Fig. 2d and Additional file 1: Table S1). At week 12, the greatest difference and lowest P value between treatment arms was observed at CDAI 50%. When considering CDAI improvement cutoffs identified previously [9], CDAI 70% or 85% was more discriminatory for early RA and CDAI 50% for established RA at week 24/26.

Fig. 2
figure 2

Percent change from baseline in CDAI in patients with early RA from a PREMIER and b OPTIMA, or established RA from c DE019 and d ARMADA at week 24/26. P value for difference between response rates for patients treated with ADA+MTX and PBO+MTX. ADA, adalimumab; CDAI, Clinical Disease Activity index; MTX, methotrexate; PBO, placebo; RA, rheumatoid arthritis

In the CDAI subgroup analyses based on baseline CDAI and DAS28(CRP) subgroups (≤ median vs > median), the results were generally consistent with the overall population and between the subgroups (Additional file 1: Table S2).

Similar to the observations for CDAI, 75% improvement in SDAI had the most discriminatory ability for both the lowest P value and the greatest treatment difference between ADA+MTX and PBO+MTX treatment for patients in early RA trials at week 26 (Fig. 3a, b and Additional file 1: Table S1); at week 12, the highest discriminatory ability was between SDAI 70–80% for both trials. In established RA patients in DE019, SDAI 55% (Fig. 3c) had the most discriminatory ability for both the lowest P value and the greatest treatment difference at week 24. At week 24 in ARMADA, the lowest P value was observed at SDAI 40–45%, while the greatest difference was observed at SDAI 60% (Fig. 3d and Additional file 1: Table S1). At week 12, the lowest P value and the greatest difference between treatment arms were observed at SDAI 45–50% in both trials (see Additional file 1: Figure S3). Consistent with CDAI, when considering SDAI improvement cutoffs identified previously [9], SDAI 70% or 85% was more discriminatory for early RA and SDAI 50% for established RA at week 24/26.

Fig. 3
figure 3

Percent change from baseline in SDAI in patients with early RA from a PREMIER and b OPTIMA, or established RA from c DE019 and d ARMADA at week 24/26. P value for difference between response rates for patients treated with ADA+MTX and PBO+MTX. ADA, adalimumab; MTX, methotrexate; PBO, placebo; RA, rheumatoid arthritis; SDAI, Simplified Disease Activity Index

Identification of the most discriminatory DAS28(CRP) improvements

For both trials in early RA patients, an improvement in DAS28(CRP) of 45% had the most discriminatory ability for both the lowest P value and greatest difference between the ADA+MTX and PBO+MTX at week 24 (Fig. 4a, b and Additional file 1: Table S1). Consistently, the most discriminatory percent improvement at week 12 was also between DAS28(CRP) 40–45% for both trials. At week 24, in patients with established RA, the most discrimination for both the lowest P value and the greatest treatment difference was observed at DAS28(CRP) 35% in DE019 and DAS28(CRP) 50% in ARMADA (Fig. 4c, d and Additional file 1: Table S1). For both trials at week 12, the most discriminatory percent improvements tended to be lower, between DAS28(CRP) 5–25% (see Additional file 1: Figure S4).

Fig. 4
figure 4

Percent change from baseline in DAS28(CRP) in patients with early RA from a PREMIER and b OPTIMA, or established RA from c DE019 and d ARMADA at week 24/26. P value for difference between response rates for patients treated with ADA+MTX and PBO+MTX. ADA, adalimumab; DAS28(CRP), 28-joint Disease Activity Score based on C-reactive protein; MTX, methotrexate; PBO, placebo; RA, rheumatoid arthritis

Discussion

We performed a comprehensive analysis to identify the ACR response and percent improvement from baseline in commonly used disease activity measures that would provide the most discrimination between treatment arms in clinical trials. In patients with established RA with inadequate response to MTX, when comparing an active agent versus PBO (plus background MTX), lower ACR response and smaller percent improvements from baseline in disease activity measures (CDAI, SDAI, and DAS28(CRP)) had better discriminatory ability. Conversely, in patients with early RA, when comparing two active agents, higher ACR responses and greater percent improvements from baseline in disease activity measures had better discriminatory ability. This suggests that the best separation is by depth of response because two active medications are likely to perform similarly in terms of ACR20 response whereas the true difference in efficacy should be observed with higher ACR response. Our results are consistent with an earlier report, which demonstrated that greater SDAI improvements and higher ACR responses were more discriminatory in an early RA population, while smaller improvements and lower degrees of improvement showed better discrimination in an established RA population [10]. We observed greater consistency in the discriminatory performance of CDAI and SDAI improvements as compared with that of ACR scores and DAS28(CRP) across trials and time points with respect to a similar disease population (Additional file 1: Table S1).

Our findings confirm the original conclusions of Felson et al., who demonstrated that the ACR20 response is superior in differentiating an active treatment from a PBO [2]. The ACR20 response is the “gold standard” in RA clinical trials to differentiate active from PBO response or to compare different treatments [1]. In several recent trials, a high ACR20 response was observed with PBO as an add-on to background MTX in patients with active RA, which may be due to differences in patient populations studied (e.g., geographic regions), differences in pre-study therapy, enhanced clinical management compared with standard of care, or differences in therapies studied, rather than a consequence of using the measure itself [9, 11,12,13].

Our analysis included both early and established RA populations treated with ADA and MTX or PBO and MTX. The patients in the early RA trials were MTX-naïve and were initiating MTX, so the comparison was between two active treatments. On the other hand, patients in the established RA trials included patients with active RA despite MTX treatment who were initiating ADA or PBO on background MTX. This most likely contributed to the different findings between these two populations in the most discriminatory response cutoffs and improvements. The small difference between OPTIMA and PREMIER may relate to the longer mean disease duration in PREMIER.

In general, greater improvements became more discriminatory after 6 months (24/26 weeks) of treatment compared with 12 weeks of treatment, indicating that the depth of treatment response improved over time. As expected, the improvements in CDAI and SDAI with the optimal discriminatory activity between treatments were very similar. The most discriminatory improvements in DAS28(CRP) for early RA tended to be much lower (45%) compared with the CDAI and SDAI improvements (70–80%). This is consistent with the weighting and transformation associated with the DAS28(CRP) formula, which results in a lack of linearity with increasing activity. Importantly, the optimally discriminatory improvements in SDAI and CDAI tended to be more consistent between trials in the same type of patients than the optimal ACR scores (e.g., SDAI 75% improvement for both OPTIMA and PREMIER at week 26 vs ACR60 and ACR80, respectively). In addition, at early time points (12 weeks), the optimally discriminatory ACR improvement identified for early RA (ACR10) may not be as clinically meaningful as SDAI and CDAI improvements, which tended to be higher.

Limitations of this analysis include the lack of data to compare two active treatments in a head-to-head study of bDMARDs or csDMARDs in patients with established RA. However, this was done in the recent ORAL Strategy trial, which was a head-to-head comparison of tofacitinib monotherapy, tofacitinib+MTX, and ADA+MTX in patients with active RA despite MTX therapy and used ACR50 as the primary endpoint [14]. Additionally, adalimumab has been compared with other bDMARDs in head-to-head trials with similar results [15,16,17], suggesting that the data are generalizable. Moreover, the results are further indirectly supported by the fact that agents that directly interfere with interleukin-6 pathways, and thus reduce CRP or ESR irrespective of clinical improvement, exaggerate DAS28 responses because of the high weight of CRP/ESR in the DAS28 formula [18, 19].

Conclusions

In conclusion, our post hoc analysis suggests that different optimal ACR responses or improvements in disease activity measures may have to be used in trials in early and established RA patients, or when comparing a drug with an active or a PBO comparator. Moreover, it appears that although the ACR scores and DAS28(CRP) are commonly used, they did not perform as consistently for discriminatory purposes as measures developed more recently, such as SDAI and CDAI. These measures therefore may be potentially considered for future trials. Finding a consensus for the use of these different response criteria may be a task for the future.