FormalPara Key Summary Points

Why carry out this study?

The introduction of new treatments and a treat-to-target strategy for rheumatoid arthritis (RA) in the last 20 years should have resulted in changes to the characteristics of patients with RA participating in clinical trials of the newest therapies.

This is important as patient characteristics may influence patients’ response to drug treatment.

To determine whether characteristics of patients with RA have changed over time, baseline characteristics were compared between 22 clinical trials published in 1999–2009 and 18 published in 2010–2017.

What was learned from the study?

No significant difference between the two timeframes and no obvious trend over time were observed for any baseline characteristic of patients with RA, including physician and patient assessments of disease activity, and Health Assessment Questionnaire-Disability Index and pain scores.

The baseline characteristics of patients with RA participating in clinical trials do not appear to have changed in the last decade; further research is needed to determine the impact of baseline patient characteristics on patients’ response to RA treatments.

Introduction

Rheumatoid arthritis (RA) is a chronic, autoimmune, inflammatory arthritis associated with pain, disability and an increased risk of mortality [1,2,3]. Globally, the prevalence of the disease is in the range 0.5–1.1% [4, 5]. Conventional synthetic disease-modifying antirheumatic drugs (csDMARDs), such as methotrexate, leflunomide and sulfasalazine, provided the standard of care for RA for many years. However, the choice of treatments for the disease has expanded over the last 20 years with the introduction of biologic DMARDs (bDMARDs), such as abatacept, adalimumab, certolizumab pegol, etanercept, golimumab, infliximab, rituximab, sarilumab and tocilizumab, and, more recently, targeted synthetic (ts)DMARDs, such as baricitinib and tofacitinib.

Guidelines for the management of RA from the European League Against Rheumatism (EULAR) and the American College of Rheumatology (ACR) recommend early use of DMARDs, with the goal of remission or low disease activity [6, 7]. This treat-to-target approach, first introduced in 2010 [8], has improved the care and outcomes of patients with RA [9, 10]. However, several studies suggest that many patients with RA are not reaping the benefits of this treat-to-target approach or the benefits derived from newer DMARDs because of underuse of these drugs. A review of data from the US National Ambulatory Care Medical Survey over a 12-year period (1996–2007) showed that only 47% of visits to a physician for RA were associated with a DMARD prescription [11], while a later retrospective review of data over a 3-year period (2005–2008) showed that only 63% of 93,143 Medicare enrollees with RA in the USA received a DMARD. Prescribed DMARDs varied with demographic factors, socioeconomic status and geographic location. Cited reasons for the low proportion of DMARD prescriptions included low income, for-profit health plans, patients refusing treatment and increased comorbidities in the elderly resulting in contraindications to available drugs [12].

A systematic review of 127 studies of patients with RA highlighted significant differences in certain patient characteristics, including age, disease duration, number of DMARDs previously used and Disease Activity Score for 28-joint count (DAS28), between patients enrolled in randomised controlled clinical trials (RCTs) and those enrolled in registries [13]. In addition, baseline DAS28 and Health Assessment Questionnaire-Disability Index (HAQ-DI) scores for patients with RA prescribed bDMARDs (etanercept, rituximab or tocilizumab) decreased over time for RCTs and observational studies published between 2004 and 2014 [13], probably owing to the introduction of the treat-to-target strategy during this period [14]. This trend was consistent with the results of an observational study conducted in the Netherlands, in which symptom duration and inflammatory activity at presentation were found to have decreased over a 23-year period (1993–2015) among patients with RA attending a local rheumatology department. Paradoxically, however, patient-reported outcomes (pain, fatigue, disease activity and global health) worsened over this time period, possibly as a result of increased societal and patient expectations [10].

Given the change in treatment strategy and the introduction of new DMARDs with different mechanisms of action in recent years (e.g. the Janus kinase inhibitors baricitinib and tofacitinib, and the interleukin-6 inhibitor sarilumab), the characteristics of patients participating in RA clinical trials might be expected to have changed over time. This is important as patient characteristics may influence the effects of drug treatment [13]. According to data from the systematic literature reviews (SLRs) conducted by Kilcher et al. [13], the characteristics of patients with RA enrolled in RCTs who show an inadequate response to csDMARDs might be expected to change over time, and this may influence the rate of response to new treatments as well as making it difficult to compare drugs evaluated during different time periods. To date, however, no such analysis has been published. To address this important issue and inform the design of future RCTs evaluating new RA treatments, we conducted a study to determine whether the baseline characteristics of patients with RA with an inadequate response to csDMARDs participating in RCTs between 1999 and 2017 have changed over time.

Methods

Objective

This study aimed to compare the characteristics of patients with an inadequate response to csDMARDs participating in RA RCTs between two different, predefined timeframes–an earlier timeframe (1999–2009) when bDMARDs were first introduced, and a later timeframe (2010–2017) after the introduction of treatments with different mechanisms of action and adoption of the treat-to-target strategy.

Systematic Literature Review

This secondary analysis was based on the results of a previously conducted SLR [15]. The SLR aimed to identify evidence for the efficacy and safety of treatments for moderately to severely active RA in adults. Searches of Medline, Medline in Process, Embase, Biosciences Information Service and the Cochrane Library were performed to identify RCTs published between 1 January 1999 and 11 December 2017 using search terms related to RA, associated interventions and RCTs (Table S1 in the supplementary material). The original searches were performed on 17 June 2015 and updated searches were performed on 10 August 2016 and 11 December 2017. There were no language limits on the database searches. Conference abstracts (2013–2017), grey literature and the bibliographies of key articles were also reviewed. Data were extracted from relevant full-text publications and quality-checked by an independent reviewer. The quality of each study was also assessed using National Institute for Health and Care Excellence (NICE) guidelines [16]. The SLR was conducted according to Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) guidelines [17].

Current Analysis

The analysis reported here focused on studies from the SLR involving patients with an inadequate response to csDMARDs, including methotrexate, hydroxychloroquine, leflunomide and sulfasalazine. As such, the main analysis study pool only included studies that did not allow prior bDMARD use. The following baseline patient characteristics of interest extracted from the studies were analysed: age, gender, disease duration, rheumatoid factor status, DAS28 based on erythrocyte sedimentation rate (DAS28-ESR) or C-reactive protein (DAS28-CRP), HAQ-DI score, swollen joint count (SJC), tender joint count (TJC), Physician’s Global Assessment of Disease Activity (PGA), Patient’s Global Assessment of Disease Activity (PtGA), pain visual analogue scale (VAS) score, previous use of non-steroidal anti-inflammatory drugs (NSAIDs) and number of previously used csDMARDs.

Additional baseline characteristics that were extracted from the studies but could not be analysed because of lack of or limited data availability included weight, comorbidities, smoking status, Clinical Disease Activity Index (CDAI) and Simplified Disease Activity Index (SDAI) scores, Routine Assessment of Patient Index Data-3 (RAPID-3) score, erosion score, joint space narrowing score, van der Heijde modified total Sharp score (mTSS) and EuroQoL-5 Dimensions (EQ-5D) questionnaire score.

The analysis reported in this article is based on previously conducted studies and does not involve any studies with human participants or animals performed by any of the authors.

Statistical Analysis

If not already presented as such, baseline patient characteristics from different treatment arms within a given study were combined into an overall study result. Where necessary, medians and ranges from each study were converted to means and standard deviations using the methods of Wan et al. [18]. For binary data, the variance estimate (v) for proportions (p) was used to derive the standard error using the formula v = p(1 − p). The results from the various studies were compared between the two study publication timeframes (1999–2009 vs 2010–2017) using random-effects meta-analyses. Between-study variance was assessed using restricted maximum-likelihood estimation and was presented as an I2 value, showing the proportion of the observed variance that reflects real difference between studies (i.e. it is not due to random error), where higher percentages represent higher variance; and a Tau-squared value, showing between-study variation, where higher values represent higher variation. The level of significance was taken as p ≤ 0.05 (unadjusted for multiple testing). Missing data were not imputed for this analysis. Forest plots were generated for all analyses, with study results ordered by year of publication to provide a visual display of potential changes over time. Analyses were conducted using SAS version 9.4, R studio version 3.4, and metafor package version 2.0 [19].

Sensitivity Analyses

Two sensitivity analyses were performed to assess the effect of changing the patient population on results. One sensitivity analysis excluded Asia–Pacific studies, since patients from these countries (mainly Japanese) were likely to have been treated with a low dose of methotrexate (< 7.5 mg/week), which could potentially impact the extent of methotrexate failure among trial populations. In addition to studies included in the main analysis, the second sensitivity analysis included studies that allowed prior bDMARD use in up to 20% of patients, since patients receiving bDMARDs were more likely to be further advanced in the RA treatment algorithm. The value of 20% was selected as many studies allowing some prior use of bDMARDs cited this value as the cut-off.

Results

Study Numbers

A total of 147 primary studies including patients with an inadequate response to csDMARDs were identified in the SLR, of which 94 were excluded (Fig. 1). Of the remaining 53 studies, 13 allowed prior use of bDMARDs in up to 20% of patients and were consequently excluded from the main analysis. Thus, the main analysis included a total of 40 studies. Of these, 22 were published in the earlier timeframe and 18 in the later timeframe. The 13 studies allowing prior use of bDMARDs in up to 20% of patients were combined with the 40 studies from the main analysis into a sensitivity analysis; details of these 53 studies are provided in Table S2 in the supplementary material [20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,40,41,42,43,44,45,46,47,48,49,50,51,52,53,54,55,56,57,58,59,60,61,62,63,64,65,66,67,68,69,70,71,72]. Treatments evaluated in the studies included csDMARDs (hydroxychloroquine, methotrexate, sulfasalazine), bDMARDs (abatacept, adalimumab, certolizumab pegol, etanercept, golimumab, infliximab, rituximab, sarilumab, tocilizumab), and tsDMARDs (baricitinib, tofacitinib). Not all identified studies were included in each specific characteristic analysis as some did not report relevant information or reported data in a manner that did not allow data to be converted (e.g. to means and standard deviations).

Fig. 1
figure 1

PRISMA diagram for study identification. bDMARD, biologic DMARD; csDMARD, conventional synthetic DMARD; DMARD, disease-modifying antirheumatic drug; MTX, methotrexate

Main Analysis

There was no statistically significant difference between the two timeframes and no obvious trend over time for age, gender, disease duration (mean difference between timeframes approximately 1 year), rheumatoid factor status, PGA, PtGA and pain VAS scores (Figs. 2 and S1). However, there was high variability (heterogeneity) between studies. Similarly, there was no statistically significant difference between timeframes and no obvious trend over time for DAS28-ESR, DAS28-CRP and HAQ-DI scores; in these cases, heterogeneity was low (Fig. 3). SJC (score range 0–66) and TJC (score range 0–68) were not significantly different between timeframes and there was no obvious trend over time; heterogeneity was also high. Different inclusion criteria were observed between studies for the minimum number of swollen or tender joints. We therefore conducted further analyses focusing on the minimum number of swollen or tender joints used as the inclusion criterion in each study, and on the most common inclusion criteria for minimum number of swollen and tender joints (≤ 6 for SJC, ≤ 6 and ≥ 8 for TJC). However, this did not change the results to any great extent (Figs. 4 and S2).

Fig. 2
figure 2

Forest plots showing means and 95% confidence intervals (CI; box and whisker plots) for the different studies according to year of publication for a Physician’s Global Assessment of Disease Activity score, b Patient’s Global Assessment of Disease Activity score and c Pain Visual Analogue Scale score. If studies reported pain on a scale of 0–10, values were multiplied by 10. I2 and Tau2 values indicate a high degree of heterogeneity between studies, while p values indicate no significant difference in mean values between studies published from 1999 to 2009 (studies above dashed line) and those published from 2010 to 2017 (studies below dashed line). Diamond shapes indicate 95% confidence intervals around the means summarised by timeframe and overall, respectively

Fig. 3
figure 3

Forest plots showing means and 95% confidence intervals (CI; box and whisker plots) for the different studies according to year of publication for a Disease Activity Score for 28-joint count (DAS28) based on erythrocyte sedimentation rate, b DAS28 based on C-reactive protein and c Health Assessment Questionnaire-Disability Index score. Diamond shapes indicate 95% confidence intervals around the means summarised by timeframe and overall, respectively. I2 and Tau2 values indicate a low degree of heterogeneity between studies, while p values indicate no significant difference in mean scores between studies published from 1999 to 2009 (studies above dashed line) and those published from 2010 to 2017 (studies below dashed line)

Fig. 4
figure 4

Forest plots showing means and 95% confidence intervals (CI; box and whisker plots) for the different studies according to year of publication for studies providing data on a swollen joint count (score range 0–66), and b tender joint count (score range 0–68). Diamond shapes indicate 95% confidence intervals around the means summarised by timeframe and overall, respectively. I2 and Tau2 values indicate a high degree of heterogeneity between studies, while p values indicate no significant difference in mean scores between studies published from 1999 to 2009 (studies above dashed line) and those published from 2010 to 2017 (studies below dashed line)

Information on the number of csDMARDs previously used was not reported in a unified manner, with very few studies (n = 3 in 1999–2009; n = 5 in 2010–2017) providing means and standard deviations. For studies providing the latter, a statistically significant difference between timeframes was observed with low to moderate heterogeneity (I2 33%, Tau2 0.06, p < 0.001; Fig. S3). The remaining studies either did not report this information or reported categories, which could not be converted into means and standard deviations. For prior NSAID exposure, most studies (24 out of 28; 86%) reported NSAID use in 100% of patients; therefore, no meta-analysis was run.

Sensitivity Analyses

The sensitivity analysis excluding Asia–Pacific studies included 28 studies: 16 from the earlier timeframe and 12 from the later timeframe. The results of this analysis were consistent with those of the main analysis (Fig. S4).

The sensitivity analysis allowing prior use of bDMARDs in up to 20% of patients included 53 studies: 26 from the earlier timeframe and 27 from the later timeframe. Again, the results of this analysis were consistent with those of the main analysis (Fig. S5).

Discussion

With the introduction of the treat-to-target strategy in RA and the availability of new treatments, the inclusion criteria for RA clinical trials might be expected to have changed in the past decade. However, the results of our analysis suggest that this is not the case: the characteristics of patients participating in recent clinical trials do not appear to have changed compared with those of patients participating in RCTs 10–20 years ago.

Patient characteristics are an important consideration in clinical trials as they may influence the effects of treatment. For example, older age is associated with decreased response rates in patients treated with etanercept or tocilizumab [73, 74], while male sex, being rheumatoid factor-positive, having a low HAQ-DI score and being a non-smoker predict a better response to various bDMARDs [73, 75, 76]. In addition, a study comparing the baseline characteristics of patients with RA between RCTs and observational studies showed that patients participating in RCTs had better prognostic factors than those participating in observational studies, which could result in overestimation of the treatment effect [13].

In this analysis, the lack of a change in DAS28 and HAQ-DI scores at baseline between the two timeframes was surprising given that real-world data from registries or observational studies suggest that baseline disease activity among patients with RA has decreased over time [77,78,79]. In addition, the aforementioned study by Kilcher et al. [13] comparing the baseline characteristics of patients with RA between RCTs and observational studies showed that baseline DAS28, HAQ-DI, ESR and CRP significantly decreased in patients participating in RCTs over the time period 1999–2015. An SLR of patients with RA receiving anti-tumour necrosis factor treatment in clinical trials over a 16-year period (1993–2008) showed a similar decrease in baseline CRP over time among patients previously treated with methotrexate but not among those with no experience of this drug [80]. However, the current analysis suggests that baseline disease activity among patients with RA participating in RCTs has not decreased over time, possibly because patients participating in RCTs tend to have more severe disease at baseline than those in routine clinical practice [13, 81]. The lack of change in disease activity in the current analysis may also be due to barriers in adopting the treat-to-target strategy in clinical practice compared with clinical trials, such as a lack of physician understanding of this treatment strategy, the feeling that the disease activity score may be falsely high due to symptoms or inflammation unrelated to RA, physician resistance to algorithm-based treatment, or a lack of time at clinic visits [82,83,84]. The lack of change in baseline disease activity over time likely reflects a lack of change in the strict inclusion criteria for patients participating in an RCT. In view of this, different methodology to that used in our analysis and measurement of different clinical/laboratory parameters might be necessary to detect any change over time in patient characteristics; this could be a subject for future research.

This analysis suggests that the mean number of previously used csDMARDs in patients participating in RCTs has decreased over time, possibly reflecting adoption of the treat-to-target approach. However, this result was based on only a few studies and was not observed in the sensitivity analysis that included studies allowing prior bDMARD use in up to 20% of patients. This suggests that it was either a ‘chance finding’ based on a small number of studies or that the inclusion of patients with prior bDMARD use corresponded to the inclusion of patients with more severe disease and hence a higher number of previously used csDMARDs, which would have diluted the effect over time. Very few studies reported the mean number of previously used csDMARDs: most reported previous csDMARD use as the percentage of patients using a certain number (e.g. 1, ≥ 1, ≥ 2, etc.), which could not be used in the current analysis. Thus, no definitive conclusions about previous csDMARD use can be drawn.

RCTs are very different to real-world clinical practice in that patients who are eligible for clinical trials generally have more severe disease [81]. Results of the current analysis suggest that patients currently being enrolled in RA RCTs are not receiving the correct treatment before commencement of the study. Although treat-to-target is the recommended approach for the management of RA [6, 7], RCTs are still being performed in which none of the participating patients have been treated accordingly. This begs the question as to whether RCTs are as informative as they were 15 years ago. In future, it would be interesting to design RCTs that include patients who have been treated according to treat-to-target recommendations.

To our knowledge, this is the first study to investigate changes over time in the characteristics of patients enrolled in RCTs of RA treatments. The main strength of this analysis is that it was based on a comprehensive SLR. However, it should be noted that transformation of median into mean values can introduce bias if the data summarised by a median value are not normally distributed [13]. As it was not the aim of this analysis to evaluate outcomes, the risk of bias was not assessed. Finally, although a highly sensitive search strategy with no language or geographic limits was used, it cannot be guaranteed that all relevant studies have been included.

Conclusion

The results of this analysis suggest that the characteristics of patients included in current RA clinical trials do not differ from those of patients included in trials for testing the first bDMARDs 20 years ago, despite current recommendations for a treat-to-target strategy. Further research is needed to determine the impact of patient characteristics on patients’ response to RA treatments in clinical trials. It also appears that patients currently being enrolled in RCTs of RA treatments are not being treated according to a treat-to-target strategy before the start of the study. Future RCTs of RA treatments should include patients who have been treated using this strategy if RCTs are to remain informative and impactful.