Background

Remission has become a key treatment goal in rheumatoid arthritis (RA). Achieving remission with drug treatment is recommended in many clinical management guidelines [1,2,3,4,5,6]. It is also a central feature of the “treat-to-target” initiative [7, 8]. Patients who achieve remission have less disability and better quality of life than those with persisting inflammatory disease [9]. In early RA remission is particularly important due to the ‘window of opportunity’ during which early intensive treatment can halt or substantially reduce subsequent disease progression [10].

There are several definitions of remission in RA. The 2010 European League Against Rheumatism (EULAR) and American College of Rheumatology (ACR) criteria provided a framework for considering these different definitions [11]. A variety of composite measures are used to determine the presence of remission. These include the Disease Activity Score (DAS) and the Disease Activity Score for 28 joints (DAS28), the Simple Disease Activity Score (SDAI) and the Clinical Disease Activity Score (CDAI) [12,13,14]. DAS28 remission criteria have been used most frequently in trials of intensive treatments in RA, though there has been debate whether it is ideal [15].

Several systematic reviews have reported on treatment remissions in RA [16,17,18,19,20], patients likely to achieve remission [21, 22] and the strength of the rationale for treatment to target approaches in RA [23, 24]. The balance of evidence from these reviews is that intensive treatment increases remission. However, several uncertainties need to be resolved. Firstly, the relative merits of intensive treatment in early RA compared to established disease need to be considered. Secondly, it is important to know whether treatment with one type of therapy, such as biologics like tumour necrosis factor (TNF) inhibitors, will lead to more remissions than treatment with combinations of conventional disease modifying anti-rheumatic drugs (DMARDs) Finally it is important to know if one or other treatment strategy is preferable in early or established disease.

We have systematically reviewed RA clinical trials that report remissions. We evaluated both trials that compare an intensive treatment strategy with standard care and also head-to-head trials of different intensive treatment strategies. We analysed trials in early and established RA separately, taking the division between these groups as usually being 12 months since diagnosis.

Methods

Inclusion and exclusion criteria

The inclusion criteria were: randomized controlled trials or open label non-randomised comparative studies with at least one intensive treatment arm and one control arm; adult patients with RA; studies of at least 6 months duration; studies enrolling at least 50 patients; studies reporting remissions; studies using treatments in their licensed indication for RA. The intensive treatment arms used drugs considered more intensive than DMARD monotherapy. These included combination DMARDs (which could involve using short-term regular doses of steroids to control synovitis), TNF inhibitors, non-TNF biologics (tocilizumab, abatacept and rituximab), and Janus Kinase (JAK) inhibitors. We also noted whether studies used a treat-to-target approach with intensive treatments. Studies either compared one intensive treatment strategy against standard care or two different intensive treatment strategies (such as combination DMARDs and TNF inhibitors with DMARDs). Foreign language papers and published conference abstracts were excluded. Trials comparing similar types of treatment, such as two intensive DMARD regimens, were also excluded. The search identified publications from 1st January 2000 to 30th April 2017.

Search strategy

A systematic literature search was carried out using EMBASE, OVID Medline as well as hand searching the systematic reviews relevant to this topic found in the Cochrane library database. The key word search terms used were ‘arthritis, rheumatoid’ (MeSH), ‘clinical trial’ [Publication Type] (MeSH), randomised controlled trial [Publication Type] (MeSH), open label (free text) and ‘remission’ (free text). These were searched separately and in combination. The EMBASE search terms included ‘arthritis, rheumatoid’ (MeSH) all subheadings and FOCUS function, clinical trial (MeSH) Explode function.

Data collection

Two researchers (CH, DLS) independently assessed studies for eligibility and extracted data. This included year of publication, disease duration, number of treatment groups, study design, control and intensive treatment regimens, study size, remissions and study end-points. The numbers of patients achieving disease remission at the trial end-point was defined by Disease Activity Scores (DAS) < 1.6, DAS28 < 2.6 or equivalent criteria. The trials were classified as early (generally with disease durations < 1 year) or established (generally with disease durations > 1 year) reflecting the trial investigators assessments. When there were differences between assessors, they reviewed the reports together and came to a joint conclusion.

Assessing Bias

A quality assessment was completed for each paper using the Cochrane Collaboration tool for assessing risk of bias [25]. The types of bias assessed were: random sequence generation, selection bias, performance bias, detection bias, attrition bias, reporting bias and other bias (such as pharmaceutical funding). The risk was defined as low or high. We also used funnel plots to assess publication bias and associated issues [26].

Statistical analysis

Results were analysed using Review Manager 5.3 (Cochrane Collaboration, Oxford, UK). The random effects model based on DerSimonian and Laird’s method [27] was used to estimate the pooled effect sizes; this gives more equal weighting to studies of different precision in comparison with a simple inverse variance weighted approach, so accommodating between study heterogeneity. For all meta-analyses, we performed Cochrane’s chi-squared test to assess between study heterogeneity and quantified I2 statistics [25]. P-values < 0.05 were considered significant.

Some of the randomised controlled trials had more than two treatment arms: when there were two control groups the results were combined; when there were two or more intensive treatment groups only those reporting licensed dosage regimens were included.

Results

Study selection

We identified 928 publications: 440 were duplicated studies and 414 were excluded after reviewing abstracts. Seventy four full text papers were reviewed in detail; 21 were excluded and 53 selected for inclusion (Fig. 1). These papers comprised 48 superiority trials, in which an intensive treatment strategy was compared with a less intensive strategy, and 6 head-to-head trials comparing combination DMARDs with biologic treatments. The BeST paper is included in both of these groups.

Fig. 1
figure 1

PRISMA Diagram Outlining Search Strategy

Characteristics of included studies

Twenty two superiority trials evaluated patients reported as having early RA. Their maximum disease durations ranged from 3 months to 3 years. Mean or median disease durations, reported in 20 of these trials, ranged from 1 to 11 months (mean 6 months). Four of these trials studied patients with very early disease, less than 6 months from diagnosis. One trial had two different intensive treatment arms (combination DMARDs and biologics) which were both included. Six trials had two or three intensive treatment arms: in three trials biologic monotherapy treatment arms were omitted; in another three trials only licensed combination regimens were included.

Twenty six superiority trials evaluated patients with established RA. Six of these trials specified maximum disease durations (from 5 to 20 years). Mean or median disease durations, reported in all of these trials, ranged from 1 to 12 years (mean 8 years). One trial had two control groups (methotrexate or sulfasalazine monotherapy) and these were combined. Sixteen trials had two or more intensive treatment arms: three had two different licensed intensive treatments (biologics and JAK inhibitors) which were both included; in one trial the biologic monotherapy treatment arm was omitted; in a further 12 trials only licensed combination regimens were included.

Overall 19,752 RA patients were studied: 7300 in early RA and 12,452 with established RA (Table 1). There were 46 conventional RCTs, one was open label and one quasi-experimental. Twenty four trials had 2-arms, 17 had 3-arms and 7 had over three arms. The trials often reported outcomes at several different time-points, but their primary outcomes were reported at 6 months in 21 trials, at 12 months in 19 trials and at longer intervals in 8 trials (2 at 18 months, 5 at 24 months and 1 at 36 months).

Table 1 Details of Studies with Control Groups

DAS28 remissions (DAS28 < 2.6) were reported in 38/48 superiority trials and 4/6 head-to-head trials. DAS remissions (DAS < 1.6) were reported in 5/48 superiority trials and 2/6 head-to-head trials. Five superiority trials reported other remissions (using SDAI in 3 and unique study-specific criteria in 2). In addition, 12 superiority trials reported some or all of the new EULAR/ACR remission criteria.

Treat-to-target strategies were included within 8/48 superiority trials and 3/6 head-to-head trials, though there were substantial differences in how these strategies were delivered.

Remission in superiority trials

Overall in the 48 trials 3013/11,259 patients achieved remission with intensive treatment compared with 1211/8493 patients receiving non-intensive therapy (Table 2). Analysis of the 53 comparisons in these trials using the random effects relative risk model showed there was a highly significant benefit for intensive treatment (RR 2.23; 95% CI 1.90, 2.61). There was marked heterogeneity between studies; I2 was 84%.

Table 2 Effectiveness In Superiority Trials Assessed By Random Risk Ratio and Heterogeneity

In the 38 trials (40 comparisons) reporting DAS28 remissions the random risk ratio was 2.26 (95% CI 1.89, 2.71); in the 10 trials (12 comparisons) reporting other remission criteria the random risk ratio was 2.13 (95% CI 1.53, 2.98). The random risk ratios showed significant effects with trials of 6 months, 12 months and longer durations. Although the random ratio was somewhat higher in trials of 6 months duration, 17/21 trials (20/24 comparisons) were in established RA and in these the random risk ratio was 4.82 (95% CI 2.85, 8.13); in the 4 trials (4 comparisons) lasting 6 months in early RA the random risk ratio was 1.94 (95% CI 1.21, 3.11). In the 8 trials (9 comparisons) involving TTT strategies as part of intensive treatment the random risk ratio was 1.62 (95% CI 1.30, 2.03).

In the 22 trials in early RA with intensive treatments trials with 1756/3993 patients achieved remission with intensive treatment compared with 903/3307 patients receiving monotherapy. One trial evaluated two intensive treatment regimens and there were consequently 23 comparisons; 13 evaluated TNF inhibitors, 5 evaluated other biologics and 5 evaluated combination DMARDs. Analysis of the 23 comparisons in these trials showed a significant overall benefit for intensive treatment (RR 1.56; 95% CI 1.38, 1.76). There was marked heterogeneity in these studies; I2 was 74% (Table 2). A funnel plot showed a symmetrical pattern in these trials (result not shown). Four trials enrolled patients with disease durations no more than 6 months and these showed a similar benefit for intensive treatment (RR 1.47; 95%CI 1.03, 2.10) Comparison of the different intensive treatment regimens in early RA patients showed similar impacts of different intensive treatments; these ranged from a random risk ratio of 1.43 with TNF inhibitors to 2.00 with other biologics. TTT strategies also increased remissions with a random risk ratio of 1.51.

In the 26 established RA trials 1257/7266 patients achieved remission with intensive treatment compared with 308/5186 patients receiving monotherapy. Three trials evaluated two intensive treatment regimens and consequently there were 29 comparisons: 10 evaluated TNF inhibitors, 10 evaluated other biologics, 3 evaluated combination DMARDs and 6 evaluated JAK inhibitors. Analysis of these 29 comparisons trials showed a significant overall benefit for intensive treatment (RR 4.21; 95% CI 2.92, 6.07). There was marked heterogeneity in these studies; I2 was 86% (Table 2). A funnel plot showed an asymmetrical pattern in these trials (result not shown). Comparison of the different intensive treatment regimens in established RA patients showed some differences in the magnitude of effects; random risk ratios ranged from 2.41 with combination DMARDs to 6.81 with other biologics (tocilizumab, adalimumab and rituximab); however, as the confidence intervals overlapped there was no evidence these differences were significant. Only two trials used TTT strategies and although these increase remissions the 95% confidence intervals showed the finding may not have been significant (random risk ratio 2.39; 95% CI 0.90, 6.32).

Using a fixed effects model gave similar findings. In all trials the risk ratio was 2.06 (95%CI 1.94, 2.18), in early RA trials it was 1.64 (95% CI 1.54, 1.74) and in established RA the risk ratio was 3.32 (95% CI 2.94, 3.74). Interestingly the fixed model indicated TTT strategies in established RA in two trials may have been significant (risk ratio 2.19, 95% CI 1.50, 3.19.

Remission in head to head trials

Overall in the 6 trials 317/787 patients achieved remission with TNF inhibitors compared with 229/671 of patients receiving combination DMARD therapies. Analysis of these 6 trials using the random effects relative risk model (Table 3) showed there was a no different between treatment strategies (RR 1.06; 95% CI 0.93. 1.21). There was little heterogeneity between studies; I2 was 21%. Comparing 4 early RA and 2 established RA trials separately also showed no evidence of a significant difference between groups (Table 3). However, comparisons of the first 6 months results in the two established RA trials showed more remissions with TNF inhibitors using the random effects relative risk model (RR 1.74, 95% CI 1.14, 2.64). The fixed effects model gave similar findings (RR 1.90; 95% CI 1.17, 3.10).

Table 3 Effectiveness In Head To Head Trials Comparing Biologic with Combination DMARD Strategies Assessed By Random Risk Ratio and Heterogeneity

Frequency of remissions

There were marked differences in the frequency of remissions in active and control groups in both early and established RA (Fig. 2). In early RA the average frequency of remissions with active treatment was 49%: in 10 early RA trials 50% or more active patients achieved remissions; the highest rate was 86% in the U-Act-Early (tocilizumab) trial and the lowest rate was 18% in the St Clair (Infliximab) trial. In early RA controls the average frequency of remission was 34%: in four trials 50% or more controls achieved remissions; and the lowest rate in controls was 18% in the Image (rituximab) trial. The average difference in remission rates between active and control group in early RA trials was 15%.

Fig. 2
figure 2

Remissions in control and active groups shown as percent patients in each group in early and established RA

In established RA the average frequency of remissions with active treatment was 19%: in only one trial did 50% or more active patients achieved remission (65% in the Ticora trial of combination DMARDs); in 14 trials 10% or less active patients achieved remission and, in the Reflex, (rituximab) and RA Beam (baricitinib and adalimumab) trials only 3% of patients achieved remissions. In established RA controls the average frequency of remission was 6%: in 22 trials less than 5% of controls achieved remissions; and in the Reflex (rituximab) trial no control patient achieved an end-point remission. The average difference in remission rates between active and control group in early RA trials was 13%.

Quality and risk of Bias

Quality assessment, using the Cochrane Collaboration tool for assessing risk of bias, showed overall quality was high with low risks of bias (Table 1).

Discussion

TNF Inhibitors, other biologics and combination DMARDS were all effective in increasing remission in early and established RA. Treat to target strategies, which usually involved intensive DMARDs, were also effective. JAK inhibitors were similarly effective in established RA; there was no data about their impact in early disease. Although other biologics achieved numerically higher risk ratios in both early and established RA the overlapping confidence intervals gave no support to the view that these differences are clinically significant. The benefits of different types of intensive treatment were therefore broadly similar. Trials of varying durations, from 6 months to more than 12 months, all showed intensive treatments increased remissions. There was no evidence that patients with very early RA of no more than 6 months disease duration benefited more from intensive treatments. We excluded trials with durations of less than 6 months to ensure we did not disadvantage the assessment of intensive treatment strategies using slower acting DMARDs. The head-to-head trials supported the similarities between treatments with combination DMARD strategies and TNF inhibitor strategies, which achieved similar end-point remission rates. There was however, some evidence that TNF inhibitors increased the early remission rates, in keeping with their relatively rapid onset of action compared to conventional DMARD combinations.

TNF Inhibitors, other biologics and combination DMARDS were all effective in increasing remission in early and established RA. There was no evidence that patients with very early RA of no more than 6 months disease duration benefited more from intensive treatments. JAK inhibitors were similarly effective in established RA; there was no data about their impact in early disease. Although other biologics achieved numerically higher risk ratios in both early and established RA the overlapping confidence intervals gave no support to the view that these differences are clinically significant. The head-to-head trials supported the similarities between treatments with combination DMARD strategies and TNF inhibitor strategies, which achieved similar end-point remission rates. There was however, some evidence that TNF inhibitors increased the early remission rates, in keeping with their relatively rapid onset of action compared to conventional DMARD combinations.

The overall quality of the studies was relatively high. However, there was evidence of marked heterogeneity in their findings with most comparisons having high I2 values. This heterogeneity meant that in some intensive treatment arms in early RA over 70% patients achieved remission while in other intensive treatment arms in established RA under 10% patients achieved remission. These differences are likely to reflect patient selection more than treatment efficacy, with very early RA patients having no previous DMARDs are highly likely to achieve remission with intensive treatment while established RA patients who have failed multiple prior treatments are unlikely to do so.

The most likely explanation for the asymmetrical funnel plot in trials in established RA relates to specifically including studies using treatments in their licensed indication which were published between 2000 and 2017. A consequence is that potential intensive treatments which were evaluated in RA patients but were not found to be effective, were not included. Firstly, small initial studies with new drugs which would have shown negative results for remissions were not included as the treatments were never licensed for RA. An example is the spleen tyrosine kinase inhibitors [81]. Secondly, some TNF inhibitors were not effective in RA and were therefore not licensed; an example is Lenercept, which failed to show sustained benefit in clinical trials [82]. Finally, combinations with DMARDs were tried in the 1980’s, before remission was measured or reported; these trials were mainly negative [83]; subsequent trials of intensive DMARD combinations reporting remission which were published after 2000 studied treatments which were known to be effective in combination. These factors mean the funnel plot of remissions in established RA would not include small trials with negative findings because of the selection criteria used. As this report focuses on the benefits of different intensive treatment strategies using licensed treatments given at their approved dosages we do not think an asymmetric funnel plot changes our conclusions.

Our systematic review has a number of limitations. Firstly, studies not reporting remission data were excluded, though they often show clinically important improvements with intensive treatment. Secondly, studies reported remissions differently; for example, DAS and DAS28 remissions are similar but not identical. Thirdly, studies were of variable duration and comparing remission rates over 6 and 12 months or more is not ideal; however, variations in treatments and patient selection meant there was no evidence for one particular time point being best. Fourthly, studies differed in the way they handled non-responders, with some trials stopping treatment if patients had not responded within 3 months or so and applying non-responder imputations. This approach may alter the remission rates in the non-intensive treatment by making it appear smaller than it might have been if treatment was continued. Fifthly, as mentioned previously, studies enrolled different patient groups in whom the likelihood of achieving remissions was very different. Sixthly, the intensive combination DMARD regimens used in the trials have been combined together, even though they represent a wide range of different strategies, not all of which appeared highly effective. In one study by Schipper et al. [58] some patients in active and control groups had monotherapy and others had biologics, so the trial is not just a comparison of one treatment strategy; however, excluding it made no difference to the conclusions. Finally, there is debate about the benefits of combining the results of different trials in a meta-analysis, considering their potential degrees of clinical heterogeneity. As we have also undertaken extensive sub-group analyses we consider the approach we have taken is justified in this particular clinical context.

Our results have several implications for clinical practice. Firstly, they show that intensive treatment strategies lead to more remissions than conventional care in both early and established RA. This finding is generally supportive of the treat-to-target approach currently recommended [7, 8], although we have not attempted to dissociate the impact of giving intensive treatment from the impact of the target. Secondly, they show that initial treatment with conventional DMARD combinations has similar effectiveness to early biologics. This finding is supportive of the current recommendations about biologic treatments from the National Institute for Health and Care Excellence (NICE), which recommend trying combination DMARDs before biologic treatment [84]. Thirdly, they question whether remission is necessarily the ideal target for treatment in established RA, as it is only achieved by a minority of patients in most trials of intensive treatment. There may be greater value of aiming for low disease activity states, in which case these need to be measured in future trials. The EULAR good response criteria can be used to assess the frequency of achieving low disease activity states measures using DAS28. Current guidance on treat to target includes aiming for low disease activity in some patients.

One issue this review cannot address is treatment sequencing. Some experts believe most early RA patients should receive methotrexate monotherapy initially for a few months and only have intensive treatments if they fail to respond. Other experts recommend early intensive treatment followed by treatment tapering. It is possible to find individual trials within our systematic review, which support both options, but there is no systematic evidence to support or refute either approach. One final pair of inter-related uncertainties is the optimal time to assess remission and the most suitable assessment to evaluate its presence. Combining superiority and head-to-head trials (Tables 1 and 4) shows 23 (43%) lasted 12 months, 20 (38%) lasted 6 months and 10 (19%) lasted over 12 months, with the longest (BROSG trial evaluating combination DMARDs) lasting 3 years. This finding suggests trials of 12 months or longer seem preferable. Although most trials reported DAS28 remissions, this represents an historical target and there is now greater emphasis on stricter remission criteria.

Table 4 Details Of Head-To-Head Studies

Conclusions

Intensive treatment with combination DMARDs, biologics or JAK inhibitors increases the frequency of remission compared to control non-intensive strategies. The benefits are seen in both early and established RA. The relative merits of different remission criteria in trials is a complex question but changing criteria has the disadvantage of making it difficult to compare trials with newer criteria and those using more historic methods.