Introduction

Rheumatoid arthritis (RA), characterised by persisting joint inflammation and pain, leads to joint damage, disability and poor quality of life, which incur high medical and societal costs [1,2,3,4]. In developed countries, between 0.5 and 1% of adults have RA, and its long-term course means its prevalence rises with age [5].

Drug treatment for RA over the last two decades has focused on using disease-modifying anti-rheumatic drugs (DMARDs) with variable amounts of short-term glucocorticoids. Conventional synthetic DMARDs (csDMARDs), such as methotrexate, are widely used. They are often supplemented by biological DMARDs (bDMARDs), in particular tumour necrosis factor inhibitors such as etanercept [6]. bDMARDs are substantially more expensive than csDMARDs and are usually given in combination with methotrexate [7]. An additional new group of DMARDs, the Janus kinase inhibitors, has been available for a few years, but their use is minimal in the trial designs included in our systematic review.

The goal of DMARD treatment is to reduce disease activity, ideally by achieving remission or low disease activity (LDA). As csDMARD and bDMARDs can be used in combination and doses adjusted according to clinical response, the concept of “treat-to-target” (TTT) has grown in recent years, supported by international reports and guidelines [8,9,10,11,12]. Its components comprise (a) setting the target, which is usually remission or LDA; (b) assessing disease activity every 1–3 months using measures like the disease activity score for 28 joints (DAS28); and (c) increasing csDMARDs, bDMARDs, and Janus kinase inhibitors to facilitate achievement of the target, using short-term glucocorticoids if needed [7]. As TTT is a relatively new strategy, there is uncertainty about both its efficacy and its cost-effectiveness, although economic analysis using observational data provides some support [13].

bDMARDs are expensive drugs and their use in TTT incurs high costs to healthcare providers. There is consequently a strong case to assess the underlying health economic rationale for using TTT approaches. As the impact of TTT differs in early and established RA patients [8, 10], economic evaluations need to consider these populations separately. Although systematic reviews by Schoels et al. [9] and Stoffer et al. [11] assessed the evidence supporting TTT, neither evaluated its economic impact. There have also been several new trials reported since these reviews were completed. We have consequently undertaken a comprehensive systematic review of both the clinical and health economic impacts of TTT, including randomised controlled trials (RCTs) published up until 2020.

Methods

This review was commissioned by the NIHR HTA Programme (Project 14/17/01) [14]. The protocol is registered as PROSPERO CRD42015017336.

Search Strategy

We followed PRISMA principles (http://www.prisma-statement.org/) (Supplement 1).

Initial searches involved MEDLINE, EMBASE, Cochrane Database of Systematic Reviews, Cochrane Central Register of Controlled Trials (CENTRAL), NHS Economic Evaluation Database (NHS EED), Health Technology Assessment Database (HTA), Database of Abstracts of Reviews of Effects (DARE), Web of Science Citation Index Expanded (WoS), Web of Science Citation Index and Conference Proceedings Index (WoS-CPI), EULAR (via Web of Science), ACR (via Web of Science) and ClinicalTrials.gov using terms for RA combined with TTT terms (after Schoels et al. [9]), and search filters for RCTs, systematic reviews and economic evaluations (Supplement 2).

A full systematic search was conducted from database inception to January 2016, refined by initial searches, on MEDLINE, EMBASE, CENTRAL, NHS EED, WoS, WoS-CPI, BIOSIS Previews, CINAHL, Econlit, ClinicalTrials.gov, International Clinical Trials Registry Platform and NICE Evidence. Additional TTT and RCT free-text terms and economic evaluation filters were added (Supplement 2) to increase the search sensitivity. No date or language limits were applied. Records from initial and full searches were combined and duplicates removed. An update search was performed on Medline (via EbscoHOST) in November 2020. No date or language limits were applied. The results from the update search were de-duplicated against the original results.

Inclusion and Exclusion Criteria

RCTs (including cluster RCTs) examining the effectiveness of one or more TTT strategies to guide treatment decisions for individual patients compared with (1) usual care (no TTT strategies); (2) TTT strategy using an alternative treatment protocol; and (3) TTT strategy using an alternative target, on the proportion of patients achieving remission and LDA, and adverse effects, among adults with clinically diagnosed RA managed anywhere on the treatment pathway, were included. Sufficient description of the TTT strategy was required; meeting abstracts had to contain sufficient methodological details for critically appraising study quality. Included studies were limited to those published in the English language. Animal models, preclinical and biological studies, trials of personalised medicine, trials of other designs and trials designed to test an active drug against placebo, where both/all trial arms pursue the same target and treatment protocol, were excluded.

Study Selection and Data Extraction

One reviewer (MMSJ) examined titles and abstracts of retrieved records; 5% were checked by another reviewer (ESH). Full texts of all studies included were examined by two reviewers, where necessary discrepancies were resolved by discussion involving a third reviewer. For the update search, all titles, abstracts and full texts were examined by one reviewer (ESH) and checked by another (ELS or MMSJ).

Three reviewers undertook data extraction (ESH, MMSJ, ELS). Each paper was extracted by one reviewer, unblinded to authors or journal, on data relevant to the decision problem, using a standardised form. Data on study, population and TTT characteristics, including adverse events (AEs), were extracted and checked by a second reviewer. Discrepancies were discussed and agreement was reached without needing to consult a third reviewer.

Quality Assessment

One reviewer (shared among ESH, MMSJ, ELS) assessed methodological quality of each RCT using Cochrane Collaboration risk of bias assessment criteria, evaluating sequence generation; allocation concealment; blinding of participants and personnel; blinding of outcome assessment; incomplete outcome data; and selective outcome reporting, each judged as high, low or unclear risk of bias [15]. We included three additional domains for cluster RCTs: recruitment bias (whether participants were recruited prior to clusters being randomised); risk of baseline differences between clusters; and attrition of clusters. We classified RCTs as overall ‘low risk’ of bias if they were rated as ‘low’ for each of three key domains—allocation concealment, blinding of outcome assessment and completeness of outcome data (> 10% attrition [16]). RCTs at ‘high risk’ of bias for any of these domains were judged ‘high risk’ overall. RCTs neither being at ‘high risk’ for any domain, nor ‘low risk’ for all these domains, were judged ‘unclear risk’ overall. All quality assessments were checked by a second reviewer with discrepancies discussed and agreement reached.

Data Synthesis

Evidence of clinical effectiveness of TTT was organised by TTT comparisons: (1) TTT vs. usual care; (2) comparison of different treatment protocols; (3) comparison of different targets. Two trials not fitting this framework were examined separately [17,18,19,20]. Some trials made more than one comparison and appear under more than one category. Trials were further examined according to whether they used early or established RA populations [8, 10, 21] using definitions of early and established RA outlined in the trials; where no definition was provided a 3-year cut-off was used [21].

Results

Study Selection

Forty-nine papers reporting 22 RCTs were included from 17,631 records reviewed (Fig. 1); 42 papers reporting 16 RCTs from the original searches, and seven papers reporting six new RCTs and one updated RCT from the update searches. We excluded 17,418 on titles and abstracts: 213 publications were reviewed in detail and 164 papers describing 72 studies were excluded (107 not TTT, 18 not RCTs; 10 reporting no relevant outcomes; 29 excluded for diverse reasons).

Fig. 1
figure 1

Flow diagram of study selection for systematic review of TTT strategies in rheumatoid arthritis

LDA low disease activity, RCT randomised controlled trial, TTT treat-to-target

Study Characteristics

The 49 papers described 22 trials of 5990 RA patients. The trials spanned four categories (Table 1) based on their main features. Eight trials (1977 patients) compared TTT with usual care [22,23,24,25,26,27,28,29]; seven trials (2418 patients) compared different treatment protocols [30,31,32,33,34,35,36]; six trials (1758 patients) compared different treatment targets [24, 26, 35, 37,38,39]; four trials (1143 patients) made other comparisons of conventional with intensive therapy [14, 17, 19, 20]. Thirteen trials studied early RA patients [17, 24, 25, 30,31,32,33,34,35,36,37,38,39]; five studied established RA patients [19, 26,27,28, 40]; and four studied early and established RA [20, 22, 23, 29]. Fifteen trials involved controls receiving less intensive treatment [17, 19, 20, 22,23,24,25,26,27,28,29,30, 33, 36, 40], including four with groups receiving different intensive treatment strategies [24, 26, 30, 36]. Five trials compared different intensive treatments without controls receiving less intensive therapy [30,31,32, 34, 35]. Three trials compared two different targets without controls receiving less intensive therapy [37,38,39]. Four trials were cluster randomised [22, 26, 27, 29] and 18 were not [17, 19, 20, 23,24,25, 28, 30,31,32,33,34,35,36,37,38,39,40].

Table 1 Characteristics of trials included in the systematic review of TTT strategies in RA

Risk of Bias

Twelve RCTs were judged at overall high risk of bias [17, 19, 20, 24, 28, 31, 33, 35,36,37,38,39], five at overall unclear risk of bias [25, 30, 32, 34, 40] and one at overall low risk of bias [23]. None were judged at high risk of bias for random sequence generation or allocation concealment; 13 had high risk of bias for blinding of participants and personnel [17, 19, 20, 23,24,25, 31,32,33,34, 38,39,40]; eight for blinding of outcome assessment [19, 20, 24, 31, 33, 37,38,39]; eight from reporting withdrawals > 10% [17, 19, 24, 28, 35, 36, 39, 40]; and five had high risk of bias as some outcomes reported in either the protocol [20, 25, 28, 32] or the methods section [35] were omitted from the results (Supplement 3).

All four cluster RCTs were considered at overall high risk of bias [22, 26, 27, 29]. Three were judged at high risk of bias for blinding of participants and personnel [22, 26, 27], one had high risk of bias for blinding of outcome assessment [22], all four had high risk of attrition bias [22, 26, 27, 29], one had high risk of bias for selective reporting [26], two had high risk of cluster recruitment bias (participants were recruited after clusters were randomised) [26, 27] and two had high risk of bias for cluster attrition (outcomes were limited to a subset of original clusters randomised) [27, 29] (Supplement 3).

Clinical Effectiveness

Heterogeneity across populations, comparisons, targets, treatment protocols and outcomes precluded meta-analysis. Trial findings were synthesised narratively focusing on proportions of patients achieving end-point remissions in 18 trials, and LDA or equivalent in four trials not reporting remissions (Table 2 and Fig. 2).

Table 2 Targets, low disease activity and remission for TTT strategies in RA
Fig. 2
figure 2

Reported remissions at trial endpoints for TTT strategies vs standard care in rheumatoid arthritis

RA rheumatoid arthritis, TTT treat-to-target

TTT vs Usual Care

Four of the eight trials reported remissions: all found more remissions with intensive treatment; differences were significant in three trials. The TICORA trial [23] (two groups, early and established RA, 18-month treatment) reported the largest difference (intensive treatment 65%, conventional treatment 16%, p < 0.001). The T-4 trial [24] (four groups, early RA, 12-month treatment) reported a significant difference with DAS28-driven care compared with conventional treatment (38% vs 21%, p = 0.05). The Optimisation of Adalimumab trial [26, 41] (three groups, established RA, 18-month treatment) reported a significant difference between DAS28-driven care and conventional treatment (38% vs 16%, p = 0.027) in the ITT analysis but the completer analysis showed no significant difference. The STREAM trial [25] (two groups, early RA, 2-year treatment) reported more remissions with intensive than conventional treatment (66% vs 49%); this difference was not significant.

The Fransen trial [27] (two groups, established RA, 6-month treatment) only reported LDA; significantly more patients achieved LDA with intensive than conventional treatment (31% vs 16%, p = 0.028). Similarly, the Bergsten trial [28] reported a greater proportion of patients achieved LDA with intensive (48%) than conventional treatment (24%). The van Hulst trial [22] (two groups, early and established RA, 18-month treatment) reported more EULAR good responders (which includes LDA) with intensive (22%) than conventional treatment (18%) (significance unreported). In the Harrold trial [29], a similar proportion of patients achieved LDA with intensive (57%) and conventional treatment (55%).

Comparison of Treatment Protocols

All seven trials reported remissions [30,31,32,33,34,35,36]. Two trials involving conventionally treated controls reported significantly more remissions with intensive treatments. The U-Act-Early trial [36] (three groups, early RA, 24-month treatment) reported significantly more remissions with tocilizimab and methotrexate (86%) than methotrexate monotherapy (44%, p < 0.001). The FIN-RACo trial [33] (two groups, early RA, 24-month treatment) reported 37% remissions with intensive combinations and 18% remissions with monotherapy (p = 0.003).

Five trials compared different intensive treatment regimens. They comprised the BeSt trial [30, 42] (four groups, early RA, 12-month treatment), the CareRA trial [43, 44] (five groups, early RA, 24-month treatment), the COBRA-light trial [45] (two groups, early RA, 12-month treatment), the Saunders trial [34] (two groups, early RA, 12-month treatment) and the TEAR trial [35] (four groups, early RA, 24-month treatment). There were no significant differences in remissions between comparable groups in these trials.

Longer-term outcomes were reported for the BeSt and Fin-RACo trials (Table 3). In the BeSt trial [46, 47], there were similar remission rates in the four arms over 10-year follow-up. In the FIN-RACo trial [48] at 11 years, patients receiving initial combination therapy had significantly more remissions than with monotherapy (37% vs 19%), although there were no differences between groups at 5 years (29% vs 22%).

Table 3 Long-term outcome in FIN-RACo and BeSt trials of TTT strategies in RA: remissions from 1 to 11 years

Different Targets and Other Comparisons

The 10 trials reporting different targets and other comparisons all reported remissions. Three of the four trials involving conventionally treated controls reported more remissions with intensive treatments. The CAMERA trial [17, 49, 50] (two groups, early RA, 2-year treatment) reported significantly more remissions with intensive than conventional treatment (50% vs 37%, p = 0.03). The BROSG trial [18] (two groups, established RA, 3-year treatment) also reported more remissions with intensive than conventional treatment (20% vs 14%); this difference was not significant. Likewise, the TITRATE trial [19] (two groups, established RA, 1-year treatment) reported significantly more remissions with intensive than conventional treatment (32% vs 18%). The Mueller trial [20] compared a fixed regimen using certolizumab pegol with a TTT regimen using certolizumab pegol, and similarly reported significantly more remissions with more intensive treatment (68% vs 29%).

Four trials compared different treatment targets. The Hodkinson trial [37] (two groups, early RA, 12-month treatment) compared Clinical Disease Activity Index (CDAI) and Simple Disease Activity Index (SDAI) targets. The Tam trial [38] (two groups, early RA, 12-month treatment) compared SDAI and DAS28 remission targets. The ARCTIC trial [39] (two groups, early RA, 24-month treatment) compared a clinical target of DAS remission and no swollen joints with an ultrasound target of no power Doppler signal plus DAS remission and no swollen joints. The TEAR trial [35] (four groups, early RA, 24-month treatment) compared immediate treatment and step-up treatment to target LDA. All four trials reported no significant difference between groups.

Two TTT versus conventional treatment trials (T-4 and Optimisation of Adalimumab) also compared different treatment targets. In the T-4 trial [24], matrix metallopeptidase-3 (MMP-3)-guided treatment appeared less effective than DAS28-guided treatment. In the Optimisation of Adalimumab trial, [26, 41] swollen joint count (SJC)-guided treatment appeared less effective than DAS28-guided treatment, although this was not statistically significant.

Disease Duration

Thirteen trials studied early RA: five compared intensive management regimens with standard care and showed significantly more patients achieved remissions or LDA states with intensive management (T-4 [24], BeSt [30, 51], FIN-RACo [33], U-Act-Early [36], CAMERA [17, 49, 50]); one trial showed more remission with intensive treatment but the difference was not significant (STREAM [25]). Another seven early RA trials compared different intensive treatment regimens and showed comparable benefits (CareRA [43], COBRA-Light [32, 45], Saunders [34], Hodkinson [37], TEAR [35], Tam [38], ARCTIC [39]).

Five trials studied established RA: four compared intensive management regimens with standard care and showed significantly more patients achieved remissions or LDA states with intensive management (Fransen [27], Optimisation of Adalimumab [26, 41], Bergsten [28], TITRATE [19]); one trial found more remission with intensive treatment but the difference was not significant (BROSG [18]).

Four trials studied mixed populations of early and established RA: two trials showed significantly more remissions with intensive treatment than standard care (TICORA [23], Mueller [20]); the other two trials showed no significant impact of intensive treatment on LDA states (van Hulst [22], Harrold [29]).

Adverse Events

Five trials did not report harms (Fransen [27] van Hulst [22], BROSG [40], Hodkinson [37], Harrold [29]). Seventeen trials variously reported deaths, serious adverse events and withdrawals for adverse events (Table 4).

Table 4 Numbers of patients with adverse events in trials of treat-to-target strategies in rheumatoid arthritis

Deaths were reported in 13 trials: there were no deaths in five trials (U-Act-Early [36], STREAM [25], Mueller [20], Bergsten [28] and Tam [38]) and 29 deaths in the other eight trials (BeSt [51, 52], TEAR [35], Saunders [34] T-4 [24], TICORA [23], CareRA [44], TITRATE [19] and ARCTIC [39]). There were four deaths in three standard care arms and 13 deaths in 16 intensive treatment arms in which patients received a range of intensive treatments.

Serious adverse events were reported in 13 trials; all found some serious adverse events (TEAR [35], BeSt [30, 52], CareRA [44], COBRA-light [45], FIN-RACo [33], U-Act-Early [36], STREAM [25], T-4 [24], TITRATE [19], Mueller [20], Tam [38] and ARCTIC [39]). 397/3368 (12%) patients had a serious event: 346/2767 (13%) receiving intensive management and 51/614 (8%) receiving standard care. Serious adverse event rates varied substantially across trials: they were greatest in the 24-month U-Act-Early trial (49/317, 15%) [36] and least in the 6-month T-4 trial [24] (5/243, 2%).

Eleven trials reported withdrawals due to adverse events. No patients withdrew in three trials (STREAM [25], TICORA [23], Bergsten [28] and TITRATE [19]). In seven trials (TEAR [35], U-Act-Early [36], CAMERA [17], T-4 [24], Optimisation of Adalimumab [26], COBRA-light [32] and ARCTIC [39]), withdrawal rates varied substantially; they were highest in the CAMERA trial (38/299, 13%) [50]. In the CareRA trial [44], none of the low-risk patients withdrew from the trial; however, various numbers of patients withdrew from the three high-risk arms. There were 51/807 (6%) withdrawals in seven standard care arms and 239/3009 (8%) in 30 intensive treatment arms.

Cost-Effectiveness

The heterogeneity of data in the economic literature prohibited the construction of a single economic model that simultaneously compared all identified treatment strategies. Instead, each study considered in the clinical effectiveness section was evaluated separately. In some cases, a measure of the cost-effectiveness was presented in the paper and could be extracted; in others, sufficient data was presented to allow a measure of incremental cost-effectiveness ratio (ICER), such as the incremental cost per additional patient in remission, to be estimated assuming annual typical costs of biologic therapy of £9200 per annum, and for simplicity assuming no costs for csDMARDs or for RA-related hospitalisations bar rheumatology visits, which were each assumed to cost £128 [7]. The analyses undertaken were in line with recommendations from NICE, which has a direct medical cost and personal social services perspective [65]. Further details are provided in Wailoo et al. [14]. For jurisdictions that do consider indirect costs, such as lost productivity and costs falling upon the individual, the ICERs would become more favourable to more efficacious treatment regimens.

In nine [25, 26, 28, 29, 34, 37, 38, 41, 43, 44, 53,54,55,56,57] of the 22 studies, there was insufficient evidence to make any clear conclusion regarding incremental cost-effectiveness. Further details are provided in Wailoo et al. [14].

The 13 remaining studies [17, 22,23,24, 27, 30, 33, 35, 36, 40] contained sufficient evidence to allow a measure of ICER to be estimated with some degree of confidence. In six studies, one intervention was estimated to dominate another (greater health-related benefits at reduced costs). From the data contained in FIN-RACo [33], it was estimated combination drug therapy likely dominates single-drug therapy. From the evidence in the TICORA trial [23], it was estimated intensive care dominates routine care. According to the data contained in the Fransen trial [27] systematic monitoring dominates usual care. Data from the van Hulst trial [22] implied that usual care dominated a nurse-led approach. Mueller at al. [20] showed that on top of a backbone of certolizumab pegol treatment, a TTT approach for csDMARD treatment had significantly better clinical outcomes than fixed csDMARD treatment. Finally, data from ARCTIC [39] showed that an ultrasound-guided tight control strategy was associated with a statistically significant increase in bDMARD use, but that there was no statistically significant difference in outcome measures.

The authors calculated ICERs for the remaining seven studies. For reference, NICE are unlikely to fund interventions that have a cost per QALY greater than £30,000 [64]. In TITRATE, Scott et al. [19] explicitly calculated a cost per QALY which was £43,972 using a medical and personal social services perspective. This value became £29,363 when indirect costs and personal costs were included, although these aspects are not included in the NICE reference case. Using data contained in the BROSG trial [40], an ICER for shared care versus hospital treatment of £1517 per QALY (£7571 when baseline utility differences were considered) was estimated. From evidence in CAMERA [17], it was estimated intensive therapy would be cost-effective when compared with conventional therapy, due to an estimated incremental cost of £110 per patient in remission. Using data in the T-4 trial [24], basing treatment decisions on DAS28 in combination with MMP-3 was estimated to cost less than £170 per patient in remission compared with treatments driven by DAS28 alone, MMP-3 alone and routine care, indicating that a combination protocol would be cost-effective [24]. Evidence from the TEAR trial [35] indicates the additional costs associated with the immediate use of etanercept or the use of etanercept before triple csDMARD therapy would not be justified by the gain in health-related benefits. The higher expense associated with initial combination therapy with infliximab in the BeSt trial [30, 42, 46, 47, 51, 52, 58,59,60,61,62,63] does not appear to be justified due to the lack of any significant gain in health-related benefits compared with initial combination therapy with prednisone. The evidence contained in the U-Act-Early trial [36] indicates that a significantly higher percentage of people treated with tocilizumab, as monotherapy or in combination with methotrexate, achieved sustained remission than did methotrexate monotherapy. However, since tocilizumab is a bDMARD, it would be associated with markedly higher costs compared with the costs associated with methotrexate, a csDMARD, on which 44% of patients achieved a sustained remission. The estimated ICER was a cost of £41,818 per additional sustained remission for early tocilizumab treatment.

Discussion

Our systematic review of 22 different trials of TTT strategies in RA, which enrolled 5990 RA patients, provides robust evidence supporting their use. Firstly, TTT strategies were effective overall; more patients achieved remission or LDA with TTT approaches than with standard care. Secondly, there was no evidence TTT strategies increased harms compared with usual care. Finally, TTT strategies were largely cost-effective. The trials employed a wide range of different TTT strategies and there was no evidence to favour any particular strategy; several different approaches were equally effective. Although TTT strategies were effective in a broad range of RA patients, the impact of this approach was most marked in early RA. Sixteen trials had high risk of bias, five unclear risk and one low risk of bias.

There are several challenges evaluating TTT strategies. Firstly, the trials involved a broad range of different treatment approaches in diverse patient groups using varying target definitions and durations. The extent of this diversity precluded undertaking a meta-analysis. Secondly, several TTT components, including treatment intensities, treatment targets, frequent review and personalised care, may all account for its benefits. Thirdly, although remission is the preferred target, LDA is also relevant and is achieved by more patients. The frequency of remission with TTT strategies also varied substantially. In two trials—U-Act-Early and TICORA—most intensively-treated patients achieved remission, but these findings were exceptional and lower numbers of patients achieved remission in the majority of trials. Finally, long-term follow-up of two trials—BeSt and Fin-RACo—gave diverse findings. In BeSt, all treatment groups had similar long-term outcomes. In Fin-RACo, initial intensive treatment maintained its benefit. BeSt was undertaken in the biologic era while Fin-RACo predated widespread biologic use; these follow-up findings may reflect post-trial treatment differences rather than the impact of one type of TTT strategy.

Evidence that TTT approaches were cost-effective was strongest in early RA. Conclusions about cost-effectiveness could be made in thirteen trials. Estimates from these studies indicated TTT was usually cost-effective; the exception was if bDMARDs were used as the initial treatment. Initial intensive treatment using csDMARDs appeared most cost-effective. Health economic evaluations of the TEAR trial provided particularly strong evidence in favour of the cost-effectiveness of initial treatment with csDMARD therapy and only using bDMARDs when patients do not respond [65].

TTT strategies have been followed for over 10 years [66] (Smolen, 2019). Their use is well established in RA and is supported by many clinical guidelines; our findings support these recommendations [67,68,69,70]. RA guidelines are cautious about using bDMARDs as first-line therapies and do not encourage this approach; our economic analyses suggest such caution is appropriate [67,68,69,70]. Despite the strength of support for TTT strategies, there are many challenges implementing such approaches in routine practice. Not all specialists accept the guidance, and there are wide variations in views on treatment and the organisation of care approaches [71, 72]. In many centres, disease activity is not routinely measured using quantitative approaches [73], and not all centres have dedicated clinics for RA patients in general and early RA in particular. A range of additional factors influence TTT implementation in routine practice. The frequency of visits and the value of a 3-month assessment are important considerations [74]. Patients’ involvement in assessing their RA may have a crucial role using methods like the Patient-Reported Outcomes Measurement Information System [75]. There is also a role for training clinical staff in TTT methods [76], although the overall impact of training is uncertain [29].

Strengths of our systematic review include robust searching of several databases and secondary sources, following PRISMA principles, undertaking independent study selection and data extraction and assessing the quality of included studies. Heterogeneity across studies meant we were unable to undertake pair-wise meta-analysis, with limited assessment of publication bias. Our conclusions also have several limitations. Firstly, there were quality concerns about some TTT trials. Risk of bias was high in 12 conventional and four cluster trials and was only low in TICORA [23]. Secondly, some trials had small sample sizes and uneven baseline variable distribution, exemplified by the STREAM trial [25]. Thirdly, each comparison and population group involved relatively few trials. Fourthly, there was heterogeneity in targets, treatment protocols, contact frequencies, outcomes and follow-up time points. Fifthly, trials such as TEAR, treatment acceleration and escalation in different arms complicated defining the comparative effectiveness of the various strategies used. Sixthly, DAS28 < 2.6 remissions do not preclude ongoing disease activity and radiological progression, although the ARCTIC trial [39] found no difference in remission rates between a conventional tight control strategy aiming for clinical remission and an ultrasound remission-guided strategy. Finally, there are insufficient trials to be certain TTT is cost-effective among established RA patients.

Five previous systematic reviews have combined data from RCTs and non-randomised studies; two [9, 11] informed international recommendations [8, 10] concluding TTT was more effective than usual care. Assessments by Jurgens et al. [77], Schipper et al. [78] and Bakker et al. [79] highlighted the benefits of TTT compared with usual care. Knevel et al. [80] found no evidence to recommend one particular target over others in a synthesis of trials and non-randomised studies, concluding evidence supporting TTT was limited to early RA. Our systematic review had more stringent inclusion criteria, only examining RCTs to minimise bias, and included health economic assessments.

We conclude that TTT strategies are effect, safe and often cost-effective. Their impact is most marked in early RA, where there is strong evidence of increased remissions, and in patient groups containing both early and established RA. In established RA TTT strategies increase LDA but their impact on remissions is less certain. The trials assessed a range of different TTT strategies and there is no reason to prefer any particular strategy.