Background

With recent advances in therapy, the current goal of treatment of rheumatoid arthritis (RA) is clinical remission. While 30% of patients treated with conventional synthetic disease-modifying medications (csDMARDs) achieve remission, up to 50% of those treated with tumor necrosis factor inhibitors (TNFi) in a treat-to-target strategy achieve remission at 6 to 12 months, with better physical functioning, less radiographic damage, and lower risks of work loss [1,2,3].

With this growing population of patients, new questions have arisen about the most appropriate regimen to maintain remission. In particular, for patients treated with TNFi in combination with csDMARDs, what are the relative benefits and risks of continuing versus discontinuing TNFi? Discontinuation of TNFi could avoid potential overtreatment and eliminate associated costs and risks of toxicities [4]. Also, because patients in remission may experiment with unsupervised drug holidays, supervised discontinuation may improve overall adherence [5, 6]. However, TNFi discontinuation entails risks of increased RA activity. Previous reviews have reported that 40 to 50% of patients could maintain remission at least short-term after stopping TNFi, but loss of remission was 1.3 to 6.7 times more likely compared to those who continued treatment [4, 7,8,9,10,11,12].

TNFi discontinuation may take place in two clinical contexts: when remission has been achieved after short-term use of TNFi as induction therapy (i.e., an induction-withdrawal approach), or more commonly, among patients in stable remission after long-term treatment (i.e., maintenance discontinuation). Viewed in the Population-Intervention-Comparator-Outcome (PICO) framework, these populations differ. It is important to examine these populations separately because the duration of RA, recency of active RA, and duration of remission may influence the success of TNFi discontinuation [12]. Previous reviews have not distinguished these different clinical scenarios, even though information on each group is needed for accurate patient counseling.

That about one-half of patients can successfully discontinue TNFi suggests that there may be subsets of patients with either higher or lower likelihoods of success. If these subsets could be identified, TNFi discontinuation could be more effectively targeted. The most consistent predictors of successful TNFi discontinuation have been the depth of remission and early RA [13]. Associations with other clinical features, particularly biomarkers, are less certain [12,13,14,15]. Whether predictors differ between patients stopping induction treatment or maintenance treatment is unknown.

Our goals were as follows: (1) to perform a systematic review of the prevalence of remission after TNFi discontinuation, separately in patients receiving induction therapy or stopping maintenance treatment, and (2) to perform a scoping review of predictors of remission in these two populations. We focused on TNFi discontinuation because this is currently the most common treatment de-escalation decision in RA [16].

Methods

We performed two related literature reviews: a systematic review of the prevalence of sustained remission/low disease activity (LDA) after discontinuation of TNFi treatment in patients with RA (and when available, comparison to continuation of TNFi), and a scoping review of predictors of continued remission/LDA after TNFi discontinuation [17]. We examined both questions following a written protocol, which was registered at the Center for Open Science (osf.oi/etzav). We followed the Preferred Reporting Items for Systematic Reviews and Meta-analyses 2020 recommendations (Supplement) [18].

Literature searches

We searched five bibliographic databases for relevant studies in any language published from January 1, 2005, to May 1, 2022: PubMed/MEDLINE, Embase, Web of Science, Cochrane Central Register of Controlled Trials, and Cochrane Reviews. We did not search before 2005 because discontinuation strategies were not used earlier. Search terms included “rheumatoid arthritis,” “tumor necrosis factor inhibitors,” individual medication names, “remission” or “low disease activity,” and “discontinuation” or “withdrawal” (Supplemental Table 1). We used EndNote20 for citation management. For the scoping review, one author also searched abstracts of congresses of the American College of Rheumatology and European League Against Rheumatism from 2010 to 2022 and Google through May 2023.

Study inclusion

Two authors independently reviewed the search results for relevant articles, first by title/abstract and subsequently full-text review. Discrepancies were resolved by discussion. We included full-length articles, reviews, conference abstracts, and trial registrations to identify primary articles and for the scoping review, but limited the systematic review to full-length articles. We included randomized controlled trials, single-arm trials, and observational studies that examined adults with RA who were in remission/LDA while on treatment with TNFi, and that reported patients’ remission status following discontinuation of TNFi treatment. We included articles regardless of the stringency of remission or RA activity index used, on the premise that investigators judged that RA activity was low enough that TNFi discontinuation was a reasonable consideration. Some studies had a controlled trial design to address a different primary question, but included TNFi discontinuation during follow-up as a secondary aim. We considered these as observational studies if TNFi discontinuation was not randomized.

We excluded cross-sectional studies, studies of other diseases or children or animals, case reports, letters, duplicate articles, and abstracts subsequently published as full-length articles. We also excluded studies of discontinuation of csDMARDs or other biologics unless the article included stratified data on TNFi. We excluded TNFi tapering studies and tapering arms of multi-arm trials (Supplemental Table 2). We focused on discontinuation rather than tapering, as tapering regimens vary, and discontinuation provides greater contrast to identify predictors. When more than one article was based on the same cohort, we included the article most relevant to the systematic or scoping review.

For the scoping review, we included full-length articles or conference abstracts that examined predictors of sustained remission/LDA after TNFi discontinuation. Predictors could be either clinical, imaging, or biological markers. We allowed studies that included patients who discontinued other biologics, provided that most patients used TNFi, and allowed studies that reported predictors of remission in the entire cohort (i.e., not limited to those who discontinued TNFi).

Data extraction

For the systematic review, two authors independently extracted data on RA activity at the time of TNFi discontinuation, remission/LDA criteria, prevalence of remission/LDA during follow-up, and outcomes of re-treatment, using a standardized format. Two authors also independently assessed study quality, using the Cochrane Risk of Bias 2 (ROB2) tool for controlled trials and the Risk Of Bias In Non-randomized Studies of Interventions (ROBINS-I) tool for other studies [19, 20]. Results were compared and discrepancies resolved by discussion. For the scoping review, data on predictors and measures of association were extracted by one author and independently checked by a second author.

Statistical analysis

Our study outcome was the prevalence of remission/LDA after TNFi discontinuation. We pooled induction-withdrawal studies and maintenance discontinuation studies separately, and for each treatment strategy, we pooled the outcomes of Disease Activity Score 28 (DAS28) < 3.2, DAS28 < 2.6, or Simplified Disease Activity Index (SDAI) ≤ 3.3 separately. For the few studies that reported the outcome as the proportion that did not restart biologic treatment, we conservatively classified these as DAS28 < 3.2. Since relapses are time-dependent and more likely with longer follow-up, we pooled results reported at 24–36 weeks after discontinuation and 37–52 weeks after discontinuation separately. We computed pooled prevalences using restricted maximum likelihood estimation random effects models with the double arcsine transformation, using the metafor package in R (version 4.2.2). We used I2 to assess heterogeneity among studies. For studies that also provided data on sustained remission/LDA in patients who continued TNFi treatment, we pooled these results and computed relative risks and risk differences of remission/LDA between discontinuation and continuation arms, using random effects models implemented in OpenMeta (www.cebm.brown.edu/openmeta).

We analyzed predictors at the time of TNFi discontinuation by comparing patients who maintained remission/LDA or not, based on the remission/LDA criterion in each study. For continuous predictors, we used mean values to compute standardized mean differences (SMD) between the groups and pooled the SMDs using DerSimonian and Laird random effects models in OpenMeta. SMDs represent the number of standard deviations by which two groups differ, with positive values indicating higher means in patients with sustained remission. For studies reporting medians, we used the methods of McGrath to estimate means [21]. For categorical predictors, we computed odds ratios for remission/LDA from reported proportions, or used the study’s reported odds ratios, and pooled these using random effects models in OpenMeta. If only hazard ratios were reported, we pooled these separately. We harmonized the direction of associations across studies so that successful discontinuation was the outcome.

In sensitivity analyses, we excluded studies rated as high risk of bias with the ROB2 tool, or serious or critical risk of bias with the ROBINS-I tool.

Results

Search results

Of 3035 unique articles identified in electronic searches and 2077 articles screened from secondary sources, we included 43 articles in the systematic review of the prevalence of sustained remission/LDA after discontinuation and 37 studies in the scoping review of predictors (Fig. 1). Of the 43 articles in the systematic review, 22 articles reported induction-withdrawal studies and 22 articles reported studies of maintenance TNFi discontinuation, with 1 article including both groups [22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,40,41,42,43,44,45,46,47,48,49,50,51,52,53,54,55,56,57,58,59,60,61,62,63,64]. Data on predictors were reported in 12 induction-withdrawal articles [27, 33,34,35,36, 39, 41, 43, 65,66,67,68] and 22 maintenance discontinuation articles [44, 46, 47, 49,50,51,52, 54,55,56, 59,60,61,62, 64, 69,70,71,72,73,74,75,76].

Fig. 1
figure 1

Flow diagram of study inclusion. Error bars represent 95% confidence intervals

Sustained remission/LDA in induction-withdrawal studies

These studies included 5 double-blind controlled trials [22,23,24,25,26], 1 open-label trial [27], and 16 studies in which TNFi discontinuation was observational [28,29,30,31,32,33,34,35,36,37,38,39,40,41,42,43] (Table 1 and Supplemental Table 3).

Table 1 Proportion of patients with sustained remission or low disease activity in induction-withdrawal studies

The criterion for TNFi discontinuation was DAS28 < 3.2 in 9 studies, DAS28 < 2.6 in 9 studies, and other indicators in 4 studies. The number of patients who discontinued TNFi ranged from 2 to 200 (median 34; total 1183), with larger samples in the trials. Seven studies examined etanercept, 5 examined infliximab, 5 examined adalimumab, 2 examined certolizumab, and 3 examined various TNFi. Follow-up varied from 24 to 96 weeks. Thirteen studies reported results at 37–52 weeks after TNFi discontinuation, and 6 studies reported results at 24–36 weeks. The proportion of patients with sustained remission/LDA after TNFi discontinuation varied widely (Table 1).

Remission prevalence after discontinuation

In the pooled analysis, 58% had DAS28 < 3.2 and 52% had DAS28 < 2.6 at 37–52 weeks after discontinuation, with high heterogeneity among studies (Fig. 2 and Supplemental table 4).

Fig. 2
figure 2

Pooled proportions having sustained remission/low disease activity at 37–52 weeks after either discontinuation or continuation of tumor necrosis factor inhibitor treatment in induction-withdrawal studies. Circles represent the DAS28 < 3.2 outcome, squares represent the DAS28 < 2.6 outcome, and triangles represent the SDAI ≤ 3.3 outcome. Closed symbols represent tumor necrosis factor inhibitor discontinuation arms, and open symbols represent continuation arms. Error bars represent 95% confidence intervals

Only four studies reported SDAI-based results, and 40% of patients had SDAI ≤ 3.3 after discontinuation. The proportion remaining in remission/LDA was therefore lower with more stringent definitions of remission. At 24–36 weeks after TNFi discontinuation, 36% of patients maintained DAS28 < 3.2, 73% had DAS28 < 2.6, and 12% had SDAI ≤ 3.3 (Supplemental table 4).

Sensitivity analysis and study heterogeneity

The double-blind controlled trials were rated as having a low or moderate risk of bias, while the open-label trial was rated as having a high risk of bias (Supplemental Fig. 1). Seven observational studies were judged as having a serious risk of bias (Supplemental Fig. 2). In the sensitivity analysis, pooled results were similar when only studies with low or moderate risk of bias were examined (Fig. 2 and Supplemental table 4).

We explored potential heterogeneity by disease activity, duration of RA, and study design among the 9 studies that reported DAS28 < 2.6 outcomes at 37–52 weeks. Among the six studies that required DAS28 < 2.6 at the time of discontinuation [23, 27, 28, 32, 36, 39], the proportion with DAS28 < 2.6 at follow-up 1 year later was 58% (95% CI 33, 82), compared to 42% (95% CI 20, 67) among the three studies that required DAS28 < 3.2 at TNFi discontinuation [22, 24, 26] (p = 0.42). Among the six studies in early RA [22, 23, 26,27,28, 32], the pooled proportion with DAS28 < 2.6 at follow-up was 63% (95% CI 42, 82), compared to 32% (95% CI 17, 49) in three studies in established RA [24, 36, 39] (p = 0.05). Among the five controlled trials [22,23,24, 26, 27], the pooled prevalence with DAS28 < 2.6 at follow-up was 47% (95% CI 30, 63), while among the four observational studies [28, 32, 36, 39], the pooled prevalence was 58% (95% CI 25, 88) (p = 0.84).

Retreatment

In five studies that reported on retreatment (64 patients combined) after relapse following TNFi discontinuation, 96% (95% CI 85, 100) regained remission/LDA after resuming TNFi treatment (Supplemental Table 3).

Remission prevalence with TNFi continuation in controlled studies

In the controlled studies, 85%, 73%, and 62% of patients who continued TNFi treatment maintained DAS28 < 3.2, DAS28 < 2.6, and SDAI ≤ 3.3, respectively, at 37–52 weeks’ follow-up (Fig. 2 and Supplemental table 4). In pooled analyses of controlled studies that compared those who discontinued TNFi to those who continued TNFi, the risk ratio of sustained DAS28 < 3.2 was 0.69, the risk ratio of sustained DAS28 < 2.6 was 0.58, and the risk ratio of sustained SDAI ≤ 3.3 was 0.59 (Supplemental table 5). Pooled risk differences were − 22.2%, − 27.3%, and − 18.4% for these outcomes, indicating that absolute relapses in the discontinuation group exceeded those in the paired continuation group by these amounts.

Sustained remission/LDA after discontinuation of maintenance TNFi

These studies included 3 double-blind controlled trials [44,45,46], 2 open-label trials [47, 48], and 17 studies in which TNFi discontinuation was observational [33, 49,50,51,52,53,54,55,56,57,58,59,60,61,62,63,64], including 4 registry studies [54, 55, 62, 63] (Table 2 and Supplemental table 6).

Table 2 Proportions with sustained remission or low disease activity among studies of discontinuation of maintenance tumor necrosis factor inhibitor treatment

Six studies used DAS28 < 3.2 as the criterion for discontinuation, 9 studies used DAS28 < 2.6, 2 studies used SDAI ≤ 3.3, and 5 studies used other criteria. Thirteen studies included patients treated with different TNFi. Minimum durations of remission/LDA were 3 months in 3 studies, 6 months in 9 studies, longer than 6 months in 3 studies, and unspecified in 7 studies. The number of patients who discontinued TNFi ranged from 4 to 717 (median 30; total 2142). Five studies reported outcomes at 24–36 weeks, 14 studies reported results at 37–52 weeks, and 3 studies reported outcomes at longer times.

Remission prevalence after discontinuation

In the pooled results, 48% of patients had DAS28 < 3.2 at 37–52 weeks after discontinuation, 47% had DAS28 < 2.6, and 46% had SDAI ≤ 3.3, with high heterogeneity among studies (Fig. 3 and Supplemental table 7). At 24–36 weeks after TNFi discontinuation, 85% of patients maintained DAS28 < 3.2, and 75% had DAS28 < 2.6.

Fig. 3
figure 3

Pooled proportions having sustained remission/low disease activity at 37–52 weeks after either discontinuation or continuation of tumor necrosis factor inhibitor treatment in maintenance discontinuation studies. Circles represent the DAS28 < 3.2 outcome, squares represent the DAS28 < 2.6 outcome, and triangles represent the SDAI ≤ 3.3 outcome. Closed symbols represent tumor necrosis factor inhibitor discontinuation arms, and open symbols represent continuation arms

The blinded trials were rated as having a low or moderate risk of bias, while the open-label trials had high risk of bias (Supplemental Fig. 3). Nine observational studies were rated as having a low or moderate risk of bias (Supplemental Fig. 4).

Sensitivity analysis and study heterogeneity

In the sensitivity analysis, the proportions of patients with successful discontinuation among studies with low or moderate risk of bias were similar to, or somewhat lower than, the proportions among all studies (Fig. 3 and Supplemental table 7).

Examining heterogeneity by RA activity, DAS28 < 2.6 at 37–52 weeks after discontinuation was only slightly more common among studies that required DAS28 < 2.6 at enrollment [52, 57, 60] compared to studies that required DAS28 < 3.2 at enrollment [48, 50, 59] (53% (95% CI 38, 68) versus 45% (95% CI 25, 67)). All studies of maintenance treatment discontinuation examined patients with established RA. The proportion with DAS28 < 2.6 at follow-up was higher in the five observational studies [50, 52, 57, 59, 60] (53%; 95% CI 40, 66) than in the one clinical trial [48] (29%; 95% CI 25, 34) (p = 0.04).

Retreatment

Among 11 studies that reported on retreatment of relapses (360 patients combined), the pooled proportion of patients who regained remission was 86% (95% CI 71, 98) (Supplemental table 6).

Remission prevalence with TNFi continuation in controlled studies

Among patients in controlled studies who continued TNFi, 69% maintained DAS28 < 3.2, 64% maintained DAS28 < 2.6, and 53% maintained SDAI ≤ 3.3 at 37–52 weeks, although the number of studies was small (Fig. 3 and Supplemental table 7). In paired analyses of studies that reported both discontinuation and continuation arms, sustained remission/LDA was more likely among those who continued TNFi, with risk ratios that ranged from 0.47 to 0.57 (Supplemental table 8). Risk differences indicated that absolute rates of maintaining DAS28 < 3.2 were, on average, 33.4% lower with discontinuation, and of maintaining DAS28 < 2.6 were 32.1% lower with discontinuation.

Predictors of successful discontinuation in induction-withdrawal studies

Collectively, data on 18 different predictors were reported (Table 3 and Supplemental table 9) [27, 33,34,35,36, 39, 41, 43, 65,66,67,68]. However, only 8 predictors were reported by more than 3 studies, and pooling was limited because studies used different effect size measures. Older age was not predictive in studies that reported mean ages, but older age groups were less likely to have successful discontinuation in two studies that reported odds ratios [43, 65]. Mean duration of RA was shorter among patients with successful discontinuation. Longer duration of TNFi treatment prior to discontinuation was associated with lower likelihood of success.

Table 3 Predictors of sustained remission in induction-withdrawal studies and studies of discontinuation of maintenance tumor necrosis factor inhibitor (TNFi) treatment

Human leukocyte antigen (HLA) shared epitope, radiographic damage, and smoking were associated with a lower likelihood of successful discontinuation, based on one study [66]. Mean Health Assessment Questionnaire (HAQ) scores and mean disease activity scores were lower among patients with successful discontinuation. There were no associations with other predictors, including sex, seropositivity, and ultrasound measures. Serum matrix metalloproteinase-3 did not predict relapse in one study [39], while relapses were associated with lower proportions of peripheral blood naïve T cells and higher proportions of regulatory T cells in another study [33].

Few induction-withdrawal studies with low or moderate risk of bias reported on predictors (Supplemental table 10). Duration of RA was not clearly predictive in this subset.

Predictors of successful discontinuation of maintenance TNFi treatment

More information was available among these studies, with data on 17 predictors reported in more than 3 studies (Table 3 and Supplemental table 9) [44, 46, 47, 49,50,51,52, 54,55,56, 59,60,61,62, 64, 69,70,71,72,73,74,75]. Mean duration of RA was shorter among patients with successful discontinuation, as was a shorter time to reach remission with TNFi treatment [55]. Patients treated with monoclonal TNFi tended to have more successful discontinuation than those receiving etanercept. Patients with more radiographic damage and obese patients were less likely to have successful discontinuation. Smoking, higher HAQ scores, and higher disease activity were associated with lower likelihoods of successful discontinuation only in two registry studies that reported hazard ratios [54, 55]. Higher multi-biomarker disease activity score was associated with lower odds of successful discontinuation in one study [69]. There were no associations with other variables, including length of remission, seropositivity, and ultrasound measures.

Selected laboratory biomarkers were examined in individual studies. Among 12 serum cytokines or cytokine receptors, lower levels of interleukin-2 and higher levels of soluble TNF receptor 1 at baseline predicted flare after treatment discontinuation in a small cohort [61]. Nagatani reported that relapse was associated with high serum interleukin-34, chemokine ligand-1, and interleukin-1β, and low serum interleukin-19 and interleukin-2 [64]. A low proportion of MerTK+CD206+ synovial tissue macrophages was strongly associated with the risk of flare after TNFi discontinuation [76].

Among studies with low or moderate risk of bias, successful discontinuation was less likely among patients with longer durations of RA and more radiographic damage, but was not associated with other clinical variables (Supplemental table 10).

Discussion

Discontinuation of TNFi treatment in patients with well-controlled RA has the potential to improve care by simplifying regimens, decreasing treatment-related side effects, and reducing costs, but comes with the risk of increased RA activity. Knowing the absolute risk of relapse is needed to inform decision-making. Because these risks, and the associated strength of evidence, may differ between short-term TNFi treatment in an induction-withdrawal strategy and discontinuation of long-term maintenance TNFi treatment, it is important to examine these risks separately. Our pooled results indicated that 58% of patients had DAS28 < 3.2 and 52% had DAS28 < 2.6 at approximately 1 year after withdrawal of induction treatment. Comparable proportions were 48% and 47% after discontinuation of maintenance TNFi treatment. Few studies reported SDAI remission or results at 24–36 weeks.

Two previous systematic reviews that included 16 and 12 studies, respectively, reported successful discontinuation in 53% and 62% of patients [7, 77]. However, these reviews pooled studies that had different criteria for remission and different lengths of follow-up, and did not distinguish between the two clinical scenarios of discontinuation, limiting the specificity of their results. These results were comparable to our findings in induction-withdrawal studies, but were higher than our results for maintenance discontinuation studies. Successful discontinuation was more common in induction-withdrawal studies, which may reflect greater responsiveness in early RA. The proportion with successful discontinuation also decreased with increasing stringency of remission, particularly so for SDAI ≤ 3.3. Stratifying by the timing of responses is also important because more relapses would be expected with longer follow-up. In maintenance discontinuation studies, for example, DAS28 < 2.6 was maintained by 75% at 24–36 weeks but only 47% at 37–52 weeks. We did not observe a similar pattern in the induction-withdrawal studies, although few studies reported results at early times. These observations underscore differences by clinical scenario, outcome, and time.

Other reviews summarized discontinuation studies qualitatively [4, 8, 9, 12, 78,79,80] or included only controlled trials and focused on comparisons between discontinuation and continuation of TNFi [10, 11, 81,82,83]. In these meta-analyses, risk ratios for LDA with discontinuation ranged from 0.44 to 0.75, and risk ratios for DAS28 remission ranged from 0.45 to 0.71 [10, 80,81,82]. We focused on the absolute risks associated with discontinuation, because absolute frequencies of relapse are an important consideration in individual patient decision-making. Data on patients who continued TNFi treatment showed that, on average, 15% of patients did not maintain LDA and 27% did not maintain DAS28 remission for periods up to 1 year in induction studies, while 31% and 36% of patients who continued maintenance TNFi treatment similarly relapsed. These results provide useful context for interpreting the proportions in the discontinuation arms, highlighting that not all these relapses are necessarily attributable to TNFi discontinuation. Many would have been expected regardless of TNFi discontinuation. Risk differences assess this directly and indicate that relapses attributable to discontinuation ranged from 20 to 33%.

It is important to note that there was substantial heterogeneity among studies, even with the same design, outcome, and length of follow-up. This may be due to differences in inclusion criteria, patient selection, and depth of remission. That 15–47% of patients lose remission over 1 year despite continuing on TNFi treatment may be due to the limited specificity of these remission criteria, but also indicates that remission in RA does not indicate a cure.

Among induction-withdrawal studies, TNFi discontinuation was more successful in patients with early RA, approaching the prevalence seen in those who continued TNFi (63% versus 73%). Greater success in early RA and among patients with deeper remission has been suggested previously [13, 14, 78, 84]. In our pooled analysis, associations with a shorter duration of RA and lower disease activity were also supported by multiple studies, as were lower HAQ scores and shorter duration of TNFi treatment. RA activity and HAQ were not found to be associated with successful discontinuation in studies that dichotomized these measures, perhaps due to reduced statistical power. Age, sex, seropositivity, and methotrexate dose were not predictive of successful discontinuation in induction-withdrawal studies. There were few data on other predictors.

Among studies of maintenance TNFi treatment, discontinuation was more successful among patients with shorter RA durations and less radiographic damage, as identified previously [14]. Given that radiographic changes are cumulative, it is not clear if radiographic damage predicts the risk of relapse independent of RA duration. Shorter time to remission with TNFi treatment was also associated with successful discontinuation. Interestingly, monoclonal TNFi tended to have more successful discontinuation than etanercept. Whether this is related to patient selection or different immunological effects is unclear. We found no association with other clinical variables, including disease activity, in contrast to induction-withdrawal studies [14].

Few studies examined immunological biomarkers, and it is difficult to draw conclusions about prognostic importance based on single studies. Given the general absence of clinical predictors, it may be that immunological markers will be key to identifying which patients will be able to maintain remission after TNFi discontinuation. Although subclinical joint inflammation is common in clinical remission [85], our results did not support the prognostic value of ultrasound in studies of TNFi discontinuation. Power Doppler positivity in remission has been associated with higher odds of relapse in one study, but this study did not examine treatment discontinuation [85]. In three studies of biologic tapering, ultrasound abnormalities predicted relapse, indicating that further evaluation of the potential prognostic value of ultrasound is warranted [86,87,88]. Subclinical joint inflammation by magnetic resonance imaging (MRI) has also been observed in many patients in clinical remission, but MRI has not been found to predict relapses on biologic tapering [45, 88, 89]. We did not identify prognostic studies of MRI in the setting of TNFi discontinuation.

Our study is limited by the definitions of remission used in the primary studies, which may be considered too liberal. Few studies used SDAI remission as either the inclusion criterion or outcome, and none used American College of Rheumatology Boolean criteria. Interestingly, the more stringent SDAI criterion resulted in both lower proportions of remission and higher proportions of relapses, reflecting increased difficulty of maintaining this level of RA activity over time. We focused on TNFi discontinuation, given there are few discontinuation studies of other biologics or csDMARDs, or of tapering, and pooling results of different strategies or medications would decrease the specificity of any conclusions. We included both observational studies and controlled trials. Although several studies were judged to have a high risk of bias, results were generally similar after excluding such studies. Pooling of results in the predictor analysis was limited by the diversity of effect measures in the primary studies. We cannot exclude the possibility of publication bias, which is difficult to identify in the presence of heterogeneity [90]. We tried to minimize publication bias by using a comprehensive search strategy that included trial registrations, abstracts, and no language restrictions. We also included articles whose main objective was not to determine the prevalence of remission after TNFi discontinuation.

Conclusions

This study is the first to examine the outcomes of TNFi discontinuation separately in induction treatment and maintenance treatment. Almost one-half of patients were able to discontinue maintenance TNFi treatment and remain in remission for up to 1 year. More patients had successful discontinuation in induction-withdrawal studies, underscoring the differences in outcomes between these scenarios. In both cases, patients with early RA were more likely to have successful discontinuation. After induction treatment with TNFi, approximately 6 in 10 patients with early RA would remain in remission for up to 1 year after discontinuation, but only 3 in 10 patients with established RA would do so. After discontinuation of maintenance TNFi treatment, approximately 5 in 10 patients would remain in remission for up to 1 year. These results may be useful in shared decision-making with patients who are contemplating treatment de-escalation. More research is needed to identify how risks of relapse vary in patient subgroups.