Introduction

Both the American College of Rheumatology (ACR) [1] and the European League Against Rheumatism (EULAR) [2] recommend assessment of disease activity as part of rheumatologists’ and nurse practitioners’ treatment decisions for patients with rheumatoid arthritis (RA). This recommendation encourages rheumatologists and nurse practitioners to accurately determine the level of disease activity at each patient visit and adjust therapy to achieve a target of low disease activity or remission. Several disease activity measures (DAMs) have been developed in attempts to quantify disease activity for use in real-world practice as well as in clinical trials to measure treatment benefit [3]. For clinical practice, ACR recommends the use of composite measures, including Disease Activity Score for 28 joints (DAS28), Clinical Disease Activity Index (CDAI), Routine Assessment of Patient Index Data 3 (RAPID3), Patient Activity Scale II (PAS II), and Simplified Disease Activity Index (SDAI), to assess the level of disease activity [4]. Each of these measures provides thresholds for remission, low/minimal, moderate, and high/severe levels of disease activity, but differ in scale and in the components used in the calculation of a composite disease activity score.

The goal of treatment for RA is to achieve sustained remission or low disease activity [1, 2]. In the treat-to-target strategy recommended by ACR and EULAR [1, 2], patients should be followed at regular intervals and their treatment adjusted until the target disease activity is achieved. In our prior work using data from the Veterans Affairs Rheumatoid Arthritis (VARA) registry, we evaluated major therapeutic changes (MTC) among US Veterans with RA across a broad range of disease activity [5]. We found that more than half of the patients in the analysis did not receive a MTC despite moderate or severe disease activity. In a chart review of a subset of these patients, we found that the most common reason for no MTC was that the rheumatologist/nurse practitioner judged the RA to be under good control and thus no change in therapy was indicated. Notably, although rheumatologists and nurse practitioners collected the components needed to calculate DAS28 for each patient, those calculations were often not performed in real time during the clinic visit, resulting in rheumatologists and nurse practitioners to base treatment decisions on their clinical judgment without using DAMs. In subsequent work, we found that patients with a MTC had more frequent clinical improvement as measured by 20% improvement in ACR criteria (ACR20 response) than patients who did not have a MTC, even among patients with long-standing RA who had received multiple prior therapies in this patient population [6]. We also noted that patients with higher disease activity were more likely to receive MTC from the rheumatologist/nurse practitioner [5].

In the current analysis, we further explored the relationship between established DAM thresholds of disease activity and rheumatologist/nurse practitioner decision to initiate a MTC. The two goals of this study were to (1) empirically determine the disease activity thresholds at which the rheumatologists and nurse practitioners were most likely to initiate MTC in the VARA population and (2) to report clinical response observed after a MTC based on such thresholds. For the first objective, the Youden Index [7] was used to determine the DAM threshold that best discriminated the decision to initiate a MTC. For the second objective, we estimated the impact of MTC on ACR20 response across 4 categories of disease activity levels as measured by DAS28, CDAI, and RAPID3: remission/low disease; low-moderate disease based on the lower bound of moderate disease to the Youden-identified threshold; high-moderate disease, which ranged from the Youden-identified threshold to the high bound of moderate disease activity; and high disease activity.

Methods

Population, data source, and study design

The study population included US Veterans enrolled in the VARA registry [7,8,9,10], a prospective, observational registry involving 11 Veterans Affairs (VA) medical centers. DAM components are recorded during routine visits using templated notes. The DAMs are extracted from medical notes stored in the VA Corporate Data Warehouse (CDW) [11] using validated extraction algorithms or data entered manually into the VARA registry database.

Patient data extracted from the CDW include pharmacy, laboratory, outpatient diagnoses, and electronic medical notes. Patient demographics, disease history, and duration of RA were collected from VARA enrollment data. Serologic samples collected to assess rheumatoid factor (RF) and anti-cyclic citrullinated peptide antibodies (ACPA) were assayed at a central laboratory on enrollment into the VARA registry. An additional chart review was performed to collect any data not identified in the CDW or VARA database.

A historical cohort design was used to compare the clinical response between patient visits with and without MTC. The unit of observation was an eligible patient visit to a rheumatology clinic during the study period (January 1, 2006, to September 30, 2017). Each eligible visit (i.e., rheumatology visits with documented core clinical measures to compute DAS28, CDAI, and RAPID3) with 18 months of enrollment and 2 rheumatology visits with DAS28 during the previous 18 months was classified as having a MTC or no MTC. The study included a baseline measurement period (18 months before the eligible visit) to measure covariates and potential confounders and an exposure period (7 days prior to 30 days after the eligible visit) to assess if a MTC occurred. The 7-day pre-visit exposure period was selected to identify any interventions that may have occurred immediately prior to the visit, (e.g., steroid dose escalation via telephone call or electronic message), and the 30-day post-visit period was designed to capture interventions that started at the visit. An outcome period (2–6 months after the eligible visit) was used to identify patients who achieved an ACR20 response (Supplemental Fig. S1).

Visit eligibility criteria

Eligible visits were identified for patients meeting the following criteria: enrolled in VARA registry, ≥ 18 years of age, rheumatology visit with all components of DAMs (DAS28, CDAI, RAPID3), documented (referenced as an eligible patient visit), ≥ 18 months of enrollment in VA health care system prior to the eligible visit, and 2 rheumatology visits with documented DAS28 scores during the 18-month baseline period ≥ 60 days apart from each other and ≥ 60 days before the eligible patient visit (to measure disease stability) (Supplemental Fig. S1). The key exclusion criteria included active cancer, organ transplant, diagnosis of other autoimmune disorders (e.g., systemic lupus erythematosus), any surgical procedure within 90 days after the eligible visit, or any hospitalization within 30 days of the eligible visit.

Youden Index and empirical decision threshold

The Youden Index is a measure of diagnostic accuracy that is used to identify optimal thresholds that discriminate a dichotomous outcome from a continuous scale [7]. The Youden Index has traditionally been used to identify optimal cut points for diagnostic tests. In this analysis, the Youden Index was used to identify the DAM value that maximized the correct classification of MTC where equal weighting was given to sensitivity and specificity.

The Youden Index (J) was calculated for each cut point/threshold (c), i.e., every value of the DAM.

$$ J(c)=\mathrm{sensitivity}(c)+\mathrm{specificity}(c)-1 $$

The goal of this analysis was to maximize J to identify the optimal cut point where c represents the set of candidate cut points/thresholds:

$$ {c}^{\mathrm{opt}}=\arg\ {\max}_{c\in C}J(c) $$

Measurements

Exposure: MTC

MTC has been previously defined [5, 6]. Briefly, a visit was associated with a MTC if (1) a new disease-modifying antirheumatic drug (DMARD) was initiated (including switching agents within the same drug class) either as a new agent or after a 90-day gap following the last date of prior therapy, (2) DMARD dose was escalated by ≥ 25%, (3) prednisone was initiated, (4) monthly average prednisone dose increased by 25%, and (5) and/or intra-articular injection of ≥ 2 with corticosteroids.

Outcome: clinical improvement measured by ACR20 response criteria

An ACR20 response was defined as improvement of 20% in both tender and swollen joint counts and 20% improvement in 3 of the ACR core disease activity measures (patient assessment of pain, patient global assessment of disease activity, physician global assessment, patient assessment of physical function, and acute-phase reactant laboratory value) [12]. ACR20 response was chosen to measure the treatment effects because it is a validated and common outcome measure in clinical trials, has standardized outcome assessments across the DAMs, and can detect clinical response to treatment in a time frame consistent with routine follow-up care (~ 3 months) [13, 14]. A window of 2–6 months after the index visit to document outcomes was used to account for variability in observed visit intervals and reduce the risk of exposure misclassification due to subsequent treatment modification. If multiple visits with documented core clinical measures were observed during the follow-up period, data from the visit closest to 3 months after the index visit were used.

Covariates: potential confounders between MTC and ACR20 response

Covariate adjustment was used to remove confounding between MTC and ACR20 response. Potential confounders included demographic characteristics, duration of RA, level of disease activity, Rheumatic Disease Comorbidity Index (RDCI) [15], disease stability [5], DMARD use at baseline, and MTC within 90 days of the eligible visit.

The standard criteria have been established to classify disease activity into remission, low, moderate, and high disease activity (Supplemental Table 1) [16,17,18]. For this analysis, we also evaluated disease activity stratification that included a division of moderate disease activity based on Youden thresholds. With this method, categories of disease activity included the following: (1) remission and low disease activity for DAS28 (< 3.2), CDAI (< 10.0), and RAPID3 (< 2.0); (2) low-moderate (lower bound of moderate disease to the Youden-identified threshold) for DAS28 (3.20–4.02), CDAI (10.0–12.9), and RAPID3 (2.00–3.81); (3) high-moderate (greater than Youden-identified threshold to the high bound of moderate disease activity) for DAS28 (4.03–5.10), CDAI (13.0–22.0), and RAPID3 (3.82–4.0); and (4) high disease activity for DAS28 (> 5.1), CDAI (> 22.0), and RAPID3 (> 4.0).

Types of MTC were descriptively analyzed based on category: changes in oral prednisone (initiating medication, restarting medication after a gap, and/or increase in medication dose), intra-articular corticosteroid injections, changes in bDMARD, and changes in csDMARD.

Estimating the impact of MTC on ACR20 response

Crude (bivariate) associations between MTC and ACR20 response were represented by risk difference (RD) and risk ratio (RR) with 95% confidence intervals (CIs). Impacts of MTC on ACR20 response were further evaluated using G-computation [19, 20] for the marginal and disease activity level conditional effects. The population average generalized estimating equation (GEE) model with an exchangeable correlation structure [21] was used with the G-computation approach to account for within-patient correlation, as multiple visits per patient were possible. Since the G-computation approach allowed us to predict potential outcomes for the entire population under both treatment conditions (with MTC and without MTC), we first built a model using a complete case analysis (i.e., in visits during the follow-up window with documented core measures), and then applied this model to the full population, including those with missing data, to estimate potential outcomes for every patient. We computed 95% CIs using a bootstrapping method, in which the random sampling (1000 samples) was done with replacement [19].

G-computation models were fit using patient age at visit, sex, race, ACPA status, RF status, disease duration, RDCI score, DAM stability (worsening or not), csDMARDs, bDMARDs, and prednisone dispensed in the month prior to visit and in the previous year, and the baseline MTC (MTC during previous 90 days). An interaction term was used to evaluate how the effect of MTC on ACR20 response was modified by different levels of disease activity. The marginal effect (overall effect) was produced by averaging the differences between the potential outcomes under MTC and the potential outcomes under no MTC, accounting for the fact that treatment effects vary across disease activity levels.

The probability of ACR20 response was shown to be independent of follow-up month when conditioning on MTC and levels of disease activity [6]; we therefore did not adjust for the follow-up interval in our ACR20 response model and used the G-computation models to estimate the population-level effects under the assumption of no loss to follow-up.

Descriptive statistics included the number of observations and percentages for dichotomous and continuous variables, and the number of observations, means, standard deviations (SDs), and 95% CIs [11] for continuous variables. We used several statistical software packages for these analyses, including Microsoft SQL Server, SAS version 9.4, and Enterprise Guide version 7.1. Data preparation and statistical analyses were conducted using Stata 14.

Results

Study visits

The study population included 1776 patients with 12,094 eligible visits; among these, 9017 visits were without a MTC and 3077 visits had a MTC. Of the 12,094 eligible visits, 7322 were complete cases (had data to calculate DAS28 during the follow-up window) and 4722 visits had missing data. The median number of visits per patient was 5 (interquartile range 3 to 10). Of the 1776 patients included, 588 (33.1%) patients never experienced a MTC, 1050 (59.1%) patients were in both the MTC and non-MTC groups, and only 138 (7.8%) patients were in the MTC group during the study period. A total of 651 patients did not meet the criteria of inclusion and were excluded from the analysis.

Patient demographic and clinical characteristics

Patients were younger, had a lower percentage of patients of Caucasian race, and had a shorter duration of RA during visits with MTC compared to patients at visits without MTC. The percentage of males, RF-positive status, ACPA-positive status, and disease stability were similar between visits with or without MTC (Table 1). Patients at visits with MTC had lower percentages of recent (within the past month) and established (within the past year) bDMARD and csDMARD dispensing episodes, but had a higher percentage of baseline use of prednisone compared to patients at visits without MTC. Patients at visits with MTC had a lower percentage of MTC during the 90 days prior to the eligible visit compared to patients at visits without MTC during the previous 90 days.

Table 1 Demographic and clinical characteristics

Youden Index and empirical thresholds

Empirical thresholds for MTC based on the maximum Youden Index and 95% CI were 4.03 (3.70–4.36), 12.9 (10.4–15.4), and 3.81 (3.32–4.30) for DAS28, CDAI, and RAPID3, respectively (Fig. 1, Supplemental Fig. S2).

Fig. 1
figure 1

Youden Index and empirical thresholds. ROC curves of disease activity measure thresholds for the likelihood to initiate MTC and empiric thresholds are shown. Empirical optimal thresholds with 95% CI, Youden indices, sensitivity and specificity at each threshold, and the AUC for the ROC curves are reported in the table. 95% CI, 95% confidence interval; AUC, area under the curve; DAS28, Disease Activity Score for 28 joints; CDAI, Clinical Disease Activity Index; MTC, major therapeutic change; ROC, receiver operating characteristic curves; RAPID3, Routine Assessment of Patient Index Data 3

Types of MTC

The most common type of MTC across all DAMs and levels of disease activity was changes to csDMARD (Supplemental Table S2). For visits with remission and low, low-moderate, and high-moderate disease activity, the second most common type of MTC was changes to oral prednisone, and for patients with high disease activity, it was changes to bDMARD therapy or to oral prednisone.

Description of MTC and frequency of ACR20 response

In the crude analysis (complete cases only), visits with remission or low disease activity were generally not associated with MTC, and visits with MTC had low rates of ACR20 responses in these patients (Table 2 and 3. Approximately one fourth to one third of visits with high disease activity with MTC were associated with an ACR20 response (range 22.8–33.5%) regardless of DAM. MTC was generally associated with an increased prevalence of ACR20 response across all strata, but not all strata-specific estimates met statistical significance. When the moderate disease activity category was divided into low-moderate and high-moderate disease based on the Youden threshold, there was a difference in the risk of ACR20 response between the two groups, but the corresponding results on RD did not show a statistically significant difference.

Table 2 Crude descriptive analysis of the frequency of MTC and ACR20 response by DAM and disease activity category for visits with ACR20 responses
Table 3 Crude descriptive analysis of the frequency of MTC and ACR20 response by DAM and disease activity category combined with empirical threshold

Marginal and conditional model-based effects of MTC on ACR20 response

In the marginal analysis, the model developed on the complete cases was applied to the full population (12,094 visits). A visit with MTC resulted in a statistically significant greater probability of ACR20 response across all DAMs: RRs for ACR20 response for visits with MTC vs without MTC ranged 1.2–2.6 across DAMs; RDs ranged 0.2–14.5% (Table 4). The stratum-specific effects varied by DAM. MTC was strongly associated with ACR20 response in categories of high disease activity across all DAMs, and again, a marked difference was observed in the percentage of visits with an ACR20 response and RD in the low-moderate disease activity group and high-moderate disease activity group separated by the Youden threshold. In all disease activity categories, MTC consistently showed improvements in ACR20 response in both RD and RR, but not all strata met the statistical significance, which is likely due to the smaller sample size of the study population.

Table 4 Adjusted marginal (overall) and conditional (severity category) effects of MTC on ACR20 response for the full population, with the missing ACR20 responses imputed through a causal prediction method

Discussion

In this observational study of US veterans enrolled in the VARA registry, we found that visits with a MTC were associated with an increased likelihood of ACR20 response during the 2–6 months after a MTC was initiated across all DAMs evaluated. However, there was a much greater RD for ACR20 response in patients with disease activity levels above the Youden threshold, as these patients had the greatest potential for response. Disease activity level establishes the indication for MTC and was the strongest predictor of MTC in patients with active disease [5]. The level of disease activity was predicted to act as an effect modifier, as RA patients with stable high disease activity may not respond to treatment, whereas patients with low disease activity or in remission may not receive additional benefit from treatment modification. As we have previously reported [5], clinical characteristics of patients with MTC differed from patients without MTC. Patients with MTC tended to have higher swollen/tender joint counts, greater pain scores, and more severe disease based on patient and physician global assessments of disease. After adjusting for disease activity scores, no clinical or administrative characteristics were predictive of MTC in multivariable regression models in this study population [5].

Disease activity in RA was originally defined as either high (when the rheumatologist/nurse practitioner decided treatment was needed) or low (when the rheumatologist/nurse practitioner decided the patient was in remission, and treatment was no longer needed) [22]. The original response criteria were established using these clinical observations [23]. The DAS28 was designed to measure RA disease activity and assess the treatment response in clinical trials. The development of the instrument was based on statistical analyses of a cohort of patients attending an outpatient clinic at the University Hospital of Nijmegen, and distinguished thresholds for low, moderate, and high levels of disease activity [24]. Current guidelines recommend MTC for patients with moderate and high disease activity [1, 2]. Our prior work and other studies suggest that the thresholds for levels of disease activity are not well aligned with rheumatologist/nurse practitioner current opinions of disease activity in real-world practice [5, 25, 26].

We used the Youden Index to determine how rheumatologists and nurse practitioners discriminated a need for MTC in a population of VARA patients. We identified disease activity scores for each DAM that maximized the discriminant ability for the decision to initiate or not initiate a MTC. The decision to initiate MTC was based on the rheumatologists’ and nurse practitioners’ clinical judgment and assessment of insufficient response. The empiric threshold for each DAM fell within the range of moderate disease. The empiric threshold for DAS28 was 4.03 (range of moderate disease per standard definition ≥ 3.2 to ≤ 5.1), for CDAI was 12.9 (range of moderate disease 10.1 to 22.0), and for RAPID3 was 3.81 (range of moderate disease 2.01 to 4.0) (Supplemental Fig. S2).

Rheumatologists and nurse practitioners consistently judged insufficient response at higher thresholds on the DAM scales compared with the lower threshold in moderate disease, which is the recommended threshold for insufficient response recommended by ACR guidelines. RA is a heterogenous disease, and the trajectory of patients with moderate disease activity varies. In an analysis of data from the Corrona registry, the probability of moving from moderate to low disease between clinic visits was 47%, and from moderate to severe disease, it was 18%, whereas over 35% of patients remained in moderate disease after 6 months [27]. There are currently no reliable methods to predict which trajectory an individual patient’s disease may take, but our results clearly show that patients across the spectrum of disease severity, including those with moderate disease by any DAM definition, can achieve significant clinical benefit from a MTC. Patients with low-moderate disease activity based on the Youden thresholds may therefore represent missed opportunities for clinical improvement with a MTC. In this study, we also observed clinical benefit (ACR20 response) in 7% of the non-MTC group. As in clinical trials when some patients receiving placebo are observed to have a significant clinical improvement, some of the patients in the non-MTC group had an ACR20 response. However, the ACR20 response was much less for the non-MTC group than for the MTC group which demonstrated the clinical benefit of the MTC. While we do not have a complete explanation for the ACR20 response in patients without MTC, we postulate that the reasons for the change in clinical activity would be similar to that seen in placebo control trials which include disease variability and regression to the mean.

The strengths of this work are that all the patients had a rheumatologist-confirmed diagnosis of RA, the study population was nationwide, and clinical and pharmacy data were collected through a uniform medical record and data collection system. Additionally, patients receiving treatment in the VA system benefit from the reliability of prescribing and ready access to medications.

This study cohort comprises US veterans who are predominantly males, with longer disease duration, older age, and more comorbidities than other RA groups, and these results may not be generalizable to the general RA population. These patients with longstanding RA have likely cycled through multiple therapies, yet still realized a clinical benefit with a MTC. A similar analysis in a cohort of newly treated patients with shorter disease duration may show a larger impact with MTC, as rheumatologists and nurse practitioners may be more likely to target remission in patients with shorter duration of RA [28]. This analysis was population-based, whereas rheumatologists and nurse practitioners make decisions for individual patients. For patients with longstanding RA, rheumatologists and nurse practitioners may believe that moderate disease represents the most achievable or best possible outcome based on their understanding of a particular patient; those patients would appear to be undertreated in our analysis.

In summary, these data demonstrate the importance of evaluating real-world clinical data in the assessment of DAM thresholds and provide insight into how DAMs may be best applied in clinical practice. While it is evident that clinical improvement can occur at any level of disease activity, the benefit will be seen in patients with relatively higher disease activity, with the level of disease activity defined by our Youden-level assessment providing a good threshold level for consideration in clinical practice. Further work will be needed to determine if guidelines should be adjusted to include these findings in directing the treatment of RA.

Conclusions

This work demonstrated that MTC was associated with clinical improvement across all DAMs, with the greatest change observed in patients with RA disease activity above the Youden threshold identified. Thus, the Youden level may be an important measure for consideration while making individualized treatment decisions.