Introduction

Rheumatoid arthritis (RA) is a chronic inflammatory disease of unknown etiology in which joint damage and physical disability are major adverse outcomes resulting in poor quality of life and premature mortality. Successful treatment of RA attains a state of low disease activity and/or clinical remission to prevent progressive joint destruction and maintaining functional capabilities. Three major classes of disease-modifying anti-rheumatic drugs (DMARDs) have demonstrated efficacy in RA: conventional synthetic DMARDs (csDMARDs, e.g., methotrexate, leflunomide), targeted synthetic DMARDs (tsDMARDs, e.g., tofacitinib), and biological agents (bDMARDS, e.g., etanercept, adalimumab, infliximab) [1]. An international rheumatology task force has recommended a Treat-to-Target approach to achieve optimal therapeutic outcomes by using defined measures of clinical disease activity to guide treatment in order to achieve remission or a low disease activity state [2]. Both the American College of Rheumatology (ACR) and the European Union League Against Rheumatism (EULAR) recommendations that the management of RA include systematic longitudinal and quantitative RA disease activity assessment [3, 4].

There are many measures available to assess RA disease activity. These include several “composite” disease activity indices primarily composed of “clinical phenotypes,” tender and swollen joint counts (TJC and SJC, respectively), physician global assessment, and patient global assessment. Commonly used composite measures include the clinical disease activity index (CDAI) [5], the disease activity score (DAS) with acute phase reactants included (DAS28-ESR, DAS28-CRP) [6], a simplified disease activity index (SDAI) [7], and the Routine Assessment of Patient Index Data 3 (RAPID3) [8]. Another validated measure of RA disease activity is the multi-biomarker disease activity (MBDA) score, which is an objective molecular measure based on an algorithmic assessment of 12 serum biomarkers [9, 10]. Most RA disease activity measures categorize the amount of disease activity and can be used to inform treatment decisions at a single point in time. For example, the MBDA score is reported on a scale of 1–100 with categories for high (> 44), moderate (30–44), and low (< 30) disease activity. Patients in the high or moderate categories are considered to have active disease and may be appropriate for treatment intensification. For use in a Treat-to-Target approach, clinicians must be able to determine whether changes in a score over time are representative of a true change in disease activity. This requires that the short-term variability of disease activity measures be well characterized.

Short-term variability is indicative of inter-observer precision for composite disease activity indices or laboratory and/or biologic variability of biomarkers for molecular measures of disease activity. Only changes in scores that are greater than the short-term variability of a disease activity measure can be reliably correlated with true clinical changes in order to inform treatment decisions. Previous studies have evaluated the short-term variation in many composite clinical measures of disease activity in patients with clinically stable disease [11,12,13]. In addition, the laboratory variability of the MBDA score has been well characterized, including pre-analytical effects of blood sampling/handling, precision of the biomarker assay, and precision of the MBDA score [14, 15]. Although these previous studies have shown that this laboratory variation is minimal, the biologic variability of the individual biomarkers included in the MBDA score has not been assessed. For patients with RA, previous reports have shown short-term biological variations in individual biomarkers that appear to have exaggerated circadian rhythms or diurnal variation such as IL-6 [16, 17]. Determining the short-term biologic variability is therefore critical to characterizing the minimally important difference (MID) of the MBDA score in order to determine whether changes in score represent meaningful differences in a patient’s disease activity.

In this study, we determined the MID for the MBDA score. This was done by evaluating short-term biological variation of the MBDA score over a 24-h period (diurnal) and from day to day (daily) in RA patients on stable DMARD therapy. In order to assess the MID in patients with clinically active disease for whom clinicians are most likely to need guidance with respect to therapeutic decisions, a subgroup analysis was performed in patients with moderate to high disease activity at baseline.

Materials and methods

Study design and enrollment criteria

This prospective observational study was conducted at a single rheumatology clinical research center—the Altoona Center for Clinical Research (Duncansville, PA, USA). Serum samples were collected at multiple time points over 4 days. This time period allowed for diurnal variations in MBDA score to be evaluated over the first 24 h and daily variations to be evaluated over a period of multiple successive days.

Inclusion criteria were as follows: RA diagnosed using the ACR 2010 Revised Classification Criteria [18], age 21 to 80 years, positive blood tests(s) for rheumatoid factor (RF) and/or anti-cyclic, citrullinated peptide antibodies (CCP or ACPA), and clinically stable RA, defined as having received csDMARD, tsDMARD, and/or bDMARD treatment for greater than 8 weeks with no medication change in the 4 weeks prior to enrollment. Non-DMARD medication use (e.g., glucocorticoids, NSAIDs) was also obtained at screening. There was no restriction on NSAID use, and corticosteroid use was allowed where the daily dose did not exceed 10 mg/day with no change in the 4 weeks prior to enrollment. Patients were screened up to 15 days in advance of enrollment, and screening results were provided to the investigator 10 days prior to enrollment. Patients with low, moderate, or high clinical disease activity as defined by CDAI, SDAI, or DAS28-ESR were included in the study [5, 7, 19]. Subject participation was timed so that subjects receiving parenteral biologic therapy would not have a scheduled dose during the 4-day study period.

Patients were excluded if they had a history of/or current inflammatory joint disease other than RA or other systemic autoimmune disorders, known active or chronic infection of any kind or concomitant malignancies or previous malignancies in the last 4 years. Patients receiving tocilizumab or those taking opioids 7 days prior to and during study visit (with the exception of subjects with high disease activity) were also excluded. Patients taking opioids were excluded in order to avoid confounding of clinical and patient-reported outcomes regarding pain, which might be masked by patients taking opiates at the time of the clinical assessment of disease activity. Eligible subjects were enrolled by the rheumatologist study investigator. The protocol received Investigational Review Board Approval, and all patients provided informed consent.

Serum collection and biomarker measurement

Patients were admitted to the research center for an overnight stay on day 1, and serum samples were collected at 8 a.m., 12 p.m. (noon), 4 p.m., 8 p.m., and 12 a.m. (midnight) during day 1, at 8 a.m. and 12 p.m. (noon) on day 2. Patients were released after the noon serum sample was collected on day 2 and returned to the site for serum sample collection at a single time point (8 a.m.) on day 3 and day 4. All 8 a.m. samples were non-fasting. A schedule of sample collection is presented in Fig. 1.

Fig. 1
figure 1

Study design. Blood sample collection schedule indicating time points for diurnal variation analysis and daily variation analysis for all patients. For the subgroup analysis of moderate to high MBDA score patients, the 12 a.m. midnight time point was excluded

Clinical data collected included patient global assessment of pain activity, patient global assessment of disease activity, physician global assessment, swollen/tender joint examination, CDAI, RAPID3, and Health Assessment Questionnaire (HAQ). Clinical disease activity measurements were collected daily at each 8 a.m. sample collection.

Samples were collected as previously described for the MBDA test (Crescendo Bioscience, South San Francisco, CA, USA) [14]. Briefly, serum was collected in 4 mL BD SST™ transport tubes, mixed by gently inverting 4–5 times, allowed to clot upright at room temperature for 30–45 min, and then centrifuged for 15 min at 1000 to 1300 RCF in a swing bucket centrifuge at room temperature. Centrifuged BD SST Transport tubes were refrigerated (2–8 °C) for up to 78 h following collection, then shipped in a designated temperature-controlled container to the testing laboratory and frozen at − 80 °C until analysis.

Single archived de-identified frozen serum samples were tested in a random manner at the testing laboratory (Crescendo Bioscience, South San Francisco, CA). The laboratory is certified under the CMS Clinical Laboratory Improvement Amendments and accredited by the College of American Pathologists. Biomarkers were measured by electroluminescence-based multiplex immunoassays on the Meso Scale Discovery Multi-Array platform (Meso Scale Discovery, Bethesda, MD, USA) [15].

Twelve biomarkers were measured in serum samples as previously described [15] and included C-reactive protein (CRP), epidermal growth factor (EGF), leptin, interleukin 6 (IL-6), matrix metalloproteinase-1 (MMP-1), matrix metalloproteinase-3 (MMP-3), resistin, serum amyloid A (SAA), tumor necrosis factor receptor-1 (TNFR-I, TNFRSF1A), vascular cell adhesion molecule-1 (VCAM-1), vascular endothelial growth factor A (VEGF-A), and human cartilage glycoprotein-39 (YKL-40). The MBDA score was calculated for each sample using a validated algorithm [9, 10, 20]. MBDA scores were categorized as low (< 30), moderate (30–44), or high (> 44) [10].

Statistical analysis

Twenty-eight patients were enrolled in the study to investigate diurnal and daily variation. The mean and standard deviation (SD) of the MBDA score was calculated for the patients in each MBDA category based on their scores at the baseline (day 1, 8 a.m.) and over the next three consecutive days. Daily and diurnal variation was calculated using a linear random-effects model. The patient, day, and time of day were included as predictors of MBDA score. The daily, diurnal, and unexplained variability were summed to calculate the total variability of the MBDA score. In the sub-group analyses for the patients with a moderate to high MBDA score at baseline, the 12 a.m. midnight time point was excluded as serum sampling in routine clinical practice would not be expected to occur in the middle of the night. MID was calculated as \( {z}_{0.95}\sqrt{2\times \mathrm{total}\ \mathrm{variance}\ \mathrm{of}\ \mathrm{MBDA}} \), where z0.95 is the standard normal deviate corresponding to the 95th percentile (90% confidence interval) [21]. The MID is the 90% upper confidence limit for the standard deviation of MBDA scores that are different due to chance (short-term variability, including laboratory variability), such that 90% of patients whose MBDA score changes by less than the MID have no change in disease activity. This threshold was selected in order to minimize the proportion of patients with a true change in RA disease activity whose change in MBDA score is interpreted as short-term variability (i.e., false negatives). The MID represents the largest expected change in MBDA score due to chance. All analyses were performed in R (version 3.2.4).

Results

Patient characteristics

Baseline demographic data for the 28 eligible, enrolled patients are presented in Table 1. All categories of MBDA disease activity scores were represented at baseline with 6 patients (21.4%) having a low MBDA score, 13 patients (46.4%) having a moderate MBDA score, and 9 patients (32.1%) having a high MBDA score. The median age of patients was 66 years, 64% of patients were female, and 100% of patients were Caucasian (Table 1).

Table 1 Baseline clinical characteristics

All patients maintained stable DMARD therapy throughout the study. Ninety-three percent of patients were receiving methotrexate (MTX) therapy while only 29% of patients were receiving biologic therapy and none were taking prednisone or other glucocorticoids. The median number of joint counts was 4.5 for both swollen and tender joint counts. Median CDAI (15.8), RAPID3 (6.5), and MBDA (39.5) scores correspond with moderate disease activity; however, all three levels of disease activity were represented as measured by baseline MBDA scores. No patients received infusions or injections of their biologic agent during the conduct of the study. The mean and SD of the clinical measures over time are presented in Fig. 2.

Fig. 2
figure 2

Daily variation of clinical measures (mean, standard error)

Daily effects on MBDA score

The mean and SD of the MBDA score for the patients in each MBDA category at each time point are presented in Fig. 3. The means of the MBDA score were stable within the moderate and high categories, but showed more fluctuation within the low MBDA category over time.

Fig. 3
figure 3

Daily variation by baseline MBDA disease activity category

Diurnal effects on MBDA score

Diurnal variation of the MBDA score was evaluated by obtaining serum samples at seven time points in a 24-h period. The mean MBDA score and SD were calculated for the patients in each MBDA category based on their scores at the different time points. For graphical purposes, both the diurnal (within a day) and the daily (8 a.m. for each of 4 days) observations for the MBDA score for all three disease activity categories are presented in Fig. 4.

Fig. 4
figure 4

Daily and diurnal variation by baseline MBDA category

MBDA scores throughout the day remained constant; however, scores began to increase in the evening, particularly at 12 a.m. and then returned to their daily average. This was readily apparent in all three disease activity categories, but was most pronounced in the low disease activity group (Fig. 4). Inspection of the concentrations of the individual biomarkers indicates the sources of the diurnal variance observed in the MBDA score (Fig. 5). Both IL-6 and leptin concentrations increased while EGF concentrations diminished. As EGF is inversely correlated with disease activity while IL-6 and leptin are positively correlated, an increase in the MBDA score would be expected.

Fig. 5
figure 5

Individual biomarkers in clinically stable patients diurnal and daily variation

Determination of MID

A linear model of MBDA including random effects for patient, day, and time of day for all time points reflects the changes of the MBDA score over time. In a combined daily-diurnal variation analysis including all patients the SD of MBDA score change was 4.7 and the MID was 11. In the subset analysis of patients with active RA (moderate/high disease activity categories, n = 22), the total SD of MBDA scores was 3.6, and the MID was therefore 8 MBDA units. When the MBDA scores were adjusted for age, sex, and adiposity [22], the MID remained unchanged at 8 units.

Discussion

Treat-to-Target approaches have demonstrated efficacy in RA treatment [2]; however, this approach requires RA disease activity to be accurately quantified in order to differentiate significant response to therapy from loss of therapeutic effect. Early in the understanding of RA pathogenesis, only clinical phenotypes were available for discerning the extent of disease activity including estimation of the number of swollen joints or tender joints and the physician’s overall global assessment. However, formal joint counts are often not performed in routine clinical practice settings [23,24,25] and have been shown to be poorly reproducible [26, 27]. Although some composite indices incorporate clinical and biomarker estimates of disease activity, these measures (CRP, erythrocyte sedimentation rates) are normal in up to 70% of patients with active disease [28] and lack the sensitivity to accurately detect and quantitate disease activity. Ultimately, none of these composite clinical measures have been proven to predict further joint damage for individual patients and continued structural damage has been observed in patients in clinical remission as determined by these indices [29]. In contrast, the MBDA test was developed to quantitate molecular disease activity in order to account for the heterogeneous biologic phenotypes that drive RA disease activity [15]. This objective measure of disease activity has been shown to predict permanent joint damage based on radiographic progression [30,31,32,33].

Determining underlying short-term biologic variability of the MBDA score is critical for appropriate use of the test in clinical practice. This allows clinicians to distinguish true changes in RA disease activity from variability within the measure. Here, we report on the MID of the MBDA score, which by design incorporates both sample handling and assay variability as well as daily and diurnal biologic variability of the biomarkers.

The amount of daily and diurnal variability for the MBDA score in all patients was very low, at only 4.7 score units, and the corresponding MID was 11 units. This indicates that for patients in any MBDA disease activity category (low, moderate, high), a MBDA score change of at least 11 is necessary to be considered clinically significant. However, RA is likely well controlled in patients with low disease activity and relative changes in MBDA score are unlikely to prompt a change in treatment regimen. In contrast, patients with moderate to high MBDA scores are often either initiating or not responding to DMARD therapy. As such, this subset of patients represents those with clinically active disease for whom clinicians are most likely to need guidance in therapeutic decision making. Among patients with moderate or high disease activity, the daily-diurnal variability was 3.6 units and the MID was 8. In this patient population, a change in MBDA score greater than or equal to 8 corresponds with true changes in disease activity.

The difference between the MID for all patients (low, moderate, or high disease activity) compared to those with active RA (moderate or high disease activity) can be explained by inspection of the diurnal performance of the individual biomarkers. Three biomarkers, IL-6, leptin, and EGF, vary in their concentrations throughout the day (Fig. 5) but primarily peak at night (IL-6 and leptin) or hit their nadir as in the case of EGF. While IL-6 and leptin concentrations increase resulting in a temporary increase in the MBDA score, EGF, however, is inversely correlated with RA disease activity and would also be expected to increase the MBDA score. Similar diurnal variations for these biomarkers have been previously described [16, 17, 34,35,36]. At higher disease activity levels, the additional nine biomarkers contribute to the high score lessening the total impact of the diurnal variation of IL-6, EGF, and leptin, whereas at low disease activity levels, all of the diurnal increase in the MBDA is resultant from three biomarkers who are known to have a diurnal pattern.

The MID of 8 enables clinicians to identify which patients with active RA (moderate or high disease activity) demonstrate a meaningful decrease in disease activity (MBDA score decrease ≥ 8), no change in disease activity (MBDA score change of < 8), or an increase in disease activity (MID score increase ≥ 8). This is clinically relevant, as MBDA is a continuous variable in which significant changes in disease activity can be observed without any corresponding change in the disease category. For example, an MBDA score decrease from 65 to 50 does not result in a change in disease activity category; however, this change is greater than the MID and would represent a true decrease in disease activity. Conversely, an MBDA score change from 46 to 43 represents a change in category (high to moderate) but does not represent a true decrease in disease activity (change < 8). In the Treat-to-Target approach, the ability to distinguish true longitudinal changes in disease activity that exceed measurement error and short term biologic variability is critical for clinicians to make appropriate treatment decisions and improve therapeutic response.

While the short-term variability for laboratory-based measures is the MID, variability of composite clinical measures has been evaluated using several metrics. These measures of variability have been well established in the literature and are used to identify true changes in disease activity. It is noteworthy that the MID for the MBDA score reported here correlates well with the short-term variability reported for other measures of RA disease activity. van Gestel et al. estimated a measurement error of 0.6 DAS units using inter-period correlation matrix analysis and discriminant validity [13] resulting in a change from baseline of 1.2 units required to exceed assay noise (DAS range, 1–10 units). Heegard et al. [11] determined a least significant difference (LSD = 1.96 × SD) from 30 patients with clinic visits 1 week apart for DAS28-CRP as 0.8 DAS units (DAS28-CRP range, 1–9.6 units). Because the MBDA score was initially developed by correlation with the DAS, it can therefore be converted to the same scale [15]. The MID of 8 for patients with high or moderate disease activity correspond to 0.66 DAS28-CRP, representing a comparable or improved short-term variability relative to DAS.

Although the MBDA score cannot be directly converted to other composite measures of disease activity, previous studies have demonstrated very similar measurement errors for DAS, CDAI, and SDAI. The LSD was determined to be 8.4 units for CDAI and 8.3 units for SDAI (CDAI and SDAI ranges, 0–76). In a test-retest study by Uhlig et al. where patients were evaluated at baseline and 5–7 days later, the smallest detectable difference (SDD = 1.96 × SD) was 1.32 DAS units for DAS28-ESR, 8.26 units for SDAI, 8.05 units for CDAI, and 1.48 for RAPID3 (RAPID3 range, 0–30) [12]. For patient reported outcomes, the minimal clinically important difference (MCID) or minimal clinically important improvement (MCII) can also be evaluated. Ward et al. report an MCID of 1.0 for DAS28-CRP, 1.2 for DAS28-ESR, 13 for SDAI, and 13 for CDAI [37]. Interestingly, Curtis et al. report MCID cut points for CDAI improvement as 12 units for high disease activity, 6 for moderate disease activity, and 1 for low disease activity (CDAI < 10) [38]; the MCID for worsening among people doing well in low disease activity was > 2 units. To this point, the MID of the MBDA for disease worsening (among people starting in low disease activity or remission) may be different than the MID of 8; however, determining a MID for worsening of the MBDA could not be assessed in the current study given the small sample size (n = 6 in low disease activity at baseline). Moreover, we would differentiate an MID, which is typically based on mathematical estimates of variability measured in stable patients over short periods of time, from an MCID, which is clinically anchored. Estimating the MID of the MBDA, as well as its MCID, for RA patients in low disease activity remains a topic for future investigation.

In summary, Treat-to-Target recommendations improve the management of RA in clinical practice and achieve optimal therapeutic outcomes by closely monitoring a patient’s disease activity [2]. The MBDA score provides an objective, molecular measure of disease activity; however, clinicians must be able to differentiate true changes in disease activity from short-term biologic variability in order to utilize MBDA in a Treat-to-Target approach. Here, we determined that changes in MBDA score greater than or equal to 8 represent true changes in disease activity for patients with active RA, while changes less than 8 represent biologic variability of the biomarkers. To this end, the knowledge of diurnal and daily variation of the MBDA score and defined MID should be helpful to the practicing rheumatologist in their decision-making and patient care when using MBDA test.