Background

Assessments of disease activity may differ between patients and physicians. Studies have shown that a discrepancy occurs in approximately one third of cases; however, this can vary depending on how it is measured [16]. Discordance occurs in many chronic illnesses, including rheumatoid arthritis (RA), and may be present before treatment starts or during the course of therapy [115]. In RA, physicians tend to rate disease activity lower than patients; however, scoring differences in both directions have been reported [17, 9, 1115]. Several reasons have been proposed for these differences, including functional impairment, limited knowledge about health, and differences in the signs and symptoms on which patients and physicians focus. Importantly, patients primarily concentrate on pain and tenderness when rating RA disease activity, and physicians focus on swollen joints and C-reactive protein (CRP) or erythrocyte sedimentation rate (ESR) [35, 9, 1113, 15, 16]. As pain may be present even if the disease is controlled, the difference between patient and physician viewpoints may be exaggerated when the disease is not active [15]. This may be the case particularly in long-standing disease when joint damage and damage-related disability are present. It is also possible that patients self-assess signs and symptoms of other comorbidities and attribute them to their RA. Finally, physicians may relate the condition of one patient to that of the many patients seen over the years, while individual patients may have little more than their own situation and wellbeing as comparator.

A better understanding of patient and physician bias (conscious and unconscious), perceptions, judgment, beliefs about what medicines can or cannot do, and disconnects in the assessment of RA signs and symptoms may enable clinicians to more effectively communicate with patients, improve treatment outcomes, and help patients achieve their varied treatment goals [16]. This includes a component of active listening and openness to patient input. In other chronic diseases, such strategies, including open-ended discussions with patients, have demonstrated an increased ability to identify patients at risk of non-adherence and allow for better alignment with patient goals [17, 18].

Decrease in pain and improvement of physical function may not be the only objectives for patients. Patients may also be concerned about the impact of RA on their overall activities and their ability to work. Studies have demonstrated that RA, whether early or late in the disease course, affects patients’ ability to work, which in turn can lead to presenteeism, absenteeism, unemployment, income loss, and early retirement [1929].

To our knowledge, no published studies have evaluated whether discordance in the assessment of disease activity is connected to presenteeism, absenteeism, and work impairment in patients with RA. In addition, differences in assessments of disease activity in RA have not been rigorously examined in clinical trial populations. Publications to date have evaluated mostly clinic patients in observational studies; consequently, the results are often complicated by the fact that the patients did not receive identical therapy or monitoring and came from a single center and/or academic environment, in contrast to a large, multicenter clinical trial [17, 9, 1113, 15].

The PRESERVE trial (ClinicalTrials.gov identifier, NCT00565409) was a multicenter two-period trial in adults with moderate RA [30]. Period 1 was the open-label, single treatment phase of the trial that evaluated responses to combination etanercept and methotrexate therapy for 36 weeks, and period 2 was the randomized, double-blind phase that investigated the outcomes of dose reduction or withdrawal of etanercept. This analysis only includes data from period 1, thereby focusing on discordance in patients with moderate disease activity who were being treated to a target of low disease activity. This report examines (1) the difference, i.e., discordance, between patients’ and physicians’ global assessments of disease activity at baseline and 36 weeks; (2) correlations between clinical parameters and discordance of global assessments at baseline and week 36; (3) baseline predictors of week-36 discordance; and (4) whether week-36 discordance is associated with work productivity.

Methods

Details of the PRESERVE trial have been presented elsewhere [30]. In brief, patients were 18–70 years of age with a diagnosis of RA based on the 1987 American College of Rheumatology criteria. Patients had a moderate disease activity score based on a 28-joint count (DAS28 >3.2 and ≤5.1) and were enrolled at 80 centers in Europe, Asia, Australia, and Latin America. Eligible patients were required to have taken stable doses of oral methotrexate 15–25 mg/week for at least 8 weeks prior to receiving 50 mg open-label etanercept once weekly plus methotrexate ≥10 mg/week for the 36 weeks of period 1. Exclusion criteria included use of disease-modifying anti-rheumatic drugs (DMARDs) other than methotrexate within 28 days before baseline or current or previous use of a biologic DMARD for RA.

The study was conducted in accordance with the International Conference on Harmonisation guideline for good clinical practice and the ethical principles of the Declaration of Helsinki. Written informed consent was obtained from all patients prior to enrollment. The study protocol and all consent forms were reviewed and approved by an institutional review board or an independent ethics committee at each participating center (see “Acknowledgments” for details).

Patients and physicians completed their respective global assessments of disease activity at baseline and weeks 4, 8, 12, 20, 28, and 36. Patients were blinded to physician global assessments and physicians were blinded to patient global assessments. The physician global assessment was performed prior to the physician having access to the C-reactive protein (CRP) levels from that visit. The global assessments used a numerical rating scale on which respondents were asked to rate disease activity by circling a number ranging from 0 (no disease activity) to 10 (extreme disease activity) [31]. At baseline and week 36, mean differences between patient and physician scores were categorized as positive discordance (patient global assessment – physician global assessment ≥2), negative discordance (patient global assessment – physician global assessment ≤ –2), or concordance (absolute difference between the two disease activity scores = 0 or 1). The cutoff of 2 was chosen by rounding to the closest whole number above the one standard deviation of 1.72 obtained from the mean difference between patient and physician global assessments.

Concordance/discordance was determined for all patients, and for the subgroup of patients who achieved all three outcomes of swollen joint count (SJC) ≤1, tender joint count (TJC) ≤1, and CRP ≤1 mg/dL at 36 weeks, and for those patients who achieved the Boolean-based definition of clinical remission (SJC ≤1, TJC ≤1, CRP ≤1 mg/dL, and patient global assessment ≤1) [32]. Concordance/discordance was also determined for the patients who achieved remission according to the clinical disease activity index (CDAI ≤2.8) at 36 weeks. Additionally, the rates of Boolean and CDAI remission were evaluated according to baseline and week-36 discordance status.

Endpoints at 36 weeks according to concordance/discordance category included DAS28, CRP, TJC, SJC, health assessment questionnaire disability index (HAQ-DI), erythrocyte sedimentation rate (ESR), patient and physician global assessments, duration of morning joint stiffness, brief pain inventory (BPI), simplified disease activity index (SDAI), CDAI, patient general health visual analog scale (VAS), and functional assessment of chronic illness therapy-fatigue (FACIT-fatigue). Additionally, for the subgroup of patients who achieved the three outcomes of SJC ≤1, TJC ≤1, and CRP ≤1 mg/dL at 36 weeks, the measurement of the modified total Sharp score was characterized by concordance category.

Patient global assessment and patient general health VAS are assessments with several important differences. The patient global assessment requests that patients measure overall arthritis activity by asking them to circle a number between 0 and 10, with 0 indicating no arthritis activity and 10 indicating extreme activity. The patient general health VAS asks patients to indicate, “in general how would you rate your health over the last 2–3 weeks?” Patients place a mark on a 100-mm line, with 0 mm meaning “very well” and 100 mm meaning “extremely bad.” Thus, the patient global assessment measures arthritis disease activity and the patient general health VAS measures overall health.

Additionally, the extent to which work productivity activity impairment was associated with discordance was determined for each component of the work productivity activity impairment questionnaire for RA (WPAI:RA). The WPAI:RA is a validated tool for measuring work productivity that was used in the PRESERVE trial [33], consisting of four components: activity impairment, absenteeism, presenteeism, and overall work impairment [34, 35]. The questionnaire focuses on impairment due to RA only. Activity impairment includes patients not employed outside the home; the other components are measured for employed patients only. Absenteeism is missed work days due to health, and presenteeism is a reduction of productivity or diminished work capacity while at work [35]. Overall work impairment is calculated using both absenteeism and presenteeism. Each component of the WPAI:RA is scored from 0–100 %; higher scores indicate a worse outcome.

Statistical analysis

The analyses included patients in the period 1 population who received at least one dose of study medication and had data at baseline and week 36. Missing data at week 36 were not imputed, rather the observed case approach was used. Descriptive statistics were used to characterize differences in patient demographics, disease characteristics, and clinical endpoints among concordant and discordant groups. P values for demographic and baseline disease characteristics were generated using the F test from analysis of variance for continuous variables; Cochran-Mantel-Haenszel test or Fisher’s exact test was used for categorical variables. The proportion of patients who shifted between baseline and week-36 concordance categories was determined.

Correlations between discordance (analyzed as continuous parameters) and clinical endpoints were determined using Pearson’s r correlation. Stepwise logistic regression was performed to determine significant baseline predictors of week-36 discordance. As binomial logistic regression requires two categories, the concordant and negative discordance groups were combined into one category and compared with positive discordance. The following parameters were included in the stepwise logistic regression analyses of baseline predictors: age, sex, disease duration, race, prior alcohol and tobacco use, rheumatoid factor status, body mass index, TJC-28, SJC-28, CRP, ESR, HAQ-DI, DAS28, CDAI, SDAI, FACIT-fatigue, patient general health, BPI, and duration of morning stiffness. Odds ratios were calculated to describe the strength of the association between baseline or week-36 parameters and the two discordance categories at 36 weeks.

Descriptive statistics for change from baseline were determined for the WPAI:RA for each concordance group. Three of the four outcome scores (absenteeism, presenteeism, and overall work impairment due to RA) were evaluated for the subgroup of patients who were employed at both baseline and 36 weeks; activity impairment was measured for the full population included in this analysis.

Results

Patients

The PRESERVE trial enrolled 834 patients [30]. The first period (36 weeks) of the study was completed by 756 patients (90.6 %); 77 patients discontinued and results from one patient were not included due to a data discrepancy. This analysis includes patients in the period-1 population with data to week 36 (n = 763). There were 13 patients who did not complete period 1 but had week-36 data (e.g., their 28-week visit occurred late and was assigned to week 36), and 6 patients completed period 1 but did not have week-36 data (e.g., their week-36 visit occurred early and was assigned to week 28); thus, the difference of 7 patients between period-1 completers and patients with week-36 data. Mean (SD) age was 48.2 (11.9) years, 82.8 % were female, 74.4 % were white, and duration of RA symptoms was 7.0 (6.9) years. Additional baseline disease characteristics are provided in Table 1.

Table 1 Demographics and baseline disease characteristics according to week-36 concordance category

Concordance

At baseline, 520/762 patient and physician global assessment scores (68.2 %; one patient had no baseline data available for one of the measures) were concordant (i.e., the difference between the scores was 0 (34.3 %) or 1 (34.0 %)). The number of patients with positively and negatively discordant scores was 194 (25.5 %) and 48 (6.3 %), respectively. Table 2 lists disease characteristics at baseline, week 36, and the change between baseline and week 36 for all patients, according to week-36 concordance category. At baseline, several clinical and patient-reported characteristics differed significantly according to concordance/discordance category: CRP, SJC-28, duration of morning stiffness, BPI, FACIT-fatigue, CDAI, SDAI, patient and physician global assessment, and patient general health. The values for most of these characteristics indicated more severe disease in the discordant than in the concordant patients. The exception was CRP; this was highest in the patients with concordance, with a mean (SD) of 13.4 (17.7) mg/L versus 9.9 (13.9) mg/L for positive discordance and 8.1 (7.1) mg/L for negative discordance.

Table 2 Disease characteristics at baseline and week 36, according to week-36 concordance category

After 36 weeks of therapy, concordance increased to 556/763 patients (72.9 %); the number of patients with positively and negatively discordant scores decreased to 189 (24.8 %) and 18 (2.4 %), respectively. Improvement between baseline and week 36 differed significantly according to concordance/discordance status for DAS28, CRP, ESR, TJC, SJC, BPI, HAQ-DI, FACIT-fatigue, CDAI, SDAI, patient and physician global assessment, and patient general health (p < 0.05 for change in CRP and TJC; p < 0.001 for ESR; and p < 0.0001 for all others; Table 2). Patients with concordant scores at week 36 had the best 36-week clinical and patient-reported outcomes, and for most measurements, the greatest improvement between baseline and week 36, compared with patients with positively or negatively discordant scores.

At week 36, patients with negative discordance exhibited the highest values of DAS28, CRP, ESR, TJC, SJC, CDAI, and SDAI, suggesting that physicians, more than their patients, may have looked at “objective” disease activity measures when determining their global assessment. In contrast, patients with positive discordance had the longest morning joint stiffness and worst values for BPI, HAQ-DI, FACIT-fatigue, and patient general health, suggesting that they, more than their physicians, focused on subjective outcomes when determining their global assessment.

Most patients who were concordant at the beginning of the study remained concordant at 36 weeks (Fig. 1). For the patients with positive or negative discordance at baseline, the greatest shift at 36 weeks was to concordance. The smallest shift from any category was to negative discordance.

Fig. 1
figure 1

Shifts in discordance categories, baseline to week 36 (n = 762). Positive discordance: patient global assessment – physician global assessment ≥2. Negative discordance: patient global assessment – physician global assessment ≤ –2. Concordance: patient global assessment – physician global assessment = 0 or 1

Disease remission

Subgroup analysis results indicated that 442/755 patients (58.5 %) achieved SJC and TJC ≤1 and CRP ≤1 mg/dL at week 36. Of those 442 patients, 255 (57.7 %) also had a patient global assessment ≤1, thus meeting the Boolean-based criteria for clinical remission. Interestingly, the remaining 187 patients (42.3 %) had a patient global assessment >1, suggesting that these patients did not believe they were doing as well, or indeed were not doing as well as the objective criteria seemed to indicate, potentially attributable to their long disease duration.

Of the 255 patients who met Boolean remission criteria at week 36, 250 patients (98.0 %) were concordant, 5 (2.0 %) were negatively discordant, and none were positively discordant. However, of the 187 patients who had SJC and TJC ≤1, CRP ≤1 mg/dL, and patient global assessment >1, only 93 (49.7 %) were concordant; 2 (1.1 %) were negatively discordant, and 92 (49.2 %) were positively discordant. The patients who met Boolean remission criteria demonstrated greater improvement in clinical and patient-reported outcomes between baseline and week 36 than the patients with SJC and TJC ≤1, CRP ≤1 mg/dL, and patient global assessment >1. This was particularly true for the outcomes of patient general health and FACIT-fatigue (data not shown).

In comparison, 205/762 patients (26.9 %) achieved CDAI remission at week 36. Of those patients, 187 (91.2 %) were concordant, 2 (1.0 %) were negatively discordant, and 16 (7.8 %) were positively discordant. These results are similar to those for Boolean remission, with the exception that more patients in CDAI remission were positively discordant.

Relationship between discordance status and remission

An additional analysis found that for patients with positive discordance, concordance, and negative discordance at baseline, 44/194 (22.7 %), 193/520 (37.1 %), and 18/48 (37.5 %), respectively, achieved Boolean remission at week 36, and 44/193 (22.8 %), 145/520 (27.9 %), and 16/48 (33.3 %), respectively, achieved CDAI remission. In comparison, for patients with positive discordance, concordance, and negative discordance at week 36, 0/189 (0 %), 250/551 (45.4 %), and 5/18 (27.8 %), respectively, achieved Boolean remission at week 36 and 16/189 (8.5 %), 187/555 (33.7 %), and 2/18 (11.1 %), respectively, achieved CDAI remission.

Correlations and predictors of discordance

The baseline values of BPI, SJC, duration of morning stiffness, FACIT-fatigue, and patient general health significantly correlated with week-36 discordance, p < 0.0001 to p < 0.05 (Table 3), although the correlations were weak (r <0.25). At week 36, DAS28, duration of morning stiffness, HAQ-DI, FACIT-fatigue, CDAI, and SDAI correlated significantly but weakly with discordance, p < 0.0001 for all. BPI and patient general health demonstrated the strongest correlations, which were moderate, at week 36 (r = 0.48 and 0.58, respectively, p < 0.0001 for both).

Table 3 Correlation between week-36 discordance and measurements of disease at baseline and week 36, and change from baseline to week 36

Baseline predictors of week-36 positive discordance were patient general health, BPI, and CRP. The odds ratios (95 % confidence interval) were similar for the full population (patient general health: 1.02 (1.00, 1.03), BPI: 1.22 (1.11, 1.35), CRP: 0.98 (0.97, 1.00)) and the subpopulation of patients who achieved SJC and TJC ≤1 and CRP ≤1 mg/dL at week 36 (patient general health: 1.03 (1.01, 1.04), BPI: 1.24 (1.07, 1.43), CRP: 0.97 (0.94, 1.00)).

WPAI scores

At baseline, mean percent WPAI activity impairment was higher (greater impairment) for the patients with positive discordance (51.1 %) than for the patients with concordance (41.3 %) or negative discordance (42.8 %) (p < 0.0001 across means) (Table 4). This continued to week 36, with mean activity impairment of 35.4 %, 15.1 %, and 25.6 % for the positive discordance, concordant, and negative discordance groups, respectively, p < 0.0001. The greatest improvement between baseline and week 36 occurred in the patients who were concordant at week 36. For the subgroup of patients who were employed at baseline and 36 weeks and had WPAI data (n = 287), the mean percent impairment while working was higher for the patients with positive discordance at baseline and 36 weeks (50.0 % and 26.3 %, respectively) than for the patients with concordance (37.8 % and 10.7 %, respectively) or negative discordance (42.0 % and 12.0 %, respectively) (p ≤ 0.0026 across means) (Table 4). Similarly, the mean percent overall work impairment was higher for the patients with positive discordance at baseline and 36 weeks (54.6 % and 28.7 %, respectively) than for the patients with concordance (41.5 % and 12.2 %, respectively) or negative discordance (43.7 % and 12.0 %, respectively) (p ≤ 0.0019 across means). Mean percent work time missed did not differ significantly between the concordance categories; it was numerically highest in the positive discordance group at both baseline and week 36 (Table 4).

Table 4 Work productivity activity impairment questionnaire for rheumatoid arthritis (WPAI:RA) results at baseline, week 36, and change from baseline to week 36, according to week-36 concordance category

Discussion

We evaluated the rates of concordance and discordance of global disease activity assessments at baseline and after 36 weeks of open-label, single-arm treatment with etanercept and methotrexate in patients with moderate RA who participated in the PRESERVE trial. At baseline, 31.8 % of patient assessments of disease activity were discordant with the assessments of their physicians. This represents a disconnect between the patient and physician in verbal and non-verbal cues about how RA is affecting the patient. In most cases, discordance reflected higher ratings of disease activity by patients.

When comparing concordance in this study to others that used a similar method of measurement, the rate of concordance was comparable to several [1, 3, 4] and higher than other studies [2, 5]. As with other studies, we found that pain was an important contributor to the patient global assessment and discordance [25, 9, 1113, 15]. We also noted that patients with relatively high levels of inflammatory activity, as assessed by an objective measure such as CRP, had a higher level of concordance than patients with lower levels of inflammation, similar to several other studies [1, 4, 5, 9, 12]. In our analysis, baseline CRP was highest in patients with concordance, with a mean of 13.4 mg/L, compared with 9.9 mg/L for positive discordance and 8.1 mg/L for negative discordance. This suggests that patients with high levels of inflammation tend to focus on indicators of potential joint damage, similar to clinicians.

At week 36, the rate of discordance was 27.1 % overall. Among the subgroup of 442 patients who achieved SJC and TJC ≤1 and CRP ≤1 mg/dL, the discordance rate was lower but still substantial at 22.4 %. This result provides additional evidence that even when the disease is controlled, pain may still be present, resulting in a considerable difference between patient and physician viewpoints [15]. Not surprisingly, for the 255 patients in this subgroup who also reported a patient global assessment ≤1 (thereby achieving a Boolean remission), the rate of discordance was 2 %. This is entirely negative discordance, and is comparable to the negative discordance of 2.4 % for the overall population. The overall decrease in discordance from 31.8 % at baseline to 27.1 % at week 36 occurred in the context of improved disease control. This may be a result that is independent of the gap in communication, or it may be due to the physician becoming more adept at interpreting patient cues, or patients becoming more expressive of how they feel.

This analysis of the PRESERVE trial provides a novel perspective in several ways. First, it focuses on patients with moderate RA (DAS28 >3.2 and ≤5.1) who received therapy with etanercept and methotrexate. Many previously reported studies did not specify the extent of disease activity, nor did they examine discordance following treatment with a tumor necrosis factor inhibitor. As the majority of patients seen in clinical practice have low or moderate RA activity [36, 37], the data presented in this report are relevant to most patients.

We also assessed the relationship between discordance, activity impairment, and work productivity. Previous studies have found that moderate RA (and even mild RA) is associated with work impairment [23, 24, 27]. Additionally, one study of patients with RA and other inflammatory and noninflammatory rheumatic diseases found a significant association between discordance and not working [2]. We examined this association more closely by using the WPAI:RA to measure four components of work productivity: activity impairment, absenteeism, presenteeism, and overall work impairment. All three components of the WPAI:RA related to employment were highest (i.e., worst) in patients with positive discordance; for two of these components the difference was statistically significant. These results suggest that discrepancies between patient and physician assessments of disease not only are associated with poorer clinical outcomes, but may also extend to other domains of a patient’s life such as work productivity, which could potentially have an economic impact on patients and society.

A strength of this study is that it evaluated data from a clinical trial population, thereby ensuring that all patients were managed in a consistent manner and data were collected at fixed intervals. This includes collection of work productivity data using a validated instrument, the WPAI:RA. Additionally, investigators completed the physician global assessment prior to having access to CRP values for that visit. This removed the potential for the CRP results to bias the physician assessment. In the clinic, patients sometimes bring their CRP or ESR results with them. Therefore, the current study may more accurately reflect the similarities and differences between patient global assessments and physician global assessments than in the clinic [9, 38].

This study has several limitations. Importantly, we were limited to the assessments that were used in the clinical trial. Although of interest, we were not able to evaluate the effect of depression or cognitive impairment on discordance because no validated measurement tool was used in the trial. In addition, as this was a clinical trial, the results may not be generalizable to all patients with RA. This study excluded patients with mild or severe disease activity and those with certain comorbid conditions. As a clinical trial, physicians could not modify medication regimens at will; instead it was necessary to follow the study protocol. Also, patients who left the study for any reason were not followed, and so we were unable to determine whether their concordance status changed over time.

Conclusions

The results of our analysis demonstrated that in patients with moderately active RA, the rate of concordance of patient and physician global assessments increased after 36 weeks of treatment with etanercept and methotrexate. Discordance significantly correlated with several clinical endpoints and was associated with decreased work productivity. Additional research into verbal and nonverbal patient/physician communication is needed to fully understand the discrepancies.