In daily clinical practice, the Disease Activity Score using 28 joint counts (DAS28) is used to monitor the disease activity of rheumatoid arthritis patients treated with disease-modifying anti-rheumatic drugs (DMARDs) and biological agents. This is useful to inform the rheumatologist about whether the treatment is producing the expected effects in an appropriate period of time or whether the treatment should be more intensified.

In an article in the present issue, Vander Cruyssen and colleagues investigated which variables can best be measured to evaluate the effect of therapy and the remaining disease activity in daily clinical practice [1]. This study was based on a cohort of 511 patients with active refractory rheumatoid arthritis who were treated with infliximab [2]. Patients who were judged by their physicians to have an insufficient response at week 22 received a dose increase at week 30. According to the authors, the decision to increase the dose was based on clinical judgement, without knowledge of outcome measures such as the DAS28. In their study, the authors found that the DAS28 as a continuous composite index correlated best with the decision to give a dose increase of infliximab, which was used as a surrogate measure of insufficient response. The discriminative capacity of the DAS28 could only slightly be improved by the inclusion of supplemental variables in the regression model. Recalculation of the DAS28 coefficients in a discriminative function obtained similar coefficients and the same discriminative capacity as the original DAS28. For a better understanding of these results, it is informative to know how the Disease Activity Score and the DAS28 were developed back in the 1990s.

The DAS28 was developed in a similar way to the Disease Activity Score, but the DAS28 contains reduced, ungraded, joint counts and has different weights [3, 4]. The DAS28 was developed in a cohort from an outpatient clinic, using the data from 227 early rheumatoid arthritis patients that were followed-up for 9 years between 1985 and 1994. Because no gold standard for disease activity is available, decisions on DMARD therapy were used as an external standard of 'high' and 'low' disease activity in the development of the DAS28. The DAS28 formula optimally discriminated between these two clinically relevant states. The validity of the DAS28 was tested using a similar cohort from another clinic. Since their development, the Disease Activity Score and the DAS28 have extensively been validated [5].

An interesting finding from the study of Vander Cruyssen and colleagues is that they also used decisions to change (infliximab) treatment as a proxy for the underlying disease activity, and produced the same DAS28 as found 20 years earlier in a cohort in which only conventional DMARDs were used, without a need to change its content or form. This means that the DAS28 is able to discriminate between clinically relevant states of disease activity, rather than discriminating a 'readiness' to change treatment (from physicians and patients) to start, to stop or to continue DMARD treatment. This enforces the validity and generalisability of the DAS28.

The authors reached their conclusion based on a series of analyses comparing the performance of multiple measures in several ways. The authors used receiver-operating characteristic curves and sensitivity, specificity and predictive values to rank the measures in order of their performance. As the authors state, these statistics for diagnostics may be used to rank measures in a study, but it is difficult to generalise the values for sensitivity, specificity, and so on, beyond the study. This difficulty occurs because all values for these statistics heavily depend on the distributions found in the study (see Figure 1 in [1]). Moreover, the use of sensitivity, specificity, and so on, does not reflect the way the DAS28 is used, as one would not use the DAS28 to 'diagnose' physician opinion on whether or not to increase the infliximab dose.

However, the results of Vander Cruyssen and colleagues can best be understood when looking at Figure 1 in their article [1], depicting the differences in disease activity measures between both groups of patients. Two lessons can be learned from this figure.

First, higher scores of the DAS28 and the other disease activity measures are found in patients in which a decision was made to increase the dose of infliximab. Only a few other studies used external criteria for high and low disease activity to study the validity of the Disease Activity Score and the DAS28. In a study performed in Italy in the late 1990s, it was found that the Disease Activity Score was the best measure to discriminate between predefined states of low and high disease activity, in a sample of 202 patients [6]. A recent study used a different, opinion-based, approach, with expert rating (n = 35) of a sample of clinical profiles that were categorised into remission, low disease activity, moderate disease activity and high disease activity [7]. Interestingly, the cut-off criteria for the DAS28 that were found in this way were only slightly different from the established cut-off points for the DAS28, which can therefore be regarded as confirmation.

The second interesting finding from Vander Cruyssen and colleagues' study, which was not highlighted in the article, is that more than 50% of the patients in which the infliximab dose was not increased had DAS28 >3.2, which means 'moderate' or 'high' disease activity. One may ask whether a dose increase would also have been indicated in these patients, as the aim is to reach low disease activity or even remission. This illustrates that the target of anti-rheumatic treatment is moving in time. It is therefore an extra advantage to use a continuous measure with absolute values to measure disease activity in daily clinical practice and clinical trials.

Conclusion

The study of Vander Cruyssen and colleagues confirms that the DAS28 is a valid measure to monitor disease activity and to titrate treatment with biologicals [8].