Introduction

Sjögren’s syndrome (SS) is an autoimmune disease that affects exocrine glands, including the salivary and lacrimal glands. It is characterized by lymphocytic infiltration into the exocrine glands, leading to dry mouth and eyes. A number of autoantibodies, such as anti-SS-A and SS-B antibodies, are detected in patients with SS. SS is subcategorized into primary SS, which is not associated with other well-defined connective tissue diseases (CTDs), and secondary SS, which is associated with other well-defined CTDs [1]. Primary SS is further subcategorized into the glandular form and the extraglandular form.

The revised criteria for the diagnosis of SS issued by the Japanese Ministry of Health (JPN) (1999) (Table 1) [2], as well as the American-European Consensus Group classification criteria for SS (AECG) (2002) (Tables 2, 3) [1], are usually used in both daily clinical practice and clinical studies in Japan. Thus, two sets of diagnostic systems are being applied for the same disease. This could result in a heterogeneous pool of SS patients. This heterogeneity of SS patients makes it difficult to analyze the diagnosis, efficacy of treatment, and prognosis of SS patients. A better alternative would be to use a unified set of criteria for the diagnosis of SS in Japan. Recently, The American College of Rheumatology (ACR) published the ACR classification criteria for SS (2012) (Table 4), which were proposed by the Sjögren’s International Collaborative Clinical Alliance (SICCA) [3]. The new set of criteria is designed to be used worldwide, not only in advanced countries but also in developing countries. The SICCA established a uniform classification for SS based on a combination of objective tests that have known specificity to SS [3].

Table 1 The revised Japanese Ministry of Health criteria for the diagnosis of SS (1999)
Table 2 The American-European Consensus Group classification criteria for SS (2002)
Table 3 The American-European Consensus Group classification criteria for SS (2002) rules for classification
Table 4 The American College of Rheumatology classification criteria for SS (2012)

Upon comparing these three classification sets, there are some differences among them in their purpose and the items adopted in the set (Table 5). The JPN criteria (1999) are intended as an aid for diagnosis, whereas the AECG criteria (2002) and the ACR criteria (2012) are intended for classification purposes in clinical studies and trials. Although the ACR criteria include only three objective items (Tables 4, 5) and are the simplest among the three sets, the ACR criteria may not identify SS patients with negative findings in labial salivary gland biopsy, because the ACR criteria do not include salivary secretion analysis and imaging studies. On the other hand, the JPN criteria combined oral examinations such as salivary secretion, sialography, and salivary gland scintigraphy with three objective items adopted in the ACR criteria (Table 5). Only the AECG criteria include ocular and oral symptoms, which may cause false positives in patients with non-SS conditions such as aging or visual display terminals (VDT) syndrome (Table 5).

Table 5 Comparison of the items adopted in the JPN and AECG and ACR criteria

The purpose of the present study was to validate the JPN criteria, AECG criteria, and ACR criteria for the diagnosis of SS in Japanese patients. The study identified the differences among these three classification sets.

Patients and methods

Study population

The study subjects were 694 patients (51 males and 643 females) with a diagnosis of SS or suspected SS who had been checked for all four criteria of the JPN (pathology, oral, ocular, anti-SS-A/SS-B antibody), and were followed up in June 2012 at ten hospitals across Japan (Kanazawa Medical University Hospital, Nagasaki University Hospital, Hyogo Medical University Hospital, Keio University Hospital, Tokyo Women’s Medical University Hospital, Tsurumi University Hospital, Kyushu University Hospital, University of Occupational and Environmental Health Hospital, Kyoto University Hospital, and University of Tsukuba Hospital) that form part of the Research Team for Autoimmune Diseases, The Research Program for Intractable Disease of the Ministry of Health, Labor and Welfare (MHLW).

Data collection and analysis

We collected clinical data from the above ten hospitals using a questionnaire. We retrospectively examined the clinical diagnosis made by the physician in charge, as well as the satisfaction of the JPN, AECG, and ACR criteria. Because lissamine green ocular staining had not been adopted in Japan at the time of clinical examination, we regarded patients who had a positive rose bengal test or fluorescein staining test as having satisfied the ocular staining score in the ACR classification system.

We regarded the clinical diagnosis made by the physician in charge as the gold standard for the diagnosis of SS in this study. We compared the sensitivities and specificities of the JPN, AECG, and ACR diagnostic systems in the diagnosis of SS (both primary and secondary SS), primary SS, and secondary SS. Agreement between the three was assessed via the kappa coefficient.

Results

Diagnosis of SS (primary and secondary SS) and non-SS

Of the 694 patients, 499 patients did not have other well-defined CTDs, whereas 195 patients did. SS was diagnosed in 476 patients (302 primary SS, 174 secondary SS), whereas non-SS was diagnosed in 218 patients (197 without other CTDs, 21 with other CTDs) by the physician in charge (Table 6).

Table 6 Diagnosis of SS and non-SS

Sensitivities and specificities of the three diagnostic systems for SS

The sensitivities of JPN, AECG, and ACR in the diagnosis of all SS (302 primary SS and 174 secondary SS) were 79.6, 78.6, and 77.5 %, respectively, whereas the respective specificities in the diagnosis of all SS were 90.4, 90.4, and 83.5 %. The sensitivities of JPN, AECG, and ACR in the diagnosis of 302 primary SS were 82.1, 83.1, and 79.1 %, respectively, with specificities of 90.9, 90.9, and 84.8 %, respectively. The sensitivities of JPN, AECG, and ACR in the diagnosis of 174 secondary SS were 75.3, 70.7, and 74.7 %, respectively, with specificities of 85.7, 85.7, and 71.4 % (Table 7).

Table 7 Sensitivities and specificities of the three tested systems for diagnosing SS

Comparisons of the satisfaction of the three diagnostic systems

Figure 1 displays Venn diagrams showing comparisons of the satisfaction of the three diagnostic systems. Among all SS patients (n = 476), more patients satisfied only the AECG criteria (n = 42) rather than only the JPN criteria (n = 8) or the ACR criteria (n = 6). The same tendency was also observed in patients with primary SS only and in those with secondary SS only. The diagrams indicate that the JPN and ACR diagnostic systems are similar, whereas the AECG diagnostic system is different from the other two. Table 8 shows the agreement among the three diagnostic systems, as assessed using the kappa coefficient. The data indicate a high level of agreement between the JPN and ACR diagnostic systems (kappa coefficient 0.74), but a low level of agreement between AECG and the other two (kappa coefficient 0.10–0.46) in the diagnosis of all SS, primary SS, and secondary SS.

Fig. 1
figure 1

Venn diagrams showing a comparison of the satisfaction of the three tested systems. a Comparison of the satisfaction of the three tested systems, performed using data from all 476 SS patients (302 primary SS and 174 secondary SS). b Comparison of the satisfaction of the three tested systems using data on 302 patients with primary SS. c Comparison of the satisfaction of the three tested systems using data on 174 patients with secondary SS. Numbers show the numbers of patients who satisfied each set of criteria, None indicates the number of patients who did not satisfy the criteria of any of the three systems. JPN criteria the revised Japanese Ministry of Health criteria for the diagnosis of SS (1999), AECG criteria The American-European Consensus Group classification criteria for SS (2002), ACR criteria American College of Rheumatology classification criteria for SS (2012)

Table 8 Agreement among the three tested systems, as assessed using the kappa coefficient

Discussion

While it is difficult to select the best gold standard system for the diagnosis of CTDs such as systemic lupus erythematosus (SLE), rheumatoid arthritis (RA), and SS, this issue is clinically relevant and important. In SLE, the ACR revised criteria for the classification of SLE (1997) [4] has been adopted for diagnosis in daily clinical practice and for classification purposes in clinical studies. Recently, the Systemic Lupus International Collaborating Clinics (SLICC) has proposed new classification criteria for SLE [5], which has generated interesting discussion about these two criteria among expert rheumatologists. On the other hand, for RA, the 2010 RA classification criteria: an ACR/European League Against Rheumatism (EULAR) collaborative initiative [6] was published recently and is currently used not only in clinical studies for the classification of RA but also in daily clinical practice for the diagnosis of RA. Therefore, these available diagnostic systems for SLE and RA could be regarded as the gold standard for both clinical studies and daily clinical practice. The AECG criteria have been adopted in Western countries for the diagnosis of SS. In Japan, however, both the AECG and JPN criteria are currently being used simultaneously for the classification and diagnosis of SS. On the other hand, the new ACR criteria have been proposed as a uniform classification for SS. At present, there is no gold standard system for the diagnosis of SS in both clinical studies and daily clinical practice, except for expert judgment. This state could create a heterogeneous pool of SS patients, which makes it difficult to analyze the diagnosis, efficacy of treatment, and prognosis of SS patients. Establishing a single set of criteria for SS and selecting a gold standard system for the diagnosis of SS is an important task in Japan.

The present study demonstrated that the sensitivity of the JPN system for all SS and secondary SS, the sensitivity of the AECG system for primary SS, and the specificities of the JPN and AECG systems for all SS, primary SS, and secondary SS were highest among the three systems for diagnosing SS in Japanese patients (relative to clinical judgment as the gold standard). The results also showed high agreement between the JPN and ACR systems, but low agreement between AECG and the other two diagnostic systems for all SS, primary SS, and secondary SS. These results indicate that the JPN and ACR criteria covered similar patient populations, although the sensitivity and specificity were higher for the JPN system than the ACR system. Among the 302 patients with primary SS, 14 did not satisfy the ACR criteria for the diagnosis of SS, although they did meet the criteria of both JPN and AECG. Further analysis of these 14 SS patients also showed that 50 % of these patients had negative pathological findings, 70 % had negative ocular staining, and 50 % were negative for autoantibodies (data not shown). These SS patients could be misdiagnosed by the ACR criteria, resulting in the lower sensitivity of the ACR diagnostic system. On the other hand, among 197 non-SS patients without other CTDs, ten patients satisfied the ACR criteria but not the JPN nor the AECG criteria (data not shown). Further analysis of these ten patients indicated that 80 % were positive for lissamine green ocular staining (Schirmer’s test, rose bengal staining, and fluorescein staining were not performed), and 60 % were positive for anti-SS-A antibody (data not shown). Although these patients might be misdiagnosed as primary SS by the ACR criteria, this could not be confirmed because these patients could be positive for other ocular tests adopted by the JPN and AECG diagnostic systems.

The specificities of the criteria for all SS, primary SS, and secondary SS patients used in the JPN and AECG systems were the same in this study. The reason for the same specificities of the JPN and AECG criteria may be the identical number of non-SS patients (21 patients, including 18 patients without CTDs and 3 patients with CTDs) who satisfied JPN and AECG. However, the JPN and AECG profiles for 20 out of these 21 non-SS patients were completely different, highlighting the low agreement between JPN and AECG, as shown in Table 8.

The sensitivity of AECG for primary SS was highest among the three systems, whereas that of JPN for all SS and secondary SS was highest. Among the 302 primary SS patients, 19 patients only satisfied the AECG criteria. These 19 primary SS patients had high frequencies of dry eye (84.2 %) and dry mouth (100.0 %) but low frequencies of anti-SS-A antibody (10.5 %) and anti-SS-B antibody (0 %). These seronegative primary SS patients with symptoms of dryness could only be diagnosed by the AECG criteria, because only the AECG criteria include symptoms of dryness. This may be the sensitivity of AECG for primary SS was highest among the three systems.

The above findings suggest that JPN provided the best set of criteria necessary for the diagnosis of Japanese patients with SS. Admittedly, however, the results of the present study do not allow us to confirm the superiority of JPN due to the inherent limitations of the study. First, we used the clinical judgment of the physician in charge as the gold standard. In Japan, because the JPN criteria are the criteria used most commonly in daily clinical practice, the clinical judgment could depend on the satisfaction of the JPN criteria. It is better to rely on expert committee consensus based on clinical case scenarios as the gold standard for diagnosis in order to avoid this bias. Second, patients who had been checked for all four criteria of the JPN diagnostic system (pathology, oral, ocular, anti-SS-A/SS-B antibodies) were included in this study, but the methods used for ocular staining varied among the participating institutions. Third, the results of the study could include selection bias. For these reasons, we need a more sophisticated validation study using randomly selected clinical case scenarios from various institutions and expert committee consensus diagnosis as the golden standard to test the three diagnostic systems for SS, to unify the criteria used for the diagnosis of SS, and ultimately to select the gold standard set of criteria for the diagnosis of SS in Japan.

Currently, the JPN diagnostic system is only used in Japan, because ACR and EULAR have never validated the JPN system. Therefore, we strongly hope that an ACR/EULAR collaborative initiative will validate JPN as well as the AECG and ACR systems.

In conclusion, although this study has a few limitations, the results obtained from it indicate the superiority of the JPN criteria, as it has higher sensitivity and specificity values for the diagnosis of SS in Japanese patients with SS than those of ACR and AECG.