FormalPara Key Summary Points

Why carry out this study?

Recently, new sets of diagnostic criteria were proposed, including criteria by the ACTTION-American Pain Society Pain Taxonomy (AAPT) group and Fibromyalgia Assessment Status (FAS) 2019 modified criteria for fibromyalgia (FM)

Because the appropriateness of the AAPT criteria and modified FAS criteria have not yet been assessed, we explored the performances of the AAPT criteria and modified FAS criteria for diagnosing FM compared to existing American College of Rheumatology (ACR) criteria

What was learned from the study?

Although the AAPT criteria and modified FAS criteria have simplified the diagnostic criteria to facilitate patient identification, these criteria had lower diagnostic accuracy than most ACR criteria

The 2016 ACR criteria showed the best performance among the various diagnostic criteria

Digital Features

This article is published with digital features, including a summary slide to facilitate understanding of the article. To view digital features for this article go to https://doi.org/10.6084/m9.figshare.14555949.

Introduction

Fibromyalgia (FM) is a chronic pain disorder characterized by chronic widespread pain and associated symptoms, including fatigue, sleep disorder, depression, and anxiety [1]. FM has become a considerable problem for patients and healthcare providers that leads to functional impairment, poor quality of life, and socioeconomic burdens [2, 3]. Although many efforts have been implemented to improve the diagnostic accuracy of FM in recent decades, it remains underdiagnosed or under-recognized. A previous study showed that the diagnosis of FM requires > 2 years and that patients with chronic pain will see a mean of 3.7 different physicians during that period [4].

The American College of Rheumatology (ACR) released the first set of criteria to discriminate FM from other chronic pain disorders in 1990 [5]. The 1990 ACR criteria had several weak points such as the absence of extra-pain manifestations, subjective attribution of tender point examination, and difficulty of implementation. Therefore, the same authors provided new, presumably improved preliminary FM criteria in 2010/2011 [6, 7]. These 2010/2011 ACR criteria for FM excluded the tender point examination and included a systemic symptom-based assessment of conditions including fatigue, sleep problems, and cognitive and somatic symptoms. However, these criteria led to misclassification because the included widespread pain index (WPI), which indicates the number of pain locations, does not consider the spatial distribution of these locations. By omitting the definition of generalized pain, the 2010/2011 ACR criteria provided uncertain discrimination between FM and localized functional pain syndromes. Smythe et al. [8] described other limitations of the 2010/2011 ACR criteria, including dilution, inconsistency, loss of specificity, and loss of the ability to recognize FM in patients with other disease states. To resolve these problems and improve the usefulness of the criteria, revised ACR criteria were released in 2016 [9]. These criteria minimized the misclassification of regional pain disorders from FM and eliminated the previously confusing recommendations regarding diagnostic exclusions. Recently, to resolve problems related to diagnosis (e.g., the considerable complexity of the existing criteria), the Analgesic, Anesthetic, and Addiction Clinical Trial Translations Innovations Opportunities and Networks (ACTTION) public-private partnership with the US Food and Drug Administration (FDA) and the American Pain Society initiated the ACTTION-American Pain Society Pain Taxonomy (AAPT) to develop “core diagnostic criteria” that would be clinically useful for discriminating FM from chronic pain disorders [10]. The aim of this approach was to apply the multidimensional diagnostic framework to FM and identify new approaches to diagnose FM that might improve its identification in clinical practice. Concurrently, a modified version of the Fibromyalgia Assessment Status (FAS) criteria (originally published in 2009) was released [11]. The AAPT criteria and modified FAS criteria included fatigue and sleep disorder as the main associated symptoms and simplified the number of pain sites involved. The common goal of these new sets of criteria was to reduce the complexity of FM diagnosis and enable the FM criteria to be more easily implemented in clinical practice. Because the appropriateness of the AAPT criteria and modified FAS criteria have not yet been assessed, we validated the Korean versions of these new sets of criteria and evaluated the performances of these criteria for diagnosing FM compared to the existing ACR criteria.

Methods

Study Design and Population

The sample size was calculated using G*Power software; assuming sensitivity = 0.8, specificity = 0.8, and α = 0.05, the minimum sample size was 90 participants. Assuming a drop-out rate of approximately 10%, we attempted to recruit 99 participants and 99 controls. In total, 203 patients (95 with FM and 108 with various rheumatic diseases such as rheumatoid arthritis, systemic lupus erythematosus, osteoarthritis, and myofascial pain syndrome) were invited to join the study during outpatient visits to Chonnam National University Hospital. All evaluations were performed between January 2020 and October 2020. Patients with FM had been diagnosed by experienced rheumatologists based on their clinical features before the study assessment; they were regularly followed up at Chonnam National University Hospital. The original diagnosis of FM was based on a history of chronic widespread pain, associated symptoms, and tender point examinations. Patients with rheumatoid arthritis met the 2010 ACR/European League Against Rheumatism criteria for rheumatoid arthritis [12], and patients with systemic lupus erythematosus satisfied the 1997 update of the 1982 ACR revised criteria [13]. Patients with osteoarthritis met the ACR classification criteria [14], and patients with myofascial pain syndrome were diagnosed using the Travell and Simons criteria [15]. We thus enrolled 56 patients with rheumatoid arthritis, 24 patients with systemic lupus erythematosus, 16 patients with osteoarthritis, and 12 patients with myofascial pain syndrome. The study was approved by the institutional review board of Chonnam National University Hospital (IRB no. CNUH-2020-041). All patients provided written informed consent upon enrollment in the study. The methods of this study are adopted from our previous study [16].

Patients were interviewed using structured questionnaires, which included assessments of their sociodemographic characteristics and clinical manifestations (e.g., age, symptom duration, disease duration, marital status, educational level, employment status, family history, social history, and comorbidities). To assess FM symptom severity, visual analog scales were used by patients to rate their current levels of pain and fatigue. All patients were assessed using the 1990, 2010, 2011, and 2016 ACR criteria; AAPT criteria; and modified FAS criteria. A trained rheumatologist assessed tender points with reference to the standardized survey manual [17]. These were identified by direct palpation at 18 specific sites; a force of 4.0 kg was delivered via direct thumb palpation, in accordance with the standardized protocol. The tender point count was the sum of the number of such points, and a tender point score was calculated by summing the scores for each tender point as follows: 0, no tenderness; 1, light tenderness (confirmed verbally upon questioning); 2, moderate tenderness (spontaneous verbal response); 3, severe tenderness (physical movement away).

Development of Korean Versions of the AAPT Criteria and Modified FAS Criteria

Three translators (including a rheumatologist) independently translated the AAPT criteria [10] and modified FAS criteria [11] into the Korean language. Reverse translation was independently performed by three bilingual native English speakers who were blinded to the original English version. Specific cultural adaptations were not performed in this study because the questionnaires used with the AAPT criteria and modified FAS criteria lack content related to cultural differences. The AAPT criteria are based on three items. FM can be diagnosed in patients when the following criteria are satisfied: multisite pain defined as ≥ 6 sites of pain from among 9 possible sites (head, left/right arms, chest, abdomen, upper back and spine, lower back and spine including the buttocks, and left/right legs), moderate to severe sleep problems or fatigue, and multisite pain plus fatigue or sleep problems for at least 3 months. The presence of another pain disorder or related symptoms does not rule out a diagnosis of FM. The modified FAS criteria are also based on three items including fatigue, quality of sleep, and chronic widespread pain. The level of fatigue and sleep quality is scored from 0 to 10, and chronic widespread pain is rated in 19 body regions. The total score is 39 points, and FM can be diagnosed when a patient’s score is ≥ 20 points.

To evaluate instrument comprehensibility, 20 patients with FM were interviewed to rate their comprehension of each question using the following four-point scale: 1, somewhat understandable; 2, moderately understandable; 3, understandable; 4, completely understandable. Questions were considered acceptable if they were scored as 3 or 4. Test-retest reliability was evaluated at 2-week intervals in 20 patients with FM. The first evaluation was assessed during a single clinic visit, and the second was assessed at the next clinic visit. During the 2-week period, no intervention was provided.

Construct validity was evaluated by comparing the AAPT criteria and modified FAS criteria with the 1990, 2010, 2011, and 2016 ACR criteria. Responses to the AAPT criteria and modified FAS criteria were also compared to those of the revised Fibromyalgia Impact Questionnaire (FIQR), the EuroQol five-dimensional questionnaire (EQ-5D), and the Multidimensional Health Assessment Questionnaire (MDHAQ). The quality of life and other dimensions of FM were assessed using the Korean version of the FIQR [18], which features 21 questions that are each rated using an 11-point numerical rating scale (score range 0–10, where 10 is the worst score). The FIQR is divided into three linked dimensions: function (nine questions), overall impact (two questions), and symptoms (10 questions). The summed function score (0–90) is divided by three, the summed overall impact score (0–20) is not transformed, and the summed symptom score (0–100) is divided by two. The total score is the sum of these three domain scores. The EQ-5D includes the following five dimensions: mobility, self-care, engagement in usual activities, pain/discomfort, and anxiety/depression [19]. Each dimension has the following three levels of severity: no, some, and extreme problems. The MDHAQ consists of 18 questions, including 8 on activities of daily living, 6 on advanced activities of daily living, and 4 on psychologic distress. The Korean version of the MDHAQ was used in this study [20]. For the evaluation of psychiatric symptoms, depression was evaluated using the Korean version of the Beck Depression Inventory (BDI) [21], which consists of 21 multiple-choice questions. Each item is rated on a 4-point scale, and the scores are summed to yield a total ranging from 0 to 63, where higher scores represent more severe depression. To evaluate the presence and severity of anxiety, the Korean version of the State-Trait Anxiety Inventory (STAI) was used [22]. This instrument includes the STAI-I (anxiety associated with a specific event) and STAI-II subscales (anxiety as a stable personality characteristic) and contains 40 items overall (20 each for the STAI-I and STAI-II).

Comparison of the 1990, 2010, 2011, and 2016 ACR Criteria with the AAPT Criteria and Modified FAS Criteria

To compare the sensitivity, specificity, positive predictive value (PPV), negative predictive value (NPV), and Cohen’s kappa coefficient among the previous sets of ACR criteria and the AAPT and modified FAS criteria, all patients underwent physical examinations for the evaluation of tender points in accordance with the 1990 ACR criteria. Then, they completed questionnaires as required by the 2010, 2011, and 2016 ACR criteria; AAPT criteria; and modified FAS criteria.

Statistical Analyses

All statistical analyses were performed using IBM SPSS Statistics for Windows (version 20; IBM Corp. Armonk, NY, USA) and STATA (version 11.0; StataCorp, College Station, TX, USA). Values are expressed as means ± standard deviations for continuous variables and as percentages for categorical variables. The normality of the data distribution was assessed using the Kolmogorov-Smirnov test. Continuous variables were compared using Student’s t test and the Mann-Whitney U test. Qualitative variables were compared between the two groups using the chi-square test or Fisher’s exact test. Test-retest reliability was assessed by calculating the Spearman correlation coefficients. The internal consistencies of the AAPT criteria and modified FAS criteria were assessed using Cronbach’s alpha, which measures how closely test items are interrelated and thus the extent to which they assess the same construct. When all items are closely related, Cronbach’s alpha will approach 1. Internal consistency was considered adequate if Cronbach’s alpha was at least 0.7. To assess construct validity, Spearman correlation coefficients were used to compare the results of the AAPT criteria and modified FAS criteria with those of the FIQR, MD-HAQ, and EQ-5D and with those of the 1990, 2010, 2011, and 2016 ACR criteria. Receiver-operating characteristic curves were generated, and area under the curve values were calculated to determine sensitivity, specificity, PPV, and NPV. A p value of < 0.05 was considered to indicate statistical significance.

Results

Of the 203 included patients, 95 with FM and 108 with various rheumatologic disorders completed the questionnaire at the time of enrollment. The mean overall patient age was 50.5 ± 11.7 years, and 82.3% of patients were women. The mean disease duration in all patients was 43.6 ± 50.7 months.

The baseline characteristics of patients with FM were compared to those of patients who had other diseases, as shown in Table 1. The mean age of patients with FM was 48.7 ± 10.7 years, and that of patients who had other rheumatologic diseases was 52.0 ± 12.4 years. In total, 86.3% of patients with FM and 78.7% of all other patients were women. Compared to patients who had other diseases, patients with FM scored significantly higher on the pain visual analog scale; fatigue visual analog scale; FIQR; tender point number and count in the 1990 ACR criteria; WPI and symptom severity score (SSS) in the 2010 ACR criteria; WPI and SSS in the 2011 ACR criteria; WPI and SSS in the 2016 ACR criteria; items 1, 2, and 3 in the AAPT criteria; and fatigue level, sleep quality level, and pain sites in the modified FAS criteria (all values were p < 0.001). In addition, scores in the EQ-5D, MD-HAQ, BDI, and STAI-II were higher in patients with FM than in those who had other rheumatologic diseases (all values were p < 0.001).

Table 1 Baseline characteristics of 95 patients with fibromyalgia and 108 patients with other rheumatic diseases

The test-retest reliability of the AAPT criteria and modified FAS criteria was assessed in 20 patients over a 2-week interval (Table 2). The Spearman coefficients of the AAPT criteria ranged from 0.761 to 0.805, and those of the modified FAS criteria ranged from 0.752 to 0.805, both indicating very acceptable internal consistency. The Cronbach’s alpha of the AAPT criteria was 0.814, and that of the modified FAS criteria was 0.897.

Table 2 Test-retest reliabilities of the AAPT criteria and the modified FAS criteria in 20 patients with fibromyalgia

To evaluate the construct validities of the AAPT criteria and modified FAS criteria, we compared these criteria with the 1990, 2010, 2011, and 2016 ACR criteria (Table 3). Items 1, 2, and 3 of the AAPT criteria showed moderate and strong associations with the 1990, 2010, 2011, and 2016 ACR criteria, ranging from 0.477 to 0.770. Similarly, the fatigue, sleep, and pain domains of the modified FAS criteria were moderately to strongly associated with the existing ACR criteria, ranging from 0.535 to 0.940. In addition, the AAPT criteria and modified FAS criteria were significantly correlated with the FIQR, EQ-5D, MDHAQ, and BDI, but not the STAI. Using the AAPT criteria, FM was diagnosed in 56.8% of patients with a prior diagnosis of FM and in 5.6% of those who had other rheumatologic disorders. Using the modified FAS criteria, FM was diagnosed in 60.0% of patients with a prior diagnosis of FM and in 7.4% of those who had other rheumatologic disorders. However, FM was diagnosed in 37.9%, 97.9%, 90.5%, and 94.7% of patients with FM using the 1990, 2010, 2011, and 2016 ACR criteria, respectively.

Table 3 Comparison of correlation coefficients among the AAPT criteria; the modified FAS criteria; and the 1990, 2010, 2011, and 2016 ACR criteria

Table 4 shows the sensitivity, specificity, PPV, and NPV of the AAPT criteria and modified FAS criteria. The sensitivity and specificity of the AAPT criteria were 56.8% (95% confidence interval [CI] 46.3–66.9%) and 94.4% (95% CI 88.3–97.9%), respectively, and the PPV and NPV of the AAPT criteria were 90.0% (95% CI 80.2–95.2%) and 71.3% (95% CI 66.3–75.9%), respectively. The Cohen’s kappa coefficients were 0.376 between the 1990 ACR criteria and the AAPT criteria and 0.864 between the 2011 ACR criteria and the AAPT criteria. The sensitivity and specificity of the modified FAS criteria were 60.0% (95% CI 49.4–69.9%) and 92.6% (95% CI 85.9–96.8%), respectively, and the PPV and NPV of the modified FAS criteria were 87.7% (95% CI 78.2–93.4%) and 72.5% (95% CI 67.2–77.2%), respectively. The Cohen’s kappa coefficients were 0.323 between the 1990 ACR criteria and the modified FAS criteria, and 0.536 between the 2016 ACR criteria and the modified FAS criteria.

Table 4 Sensitivity, specificity, positive and negative predictive values of the 1990, 2010, 2011, and 2016 ACR criteria; the AAPT criteria; and the modified FAS criteria

Figure 1 shows the receiver-operating characteristic curve comparing the diagnostic accuracies of the various sets of criteria. The areas under the curve of the AAPT criteria and modified FAS criteria were 0.852 (95% CI 0.801–0.903) and 0.903 (95% CI 0.861–0.944). Furthermore, the areas under the curve of the 1990, 2010, 2011, and 2016 ACR criteria were 0.931 (95% CI 0.891–0.972), 0.953 (95% CI 0.925–0.981), 0.960 (95% CI 0.936–0.984), and 0.970 (95% CI 0.951–0.989), respectively. The areas under the curve of the AAPT criteria and modified FAS criteria were lower than those of the 1990, 2010, 2011, and 2016 ACR criteria.

Fig. 1
figure 1

Receiver-operating characteristic curve analyses comparing the diagnostic accuracies of the 1990, 2010, 2011, and 2016 ACR criteria; the AAPT criteria; and the modified FAS criteria

Discussion

In this study, the new AAPT criteria and updated FAS criteria had lower diagnostic accuracies than the 1990, 2010, 2011, and 2016 ACR criteria in terms of lower sensitivities and lower areas under the curve values. Furthermore, the 2016 ACR criteria demonstrated the best performance among the criteria in this study.

For several decades, extensive efforts have been made to improve the classification, diagnostic criteria, and screening criteria for clinical identification of patients with FM. First, the 1990 ACR criteria adopted tender point examination using an expert consensus approach [5]. Because of the difficulty and heterogeneity of tender point examination among physicians, subsequent ACR criteria eliminated this examination. Although the 2010/2011 ACR criteria introduced “widespread pain” and associated extra-pain symptoms [6, 7], other challenges persisted. Myofascial pain syndrome could be misdiagnosed as FM, and the spatial distribution of pain could be overlooked. To address these problems, the revised 2016 ACR criteria introduced the requirement for ≥ 4 of the 5 body regions to exhibit “generalized pain” rather than “widespread pain” [9]. Although the revised 2016 ACR criteria permitted the coexistence of other diseases, difficulties remain in the clinical diagnosis of FM due to the criteria complexity and various comorbidities [8]. To reduce time involved in patient diagnosis and to improve implementation in daily practice, the AAPT criteria and modified FAS criteria were recently developed [10, 11]. These criteria were proposed to reflect the current understanding of FM and to be easily used for the diagnosis and follow-up of patients with FM.

In this study, the AAPT criteria had lower diagnostic accuracy than the 1990, 2010, 2011, and 2016 ACR criteria. The AAPT FM working group introduced new criteria to improve the identification of patients with FM by simplifying the FM diagnostic criteria [10]. The concept of multisite pain was proposed as a substitute for chronic widespread pain, in which the presence of ≥ 6 sites of pain (from among 9 possible sites) is necessary for diagnosis with FM. In our study, 34 patients with FM who satisfied the WPI of the 2016 ACR criteria did not satisfy the definition of multisite pain (item 1) of the AAPT criteria, suggesting that multisite pain is a stricter requirement than the WPI in the existing ACR criteria. Similarly, the AAPT criteria only focused on sleep disturbance and fatigue among the various associated extra-pain symptoms to simplify the diagnostic criteria, such that patients with mild or fluctuating symptoms were not identified by the AAPT criteria. Salaffi et al. [11] compared the performances of the 2011 ACR criteria, 2016 ACR criteria, and AAPT criteria, showing that the AAPT criteria had the worst performance in terms of sensitivity, specificity, and correct classification. These findings are consistent with our results that the AAPT criteria had the lowest diagnostic accuracy among the six sets of criteria. Taken together, the findings emphasize that simplicity is prioritized in the AAPT criteria, rather than diagnostic accuracy. Thus, extra caution is necessary when the AAPT criteria are used for the diagnosis of patients with chronic pain.

Similar to the AAPT criteria, the current study showed that the modified FAS criteria had lower diagnostic accuracy than the 1990, 2010, 2011, and 2016 ACR criteria. The modified FAS criteria constitute the updated version of the FAS questionnaire developed in 2009 [23]. These criteria use a simplified rating of chronic widespread pain that involves description of the presence or absence of pain in 19 body regions, rather than assessing pain in those body regions using four-point numerical scales. Also similar to the AAPT criteria, the modified FAS criteria focused on fatigue and quality of sleep to simplify the diagnostic criteria. Thus, despite their simplicity, the modified FAS criteria had similar limitations in diagnosing patients with FM compared to the AAPT criteria, which led the two sets of criteria to demonstrate lower diagnostic accuracy than the existing ACR criteria. Although these two sets of criteria did not accomplish their intended goals, it remains important to facilitate the diagnosis of patients with FM without reduced accuracy. Further efforts are needed to facilitate FM diagnosis, particularly with respect to the challenges facing patients with chronic pain that is not adequately managed.

Conclusions

The AAPT criteria and modified FAS criteria showed lower sensitivity, specificity, and diagnostic accuracy than the 1990, 2010, 2011, and 2016 ACR criteria. Surprisingly, the 2016 ACR criteria had the best diagnostic accuracy among the criteria assessed in this study. Further large population-based validation studies are needed to support our findings.