Background

Spondyloarthritis (SpA) is an umbrella term that describes a group of interrelated rheumatic conditions including ankylosing spondylitis (AS), psoriatic arthritis (PsA), spondyloarthritis associated with inflammatory bowel disease (IBD), and reactive arthritis [1]. The development of the Assessment in SpondyloArthritis International Society (ASAS) criteria has led to the subdivision of SpA into predominantly axial SpA and predominantly peripheral SpA, depending on their clinical presentation [2,3,4]. Ankylosing spondylitis (AS) is regarded as the disease prototype, and it typically affects patients at a young age. In a multicenter cross-sectional survey in China, the mean age of onset and diagnosis of AS was 29.2 and 33.5 years respectively [5]. Studies have shown that AS patients have a greater work disability (WD) compared to the general population, with WD rates varying from 3 to 50% in western countries [6,7,8]. Patients with AS are 3.1 times more likely to have withdrawal from work than expected in the general population, and they are also more likely to experience a lower quality of life (QoL) [9, 10]. Patients with more severe AS showed significantly greater impairment in work and daily activities than patients with milder disease severity [11], and this loss of work productivity can lead to increased lifetime costs and socioeconomic burden [6, 7].

The use of biologics in the treatment of SpA has gained popularity in the recent two decades. With better disease control, there is a growing interest in the assessment of health-related quality of life (HRQoL). This is particularly important in determining the impact and effectiveness of new pharmaceutical agents and to compare different treatment regimes. Studies have shown that patients with axial SpA report a lower HRQoL than do healthy controls and this reduction in HRQoL is associated with fatigue, pain, increased disease activity, and decreased daily activity and exercise [12,13,14]. Furthermore, a lower HRQoL in SpA patients is associated with adverse psychological outcomes, including body image disturbance and a higher prevalence of depression and anxiety [15, 16].

There are mainly two different types of HRQoL instruments, namely disease-specific and generic, to assess patients of chronic diseases. Disease-specific tools provide an assessment of the disease state and treatment outcomes. For axial SpA, disease-specific tools for assessing functional disability include Bath Ankylosing Spondylitis Functional Index (BASFI) [17], the Leeds Disability Questionnaire (LDQ) [18], and the Dougados Functional Index (DFI) [19]. Generic instruments are more useful for assessments of the disease impact by allowing comparisons between different disease populations. One such tool is the 36-item Short-form (SF-36) questionnaire [20,21,22] which provides a numerical measurement of a patient’s health. However, it does not incorporate preferences for health states and cannot be used directly in cost-utility analyses. The EuroQoL 5-dimension (EQ-5D) is a generic health measure instrument developed by the EuroQoL group, which allows a quantitative expression of the individual’s perception of their overall health status [23]. It serves as an important utility measure for clinical and economic appraisal, particularly in the cost-utility analysis of various health care interventions, and the calculation of quality-adjusted life years (QALYs). It has been applied to the Chinese population previously [24] and has been validated in other spine conditions such as adolescent idiopathic scoliosis [25,26,27]. However, its applicability in Chinese patients with SpA is currently unknown. Hence, the aim of this study is to validate the use of EQ-5D in Chinese patients with SpA and to test its psychometric properties.

Methods

A total of 220 consecutive patients of Chinese ethnicity were prospectively recruited from 2 rheumatology specialist clinics between May to December 2017. All recruited patients fulfilled either the ASAS axial SpA criteria [2, 3] or peripheral SpA criteria [4] for diagnosis. All recruited patients were 18 years old or above. Patients who did not give consent for participation, non-Chinese, illiterate, and unable to comprehend the instruments were excluded. Subjects who consented were interviewed for a panel of sociodemographic and disease-associated parameters, disease activity and severity factors, and HRQoL scores that highlight the functional and mental health status. All subjects were interviewed over the phone by the same research personnel for a reassessment of the study questionnaires 2 weeks after their baseline interview for test-retest reliability of the study instruments.

Sociodemographic and disease-associated data

Patients’ smoking and drinking habits, education level, income, and occupation were recorded. Disease-associated data including disease duration, the presence of back pain and/or peripheral arthritis, dactylitis, enthesitis, and extra-articular manifestations such as uveitis, psoriasis, IBD, and history of sexually transmitted disease or dysentery was collected. Physical examination was performed to determine the number of tender joint count (TJC) and swollen joint count (SJC), the dactylitis and enthesitis scores. Antero-posterior radiograph of the lumbosacral (LS) spine was utilized for grading of sacroiliitis according to the modified New York criteria [28] by a rheumatologist (HYC) who was blinded to the clinical data. Radiological sacroiliitis was graded as follows: 0, normal; 1, suspicious; 2, minimal sclerosis with some erosions; 3, erosion with widening of joint space and possible partial ankyloses; and 4, complete ankyloses. Bilateral sacroiliitis of grade 2 or above, or unilateral sacroiliitis of grade 3 or above was defined as AS.

Disease activity and severity scores

All recruited patients filled in the Bath Ankylosing Spondylitis Disease Activity Index (BASDAI) [29] and BASFI [17] to determine the disease activity and functional disability respectively. Spinal mobility was assessed clinically to determine the BASMI [30] score. The Bath Ankylosing Spondylitis Global Index (BASGI) [31] and CRP were measured for calculation of ASDAS-CRP [32], which is a composite disease activity measure of SpA. Human leucocyte antigen (HLA) B27 status was also checked as a poor prognostic marker.

Functional and mental health status

The SF-36 [20,21,22] was used for the assessment of mental and physical health and as a comparable generic questionnaire marker of EQ-5D changes. Work Productivity and Activity Impairment (WPAI) questionnaire [33] was used for work productivity and regular activity impairment assessment. Oswestry Disability Index (ODI) [34, 35] was used for assessment of the functional disability caused by the back pain. Hospital Anxiety and Depression Scale (HADS) [16, 36] was utilized to assess the mental health status.

The main study parameter was the EQ-5D which is a standardized measure of health status developed by the EuroQoL group that allows a generic assessment of health status for clinical and economic appraisal [23]. It consists of a two-page questionnaire, the EQ-5D descriptive system and the EQ visual analogue scale (EQ VAS). The descriptive system is comprised of five domains, including mobility, self-care, usual activities, pain/discomfort, and anxiety/depression. There are two versions of EQ-5D, namely the EQ-5D-3 level (EQ-5D-3 L) and the EQ-5D-5 level (EQ-5D-5 L) versions. For the EQ-5D-3 L, each domain will be scored by three levels (no problem, some problem, and extreme problem). We utilized the EQ-5D-5 L version for this study, and each domain of this parameter was scored by five levels with one representing no problem and five representing extreme problem. Previous studies published by the EuroQoL group have shown that the five-level version could significantly increase reliability and sensitivity while maintaining the feasibility of the test and it could potentially reduce ceiling effects [23]. The scores of the five domains are combined into a five-digit number which is converted into a single index value. The EQ VAS allows patients to self-report their own perceived quality of life from a scale of 0 (worst) to 100 (best). Currently, no Chinese-specific EQ-5D-5 L value set is available and hence, we have adopted an indirect two-step approach to obtain the index value. We do so by first converting the EQ-5D-5 L into the EQ-5D-3 L health status via a transition probability matrix [37], and subsequently, the EQ-5D-3 L health status is scored according to a Chinese-specific EQ-5D-3 L value set ranging from − 0.149 for the worst health status (“33333”) to 1 for the best health status (“11111”).

Statistical analysis

Overall baseline descriptive characteristics were reported with mean ± standard deviation (SD). Any differences between measures were compared using independent t-test and Chi-squared test where appropriate. At least 15% of patients achieving the lowest or highest possible scores were considered as having a floor or ceiling effect, respectively [38]. Internal consistency of the measurements was performed using Cronbach’s alpha with a value > 0.7 to indicate adequacy [39]. Test-retest reliability was assessed by weighted kappa for the five domains of EQ-5D-5 L and the intra-class correlation coefficient (ICC) score over the 2-week period. An ICC of ≥ 0.7 was used to indicate good reproducibility [38]. A weighted Kappa score of < 0.2 was indicative of poor agreement, 0.21–0.4 was fair, 0.41–0.6 was moderate, 0.61–0.8 was good, and ≥ 0.8 was very good [40].

Disease activity was determined by BASDAI and ASDAS-CRP for analysis. In addition, the presence of peripheral arthritis, dactylitis, uveitis, psoriasis, and HLA-B27 status was also used. All of these parameters were dichotomized into a “no” or “yes” for analysis except for BASDAI and ASDAS-CRP. BASDAI was dichotomized into “low (score < 4)” or “high (score ≥ 4)” disease activity, and ASDAS-CRP was categorized as “inactive disease (< 1.3),” “moderate disease activity (1.3–< 2.1),” “high disease activity (2.1 to < 3.5),” and “very high disease activity (> 3.5).” Correlation between these factors representing disease activity with ODI, HADS depression, anxiety and total scores, EQ VAS, EQ-5D, BASFI, BASMI, SF-36, and its 10 domains (physical functioning, physical role, emotional role, vitality, emotional well-being, social functioning, bodily pain, general health, physical component score, mental component score) was assessed by independent t test. Several parameters required multiple categories to distinguish disease activities. HADS depression and anxiety scores were categorized as “normal (0–7),” “borderline (8–10),” and “abnormal (11–21).” ODI was categorized into “minimal disability (0–20),” “moderate disability (21–40),” “severe disability (41–60),” and “crippled (61–80).” Correlation between these parameters with various instruments listed above was performed with analysis of variance (ANOVA).

Spearman’s correlation was performed to assess the validity of SF-36 and EQ-5D-5 L scores with various other instruments including ODI, HADS, BASFI, BASMI, BASDAI, EQ VAS, and ASDAS-CRP. All statistical analyses were conducted using STATA version 13.0. A p value of < 0.05 was considered as statistically significant, and 95% confidence intervals (CIs) were listed as appropriate.

Results

A total of 220 Chinese patients with SpA were recruited consecutively without any exclusions or refusals after recruitment. The mean age was 47.2 ± 14.1 years, and 67.3% of them were male patients. Baseline characteristics of the recruited SpA patients are shown in Table 1. Up to 61.4% of patients had low disease activity with a BASDAI of < 4, and 78.5% of patients were positive for HLA-B27. Very high disease activity by ASDAS-CRP, severe ODI (crippled), and dactylitis was uncommon.

Table 1 Demographic and clinical characteristics of patients

Table 2 lists the overall average scores for each study instrument. Cronbach’s alpha coefficient was 0.843 for the EQ-5D-5 L score hence indicating acceptable internal consistency and reliability. There was a ceiling effect observed for all domains except pain/discomfort for EQ-5D-5 L (Fig. 1). No floor effect was observed. The ICC of the EQ-5D-5 L was 0.828 supporting good reliability (Table 3). The overall ODI score is low indicating that the overall disability level of our cohort was not severe. Similarly, this was observed for BASFI and HADS scores.

Table 2 Descriptive statistics of baseline measures
Fig. 1
figure 1

Distribution of EQ-5D-5 L responses in the study cohort

Table 3 Reliability of EQ-5D-5 L

Correlations between the various study instruments are listed in Table 4. Statistically significant negative correlations were observed between ODI, HADS, BASFI, BASMI, BASDAI, and ASDAS-CRP with SF-36 and EQ-5D-5 L scores. As an internal verification, SF-36 improvement positively correlated with EQ VAS. Largest correlations were observed for ODI and BASFI. Smallest correlations were observed for BASMI. No significant correlations were noted for back pain duration, psoriasis duration, swollen joints, and dactylitis score. Tender joints however correlated with poorer SF-36 functional and pain scores, and EQ-5D-5 L scores. A negative correlation was also observed between enthesitis score, CRP, and ESR for SF-36.

Table 4 Spearman correlation coefficient between measures

Most scores were able to differentiate between patients with current back pain and spinal pain in the past week (Tables 5 and 6). Higher disease activity was well-differentiated by EQ-5D-5 L and SF-36 scores. Higher BASDAI score had lower EQ-5D score (0.656 vs 0.874, p < 0.001). Similarly, this pattern was also observed for ASDAS-CRP scores for both EQ-5D and SF-36 scores (P < 0.001). Worse HADS depression and anxiety, and ODI scores were associated with worse EQ-5D and SF-36 scores (p < 0.001). Consistency was confirmed with a worse EQ VAS associated with higher disease activities. No statistically significant differences were observed for the various instruments for reports of any back pain, peripheral arthritis, dactylitis, uveitis, or psoriasis. The presence of peripheral or axial SpA or AS was more sensitive to BASFI and BASMI changes.

Table 5 Independent t test
Table 6 ANOVA test

Discussion

SpA is a chronic debilitating disease that significantly reduces a patient’s QoL. Patients are required to undergo prolonged treatment regimens to help control a disease that cannot truly be eradicated. As such, patients must be monitored both physically and mentally throughout the management process to evaluate treatment outcomes, identify new concerns, and calculate the most cost-effective options. In addition, determining QALYs will help us understand the impact of disease in the general healthcare system and drive institutional policies based on cost-utility analyses. In this current climate where designing the most effective treatment strategies at the lowest cost is paramount, we as healthcare providers are tasked with gathering this information.

As such, this first psychometric validation study of using the EQ-5D instrument in patients with SpA is a necessary step to providing a platform to assemble cost-utility information on the disease impact of SpA and the effectiveness of our treatment. Testing the validity, reliability, and sensitivity of the study instrument is necessary to convince users of its applicability in measuring the HRQoL of patients with SpA and for cross-specialty cross-disease comparisons of QALYs. Our results suggest EQ-5D to be effective in measuring disease outcomes and severity by identifying different disease activity status. The test-retest reliability of the EQ-5D in our cohort was also good with a strong ICC and agreement between the five domains.

The EQ-5D instrument is also proven by this study to have a good correlation with SF-36 for disease status. Specific pain sources appear to be best differentiated by EQ-5D and most influential for poor outcome scores. A strong negative correlation was observed with tender joints as well as current or past week back pain. Higher disease activities as supported by BASDAI and ASDAS-CRP scores were well differentiated by lower EQ-5D-5 L and SF-36 domain and overall scores. This pattern is also evident for other physical and mental measures such as HADS and ODI scores. There are limitations in its ability to detect specific disease patterns such as the presence of peripheral arthritis, dactylitis, uveitis, or psoriasis. However, as a generic measurement, this is not its expected function and such information should be produced by disease-specific instruments such as BASFI and BASMI.

The EQ-5D instrument has also confirmed validity along with SF-36 as an internal consistency measure. There is a ceiling effect observed in this study for the EQ-5D measurement but is not unexpected. This finding has been demonstrated in other studies of chronic conditions as well [26, 41,42,43,44]. Nevertheless, overall results suggest that this tool is useful in conjunction with other disease-specific tools to monitor SpA patients.

The main limitation of this study is the use of phone interviews for the retest portion of the study. This was performed as routine follow-up consultations were more than 2 weeks apart, and hence, it was not practical to ask patients to return to close succession for questionnaire interviews only. Nevertheless, the scores suggest our method was acceptable and still reached significant results. Also, the use of an indirect two-step approach to determine EQ-5D-5 L may produce measurement errors in the score but due to the lack of a cultural-specific value set, this is still the best approach to generate the 5 L scores.

Conclusions

The EQ-5D-5 L demonstrates satisfactory psychometric properties for the assessment of patients with SpA. As this is a patient population with long-term follow-up and treatment, utilizing this generic measurement to study HRQoL changes and evaluate the economics of various treatment options is necessary. It appears that EQ-5D-5 L scores is most sensitive to pain and is a useful tool to differentiate patients with joint or spinal pain. Future study is required to determine the responsiveness properties of this measurement tool for changes in disease activity, comparing different treatment regimens and also other similar chronic diseases.