Introduction

Caregiver burden is a state where the physical and psychological well-being, family relations or financial status of the caregiver could be threatened by providing the necessary care to another [1]. Caregiving, especially for elderly with dementia, usually causes burden among caregivers [2,3,4,5]. Studies showed that burden can ultimately lead to depression [6], and could lead to the poor treatment outcomes of the patients [7].

Thailand is becoming an aging society which would increase the number of dependent individuals and tendency of a household’s need for caregivers [8], so the burden of caregivers should be concerned. Studies have shown that over 40 to 70% of caregivers perceiving a burden [9, 10].

One of the oldest and most common measurements to assess dementia caregiver burden is the Zarit Burden Interview (ZBI) [11, 12]. It measures multidimensional aspects including physical, emotional, financial and social burden and the relation with the care receiver. It originated as a 29-item questionnaire but currently has been translated to many languages and revised to a 22-item form (ZBI-22) [13]. Shorter forms of ZBI have been developed by researchers over the past decades, ranging from 1 to 14 items [14]. However, the most widely used is ZBI-12, introduced by Be´dard et al. [15]. ZBI-12 has shown good psychometric properties in various languages and cultures [16,17,18,19,20,21,22,23,24,25]. Regarding its factor structure, the dimensions of ZBI range from 2 to 5 [15, 20]. Due to its multidimensional nature, it may not be accurately captured by a global score [26,27,28]. The ZBI has demonstrated high correlations with other psychological tools [17, 23,24,25, 29]. For ZBI-12, factor analysis revealed a two-factor rather than a unidimensional model despite being shorter [27]. Correlations between the ZBI-12 and ZBI-22 received a value of 0.96 in the initial study. To capture the global score of burden, the unidimensional ZBI was developed using item response theory (IRT), yielding a different set of items for the short scale as compared with the former 12-item ZBI [30].

To our knowledge, only one study examined the Thai version of ZBI-12 using factor analysis [31]. The Thai ZBI has never been tested for psychometric properties among Thai dementia caregivers and using IRT or Rasch measurement model. Therefore, the present study aimed to examine the ZBI construct by means of convergent, discriminant and concurrent validity, using both Rasch analysis and confirmatory factor analysis (CFA).

Main text

Methods

Subjects

One hundred and two caregivers of patients with Alzheimer’s, who were diagnosed and treated by neurologists at Maharaj Nakorn Chiang Mai Hospital, participated in the study. Primary caregivers aged 18 years old or more, who had been providing care for at least 1 month were recruited. Exclusion criteria was inability to communicate due to either language barrier or severe mental health problem.

Data collection

Data were collected at an outpatient clinic through structured interviews by one physician (MP) who had no role in patient care planning. All gave written informed consent before completing the questionnaires. The questionnaires included sociodemographic data, records related to caregiving and specific measures, which were ZBI, Perceived stress scale (PSS), Patient Health Questionnaire (PHQ-9), and EQ-5D.

Outcome measures

ZBI

The ZBI is a caregiver-reported questionnaire measuring the burden the respondent feels in providing care to the patient. Currently, it has two widely used forms, ZBI-22 and ZBI-12, with a Likert scoring scale between 0 (never) and 4 (nearly always) [15, 32]. Studies showed high correlation in both ZBI-22 and ZBI-12 with the Caregiver Activity Survey, and with other tools [25, 33].

The Thai version (translated version) of the ZBI used in this study was allowed by Professor Zarit and Mapi Research Trust [13]. The study sample showed a Cronbach’s alpha of 0.921 for the ZBI-22 and 0.865 for ZBI-12.

PSS

The PSS is a self-reporting, 10-item questionnaire measuring the extent to which individuals perceived stress [34]. The 4-response Likert scale, ranges from 0 (not at all) to 4 (the most). The Thai version PSS showed a Cronbach’s alpha of 0.85. It correlated with other measures including the State Trait Anxiety Inventory, but negatively correlated with the Rosenberg Self-Esteem Scale [35]. The study sample showed a Cronbach’s alpha of 0.850.

PHQ-9

The PHQ-9 is a self-reporting, 9-item questionnaire measuring the extent to which an individual feels bothered due to depressive symptoms over the past 2 weeks [36]. The 4-response Likert scale ranges from 0 (not at all) to 3 (nearly every day). The Thai version PHQ-9 showed a Cronbach’s alpha of 0.79 and a positive association between the PHQ-9 and the HAM-D [37]. The study sample showed a Cronbach’s alpha of 0.849.

EQ-5D

The EQ-5D is a self-reporting questionnaire measuring health-related quality of life [38]. It comprises 5 items assessing 5 domains of health state: mobility, self-care, usual activities, pain and anxiety/depression, with a 5-response scale ranging from 1 (no problem) to 5 (severe problem). All 5 aspects were calculated to an index score with the maximum of 1.000 [39]. An intraclass correlation coefficient of 0.987 for the EQ-5D index score, and a significant correlation with WHOQOL-BREF were noted [40]. The study sample showed that Cronbach’s alpha was 0.723.

Statistical analysis

Sociodemographic data were analyzed using descriptive statistics. Pearson’s or Spearman’s rank was used for correlational analysis. The same items were presented in both tests, leading to an overestimate of the “true” correlation, so a corrected correlation was made between both forms of ZBI [41].

Based on measurement theory, a scale should demonstrate that all items contribute to the same construct, and has monotonically increasing steps. All these properties can be illustrated by the Rasch model. The following approach was conducted for analysis.

Correlation analysis

We tested the ZBI against the EQ5D subscale, hypothesizing that ZBI should relate more to anxiety/depression than mobility. We expected to find a low—moderate correlation between ZBI and PSS and PHQ-9 to demonstrate concurrent validity.

Rasch analysis

The Rasch model belongs to the item-response latent trait models, a probabilistic logistic model that predicts that the response to a particular item is influenced by the quality of both person and item. More details can be found elsewhere [42]. The partial credit Rasch model was used [43], with the following criteria. First, unidimensionality and local independence, which were evaluated by (a) the first principal component of the residuals (or first contrast) should have an eigen value less than 2, (b) disattenuated correlation > 0.7 and (c) item fit statistics (INFIT and OUTFIT mean-square) indicating the consistency of each item to the other items, should be 0.70 and 1.30 [44]. To evaluate local independency, a standardized residual correlation should be less than 0.3 [45]. Second, response category functioning and ordered categories and thresholds are expected for measurement [46]. Third, a reliability coefficient of 0.80 or higher and of 0.90 or higher are considered acceptable for person reliability and item reliability, respectively.

CFA

To test how data were well modeled with the unidimensional construct, CFA was performed for both ZBI-22 and ZBI-12. The Weighted Least Square Mean and Variance corrected method of estimation was used for the nonnormality and ordinal types of items. Assessment model fit used Chi square (p > 0.05), comparative fit index and Tucker Lewis Index, where values 0.95 or higher are preferable [47]. Root mean square error of approximation value < 0.08 was indicative of an acceptable model fit [48].

Computer software

For CFA, Mplus, Version 8.4 was used (Muthén and Muthén 2015). Rasch analysis used Winsteps, Version 4.4.8 (Beaverton, Oregon: Winsteps.com). All other analyses were performed using IBM SPSS, Version 22 (SPSS Inc., Chicago, IL, USA).

Results

The average age of the caregiver sample was 55 years (SD = 12.9); most were women (77.5%). According to ZBI level, the sample was reported to have low burden. The quality of life index score was quite high on average, while perceived stress and depressive symptoms were low (Table 1).

Table 1 Demographic characteristics of participants

For the distribution of the ZBI-items, some had unacceptable kurtosis (> ±3), which contributed to the high frequency of zero categories on these respective items (Additional file 1: Table S1).

Correlation analysis showed that ZBI-22 had a coefficient of 0.855 (p < 0.01) with ZBI-12 for the uncorrected correlation, and 0.784 (p < 0.01). Both ZBI-22 and ZBI-21 significantly related to PHQ-9, PSS, the EQ-5D index score, subscale mobility, pain and anxiety/depression, but not to self-care and usual activity indicating convergent and discriminant validity (Table 2).

Table 2 Zero correlation between variables

Rasch analysis results showed that the unexplained variance in the first contrast yielded eigen values of 2.52 and 3.03, implying a possible second dimension. However, based on disattenuated correlation between person measure > 0.7, the second dimension could noise for ZBI-22. Five items of ZBI-22, and two items of ZBI-12 were shown to be misfitted. Five pairs of items from ZBI-22 and three pairs of items from ZBI-12 had standardized residual correlations above 0.2, indicating item dependency of both forms of ZBI. For category function, 33 to 50% of items were found to be disordered category or threshold (Table 3). For this reason, the four original rating categories were combined in different ways until the criteria were best met. This was obtained by rescaling as follows: 0 = 0; 1 = 1; 2 = 2; 3 = 3 and 4 = 3 for ZBI-22, and 0 = 0; 1 = 1; 2 = 2; 3 = 2 and 4 = 3 for ZBI-12. After rescaling, the data fit better with Rasch model as the misfitting items reduced while reliability increased. All reliability values were shown to be in an acceptable range.

Table 3 Rasch analysis results of the ZBI

The CFA showed that the unidimensional model did not fit with the data for both versions of ZBI. Three-factor model provided the best-fitted statistics for ZBI-22, while the four-factor model with the correlated error terms of items 11 and item 12, provided the best-fitted statistics for ZBI-12 (Additional file 2: Table S2).

Discussion

The present study aimed to evaluate the psychometric properties of the Thai version of the ZBI among caregivers of patients with Alzheimer’s disease. Consistent with related studies, both ZBI-22 and ZBI-12 did not demonstrate a unidimensional scale [49, 50], even though the ZBI-22 seemed to be favored over ZBI-12. Three-factor and four-factor fitted the data the best for ZBI-22 and ZBI-12, respectively. However, the disattenuated correlation (> 0.70) in ZBI-22 suggested that it could be sufficiently unidimensional, but not for ZBI-12.

Pairs of error variances to be correlated suggested by CFA corresponded to local dependence by Rasch analysis. This was consistent with Ballesteros et al.’s study [30] in that both items, “should do more” and “could do a better job caring” were excluded from the new 12-item ZBI. In addition to these two items, more pairs were shown to be locally dependent. Violations of local independence in a unidimensional scale can lead to inflated estimates of reliability, providing a false impression of the accuracy and precision of estimates [51].

Disordered categories and thresholds indicated that respondents had difficulty discriminating between response categories given their level of caregiver burden. In ZBI-22, the response categories were collapsed from five to four categories and by that category 3 (quite frequently) and 4 (nearly always) were collapsed together. Oddly, for ZBI-12, collapsing category 2 (sometimes) and 3 (quite frequently) yielded better results. It remains unclear why the participants responded differently to the same items of different scales.

Suggestions from our findings are twofold if interpretation of mean scores, or changes in total scores is to be meaningful, First, is to revise or remove the locally dependent and misfitting items 11 and 12 of ZBI-12 to make it better unidimensional scale. Second, is to look for the fitted items with ordered category and threshold from ZBI-22 to form a new short ZBI.

Taken together, the Thai version of ZBI-12 may not be regarded as unidimensional, an interval rating scale of burden among caregivers to patients with Alzheimer’s disease. The ZBI-22 showed sufficient unidimensionality. Some items were suggested to be removed if ZBI-12 is to be used.

Limitations and future study

Clinicians should interpret the results in light of the limitation in sample size. Replication in a larger sample size should be encouraged. Test–retest reliability, sensitivity to change and equivalence test in different populations and cultures should be warranted.