Cross-cultural adaptation and psychometric testing of the Dutch and German versions of the Evaluation of Daily Activity Questionnaire in people with rheumatoid arthritis

The Evaluation of Daily Activity Questionnaire (EDAQ) is a detailed patient-reported outcome measure of activity ability. The objective of this research was to assess the linguistic and cross-cultural validity and psychometric properties of the EDAQ in rheumatoid arthritis for Dutch and German speakers. The EDAQ was translated into Dutch and German using standard methods. A total of 415 participants (Dutch n = 252; German n = 163) completed two questionnaires about four weeks apart. The first included the EDAQ, Health Assessment Questionnaire (HAQ) and 36-item Short-Form v2 (SF-36v2) and the second, the EDAQ only. We examined construct validity using Rasch analysis for the two components (Self-Care and Mobility) of the Dutch and German EDAQ. Language invariance was also tested from the English version. We examined internal consistency, concurrent and discriminant validity and test–retest reliability in the 14 EDAQ domains. The Self-Care and Mobility components satisfied Rasch model requirements for fit, unidimensionality and invariance by language. Internal consistency for all 14 domains was mostly good to excellent (Cronbach’s alpha ≥ 0.80). Concurrent validity was mostly strong: HAQ rs = 0.65–0.87; SF36v2 rs = − 0.61 to − 0.87. Test–retest reliability was excellent [ICC (2,1) = 0.77–0.97]. The EDAQ has good reliability and validity in both languages. The Dutch and German versions of the EDAQ can be used as a measure of daily activity in practice and research in the Netherlands and German- speaking countries. Electronic supplementary material The online version of this article (10.1007/s00296-020-04657-7) contains supplementary material, which is available to authorized users.


Introduction
Patient-Reported Outcome Measures (PROMs) are used in clinical practice and research to identify patients with rheumatic and musculoskeletal conditions' (RMDs) functional problems and evaluate the effectiveness of rehabilitation for these. Most commonly, daily activities in the International

3
Classification of Functioning, Disability and Health (ICF) domains of communication, mobility, self-care and domestic life are assessed [1]. There is no single PROM of daily activities widely used by rheumatology health professionals (RHPs) in Europe. Those commonly used in rheumatoid arthritis (RA) research, for example, include between 10 and 21 activities [2][3][4][5][6][7]. In practice, RHPs often prefer non-validated daily activity checklists, including up to 55 activities [8], because they provide detailed information for individual treatment planning. However, such checklists lack reliability and validity to evaluate rehabilitation.
Consequently, a reliable, valid comprehensive daily activity PROM, including activities identified as problematic by people with RMDs, would be valuable: for treatment planning and evaluation; and in audit and research. The Evaluation of Daily Activity Questionnaire (EDAQ) was developed in Sweden, with women with RA, to meet these needs [9]. It takes patients 30 min to complete at home, allowing reflection on activity abilities. During rehabilitation, RHPs can quickly focus on identified problems, allowing more time for solutions.
The EDAQ has been extensively updated, culturally and linguistically validated and psychometrically tested in English. British men and women with RMDs identified new items and domains to include. It has been tested in eight RMDs: RA, ankylosing spondylitis (AS), osteoarthritis (OA), systemic lupus erythematosus, systemic sclerosis, chronic pain, chronic upper limb disorders and Primary Sjögren's syndrome [8,[10][11][12]. Construct representation identified activities in the EDAQ rated as more difficult in RA require: greater overall physical demand, bilateral and fine hand use [13]. Content has been linked to the ICF [8] and ICF Core Sets for AS, OA, chronic widespread pain, low back pain and musculoskeletal conditions for post-acute care [11]. Over 80% of people with RMDs considered it the right length and helpful for discussing activity abilities with an RHP [10,12]. The English EDAQ is also available online. Patients can complete it, store results to their user profile and e-mail them to their RHP [14]. Cross-cultural adaptation of the EDAQ into other languages would enable its use in other countries; and cross-country comparisons of patients' rehabilitation needs and effects of rehabilitation on activities. To do this, cross-cultural invariance must be demonstrated, i.e. the EDAQ works in a consistent manner across language versions [15,16].
The objectives of this study were to: linguistically validate and cross-culturally adapt the EDAQ for Dutch and German speakers with RA; test cross-cultural invariance across the Dutch, German and English versions of the EDAQ to ensure equivalent scaling; test the psychometric properties of the Dutch and German EDAQ; and further establish content validity in RA by linking the EDAQ to the ICF Core Set for RA.

Participants
Participants were recruited by health service staff identifying eligibility. Dutch-speakers were recruited from one hospital in the Netherlands. German-speakers were recruited from three hospitals in Switzerland and from Swiss, German and Austrian arthritis patient associations. Participants from patient associations volunteered following reading study information on associations' websites and completing an eligibility screening form. As the Dutch and German language EDAQs were tested for cross-cultural adaptation and invariance with the English EDAQ, data from the earlier English study were included [10].
Participants were eligible if they: had a confirmed diagnosis of RA; were able to read, write and understand Dutch or German (as applicable); and had not (or were not about to) altered their disease-modifying medication regimen in the last 3 months (which could affect test-retest reliability).
Ethical approval was obtained, and all participants provided written informed consent.

Linguistic and cross-cultural validation
Recommended procedures were followed [17,18]. Validation occurred in two stages. In stage one, two independent forward translations were made for each of the Dutch and German versions from the original Swedish EDAQ; these were synthesised by expert panels in each country, respectively; independent back translations were made from each language into Swedish; followed by synthesis by the Swedish language expert panel for each language to check for equivalence of meaning. In stage two: additional items developed for the English EDAQ were forward/backward translated as above into Dutch and German. Harmonization of the Dutch, German and English versions by the research teams then ensured equivalence.
Field testing of the Dutch and German EDAQs was conducted with people with RA in the Netherlands and Switzerland, respectively, using cognitive debriefing interviews [17,19]. Participants completed the draft Dutch or German EDAQ at home and, within 2 weeks, were interviewed about comprehensibility, the relevance of the activities for people with RA and whether any important daily activities were missing. The results were discussed between the Dutch, German and English research teams. Further wording changes and additional items were agreed to ensure equivalence across these three versions.

Content validity
To further evaluate the content validity of the EDAQ, Part 2 items were systematically linked to the ICF Core Set for RA [20], using content linking rules [21].

Phase 2: Psychometric testing
Participants were mailed a questionnaire booklet which collected data to describe the recruited population: age, gender, marital, educational and employment status, disease duration and RA disease-modifying medication, as well as the EDAQ and the measures described below. Two to three weeks later, participants were mailed the EDAQ to complete for a second time at home to evaluate test-retest reliability. Two reminders were sent for each mailing, as necessary.

Measurement instruments
The EDAQ includes three parts: Part 1 comprises 10 numerical rating scales (NRS) to assess symptom severity, mood and life satisfaction, each scored on a 0 (none) to 10 (severe) scale. Part 2 comprises 138 activities in 14 domains. Twelve can be combined into two Components: Self-Care (Eating and Drinking; In the Bathroom and Personal Care; Getting Dressed/Undressed; Cooking; Cleaning the House; Laundry and Clothes Care; Communication); and Mobility (Bathing and Showering; Moving Indoors; Moving and Transfers; Moving Outdoors and Shopping; Gardening and Household Maintenance). The other two domains are Caring; and Leisure, Hobbies and Social Activities. Items are scored on a 4-point scale assessing ability to perform daily activities (0 = no difficulty, 3 = unable to do). If the person would not normally perform that activity (for reasons other than health), there is a "not applicable" option. Each item is answered twice by rating performance without (Section A) and then with (Section B) ergonomic solutions (e.g., alternative methods, assistive devices, environmental modifications). In ICF terminology, section A relates to capacity and B to performance [1]. Items are summed to produce total scores for Sections A and B within each domain, with any score reductions between Sections A and B denoting the impact of ergonomic solutions on improving activity ability. If there are missing items within a domain, a total domain score cannot be calculated. Higher scores indicate greater activity limitations. The optional Part 3 includes a list of assistive devices and whether owned and used [10]. Part 3 was not tested as it is not used as an outcome measure.
The comparator health measures to assess concurrent validity were: (i) The Health Assessment Questionnaire (HAQ): assessing ability to perform 20 daily activities rated on a 0-3 scale (0 = not at all difficult; 3 = unable to do) [22,23]. These were summed to give a total score, as the HAQ20 does not score items worse if an assistive device is used [24]. Higher scores indicate greater activity limitations. (ii) The Physical Function, Bodily Pain and Vitality (fatigue) scales of the Medical Outcomes Survey 36-item Short-Form version 2 (SF36v2), with normbased scoring [6,25]. Lower scores denote worse health states. (iii) Hand pain: measured using a 11-point NRS of hand/ wrist pain in the past week in during moderate activities (e.g., cooking a meal, doing housework/light gardening: 0 = no to 10 = severe). (iv) RA Quality of Life scale (RAQoL): 30 items about QoL answered yes (= 1) or no (= 0), with yes items summed to give a total score. Higher scores indicate worse QoL [26]. (v) Perceived health status: using a 5-point NRS asking effects of their condition in the last month (1 = very good: no symptoms/no limitations in daily activities to 5 very poor: very severe symptoms/inability to carry out most activities). (vi) Perceived change in health status: At Test 2 only, a 5-point NRS asking how much arthritis troubled them compared to when last completing the questionnaire (1 = much less to 5 = much more).

Sample size
As Rasch analysis was used to assess the invariance of the EDAQ Part 2 across language versions, a sample size of at least 150 for each language was necessary. This number was determined to ensure: a uniform distribution of patients across the construct of activity limitation; the precision of the estimate of both persons and items remains similar across the construct; and enough cases to test for invariance across groups. The sample does not need to be representative, as the mathematical model is independent of distribution, but it should have a good distribution across the activity domains [27]. At least, 79 sets of repeated responses were required to demonstrate that a test-retest correlation of 0.7 differed from a background correlation (constant) of 0.45, with 90% power at the 1% significance level. A test re-test correlation of 0.7 is deemed a minimum acceptable level [28].

Cross-cultural adaptation and invariance
Rasch analysis is an iterative process of fitting data to the Rasch Measurement Model [29]. If the data meet the model expectations, then ordinal raw scores can be transformed into an interval level latent estimate [30]. Those expectations are associated with several assumptions underlying the model, namely stochastic ordering of items, local independence of items, unidimensionality and group invariance [31]. For unidimensionality, a t test of two estimates is made to ascertain if more than 5% of such estimates are different, or at least at the lower confidence interval for the proportion of different tests [32]. Rasch analysis of the English EDAQ Part 2 has already determined that total domain scores and component scores can be considered as unidimensional and total raw scores used [10]. Total domain and component scores can be converted to a Rasch metric when required for parametric analyses [33].
In cross-cultural adaptation, group invariance is crucial, as this determines if the adaptation has provided equivalent scaling, in this case across languages. Invariance is tested through Differential Item Functioning (DIF) [27]. As both the Dutch and German versions of the scale were made from the English version, invariance was tested from the English version for each, and across all three languages combined. The 12 domains of the two components of Self-Care and Mobility were used as testlets (as already determined in the earlier English Rasch analysis [10] and were the summed score of the items within each domain [34]. If local dependency remained across the domains these were further aggregated, as required. The analysis fits data to the Rasch model for each component (i.e. Self-Care and Mobility). at their respective domain levels. The RUMM2030 software was used [35].

Psychometric testing
The Statistical Package for the Social Sciences (SPSS) v25 was used for analyses [36], apart from linear weighted kappas, calculated using MedCalc [37]. As all measures consist of ordinal data, non-parametric statistical tests were used to assess the psychometrics, apart for intra-class correlation coefficients [ICC (2,1)] and sensitivity to change statistics, which were calculated using Rasch transformed data as interval data is required for these calculations [33]. Ordinal data are summarized as medians and inter-quartile ranges. Normality of Rasch transformed data was tested using the Kolmogorov-Smirnov test and data summarized using means and standard deviations.
Internal consistency was assessed using Cronbach's alpha, with results of ≥ 0.80 being good to excellent [39].
Concurrent validity of the Part 1 NRS and Part 2 domain total scores was assessed using Spearman's correlations with measures of related constructs. For Part 1, this was the SF-36v2 sub-scales, except for Satisfaction with Life correlated with the RAQOL. For Part 2, this was the HAQ20, SF-36v2 sub-scales, RAQoL, Pain, Fatigue and Hand Pain NRS and Perceived Health Status.
Discriminant validity was assessed using Kruskal-Wallis tests to evaluate differences in scores between participants with different perceived health status groups.
Sensitivity to change was assessed by calculating Standard Error of Measurement (SEM) and the Minimal Detectable Change 95 (MDC 95 ). The formula used was: SEM = s√(1 − r), where s = the mean and standard deviation (SD) of Test 1 and Test 2 (retest), r = the reliability coefficient for the test, i.e. Pearson's correlation co-efficient between Test and Test 2 values. Thereafter the MDC 95 was calculated using the formula: MDC 95 = SEM × √2 × 1.96 [41,42].
Floor and ceiling effects were considered present if > 15% of participants achieved either the lowest or highest scores in the 14 EDAQ Part 2 domains [43].

Phase 1: validation
The only difficulties encountered during translation were identifying names for some assistive devices in part 3. This was overcome with photographs and local therapists providing correct names. Cognitive debriefing interviews were undertaken with six Dutch-and five German-speaking participants. Average time to complete all three parts of the EDAQ was 30 (SD 8) minutes. Activities in the EDAQ Part 2 were considered culturally relevant by both Dutch and German participants. Some additional activities were added to existing items: use of smartphones; laptop/tablet (e.g. iPad) to the Communication domain; bicycling to the Leisure, Hobbies and Social Activities domain. Ten considered it easy/partially easy to complete, with five highlighting the importance of carefully reading instructions. Eight commented that Part 2 (activities) was most relevant and three that Part 3 (assistive devices) least relevant because those participants had few or no assistive devices. Seven considered the EDAQ included the right range and number of activities. However, four thought there were too many. All 11 participants considered the EDAQ would capture the difficulties they face daily and enable discussions with rehabilitation health professionals.

Linking to the ICF Core Set for RA
The Part 2 EDAQ had good content validity, with 28/33 activities from the Communication, Mobility, Self-Care and Domestic Life items of the RA Core Set included. The five Core Set activities not included were not specific daily activities: carrying out daily routine (d230); Interpersonal family (d760) and intimate (d770) relationships; and Major Life areas: remunerative (d850) and other work/employment (d859). (see Supplementary Table S1).

Participants
The sample consisted of 252 Dutch-speaking people from the Netherlands and 163 German-speaking people (87 from Switzerland, 70 from Germany and 6 from Austria). Their language group-specific demographic characteristics and disease duration are shown in Table 1 and health status in Table 2. Demographic and health data for the  English-speaking participants (n = 383), included in the Rasch analysis, is also shown. The Dutch sample, compared to the German and English-speaking samples, were older with: more men; shorter disease duration; less educational experience; fewer on biologics; and better health status.

Rasch analysis
Fit of the data to the Rasch model for each component and language is shown in  Tables S3 and S4.

Test-retest reliability
Part 1: the 10 NRS had moderate to good reliability (Table 4). Part 2: correlations between test 1 and 2 domain scores were good or very good (r s = 0.75 to 0.93), apart from the Caring domain which had only moderate correlations ( Table 5). The domains' intra-class correlations were excellent [ICC (2,1) = 0.90 to 0.97], apart from Gardening and Household Maintenance (Dutch) which was lower at 0.77 (Table 5). Linear weighted kappa scores for individual items in each domain were mainly moderate to good (0.20-0.75 Dutch; 0.41-0.82 German version) (Supplementary Table 2).

Internal consistency
Cronbach's alpha values for the 14 domains were all excellent (≥ 0.85) apart from Communication (0.79) in the Dutch version ( Table 5). All domains in both languages, therefore, had values consistent with group use (i.e. ≥ 0.7), and most with individual use (i.e. ≥ 0.85) [28]. Each domain can be used as a stand-alone measure, as well as collectively within the two components of Self-Care and Mobility.

Concurrent validity
In EDAQ Part 1, there were moderate to strong correlations between NRS and SF36v2 Mental Health, Vitality and Bodily Pain scales and RAQOL, as relevant (r s = − 0.42 to − 0.73) Table 4).        Perceived Health State was moderately to strongly correlated (r s = 0.47-0.65). Apart from being moderately correlated with physical function measures, the Caring domain was weakly correlated with other measures (Table 6).

Discriminant validity
There were significant differences in most Part 2 domain scores (p < 0.01), except for the Dutch Caring domain (p = 0.27), as many participants reported not performing Caring activities (Supplementary Table 7).

Sensitivity to change
Part 2 MDC 95 domain scores ranged from 1.25 to 3.68 apart from Gardening and Household Maintenance which was higher (7.77) for the Dutch EDAQ (Table 5).

Floor and ceiling effects
In A strength of this study is that it included a large sample of people with RA, recruited from both out-patient clinics and patient associations. Recruitment was from three German-speaking countries. Two German versions have been developed: one for Switzerland, and one for Germany and Austria, as there are minor differences in written German. Although the English version has been tested in eight RMDs, the Dutch and German EDAQs have only been tested in RA. Further research is needed to establish whether the Dutch and German EDAQs are reliable and valid in other RMDs. This would allow the EDAQ to be used across these language versions/countries in a wide variety of RMDs commonly treated in rheumatology departments and in other settings.
The limitations of this study are that, in Phase 2, the acceptability and utility of the EDAQ were not investigated, as in the English studies, although Phase 1 participants endorsed these. Additionally, floor effects were observed across most domains (particularly the Dutch sample). This is likely because the Dutch sample included a higher proportion of men, compared to the German and English samples. Men with RA tend to have fewer daily activity difficulties than women, predominantly because of their stronger grip force [47]. Additionally, in this Dutch sample more reported being in good health.
In conclusion, the Dutch and German EDAQs are valid, reliable measures of activity limitations which can be used with people with RA. Either the whole EDAQ, the Self-Care/Domestic Life or Mobility components or the individual domains can be used in clinical practice to identify client's daily activity difficulties, facilitate discussion to find solutions, and evaluate the outcome. Equivalence between