Background

Fibromyalgia is a common disorder characterized by widespread musculoskeletal pain, stiffness, paraesthesia, non-restorative sleep, and fatigue along with multiple tender points, which are widely and symmetrically distributed [1]. FM is an important cause of disability and work absenteeism and has major personal, social, and economic consequences [2, 3]. It affects at least 2% of the general population in the United States [4] and a similar prevalence exist worldwide [5]. It is also common in Bangladesh with a point prevalence of 4.4% [6]. Approximately 6-10% of all individuals in a physician’s waiting room have FM [7], so most physicians can expect to encounter at least one patient with FM daily.

There are several instruments available to measure the health status and physical function of patients with rheumatic diseases in general, including the Health Assessment Questionnaire (HAQ) [8] and the Arthritis Impact Measurement Scales (AIMS) [9]. Because these questionnaires do not fully reflect the multidimensionality of FM symptoms, the Fibromyalgia Impact Questionnaire (FIQ) was developed to specifically capture the total spectrum of problems related to FM [10, 11]. Since its introduction in 1991 [10], the FIQ has become a widely used and popular instrument in FM research. The scale has been translated and validated in several languages, including Hebrew, Swedish, Korean, German, French, Spanish, Italian, Turkish, and Dutch [1221]. Overall, it has shown credible construct validity, adequate reliability, and excellent responsiveness to change and has been recommended as a primary endpoint in FM clinical trials [11, 22].

To date, however, the FIQ has not been translated into Bengali. With nearly 300 million speakers, Bengali is the sixth language in the world, second in India and the national language in Bangladesh [23]. Therefore, the aim of the present study was to cross-culturally adapt the FIQ into Bengali and to evaluate its reliability and construct validity in Bangladeshi patients with fibromyalgia.

Methods

The original US modified version of the FIQ consists of 10 items [11]. The first item contains 11 sub-items measuring the ability to perform activities of daily living in the past week on a 4-point Likert-type scale from 0 (always) to 3 (never) and is referred to as the physical function subscale. Item 2 and 3 ask for the number of days felt good and the number of days (0–7) unable to work during the past week. Items 4–10 are horizontal linear scales marked in 10 increments on which the patient rates work difficulty, pain, fatigue, morning tiredness, stiffness, anxiety, and depression in the past week. Scores on each of the 10 items are standardized so as to range from 0 to 10, with higher scores indicating greater impairment. The total score can range from 0 to 100.

Cross-cultural translation of the FIQ

Cross-cultural adaptation was carried out in five stages according to the recommendations by Beaton et al. [24].

Stage I - III. Forward and backward translation

The forward translation was performed by two bilingual translators with different profiles whose mother tongue was Bengali. One of the translators with a medical background was aware of the concepts being examined in the questionnaire. The other, a university business administration student, was not aware or informed of the concepts being quantified. After this, both translators and three of the authors sat down to produce a synthesis version from the original questionnaire as well as both translations. The synthesis version was back translated into English by two independent translators proficient in English and without any medical background. One of them was a University lecturer in English and the other was a professional translator. Both were not aware or informed of the concepts explored.

Stage IV. Expert committee

An expert committee was formed which included health professionals, methodologists, and the translators involved in the process. The committee reviewed all translations and the original questionnaire to reach consensus on any discrepancy and to produce a preliminary version for field testing. Several critical decisions were made to achieve semantic, idiomatic, experiential, and conceptual equivalence between the original and back-translated versions. For some physical function sub-items that refer to activities not commonly practiced by Bengali women, such as using a washer and dryer and vacuuming a rug, the committee added culturally appropriate equivalents (Table 1). This resulted in a preliminary Bengali version of the FIQ that included 20 original and 11 added or modified sub-items for the physical function subscale.

Table 1 Original and added equivalent items of the physical function subscale used in pretesting the preliminary version in FM patients (n = 30)

Stage V: Pretest

The preliminary version of the Bengali FIQ (B-FIQ) was field tested in 30 female patients fulfilling the ACR criteria for FM [25]. Since only two patients were able to independently complete the questionnaire, the questionnaire was subsequently administered in a face-to-face interview. Consecutive FM out-patients who consented to participate were enrolled from the rheumatology wing of the Bangabandhu Sheikh Mujib Medical University (BSMMU). To test for face and content validity, patients were probed about what they thought was meant by each item and the chosen response. Additionally, patients were asked if they were accustomed to performing the activities.

Psychometric evaluation of the B-FIQ

Participants

A new group of 102 adult female FM patients fulfilling the 1990 ACR criteria for FM [25] were enrolled consecutively from the BSMMU and a satellite clinic of the Community Oriented Program for Control of Rheumatic Disorders (COPCORD) at Sonargaon, Narayangonj, Bangladesh. Patients with a history of autoimmune disease or inflammatory arthritis were excluded. A random sample of 40 patients was requested to visit the clinic again 7 days after the first visit. During that week no intervention was given. Additionally, two control groups matched for age and sex of 50 patients with rheumatoid arthritis (RA) and 50 healthy persons each were enrolled. The healthy controls were a convenience sample of people without musculoskeletal complaints. The study protocol was approved by the ethics committee of the BSMMU, Dhaka, Bangladesh. All participants provided verbal informed consent and completed the pre-final B-FIQ and socio-demographic items in a face-to-face interview. The time it took to administer the B-FIQ was recorded in both the FM patients and healthy controls.

Additional measures

FM patients were additionally administered a validated Bengali version of the HAQ [26] and relevant mental health subscales of the SF-36 (general health, vitality, role-emotional, and mental health) [27]. To assess current disease symptoms, pain, morning stiffness, fatigue, sleep disturbance, headache problem, and depression were recorded on 100 mm visual analog scales (VASs) with 100 denoting the worst possible condition. Also, a tender point count (TPC) was performed by a rheumatologist. Tenderness was assessed by applying about 4 kg finger pressure at all tender points until the fingernail bed blanched [25].

Statistical analysis

To assess the content validity of the physical function sub-items, a cutoff criterion of ≥25% impairment response (‘occasionally’ or ‘never’) was set to indicate a valid item [10, 13]. Additionally, missing values were examined. Internal consistency of the physical function subscale and the total B-FIQ was assessed by Cronbach’s α, where a value ≥0.70 was considered adequate for group comparisons and a value between 0.90–0.95 for individual comparisons [28]. Construct validity of the B-FIQ was assessed by examining Spearman correlations between the B-FIQ scores and clinical symptom severity VASs, scores on the HAQ and relevant mental scales of the SF-36, and the TPC [1214, 21]. Known groups validity was assessed by comparing B-FIQ scores of the FM patients with those of the healthy controls and RA patients [14] using Mann–Whitney U tests. Finally, test-retest reliability was assessed by computing Spearman correlations between the B-FIQ scores of the 40 FM patients that completed the B-FIQ again after 7 days. As with internal consistency, test-retest reliability coefficients ≥0.70 and between 0.90–0.95 were considered adequate for group level and individual measurements over time, respectively [28]. All statistical analyses were performed using SPSS for Windows version 11.5.

Results

Pretesting of the FIQ

Mean age of the 30 patients included in the pretest was 32.1 ± 11.3 years and 53.3% and 33.3% were housewife or student, respectively. 23.3% had no education, whereas 20% and 43.3% had followed primary or secondary education, respectively. From the 22 physical function sub-items (Table 1), four original items (items 2, 5, 7, and 9) were not understood by most of the patients. The remaining original and all added equivalent sub-items were understood by the patients, although many patients were not accustomed to doing 6 of the activities mentioned (items 2.2, 6.1, 9.2, 9.3, 10, and 11.1). The remaining 12 sub-items were kept for the pre-final questionnaire with some clarification added to some items based on the comments made by the patients (items 1, 2.1, 6, and 8). As patients mentioned practicing different religions, item 5.2 (“Pray in usual way”) was changed into “Do worship / religious prayer in usual way”.

Psychometric evaluation of the B-FIQ

Descriptive characteristics of the included participants are listed in Table 2. The three samples were comparable regarding age, occupation, religion, and marital status. However, there were significant differences in educational level. On average, 9.6 ± 1.6 minutes were needed to administer the B-FIQ to the FM patients, whereas it took a mean time of 7.4 ± 1.3 minutes to administer the B-FIQ to the healthy controls (P difference <0.001).

Table 2 Descriptive characteristics of the FM patients, healthy controls, and RA patients

Using the cutoff of ≥25% impairment, all 12 physical function sub-items met the criterion. Also, missing data within the physical function subscale of the B-FIQ were limited to 2.9% of the patients who did not “Dress vegetable with the help of a boti” (item 10) and 53.9% of the patients who did not “Clean (including sweeping) the yard” (item 9), suggesting adequate content validity of the majority of physical activities.

The internal consistency of the initial 12 sub-items constituting the physical function subscale was inadequate, with Cronbach’s α = 0.67. Removing the item “Do worship / religious prayer in usual way,” which was tested as a second possible equivalent to “vacuum a rug” raised Cronbach’s α to an acceptable level of 0.73. Consequently, this item was removed from the scale for the further analyses and the final B-FIQ (Additional file 1), also making the B-FIQ more comparable to the original US FIQ. Cronbach’s α for all 10 individual items of the total B-FIQ was 0.83, also indicating acceptable internal consistency for group level analyses.

As expected, B-FIQ scores were significantly correlated in the expected direction with most current FM symptoms recorded on the VASs (Table 3). Correlations were generally strong for similar aspects of health and moderate for different aspects of health. Exceptions were physical function, which correlated only with tiredness on the VAS and the TPC, and workdays missed, which correlated only with the TPC. Correlations with overall disease severity were moderate and correlations with the TPC weak to moderate. Additionally, the B-FIQ items were significantly correlated with all selected subscales from the HAQ and the SF-36, except for a lack of correlation between workdays missed and general health (Table 4). Correlations were generally strongest for similar aspects of health, such as morning tiredness vs. vitality and anxiety and depression vs. mental health. However, physical function was only weakly correlated with the HAQ disability index.

Table 3 Spearman correlations with other measures in FM patients (n = 102)
Table 4 Spearman correlations with selected measures in FM patients (n = 102)

FM patients scored significantly worse than healthy controls on all items of the B-FIQ (Table 5). The mean total B-FIQ score was more than 6 times higher in FM patients than in healthy controls. The B-FIQ was also able to discriminate between FM and RA patients, as demonstrated by significantly higher scores for FM patients on all items except number of workdays missed. Also, the mean total B-FIQ score was substantively higher in FM patients compared to RA patients.

Table 5 Discriminant validity of the B-FIQ in FM patients, healthy controls, and RA patients

Mean scores of the 40 patients that completed the B-FIQ twice and test-retest correlations are presented in Table 6. Test-retest correlations were sufficiently high for group level comparisons for all items, except for anxiety and morning tiredness.

Table 6 One-week test-retest reliability of the B-FIQ in FM patients (n = 40)

Discussion

Tools for evaluating health status have been developed mainly for use in English-speaking countries [29, 30]. In order to assess the health status of different cultural groups and to compare the results of trials in different countries, the need for non-English language measures has increased [31]. To date, no validated disease-specific measure was available to assess the full spectrum of FM problems in Bengali patients. In this study, we developed and evaluated a cross-culturally adapted version of the FIQ for use in Bengali-speaking patients with FM. Overall, the findings suggest that the B-FIQ is a sufficiently reliable and valid measure of health status in Bangladeshi women with FM. However, some limitations were apparent that need further study, including the low correlation between the physical function scale and similar or conceptually related scales and the low test-retest reliability of the anxiety and morning tiredness items.

Cross-cultural adaption of a questionnaire is more challenging than merely translating its items into another language. To be used across cultures, the items must not only be translated well linguistically, but also have to be adapted to the specific culture to maintain content validity at a conceptual level [24]. This may involve changing or replacing items that are not experienced in the target culture [24, 31]. To adapt the FIQ for use in Bengali patients, five sub-items of the final B-FIQ physical function scale (i.e., do laundry with a washer and dryer, vacuum a rug, walk several blocks, do yard work, and drive a car) were replaced with culturally appropriate equivalent activities since these activities are not commonly performed or understood by Bengali women. Similar modifications have been made to these items in previous cultural adaptations of the FIQ [12, 14, 21]. Additionally, several other physical function items were slightly modified or clarified. The finding that especially the sub-items of the physical function subscale needed adaptations corresponds with findings from several translations of the widely used SF-36 health status measure, which also showed that the most difficult items to translate were physical functioning items that refer to activities not common outside the US [31]. In 2009, Bennett et al. [32] developed a revised version (the FIQR) in response to several deficiencies, including the fact that the functional questions were originally intended for women living in reasonably affluent countries. It is likely that the new physical function items of the FIQR are more appropriate to Bengali women as well and could provide better cross-cultural compatibility. Therefore, it would be worthwhile for future studies to validate the FIQR in Bangladeshi patients with FM.

During the pre-testing phase, it became clear that most Bengali FM patients were not able to complete the questionnaire by themselves. Although the FIQ was developed as a self-administered questionnaire, we decided to administer the questionnaire in a face-to-face interview context. This inability to self-complete the questionnaire was most likely due to patients’ lack of previous experience with research and participation in such studies. Most patients were not familiar with completing questionnaires and scoring VASs. Also, the low literacy rate among Bangladeshi women may have contributed.

Additionally, there was a notable difference in the time it took to administer the B-FIQ to the FM patients and the healthy controls in the psychometric evaluation study. The longer administration time in the FM patients may be the result of more cognitive dysfunction in this group, which is increasingly recognized as a key symptom of FM [33, 34].

The internal consistencies of the physical function scale and the total B-FIQ were adequate for group level comparisons and suggest that scores on the sub-items of the physical function scale and the scores on all 10 items of the B-FIQ can be summed to create single total scores. With a Cronbach’s α of 0.83, the internal consistency of the total scale was comparable with previous translations of the FIQ, where α’s have ranged between 0.72 and 0.93 [13, 15, 1719, 21]. Internal consistency of the physical functioning scale (α = 0.73) was somewhat lower than the values of 0.86 and 0.91 found in previous studies [17, 20]. Since one item was removed from the pre-final B-FIQ due to its low item-total correlation, the final B-FIQ consists of 11 items similar which are similar in content to the original US version.

The significant correlations between most FIQ items and other outcome measures suggest that the FIQ has adequate construct validity. Most notable exceptions were physical function, which only correlated with tiredness and the TPC, and workdays missed, which only correlated with the TPC. Additional analyses using selected scales from the HAQ and SF-36 as convergent measures showed that both were significantly, but only weakly, related to the HAQ disability index and the SF-36 vitality scale.

The finding that workdays missed does not correlate well with many other outcomes was also apparent in previous translation studies [14, 17, 20, 21]. This difference with the original US questionnaire may be the result of cultural differences in employment and working conditions between countries. In their evaluation study of the Dutch FIQ, for instance, Zijlstra et al. [20] argued that the low correlations were probably due to the small number of women who had a job. In the Bengali socio-cultural situation, in particular, women often cannot skip work, especially housework, even if they feel very sick. The non-significant or weak correlations between the physical functioning scale and most other measures does not correspond with most previous studies. Although in the Korean version of the FIQ this item also correlated poorly with current FM symptoms [21], most studies did find moderate to high correlations with concurrent measures such as the HAQ and the SF-36 physical functioning scale [1315, 17, 20, 21].

Except for stiffness, all B-FIQ items were significantly but only weakly or moderately correlated with the TPC. This is in accordance with previous studies that also showed weak [10, 16, 17, 21] or moderate [13, 15, 18, 19] correlations between FIQ items and TPCs. It is also in line with findings by Jacobs et al. [35], who found a weak correlation between TPCs and self-reported pain. They concluded that TPCs and self-reported pain represent different aspects of pain in FM. Callahan and Pincus even suggested that this is a specific feature of FM [36].

The B-FIQ was highly capable of discriminating between FM patients and healthy controls and between FM patients and RA patients. Score differences between FM patients and RA patients were significant for all items except workdays missed. This is consistent with the findings by Hedin et al. in Sweden [14]. FM patients also scored worse on the number of days felt good, suggesting that the FIQ may be a more appropriate instrument for evaluating FM patients than the HAQ or AIMS.

Finally, test-retest reliability was adequate for all B-FIQ items except morning tiredness and anxiety. Low test-retest coefficients have been previously reported for morning tiredness [17, 20, 21], but not for anxiety. Other studies, however, did report low reliably for varying items of the FIQ [12, 14, 16]. Perrot et al. [16] have suggested that this may be due to the variability of the multiple aspects of the FM syndrome.

Conclusions

The results of this study suggest that the B-FIQ is a reliable and valid measure of health status for female FM patients in Bangladesh. Future studies should further examine the issues related to the construct validity of the physical function subscale and the reliability of the anxiety item. Additionally, the responsiveness of the B-FIQ should be examined in longitudinal clinical trials.