The EORTC QLQ-CR29 quality of life questionnaire for colorectal cancer: validation of the Dutch version

Purpose To validate the Dutch version of the EORTC QLQ-CR29 quality of life questionnaire for colorectal cancer. Methods We translated and pilot-tested the original questionnaire in the Netherlands, following EORTC guidelines. We assessed factor structure, reliability and construct validity in different samples of patients from four hospitals. Results Of 296 patients, 236 (80 %) returned the questionnaire, and 27 out of 48 patients returned the retest questionnaire. In addition to the original three scales, we found a reliable bowel functioning scale (α = 0.80), reducing the number of individual items by five. Two of the other scales had sufficient to good reliability (urinary frequency, α = 0.71, original α = 0.75, body image α = 0.80, original α = 0.84), the third, blood and mucus in stool, only moderate (α = 0.56, original α = 0.69). Item functioning was sufficient to excellent for all but two items (urinary incontinence and dysuria). Construct validity was similar to that in earlier studies. Conclusion We found a very satisfactory scale for bowel problems, in patients both with and without stoma. The body image and urinary incontinence scales were reliable, and construct validity was sufficient. We suggest the questionnaire to be adapted to decrease the number of individual items, improve the scales, and therefore increase reliability of the entire questionnaire.


Introduction
Colorectal cancer is a prevalent cancer, and both the disease and its treatment strongly impact quality of life (QoL). To allow for the evaluation of new treatments, the European Organisation for Research and Treatment of Cancer (EORTC) developed the colorectal QoL module QLQ-CR38 [1] as an adjunct to the generic EORTC QLQ-C30. Later, this was revised to the shorter QLQ-CR29 [2] and validated in an international study [3]. The resulting QLQ-CR29 consisted of four scales and 19 individual items. Later validation studies were reported for the Polish [4] and Spanish [5] versions. Validation of the Danish QLQ-CR38 [6] suggested the QLQ-CR29 to be more valid than the QLQ-CR38. In the Spanish QLQ-CR29, the blood and mucus scale was not confirmed; in the Polish only the body image scale was reliable, and the urinary incontinence scale approached acceptable reliability. Construct validity was limited for the Polish version and showed ambiguous results for the Spanish. In both cases, the authors nevertheless concluded the questionnaire to be reliable and valid.
These equivocal results led us to assess the reliability and validity of the Dutch version and to assess whether additional scales might result in a reduction in the number of individual items.

Materials and methods Translation and procedures
The QLQ-CR29 had been translated into Flemish/Dutch by the EORTC Quality of Life Group, following their Translation Procedure Manual Instructions [7]. Differences in the Dutch language exist between Belgium and the Netherlands, and pilot testing was undertaken to reword some items for a Dutch population, in 29 patients with colorectal cancer from the Leiden University Medical Center (LUMC). Suggested changes were discussed with experts of the EORTC, resulting in a final Dutch translation.
Consecutive patients were recruited from two academic and two peripheral hospitals in the western region of the Netherlands [LUMC-Departments of Surgery and Radiotherapy, Alrijne Hospital Leiden (former locations Diaconessen Hospital and Rijnland Hospital)], and Erasmus Medical Center Rotterdam, between May 2011 and December 2012. In three departments (Diaconessen Hospital, LUMC-Surgery, and ErasmusMC), research nurses handed the questionnaire to the patients (n = 123, response rate 79 %) at the time of their follow-up visit, and in one hospital (Rijnland), the questionnaire was sent to patients (n = 80, response rate 83 %) who had undergone treatment for colorectal cancer between May and December 2011. In one department (LUMC-Radiotherapy), the questionnaire was sent to patients (n = 93, response rate = 78 %) who participated in other studies [8,9]. Of the 296 patients receiving a questionnaire, 244 returned it, and we included 236 completed questionnaires (response 80 %). Time between surgery and filling out the questionnaire ranged from 5 months to 12 years. No information is available on the non-responders, unfortunately, but given the nature of the task, filling out a short questionnaire, we do not expect major non-response bias.
For convergent validity, participants were additionally asked to fill out the EORTC QLQ-C30. For test-retest reliability, we approached patients who had indicated their willingness in the first questionnaire. The questionnaire was sent to every fifth participant within 2 weeks of returning the first questionnaire. Twenty-seven patients (out of 48 invited, 56 %) filled in the questionnaire twice, on average 19 days after the first (range 4-46 days). Patient characteristics are presented in Table 1.

Statistical analysis
We assessed item performance, by proportion of floor and ceiling effects, and by test-retest reliability (intraclass correlation coefficients, ICCs). Since the QLQ-CR29 was shown to consist only of few and mostly two-item scales, we carried out a principal component analysis to detect potential additional subscales, based on eigenvalues ([1.0). Items 49-54 on bowel problems (patients without a stoma) and stoma problems (patients with a stoma), respectively, were used as if the same items for patients without and with a stoma. We used varimax rotation to facilitate interpretation [10]. We assessed scale reliability using Cronbach's a, for both the newly found scales and the original four scales. Subscales were constructed on the basis of the principal component analysis by adding the unweighted scores of the variables that loaded on a factor and normalizing to 0-100. Finally, we assessed construct validity as done in the earlier studies [4,5], using correlations with the QLQ-CR30 (scores below 0.40 indicating no undue overlap between the constructs of the two questionnaires), and known-groups comparisons comparing older (C66 years) and younger (B65 years), patients with and without a stoma, and patients treated with curative and palliative intent using Mann-Whitney U tests. Table 2 presents the item characteristics and the subscales detected. ICCs were low for urinary incontinence and dysuria. The percentage respondents at floor was rather high ([50 %) in the blood and mucus in stool scale and for 19 individual items.

Factor analysis and reliability
Factor analysis revealed seven factors, of which the original urinary frequency scale (Cronbach's a = 0.71) and body image (a = 0.80) scales were reproduced (alpha in the original study [3] of 0.71 and 0.84, respectively). The original two-item stool frequency scale (items 52 and 53) had a lower a (0.68, originally 0.70 [3]) than when included in a larger factor, with all bowel and stoma problems included (items 49-54: a = 0.80). This latter scale also showed good reliability for patients with (a = 0.80) and without (a = 0.84) a stoma. The blood or mucus in stool scale was reproduced in the factor analysis but had a low a of 0.56 (originally 0.69 [3]). All remaining factors did not form clearly interpretable scales, and reliabilities were all below 0.70. We thus present construct validation for the original scales and items, as well as the new bowel/stoma problems scale.

Construct validity
Correlations between the subscales and the QLQ-C30 subscales were below 0.40, except for body image, which correlated moderately (r = 0.48) with social functioning.
Younger compared to older patients had significantly worse sexual functioning (Table 3) and had fewer problems with urinary frequency and incontinence and with a dry mouth. Patients without a stoma had a higher body image and less urinary incontinence. Patients treated with curative intent indicated more problems with blood and mucus in stool, defaecation problems, buttock pain, and stool frequency and fewer problems with hair loss and trouble with taste than patients treated with palliative intent.

Discussion
This study largely replicates the findings of the original study [3] and the Spanish validation [5]. As in the original study, the body image and urinary frequency scales were reliable, while the blood and mucus scale was only moderately reliable. An important result is that we found a reliable scale incorporating the items about bowel problems or stoma problems. Neither the Spanish nor the Polish study performed an exploratory factor analysis and only reported the results for the scales defined in the original paper [3]. Since the original stool frequency scale was incorporated in this new scale, the questionnaire still consists of four scales, but with 14 additional single items instead of 19. For reasons of reliability and multiple testing, it is recommended to have as few single items as possible, so this is an improvement.
Remarkable was the better item performance in our study compared to the Spanish validation, where ceiling effects were present in over 50 % of the scores in four domains (body image, anxiety, weight, and impotence). The patients in our sample scored markedly lower than those in the Spanish study, likely reflecting in part cultural values about body image and sexuality. Dysuria had similar high floor effects in the Spanish [5] and Danish [6] studies. We recommend additional assessment of the items urinary incontinence and dysuria, which showed poor reliability and item performance.
Reliabilities of the items in the original study were higher than ours (ICCs [ 0.55). The other studies did not report test-retest reliability.
Construct validity was sufficient, as shown by only limited overlap between the QLQ-CR29 and QLQ-C30 (similar to the original study [3], apart from the correlation only we found between body image and social functioning). We also found differences in scores between groups that were well interpretable. We found fewer differences between patients with and without a stoma than the original study [3] (which also saw differences for the urinary frequency scale and the faecal incontinence, sore skin, and embarrassment items). Further, patients receiving palliative treatment in that study reported more problems with hair loss, anxiety, faecal incontinence, and dyspareunia, whereas in our study they reported less blood and mucus in stool and buttock pain, and lower stool frequency. In conclusion, we were able to replicate earlier findings, but could also reduce the number of single items and thus  improve on the QLQ-CR29 as published so far. We recommend that the remaining individual items be revised to improve their performance, and that following that, more psychometric research be carried out to reduce the number of individual items.
helped with the data collection and especially all the patients who so generously donated their time and effort to our study.
Funding This study was funded by the Dutch Cancer Society (Grant Number UL-2009-4431).

Compliance with ethical standards
Conflict of interest The authors declare that they have no conflict of interest. Human and animal rights All procedures performed in studies involving human participants were in accordance with the ethical standards of the institutional committees and with the 1964 Helsinki Declaration and its later amendments or comparable ethical standards. Written consent was obtained from all participants included in the study.
Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://crea tivecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.