Background

Measurement of changes in quality of life (QoL) has become a standard outcome variable in evaluating different therapeutic regimes in cancer [13]. Standardized, validated and reliable questionnaires are available for the measurement of changes in QoL [47]. Additionally, the use of these instruments by clinicians caring for patients is being explored [811]. Assessing changes in QoL as patients progress through the course of disease and treatment increases the need for longitudinal assessment.

Computer versions of these questionnaires have become available and can be used for longitudinal assessments [1217]. These systems are well accepted by patients [12, 1619] and allow for the collection of data without transcription errors [12, 17, 18]. Comparison of data collected at one point in time by computer versus paper suggest that the method of collecting the information does not have a large effect on the data collected [19], although some differences are obtained. Formatting of the questions has been found to have an effect [20], and there may be a tendency of patients to give more positive responses with the computer, especially if the format is simplified [12, 20].

A potential barrier to longitudinal measurements is that compliance may decrease over time. Patients may be initially willing to answer questions on the computer, but be less willing to do so on subsequent visits [14, 17]. Reasons for this may include time constraints due to office and patients' scheduling needs as well as patients feeling unwell. One method to deal with these realities of daily clinical practice is to offer patients the choice of taking home the questionnaires to complete if they state they do not have time or do not want to complete the questionnaires on the computer at that time. This would introduce two principal differences. The method of data collection would be different (paper vs. computer), and the location of completing the forms would be different (home vs. office). Asking patients about their QoL over the past several days would reduce the effect of answering the questions in the office or at home. If the instruments are measuring significant changes in life due to major events such as diagnosis of serious disease, surgery, chemotherapy, and remission, then the location and method of administration of the instrument should have minimal effect on responses.

Women attending a gynecologic oncology practice were enrolled in a longitudinal study of QoL. Women completed a computer version of a QoL questionnaire pre-operatively and again at six months. They were given the option of using the paper version at either time point, and the effect of this choice was examined. An additional issue examined was whether use of the paper version was widespread or sporadic. The goal of the study was to compare changes over time obtained when women used a touch-screen computer on two occasions with changes obtained when women used a paper version of the questionnaire on one of the occasions.

Methods

Patients who were scheduled to undergo surgery for endometrial cancer, ovarian cancer or an adnexal mass were invited to participate in a long term study of QoL, complementary medicine use and diet. Women at two gynecologic oncology offices in Northeast Ohio were recruited from 2001 – 2003. Informed consent was obtained for participation in this IRB approved study. Private office records and hospital discharge records were reviewed to abstract demographics and final pathology diagnosis. Baseline demographics were ascertained by interview with a research assistant pre-operatively. Patients completed the questionnaire pre-operatively and again at six months.

Computer kiosks with a 15 inch monitor were programmed with the Functional Assessment of Cancer Therapy-General questionnaire (FACT-G) along with an additional fatigue module [21]. The FACT is a 27-item questionnaire consisting of four domains: physical, emotional, social and functional well-being. Patients are asked how true each statement has been for them over the past seven days. Each domain is comprised of six to seven questions scored by use of a Likert-type scale ranging from 0 (not at all) to 4 (very much). Each domain appeared on one screen and patients touched their response to each individual question. Patients could change their answers by touching an alternate response on that screen but could not return to a previous screen. All questions had to be completed before the computer continued to the next screen. The touch screen computer was designed so that the format of the questions closely matched the format of the questions on the paper form. Patients utilized the computer kiosk independently during their office visit although the research assistant was available to answer questions. Patients were given the option of completing the questionnaire using the paper version at any time.

Statistical analyses

Patients were categorized into four groups; those who completed the FACT-G on the computer on both occasions (CC), those who completed the initial assessment by computer and used paper format at six months (CP), patients who completed the initial assessment on paper and the six-month via computer (PC) and patients who completed both assessments on paper (PP). Analysis of variance or chi-square statistic was used to compare baseline demographic variables between patients who always used the computer and those who utilized the paper version at either time point.

Repeated measures analysis-of-variance was used to analyze change in the domain score from baseline to six months (time effect), whether there was an effect of group (CC, CP, PC and PP) and whether there was an interaction between group and time. Significance was set at p < 0.01 due to multiple comparisons. SPSS version 10.0 was used for analysis (Chicago, IL).

Results

A total of 187 patients were asked to participate in this longitudinal study and 151 agreed (81%). Following completion of the initial assessment, 32 patients were lost to follow-up, moved, missed the second appointment entirely or refused to complete the questionnaire the second time (16 patients with benign adnexal mass, 8 with endometrial cancer and 8 with ovarian cancer). A total of 119 patients (79% of patients who agreed to participate in the study) completed the FACT-G assessments at both time points. Forty patients had endometrial cancer, 40 had ovarian cancer and 39 had a benign adnexal mass. Twenty of the cancer patients had Stage III or IV disease. Virtually all of the patients were Caucasian (96.6%).

Patients returned the questionnaire by mail within a few days of their scheduled visit. Seventy-one (60%) patients used the computer at both visits (CC), 26 (21.8%) used the computer initially followed by the paper version at six months (CP), 17 (14.3%) used the paper version initially followed by the computer version (PC), and five patients (4.2%) used the paper version at both visits (PP). Patients in the PP group were excluded from statistical analyses as the numbers in that group were small (n = 5).

There were no differences in the age (F = 0.225, p = 0.80) or level of education (χ2 = 2.75, p = 0.60) between the CC, PC and CP groups (Table 1). Approximately 60% of the patients within each diagnosis group used the computer at both time points (Table 1). A slightly higher percentage of patients with a benign adnexal mass used the paper version of the FACT-G at the six months visit (χ2 = 11.07, p = 0.026) as they were more likely to decline to come in for an office visit and request the FACT-G be sent home than were the patients with a cancer diagnosis (Table 1). Four of the five patients in the PP group had ovarian cancer. Mean age of those in the PP group was similar to the other groups (61.2 years) and all had some college or were college graduates.

Table 1 Patient Demographics by Group

Physical well-being domain scores were significantly higher at six months than at baseline (Figure 1, F = 8.849, p = .004) and there was no effect of group (CC, CP, PC; p = 0.480) and no interaction between time and group (p = 0.457). Functional well-being scores were also higher at six months (Figure 2, F = 14.024, p < 0.001) and there was no effect of group (p = 0.453) and no interaction effect (p = 0.583). Emotional well-being scores were significantly higher at six months (Figure 3, F = 24.334, p < 0.001) and there was no effect of group (p = 0.943) and no interaction between group and time (p = 0.865). Social well-being scores did not increase with time (Figure 4, p = 0.14) and there was no effect of group (p = 0.185). There was a significant interaction between group and time (F = 5.671, p = 0.005) as the CP group had a higher score at baseline. There was no effect of time, group or interaction on fatigue scores (data not shown). Total scores were significantly higher at six months (Figure 5, F = 12.174, p = 0.001) and there was no effect of method (p = 0.756) and no interaction effect (p = 0.392).

Figure 1
figure 1

Scores on the Physical Well-Being domain of the Functional Assessment of Cancer Therapy (FACT-G)

Figure 2
figure 2

Scores on the Functional Well-Being domain of the Functional Assessment of Cancer Therapy (FACT-G)

Figure 3
figure 3

Scores on the Emotional Well-Being domain of the Functional Assessment of Cancer Therapy (FACT-G)

Figure 4
figure 4

Scores on the Social Well-Being domain of the Functional Assessment of Cancer Therapy (FACT-G)

Figure 5
figure 5

Total scores on the Functional Assessment of Cancer Therapy (FACT-G)

Discussion

Physical, functional, emotional well-being, and total scores, improved significantly between baseline and six months. In all cases, there was no effect of group and no interaction between group and time, indicating that the women were not affected by the method of data collection. There were also no significant effects of group even when there was no change in the scores over time (social well-being, fatigue). The one significant interaction effect was observed with the social well-being domain, which appeared due to a high baseline score in the CP group. At baseline, the CP group was the same as the CC group (they all used the computer) so it is not clear why there would be a high baseline score in the group that would use a paper version six months later. It is possible that with the number of tests conducted, one spurious finding would be obtained. The trend across all the tests is very strong, however. There are clear and significant changes with time but not with the method of obtaining the data.

Given the choice between using the computer version and the paper version, a small number of women chose the paper version. Of the 238 total measurements, the paper version was used a total of 53 times (22%). Reasons for not using the computer included not wanting to come in to the physician's office at all and patient preference but also instances beyond the patients' control such as scheduling complications and researcher unavailability on a small number of occasions. Designing strategies to increase computer availability may result in further reductions in patient use of the paper versions. If patients can log onto the computer using a unique identifier and complete the questionnaires on their own in the waiting room, the number of women who have to take questionnaires home or forgo completing them should decrease even further.

The second assessment occurred six months following major surgery for all women. The majority of women with ovarian cancer received chemotherapy, but were not receiving it at six months. This time point therefore allows a relatively stable point to assess changes in QoL relative to pre-operative scores in these groups of women. It is possible that differences in method of data collection would be obtained if women were acutely ill at the time of measurement, however the time frame of seven days used in the FACT-G reduces the likelihood that a separation in time of a day or two between using the computer in the office or the paper version at home will result in different responses. The time frame used in the FACT-G, and the relatively stable time point chosen may therefore contribute to the lack of measurement effect obtained in these groups of women.

A limitation of this study is the lack of minority representation which may reduce the generalizability of these results. Additionally, 19% of patients refused to participate in the study. Of the patients who did participate, 21% did not complete the second assessment, although this figure includes 16 women with a benign adnexal mass who may have returned to their referring physician, and women with cancer who moved or transferred their care. Nonetheless, the women who remained on study may differ from those who did not agree to participate or who did not complete the second assessment. They may, for example, have a greater degree of commitment to the research process.

A second limitation is that women were not randomly assigned to use either the computer or the paper versions. This is a preliminary examination of existing data to determine whether there appeared to be a selection bias, or major effect, of using the paper version. Women with QoL scores that differed markedly from the norm, for example, might have chosen to take the paper version home. This did not appear to be the case, however, as highly significant effects of time were observed, but group and interaction effects were markedly non-significant. Related limitations include the remote, but possible, explanation that the first method of administration had an effect on participants at the second time point. Additionally, patient choice itself may have had an unmeasured effect. For example, women with benign adnexal mass were more likely to forego the second office visit and complete the questionnaire at home. Disease and questionnaire administration are therefore confounded. These limitations may have influenced group choice, as well as responses on the second measurement.

These exploratory data suggest that women are responding to questionnaires presented on a computer in the same manner as questionnaires on paper. This study therefore differed somewhat from studies that found differences in method of administration [12, 20]. An important consideration may be maintaining the same format of the questions in the two methods of administration. In this study, each domain was presented on one large screen so that all questions were listed together. The similarity of the format may have contributed to the finding that modes of administration are interchangeable, however larger scale studies, which include randomization and assessing women at different stages of treatment, should be conducted to verify these findings.

Conclusions

Longitudinal measurements of health- related QoL are increasingly used in cancer patients. This study examined whether two different methods of measuring QoL (computer and paper) would provide interchangeable data. It appears that patients are dealing with issues of significant concern and they are responding to the content of the questions and not the method of data collection. It is clearly desirable to standardize the method of data collection and have conditions remain constant across time. The results of this study, however, demonstrate that valid data are obtained with alternate methods of data collection and this is preferable to foregoing data collection entirely.