Background

Self-completed postal questionnaires provide a convenient, cost-effective method of measuring patient outcomes, avoiding travel for participants and researchers. However, non-participation has been increasing over time [1]. If non-participation is not random, the sampled population may differ from the whole population and may affect the generalisability and usefulness of the results [2].

Many researchers have investigated methods of improving responses to postal questionnaires. Their findings have been extensively reviewed [36]. Varied methods such as using a package of communication strategies, teasers on envelopes, personalising the questionnaire, non-monetary incentives and making clear that the study is based at a university have all been found to improve response rates [5]. Questionnaire burden may affect participation in self-completed postal questionnaires [5, 6]. It may also alter the answers given [7].

Changes in quality of life, mental and physical health occur after treatment on an Intensive Care Unit (ICU) [812]. In the Intensive Care Outcome Network (ICON) study we investigated quality of life, mental and physical health following treatment on an ICU using validated postal questionnaires. Whether questionnaire burden affects return rates or answers from patients after treatment on an ICU is unknown. In other hospitalised patient populations, we found two randomised studies, with conflicting results [7, 13].

We therefore undertook an early example of a Study Within a Trial (SWAT) [14] to investigate the effects of questionnaire burden on participation and answers.

Methods

Study design

We conducted a randomised controlled trial as a study within a trial (SWAT), within the ICON study. The ICON study assessed quality of life, the incidence of depression, and the incidence of post-traumatic stress disorder following at least 24 h treatment on an ICU; the protocol has been published [15]. The study received national ethics approval (REC 06/Q1605/17) and local research governance approval was obtained at each centre.

Study population

Patients from 26 UK ICUs took part (1 university hospital, 6 university-affiliated hospitals and 19 district general hospitals). We gave all patients a letter introducing the study at ICU discharge: it explained that they might receive mail from the study team. Patients were eligible if they received level 3 care (as defined by the Intensive Care Society, London [16]) on an ICU for at least 24 h. We excluded patients if they were under 16 years old. We also excluded patients not registered with a general practitioner or of no fixed abode (factors anticipated to prevent follow-up in the study). We excluded patients taking part in another questionnaire follow-up study run by the same research office.

Allocation

We used a pseudo-random number generator, built into GNU Libc (http://www.gnu.org/software/libc/), to allocate patients to a study group without restriction. Allocation occurred at the central research office at the point of enrolment.

Interventions

The group A pack contained four questionnaire pages. The pack contained the one page EuroQol 5 dimensions 3 level (EQ-5D-3 L) questionnaire, two pages each containing a single visual analogue scale and a single page demographics questionnaire.

The group B pack contained eight questionnaire pages. In addition to the group A pack, it also contained a two page Hospital Anxiety and Depression Scale (HADS) questionnaire and a two page Post-traumatic stress disorder Check List– Civilian version (PCL-C) questionnaire.

The initial packs sent at 3 months contained a personally addressed letter inviting participation. The 3-month packs also contained a three-page study information leaflet and a consent form (required by the Ethics Committee).

Packs at 12 and 24 months contained a covering letter and the questionnaires, but no information leaflet or consent form. However, if patients did not receive a 3-month pack because they remained in hospital, the 12-month pack contained the initial letter, study information leaflet and consent form. Questionnaire content remained the same at 3, 12 and 24 months.

Participants were unaware that we were sending different questionnaire packs.

Survey implementation

Both groups received questionnaire packs 3 months after ICU discharge. We sent further questionnaire packs at 12 and 24 months after ICU discharge to respondents who agreed to take part further. We sent repeat questionnaire packs if a reply did not arrive after 2 weeks. Repeat questionnaire packs were identical to the first pack other than slight changes to the wording of the letter that made it clear that this was a repeat mailing. We did not contact patients who spent over 75 days in hospital following their discharge from ICU at 3 months, instead these patients were first contacted at 12 months. Patients who remained in hospital at 12 months did not receive any follow up.

We checked survival with the patient’s registered general practitioner and the National Health Service clinical spine application before posting each questionnaire pack. If the general practitioner informed us that a participant would be unable to complete a questionnaire, we did not send a pack.

We printed all documents using a high quality laser printer. We printed the invitation letter on Oxford University headed paper. The trial coordinator signed each letter. We printed each questionnaire on different coloured paper and bound them with a removable clip. All pages were single-sided and numbered. We used uniform design, large font size and generous spacing. All packs contained a Freepost addressed envelope for questionnaire return. All packs included an ICON branded pen with an ICON-labelled tea bag as incentives; the tea bag label invited the participant to enjoy a cup of tea whilst completing the questionnaire. Participants could also complete the questionnaires by telephone with a trained researcher.

Outcome measures

The primary outcome measure was the questionnaire return rate 3 months after ICU discharge, by group. For the first mailed pack we defined questionnaire return as a completed consent form agreeing to take part in the study and at least one questionnaire question or visual analogue scale completed and returned. For subsequent packs we defined questionnaire return as at least one questionnaire question or visual analogue scale completed and returned.

Secondary outcome measures were return rates at 12 and 24 months and the effect of questionnaire burden on the EQ-5D-3 L weighted index score and individual domain scores at 3 months after ICU discharge. We defined a valid response as completion of all questions within an instrument. We undertook a post-hoc analysis of missingness on the invalid returns.

Sample size

The ICON study ran for 19 months before the randomised controlled trial started. During this time we sent 6028 patients the group B pack and a Short Form 36 version 2 (SF36v2—health related quality of life questionnaire) at 3 months. The 3 month questionnaire return rate was 36%. We assumed a similar return rate for group B packs sent in the trial. We estimated 18,000 patients would take part in the next 28 months (allowing for site changes). Assuming a similar mortality 75 days following discharge from ICU (32%), power of 90% and a significance level of 0.05, this sample size was sufficient to detect a 2.9% change in the return rate.

Statistical analysis

We used an electronic form reader (Teleform v10, Cambridge, UK) to transcribe questionnaire responses into a database (MySQL v5.0-Oracle Corporation, Redwood Shores, CA). Study office personnel manually entered data that the electronic form reader could not interpret. We linked participant records with the Intensive Care National Audit & Research Centre (ICNARC) Case Mix Programme database to obtain admitting diagnoses and severity of illness scores [17].

Statistical analysis was undertaken using R Core v3.2.3 [18]. We used Fisher’s exact test to compare between group rates of questionnaire return. We analysed questionnaire return rates for all patients to whom we sent a questionnaire. We used the Mann–Whitney “U” test to compare EQ-5D-3 L weighted index scores and visual analogue scales between groups at the first time point. We used the Chi-squared test to compare the proportion of patients who responded as EQ-5D-3 L level 1 (“No problems”) with a single collapsed category for those who responded as level 2 (“moderate”) or level 3 (“severe”) [19]. We did not correct for multiple testing. We did not include information from EQ-5D-3 L respondents with any missing or invalid domain responses in our analysis of responses to the EQ-5D-3 L questionnaire.

Results

Patients joined the study May 2008-September 2010 inclusive as planned. The study database closed 28 months after recruitment of the final patient. Of the 18,490 patients who were screened, 18,134 underwent randomization (Fig. 1). Table 1 shows the characteristics of randomised patients by group. Table 2 is a response analysis showing the characteristics of those patients that responded to the study at 3 months (an equivalent non-response analysis is included in Additional file 1: Table S1). Table 3 shows response rates at 3, 12 and 24 months by group. Response rates were equivalent at all time points.

Fig. 1
figure 1

Consort diagram

Table 1 Demographics
Table 2 Responders at 3 months
Table 3 Three, 12 and 24 month response rates

We randomised 18,134 of the 18,490 patients assessed for inclusion (see Fig. 1). 5410 patients died within 3 months of ICU discharge. We delayed follow-up until 12 months in 554 patients who spent over 75 days in hospital. 12,170 patients were eligible to receive a questionnaire pack at 3 months. Table 1 shows demographic data for all participants. Table 2 is a response analysis showing the characteristics of those patients that responded to the study at 3 months (an equivalent non-response analysis is included in Additional file 1: Table S1). Between the two groups responders were very similar, although, without correction for multiple testing, differences existed in self-reported university education and need for assistance. Table 3 shows response rates at 3, 12 and 24 months by group. Response rates were equivalent at all time points.

Table 4 shows valid responses to the EQ-5D-3 L questionnaire at 3 months by group. Patients in group B reported worse function in the “anxiety and depression” (p = 0.017) and “mobility” (p = 0.003) domains of the EQ-5D-3 L questionnaire at 3 months. Questionnaire burden did not affect answers to the “activities”, “pain/discomfort” and “self-care” domains at 3 months. Nor did it affect median EQ-5D-3 L weighted index scores or EQ visual analogue scale score at 3 months. Participants did not always complete every dimension. An analysis of missingness is shown in Additional file 2: Table S2.

Table 4 Three month questionnaire responses

Discussion

Halving the questionnaire burden had no effect on response rates. Most participants who returned a questionnaire at 3 months completed questionnaires at later follow-up points. The group sent the long questionnaire pack reported worse function in the “mobility” and “anxiety and depression” EQ-5D-3 L domains.

We believe our study is by far the largest randomised controlled trial of questionnaire burden. Our short questionnaire pack was the EQ-5D-3 L (four pages). The EQ-5D-3 L was the shortest validated quality of life questionnaire for patients recovering from critical illnesses. Our long questionnaire pack also contained HADS and PCL-C questionnaires (an extra four pages).

We believe we followed best practice for postal questionnaires [3, 5, 6, 20]. Packs included an ICON branded pen and teabag as non-monetary incentives. We used a non-financial incentive appropriate to the situation as these have evidence of benefit and because a financial incentive would have been too costly given the scale of our study [5, 21]. We used personalised letters headed on University of Oxford paper that detailed the patients name, address and their admission hospital [3, 5]. We printed each questionnaire on different coloured paper bound with a removable clip [5]. We used a package of postal communications, including sending out 2 copies of the questionnaires [3]. We used good data management practices to optimise our data capture. This included using only single-sided pages to ensure participants did not miss questions printed on the back of pages and numbering the pages [5]. We used a large font size and generous spacing to facilitate responses from older patients [22].

Our study has limitations. We only contacted patients by post, limiting our findings to postal-only questionnaires. We stopped sending questionnaires to patients who did not respond at the previous mailing point, however recovery from critical illness may have increased responses over time. Additional telephone contact may have changed our results, although this took place very rarely. Our short questionnaire pack contained nine pages. Five of these pages were the letter, consent form and information leaflet. These pages were necessary as patients did not agree to take part before leaving hospital. Informed consent before discharge would have been a more complex approach. However, a (shorter) pack without these pages may have improved return rates [5].

Two studies have examined the effect of questionnaire burden in patients discharged from hospital [7, 13]. In the International Stroke trial more participants responded to a six question EuroQol instrument than the longer SF-36 questionnaire [7]. Yet, this difference was not seen with Picker Patient Experience questionnaires of different lengths [13].

Three systematic reviews have assessed the effect of reducing questionnaire burden on return rates [3, 5, 6]. They all included participants who had not recently been in hospital. Two report an increase in return rates with decreased questionnaire burden [5, 6]. The most recent, restricted to clinical randomised controlled trials found only a marginal effect [3]. In patients recovering from severe illness, return rates to postal questionnaires are variable [712].

In the International Stroke Trial [7], where return rates were higher than in our study, patients agreed to take part in the trial before they received questionnaires. In the ICON study, we combined agreeing to take part and returning the first questionnaire. This combination may explain our lower initial return rates. Where patients agreed to take part in our study, later return rates were similarly high. In our study the longer questionnaire group reported worse function in the “mobility” and “usual activities” EQ-5D-3 L domains.

Two trials in recently hospitalised patients also studied the relationship between questionnaire burden and the answers given. Different Picker Patient Experience questionnaire lengths did not affect the answers given [13]. Conversely, more stroke survivors scored themselves as dependent using the (shorter) six question EuroQol instrument than the SF-36 instrument [7]. The difference seen may reflect the different questionnaires used, rather than the questionnaire burden. In our study, questionnaire burden affected the answers given to the EQ-5D-3 L questionnaire. When presented with additional questionnaires four percent of respondents placed themselves in a worse category.

In patients discharged from an ICU, our study is large enough to be definitive. Reducing questionnaire length from eight to four pages has little effect on return rates. However, including extra questionnaires resulted in different EQ-5D-3 L answers. This interaction suggests investigators should avoid mailings with multiple questionnaires. The effect adds to the case for minimising the burden to participants.

We undertook this study before the Study Within a Trial (SWAT) programme commenced [14, 23]. Our study demonstrates that large scale SWATs examining clinical outcomes can be undertaken. Our findings highlight the potential impact of trial methodologies on outcomes.

With the same response rate in the two groups, we did not expect different answers to EQ-5D-3 L domains. The reasons underlying the different answers remain unclear. The extra questions may have resulted in a different response group. Alternatively, the extra questions may have caused the same group to respond differently. We need to understand better the links between questionnaire burden and the pattern of answers.

Conclusions

In patients treated on an intensive care unit questionnaire burden affected the findings from the same questionnaire. This is a compelling reason to minimise the questionnaire burden. Halving the number of questionnaire pages had no effect on the return rate.