Background

Attrition can be defined as the loss of relevant individuals occurring after definition of the population to be included in a study. The first stage of attrition is non-response, a loss of subjects to recruitment and baseline data collection. Data on non-responders is not easily accessed [1] and assessment of non-response bias often depends on the availability of routine demographic data.Dropout of subjects who participated in the baseline data collection but are not followed up is the second stage of attrition.

Dropout can be composed of four types: death, contact failure, inability to respond or refusal to respond. Predictors for attrition are usually investigated using baseline data. Investigation and adjustment for attrition needs to consider any risk factors that may be associated with the outcome of interest in each particular study and how these risk factors might be associated with different types of attrition [2, 3]. Knowledge of the patterns of attrition is helpful to improve response rates and for effective examination of the extent of any attrition bias [2]. Only a few large longitudinal studies have treated all four different components of dropout separately [4, 5].

There have been numerous factors that have been associated with increased mortality. Sociodemographic factors include being male, old, low education and living in an institution [2, 59]. Social networking factors include fewer social supports (if male), using informal and formal support and less variety of pursuits (i.e. activities, hobbies or interests) [5, 8, 10]. Physical and mental health factors include having functional difficulties, poor self-reported health, cognitive impairment and major disease [2, 5, 79, 11]. The associations of mortality with education and major disease are not independent of cognition and physical functioning [6, 7].

Refusal dropout has not been consistently related to any factor [4]. Investigation of non-contact attrition has consistently shown worse physical and psychological health in movers than continuing responders, and sometimes suggested increased age for movers [4, 5, 12]. In studies where contact failure, inability to respond and refusal to respond have been grouped together dropouts are older, cognitively impaired and live alone [2, 4, 1317]. Generic attrition has been linked to many demographic, social and health factors including being female (occasionally), fewer years of education, single, living with others, in urban areas, in rented accommodation, or in better quality housing, smoking, lower income, low social participation, poor functioning, poor self-reported health, more chronic illnesses and depression [2, 4, 5, 13, 14, 16, 17].

Studies of attrition come from a variety of settings and use a variety of methods. Some have investigated factors using only univariate analyses [812, 1820] which may account for the number of different factors found. However, even studies where multivariable analysis was used suggest that drop-out is a complex process that can be independently linked to more than one factor [2, 5, 7, 13, 14, 2126]. See Chatfield et al. [27] for a systematic review of factors related to attrition in large population-based longitudinal studies of the elderly.

The aim of the MRC Cognitive Function and Ageing Study (MRC CFAS) is to estimate incidence and longitudinal effects of many disorders [28]. The study is currently beginning to publish the hypotheses that require the longitudinal data. Hence the potential for bias in the respondents at the second wave of interviewing (the incidence wave) needs to be evaluated [29]. The analysis presented in this paper aims to investigate three points. Firstly, was the original sample representative of the centres concerned. Second, which characteristics measured in the original prevalence screen were predictive of non-response for the second wave interview. Third, whether the dropout resulted in bias for the estimates of incidence and longitudinal estimates of disease.

Methods

MRC CFAS is a population based longitudinal study primarily of dementia, but also of other disorders and their potential risk factors. The initial phases of the study have been described in detail elsewhere [28, 29] briefly five centres in England and Wales (East Cambridgeshire, Gwynedd, Newcastle-upon-Tyne, Nottingham and Oxford) with identical methods were used to obtain approximately 2,500 screening interviews in each centre. All centres except Gwynedd obtained the population information from the appropriate Family Health Service Authorities (FHSA); all individuals aged 64 and over on defined dates were enumerated. Population based samples stratified to ages 65–74 years and 75 and above were taken to achieve the 2,500 interviews at each centre. In Gwynedd the FHSA could not release names and addresses for sampling and hence enumeration was undertaken by searching records in GP surgeries and ascertainment based on surgery size. All individuals who were enumerated (n = 18017) have been flagged at the Office of National Statistics (ONS) for deaths and emigrations. Of these 17591 (98%) were successfully matched against their NHS service register records. At wave one an initial screening interview on all individuals was attempted, followed by a more detailed assessment interview on a 20% sub-sample of the respondents, biased towards the cognitively frail. One year later half of these assessed respondents were seen again and then at two years (wave 2) all respondents (screen only and assessed groups) were re-interviewed again using the two-phase sampling technique (see figure 1, the CFAS main papers [28, 29] and website [30] for more information). Once an individual had refused to be interviewed they were not recontacted.

Figure 1
figure 1

Flow chart of individuals at the time of the second wave interviews. D: Died. R: Refused. M: Moved or uncontactable.

Structured interviews were undertaken in the respondent's own home by trained interviewers using computer-assisted interview. The screen interview collected information about demographics including marital status and educational ability, socio-economic factors (including Social Economic Group [31]), social support [32], cognitive impairment (measured using Mini Mental State Examination [33]), functional ability (ADL and IADL and the Townsend Activities of daily living score [34, 35]), organicity section of the Geriatric Mental State (GMS [36]), chronic diseases (including heart disease, cerebrovascular disease, Rose angina scale and intermittent claudication [37]), emotional problems (self-reported depression and anxiety), endocrine disorders and other disorders thought to influence dementia risk [38]), self-perceived health and diseases of first degree relatives. A 20% subsample using different sampling fractions based on age, centre and cognitive ability were selected for a more detailed diagnostic interview with the full mood and organicity sections of the GMS Automated Geriatric Examination for Computer Assisted Taxonomy (AGECAT [36]).

Data from the study have been released in stages. Version 6.2 of the data has been used for this analysis and information from ONS for deaths, loss to follow-up and emigrations has been censored at 31 December 2000.

At each interview stage individuals were classified as undertaking that interview successfully and, if not, reasons for non-response were ascertained. Once an individual had refused an interview or moved away they were not contacted again, however individuals could temporarily say the interview was 'not convenient' and these were re-contacted at the next wave. Individuals who were unable to undertake the complete interview could have some sections completed by a proxy. In addition to drop out due to refusal or non-contact, individuals died between the stages. Exact dates of death for these individuals have been ascertained by ONS. The three aims have been investigated using two separate analyses. The first examined whether the initial wave 1 interviewed sample has a similar mortality at the end of wave 2 as those who moved or refused to be interviewed at wave 1 (are the longitudinal results biased by the initial non-response?). In this analysis individuals are classified as dead or still available for follow-up at the date the second wave of interviewing finished (dependent on the centre and original date of interview). The second analysis examines whether there are differences in the baseline interview characteristics for those who completed wave 2 and those who did not. In this analysis individuals are classified throughout the whole second wave interview process, but censored at the point of contact for the second interview (whether successful or not) and therefore there are less deaths than if the entire second wave time period is included.

It is possible that different types of drop out between waves have different effects. Predictors of drop out have been considered in three separate models for death, refusal and non-contact. The analysis presented compares rates for these three groups using comparison of proportions. Items that may predict drop out between waves have been investigated using logistic regression. Due to the study size odds ratios of <0.67 or >1.5 have been used for indication of important factors (except in the analysis of factors described in previous literature). This is to ensure that only meaningful differences that are statistically significant remain and that epidemiologically trivial results are discounted. Variables with more than two levels have been examined for a trend in the odds ratios with the similar limits on the indication of importance. The study design stratified the sample by ages 65–74 and 75 and above in equal numbers and hence all analyses are adjusted for age, regardless of whether there is an age effect. Other demographic variables such as sex and centre were included when this helped to stabilise the model, but are not used in describing the difference in dropout. A full model for each dropout type that included not only those factors found within the data, but also the effects found in the literature has additionally been fitted and the results described. Confidence intervals for estimates where there is no natural reference category have been calculated using Floating Absolute Risks, which avoid the use of an arbitrary reference category [39]. The confidence intervals apportion the overall error, using the variance/covariance matrix, to each level of the categorical variable enabling direct comparison of any value with another.

Results

MRC CFAS identified 20234 individuals whose data could potentially be eligible for inclusion in the study (table 1). FHSA errors, duplicates and over sampling for an inflated estimate of refusal rate resulted in 2085 individuals who were never contacted. A further 132 individuals were identified from GP practices that had refused access to their patients; hence contact was attempted for 18017 individuals. Forty-eight individuals were ineligible as they did not speak sufficient English (or Welsh in Gwynedd) and one as they were related to the study team. A further 417 individuals were never traced; the majority had left one GP practice and not re-registered with a new address with a new GP. Of the 17551 eligible individuals contacted 74% were successfully interviewed with similar rates seen at each of the five centres. At the end of December 2000 (version 6.1 of the data) the vital status of all individuals as reported by ONS was 8943 (51%) alive and traced, 8537 (49%) were dead, 24 (<1%) had emigrated and 87 (<1%) had lost their NHS registration.

Table 1 Audit trail of individuals into the MRC CFAS study

Initial non-response and longitudinal bias

All individuals were classified as to whether they were alive or dead at the end of the second wave interview process (between 1994–1996 depending on centre and year of first interview), to ascertain whether those who undertook the first wave interview (prevalence screen) had similar death rates to those who refused or who had moved. Table 2 details each of the follow-up categories, whether they were successfully traced by ONS and whether they were alive or dead by the second stage. The death rates are similar between those seen at the wave 1 interview and those who refused (17% versus 20%, difference 3%, 95% Confidence Interval (CI) 2–5%). There is a higher death rate in those individuals who had moved prior to the wave 1 interview (27% versus 17%, difference 10%, 95% CI 4–16%). Only 2 out of the 417 never found were traced by ONS and both were still alive.

Table 2 Potential bias in the non-responders from the original sample and wave 2 interviews

Longitudinal response

The full audit trail, to wave 2 interview, of the 13,004 individuals who completed the wave 1 interview is also shown in table 2. Sixty eight per cent (8826) of these individuals successfully completed the wave 2 interviews. Of the 4178 that did not complete the interview process 1502 (36%) died, 2490 (60%) refused and 181 (4%) moved. The baseline characteristics of these individuals are compared in table 3. Many of the characteristics seem to show differences between the types of loss between waves, but univariate analyses would show too many factors as associated with attrition. The characteristics shown in this table have been included in a multivariable model using stepwise logistic regression if the factors showed increased or reduced risk from an unadjusted analysis (table 4). As stated previously only odds ratios of <0.67 or >1.5 have been considered. A final model has also been fitted that adds the factors that have been reported to be related to dropout from the literature to the best model from the data.

Table 3 Number (%) of respondents by status at wave 2 interview and characteristics measured at baseline. Missing data excluded
Table 4 Predictors for each type of loss to follow-up from the multivariable model.

Drop-out due to death

Individuals were more likely to dropout due to death if they were older, male, smoked, had lower (or missing) cognitive impairment as measured by MMSE, functional ability as measured by ADL impairment or missing ADL score, or fair/poor (or missing) self perceived health (table 4). This compares well with the previous literature, however no effect of self reported depression (Odds ratio (OR) 1.0 95% Confidence interval (CI) 0.8–1.2) or level of education (OR 1.0 95% CI 0.9–1.2) was found on dropout due to death in our respondents (even when controlling for cognitive ability).

Drop-out due to refusal

Individuals who refused were also more likely to have a lower/missing MMSE. Individuals were less likely to refuse if they had more full-time years of education and as the accommodation became more dependent. There was a notable difference between the centres with Cambridgeshire and Gwynedd, the two rural centres, having the highest likelihood of refusal (table 4). The literature findings of low social class (OR 1.1 95% CI 1.1–1.3), living with others (OR 1.4 95% CI 1.2–1.5), being young or female (OR 1.3 95% CI 1.2–1.5) were all confirmed as being associated with refusal in our data, albeit some with weak, but significant effects. However, neither chronic disease (OR 0.9 95% CI 0.8–0.9) nor self-reported depression (OR 0.9 95% CI 0.9–1.1) were associated with refusal.

Drop-out due to moving/non-contact

Individuals were more likely to move between first and second interview if they had symptoms suggestive of dementia or were unable to complete the cognitive assessment, were single, smokers, or had self-reported depression. They were less likely to move if they lived in warden-controlled accommodation (table 4). The potential factors from the literature associated with having moved or being uncontactable were not all associated with loss in our data, e.g. being functionally impaired (OR 1.1 95% CI 0.7–1.9), cognitively impaired (OR 1.3 95% CI 0.4–4.0) though they may have failed to reach conventional significance due to small numbers. Emotional problems (primarily depression and anxiety) (OR 1.5 95% CI 1.1–2.2) were associated with moving once again, but this effect was better measured by the subset of individuals with depression (table 4) rather than the complete group with emotional problems.

We have also investigated all non-mortality drop-out without subclassification and this has shown that, as with other studies, individuals who are cognitively impaired, women, poor functioning and with less education were all more likely to drop-out, but people living alone were less likely to drop out. In addition there was a weak but non-significant effect of ethnicity (Caucasian versus other), and no effects of smoking and self-perceived health.

Discussion

A difference in mortality was found between those initially able to undertake the study and those who refused, though the difference was small. The small number of individuals who had moved before the first interview had a higher mortality than those who did undertake the interview. This may be because ill health or inability to remain in their own home caused them to move, despite efforts to track down all the residential care homes in the centres themselves. There is much mobility amongst the old-age population to retirement communities and for closeness to family members [4042]. Initial non-response bias may well have been generated by the 23% of the population who either refused or moved, especially with the higher death rate in the movers, however the social class and general demographic data is close to that seen in the population of the centres concerned (C McCracken, unpublished data).

The method of using volunteer groups minimises the longitudinal dropout however initial response to the complete population is low. Others like CFAS have a fairly constant rate of refusal for both the initial non-response and the longitudinal component [14], hence initially they may be less biased. However with time and many longitudinal waves the two types of study will become increasingly similar in characteristics. Each analysis in any population-based study will require careful consideration of the potential bias that these two very different dropout mechanisms introduce.

Following a successful interview at baseline the predictors for drop out to the second wave due to death were increased age, being male, being impaired for activities of daily living (ADL), having poor self perceived health, being a smoker and poor cognitive ability. These findings are almost identical to the predictors of 5-year mortality in the Canadian Study of Health and Aging (CSHA) [6] except our weak effect of smoking was not a predictor in the CSHA. As with the CSHA, our big effect of institutionalisation (18% in deceased at wave 2 compared with 3% in respondents) found with a univariate analysis disappeared when physical and cognitive factors were introduced to a multivariable analysis. This study did not find any association between self-reported depression and death in contrast to other studies [43], however the effect is not clear cut; there was no evidence with self-reported depression, but some evidence of an increased prevalence of depression using the AGECAT diagnosis, although this was not statistically significant.

Similarly individuals who refused to take part in further interviews were more likely to have poor cognitive ability, but were also more likely to have less years of full-time education and be living in their own home and be living alone. The fact that attrition not due to death was related to only a few factors was encouraging as this attrition is essentially a study design issue. These findings are quite similar to those in Longitudinal Ageing Study Amsterdam [4] where refusers were found to have less years of full-time education, however unlike CFAS they were more likely to be living with others. In CFAS there were also more refusals in the rural centres. It is interesting to compare this result with two North American studies, where more populated areas and very large cities were associated with higher dropout [5, 13].

Individuals who moved away or were uncontactable were more likely to be single, smokers, potentially demented and have self-reported depression and were less likely to have moved if they were already in warden controlled accommodation at baseline. We found little suggestion of an age effect in our multivariable analysis while univariate analyses previously have found movers to be slightly older than continuing respondents and the 'hard to find' slightly younger [4]. Our findings add to the sparse literature findings so far that movers have worse psychological and physical health [5].

Overall individuals who were unable to undertake a proper cognitive assessment were more likely to have all three types of attrition, and having an incomplete interview (measured by incomplete responses to self-reported health and ADL impairments) was associated with higher levels of mortality. This could possibly indicate that these individuals were already in terminal decline, however the measurement of their exact level was difficult to ascertain.

Conclusions

Factors that influence dropout can potentially influence any results from longitudinal studies. The findings presented here suggest that different types of dropout between waves will affect different results. No factors affected different types of attrition in different directions, which makes adjusting for attrition biases possible. The longitudinal estimates that will be most affected by the attrition biases presented here will be analyses related to age (mortality effects only), cognitive ability, poor functioning, smoking history, residential status/population mobility and self-perceived health status, and to a lesser extent changes in marital status, social contacts and self-reported depression.

CFAS would not appear to be more biased than any other longitudinal study on ageing, however many of the larger studies have not properly investigated their drop-out mechanisms, making comparisons difficult [4]. Also, different ratios of the types of attrition will mean results from other studies may be affected differently to our own. All researchers should consider attrition bias in any analysis they undertake, and papers on the overall pattern and breakdown of attrition are useful to other researchers in the field.