Background

Longitudinal studies, when the same participants are observed at multiple points in time, are vital to numerous areas of science, including Medicine, Public Health, Public Opinion Research, Psychology, Sociology, Econometrics and Political Science [1]. Conducting such studies allows the analysis of change over time in phenomena of interest, which is argued to be the basis for causal inference as one of the ultimate goals of science [2]. However, how well participants can be retained over time largely determines whether these goals can be achieved [3]. Participation attrition/Survey Dropout may lead to biased inferences and thus false conclusions thereby threatening the validity of longitudinal studies [4].

Although participant’s health is often assumed to be one of the most important predictors of dropout, only relatively few studies have analysed this purported relationship, with mixed results [3, 5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20]. For example, Banks and colleagues [5] analysed the relationship between several indicators of health, sociodemographic variables and attrition among older adults. They found that only the sociodemographic variables and not health predicted future attrition. Similarly, Wadsworth also found in their study of a national birth cohort that the response rates in general did not differ meaningfully between persons with and without serious physical illnesses [18]. Meneses and colleagues [8] analysed predictors of attrition among rural cancer survivors. In their study, physical health did not predict participant dropout; only mental health was found to be related to study attrition. Another study analysed predictors of attrition in Australian women [10]. In contrast to both previous studies, they found that physical health as well as mental health was significantly related to study dropout. As a last example, Goldberg and colleagues analysed predictors of attrition in a French cohort [14]. Among all of the studied predictors, physical health indicators had the strongest associations with future dropout in their study.

Therefore, although some previous studies had analyzed the relationship between health and study dropout, they provided mixed findings. Whereas some studies suggest that health does not significantly relate to survey participation, others suggest that health aspects are among the strongest predictors of study dropout. Additionally, studies from Germany are missing. So, more research is needed before firm conclusions regarding the potential contribution of health to survey dropout can be drawn [3]. As of yet, it might be speculated that worse health outcomes tend to be associated with increased dropout rates. The current study aims to examine how multiple aspects of health relate to survey dropout. It is examined which aspects of health—chronic conditions, physical functioning, depression, cognitive functioning—predict participant dropout using a large population-based sample of German middle-aged and older adults. We ask: How do the different aspects of health relate to future study dropout?

Methods

Sample

The 2008 wave of the German Aging Survey data was used, as described in a previous study [21]. This is a cohort-sequential longitudinal, population-based study on Germans above the age of 40 years. It was provided to the main author by the Research Data Center of the German Center of Gerontology [22]. The interviews were conducted face-to-face usually in the respondent’s place of residence and in the German language. National probability sampling was used for the German Aging Survey. All participants of 2008 who gave written consent were re-contacted whether they would participate in further waves of the survey in 2011, 2014, and 2017. To increase survey participation, participants were regularly contacted via information brochures and greeting cards, which were sent to the addresses of all participants. A broad range of information on the life of older adults in Germany is collected in the German Aging Survey, including their health status. In every follow-up data about mortality was also collected, which was based on the civil registry office and on reports of relatives of the deceased. Numerous studies have used the German Aging Survey for empirical studies on the health of middle-aged and older adults in Germany [23,24,25,26,27,28]. In this study data from all baseline participants of 2008 was used, who agreed to fill out a drop-off questionnaire, which resulted in a sample size of N = 4442.

Measures

Several indicators of health in the 2008 wave were used to distinguish different aspects of health: number of chronic conditions, physical functioning, depression, and cognitive functioning. The number of chronic conditions was assessed via self-report with a list of conditions in a drop-off questionnaire (including heart disease, circulatory disorders, joint problems, respiratory disease, stomach or intestinal disease, cancer, diabetes, gallbladder, liver or kidney disease, bladder disease, sleep disorders, visual impairment, hearing impairment). Physical functioning was measured with the subscale Physical Functioning of the German version of the Short Form 36 Health Survey [29]. Depression was assessed with the 15-item German short version of the Center for Epidemiological Studies Depression Scale [CESD Scale;, [30]]. Cognitive functioning was measured via the German version of the Digit Symbol Substitution Test derived from the Wechsler Intelligence Test [31]. Regarding marital status for married participants and for those who indicated that they lived in a long-term partnership this variable was coded as 1 and for for all other participants as 0. Education was classified according to four levels: A low educational level (coded as 1) corresponds to participants who did not complete any vocational qualification and only had up to a maximum of a graduation degree. An intermediate educational level (coded as 2) corresponds to participants with vocational qualifications or participants who had the necessary qualifications for university entrance. An upper-intermediate educational level (coded as 3) corresponds to participants with a finished upgrading training, as for example is the case in Germany for a master craftsman. Finally, a high educational level corresponds to participants with completed university studies. Additional covariates included age, income (as percentage of the population average divided by 100), network size (as the number of important persons with whom the participant has regular contact) and sex. Similar to a previous study, information on future participation of baseline participants formed the basis of our dependent variable [21]. It was coded as 0 (did not drop out / participated in at least one further wave) or 1 (dropped out / did not participate in at least one further wave). Dropout could occur for several reasons including death, inability to contact, inability to respond, and insufficient motivation.

Data analysis

Spearman correlation and logistic regression analyses were used to examine the degree to which different aspects of health predict future participant dropout, similarly to a previous study [21]. Missing values in the 2008 wave were imputed, because participants with missing values might be more likely to also drop out in further survey waves, thus decreasing the potential for bias [32]. The MissForest algorithm was used to impute missing values. It uses nonparametric random forests which seem especially useful to imputation of mixed-type data as in the current study. The missForest algorithm is able to outperform other imputation techniques [33]. Missing values were minimal for most variables except cognitive functioning (dropout: 0%; age: 0%; sex: 0%; education: 0%; physical functioning: 0%; network size: 0%; chronic conditions: 2%; depression: 2%; income: 8%; cognitive functioning: 21%). To assess the robustness of our results several additional analyses were performed. Firstly, because some previous studies suggested different dropout reasons for men and women, we provide sex-stratified analyses [34]. Additionally, instead of analysing a dichotomous dropout variable, we analyse how an alternative outcome, the number of times the participants participated in further waves, is related to our health indicators. Furthermore, we analysed a non-imputed list-wise deleted dataset. Lastly, since at least one study has found differences between predictors of dropout and predictors of mortality, we also conducted our analyses for a sample in which participants who are known to become deceased are excluded [5]. Thereby in the last sensitivity analysis, predictors of general survey dropout, for reasons other than mortality, are investigated.

Results

Descriptive statistics and Spearman inter-correlations are reported in Table 1. As can be seen, participants (49% female) were on average 61.80 years old (SD = 11.88). 40% of participants dropped out and did not participate in successive survey waves. Indicators of health substantially and significantly correlated with each other (0.19 < = | r | < = 0.51). Descriptive differences between those who dropped out and those who continued to participate are depicted in Table 4 in Appendix.

Table 1 Descriptive Statistics and Inter-correlations of Dropout, Health Aspects and Demographic Variables (N = 4442)

Logistic regression results are presented in Table 2. Better physical functioning predicted decreased odds of dropout. The number of chronic conditions also significantly predicted decreased odds of dropping out and depression and cognitive functioning were not significantly related to dropout. Additionally, being older, female sex, lower levels of education, a lower income, a lower network size, and not being in a partnership predicted increased odds of dropping out. Of the health-related variables, chronic conditions and physical functioning had the largest standardized coefficients. These results were replicated when the isolated contribution of the health aspects was tested in separate regression analyses, as visible in Table 3.

Table 2 Logistic Regression Results Predicting Study Dropout via Baseline Health and Demographic Variables (N = 4442)
Table 3 Logistic Regression Results Predicting Study Dropout via Baseline Health and Demographic Variables with Separate Analyses per Health Aspect (N = 4442)

Also, we conducted several additional robustness checks: First, we analyzed women and men separately (Table 5 in Appendix). Here we find that similar associations as in our main analyses, except cognitive functioning. Higher cognitive functioning predicted decreased dropout strongly in women, but also predicted a slight increase in dropout in men. Second, we analysed the participation count instead of a binary dropout variable (Table 6 in Appendix). The participation count refers to the number of future waves the participants participated in. Here we find that, similar to the main analysis, chronic conditions, physical functioning, and cognitive functioning predicted increased chances of dropout. Third, we used a non-imputed list-wise deleted dataset (Table 7 in Appendix). Again, similar results as in the main analysis were found. And fourth, we excluded participants who died during follow-up, thus testing whether cases of mortality significantly influenced our potential conclusions (Table 8 in Appendix). Here we find that, again, higher chronic conditions and higher physical functioning significantly predicted decreased chances of dropout, thus suggesting strong robustness of our results.

Discussion

Some studies had analyzed health as a predictor of study dropout but provided mixed results. Contributing to the literature by further investigating the relationship between health and study dropout, we considered four different aspects of health and investigated how these different aspects of health simultaneously predicted future study dropout in a large population-based sample. We found that the different aspects of health predicted dropout in a contrasting manner: Better physical and, in women, cognitive functioning predicted decreased odds of dropout, whereas fewer chronic conditions, and thus a better health status in terms of less chronic conditions, predicted increased chances of study dropout. Thus, participants who have chronic conditions but are not negatively impacted by them are most likely to participate in future survey waves. Depression did not show any association with dropout as soon as other health aspects were added to the model.

Our results support those studies that reported an association between health and study dropout [e.g., 10]. So, similar to some other studies we also found that health was strongly associated with survey participation [14]. At the same time, the findings of the current study contribute to the literature by showing that different aspects of health, having chronic conditions and the impairment associated with them, might have differential associations with future survey participation. To our knowledge, this is one of the first studies to show that worse health might be associated with increased as well as decreased chances for future dropout, depending on the health aspect. This might also explain the mixed findings regarding health-related dropout reported in the literature, where, surprisingly, physical health was also sometimes found not to be related to survey participation. Global measures of health might sometimes not predict dropout because the potential dropout-increasing associations of some health outcomes and the dropout-decreasing associations of other health outcomes are conflated in this summary measure. In contrast to some previous studies, depression was also not found to predict future study dropout, after controlling for other aspects of health [e.g., 8]. In consideration of the current results, overall associations of health with study dropout might not have been consistently observed in the literature, because they only emerge strongly once the diverging associations of different sub-aspects of health are disentangled, as has been shown in this and some previous studies.

How can these differential associations be explained? The literature has identified multiple main factors that predict survey participation and dropout [35]. Among them, ability (being able to participate) and motivation (being willing to participate) have to be distinguished [36, 37]. Likely, limitations in physical and cognitive functioning impair ones’ ability to participate in surveys and are thus associated with systematic dropout. Having chronic conditions, on the other hand, was associated with decreased odds of dropout. Having chronic conditions might increase one’s need to talk about these issues and might thus increase one’s willingness to participate in studies that address these topics. The lack of an association of depression with dropout in this study might be explained by the fact that depression represents a prevalent co-morbidity of physical morbidity [38]. In line with this hypothesis, depression was associated with dropout in the univariate analysis, but the association disappeared once one controlled for other aspects of health and demographic background information. However, future studies should disentangle the observed associations between aspects of health and study dropout even further, for example by distinguishing different sub-constructs of depression and emotional states.

The interpretation of longitudinal studies should be considered in light of these results. In contrast to previous work, strong associations of health with dropout were found. For example, one increase in a standard deviation of chronic conditions was associated with 18% decreased odds of dropping out, and each increase in a standard deviation of physical functioning was associated with 15% reduced odds of dropout. Therefore, participants seem to selectively participate in the further survey waves depending on their health. Biases in longitudinal studies might thus result in accordance to the degree that health is related to the phenomena of interest. For example, in longitudinal studies participants could seem functionally healthier than they are, all the while seeming to suffer from more chronic conditions and multimorbidity, although these biases might not extend to mental health [e.g., [39,40,41,42]]. Thus, authors of substantive research should be attentive to the potential health-bias inherent to longitudinal research.

Research employing longitudinal methods needs to account for this potential bias. One often-suggested strategy is to impute missing data [43]. Importantly, for this strategy to be feasible, multiple indicators of health need to be included that are able to account for the differential associations of health aspects with dropout. At least, indicators of chronic disease status and functional health need to be considered. Including only one overall health variable cannot account for the observed differential health-dependent dropout. Another strategy to perform longitudinal analyses and avoid this bias is to use other data sources that do not suffer from selective health dropout, like claims or health insurance data, which however might also be susceptible to other biases [44,45,46].

The current study could be improved in multiple ways. First, different types of dropout were not differentiated in the study, which might be needed to validate the supposed mechanisms. Similarly, due to the correlational nature of the research, causation cannot be ascertained. Although the current study statistically controlled for a range of potential covariates, there is still a risk of residual confounding. As such, future studies are needed that include an even more diverse set of variables in their analyses. Secondly, although the study used a large population-based sample of middle-aged and older adults, the sample did not include young adults. The associations of different aspects of health with attrition might differ in younger adults from middle-aged and older adults and should thus be analyzed by future studies. We also only included baseline health variables. Future studies might include time-varying health variables in a more complicated research design to study how dynamics of health development predict future dropout. In a similar vein, although several different health aspects were considered there could still be residual confounding. Therefore, future studies are needed that include further variables as potential predictors of study dropout. Furthermore, future studies should study whether similar results can be obtained when different study designs are used, such as is the case in RCTs or with different survey topics [e.g., [35, 47, 48]]. Lastly, it seems unclear why cognitive functioning seemed to predict dropout differentially in men and women, which should be investigated by future studies.

Conclusion

The current study provides further evidence on how different indicators of health predict survey dropout. Participants with chronic conditions, but minimal physical and cognitive disability are most likely to participate in follow-up studies. Health has thus a complex relationship with survey dropout and must be accounted for in longitudinal studies to provide accurate research results. Neglecting this systematic attrition due to health problems bears the risk of severely under- or overestimating health-related effects and trends.