Abstract
Longitudinal studies are based on repeatedly measuring the outcome of interest and covariates over a sequences of time points. These studies play a vital role in many disciplines of science, such as medicine, epidemiology, ecology and public health. However, data arising from such studies often show inevitable incompleteness due to dropouts or even intermittent missingness that can potentially cause serious bias problems in the analysis of longitudinal data. In this chapter we confine our considerations to the dropout missingness pattern. Given the problems that can arise when there are dropouts in longitudinal studies, the following question is forced upon researchers: What methods can be utilized to handle these potential pitfalls? The goal is to use approaches that better avoid the generation of biased results. This chapter considers some of the key modelling techniques and basic issues in statistical data analysis to address dropout problems in longitudinal studies. The main objective is to provide an overview of issues and different methodologies in the case of subjects dropping out in longitudinal data for both the case of continuous and discrete outcomes. The chapter focusses on methods that are valid under the missing at random (MAR) mechanism and the missingness patterns of interest will be monotone; these are referred to as dropout in the context of longitudinal data. The fundamental concepts of the patterns and mechanisms of dropout are discussed. The techniques that are investigated for handling dropout are: (1) Multiple imputation (MI); (2) Likelihood-based methods, in particular Generalized linear mixed models (GLMMs) ; (3) Multiple imputation based generalized estimating equations (MI-GEE) ; and (4) Weighted estimating equations (WGEE) . For each method, useful and important assumptions regarding its applications are presented. The existing literature in which we examine the effectiveness of these methods in the analysis of incomplete longitudinal data is discussed in detail. Two application examples are presented to study the potential strengths and weaknesses of the methods under an MAR dropout mechanism.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Alosh, M. (2010). Modeling longitudinal count data with dropouts. Pharmaceutical Statistics, 9, 35–45.
Anderson, J. A., & Aitkin, M. (1985). Variance component models with binary response: Interviewer variability. Journal of the Royal Statistical Society, Series B, 47, 203–210.
Beunckens, C., Sotto, C., & Molenberghs, G. (2008). A simulation study comparing weighted estimating equations with multiple imputation based estimating equations for longitudinal binary data. Computational Statistics and Data Analysis, 52, 1533–1548.
Birhanu, T., Molenberghs, G., Sotto, C., & Kenward, M. G. (2011). Doubly robust and multiple-imputation-based generalized estimating equations. Journal of Biopharmaceutical Statistics, 21, 202–225.
Breslow, N. E., & Lin, X. (1995). Bias correction in generalised linear models with a single component of dispersion. Biometrika, 82, 81–91.
Burton, A., Altman, D. G., Royston, P., & Holder, R. (2006). The design of simulation studies in medical statistics. Statistics in Medicine, 25, 4279–4292.
Carpenter, J., & Kenward, M. (2013). Multiple imputation and its application. UK: Wiley.
Collins, L. M., Schafer, J. L., & Kam, C. M. (2001). A comparison of inclusive and restrictive strategies in modern missing data procedures. Psychological Methods, 6, 330–351.
De Backer, M., De Keyser, P., De Vroey, C., & Lesaffre, E. (1996). A 12-week treatment for dermatophyte toe onychomycosis: terbinafine 250mg/day vs. itraconazole 200mg/day? a double-blind comparative trial. British Journal of Dermatology, 134, 16–17.
Dempster, A. P., Laird, N. M., & Rubin, D. B. (1977). Maximum likelihood from incomplete data via the EM algorithm (with discussion). Journal of Royal Statistical Society: Series B, 39, 1–38.
Jansen, I., Beunckens, C., Molenberghs, G., Verbeke, G., & Mallinckrodt, C. (2006). Analyzing incomplete discrete longitudinal clinical trial data. Statistical Science, 21, 52–69.
Laird, N. M., & Ware, J. H. (1982). Random-effects models for longitudinal data. Biometrics, 38, 963–974.
Liang, K. Y., & Zeger, S. L. (1986). Longitudinal data analysis using generalized linear models. Biometrika, 73, 13–22.
Little, R. J. A., & Rubin, D. B. (1987). Statistical analysis with missing data. New York: Wiley.
Little, R. J. A. (1995). Modeling the drop-out mechanism in repeated-measures studies. Journal of the American Statistical Association, 90, 1112–1121.
Little, R. J., & DAgostino, R., Cohen, M. L., Dickersin, K., Emerson, S. S., Farrar, J., Frangakis, C., Hogan, J. W., Molenberghs, G., Murphy, S. A., Neaton, J. D., Rotnitzky, A., Scharfstein, D., Shih, W, J., Siegel, J. P., & Stern, H., (2012). The prevention and treatment of missing data in clinical trials. The New England Journal of Medicine, 367, 1355–1360.
Mallinckrodt, C. H., Clark, W. S., & Stacy, R. D. (2001a). Type I error rates from mixedeffects model repeated measures versus fixed effects analysis of variance with missing values imputed via last observation carried forward. Drug Information Journal, 35, 1215–1225.
Mallinckrodt, C. H., Clark, W. S., & Stacy, R. D. (2001b). Accounting for dropout bias using mixed-effect models. Journal of Biopharmaceutical Statistics, 11, 9–21.
Mallinckrodt, C. H., Clark, W. S., Carroll, R. J., & Molenberghs, G. (2003a). Assessing response profiles from incomplete longitudinal clinical trial data under regulatory considerations. Journal of Biopharmaceutical Statistics, 13, 179–190.
Mallinckrodt, C. H., Sanger, T. M., Dube, S., Debrota, D. J., Molenberghs, G., Carroll, R. J., et al. (2003b). Assessing and interpreting treatment effects in longitudinal clinical trials with missing data. Biological Psychiatry, 53, 754–760.
Milliken, G. A., & Johnson, D. E. (2009). Analysis of messy data. Design experiments (2nd ed., Vol. 1). Chapman and Hall/CRC.
Molenberghs, G., Kenward, M. G., & Lesaffre, E. (1997). The analysis of longitudinal ordinal data with non-random dropout. Biometrika, 84, 33–44.
Molenberghs, G., & Verbeke, G. (2005). Models for discrete longitudinal data. New York: Springer.
Molenberghs, G., & Kenward, M. G. (2007). Missing data in clinical studies. England: Wiley.
Molenberghs, G., Beunckens, C., Sotto, C., & Kenward, M. (2008). Every missing not at random model has got a missing at random counterpart with equal fit. Journal of Royal Statistical Soceity: Series B, 70, 371–388.
Pinheiro, J. C., & Bates, D. M. (2000). Mixed effects models in S and S-Plus. New York: Springer.
Rubin, D. B. (1976). Inference and missing data. Biometrika, 63, 581–592.
Rubin, D. B. (1978). Multiple imputations in sample surveys. In Proceedings of the Survey Research Methods Section (pp. 20–34). American Statistical Association.
Rubin, D. B., & Schenker, N. (1986). Multiple imputation for interval estimation from simple random samples with ignorable nonresponse. Journal of the American Statistical Association, 81, 366–374.
Rubin, D. B. (1987). Multiple imputation for nonresponse in surveys. New York: Wiley.
Rubin, D. B. (1996). Multiple imputation after 18+ years (with discussion). Journal of the American Statistical Association, 91, 473–520.
Schafer, J. L. (1997). Analysis of incomplete multivariate data. New York: Champan and Hall.
Schafer, J. L., & Olsen, M. K. (1998). Multiple imputation for multivariate missing-data problems: A data analysts perspective. Multivariate Behavioral Research, 33, 545–571.
Schafer, J. L. (1999). Multiple imputation: A primer. Statistical Methods in Medical Research, 8, 3–15.
Schafer, J. L., & Graham, J. W. (2002). Missing data: Our view of the state of the art. Psychological Methods, 7, 147–177.
Schafer, J. L. (2003). Multiple imputation in multivariate problems when the imputation and analysis models differ. Statistica Neerlandica, 57, 19–35.
Stiratelli, R., Laird, N., & Ware, J. (1984). Random effects models for serial observations with dichotomous response. Biometrics, 40, 961–972.
Tanner, M. A., & Wong, W. H. (1987). The calculation of posterior distributions by data augmentation (with discussion). Journal of the American Statistical Association, 82, 528–550.
Verbeke, G., & Molenberghs, G. (2000). Linear mixed models for longitudinal data. New York: Springer.
Yoo, B. (2009). The impact of dichotomization in longitudinal data analysis: A simulation study. Pharmaceutical Statistics, 9, 298–312.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2017 Springer Nature Singapore Pte Ltd.
About this chapter
Cite this chapter
Satty, A., Mwambi, H., Molenberghs, G. (2017). Statistical Methodologies for Dealing with Incomplete Longitudinal Outcomes Due to Dropout Missing at Random. In: Chen, DG., Chen, J. (eds) Monte-Carlo Simulation-Based Statistical Modeling . ICSA Book Series in Statistics. Springer, Singapore. https://doi.org/10.1007/978-981-10-3307-0_10
Download citation
DOI: https://doi.org/10.1007/978-981-10-3307-0_10
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-10-3306-3
Online ISBN: 978-981-10-3307-0
eBook Packages: Mathematics and StatisticsMathematics and Statistics (R0)