Statistical Methodologies for Dealing with Incomplete Longitudinal Outcomes Due to Dropout Missing at Random

Satty, A.; Mwambi, H.; Molenberghs, G.

doi:10.1007/978-981-10-3307-0_10

A. Satty⁶,
H. Mwambi⁵ &
G. Molenberghs⁷

Part of the book series: ICSA Book Series in Statistics ((ICSABSS))

3829 Accesses

Abstract

Longitudinal studies are based on repeatedly measuring the outcome of interest and covariates over a sequences of time points. These studies play a vital role in many disciplines of science, such as medicine, epidemiology, ecology and public health. However, data arising from such studies often show inevitable incompleteness due to dropouts or even intermittent missingness that can potentially cause serious bias problems in the analysis of longitudinal data. In this chapter we confine our considerations to the dropout missingness pattern. Given the problems that can arise when there are dropouts in longitudinal studies, the following question is forced upon researchers: What methods can be utilized to handle these potential pitfalls? The goal is to use approaches that better avoid the generation of biased results. This chapter considers some of the key modelling techniques and basic issues in statistical data analysis to address dropout problems in longitudinal studies. The main objective is to provide an overview of issues and different methodologies in the case of subjects dropping out in longitudinal data for both the case of continuous and discrete outcomes. The chapter focusses on methods that are valid under the missing at random (MAR) mechanism and the missingness patterns of interest will be monotone; these are referred to as dropout in the context of longitudinal data. The fundamental concepts of the patterns and mechanisms of dropout are discussed. The techniques that are investigated for handling dropout are: (1) Multiple imputation (MI); (2) Likelihood-based methods, in particular Generalized linear mixed models (GLMMs) ; (3) Multiple imputation based generalized estimating equations (MI-GEE) ; and (4) Weighted estimating equations (WGEE) . For each method, useful and important assumptions regarding its applications are presented. The existing literature in which we examine the effectiveness of these methods in the analysis of incomplete longitudinal data is discussed in detail. Two application examples are presented to study the potential strengths and weaknesses of the methods under an MAR dropout mechanism.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

eBook: USD 16.99; Price excludes VAT (USA)

Softcover Book: USD 139.99; Price excludes VAT (USA)

Hardcover Book: USD 139.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Alosh, M. (2010). Modeling longitudinal count data with dropouts. Pharmaceutical Statistics, 9, 35–45.
Article Google Scholar
Anderson, J. A., & Aitkin, M. (1985). Variance component models with binary response: Interviewer variability. Journal of the Royal Statistical Society, Series B, 47, 203–210.
MathSciNet Google Scholar
Beunckens, C., Sotto, C., & Molenberghs, G. (2008). A simulation study comparing weighted estimating equations with multiple imputation based estimating equations for longitudinal binary data. Computational Statistics and Data Analysis, 52, 1533–1548.
Article MathSciNet MATH Google Scholar
Birhanu, T., Molenberghs, G., Sotto, C., & Kenward, M. G. (2011). Doubly robust and multiple-imputation-based generalized estimating equations. Journal of Biopharmaceutical Statistics, 21, 202–225.
Article MathSciNet Google Scholar
Breslow, N. E., & Lin, X. (1995). Bias correction in generalised linear models with a single component of dispersion. Biometrika, 82, 81–91.
Article MathSciNet MATH Google Scholar
Burton, A., Altman, D. G., Royston, P., & Holder, R. (2006). The design of simulation studies in medical statistics. Statistics in Medicine, 25, 4279–4292.
Article MathSciNet Google Scholar
Carpenter, J., & Kenward, M. (2013). Multiple imputation and its application. UK: Wiley.
Book MATH Google Scholar
Collins, L. M., Schafer, J. L., & Kam, C. M. (2001). A comparison of inclusive and restrictive strategies in modern missing data procedures. Psychological Methods, 6, 330–351.
Article Google Scholar
De Backer, M., De Keyser, P., De Vroey, C., & Lesaffre, E. (1996). A 12-week treatment for dermatophyte toe onychomycosis: terbinafine 250mg/day vs. itraconazole 200mg/day? a double-blind comparative trial. British Journal of Dermatology, 134, 16–17.
Article Google Scholar
Dempster, A. P., Laird, N. M., & Rubin, D. B. (1977). Maximum likelihood from incomplete data via the EM algorithm (with discussion). Journal of Royal Statistical Society: Series B, 39, 1–38.
MathSciNet MATH Google Scholar
Jansen, I., Beunckens, C., Molenberghs, G., Verbeke, G., & Mallinckrodt, C. (2006). Analyzing incomplete discrete longitudinal clinical trial data. Statistical Science, 21, 52–69.
Article MathSciNet MATH Google Scholar
Laird, N. M., & Ware, J. H. (1982). Random-effects models for longitudinal data. Biometrics, 38, 963–974.
Article MATH Google Scholar
Liang, K. Y., & Zeger, S. L. (1986). Longitudinal data analysis using generalized linear models. Biometrika, 73, 13–22.
Article MathSciNet MATH Google Scholar
Little, R. J. A., & Rubin, D. B. (1987). Statistical analysis with missing data. New York: Wiley.
MATH Google Scholar
Little, R. J. A. (1995). Modeling the drop-out mechanism in repeated-measures studies. Journal of the American Statistical Association, 90, 1112–1121.
Article MathSciNet MATH Google Scholar
Little, R. J., & DAgostino, R., Cohen, M. L., Dickersin, K., Emerson, S. S., Farrar, J., Frangakis, C., Hogan, J. W., Molenberghs, G., Murphy, S. A., Neaton, J. D., Rotnitzky, A., Scharfstein, D., Shih, W, J., Siegel, J. P., & Stern, H., (2012). The prevention and treatment of missing data in clinical trials. The New England Journal of Medicine, 367, 1355–1360.
Google Scholar
Mallinckrodt, C. H., Clark, W. S., & Stacy, R. D. (2001a). Type I error rates from mixedeffects model repeated measures versus fixed effects analysis of variance with missing values imputed via last observation carried forward. Drug Information Journal, 35, 1215–1225.
Article Google Scholar
Mallinckrodt, C. H., Clark, W. S., & Stacy, R. D. (2001b). Accounting for dropout bias using mixed-effect models. Journal of Biopharmaceutical Statistics, 11, 9–21.
Article Google Scholar
Mallinckrodt, C. H., Clark, W. S., Carroll, R. J., & Molenberghs, G. (2003a). Assessing response profiles from incomplete longitudinal clinical trial data under regulatory considerations. Journal of Biopharmaceutical Statistics, 13, 179–190.
Article MATH Google Scholar
Mallinckrodt, C. H., Sanger, T. M., Dube, S., Debrota, D. J., Molenberghs, G., Carroll, R. J., et al. (2003b). Assessing and interpreting treatment effects in longitudinal clinical trials with missing data. Biological Psychiatry, 53, 754–760.
Article Google Scholar
Milliken, G. A., & Johnson, D. E. (2009). Analysis of messy data. Design experiments (2nd ed., Vol. 1). Chapman and Hall/CRC.
Google Scholar
Molenberghs, G., Kenward, M. G., & Lesaffre, E. (1997). The analysis of longitudinal ordinal data with non-random dropout. Biometrika, 84, 33–44.
Article MATH Google Scholar
Molenberghs, G., & Verbeke, G. (2005). Models for discrete longitudinal data. New York: Springer.
MATH Google Scholar
Molenberghs, G., & Kenward, M. G. (2007). Missing data in clinical studies. England: Wiley.
Book Google Scholar
Molenberghs, G., Beunckens, C., Sotto, C., & Kenward, M. (2008). Every missing not at random model has got a missing at random counterpart with equal fit. Journal of Royal Statistical Soceity: Series B, 70, 371–388.
Article MATH Google Scholar
Pinheiro, J. C., & Bates, D. M. (2000). Mixed effects models in S and S-Plus. New York: Springer.
Book MATH Google Scholar
Rubin, D. B. (1976). Inference and missing data. Biometrika, 63, 581–592.
Article MathSciNet MATH Google Scholar
Rubin, D. B. (1978). Multiple imputations in sample surveys. In Proceedings of the Survey Research Methods Section (pp. 20–34). American Statistical Association.
Google Scholar
Rubin, D. B., & Schenker, N. (1986). Multiple imputation for interval estimation from simple random samples with ignorable nonresponse. Journal of the American Statistical Association, 81, 366–374.
Article MathSciNet MATH Google Scholar
Rubin, D. B. (1987). Multiple imputation for nonresponse in surveys. New York: Wiley.
Book MATH Google Scholar
Rubin, D. B. (1996). Multiple imputation after 18+ years (with discussion). Journal of the American Statistical Association, 91, 473–520.
Article MATH Google Scholar
Schafer, J. L. (1997). Analysis of incomplete multivariate data. New York: Champan and Hall.
Book MATH Google Scholar
Schafer, J. L., & Olsen, M. K. (1998). Multiple imputation for multivariate missing-data problems: A data analysts perspective. Multivariate Behavioral Research, 33, 545–571.
Article Google Scholar
Schafer, J. L. (1999). Multiple imputation: A primer. Statistical Methods in Medical Research, 8, 3–15.
Article Google Scholar
Schafer, J. L., & Graham, J. W. (2002). Missing data: Our view of the state of the art. Psychological Methods, 7, 147–177.
Article Google Scholar
Schafer, J. L. (2003). Multiple imputation in multivariate problems when the imputation and analysis models differ. Statistica Neerlandica, 57, 19–35.
Article MathSciNet Google Scholar
Stiratelli, R., Laird, N., & Ware, J. (1984). Random effects models for serial observations with dichotomous response. Biometrics, 40, 961–972.
Article Google Scholar
Tanner, M. A., & Wong, W. H. (1987). The calculation of posterior distributions by data augmentation (with discussion). Journal of the American Statistical Association, 82, 528–550.
Article MathSciNet MATH Google Scholar
Verbeke, G., & Molenberghs, G. (2000). Linear mixed models for longitudinal data. New York: Springer.
MATH Google Scholar
Yoo, B. (2009). The impact of dichotomization in longitudinal data analysis: A simulation study. Pharmaceutical Statistics, 9, 298–312.
Article Google Scholar

Download references

Author information

Authors and Affiliations

Faculty of Mathematical Sciences and Statistics, Alneelain University, Khartoum, Sudan
H. Mwambi
School of Mathematics, Statistics and Computer Science, University of KwaZulu-Natal, Private Bag X01 Scottsville 3209, Pietermaritzburg, South Africa
A. Satty
I-BioStat, Universiteit Hasselt & KU Leuven, Martelarenlaan 42, 3500, Hasselt, Belgium
G. Molenberghs

Authors

A. Satty
View author publications
You can also search for this author in PubMed Google Scholar
H. Mwambi
View author publications
You can also search for this author in PubMed Google Scholar
G. Molenberghs
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to H. Mwambi .

Editor information

Editors and Affiliations

Gillings School of Global Public Health, University of North Carolina, Chapel Hill, North Carolina, USA
Ding-Geng (Din) Chen
Risk Management, Credit Suisse Risk Management, New York, New York, USA
John Dean Chen

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Satty, A., Mwambi, H., Molenberghs, G. (2017). Statistical Methodologies for Dealing with Incomplete Longitudinal Outcomes Due to Dropout Missing at Random. In: Chen, DG., Chen, J. (eds) Monte-Carlo Simulation-Based Statistical Modeling . ICSA Book Series in Statistics. Springer, Singapore. https://doi.org/10.1007/978-981-10-3307-0_10

Download citation

DOI: https://doi.org/10.1007/978-981-10-3307-0_10
Published: 03 February 2017
Publisher Name: Springer, Singapore
Print ISBN: 978-981-10-3306-3
Online ISBN: 978-981-10-3307-0
eBook Packages: Mathematics and StatisticsMathematics and Statistics (R0)

Publish with us

Policies and ethics