Skip to main content

Statistical Methodologies for Dealing with Incomplete Longitudinal Outcomes Due to Dropout Missing at Random

  • Chapter
  • First Online:
Monte-Carlo Simulation-Based Statistical Modeling

Part of the book series: ICSA Book Series in Statistics ((ICSABSS))

  • 3829 Accesses

Abstract

Longitudinal studies are based on repeatedly measuring the outcome of interest and covariates over a sequences of time points. These studies play a vital role in many disciplines of science, such as medicine, epidemiology, ecology and public health. However, data arising from such studies often show inevitable incompleteness due to dropouts or even intermittent missingness that can potentially cause serious bias problems in the analysis of longitudinal data. In this chapter we confine our considerations to the dropout missingness pattern. Given the problems that can arise when there are dropouts in longitudinal studies, the following question is forced upon researchers: What methods can be utilized to handle these potential pitfalls? The goal is to use approaches that better avoid the generation of biased results. This chapter considers some of the key modelling techniques and basic issues in statistical data analysis to address dropout problems in longitudinal studies. The main objective is to provide an overview of issues and different methodologies in the case of subjects dropping out in longitudinal data for both the case of continuous and discrete outcomes. The chapter focusses on methods that are valid under the missing at random (MAR) mechanism and the missingness patterns of interest will be monotone; these are referred to as dropout in the context of longitudinal data. The fundamental concepts of the patterns and mechanisms of dropout are discussed. The techniques that are investigated for handling dropout are: (1) Multiple imputation (MI); (2) Likelihood-based methods, in particular Generalized linear mixed models (GLMMs) ; (3) Multiple imputation based generalized estimating equations (MI-GEE) ; and (4) Weighted estimating equations (WGEE) . For each method, useful and important assumptions regarding its applications are presented. The existing literature in which we examine the effectiveness of these methods in the analysis of incomplete longitudinal data is discussed in detail. Two application examples are presented to study the potential strengths and weaknesses of the methods under an MAR dropout mechanism.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

eBook
USD 16.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 139.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 139.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  • Alosh, M. (2010). Modeling longitudinal count data with dropouts. Pharmaceutical Statistics, 9, 35–45.

    Article  Google Scholar 

  • Anderson, J. A., & Aitkin, M. (1985). Variance component models with binary response: Interviewer variability. Journal of the Royal Statistical Society, Series B, 47, 203–210.

    MathSciNet  Google Scholar 

  • Beunckens, C., Sotto, C., & Molenberghs, G. (2008). A simulation study comparing weighted estimating equations with multiple imputation based estimating equations for longitudinal binary data. Computational Statistics and Data Analysis, 52, 1533–1548.

    Article  MathSciNet  MATH  Google Scholar 

  • Birhanu, T., Molenberghs, G., Sotto, C., & Kenward, M. G. (2011). Doubly robust and multiple-imputation-based generalized estimating equations. Journal of Biopharmaceutical Statistics, 21, 202–225.

    Article  MathSciNet  Google Scholar 

  • Breslow, N. E., & Lin, X. (1995). Bias correction in generalised linear models with a single component of dispersion. Biometrika, 82, 81–91.

    Article  MathSciNet  MATH  Google Scholar 

  • Burton, A., Altman, D. G., Royston, P., & Holder, R. (2006). The design of simulation studies in medical statistics. Statistics in Medicine, 25, 4279–4292.

    Article  MathSciNet  Google Scholar 

  • Carpenter, J., & Kenward, M. (2013). Multiple imputation and its application. UK: Wiley.

    Book  MATH  Google Scholar 

  • Collins, L. M., Schafer, J. L., & Kam, C. M. (2001). A comparison of inclusive and restrictive strategies in modern missing data procedures. Psychological Methods, 6, 330–351.

    Article  Google Scholar 

  • De Backer, M., De Keyser, P., De Vroey, C., & Lesaffre, E. (1996). A 12-week treatment for dermatophyte toe onychomycosis: terbinafine 250mg/day vs. itraconazole 200mg/day? a double-blind comparative trial. British Journal of Dermatology, 134, 16–17.

    Article  Google Scholar 

  • Dempster, A. P., Laird, N. M., & Rubin, D. B. (1977). Maximum likelihood from incomplete data via the EM algorithm (with discussion). Journal of Royal Statistical Society: Series B, 39, 1–38.

    MathSciNet  MATH  Google Scholar 

  • Jansen, I., Beunckens, C., Molenberghs, G., Verbeke, G., & Mallinckrodt, C. (2006). Analyzing incomplete discrete longitudinal clinical trial data. Statistical Science, 21, 52–69.

    Article  MathSciNet  MATH  Google Scholar 

  • Laird, N. M., & Ware, J. H. (1982). Random-effects models for longitudinal data. Biometrics, 38, 963–974.

    Article  MATH  Google Scholar 

  • Liang, K. Y., & Zeger, S. L. (1986). Longitudinal data analysis using generalized linear models. Biometrika, 73, 13–22.

    Article  MathSciNet  MATH  Google Scholar 

  • Little, R. J. A., & Rubin, D. B. (1987). Statistical analysis with missing data. New York: Wiley.

    MATH  Google Scholar 

  • Little, R. J. A. (1995). Modeling the drop-out mechanism in repeated-measures studies. Journal of the American Statistical Association, 90, 1112–1121.

    Article  MathSciNet  MATH  Google Scholar 

  • Little, R. J., & DAgostino, R., Cohen, M. L., Dickersin, K., Emerson, S. S., Farrar, J., Frangakis, C., Hogan, J. W., Molenberghs, G., Murphy, S. A., Neaton, J. D., Rotnitzky, A., Scharfstein, D., Shih, W, J., Siegel, J. P., & Stern, H., (2012). The prevention and treatment of missing data in clinical trials. The New England Journal of Medicine, 367, 1355–1360.

    Google Scholar 

  • Mallinckrodt, C. H., Clark, W. S., & Stacy, R. D. (2001a). Type I error rates from mixedeffects model repeated measures versus fixed effects analysis of variance with missing values imputed via last observation carried forward. Drug Information Journal, 35, 1215–1225.

    Article  Google Scholar 

  • Mallinckrodt, C. H., Clark, W. S., & Stacy, R. D. (2001b). Accounting for dropout bias using mixed-effect models. Journal of Biopharmaceutical Statistics, 11, 9–21.

    Article  Google Scholar 

  • Mallinckrodt, C. H., Clark, W. S., Carroll, R. J., & Molenberghs, G. (2003a). Assessing response profiles from incomplete longitudinal clinical trial data under regulatory considerations. Journal of Biopharmaceutical Statistics, 13, 179–190.

    Article  MATH  Google Scholar 

  • Mallinckrodt, C. H., Sanger, T. M., Dube, S., Debrota, D. J., Molenberghs, G., Carroll, R. J., et al. (2003b). Assessing and interpreting treatment effects in longitudinal clinical trials with missing data. Biological Psychiatry, 53, 754–760.

    Article  Google Scholar 

  • Milliken, G. A., & Johnson, D. E. (2009). Analysis of messy data. Design experiments (2nd ed., Vol. 1). Chapman and Hall/CRC.

    Google Scholar 

  • Molenberghs, G., Kenward, M. G., & Lesaffre, E. (1997). The analysis of longitudinal ordinal data with non-random dropout. Biometrika, 84, 33–44.

    Article  MATH  Google Scholar 

  • Molenberghs, G., & Verbeke, G. (2005). Models for discrete longitudinal data. New York: Springer.

    MATH  Google Scholar 

  • Molenberghs, G., & Kenward, M. G. (2007). Missing data in clinical studies. England: Wiley.

    Book  Google Scholar 

  • Molenberghs, G., Beunckens, C., Sotto, C., & Kenward, M. (2008). Every missing not at random model has got a missing at random counterpart with equal fit. Journal of Royal Statistical Soceity: Series B, 70, 371–388.

    Article  MATH  Google Scholar 

  • Pinheiro, J. C., & Bates, D. M. (2000). Mixed effects models in S and S-Plus. New York: Springer.

    Book  MATH  Google Scholar 

  • Rubin, D. B. (1976). Inference and missing data. Biometrika, 63, 581–592.

    Article  MathSciNet  MATH  Google Scholar 

  • Rubin, D. B. (1978). Multiple imputations in sample surveys. In Proceedings of the Survey Research Methods Section (pp. 20–34). American Statistical Association.

    Google Scholar 

  • Rubin, D. B., & Schenker, N. (1986). Multiple imputation for interval estimation from simple random samples with ignorable nonresponse. Journal of the American Statistical Association, 81, 366–374.

    Article  MathSciNet  MATH  Google Scholar 

  • Rubin, D. B. (1987). Multiple imputation for nonresponse in surveys. New York: Wiley.

    Book  MATH  Google Scholar 

  • Rubin, D. B. (1996). Multiple imputation after 18+ years (with discussion). Journal of the American Statistical Association, 91, 473–520.

    Article  MATH  Google Scholar 

  • Schafer, J. L. (1997). Analysis of incomplete multivariate data. New York: Champan and Hall.

    Book  MATH  Google Scholar 

  • Schafer, J. L., & Olsen, M. K. (1998). Multiple imputation for multivariate missing-data problems: A data analysts perspective. Multivariate Behavioral Research, 33, 545–571.

    Article  Google Scholar 

  • Schafer, J. L. (1999). Multiple imputation: A primer. Statistical Methods in Medical Research, 8, 3–15.

    Article  Google Scholar 

  • Schafer, J. L., & Graham, J. W. (2002). Missing data: Our view of the state of the art. Psychological Methods, 7, 147–177.

    Article  Google Scholar 

  • Schafer, J. L. (2003). Multiple imputation in multivariate problems when the imputation and analysis models differ. Statistica Neerlandica, 57, 19–35.

    Article  MathSciNet  Google Scholar 

  • Stiratelli, R., Laird, N., & Ware, J. (1984). Random effects models for serial observations with dichotomous response. Biometrics, 40, 961–972.

    Article  Google Scholar 

  • Tanner, M. A., & Wong, W. H. (1987). The calculation of posterior distributions by data augmentation (with discussion). Journal of the American Statistical Association, 82, 528–550.

    Article  MathSciNet  MATH  Google Scholar 

  • Verbeke, G., & Molenberghs, G. (2000). Linear mixed models for longitudinal data. New York: Springer.

    MATH  Google Scholar 

  • Yoo, B. (2009). The impact of dichotomization in longitudinal data analysis: A simulation study. Pharmaceutical Statistics, 9, 298–312.

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to H. Mwambi .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2017 Springer Nature Singapore Pte Ltd.

About this chapter

Cite this chapter

Satty, A., Mwambi, H., Molenberghs, G. (2017). Statistical Methodologies for Dealing with Incomplete Longitudinal Outcomes Due to Dropout Missing at Random. In: Chen, DG., Chen, J. (eds) Monte-Carlo Simulation-Based Statistical Modeling . ICSA Book Series in Statistics. Springer, Singapore. https://doi.org/10.1007/978-981-10-3307-0_10

Download citation

Publish with us

Policies and ethics