Abstract
In this chapter we discuss several models by which missing data can arise in clinical trials. The likelihood function is used as a basis for discussing different missing data mechanisms for incomplete responses in short-term and longitudinal studies, as well as for missing covariates. We critically discuss common ad hoc strategies for dealing with incomplete data, such as complete-case analyses and naive methods of imputation, and we review more broadly appropriate approaches for dealing with incomplete data in terms of asymptotic and empirical frequency properties. These methods include the EM algorithm, multiple imputation, and inverse probability weighted estimating equations. Simulation studies are reported which demonstrate how to implement these procedures and examine performance empirically.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Albert, P.S., Follmann, D.: Shared-parameter models. In: Fitzmaurice, G., Davidian, M., Verbeke, G., Molenberghs, G. (eds.) Longitudinal Data Analysis, Chapter 18, 433–452. CRC Press, Boca Raton, FL. (2009)
Barnard, J., Rubin D.B.: Miscellanea. small-sample degrees of freedom with multiple imputation. Biometrika 86(4), 948–955 (1999)
Chen, B., Cook, R.J.: Strategies for bias reduction in estimation of marginal means with data missing at random. Optimization and Data Analysis on Biomedical Informatics. Ed: Panos Pardalos. American Mathematics Society (2011)
Chen, B., Yi, G.Y., Cook, R.J.: Weighted generalized estimating functions for longitudinal response and covariate data that are missing at random. Journal of the American Statistical Associaton 105, 336–353 (2010)
Cook, R.J., Zeng, L., Yi, G.Y.: Marginal analysis of incomplete longitudinal binary data: a cautionary note on LOCF imputation. Biometrics 60(3), 820–828 (2004)
Cox, D.R.: The analysis of multivariate binary data. Applied Statistics 21, 113–120 (1972)
Crowder, M.: On the use of a working correlation matrix in using generalized linear models for repeated measures. Biometrika 82, 407–410 (1995)
Dempster, A.P., Laird, N.M., Rubin, D.B.: Maximum likelihood from incomplete data via the EM algorithm (with discussion). Journal of the Royal Statistical Society: Series B 39, 1–38 (1977)
Fitzmaurice, G.M., Molenberghs, G., Lipsitz, S.R.: Regression models for longitudinal binary responses with informative drop-outs. Journal of the Royal Statistical Society. Series B (Methodological) 57(4), 691–704 (1995)
Glynn, R.J., Laird, N.M., Rubin, D.B.: Multiple imputation in mixture models for nonignorable nonresponse with followups. Journal of the American Statistical Associaton 88, 984–993 (1993)
Godambe, V.P.: Estimating Functions. Oxford University Press, USA (1991)
Gordon, K.B., Langley, R.G., Leonardi, C., Toth, D., Menter, M.A., Kang, S., Hefferman, M., Miller, B., Hamlin, R., Lim, L., Zhong, J., Hoffman, R., Okun, M.M.: Clinical response to adalimumab treatment in patients with moderate to severe psoriasis: Double-blind, randomized controlled trial and open-label extension study. Journal of the American Academy of Dermatology 55, 598–606 (2006)
Heagerty, P.J.: Marginalized transition models and likeliood inference for longitudinal categorical data. Biometrics 58, 342–351 (2002)
Heagerty, P.J., Zeger, S.L.: Marginalized multilevel models and likelihood inference. Statistical Science 15, 1–19 (2000)
Herzog, T., and Rubin, D.B.: Using multiple imputations to handle nonresponse in sample surveys. In: Madow, W.G., Olkin, I., Rubin, D.B. (eds.) Incomplete Data in Sample Surveys, Volume 2: Theory and Bibliography, 209–245. New York: Academic Press (1983)
Laupacis, A., Sackett, D.L., Roberts, R.S.: An assessment of clinically useful measures of the consequences of treatment. New England Journal of Medicine 318, 1728–1733 (1988)
Liang, K.Y., Zeger, S.: Longitudinal data analysis using generalized linear models. Biometrika 73, 13–22 (1986)
Little, R.J.A.: Pattern-mixture models for multivariate incomplete data. Journal of the American Statistical Associaton 88, 125–134 (1993)
Little, R.J.A.: Modeling the drop-out mechanism in repeated-measures studies. Journal of the American Statistical Associaton 90, 1112–1121 (1995)
Little, R.J.A., Rubin, D.B.: Statistical Analysis with Missing Data. Wiley, New York (1987)
Little, R.J.A., Rubin, D.B.: Statistical Analysis with Missing Data, Second Edition. Wiley, New York (2002)
Matthews, D.E., Farewell, V.T.: Using and Understanding Medical Statistics, 3rd Revised Edition. Karger, Basel, Switzerland (1996)
McCullagh, P., Nelder, J.A.: Generalized Linear Models, Second Edition. Chapman & Hall/CRC., London, UK (1989)
McIsaac, M.A., Cook, R.J.: Response-Dependent Sampling with Clustered and Longitudinal Data. In: ISS-2012 Proceedings Volume On Longitudinal Data Analysis Subject to Measurement Errors, Missing Values, and/or Outliers, 157–181. New York: Springer (2013)
McIsaac, M.A., Cook, R.J., Poulin-Costello, M.: Incomplete data in randomized dermatology trials: Consequences and statistical methodology. Dermatology 226(1), 19–27 (2013). DOI 10.1159/ 000346247
Molenberghs, G., Kenward M.: Missing Data in Clinical Studies. John Wiley & Sons Ltd, West Sussex, England, UK (2007)
Prakash, A., Risser, R. C., Mallinckrodt, C. H.: The impact of analytic method on interpretaion of outcomes in longitudinal clinical trials. International Journal of Clinical Practice 62, 1147–1158 (2008)
Reich, K., Nestle, F.O., Papp, K., Ortonne, J.P., Evans, R., Guzzo, C., Dooley, L.T., Griffiths, C.E.M. for the EXPRESS Study Investigators: Infliximab induction and maintenance therapy for moderate-to-severe psoriasis: a phase III, multicentre, double-blind trial. Lancet 366, 1367–1374 (2005)
Reilly, M., Pepe, M.: The relationship between hot-deck multiple imputation and weighted likelihood. Statistics in Medicine 16, 5–19 (1997)
Robins, J.M., Ritov, Y.: Toward a curse of dimensionality approximate (CODA) asymptotic theory for semiparameric models. Statistics in Medicine 16, 285–319 (1997)
Robins, J.M., Hernan, M.A., Brumback, B. Marginal structural models and causal inference in epidemiology. Epidemiology 11(5), 550–560 (2000)
Robins, J.M., Rotnitzky, A., Zhao, L.P.: Analysis of semiparametric regression models for repeated outcomes in the presence of missing data. Journal of the American Statistical Association 90(429), 106–121 (1995)
Rothman, K.J., Greenland, S., eds: Modern Epidemiology, Second Edition. Lippincott Williams & Wilkins, Philadelphia (1998)
Rotnitzky, A., Wypij, D.: A note on the bias of estimators with missing data. Biometrics 50, 1163–1170 (1994)
Rubin, D.B.: Inference and missing data. Biometrika 63, 581–592 (1976)
Rubin, D.B.: Multiple Imputation for Nonresponse in Surveys. Wiley, New York (1987)
Saurat, J.H., Stingl, G., Dubertret, L., Papp, K., Langley, R.G., Ortonne, J.P., Unnebrink, K., Kaul, M., Camez, A., for the CHAMPION Study Investigators: Efficacy and safety results from the randomized controlled comparative study of adalimumab vs. methotrexate vs. placebo in patients with psoriasis (CHAMPION). British Journal of Dermatology 158, 558–566 (2007)
Schenker, N., Welsh, A.H.: Asymptotic results for multiple imputation. The Annals of Statistics 16(4), 1550–1566 (1988)
Sprott, D.A.: Statistical Inference in Science. Springer, New York (2000)
Sprott, D.A., Farewell, V.T.: Randomization in experimental science. Statistical Papers 34, 89–94 (1993)
Sutradhar, B.C., Das, K.: On the efficiency of regression estimators in generalised linear models for longitudinal data. Biometrika 86, 459–465 (1999)
Wang, N., Robins, J. M.: Large-sample theory for parametric multiple imputation procedures. Biometrika 85, 935–948 (1998)
White, H.A.: Maximum likelihood estimation of misspecified models. Econometrica 50, 1–25 (1982)
Yi, G.Y., Cook, R.J.: Marginal methods for incomplete longitudinal data arising in clusters. Journal of the American Statistical Association 97(460), 1071–1080 (2002)
Zhao, L.P., Prentice, R.L.: Correlated binary regression using a quadratic exponential model. Biometrika 77, 642–648 (1990)
Acknowledgements
This work was supported by a Post-Graduate Scholarship to Michael McIsaac from the Natural Sciences and Engineering Research Council (NSERC) of Canada and grants to Richard Cook from NSERC (Grant No. 101093) and the Canadian Institutes of Health Research (Grant No. 105099). Richard Cook is a Tier I Canada Research Chair in Statistical Methods for Health Research.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2014 Springer-Verlag Berlin Heidelberg
About this chapter
Cite this chapter
McIsaac, M.A., Cook, R.J. (2014). Statistical Models and Methods for Incomplete Data in Randomized Clinical Trials. In: van Montfort, K., Oud, J., Ghidey, W. (eds) Developments in Statistical Evaluation of Clinical Trials. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-55345-5_1
Download citation
DOI: https://doi.org/10.1007/978-3-642-55345-5_1
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-55344-8
Online ISBN: 978-3-642-55345-5
eBook Packages: Mathematics and StatisticsMathematics and Statistics (R0)