Abstract
This chapter provides an overview of the topic of missing data. We introduce the main types of missing data that can occur in practice and discuss the practical consequences of each of these types for general data analysis. We then describe general and practical solutions to the problem of missing data, discussing common but flawed approaches as well as more powerful approaches such as multiple imputation, which is an approach to dealing with missing data that is suitable for many—although not all—situations. Finally, we consider the topic of missing data as part of statistical inference more generally, and how it can be handled in both maximum likelihood and Bayesian approaches to inference.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
Notes
- 1.
- 2.
Continuing with the notation introducted in Sect. 4.2.1, here we will denote the fully observed variables in our data by x, the partially observed variables by \(y = y^{\text {obs}}, y^{\text {obs}}\), and we will index the missing variables in y by the I. We can also assume that any or all of x, y and I may be multivariate arrays.
References
Azur MJ, Stuart EA, Frangakis C, Leaf PJ (2011) Multiple imputation by chained equations: what is it and how does it work? Int J Methods Psychiatr Res 20(1):40–49
Buuren S, Groothuis-Oudshoorn K (2011) mice: Multivariate imputation by chained equations in r. J Stat Softw 45(3)
Collins LM, Schafer JL, Kam C-M (2001) A comparison of inclusive and restrictive strategies in modern missing data procedures. Psychol Methods 6(4):330
Dempster MM, Laird NM, Jain DB (1977) Maximum likelihood from incomplete data via the em algorithm. J R Stat Soc 1–38
Geman S, Geman D (1984) Stochastic relaxation, Gibbs distributions and the Bayesian restoration of images. IEEE Trans Pattern Anal Machine Intel 6:721–741
Graham JW (2009) Missing data analysis: making it work in the real world. Annu Rev Psychol 60:549–576
Graham JW, Olchowski AE, Gilreath TD (2007) How many imputations are really needed? Some practical clarifications of multiple imputation theory. Prev Sci 8(3):206–213
Little RJ, Rubin DB (2002) Statistical analysis with missing data. Wiley, New York
Little RJ, Smith PJ (1987) Editing and imputation for quantitative survey data. J Amer Stat Assoc 82(397):58–68
Marlin BM, Zemel RS, Roweis ST, Slaney M (2011) Recommender systems, missing data and statistical model estimation. In: IJCAI proceedings-international joint conference on artificial intelligence, vol 22, pp 2686–2691
Mohan K, Pearl J, Tian J (2013) Graphical models for inference with missing data. In: Burges C, Bottou L, Welling M, Ghahramani Z, Weinberger K (eds) Advances in neural information processing systems, vol 26. Curran Associates, Inc., pp 1277–1285
Peugh JL, Enders CK (2004) Missing data in educational research: a review of reporting practices and suggestions for improvement. Rev Educ Res 74(4):525–556
Rouder JN, Speckman PL, Sun D, Morey RD, Iverson G (2009) Bayesian t tests for accepting and rejecting the null hypothesis. Psychon Bull Rev 16(2):225–237
Rubin DB (1976) Inference and missing data. Biometrika 63(3):581–592
Rubin DB (1987) Multiple imputation for nonresponse in surveys. Wiley, New York
Schafer JL, Graham JW (2002) Missing data: our view of the state of the art. Psychol Methods 7(2):147
Su Y-S, Yajima M, Gelman AE, Hill J (2011) Multiple imputation with diagnostics (mi) in r: opening windows into the black box. J Stat Softw 45(2):1–31
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2016 Springer International Publishing Switzerland
About this chapter
Cite this chapter
Baguley, T., Andrews, M. (2016). Handling Missing Data. In: Robertson, J., Kaptein, M. (eds) Modern Statistical Methods for HCI. Human–Computer Interaction Series. Springer, Cham. https://doi.org/10.1007/978-3-319-26633-6_4
Download citation
DOI: https://doi.org/10.1007/978-3-319-26633-6_4
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-26631-2
Online ISBN: 978-3-319-26633-6
eBook Packages: Computer ScienceComputer Science (R0)