Analysis of Missing Data

Graham, John W.

doi:10.1007/978-1-4614-4018-5_2

John W. Graham²

Part of the book series: Statistics for Social and Behavioral Sciences ((SSBS))

8434 Accesses
15 Citations

Abstract

In this chapter, I present older methods for handling missing data. I then turn to the major new approaches for handling missing data. In this chapter, I present methods that make the MAR assumption. Included in this introduction are the EM algorithm for covariance matrices, normal-model multiple imputation (MI), and what I will refer to as FIML (full information maximum likelihood) methods. Before getting to these methods, however, I talk about the goals of analysis.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 79.99; Price excludes VAT (USA)

Softcover Book: USD 99.99; Price excludes VAT (USA)

Hardcover Book: USD 139.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
It is this random error that is missing from the data set imputed from the EM solution in the MVA module of SPSS (von Hippel 2004; this remains the case at least through version 20).
2.
However, it is acceptable if variables are included in the imputation model that are not included in the analysis model.

References

Aiken, L.S., & West, S.G. (1991). Multiple regression: Testing and interpreting interactions. Newbury Park, CA: Sage.
Google Scholar
Allison, P. D. (2002). Missing Data. Thousand Oaks, CA: Sage.
MATH Google Scholar
Arbuckle, J. L. (1995). Amos users’ guide. Chicago: Smallwaters.
Google Scholar
Arbuckle, J. L. (2010). IBM SPSS Amos 19 User’s Guide. Crawfordville, FL: Amos Development Corporation.
Google Scholar
Bentler, P. M., & Wu, E. J. C. (1995). EQS for Windows User’s Guide. Encino, CA: Multivariate Software, Inc.
Google Scholar
Collins, L. M., Wugalter, S. E. (1992). Latent class models for stage-sequential dynamic latent variables. Multivariate Behavioral Research, 27, 131–157.
Article Google Scholar
Collins, L. M., Schafer, J. L., & Kam, C. M. (2001). A comparison of inclusive and restrictive strategies in modern missing data procedures. Psychological Methods, 6, 330–351.
Google Scholar
Efron, B. (1982). The jackknife, the bootstrap, and other resampling plans. Philadelphia: Society for Industrial and Applied Mathematics.
Google Scholar
Graham, J. W. (2003). Adding missing-data relevant variables to FIML-based structural equation models. Structural Equation Modeling, 10, 80–100.
Article MathSciNet Google Scholar
Graham, J. W. (2009). Missing data analysis: making it work in the real world. Annual Review of Psychology, 60, 549–576.
Article Google Scholar
Graham, J. W., Olchowski, A. E., & Gilreath, T. D. (2007). How Many Imputations are Really Needed? Some Practical Clarifications of Multiple Imputation Theory. Prevention Science, 8, 206–213.
Article Google Scholar
Graham, J. W., and Coffman, D. L. (in press). Structural Equation Modeling with Missing Data. In R. Hoyle (Ed.), Handbook of Structural Equation Modeling. New York: Guilford Press.
Google Scholar
Graham, J. W., Cumsille, P. E., and Elek-Fisk, E. (2003). Methods for handling missing data. In J. A. Schinka & W. F. Velicer (Eds.). Research Methods in Psychology (pp. 87–114). Volume 2 of Handbook of Psychology (I. B. Weiner, Editor-in-Chief). New York: John Wiley & Sons.
Google Scholar
Graham, J. W., Cumsille, P. E., and Shevock, A. E. (in press). Methods for handling missing data. In J. A. Schinka & W. F. Velicer (Eds.). Research Methods in Psychology (pp. 000–000). Volume 3 of Handbook of Psychology (I. B. Weiner, Editor-in-Chief). New York: John Wiley & Sons.
Google Scholar
Graham, J. W., & Donaldson, S. I. (1993). Evaluating interventions with differential attrition: The importance of nonresponse mechanisms and use of followup data. Journal of Applied Psychology, 78, 119–128.
Article Google Scholar
Graham, J. W., & Hofer, S. M. (1991). EMCOV.EXE Users Guide. Unpublished manuscript, University of Southern California.
Google Scholar
Graham, J. W., Hofer, S.M., Donaldson, S. I., MacKinnon, D.P., & Schafer, J. L. (1997). Analysis with missing data in prevention research. In K. Bryant, M. Windle, & S. West (Eds.), The science of prevention: methodological advances from alcohol and substance abuse research. (pp. 325–366). Washington, D.C.: American Psychological Association.
Chapter Google Scholar
Graham, J. W., Hofer, S.M., and MacKinnon, D.P. (1996). Maximizing the usefulness of data obtained with planned missing value patterns: an application of maximum likelihood procedures. Multivariate Behavioral Research, 31, 197–218.
Article Google Scholar
Hansen, W. B., & Graham, J. W. (1991). Preventing alcohol, marijuana, and cigarette use among adolescents: Peer pressure resistance training versus establishing conservative norms. Preventive Medicine, 20, 414–430.
Article Google Scholar
Jaccard, J.J. & Turrisi, R. (2003). Interaction effects in multiple regression. Newberry Park, CA: Sage Publications.
Google Scholar
Jöreskog, K.G. & Sörbom, D. (2006). LISREL 8.8 for Windows [Computer software]. Lincolnwood, IL: Scientific Software International, Inc.
Google Scholar
MacKinnon, D. P., Lockwood, C. M., Hoffman, J. M., West, S. G. & Sheets, V. (2002). A comparison of methods to test mediation and other intervening variable effects. Psychological Methods, 7(1), 83–104.
Google Scholar
Mels, G. (2006) LISREL for Windows: Getting Started Guide. Lincolnwood, IL: Scientific Software International, Inc.
Google Scholar
Muthén, L. K., & Muthén, B. O. (2010). Mplus User’s Guide. (6th ed.). Los Angeles: Author.
Google Scholar
Neale, M. C., Boker, S. M., Xie, G., and Maes, H. H. (2003). Mx: Statistical Modeling. VCU Box 900126, Richmond, VA 23298: Department of Psychiatry. 6th Edition.
Google Scholar
Raudenbush, S. W., & Bryk, A. S. (2002). Hierarchical Linear Models: Applications and Data Analysis Methods, Second Edition. Newbury Park, CA: Sage.
Google Scholar
Raudenbush, S. W., Rowan, B., and Kang, S. J. (1991). A multilevel, multivariate model for studying school climate with estimation via the EM algorithm and application to U.S. high-school data. Journal of Educational Statistics, 16, 295–330.
Article Google Scholar
Rubin, D.B. (1987). Multiple imputation for nonresponse in surveys. New York: Wiley.
Book Google Scholar
Rubin, D. B., & Thayer, D. T. (1982). EM algorithms for ML factor analysis. Psychometrika, 47, 69–76.
Article MathSciNet MATH Google Scholar
Schafer, J. L. (1997). Analysis of Incomplete Multivariate Data. New York: Chapman and Hall.
Book MATH Google Scholar
Schafer, J. L. (2001). Multiple imputation with PAN. In L. M. Collins and A. G. Sayer (Eds.) New Methods for the Analysis of Change, ed., (pp. 357–377). Washington, DC: American Psychological Association.
Chapter Google Scholar
Schafer, J. L., and Olsen, M. K. (1998). Multiple imputation for multivariate missing data problems: A data analyst’s perspective. Multivariate Behavioral Research, 33, 545–571.
Article Google Scholar
Schafer, J. L., and Yucel, R. M. (2002). Computational strategies for multivariate linear mixed-effects models with missing values. Journal of Computational and Graphical Statistics, 11, 437–457.
Article MathSciNet Google Scholar
Tanner, M. A., & Wong, W. H. (1987). The calculation of posterior distributions by data augmentation (with discussion). Journal of the American Statistical Association, 82, 528–550.
Article MathSciNet MATH Google Scholar
von Hippel, P. T. (2004). Biases in SPSS 12.0 Missing Value Analysis. American Statistician, 58, 160–164.
Article Google Scholar
Willett, J. B., and Sayer, A. G. (1994). Using covariance structure analysis to detect correlates and predictors of individual change over time. Psychological Bulletin, 116(2), 363–381.
Article Google Scholar
Yuan, K-H., & Bentler, P.M. (2000). Three likelihood-based methods for mean and covariance structure analysis with nonnormal missing data. Sociological Methodology, 30, 165–200.
Article Google Scholar

Download references

Author information

Authors and Affiliations

Department of Biobehavioral Health, The Pennsylvania State University, Health & Human Development Bldg. East, University Park, PA, USA
John W. Graham

Authors

John W. Graham
View author publications
You can also search for this author in PubMed Google Scholar

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Graham, J.W. (2012). Analysis of Missing Data. In: Missing Data. Statistics for Social and Behavioral Sciences. Springer, New York, NY. https://doi.org/10.1007/978-1-4614-4018-5_2

Download citation

DOI: https://doi.org/10.1007/978-1-4614-4018-5_2
Published: 10 May 2012
Publisher Name: Springer, New York, NY
Print ISBN: 978-1-4614-4017-8
Online ISBN: 978-1-4614-4018-5
eBook Packages: Mathematics and StatisticsMathematics and Statistics (R0)

Publish with us

Policies and ethics