Skip to main content

Analysis of Missing Data

  • Chapter
  • First Online:
Missing Data

Part of the book series: Statistics for Social and Behavioral Sciences ((SSBS))

Abstract

In this chapter, I present older methods for handling missing data. I then turn to the major new approaches for handling missing data. In this chapter, I present methods that make the MAR assumption. Included in this introduction are the EM algorithm for covariance matrices, normal-model multiple imputation (MI), and what I will refer to as FIML (full information maximum likelihood) methods. Before getting to these methods, however, I talk about the goals of analysis.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 79.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 99.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 139.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    It is this random error that is missing from the data set imputed from the EM solution in the MVA module of SPSS (von Hippel 2004; this remains the case at least through version 20).

  2. 2.

    However, it is acceptable if variables are included in the imputation model that are not included in the analysis model.

References

  • Aiken, L.S., & West, S.G. (1991). Multiple regression: Testing and interpreting interactions. Newbury Park, CA: Sage.

    Google Scholar 

  • Allison, P. D. (2002). Missing Data. Thousand Oaks, CA: Sage.

    MATH  Google Scholar 

  • Arbuckle, J. L. (1995). Amos users’ guide. Chicago: Smallwaters.

    Google Scholar 

  • Arbuckle, J. L. (2010). IBM SPSS Amos 19 User’s Guide. Crawfordville, FL: Amos Development Corporation.

    Google Scholar 

  • Bentler, P. M., & Wu, E. J. C. (1995). EQS for Windows User’s Guide. Encino, CA: Multivariate Software, Inc.

    Google Scholar 

  • Collins, L. M., Wugalter, S. E. (1992). Latent class models for stage-sequential dynamic latent variables. Multivariate Behavioral Research, 27, 131–157.

    Article  Google Scholar 

  • Collins, L. M., Schafer, J. L., & Kam, C. M. (2001). A comparison of inclusive and restrictive strategies in modern missing data procedures. Psychological Methods, 6, 330–351.

    Google Scholar 

  • Efron, B. (1982). The jackknife, the bootstrap, and other resampling plans. Philadelphia: Society for Industrial and Applied Mathematics.

    Google Scholar 

  • Graham, J. W. (2003). Adding missing-data relevant variables to FIML-based structural equation models. Structural Equation Modeling, 10, 80–100.

    Article  MathSciNet  Google Scholar 

  • Graham, J. W. (2009). Missing data analysis: making it work in the real world. Annual Review of Psychology, 60, 549–576.

    Article  Google Scholar 

  • Graham, J. W., Olchowski, A. E., & Gilreath, T. D. (2007). How Many Imputations are Really Needed? Some Practical Clarifications of Multiple Imputation Theory. Prevention Science, 8, 206–213.

    Article  Google Scholar 

  • Graham, J. W., and Coffman, D. L. (in press). Structural Equation Modeling with Missing Data. In R. Hoyle (Ed.), Handbook of Structural Equation Modeling. New York: Guilford Press.

    Google Scholar 

  • Graham, J. W., Cumsille, P. E., and Elek-Fisk, E. (2003). Methods for handling missing data. In J. A. Schinka & W. F. Velicer (Eds.). Research Methods in Psychology (pp. 87–114). Volume 2 of Handbook of Psychology (I. B. Weiner, Editor-in-Chief). New York: John Wiley & Sons.

    Google Scholar 

  • Graham, J. W., Cumsille, P. E., and Shevock, A. E. (in press). Methods for handling missing data. In J. A. Schinka & W. F. Velicer (Eds.). Research Methods in Psychology (pp. 000–000). Volume 3 of Handbook of Psychology (I. B. Weiner, Editor-in-Chief). New York: John Wiley & Sons.

    Google Scholar 

  • Graham, J. W., & Donaldson, S. I. (1993). Evaluating interventions with differential attrition: The importance of nonresponse mechanisms and use of followup data. Journal of Applied Psychology, 78, 119–128.

    Article  Google Scholar 

  • Graham, J. W., & Hofer, S. M. (1991). EMCOV.EXE Users Guide. Unpublished manuscript, University of Southern California.

    Google Scholar 

  • Graham, J. W., Hofer, S.M., Donaldson, S. I., MacKinnon, D.P., & Schafer, J. L. (1997). Analysis with missing data in prevention research. In K. Bryant, M. Windle, & S. West (Eds.), The science of prevention: methodological advances from alcohol and substance abuse research. (pp. 325–366). Washington, D.C.: American Psychological Association.

    Chapter  Google Scholar 

  • Graham, J. W., Hofer, S.M., and MacKinnon, D.P. (1996). Maximizing the usefulness of data obtained with planned missing value patterns: an application of maximum likelihood procedures. Multivariate Behavioral Research, 31, 197–218.

    Article  Google Scholar 

  • Hansen, W. B., & Graham, J. W. (1991). Preventing alcohol, marijuana, and cigarette use among adolescents: Peer pressure resistance training versus establishing conservative norms. Preventive Medicine, 20, 414–430.

    Article  Google Scholar 

  • Jaccard, J.J. & Turrisi, R. (2003). Interaction effects in multiple regression. Newberry Park, CA: Sage Publications.

    Google Scholar 

  • Jöreskog, K.G. & Sörbom, D. (2006). LISREL 8.8 for Windows [Computer software]. Lincolnwood, IL: Scientific Software International, Inc.

    Google Scholar 

  • MacKinnon, D. P., Lockwood, C. M., Hoffman, J. M., West, S. G. & Sheets, V. (2002). A comparison of methods to test mediation and other intervening variable effects. Psychological Methods, 7(1), 83–104.

    Google Scholar 

  • Mels, G. (2006) LISREL for Windows: Getting Started Guide. Lincolnwood, IL: Scientific Software International, Inc.

    Google Scholar 

  • Muthén, L. K., & Muthén, B. O. (2010). Mplus User’s Guide. (6th ed.). Los Angeles: Author.

    Google Scholar 

  • Neale, M. C., Boker, S. M., Xie, G., and Maes, H. H. (2003). Mx: Statistical Modeling. VCU Box 900126, Richmond, VA 23298: Department of Psychiatry. 6th Edition.

    Google Scholar 

  • Raudenbush, S. W., & Bryk, A. S. (2002). Hierarchical Linear Models: Applications and Data Analysis Methods, Second Edition. Newbury Park, CA: Sage.

    Google Scholar 

  • Raudenbush, S. W., Rowan, B., and Kang, S. J. (1991). A multilevel, multivariate model for studying school climate with estimation via the EM algorithm and application to U.S. high-school data. Journal of Educational Statistics, 16, 295–330.

    Article  Google Scholar 

  • Rubin, D.B. (1987). Multiple imputation for nonresponse in surveys. New York: Wiley.

    Book  Google Scholar 

  • Rubin, D. B., & Thayer, D. T. (1982). EM algorithms for ML factor analysis. Psychometrika, 47, 69–76.

    Article  MathSciNet  MATH  Google Scholar 

  • Schafer, J. L. (1997). Analysis of Incomplete Multivariate Data. New York: Chapman and Hall.

    Book  MATH  Google Scholar 

  • Schafer, J. L. (2001). Multiple imputation with PAN. In L. M. Collins and A. G. Sayer (Eds.) New Methods for the Analysis of Change, ed., (pp. 357–377). Washington, DC: American Psychological Association.

    Chapter  Google Scholar 

  • Schafer, J. L., and Olsen, M. K. (1998). Multiple imputation for multivariate missing data problems: A data analyst’s perspective. Multivariate Behavioral Research, 33, 545–571.

    Article  Google Scholar 

  • Schafer, J. L., and Yucel, R. M. (2002). Computational strategies for multivariate linear mixed-effects models with missing values. Journal of Computational and Graphical Statistics, 11, 437–457.

    Article  MathSciNet  Google Scholar 

  • Tanner, M. A., & Wong, W. H. (1987). The calculation of posterior distributions by data augmentation (with discussion). Journal of the American Statistical Association, 82, 528–550.

    Article  MathSciNet  MATH  Google Scholar 

  • von Hippel, P. T. (2004). Biases in SPSS 12.0 Missing Value Analysis. American Statistician, 58, 160–164.

    Article  Google Scholar 

  • Willett, J. B., and Sayer, A. G. (1994). Using covariance structure analysis to detect correlates and predictors of individual change over time. Psychological Bulletin, 116(2), 363–381.

    Article  Google Scholar 

  • Yuan, K-H., & Bentler, P.M. (2000). Three likelihood-based methods for mean and covariance structure analysis with nonnormal missing data. Sociological Methodology, 30, 165–200.

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Rights and permissions

Reprints and permissions

Copyright information

© 2012 Springer Science+Business Media New York

About this chapter

Cite this chapter

Graham, J.W. (2012). Analysis of Missing Data. In: Missing Data. Statistics for Social and Behavioral Sciences. Springer, New York, NY. https://doi.org/10.1007/978-1-4614-4018-5_2

Download citation

Publish with us

Policies and ethics