Behavior Research Methods

, Volume 50, Issue 2, pp 501–517 | Cite as

On using multiple imputation for exploratory factor analysis of incomplete data

  • Vahid Nassiri
  • Anikó Lovik
  • Geert Molenberghs
  • Geert Verbeke


A simple multiple imputation-based method is proposed to deal with missing data in exploratory factor analysis. Confidence intervals are obtained for the proportion of explained variance. Simulations and real data analysis are used to investigate and illustrate the use and performance of our proposal.


Missing data Multiple imputation Exploratory factor analysis Principal component analysis 



Financial support from the IAP research network # P7/06 of the Belgian Government (Belgian Science Policy) is gratefully acknowledged. The research leading to these results has also received funding from the European Seventh Framework programme FP7 2007 - 2013 under grant agreement Nr. 602552. We gratefully acknowledge support from the IWT-SBO ExaScience grant. We are grateful for suggestions made by anonymous referees, which have greatly helped to improve this manuscript.


  1. Anderson, T. W. (1963). Asymptotic theory for principal component analysis. The Annals of Mathematical Statistics, 34(1), 122–148.CrossRefGoogle Scholar
  2. APA (2010). Publication manual of the American Psychological Association, 6th edn. American Psychological Association Washington.Google Scholar
  3. Buuren, S., & Groothuis-Oudshoorn, K. (2011). mice: Multivariate imputation by chained equations in r. Journal of Statistical Software, 45(3).Google Scholar
  4. Carpenter, J., & Kenward, M. (2012). Multiple imputation and its application. Wiley.Google Scholar
  5. Dempster, A. P., Laird, N. M., & Rubin, D. B. (1977). Maximum likelihood from incomplete data via the EM algorithm. Journal of the Royal Statistical Society. Series B (methodological), 1–38.Google Scholar
  6. Denissen, J. J., Geenen, R., Van Aken, M. A., Gosling, S. D., & Potter, J. (2008). Development and validation of a Dutch translation of the Big Five Inventory (BFI). Journal of Personality Assessment, 90(2), 152–157.CrossRefPubMedGoogle Scholar
  7. Dray, S., & Josse, J. (2015). Principal component analysis with missing values: A comparative survey of methods. Plant Ecology, 216(5), 657–667.CrossRefGoogle Scholar
  8. Efron, B., & Tibshirani, R. J. (1994). An introduction to the bootstrap. CRC Press.Google Scholar
  9. Enders, C. K., & Bandalos, D. L. (2001). The relative performance of full information maximum likelihood estimation for missing data in structural equation models. Structural Equation Modeling, 8(3), 430–457.CrossRefGoogle Scholar
  10. Fan, K. (1949). On a theorem of Weyl concerning eigenvalues of linear transformations I. Proceedings of the National Academy of Sciences, 35, 652–655.CrossRefGoogle Scholar
  11. Fieller, E. C. (1954). Some problems in interval estimation. Journal of the Royal Statistical Society. Series B (Methodological), 175–185.Google Scholar
  12. Gabriel, K. R., & Zamir, S. (1979). Lower rank approximation of matrices by least squares with any choice of weights. Technometrics, 21(4), 489–498.CrossRefGoogle Scholar
  13. Harville, D. A. (1997) Matrix algebra from a statistician’s perspective. New York: Springer.CrossRefGoogle Scholar
  14. Ilin, A., & Raiko, T. (2010). Practical approaches to principal component analysis in the presence of missing values. The Journal of Machine Learning Research, 11, 1957–2000.Google Scholar
  15. John, O. P., & Srivastava, S. (1999). The Big Five trait taxonomy: History, measurement, and theoretical perspectives. Handbook of Personality: Theory and Research, 2(1999), 102–138.Google Scholar
  16. Johnson, R. A., & Wichern, D. W. (1992) Applied multivariate statistical analysis Vol. 4. Englewood Cliffs: Prentice Hall.Google Scholar
  17. Josse, J., Husson, F., & Pagès, J. (2009). Gestion des données manquantes en analyse en composantes principales. Journal de la Socié,té Française de Statistique, 150(2), 28–51.Google Scholar
  18. Josse, J., Pagès, J., & Husson, F. (2011). Multiple imputation in principal component analysis. Advances in Data Analysis and Classification, 5(3), 231–246.CrossRefGoogle Scholar
  19. Kiers, H. A. (1997). Weighted least squares fitting using ordinary least squares algorithms. Psychometrika, 62(2), 251–266.CrossRefGoogle Scholar
  20. Larsen, R., & Warne, R. T. (2010). Estimating confidence intervals for eigenvalues in exploratory factor analysis. Behavior Research Methods, 42(3), 871–876.CrossRefPubMedGoogle Scholar
  21. Little, R. J. (1988). Missing-data adjustments in large surveys. Journal of Business & Economic Statistics, 6(3), 287–296.Google Scholar
  22. Lorenzo-Seva, U., & Ferrando, P. J. (2006). Factor: A computer program to fit the exploratory factor analysis model. Behavior Research Methods, 38(1), 88–91.CrossRefPubMedGoogle Scholar
  23. Lorenzo-Seva, U., & Van Ginkel, J. R. (2016). Multiple imputation of missing values in exploratory factor analysis of multidimensional scales: Estimating latent trait scores. Anales de Psicología/Annals of Psychology, 32(2), 596–608.CrossRefGoogle Scholar
  24. Lovik, A., Nassiri, V., Verbeke, G., Molenberghs, G., & Sodermans, A. K. (2017). Psychometric properties and comparison of different techniques for factor analysis on the Big Five Inventory from a Flemish sample. Personality and Individual Differences, 117, 122–129.CrossRefGoogle Scholar
  25. Mahalanobis, P. C. (1936). On the generalized distance in statistics. Proceedings of the National Institute of Sciences (Calcutta), 2, 49–55.Google Scholar
  26. McNeish, D. (2016). Exploratory factor analysis with small samples and missing data. Journal of Personality Assessment, 1–16.Google Scholar
  27. Mortelmans, D., Pasteels, I., Van Bavel, J., Bracke, P., Matthijs, K., & Van Peer, C. (2012). Divorce in Flanders. Data collection and code book. Retrieved September, 22.Google Scholar
  28. Rtools (2017). Retrieved from
  29. Rubin, D. B. (1976). Inference and missing data. Biometrika, 581–592.Google Scholar
  30. Rubin, D. B. (2004). Multiple imputation for nonresponse in surveys (Vol. 81). Wiley.Google Scholar
  31. Russell, D. W. (2002). In search of underlying dimensions: The use (and abuse) of factor analysis in Personality and Social Psychology Bulletin. Personality and Social Psychology Bulletin, 28(12), 1629–1646.CrossRefGoogle Scholar
  32. Schafer, J. L. (1997). Analysis of incomplete multivariate data. CRC Press.Google Scholar
  33. Shao, J., & Sitter, R. R. (1996). Bootstrap for imputed survey data. Journal of the American Statistical Association, 91(435), 1278–1288.CrossRefGoogle Scholar
  34. Ten Berge, J. M. (1977). Orthogonal procrustes rotation for two or more matrices. Psychometrika, 42(2), 267–276.CrossRefGoogle Scholar
  35. Van Buuren, S. (2007). Multiple imputation of discrete and continuous data by fully conditional specification. Statistical Methods in Medical Research, 16(3), 219–242.CrossRefPubMedGoogle Scholar
  36. Vink, G., Frank, L. E., Pannekoek, J., & Buuren, S. (2014). Predictive mean matching imputation of semicontinuous variables. Statistica Neerlandica, 68(1), 61–90.CrossRefGoogle Scholar
  37. Wold, H., & Lyttkens, E. (1969). Nonlinear iterative partial least squares (NIPALS) estimation procedures. Bulletin of the International Statistical Institute, 43, 1.Google Scholar

Copyright information

© Psychonomic Society, Inc. 2018

Authors and Affiliations

  • Vahid Nassiri
    • 1
  • Anikó Lovik
    • 1
  • Geert Molenberghs
    • 1
    • 2
  • Geert Verbeke
    • 1
    • 2
  1. 1.KU LeuvenBioStatLeuvenBelgium
  2. 2.Universiteit HasseltBiostatDiepenbeekBelgium

Personalised recommendations