Skip to main content

The Goodness of Sample Loadings of Principal Component Analysis in Approximating to Factor Loadings with High Dimensional Data

  • Conference paper
  • First Online:
Quantitative Psychology Research

Part of the book series: Springer Proceedings in Mathematics & Statistics ((PROMS,volume 167))

Abstract

Guttman (Psychometrika 21 273–286:1956) showed that the loadings of factor analysis (FA) and those of principal component analysis (PCA) approach each other as the number of variables p goes to infinity. Because the computation for PCA is simpler than FA, PCA can be used as an approximation for FA when p is large. However, another side of the coin is that as p increases, non-consistency might become an issue. Therefore, it is necessary to simultaneously consider the closeness between the estimated FA and the estimated PCA loadings as well as the closeness between the estimated and the population FA loadings. Using Monte Carlo simulation, this article studies the behavior of three kinds of closeness under high-dimensional conditions: (1) between the estimated FA and the estimated PCA loadings, (2) between the estimated FA and the population FA loadings, and (3) between the estimated PCA and the population FA loadings. To deal with high-dimensionality, a ridge method proposed by Yuan and Chan (Computational Statistics and Data Analysis 52:4842–4828, 2008) is employed. As a measure for closeness, the average canonical correlation (CC) between two loading matrices and its Fisher-z transformation are employed. Results indicate that the Fisher-z transformed average CC between the estimated FA and the estimated PCA loadings is larger than that between the estimated FA and the population FA loadings as well as that between the estimated PCA and the population FA loadings. It is concluded that, under high-dimensional conditions, the closeness between the estimated FA and PCA loadings is easier to achieve than that between the estimated and the population FA loadings and also that between the estimated PCA and the population FA loadings.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 109.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 139.00
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 139.00
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  • Anderson, T. W. (1963). Asymptotic theory for principal component analysis. Annals of Mathematical Statistics, 34, 122–148.

    Article  MathSciNet  MATH  Google Scholar 

  • Anderson, T. W. (2003). An introduction to multivariate statistical analysis (3rd ed.). New York: Wiley.

    MATH  Google Scholar 

  • Bai, J., & Li, K. (2012). Statistical analysis of factor models of high dimension. The Annals of Statistics, 40, 436–465.

    Article  MathSciNet  MATH  Google Scholar 

  • Beaujean, A. A. (2013). Factor analysis using R. Practical Assessment, Research & Evaluation, 18. Retrieved April 12, 2015, from http://pareonline.net/getvn.asp?v=18&n=4.

  • Bentler, P. M., & Kano, Y. (1990). On the equivalence of factors and components. Multivariate Behavioral Research, 25, 67–74.

    Article  Google Scholar 

  • Guttman, L. (1956). “Best possible” estimates of communalities. Psychometrika, 21, 273–286.

    Article  MathSciNet  MATH  Google Scholar 

  • Hayashi, K., & Bentler, P. M. (2000). On the relations among regular, equal unique variances, and image factor analysis models. Psychometrika, 65, 59–72.

    Article  MathSciNet  MATH  Google Scholar 

  • Johnstone, I. M. & Lu, A. Y. (2004). Sparse principal component analysis (Technical report). Department of Statistics, Stanford University.

    Google Scholar 

  • Johnstone, I. M., & Lu, A. Y. (2009). Consistency and sparsity for principal components analysis in high dimensions. Journal of the American Statistical Association, 104, 682–693.

    Article  MathSciNet  MATH  Google Scholar 

  • Krijnen, W. P. (2006). Convergence of estimates of unique variances in factor analysis, based on the inverse sample covariance matrix. Psychometrika, 71, 193–199.

    Article  MathSciNet  MATH  Google Scholar 

  • Lawley, D. N., & Maxwell, A. E. (1971). Factor analysis as a statistical method (2nd ed.). New York: American Elsevier.

    MATH  Google Scholar 

  • Liang, L., Hayashi, K., & Yuan, K.-H. (2015). On closeness between factor analysis and principal component analysis under high-dimensional conditions. In L. A. van der Ark, D. M. Bolt, W.-C. Wang, J. A. Douglas, & S.-M. Chow (Eds.) Quantitative psychology research: The 79th Annual Meeting of the Psychometric Society, Madison, Wisconsin, 2014 (pp. 209–221). New York: Springer.

    Google Scholar 

  • Paul, D. (2007). Asymptotics of sample eigenstructure for a large dimensional spiked covariance model. Statistica Sinica, 17, 1617–1642.

    MathSciNet  MATH  Google Scholar 

  • Pourahmadi, M. (2013). High-dimensional covariance estimation. New York: Wiley.

    Book  MATH  Google Scholar 

  • Schneeweiss, H. (1997). Factors and principal components in the near spherical case. Multivariate Behavioral Research, 32, 375–401.

    Article  Google Scholar 

  • Schneeweiss, H., & Mathes, H. (1995). Factor analysis and principal components. Journal of Multivariate Analysis, 55, 105–124.

    Article  MathSciNet  MATH  Google Scholar 

  • Yuan, K.-H. (2013). Ridge structural equation modeling with large p and/or small N. IMPS2013. The Netherlands: Arnhem.

    Google Scholar 

  • Yuan, K.-H., & Chan, W. (2008). Structural equation modeling with near singular covariance matrices. Computational Statistics and Data Analysis, 52, 4842–4858.

    Article  MathSciNet  MATH  Google Scholar 

Download references

Acknowledgment

Ke-Hai Yuan’s work was supported by the National Science Foundation under Grant No. SES-1461355. The authors are grateful to Dr. Daniel M. Bolt for his valuable comments on the earlier version of the manuscript.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Kentaro Hayashi .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2016 Springer International Publishing Switzerland

About this paper

Cite this paper

Liang, L., Hayashi, K., Yuan, KH. (2016). The Goodness of Sample Loadings of Principal Component Analysis in Approximating to Factor Loadings with High Dimensional Data. In: van der Ark, L., Bolt, D., Wang, WC., Douglas, J., Wiberg, M. (eds) Quantitative Psychology Research. Springer Proceedings in Mathematics & Statistics, vol 167. Springer, Cham. https://doi.org/10.1007/978-3-319-38759-8_15

Download citation

Publish with us

Policies and ethics