The Goodness of Sample Loadings of Principal Component Analysis in Approximating to Factor Loadings with High Dimensional Data

Liang, Lu; Hayashi, Kentaro; Yuan, Ke-Hai

doi:10.1007/978-3-319-38759-8_15

Lu Liang⁶,
Kentaro Hayashi⁶ &
Ke-Hai Yuan⁷

Part of the book series: Springer Proceedings in Mathematics & Statistics ((PROMS,volume 167))

1475 Accesses
1 Citations

Abstract

Guttman (Psychometrika 21 273–286:1956) showed that the loadings of factor analysis (FA) and those of principal component analysis (PCA) approach each other as the number of variables p goes to infinity. Because the computation for PCA is simpler than FA, PCA can be used as an approximation for FA when p is large. However, another side of the coin is that as p increases, non-consistency might become an issue. Therefore, it is necessary to simultaneously consider the closeness between the estimated FA and the estimated PCA loadings as well as the closeness between the estimated and the population FA loadings. Using Monte Carlo simulation, this article studies the behavior of three kinds of closeness under high-dimensional conditions: (1) between the estimated FA and the estimated PCA loadings, (2) between the estimated FA and the population FA loadings, and (3) between the estimated PCA and the population FA loadings. To deal with high-dimensionality, a ridge method proposed by Yuan and Chan (Computational Statistics and Data Analysis 52:4842–4828, 2008) is employed. As a measure for closeness, the average canonical correlation (CC) between two loading matrices and its Fisher-z transformation are employed. Results indicate that the Fisher-z transformed average CC between the estimated FA and the estimated PCA loadings is larger than that between the estimated FA and the population FA loadings as well as that between the estimated PCA and the population FA loadings. It is concluded that, under high-dimensional conditions, the closeness between the estimated FA and PCA loadings is easier to achieve than that between the estimated and the population FA loadings and also that between the estimated PCA and the population FA loadings.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 109.00; Price excludes VAT (USA)

Softcover Book: USD 139.00; Price excludes VAT (USA)

Hardcover Book: USD 139.00; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Anderson, T. W. (1963). Asymptotic theory for principal component analysis. Annals of Mathematical Statistics, 34, 122–148.
Article MathSciNet MATH Google Scholar
Anderson, T. W. (2003). An introduction to multivariate statistical analysis (3rd ed.). New York: Wiley.
MATH Google Scholar
Bai, J., & Li, K. (2012). Statistical analysis of factor models of high dimension. The Annals of Statistics, 40, 436–465.
Article MathSciNet MATH Google Scholar
Beaujean, A. A. (2013). Factor analysis using R. Practical Assessment, Research & Evaluation, 18. Retrieved April 12, 2015, from http://pareonline.net/getvn.asp?v=18&n=4.
Bentler, P. M., & Kano, Y. (1990). On the equivalence of factors and components. Multivariate Behavioral Research, 25, 67–74.
Article Google Scholar
Guttman, L. (1956). “Best possible” estimates of communalities. Psychometrika, 21, 273–286.
Article MathSciNet MATH Google Scholar
Hayashi, K., & Bentler, P. M. (2000). On the relations among regular, equal unique variances, and image factor analysis models. Psychometrika, 65, 59–72.
Article MathSciNet MATH Google Scholar
Johnstone, I. M. & Lu, A. Y. (2004). Sparse principal component analysis (Technical report). Department of Statistics, Stanford University.
Google Scholar
Johnstone, I. M., & Lu, A. Y. (2009). Consistency and sparsity for principal components analysis in high dimensions. Journal of the American Statistical Association, 104, 682–693.
Article MathSciNet MATH Google Scholar
Krijnen, W. P. (2006). Convergence of estimates of unique variances in factor analysis, based on the inverse sample covariance matrix. Psychometrika, 71, 193–199.
Article MathSciNet MATH Google Scholar
Lawley, D. N., & Maxwell, A. E. (1971). Factor analysis as a statistical method (2nd ed.). New York: American Elsevier.
MATH Google Scholar
Liang, L., Hayashi, K., & Yuan, K.-H. (2015). On closeness between factor analysis and principal component analysis under high-dimensional conditions. In L. A. van der Ark, D. M. Bolt, W.-C. Wang, J. A. Douglas, & S.-M. Chow (Eds.) Quantitative psychology research: The 79th Annual Meeting of the Psychometric Society, Madison, Wisconsin, 2014 (pp. 209–221). New York: Springer.
Google Scholar
Paul, D. (2007). Asymptotics of sample eigenstructure for a large dimensional spiked covariance model. Statistica Sinica, 17, 1617–1642.
MathSciNet MATH Google Scholar
Pourahmadi, M. (2013). High-dimensional covariance estimation. New York: Wiley.
Book MATH Google Scholar
Schneeweiss, H. (1997). Factors and principal components in the near spherical case. Multivariate Behavioral Research, 32, 375–401.
Article Google Scholar
Schneeweiss, H., & Mathes, H. (1995). Factor analysis and principal components. Journal of Multivariate Analysis, 55, 105–124.
Article MathSciNet MATH Google Scholar
Yuan, K.-H. (2013). Ridge structural equation modeling with large p and/or small N. IMPS2013. The Netherlands: Arnhem.
Google Scholar
Yuan, K.-H., & Chan, W. (2008). Structural equation modeling with near singular covariance matrices. Computational Statistics and Data Analysis, 52, 4842–4858.
Article MathSciNet MATH Google Scholar

Download references

Acknowledgment

Ke-Hai Yuan’s work was supported by the National Science Foundation under Grant No. SES-1461355. The authors are grateful to Dr. Daniel M. Bolt for his valuable comments on the earlier version of the manuscript.

Author information

Authors and Affiliations

Department of Psychology, University of Hawaii at Manoa, 2530 Dole Street, Sakamaki C400, Honolulu, HI , 96822, USA
Lu Liang & Kentaro Hayashi
Department of Psychology, University of Notre Dame, 123A Haggar Hall, Notre Dame, IN, 46556, USA
Ke-Hai Yuan

Authors

Lu Liang
View author publications
You can also search for this author in PubMed Google Scholar
Kentaro Hayashi
View author publications
You can also search for this author in PubMed Google Scholar
Ke-Hai Yuan
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Kentaro Hayashi .

Editor information

Editors and Affiliations

University of Amsterdam, Amsterdam, The Netherlands
L. Andries van der Ark
University of Wisconsin, Madison, Wisconsin, USA
Daniel M. Bolt
Education University of Hong Kong, Hong Kong, China
Wen-Chung Wang
University of Illinois, Champaign, Illinois, USA
Jeffrey A. Douglas
Umeå University, Umeå, Sweden
Marie Wiberg

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Liang, L., Hayashi, K., Yuan, KH. (2016). The Goodness of Sample Loadings of Principal Component Analysis in Approximating to Factor Loadings with High Dimensional Data. In: van der Ark, L., Bolt, D., Wang, WC., Douglas, J., Wiberg, M. (eds) Quantitative Psychology Research. Springer Proceedings in Mathematics & Statistics, vol 167. Springer, Cham. https://doi.org/10.1007/978-3-319-38759-8_15

Download citation

DOI: https://doi.org/10.1007/978-3-319-38759-8_15
Published: 05 August 2016
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-38757-4
Online ISBN: 978-3-319-38759-8
eBook Packages: Mathematics and StatisticsMathematics and Statistics (R0)

Publish with us

Policies and ethics