Sparse Exploratory Factor Analysis

Trendafilov, Nickolay T.; Fontanella, Sara; Adachi, Kohei

doi:10.1007/s11336-017-9575-8

Sparse Exploratory Factor Analysis

Published: 13 July 2017

Volume 82, pages 778–794, (2017)
Cite this article

Psychometrika Aims and scope Submit manuscript

Nickolay T. Trendafilov¹,
Sara Fontanella² &
Kohei Adachi³

1423 Accesses
22 Citations
1 Altmetric
Explore all metrics

Abstract

Sparse principal component analysis is a very active research area in the last decade. It produces component loadings with many zero entries which facilitates their interpretation and helps avoid redundant variables. The classic factor analysis is another popular dimension reduction technique which shares similar interpretation problems and could greatly benefit from sparse solutions. Unfortunately, there are very few works considering sparse versions of the classic factor analysis. Our goal is to contribute further in this direction. We revisit the most popular procedures for exploratory factor analysis, maximum likelihood and least squares. Sparse factor loadings are obtained for them by, first, adopting a special reparameterization and, second, by introducing additional $\ell _1$-norm penalties into the standard factor analysis problems. As a result, we propose sparse versions of the major factor analysis procedures. We illustrate the developed algorithms on well-known psychometric problems. Our sparse solutions are critically compared to ones obtained by other existing methods.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Statistical tests, P values, confidence intervals, and power: a guide to misinterpretations

Article Open access 01 April 2016

Recognize the Value of the Sum Score, Psychometrics’ Greatest Accomplishment

Article Open access 17 April 2024

Estimating power in (generalized) linear mixed models: An open introduction and tutorial in R

Article Open access 05 May 2021

References

Absil, P.-A., Mahony, R., & Sepulchre, R. (2008). Optimization algorithms on matrix manifolds. Princeton, NJ: Princeton University Press.
Book Google Scholar
Boumal, N., Mishra, B., Absil, P.-A., & Sepulchre, R. (2014). MANOPT: a Matlab toolbox for optimization on manifolds. Journal of Machine Learning Research, 15, 1455–1459.
Google Scholar
Choi, J., Zou, H., & Oehlert, G. (2011). A penalized maximum likelihood approach to sparse factor analysis. Statistics and Its Interface, 3, 429–436.
Article Google Scholar
Del Buono, N., & Lopez, L. (2001). Runge–Kutta type methods based on geodesics for systems of ODEs on the Stiefel manifold. BIT Numerical Mathematics, 41(5), 912–923.
Article Google Scholar
Edelman, A., Arias, T. A., & Smith, S. T. (1998). The geometry of algorithms with orthogonality constraints. SIAM Journal on Matrix Analysis and Applications, 20, 303–353.
Article Google Scholar
Fontanella, S., Trendafilov, N., & Adachi, K. (2014). Sparse exploratory factor analysis. In Proceedings of COMPSTAT, 2014 (pp. 281–288).
Hage, C., & Kleinsteuber, M. (2014). Robust PCA and subspace tracking from incomplete observations using $\ell _0$-surrogates. Computational Statistics, 29, 467–487.
Article Google Scholar
Harman, H. H. (1976). Modern factor analysis (3rd ed.). Chicago, IL: University of Chicago Press.
Google Scholar
Hirose, K., & Yamamoto, M. (2014). Estimation of an oblique structure via penalized likelihood factor analysis. Computational Statistics and Data Analysis, 79, 120–132.
Article Google Scholar
Hirose, K., & Yamamoto, M. (2015). Sparse estimation via nonconcave penalized likelihood in a factor analysis model. Statistics and Computing, 25, 863–875.
Article Google Scholar
Jolliffe, I. T. (2002). Principal component analysis (2nd ed.). New York, NY: Springer-verlag.
Google Scholar
Jöreskog, K. G. (1977). Factor analysis by least-squares and maximum likelihood methods. In K. Enslein, A. Ralston, & H. S. Wilf (Eds.), Mathematical methods for digital computers (pp. 125–153). New York, NY: John Wiley & Sons.
Google Scholar
Luss, R., & Teboulle, M. (2013). Conditional gradient algorithms for rank-one matrix approximations with a sparsity constraint. SIAM Review, 55, 65–98.
Article Google Scholar
MATLAB. (2014). MATLAB R2014b. New York, NY: The MathWorks Inc.
Google Scholar
Mulaik, S. A. (2010). The foundations of factor analysis (2nd ed.). Boca Raton, FL: Chapman and Hall/CRC.
Google Scholar
Ning, N., & Georgiou, T. T. (2011). Sparse factor analysis via likelihood and $\ell _1$-regularization. In 50th IEEE conference on decision and control and european control conference (CDC-ECC) Orlando, FL, USA, December 12–15, 2011.
Trendafilov, N. T. (2003). Dynamical system approach to factor analysis parameter estimation. British Journal of Mathematical and Statistical Psychology, 56, 27–46.
Article PubMed Google Scholar
Trendafilov, N. T. (2014). From simple structure to sparse components: A review. Computational Statistics, 29, 431–454.
Article Google Scholar
Trendafilov, N. T., & Adachi, K. (2015). Sparse versus simple structure loadings. Psychometrika, 80, 776–790.
Article PubMed Google Scholar
Trendafilov, N. T., & Jolliffe, I. T. (2006). Projected gradient approach to the numerical solution of the SCoTLASS. Computational Statistics and Data Analysis, 50, 242–253.
Article Google Scholar
Wen, Z., & Yin, W. (2013). A feasible method for optimization with orthogonality constraints. Mathematical Programming, 142, 397–434.
Article Google Scholar

Download references

Acknowledgements

We are grateful to the Reviewers for the careful reading of the manuscript and their helpful comments. We also thank Dr Kei Hirose, Osaka University, for his help with fanc.

Author information

Authors and Affiliations

School of Mathematics and Statistics, Open University, Milton Keynes, UK
Nickolay T. Trendafilov
Department of Medicine, Imperial College London, London, UK
Sara Fontanella
Graduate School of Human Sciences, Osaka University, Suita, Japan
Kohei Adachi

Authors

Nickolay T. Trendafilov
View author publications
You can also search for this author in PubMed Google Scholar
Sara Fontanella
View author publications
You can also search for this author in PubMed Google Scholar
Kohei Adachi
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Nickolay T. Trendafilov.

Additional information

This work is supported by a Grant RPG-2013-211 from The Leverhulme Trust, UK.

Appendices

Appendix 1

Here, we find the gradient of the penalty term $P_\tau (Q)^\top P_\tau (Q)$ in (11) and (12), which can be then combined with the gradients of the objective functions of ML-, LS-, or GLS-EFA. Let start with

$$\begin{aligned} d(P(Q_\tau )^\top P_\tau (Q)) = 2 d(P_\tau (Q))^\top P_\tau (Q) = 2 d(P_\tau )^\top P_\tau , \end{aligned}$$

(14)

which requires the calculation of $d(P_\tau )$. At this point, we need an approximation of $\text{ sign }(x)$, and we employ the one already used in (Trendafilov & Jolliffe, 2006), which is $\text{ sign }(x) \approx \text{ tanh }(\gamma x)$ for some large $\gamma > 0$, or for short $\text{ th }(\gamma x)$. See also (Hage & Kleinsteuber, 2014; Luss & Teboulle, 2013). Then

$$\begin{aligned} 2(dP_\tau )= & {} (d{\mathbf {q}}_\tau ) \odot [1_r + \text{ th }(\gamma {\mathbf {q}}_\tau )] + {\mathbf {q}}_\tau \odot [1_r - \text{ th }^2(\gamma {\mathbf {q}}_\tau )] \odot \gamma (d{\mathbf {q}}_\tau ),\nonumber \\= & {} (d{\mathbf {q}}_\tau ) \odot \left\{ 1_r + \text{ th }(\gamma {\mathbf {q}}_\tau ) + \gamma {\mathbf {q}}_\tau \odot [1_r - \text{ th }^2(\gamma {\mathbf {q}}_\tau )] \right\} , \end{aligned}$$

(15)

where $1_r$ is a $r \times 1$ vector with unit entries. The next differential to be found is:

$$\begin{aligned} dq_\tau= & {} 1_p^\top \left\{ (dQ) \odot \text{ th }(\gamma Q) + Q \odot [1_{p \times r} - \text{ th }^2(\gamma Q)] \odot \gamma (dQ)\right\} \nonumber \\= & {} 1_p^\top \left\{ (dQ) \odot \{ \text{ th }(\gamma Q) + (\gamma Q) \odot [1_{p \times r} - \text{ th }^2(\gamma Q)\} \right\} , \end{aligned}$$

(16)

where $1_{p \times r}$ is a $p \times r$ matrix with unit entries.

Now we are ready to find the gradient $\nabla _Q$ of the penalty term with respect to Q. To simplify the notations, let

$$\begin{aligned} {{\mathbf {w}}} = 1_r + \text{ th }(\gamma {{\mathbf {q}}}_\tau ) + (\gamma {{\mathbf {q}}}_\tau ) \odot [1_r - \text{ th }^2(\gamma {{\mathbf {q}}}_\tau )], \end{aligned}$$

(17)

and

$$\begin{aligned} W = \text{ th }(\gamma Q) + (\gamma Q) \odot [1_{p \times r} - \text{ th }^2(\gamma Q)]. \end{aligned}$$

(18)

Going back to (14) and (15), we find that:

$$\begin{aligned} 2 (dP_\tau )^\top P_\tau= & {} \text{ trace } [(d{{\mathbf {q}}}_\tau ) \odot {{\mathbf {w}}} ]^\top P_\tau = \text{ trace } (d{{\mathbf {q}}}_\tau )^\top ({{\mathbf {w}}} \odot P_\tau ) \nonumber \\= & {} \text{ trace }\{ 1_p^\top [(dQ) \odot W] \}^\top ({{\mathbf {w}}} \odot P_\tau ) \nonumber \\= & {} \text{ trace } [(dQ)^\top \odot W^\top ] 1_p ({\mathbf {w}} \odot P_\tau ) \nonumber \\= & {} \text{ trace } (dQ)^\top \{W \odot [1_p ({\mathbf {w}} \odot P_\tau )]\}, \end{aligned}$$

(19)

making use of the identity $\text{ trace } (A \odot B) C = \text{ trace } A (B^\top \odot C)$. Thus, the gradient $\nabla _Q$ of the penalty term with respect to Q is:

$$\begin{aligned} \nabla _Q = W \odot [1_p ({\mathbf {w}} \odot P_\tau )]. \end{aligned}$$

(20)

Appendix 2

Here, we summarize some technical details related to the numerical solutions employed in the work.

The gradients of the ML-, LS- and GLS-EFA objective functions with respect to the unknowns $\{Q, D, \Psi \}$ are given in Trendafilov (2003) as the following block-matrix: $( -Y \textit{QD}^{2}, -Q^{T} Y Q \odot D, -Y \odot \Psi )$. For ML-EFA, one has $Y = 2 R_{ZZ}^{-1} (R-R_{ZZ}) R_{ZZ}^{-1}$, and for LS- and GLS-EFA, it changes to $Y = 4 (R - R_{ZZ})V^2$. Additionally, we need the gradient $\nabla _Q$ of the penalty term $P_\tau (Q)^\top P_\tau (Q)$ with respect to Q, which should be added to $-Y \textit{QD}^{2}$. Its derivation is given in details in Appendix.

The dynamical system approach employed in (Trendafilov, 2003) can be readily applied for solving (11) and (12). It involves numerical integration of matrix ordinary differential equations (ODE) for $\{Q, D, \Psi \}$ defined by their projected gradients. Particularly, it involves projected gradient dynamical system for Q on the Stiefel manifold of all $p \times r$ orthonormal matrices. There exist a number of specialized numerical methods for solving such problem listed in (Trendafilov, 2003), e.g. Del Buono & Lopez (2001) and etc. In contrast to the standard EFA alternating approaches (Jöreskog, 1977; Mulaik, 2010), the dynamical system approach gives matrix algorithms which produce simultaneous solution for $\{Q, D, \Psi \}$ exploiting the geometry of their specific matrix structures. Moreover, such algorithms are globally convergent, i.e. the convergence is reached independently of the starting (initial) point (Absil et al., 2008; Trendafilov, 2003).

The numerical ODE solvers currently available in MATLAB (MATLAB, 2014) are not suitable for solving large optimization problems. They track the whole trajectory defined by the ODE which is time-consuming and undesirable when the asymptotic state is of interest only. This limits the application of the proposed approach to solving (11) and (12) for rather small data sets.

An alternative way is to employ iterative algorithms directly working on matrix manifolds (Absil et al., 2008; Edelman et al., 1998; Wen & Yin, 2013). The listed above gradients can be readily used for solving (11) and (12) by employing MANOPT, a free MATLAB-based software for optimization on matrix manifolds (Boumal et al., 2014). The MANOPT code for solving (11) and (12) can be obtained from the authors upon request, and will be available online. Note that by choosing $\mu = 0$, one can obtain solutions for the standard ML-, LS- and GLS-EFA problems (9) and (10).

Rights and permissions

Reprints and permissions

About this article

Cite this article

Trendafilov, N.T., Fontanella, S. & Adachi, K. Sparse Exploratory Factor Analysis. Psychometrika 82, 778–794 (2017). https://doi.org/10.1007/s11336-017-9575-8

Download citation

Received: 06 November 2014
Published: 13 July 2017
Issue Date: September 2017
DOI: https://doi.org/10.1007/s11336-017-9575-8

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Sparse Exploratory Factor Analysis

Abstract

Access this article

Similar content being viewed by others

Statistical tests, P values, confidence intervals, and power: a guide to misinterpretations

Recognize the Value of the Sum Score, Psychometrics’ Greatest Accomplishment

Estimating power in (generalized) linear mixed models: An open introduction and tutorial in R

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Appendices

Appendix 1

Appendix 2

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Sparse Exploratory Factor Analysis

Abstract

Access this article

Similar content being viewed by others

Statistical tests, P values, confidence intervals, and power: a guide to misinterpretations

Recognize the Value of the Sum Score, Psychometrics’ Greatest Accomplishment

Estimating power in (generalized) linear mixed models: An open introduction and tutorial in R

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Appendices

Appendix 1

Appendix 2

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation