An augmented Lagrangian approach for sparse principal component analysis

Lu, Zhaosong; Zhang, Yong

doi:10.1007/s10107-011-0452-4

An augmented Lagrangian approach for sparse principal component analysis

Full Length Paper
Series A
Published: 27 April 2011

Volume 135, pages 149–193, (2012)
Cite this article

Mathematical Programming Submit manuscript

Zhaosong Lu¹ &
Yong Zhang¹

1190 Accesses
62 Citations
Explore all metrics

Abstract

Principal component analysis (PCA) is a widely used technique for data analysis and dimension reduction with numerous applications in science and engineering. However, the standard PCA suffers from the fact that the principal components (PCs) are usually linear combinations of all the original variables, and it is thus often difficult to interpret the PCs. To alleviate this drawback, various sparse PCA approaches were proposed in the literature (Cadima and Jolliffe in J Appl Stat 22:203–214, 1995; d’Aspremont et al. in J Mach Learn Res 9:1269–1294, 2008; d’Aspremont et al. SIAM Rev 49:434–448, 2007; Jolliffe in J Appl Stat 22:29–35, 1995; Journée et al. in J Mach Learn Res 11:517–553, 2010; Jolliffe et al. in J Comput Graph Stat 12:531–547, 2003; Moghaddam et al. in Advances in neural information processing systems 18:915–922, MIT Press, Cambridge, 2006; Shen and Huang in J Multivar Anal 99(6):1015–1034, 2008; Zou et al. in J Comput Graph Stat 15(2):265–286, 2006). Despite success in achieving sparsity, some important properties enjoyed by the standard PCA are lost in these methods such as uncorrelation of PCs and orthogonality of loading vectors. Also, the total explained variance that they attempt to maximize can be too optimistic. In this paper we propose a new formulation for sparse PCA, aiming at finding sparse and nearly uncorrelated PCs with orthogonal loading vectors while explaining as much of the total variance as possible. We also develop a novel augmented Lagrangian method for solving a class of nonsmooth constrained optimization problems, which is well suited for our formulation of sparse PCA. We show that it converges to a feasible point, and moreover under some regularity assumptions, it converges to a stationary point. Additionally, we propose two nonmonotone gradient methods for solving the augmented Lagrangian subproblems, and establish their global and local convergence. Finally, we compare our sparse PCA approach with several existing methods on synthetic (Zou et al. in J Comput Graph Stat 15(2):265–286, 2006), Pitprops (Jeffers in Appl Stat 16:225–236, 1967), and gene expression data (Chin et al in Cancer Cell 10:529C–541C, 2006), respectively. The computational results demonstrate that the sparse PCs produced by our approach substantially outperform those by other methods in terms of total explained variance, correlation of PCs, and orthogonality of loading vectors. Moreover, the experiments on random data show that our method is capable of solving large-scale problems within a reasonable amount of time.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

References

Alter O., Brown P., Botstein D.: Singular value decomposition for genome-wide expression data processing and modeling. Proc. Natl. Acad. Sci. 97, 10101–10106 (2000)
Article Google Scholar
Barzilai J., Borwein J.M.: Two point step size gradient methods. IMA J. Numer. Anal. 8, 141–148 (1988)
Article MathSciNet MATH Google Scholar
Beck A., Teboulle M.: A fast iterative shrinkage-thresholding algorithm for linear inverse problems. SIAM J. Imaging Sci. 2, 183–202 (2009)
Article MathSciNet MATH Google Scholar
Bertsekas, D.P.: Nonlinear Programming. Athena Scientific (1999)
Birgin E.G., Martínez J.M., Raydan M.: Nonmonotone spectral projected gradient methods on convex sets. SIAM J. Optim. 10, 1196–1211 (2000)
Article MathSciNet MATH Google Scholar
Burer S., Monteiro R.D.C.: A nonlinear programming algorithm for solving semidefinite programs via low-rank factorization. Math. Program. Ser. B 95, 329–357 (2003)
Article MathSciNet MATH Google Scholar
Cadima J., Jolliffe I.: Loadings and correlations in the interpretation of principal components. J. Appl. Stat. 22, 203–214 (1995)
Article MathSciNet Google Scholar
Chin, K., Devries, S., Fridlyand, J., Spellman, P., Roydasgupta, R., Kuo, W.-L., Lapuk, A., Neve, R., Qian, Z., Ryder, T.: Genomic and transcriptional aberrations linked to breast cancer pathophysiologies. Cancer Cell 10, 529C–541C (2006)
Article Google Scholar
d’Aspremont A., Bach F.R., El Ghaoui L.: Optimal solutions for sparse principal component analysis. J. Mach. Learn. Res. 9, 1269–1294 (2008)
MathSciNet MATH Google Scholar
d’Aspremont A., El Ghaoui L., Jordan M.I., Lanckriet G.R.G.: A direct formulation for sparse PCA using semidefinite programming. SIAM Rev. 49, 434–448 (2007)
Article MathSciNet MATH Google Scholar
Hancock P., Burton A., Bruce V.: Face processing: human perception and principal components analysis. Memory Cogn. 24, 26–40 (1996)
Article Google Scholar
Hastie T., Tibshirani R., Eisen M., Brown P., Ross D., Scherf U., Weinstein J., Alizadeh A., Staudt L., Botstein D.: ́ene Shavingás a method for identifying distinct sets of genes with similar expression patterns. Genome Biol. 1, 1–21 (2000)
Article Google Scholar
Hastie T., Tibshirani R., Friedman J.: The Elements of Statistical Learning: Data Mining, Inference and Prediction. Springer, NewYork (2001)
MATH Google Scholar
Helmke U., Moore J.B.: Optimization and Dynamical Systems. Springer, London and New York (1994)
Book Google Scholar
Hiriart-Urruty J.B., Lemaréchal C.: Convex Analysis and Minimization Algorithms I. Comprehensive Study in Mathematics, vol. 305. Springer, New York (1993)
Google Scholar
Jeffers J.: Two case studies in the application of principal component. Appl. Stat. 16, 225–236 (1967)
Article Google Scholar
Jolliffe I.: Rotation of principal components: choice of normalization constraints. J. Appl. Stat. 22, 29–35 (1995)
Article MathSciNet Google Scholar
Journée M., Nesterov Yu., Richtárik P., Sepulchre R.: Generalized power method for sparse principal component analysis. J. Mach. Learn. Res. 11, 517–553 (2010)
MathSciNet MATH Google Scholar
Jolliffe I.T., Trendafilov N.T., Uddin M.L.: A modified principal component technique based on the Lasso. J. Comput. Graph. Stat. 12, 531–547 (2003)
Article MathSciNet Google Scholar
Lu, Z., Zhang Y.: An Augmented Lagrangian Approach for Sparse Principal Component Analysis. Technical report, Department of Mathematics, Simon Fraser University, Burnaby, BC, V5A 1S6, Canada, July (2009)
Moghaddam B., Weiss Y., Avidan S.: Spectral bounds for sparse PCA: exact and greedy algorithms. In: Weiss, Y., Schölkopf, B., Platt, J. (eds) Advances in Neural Information Processing Systems 18, pp. 915–922. MIT Press, Cambridge (2006)
Google Scholar
Monteiro, R.D.C.: Private communication (2009)
Nesterov, Y.E.: Gradient methods for minimizing composite objective functions. CORE Discussion paper 2007/76, September 2007
Robinson S.M.: Stability theory for systems of inequalities, Part 2: Differentiable nonlinear systems. SIAM J. Numer. Anal. 13, 497–513 (1976)
Article MathSciNet MATH Google Scholar
Robinson S.M.: Local structure of feasible sets in nonlinear programming, Part I: regularity. In: Pereira, V., Reinoza, A. (eds) Numerical Methods Lecture Notes in Mathematics vol. 1005, Springer, Berlin (1983)
Google Scholar
Rockafellar R.T.: Convex Analysis. Princeton University Press, Princeton (1970)
MATH Google Scholar
Ruszczyński A.: Nonlinear Optimization. Princeton University Press, Princeton (2006)
MATH Google Scholar
Shen H., Huang J.Z.: Sparse principal component analysis via regularized low rank matrix approximation. J. Multivar. Anal. 99(6), 1015–1034 (2008)
Article MathSciNet MATH Google Scholar
Tseng P., Yun S.: A coordinate gradient descent method for nonsmooth separable minimization. Math. Program. 117, 387–423 (2009)
Article MathSciNet MATH Google Scholar
Wright S.J., Nowak R., Figueiredo M.A.T.: Sparse reconstruction by separable approximation. IEEE Trans. Signal Process. 57(3), 2479–2493 (2009)
Article MathSciNet Google Scholar
Zou H., Hastie T., Tibshirani R.: Sparse principal component analysis. J. Comput. Graph. Stat. 15(2), 265–286 (2006)
Article MathSciNet Google Scholar

Download references

Author information

Authors and Affiliations

Department of Mathematics, Simon Fraser University, Burnaby, BC, V5A 1S6, Canada
Zhaosong Lu & Yong Zhang

Authors

Zhaosong Lu
View author publications
You can also search for this author in PubMed Google Scholar
Yong Zhang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Zhaosong Lu.

Additional information

This work was supported in part by NSERC Discovery Grant.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Lu, Z., Zhang, Y. An augmented Lagrangian approach for sparse principal component analysis. Math. Program. 135, 149–193 (2012). https://doi.org/10.1007/s10107-011-0452-4

Download citation

Received: 12 July 2009
Accepted: 28 February 2011
Published: 27 April 2011
Issue Date: October 2012
DOI: https://doi.org/10.1007/s10107-011-0452-4

Keywords

Mathematics Subject Classification (2000)

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

An augmented Lagrangian approach for sparse principal component analysis

Abstract

Access this article

Similar content being viewed by others

Tutorial on PCA and approximate PCA and approximate kernel PCA

The Frank-Wolfe Algorithm: A Short Introduction

A Guide for Sparse PCA: Model Comparison and Applications

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Keywords

Mathematics Subject Classification (2000)

Navigation

An augmented Lagrangian approach for sparse principal component analysis

Abstract

Access this article

Similar content being viewed by others

Tutorial on PCA and approximate PCA and approximate kernel PCA

The Frank-Wolfe Algorithm: A Short Introduction

A Guide for Sparse PCA: Model Comparison and Applications

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Mathematics Subject Classification (2000)

Search

Navigation