Clustering and feature selection using sparse principal component analysis

Luss, Ronny; d’Aspremont, Alexandre

doi:10.1007/s11081-008-9057-z

Clustering and feature selection using sparse principal component analysis

Published: 06 November 2008

Volume 11, pages 145–157, (2010)
Cite this article

Optimization and Engineering Aims and scope Submit manuscript

Ronny Luss¹ &
Alexandre d’Aspremont¹

623 Accesses
32 Citations
Explore all metrics

Abstract

In this paper, we study the application of sparse principal component analysis (PCA) to clustering and feature selection problems. Sparse PCA seeks sparse factors, or linear combinations of the data variables, explaining a maximum amount of variance in the data while having only a limited number of nonzero coefficients. PCA is often used as a simple clustering technique and sparse factors allow us here to interpret the clusters in terms of a reduced set of variables. We begin with a brief introduction and motivation on sparse PCA and detail our implementation of the algorithm in d’Aspremont et al. (SIAM Rev. 49(3):434–448, 2007). We then apply these results to some classic clustering and feature selection problems arising in biology.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

References

Alizadeh A, Eisen M, Davis R, Ma C, Lossos I, Rosenwald A (2000) Distinct types of diffuse large b-cell lymphoma identified by gene expression profiling. Nature 403:503–511
Article Google Scholar
Alon A, Barkai N, Notterman DA, Gish K, Ybarra S, Mack D, Levine AJ (1999) Broad patterns of gene expression revealed by clustering analysis of tumor and normal colon tissues probed by oligonucleotide arrays. Cell Biol 96:6745–6750
Google Scholar
Cadima J, Jolliffe IT (1995) Loadings and correlations in the interpretation of principal components. J Appl Stat 22:203–214
Article MathSciNet Google Scholar
Candès EJ, Tao T (2005) Decoding by linear programming. IEEE Trans Inf Theory 51(12):4203–4215
Article Google Scholar
d’Aspremont A (2005) Smooth optimization with approximate gradient. ArXiv:math.OC/0512344
d’Aspremont A, El Ghaoui L, Jordan MI, Lanckriet GRG (2007) A direct formulation for sparse PCA using semidefinite programming. SIAM Rev 49(3):434–448
Article MATH MathSciNet Google Scholar
Donoho DL, Tanner J (2005) Sparse nonnegative solutions of underdetermined linear equations by linear programming. Proc Natl Acad Sci 102(27):9446–9451
Article MathSciNet Google Scholar
Guyon I, Weston J, Barnhill S, Vapnik V (2002) Gene selection for cancer classification using support vector machines. Mach Learn 46:389–422
Article MATH Google Scholar
Huang TM, Kecman V (2005) Gene extraction for cancer diagnosis by support vector machines-an improvement. Artif Intell Med 35:185–194
Article Google Scholar
Jolliffe IT, Trendafilov NT, Uddin M (2003) A modified principal component technique based on the LASSO. J Comput Graph Stat 12:531–547
Article MathSciNet Google Scholar
Moler C, Van Loan C (2003) Nineteen dubious ways to compute the exponential of a matrix, twenty-five years later. SIAM Rev 45(1):3–49
Article MATH MathSciNet Google Scholar
Moghaddam B, Weiss Y, Avidan S (2006a) Generalized spectral bounds for sparse LDA. In: International conference on machine learning
Moghaddam B, Weiss Y, Avidan S (2006b) Spectral bounds for sparse PCA: Exact and greedy algorithms. Adv Neural Inf Process Syst, 18
Nesterov Y (1983) A method of solving a convex programming problem with convergence rate O(1/k ²). Sov Math Dokl 27(2):372–376
MATH Google Scholar
Nesterov Y (2005) Smooth minimization of non-smooth functions. Math Program 103(1):127–152
Article MATH MathSciNet Google Scholar
Pataki G (1998) On the rank of extreme matrices in semidefinite programs and the multiplicity of optimal eigenvalues. Math Oper Res 23(2):339–358
Article MATH MathSciNet Google Scholar
Su Y, Murali TM, Pavlovic V, Schaffer M, Kasif S (2003) Rankgene: identification of diagnostic genes based on expression data. Bioinformatics 19:1578–1579
Article Google Scholar
Srebro N, Shakhnarovich G, Roweis S (2006) An investigation of computational and informational limits in Gaussian mixture clustering. In: Proceedings of the 23rd international conference on machine learning, pp 865–872
Sturm J (1999) Using SEDUMI 1.0x, a MATLAB toolbox for optimization over symmetric cones. Optim Methods Softw 11:625–653
Article MathSciNet Google Scholar
Tibshirani R (1996) Regression shrinkage and selection via the LASSO. J R Stat Soc Ser B 58(1):267–288
MATH MathSciNet Google Scholar
Vapnik V (1995) The nature of statistical learning theory. Springer, Berlin
MATH Google Scholar
Zou H, Hastie T (2005) Regularization and variable selection via the elastic net. J R Stat Soc Ser B Stat Methodol 67(2):301–320
Article MATH MathSciNet Google Scholar
Zou H, Hastie T, Tibshirani R (2006) Sparse principal component analysis. J Comput Graph Stat 15(2):265–286
Article MathSciNet Google Scholar
Zhang Z, Zha H, Simon H (2002) Low rank approximations with sparse factors I: basic algorithms and error analysis. SIAM J Matrix Anal Appl 23(3):706–727
Article MATH MathSciNet Google Scholar
Zhang Z, Zha H, Simon H (2004) Low rank approximations with sparse factors II: penalized methods with discrete Newton-like iterations. SIAM J Matrix Anal Appl 25(4):901–920
Article MATH MathSciNet Google Scholar

Download references

Author information

Authors and Affiliations

ORFE Department, Princeton University, Princeton, NJ, 08544, USA
Ronny Luss & Alexandre d’Aspremont

Authors

Ronny Luss
View author publications
You can also search for this author in PubMed Google Scholar
Alexandre d’Aspremont
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Alexandre d’Aspremont.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Luss, R., d’Aspremont, A. Clustering and feature selection using sparse principal component analysis. Optim Eng 11, 145–157 (2010). https://doi.org/10.1007/s11081-008-9057-z

Download citation

Received: 30 March 2007
Accepted: 13 October 2008
Published: 06 November 2008
Issue Date: February 2010
DOI: https://doi.org/10.1007/s11081-008-9057-z

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Clustering and feature selection using sparse principal component analysis

Abstract

Access this article

Similar content being viewed by others

The Sparse Principal Component Analysis Problem: Optimality Conditions and Algorithms

Robust Principal Component Analysis

The Alternating Least-Squares Algorithm for CDPCA

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Clustering and feature selection using sparse principal component analysis

Abstract

Access this article

Similar content being viewed by others

The Sparse Principal Component Analysis Problem: Optimality Conditions and Algorithms

Robust Principal Component Analysis

The Alternating Least-Squares Algorithm for CDPCA

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation