Advertisement

Sparse Principal Component Analysis via Joint L2,1-Norm Penalty

  • Shi Xiaoshuang
  • Lai Zhihui
  • Guo Zhenhua
  • Wan Minghua
  • Zhao Cairong
  • Kong Heng
Part of the Lecture Notes in Computer Science book series (LNCS, volume 8272)

Abstract

Sparse principal component analysis (SPCA) is a popular method to get the sparse loadings of principal component analysis(PCA), it represents PCA as a regression model by using lasso constraint, but the selected features of SPCA are independent and generally different with each principal component (PC). Therefore, we modify the regression model by replacing the elastic net with L 2,1-norm, which encourages row-sparsity that can get rid of the same features in different PCs, and utilize this new “self-contained” regression model to present a new framework for graph embedding methods, which can get sparse loadings via L 2,1-norm. Experiment on Pitprop data illustrates the row-sparsity of this modified regression model for PCA and experiment on YaleB face database demonstrates the effectiveness of this model for PCA in graph embedding.

Keywords

SPCA L2,1-norm row-sparsity 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Yang, J., Zhang, D., Frangi, A.F., Yang, J.Y.: Two-dimensional, P.C.A.: a new approach to appearance-based face representation and recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence 26(1), 131–137 (2004)CrossRefGoogle Scholar
  2. 2.
    Jolliffe, I.: Principal component analysis. John Wiley & Sons Ltd. (2005)Google Scholar
  3. 3.
    Richman, M.B.: Rotation of principal components. Journal of Climatology 6(3), 293–335 (1986)CrossRefGoogle Scholar
  4. 4.
    Vines, S.K.: Simple principal components. Journal of the Royal Statistical Society: Series C (Applied Statistics) 49(4), 441–451 (2000)MathSciNetCrossRefzbMATHGoogle Scholar
  5. 5.
    Jeffers, J.N.R.: Two case studies in the application of principal component analysis. Applied Statistics 225–236 (1967)Google Scholar
  6. 6.
    Zou, H., Hastie, T., Tibshirani, R.: Sparse principal component analysis. Journal of Computational and Graphical Statistics 15(2), 265–286 (2006)MathSciNetCrossRefGoogle Scholar
  7. 7.
    Alter, O., Brown, P.O., Botstein, D.: Singular value decomposition for genome-wide expression data processing and modeling. Proceedings of the National Academy of Sciences 97(18), 10101–10106 (2000)CrossRefGoogle Scholar
  8. 8.
    Tibshirani, R.: Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society. Series B (Methodological), 267–288 (1996)Google Scholar
  9. 9.
    Cadima, J., Jolliffe, I.T.: Loading and correlations in the interpretation of principle compenents. Journal of Applied Statistics 22(2), 203–214 (1995)MathSciNetCrossRefGoogle Scholar
  10. 10.
    Nie, F., Huang, H., Cai, X., Ding, C.H.: Efficient and robust feature selection via joint L 2,1-norms minimization. In: Proc. NIPS, pp. 1813–1821 (2010)Google Scholar
  11. 11.
    Gu, Q., Li, Z., Han, J.: Joint feature selection and subspace learning. In: Proceedings of the Twenty-Second International Joint Conference on Artificial Intelligence, vol. 2. AAAI Press (2011)Google Scholar
  12. 12.
    Yuan, M., Lin, Y.: Model selection and estimation in regression with grouped variables. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 68(1), 49–67 (2006)MathSciNetCrossRefzbMATHGoogle Scholar
  13. 13.
    Argyriou, A., Evgeniou, T., Pontil, M.: Convex multi-task feature learning. Machine Learning 73(3), 243–272 (2008)CrossRefGoogle Scholar
  14. 14.
    Obozinski, G., Taskar, B., Jordan, M.I.: Joint covariate selection and joint subspace selection for multiple classification problems. Statistics and Computing 20(2), 231–252 (2010)MathSciNetCrossRefGoogle Scholar
  15. 15.
    Yan, S., Xu, D., Zhang, B., Zhang, H.J., Yang, Q., Lin, S.: Graph embedding and extensions: a general framework for dimensionality reduction. IEEE Transactions on Pattern Analysis and Machine Intelligence 29(1), 40–51 (2007)CrossRefGoogle Scholar
  16. 16.
    Hou, C., Nie, F., Yi, D., Wu, Y.: Feature selection via joint embedding learning and sparse regression. In: Proceedings of the Twenty-Second International Joint Conference on Artificial Intelligence, vol. 2. AAAI Press (2011)Google Scholar
  17. 17.
    Efron, B., Hastie, T., Johnstone, I., Tibshirani, R.: Least angle regression. The Annals of Statistics 32(2), 407–499 (2004)MathSciNetCrossRefzbMATHGoogle Scholar
  18. 18.
    Journe, M., Nesterov, Y., Richtrik, P., Sepulchre, R.: Generalized power method for sparse principal component analysis. The Journal of Machine Learning Research 11, 517–553 (2010)Google Scholar
  19. 19.
    Jenatton, R., Audibert, J.Y., Bach, F.: Structured variable selection with sparsity-inducing norms. The Journal of Machine Learning Research 12, 2777–2824 (2011)MathSciNetGoogle Scholar
  20. 20.
    Zou, H., Hastie, T.: Regression shrinkage and selection via the elastic net, with applications to microarrays. Journal of the Royal Statistical Society: Series B 67, 301–320 (2003)MathSciNetGoogle Scholar
  21. 21.
    Trefethen, L.N., Bau III, D.: Numerical linear algebra. SIAM (1997)Google Scholar

Copyright information

© Springer International Publishing Switzerland 2013

Authors and Affiliations

  • Shi Xiaoshuang
    • 1
  • Lai Zhihui
    • 1
  • Guo Zhenhua
    • 1
  • Wan Minghua
    • 1
  • Zhao Cairong
    • 1
  • Kong Heng
    • 1
  1. 1.Shenzhen Key Laboratory of Broadband Network & Multimedia, Graduate School at ShenzhenTsinghua UniversityShenzhenChina

Personalised recommendations