Abstract
Partial least squares (PLS) regression combines dimensionality reduction and prediction using a latent variable model. It provides better predictive ability than principal component analysis by taking into account both the independent and response variables in the dimension reduction procedure. However, PLS suffers from over-fitting problems for few samples but many variables. We formulate a new criterion for sparse PLS by adding a structured sparsity constraint to the global SIMPLS optimization. The constraint is a sparsity-inducing norm, which is useful for selecting the important variables shared among all the components. The optimization is solved by an augmented Lagrangian method to obtain the PLS components and to perform variable selection simultaneously. We propose a novel greedy algorithm to overcome the computation difficulties. Experiments demonstrate that our approach to PLS regression attains better performance with fewer selected predictors.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
de Jong, S. (1995). PLS shrinks. Journal of Chemometrics, 9(4), 323–326.
Wold, H. (1966). Nonlinear estimation by iterative least squares procedures. Research papers in statistics, 411–444.
Hotelling, H. (1933). Analysis of a complex of statistical variables into principal components. Journal of educational psychology, 24(6), 417.
Wold, S., Ruhe, A., Wold, H., and Dunn III, W. J. (1984). The collinearity problem in linear regression. The partial least squares (PLS) approach to generalized inverses. SIAM Journal on Scientific and Statistical Computing, 5(3), 735–743.
Martens, H., and Naes, T. (1989). Multivariate calibration. Wiley.
Rossouw, D., Robert-Granié, C., and Besse, P. (2008). A sparse PLS for variable selection when integrating omics data. Genetics and Molecular Biology, 7(1), 35.
Höskuldsson, A. (1988). PLS regression methods. Journal of Chemometrics, 2(3), 211–228.
Wold, H. (1975). Soft modelling by latent variables: the non-linear iterative partial least squares (NIPALS) approach. Perspectives in Probability and Statistics, In Honor of MS Bartlett, 117–144.
Wold, S., Martens, H., and Wold, H. (1983). The multivariate calibration problem in chemistry solved by the PLS method. Proceedings of the Conference on Matrix Pencils. Lectures Notes in Mathematics, 286–293.
Chun, H., and KeleÅÿ, S. (2010). Sparse partial least squares regression for simultaneous dimension reduction and variable selection. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 72, 3–25.
Bach, F. R. (2008). Consistency of the group Lasso and multiple kernel learning. The Journal of Machine Learning Research, 9, 1179–1225.
de Jong, S. (1993). SIMPLS: an alternative approach to partial least squares regression. Chemometrics and Intelligent Laboratory Systems, 18(3), 251–263.
Tenenhaus, M. (1998). La Régression PLS: théorie et pratique. Editions Technip.
Boulesteix, A. L., and Strimmer, K. (2007). Partial least squares: a versatile tool for the analysis of high-dimensional genomic data. Briefings in bioinformatics, 8(1), 32–44.
ter Braak, C. J., and de Jong, S. (1998). The objective function of partial least squares regression. Journal of chemometrics, 12(1), 41–54.
Jolliffe, I. T., Trendafilov, N. T., and Uddin, M. (2003). A modified principal component technique based on the LASSO. Journal of Computational and Graphical Statistics, 12(3), 531–547.
Gander, W., Golub, G. H., and von Matt, U. (1989). A constrained eigenvalue problem. Linear Algebra and its applications, 114, 815–839.
Beck, A., Ben-Tal, A., and Teboulle, M. (2006). Finding a global optimal solution for a quadratically constrained fractional quadratic problem with applications to the regularized total least squares. SIAM Journal on Matrix Analysis and Applications, 28(2), 425–445.
Jolliffe, I. (2002). Principal component analysis. John Wiley & Sons, Ltd.
de Jong, S. (2005). PLS fits closer than PCR. Journal of chemometrics, 7(6), 551–557.
ter Braak, C. J., and de Jong, S. (1998). The objective function of partial least squares regression. Journal of chemometrics, 12(1), 41–54.
Magidson, J. (2010). Correlated Component Regression: A Prediction/Classification Methodology for Possibly Many Features. Proceedings of the American Statistical Association.
Acknowledgements
We would like to give special thanks to Douglas Rutledge, professor in AgroParisTech, for his expert knowledge in chemometrics to interpret the selected variables in octane data.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2013 Springer Science+Business Media New York
About this paper
Cite this paper
Liu, TY., Trinchera, L., Tenenhaus, A., Wei, D., Hero, A.O. (2013). Globally Sparse PLS Regression. In: Abdi, H., Chin, W., Esposito Vinzi, V., Russolillo, G., Trinchera, L. (eds) New Perspectives in Partial Least Squares and Related Methods. Springer Proceedings in Mathematics & Statistics, vol 56. Springer, New York, NY. https://doi.org/10.1007/978-1-4614-8283-3_7
Download citation
DOI: https://doi.org/10.1007/978-1-4614-8283-3_7
Published:
Publisher Name: Springer, New York, NY
Print ISBN: 978-1-4614-8282-6
Online ISBN: 978-1-4614-8283-3
eBook Packages: Mathematics and StatisticsMathematics and Statistics (R0)