Advertisement

Variable Selection

  • Wolfgang Karl HärdleEmail author
  • Léopold Simar
Chapter

Abstract

Variable selection is very important in statistical modeling. We are frequently not only interested in using a model for prediction but also need to correctly identify the relevant variables, that is, to recover the correct model under given assumptions. It is known that under certain conditions, the ordinary least squares (OLS) method produces poor prediction results and does not yield a parsimonious model causing overfitting. Therefore the objective of the variable selection methods is to find the variables which are the most relevant for prediction. Such methods are particularly important when the true underlying model has a sparse representation (many parameters close to zero). The identification of relevant variables will reduce the noise and therefore improve the prediction performance of the fitted model.

References

  1. B. Efron, T. Hastie, I. Johnstone, R. Tibshirani, Least angle regression (with discussion). Ann. Stat. 32(2), 407–499 (2004)CrossRefGoogle Scholar
  2. J.H. Friedman, T. Hastie, R. Tibshirani, Regularization paths for generalized linear models via coordinate descent. J. Stat. Softw. 33 (2010)Google Scholar
  3. C. Lawson, R. Hansen, Solving Least Square Problems (Prentice Hall, Englewood Cliffs, 1974)Google Scholar
  4. L. Meier, S. van de Geer, P. Bühlmann, The group lasso for logistic regression. JRSSB 70, 53–71 (2008)MathSciNetCrossRefGoogle Scholar
  5. M.R. Osborne, B. Presnell, B.A. Turlach, On the Lasso and Its Dual. J. Comput. Graph. Stat. 9(2), 319–337 (2000)MathSciNetGoogle Scholar
  6. S.K. Shevade, S.S. Keerthi, A simple and efficient algorithm for gene selection using sparse logistic regression. Bioinformatics 19, 2246–2253 (2003)CrossRefGoogle Scholar
  7. N. Simon, J.H. Friedman, T. Hastie, R. Tibshirani, A sparse-group lasso. J. Comput. Graph. Stat. 22(2), 231–245 (2013)MathSciNetCrossRefGoogle Scholar
  8. R. Tibshirani, Regression Shrinkage and Selection via the Lasso. J. Royal Stat. Soc. Ser. B 58, 267–288 (1996)MathSciNetzbMATHGoogle Scholar
  9. M. Yuan, Y. Lin, Model selection and estimation in regression with grouped variables. J. Royal Stat. Soc. Ser. B 68, 49–67 (2006)MathSciNetCrossRefGoogle Scholar
  10. H. Zou, T. Hastie, Regularization and variable selection via the elastic net. J. Royal Stat. Soc. Ser. B 67, 301–320 (2005)MathSciNetCrossRefGoogle Scholar

Copyright information

© Springer Nature Switzerland AG 2019

Authors and Affiliations

  1. 1.Ladislaus von Bortkiewicz Chair of StatisticsHumboldt-Universität zu BerlinBerlinGermany
  2. 2.Institute of Statistics, Biostatistics and Actuarial SciencesUniversité Catholique de LouvainLouvain-la-NeuveBelgium

Personalised recommendations