Abstract
We use the Lasso, its adaptive or its thresholded variant, as procedure for variable selection. This essentially means that for \( {S_0}\,:=\,{\{j\,:{\beta^0_j}\neq 0\}} \) being the true active set, we look for a Lasso procedure delivering an estimator \( \hat{S}\) of \( {S_0}\) such that \( \hat{S}={S_0}\) with large probability. However, it is clear that very small coefficients \( \mid\beta^0_j\mid \) cannot be detected by any method. Moreover, irrepresentable conditions show that the Lasso, or any weighted variant, typically selects too many variables. In other words, unless one imposes very strong conditions, false positives cannot be avoided either. We shall therefore aim at estimators with oracle prediction error, yet having not too many false positives. The latter is considered as achieved when \( {\mid \hat{S}\backslash {S_*}\mid}=O({\mid {S_*}\mid})\), where \( {S_*}\,\subset \,{S_0}\) is the set of coefficients the oracle would select.We will show that the adaptive Lasso procedure, and also thresholding the initial Lasso, reaches this aim, assuming sparse eigenvalues, or alternatively, so-called “beta-min” conditions.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
Copyright information
© 2011 Springer-Verlag Berlin Heidelberg
About this chapter
Cite this chapter
Bühlmann, P., van de Geer, S. (2011). Variable selection with the Lasso. In: Statistics for High-Dimensional Data. Springer Series in Statistics. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-20192-9_7
Download citation
DOI: https://doi.org/10.1007/978-3-642-20192-9_7
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-20191-2
Online ISBN: 978-3-642-20192-9
eBook Packages: Mathematics and StatisticsMathematics and Statistics (R0)