An Algorithm for Iterative Selection of Blocks of Features
We focus on the problem of linear regression estimation in high dimension, when the parameter β is ”sparse” (most of its coordinates are 0) and ”blocky” (βi and βi + 1 are likely to be equal). Recently, some authors defined estimators taking into account this information, such as the Fused-LASSO  or the S-LASSO  among others. However, there are no theoretical results about the obtained estimators in the general design matrix case. Here, we propose an alternative point of view, based on the Iterative Feature Selection method . We propose an iterative algorithm that takes into account the fact that β is sparse and blocky, with no prior knowledge on the position of the blocks. Moreover, we give a theoretical result that ensures that every step of our algorithm actually improves the statistical performance of the obtained estimator. We provide some simulations, where our method outperforms LASSO-type methods in the cases where the parameter is sparse and blocky. Moreover, we give an application to real data (CGH arrays), that shows that our estimator can be used on large datasets.
KeywordsFeature Selection Sparsity Linear Regression Grouped Variables ArrayCGH
Unable to display preview. Download preview PDF.
- 2.Alquier, P.: LASSO, iterative feature selection and the correlation selector: Oracle inequalities and numerical performances. Electron. J. Stat., 1129–1152 (2008)Google Scholar
- 6.Candès, E., Tao, T.: The dantzig selector: statistical estimation when p is much larger than n. Ann. Statist. 35 (2007)Google Scholar
- 9.Hebiri, M. Regularization with the smooth-LASSO procedure. Preprint LPMA, arXiv:0803.0668 (2008)Google Scholar
- 10.Hebiri, M., Van de Geer, S.: The smooth-lasso and other ℓ1 + ℓ2-penalized methods. arXiv:1003.4885 (2010)Google Scholar
- 11.Hoefling, H.: A path algorithm for the fused LASSO signal approwimator. Preprint arXiv:0910.0526 (2009)Google Scholar
- 14.R Development Core Team. R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing, Vienna, Austria (2008) ISBN 3-900051-07-0Google Scholar
- 17.Slawski, M., zu Castell, W., and Tutz, G.: Feature selection guided by structural information. To appear in the Annals of Applied StatisticsGoogle Scholar
- 20.Tibshirani, R.J., Taylor, J.: Regularization path for least squares problems with generalized ℓ1 penalties (2009) (preprint)Google Scholar