An Algorithm for Iterative Selection of Blocks of Features

  • Pierre Alquier
Part of the Lecture Notes in Computer Science book series (LNCS, volume 6331)


We focus on the problem of linear regression estimation in high dimension, when the parameter β is ”sparse” (most of its coordinates are 0) and ”blocky” (β i and β i + 1 are likely to be equal). Recently, some authors defined estimators taking into account this information, such as the Fused-LASSO [19] or the S-LASSO [10] among others. However, there are no theoretical results about the obtained estimators in the general design matrix case. Here, we propose an alternative point of view, based on the Iterative Feature Selection method [1]. We propose an iterative algorithm that takes into account the fact that β is sparse and blocky, with no prior knowledge on the position of the blocks. Moreover, we give a theoretical result that ensures that every step of our algorithm actually improves the statistical performance of the obtained estimator. We provide some simulations, where our method outperforms LASSO-type methods in the cases where the parameter is sparse and blocky. Moreover, we give an application to real data (CGH arrays), that shows that our estimator can be used on large datasets.


Feature Selection Sparsity Linear Regression Grouped Variables ArrayCGH 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Alquier, P.: Iterative feature selection in regression estimation. Annales de l’Institut Henri Poincaré, Probability and Statistics 44(1), 47–88 (2008)zbMATHCrossRefMathSciNetGoogle Scholar
  2. 2.
    Alquier, P.: LASSO, iterative feature selection and the correlation selector: Oracle inequalities and numerical performances. Electron. J. Stat., 1129–1152 (2008)Google Scholar
  3. 3.
    Bickel, P., Ritov, Y., Tsybakov, A.: Simultaneous analysis of LASSO and dantzig selector. The Annals of Statistics 37(4), 1705–1732 (2009)zbMATHCrossRefMathSciNetGoogle Scholar
  4. 4.
    Breiman, L.: Better subset regression using the nonnegative garrote. Technometrics 37, 373–384 (1995)zbMATHCrossRefMathSciNetGoogle Scholar
  5. 5.
    Bunea, F., Tsybakov, A., Wegkamp, M.: Sparsity oracle inequalities for the lasso. Electron. J. Stat. 1, 169–194 (2007)zbMATHCrossRefMathSciNetGoogle Scholar
  6. 6.
    Candès, E., Tao, T.: The dantzig selector: statistical estimation when p is much larger than n. Ann. Statist. 35 (2007)Google Scholar
  7. 7.
    Chesnau, C., Hebiri, M.: Some theoretical results on the grouped variables lasso. Mathematical Methods of Statistics 17(4), 317–326 (2008)CrossRefMathSciNetGoogle Scholar
  8. 8.
    Friedman, J., Hastie, T., Höfling, H., Tibshirani, R.: Pathwise coordinate optimization. Ann. Appl. Statist. 1(2), 302–332 (2007)zbMATHCrossRefGoogle Scholar
  9. 9.
    Hebiri, M. Regularization with the smooth-LASSO procedure. Preprint LPMA, arXiv:0803.0668 (2008)Google Scholar
  10. 10.
    Hebiri, M., Van de Geer, S.: The smooth-lasso and other ℓ1 + ℓ2-penalized methods. arXiv:1003.4885 (2010)Google Scholar
  11. 11.
    Hoefling, H.: A path algorithm for the fused LASSO signal approwimator. Preprint arXiv:0910.0526 (2009)Google Scholar
  12. 12.
    Huang, J., Salim, A., Lei, K., O’Sullivan, K., Pawitan, Y.: Classification of array cgh data using smoothed logistic regression model. Statistics in Medicine 8(30), 3798–3810 (2009)CrossRefGoogle Scholar
  13. 13.
    Osborne, M., Presnell, B., Turlach, B.: On the LASSO and its dual. J. Comput. Graph. Statist. 9(2), 319–337 (2000)CrossRefMathSciNetGoogle Scholar
  14. 14.
    R Development Core Team. R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing, Vienna, Austria (2008) ISBN 3-900051-07-0Google Scholar
  15. 15.
    Rapaport, F., Barillot, E., Vert, J.-P.: Classification of array-CGH data using fused SVM. Bioinformatics 24(13), 1375–1382 (2008)CrossRefGoogle Scholar
  16. 16.
    Rinaldo, A.: Properties and refinements of the fused LASSO. The Annals of Statistics 37(5B), 2922–2952 (2009)zbMATHCrossRefMathSciNetGoogle Scholar
  17. 17.
    Slawski, M., zu Castell, W., and Tutz, G.: Feature selection guided by structural information. To appear in the Annals of Applied StatisticsGoogle Scholar
  18. 18.
    Tibshirani, R.: Regression shrinkage and selection via the LASSO. J. Roy. Statist. Soc. Ser. B 58(1), 267–288 (1996)zbMATHMathSciNetGoogle Scholar
  19. 19.
    Tibshirani, R., Saunders, M., Rosset, S., Zhu, J., Knight, K.: Sparsity and smoothness via the fused lasso. JRSS-B 67(1), 91–108 (2005)zbMATHMathSciNetGoogle Scholar
  20. 20.
    Tibshirani, R.J., Taylor, J.: Regularization path for least squares problems with generalized ℓ1 penalties (2009) (preprint)Google Scholar
  21. 21.
    Van de Geer, S., Bühlmann, P.: On the conditions used to prove oracle results for the lasso. Electronic Journal of Statistics 3, 1360–1392 (2009)CrossRefMathSciNetGoogle Scholar
  22. 22.
    Yuan, M., Lin, Y.: Model selection and estimation in regression with grouped variables. JRSS-B 68(1), 49–67 (2006)zbMATHMathSciNetGoogle Scholar
  23. 23.
    Zhao, P., Rocha, G., Yu, B.: The composite absolute penalties for grouped and hierarchical variable selection. The Annals of Statistics 37(6A), 3468–3497 (2009)zbMATHCrossRefMathSciNetGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2010

Authors and Affiliations

  • Pierre Alquier
    • 1
    • 2
  1. 1.LPMA (University Paris 7)ParisFrance
  2. 2.CREST (ENSAE) 

Personalised recommendations