Skip to main content

An Algorithm for Iterative Selection of Blocks of Features

  • Conference paper

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 6331))

Abstract

We focus on the problem of linear regression estimation in high dimension, when the parameter β is ”sparse” (most of its coordinates are 0) and ”blocky” (β i and β i + 1 are likely to be equal). Recently, some authors defined estimators taking into account this information, such as the Fused-LASSO [19] or the S-LASSO [10] among others. However, there are no theoretical results about the obtained estimators in the general design matrix case. Here, we propose an alternative point of view, based on the Iterative Feature Selection method [1]. We propose an iterative algorithm that takes into account the fact that β is sparse and blocky, with no prior knowledge on the position of the blocks. Moreover, we give a theoretical result that ensures that every step of our algorithm actually improves the statistical performance of the obtained estimator. We provide some simulations, where our method outperforms LASSO-type methods in the cases where the parameter is sparse and blocky. Moreover, we give an application to real data (CGH arrays), that shows that our estimator can be used on large datasets.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Alquier, P.: Iterative feature selection in regression estimation. Annales de l’Institut Henri Poincaré, Probability and Statistics 44(1), 47–88 (2008)

    Article  MATH  MathSciNet  Google Scholar 

  2. Alquier, P.: LASSO, iterative feature selection and the correlation selector: Oracle inequalities and numerical performances. Electron. J. Stat., 1129–1152 (2008)

    Google Scholar 

  3. Bickel, P., Ritov, Y., Tsybakov, A.: Simultaneous analysis of LASSO and dantzig selector. The Annals of Statistics 37(4), 1705–1732 (2009)

    Article  MATH  MathSciNet  Google Scholar 

  4. Breiman, L.: Better subset regression using the nonnegative garrote. Technometrics 37, 373–384 (1995)

    Article  MATH  MathSciNet  Google Scholar 

  5. Bunea, F., Tsybakov, A., Wegkamp, M.: Sparsity oracle inequalities for the lasso. Electron. J. Stat. 1, 169–194 (2007)

    Article  MATH  MathSciNet  Google Scholar 

  6. Candès, E., Tao, T.: The dantzig selector: statistical estimation when p is much larger than n. Ann. Statist. 35 (2007)

    Google Scholar 

  7. Chesnau, C., Hebiri, M.: Some theoretical results on the grouped variables lasso. Mathematical Methods of Statistics 17(4), 317–326 (2008)

    Article  MathSciNet  Google Scholar 

  8. Friedman, J., Hastie, T., Höfling, H., Tibshirani, R.: Pathwise coordinate optimization. Ann. Appl. Statist. 1(2), 302–332 (2007)

    Article  MATH  Google Scholar 

  9. Hebiri, M. Regularization with the smooth-LASSO procedure. Preprint LPMA, arXiv:0803.0668 (2008)

    Google Scholar 

  10. Hebiri, M., Van de Geer, S.: The smooth-lasso and other ℓ1 + ℓ2-penalized methods. arXiv:1003.4885 (2010)

    Google Scholar 

  11. Hoefling, H.: A path algorithm for the fused LASSO signal approwimator. Preprint arXiv:0910.0526 (2009)

    Google Scholar 

  12. Huang, J., Salim, A., Lei, K., O’Sullivan, K., Pawitan, Y.: Classification of array cgh data using smoothed logistic regression model. Statistics in Medicine 8(30), 3798–3810 (2009)

    Article  Google Scholar 

  13. Osborne, M., Presnell, B., Turlach, B.: On the LASSO and its dual. J. Comput. Graph. Statist. 9(2), 319–337 (2000)

    Article  MathSciNet  Google Scholar 

  14. R Development Core Team. R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing, Vienna, Austria (2008) ISBN 3-900051-07-0

    Google Scholar 

  15. Rapaport, F., Barillot, E., Vert, J.-P.: Classification of array-CGH data using fused SVM. Bioinformatics 24(13), 1375–1382 (2008)

    Article  Google Scholar 

  16. Rinaldo, A.: Properties and refinements of the fused LASSO. The Annals of Statistics 37(5B), 2922–2952 (2009)

    Article  MATH  MathSciNet  Google Scholar 

  17. Slawski, M., zu Castell, W., and Tutz, G.: Feature selection guided by structural information. To appear in the Annals of Applied Statistics

    Google Scholar 

  18. Tibshirani, R.: Regression shrinkage and selection via the LASSO. J. Roy. Statist. Soc. Ser. B 58(1), 267–288 (1996)

    MATH  MathSciNet  Google Scholar 

  19. Tibshirani, R., Saunders, M., Rosset, S., Zhu, J., Knight, K.: Sparsity and smoothness via the fused lasso. JRSS-B 67(1), 91–108 (2005)

    MATH  MathSciNet  Google Scholar 

  20. Tibshirani, R.J., Taylor, J.: Regularization path for least squares problems with generalized ℓ1 penalties (2009) (preprint)

    Google Scholar 

  21. Van de Geer, S., Bühlmann, P.: On the conditions used to prove oracle results for the lasso. Electronic Journal of Statistics 3, 1360–1392 (2009)

    Article  MathSciNet  Google Scholar 

  22. Yuan, M., Lin, Y.: Model selection and estimation in regression with grouped variables. JRSS-B 68(1), 49–67 (2006)

    MATH  MathSciNet  Google Scholar 

  23. Zhao, P., Rocha, G., Yu, B.: The composite absolute penalties for grouped and hierarchical variable selection. The Annals of Statistics 37(6A), 3468–3497 (2009)

    Article  MATH  MathSciNet  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2010 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Alquier, P. (2010). An Algorithm for Iterative Selection of Blocks of Features. In: Hutter, M., Stephan, F., Vovk, V., Zeugmann, T. (eds) Algorithmic Learning Theory. ALT 2010. Lecture Notes in Computer Science(), vol 6331. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-16108-7_7

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-16108-7_7

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-16107-0

  • Online ISBN: 978-3-642-16108-7

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics