An Algorithm for Iterative Selection of Blocks of Features

Alquier, Pierre

doi:10.1007/978-3-642-16108-7_7

An Algorithm for Iterative Selection of Blocks of Features

Pierre Alquier^23,24

Conference paper

1132 Accesses
1 Citations

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 6331))

Abstract

We focus on the problem of linear regression estimation in high dimension, when the parameter β is ”sparse” (most of its coordinates are 0) and ”blocky” (β _i and β _i + 1 are likely to be equal). Recently, some authors defined estimators taking into account this information, such as the Fused-LASSO [19] or the S-LASSO [10] among others. However, there are no theoretical results about the obtained estimators in the general design matrix case. Here, we propose an alternative point of view, based on the Iterative Feature Selection method [1]. We propose an iterative algorithm that takes into account the fact that β is sparse and blocky, with no prior knowledge on the position of the blocks. Moreover, we give a theoretical result that ensures that every step of our algorithm actually improves the statistical performance of the obtained estimator. We provide some simulations, where our method outperforms LASSO-type methods in the cases where the parameter is sparse and blocky. Moreover, we give an application to real data (CGH arrays), that shows that our estimator can be used on large datasets.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Alquier, P.: Iterative feature selection in regression estimation. Annales de l’Institut Henri Poincaré, Probability and Statistics 44(1), 47–88 (2008)
Article MATH MathSciNet Google Scholar
Alquier, P.: LASSO, iterative feature selection and the correlation selector: Oracle inequalities and numerical performances. Electron. J. Stat., 1129–1152 (2008)
Google Scholar
Bickel, P., Ritov, Y., Tsybakov, A.: Simultaneous analysis of LASSO and dantzig selector. The Annals of Statistics 37(4), 1705–1732 (2009)
Article MATH MathSciNet Google Scholar
Breiman, L.: Better subset regression using the nonnegative garrote. Technometrics 37, 373–384 (1995)
Article MATH MathSciNet Google Scholar
Bunea, F., Tsybakov, A., Wegkamp, M.: Sparsity oracle inequalities for the lasso. Electron. J. Stat. 1, 169–194 (2007)
Article MATH MathSciNet Google Scholar
Candès, E., Tao, T.: The dantzig selector: statistical estimation when p is much larger than n. Ann. Statist. 35 (2007)
Google Scholar
Chesnau, C., Hebiri, M.: Some theoretical results on the grouped variables lasso. Mathematical Methods of Statistics 17(4), 317–326 (2008)
Article MathSciNet Google Scholar
Friedman, J., Hastie, T., Höfling, H., Tibshirani, R.: Pathwise coordinate optimization. Ann. Appl. Statist. 1(2), 302–332 (2007)
Article MATH Google Scholar
Hebiri, M. Regularization with the smooth-LASSO procedure. Preprint LPMA, arXiv:0803.0668 (2008)
Google Scholar
Hebiri, M., Van de Geer, S.: The smooth-lasso and other ℓ₁ + ℓ₂-penalized methods. arXiv:1003.4885 (2010)
Google Scholar
Hoefling, H.: A path algorithm for the fused LASSO signal approwimator. Preprint arXiv:0910.0526 (2009)
Google Scholar
Huang, J., Salim, A., Lei, K., O’Sullivan, K., Pawitan, Y.: Classification of array cgh data using smoothed logistic regression model. Statistics in Medicine 8(30), 3798–3810 (2009)
Article Google Scholar
Osborne, M., Presnell, B., Turlach, B.: On the LASSO and its dual. J. Comput. Graph. Statist. 9(2), 319–337 (2000)
Article MathSciNet Google Scholar
R Development Core Team. R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing, Vienna, Austria (2008) ISBN 3-900051-07-0
Google Scholar
Rapaport, F., Barillot, E., Vert, J.-P.: Classification of array-CGH data using fused SVM. Bioinformatics 24(13), 1375–1382 (2008)
Article Google Scholar
Rinaldo, A.: Properties and refinements of the fused LASSO. The Annals of Statistics 37(5B), 2922–2952 (2009)
Article MATH MathSciNet Google Scholar
Slawski, M., zu Castell, W., and Tutz, G.: Feature selection guided by structural information. To appear in the Annals of Applied Statistics
Google Scholar
Tibshirani, R.: Regression shrinkage and selection via the LASSO. J. Roy. Statist. Soc. Ser. B 58(1), 267–288 (1996)
MATH MathSciNet Google Scholar
Tibshirani, R., Saunders, M., Rosset, S., Zhu, J., Knight, K.: Sparsity and smoothness via the fused lasso. JRSS-B 67(1), 91–108 (2005)
MATH MathSciNet Google Scholar
Tibshirani, R.J., Taylor, J.: Regularization path for least squares problems with generalized ℓ₁ penalties (2009) (preprint)
Google Scholar
Van de Geer, S., Bühlmann, P.: On the conditions used to prove oracle results for the lasso. Electronic Journal of Statistics 3, 1360–1392 (2009)
Article MathSciNet Google Scholar
Yuan, M., Lin, Y.: Model selection and estimation in regression with grouped variables. JRSS-B 68(1), 49–67 (2006)
MATH MathSciNet Google Scholar
Zhao, P., Rocha, G., Yu, B.: The composite absolute penalties for grouped and hierarchical variable selection. The Annals of Statistics 37(6A), 3468–3497 (2009)
Article MATH MathSciNet Google Scholar

Download references

Author information

Authors and Affiliations

LPMA (University Paris 7), 175, rue du Chevaleret, 75013, Paris, France
Pierre Alquier
CREST (ENSAE),
Pierre Alquier

Authors

Pierre Alquier
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Research School of Information Sciences and Engineering, Australian National University and NICTA, 0200, Canberra, ACT, Australia
Marcus Hutter
Department of Mathematics, National University of Singapore, Block S17, 10 Lower Kent Ridge Road, 119076, Singapore, Republic of Singapore
Frank Stephan
Department of Computer Science, University of London, Royal Holloway, TW20 0EX, Egham, Surrey, UK
Vladimir Vovk
Division of Computer Science, Hokkaido University, , ,, N-14, W-9, Sapporo, 060-0814, Japan
Thomas Zeugmann

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Alquier, P. (2010). An Algorithm for Iterative Selection of Blocks of Features. In: Hutter, M., Stephan, F., Vovk, V., Zeugmann, T. (eds) Algorithmic Learning Theory. ALT 2010. Lecture Notes in Computer Science(), vol 6331. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-16108-7_7

Download citation

DOI: https://doi.org/10.1007/978-3-642-16108-7_7
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-16107-0
Online ISBN: 978-3-642-16108-7
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics