Advertisement

Statistics and Computing

, Volume 26, Issue 5, pp 1059–1077 | Cite as

Extensions of stability selection using subsamples of observations and covariates

  • Andre BeinruckerEmail author
  • Ürün Dogan
  • Gilles Blanchard
Article

Abstract

We introduce extensions of stability selection, a method to stabilise variable selection methods introduced by Meinshausen and Bühlmann (J R Stat Soc 72:417–473, 2010). We propose to apply a base selection method repeatedly to random subsamples of observations and subsets of covariates under scrutiny, and to select covariates based on their selection frequency. We analyse the effects and benefits of these extensions. Our analysis generalizes the theoretical results of Meinshausen and Bühlmann (J R Stat Soc 72:417–473, 2010) from the case of half-samples to subsamples of arbitrary size. We study, in a theoretical manner, the effect of taking random covariate subsets using a simplified score model. Finally we validate these extensions on numerical experiments on both synthetic and real datasets, and compare the obtained results in detail to the original stability selection method.

Keywords

Variable selection Stability selection Subsampling 

Notes

Acknowledgments

We are extremely grateful to Nicolai Meinshausen and Peter Bühlmann for communicating to us the R-code used by Meinshausen and Bühlmann (2010) as well as for numerous discussions. We are indebted to Richard Samworth and Rajen Shah for numerous discussions and for hosting the first author during part of this work. We thank Maurilio Gutzeit for helping us with part of the numerical experiments.

Supplementary material

11222_2015_9589_MOESM1_ESM.pdf (320 kb)
Supplementary material 1 (pdf 320 KB)

References

  1. Alexander, D.H., Lange, K.: Stability selection for genome-wide association. Genet. Epidemiol. 35(7), 722–728 (2011)CrossRefGoogle Scholar
  2. Arlot, S., Celisse, A.: A survey of cross-validation procedures for model selection. Stat. Surv. 4, 40–79 (2010)MathSciNetCrossRefzbMATHGoogle Scholar
  3. Bolasso, F.B.: Model consistent Lasso estimation through the bootstrap. In: Proceedings of 25th International Conference on Machine Learning (ICML), pp. 33–40. ACM (2008)Google Scholar
  4. Beinrucker, A., Dogan, U., Blanchard, G.: Early stopping for mutual information based feature selection. In: Proceedings of 21st International Conference on Pattern Recognition (ICPR), pp. 975–978 (2012a)Google Scholar
  5. Beinrucker, A., Dogan, U., Blanchard, G.: A simple extension of stability feature selection. In: Pattern Recognition, vol. 7476 of Lecture Notes in Computer Science, pp. 256–265. Springer, New York (2012b)Google Scholar
  6. Bi, J., Bennett, K., Embrechts, M., Breneman, C., Song, M.: Dimensionality reduction via sparse support vector machines. J. Mach. Learn. Res. 3, 1229–1243 (2003)zbMATHGoogle Scholar
  7. Breiman, L.: Random forests. Mach. Learn. 45(1), 5–32 (2001)MathSciNetCrossRefzbMATHGoogle Scholar
  8. Bühlmann, P., Yu, B.: Analyzing bagging. Ann. Stat. 30(4), 927–961 (2002)MathSciNetCrossRefzbMATHGoogle Scholar
  9. Bühlmann, P., Rütimann, P., van de Geer, S., Zhang, C.-H.: Correlated variables in regression: clustering and sparse estimation. J. Stat. Plan. Inference 143(11), 1835–1858 (2013)MathSciNetCrossRefzbMATHGoogle Scholar
  10. Cover, T.M., Thomas, J.A.: Elements of Information Theory, second edn. Wiley-Interscience, New York (2006)zbMATHGoogle Scholar
  11. Efron, B.: Bootstrap methods: another look at the jackknife. Ann. Stat. 7(1), 1–26 (1979)MathSciNetCrossRefzbMATHGoogle Scholar
  12. Embrechts, P.: Modelling Extremal Events: For Insurance and Finance, volume 33 of Stochastic Modelling and Applied Probability. Springer, New York (1997)CrossRefGoogle Scholar
  13. Escudero, G., Marquez, L., Rigau, G.: Boosting applied to word sense disambiguation. In: Proceedings of European Conference on Machine Learning (ECML), pp. 129–141 (2000)Google Scholar
  14. Fleuret, F.: Fast binary feature selection with conditional mutual information. J. Mach. Learn. Res. 5, 1531–1555 (2004)MathSciNetzbMATHGoogle Scholar
  15. Friedman, J., Hastie, T., Tibshirani, R.: Regularization paths for generalized linear models via coordinate descent. J. Stat. Softw. 33(1), 1–22 (2010)CrossRefGoogle Scholar
  16. Guyon, I.: Feature Extraction: Foundations and Applications, vol. 207. Springer, New York (2006)Google Scholar
  17. Hastie, T., Efron, B.: LARS: Least Angle Regression, Lasso and Forward Stagewise (2012). URL http://CRAN.R-project.org/package=lars. R package version 1.1
  18. Haury, A.-C., Mordelet, F., Vera-Licona, P., Vert, J.-P.: Tigress: trustful inference of gene regulation using stability selection. BMC Syst. Biol. 6(1), 145 (2012)CrossRefGoogle Scholar
  19. He, Q., Lin, D.-Y.: A variable selection method for genome-wide association studies. Bioinformatics 27(1), 1–8 (2011)MathSciNetCrossRefGoogle Scholar
  20. He, Z., Yu, W.: Stable feature selection for biomarker discovery. Comput. Biol. Chem. 34(4), 215–225 (2010)Google Scholar
  21. Hinton, G.E., Srivastava, N., Krizhevsky, A., Sutskever, I., Salakhutdinov, R.R.: Improving neural networks by preventing co-adaptation of feature detectors. arXiv preprint arXiv:1207.0580 (2012)
  22. Leadbetter, M.R., Lindgren, G., Rootzén, H.: Extremes and Related Properties of Random Sequences and Processes. Springer Series in Statistics. Springer, New York (1983)CrossRefzbMATHGoogle Scholar
  23. LeCun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proc. IEEE 86(11), 2278–2324 (1998)CrossRefGoogle Scholar
  24. Lounici, K., et al.: Sup-norm convergence rate and sign concentration property of Lasso and Dantzig estimators. Electron. J. Stat. 2, 90–102 (2008)MathSciNetCrossRefzbMATHGoogle Scholar
  25. MASH Consortium. The MASH project. http://www.mash-project.eu (2012). [Online; Accessed 19 Mar 2013]
  26. Meinshausen, N., Bühlmann, P.: Stability selection. J. R. Stat. Soc. 72(4), 417–473 (2010)MathSciNetCrossRefGoogle Scholar
  27. Meinshausen, N., Yu, B.: Lasso-type recovery of sparse representations for high-dimensional data. Ann. Stat. 37(1), 246–270 (2009)MathSciNetCrossRefzbMATHGoogle Scholar
  28. Politis, D.N., Romano, J.P., Wolf, M.: Subsampling. Springer Series in Statistics. Springer, New York (1999)CrossRefzbMATHGoogle Scholar
  29. Sauerbrei, W., Schumacher, M.: A bootstrap resampling procedure for model building: application to the Cox regression model. Stat. Med. 11(16), 2093–2109 (1992)CrossRefGoogle Scholar
  30. Schapire, R., Singer, Y.: Improved boosting algorithms using confidence-rated predictions. Mach. Learn. 37(3), 297–336 (1999)CrossRefzbMATHGoogle Scholar
  31. Shah, R.D., Samworth, R.J.: Variable selection with error control: another look at stability selection. J. R. Stat. Soc. 75(1), 55–80 (2013)MathSciNetCrossRefGoogle Scholar
  32. Tibshirani, R.: Regression shrinkage and selection via the lasso. J. R. Stat. Soc. 58(1), 267–288 (1996)MathSciNetzbMATHGoogle Scholar
  33. Wang, S., Nan, B., Rosset, S., Zhu, J.: Random Lasso. Ann. Appl. Stat. 5(1), 468–485 (2011)MathSciNetCrossRefzbMATHGoogle Scholar

Copyright information

© Springer Science+Business Media New York 2015

Authors and Affiliations

  • Andre Beinrucker
    • 1
    Email author
  • Ürün Dogan
    • 2
  • Gilles Blanchard
    • 1
  1. 1.University of PotsdamPotsdamGermany
  2. 2.Microsoft/Skype LabsLondonUK

Personalised recommendations