Model Selection by Bootstrap Penalization for Classification

  • Magalie Fromont
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 3120)


We consider the binary classification problem. Given an i.i.d. sample drawn from the distribution of an \(\mathcal{X}\times\{0,1\}\)-valued random pair, we propose to estimate the so-called Bayes classifier by minimizing the sum of the empirical classification error and a penalty term based on Efron’s or i.i.d. weighted bootstrap samples of the data. We obtain exponential inequalities for such bootstrap type penalties, which allow us to derive non-asymptotic properties for the corresponding estimators. In particular, we prove that these estimators achieve the global minimax risk over sets of functions built from Vapnik-Chervonenkis classes. The obtained results generalize Koltchinskii [12] and Bartlett, Boucheron, Lugosi’s [2] ones for Rademacher penalties that can thus be seen as special examples of bootstrap type penalties.


Empirical Process Independent Copy Multinomial Vector Structural Risk Minimization Concentration Inequality 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Barron, A.R.: Logically smooth density estimation. Technical Report 56, Dept. of Statistics, Stanford Univ. (1985) Google Scholar
  2. 2.
    Bartlett, P., Boucheron, S., Lugosi, G.: Model selection and error estimation. Mach. Learn. 48, 85–113 (2002)zbMATHCrossRefGoogle Scholar
  3. 3.
    Bartlett, P., Bousquet, O., Mendelson, S.: Localized Rademacher complexities. In: Proc. of the 15th annual conf. on Computational Learning Theory, pp. 44–58 (2002)Google Scholar
  4. 4.
    Birgé, L., Massart, P.: Minimum contrast estimators on sieves: exponential bounds and rates of convergence. Bernoulli 4, 329–375 (1998)zbMATHCrossRefMathSciNetGoogle Scholar
  5. 5.
    Boucheron, S., Lugosi, G., Massart, P.: A sharp concentration inequality with applications. Random Struct. Algorithms 16, 277–292 (2000)zbMATHCrossRefMathSciNetGoogle Scholar
  6. 6.
    Buescher, K.L., Kumar, P.R.: Learning by canonical smooth estimation. I: Simultaneous estimation, II: Learning and choice of model complexity. IEEE Trans. Autom. Control 41, 545–556 557–569 (1996)zbMATHCrossRefMathSciNetGoogle Scholar
  7. 7.
    Devroye, L., Lugosi, G.: Lower bounds in pattern recognition and learning. Pattern Recognition 28, 1011–1018 (1995)CrossRefGoogle Scholar
  8. 8.
    Efron, B.: The jackknife, the bootstrap and other resampling plans. CBMS-NSF Reg. Conf. Ser. Appl. Math. 38 (1982)Google Scholar
  9. 9.
    Fromont, M.: Quelques problèmes de sélection de modèles : construction de tests adaptatifs, ajustement de pénalités par des méthodes de bootstrap (Some model selection problems: construction of adaptive tests, bootstrap penalization). Ph. D. thesis, Université Paris XI (2003) Google Scholar
  10. 10.
    Giné, E., Zinn, J.: Bootstrapping general empirical measures. Ann. Probab. 18, 851–869 (1990)zbMATHCrossRefMathSciNetGoogle Scholar
  11. 11.
    Haussler, D.: Sphere packing numbers for subsets of the Boolean n-cube with bounded Vapnik-Chervonenkis dimension. J. Comb. Theory A 69, 217–232 (1995)CrossRefMathSciNetGoogle Scholar
  12. 12.
    Koltchinskii, V.: Rademacher penalties and structural risk minimization. IEEE Trans. Inf. Theory 47, 1902–1914 (2001)zbMATHCrossRefMathSciNetGoogle Scholar
  13. 13.
    Koltchinskii, V., Panchenko, D.: Rademacher processes and bounding the risk of function learning. In: High dimensional probability II. 2nd international conference, Univ. of Washington, DC (1999)Google Scholar
  14. 14.
    Lozano, F.: Model selection using Rademacher penalization. In: Proceedings of the 2nd ICSC Symp. on Neural Computation, Berlin, Germany (2000)Google Scholar
  15. 15.
    Lugosi, G., Nobel, A.B.: Adaptive model selection using empirical complexities. Ann. Statist. 27, 1830–1864 (1999)zbMATHCrossRefMathSciNetGoogle Scholar
  16. 16.
    Lugosi, G., Wegkamp, M.: Complexity regularization via localized random penalties. (2003) (preprint) Google Scholar
  17. 17.
    Lugosi, G., Zeger, K.: Concept learning using complexity regularization. IEEE Trans. Inf. Theory 42, 48–54 (1996)zbMATHCrossRefMathSciNetGoogle Scholar
  18. 18.
    Mammen, E., Tsybakov, A.: Smooth discrimination analysis. Ann. Statist. 27, 1808–1829 (1999)zbMATHCrossRefMathSciNetGoogle Scholar
  19. 19.
    Massart, P.: Some applications of concentration inequalities to statistics. Ann. Fac. Sci. Toulouse 9, 245–303 (2000)zbMATHMathSciNetGoogle Scholar
  20. 20.
    Massart, P.: Concentration inequalities and model selection. Lectures given at the St-Flour summer school of Probability Theory, in Lect. Notes Math. (2003) (to appear) Google Scholar
  21. 21.
    Massart, P.: Nedelec E. Risk bounds for statistical learning. (2003) (preprint) Google Scholar
  22. 22.
    McDiarmid, C.: On the method of bounded differences. Surveys in combinatorics (Lond. Math. Soc. Lect. Notes) 141, 148–188 (1989)MathSciNetGoogle Scholar
  23. 23.
    Præstgaard, J., Wellner, J.A.: Exchangeably weighted bootstraps of the general empirical process. Ann. Probab. 21, 2053–2086 (1993)zbMATHCrossRefMathSciNetGoogle Scholar
  24. 24.
    Tsybakov, A.: Optimal aggregation of classifiers in statistical learning. (2001) (preprint) Google Scholar
  25. 25.
    Vapnik, V.N., Chervonenkis, A.Y.: On the uniform convergence of relative frequencies of events to their probabilities. Theor. Probab. Appl. 16, 264–280 (1971)zbMATHCrossRefGoogle Scholar
  26. 26.
    Vapnik, V. N., Chervonenkis A.Y.: Teoriya raspoznavaniya obrazov. Statisticheskie problemy obucheniya. Nauka, Moscow (1974) Google Scholar
  27. 27.
    Vapnik, V.N.: Estimation of dependences based on empirical data. Springer, New York (1982)zbMATHGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2004

Authors and Affiliations

  • Magalie Fromont
    • 1
  1. 1.Laboratoire de mathématiques, Bât. 425Université Paris XIOrsay CedexFrance

Personalised recommendations