New approaches to statistical learning theory

  • Olivier Bousquet
Special Section on New Trends in Statistical Information Processing

Abstract

We present new tools from probability theory that can be applied to the analysis of learning algorithms. These tools allow to derive new bounds on the generalization performance of learning algorithms and to propose alternative measures of the complexity of the learning task, which in turn can be used to derive new learning algorithms.

Key words and phrases

Statistical learning theory concentration inequalities Rademacher averages error bounds 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Anthony, M. and Shawe-Taylor, J. (1993). A result of Vapnik with applications,Discrete Appl. Math. 47, 207–217.CrossRefMathSciNetGoogle Scholar
  2. Bartlett, P. and Mendelson, S. (2002). Rademacher and gaussian complexities: Risk bounds and structural results,Journal of Machine Learning Research,3, 463–482.CrossRefMathSciNetGoogle Scholar
  3. Bartlett, P., Boucheron, S. and Lugosi, G. (2002a). Model selection and error estimation,Machin Learning,48, 85–113.CrossRefGoogle Scholar
  4. Bartlett, P., Bousquet, O. and Mendelson, S. (2002b). Local rademacher complexities (preprint).Google Scholar
  5. Bartett, P., Bousquet, O. and Mendelson, S. (2002c). Localized rademacher complexity,Proceedings of the 15th Annual Conference on Computational Learning Theory, Lecture Notes in Comput. Sci., 44–58, Springer, Berlin.Google Scholar
  6. Boucheron, S., Lugosi, G. and Massart, P. (2002). A sharp concentration inequality with applications,Random Structures Algorithms,16(3), 277–292.CrossRefMathSciNetGoogle Scholar
  7. Boucheron, S., Lugosi, G. and Massart, P. (2002). Concentration inequalities using the entropy method,Ann. Probab. (to appear).Google Scholar
  8. Bousquet, O. (2002a). A Bennett concentration inequality and its application to suprema of empirical processes,Computes Rendus Mathématique Academie des Sciences. Paris,334, 495–500.MathSciNetGoogle Scholar
  9. Bousquet, O. (2002b). Concentration inequalities and empirical processes theory applied to the analysis of learning algorithms, Ph.D. thesis, Centre de Mathématiques Appliquées, Ecole Polytechnique (preprint).Google Scholar
  10. Bousquet, O. and Elisseeff, A. (2002). Stability and generalization,Journal of Machine Learning Research,2, 499–526.CrossRefMathSciNetGoogle Scholar
  11. Koltchinskii, V. and Panchenko, D. (2000). Rademacher processes and bounding the risk of function learning,High Dimensional Probability II (eds. E. Gine, D. Mason and J. Wellner) 443–459.Google Scholar
  12. Ledoux, M. and Talagrand, M. (1991).Probability in Banach Spaces, Springer, Berlin.MATHGoogle Scholar
  13. Massart, P. (2000). Some applications of concentration inequalities to statistics,Ann. Fac. Sci. Toulouse Math. (6),9(2), 245–303.MathSciNetGoogle Scholar
  14. McDiarmid, C. (1989). On the method of bounded differences,Surveys in Combinatorics, London Math. Soc. Lecture Note Ser.,141, 148–188, Cambridge University Press, Cambridge.Google Scholar
  15. Mendelson, S. (2001). On the size of convex hulls of small sets,Journal of Machine Learning Research,2, 1–18.CrossRefMathSciNetGoogle Scholar
  16. van der Vaart, A. and Wellner, J. (1996).Weak Convergence and Empirical Processes with Applications to Statistics, Wiley, New York.MATHGoogle Scholar
  17. Vapnik, V. and Chervonenkis, A. (1971). On the uniform convergence of relative frequencies of events to their probabilities,Theory Probab. Appl.,16, 264–280.CrossRefGoogle Scholar
  18. Vapnik, V. and Chervonenkis, A. (1991). The necessary and sufficient conditions for consistency of the method of empirical risk minimization,Pattern Recognition and Image Analysis,1(3), 284–305.Google Scholar

Copyright information

© The Institute of Statistical Mathematics 2003

Authors and Affiliations

  • Olivier Bousquet
    • 1
  1. 1.Max Planck Institute for Biological CyberneticsTübingenGermany

Personalised recommendations