Skip to main content

A Consistent Strategy for Boosting Algorithms

  • Conference paper
  • First Online:

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 2375))

Abstract

The probability of error of classification methods based on convex combinations of simple base classifiers by “boosting” algorithms is investigated. The main result of the paper is that certain regularized boosting algorithms provide Bayes-risk consistent classifiers under the only assumption that the Bayes classifier may be approximated by a convex combination of the base classifiers. Non-asymptotic distribution-free bounds are also developed which offer interesting new insight into how boosting works and help explain their success in practical classification problems.

The work of the first author was supported by DGI grant BMF2000-0807

The work of the second author was supported by a Marie Curie Fellowship of the European Community ”Improving Human Potential” programme under contract number HPMFCT-2000-00667.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Y. Amit and G. Blanchard and K. Wilder. Multiple Randomized Classifiers. Submitted, 2001.

    Google Scholar 

  2. G. Blanchard. Méthodes de mélange et d’agrégation en reconnaissance de formes. Application aux arbres de décision. PhD thesis, Université Paris XIII, 2001. In English.

    Google Scholar 

  3. L. Breiman. Bagging predictors. Machine Learning, 26(2):123–140, 1996.

    Google Scholar 

  4. L. Breiman. Bias, variance, and arcing classifiers. Technical Report 460, Statistics Department, University of California, April 1996.

    Google Scholar 

  5. L. Breiman. Arcing the edge. Technical Report 486, Statistics Department, University of California, June 1997.

    Google Scholar 

  6. L. Breiman. Pasting bites together for prediction in large data sets. Technical report, Statistics Department, University of California, July 1997.

    Google Scholar 

  7. L. Breiman. Prediction games and arcing algorithms. Technical Report 504, Statistics Department, University of California, December 1997.

    Google Scholar 

  8. L. Breiman. Arcing classifiers. Annals of Statistics, 26:801–849, 1998.

    Article  MATH  MathSciNet  Google Scholar 

  9. L. Breiman. Some infinite theory for predictor ensembles. Technical Report 577, Statistics Department, UC Berkeley, August 2000.

    Google Scholar 

  10. P. Bühlmann and B. Yu. Discussion of the paper “Additive Logistic Regression” by Jerome Friedman, Trevor Hastie and Robert Tibshirani. The Annals of Statistics, 28:377–386, 2000.

    Google Scholar 

  11. P. Bühlmann and B. Yu. Boosting with the L2-Loss: Regression and Classification Manuscript, August 2001.

    Google Scholar 

  12. M. Collins, R. E. Schapire, and Y. Singer. Logistic regression, AdaBoost and Bregman distances. In Proceedings of the Thirteenth Annual Conference on Computational Learning Theory, 2000.

    Google Scholar 

  13. L. Devroye, L. Györfi, and G. Lugosi. A Probabilistic Theory of Pattern Recognition. Springer-Verlag, New York, 1996.

    MATH  Google Scholar 

  14. L. Devroye and G. Lugosi. Combinatorial Methods in Density Estimation. Springer-Verlag, New York, 2000.

    Google Scholar 

  15. Y. Freund. Boosting a weak learning algorithm by majority. Information and Computation, 121(2):256–285, September 1995.

    Google Scholar 

  16. Y. Freund, Y. Mansour, and R. E. Schapire. Why averaging classifiers can protect against overfitting. In Proceedings of the Eighth International Workshop on Artificial Intelligence and Statistics, 2001.

    Google Scholar 

  17. Y. Freund and R. E. Schapire. Experiments with a new boosting algorithm. In Proc. 13th International Conference on Machine Learning, pages 148–146. Morgan Kaufmann, 1996.

    Google Scholar 

  18. Y. Freund and R. E. Schapire. Game theory, on-line prediction and boosting. In Proc. 9th Annu. Conf. on Comput. Learning Theory, pages 325–332. ACM Press, New York, NY, 1996.

    Chapter  Google Scholar 

  19. Y. Freund and R. E. Schapire. A decision-theoretic generalization of on-line learning and an application to boosting. Journal of Computer and System Sciences, 55(1):119–139, August 1997.

    Google Scholar 

  20. Y. Freund and R. E. Schapire. Discussion of the paper “additive logistic regression: a statistical view of boosting” by J. Friedman, T. Hastie and R. Tibshirani. The Annals of Statistics, 38(2):391–393, 2000.

    Google Scholar 

  21. J. Friedman, T. Hastie, and R. Tibshirani. Additive logistic regression: a statistical view of boosting. Technical report, Department of Statistics, Sequoia Hall, Stanford University, July 1998.

    Google Scholar 

  22. W. Jiang. Process consistency for adaboost. Technical Report 00-05, Department of Statistics, Northwestern University, November 2000.

    Google Scholar 

  23. W. Jiang. Some theoretical aspects of boosting in the presence of noisy data. In Proceedings of The Eighteenth International Conference on Machine Learning (ICML-2001), June 2001, Morgan Kaufmann.

    Google Scholar 

  24. V. Koltchinskii and D. Panchenko. Empirical margin distributions and bounding the generalization error of combined classifiers. Submitted, 2000.

    Google Scholar 

  25. M. Ledoux and M. Talagrand. Probability in Banach Space. Springer-Verlag, New York, 1991.

    Google Scholar 

  26. S. Mannor and R. Meir. Weak learners and improved convergence rate in boosting. In Advances in Neural Information Processing Systems 13: Proc. NIPS’2000, 2001.

    Google Scholar 

  27. S. Mannor, R. Meir, and S. Mendelson. On the consistency of boosting algorithms. Manuscript, June 2001.

    Google Scholar 

  28. L. Mason, J. Baxter, P. L. Bartlett, and M. Frean. Functional gradient techniques for combining hypotheses. In A. J. Smola, P. L. Bartlett, B. Schölkopf, and D. Schuurmans, editors, Advances in Large Margin Classifiers, pages 221–247. MIT Press, Cambridge, MA, 1999.

    Google Scholar 

  29. R. E. Schapire. The strength of weak learnability. Machine Learning, 5(2):197–227, 1990.

    Google Scholar 

  30. R. E. Schapire, Y. Freund, P. Bartlett, and W. S. Lee. Boosting the margin: A new explanation for the effectiveness of voting methods. The Annals of Statistics, 26(5):1651–1686, October 1998.

    Google Scholar 

  31. T. Zhang Statistical Behavior and Consistency of Classification Methods based on Convex Risk Minimization. Manuscript, 2001.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2002 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Lugosi, G., Vayatis, N. (2002). A Consistent Strategy for Boosting Algorithms. In: Kivinen, J., Sloan, R.H. (eds) Computational Learning Theory. COLT 2002. Lecture Notes in Computer Science(), vol 2375. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-45435-7_21

Download citation

  • DOI: https://doi.org/10.1007/3-540-45435-7_21

  • Published:

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-43836-6

  • Online ISBN: 978-3-540-45435-9

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics