Theoretical Views of Boosting

  • Robert E. Schapire
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 1572)

Abstract

Boosting is a general method for improving the accuracy of any given learning algorithm. Focusing primarily on the AdaBoost algorithm, we briefly survey theoretical work on boosting including analyses of AdaBoost’s training error and generalization error, connections between boosting and game theory, methods of estimating probabilities using boosting, and extensions of AdaBoost for multiclass classification problems. We also briefly mention some empirical work.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Peter L. Bartlett. The sample complexity of pattern classification with neural networks: the size of the weights is more important than the size of the network. IEEE Transactions on Information Theory, 44(2):525–536, March 1998.MATHCrossRefMathSciNetGoogle Scholar
  2. 2.
    Eric Bauer and Ron Kohavi. An empirical comparison of voting classification algorithms: Bagging, boosting, and variants. Machine Learning, to appear.Google Scholar
  3. 3.
    Eric B. Baum and David Haussler. What size net gives valid generalization? Neural Computation, 1(1):151–160, 1989.CrossRefGoogle Scholar
  4. 4.
    Bernhard E. Boser, Isabelle M. Guyon, and Vladimir N. Vapnik. A training algorithm for optimal margin classifiers. In Proceedings of the Fifth Annual ACM Workshop on Computational Learning Theory, pages 144–152, 1992.Google Scholar
  5. 5.
    Leo Breiman. Arcing the edge. Technical Report 486, Statistics Department, University of California at Berkeley, 1997.Google Scholar
  6. 6.
    Leo Breiman. Prediction games and arcing classifiers. Technical Report 504, Statistics Department, University of California at Berkeley, 1997.Google Scholar
  7. 7.
    Leo Breiman. Arcing classifiers. The Annals of Statistics, 26(3):801–849, 1998.MATHCrossRefMathSciNetGoogle Scholar
  8. 8.
    Corinna Cortes and Vladimir Vapnik. Support-vector networks. Machine Learning, 20(3):273–297, September 1995.MATHGoogle Scholar
  9. 9.
    Thomas G. Dietterich. An experimental comparison of three methods for constructing ensembles of decision trees: Bagging, boosting, and randomization. Machine Learning, to appear.Google Scholar
  10. 10.
    Thomas G. Dietterich and Ghulum Bakiri. Solving multiclass learning problems via error-correcting output codes. Journal of Artificial Intelligence Research, 2:263–286, January 1995.MATHGoogle Scholar
  11. 11.
    Harris Drucker and Corinna Cortes. Boosting decision trees. In Advances in Neural Information Processing Systems 8, pages 479–485, 1996.Google Scholar
  12. 12.
    Harris Drucker, Robert Schapire, and Patrice Simard. Boosting performance in neural networks. International Journal of Pattern Recognition and Artificial Intelligence, 7(4):705–719, 1993.CrossRefGoogle Scholar
  13. 13.
    Yoav Freund. Boosting a weak learning algorithm by majority. Information and Computation, 121(2):256–285, 1995.MATHCrossRefMathSciNetGoogle Scholar
  14. 14.
    Yoav Freund and Robert E. Schapire. Experiments with a new boosting algorithm. In Machine Learning: Proceedings of the Thirteenth International Conference, pages 148–156, 1996.Google Scholar
  15. 15.
    Yoav Freund and Robert E. Schapire. Game theory, on-line prediction and boosting. In Proceedings of the Ninth Annual Conference on Computational Learning Theory, pages 325–332, 1996.Google Scholar
  16. 16.
    Yoav Freund and Robert E. Schapire. A decision-theoretic generalization of online learning and an application to boosting. Journal of Computer and System Sciences, 55(1):119–139, August 1997.MATHCrossRefMathSciNetGoogle Scholar
  17. 17.
    Yoav Freund and Robert E. Schapire. Adaptive game playing using multiplicative weights. Games and Economic Behavior, to appear.Google Scholar
  18. 18.
    Jerome Friedman, Trevor Hastie, and Robert Tibshirani. Additive logistic regression: a statistical view of boosting. Technical Report, 1998.Google Scholar
  19. 19.
    Adam J. Grove and Dale Schuurmans. Boosting in the limit: Maximizing the margin of learned ensembles. In Proceedings of the Fifteenth National Conference on Artificial Intelligence, 1998.Google Scholar
  20. 20.
    Jeffrey C. Jackson and Mark W. Craven. Learning sparse perceptrons. In Advances in Neural Information Processing Systems 8, pages 654–660, 1996.Google Scholar
  21. 21.
    Michael Kearns and Leslie G. Valiant. Learning Boolean formulae or finite automata is as hard as factoring. Technical Report TR-14-88, Harvard University Aiken Computation Laboratory, August 1988.Google Scholar
  22. 22.
    Michael Kearns and Leslie G. Valiant. Cryptographic limitations on learning Boolean formulae and finite automata. Journal of the Association for Computing Machinery, 1(1):67–95, January 1994.MathSciNetGoogle Scholar
  23. 23.
    Richard Maclin and David Opitz. An empirical evaluation of bagging and boosting. In Proceedings of the Fourteenth National Conference on Artificial Intelligence, pages 546–551, 1997.Google Scholar
  24. 24.
    Llew Mason, Peter Bartlett, and Jonathan Baxter. Direct optimization of margins improves generalization in combined classifiers. Technical report, Deparment of Systems Engineering, Australian National University, 1998.Google Scholar
  25. 25.
    C.J. Merz and P.M. Murphy. UCI repository of machine learning databases, 1998. http://www.ics.uci.edu/_mlearn/MLRepository.html.
  26. 26.
    J.R. Quinlan. Bagging, boosting, and C4.5. In Proceedings of the Thirteenth National Conference on Artificial Intelligence, pages 725–730, 1996.Google Scholar
  27. 27.
    J. Ross Quinlan. C4.5: Programs for Machine Learning. Morgan Kaufmann, 1993.Google Scholar
  28. 28.
    Robert E. Schapire. The strength of weak learnability. Machine Learning, 5(2): 197–227, 1990.Google Scholar
  29. 29.
    Robert E. Schapire. Using output codes to boost multiclass learning problems. In Machine Learning: Proceedings of the Fourteenth International Conference, pages 313–321, 1997.Google Scholar
  30. 30.
    Robert E. Schapire, Yoav Freund, Peter Bartlett, and Wee Sun Lee. Boosting the margin: A new explanation for the effectiveness of voting methods. In Machine Learning: Proceedings of the Fourteenth International Conference, pages 322–330, 1997. To appear, The Annals of Statistics.Google Scholar
  31. 31.
    Robert E. Schapire and Yoram Singer. Improved boosting algorithms using confidence-rated predictions. In Proceedings of the Eleventh Annual Conference on Computational Learning Theory, pages 80–91, 1998.Google Scholar
  32. 32.
    Robert E. Schapire and Yoram Singer. BoosTexter: A system for multiclass multi-label text categorization. Machine Learning, to appear.Google Scholar
  33. 33.
    Holger Schwenk and Yoshua Bengio. Training methods for adaptive boosting of neural networks. In Advances in Neural Information Processing Systems 10, pages 647–653, 1998.Google Scholar
  34. 34.
    L.G. Valiant. A theory of the learnable. Communications of the ACM, 27(11):1134–1142, November 1984.MATHCrossRefGoogle Scholar
  35. 35.
    Vladimir N. Vapnik. The Nature of Statistical Learning Theory. Springer, 1995.Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 1999

Authors and Affiliations

  • Robert E. Schapire
    • 1
  1. 1.AT&T LabsShannon LaboratoryFlorham ParkUSA

Personalised recommendations