Skip to main content

Boosting Using Neural Networks

  • Chapter
Combining Artificial Neural Nets

Part of the book series: Perspectives in Neural Computing ((PERSPECT.NEURAL))

Summary

Boosting is a method to construct a committee of weak learners that lowers the error rate in classification and prediction error in regression. Boosting works by iteratively constructing weak learners whose training set is conditioned on the performance of the previous members of the ensemble. In classification, we train neural networks using stochastic gradient descent and in regression, we train neural networks using conjugate gradient descent. We compare ensembles of neural networks to ensembles of trees and show that neural networks are superior. We also compare ensembles constructed using boosting to those constructed using bagging and show that boosting is generally superior. Finally, the importance of using separate training, validation, and test sets in order to obtain good generalisation is stressed.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Christopher M. Bishop. Neural Networks for Pattern Recognition. Oxford, 1995.

    Google Scholar 

  2. Leo Breiman. Stacked regression. Technical Report 367, Department of Statistics, University of California at Berkeley, 1992.

    Google Scholar 

  3. Leo Breiman. Bagging predictors. Machine Learning, 24 (2): 123–140, 1996.

    MathSciNet  MATH  Google Scholar 

  4. Leo Breiman. The heuristics of instability in model selection. Annals of Statistics, 24: 2350–2383, 1996.

    Article  MathSciNet  MATH  Google Scholar 

  5. Leo Breiuran. Prediction games and arcing classifiers. Technical Report 504, Statistics Department, University of California at Berkeley, 1997.

    Google Scholar 

  6. Leo Breiman, Jerome H. Friedman, Richard A. Olshen, and Charles J. Stone. Classification and Regression Trees. Wadsworth International Group, 1984.

    MATH  Google Scholar 

  7. Thomas G. Dietterich. Machine learning research: Four current directions. AI Magazine, 18 (4): 97–136, 1997.

    Google Scholar 

  8. Harris Drucker. Improving regressors using boosting techniques. In Proceeding International Conference on Machine Learning, pages 107–115. Morgan Kaufman, 1997.

    Google Scholar 

  9. Harris Drucker, Corinna Cortes, L.D. Jackel, Yann LeCun, and Vladimir Vapnik. Boosting and other ensemble methods. In David S. Touretzky, Michael C. Mozer, and Michael E. Hasselmo, editors, Advances in Neural Information Processing Systems 8, pages 479–485. Mogan-Kaufmann, 1996.

    Google Scholar 

  10. Harris Drucker, Robert Schapire, and Patrice Simard. Boosting performance in neural networks. International Journal of Pattern Recognition and Artificial Intelligence, 7 (4): 705–719, 1993.

    Article  Google Scholar 

  11. Harris Drucker, Robert Schapire, and Patrice Simard. Improving performance in neural networks using a boosting algorithm. In Stephen José Hanson, Jack D. Cowan, and C. Lee Giles, editors, Advances in Neural Information Processing Systems 5, pages 42–49. Morgan Kaufman, 1993.

    Google Scholar 

  12. Bradley Efron and Robert J. Tibshirani. An Introduction to the Bootstrap. Chapman and Hall, 1993.

    MATH  Google Scholar 

  13. S. Fahlman and C.E. Lebiere. The cascade-correlation learning architecture. Technical report, Carnegie Mellon University, 1990. Technical Report CM-CS90–100.

    Google Scholar 

  14. Yoav Freund. Boosting a weak learning algorithm by majority. In Proceedings of the Third Workshop on Computational Learning Theory, pages 202–216. Morgan-Kaufmann, 1990.

    Google Scholar 

  15. Yoav Freund and Robert E. Schapire. A decision-theoretic generalization of on-line learning and an application to boosting. In Computational Learning Theory: Second European Conference, EuroCOLT ‘85, pages 23–37. Springer-Verlag, 1995.

    Google Scholar 

  16. Yoav Freund and Robert E. Schapire. Experiments with a new boosting algorithm. In Machine Learning: Proceedings of the Thirteenth International Conference, pages 148–156, 1996.

    Google Scholar 

  17. Yoav Freund and Robert E. Schapire. Game theory, on-line prediction and boosting. In Proceedings of the Ninth Annual Conference on Computational Learning Theory, pages 325–332, 1996.

    Chapter  Google Scholar 

  18. Yoav Freund and Robert E. Schapire. A decision-theoretic generalization of online learning and an application to boosting. Journal of Computer and System Sciences, 55 (1): 119–139, August 1997.

    Article  MathSciNet  MATH  Google Scholar 

  19. Jerome H. Friedman. Multivariate adaptive regression splines. In Annals of Statistics, volume 19, 1991.

    Google Scholar 

  20. Simon Haykin. Neural Networks. MacMillin, 1994.

    MATH  Google Scholar 

  21. Michael I. Jordan and R.A. Jacobs. Hierarchical mixtures of experts and the em algorithm. Neural Computation, 6: 181–214, 1994.

    Article  Google Scholar 

  22. Y. LeCun, L. D. Jackel, L. Bottou, A. Brunot, C. Cortes, J. S. Denker, H. Drucker, I. Guyon, U. A. Muller, E. Säckinger, P. Simard, and V. N. Vapnik. Comparison of learning algorithms for handwritten digit recognition. In F. Fogelman and P. Gallinari, editors, International Conference on Artificial Neural Networks, pages 53–60, Paris, 1995. EC2 & Cie.

    Google Scholar 

  23. Y. LeCun, L. D. Jackel, L. Bottou, C. Cortes, J. S. Denker, H. Drucker, I. Guyon, U. A. Muller, E. Säckinger, P. Simard, and V. N. Vapnik. Learning algorithms for classification: A comparison on handwri tten digit recognition. In J. H. Oh, C. Kwon, and S. Cho, editors, Neural Networks: The Statistical Mechanics Perspective, pages 261–276. World Scientific, 1995.

    Google Scholar 

  24. Yann LeCun, Bernard Boser, John S. Denker, Donnie Henderson, Richard E. Howard, William Hubbard, and Larry D. Jackel. Handwritten digit recognition with a back-propagating network. In David Touretzky, editor, Advances in Neural Information Processing Systems 2. Margan Kaufmann, 1989.

    Google Scholar 

  25. David Luenberger. Introduction to Linear and Nonlinear Programming. Addison Wesley, 1973.

    MATH  Google Scholar 

  26. J. Mingers. An empirical comparison of pruning methods for decision trees. Machine Learning, 4: 277–243, 1989.

    Article  Google Scholar 

  27. William H. Press, Brian P. Flannery, Sau A. Teukolsky, and William T. Vetterling. Numerical Recipes in C. Cambridge, 1990.

    Google Scholar 

  28. J. Ross Quinlin. C4.5: Programs for Machine Learning. Morgan Kaufmann, 1988.

    Google Scholar 

  29. Robert E. Schapire. The strength of weak learnability. In 30th Annual Symposium on Foundations of Computer Science, pages 28–33, October 1989.

    Chapter  Google Scholar 

  30. Robert E. Schapire. Using output codes to boost mulitclass learning problems. In Proceeding International Conference on Machine learning. Morgan-Kaufmann, 1997.

    Google Scholar 

  31. Robert E. Schapire and Yoram Singer. Improved boosing algorithms using confidence-rated predictions. In Proceeding of the Eleventh Annual Conference on Computation Learning Theory, 1998.

    Google Scholar 

  32. Holger Schwenk and Yoshua Bengio. Adaptive boosting of neural networks for character recognition. In Advances in Neural Information Processing Systems 10, 1997.

    Google Scholar 

  33. Vladimir Vapnik. Estimation of Dependencies Based on Empirical Data. Springer-Verlag, 1982.

    Google Scholar 

  34. David H. Wolpert. Stacked generalization. Neural Networks, 5: 241–259, 1992.

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 1999 Springer-Verlag London Limited

About this chapter

Cite this chapter

Sharkey, A.J.C. (1999). Boosting Using Neural Networks. In: Sharkey, A.J.C. (eds) Combining Artificial Neural Nets. Perspectives in Neural Computing. Springer, London. https://doi.org/10.1007/978-1-4471-0793-4_3

Download citation

  • DOI: https://doi.org/10.1007/978-1-4471-0793-4_3

  • Publisher Name: Springer, London

  • Print ISBN: 978-1-85233-004-0

  • Online ISBN: 978-1-4471-0793-4

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics