AdaBoosting neural networks: Application to on-line character recognition

  • Holger Schwenk
  • Yoshua Bengio
Part VI: Speech, Vision, and Pattern Recognition
Part of the Lecture Notes in Computer Science book series (LNCS, volume 1327)


“Boosting” is a general method for improving the performance of any weak learning algorithm that consistently generates classifiers which need to perform only slightly better than random guessing. A recently proposed and very promising boosting algorithm is AdaBoost [4]. It has been applied with great success to several benchmark machine learning problems using rather simple learning algorithms [3], in particular decision trees [1,2,5]. In this paper we use AdaBoost to improve the performances of a strong learning algorithm: a neural network based on-line character recognition system. In particular we will show that it can be used to learn automatically a great variety of writing styles even when the amount of training data for each style varies a lot. Our system achieves about 1.4 % error on a handwritten digit data base of more than 200 writers.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    L. Breiman. Bias, variance, and Arcing classifiers. Technical Report 460, Statistics Departement, University of California at Berkeley, 1996.Google Scholar
  2. 2.
    H. Drucker and C. Cortes, Boosting decision trees. In NIPS *8, pages 479–485, 1996.Google Scholar
  3. 3.
    Y. Freund and R.E. Schapire. Experiments with a new boosting algorithm. In Machine Learning: Proceedings of Thirteenth International Conference, pages 148–156, 1996.Google Scholar
  4. 4.
    Y. Freund and R.E. Schapire. A decision theoretic generalization of on-line learning and an application to boosting. Journal of Computer and System Science, to appear.Google Scholar
  5. 5.
    J.R. Quinlan. Bagging, Boosting and C4.5. In 14th Ntnl Conf on Artificial Intelligence, 1996.Google Scholar
  6. 6.
    R.E. Schapire, Y Freund, P. Bartlett, and W.S. Leee. Boosting the margin: A new explanation for the effectiveness of voting methods. In Machines That Learn — Snowbird, 1997.Google Scholar
  7. 7.
    H. Schwenk and M. Milgram. Transformation invariant autoassociation with application to handwritten character recognition. NIPS *7, pages 991–998. MIT Press, 1995.Google Scholar
  8. 8.
    H. Schwenk and M. Milgram. Learning discriminant tangent models for handwritten character recognition. In ICANN * 96, pages 585–590. Springer Verlag, 1995.Google Scholar
  9. 9.
    H. Schwenk and M. Milgram. Constraint tangent distance for online character recognition. In International Conference on Pattern Recognition, pages D 520–524, 1996.Google Scholar
  10. 10.
    P. Simard, Y. Le Cun, and J. Denker. Efficient pattern recognition using a new transformation distance. NIPS * 5, pages 50–58. Morgan Kaufmann, 1993.Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 1997

Authors and Affiliations

  • Holger Schwenk
    • 1
  • Yoshua Bengio
    • 1
    • 2
  1. 1.University of MontrealCanada
  2. 2.AT&T Bell LaboratoriesHolmdel

Personalised recommendations