AdaBoosting neural networks: Application to on-line character recognition
“Boosting” is a general method for improving the performance of any weak learning algorithm that consistently generates classifiers which need to perform only slightly better than random guessing. A recently proposed and very promising boosting algorithm is AdaBoost . It has been applied with great success to several benchmark machine learning problems using rather simple learning algorithms , in particular decision trees [1,2,5]. In this paper we use AdaBoost to improve the performances of a strong learning algorithm: a neural network based on-line character recognition system. In particular we will show that it can be used to learn automatically a great variety of writing styles even when the amount of training data for each style varies a lot. Our system achieves about 1.4 % error on a handwritten digit data base of more than 200 writers.
Unable to display preview. Download preview PDF.
- 1.L. Breiman. Bias, variance, and Arcing classifiers. Technical Report 460, Statistics Departement, University of California at Berkeley, 1996.Google Scholar
- 2.H. Drucker and C. Cortes, Boosting decision trees. In NIPS *8, pages 479–485, 1996.Google Scholar
- 3.Y. Freund and R.E. Schapire. Experiments with a new boosting algorithm. In Machine Learning: Proceedings of Thirteenth International Conference, pages 148–156, 1996.Google Scholar
- 4.Y. Freund and R.E. Schapire. A decision theoretic generalization of on-line learning and an application to boosting. Journal of Computer and System Science, to appear.Google Scholar
- 5.J.R. Quinlan. Bagging, Boosting and C4.5. In 14th Ntnl Conf on Artificial Intelligence, 1996.Google Scholar
- 6.R.E. Schapire, Y Freund, P. Bartlett, and W.S. Leee. Boosting the margin: A new explanation for the effectiveness of voting methods. In Machines That Learn — Snowbird, 1997.Google Scholar
- 7.H. Schwenk and M. Milgram. Transformation invariant autoassociation with application to handwritten character recognition. NIPS *7, pages 991–998. MIT Press, 1995.Google Scholar
- 8.H. Schwenk and M. Milgram. Learning discriminant tangent models for handwritten character recognition. In ICANN * 96, pages 585–590. Springer Verlag, 1995.Google Scholar
- 9.H. Schwenk and M. Milgram. Constraint tangent distance for online character recognition. In International Conference on Pattern Recognition, pages D 520–524, 1996.Google Scholar
- 10.P. Simard, Y. Le Cun, and J. Denker. Efficient pattern recognition using a new transformation distance. NIPS * 5, pages 50–58. Morgan Kaufmann, 1993.Google Scholar