Advertisement

Object Recognition with Gradient-Based Learning

  • Yann LeCun
  • Patrick Haffner
  • Léon Bottou
  • Yoshua Bengio
Chapter
Part of the Lecture Notes in Computer Science book series (LNCS, volume 1681)

Abstract

Finding an appropriate set of features is an essential problem in the design of shape recognition systems. This paper attempts to show that for recognizing simple objects with high shape variability such as handwritten characters, it is possible, and even advantageous, to feed the system directly with minimally processed images and to rely on learning to extract the right set of features. Convolutional Neural Networks are shown to be particularly well suited to this task. We also show that these networks can be used to recognize multiple objects without requiring explicit segmentation of the objects from their surrounding. The second part of the paper presents the Graph Transformer Network model which extends the applicability of gradient-based learning to systems that use graphs to represents features, objects, and their combinations.

Keywords

Radial Basis Function Loss Function Object Recognition Convolutional Neural Network Neural Information Processing System 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. [1]
    Bengio, Y., LeCun, Y., Nohl, C., and Burges, C. (1995). LeRec: A NN/HMM Hybrid for On-Line Handwriting Recognition. Neural Computation, 7(5).Google Scholar
  2. [2]
    Bottou, L. and Gallinari, P. (1991). A Framework for the Cooperation of Learning Algorithms. In Touretzky, D. and Lippmann, R., editors, Advances in Neural Information Processing Systems, volume 3, Denver. Morgan Kaufmann.Google Scholar
  3. [3]
    Bottou, L., LeCun, Y., and Bengio, Y. (1997). Global Training of Document Processing Systems using Graph Transformer Networks. In Proc. of Computer Vision and Pattern Recognition, Puerto-Rico. IEEE.Google Scholar
  4. [4]
    Burges, C. J. C. and Schoelkopf, B. (1997). Improving the accuracy and speed of support vector machines. In M. Mozer, M. J. and Petsche, T., editors, Advances in Neural Information Processing Systems 9. The MIT Press, Cambridge.Google Scholar
  5. [5]
    Driancourt, X. and Bottou, L. (1991). MLP, LVQ and DP: Comparison & Cooperation. In Proceedings of the International Joint Conference on Neural Networks, Seattle.Google Scholar
  6. [6]
    Drucker, H., Schapire, R., and Simard, P. (1993). Improving performance in neural networks using a boosting algorithm. In Hanson, S. J., Cowan, J. D., and Giles, C. L., editors, Advances in Neural Information Processing Systems 5, pages 42–49, San Mateo, CA. Morgan Kaufmann.Google Scholar
  7. [7]
    Fukushima, K. (1975). Cognitron: A Self-Organizing Multilayered Neural Network. Biological Cybernetics, 20:121–136.CrossRefGoogle Scholar
  8. [8]
    Fukushima, K. and Miyake, S. (1982). Neocognitron: A new algorithm for pattern recognition tolerant of deformations and shifts in position. Pattern Recognition, 15:455–469.CrossRefGoogle Scholar
  9. [9]
    Hubel, D. H. and Wiesel, T. N. (1962). Receptive Fields, Binocular Interaction, and Functional Architecture in the Cat’s Visual Cortex. Journal of Physiology (London), 160:106–154.Google Scholar
  10. [10]
    Keeler, J., Rumelhart, D., and Leow, W. K. (1991). Integrated segmentation and recognition of hand-printed numerals. In Lippmann, R. P., Moody, J. M., and Touretzky, D. S., editors, Neural Information Processing Systems, volume 3, pages 557–563. Morgan Kaufmann Publishers, San Mateo, CA.Google Scholar
  11. [11]
    Lades, M., Vorbrüggen, J. C., Buhmann, J., and von der Malsburg, C. (1993). Distortion Invariant Object Recognition in the Dynamic Link Architecture. IEEE Trans. Comp., 42(3):300–311.CrossRefGoogle Scholar
  12. [12]
    Lawrence, S., Giles, C. L., Tsoi, A. C., and Back, A. D. (1997). Face Recognition: A Convolutional Neural Network Approach. IEEE Transactions on Neural Networks, 8(1):98–113.CrossRefGoogle Scholar
  13. [13]
    LeCun, Y. (1986). Learning Processes in an Asymmetric Threshold Network. In Bienenstock, E., Fogelman-Soulié, F., and Weisbuch, G., editors, Disordered systems and biological organization, pages 233–240, Les Houches, France. Springer-Verlag.Google Scholar
  14. [14]
    LeCun, Y. (1987). Modeles connexionnistes de l’apprentissage (connectionist learning models). PhD thesis, Université P. et M. Curie (Paris 6).Google Scholar
  15. [15]
    LeCun, Y. (1988). A theoretical framework for Back-Propagation. In Touretzky, D., Hinton, G., and Sejnowski, T., editors, Proceedings of the 1988 Connectionist Models Summer School, pages 21–28, CMU, Pittsburgh, Pa. Morgan Kaufmann.Google Scholar
  16. [16]
    LeCun, Y. (1989). Generalization and Network Design Strategies. In Pfeifer, R., Schreter, Z., Fogelman, F., and Steels, L., editors, Connectionism in Perspective, Zurich, Switzerland. Elsevier.Google Scholar
  17. [17]
    LeCun, Y., Boser, B., Denker, J. S., Henderson, D., Howard, R. E., Hubbard, W., and Jackel, L. D. (1989). Backpropagation Applied to Handwritten Zip Code Recognition. Neural Computation, 1(4):541–551.CrossRefGoogle Scholar
  18. [18]
    LeCun, Y., Boser, B., Denker, J. S., Henderson, D., Howard, R. E., Hubbard, W., and Jackel, L. D. (1990). Handwritten digit recognition with a back-propagation net work. In Touretzky, D., editor, Advances in Neural Information Processing Systems 2 (NIPS*89), Denver, CO. Morgan Kaufman.Google Scholar
  19. [19]
    LeCun, Y., Bottou, L., Bengio, Y., and Haffner, P. (1998). Gradient-Based Learning Applied to Document Recognition. Proceedings of the IEEE, (86)11:2278–2324.CrossRefGoogle Scholar
  20. [20]
    LeCun, Y., Kanter, I., and Solla, S. (1991). Eigenvalues of covariance matrices: application to neural-network learning. Physical Review Letters, 66(18):2396–2399.CrossRefGoogle Scholar
  21. [21]
    Martin, G. L. (1993). Centered-object integrated segmentation and recognition of overlapping hand-printed characters. Neural Computation, 5:419–429.CrossRefGoogle Scholar
  22. [22]
    Matan, O., Burges, C. J. C., LeCun, Y., and Denker, J. S. (1992). Multi-Digit Recognition Using a Space Displacement Neural Network. In Moody, J. M., Hanson, S. J., and Lippman, R. P., editors, Neural Information Processing Systems, volume 4. Morgan Kaufmann Publishers, San Mateo, CA.Google Scholar
  23. [23]
    Mozer, M. C. (1991). The perception of multiple objects: A connectionist approach. MIT Press-Bradford Books, Cambridge, MA.Google Scholar
  24. [24]
    Nowlan, S. and Platt, J. (1995). A Convolutional Neural Network Hand Tracker. In Tesauro, G., Touretzky, D., and Leen, T., editors, Advances in Neural Information Processing Systems 7, pages 901–908, San Mateo, CA. Morgan Kaufmann.Google Scholar
  25. [25]
    Osuna, E., Freund, R., and Girosi, F. (1997). Training Support Vector Machines: an Application to Face Detection. In Proceedings of CVPR’96, pages 130–136. IEEE Computer Society Press.Google Scholar
  26. [26]
    Rabiner, L. R. (1989). A Tutorial On Hidden Markov Models and Selected Applications in Speech Recognition. Proceedings of the IEEE, 77(2):257–286.CrossRefGoogle Scholar
  27. [27]
    Rowley, H. A., Baluja, S., and Kanade, T. (1996). Neural Network-Based Face Detection. In Proceedings of CVPR’96, pages 203–208. IEEE Computer Society Press.Google Scholar
  28. [28]
    Rumelhart, D. E., Hinton, G. E., and Williams, R. J. (1986). Learning internal representations by error propagation. In Parallel distributed processing: Explorations in the microstructure of cognition, volume I, pages 318–362. Bradford Books, Cambridge, MA.Google Scholar
  29. [29]
    Schapire, R. E. (1990). The strength of weak learnability. Machine Learning, 5(2):197–227.Google Scholar
  30. [30]
    Vaillant, R., Monrocq, C., and LeCun, Y. (1994). Original approach for the localisation of objects in images. IEE Proc on Vision, Image, and Signal Processing, 141(4):245–250.CrossRefGoogle Scholar
  31. [31]
    Vapnik, V. N. (1995). The Nature of Statistical Learning Theory. Springer, New-York.Google Scholar
  32. [32]
    Wang, J. and Jean, J. (1993). Multi-resolution neural networks for omnifont character recognition. In Proceedings of International Conference on Neural Networks, volume III, pages 1588–1593.CrossRefGoogle Scholar
  33. [33]
    Wolf, R. and Platt, J. (1994). Postal address block location using a convolutional locator network. In Cowan, J. D., Tesauro, G., and Alspector, J., editors, Advances in Neural Information Processing Systems 6, pages 745–752.Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 1999

Authors and Affiliations

  • Yann LeCun
    • 1
  • Patrick Haffner
    • 1
  • Léon Bottou
    • 1
  • Yoshua Bengio
    • 1
  1. 1.AT&T Shannon LabRed BankUSA

Personalised recommendations