Automatic speech recognition with neural networks: Beyond nonparametric models

  • Paolo Frasconi
  • Marco Gori
  • Giovanni Soda
Part II The Quest of Perceptual Primitives
Part of the Lecture Notes in Computer Science book series (LNCS, volume 745)


In the last few years different connectionist models have been applied to many perceptual tasks. Many efforts have been focussed in particular to different speech recognition tasks in the attempt of exploring the remarkable potential learning capabilities of connectionist models. In this paper we briefly review most successful approaches to speech recognition in the attempt of assessing their actual contribution to the field. A detailed analysis of different problems found in speech recognition allows us to identify some “desiderata” to be met for building challenging models. One of the most remarkable targets is that of proposing an effective model of the speech time dimension. Moreover, many proposed connectionist models turn out to be severely limited by their inherent nonparametric structure which makes learning of many tasks very hard. We suggest methods for introducing prior knowledge in recurrent networks and briefly discuss how can they learn more effectively in presence of “structured tasks”.

Index terms

Automatic Speech Recognition learning from examples neural networks 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Y. Bengio, R. De Mori, and M. Gori, “Learning the dynamic nature of speech with back-propagation for sequences,” Pattern Recognition Letters, Vol. 13, No. 5, May 1992.Google Scholar
  2. 2.
    H. Bourlard and C. J. Wellekens, “Speech Pattern Discrimination and Multilayered Perceptrons,” Computer Speech and Language, no. 3, 1989, pp. 1–19Google Scholar
  3. 3.
    J.L. Elman and D. Zipser, “Learning the hidden structure of the speech,” Journal of the Acoustic Society of America vol. 83, no. 4, pp. 1615–1626, April 1988.Google Scholar
  4. 4.
    P. Frasconi, M. Gori, M. Maggini, and G. Soda, “A Unified Approach for Integrating Explicit Knowledge and Learning by Example in Recurrent Networks”, Proceedings of IEEE-IJCNN91, Seattle, I 811–816, July 8–12 1991.Google Scholar
  5. 5.
    P. Frasconi, M. Gori, and G. Soda, “Local Feedback Multi-Layered Networks,” Neural Computation vol. 4, no. 1, pp. 120–130, 1991.Google Scholar
  6. 6.
    P. Frasconi, M. Gori, M. Maggini, and G. Soda, “Unified Integration of Explicit Rules and Learning by Example in Recurrent Networks,” IEEE Trans. on Knowledge and Data Engineering Google Scholar
  7. 7.
    P. Frasconi, M. Gori, and G. Soda, “Recurrent networks with activation feedback,” Proc. of 3th Italian Workshop on Parallel Architectures and Neural Networks,” Vietri sul Mare, Salerno, 15–18 May 1990, pp. 329–336Google Scholar
  8. 8.
    P. Frasconi, M. Gori, and G. Soda, “Injecting Nondeterministic Finite State Automata into Recurrent Neural Networks,” Technical Report RT15/92, Universita’ di Firenze, August 1992.Google Scholar
  9. 9.
    S. Geman, E. Bienenstock, and R. Dourstat, “Neural Networks and the Bias/Variance Dilemma”, Neural Computation, Vo. 4, No. 1, January 1992, pp. 1–58Google Scholar
  10. 10.
    M. Gori and A. Tesi, “On the Problem of Local Minima in BackPropagation”, IEEE Trans. Pattern Anal. and Machine Intell., vol. 14, no. 1, pp. 76–86, 1991.Google Scholar
  11. 11.
    J.E. Hopcroft, J.D. Ullman, Introduction to Automata Theory, Languages and Computation, Addison-Wesley, Reading, Mass., 1979.Google Scholar
  12. 12.
    T. Kohonen, “The Self-Organizing Map,” Proc. of the IEEE, vol. 78, no. 9, September 1990 (special issue on neural networks I)Google Scholar
  13. 13.
    M.L. Minsky, and S.A. Papert, Perceptrons — Expanded Edition, MIT Press, 1988.Google Scholar
  14. 14.
    M.C. Mozer, “A focused Backpropagation algorithm for temporal pattern recognition,” Complex Systems, no. 3, pp. 349–381Google Scholar
  15. 15.
    B.A. Pearlmutter, “Learning State Space Trajectories in Recurrent Neural Networks,” Neural Computation vol. 1, no. 2, pp. 263–269, 1989.Google Scholar
  16. 16.
    L. Rabiner, “A Tutorial on Hidden Markov Models and Selected Applications in Speech Recognition,” IEEE ASSP Magaz., 1989, pp. 267–295Google Scholar
  17. 17.
    A.J. Robinson and F. Fallside, “Static and Dynamic error propagation networks with application to speech coding,” In Dana Z. Anderson editor, Neural Information Processing Systems, American Institute of Physics, New York 1987Google Scholar
  18. 18.
    H. Sakoe and C. Chiba, “Dynamic Programming algorithm optimization for spoken word recognition,” IEEE Trans. on ASSP, vol. 54, no. 1, pp. 43–49, February 1978Google Scholar
  19. 19.
    A. Waibel, T. Hanazawa, G. Hinton, K. Shikano, K. Lang, “Phoneme Recognition Using Time-Delay Neural Networks,” IEEE Trans. on ASSP, vol. 37, no. 3, March 1989.Google Scholar
  20. 20.
    A. Waibel, H. Sawai, and K. Shikano, “Modularity and Scaling in Large Phonemic Neural Networks,” IEEE Trans. on ASSP, Becember 1989Google Scholar
  21. 21.
    R.L. Watrous, “Speech Recognition Using Connectionist Networks,” Ph.D. Thesis, University of Pennsylvania, Philadelphia, PA 190104 November 1988Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 1993

Authors and Affiliations

  • Paolo Frasconi
    • 1
  • Marco Gori
    • 1
  • Giovanni Soda
    • 1
  1. 1.Dipartimento di Sistemi e InformaticaFirenzeItaly

Personalised recommendations