Skip to main content

Speaker-Independent Word Recognition with Backpropagation Networks

  • Conference paper
Artificial Neural Nets and Genetic Algorithms

Abstract

This paper presents a system that recognizes a limited vocabulary of spoken words in a speaker-independent manner. The system requires only minimal hardware support for acoustic preprocessing. In contrast to other approaches to word-level recognition, it reduces the information content of the speech signals by a compression algorithm before presenting them as inputs to a standard 3-layer backpropagation network. The network learns to recognize the utterances of the speakers in the training set, and the trained network is then used to recognize the spoken words of unknown speakers. Recognition rates of up to 91% were obtained for unknown speakers of the same sex and up to 72% for a mix of both male and female speakers. Since the training times are fast and the system is very cost effective, the approach is practically feasible for a variety of applications.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Similar content being viewed by others

References

  1. Behme H, ‘A Neural Net for Recognition and Storing of Spoken Words’, In: Parallel Processing in Neural Systems and Computers, pp. 379-382, Elsevier Science Publishers, 1990.

    Google Scholar 

  2. Bengio Y, Cardin R, and De Mori R, ‘Speaker Independent Speech Recognition with Neural Networks and Speech Knowledge’, In: Advances in Neural Information Processing Systems, Vol. 2, pp. 218–225, Morgan Kaufman Publishers, 1990.

    Google Scholar 

  3. Bourlard H, and Morgan N, ‘A Continuous Speech Recognition System Embedding MLP into HMM’, In: Advances in Neural Information Processing Systems, Vol. 2, pp. 186–193, Morgan Kaufman Publishers, 1990.

    Google Scholar 

  4. Franzini M A, ‘Learning to Recognize Spoken Words: A study in Connectionist Speech Recognition’, In: Proceedings of the 1988 Connectionist Models Summer School, pp. 407-416, Morgan Kaufman Publishers, 1988.

    Google Scholar 

  5. Grajski K A, Witmer D P, and Chen C, ‘A Preliminary Note on Static and Recurrent Neural Networks for Word-Level Speech Recognition’, In: Proceedings of the 1990 International Joint Conference on Neural Networks, Vol. 2, pp. 245–248, Lawrence Erlbaum Publishers, 1990.

    Google Scholar 

  6. Hampshire II J B, and Waibel A, ‘Connectionist Architectures for Multi-Speaker Phoneme Recognition’, In: Advances in Neural Information Processing Systems, Vol. 2, pp. 203–210, Morgan Kaufman Publishers, 1990.

    Google Scholar 

  7. Hertz J A, Krogh A, and Palmer R, ‘Introduction to the Theory of Neural Computation’, Addison-Wesley, Reading, Massachusetts, 1991.

    Google Scholar 

  8. Kohonen T, ‘The Neural Phonetic Typewriter’, IEEE Computer, 3: 11–22, 1988.

    Article  Google Scholar 

  9. Kowalewski F, and Strube H, ‘Word Recognition with a Recurrent Neural Network’, In: Parallel Processing in Neural Systems and Computers, pp. 390-394, Elsevier Publishers, 1990.

    Google Scholar 

  10. Lee K, ‘Context-Dependent Phonetic Hidden Markov Models for Speaker-Independent Continuous Speech Recognition’, IEEE Transactions on Acoustics, Speech, and Signal Processing, 38(4), 1990.

    Google Scholar 

  11. Lee Y, and Lippmann R P, ‘Practical Characteristics of Neural Network and Conventional Pattern Classifiers on Artificial and Speech Problems’, In: Advances in Neural Information Processing Systems, Vol. 2, pp. 168–177, Morgan Kaufman Publishers, 1990.

    Google Scholar 

  12. Peacocke R D, and Graf D H, ‘An Introduction to Speech and Speaker Recognition’, IEEE Computer, 8: 26–33, 1990.

    Article  Google Scholar 

  13. Rabiner L R, and Gold B, ‘Theory and Applications of Digital Signal Processing’, Prentice-Hall, 1975.

    Google Scholar 

  14. Rigoll G, ‘Neural Network Based Continous Speech Recognition by Combining Self Organizing Maps and Hidden Markov Modelling’, In: Lecture Notes in Computer Science, Vol. 134, pp. 58–65, Springer-Verlag, Berlin, 1990.

    Google Scholar 

  15. Rumelhart, D E, Hinton, G, and Williams, R E, ‘Learning Internal Representations by Error Propagation’, In: Parallel Distributed Processing: Explorations in the Microstructures of Cognition, Vol. 1, 318-362, MIT Press

    Google Scholar 

  16. Sung C, and Jones W C, ‘A Speech Recognition System Featuring Neural Network Processing of Global Lexical Features’, In: Proceedings of the 1990 International Joint Conference on Neural Networks, Vol. 2, pp. 437–440, Lawrence Erlbaum Publishers, 1990.

    Google Scholar 

  17. Waibel A, Hanazawa T, Hinton G, Shikano K, and Lang K, ‘Phoneme Recognition Using Time-Delay Neural Networks’, IEEE Transactions on Acoustics, Speech, and Signal Processing, 37(3): 328–339, 1989.

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 1993 Springer-Verlag/Wien

About this paper

Cite this paper

Freisleben, B., Bohn, CA. (1993). Speaker-Independent Word Recognition with Backpropagation Networks. In: Albrecht, R.F., Reeves, C.R., Steele, N.C. (eds) Artificial Neural Nets and Genetic Algorithms. Springer, Vienna. https://doi.org/10.1007/978-3-7091-7533-0_36

Download citation

  • DOI: https://doi.org/10.1007/978-3-7091-7533-0_36

  • Publisher Name: Springer, Vienna

  • Print ISBN: 978-3-211-82459-7

  • Online ISBN: 978-3-7091-7533-0

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics