Speaker-Independent Word Recognition with Backpropagation Networks

Freisleben, Bernd; Bohn, Christian-Arved

doi:10.1007/978-3-7091-7533-0_36

Bernd Freisleben⁴ &
Christian-Arved Bohn⁵

234 Accesses

Abstract

This paper presents a system that recognizes a limited vocabulary of spoken words in a speaker-independent manner. The system requires only minimal hardware support for acoustic preprocessing. In contrast to other approaches to word-level recognition, it reduces the information content of the speech signals by a compression algorithm before presenting them as inputs to a standard 3-layer backpropagation network. The network learns to recognize the utterances of the speakers in the training set, and the trained network is then used to recognize the spoken words of unknown speakers. Recognition rates of up to 91% were obtained for unknown speakers of the same sex and up to 72% for a mix of both male and female speakers. Since the training times are fast and the system is very cost effective, the approach is practically feasible for a variety of applications.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Backpropagation for Fully Connected Cascade Networks

Article 25 January 2017

Artificial Neural Networks and Backpropagation

Towards More Biologically Plausible Error-Driven Learning for Artificial Neural Networks

References

Behme H, ‘A Neural Net for Recognition and Storing of Spoken Words’, In: Parallel Processing in Neural Systems and Computers, pp. 379-382, Elsevier Science Publishers, 1990.
Google Scholar
Bengio Y, Cardin R, and De Mori R, ‘Speaker Independent Speech Recognition with Neural Networks and Speech Knowledge’, In: Advances in Neural Information Processing Systems, Vol. 2, pp. 218–225, Morgan Kaufman Publishers, 1990.
Google Scholar
Bourlard H, and Morgan N, ‘A Continuous Speech Recognition System Embedding MLP into HMM’, In: Advances in Neural Information Processing Systems, Vol. 2, pp. 186–193, Morgan Kaufman Publishers, 1990.
Google Scholar
Franzini M A, ‘Learning to Recognize Spoken Words: A study in Connectionist Speech Recognition’, In: Proceedings of the 1988 Connectionist Models Summer School, pp. 407-416, Morgan Kaufman Publishers, 1988.
Google Scholar
Grajski K A, Witmer D P, and Chen C, ‘A Preliminary Note on Static and Recurrent Neural Networks for Word-Level Speech Recognition’, In: Proceedings of the 1990 International Joint Conference on Neural Networks, Vol. 2, pp. 245–248, Lawrence Erlbaum Publishers, 1990.
Google Scholar
Hampshire II J B, and Waibel A, ‘Connectionist Architectures for Multi-Speaker Phoneme Recognition’, In: Advances in Neural Information Processing Systems, Vol. 2, pp. 203–210, Morgan Kaufman Publishers, 1990.
Google Scholar
Hertz J A, Krogh A, and Palmer R, ‘Introduction to the Theory of Neural Computation’, Addison-Wesley, Reading, Massachusetts, 1991.
Google Scholar
Kohonen T, ‘The Neural Phonetic Typewriter’, IEEE Computer, 3: 11–22, 1988.
Article Google Scholar
Kowalewski F, and Strube H, ‘Word Recognition with a Recurrent Neural Network’, In: Parallel Processing in Neural Systems and Computers, pp. 390-394, Elsevier Publishers, 1990.
Google Scholar
Lee K, ‘Context-Dependent Phonetic Hidden Markov Models for Speaker-Independent Continuous Speech Recognition’, IEEE Transactions on Acoustics, Speech, and Signal Processing, 38(4), 1990.
Google Scholar
Lee Y, and Lippmann R P, ‘Practical Characteristics of Neural Network and Conventional Pattern Classifiers on Artificial and Speech Problems’, In: Advances in Neural Information Processing Systems, Vol. 2, pp. 168–177, Morgan Kaufman Publishers, 1990.
Google Scholar
Peacocke R D, and Graf D H, ‘An Introduction to Speech and Speaker Recognition’, IEEE Computer, 8: 26–33, 1990.
Article Google Scholar
Rabiner L R, and Gold B, ‘Theory and Applications of Digital Signal Processing’, Prentice-Hall, 1975.
Google Scholar
Rigoll G, ‘Neural Network Based Continous Speech Recognition by Combining Self Organizing Maps and Hidden Markov Modelling’, In: Lecture Notes in Computer Science, Vol. 134, pp. 58–65, Springer-Verlag, Berlin, 1990.
Google Scholar
Rumelhart, D E, Hinton, G, and Williams, R E, ‘Learning Internal Representations by Error Propagation’, In: Parallel Distributed Processing: Explorations in the Microstructures of Cognition, Vol. 1, 318-362, MIT Press
Google Scholar
Sung C, and Jones W C, ‘A Speech Recognition System Featuring Neural Network Processing of Global Lexical Features’, In: Proceedings of the 1990 International Joint Conference on Neural Networks, Vol. 2, pp. 437–440, Lawrence Erlbaum Publishers, 1990.
Google Scholar
Waibel A, Hanazawa T, Hinton G, Shikano K, and Lang K, ‘Phoneme Recognition Using Time-Delay Neural Networks’, IEEE Transactions on Acoustics, Speech, and Signal Processing, 37(3): 328–339, 1989.
Article Google Scholar

Download references

Author information

Authors and Affiliations

Dept. of Computer Science, University of Darmstadt, Alexanderstr. 10, D-6100, Darmstadt, Germany
Bernd Freisleben
Dept. Scientific Visualization of HLRZ, GMD Birlinghoven, P.O. Box 1316, D-5205, Sankt Augustin 1, Germany
Christian-Arved Bohn

Authors

Bernd Freisleben
View author publications
You can also search for this author in PubMed Google Scholar
Christian-Arved Bohn
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Institut für Informatik, Universität Innsbruck, Innsbruck, Austria
Rudolf F. Albrecht
Division of Statistics and Operational Research, UK
Colin R. Reeves
Division of Mathematics School of Mathematical and Information Sciences, Coventry University, Coventry, UK
Nigel C. Steele

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Freisleben, B., Bohn, CA. (1993). Speaker-Independent Word Recognition with Backpropagation Networks. In: Albrecht, R.F., Reeves, C.R., Steele, N.C. (eds) Artificial Neural Nets and Genetic Algorithms. Springer, Vienna. https://doi.org/10.1007/978-3-7091-7533-0_36

Download citation

DOI: https://doi.org/10.1007/978-3-7091-7533-0_36
Publisher Name: Springer, Vienna
Print ISBN: 978-3-211-82459-7
Online ISBN: 978-3-7091-7533-0
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics

Speaker-Independent Word Recognition with Backpropagation Networks

Abstract

Access this chapter

Preview

Similar content being viewed by others

Backpropagation for Fully Connected Cascade Networks

Artificial Neural Networks and Backpropagation

Towards More Biologically Plausible Error-Driven Learning for Artificial Neural Networks

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Navigation

Speaker-Independent Word Recognition with Backpropagation Networks

Abstract

Access this chapter

Preview

Similar content being viewed by others

Backpropagation for Fully Connected Cascade Networks

Artificial Neural Networks and Backpropagation

Towards More Biologically Plausible Error-Driven Learning for Artificial Neural Networks

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation