Handling Context-Dependencies in Speech by LVQ

Mäntysalo, Jyri; Torkkola, Kari; Kohonen, Teuvo

doi:10.1007/978-1-4471-2063-6_93

Jyri Mäntysalo²,
Kari Torkkola³ &
Teuvo Kohonen²

Included in the following conference series:

International Conference on Artificial Neural Networks

25 Accesses

Abstract

In the framework of phonemic speech recognition using codebooks trained by Learning Vector Quantization (LVQ) together with Hidden Markov Models (HMMs), a novel way to model context-dependencies in speech is presented. We use LVQ to map acoustic contextual data into context-independent phonemic form. The contextual data is in the form of concatenated averages of successive short-time feature vectors. This mapping eliminates the need to employ context dependent phonemic HMMs and the difficulties associated therein. Instead, simpler context-independent discrete observation HMMs suffice. We report excellent results for a speaker dependent task for Finnish.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Paul Bamberg and Laurence Gillick. Phoneme-in-context modeling for DRAGON’S continuous speech recognizer. In Proceedings of the DARPA Speech and Natural Language workshop, pages 163–169, Hidden valley, Pennsylvania, USA, 1990.
Google Scholar
Stephen B. Davis and Paul Mermelstein. Comparison of parametric representations for monosyllabic word recognition in continuously spoken sentences. IEEE Transactions on Acoustics, Speech, and Signal Processing, 28(4):357–366, August 1980.
Article Google Scholar
V. N. Gupta, M. Lennig, and P. Mermelstein. Integration of acoustic information in a large vocabulary word recognizer. In Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP87), volume 2, pages 697–700, Dallas, Texas, April 1987.
Google Scholar
Teuvo Kohonen. Self-Organization and Associative Memory. Springer-Verlag, Berlin-Heidelberg-New York-Tokio, 3 edition, 1989.
Book Google Scholar
Teuvo Kohonen. Improved versions of learning vector quantization. In Proceedings of the International Joint Conference on Neural Networks, volume I, pages 545–550, San Diego, June 1990.
Google Scholar
Teuvo Kohonen. The self-organizing map. Proceedings of the IEEE, 78(9):1464–1480, 1990.
Google Scholar
Teuvo Kohonen, Jari Kangas, Jorma Laaksonen, and Kari Torkkola. LVQ.PAK: A program package for the correct application of Learning Vector Quantization algorithms. In Proceedings of the International Joint Conference on Neural Networks, volume I, pages 725–730, Baltimore, June 1992. IEEE.
Google Scholar
Kai-Fu Lee. Context-dependent phonetic hidden Markov models for speaker-independent continuous speech recognition. IEEE Transactions on Acoustics, Speech, and Signal Processing, 38(4):599–609, April 1990.
Article Google Scholar
Jyri Mäntysalo, Kari Torkkola, and Teuvo Kohonen. LVQ-based speech recognition with highdimensional context vectors. In Proceedings of the International Conference on Spoken Language Processing (ICSLP92), volume I, pages 539–542, Banff, Alberta, Canada, Oct. 12–16 1992.
Google Scholar
Lawrence R. Rabiner. A tutorial on hidden Markov models and selected applications in speech recognition. Proceedings of the IEEE, 77(2):257–286, 1989.
Google Scholar
Kari Torkkola, Jari Kangas, Pekka Utela, Sami Kaski, Mikko Kokkonen, Mikko Kurimo, and Teuvo Kohonen. Status report of the Finnish phonetic typewriter project. In Proceedings of the International Conference on Artificial Neural Networks (ICANN-91), pages 771–776, Espoo, Finland, June 24–28 1991.
Google Scholar
S.J. Young. The general use of tying in phoneme-based HMM speech recognisers. In Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP92), volume I, pages 569–572, San Francisco, CA, USA, March 23–36 1992.
Google Scholar

Download references

Author information

Authors and Affiliations

Laboratory of Information and Computer Science, Helsinki University of Technology, Rakentajanaukio 2 C, SF-02150, Espoo, Finland
Jyri Mäntysalo & Teuvo Kohonen
Institut Dalle Molle D’Intelligence Artificielle Perceptive (IDIAP), C.P. 609, CH-1920, Martigny, Switzerland
Kari Torkkola

Authors

Jyri Mäntysalo
View author publications
You can also search for this author in PubMed Google Scholar
Kari Torkkola
View author publications
You can also search for this author in PubMed Google Scholar
Teuvo Kohonen
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Dutch Foundation for Neural Networks, University of Nijmegen, Geert Grooteplein 21, 6525 EZ, Nijmegen, The Netherlands
Stan Gielen & Bert Kappen &

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Mäntysalo, J., Torkkola, K., Kohonen, T. (1993). Handling Context-Dependencies in Speech by LVQ. In: Gielen, S., Kappen, B. (eds) ICANN ’93. ICANN 1993. Springer, London. https://doi.org/10.1007/978-1-4471-2063-6_93

Download citation

DOI: https://doi.org/10.1007/978-1-4471-2063-6_93
Published: 10 April 2012
Publisher Name: Springer, London
Print ISBN: 978-3-540-19839-0
Online ISBN: 978-1-4471-2063-6
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics