Abstract
In the framework of phonemic speech recognition using codebooks trained by Learning Vector Quantization (LVQ) together with Hidden Markov Models (HMMs), a novel way to model context-dependencies in speech is presented. We use LVQ to map acoustic contextual data into context-independent phonemic form. The contextual data is in the form of concatenated averages of successive short-time feature vectors. This mapping eliminates the need to employ context dependent phonemic HMMs and the difficulties associated therein. Instead, simpler context-independent discrete observation HMMs suffice. We report excellent results for a speaker dependent task for Finnish.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Paul Bamberg and Laurence Gillick. Phoneme-in-context modeling for DRAGON’S continuous speech recognizer. In Proceedings of the DARPA Speech and Natural Language workshop, pages 163–169, Hidden valley, Pennsylvania, USA, 1990.
Stephen B. Davis and Paul Mermelstein. Comparison of parametric representations for monosyllabic word recognition in continuously spoken sentences. IEEE Transactions on Acoustics, Speech, and Signal Processing, 28(4):357–366, August 1980.
V. N. Gupta, M. Lennig, and P. Mermelstein. Integration of acoustic information in a large vocabulary word recognizer. In Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP87), volume 2, pages 697–700, Dallas, Texas, April 1987.
Teuvo Kohonen. Self-Organization and Associative Memory. Springer-Verlag, Berlin-Heidelberg-New York-Tokio, 3 edition, 1989.
Teuvo Kohonen. Improved versions of learning vector quantization. In Proceedings of the International Joint Conference on Neural Networks, volume I, pages 545–550, San Diego, June 1990.
Teuvo Kohonen. The self-organizing map. Proceedings of the IEEE, 78(9):1464–1480, 1990.
Teuvo Kohonen, Jari Kangas, Jorma Laaksonen, and Kari Torkkola. LVQ.PAK: A program package for the correct application of Learning Vector Quantization algorithms. In Proceedings of the International Joint Conference on Neural Networks, volume I, pages 725–730, Baltimore, June 1992. IEEE.
Kai-Fu Lee. Context-dependent phonetic hidden Markov models for speaker-independent continuous speech recognition. IEEE Transactions on Acoustics, Speech, and Signal Processing, 38(4):599–609, April 1990.
Jyri Mäntysalo, Kari Torkkola, and Teuvo Kohonen. LVQ-based speech recognition with highdimensional context vectors. In Proceedings of the International Conference on Spoken Language Processing (ICSLP92), volume I, pages 539–542, Banff, Alberta, Canada, Oct. 12–16 1992.
Lawrence R. Rabiner. A tutorial on hidden Markov models and selected applications in speech recognition. Proceedings of the IEEE, 77(2):257–286, 1989.
Kari Torkkola, Jari Kangas, Pekka Utela, Sami Kaski, Mikko Kokkonen, Mikko Kurimo, and Teuvo Kohonen. Status report of the Finnish phonetic typewriter project. In Proceedings of the International Conference on Artificial Neural Networks (ICANN-91), pages 771–776, Espoo, Finland, June 24–28 1991.
S.J. Young. The general use of tying in phoneme-based HMM speech recognisers. In Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP92), volume I, pages 569–572, San Francisco, CA, USA, March 23–36 1992.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 1993 Springer-Verlag London Limited
About this paper
Cite this paper
Mäntysalo, J., Torkkola, K., Kohonen, T. (1993). Handling Context-Dependencies in Speech by LVQ. In: Gielen, S., Kappen, B. (eds) ICANN ’93. ICANN 1993. Springer, London. https://doi.org/10.1007/978-1-4471-2063-6_93
Download citation
DOI: https://doi.org/10.1007/978-1-4471-2063-6_93
Published:
Publisher Name: Springer, London
Print ISBN: 978-3-540-19839-0
Online ISBN: 978-1-4471-2063-6
eBook Packages: Springer Book Archive