Handling Context-Dependencies in Speech by LVQ
In the framework of phonemic speech recognition using codebooks trained by Learning Vector Quantization (LVQ) together with Hidden Markov Models (HMMs), a novel way to model context-dependencies in speech is presented. We use LVQ to map acoustic contextual data into context-independent phonemic form. The contextual data is in the form of concatenated averages of successive short-time feature vectors. This mapping eliminates the need to employ context dependent phonemic HMMs and the difficulties associated therein. Instead, simpler context-independent discrete observation HMMs suffice. We report excellent results for a speaker dependent task for Finnish.
KeywordsHide Markov Model Speech Recognition Quantization Error Learn Vector Quantization Output Symbol
Unable to display preview. Download preview PDF.
- Paul Bamberg and Laurence Gillick. Phoneme-in-context modeling for DRAGON’S continuous speech recognizer. In Proceedings of the DARPA Speech and Natural Language workshop, pages 163–169, Hidden valley, Pennsylvania, USA, 1990.Google Scholar
- V. N. Gupta, M. Lennig, and P. Mermelstein. Integration of acoustic information in a large vocabulary word recognizer. In Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP87), volume 2, pages 697–700, Dallas, Texas, April 1987.Google Scholar
- Teuvo Kohonen. Improved versions of learning vector quantization. In Proceedings of the International Joint Conference on Neural Networks, volume I, pages 545–550, San Diego, June 1990.Google Scholar
- Teuvo Kohonen. The self-organizing map. Proceedings of the IEEE, 78(9):1464–1480, 1990.Google Scholar
- Teuvo Kohonen, Jari Kangas, Jorma Laaksonen, and Kari Torkkola. LVQ.PAK: A program package for the correct application of Learning Vector Quantization algorithms. In Proceedings of the International Joint Conference on Neural Networks, volume I, pages 725–730, Baltimore, June 1992. IEEE.Google Scholar
- Jyri Mäntysalo, Kari Torkkola, and Teuvo Kohonen. LVQ-based speech recognition with highdimensional context vectors. In Proceedings of the International Conference on Spoken Language Processing (ICSLP92), volume I, pages 539–542, Banff, Alberta, Canada, Oct. 12–16 1992.Google Scholar
- Lawrence R. Rabiner. A tutorial on hidden Markov models and selected applications in speech recognition. Proceedings of the IEEE, 77(2):257–286, 1989.Google Scholar
- Kari Torkkola, Jari Kangas, Pekka Utela, Sami Kaski, Mikko Kokkonen, Mikko Kurimo, and Teuvo Kohonen. Status report of the Finnish phonetic typewriter project. In Proceedings of the International Conference on Artificial Neural Networks (ICANN-91), pages 771–776, Espoo, Finland, June 24–28 1991.Google Scholar
- S.J. Young. The general use of tying in phoneme-based HMM speech recognisers. In Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP92), volume I, pages 569–572, San Francisco, CA, USA, March 23–36 1992.Google Scholar