Advertisement

ICANN ’93 pp 389-394 | Cite as

Handling Context-Dependencies in Speech by LVQ

  • Jyri Mäntysalo
  • Kari Torkkola
  • Teuvo Kohonen
Conference paper

Abstract

In the framework of phonemic speech recognition using codebooks trained by Learning Vector Quantization (LVQ) together with Hidden Markov Models (HMMs), a novel way to model context-dependencies in speech is presented. We use LVQ to map acoustic contextual data into context-independent phonemic form. The contextual data is in the form of concatenated averages of successive short-time feature vectors. This mapping eliminates the need to employ context dependent phonemic HMMs and the difficulties associated therein. Instead, simpler context-independent discrete observation HMMs suffice. We report excellent results for a speaker dependent task for Finnish.

Keywords

Hide Markov Model Speech Recognition Quantization Error Learn Vector Quantization Output Symbol 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. [1]
    Paul Bamberg and Laurence Gillick. Phoneme-in-context modeling for DRAGON’S continuous speech recognizer. In Proceedings of the DARPA Speech and Natural Language workshop, pages 163–169, Hidden valley, Pennsylvania, USA, 1990.Google Scholar
  2. [2]
    Stephen B. Davis and Paul Mermelstein. Comparison of parametric representations for monosyllabic word recognition in continuously spoken sentences. IEEE Transactions on Acoustics, Speech, and Signal Processing, 28(4):357–366, August 1980.CrossRefGoogle Scholar
  3. [3]
    V. N. Gupta, M. Lennig, and P. Mermelstein. Integration of acoustic information in a large vocabulary word recognizer. In Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP87), volume 2, pages 697–700, Dallas, Texas, April 1987.Google Scholar
  4. [4]
    Teuvo Kohonen. Self-Organization and Associative Memory. Springer-Verlag, Berlin-Heidelberg-New York-Tokio, 3 edition, 1989.CrossRefGoogle Scholar
  5. [5]
    Teuvo Kohonen. Improved versions of learning vector quantization. In Proceedings of the International Joint Conference on Neural Networks, volume I, pages 545–550, San Diego, June 1990.Google Scholar
  6. [6]
    Teuvo Kohonen. The self-organizing map. Proceedings of the IEEE, 78(9):1464–1480, 1990.Google Scholar
  7. [7]
    Teuvo Kohonen, Jari Kangas, Jorma Laaksonen, and Kari Torkkola. LVQ.PAK: A program package for the correct application of Learning Vector Quantization algorithms. In Proceedings of the International Joint Conference on Neural Networks, volume I, pages 725–730, Baltimore, June 1992. IEEE.Google Scholar
  8. [8]
    Kai-Fu Lee. Context-dependent phonetic hidden Markov models for speaker-independent continuous speech recognition. IEEE Transactions on Acoustics, Speech, and Signal Processing, 38(4):599–609, April 1990.CrossRefGoogle Scholar
  9. [9]
    Jyri Mäntysalo, Kari Torkkola, and Teuvo Kohonen. LVQ-based speech recognition with highdimensional context vectors. In Proceedings of the International Conference on Spoken Language Processing (ICSLP92), volume I, pages 539–542, Banff, Alberta, Canada, Oct. 12–16 1992.Google Scholar
  10. [10]
    Lawrence R. Rabiner. A tutorial on hidden Markov models and selected applications in speech recognition. Proceedings of the IEEE, 77(2):257–286, 1989.Google Scholar
  11. [11]
    Kari Torkkola, Jari Kangas, Pekka Utela, Sami Kaski, Mikko Kokkonen, Mikko Kurimo, and Teuvo Kohonen. Status report of the Finnish phonetic typewriter project. In Proceedings of the International Conference on Artificial Neural Networks (ICANN-91), pages 771–776, Espoo, Finland, June 24–28 1991.Google Scholar
  12. [12]
    S.J. Young. The general use of tying in phoneme-based HMM speech recognisers. In Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP92), volume I, pages 569–572, San Francisco, CA, USA, March 23–36 1992.Google Scholar

Copyright information

© Springer-Verlag London Limited 1993

Authors and Affiliations

  • Jyri Mäntysalo
    • 1
  • Kari Torkkola
    • 2
  • Teuvo Kohonen
    • 1
  1. 1.Laboratory of Information and Computer ScienceHelsinki University of TechnologyEspooFinland
  2. 2.Institut Dalle Molle D’Intelligence Artificielle Perceptive (IDIAP)MartignySwitzerland

Personalised recommendations