Skip to main content

Handling Context-Dependencies in Speech by LVQ

  • Conference paper
  • First Online:
ICANN ’93 (ICANN 1993)

Included in the following conference series:

  • 25 Accesses

Abstract

In the framework of phonemic speech recognition using codebooks trained by Learning Vector Quantization (LVQ) together with Hidden Markov Models (HMMs), a novel way to model context-dependencies in speech is presented. We use LVQ to map acoustic contextual data into context-independent phonemic form. The contextual data is in the form of concatenated averages of successive short-time feature vectors. This mapping eliminates the need to employ context dependent phonemic HMMs and the difficulties associated therein. Instead, simpler context-independent discrete observation HMMs suffice. We report excellent results for a speaker dependent task for Finnish.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Paul Bamberg and Laurence Gillick. Phoneme-in-context modeling for DRAGON’S continuous speech recognizer. In Proceedings of the DARPA Speech and Natural Language workshop, pages 163–169, Hidden valley, Pennsylvania, USA, 1990.

    Google Scholar 

  2. Stephen B. Davis and Paul Mermelstein. Comparison of parametric representations for monosyllabic word recognition in continuously spoken sentences. IEEE Transactions on Acoustics, Speech, and Signal Processing, 28(4):357–366, August 1980.

    Article  Google Scholar 

  3. V. N. Gupta, M. Lennig, and P. Mermelstein. Integration of acoustic information in a large vocabulary word recognizer. In Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP87), volume 2, pages 697–700, Dallas, Texas, April 1987.

    Google Scholar 

  4. Teuvo Kohonen. Self-Organization and Associative Memory. Springer-Verlag, Berlin-Heidelberg-New York-Tokio, 3 edition, 1989.

    Book  Google Scholar 

  5. Teuvo Kohonen. Improved versions of learning vector quantization. In Proceedings of the International Joint Conference on Neural Networks, volume I, pages 545–550, San Diego, June 1990.

    Google Scholar 

  6. Teuvo Kohonen. The self-organizing map. Proceedings of the IEEE, 78(9):1464–1480, 1990.

    Google Scholar 

  7. Teuvo Kohonen, Jari Kangas, Jorma Laaksonen, and Kari Torkkola. LVQ.PAK: A program package for the correct application of Learning Vector Quantization algorithms. In Proceedings of the International Joint Conference on Neural Networks, volume I, pages 725–730, Baltimore, June 1992. IEEE.

    Google Scholar 

  8. Kai-Fu Lee. Context-dependent phonetic hidden Markov models for speaker-independent continuous speech recognition. IEEE Transactions on Acoustics, Speech, and Signal Processing, 38(4):599–609, April 1990.

    Article  Google Scholar 

  9. Jyri Mäntysalo, Kari Torkkola, and Teuvo Kohonen. LVQ-based speech recognition with highdimensional context vectors. In Proceedings of the International Conference on Spoken Language Processing (ICSLP92), volume I, pages 539–542, Banff, Alberta, Canada, Oct. 12–16 1992.

    Google Scholar 

  10. Lawrence R. Rabiner. A tutorial on hidden Markov models and selected applications in speech recognition. Proceedings of the IEEE, 77(2):257–286, 1989.

    Google Scholar 

  11. Kari Torkkola, Jari Kangas, Pekka Utela, Sami Kaski, Mikko Kokkonen, Mikko Kurimo, and Teuvo Kohonen. Status report of the Finnish phonetic typewriter project. In Proceedings of the International Conference on Artificial Neural Networks (ICANN-91), pages 771–776, Espoo, Finland, June 24–28 1991.

    Google Scholar 

  12. S.J. Young. The general use of tying in phoneme-based HMM speech recognisers. In Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP92), volume I, pages 569–572, San Francisco, CA, USA, March 23–36 1992.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 1993 Springer-Verlag London Limited

About this paper

Cite this paper

Mäntysalo, J., Torkkola, K., Kohonen, T. (1993). Handling Context-Dependencies in Speech by LVQ. In: Gielen, S., Kappen, B. (eds) ICANN ’93. ICANN 1993. Springer, London. https://doi.org/10.1007/978-1-4471-2063-6_93

Download citation

  • DOI: https://doi.org/10.1007/978-1-4471-2063-6_93

  • Published:

  • Publisher Name: Springer, London

  • Print ISBN: 978-3-540-19839-0

  • Online ISBN: 978-1-4471-2063-6

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics