Advertisement

Automatic Transformation of Speech Databases for Continuous Speech Recognition

  • S. Rieck
  • E. G. Schukat-Talamazzini
  • T. Kuhn
  • S. Kunzmann
  • E. Nöth
Conference paper
Part of the NATO ASI Series book series (volume 75)

Abstract

In this paper a method is described to generate automatically the labels for a new speech database from an existing manually labeled speech database. This becomes necessary when new standards are introduced and the speech signals have to be resampled. A dynamic time warping algorithm is used to match the original and the resampled speech signals. The comparison is carried out on mel based features. To improve computation time the search space for the DTW algorithm is restricted. Several experiments were carried out with a normal density Bayes classifier to check the quality of the new labelings. The results showed only a slight decrease in performance when using the new labelings.

Keywords

Recognition Rate Speech Recognition Speech Signal Optimal Path Dynamic Time Warping 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. [1]
    D. Van Compernolle. Noise adaptation in a hidden Markov model speech recognition system. Computer Speech and Language, 3(2):151–167, April 1989.CrossRefGoogle Scholar
  2. [2]
    S.D. Davis and P. Mermelstein. Comparison of parametric representations for monosyllabic word recognition in continuously spoken sentences. IEEE Trans. on Acoustics, Speech, and Signal Processing, 28(4):357–366, August 1980.CrossRefGoogle Scholar
  3. [3]
    T. Kuhn, S. Kunzmann, E. Nöth, S. Rieck, and E.G. Schukat-Talamazzini. Iterative Optimization of the Data Driven Analysis in Continuous Speech. In this volume. Google Scholar
  4. [4]
    Y. Linde, A. Buzo, and R.M. Gray. An algorithm for vector quantizer design. IEEE Trans. on Communications, 28(1):84–95, January 1980.CrossRefGoogle Scholar
  5. [5]
    H. Niemann. Klassifikation von Mustern. Springer-Verlag, Berlin, Heidelberg, New York, Tokyo, 1983.MATHGoogle Scholar
  6. [6]
    H. Niemann, A. Brietzmann, R. Mühlfeld, P. Regel, and E.G. Schukat. The Speech Understanding and Dialog System EVAR. In R. De Mori and S.Y. Sun, editors, New Systems and Architectures for Automatic Speech Recognition and Synthesis. NATO Series F 16, pages 271–302, Springer-Verlag, Berlin, Heidelberg, New York, Tokyo, 1985.Google Scholar
  7. [7]
    P. Regel. Akustisch-phonetische Transkription für die automatische Spracherkennung. Volume 83 of Fortschritt-Berichte, VDI Verlag, Düsseldorf, 1988.Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 1992

Authors and Affiliations

  • S. Rieck
    • 1
  • E. G. Schukat-Talamazzini
    • 1
  • T. Kuhn
    • 1
  • S. Kunzmann
    • 1
  • E. Nöth
    • 1
  1. 1.Lehrstuhl für Informatik 5 (Mustererkennung)Friedrich-Alexander-Universität Erlangen-NürnbergErlangenF.R. of Germany

Personalised recommendations