Automatic Transformation of Speech Databases for Continuous Speech Recognition
In this paper a method is described to generate automatically the labels for a new speech database from an existing manually labeled speech database. This becomes necessary when new standards are introduced and the speech signals have to be resampled. A dynamic time warping algorithm is used to match the original and the resampled speech signals. The comparison is carried out on mel based features. To improve computation time the search space for the DTW algorithm is restricted. Several experiments were carried out with a normal density Bayes classifier to check the quality of the new labelings. The results showed only a slight decrease in performance when using the new labelings.
Unable to display preview. Download preview PDF.
- T. Kuhn, S. Kunzmann, E. Nöth, S. Rieck, and E.G. Schukat-Talamazzini. Iterative Optimization of the Data Driven Analysis in Continuous Speech. In this volume. Google Scholar
- H. Niemann, A. Brietzmann, R. Mühlfeld, P. Regel, and E.G. Schukat. The Speech Understanding and Dialog System EVAR. In R. De Mori and S.Y. Sun, editors, New Systems and Architectures for Automatic Speech Recognition and Synthesis. NATO Series F 16, pages 271–302, Springer-Verlag, Berlin, Heidelberg, New York, Tokyo, 1985.Google Scholar
- P. Regel. Akustisch-phonetische Transkription für die automatische Spracherkennung. Volume 83 of Fortschritt-Berichte, VDI Verlag, Düsseldorf, 1988.Google Scholar