Verbkey - A Single-Chip Speech Control for the Automobile Environment
The article deals with a novel speech recognizer technology which has the potential to overcome some problems of in-car speech control. The verbKEY recognizer bases on the Associative-Dynamic (ASD) algorithm which differs from established techniques as HMM or DTW. The speech recognition technology is designed to run on a 16 bit, fixed point DSP platform. It enables high recognition performance and robustness. At the same time, it is highly cost efficient due to its low memory consumption and its less calculation complexity. Typical applications such as dialling, word spotting or menu structures for the device control are processed by the continuous, real-time recognition engine with an accuracy higher 98% for a 20 words vocabulary. The article describes a hardware prototype for command & control applications and the measures taken to improve the robustness against environmental noises. Finally, the authors discuss some ergonomic aspects to obtain a higher level of traffic safety.
KeywordsAutomatic speech recognition Associative-Dynamic classifier (ASD) robustness telephone application discriminative optimization
Unable to display preview. Download preview PDF.
- T. Rudolph, Evolutionary Optimization of Fast Command Recognizers, (in German), Phd thesis, Dresden University of Technology, 1998.Google Scholar
- A. Noll A. Paesler H. Ney, D. Mergel, “Data-driven search organisation for continuous speech recognition,” IEEE Trans. Signal Processing, vol. 40, pp. 272–281, 1992.Google Scholar
- U. Koloska T. Richter R. Petrick D. Hirschfeld, J. Bechstein, “Development steps of a hardware recognizer with minimal footprint”, (in German), Proc. 13th Conf. on Electronic Speech Signal Processing (ESSV), Dresden, pp. 182–189, 2002.Google Scholar
- W. Hess P. Vary, U. Heute, Digital Speech Signal Processing, (in German), Teubner, Stuttgart, 1998.Google Scholar
- G. Ruske: Automatische Spracherkennung — Methoden der Klassifikation und Merkmalsextraktion, München: Oldenbourg Verlag, 1988.Google Scholar
- Fukunaga: Introduction to Statistical Pattern Recognition, San Diego: Academic Press, 1990.Google Scholar
- ETSI EN 301, V7.1.1: Voice activity detector (VAD) for AdaptiveMulti-Rate (AMR) speech traffic channels, General description (GSM 06.94 version 7.1.1 Release 1998)Google Scholar