Automotive Speech Recognition

  • Harald Höge
  • Sascha Hohenner
  • Bernhard Kämmerer
  • Niels Kunstmann
  • Stefanie Schachtl
  • Martin Schönle
  • Panji Setiawan
Part of the Advances in Pattern Recognition book series (ACVPR)

In the coming years speech recognition will be a commodity feature in car. Control of communication systems integrated in the car infotainment system including telephony, audio devices and destination inputs for navigation can be done via voice. Concerning speech recognition technology biggest the challenge is the recognition of large vocabularies in noisy environments using cost sensitive hardware platforms. Further intuitive dialog design coupled with natural sounding text to speech systems has to be provided to achieve a smooth man-machine interaction. This chapter describes commercial driven activities to develop and produce speech technology components for various automotive applications including the used speech recognition, speaker characterization, speech synthesis and dialog technology, the used platforms, and a methodology for the evaluation of recognition performance.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Andrassy, B., Hilger, F. and Beaugeant, C. (2001) Investigations on the combination of four algorithms to increase the noise robustness of a DSR front-end for real world car data. In Proceedings of Automatic Speech Recognition and Understanding Workshop.Google Scholar
  2. Automotive Electronic Council (2003) Stress Test Qualification for Integrated Circuits, AEC— Q100—Rev-F.2, 2003-07-18, Automotive Electronics Council, Component Technical Committee.Google Scholar
  3. Bauer, J.G. (1997) Enhanced control and estimation of parameters for a telephone based isolated digit recognizer. In Proceedings of IEEE International Conference of Acoustics, Speech, and Signal Processing (ICASSP), pp. 1531-1534.Google Scholar
  4. Beaugeant, C., Gilg, V., Schönle, M., Jax, P. and Martin, R. (2002) Computationally efficient speech enhancement using RLS and psycho-acoustic motivated algorithm. In Proceedings of World Multi-Conference on Systemics, Cybernetics and Informatics.Google Scholar
  5. Berton, A., Regel-Brietzmann, P., Block, H.U. and Schachtl, S. (2006) Integration of Scalable Dialog Systems in Cars. In Proceedings of ESSV, Freiberg.Google Scholar
  6. Block, H.-U., Caspari, R. and Schachtl, S. (2004) Callable Manuals - Access to Product Docu-mentation via Voice. “it” Information Technology, Vol. 46, Oldenburg Verlag, München, pp. 299-305.Google Scholar
  7. Ephraim, Y. and Malah, D. (1984) Speech enhancement using a minimum mean-square error short-time spectral amplitude estimator. IEEE Transaction on Acoustics, Speech and Signal Processing, Vol. 32, no. 6, pp. 1109-1121.CrossRefGoogle Scholar
  8. Höge, H. (2000) Speech database technology for commercially used recognizers-status and future issues. In Proceedings of Workshop XLDB on LREC 2000, Athens.Google Scholar
  9. Höge, H. and Andrassy, B. (2006) Human and machine recognition as a function of SNR. In LREC 2006 ELRA, Genoa, Italy, pp. 2060-2063.Google Scholar
  10. Junqua, J.C. (1993) The Lombard reflex and its role on human listeners and automatic speech recognizers. Journal Of the Acoustical Society of America, Vol. 93, pp. 510-524.CrossRefGoogle Scholar
  11. Ramabadran, T., Sorin, A., McLaughlin, M., Chanzan, D., Pearce, D. and Hoory, R. (2004) The ETSI extended distributed speech recognition (DSR) standards. In Proceedings of IEEE ICASSP, Vol. I, pp. 53-56.Google Scholar
  12. Scalart, P. and Filho, J., (1996) Speech enhancement based on a priori signal to noise estimation. In Proceedings of ICASSP, pp. 629-632.Google Scholar
  13. Setiawan, P., Beaugeant, C., Stan, S. and Fingscheidt, T. (2005a) Least-squares weighting rule formulations in the frequency domain. In Proceedings of Electronic Speech Signal Processing Conference (ESSP), September 2005.Google Scholar
  14. Setiawan, P., Suhadi S., Fingscheidt, T. and Stan, S. (2005b) Robust speech recognition for mobile devices in car noise. In Proceedings of European Conference on Speech Communica-tion and Technology (EUROSPEECH). SpeechDat (2000) http://www.speechdat.org.
  15. The Motor Industry Software Reliability Association (2004) MISRA-C: 2004—Guidelines for the use of the C language in critical systems, MIRA Ltd., Warwickshire.Google Scholar
  16. The SPICE User Group (2005) Automotive SPICE Process Assessment Model, Version 2.2, 2005-08-21 (see www.automotivespice.com)
  17. Varga, I., Aalburg, S., Andrassy, B., Astrov, S., Bauer, J.G., Beaugeant, Ch., Geissler, Ch. and Höge, H. (2002) ASR in Mobile Phones—An Industrial Approach. IEEE Trans. Speech and Audio Processing, Vol. 10, no. 8, pp. 562-569.CrossRefGoogle Scholar
  18. Wahlster, W. (2004) SmartWeb—Mobile applications of the semantic web. In P. Dadam and M. Reichert (eds.), Springer GI Jahrestagung 2004.Google Scholar

Copyright information

© Springer-Verlag London Limited 2008

Authors and Affiliations

  • Harald Höge
    • 1
  • Sascha Hohenner
    • 1
  • Bernhard Kämmerer
    • 1
  • Niels Kunstmann
    • 1
  • Stefanie Schachtl
    • 1
  • Martin Schönle
    • 1
  • Panji Setiawan
    • 2
  1. 1.Corporate TechnologySiemens AGMünchenGermany
  2. 2.Universität der Bundeswehr MünchenMünchenGermany

Personalised recommendations