Skip to main content

Application of Speech Technology in Vehicles

  • Chapter
  • First Online:
Speech Technology

Abstract

Speech technology has been regarded as one of the most interesting technologies for operating in-vehicle information systems. Cameron [1] has pointed out that under at least one of the four criteria that people are using speech system more likely. These four criteria are the following: (1) They are offered no choice; (2) it corresponds to the privacy of their surroundings; (3) their hands or eyes are busy on another task; and (4) it is quicker than any other alternatives. For driver, driving is a typical “hands and eyes are busy” task. In most of the situations, the driver is the only person inside the car, or with some passengers who know each other well, so the “privacy of surroundings” criteria are also met. There are long histories of interests of applying speech technology into controlling in-vehicle information system. Up to now, some of the commercial cars have already equipped with imbedded speech technology. In 1996, however, the S-Class car of Mercedes-Benz introduced Linguatronic, the first generation of in-car speech system for anybody who drives a car [2]. Since then, the number of in-vehicle applications using speech technology is increasing [3].

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

eBook
USD 16.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 119.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 169.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Cameron, H. (2000). Speech at the interface. In: Workshop on "Voice Operated Telecom Services". Ghent, Belgium, COST 249.

    Google Scholar 

  2. Heisterkamp, P. (2001). Linguatronic - Product-level speech system for Mercedes-Benz Cars. In: Proc. HLT, San Diego, CA, USA.

    Google Scholar 

  3. Hamerich, S. W. (2007). Towards advanced speech driven navigation systems for cars. In: 3rd IET Int. Conf. on Intelligent Environments, IE07, Sept. 24-25, Ulm, Germany.

    Google Scholar 

  4. Goose, S., Djennane, S. (2002). WIRE3: Driving around the information super-highway. Pers. Ubiquitous Comput., 6, 164-175.

    Google Scholar 

  5. Nass, C., Jonsson, I.-M., Harris, H., Reaves, B., Endo, J., Brave, S., Takayama, L. (2005). Improving automotive safety by pairing driver emotion and car voice emotion. In: CHI '05 Extended Abstracts on Human factors in Computing Systems. ACM Press, New York, NY.

    Google Scholar 

  6. Nass, C., Brave, S. B. (2005). Wired for Speech: How Voice Activates and Enhance the Human Computer Relationship. MIT Press, Cambridge, MA.

    Google Scholar 

  7. Bishop, R. (2005). Intelligent Vehicle Technology and Trends. Artech House, Boston.

    Google Scholar 

  8. van de Weijer, C. (2008). Keynote 1: Dutch connected traffic in practice and in the future. In: IEEE Intelligent Vehicles Sympos. Eindhoven, The Netherlander, June 4-6.

    Google Scholar 

  9. Gardner, M. (2008). Nomadic device integration in Aide. In: Proc. AIDE Final Workshop and Exhibition. April 15-16, Goteborg, Sweden.

    Google Scholar 

  10. Johansson, E., Engstrom, J., Cherri, C., Nodari, E., Toffetti, A., Schindhelm, R., Gelau, C. (2004). Review of existing techniques and metrics for IVIS and ADAS assessment. EU Information Society Technology (IST) program IST-1-507674-IP: Adaptive Integrated Driver-Vehicle Interface (AIDE).

    Google Scholar 

  11. Lee, J. D., Caven, B., Haake, S., Brown, T. L. (2001). Speech-based interaction with in- vehicle computer: The effect of speech-based e-mail on driver's attention to the roadway. Hum. Factors, 43, 631-640.

    Google Scholar 

  12. Barón, A., Green, P. (2006). Safety and Usability of Speech Interfaces for In-Vehicle Tasks while Driving: A Brief Literature Review. Transportation Research Institute (UMTRI), The University of Michigan.

    Google Scholar 

  13. Saad, F., Hjalmdahl, M., Cañas, J., Alonso, M., Garayo, P., Macchi, L., Nathan, F., Ojeda, L., Papakostopoulos, V., Panou, M., Bekiaris. E. (2004). Literature review of behavioural effects. EU Information Society Technology (IST) program: IST-1-507674-IP, Adaptive Integrated Driver-Vehicle Interface (AIDE).

    Google Scholar 

  14. Treffner, P. J., Barrett, R. (2004). Hands-free mobile phone speech while driving degrades coordination and control. Transport. Res. F, 7, 229-246.

    Google Scholar 

  15. Esbjornsson, M., Juhlin, O., Weilenmann, A. (2007). Drivers using mobile phones in traffic: An ethnographic study of interactional adaption. Int. J. Hum. Comput. Inter., Special Issue on: In-Use, In-Situ: Extending Field Research Methods, 22 (1), 39-60.

    Google Scholar 

  16. Jonsson, I.-M., Chen, F. (2006). How big is the step for driving simulators to driving a real car? In: IEA 2006 Congress, Maastricht, The Netherlands, July 10-14.

    Google Scholar 

  17. Chen, F., Jordan, P. (2008). Zonal adaptive workload management system: Limiting sec- ondary task while driving. In: IEEE Intelligent Transportation System, IVs' 08, Eindhoven, The Netherlander, June 2-6.

    Google Scholar 

  18. Esbjörnsson, M., Brown, B., Juhlin, O., Normark, D., Östergren, M., Laurier, E. (2006). Watching the cars go round and round: designing for active spectating. In: Proc. SIGCHI Conf. on Human Factors in computing systems, Montréal, Québec, Canada, 2006.

    Google Scholar 

  19. Recarte, M. A., Nunes, L. M. (2003). Mental workload while driving: Effects on visual search, discrimination, and decision making. J. Exp. Psychol.: Appl., 9 (2), 119-137.

    Google Scholar 

  20. Victor, T. W., Harbluk, J. L., Engstrom, J. A. (2005). Sensitivity of eye-movement measures to in-vehicle task difficulty. Transport. Res. Part F, 8 (2), 167-190.

    Google Scholar 

  21. Hart, S. G., Staveland, L. E. (1988). Development of NASA-TLX (task Load Index): Results of empirical and theoretical research. In: Meshkati (ed) Human Mental Workload, P. A. H. a. N. Elsevier Science Publishers B.V., North-Holland, 139-183.

    Google Scholar 

  22. Pauzie, A., Sparpedon, A., Saulnier, G. (2007). Ergonomic evaluation of a prototype guidance system in an urban area. Discussion about methodologies and data collection tools, in Vehicle Navigation and Information Systems Conference. In: Proc. in conjunction with the Pacific Rim TransTech Conf. 6th Int. VNIS. "A Ride into the Future", Seattle, WA, USA.

    Google Scholar 

  23. Wang, E., Chen, F. (2008). A new measurement for simulator driving performance in situation without interfere from other vehicles, International Journal of Transportation Systems F. AEI 2008. In: Applied Human Factors and Ergonomics 2008, 2nd Int. Conf., Las Vegas, USA, July 14-17.

    Google Scholar 

  24. Wilson, G. F., Lambert, J. D., Russell, C. A. (2002). Performance enhancement with real- time physiologically controlled adaptive aiding. In: HFA Workshop: Psychophysiological Application to Human Factors, March 11-12, 2002. Swedish Center for Human Factors in Aviation.

    Google Scholar 

  25. Wilson, G. F. (2002). Psychophysiological test methods and procedures. In: HFA Workshop: Psychophysiological Application to Human Factors, March 11-12, 2002. Swedish Center for Human Factors in Aviation.

    Google Scholar 

  26. Lai, J., Cheng, K., Green, P., Tsimhoni, O. (2001). On the road and on the web? Comprehension of synthetic and human speech while driving. In: Conf. on Human Factors and Computing Systems, CHI 2001, 31 March-5 April 2001. Seattle, Washington, USA.

    Google Scholar 

  27. Hermansky, H., Morgan, N. (1994). RASTA processing of speech. IEEE Trans. Speech Audio Process., 2 (4), 578-589.

    Google Scholar 

  28. Kermorvant, C. (1999). A comparison of noise reduction techniques for robust speech recognition. IDIAP research report, IDIAP-RR-99-10, Dalle Molle Institute for perceptual Artificial Intelligence, Valais, Switzerland.

    Google Scholar 

  29. Furui, S. (1986). Speaker-independent isolated word recognition using dynamic features of speech spectrum. IEEE Trans. Acoustics, Speech Signal Process., 34 (1), 52-59.

    Google Scholar 

  30. Mansour, D., Juang, B.-H. (1989). The short-time modified coherence representation and noisy speech recognition. IEEE Trans. Acoustics Speech Signal Process., 37 (6), 795-804.

    Google Scholar 

  31. Hernando, J., Nadeu, C. (1997). Linear prediction of the one-sided autocorrelation sequence for noisy speech recognition. IEEE Trans. Speech Audio Process., 5 (1), 80-84.

    Google Scholar 

  32. Chen, J., Paliwal, K. K., Nakamura, S. (2003). Cepstrum derived from differentiated power spectrum for robust speech recognition. Speech Commun., 41 (2-3), 469-484.

    Google Scholar 

  33. Yuo, K.-H., Wang, H.-C. (1998). Robust features derived from temporal trajectory filtering for speech recognition under the corruption of additive and convolutional noises. In: Proc. IEEE Int. Conf. on Acoustics, Speech, and Signal Processing, April 21-24, 1997, Munich, Bavaria, Germany.

    Google Scholar 

  34. Yuo, K.-H., Wang, H.-C. (1999). Robust features for noisy speech recognition based on temporal trajectory filtering of short-time autocorrelation sequences. Speech Commun., 28, 13-24.

    Google Scholar 

  35. Lebart, K., Boucher, J. M. (2001). A new method based on spectral subtraction for speech dereverberation. Acta Acoustic ACUSTICA, 87, 359-366.

    Google Scholar 

  36. Lee, C.-H., Soong, F. K., Paliwal, K. K. (1996). Automatic Speech and Speaker Recognition. Kluwer, Norwell.

    Google Scholar 

  37. Gales, M. J. F., Young, S. J. (1995). Robust speech recognition in additive and convolutional noise using parallel model combination. Comput. Speech Lang., 9, 289-307.

    Google Scholar 

  38. Gales, M. J. F., Young, S. J. (1996). Robust continuous speech recognition using parallel model combination. IEEE Trans. Speech Audio Process., 4 (5), 352-359.

    Google Scholar 

  39. Acero, A., Deng, L., Kristjansson, T., Zhang, J. (2000). HMM adaptation using vector Taylor series for noisy speech recognition. In: Proc. ICASSP, June 05-09, 2000, Istanbul, Turkey.

    Google Scholar 

  40. Kim, D. Y., Un, C. K., Kim, N. S. (1998). Speech recognition in noisy environments using first-order vector Taylor series. Speech Commun., 24 (1), 39-49.

    Google Scholar 

  41. Visser, E., Otsuka, M., Lee, T.-W. (2003). A spatio-temporal speech enhancement scheme for robust speech recognition in noisy environments. Speech Commun., 41, 393-407.

    Google Scholar 

  42. Farahani, G., Ahadi, S. M., Homayounpour, M. M. (2007). Features based on filtering and spectral peaks in autocorrelation domain for robust speech recognition. Comput. Speech Lang., 21, 187-205.

    Google Scholar 

  43. Choi, E. H. C. (2004). Noise robust front-end for ASR using spectral subtraction, spectral flooring and cumulatie distribution mapping. In: Proc. 10th Australian Int. Conf. on Speech Science & Technology. Macquarie University, Sydney, December 8-10.

    Google Scholar 

  44. Fernandez, R., Corradini, A., Schlangen, D. Stede, M. (2007). Towards reducing and man- aging uncertainty in spoken dialogue systems. In: The Seventh International Workshop on Computational Semantics (IWCS-7). Tilburg, The Netherlands, Jan 10-12.

    Google Scholar 

  45. Skantze, G. (2005). Exploring human error recovery strategies: Implications for spoken dialogue systems. Speech Commun., 45 (3), 325-341.

    Google Scholar 

  46. Gellatly, A. W. a. D., T. A. (1998). Speech recognition and automotive applications: using speech to perform in-vehicle tasks. In: Proc. Human Factors and Ergonomics Society 42nd Annual Meeting, October 5-9, 1998, Hyatt Regency Chicago, Chicago, Illinois.

    Google Scholar 

  47. Greenberg, J., Tijenna, L. Curn, R., Artz, B., Cathey, L., Grant P, Kochhar, D., Koxak, K., Blommer, M. (2003). Evaluation of driver distraction using an event detection paradigm. In: Proc. Transportation Research Board Annual Meetings, January 12-16, 2003, Washington, DC.

    Google Scholar 

  48. McCallum, M. C., Campbell, J. L., Richman, J. B., Brown, J. (2004). Speech recognition and in-vehicle telematics devices; Potential reductions in driver distraction. Int. J. Speech Technol., 7, 25-33.

    Google Scholar 

  49. Bernsen, N. O., Dybkjaer, L. (2002). A multimodal virtual co-driver's problems with the driver. In: ISCA Tutorial and Research Workshop on Multi-Modal Dialogue in Mobile Environments Proceedings. Kloster Irsee, Germany, June 17-19.

    Google Scholar 

  50. Geutner, P., Steffens, F. Manstetten, D. (2002). Design of the VICO Spoken Dialogue System: Evaluation of User Expectations by Wizard-of-Oz Experiments. In: Proc. 3rd Int. Conf. on Language Resources and Evaluation (LREC 2002). Las Palmas, Spain, May.

    Google Scholar 

  51. Villing, J.a.L., S. (2006). Dico: A multimodal menu-based in-vehicle dialogue system. In: The 10th Workshop on the Semantics and Pragmatics of Dialogue, brandial'06 (Sem-Dial 10). Potsdam, Germany, Sept 11-13.

    Google Scholar 

  52. Larsson, S. (2002). Issue-based dialogue management. PhD Thesis, Goteborg University.

    Google Scholar 

  53. Bringert, B., Ljunglöf, P., Raanta, A.and Cooper, R. (2005). Multimodal dialogue systems grammars. In: The DIALOR'05, 9th Workshop on the Semantics and Pragmatics of Dialogue. Nancy (France), June 9-11, 2005.

    Google Scholar 

  54. Oviatt, S. (2004). When do we interact multimodally? Cognitive load and multimodal communication patterns. In: Proc. 6th Int. Conf. on Multimodal Interfaces. Pennsylvania, Oct 14-15.

    Google Scholar 

  55. Bernsen, O., Dybkjaer, L. (2001). Exploring natural interaction in the car. In: Proc. CLASS Workshop on Natural Interactivity and Intelligent Interactive Information Representation, Verona, Italy, Dec 2001.

    Google Scholar 

  56. Esbjörnsson, M., Juhlin, O., Weilenmann, A. (2007). Drivers using mobile phones in traffic: An ethnographic study of interactional adaption. Int. J. Hum Comput Interact., Special Issue on In-Use, In-Situ: Extending Field Research Meth., 22 (1), 39-60.

    Google Scholar 

  57. Jonsson, I.-M., Nass, C., Endo, J., Reaves, B., Harris, H., Ta, J. L., Chan, N., Knapp, S. (2004). Don't blame me I am only the driver: Impact of blame attribution on attitudes and attention to driving task. In: CHI '04 extended Abstracts on Human Factors in Computing Systems, Vienna, Austria.

    Google Scholar 

  58. Jonsson, I.-M., Zajicek, M. (2005). Selecting the voice for an in-car information system for older adults. In: Human Computer Interaction Int. Las Vegas, Nevada, USA.

    Google Scholar 

  59. Jonsson, I.-M., Zajicek, M., Harris, H., Nass, C. I. (2005). Thank you I did not see that: In-car speech-based information systems for older adults. In: Conf. on Human Factors in Computing Systems. ACM Press, Portland, OR.

    Google Scholar 

  60. Jonsson, I. M., Nass, C. I., Harris, H., Takayama, L. (2005). Got Info? Examining the con- sequences of inaccurate information systems. In: Int. Driving Symp. on Human Factors in Driver Assessment, Training, and Vehicle Design. Rockport, Maine.

    Google Scholar 

  61. Gross, J. J. (1999). Emotion and emotion regulation. In: John, L. A. P. O. P. (ed) Handbook of Personality: Theory and Research. New York: Guildford, 525-552.

    Google Scholar 

  62. Picard, R. W. (1997). Affective Computing. MIT Press, Cambridge, MA.

    Google Scholar 

  63. Clore, G. C., Gasper, K. (2000). Feeling is believing: Some affective influences on belief. In: Frijda, A. S. R. M. N. H., Bem, S. (eds) Emotions and Beliefs: How Feelings Influence Thoughts, Editions de la Maison des Sciences de l'Homme and Cambridge University Press (jointly published), Paris/Cambridge, 10-44.

    Google Scholar 

  64. Gross, J. J. (1998). Antecedent- and response-focused emotion regulation: Divergent con- sequences for experience, expression, and physiology. J. Personality Social Psychol., 74, 224-237.

    Google Scholar 

  65. Davidson, R. J. (1994). On emotion, mood, and related affective constructs. In: Davidson, P. E. R. J. (ed) The Nature of Emotion, Oxford University Press, New York, 51-55.

    Google Scholar 

  66. Bower, G. H., Forgas, J. P. (2000). Affect, memory, and social cognition. In: Eich, J. F. K. E., Bower, G. H., Forgas, J. P., Niedenthal, P. M. (eds) Cognition and Emotion. Oxford University Press, Oxford, 87-168.

    Google Scholar 

  67. Groeger, J. A. (2000). Understanding Driving: Applying Cognitive Psychology to a Complex Everyday Task. Psychology Press, Philadelphia, PA.

    Google Scholar 

  68. Lunenfeld, H. (1989). Human factor considerations of motorist navigation and information systems. In: Proc. Vehicle Navigation and Information Systems, September 11-13, Toronto, Canada.

    Google Scholar 

  69. Srinivasan, R., Jovanis, P. (1997). Effect of in-vehicle route guidance systems on driver workload and choice of vehicle speed: Findings from a driving simulator experiment. In: Ian Noy, Y. (ed) Ergonomics and Safety of Intelligent Driver Interfaces, Lawrence Erlbaum Associates Inc., Publishers, Mahwah, New Jersey, 97-114.

    Google Scholar 

  70. Horswill, M., McKenna, F. (1999). The effect of interference on dynamic risk-taking judgments. Br. J. Psychol., 90, 189-199.

    Google Scholar 

  71. Strayer, D., Drews, F., Johnston, W. (2003). Cell phone induced failures of visual attention during simulated driving. J. Exp. Psychol.: Appl., 9 (1), 23-32.

    Google Scholar 

  72. Merat, N., Jamson, A. H. (2005). Shut up I'm driving! Is talking to an inconsiderate passenger the same as talking on a mobile telephone. In: 3rd Int. Driving Symp.on Human Factors in Driver Assessment, Training, and Vehicle Design. Rockport, Maine.

    Google Scholar 

  73. Nass, C. et al. (2005). Improving automotive safety by pairing driver emotion and car voice emotion. In: CHI '05 Extended Abstracts on Human Factors in Computing Systems. ACM Press, New York, NY.

    Google Scholar 

  74. Brouwer, W. H. (1993). Older drivers and attentional demands: consequences for human factors research. In: Proc. Human Factors and Ergonomics Society-Europe, Chapter on Aging and Human Factors. Soesterberg, Netherlands, 93-106.

    Google Scholar 

  75. Ponds, R. W., Brouwer, W. H., Wolffelaar, P. C. (1988). Age differences in divided attention in a simulated driving task. J. Gerontol., 43 (6), 151-156.

    Google Scholar 

  76. Zajicek, M., Hall, S. (1999). Solutions for elderly visually impaired people using the Internet. In: The 'Technology Push' and The User Tailored Information Environment, 5th Eur. Research Consortium for Informatics and Mathematics - ERCIM. 2000. Dagstuhl, Germany, November 28-December 1.

    Google Scholar 

  77. Zajicek, M.a.M., W. (2001). Speech output for older visually impaired adults. In: Blandford, A., Vanderdonckt, J., Gray, P. (eds) People and Computers XV - Interacting without Frontiers, Spring Verlag, 503-513.

    Google Scholar 

  78. Fiske, S., Taylor, S. (1991). Social Cognition. McGraw-Hill, New York, NY.

    Google Scholar 

  79. Lazarsfeld, P., Merton, R. (1948). Mass communication-popular taste and organized social action. In: Bryson, L. (ed) Institute for Religious and Social Studies, Nueva York.

    Google Scholar 

  80. Rogers, E., and Bhowmik, D. (1970). Homophily-Heterophily: Relational concepts for communication research. Public Opinion Q., 34, 523.

    Google Scholar 

  81. Dulude, L. (2002). Automated telephone answering systems and aging. Behav. Inform. Technol., 21, 171-184.

    Google Scholar 

  82. Van Der Laan, J., Heino, A., De Waard, D. (1997). A simple procedure for the assessment of acceptance of advanced transport telematics. Transport Res. C, 5 (1), 1-10.

    Google Scholar 

  83. Dybkjær, L., Bernsen, N. O., Minker, W. (2004). Evaluation and usability of multimodal spoken language dialogue systems. Speech Commun., 43, 33-54.

    Google Scholar 

  84. Graham, R., Aldridge, L., Carter, C., Lansdown, T. C. (1999). The design of in-car speech recognition interfaces for usability and user acceptance. In: Harris, D. (ed) Engineering Psychology and Cognitive Ergonomics, Ashgate, Aldershot, 313-320.

    Google Scholar 

  85. Larsen, L. B. (2003). Assessment of spoken dialogue system usability - what are we really measuring? In: 8th Eur. Conf. on Speech Communication and Technology - Eurospeech 2003. September 1-4, Geneva, Switzerland.

    Google Scholar 

  86. Zajicek, M., Jonsson, I. M. (2005). Evaluation and context for in-car speech systems for older adults. In: The 2nd Latin American Conf. on Human-Computer Interaction, CLIHC, Cuernavaca, México, October 23-26, 2005.

    Google Scholar 

  87. Chen, F. (2004). Speech interaction system - how to increase its usability. In: The 8th Int. Conf. on Spoken Language Processing, Interspeech. ICSL, Jeju Island, Korea, Oct 4-8, 2004.

    Google Scholar 

  88. Norman, D. (2007). The Design of Future Things. Basic Books, New York.

    Google Scholar 

  89. Jordan, P. W. (2000). Designing Pleasurable Products. Taylor & Francis, London and New York

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Fang Chen .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2010 Springer Science+Business Media, LLC

About this chapter

Cite this chapter

Chen, F., Jonsson, IM., Villing, J., Larsson, S. (2010). Application of Speech Technology in Vehicles. In: Chen, F., Jokinen, K. (eds) Speech Technology. Springer, New York, NY. https://doi.org/10.1007/978-0-387-73819-2_11

Download citation

  • DOI: https://doi.org/10.1007/978-0-387-73819-2_11

  • Published:

  • Publisher Name: Springer, New York, NY

  • Print ISBN: 978-0-387-73818-5

  • Online ISBN: 978-0-387-73819-2

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics