Application of Speech Technology in Vehicles

Chen, Fang; Jonsson, Ing-Marie; Villing, Jessica; Larsson, Staffan

doi:10.1007/978-0-387-73819-2_11

Fang Chen³,
Ing-Marie Jonsson⁴,
Jessica Villing⁵ &
…
Staffan Larsson⁵

1443 Accesses
6 Citations

Abstract

Speech technology has been regarded as one of the most interesting technologies for operating in-vehicle information systems. Cameron [1] has pointed out that under at least one of the four criteria that people are using speech system more likely. These four criteria are the following: (1) They are offered no choice; (2) it corresponds to the privacy of their surroundings; (3) their hands or eyes are busy on another task; and (4) it is quicker than any other alternatives. For driver, driving is a typical “hands and eyes are busy” task. In most of the situations, the driver is the only person inside the car, or with some passengers who know each other well, so the “privacy of surroundings” criteria are also met. There are long histories of interests of applying speech technology into controlling in-vehicle information system. Up to now, some of the commercial cars have already equipped with imbedded speech technology. In 1996, however, the S-Class car of Mercedes-Benz introduced Linguatronic, the first generation of in-car speech system for anybody who drives a car [2]. Since then, the number of in-vehicle applications using speech technology is increasing [3].

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

eBook: USD 16.99; Price excludes VAT (USA)

Softcover Book: USD 119.99; Price excludes VAT (USA)

Hardcover Book: USD 169.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Cameron, H. (2000). Speech at the interface. In: Workshop on "Voice Operated Telecom Services". Ghent, Belgium, COST 249.
Google Scholar
Heisterkamp, P. (2001). Linguatronic - Product-level speech system for Mercedes-Benz Cars. In: Proc. HLT, San Diego, CA, USA.
Google Scholar
Hamerich, S. W. (2007). Towards advanced speech driven navigation systems for cars. In: 3rd IET Int. Conf. on Intelligent Environments, IE07, Sept. 24-25, Ulm, Germany.
Google Scholar
Goose, S., Djennane, S. (2002). WIRE3: Driving around the information super-highway. Pers. Ubiquitous Comput., 6, 164-175.
Google Scholar
Nass, C., Jonsson, I.-M., Harris, H., Reaves, B., Endo, J., Brave, S., Takayama, L. (2005). Improving automotive safety by pairing driver emotion and car voice emotion. In: CHI '05 Extended Abstracts on Human factors in Computing Systems. ACM Press, New York, NY.
Google Scholar
Nass, C., Brave, S. B. (2005). Wired for Speech: How Voice Activates and Enhance the Human Computer Relationship. MIT Press, Cambridge, MA.
Google Scholar
Bishop, R. (2005). Intelligent Vehicle Technology and Trends. Artech House, Boston.
Google Scholar
van de Weijer, C. (2008). Keynote 1: Dutch connected traffic in practice and in the future. In: IEEE Intelligent Vehicles Sympos. Eindhoven, The Netherlander, June 4-6.
Google Scholar
Gardner, M. (2008). Nomadic device integration in Aide. In: Proc. AIDE Final Workshop and Exhibition. April 15-16, Goteborg, Sweden.
Google Scholar
Johansson, E., Engstrom, J., Cherri, C., Nodari, E., Toffetti, A., Schindhelm, R., Gelau, C. (2004). Review of existing techniques and metrics for IVIS and ADAS assessment. EU Information Society Technology (IST) program IST-1-507674-IP: Adaptive Integrated Driver-Vehicle Interface (AIDE).
Google Scholar
Lee, J. D., Caven, B., Haake, S., Brown, T. L. (2001). Speech-based interaction with in- vehicle computer: The effect of speech-based e-mail on driver's attention to the roadway. Hum. Factors, 43, 631-640.
Google Scholar
Barón, A., Green, P. (2006). Safety and Usability of Speech Interfaces for In-Vehicle Tasks while Driving: A Brief Literature Review. Transportation Research Institute (UMTRI), The University of Michigan.
Google Scholar
Saad, F., Hjalmdahl, M., Cañas, J., Alonso, M., Garayo, P., Macchi, L., Nathan, F., Ojeda, L., Papakostopoulos, V., Panou, M., Bekiaris. E. (2004). Literature review of behavioural effects. EU Information Society Technology (IST) program: IST-1-507674-IP, Adaptive Integrated Driver-Vehicle Interface (AIDE).
Google Scholar
Treffner, P. J., Barrett, R. (2004). Hands-free mobile phone speech while driving degrades coordination and control. Transport. Res. F, 7, 229-246.
Google Scholar
Esbjornsson, M., Juhlin, O., Weilenmann, A. (2007). Drivers using mobile phones in traffic: An ethnographic study of interactional adaption. Int. J. Hum. Comput. Inter., Special Issue on: In-Use, In-Situ: Extending Field Research Methods, 22 (1), 39-60.
Google Scholar
Jonsson, I.-M., Chen, F. (2006). How big is the step for driving simulators to driving a real car? In: IEA 2006 Congress, Maastricht, The Netherlands, July 10-14.
Google Scholar
Chen, F., Jordan, P. (2008). Zonal adaptive workload management system: Limiting sec- ondary task while driving. In: IEEE Intelligent Transportation System, IVs' 08, Eindhoven, The Netherlander, June 2-6.
Google Scholar
Esbjörnsson, M., Brown, B., Juhlin, O., Normark, D., Östergren, M., Laurier, E. (2006). Watching the cars go round and round: designing for active spectating. In: Proc. SIGCHI Conf. on Human Factors in computing systems, Montréal, Québec, Canada, 2006.
Google Scholar
Recarte, M. A., Nunes, L. M. (2003). Mental workload while driving: Effects on visual search, discrimination, and decision making. J. Exp. Psychol.: Appl., 9 (2), 119-137.
Google Scholar
Victor, T. W., Harbluk, J. L., Engstrom, J. A. (2005). Sensitivity of eye-movement measures to in-vehicle task difficulty. Transport. Res. Part F, 8 (2), 167-190.
Google Scholar
Hart, S. G., Staveland, L. E. (1988). Development of NASA-TLX (task Load Index): Results of empirical and theoretical research. In: Meshkati (ed) Human Mental Workload, P. A. H. a. N. Elsevier Science Publishers B.V., North-Holland, 139-183.
Google Scholar
Pauzie, A., Sparpedon, A., Saulnier, G. (2007). Ergonomic evaluation of a prototype guidance system in an urban area. Discussion about methodologies and data collection tools, in Vehicle Navigation and Information Systems Conference. In: Proc. in conjunction with the Pacific Rim TransTech Conf. 6th Int. VNIS. "A Ride into the Future", Seattle, WA, USA.
Google Scholar
Wang, E., Chen, F. (2008). A new measurement for simulator driving performance in situation without interfere from other vehicles, International Journal of Transportation Systems F. AEI 2008. In: Applied Human Factors and Ergonomics 2008, 2nd Int. Conf., Las Vegas, USA, July 14-17.
Google Scholar
Wilson, G. F., Lambert, J. D., Russell, C. A. (2002). Performance enhancement with real- time physiologically controlled adaptive aiding. In: HFA Workshop: Psychophysiological Application to Human Factors, March 11-12, 2002. Swedish Center for Human Factors in Aviation.
Google Scholar
Wilson, G. F. (2002). Psychophysiological test methods and procedures. In: HFA Workshop: Psychophysiological Application to Human Factors, March 11-12, 2002. Swedish Center for Human Factors in Aviation.
Google Scholar
Lai, J., Cheng, K., Green, P., Tsimhoni, O. (2001). On the road and on the web? Comprehension of synthetic and human speech while driving. In: Conf. on Human Factors and Computing Systems, CHI 2001, 31 March-5 April 2001. Seattle, Washington, USA.
Google Scholar
Hermansky, H., Morgan, N. (1994). RASTA processing of speech. IEEE Trans. Speech Audio Process., 2 (4), 578-589.
Google Scholar
Kermorvant, C. (1999). A comparison of noise reduction techniques for robust speech recognition. IDIAP research report, IDIAP-RR-99-10, Dalle Molle Institute for perceptual Artificial Intelligence, Valais, Switzerland.
Google Scholar
Furui, S. (1986). Speaker-independent isolated word recognition using dynamic features of speech spectrum. IEEE Trans. Acoustics, Speech Signal Process., 34 (1), 52-59.
Google Scholar
Mansour, D., Juang, B.-H. (1989). The short-time modified coherence representation and noisy speech recognition. IEEE Trans. Acoustics Speech Signal Process., 37 (6), 795-804.
Google Scholar
Hernando, J., Nadeu, C. (1997). Linear prediction of the one-sided autocorrelation sequence for noisy speech recognition. IEEE Trans. Speech Audio Process., 5 (1), 80-84.
Google Scholar
Chen, J., Paliwal, K. K., Nakamura, S. (2003). Cepstrum derived from differentiated power spectrum for robust speech recognition. Speech Commun., 41 (2-3), 469-484.
Google Scholar
Yuo, K.-H., Wang, H.-C. (1998). Robust features derived from temporal trajectory filtering for speech recognition under the corruption of additive and convolutional noises. In: Proc. IEEE Int. Conf. on Acoustics, Speech, and Signal Processing, April 21-24, 1997, Munich, Bavaria, Germany.
Google Scholar
Yuo, K.-H., Wang, H.-C. (1999). Robust features for noisy speech recognition based on temporal trajectory filtering of short-time autocorrelation sequences. Speech Commun., 28, 13-24.
Google Scholar
Lebart, K., Boucher, J. M. (2001). A new method based on spectral subtraction for speech dereverberation. Acta Acoustic ACUSTICA, 87, 359-366.
Google Scholar
Lee, C.-H., Soong, F. K., Paliwal, K. K. (1996). Automatic Speech and Speaker Recognition. Kluwer, Norwell.
Google Scholar
Gales, M. J. F., Young, S. J. (1995). Robust speech recognition in additive and convolutional noise using parallel model combination. Comput. Speech Lang., 9, 289-307.
Google Scholar
Gales, M. J. F., Young, S. J. (1996). Robust continuous speech recognition using parallel model combination. IEEE Trans. Speech Audio Process., 4 (5), 352-359.
Google Scholar
Acero, A., Deng, L., Kristjansson, T., Zhang, J. (2000). HMM adaptation using vector Taylor series for noisy speech recognition. In: Proc. ICASSP, June 05-09, 2000, Istanbul, Turkey.
Google Scholar
Kim, D. Y., Un, C. K., Kim, N. S. (1998). Speech recognition in noisy environments using first-order vector Taylor series. Speech Commun., 24 (1), 39-49.
Google Scholar
Visser, E., Otsuka, M., Lee, T.-W. (2003). A spatio-temporal speech enhancement scheme for robust speech recognition in noisy environments. Speech Commun., 41, 393-407.
Google Scholar
Farahani, G., Ahadi, S. M., Homayounpour, M. M. (2007). Features based on filtering and spectral peaks in autocorrelation domain for robust speech recognition. Comput. Speech Lang., 21, 187-205.
Google Scholar
Choi, E. H. C. (2004). Noise robust front-end for ASR using spectral subtraction, spectral flooring and cumulatie distribution mapping. In: Proc. 10th Australian Int. Conf. on Speech Science & Technology. Macquarie University, Sydney, December 8-10.
Google Scholar
Fernandez, R., Corradini, A., Schlangen, D. Stede, M. (2007). Towards reducing and man- aging uncertainty in spoken dialogue systems. In: The Seventh International Workshop on Computational Semantics (IWCS-7). Tilburg, The Netherlands, Jan 10-12.
Google Scholar
Skantze, G. (2005). Exploring human error recovery strategies: Implications for spoken dialogue systems. Speech Commun., 45 (3), 325-341.
Google Scholar
Gellatly, A. W. a. D., T. A. (1998). Speech recognition and automotive applications: using speech to perform in-vehicle tasks. In: Proc. Human Factors and Ergonomics Society 42nd Annual Meeting, October 5-9, 1998, Hyatt Regency Chicago, Chicago, Illinois.
Google Scholar
Greenberg, J., Tijenna, L. Curn, R., Artz, B., Cathey, L., Grant P, Kochhar, D., Koxak, K., Blommer, M. (2003). Evaluation of driver distraction using an event detection paradigm. In: Proc. Transportation Research Board Annual Meetings, January 12-16, 2003, Washington, DC.
Google Scholar
McCallum, M. C., Campbell, J. L., Richman, J. B., Brown, J. (2004). Speech recognition and in-vehicle telematics devices; Potential reductions in driver distraction. Int. J. Speech Technol., 7, 25-33.
Google Scholar
Bernsen, N. O., Dybkjaer, L. (2002). A multimodal virtual co-driver's problems with the driver. In: ISCA Tutorial and Research Workshop on Multi-Modal Dialogue in Mobile Environments Proceedings. Kloster Irsee, Germany, June 17-19.
Google Scholar
Geutner, P., Steffens, F. Manstetten, D. (2002). Design of the VICO Spoken Dialogue System: Evaluation of User Expectations by Wizard-of-Oz Experiments. In: Proc. 3rd Int. Conf. on Language Resources and Evaluation (LREC 2002). Las Palmas, Spain, May.
Google Scholar
Villing, J.a.L., S. (2006). Dico: A multimodal menu-based in-vehicle dialogue system. In: The 10th Workshop on the Semantics and Pragmatics of Dialogue, brandial'06 (Sem-Dial 10). Potsdam, Germany, Sept 11-13.
Google Scholar
Larsson, S. (2002). Issue-based dialogue management. PhD Thesis, Goteborg University.
Google Scholar
Bringert, B., Ljunglöf, P., Raanta, A.and Cooper, R. (2005). Multimodal dialogue systems grammars. In: The DIALOR'05, 9th Workshop on the Semantics and Pragmatics of Dialogue. Nancy (France), June 9-11, 2005.
Google Scholar
Oviatt, S. (2004). When do we interact multimodally? Cognitive load and multimodal communication patterns. In: Proc. 6th Int. Conf. on Multimodal Interfaces. Pennsylvania, Oct 14-15.
Google Scholar
Bernsen, O., Dybkjaer, L. (2001). Exploring natural interaction in the car. In: Proc. CLASS Workshop on Natural Interactivity and Intelligent Interactive Information Representation, Verona, Italy, Dec 2001.
Google Scholar
Esbjörnsson, M., Juhlin, O., Weilenmann, A. (2007). Drivers using mobile phones in traffic: An ethnographic study of interactional adaption. Int. J. Hum Comput Interact., Special Issue on In-Use, In-Situ: Extending Field Research Meth., 22 (1), 39-60.
Google Scholar
Jonsson, I.-M., Nass, C., Endo, J., Reaves, B., Harris, H., Ta, J. L., Chan, N., Knapp, S. (2004). Don't blame me I am only the driver: Impact of blame attribution on attitudes and attention to driving task. In: CHI '04 extended Abstracts on Human Factors in Computing Systems, Vienna, Austria.
Google Scholar
Jonsson, I.-M., Zajicek, M. (2005). Selecting the voice for an in-car information system for older adults. In: Human Computer Interaction Int. Las Vegas, Nevada, USA.
Google Scholar
Jonsson, I.-M., Zajicek, M., Harris, H., Nass, C. I. (2005). Thank you I did not see that: In-car speech-based information systems for older adults. In: Conf. on Human Factors in Computing Systems. ACM Press, Portland, OR.
Google Scholar
Jonsson, I. M., Nass, C. I., Harris, H., Takayama, L. (2005). Got Info? Examining the con- sequences of inaccurate information systems. In: Int. Driving Symp. on Human Factors in Driver Assessment, Training, and Vehicle Design. Rockport, Maine.
Google Scholar
Gross, J. J. (1999). Emotion and emotion regulation. In: John, L. A. P. O. P. (ed) Handbook of Personality: Theory and Research. New York: Guildford, 525-552.
Google Scholar
Picard, R. W. (1997). Affective Computing. MIT Press, Cambridge, MA.
Google Scholar
Clore, G. C., Gasper, K. (2000). Feeling is believing: Some affective influences on belief. In: Frijda, A. S. R. M. N. H., Bem, S. (eds) Emotions and Beliefs: How Feelings Influence Thoughts, Editions de la Maison des Sciences de l'Homme and Cambridge University Press (jointly published), Paris/Cambridge, 10-44.
Google Scholar
Gross, J. J. (1998). Antecedent- and response-focused emotion regulation: Divergent con- sequences for experience, expression, and physiology. J. Personality Social Psychol., 74, 224-237.
Google Scholar
Davidson, R. J. (1994). On emotion, mood, and related affective constructs. In: Davidson, P. E. R. J. (ed) The Nature of Emotion, Oxford University Press, New York, 51-55.
Google Scholar
Bower, G. H., Forgas, J. P. (2000). Affect, memory, and social cognition. In: Eich, J. F. K. E., Bower, G. H., Forgas, J. P., Niedenthal, P. M. (eds) Cognition and Emotion. Oxford University Press, Oxford, 87-168.
Google Scholar
Groeger, J. A. (2000). Understanding Driving: Applying Cognitive Psychology to a Complex Everyday Task. Psychology Press, Philadelphia, PA.
Google Scholar
Lunenfeld, H. (1989). Human factor considerations of motorist navigation and information systems. In: Proc. Vehicle Navigation and Information Systems, September 11-13, Toronto, Canada.
Google Scholar
Srinivasan, R., Jovanis, P. (1997). Effect of in-vehicle route guidance systems on driver workload and choice of vehicle speed: Findings from a driving simulator experiment. In: Ian Noy, Y. (ed) Ergonomics and Safety of Intelligent Driver Interfaces, Lawrence Erlbaum Associates Inc., Publishers, Mahwah, New Jersey, 97-114.
Google Scholar
Horswill, M., McKenna, F. (1999). The effect of interference on dynamic risk-taking judgments. Br. J. Psychol., 90, 189-199.
Google Scholar
Strayer, D., Drews, F., Johnston, W. (2003). Cell phone induced failures of visual attention during simulated driving. J. Exp. Psychol.: Appl., 9 (1), 23-32.
Google Scholar
Merat, N., Jamson, A. H. (2005). Shut up I'm driving! Is talking to an inconsiderate passenger the same as talking on a mobile telephone. In: 3rd Int. Driving Symp.on Human Factors in Driver Assessment, Training, and Vehicle Design. Rockport, Maine.
Google Scholar
Nass, C. et al. (2005). Improving automotive safety by pairing driver emotion and car voice emotion. In: CHI '05 Extended Abstracts on Human Factors in Computing Systems. ACM Press, New York, NY.
Google Scholar
Brouwer, W. H. (1993). Older drivers and attentional demands: consequences for human factors research. In: Proc. Human Factors and Ergonomics Society-Europe, Chapter on Aging and Human Factors. Soesterberg, Netherlands, 93-106.
Google Scholar
Ponds, R. W., Brouwer, W. H., Wolffelaar, P. C. (1988). Age differences in divided attention in a simulated driving task. J. Gerontol., 43 (6), 151-156.
Google Scholar
Zajicek, M., Hall, S. (1999). Solutions for elderly visually impaired people using the Internet. In: The 'Technology Push' and The User Tailored Information Environment, 5th Eur. Research Consortium for Informatics and Mathematics - ERCIM. 2000. Dagstuhl, Germany, November 28-December 1.
Google Scholar
Zajicek, M.a.M., W. (2001). Speech output for older visually impaired adults. In: Blandford, A., Vanderdonckt, J., Gray, P. (eds) People and Computers XV - Interacting without Frontiers, Spring Verlag, 503-513.
Google Scholar
Fiske, S., Taylor, S. (1991). Social Cognition. McGraw-Hill, New York, NY.
Google Scholar
Lazarsfeld, P., Merton, R. (1948). Mass communication-popular taste and organized social action. In: Bryson, L. (ed) Institute for Religious and Social Studies, Nueva York.
Google Scholar
Rogers, E., and Bhowmik, D. (1970). Homophily-Heterophily: Relational concepts for communication research. Public Opinion Q., 34, 523.
Google Scholar
Dulude, L. (2002). Automated telephone answering systems and aging. Behav. Inform. Technol., 21, 171-184.
Google Scholar
Van Der Laan, J., Heino, A., De Waard, D. (1997). A simple procedure for the assessment of acceptance of advanced transport telematics. Transport Res. C, 5 (1), 1-10.
Google Scholar
Dybkjær, L., Bernsen, N. O., Minker, W. (2004). Evaluation and usability of multimodal spoken language dialogue systems. Speech Commun., 43, 33-54.
Google Scholar
Graham, R., Aldridge, L., Carter, C., Lansdown, T. C. (1999). The design of in-car speech recognition interfaces for usability and user acceptance. In: Harris, D. (ed) Engineering Psychology and Cognitive Ergonomics, Ashgate, Aldershot, 313-320.
Google Scholar
Larsen, L. B. (2003). Assessment of spoken dialogue system usability - what are we really measuring? In: 8th Eur. Conf. on Speech Communication and Technology - Eurospeech 2003. September 1-4, Geneva, Switzerland.
Google Scholar
Zajicek, M., Jonsson, I. M. (2005). Evaluation and context for in-car speech systems for older adults. In: The 2nd Latin American Conf. on Human-Computer Interaction, CLIHC, Cuernavaca, México, October 23-26, 2005.
Google Scholar
Chen, F. (2004). Speech interaction system - how to increase its usability. In: The 8th Int. Conf. on Spoken Language Processing, Interspeech. ICSL, Jeju Island, Korea, Oct 4-8, 2004.
Google Scholar
Norman, D. (2007). The Design of Future Things. Basic Books, New York.
Google Scholar
Jordan, P. W. (2000). Designing Pleasurable Products. Taylor & Francis, London and New York
Google Scholar

Download references

Author information

Authors and Affiliations

Interaction Design, Department of Computer Science and Engineering, Chalmers University of Technology, Gothenburg, Sweden
Fang Chen
Toyota Information Technology Center, Palo Alto, CA, USA
Ing-Marie Jonsson
Department of Philosophy, Linguistics and Theory of Science, University of Gothenburg, Göteborg, Sweden
Jessica Villing & Staffan Larsson

Authors

Fang Chen
View author publications
You can also search for this author in PubMed Google Scholar
Ing-Marie Jonsson
View author publications
You can also search for this author in PubMed Google Scholar
Jessica Villing
View author publications
You can also search for this author in PubMed Google Scholar
Staffan Larsson
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Fang Chen .

Editor information

Editors and Affiliations

Department of Computing Science & Engineering, Chalmers University of Technology, 412 96, Göteborg, Sweden
Fang Chen
Department of Speech Sciences, University of Helsinki, 9, FIN-00014, Helsinki, Finland
Kristiina Jokinen

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Chen, F., Jonsson, IM., Villing, J., Larsson, S. (2010). Application of Speech Technology in Vehicles. In: Chen, F., Jokinen, K. (eds) Speech Technology. Springer, New York, NY. https://doi.org/10.1007/978-0-387-73819-2_11

Download citation

DOI: https://doi.org/10.1007/978-0-387-73819-2_11
Published: 17 April 2010
Publisher Name: Springer, New York, NY
Print ISBN: 978-0-387-73818-5
Online ISBN: 978-0-387-73819-2
eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics