Language Resources and Evaluation

, Volume 40, Issue 1, pp 67–85 | Cite as

Adaptation of an automotive dialogue system to users’ expertise and evaluation of the system

Original Paper

Abstract

Spoken dialogue systems (SDSs) can be used to operate devices, e.g. in the automotive environment. People using these systems usually have different levels of experience. However, most systems do not take this into account. In this paper, we present a method to build a dialogue system in an automotive environment that automatically adapts to the user’s experience with the system. We implemented the adaptation in a prototype and carried out exhaustive tests. Our usability tests show that adaptation increases both user performance and user satisfaction. We describe the tests that were performed, and the methods used to assess the test results. One of these methods is a modification of PARADISE, a framework for evaluating the performance of SDSs [Walker MA, Litman DJ, Kamm CA, Abella A (Comput Speech Lang 12(3):317–347, 1998)]. We discuss its drawbacks for the evaluation of SDSs like ours, the modifications we have carried out, and the test results.

Keywords

Adaptation Evaluation 

Abbreviations

SDS

Spoken dialogue system

ASR

Automatic speech recognition

GUI

Graphical user interface

PTT

Push to talk

AVM

Attribute value matrix

OOV

Out of vocabulary

US

User satisfaction

Notes

Acknowledgements

We thank Professor Klaus Schulz (LMU, Munich) for helpful discussions clarifying our ideas and for comments on earlier drafts. We’d also like to express our gratitude to Stefan Pöhn (Berner & Mattner) for the programming, helping to make our, often chaotic, ideas concrete. Thanks to Alexander Huber (BMW AG) for his continuing encouraging support. We are also indebted to the anonymous reviewers for their careful reading and helpful comments. And, last but not least, we thank Laura Ramirez-Polo for amending the drafts of this article.

References

  1. Aguilera, E. J. G., Bernsen, N. O., Bescós, S. R., Dybkjær, L., Fanard, F.-X., Hernandez, P. C., Macq, B., Martin, O., Nikolakis, G., de la Orden, P. L., Paternò, F., Santoro, C., Trevisan, D., Tzovaras, D., & Vanderdonckt, J. (2004). Usability evaluation issues in natural interactive and multimodal systems— State of the art and current practice (draft version). Technical report, NISLab, University of Southern Denmark. Project SIMILAR SIG7 on Usability and Evaluation, Deliverable D16.Google Scholar
  2. Akyol, S., Libuda, L., & Kraiss, K.-F. (2001). Multimodale Benutzung adaptiver Kfz-Bordsysteme. In T. Jürgensohn & K.-P. Timpe (Eds.), Kraftfahrzeugführung (pp. 137–154). Berlin: Springer-Verlag.Google Scholar
  3. Allen, J. F., & Core, M. G. (1997). Draft of DAMSL: Dialog Act Markup in Several Layers. http://www.cs.rochester.edu/research/cisd/resources/damsl.
  4. Beringer, N., Kartal, U., Louka, K., Schiel, F., & Türk, U. (2002). PROMISE—A procedure for multimodal interactive system evaluation. Technical report, LMU München, Institut für Phonetik und sprachliche Kommunikation. Teilprojekt 1: Modalitätsspezifische Analysatoren, Report Nr. 23.Google Scholar
  5. Bernsen, N. O., & Dybkjær, L. (2001). Exploring natural interaction in the car. In International workshop on information presentation and natural multimodal dialogue, Verona, Italy, pp. 75–79.Google Scholar
  6. Bühner, M. (2004). Einführung in die Test- und Fragebogenkonstruktion. München: Pearson Studium.Google Scholar
  7. Carletta, J. (1996). Assessing agreement on classification tasks: The kappa statistic. Computational Linguistics, 22(2), 249–254.Google Scholar
  8. Clark, H. H. (1997). Using language. Cambridge, New York, Melbourne: Cambridge University Press.Google Scholar
  9. Cnossen, F., Meijman, T., & Rothengatter, T. (2004). Adaptive strategy changes as a function of task demands: A study of car drivers. Ergonomics, 47(2), 218–236.CrossRefGoogle Scholar
  10. Core, M. G., & Allen, J. F. (1997). Coding dialogs with the DAMSL annotation scheme. In AAAI Fall 1997 symposium on communicative action in humans and machines, American Association for Artificial Intelligence (AAAI) (pp. 28–35). URL: http://www.citeseer.nj.nec.com/core97coding.htm.
  11. DIN EN ISO 9241-10 (1996). Ergonomische Anforderungen für Bürotätigkeiten mit Bildschirmgeräten, Teil 10: Grundsätze der Dialoggestaltung. DIN EN ISO 9241-10.Google Scholar
  12. Edelmann, W. (1996). Lernpsychologie (5th ed.). Weinheim: Psychologie Verlagsunion.Google Scholar
  13. Hagen, E., Said, T., & Eckert, J. (2004). Spracheingabe im neuen BMW 6er. Sonderheft ATZ/MTZ (Der neue BMW 6er), May, pp. 134–139.Google Scholar
  14. Haller, R. (2003). The display and control concept iDrive—Quick access to all driving and comfort functions. ATZ/MTZ Extra (The New BMW 5-Series), August, pp. 51–53.Google Scholar
  15. Hassel, L. (2006). Adaption eines Sprachbediensystems im Automobilbereich an den Erfahrungsgrad des Anwenders und Evaluation von Konzepten zur Verbesserung der Bedienbarkeit des Sprachsystems. PhD thesis, Ludwig Maximilian Universität, Abschlussarbeit für das Aufbaustudium Computerlinguistik.Google Scholar
  16. Hassel, L., & Hagen, E. (2005). Evaluation of a dialogue system in an automotive environment. In Proceedings of the 6th SIGdial workshop on discourse and dialogue, Lisbon, Portugal, 2–3 September 2005, pp. 155–165.Google Scholar
  17. Heisterkamp, P. (2001). Linguatronic—Product-level speech system for Mercedes-Benz cars. In Proceedings of the 1st international conference on human language technology research (HLT), San Diego, CA, USA.Google Scholar
  18. Hjalmarsson, A. (2002). Evaluating AdApt, a multi-modal conversational, dialogue system using PARADISE. Master’s thesis, Department of Speech, Music and Hearing, KTH Royal Institute of Technology, Stockholm, Sweden.Google Scholar
  19. Hof, A. (2007). Entwicklung eines adaptiven Hilfesystems für multimodale Anzeige-Bedienkonzepte im Fahrzeug. PhD thesis, Universität Regensburg, Philosophische Fakultät IV (Sprach- und Literaturwissenschaften), to appear 2007.Google Scholar
  20. Jokinen, K., Kanto, K., Kerminen, A., & Rissanen, J. (2004). Evaluation of adaptivity and user expertise in a speech-based e-mail system. In B. Gambäck, & K. Jokinen (Eds.), Proceedings of the 20th international conference on computational linguistics (ACL): “Robust and adaptive information processing for mobile speech interfaces: DUMAS final workshop”, Geneva, Switzerland, pp. 44–52.Google Scholar
  21. Landauer, T. K. (1997). Behavioral research methods in human–computer interaction. In M. A. Helander, T. K. Landauer, & P. V. Prabhu (Eds.), Handbook of human–computer interaction (2nd ed., pp. 203–227). North-Holland, Amsterdam, Lausanne, New York, USA: ZMMS Forschungsbericht, 96-3.Google Scholar
  22. Larsen, L. B. (2003a). Evaluation methodologies for spoken and multi modal dialogue systems—Revision 2. May 2003 (draft version). Presented at the COST 278 MC-Meeting in Stockholm, Sweden.Google Scholar
  23. Larsen, L. B. (2003b). Issues on the evaluation of spoken dialogue systems using objective and subjective measures. In Proceedings of the 8th IEEE workshop on automatic speech recognition and understanding (ASRU), St. Thomas, U.S. Virgin Islands, pp. 209–214.Google Scholar
  24. Libuda, L. (2001). Improving clarification dialogs in speech command systems with the help of user modeling: A conceptualization for an in-car user interface. In Online-Proceedings des 9. GI-Workshops: ABIS-Adaptivität und Benutzermodellierung in interaktiven Softwaresystemen. GI-Fachgruppe: Adaptivität und Benutzermodellierung in Interaktiven Softwaresystemen (ABIS).Google Scholar
  25. Mourant, R. R., Tsai, F.-J., Al-Shihabi, T., & Jaeger, B. K. (2001). Divided attention ability of young and older drivers. In Proceedings of the 80th annual meeting of the transportation research board. Available online at http://www.nrd.nhtsa.dot.gov/departments/nrd-13/driver-distraction/PDF/9.PD.
  26. Nielsen, J. (1993). Usability Engineering. Boston, USA: Academic Press Professional.Google Scholar
  27. NIST (2001). Common industry format for usability test reports. Technical report, National Institute of Standards and Technology. Version 2.0, 18 May 2001.Google Scholar
  28. Paek, T. (2001). Empirical methods for evaluating dialog systems. In ACL 2001 workshop on evaluation methodologies for language and dialogue systems, Toulouse, France, pp. 1–9.Google Scholar
  29. Piechulla, W., Mayserb, C., Gehrke, H., & König, W. (2003). Reducing drivers’ mental workload by means of an adaptive man–machine interface. Transportation Research Part F: Traffic Psychology and Behaviour, 6(4), 233–248.CrossRefGoogle Scholar
  30. Rasch, B., Friese, M., Hofmann, W., & Naumann, E. (2004). Quantitative Methoden - Band 1. Berlin, Heidelberg: Springer-Verlag.Google Scholar
  31. Rich, E. (1979). User modeling via stereotypes. Cognitive Science, 3, 329–354.CrossRefGoogle Scholar
  32. Rogers, S., Fiechter, C.-N., & Thompson, C. (2000). Adaptive user interfaces for automotive environments. In Proceedings of the IEEE intelligent vehicles (IV) symposium, Detroit, USA, pp. 662–667.Google Scholar
  33. Schütz, W., & Schäfer, R. (2002). Towards more realistic modelling of a user’s evaluation process. In ABIS-workshop 2002: Personalization for the mobile world, 9th–11th October 2002, during a week of workshops “LLA02: Learning–teaching–adaptivity” (pp. 91–98). Hannover, Germany: Learning Lab Lower Saxony (L3S).Google Scholar
  34. Siegel, S. & Castellan, N. J. (1988). Nonparametric statistics for the behavioral sciences. Singapore: McGraw-Hill International.Google Scholar
  35. Walker, M. A., Litman, D. J., Kamm, C. A., & Abella, A. (1998). Evaluating spoken dialogue agents with PARADISE: Two case studies. Computer Speech and Language, 12(3), 317–347.CrossRefGoogle Scholar
  36. Whittaker, S., Terveen, L., & Nardi, B. A. (2000). Let’s stop pushing the envelope and start addressing it: A reference task agenda for HCI. Human Computer Interaction, 15, 75–106.CrossRefGoogle Scholar
  37. Wu, J. (2000). Accomodating both experts and novices in one interface. Universal Usability Guide. Department of Computer Science, University of Maryland, http://www.otal.umd.edu/UUGuide.

Copyright information

© Springer Science+Business Media 2006

Authors and Affiliations

  1. 1.Centre for Information and Language ProcessingLudwig Maximilian UniversityMunichGermany
  2. 2.Forschungs- und InnovationszentrumBMW AGMunichGermany

Personalised recommendations