Skip to main content

Abstract

Identifying the weaknesses of a system as well as establishing test criteria and measures which make different systems and concepts comparable are one of the major challenges in the evaluation of human–computer interfaces. We have given a synopsis about the evaluation of SLDS in Section 1.5 (cf. also Fig. 1.3). In the following section, we outline further aspects of the evaluation of SLDSs and their components. In the remainder of this chapter, we present an in-depth evaluation of the performance of the emotion recognizers described in Chapters 4 and 5, and we describe our approach to measure the usability of the dialogue manager described in Chapter 5.3.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 109.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  • Bernsen NO, Dybkjær H, Dybkjær L (1994) Wizard of Oz prototyping: How and when? In: CCI working papers in cognitive science and HCI, WPCS-94-1. Centre for Cognitive Science, Roskilde University, Roskilde, Denmark

    Google Scholar 

  • Boersma P (2001) Praat, a system for doing phonetics by computer. Glot Int 5(9/10):341–345

    Google Scholar 

  • Brown P, Levinson SC (1987) Politeness – Some universals in language Use. Cambridge University Press, Cambridge, United Kingdom

    Google Scholar 

  • Carletta JC (1996) Assessing the reliability of subjective codings. Computat Ling 22(2):249–254

    Google Scholar 

  • Kim I-S (2006) Automatic speech recognition: Reliability and pedagogical implications for teaching pronunciation. Educ Technol Soc 9(1):322–334

    Google Scholar 

  • Lippmann RP (1997) Speech recognition by machines and humans. Speech Commun 22(1):1–15

    Article  Google Scholar 

  • Meng H, Pittermann J, Pittermann A, Minker W (2007) Combined speech-emotion recognition for spoken human-computer interfaces. In: IEEE international conference on signal processing and communications (ICSPC), Dubai, United Arab Emirates

    Google Scholar 

  • Polzin TS, Waibel A (1998) Detecting emotions in speech. In: Proceedings of the CMC, Tilburg, The Netherlands

    Google Scholar 

  • Zhou L, Shi Y, Feng J, Sears A (2005) Data mining for detecting errors in dication speech recognition. IEEE Trans Speech Audio Process 13(5):681–688

    Article  Google Scholar 

  • Bernsen NO, Dybkjær H, Dybkjær L (1994) Wizard of Oz prototyping: How and when? In: CCI working papers in cognitive science and HCI, WPCS-94-1. Centre for Cognitive Science, Roskilde University, Roskilde, Denmark

    Google Scholar 

  • Boersma P (2001) Praat, a system for doing phonetics by computer. Glot Int 5(9/10):341–345

    Google Scholar 

  • Brown P, Levinson SC (1987) Politeness – Some universals in language Use. Cambridge University Press, Cambridge, United Kingdom

    Google Scholar 

  • Burkhardt F, Audibert N, Malatesta L, Türk O, Arslan L, Aubergé V (2006b) Emotional prosody – Does culture make a difference? In: 3rd international conference on speech prosody, Dresden, Germany, pp 245–248

    Google Scholar 

  • Carletta JC (1996) Assessing the reliability of subjective codings. Computat Ling 22(2):249–254

    Google Scholar 

  • Cohen J (1960) A coefficient of agreement for nominal scales. Educ Psychol Measure 20:37–40

    Article  Google Scholar 

  • Cohen PR (1995) Empirical methods for artificial intelligence. MIT Press, Boston, USA

    Google Scholar 

  • Fleiss JL (1971) Measuring nominal scale agreement among many raters. Psychol Bull 76(5): 378–382

    Article  Google Scholar 

  • Iwai A, Yano Y, Okuma S (2004) Complex emotion recognition system for a specific user using SOM based on prosodic features. In: International conference on speech and language processing (ICSLP), Jeju, Korea

    Google Scholar 

  • Kim I-S (2006) Automatic speech recognition: Reliability and pedagogical implications for teaching pronunciation. Educ Technol Soc 9(1):322–334

    Google Scholar 

  • Lippmann RP (1997) Speech recognition by machines and humans. Speech Commun 22(1):1–15

    Article  Google Scholar 

  • Meng H, Pittermann J, Pittermann A, Minker W (2007) Combined speech-emotion recognition for spoken human-computer interfaces. In: IEEE international conference on signal processing and communications (ICSPC), Dubai, United Arab Emirates

    Google Scholar 

  • Pittermann J, Pittermann A, Minker W (2007c) Design and implementation of adaptive dialogue strategies for speech-based interfaces. J Ubiquitous Comput Intell 1(2):145–152

    Article  Google Scholar 

  • Polzin TS, Waibel A (1998) Detecting emotions in speech. In: Proceedings of the CMC, Tilburg, The Netherlands

    Google Scholar 

  • Walker MA, Cahn JE, Whittaker SJ (1997a) Improvising linguistic style: social and affective bases of agent personality. In: Johnson WL, Hayes-Roth B (eds) Proceedings of the first international conference on autonomous agents (Agents’97), Marina del Rey, USA, ACM Press, pp 96–105

    Chapter  Google Scholar 

  • Zhou L, Shi Y, Feng J, Sears A (2005) Data mining for detecting errors in dication speech recognition. IEEE Trans Speech Audio Process 13(5):681–688

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Johannes Pittermann .

Rights and permissions

Reprints and permissions

Copyright information

© 2010 Springer Science+Business Media B.V.

About this chapter

Cite this chapter

Pittermann, J., Pittermann, A., Minker, W. (2010). Evaluation. In: Handling Emotions in Human-Computer Dialogues. Springer, Dordrecht. https://doi.org/10.1007/978-90-481-3129-7_6

Download citation

Publish with us

Policies and ethics