Abstract
Identifying the weaknesses of a system as well as establishing test criteria and measures which make different systems and concepts comparable are one of the major challenges in the evaluation of human–computer interfaces. We have given a synopsis about the evaluation of SLDS in Section 1.5 (cf. also Fig. 1.3). In the following section, we outline further aspects of the evaluation of SLDSs and their components. In the remainder of this chapter, we present an in-depth evaluation of the performance of the emotion recognizers described in Chapters 4 and 5, and we describe our approach to measure the usability of the dialogue manager described in Chapter 5.3.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Bernsen NO, Dybkjær H, Dybkjær L (1994) Wizard of Oz prototyping: How and when? In: CCI working papers in cognitive science and HCI, WPCS-94-1. Centre for Cognitive Science, Roskilde University, Roskilde, Denmark
Boersma P (2001) Praat, a system for doing phonetics by computer. Glot Int 5(9/10):341–345
Brown P, Levinson SC (1987) Politeness – Some universals in language Use. Cambridge University Press, Cambridge, United Kingdom
Carletta JC (1996) Assessing the reliability of subjective codings. Computat Ling 22(2):249–254
Kim I-S (2006) Automatic speech recognition: Reliability and pedagogical implications for teaching pronunciation. Educ Technol Soc 9(1):322–334
Lippmann RP (1997) Speech recognition by machines and humans. Speech Commun 22(1):1–15
Meng H, Pittermann J, Pittermann A, Minker W (2007) Combined speech-emotion recognition for spoken human-computer interfaces. In: IEEE international conference on signal processing and communications (ICSPC), Dubai, United Arab Emirates
Polzin TS, Waibel A (1998) Detecting emotions in speech. In: Proceedings of the CMC, Tilburg, The Netherlands
Zhou L, Shi Y, Feng J, Sears A (2005) Data mining for detecting errors in dication speech recognition. IEEE Trans Speech Audio Process 13(5):681–688
Bernsen NO, Dybkjær H, Dybkjær L (1994) Wizard of Oz prototyping: How and when? In: CCI working papers in cognitive science and HCI, WPCS-94-1. Centre for Cognitive Science, Roskilde University, Roskilde, Denmark
Boersma P (2001) Praat, a system for doing phonetics by computer. Glot Int 5(9/10):341–345
Brown P, Levinson SC (1987) Politeness – Some universals in language Use. Cambridge University Press, Cambridge, United Kingdom
Burkhardt F, Audibert N, Malatesta L, Türk O, Arslan L, Aubergé V (2006b) Emotional prosody – Does culture make a difference? In: 3rd international conference on speech prosody, Dresden, Germany, pp 245–248
Carletta JC (1996) Assessing the reliability of subjective codings. Computat Ling 22(2):249–254
Cohen J (1960) A coefficient of agreement for nominal scales. Educ Psychol Measure 20:37–40
Cohen PR (1995) Empirical methods for artificial intelligence. MIT Press, Boston, USA
Fleiss JL (1971) Measuring nominal scale agreement among many raters. Psychol Bull 76(5): 378–382
Iwai A, Yano Y, Okuma S (2004) Complex emotion recognition system for a specific user using SOM based on prosodic features. In: International conference on speech and language processing (ICSLP), Jeju, Korea
Kim I-S (2006) Automatic speech recognition: Reliability and pedagogical implications for teaching pronunciation. Educ Technol Soc 9(1):322–334
Lippmann RP (1997) Speech recognition by machines and humans. Speech Commun 22(1):1–15
Meng H, Pittermann J, Pittermann A, Minker W (2007) Combined speech-emotion recognition for spoken human-computer interfaces. In: IEEE international conference on signal processing and communications (ICSPC), Dubai, United Arab Emirates
Pittermann J, Pittermann A, Minker W (2007c) Design and implementation of adaptive dialogue strategies for speech-based interfaces. J Ubiquitous Comput Intell 1(2):145–152
Polzin TS, Waibel A (1998) Detecting emotions in speech. In: Proceedings of the CMC, Tilburg, The Netherlands
Walker MA, Cahn JE, Whittaker SJ (1997a) Improvising linguistic style: social and affective bases of agent personality. In: Johnson WL, Hayes-Roth B (eds) Proceedings of the first international conference on autonomous agents (Agents’97), Marina del Rey, USA, ACM Press, pp 96–105
Zhou L, Shi Y, Feng J, Sears A (2005) Data mining for detecting errors in dication speech recognition. IEEE Trans Speech Audio Process 13(5):681–688
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
Copyright information
© 2010 Springer Science+Business Media B.V.
About this chapter
Cite this chapter
Pittermann, J., Pittermann, A., Minker, W. (2010). Evaluation. In: Handling Emotions in Human-Computer Dialogues. Springer, Dordrecht. https://doi.org/10.1007/978-90-481-3129-7_6
Download citation
DOI: https://doi.org/10.1007/978-90-481-3129-7_6
Published:
Publisher Name: Springer, Dordrecht
Print ISBN: 978-90-481-3128-0
Online ISBN: 978-90-481-3129-7
eBook Packages: Humanities, Social Sciences and LawSocial Sciences (R0)