Decisive Factors in the Annotation of Emotions for Spoken Dialogue Systems
The recognition of human emotions is a very important task towards implementing more natural computer interfaces. A good annotation of the emotional corpora employed by researchers is fundamental to optimize the performance of the emotion recognizers developed. In this paper we discuss several aspects to be considered in order to obtain as much information as possible from this kind of corpora, and propose a novel method to include them automatically during the annotation procedure. The experimental results show that considering information about the usersystem interaction context, as well as the neutral speaking style of users, yields a more fine-grained human annotation and can improve machine-learned annotation accuracy by 24.52%, in comparison with the classical annotation based on acoustic features.
Unable to display preview. Download preview PDF.
- Callejas, Zoraida, & Ramn Lpez-Czar 2005. Implementing modular dialogue systems: a case study. In Proceedings of the ASIDE 2005.Google Scholar
- Forbes-Riley, Kate, & Diane J. Litman 2004. Predicting emotion in spoken dialogue from multiple knowledge sources. In Proceedings of the HLT-NAACL 2004, pages 201–208.Google Scholar
- Morrison, Donn, Ruili Wang, & Liyanage C. De Silva 2006. Ensemble methods for spoken emotion recognition in call-centers. Speech communication. In Press.Google Scholar
- Riccardi, Giuseppe, & Dilek Hakkani-Tr 2005. Grounding Emotions in Human-Machine Conversational Systems. Lecture Notes in Computer Science, pages 144–154.Google Scholar