Advertisement

Two-Level Fusion to Improve Emotion Classification in Spoken Dialogue Systems

  • Ramón López-Cózar
  • Zoraida Callejas
  • Martin Kroul
  • Jan Nouza
  • Jan Silovský
Part of the Lecture Notes in Computer Science book series (LNCS, volume 5246)

Abstract

This paper proposes a technique to enhance emotion classification in spoken dialogue systems by means of two fusion modules. The first combines emotion predictions generated by a set of classifiers that deal with different kinds of information about each sentence uttered by the user. To do this, the module employs several fusion methods that produce other predictions about the emotional state of the user. The predictions are the input to the second fusion module, where they are combined to deduce the user’s emotional state. Experiments have been carried out considering two emotion categories (‘Non-negative’ and ‘Negative’) and classifiers that deal with prosodic, acoustic, lexical and dialogue acts information. The results show that the first fusion module significantly increases the classification rates of a baseline and the classifiers working separately, as has been observed previously in the literature. The novelty of the technique is the inclusion of the second fusion module, which enhances classification rate by 2.25% absolute.

Keywords

Emotion Recognition Fusion Method Emotion Category Speech Database Voice Sample 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Bänziger, T., Scherer, K.R.: The role of intonation in emotional expressions. Speech communication 46, 252–267 (2005)CrossRefGoogle Scholar
  2. 2.
    Ai, H., Litman, D., Forbes-Riley, K., Rotaru, M., Tetreault, J., Purandare, A.: Using systems and user performance features to improve emotion detection in spoken tutoring dialogs. In: Proc. of Interspeech 2006-ICSLP, Pittsburgh, USA, pp. 797–800 (2006)Google Scholar
  3. 3.
    Devillers, L., Scherer, K.: Real-life emotions detection with lexical and paralinguistic cues on human-human call center dialogs. In: Proc. of Interspeech 2006-ICSLP, Pittsburgh, USA, pp. 801–804 (2006)Google Scholar
  4. 4.
    Morrison, D., Wang, R., Silva, L.C.D.: Ensemble methods for spoken emotion recognition in call-centers. Speech communication 49, 98–112 (2007)CrossRefGoogle Scholar
  5. 5.
    Lee, C.M., Narayanan, S.S.: Toward detecting emotions in spoken dialogs. IEEE Transactions on Speech and Audio Processing 13, 293–303 (2005)CrossRefGoogle Scholar
  6. 6.
    Tax, D., Breukelen, M.V., Duin, R., Kittler, J.: Combining multiple classifiers by averaging or multiplying. Pattern Recognition 33, 1475–1485 (2001)CrossRefGoogle Scholar
  7. 7.
    López-Cózar, R., Callejas, Z.: Combining language models in the input interface of a spoken dialogue system. Computer Speech and Language 20, 420–440 (2005)CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2008

Authors and Affiliations

  • Ramón López-Cózar
    • 1
  • Zoraida Callejas
    • 1
  • Martin Kroul
    • 2
  • Jan Nouza
    • 2
  • Jan Silovský
    • 2
  1. 1.Dept. of Languages and Computer SystemsUniversity of GranadaSpain
  2. 2.Institute of Information Technology and ElectronicsTechnical University of LiberecCzech Republic

Personalised recommendations