Robust Recognition of Emotion from Speech

  • Mohammed E. Hoque
  • Mohammed Yeasin
  • Max M. Louwerse
Part of the Lecture Notes in Computer Science book series (LNCS, volume 4133)


This paper presents robust recognition of a subset of emotions by animated agents from salient spoken words. To develop and evaluate the model for each emotion from the chosen subset, both the prosodic and acoustic features were used to extract the intonational patterns and correlates of emotion from speech samples. The computed features were projected using a combination of linear projection techniques for compact and clustered representation of features. The projected features were used to build models of emotions using a set of classifiers organized in hierarchical fashion. The performances of the models were obtained using number of classifiers from the WEKA machine learning toolbox. Empirical analysis indicated that the lexical information computed from both the prosodic and acoustic features at word level yielded robust classification of emotions.


emotion recognition prosody speech machine learning 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Graesser, A.C., VanLehn, K., Rose, C., Jordan, P., Harter, D.: Intelligent tutoring systems with conversational dialogue. AI Magazine 22, 39–51 (2001)Google Scholar
  2. 2.
    Louwerse, M.M., Graesser, A.C., Lu, S., Mitchell, H.H.: Social cues in animated conversational agents. Applied Cognitive Psychology 19, 1–12 (2005)CrossRefGoogle Scholar
  3. 3.
    Kort, B., Reilly, R., Picard, R.W.: An Affective Model of Interplay Between Emotions and Learning: Reengineering Educational Pedagogy-Building a Learning Companion. In: Proceedings of International Conference on Advanced Learning Technologies (ICALT 2001), Madison, Wisconsin (August 2001)Google Scholar
  4. 4.
    D’Mello, S.K., Craig, S.D., Witherspoon, A., Sullins, J., McDaniel, B., Gholson, B., Graesser, A.C.: The relationship between affective states and dialog patterns during interactions with AutoTutor. In: Proceedings of the World Conference on E-learning in Corporate, Government, Health Care, and Higher Education, Chesapeake, VA (2005)Google Scholar
  5. 5.
    Louwerse, M., Jeuniaux, P., Hoque, M., Wu, J., Lewis, G.: Multimodal Communication in Computer-Mediated Map Task Scenarios. In: The 28th Annual Conference of the Cognitive Science Society, Vancouver, Canada (July 2006)Google Scholar
  6. 6.
    Louwerse, M.M., Bard, E.G., Steedman, M., Hu, X., Graesser, A.C.: Tracking multimodal communication in humans and agents, Institute for Intelligent Systems, University of Memphis, Memphis, TN (2004)Google Scholar
  7. 7.
    Argyle, M., Cook, M.: Gaze and Mutual Gaze (1976)Google Scholar
  8. 8.
    Doherty-Sneddon, G., Anderson, A.H., O’Malley, C., Langton, S., Garrod, S., Bruce, V.: Face-to-face and video-mediated communication: A comparison of dialogue structure and task performance. Journal of Experimental Psychology-Applied 3(2), 105–125 (1997)CrossRefGoogle Scholar
  9. 9.
    Goldin-Meadow, S., Singer, M.A.: From children’s hands to adults’ ears: Gesture’s role in teaching and learning. Developmental Psychology, 509–520 (2003)Google Scholar
  10. 10.
    Louwerse, M.M., Bangerter, A.: Focusing attention with deictic gestures and linguistic expressions. In: Proceedings of the Cognitive Science Society (2005)Google Scholar
  11. 11.
    McNeill, D.: Hand and mind: What gestures reveal about thought (1992)Google Scholar
  12. 12.
    Ekman, P.: About brows: emotional and conversational signals. Human Ethology, 169–248 (1979)Google Scholar
  13. 13.
    Louwerse, M.M., Graesser, A.C., Lu, S., Mitchell, H.H.: Social cues in animated conversational agents. Applied Cognitive Psychology 19, 1–12 (2004)Google Scholar
  14. 14.
    Cassell, J., Thorisson, K.: The Power of a Nod and a Glance: Envelope vs. Emotional Feedback in Animated Conversational Agents. Journal of Applied Artifical Intelligence 13(3), 519–538 (1999)CrossRefGoogle Scholar
  15. 15.
    Cohen, M.M., Massaro, D.W.: Development and Experimentation with Synthetic Visible Speech Behavioral Research Methods and Instrumentation, vol. 26, pp. 260–265 (1994)Google Scholar
  16. 16.
    Picard, R.: Affective Computing (1997)Google Scholar
  17. 17.
    Massaro, D.W.: Illusions and Issues in Bimodal SpeechPerception. In: Proceedings of Auditory Visual Speech Perception 1998, Terrigal-Sydney Australia (December 1998)Google Scholar
  18. 18.
    Lester, J., Stone, B.B., Stelling, G.: Lifelike Pedagogical Agents for Mixed-Initiative Problem Solving in Constructivist Learning Environments. User Modeling and User-Adapted Interaction 9, 1–44 (1999)CrossRefGoogle Scholar
  19. 19.
    Rickel, J., Johnson, W.L.: Animated agents for procedural training in virtual reality: Perception, cognition, and motor control. Applied Artificial Intelligence 13, 343–382 (1999)CrossRefGoogle Scholar
  20. 20.
    Person, N., Graesser, A., Bautista, L., Mathews, E.C., The Tutoring Research Group: Evaluating student learning gains in two versions of AutoTutor. In: Moore, J., Redfield, C., Johnson, W. (eds.) Artificial Intelligence in Education: AI-ED in the Wired and Wireless Future, pp. 286–293 (2001)Google Scholar
  21. 21.
    Lee, C., Narayanan, S.: Toward detecting emotions in spoken dialogs. IEEE transaction on speech and audio processing 13 (2005)Google Scholar
  22. 22.
    Dellaert, F., Polzin, T., Waibel, A.: Recognizing Emotion in Speech. In: Proceedings of the ICSLP (1996)Google Scholar
  23. 23.
    Lee, C., Narayanan, S., Pieraccini, R.: Classifying Emotions in Human-Machine Spoken Dialogs. In: Proc. of International Conference on Multimedia and Expo., Lausanne, Switzerland (August 2002)Google Scholar
  24. 24.
    Paeschke, A., Sendlmeier, W.F.: Prosodic Characteristics of Emotional Speech: Measurements of Fundamental Frequency Movements. In: Proceedings of the ISCA-Workshop on Speech and Emotion (2000)Google Scholar
  25. 25.
    Tato, R., Santos, R., Kompe, R., Pardo, J.M.: Emotional Space Improves Emotion Recognition. In: Proc. Of ICSLP 2002, Denver, Colorado (September 2002)Google Scholar
  26. 26.
    Yacoub, S., Simske, S., Lin, X., Burns, J.: Recognition of Emotions in Interactive Voice Response Systems. In: The Eurospeech 2003, 8th European Conference on Speech Communication and Technology, Geneva, Switzerland (September 1-4, 2003)Google Scholar
  27. 27.
    Yu, F., Chang, E., Xu, Y.Q., Shum, H.Y.: Emotion Detection From Speech To Enrich Multimedia Content. In: The Second IEEE Pacific-Rim Conference on Multimedia, Beijing, China, October 24-26 (2001)Google Scholar
  28. 28.
    Louwerse, M.M., Mitchell, H.H.: Towards a taxonomy of a set of discourse markers in dialog: A theoretical and computational linguistic account. Discourse Processes, 199–239 (2003)Google Scholar
  29. 29.
    Duda, R., Hart, P., Stork, D.: Pattern Classification, 2nd edn. Wiley, New York (2001)zbMATHGoogle Scholar
  30. 30.
    Witten, I.H., Frank, E.: Data Mining: Practical machine learning tools and techniques, 2nd edn. Morgan Kaufmann, San Francisco (2005)zbMATHGoogle Scholar
  31. 31.
    Williams, C.E., Stevens, K.N.: Emotions and speech:Some acoustical correlates. JASA 52, 1238–1250 (1972)Google Scholar
  32. 32.
    Banse, R., Scherer, K.R.: Acoustic profiles in vocal emotion expression. J. Personality and Social Psychology 70, 614–636 (1996)CrossRefGoogle Scholar
  33. 33.
    Mozziconacci, S.: The expression of emotion considered in the framework of an intonational model. In: Proc. ISCA Wrksp. Speech and Emotion, pp. 45–52 (2000)Google Scholar
  34. 34.
    Boersma, P., Weenink, D.: Praat: doing phonetics by computer, Version 4.4.16 edn. (2006)Google Scholar
  35. 35.
    Kettebekov, S., Yeasin, M., Sharma, R.: Prosody-based Audio Visual co-analysis for co-verbal gesture recognition. IEEE transaction on Multimedia 7, 234–242 (2005)CrossRefGoogle Scholar
  36. 36.
    Pantic, M., Rothkrantz, L.J.M.: Toward an affect-sensitive multimodal human-computer interaction. Proceedings of the IEEE 91, 1370–1390 (2003)CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2006

Authors and Affiliations

  • Mohammed E. Hoque
    • 1
  • Mohammed Yeasin
    • 1
  • Max M. Louwerse
    • 2
  1. 1.Department of Electrical and Computer Engineering / Institute for Intelligent Systems 
  2. 2.Department of Psychology / Institute for Intelligent SystemsThe University of MemphisMemphisUSA

Personalised recommendations