Perceiving Visual Emotions with Speech

  • Zhigang Deng
  • Jeremy Bailenson
  • J. P. Lewis
  • Ulrich Neumann
Part of the Lecture Notes in Computer Science book series (LNCS, volume 4133)


Embodied Conversational Agents (ECAs) with realistic faces are becoming an intrinsic part of many graphics systems employed in HCI applications. A fundamental issue is how people visually perceive the affect of a speaking agent. In this paper we present the first study evaluating the relation between objective and subjective visual perception of emotion as displayed on a speaking human face, using both full video and sparse point-rendered representations of the face. We found that objective machine learning analysis of facial marker motion data is correlated with evaluations made by experimental subjects, and in particular, the lower face region provides insightful emotion clues for visual emotion perception. We also found that affect is captured in the abstract point-rendered representation.


Video Clip Lower Face Quadratic Discrimination Analysis Facial Animation Conversational Agent 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Ahlberg, J., Pandzic, I.S., You, L.: Evaluating MPEG-4 Facial Animation Players. In: Pandzic, I.S., Forchhimer, R. (eds.) MPEG-4 Facial Animation: the standard, implementation and applications, pp. 287–291 (2002)Google Scholar
  2. 2.
    Andre, E., Rist, M., Muller, J.: Guiding the User through Dynamically Generated Hypermedia Presentations with a Life-like Character. In: IUI 1998, pp. 21–28 (1998)Google Scholar
  3. 3.
    Bassili, J.N.: Emotion Recognition: The Role of Facial Movement and the Relative Importance of Upper and Lower Areas of the Face. Journal of the Personality and Social Psychology (37), 2049–2058 (1979)Google Scholar
  4. 4.
    Blanz, V., Basso, C., Poggio, T., Vetter, T.: Reanimating Faces in Images and Video. Computer Graphics Forum 22(3) (2003)Google Scholar
  5. 5.
    Brand, M.: Voice Puppetry. In: Proc. of ACM SIGGRAPH 1999, pp. 21–28. ACM Press, New York (1999)Google Scholar
  6. 6.
    Bregler, C., Covell, M., Slaney, M.: Video Rewrite: Driving Visual Speech with Audio. In: Proc. of ACM SIGGRAPH 1997, pp. 353–360. ACM Press, New York (1997)Google Scholar
  7. 7.
    Busso, C., Deng, Z., Neumann, U., Narayanan, S.: Natural Head Motion Synthesis Driven by Acoustic Prosody Features. The Journal of Computer Animation and Virtual Worlds 16(3-4), 283–290 (2005)CrossRefGoogle Scholar
  8. 8.
    Cassell, J., Pelachaud, C., Badler, N., Steedman, M., Achorn, B., Becket, T., Douville, B., Prevost, S., Stone, M.: Animated Conversation: Rule-Based Generation of Facial Expression, Gesture and Spoken intonation for Multiple Conversational Agents. In: Proc. of ACM SIGGRAPH 1994, pp. 413–420. ACM Press, New York (1994)CrossRefGoogle Scholar
  9. 9.
    Cassell, J., Sullivan, J., Prevost, S., Churchill, E.: Embodied Conversational Agents. MIT Press, Cambridge (2000)Google Scholar
  10. 10.
    Chuang, E.S., Deshpande, H., Bregler, C.: Facial Expression Space Learning. In: Proc. of Pacific Graphics 2002, pp. 68–76 (2002)Google Scholar
  11. 11.
    Cohen, M.M., Massaro, D.W.: Modeling Coarticulation in Synthetic Visual Speech. In: Magnenat-Thalmann, N., Thalmann, D. (eds.) Models and Techniques in Computer Animation, pp. 139–156. Springer, Heidelberg (1993)Google Scholar
  12. 12.
    Costantini, E., Pianesi, F., Cosi, P.: Evaluation of Synthetic Faces: Human Recognition of Emotional Facial Displays. In: Dybkiaer, L., Minker, W., Heisterkamp, P. (eds.) Affective Dialogue Systems (2004)Google Scholar
  13. 13.
    Costantini, E., Pianesi, F., Prete, M.: Recognising emotions in human and synthetic faces: the role of the upper and lower parts of the face. In: Proc. of IUI 2005, pp. 20–27. ACM Press, New York (2005)CrossRefGoogle Scholar
  14. 14.
    Deng, Z., Neumann, U., Lewis, J.P., Kim, T.Y., Bulut, M., Narayanan, S.: Expressive Facial Animation Synthesis by Learning Speech Co-Articulation and Expression Space. IEEE Transaction on Visualization and Computer Graphics 12(6) (November/December 2006)Google Scholar
  15. 15.
    Deng, Z., Bulut, M., Neumann, U., Narayanan, S.: Automatic Dynamic Expression Synthesis for Speech Animation. In: Proc. of IEEE Computer Animation and Social Agents 2004, July 2004, pp. 267–274 (2004)Google Scholar
  16. 16.
    Deng, Z., Lewis, J.P., Neumann, U.: Synthesizing Speech Animation by Learning Compact Speech Co-Articulation Models. In: Proc. of Computer Graphics International 2005, June 2005, pp. 19–25 (2005)Google Scholar
  17. 17.
    Ekman, P., Friesen, W.V.: Unmasking the Face: A Guide to Recognizing Emotions from Facial Clues. Prentice-Hall, Englewood Cliffs (1975)Google Scholar
  18. 18.
    Essa, I.A., Pentland, A.P.: Coding, Analysis, Interpretation, and Recognition of Facial Expressions. IEEE Transaction on Pattern Analysis and Machine Intelligence 19(7), 757–763 (1997)CrossRefGoogle Scholar
  19. 19.
    Ezzat, T., Geiger, G., Poggio, T.: Trainable Videorealistic Speech Animation. ACM Trans. Graph. 21(3), 388–398 (2002)CrossRefGoogle Scholar
  20. 20.
    Gratch, J., Marsella, S.: Evaluating a Computational Model of Emotion. Journal of Autonomous Agents and Multiagent Systems 11(1), 23–43Google Scholar
  21. 21.
    Gratch, J., Rickel, J., Andre, E., Badler, N., Cassell, J., Petajan, E.: Creating Interactive Virtual Humans: Some Assembly Required. IEEE Intelligent Systems, 54–63 (July/August 2002)Google Scholar
  22. 22.
    Hastie, T., Ribshirani, R., Friedman, J.: The Elements of Statistical Learning: Data Mining, Inference, and Prediction. Springer, Heidelberg (2001)zbMATHGoogle Scholar
  23. 23.
    Katsyri, J., Klucharev, V., Frydrych, M., Sams, M.: Identification of Synthetic and Natural Emotional Facial Expressions. In: Proc. of AVSP 2003, pp. 239–244 (2003)Google Scholar
  24. 24.
    Kshirsagar, S., Thalmann, N.M.: Visyllable Based Speech Animation. Computer Graphics Forum 22(3) (2003)Google Scholar
  25. 25.
    Walker, J.H., Sproull, L., Subramani, R.: Using a human face in an interface. In: Proc. of CHI 1994, pp. 85–91. ACM Press, New York (1994)Google Scholar
  26. 26.
    Lee, Y., Terzopoulos, D., Waters, K.: Realistic modeling for facial animation. In: Proc. of ACM SIGGRAPH 1995, pp. 55–62. ACM Press, New York (1995)Google Scholar
  27. 27.
    Lewis, J.P.: Automated lip-sync: Background and techniques. J. of Visualization and Computer Animation, 118–122 (1991)Google Scholar
  28. 28.
    Lewis, J.P., Purcell, P.: Soft Machine: A Personable Interface. In: Proc. of Graphics Interface, vol. 84, pp. 223–226.Google Scholar
  29. 29.
    Marsella, S., Gratch, J.: Modeling the Interplay of Plans and Emotions in Multi-Agent Simulations. In: Proc. of the Cognitive Science Society (2001)Google Scholar
  30. 30.
    Nass, C., Kim, E.Y., Lee, E.J.: When My Face is the Interface: An Experimental Comparison of Interacting with One’s Own Face or Someone Else’s Face. In: Proc. of CHI 1998, pp. 148–154. ACM Press, New York (1998)Google Scholar
  31. 31.
    Noh, J.Y., Neumann, U.: Expression Cloning. In: Proc. of ACM SIGGRAPH 2001, pp. 277–288. ACM Press, New York (2001)Google Scholar
  32. 32.
    Pandzic, I.S., Ostermann, J., Millen, D.: User evaluation: synthetic talking faces for interactive services. The Visual Computer 15, 330–340 (1999)CrossRefGoogle Scholar
  33. 33.
    Parke, F.: Computer Generated Animation of Faces. In: Proc. ACM Nat’l Conf., pp. 451–457. ACM Press, New York (1972)Google Scholar
  34. 34.
    Pelachaud, C., Badler, N., Steedman, M.: Linguistic Issues in Facial Animation. In: Proc. of Computer Animation 1991 (1991)Google Scholar
  35. 35.
    Pelachaud, C., Badler, N., Steedman, M.: Generating Facial Expressions for Speech. Cognitive Science 20(1), 1–46 (1994)CrossRefGoogle Scholar
  36. 36.
    Rist, M., Andre, E., Muller, J.: Adding animated presentation agents to the interface. In: IUI 1997: Proc. of Intelligent user interfaces, pp. 79–86. ACM Press, New York (1997)CrossRefGoogle Scholar
  37. 37.
    Sirovich, L., Kirby, M.: Low-dimensional procedure for the characterization of human faces. J. Opt. Soc. Am. A. 4(3), 519–524 (1987)CrossRefGoogle Scholar
  38. 38.
    Turk, M.A., Pentland, A.P.: Face Recognition Using Eigenfaces. In: IEEE CVPR 1991, pp. 586–591 (1991)Google Scholar
  39. 39.
    Uttkay, Z., Doorman, C., Noot, H.: Evaluating ECAs - What and How? In: Proc. of the AAMAS 2002 Workshop on Embodied Conversational Agents (2002)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2006

Authors and Affiliations

  • Zhigang Deng
    • 1
  • Jeremy Bailenson
    • 2
  • J. P. Lewis
    • 3
  • Ulrich Neumann
    • 4
  1. 1.Department of Computer ScienceUniversity of HoustonHouston
  2. 2.Department of CommunicationStanford University
  3. 3.Computer Graphics LabStanford University
  4. 4.Department of Computer ScienceUniversity of Southern CaliforniaLos Angeles

Personalised recommendations