Affective computing is currently one of the most active research topics, furthermore, having increasingly intensive attention. This strong interest is driven by a wide spectrum of promising applications in many areas such as virtual reality, smart surveillance, perceptual interface, etc. Affective computing concerns multidisciplinary knowledge background such as psychology, cognitive, physiology and computer sciences. The paper is emphasized on the several issues involved implicitly in the whole interactive feedback loop. Various methods for each issue are discussed in order to examine the state of the art. Finally, some research challenges and future directions are also discussed.


Facial Expression Emotion Recognition Facial Expression Recognition Service Robot Speech Synthesis 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Picard, R.W.: Affective computing. MIT Press, Cambridge (1997)Google Scholar
  2. 2.
    James, W.: What is emotion? Mind 9, 188–205 (1884)CrossRefGoogle Scholar
  3. 3.
    Damasio, A.R.: Descartes, Error: Emotion, Reason, and the Human Brain. Gosset/Putnam Press, New York (1994)Google Scholar
  4. 4.
    Ekman, P.: Basic Emotions. Handbook of Cognition and Emotion. John Wiley, New York (1999)Google Scholar
  5. 5.
    Schlossberg, H.: Three dimensions of emotion. Psychological review 61, 81–88 (1954)CrossRefGoogle Scholar
  6. 6.
    Osgood, C.E., Suci, G.J., Tannenbaum, P.H. (eds.): The measurements of meaning. University of Illinois Press (1957)Google Scholar
  7. 7.
    Mozziconacci, S.J.L., Hermes, D.J.: Expression of emotion and attitude through temporal speech variations. In: ICSLP 2000, Beijing (2000)Google Scholar
  8. 8.
    Cahn, J.E.: The generation of affect in synthesized speech. Journal of the American Voice I/O Society 8 (July 1990)Google Scholar
  9. 9.
    Etcoff, N.L., Magee, J.J.: Categorical perception of facial expressions. Cognition 44, 227–240 (1992)CrossRefGoogle Scholar
  10. 10.
    Camurri, A., De Poli, G., Leman, M., Volpe, G.: A Multi-layered Conceptual Framework for Expressive Gesture Applications. In: Proc. Intl. MOSART Workshop, Barcelona (November 2001)Google Scholar
  11. 11.
    Tao, J.: Emotion Control of Chinese Speech Synthesis in Natural Environment. In: Eurospeech 2003, Geneva (September 2003)Google Scholar
  12. 12.
    Cowie, R.: Emotion recognition in human-computer interaction. IEEE Signal Processing Magazine 18(1), 32–80 (2001)CrossRefGoogle Scholar
  13. 13.
    Azarbayejani, A., et al.: Real-Time 3-D Tracking of the Human Body. In: IMAGE’COM 1996, Bordeaux, France (May 1996)Google Scholar
  14. 14.
    O’Brien, J.F., Bodenheimer, B., Brostow, G., Hodgins, J.: Automatic Joint Parameter Estimation from Magnetic Motion Capture Data. In: Proceedings of Graphics Interface 2000, Montreal, Canada, May 2000, pp. 53–60 (2000)Google Scholar
  15. 15.
    Pavlovic, V.I., Sharma, R., Huang, T.S.: Visual Interpretation of Hand Gestures for Human-Computer Interaction: A Review. IEEE Transactions on Pattern Analysis and Machine Intelligence (1997)Google Scholar
  16. 16.
    Gavrila, D.M.: The Visual Analysis of Human Movement: A Survey. Computer Vision and Image Understanding 73(1), 82–98 (1999)zbMATHCrossRefGoogle Scholar
  17. 17.
    Aggarwal, J.K., Cai, Q.: Human Motion Analysis: A Review. Computer Vision and Image Understanding 73(3) (1999)Google Scholar
  18. 18.
    Moriyama, T., Ozawa, S.: Emotion Recognition and Synthesis System on Speech. In: IEEE International Conference on Multimedia Computing and Systems, Florence, Italy (1999)Google Scholar
  19. 19.
    Antonio, R., Damasio, H.: Brain and Language. Scientific American, 89–95 (September 1992)Google Scholar
  20. 20.
    Calder, A.J.: A Principal Component Analysis of Facial Expression. Vision Research 41 (2001)Google Scholar
  21. 21.
  22. 22.
  23. 23.
  24. 24.
  25. 25.
  26. 26.
  27. 27.
  28. 28.
  29. 29.
    Massaro, D.W., Beskow, J., Cohen, M.M., Fry, C.L., Rodriguez, T.: Picture My Voice: Audio to Visual Speech Synthesis using Artificial Neural Networks. In: Proceedings of AVSP 1999, Santa Cruz, CA, August 1999, pp. 133–138 (1999)Google Scholar
  30. 30.
    Yamamoto, E., Nakamura, S., Shikano, K.: Lip movement synthesis from speech based on Hidden Markov Models. Speech Communication 26, 105–115 (1998)CrossRefGoogle Scholar
  31. 31.
    Gutierrez-Osuna, R., Kakumanu, P.K., Esposito, A., Garcia, O.N., Bojorquez, A., Castillo, J.L., Rudomin, I.: Speech-Driven Facial Animation With Realistic Dynamics. IEEE Trans. on Multimedia 7(1) (Febrauary 2005)Google Scholar
  32. 32.
    Hong, P., Wen, Z., Huang, T.S.: Real-time speech-driven face animation with expressions using neural networks. IEEE Trans. on Neural Networks 13(4) (July 2002)Google Scholar
  33. 33.
    Murat Tekalp, A., rn Ostermann, J.: Face and 2-D mesh animation in MPEG-4. Signal Processing: Image Communication 15, 387–421 (2000)CrossRefGoogle Scholar
  34. 34.
    Bregler, C., Covell, M., Slaney, M.: Video Rewrite: Driving Visual Speech with Audio. In: ACM SIGGRAPH (1997)Google Scholar
  35. 35.
    Hunt, A., Black, A.: Unit selection in a concatenative speech synthesis system using a large speech database. In: ICASSP, vol. 1, pp. 373–376 (1996)Google Scholar
  36. 36.
    Cosatto, E., Potamianos, G., Graf, H.P.: Audio-visual unit selection for the synthesis of photo-realistic talking-heads. In: IEEE International Conference on Multimedia and Expo., ICME 2000, vol. 2, pp. 619–622 (2000)Google Scholar
  37. 37.
    Ezzat, T., Poggio, T.: MikeTalk: A Talking Facial Display Based on Morphing Visemes. In: Proc. Computer Animation Conference, Philadelphia, USA (1998)Google Scholar
  38. 38.
    Verma, A., Subramaniam, L.V., Rajput, N., Neti, C., Faruquie, T.A.: Animating Expressive Faces Across Languages. IEEE Trans. on Multimedia 6(6) (December 2004)Google Scholar
  39. 39.
    Ekman, P., Friesen, W.V.: Facial Action Coding System. Con, Palo AltoGoogle Scholar
  40. 40.
    Campbell, N.: Perception of Affect in Speech - towards an Automatic Processing of Paralinguistic Information in Spoken Conversation. In: ICSLP 2004, Jeju (October 2004)Google Scholar
  41. 41.
    Ortony, A., Clore, G.L., Collins, A.: The Cognitive Structure of EmotionsGoogle Scholar
  42. 42.
    Gobl, C., Ní Chasaide, A.: The role of voice quality in communicating emotion, mood and attitude. Speech Communication 40, 189–212 (2003)zbMATHCrossRefGoogle Scholar
  43. 43.
    Scherer, K.R.: Vocal affect expression: A review and a model for future research. Psychological Bulletin 99, 143–165 (1986)CrossRefGoogle Scholar
  44. 44.
    Eide, E., Aaron, A., Bakis, R., Hamza, W., Picheny, M., Pitrelli, J.: A corpus-based approach to <ahem/> expressive speech synthesis. In: IEEE speech synthesis workshop, Santa Monica (2002)Google Scholar
  45. 45.
    Dellaert, F., Polzin, t., Waibel, A.: Recognizing Emotion in Speech. In: Proc. Of ICSLP 1996, Philadelphia, PA, pp. 1970–1973 (1996)Google Scholar
  46. 46.
    Petrushin, V.A.: Emotion Recognition in Speech Signal: Experimental Study, Development and Application. In: ICSLP 2000, Beijing (2000)Google Scholar
  47. 47.
    Lee, C.M., Narayanan, S., Pieraccini, R.: Recognition of Negative Emotion in the Human Speech Signals. In: Workshop on Auto. Speech Recognition and Understanding (December 2001)Google Scholar
  48. 48.
    Tato, R., Santos, R., Kompe, R., Pardo, J.M.: Emotional Space Improves Emotion Recognition. In: Proc. Of ICSLP 2002, Denver, Colorado (September 2002)Google Scholar
  49. 49.
    Yu, F., Chang, E., Xu, Y.Q., Shum, H.Y.: Emotion Detection From Speech To Enrich Multimedia Content. In: Second IEEE Pacific-Rim Conference on Multimedia, Beijing, China, October 24-26 (2001)Google Scholar
  50. 50.
    Campbell, N.: Synthesis Units for Conversational Speech - Using Phrasal Segments,
  51. 51.
    Schröder, M., Breuer, S.: XML Representation Languages as a Way of Interconnecting TTS Modules. In: Proc. ICSLP 2004, Jeju, Korea (2004)Google Scholar
  52. 52.
    Chuang, Z.-J., Wu, C.-H.: Emotion Recognition from Textual Input using an Emotional Semantic Network. In: Proceedings of International Conference on Spoken Language Processing, ICSLP 2002, Denver (2002)Google Scholar
  53. 53.
    Kobayashi, H., Hara, F.: Recognition of Six Basic Facial Expressions and Their Strength by Neural Network. In: Proc. Int’l Workshop Robot and Human Comm., pp. 381–386 (1992)Google Scholar
  54. 54.
    Lyons, M.J., Akamatsu, S., Kamachi, M., Gyoba, J.: Coding Facial Expressions with Gabor Wavelets. In: Proceedings of Third IEEE International Conference on Automatic Face and Gesture Recognition, Nara Japan, April 14-16, pp. 200–205. IEEE Computer Society, Los Alamitos (1998)CrossRefGoogle Scholar
  55. 55.
    Brunelli, R., Falavigna, D.: Person Identification Using Multiple Cues. IEEE Trans. On Pattern Analysis and Machine Intelligence 12(10), 955–966 (1995)CrossRefGoogle Scholar
  56. 56.
    Bigun, E.S., Bigun, J., Duc, B., Fischer, S.: Expert Conciliation for Multimodal Person Authentication Systems using Bayesian Statistics. In: Bigün, J., Borgefors, G., Chollet, G. (eds.) AVBPA 1997. LNCS, vol. 1206, pp. 291–300. Springer, Heidelberg (1997)CrossRefGoogle Scholar
  57. 57.
    Kumar, A., Wong, D.C., Shen, H.C., Jain, A.K.: Personal Verification using Palmprint and Hand Geometry Biometric. In: 4th International Conference on Audio- and Video-based Biometric Person Authentication, Guildford, UK, June 9-11 (2003)Google Scholar
  58. 58.
    Frischholz, R.W., Dieckmann, U.: Bioid: A Multimodal Biometric Identification System. IEEE Computer 33(2), 64–68 (2000)Google Scholar
  59. 59.
    Jain, A.K., Ross, A.: Learning User-specific Parameters in a Multibiometric System. In: Proc. International Conference on Image Processing (ICIP), Rochester, New York, September 22-25 (2002)Google Scholar
  60. 60.
    Ho, T.K., Hull, J.J., Srihari, S.N.: Decision Combination in Multiple Classifier Systems. IEEE Trans. on Pattern Analysis and Machine Intelligence 16(1), 66–75 (1994)CrossRefGoogle Scholar
  61. 61.
    Kittler, J., Hatef, M., Duin, R.P.W., Matas, J.: On Combining Classifiers. IEEE Trans. on Pattern Analysis and Machine Intelligence 20(3), 226–239 (1998)CrossRefGoogle Scholar
  62. 62.
    Dieckmann, U., Plankensteiner, P., Wagner, T.: Sesam: A Biometric Person Identification System Using Sensor Fusion. Pattern Recognition Letters 18(9), 827–833 (1997)CrossRefGoogle Scholar
  63. 63.
    Picard, R.W.: Affective Computing: Challenges. Int. Journal of Human-Computer Studies 59(1-2), 55–64 (2003)CrossRefGoogle Scholar
  64. 64.
    Campbell, N.: Databases of Expressive Speech. In: COCOSDA 2003, Singapore (2003)Google Scholar
  65. 65.

Copyright information

© Springer-Verlag Berlin Heidelberg 2005

Authors and Affiliations

  • Jianhua Tao
    • 1
  • Tieniu Tan
    • 1
  1. 1.National Laboratory of Pattern Recognition (NLPR), Institute of AutomationChinese Academy of SciencesBeijing

Personalised recommendations