Creating Emotion Recognition Agents for Speech Signal

  • Valery A. Petrushin
Part of the Multiagent Systems, Artificial Societies, and Simulated Organizations book series (MASA, volume 3)


This chapter presents agents for emotion recognition in speech and their application to a real world problem. The agents can recognize five emotional states—unemotional, happiness, anger, sadness, and fear— with good accuracy, and be adapted to a particular environment depending on parameters of speech signal and the number of target emotions. A practical application has been developed using an agent that is able to analyze telephone quality speech signal and to distinguish between two emotional states—“agitation” and “calm”. This agent has been used as a part of a decision support system for prioritizing voice messages and assigning a proper human agent to respond the message at a call center.


Speech Signal Emotion Recognition Call Center Emotional Content Emotion Category 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. [1]
    R. Banse and K.R. Scherer. Acoustic profiles in vocal emotion expression. Journal of Personality and Social Psychology, 70: 614–636, 1996.CrossRefGoogle Scholar
  2. [2]
    R. van Bezooijen. The characteristics and recognizability of vocal expression of emotions. Foris, Drodrecht, The Netherlands, 1984.Google Scholar
  3. [3]
    J.E. Cahn. Generation of Affect in Synthesized Speech. In Proc. 1989 Conference of the American Voice I/O Society, pages 251–256. Newport Beach, CA, September 11–13, 1989.Google Scholar
  4. [4]
    C. Darwin. The expression of the emotions in man and animals. University of Chicago Press, 1965 (Original work published in 1872).Google Scholar
  5. [5]
    F. Dellaert, T. Polzin, and A. Waibel. Recognizing emotions in speech. In Proc. Intl. Conf. on Spoken Language Processing, pages 734–737. Philadelphia, PA, October 3–6, 1996.Google Scholar
  6. [6]
    C. Elliot and J. Brzezinski. Autonomous Agents as Synthetic Characters. AI Magazine, 19: 13–30, 1998.Google Scholar
  7. [7]
    L. Hansen and P. Salomon. Neural Network Ensembles. IEEE Transactions on Pattern Analysis and Machine Intelligence. 12: 993–1001, 1990.CrossRefGoogle Scholar
  8. [8]
    I. Kononenko. Estimating attributes: Analysis and extension of RELIEF. In L. De Raedt and F. Bergadano, editors, Proc. European Conf. On Machine Learning (ECML’94), pages 171–182. Catania, Italy, April 6–8, 1994.Google Scholar
  9. [9]
    I.R. Murray and J.L. Arnott. Toward the simulation of emotion in synthetic speech: A review of the literature on human vocal emotions. J. Acoust. Society of America, 93(2): 1097–1108, 1993.Google Scholar
  10. [10]
    R. Picard. Affective computing. MIT Press, Cambridge, MA, 1997.Google Scholar
  11. [11]
    K.R. Scherer, R. Banse, H.G. Wallbott, and T. Goldbeck. Vocal clues in emotion encoding and decoding. Motivation and Emotion, 15: 123–148, 1991.CrossRefGoogle Scholar
  12. [12]
    N. Tosa and R. Nakatsu. Life-like communication agent: Emotion sensing character “MIC” and feeling session character “MUSE”. In Proc. Third IEEE Intl. Conf. on Multimedia Computing and Systems, pages 12–19. Hiroshima, Japan, June 17–23, 1996.Google Scholar

Copyright information

© Kluwer Academic Publishers 2002

Authors and Affiliations

  • Valery A. Petrushin

There are no affiliations available

Personalised recommendations