User Modeling and User-Adapted Interaction

, Volume 11, Issue 4, pp 297–326 | Cite as

Modeling Emotion and Attitude in Speech by Means of Perceptually Based Parameter Values

  • Sylvie J. L. Mozziconacci


This study focuses on the perception of emotion and attitude in speech. The ability to identify vocal expressions of emotion and/or attitude in speech material was investigated. Systematic perception experiments were carried out to determine optimal values for the acoustic parameters: pitch level, pitch range and speech rate. Speech was manipulated by varying these parameters around the values found in a selected subset of the speech material which consisted of two sentences spoken by a male speaker expressing seven emotions or attitudes: neutrality, joy, boredom, anger, sadness, fear, and indignation. Listening tests were carried out with this speech material, and optimal values for pitch level, pitch range, and speech rate were derived for the generation of speech expressing emotion or attitude, from a neutral utterance. These values were perceptually tested in re-synthesized speech and in synthetic speech generated from LPC-coded diphones.

attitude emotion experimental phonetics expression perception prosody speech speech technology 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. Bartneck, C.: How convincing is Mr. Data's smile: Affective expressions of machines (in this issue).Google Scholar
  2. Beckman, M. E.: 1997, Speech models and speech synthesis. In: J. P. H. van Santen, R. W. Sproat, J. P. Olive and J. Hirschberg (eds.) Progress in speech synthesis. Springer-Verlag, New York, pp. 185-209.Google Scholar
  3. Bezooijen, R. A. M. G. van: 1984, The characteristics and recognizability of vocal expression of emotion. Foris, Dordrecht, The Netherlands.Google Scholar
  4. Bianchi-Berthouse, N. and Lisetti C. L.: Modeling multimodal expression of user's affective subjective experience (in this issue).Google Scholar
  5. Bouwhuis, D. G.: 1974, The recognition of attitudes in speech. IPO Annual Progress Report 9, pp. 82-86.Google Scholar
  6. Cahn, J. E.: 1990, Generating expression in synthesized speech. Technical report, MIT Media Lab., Boston.Google Scholar
  7. Carlson, R.: 1991, Synthesis: modelling variability and constraints. Proceedings Eurospeech'91, Genova, Italy 3, pp. 1043-1048.Google Scholar
  8. Carlson, R., Granström, B., and Nord, L.: 1992, Experiments with emotive speech: acted utterances and synthesized replicas. Proceedings ICSLP 92. Banff, Alberta, Canada, 1, pp. 671-674.Google Scholar
  9. Charpentier, F., and Moulines, E.: 1989, Pitch-synchronous waveform processing techniques for text-to-speech synthesis using diphones. Proceedings Eurospeech'89. Paris, France, 2, pp. 13-19.Google Scholar
  10. Collier, R.: 1991, Multi-language intonation synthesis. Journal of Phonetics 19, pp. 61-73.Google Scholar
  11. Cosmides, L.: 1983, Invariances in the acoustic expression of emotion during speech. Journal of Experimental Psychology: Human Perception and Performance 9, pp. 864-881.Google Scholar
  12. Cummings, K. E., and Clements, M. A.: 1995, Analysis of the glottal excitation of emotionally styled and stressed speech. Journal of the Acoustical Society of America 98(1), pp. 88-98.Google Scholar
  13. Ekman, P.: 1982, Emotion in the human face, second edition. Cambridge University Press, New York.Google Scholar
  14. Fairbanks, G. and Pronovost, W.: 1939, An experimental study of the pitch characteristics of the voice during the expression of emotion. Speech Monographs 6, pp. 87-104.Google Scholar
  15. Frick, R. W.: 1985, Communicating emotion: the role of prosodic features. Psychological Bulletin 97, pp. 412-429.Google Scholar
  16. Frijda, N. H.: 1986, The emotions. Cambridge University Press, Cambridge, England.Google Scholar
  17. Hart, J. 't, Collier, R. and Cohen, A.: 1990, A perceptual study of intonation. Cambridge University Press, Cambridge.Google Scholar
  18. Hermes, D. J.: 1988, Measurement of pitch by subharmonic summation. Journal of the Acoustical Society of America 83, pp. 257-264.Google Scholar
  19. Hermes, D. J.: 1990, ‘Vowel-onset detection.’ Journal of the Acoustical Society of America 87(2), pp. 866-873.Google Scholar
  20. House, D.: 1990, Tonal perception in speech. Lund University Press, Lund.Google Scholar
  21. Izard, C. E.: 1977, Human emotions. Plenum Press, New York.Google Scholar
  22. Kitahara, Y. and Tohkura, Y.: 1992, Prosodic control to express emotions for man-machine interaction. IEICE Transactions on Fundamentals of Electronics, communications and computer sciences 75, pp. 155-163.Google Scholar
  23. Ladd, D. R., Silverman, K. E. A., Tolkmitt, F., Bergman, G. and Scherer, K. R.: 1985, ‘Evidence for the independent function of intonation contour type, voice quality, and F0 range in signalling speaker affect.’ Journal of the Acoustical Society of America 78, pp. 435-444.Google Scholar
  24. Laukkanen, A.-M., Vilkman, E., Alku, P. and Oksanen H.: 1997, On the perception of emotions in speech: the role of voice quality. Journal of Logopedics and Phoniatrics Vocology 22(4), pp. 157-168.Google Scholar
  25. Leinonen, L., Hiltunen, T., Linnankoski, I. and Laakso, M.-L.: 1997, ‘Expression of emotional-motivational connotations with a one-word utterance.’ Journal of the Acoustical Society of America 102(3), pp. 1853-1863.Google Scholar
  26. Lieberman, P. and Michaels, S. B.: 1962, Some aspects of fundamental frequency and envelope amplitude as related to emotional content of speech. Journal of the Acoustical Society of America 34, pp. 922-927.Google Scholar
  27. Lisetti, C. L.: 1999, A user model of emotion-cognition. Proceedings of the workshop on attitude, personality, and emotions in user-adapted interaction, at the 7th International Conference on User Modeling (UM'99). Banff, Canada.Google Scholar
  28. Mozziconacci, S. J. L.: 1998, Speech variability and emotion: Production and perception. Technical University Eindhoven, The Netherlands.Google Scholar
  29. Mozziconacci, S. J. L., and Hermes, D. J.: 1999, Role of intonation patterns in conveying emotion in speech. Proceedings ICPhS 99. San Francisco, USA.Google Scholar
  30. Murray, I. R.: 1989, Simulating emotion in synthetic speech. University of Dundee, Scotland, UK.Google Scholar
  31. Murray, I. R. and Arnott, J. L.: 1993, Toward the simulation of emotion in synthetic speech: a review of the literature on human vocal emotion. Journal of the Acoustical Society of America 93, pp. 1097-1108.Google Scholar
  32. Pijper, J.-R. de: 1983, Modelling British English intonation: an analysis by resynthesis of British English intonation. Foris, Dordrecht, The Netherlands.Google Scholar
  33. Plutchik, R.: 1980, Emotion: a psychoevolutionary synthesis. Harper & Row, New York.Google Scholar
  34. Protopapas, A. and Lieberman, P.: 1997, Fundamental frequency of phonation and perceived emotional stress. Journal of the Acoustical Society of America 101(4), pp. 2267-2277.Google Scholar
  35. Rijnsoever, P. van: 1988, A multilingual text-to-speech system. IPO Annual Progress Report 23, 34-39.Google Scholar
  36. Rosis, F. de, and Grasso, F.: in press, Affective natural language generation. In: A. Paiva (ed.): Affect in interaction. Springer LNAI Series, in press.Google Scholar
  37. Siegwart, H. and Scherer, K. R.: 1995, Acoustic concomitants of emotional expression in operatic singing: the case of Lucia in Ardi gli incensi. Journal of Voice 9(3), pp. 249-260.Google Scholar
  38. Verhelst, W. and Borger, M.: 1991, Intra-speaker transplantation of speech characteristics: an application of waveform vocoding techniques and DTW Proceedings Eurospeech'91. Genova, Italy, 3, pp. 1319-1322.Google Scholar
  39. Williams, C. E. and Stevens, K. N.: 1972, Emotions and speech: some acoustical factors. Journal of the Acoustical Society of America 52, pp. 1238-1250.Google Scholar
  40. Zelle, H. W., Pijper, J.-R. de and Hart, J. 't: 1984, Semi-automatic synthesis of intonation for Dutch and British English. Proceedings Xth ICPhS. Utrecht, The Netherlands, IIB, pp. 247-251.Google Scholar

Copyright information

© Kluwer Academic Publishers 2001

Authors and Affiliations

  • Sylvie J. L. Mozziconacci
    • 1
  1. 1.IPO, Center for User-System InteractionEindhovenThe Netherlands

Personalised recommendations