Skip to main content
Log in

Emotion in Speech: The Acoustic Attributes of Fear, Anger, Sadness, and Joy

  • Published:
Journal of Psycholinguistic Research Aims and scope Submit manuscript

Abstract

Decoders can detect emotion in voice with much greater accuracy than can be achieved by objective acoustic analysis. Studies that have established this advantage, however, used methods that may have favored decoders and disadvantaged acoustic analysis. In this study, we applied several methodologic modifications for the analysis of the acoustic differentiation of fear, anger, sadness, and joy. Thirty-one female subjects between the ages of 18 and 35 (encoders) were audio-recorded during an emotion-induction procedure and produced a total of 620 emotion-laden sentences. Twelve female judges (decoders), three for each of the four emotions, were assigned to rate the intensity of one emotion each. Their combined ratings were used to select 38 prototype samples per emotion. Past acoustic findings were replicated, and increased acoustic differentiation among the emotions was achieved. Multiple regression analysis suggested that some, although not all, of the acoustic variables were associated with decoders' ratings. Signal detection analysis gave some insight into this disparity. However, the analysis of the classic constellation of acoustic variables may not completely capture the acoustic features that influence decoders' ratings. Future analyses would likely benefit from the parallel assessment of respiration, phonation, and articulation.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

REFERENCES

  • Alpert, M. (1982). Encoding of feelings in voice. In P. J. Clayton & J. E. Barrett (Eds.), Treatment of depression: Old controversies and new approaches (pp. 217-228). New York: Raven Press.

    Google Scholar 

  • Alpert, M., Merewether, F., Homel, P., Martz, J., & Lomask, M. (1986). VoxCom: A system for analyzing natual speech in real time. Behavioral Research Methods: Instruments and Computers, 18, 267-272.

    Google Scholar 

  • Banse, R., & Scherer, K. R. (1996). Acoustic profiles in vocal emotion expression. Journal of Personality & Social Psychology, 70, 614-636.

    Google Scholar 

  • Davitz, J. (1964). The communication of emotional meaning. Westport, CT: Greenwood Press.

    Google Scholar 

  • Egan, J. P. (1975). Signal detection theory and ROC analysis. New York: Academic Press.

    Google Scholar 

  • Epstein, S. (1972). The nature of anxiety with emphasis upon its relationship to expectancy. In C. D. Speilberger (Ed.), Anxiety: Current trends in theory and research. New York: Academic Press.

    Google Scholar 

  • Fairbanks, G., & Provonost, W. (1938). Vocal pitch during simulated emotion. Science, 88, 382-383.

    Google Scholar 

  • Flint, A. J., Black, S. E., Campbell-Taylor, I., Gailey, G. F., and Levinton, C. (1993). Abnormal speech articulation, psychomotor retardation, and subcortical dysfunction in major depression. Journal of Psychiatric Research, 27, 309-319.

    Google Scholar 

  • Greasley, P., Sherrard, C., Waterman, M., Setter, J., Roach, P., Arnfield, S., & Horton, D. (1996). The perception of emotion in speech. Paper presented at the meeting of the XXVI International Congress of Psychology, Montreal, Canada.

  • Green, D. M., & Swets, J. A. (1966). Signal detection theory and psychophysics. New York: Wiley.

    Google Scholar 

  • Hirano, M., Kurita, S., & Nakashima, T. (1981). Growth, development and aging of human vocal folds. Paper presented at Vocal Fold Physiology Conference, Madison, WI.

  • James, W. (1890). The principles of psychology. New York: Holt.

    Google Scholar 

  • Kahane, J. (1980). Age related hisotological changes in the human male and female laryngeal cartilages: Bilogical and functional implications. In V. Lawrence (Ed.), Transcripts of the ninth symposium: Care of the professional voice (pp. 11-20). New York: The Voice Foundation.

    Google Scholar 

  • Keller, E., Vigneux, P., & Laframboise, M. (1991). Acoustic analysis of neurologically impaired speech. British Journal of Disorders of Communication, 26, 75-94.

    Google Scholar 

  • Klatt, D. H., & Klatt, L. C. (1990). Analysis, synthesis, and perception of voice quality variations among female and male talkers. Journal of the Acoustical Society of America, 87, 820-857.

    Google Scholar 

  • Koller, W. C., Busenbark, K., & Miner, K. (1994). The relationship of essential tremor to other movement disorders: report on 678 patients. Annals of Neurology, 35, 717-723.

    Google Scholar 

  • Kuny, S., & Stassen, H. H. (1993). Speaking behavior and voice sound characteristics in depressive patients during recovery. Journal of Psychiatric Research, 27, 289-307.

    Google Scholar 

  • LeDoux, J. (1996). The emotional brain. NY: Simon & Schuster.

    Google Scholar 

  • Malinowski, A. (1967). Shape, dimensions and process of calcification of the cartilagious framework of the larynx in relation to age and sex in Polish population. Folia Morphologica, 26, 118-128.

    Google Scholar 

  • Maue, W. (1970). Cartilages, ligaments, and articulations of the adult human larynx. Unpublished doctoral dissertation. University of Pittsburgh, PA.

  • Mehrabian, A. (1972). Nonverbal communication. Chicago: Aldine-Atherton.

    Google Scholar 

  • Moreno, C., Borod, J. C., Welkowitz, J., & Alpert, M. (1993). The perception of facial emotion across the adult life span. Developmental Neuropsychology, 9, 305-314.

    Google Scholar 

  • Murray, I. R., & Arnott, J. L. (1993). Toward a simulation of emotion in synthetic speech: A review of the lieterautre on human vocal emotion. Journal of the Acoustical Sociatey of America, 93, 1097-1108.

    Google Scholar 

  • Mysak, E. D. (1959). Pitch and duration characteristics of older males. Journal of Speech and Hearing Research, 2, 46-54.

    Google Scholar 

  • Ortleb, R. (1937). An objective study of emphasis in oral reading of emotional and unemotional material. Speech Monographs, 4, 56-68.

    Google Scholar 

  • Perez, K. S., Ramig, L. O., Smith, M. E. & Dromey, C. (1996). The parksonian larnyx: Tremor and videostoboscopic findings. The Journal of Voice, 10, 354-361.

    Google Scholar 

  • Pittam, J., & Scherer, K. R. (1993). Vocal expression and communication of emotion. In M. Lewis & J. M. Haviland (Eds.), Handbook of emotions (pp. 185-197). New York: Guilford Press.

    Google Scholar 

  • Rosenthal, R., Hall, J. A., DiMatteo, M. R., Rogers, P. L., & Archer, D. (1979). Sensitivity to nonverbal communication. Baltimore, MD: Johns Hopkins University Press.

    Google Scholar 

  • Scherer, K. R. (1981). Speech and emotional states. In J. K. Darby (Ed.), The evaluation of speech in psychiatry (pp. 189-220). New York: Grune & Stratton.

    Google Scholar 

  • Scherer, K. R. (1986). Vocal affect expression: A review and model for future research. Psychological Bulletin, 99, 143-165.

    Google Scholar 

  • Smith, S. B. (1978). The relationship between the angle of the thyroid laminae and vocal fold length. Unpublished master's thesis. University of Illnois, Champagne.

  • Snedecor, G. W., & Cochran, W. G. (1978). Statistical methods, 6th ed. Ames, IA: Iowa State University Press.

    Google Scholar 

  • Starkweather, J. A. (1956). The communication value of content-free speech. American Journal of Psychology, 69, 121-123.

    Google Scholar 

  • Tischer, B. (1994). Die vokale kommunikation von gefuhlen [The vocal communication of emotions]. Weinheim, Germany: Beltz.

    Google Scholar 

  • Williams, C. E. & Stevens, K. N. (1981) Vocal correlates of emotional states. In JK Darby (Ed.), The evaluation of speech in psychiatry. New York: Grune & Stratton.

    Google Scholar 

  • Zemlin, W. R. (1988). Speech and hearing science, 3rd ed. Englewood Cliffs, NJ: Prentice-Hall.

    Google Scholar 

  • Zuckerman, M., Lipets, M. S., Hall, J. A. & Rosenthal, R. (1975). Encoding and decoding nonverbal cues of emotion. Journal of Personality and Social Psychology, 32, 1068-1076.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Rights and permissions

Reprints and permissions

About this article

Cite this article

Sobin, C., Alpert, M. Emotion in Speech: The Acoustic Attributes of Fear, Anger, Sadness, and Joy. J Psycholinguist Res 28, 347–365 (1999). https://doi.org/10.1023/A:1023237014909

Download citation

  • Issue Date:

  • DOI: https://doi.org/10.1023/A:1023237014909

Keywords

Navigation