Decoders can detect emotion in voice with much greater accuracy than can be achieved by objective acoustic analysis. Studies that have established this advantage, however, used methods that may have favored decoders and disadvantaged acoustic analysis. In this study, we applied several methodologic modifications for the analysis of the acoustic differentiation of fear, anger, sadness, and joy. Thirty-one female subjects between the ages of 18 and 35 (encoders) were audio-recorded during an emotion-induction procedure and produced a total of 620 emotion-laden sentences. Twelve female judges (decoders), three for each of the four emotions, were assigned to rate the intensity of one emotion each. Their combined ratings were used to select 38 prototype samples per emotion. Past acoustic findings were replicated, and increased acoustic differentiation among the emotions was achieved. Multiple regression analysis suggested that some, although not all, of the acoustic variables were associated with decoders' ratings. Signal detection analysis gave some insight into this disparity. However, the analysis of the classic constellation of acoustic variables may not completely capture the acoustic features that influence decoders' ratings. Future analyses would likely benefit from the parallel assessment of respiration, phonation, and articulation.
This is a preview of subscription content, access via your institution.
Buy single article
Instant access to the full article PDF.
Price excludes VAT (USA)
Tax calculation will be finalised during checkout.
Alpert, M. (1982). Encoding of feelings in voice. In P. J. Clayton & J. E. Barrett (Eds.), Treatment of depression: Old controversies and new approaches (pp. 217-228). New York: Raven Press.
Alpert, M., Merewether, F., Homel, P., Martz, J., & Lomask, M. (1986). VoxCom: A system for analyzing natual speech in real time. Behavioral Research Methods: Instruments and Computers, 18, 267-272.
Banse, R., & Scherer, K. R. (1996). Acoustic profiles in vocal emotion expression. Journal of Personality & Social Psychology, 70, 614-636.
Davitz, J. (1964). The communication of emotional meaning. Westport, CT: Greenwood Press.
Egan, J. P. (1975). Signal detection theory and ROC analysis. New York: Academic Press.
Epstein, S. (1972). The nature of anxiety with emphasis upon its relationship to expectancy. In C. D. Speilberger (Ed.), Anxiety: Current trends in theory and research. New York: Academic Press.
Fairbanks, G., & Provonost, W. (1938). Vocal pitch during simulated emotion. Science, 88, 382-383.
Flint, A. J., Black, S. E., Campbell-Taylor, I., Gailey, G. F., and Levinton, C. (1993). Abnormal speech articulation, psychomotor retardation, and subcortical dysfunction in major depression. Journal of Psychiatric Research, 27, 309-319.
Greasley, P., Sherrard, C., Waterman, M., Setter, J., Roach, P., Arnfield, S., & Horton, D. (1996). The perception of emotion in speech. Paper presented at the meeting of the XXVI International Congress of Psychology, Montreal, Canada.
Green, D. M., & Swets, J. A. (1966). Signal detection theory and psychophysics. New York: Wiley.
Hirano, M., Kurita, S., & Nakashima, T. (1981). Growth, development and aging of human vocal folds. Paper presented at Vocal Fold Physiology Conference, Madison, WI.
James, W. (1890). The principles of psychology. New York: Holt.
Kahane, J. (1980). Age related hisotological changes in the human male and female laryngeal cartilages: Bilogical and functional implications. In V. Lawrence (Ed.), Transcripts of the ninth symposium: Care of the professional voice (pp. 11-20). New York: The Voice Foundation.
Keller, E., Vigneux, P., & Laframboise, M. (1991). Acoustic analysis of neurologically impaired speech. British Journal of Disorders of Communication, 26, 75-94.
Klatt, D. H., & Klatt, L. C. (1990). Analysis, synthesis, and perception of voice quality variations among female and male talkers. Journal of the Acoustical Society of America, 87, 820-857.
Koller, W. C., Busenbark, K., & Miner, K. (1994). The relationship of essential tremor to other movement disorders: report on 678 patients. Annals of Neurology, 35, 717-723.
Kuny, S., & Stassen, H. H. (1993). Speaking behavior and voice sound characteristics in depressive patients during recovery. Journal of Psychiatric Research, 27, 289-307.
LeDoux, J. (1996). The emotional brain. NY: Simon & Schuster.
Malinowski, A. (1967). Shape, dimensions and process of calcification of the cartilagious framework of the larynx in relation to age and sex in Polish population. Folia Morphologica, 26, 118-128.
Maue, W. (1970). Cartilages, ligaments, and articulations of the adult human larynx. Unpublished doctoral dissertation. University of Pittsburgh, PA.
Mehrabian, A. (1972). Nonverbal communication. Chicago: Aldine-Atherton.
Moreno, C., Borod, J. C., Welkowitz, J., & Alpert, M. (1993). The perception of facial emotion across the adult life span. Developmental Neuropsychology, 9, 305-314.
Murray, I. R., & Arnott, J. L. (1993). Toward a simulation of emotion in synthetic speech: A review of the lieterautre on human vocal emotion. Journal of the Acoustical Sociatey of America, 93, 1097-1108.
Mysak, E. D. (1959). Pitch and duration characteristics of older males. Journal of Speech and Hearing Research, 2, 46-54.
Ortleb, R. (1937). An objective study of emphasis in oral reading of emotional and unemotional material. Speech Monographs, 4, 56-68.
Perez, K. S., Ramig, L. O., Smith, M. E. & Dromey, C. (1996). The parksonian larnyx: Tremor and videostoboscopic findings. The Journal of Voice, 10, 354-361.
Pittam, J., & Scherer, K. R. (1993). Vocal expression and communication of emotion. In M. Lewis & J. M. Haviland (Eds.), Handbook of emotions (pp. 185-197). New York: Guilford Press.
Rosenthal, R., Hall, J. A., DiMatteo, M. R., Rogers, P. L., & Archer, D. (1979). Sensitivity to nonverbal communication. Baltimore, MD: Johns Hopkins University Press.
Scherer, K. R. (1981). Speech and emotional states. In J. K. Darby (Ed.), The evaluation of speech in psychiatry (pp. 189-220). New York: Grune & Stratton.
Scherer, K. R. (1986). Vocal affect expression: A review and model for future research. Psychological Bulletin, 99, 143-165.
Smith, S. B. (1978). The relationship between the angle of the thyroid laminae and vocal fold length. Unpublished master's thesis. University of Illnois, Champagne.
Snedecor, G. W., & Cochran, W. G. (1978). Statistical methods, 6th ed. Ames, IA: Iowa State University Press.
Starkweather, J. A. (1956). The communication value of content-free speech. American Journal of Psychology, 69, 121-123.
Tischer, B. (1994). Die vokale kommunikation von gefuhlen [The vocal communication of emotions]. Weinheim, Germany: Beltz.
Williams, C. E. & Stevens, K. N. (1981) Vocal correlates of emotional states. In JK Darby (Ed.), The evaluation of speech in psychiatry. New York: Grune & Stratton.
Zemlin, W. R. (1988). Speech and hearing science, 3rd ed. Englewood Cliffs, NJ: Prentice-Hall.
Zuckerman, M., Lipets, M. S., Hall, J. A. & Rosenthal, R. (1975). Encoding and decoding nonverbal cues of emotion. Journal of Personality and Social Psychology, 32, 1068-1076.
Rights and permissions
About this article
Cite this article
Sobin, C., Alpert, M. Emotion in Speech: The Acoustic Attributes of Fear, Anger, Sadness, and Joy. J Psycholinguist Res 28, 347–365 (1999). https://doi.org/10.1023/A:1023237014909
- Multiple Regression Analysis
- Signal Detection
- Female Subject
- Methodologic Modification
- Great Accuracy