Skip to main content
Log in

Decoding emotions from nonverbal vocalizations: How much voice signal is enough?

Motivation and Emotion Aims and scope Submit manuscript

Abstract

How much acoustic signal is enough for an accurate recognition of nonverbal emotional vocalizations? Using a gating paradigm (7 gates from 100 to 700 ms), the current study probed the effect of stimulus duration on recognition accuracy of emotional vocalizations expressing anger, disgust, fear, amusement, sadness and neutral states. Participants (n = 52) judged the emotional meaning of vocalizations presented at each gate. Increased recognition accuracy was observed from gates 2 to 3 for all types of vocalizations. Neutral vocalizations were identified with the shortest amount of acoustic information relative to all other types of vocalizations. A shorter acoustic signal was required to decode amusement compared to fear, anger and sadness, whereas anger and fear required equivalent amounts of acoustic information to be accurately recognized. These findings confirm that the time course of successful recognition of discrete vocal emotions varies by emotion type. Compared to prior studies, they additionally indicate that the type of auditory signal (speech prosody vs. nonverbal vocalizations) determines how quickly listeners recognize emotions from a speaker’s voice.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2

Notes

  1. Based on existing studies arguing for a clear distinction between different types of positive nonverbal vocalizations (e.g., achievement/triumph, amusement, contentment, sensual pleasure and relief—Sauter and Scott 2007; amusement, interest, relief, awe, compassion, sensory pleasure, enthusiasm and triumph—Simon-Thomas et al. 2009), the term ‘happiness’ used in the MAV was replaced with ‘amusement’ in the current study as it more accurately matches the stimuli (laughter) included in this battery.

  2. The lower acoustic variability of the MAV sounds, compared to other stimulus batteries (e.g., Lima et al. 2013), is ideal for the study of the effects of stimulus duration on vocal emotional recognition: when presented with stimuli with lower acoustic variation, listeners may rely more on duration for their emotional judgments than on other acoustic properties of the voice.

  3. 68.2% for the chance level at 12.5%—Belin et al. (2008), 51.1% for the chance level at 12.5%—Koeda et al. (2013), 62.8% for the chance level at 11.1%—Vasconcelos et al. 2017.

  4. The maximum duration of the gate (700 ms) was chosen to allow the use of the current vocalizations in ERP studies probing the time course of vocal emotional processing. In studies using this methodology, differences in stimulus duration across conditions may affect sensory ERP components such as the N1 (Stapells 2002), and potentially confound the interpretation of later processing stages involved in the cognitive evaluation of the stimulus.

References

  • Banse, R., & Scherer, K. R. (1996). Acoustic profiles in vocal emotion expression. Journal of Personality and Social Psychology, 70(3), 614–636.

    Article  PubMed  Google Scholar 

  • Barr, R. G., Chen, S., Hopkins, B., & Westra, T. (1996). Crying patterns in preterm infants. Developmental Medicine and Child Neurology, 38(4), 345–355.

    Article  PubMed  Google Scholar 

  • Bates, D., Maechler, M., Bolker, B., & Walker, S. (2014). lme4: Linear mixed-effects models using Eigen and S4. R Package Version, 1(7), 1–23.

    Google Scholar 

  • Baumeister, R. F., Bratslavsky, E., Finkenauer, C., & Vohs, K. D. (2001). Bad is stronger than good. Review of General Psychology, 5(4), 323–370.

    Article  Google Scholar 

  • Belin, P. (2006). Voice processing in human and non-human primates. Philosophical Transactions of the Royal Society of London B, 361(1476), 2091–2107.

    Article  Google Scholar 

  • Belin, P., Fecteau, S., & Bédard, C. (2004). Thinking the voice: Neural correlates of voice perception. Trends in Cognitive Sciences, 8(3), 129–135.

    Article  PubMed  Google Scholar 

  • Belin, P., Fillion-Bilodeau, S., & Gosselin, F. (2008). The Montreal Affective Voices: A validated set of nonverbal affect bursts for research on auditory affective processing. Behavior Research Methods, 40(2), 531–539.

    Article  PubMed  Google Scholar 

  • Bergmann, G., Goldbeck, T., & Scherer, K. R. (1988). Emotionale Eindruckswirkung von prosodischen Sprechmerkmalen (The effects of prosody on emotion inference). Zeitschrift fur Experimentelle und Angewandte Psychologie, 35, 167–200.

    PubMed  Google Scholar 

  • Boersma, P., & Weenink, D. (2005). Praat: Doing phonetics by computer. 2009. Computer program available at http://www.praat.org

  • Bostanov, V., & Kotchoubey, B. (2004). Recognition of affective prosody: Continuous wavelet measures of event-related brain potentials to emotional exclamations. Psychophysiology, 41(2), 259–268.

    Article  PubMed  Google Scholar 

  • Caron, J. E. (2002). From ethology to aesthetics: Evolution as a theoretical paradigm for research on laughter, humor, and other comic phenomena. Humor, 15(3), 245–282.

    Article  Google Scholar 

  • Cedrus Corporation. (1991). Super Lab, general purpose psychology testing software.

  • Collignon, O., Girard, S., Gosselin, F., Saint-Amour, D., Lepore, F., & Lassonde, M. (2010). Women process multisensory emotion expressions more efficiently than men. Neuropsychologia, 48(1), 220–225.

    Article  PubMed  Google Scholar 

  • Cornew, L., Carver, L., & Love, T. (2010). There’s more to emotion than meets the eye: A processing bias for neutral content in the domain of emotional prosody. Cognition and Emotion, 24(7), 1133–1152.

    Article  Google Scholar 

  • Cowie, R., & Cornelius, R. R. (2003). Describing the emotional states that are expressed in speech. Speech Communication, 40(1–2), 5–32.

    Article  Google Scholar 

  • Edmonson, M. S. (1983). Notes on laughter. Anthropological Linguistics, 29, 23–33.

    Google Scholar 

  • Ekman, P. (1992). An argument for basic emotions. Cognition and Emotion, 6, 169–200.

    Article  Google Scholar 

  • Gervais, M., & Wilson, D. S. (2005). The evolution and functions of laughter and humor: A synthetic approach. The Quarterly Review of Biology, 80(4), 395–430.

    Article  PubMed  Google Scholar 

  • Greatbatch, D., & Clark, T. (2003). Displaying group cohesiveness: Humour and laughter in the public lectures of management gurus. Human Relations, 56(12), 1515–1544.

    Article  Google Scholar 

  • Hawk, S. T., Van Kleef, G. A., Fischer, A. H., & Van Der Schalk, J. (2009). “Worth a thousand words”: Absolute and relative decoding of nonlinguistic affect vocalizations. Emotion, 9(3), 293.

    Article  PubMed  Google Scholar 

  • Hendriks, M. C. P., Croon, M. A., & Vingerhoets, A. J. J. M. (2008). Social reactions to adult crying: The help-soliciting function of tears. The Journal of Social Psychology, 148(1), 22–42.

    Article  PubMed  Google Scholar 

  • Hoffman, L., & Rovine, M. J. (2007). Multilevel models for the experimental psychologist: Foundations and illustrative examples. Behavior Research Methods, 39(1), 101–117.

    Article  PubMed  Google Scholar 

  • Ito, T. A., Larsen, J. T., Smith, N. K., & Cacioppo, J. T. (1998). Negative information weighs more heavily on the brain: The negativity bias in evaluative categorizations. Journal of Personality and Social Psychology, 75(4), 887.

    Article  PubMed  Google Scholar 

  • Jaeger, T. F. (2008). Categorical data analysis: Away from ANOVAs (transformation or not) and towards logit mixed models. Journal of Memory and Language, 59(4), 434–446.

    Article  PubMed  PubMed Central  Google Scholar 

  • Juslin, P. N., & Laukka, P. (2001). Impact of intended emotion intensity on cue utilization and decoding accuracy in vocal expression of emotion. Emotion, 1(4), 381.

    Article  PubMed  Google Scholar 

  • Juslin, P. N., & Laukka, P. (2003). Communication of emotions in vocal expression and music performance: Different channels, same code? Psychological Bulletin, 129(5), 770.

    Article  PubMed  Google Scholar 

  • Kipper, S., & Todt, D. (2001). Variation of sound parameters affects the evaluation of human laughter. Behaviour, 138(9), 1161–1178.

    Article  Google Scholar 

  • Koeda, M., Belin, P., Hama, T., Masuda, T., Matsuura, M., & Okubo, Y. (2013). Cross-cultural differences in the processing of non-verbal affective vocalizations by Japanese and Canadian listeners. Frontiers in Psychology, 4, 105.

    Article  PubMed  PubMed Central  Google Scholar 

  • Kuznetsova, A., Brockhoff, P. B., & Christensen, R. H. B. (2016). lmerTest: Tests in linear mixed effects models. R package Version 2.0-20 [Computer software]. Retrieved April 15, 2016.

  • Latinus, M., & Belin, P. (2011). Human voice perception. Current Biology, 21(4), R143–R145.

    Article  PubMed  Google Scholar 

  • Laukka, P. (2005). Categorical perception of vocal emotion expressions. Emotion, 5(3), 277–295.

    Article  PubMed  Google Scholar 

  • Laukka, P., Elfenbein, H. A., Söder, N., Nordström, H., Althoff, J., Chui, W., et al. (2013). Cross-cultural decoding of positive and negative non-linguistic emotion vocalizations. Frontiers in Psychology, 4, 353.

    Article  PubMed  PubMed Central  Google Scholar 

  • Lima, C. F., Anikin, A., Monteiro, A. C., Scott, S. K., & Castro, S. L. (2018). Automaticity in the recognition of nonverbal emotional vocalizations. Emotion, 19(2), 219–233.

    Article  PubMed  PubMed Central  Google Scholar 

  • Lima, C. F., Castro, S. L., & Scott, S. K. (2013). When voices get emotional: A corpus of nonverbal vocalizations for research on emotion processing. Behavior Research Methods, 45(4), 1234–1245.

    Article  PubMed  Google Scholar 

  • Liu, T., Pinheiro, A. P., Deng, G., Nestor, P. G., McCarley, R. W., & Niznikiewicz, M. A. (2012). Electrophysiological insights into processing nonverbal emotional vocalizations. NeuroReport, 23(2), 108–112.

    Article  PubMed  Google Scholar 

  • Maas, C. J., & Hox, J. J. (2005). Sufficient sample sizes for multilevel modeling. Methodology, 1(3), 86–92.

    Article  Google Scholar 

  • McNeish, D. M., & Stapleton, L. M. (2016). The effect of small sample size on two-level model estimates: A review and illustration. Educational Psychology Review, 28(2), 295–314.

    Article  Google Scholar 

  • Mehu, M., & Dunbar, R. I. (2008). Naturalistic observations of smiling and laughter in human group interactions. Behaviour, 145(12), 1747–1780.

    Article  Google Scholar 

  • Meneses, J. A. C., & Díaz, J. M. M. (2017). Vocal emotion expressions effects on cooperation behavior. Psicológica, 38, 1–24.

    Google Scholar 

  • Murphy, S. T., & Zajonc, R. B. (1993). Affect, cognition, and awareness: Affective priming with optimal and suboptimal stimulus exposures. Journal of Personality and Social Psychology, 64(5), 723–739.

    Article  PubMed  Google Scholar 

  • Naranjo, C., Kornreich, C., Campanella, S., Noël, X., Vandriette, Y., Gillain, B., et al. (2011). Major depression is associated with impaired processing of emotion in music as well as in facial and vocal stimuli. Journal of Affective Disorders, 128(3), 243–251.

    Article  PubMed  Google Scholar 

  • Nesse, R. M. (1990). Evolutionary explanations of emotions. Human Nature, 1(3), 261–289.

    Article  PubMed  Google Scholar 

  • Paquette, S., Peretz, I., & Belin, P. (2013). The “Musical Emotional Bursts”: A validated set of musical affect bursts to investigate auditory affective processing. Frontiers in Psychology, 4, 509.

    PubMed  PubMed Central  Google Scholar 

  • Paulmann, S., & Kotz, S. A. (2008). An ERP investigation on the temporal dynamics of emotional prosody and emotional semantics in pseudo-and lexical-sentence context. Brain and Language, 105(1), 59–69.

    Article  PubMed  Google Scholar 

  • Paulmann, S., & Pell, M. D. (2010). Contextual influences of emotional speech prosody on face processing: How much is enough? Cognitive, Affective, & Behavioral Neuroscience, 10(2), 230–242.

    Article  Google Scholar 

  • Pell, M. D. (2002). Evaluation of nonverbal emotion in face and voice: Some preliminary findings on a new battery of tests. Brain and Cognition, 48(2–3), 499–514.

    PubMed  Google Scholar 

  • Pell, M. D., & Kotz, S. A. (2011). On the time course of vocal emotion recognition. PLoS ONE, 6(11), e27256.

    Article  PubMed  PubMed Central  Google Scholar 

  • Pell, M. D., Rothermich, K., Liu, P., Paulmann, S., Sethi, S., & Rigoulot, S. (2015). Preferential decoding of emotion from human non-linguistic vocalizations versus speech prosody. Biological Psychology, 111, 14–25.

    Article  PubMed  Google Scholar 

  • Pinheiro, A. P., Barros, C., Dias, M., & Kotz, S. A. (2017a). Laughter catches attention! Biological Psychology, 130, 11–21.

    Article  PubMed  Google Scholar 

  • Pinheiro, A. P., Barros, C., Vasconcelos, M., Obermeier, C., & Kotz, S. A. (2017b). Is laughter a better vocal change detector than a growl? Cortex, 92, 233–248.

    Article  PubMed  Google Scholar 

  • Pinheiro, A. P., Del Re, E., Mezin, J., Nestor, P. G., Rauber, A., McCarley, R. W., et al. (2013). Sensory-based and higher-order operations contribute to abnormal emotional prosody processing in schizophrenia: An electrophysiological investigation. Psychological Medicine, 43(3), 603–618.

    Article  PubMed  Google Scholar 

  • Pinheiro, A. P., Rezaii, N., Rauber, A., Liu, T., Nestor, P. G., McCarley, R. W., et al. (2014). Abnormalities in the processing of emotional prosody from single words in schizophrenia. Schizophrenia Research, 152(1), 235–241.

    Article  PubMed  Google Scholar 

  • Rigoulot, S., Wassiliwizky, E., & Pell, M. D. (2013). Feeling backwards? How temporal order in speech affects the time course of vocal emotion recognition. Frontiers in Psychology, 4(367), 1–14.

    Google Scholar 

  • Ruffman, T., Henry, J. D., Livingstone, V., & Phillips, L. H. (2008). A meta-analytic review of emotion recognition and aging: Implications for neuropsychological models of aging. Neuroscience and Biobehavioral Reviews, 32(4), 863–881.

    Article  PubMed  Google Scholar 

  • Salasoo, A., & Pisoni, D. B. (1985). Interaction of knowledge sources in spoken word identification. Journal of Memory and Language, 24(2), 210–231.

    Article  PubMed  PubMed Central  Google Scholar 

  • Sauter, D. A., & Eimer, M. (2010). Rapid detection of emotion from human vocalizations. Journal of Cognitive Neuroscience, 22(3), 474–481.

    Article  PubMed  Google Scholar 

  • Sauter, D. A., Eisner, F., Calder, A. J., & Scott, S. K. (2010a). Perceptual cues in nonverbal vocal expressions of emotion. The Quarterly Journal of Experimental Psychology, 63(11), 2251–2272.

    Article  PubMed  Google Scholar 

  • Sauter, D. A., Eisner, F., Ekman, P., & Scott, S. K. (2010b). Cross-cultural recognition of basic emotions through nonverbal emotional vocalizations. Proceedings of the National Academy of Sciences, 107(6), 2408–2412.

    Article  Google Scholar 

  • Sauter, D. A., & Scott, S. K. (2007). More than one kind of happiness: Can we recognize vocal expressions of different positive states? Motivation and Emotion, 31(3), 192–199.

    Article  Google Scholar 

  • Scheiner, E., Hammerschmidt, K., Jürgens, U., & Zwirner, P. (2002). Acoustic analyses of developmental changes and emotional expression in the preverbal vocalizations of infants. Journal of Voice, 16(4), 509–529.

    Article  PubMed  Google Scholar 

  • Scherer, K. R. (1989). Vocal correlates of emotional arousal and affective disturbance. In A. Manstead & H. Wagner (Eds.), Handbook of social psychophysiology: Emotion and social behavior (pp. 165–197). London: Wiley.

    Google Scholar 

  • Scherer, K. R., & Ellgring, H. (2007). Multimodal expression of emotion: Affect programs or componential appraisal patterns? Emotion, 7, 158–171.

    Article  PubMed  Google Scholar 

  • Schirmer, A., & Kotz, S. A. (2006). Beyond the right hemisphere: Brain mechanisms mediating vocal emotional processing. Trends in Cognitive Sciences, 10(1), 24–30.

    Article  PubMed  Google Scholar 

  • Schirmer, A., Kotz, S. A., & Friederici, A. D. (2002). Sex differentiates the role of emotional prosody during word processing. Cognitive Brain Research, 14(2), 228–233.

    Article  PubMed  Google Scholar 

  • Schirmer, A., Kotz, S. A., & Friederici, A. D. (2005a). On the role of attention for the processing of emotions in speech: Sex differences revisited. Cognitive Brain Research, 24(3), 442–452.

    Article  PubMed  Google Scholar 

  • Schirmer, A., Simpson, E., & Escoffier, N. (2007). Listen up! Processing of intensity change differs for vocal and nonvocal sounds. Brain Research, 1176, 103–112.

    Article  PubMed  Google Scholar 

  • Schirmer, A., Striano, T., & Friederici, A. D. (2005b). Sex di¡erences in the preattentive processing of vocal emotional expressions. NeuroReport, 16(6), 635–639.

    Article  PubMed  Google Scholar 

  • Schirmer, A., Zysset, S., Kotz, S. A., & Von Cramon, D. Y. (2004). Gender differences in the activation of inferior frontal cortex during emotional speech perception. NeuroImage, 21(3), 1114–1123.

    Article  PubMed  Google Scholar 

  • Schlegel, K., Vicaria, I. M., Isaacowitz, D. M., & Hall, J. A. (2017). Effectiveness of a short audiovisual emotion recognition training program in adults. Motivation and Emotion, 41(5), 646–660.

    Article  Google Scholar 

  • Schröder, M. (2003). Experimental study of affect bursts. Speech Communication, 40(1), 99–116.

    Article  Google Scholar 

  • Simon-Thomas, E. R., Keltner, D. J., Sauter, D., Sinicropi-Yao, L., & Abramson, A. (2009). The voice conveys specific emotions: Evidence from vocal burst displays. Emotion, 9(6), 838–846.

    Article  PubMed  Google Scholar 

  • Sobin, C., & Alpert, M. (1999). Emotion in speech: The acoustic attributes of fear, anger, sadness, and joy. Journal of Psycholinguistic Research, 23(4), 347–365.

    Article  Google Scholar 

  • Stapells, D. R. (2002). Cortical event-related potentials to auditory stimuli. Handbook of Clinical Audiology, 5, 378–406.

    Google Scholar 

  • Van Bezooijen, R. (1984). Characteristics and recognizability of vocal expressions of emotion (Vol. 5). Berlin: Walter de Gruyter.

    Book  Google Scholar 

  • Vasconcelos, M., Dias, M., Soares, A. P., & Pinheiro, A. P. (2017). What is the melody of that voice? Probing unbiased recognition accuracy of nonverbal vocalizations with the Montreal Affective Voices. Journal of Nonverbal Behavior, 41(3), 239–267.

    Article  Google Scholar 

  • Vettin, J., & Todt, D. (2004). Laughter in conversation: Features of occurrence and acoustic structure. Journal of Nonverbal Behavior, 28(2), 93–115.

    Article  Google Scholar 

  • Vingerhoets, A., Bylsma, L., & Rottenberg, J. (2009). Crying: A biopsychosocial phenomenon. Tears in the Graeco-Roman World 439–475.

  • Zimmer, U., Höfler, M., Koschutnig, K., & Ischebeck, A. (2016). Neuronal interactions in areas of spatial attention reflect avoidance of disgust, but orienting to danger. NeuroImage, 134, 94–104.

    Article  PubMed  Google Scholar 

Download references

Acknowledgements

The authors are grateful to all participants who took part in this study.

Funding

This work was supported by a Doctoral Grant SFRH/BD/92772/2013 awarded to PC, and by Grants IF/00334/2012, PTDC/MHN-PCN/3606/2012, and PTDC/MHC-PCN/0101/2014 awarded to APP. These Grants were funded by the Science and Technology Foundation (Fundação para a Ciência e a Tecnologia - FCT, Portugal) and FEDER (European Regional Development Fund) through the European programs QREN (National Strategic Reference Framework) and COMPETE (Operational Programme ‘Thematic Factors of Competitiveness’).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Ana P. Pinheiro.

Ethics declarations

Conflict of interest

No potential conflict of interest was reported by the authors.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (DOCX 42 kb)

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Castiajo, P., Pinheiro, A.P. Decoding emotions from nonverbal vocalizations: How much voice signal is enough?. Motiv Emot 43, 803–813 (2019). https://doi.org/10.1007/s11031-019-09783-9

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11031-019-09783-9

Keywords

Navigation