Motivation and Emotion

, Volume 35, Issue 2, pp 192–201 | Cite as

Is there an advantage for recognizing multi-modal emotional stimuli?

Original Paper


Emotions can be recognized whether conveyed by facial expressions, linguistic cues (semantics), or prosody (voice tone). However, few studies have empirically documented the extent to which multi-modal emotion perception differs from uni-modal emotion perception. Here, we tested whether emotion recognition is more accurate for multi-modal stimuli by presenting stimuli with different combinations of facial, semantic, and prosodic cues. Participants judged the emotion conveyed by short utterances in six channel conditions. Results indicated that emotion recognition is significantly better in response to multi-modal versus uni-modal stimuli. When stimuli contained only one emotional channel, recognition tended to be higher in the visual modality (i.e., facial expressions, semantic information conveyed by text) than in the auditory modality (prosody), although this pattern was not uniform across emotion categories. The advantage for multi-modal recognition may reflect the automatic integration of congruent emotional information across channels which enhances the accessibility of emotion-related knowledge in memory.


Emotional prosody Emotional semantics Emotional facial expressions 


  1. Baenziger, T., Grandjean, D., & Scherer, K. R. (2009). Emotion recognition from expressions in face, voice, and body. The Multimodal Emotion Recognition Test (MERT). Emotion, 9(5), 691–704.CrossRefGoogle Scholar
  2. Banse, R., & Scherer, K. R. (1996). Acoustic profiles in vocal emotion expression. Journal of Personality and Social Psychology, 3, 614–636.CrossRefGoogle Scholar
  3. Borod, J. C., Cicero, B., Obler, L. K., Welkowitz, J., Erhan, H. M., Santschi, C., et al. (1998). Right hemisphere emotional perception. Evidence across multiple channels. Neuropsychology, 12, 446–458.PubMedCrossRefGoogle Scholar
  4. Borod, J. C., Pick, L. H., Hall, S., Sliwinski, M., Madigan, N., Obler, L. K., et al. (2000). Relationships among facial, prosodic, and lexical channels of emotional perceptual processing. Cognition and Emotion, 14, 193–211.CrossRefGoogle Scholar
  5. Bower, G. H. (1981). Mood and memory. American Psychologist, 36, 129–148.PubMedCrossRefGoogle Scholar
  6. Busso, C., Deng, Z., Yildirim, S., Bulut, M., Lee, C. M., Kazemzadeh, A., et al. (2004). Analysis of emotion recognition using facial expressions, speech and multimodal information. In Proceedings of ACM 6th International Conference on Multimodal Interfaces (ICMI 2004), State College, PA, 2004.Google Scholar
  7. Castro, S. L., & Lima, C. F. (2010). Recognizing emotions in spoken language: A validated set of Portuguese sentences and pseudosentences for research on emotional prosody. Behavior Research Methods, 42(1), 74–81.PubMedCrossRefGoogle Scholar
  8. Collignon, O., Girard, S., Gosselin, F., Roy, S., Saint-Amour, D., Lassonde, M., et al. (2008). Audio-visual integration of emotion expression. Brain Research, 1242, 126–135.PubMedCrossRefGoogle Scholar
  9. De Silva, L. C., Miyasato, T., & Natatsu, R. (1997). Facial emotion recognition using multimodal information. In Proceedings of IEEE International Conference on Information, Communications and Signal Processing (ICICS’97), pp. 397–401. Google Scholar
  10. DeGelder, B., & Bertelson, P. (2003). Multisensory integration, perception and ecological validity. Trends in Cognitive Sciences, 7, 460–467.CrossRefGoogle Scholar
  11. DeGelder, B., Böcker, K. B. E., Tuomainen, J., Hensen, M., & Vroomen, J. (1999). The combined perception of emotion from voice and face: Early interaction revealed by human electric brain responses. Neuroscience Letters, 260, 133–136.CrossRefGoogle Scholar
  12. DeGelder, B., & Vroomen, J. (2000). The perception of emotions by ear and by eye. Cognition and Emotion, 14, 289–311.CrossRefGoogle Scholar
  13. Ekman, P. (1992). An argument for basic emotions. Cognition and Emotion, 6, 169–200.CrossRefGoogle Scholar
  14. Ekman, P., & Friesen, W. (1976). Pictures of facial affect. Palo Alto, CA: Consulting Psychologist’s Press.Google Scholar
  15. Elfenbein, H. A., & Ambady, N. (2002). On the universality and cultural specificity of emotion recognition: A meta-analysis. Psychological Bulletin, 128, 203–235.PubMedCrossRefGoogle Scholar
  16. Etcoff, N. L., & Magee, J. J. (1992). Categorical perception of facial expressions. Cognition, 44, 227–240.PubMedCrossRefGoogle Scholar
  17. Hawk, S. T., van Kleef, G. A., Fischer, A. H., & van der Schalk, J. (2009). Worth a thousand words: Absolute and relative decodability of nonlinguistic affect vocalizations. Emotion, 9(3), 293–305.PubMedCrossRefGoogle Scholar
  18. Johnstone, T., & Scherer, K. R. (2000). Vocal communication of emotion. In M. Lewis & J. Haviland (Eds.), Handbook of emotions (2nd ed., pp. 220–235). New York: Guilford Press.Google Scholar
  19. Juslin, P. N., & Laukka, P. (2003). Communication of emotions in vocal expression and music performance: Different channels, same code? Psychological Bulletin, 129, 770–814.PubMedCrossRefGoogle Scholar
  20. Keppel, G. (1991). Design and analysis: A researcher’s handbook. Englewood Cliffs, NJ: Prentice Hall.Google Scholar
  21. Kotz, S. A., & Paulmann, S. (2007). When emotional prosody and semantics dance cheek to cheek: ERP evidence. Brain Research, 1151, 107–118.PubMedCrossRefGoogle Scholar
  22. Kreifelts, B., Ethofer, T., Grodd, W., Erb, M., & Wildgruber, D. (2007). Audiovisual integration of emotional signals in voice and face: An event-related fMRI study. Neuroimage, 37, 1445–1456.PubMedCrossRefGoogle Scholar
  23. Kreifelts, B., Ethofer, T., Huberle, E., Grodd, W., & Wildgruber, D. (2010). Association of trait emotional intelligence and individual fMRI-activation patterns during the perception of social signals from voice and face. Human Brain Mapping, 31(7), 979–991.PubMedCrossRefGoogle Scholar
  24. Levitt, E. A. (1964). The relationship between abilities to express emotional meanings vocally and facially. In J. R. Davitz (Ed.), The communication of emotional meaning (pp. 87–100). New York: McGraw-Hill.Google Scholar
  25. Massaro, D. W., & Egan, P. B. (1996). Perceiving affect from the voice and the face. Psychonomic Bulletin Review, 3, 215–221.CrossRefGoogle Scholar
  26. Niedenthal, P. M. (2007). Embodying emotion. Science, 316, 1002–1005.PubMedCrossRefGoogle Scholar
  27. Niedenthal, P. M., & Halberstadt, J. B. (1995). The acquisition and structure of emotional response categories. The Psychology of Learning and Motivation, 33, 23–63.CrossRefGoogle Scholar
  28. Nowicki, S., & Duke, M. (1994). Individual differences in the nonverbal communication of affect. Journal of Nonverbal Behavior, 18, 9–36.CrossRefGoogle Scholar
  29. Paulmann, S., Pell, M. D., & Kotz, S. A. (2008). How aging affects the recognition of emotional speech. Brain and Language, 104, 262–269.PubMedCrossRefGoogle Scholar
  30. Pell, M. D. (2002). Evaluation of nonverbal emotion in face and voice: Some preliminary findings on a new battery of tests. Brain and Cognition, 48, 499–504.PubMedGoogle Scholar
  31. Pell, M. D. (2005). Nonverbal emotion priming: evidence from the ‘facial affect decision task’. Journal of Nonverbal Behavior, 29(1), 45–73.CrossRefGoogle Scholar
  32. Pell, M. D. (2006). Cerebral mechanisms for understanding emotional prosody in speech. Brain and Language, 96(2), 221–234.PubMedCrossRefGoogle Scholar
  33. Pell, M. D., Jaywant, A., Monetta, L., & Kotz, S. A. (in press). Emotional speech processing: disentangling the effects of prosody and semantic cues. Cognition & Emotion. doi:10.1080/02699931.2010.516915.
  34. Pell, M. D., Paulmann, S., Dara, C., Alasseri, A., & Kotz, S. A. (2009). Factors in the recognition of vocally expressed emotions: A comparison of four languages. Journal of Phonetics, 37, 417–435.CrossRefGoogle Scholar
  35. Rosenthal, R., Hall, J. A., DiMatteo, M. R., Rogers, P. L., & Archer, D. (1979). Sensitivity to nonverbal communication: The PONS test. Baltimore: John Hopkins University Press.Google Scholar
  36. Russell, J., & Lemay, G. (2000). Emotion concepts. In M. Lewis & M. J. Haviland-Jones (Eds.), Handbook of emotion (2nd ed., pp. 491–503). New York: Guilford Press.Google Scholar
  37. Schwartz, J.-L., Robert-Ribes, J., & Escudier, P. (1998). Ten years after Summerfield: A taxonomy of models for audio-visual fusion in speech perception. In R. Campbell (Ed.), Hearing by eye: The psychology of lipreading (pp. 3–51). Hove, UK: Lawrence Erlbaum Associcates.Google Scholar
  38. Welch, R. B., & Warren, D. H. (1980). Immediate perceptual response to intersensory discrepancy. Psychological Bulletin, 88, 638–667.PubMedCrossRefGoogle Scholar

Copyright information

© Springer Science+Business Media, LLC 2011

Authors and Affiliations

  1. 1.Department of PsychologyUniversity of EssexEssexUK
  2. 2.School of Communication Sciences and DisordersMcGill UniversityMontrealCanada

Personalised recommendations