Skip to main content

Cross-Modal Representation in Humans and Nonhuman Animals: A Comparative Perspective

  • Chapter
  • First Online:
Book cover Integrating Face and Voice in Person Perception

Abstract

Auditory–visual representation provide redundant information about vocal individuals (i.e., who is vocalizing), and studies have reported such an ability in various vertebrate species. I introduce behavioral evidences of such abilities in animals and characterize the experimental paradigms that have been used in this field of study. I then compare vocal-type representation in nonhuman primates with that in humans, and discuss the evolution of human-specific phoneme representation (representation of articulatory gestures) that might relate to the faculty of language.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 169.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 219.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 219.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  • Adachi, I., & Fujita, K. (2007). Cross-modal representation of human caretakers in squirrel monkeys. Behavioural Processes, 74, 27–32.

    Article  PubMed  Google Scholar 

  • Adachi, I., Kuwahata, H., & Fujita, K. (2007). Dogs recall their owner’s face upon hearing the owner’s voice. Animal Cognition, 10, 17–21.

    Article  PubMed  Google Scholar 

  • Adachi, I., Kuwahata, H., Fujita, K., Tomonaga, M., & Matsuzawa, T. (2006). Japanese macaques form a cross-modal representation of their own species in their first year of life. Primates, 47, 350–354.

    Article  PubMed  Google Scholar 

  • Bernstein, L. E., Demorest, M. E., & Tucker, P. E. (2000). Speech perception without hearing. Perception & Psychophysics, 62, 233–252.

    Article  CAS  Google Scholar 

  • Blumstein, D. T., & Daniel, J. C. (2004). Yellow-belled marmots discriminate between the alarm calls of individuals and are more responsive to calls from juveniles. Animal Behaviour, 68, 1257–1265.

    Article  Google Scholar 

  • Bovet, D., & Deputte, B. L. (2009). Matching vocalizations to faces of familiar conspecifics in grey-cheeked mangabeys (Lophocebus albigena). Folia Primatologica, 80, 220–232.

    Article  Google Scholar 

  • Calvert, G. A., Brammer, M. J., & Iversen, S. D. (1998). Crossmodal identification. Trends in Cognitive Sciences, 2, 247–253.

    Article  PubMed  CAS  Google Scholar 

  • Campanella, S., & Belin, P. (2007). Integrating face and voice in person perception. Trends in Cognitive Sciences, 11, 535–543.

    Article  PubMed  Google Scholar 

  • Carey, S., & Bartlett, E. (1978). Acquiring a single new word. Papers and Reports on Child Language Development, 15, 17–29.

    Google Scholar 

  • Ceugniet, M., & Izumi, A. (2004). Vocal individual discrimination in Japanese monkeys. Primates, 45, 119–128.

    Article  PubMed  Google Scholar 

  • Colombo, M., & D’Amato, M. R. (1986). A comparison of visual and auditory short-term memory in monkeys (Cebus apella). Quarterly Journal of Experimental Psychology, 38B, 425–448.

    Google Scholar 

  • Colombo, M., & Graziano, M. (1994). Effects of auditory and visual interference on auditory-visual delayed matching to sample in monkey (Macaca fascicularis). Behavioral Neuroscience, 108, 636–639.

    Article  PubMed  CAS  Google Scholar 

  • Coulon, M., Deputte, B. L., van Heyman, Y., & Baudoin, C. (2009). Individual recognition in domestic cattle (Bos taurus): Evidence from 2D-images of heads from different breeds. PLoS One, 4, e4441.

    Article  PubMed  Google Scholar 

  • Cowey, A., & Weiskrantz, L. (1975). Demonstration of cross-modal matching in rhesus monkeys, Macaca mulatta. Neuropsychologia, 13, 117–120.

    Article  PubMed  CAS  Google Scholar 

  • Davenport, R. K., & Rogers, C. M. (1970). Intermodal equivalence of stimuli in apes. Science, 168, 279–280.

    Article  PubMed  CAS  Google Scholar 

  • Davenport, R. K., & Rogers, C. M. (1971). Perception of photographs by apes. Behaviour, 39, 318–320.

    Article  PubMed  CAS  Google Scholar 

  • Davenport, R. K., Rogers, C. M., & Russell, I. S. (1973). Cross modal perception in apes. Neuropsychologia, 11, 21–28.

    Article  PubMed  CAS  Google Scholar 

  • Dodd, B. (1979). Lip reading in infants: Attention to speech presented in- and out-of-synchrony. Cognitive Psychology, 11, 478–784.

    Article  PubMed  CAS  Google Scholar 

  • Ettlinger, G. (1967). Analysis of cross-modal effects and their relationship to language. In F. L. Darley & C. H. Millikan (Eds.), Brain mechanisms underlying speech and language (pp. 53–60). New York, Grune & Stratton.

    Google Scholar 

  • Ettlinger, G., & Blakemore, C. B. (1967). Cross-modal matching in the monkey. Neuropsychologia, 5, 147–154.

    Article  Google Scholar 

  • Evans, T. A., Howell, S., & Westergaard, G. C. (2005). Auditory-visual cross-modal perception of communicative stimuli in tufted capuchin monkeys (Cebus apella). Journal of Experimental Psychology: Animal Behavior Processes, 31, 399–406.

    Article  PubMed  Google Scholar 

  • Gaffan, D., & Harrison, S. (1991). Auditory-visual associations, hemispheric specialization and temporal-frontal interaction in the rhesus monkey. Brain, 114, 2133–2144.

    Article  PubMed  Google Scholar 

  • Geschwind, N. (1965). Disconnection syndrome in animals and man. Brain, 88, 237–294.

    Article  PubMed  CAS  Google Scholar 

  • Ghazanfar, A. A., & Logothetis, N. K. (2003). Facial expressions linked to monkey calls. Nature, 423, 937–938.

    Article  PubMed  CAS  Google Scholar 

  • Ghazanfar, A. A., & Schroeder, C. E. (2006). Is neocortex essentially multisensory? Trends in Cognitive Sciences, 10, 278–285.

    Article  PubMed  Google Scholar 

  • Ghazanfar, A. A., Turesson, H. K., Maier, J. X., van Dinther, R., Patterson, R. D., & Logothetis, N. K. (2007). Vocal-tract resonances as indexical cues in rhesus monkeys. Current Biology, 17, 425–430.

    Article  PubMed  CAS  Google Scholar 

  • Gogate, L. J., & Bahrick, L. E. (1998). Intersensory redundancy facilitates learning of arbitrary relations between vowel sounds and objects in seven-month-old infants. Journal of Experimental Child Psychology, 69, 133–149.

    Article  PubMed  CAS  Google Scholar 

  • Hashiya, K., & Kojima, S. (1997). Auditory-visual intermodal matching by a chimpanzee (Pan troglodytes). Japanese Psychological Research, 39, 182–190.

    Article  Google Scholar 

  • Hashiya, K., & Kojima, S. (2001a). Acquisition of auditory-visual intermodal matching-to-sample by a chimpanzee (Pan troglodytes): Comparison with visual-visual intermodal matching. Animal Cognition, 4, 231–239.

    Article  Google Scholar 

  • Hashiya, K., & Kojima, S. (2001b). Hearing and auditory-visual intermodal recognition in the chimpanzee. In T. Matsuzawa (Ed.), Primate origins of human cognition and behavior (pp. 155–189). Tokyo: Springer.

    Google Scholar 

  • Howard, I. P., & Templeton, W. B. (1966). Human spatial orientation. New York, NY: Willey.

    Google Scholar 

  • Izumi, A., & Kojima, S. (2004). Matching vocalizations to vocalizing faces in a chimpanzee (Pan troglodytes). Animal Cognition, 7, 179–184.

    Article  PubMed  Google Scholar 

  • Izumi, A., Kuraoka, K., Kojima, S., & Nakamura, K. (2001). Visually guided facial actions in rhesus monkeys. Cognitive, Affective, & Behavioral Neuroscience, 1, 266–269.

    Article  CAS  Google Scholar 

  • Janik, V. M., & Slater, P. J. B. (1997). Vocal learning in mammals. Advances in the Study of Behavior, 26, 59–99.

    Article  Google Scholar 

  • Jordan, K. E., Brannon, E. M., Logothetis, N. K., & Ghazanfar, A. A. (2005). Monkeys match the number of voices they hear to the number of faces they see. Current Biology, 15, 1034–1038.

    Article  PubMed  CAS  Google Scholar 

  • Jouventin, P., Aubin, T., & Lengagne, T. (1999). Finding a parent in a king penguin colony: The acoustic system of individual recognition. Animal Behaviour, 57, 1175–1183.

    Article  PubMed  Google Scholar 

  • Kaminski, J., Call, J., & Fischer, J. (2004). Word learning in a domestic dog: Evidence for “fast mapping”. Science, 304, 1682–1683.

    Article  PubMed  CAS  Google Scholar 

  • Kojima, S. (1985). Auditory short-term memory in the Japanese monkey. International Journal of Neuroscience, 25, 255–262.

    Article  PubMed  CAS  Google Scholar 

  • Kojima, S. (2003). A search for the origins of human speech: Auditory and vocal functions of the chimpanzee. Kyoto: Kyoto University Press.

    Google Scholar 

  • Kojima, S., Izumi, A., & Ceugniet, M. (2003). Identification of vocalizers by pant hoots, pant grunts and screams in a chimpanzee. Primates, 44, 225–230.

    Article  PubMed  Google Scholar 

  • Kuhl, P. K., & Meltzoff, A. N. (1982). The bimodal development of speech in infancy. Science, 218, 1138–1141.

    Article  PubMed  CAS  Google Scholar 

  • Kuhl, P. K., & Meltzoff, A. N. (1984). The intermodal representation of speech in infants. Infant Behavior & Development, 7, 361–381.

    Article  Google Scholar 

  • Kuhl, P. K., Williams, K. A., & Meltzoff, A. N. (1991). Cross-modal speech perception in adults and infants using nonspeech auditory stimuli. Journal of Experimental Psychology: Human Perception & Performance, 17, 829–840.

    Article  CAS  Google Scholar 

  • Lewkowicz, D. J., & Ghazanfar, A. A. (2009). The emergence of multisensory systems through perceptual narrowing. Trends in Cognitive Sciences, 13, 470–478.

    Article  PubMed  Google Scholar 

  • Liberman, A. M., & Whalen, D. H. (2000). On the relation of speech to language. Trends in Cognitive Sciences, 4, 187–196.

    Article  PubMed  Google Scholar 

  • Maier, J. X., Neuhoff, J. G., Logothetis, N. K., & Ghazanfar, A. A. (2004). Multisensory integration of looming signals by rhesus monkeys. Neuron, 43, 177–181.

    Article  PubMed  CAS  Google Scholar 

  • Martin-Malivel, J., & Fagot, J. (2001). Cross-modal integration and conceptual categorization in baboons. Behavioural Brain Research, 122, 209–213.

    Article  PubMed  CAS  Google Scholar 

  • McGurk, H., & MacDonald, J. (1976). Hearing lips and seeing voices. Nature, 264, 746–748.

    Article  PubMed  CAS  Google Scholar 

  • Meltzoff, A. N., & Borton, R. W. (1979). Intermodal matching by human neonates. Nature, 282, 403–404.

    Article  PubMed  CAS  Google Scholar 

  • Munhall, K. G., & Vatikiotis-Bateson, E. (1998). The moving face during speech communication. In R. Campbell, B. Dodd, D. Burnham (Eds.), Hearing by Eye 2: Advances in the psychology of speechreading and auditory-visual speech (pp. 123–139). Hove, UK: Psychology Press.

    Google Scholar 

  • Murray, E. A., & Gaffan, D. (1994). Removal of the amygdala plus subjacent disrupts the retention of both intramodal and crossmodal associative memories in monkeys. Behavioral Neuroscience, 108, 494–500.

    Article  PubMed  CAS  Google Scholar 

  • Parr, L. A. (2001). Cognitive and physiological markers of emotional awareness in chimpanzees (Pan troglodytes). Animal Cognition, 4, 223–229.

    Article  Google Scholar 

  • Parr, L. A., Winslow, J. T., Hopkins, W. D., & de Waal, F. B. M. (2000). Recognizing facial cues: Individual discrimination by chimpanzees (Pan troglodytes) and rhesus monkeys (Macaca mulatta). Journal of Comparative Psychology, 114, 47–60.

    Article  PubMed  CAS  Google Scholar 

  • Patterson, M. L., & Werker, J. F. (2003). Two-month-old infants match phonetic information in lips and voice. Developmental Science, 6, 191–196.

    Article  Google Scholar 

  • Pokorny, J. J., & de Waal, F. B. M. (2009). Monkeys recognize the faces of group mates in photographs. Proceedings of the National Academy of Sciences of the United States of America, 106, 21539–21543.

    Article  PubMed  CAS  Google Scholar 

  • Proops, L., McComb, K., & Reby, D. (2009). Cross-modal individual recognition in domestic horses (Equus caballus). Proceedings of the National Academy of Sciences of the United States of America, 106, 947–951.

    Article  PubMed  CAS  Google Scholar 

  • Santos, L. S., & Hauser, M. D. (1999). How monkeys see the eyes: Cotton-top tamarins’ reaction to changes in visual attention and action. Animal Cognition, 2, 131–139.

    Article  Google Scholar 

  • Savage-Rumbaugh, S., Sevcik, R. A., & Hopkins, W. D. (1988). Symbolic cross-modal transfer in two species of chimpanzees. Child Development, 59, 617–625.

    Article  PubMed  CAS  Google Scholar 

  • Sayigh, L. S., Tyack, P. L., Wells, R. S., Solow, A. R., Scott, M. D., & Irvine, A. B. (1998). Individual recognition in wild bottlenose dolphins: A field test using playback experiments. Animal Behaviour, 57, 41–50.

    Article  Google Scholar 

  • Spelke, E. S. (1979). Perceiving bimodally specified events in infancy. Developmental Psychology, 15, 626–636.

    Article  Google Scholar 

  • Spelke, E. S. (1985). Preferential-looking methods as tools for the study of cognition in infancy. In G. Gottlieb & N. Krasnegor (Eds.), Measurement of audition and vision in the first year of postnatal life (pp. 323–363). Norwood, NJ: Ablex.

    Google Scholar 

  • Stein, B. E., & Stanford, T. R. (2008). Multisensory integration: Current issues from the perspective of the single neuron. Nature Reviews Neuroscience, 9, 255–266.

    Article  PubMed  CAS  Google Scholar 

  • Sumby, W. H., & Pollack, I. (1954). Visual contribution to speech intelligibility in noise. Journal of the Acoustical Society of America, 26, 212–215.

    Article  Google Scholar 

  • Walker, A. S. (1982). Intermodal perception of expressive behaviors by human infants. Journal of Experimental Child Psychology, 33, 514–535.

    Article  PubMed  CAS  Google Scholar 

  • Weiss, D. J., Garibaldi, B. T., & Hauser, M. D. (2001). The production and perception of long calls by cotton-top tamarins (Saguinus oedipus): Acoustic analyses and playback experiments. Journal of Comparative Psychology, 115, 258–271.

    Article  PubMed  CAS  Google Scholar 

  • Woods, T. M., & Recanzone, G. H. (2004). Visually induced plasticity of auditory spatial perception in macaques. Current Biology, 14, 1559–1564.

    Article  PubMed  CAS  Google Scholar 

  • Wright, A. A., Shyan, M. R., & Jitsumori, M. (1990). Auditory same/different concept learning by monkeys. Animal Learning & Behavior, 18, 287–294.

    Article  Google Scholar 

  • Wynn, K. (1992). Addition and subtraction by human infants. Nature, 358, 749–750.

    Article  PubMed  CAS  Google Scholar 

  • Yamaguchi, C., & Izumi, A. (2008). Vocal learning in nonhuman primates: Importance of vocal contexts. In N. Masataka (Ed.), The origins of language: Unrevealing evolutionary forces (pp. 75–84). Tokyo: Springer.

    Google Scholar 

  • Zangenehpour, S., Ghazanfar, A. A., Lewkowicz, D. J., & Zatorre, R. J. (2009). Heterochrony and cross-species intersensory matching by infant vervet monkeys. PLoS One, 4, e4302.

    Article  PubMed  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Akihiro Izumi .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2013 Springer Science+Business Media New York

About this chapter

Cite this chapter

Izumi, A. (2013). Cross-Modal Representation in Humans and Nonhuman Animals: A Comparative Perspective. In: Belin, P., Campanella, S., Ethofer, T. (eds) Integrating Face and Voice in Person Perception. Springer, New York, NY. https://doi.org/10.1007/978-1-4614-3585-3_2

Download citation

Publish with us

Policies and ethics