Audio-Visual Perception of Everyday Natural Objects – Hemodynamic Studies in Humans
Our ability to perceive and recognize objects, people, and meaningful action events is a cognitive function of prime importance, which is characterized by an interplay of visual, auditory, and sensory-motor processing. One goal of sensory neuroscience is to better understand multisensory perception, including how information from auditory and visual systems may merge to create stable, unified representations of objects and actions in our environment. This chapter summarizes and compares results from 49 paradigms published over the past decade that have explicitly examined human brain regions associated with audio-visual interactions. A series of meta-analyses compare and contrast distinct cortical networks preferentially activated under five major types of audio-visual interactions: (1) matching spatial and/or temporal features of nonnatural objects, (2–3) matching crossmodal features characteristic of natural objects (moving versus static images), (4) associating artificial audio-visual pairings (e.g., written/spoken language), and (5) an examination of networks activated when auditory and visual stimuli are incongruent. These meta-analysis results are discussed in the context of cognitive theories regarding how object knowledge representations may mesh with the multiple parallel pathways that appear to mediate audio-visual perception.
KeywordsSuperior Parietal Lobule Primary Auditory Cortex Inferior Frontal Cortex Action Sound Early Visual Area
Thanks to Mr. Chris Frum for assistance with preparation of figures. Thanks also to Dr. David Van Essen, Donna Hanlon, and John Harwell for continual development of cortical data analysis and presentation with CARET software, and William J. Talkington, Mary Pettit, and two anonymous reviewers for helpful comments on earlier versions of the text. This work was supported by the NCRR/NIH COBRE grant P20 RR15574 (to the Sensory Neuroscience Research Center of West Virginia University) and subproject to JWL.
- Belardinelli M, Sestieri C, Di Matteo R, Delogu F, Del Gratta C, Ferretti A, Caulo M, Tartaro A, Romani G (2004) Audio-visual crossmodal interactions in environmental perception: an fMRI investigation. Cogn Process 5:167–174Google Scholar
- Calvert GA, Brammer MJ (1999) FMRI evidence of a multimodal response in human superior temporal sulcus. Neuroimage 9:(S1038)Google Scholar
- Calvert GA, Lewis JW (2004) Hemodynamic studies of audio-visual interactions. In: Calvert GA, Spence C, Stein B (eds) Handbook of multisensory processing. MIT Press, Cambridge, MA, pp 483–502Google Scholar
- Cross ES, Kraemer DJ, Hamilton AF, Kelley WM, Grafton ST (2008) Sensitivity of the action observation network to physical and observational learning. Cereb Cortex 19(2):315–326Google Scholar
- Kanwisher N, Downing P, Epstein R, Kourtzi Z (2001) Functional neuroimaging of visual recognition. In: Cabeza R, Kingstone A (eds) Handbook of functional neuroimaging of cognition. MIT Press, Cambridge, MA, pp 109–152Google Scholar
- Martin A (2001) Functional neuroimaging of semantic memory. In: Cabeza R, Kingstone A (eds) Handbook of functional neuroimaging of cognition. The MIT Press, Cambridge, MA, pp 153–186Google Scholar
- Naumer MJ, Doehrmann O, Muller NG, Muckli L, Kaiser J, Hein G (2008) Cortical plasticity of audio-visual object representations. Cereb Cortex 19:1641–1653Google Scholar
- Recanzone GH, Cohen YE (2009) Serial and parallel processing in the primate auditory cortex revisited. Behav Brain Res 206:1–7Google Scholar
- Stein BE, Meredith MA (1993) The merging of the senses. MIT Press, Cambridge, MAGoogle Scholar
- Ungerleider LG, Mishkin M, Goodale MA, Mansfield RJW (1982) Two cortical visual systems. In: Ingle DJ (ed.) Analysis of visual behavior. MIT Press, Cambridge, MA, pp 549–586Google Scholar
- Vygotsky L (1978) Mind in society: the development of higher psychological processes. Harvard University Press, Cambridge, MAGoogle Scholar