Abstract
Integrating audiovisual cues for simple events is affected when sources are separated in space and time. By contrast, audiovisual perception of speech appears resilient when either spatial or temporal disparities exist. We investigated whether speech perception is sensitive to the combination of spatial and temporal inconsistencies. Participants heard the bisyllable /aba/ while seeing a face produce the incongruent bisyllable /ava/. We tested the level of visual influence over auditory perception when the sound was asynchronous with respect to facial motion (from −360 to +360 ms) and emanated from five locations equidistant to the participant. Although an interaction was observed, it was not related to participants’ perception of synchrony, nor did it indicate a linear relationship between the effect of spatial and temporal discrepancies. We conclude that either the complexity of the signal or the nature of the task reduces reliance on spatial and temporal contiguity for audiovisual speech perception.


Similar content being viewed by others
References
Bertelson P, Vroomen J, Wiegeraad G, de Gelder B (1994) Exploring the relation between McGurk interference and ventriloquism. In: Proceedings of the international congress on spoken language processing, Yokohama, pp 559–562
Bolognini N, Frassinetti F, Serino A, Ladavas E (2005) “Acoustical vision” of below threshold stimuli: interaction among spatially converging audiovisual inputs. Exp Brain Res 160:273–282
Colin C, Radeau M, Deltenre P, Morais J (2001) Rules of intersensory integration in spatial scene analysis and speechreading. Psychol Belg 43:131–144
Dixon NF, Spitz L (1980) The detection of audiovisual desynchrony. Percept Psychophys 9:719–721
Driver J, Spence C (1998) Crossmodal attention. Curr Opin Neurobiol 8:245–253
Giguere C, Abel SM (1993) Sound localization: effects of reverberation time, speaker array, stimulus frequency, and stimulus rise/decay. J Acoust Soc Am 94:769–776
Green KP, Kuhl PK, Meltzoff AN, Stevens EB (1991) Integrating speech information across talkers, gender, and sensory modality: female faces and male voices in the McGurk effect. Percept Psychophys 50:524–536
Hampson MG, FH, Cohen MA, and Nieto-Castanon A (2003) Changes in the McGurk Effect across phonetic contexts. Boston University, Technical Report CAS, Boston
Jones JA, Munhall KG (1997) The effects of separating auditory and visual sources on audiovisual integration of speech. Can Acoust 25:13–19
King AJ, Palmer AR (1985) Integration of visual and auditory information in bimodal neurones in the guinea-pig superior colliculus. Exp Brain Res 60:492–500
Lewald J, Ehrenstein WH, Guski R (2001) Spatio-temporal constraints for auditory–visual integration. Behav Brain Res 121:69–79
Lewald J, Guski R (2003) Cross-modal perceptual integration of spatially and temporally disparate auditory and visual stimuli. Brain Res Cogn Brain Res 16:468–478
Lewald J, Guski R (2004) Auditory–visual temporal integration as a function of distance: no compensation for sound-transmission time in human perception. Neurosci Lett 357:119–122
Liberman AM, Mattingly I (1985) The motor theory revised. Cognition 21:1–36
Lovelace CT, Stein BE, Wallace MT (2003) An irrelevant light enhances auditory detection in humans: a psychophysical analysis of multisensory integration in stimulus detection. Brain Res Cogn Brain Res 17:447–453
Macaluso E, George N, Dolan R, Spence C, Driver J (2004) Spatial and temporal factors during processing of audiovisual speech: a PET study. Neuroimage 21:725–732
Macpherson EA, Middlebrooks JC (2000) Localization of brief sounds: effects of level and background noise. J Acoust Soc Am 108:1834–1849
Massaro DW (1987) Psychophysics versus specialized processes in speech perception: an alternative perspective. In: Schouten MEH (ed) Psychophysics of speech perception, Dordrecht, Netherlands, pp 46–65
Massaro DW, Cohen MM, Gesi A, Heredia R, Tsuzaki M (1993) Bimodal speech perception: an examination across languages. J Phon 21:445–478
Massaro DW, Cohen MM, Smeele PM (1996) Perception of asynchronous and conflicting visual and auditory speech. J Acoust Soc Am 100:1777–1786
McDonald JJ, Teder-Salejarvi WA, Hillyard SA (2000) Involuntary orienting to sound improves visual perception. Nature 407:906–908
McGrath M, Summerfield Q (1985) Intermodal timing relations and audio–visual speech recognition by normal-hearing adults. J Acoust Soc Am 77:678–685
McGurk H, MacDonald J (1976) Hearing lips and seeing voices. Nature 264:746–748
Meredith MA, Nemitz JW, Stein BE (1987) Determinants of multisensory integration in superior colliculus neurons. I. Temporal factors. J Neurosci 7:3215–3229
Meredith MA, Stein BE (1996) Spatial determinants of multisensory integration in cat superior colliculus neurons. J Neurophysiol 75:1843–1857
Munhall KG, Gribble P, Sacco L, Ward M (1996) Temporal constraints on the McGurk effect. Percept Psychophys 58:351–362
Munhall KG, Jones JA, Callan D, Kuratate T, Vatikiotis-Bateson E (2003) Visual prosody and speech intelligibility: head movement improves auditory speech perception. Psychol Sci 15:133–137
Oldfield SR, Parker SP (1984) Acuity of sound localisation: a topography of auditory space. I. Normal hearing conditions. Percept 13:581–600
Rosenblum LD, Schmuckler MA, Johnson JA (1997) The McGurk effect in infants. Percept Psychophys 59:347–357
Schwartz JL, Berthommier F, Savariaux C (2004) Seeing to hear better: evidence for early audio–visual interactions in speech identification. Cognition 93:B69–B78
Sekiyama K (1997) Cultural and linguistic factors in audiovisual speech processing: the McGurk effect in Chinese subjects. Percept Psychophys 59:73–80
Slutsky DA, Recanzone GH (2001) Temporal and spatial dependency of the ventriloquism effect. Neuroreport 12:7–10
Spence C, Driver J (1996) Audiovisual links in endogenous covert spatial attention. J Exp Psychol Hum Percept Perform 22:1005–1030
Skipper JI, Nusbaum HC, Small SL (2005) Listening to talking faces: motor cortical activation during speech production. Neuroimage 25:76–89
Stein BE, Meredith MA (1993) Merging of the senses. The MIT Press, Cambridge
Stone JV, Hunkin NM, Porrill J, Wood R, Keeler V, Beanland M, Port M, Porter NR (2001) When is now? Perception of simultaneity. Proc Biol Sci USA 268:31–38
Sugita Y, Suzuki Y (2003) Audiovisual perception: implicit estimation of sound-arrival time. Nature 421:911
Sumby WH, Pollack I (1954) Visual contribution to speech intelligibility in noise. J Acoust Soc Am 26:212–215
Summerfield Q (1987) Some preliminaries to a comprehensive account of audiovisual speech perception. In Dodd B, Campbell R (eds) Hearing by eye: the psychology of lip-reading, Erlbaum, Hillsdale, pp 3–51
Summerfield Q (1992) Lipreading and audio–visual speech perception. Philos Trans R Soc Lond B Biol Sci 335:71–78
Van Strien JW (1988) Handedness and hemispheric laterality. Dissertation, Vrije Universiteit, Amsterdam
van Wassenhove V, Grant KW, Poeppel D (2006) Temporal window of integration in auditory–visual speech perception. Neuropsychologia, in press
Vatikiotis-Bateson, E, Kuratate, T, Munhall, KG, Yehia, HC (2000). The production and perception of a realistic talking face. In: Fujimura O, Joseph BD, Palek B (eds) Proceedings of LP’98, Item order in language and speech 2, Prague, pp 439–460
Vroomen J, de Gelder B (2000) Sound enhances visual perception: cross-modal effects of auditory organization on vision. J Exp Psychol Hum Percept Perform 26:1583–1590
Walker S, Bruce V, O’Malley C (1995) Facial identity and facial speech processing: familiar faces and voices in the McGurk effect. Percept Psychophys 57:1124–1133
Wallace MT, Roberson GE, Hairston WD, Stein BE, Vaughan JW, Schirillo JA (2004) Unifying multisensory signals across time and space. Exp Brain Res 158:252–258
Welch RB, Warren DH (1980) Immediate perceptual response to intersensory discrepancy. Psychol Bull 88:638–667
Windmann S (2004) Effects of sentence context and expectation on the McGurk illusion. J Mem Lang 50:212–230
Acknowledgments
This work was supported by the Natural Sciences and Engineering Research Council of Canada (NSERC).
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Jones, J.A., Jarick, M. Multisensory integration of speech signals: the relationship between space and time. Exp Brain Res 174, 588–594 (2006). https://doi.org/10.1007/s00221-006-0634-0
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00221-006-0634-0

