Skip to main content
Log in

Multisensory integration of speech signals: the relationship between space and time

  • Research Note
  • Published:
Experimental Brain Research Aims and scope Submit manuscript

Abstract

Integrating audiovisual cues for simple events is affected when sources are separated in space and time. By contrast, audiovisual perception of speech appears resilient when either spatial or temporal disparities exist. We investigated whether speech perception is sensitive to the combination of spatial and temporal inconsistencies. Participants heard the bisyllable /aba/ while seeing a face produce the incongruent bisyllable /ava/. We tested the level of visual influence over auditory perception when the sound was asynchronous with respect to facial motion (from −360 to +360 ms) and emanated from five locations equidistant to the participant. Although an interaction was observed, it was not related to participants’ perception of synchrony, nor did it indicate a linear relationship between the effect of spatial and temporal discrepancies. We conclude that either the complexity of the signal or the nature of the task reduces reliance on spatial and temporal contiguity for audiovisual speech perception.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
€32.70 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price includes VAT (Finland)

Instant access to the full article PDF.

Fig. 1
Fig. 2

Similar content being viewed by others

References

  • Bertelson P, Vroomen J, Wiegeraad G, de Gelder B (1994) Exploring the relation between McGurk interference and ventriloquism. In: Proceedings of the international congress on spoken language processing, Yokohama, pp 559–562

  • Bolognini N, Frassinetti F, Serino A, Ladavas E (2005) “Acoustical vision” of below threshold stimuli: interaction among spatially converging audiovisual inputs. Exp Brain Res 160:273–282

    Article  PubMed  Google Scholar 

  • Colin C, Radeau M, Deltenre P, Morais J (2001) Rules of intersensory integration in spatial scene analysis and speechreading. Psychol Belg 43:131–144

    Google Scholar 

  • Dixon NF, Spitz L (1980) The detection of audiovisual desynchrony. Percept Psychophys 9:719–721

    Article  CAS  Google Scholar 

  • Driver J, Spence C (1998) Crossmodal attention. Curr Opin Neurobiol 8:245–253

    Article  PubMed  CAS  Google Scholar 

  • Giguere C, Abel SM (1993) Sound localization: effects of reverberation time, speaker array, stimulus frequency, and stimulus rise/decay. J Acoust Soc Am 94:769–776

    Article  PubMed  CAS  Google Scholar 

  • Green KP, Kuhl PK, Meltzoff AN, Stevens EB (1991) Integrating speech information across talkers, gender, and sensory modality: female faces and male voices in the McGurk effect. Percept Psychophys 50:524–536

    PubMed  CAS  Google Scholar 

  • Hampson MG, FH, Cohen MA, and Nieto-Castanon A (2003) Changes in the McGurk Effect across phonetic contexts. Boston University, Technical Report CAS, Boston

  • Jones JA, Munhall KG (1997) The effects of separating auditory and visual sources on audiovisual integration of speech. Can Acoust 25:13–19

    Google Scholar 

  • King AJ, Palmer AR (1985) Integration of visual and auditory information in bimodal neurones in the guinea-pig superior colliculus. Exp Brain Res 60:492–500

    Article  PubMed  CAS  Google Scholar 

  • Lewald J, Ehrenstein WH, Guski R (2001) Spatio-temporal constraints for auditory–visual integration. Behav Brain Res 121:69–79

    Article  PubMed  CAS  Google Scholar 

  • Lewald J, Guski R (2003) Cross-modal perceptual integration of spatially and temporally disparate auditory and visual stimuli. Brain Res Cogn Brain Res 16:468–478

    Article  PubMed  Google Scholar 

  • Lewald J, Guski R (2004) Auditory–visual temporal integration as a function of distance: no compensation for sound-transmission time in human perception. Neurosci Lett 357:119–122

    Article  PubMed  CAS  Google Scholar 

  • Liberman AM, Mattingly I (1985) The motor theory revised. Cognition 21:1–36

    Article  PubMed  CAS  Google Scholar 

  • Lovelace CT, Stein BE, Wallace MT (2003) An irrelevant light enhances auditory detection in humans: a psychophysical analysis of multisensory integration in stimulus detection. Brain Res Cogn Brain Res 17:447–453

    Article  PubMed  Google Scholar 

  • Macaluso E, George N, Dolan R, Spence C, Driver J (2004) Spatial and temporal factors during processing of audiovisual speech: a PET study. Neuroimage 21:725–732

    Article  PubMed  CAS  Google Scholar 

  • Macpherson EA, Middlebrooks JC (2000) Localization of brief sounds: effects of level and background noise. J Acoust Soc Am 108:1834–1849

    Article  PubMed  CAS  Google Scholar 

  • Massaro DW (1987) Psychophysics versus specialized processes in speech perception: an alternative perspective. In: Schouten MEH (ed) Psychophysics of speech perception, Dordrecht, Netherlands, pp 46–65

    Google Scholar 

  • Massaro DW, Cohen MM, Gesi A, Heredia R, Tsuzaki M (1993) Bimodal speech perception: an examination across languages. J Phon 21:445–478

    Google Scholar 

  • Massaro DW, Cohen MM, Smeele PM (1996) Perception of asynchronous and conflicting visual and auditory speech. J Acoust Soc Am 100:1777–1786

    Article  PubMed  CAS  Google Scholar 

  • McDonald JJ, Teder-Salejarvi WA, Hillyard SA (2000) Involuntary orienting to sound improves visual perception. Nature 407:906–908

    Article  PubMed  CAS  Google Scholar 

  • McGrath M, Summerfield Q (1985) Intermodal timing relations and audio–visual speech recognition by normal-hearing adults. J Acoust Soc Am 77:678–685

    Article  PubMed  CAS  Google Scholar 

  • McGurk H, MacDonald J (1976) Hearing lips and seeing voices. Nature 264:746–748

    Article  PubMed  CAS  Google Scholar 

  • Meredith MA, Nemitz JW, Stein BE (1987) Determinants of multisensory integration in superior colliculus neurons. I. Temporal factors. J Neurosci 7:3215–3229

    PubMed  CAS  Google Scholar 

  • Meredith MA, Stein BE (1996) Spatial determinants of multisensory integration in cat superior colliculus neurons. J Neurophysiol 75:1843–1857

    PubMed  CAS  Google Scholar 

  • Munhall KG, Gribble P, Sacco L, Ward M (1996) Temporal constraints on the McGurk effect. Percept Psychophys 58:351–362

    PubMed  CAS  Google Scholar 

  • Munhall KG, Jones JA, Callan D, Kuratate T, Vatikiotis-Bateson E (2003) Visual prosody and speech intelligibility: head movement improves auditory speech perception. Psychol Sci 15:133–137

    Article  Google Scholar 

  • Oldfield SR, Parker SP (1984) Acuity of sound localisation: a topography of auditory space. I. Normal hearing conditions. Percept 13:581–600

    Article  CAS  Google Scholar 

  • Rosenblum LD, Schmuckler MA, Johnson JA (1997) The McGurk effect in infants. Percept Psychophys 59:347–357

    PubMed  CAS  Google Scholar 

  • Schwartz JL, Berthommier F, Savariaux C (2004) Seeing to hear better: evidence for early audio–visual interactions in speech identification. Cognition 93:B69–B78

    Article  PubMed  Google Scholar 

  • Sekiyama K (1997) Cultural and linguistic factors in audiovisual speech processing: the McGurk effect in Chinese subjects. Percept Psychophys 59:73–80

    PubMed  CAS  Google Scholar 

  • Slutsky DA, Recanzone GH (2001) Temporal and spatial dependency of the ventriloquism effect. Neuroreport 12:7–10

    Article  PubMed  CAS  Google Scholar 

  • Spence C, Driver J (1996) Audiovisual links in endogenous covert spatial attention. J Exp Psychol Hum Percept Perform 22:1005–1030

    Article  PubMed  CAS  Google Scholar 

  • Skipper JI, Nusbaum HC, Small SL (2005) Listening to talking faces: motor cortical activation during speech production. Neuroimage 25:76–89

    Article  PubMed  Google Scholar 

  • Stein BE, Meredith MA (1993) Merging of the senses. The MIT Press, Cambridge

    Google Scholar 

  • Stone JV, Hunkin NM, Porrill J, Wood R, Keeler V, Beanland M, Port M, Porter NR (2001) When is now? Perception of simultaneity. Proc Biol Sci USA 268:31–38

    Article  CAS  Google Scholar 

  • Sugita Y, Suzuki Y (2003) Audiovisual perception: implicit estimation of sound-arrival time. Nature 421:911

    Article  PubMed  CAS  Google Scholar 

  • Sumby WH, Pollack I (1954) Visual contribution to speech intelligibility in noise. J Acoust Soc Am 26:212–215

    Article  Google Scholar 

  • Summerfield Q (1987) Some preliminaries to a comprehensive account of audiovisual speech perception. In Dodd B, Campbell R (eds) Hearing by eye: the psychology of lip-reading, Erlbaum, Hillsdale, pp 3–51

  • Summerfield Q (1992) Lipreading and audio–visual speech perception. Philos Trans R Soc Lond B Biol Sci 335:71–78

    Article  PubMed  CAS  Google Scholar 

  • Van Strien JW (1988) Handedness and hemispheric laterality. Dissertation, Vrije Universiteit, Amsterdam

  • van Wassenhove V, Grant KW, Poeppel D (2006) Temporal window of integration in auditory–visual speech perception. Neuropsychologia, in press

  • Vatikiotis-Bateson, E, Kuratate, T, Munhall, KG, Yehia, HC (2000). The production and perception of a realistic talking face. In: Fujimura O, Joseph BD, Palek B (eds) Proceedings of LP’98, Item order in language and speech 2, Prague, pp 439–460

  • Vroomen J, de Gelder B (2000) Sound enhances visual perception: cross-modal effects of auditory organization on vision. J Exp Psychol Hum Percept Perform 26:1583–1590

    Article  PubMed  CAS  Google Scholar 

  • Walker S, Bruce V, O’Malley C (1995) Facial identity and facial speech processing: familiar faces and voices in the McGurk effect. Percept Psychophys 57:1124–1133

    PubMed  CAS  Google Scholar 

  • Wallace MT, Roberson GE, Hairston WD, Stein BE, Vaughan JW, Schirillo JA (2004) Unifying multisensory signals across time and space. Exp Brain Res 158:252–258

    Article  PubMed  CAS  Google Scholar 

  • Welch RB, Warren DH (1980) Immediate perceptual response to intersensory discrepancy. Psychol Bull 88:638–667

    Article  PubMed  CAS  Google Scholar 

  • Windmann S (2004) Effects of sentence context and expectation on the McGurk illusion. J Mem Lang 50:212–230

    Article  Google Scholar 

Download references

Acknowledgments

This work was supported by the Natural Sciences and Engineering Research Council of Canada (NSERC).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Jeffery A. Jones.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Jones, J.A., Jarick, M. Multisensory integration of speech signals: the relationship between space and time. Exp Brain Res 174, 588–594 (2006). https://doi.org/10.1007/s00221-006-0634-0

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00221-006-0634-0

Keywords

Navigation