Experimental Brain Research

, Volume 181, Issue 1, pp 173–181 | Cite as

Temporal recalibration during asynchronous audiovisual speech perception

  • Argiro Vatakis
  • Jordi Navarra
  • Salvador Soto-Faraco
  • Charles Spence
Research Article

Abstract

We investigated the consequences of monitoring an asynchronous audiovisual speech stream on the temporal perception of simultaneously presented vowel-consonant-vowel (VCV) audiovisual speech video clips. Participants made temporal order judgments (TOJs) regarding whether the speech-sound or the visual-speech gesture occurred first, for video clips presented at various different stimulus onset asynchronies. Throughout the experiment, half of the participants also monitored a continuous stream of words presented audiovisually, superimposed over the VCV video clips. The continuous (adapting) speech stream could either be presented in synchrony, or else with the auditory stream lagging by 300 ms. A significant shift (13 ms in the direction of the adapting stimulus in the point of subjective simultaneity) was observed in the TOJ task when participants monitored the asynchronous speech stream. This result suggests that the consequences of adapting to asynchronous speech extends beyond the case of simple audiovisual stimuli (as has recently been demonstrated by Navarra et al. in Cogn Brain Res 25:499–507, 2005) and can even affect the perception of more complex speech stimuli.

Keywords

Speech Asynchrony Temporal order judgment Adaptation Temporal recalibration Audition Vision 

References

  1. Arnold DH, Johnston A, Nishida S (2005) Timing sight and sound. Vision Res 45:1275–1284PubMedCrossRefGoogle Scholar
  2. Bergmann D, Spence C, Noesselt T (2006) Neural correlates of synchrony perception using audiovisual speech stimuli. Poster presented at the seventh annual meeting of the international multisensory research forum (IMRF), Dublin, IrelandGoogle Scholar
  3. Bernstein LE, Auer ET, Moore JK (2004) Audiovisual speech binding: convergence or association? In: Calvert GA, Spence C, Stein BE (eds) The handbook of multisensory processing. MIT, Cambridge, pp 203–223Google Scholar
  4. Bertelson P, de Gelder B (2004) The psychology of multimodal perception. In: Spence C, Driver J (eds) Crossmodal space and crossmodal attention. Oxford University Press, Oxford, pp 141–177Google Scholar
  5. Bushara KO, Grafman J, Hallett M (2001) Neural correlates of auditory-visual stimulus onset asynchrony detection. J Neurosci 21:300–304PubMedGoogle Scholar
  6. Calvert GA (2001) Crossmodal processing in the human brain: insights from functional neuroimaging studies. Cerebral Cortex 11:1110–1123PubMedCrossRefGoogle Scholar
  7. Calvert GA, Bullmore ET, Campbell MJBR, Williams SCR, McGuire PK, Woodruff PWR, Iversen SD, David AS (1997) Activation of auditory cortex during silent lipreading. Science 276:593–596PubMedCrossRefGoogle Scholar
  8. Calvert GA, Spence C, Stein BE (eds) (2004) The handbook of multisensory processes. MIT, CambridgeGoogle Scholar
  9. Coren S, Ward LM, Enns JT (2004) Sensation and perception, 6th edn. Harcourt Brace, Fort WorthGoogle Scholar
  10. de Gelder B, Bertelson P (2003) Multisensory integration, perception and ecological validity. Trends Cogn Sci 7:460–467PubMedCrossRefGoogle Scholar
  11. Dixon NF, Spitz L (1980) The detection of auditory visual desynchrony. Perception 9:719–721PubMedCrossRefGoogle Scholar
  12. Engel GR, Dougherty WG (1971) Visual-auditory distance constancy. Nature 234:308PubMedCrossRefGoogle Scholar
  13. Ernst MO, Banks MS (2002) Humans integrate visual and haptic information in a statistically optimal fashion. Nature 415:429–433PubMedCrossRefGoogle Scholar
  14. Fendrich R, Corballis PM (2001) The temporal cross-capture of audition and vision. Percept Psychophys 63:719–725PubMedGoogle Scholar
  15. Finney DJ (1964) Probit analysis: statistical treatment of the sigmoid response curve. Cambridge University Press, LondonGoogle Scholar
  16. Fujisaki W, Shimojo S, Kashino M, Nishida S (2004) Recalibration of audiovisual simultaneity. Nat Neurosci 7:773–778PubMedCrossRefGoogle Scholar
  17. Grant KW, van Wassenhove V, Poeppel D (2004) Detection of auditory (cross-spectral) and auditory-visual (cross-modal) synchrony. Speech Commun 44:43–53CrossRefGoogle Scholar
  18. Hamilton RH, Shenton JT, Branch-Coslett H (2006) An acquired deficit of audiovisual speech processing. Brain Lang 98:66–73PubMedCrossRefGoogle Scholar
  19. Hirsh IJ, Sherrick Jr CE (1961) Perceived order in different sense modalities. J Exp Psychol 62:423–432PubMedCrossRefGoogle Scholar
  20. ITU-R BT 1359-1 (1998) Relative timing of sound and vision for broadcasting (Question ITU-R 35/11)Google Scholar
  21. Jack CE, Thurlow WR (1973) Effects of degree of visual association and angle of displacement on the “ventriloquism” effect. Percept Mot Skills 37:967–979PubMedGoogle Scholar
  22. Jackson CV (1953) Visual factors in auditory localization. Q J Exp Psychol 5:52–65CrossRefGoogle Scholar
  23. King AJ (2005) Multisensory integration: strategies for synchronization. Curr Biol 15:R339–R341PubMedCrossRefGoogle Scholar
  24. King AJ, Palmer AR (1985) Integration of visual and auditory information in bimodal neurones in the guinea-pig superior colliculus. Exp Brain Res 60:492–500PubMedCrossRefGoogle Scholar
  25. Kopinska A, Harris LR (2004) Simultaneity constancy. Perception 33:1049–1060PubMedCrossRefGoogle Scholar
  26. Lavie N (2005) Distracted and confused? Selective attention under load. Trends Cogn Sci 9:75–82PubMedCrossRefGoogle Scholar
  27. Lavie N, Tsal Y (1994) Perceptual load as a major determinant of the locus of selection in visual attention. Percept Psychophys 56:183–197PubMedGoogle Scholar
  28. Lewald J, Guski R (2004) Auditory-visual temporal integration as a function of distance: no compensation of sound-transmission time in human perception. Neurosci Lett 357:119–122PubMedCrossRefGoogle Scholar
  29. Macaluso E, George N, Dolan R, Spence C, Driver J (2004). Spatial and temporal factors during processing of audiovisual speech perception: a PET study. Neuroimage 21:725–732PubMedCrossRefGoogle Scholar
  30. Massaro DW (2004). From multisensory integration to talking heads and language learning. In: Calvert GA, Spence C, Stein BE (eds) The handbook of multisensory processing. MIT, Cambridge, pp 153–176Google Scholar
  31. McDonald JJ, Teder-Sälejärvi WA, Di Russo F, Hillyard SA (2005) Neural basis of auditory-induced shifts in visual time-order perception. Nat Neurosci 8:1197–1202PubMedCrossRefGoogle Scholar
  32. Miller LM, D’Esposito M (2005) Perceptual fusion and stimulus coincidence in the crossmodal integration of speech. J Neurosci 25:5884–5893PubMedCrossRefGoogle Scholar
  33. Morein-Zamir S, Soto-Faraco S, Kingstone A (2003) Auditory capture of vision: examining temporal ventriloquism. Cogn Brain Res 17:154–163CrossRefGoogle Scholar
  34. Moutoussis K, Zeki S (1997) A direct demonstration of perceptual asynchrony in vision. Proc R Soc Lond B 264:393–399CrossRefGoogle Scholar
  35. Munhall K, Vatikiotis-Bateson E (2004) Spatial and temporal constraints on audiovisual speech perception. In: Calvert GA, Spence C, Stein BE (eds) The handbook of multisensory processing. MIT, Cambridge, pp 177–188Google Scholar
  36. Munhall KG, Gribble P, Sacco L, Ward M (1996) Temporal constraints on the McGurk effect. Percept Psychophys 58:351–362PubMedGoogle Scholar
  37. Navarra J, Vatakis A, Zampini M, Humphreys W, Soto-Faraco S, Spence C (2005) Exposure to asynchronous audiovisual speech extends the temporal window for audiovisual integration. Cogn Brain Res 25:499–507CrossRefGoogle Scholar
  38. Nishida S, Johnston A (2002) Marker correspondence, not processing latency, determines temporal binding of visual attributes. Curr Biol 12:359–368PubMedCrossRefGoogle Scholar
  39. Noesselt T, Fendrich R, Bonath B, Tyll S, Heinze HJ (2005) Closer in time when farther in space: Spatial factors in audiovisual temporal integration. Cogn Brain Res 25:443–458CrossRefGoogle Scholar
  40. Rihs S (1995) The influence of audio on perceived picture quality and subjective audio-visual delay tolerance. In: Hamberg R, de Ridder H (eds) Proceedings of the MOSAIC workshop: advanced methods for the evaluation of television picture quality, 18 and 19 September 1995, Eindhoven, pp 133–137Google Scholar
  41. Scheier C, Nijhawan R, Shimojo S (1999) Sound alters visual temporal resolution. Invest Ophthalmol Vis Sci 40:4169Google Scholar
  42. Sekuler R, Sekuler AB, Lau R (1997) Sound alters visual motion perception. Nature 385:308PubMedCrossRefGoogle Scholar
  43. Soto-Faraco S, Alsius A (2007) Access to the uni-sensory components in a cross-modal illusion. Neuroreport (in press)Google Scholar
  44. Spence C, Squire SB (2003) Multisensory integration: maintaining the perception of synchrony. Curr Biol 13:R519–R521PubMedCrossRefGoogle Scholar
  45. Spence C, Shore DI, Klein RM (2001) Multisensory prior entry. J Exp Psychol Gen 130:799–832PubMedCrossRefGoogle Scholar
  46. Stone JV, Hunkin NM, Porrill J, Wood R, Keeler V, Beanland M, Port M, Porter NR (2001) When is now? Perception of simultaneity. Proc R Soc Lond B Biol Sci 268:31–38CrossRefGoogle Scholar
  47. Sugita Y, Suzuki Y (2003) Implicit estimation of sound-arrival time. Nature 421:911PubMedCrossRefGoogle Scholar
  48. Tuomainen J, Andersen TS, Tiippana K, Sams M (2005) Audio-visual speech perception is special. Cognition 96:B13–B22PubMedCrossRefGoogle Scholar
  49. Vatakis A, Spence C (2006a) Evaluating the influence of frame rate on the temporal aspects of audiovisual speech perception. Neurosci Lett 405:132–136CrossRefGoogle Scholar
  50. Vatakis A, Spence C (2006b) Audiovisual synchrony perception for speech and music using a temporal order judgment task. Neurosci Lett 393:40–44CrossRefGoogle Scholar
  51. Vatakis A, Spence C (2006c) Audiovisual synchrony perception for music, speech, and object actions. Brain Res 1111:134–142CrossRefGoogle Scholar
  52. Vatakis A, Spence C (2007) Crossmodal binding: evaluating the ‘unity assumption’ using audiovisual speech stimuli. Percept Psychophys (in press)Google Scholar
  53. Vibell J, Klinge C, Zampini M, Spence C, Nobre AC (2007) Temporal order is coded temporally in the brain: early ERP latency shifts underlying prior entry in a crossmodal temporal order judgment task. J Cogn Neurosci 19:109–120PubMedCrossRefGoogle Scholar
  54. Vroomen J, de Gelder B (2004) Temporal ventiloquism: sound modulates the flash-lag effect. J Exp Psychol Hum Percept Perform 30:513–518PubMedCrossRefGoogle Scholar
  55. Vroomen J, Keetels M (2006) The spatial constraint in intersensory pairing: no role in temporal ventriloquism. J Exp Psychol Hum Percept Perform 32:1063–1071PubMedCrossRefGoogle Scholar
  56. Vroomen J, Keetels M, de Gelder B, Bertelson P (2004) Recalibration of temporal order perception by exposure to audio-visual asynchrony. Cogn Brain Res 22:32–35CrossRefGoogle Scholar
  57. Zampini M, Shore DI, Spence C (2003) Audiovisual temporal order judgments. Exp Brain Res 152:198–210PubMedCrossRefGoogle Scholar
  58. Zampini M, Guest S, Shore DI, Spence C (2005) Audio-visual simultaneity judgments. Percept Psychophys 67:531–544PubMedGoogle Scholar

Copyright information

© Springer-Verlag 2007

Authors and Affiliations

  • Argiro Vatakis
    • 1
  • Jordi Navarra
    • 1
    • 2
  • Salvador Soto-Faraco
    • 3
  • Charles Spence
    • 1
  1. 1.Crossmodal Research Laboratory, Department of Experimental PsychologyUniversity of OxfordOxfordUK
  2. 2.Grup de Recerca Neurociencia Cognitiva (GRNC), Parc Científic de BarcelonaUniversitat de BarcelonaBarcelonaSpain
  3. 3.ICREA and Parc Científic de BarcelonaUniversitat de BarcelonaBarcelonaSpain

Personalised recommendations