Abstract
Understanding the mechanisms underlying the human perception of synchrony for simple and complex audiovisual stimuli represents an important, but as yet unresolved, issue in the field of cognitive science. Many questions regarding the processes involved in the temporal integration of auditory and visual stimuli that give rise to a synchronous audiovisual experience of everyday events are still open for research. This chapter outlines what is currently known about the mechanisms of audiovisual temporal perception and reviews the results of a series of studies of temporal perception using complex audiovisual stimuli. To date, two characteristics of the audiovisual temporal window of integration have been shown to be relatively consistent across the majority of studies: (1) It has a width on the order of several hundred milliseconds and (2) it is asymmetrical, being larger when the visual-stimulus leads than when it lags. We provide an overview of research demonstrating that the temporal window of audiovisual integration for complex stimuli is modulated by the type, complexity, and properties of the particular experimental stimuli used, the familiarity of the observer with the stimuli presented, the degree of unity of the auditory- and visual-stimulus streams (for the case of speech stimuli), and the orientation of the visual stimulus (again for the case of speech stimuli).
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Alais D, Burr D (2004) The ventriloquist effect results from near-optimal bimodal integration. Curr Biol 14:257–262
Allan LG (1975) The relationship between judgments of successiveness and judgments of order. Percept Psychophys 18:29–36
Arnold DH, Johnston A, Nishida S (2005) Timing sight and sound. Vision Research 45:1275–1284
Asakawa K, Tanaka A, Imai H (2009). Temporal recalibration in audio-visual speech integration using a simultaneity judgment task and the McGurk identification task. Cognitive Science Meeting, Amsterdam, Netherlands
Bald L, Berrien FK, Price JB, Sprague RO (1942) Errors in perceiving the temporal order of auditory and visual stimuli. J Appl Psychol 26:382–388
Bentin S, Allison T, Puce A, Perez E, McCarthy G (1996) Electrophysiological studies of face perception in humans. J Cogn Neurosci 8:551–565
Bernstein LE, Auer ET, Moore JK (2004) Audiovisual speech binding: convergence or association? In: Calvert GA, Spence C, Stein BE (eds) The handbook of multisensory processing. MIT Press, Cambridge, MA, pp 203–223
Bertelson P, Aschersleben G (1998) Automatic visual bias of perceived auditory location. Psychonom Bull Rev 5:482–489
Bushara KO, Grafman J, Hallett M (2001). Neural correlates of auditory-visual stimulus onset asynchrony detection. J Neurosci 21:300–304
Calvert GA, Spence C, Stein BE (eds) (2004) The handbook of multisensory processing. MIT Press, Cambridge, MA
Chen Y-C, Spence C (2010). When hearing the bark helps to identify the dog: Semantically-congruent sounds modulate the identification of masked pictures. Cognition 114:389–404
Choe CS, Welch RB, Gilford RM, Juola JF (1975) The ‘ventriloquist effect’: visual dominance or response bias? Percept Psychophys 18:55–60
Conrey BL, Pisoni DB (2006) Auditory-visual speech perception and synchrony detection for speech and nonspeech signals. J Acoust Soc Am 119:4065–4073
Coren S, Ward LM, Enns JT (2004) Sensation perception, 6th edn. Harcourt Brace, Fort Worth
De Gelder B, Bertelson P (2003) Multisensory integration, perception and ecological validity. Trends Cogn Sci 7:460–467
Dixon NF, Spitz L (1980) The detection of auditory visual desynchrony. Perception 9: 719–721
Doehrmann O, Naumer MJ (2008) Semantics and the multisensory brain: how meaning modulates processes of audio-visual integration. Brain Res 1242:136–150
Driver J, Spence C (2000) Multisensory perception: beyond modularity and convergence. Curr Biol 10:R731–R735
Efron R (1963) The effect of handedness on the perception of simultaneity and temporal order. Brain 86:261–284
Engel GR, Dougherty WG (1971) Visual-auditory distance constancy. Nature 234:308
Eskelund K, Andersen TS (2009) Specialization in audiovisual speech perception: a replication study. Poster presented at the 10th Annual Meeting of the International Multisensory Research Forum (IMRF), New York City, 29th June–2nd July
Fendrich R, Corballis PM (2001) The temporal cross-capture of audition and vision. Percept Psychophys 63:719–725
Fraisse P (1984) Perception and estimation of time. Annu Rev Psychol 35:1–36
Fujisaki W, Nishida S (2005) Temporal frequency characteristics of synchrony-asynchrony discrimination of audio-visual signals. Exp Brain Res 166:455–464
Fujisaki W, Nishida S (2007) Feature-based processing of audio-visual synchrony perception revealed by random pulse trains. Vis Res 47:1075–1093
Grant KW, Greenberg S (2001) Speech intelligibility derived from asynchronous processing of auditory-visual speech information. Proceedings of the Workshop on Audio Visual Speech Processing, Scheelsminde, Denmark, September 7–9, pp 132–137
Grant KW, Seitz PF (1998) The use of visible speech cues (speechreading) for directing auditory attention: reducing temporal and spectral uncertainty in auditory detection of spoken sentences. In: Kuhl PK, Crum LA (eds) Proceedings of the 16th international congress on acoustics and the 135th meeting of the acoustical society of America, vol. 3. ASA, New York, pp 2335–2336
Grant KW, van Wassenhove V, Poeppel D (2003) Detection of auditory (cross-spectral) and auditory-visual (cross-modal) synchrony. Speech Commun 44:43–53
Grant KW, van Wassenhove V, Poeppel D (2004) Detection of auditory (cross-spectral) and auditory-visual (cross-modal) synchrony. J Acoust Soc Am 108:1197–1208
Hein G, Doehrmann O, Müller NG, Kaiser J, Muckli L, Naumer MJ (2007) Object familiarity and semantic congruency modulate responses in cortical audiovisual integration areas. J Neurosci 27:7881–7887
Hirsh IJ (1959) Auditory perception of temporal order. J Acoust Soc Am 31:759–767
Hirsh IJ, Sherrick CE Jr (1961) Perceived order in different sense modalities. J Exp Psychol 62:424–432
Hollier MP, Rimell AN (1998) An experimental investigation into multi-modal synchronisation sensitivity for perceptual model development. 105th AES Convention, Preprint No. 4790
Howard IP, Templeton WB (1966) Human spatial orientation. Wiley, New York
Jaśkowski P, Jaroszyk F, Hojan-Jesierska D (1990) Temporal-order judgments and reaction time for stimuli of different modalities. Psycholog Res 52:35–38
Jones JA, Jarick M (2006) Multisensory integration of speech signals: the relationship between space and time. Exp Brain Res 174:588–594
Kallinen K, Ravaja N (2007) Comparing speakers versus headphones in listening to news from a computer - individual differences and psychophysiological responses. Comp Human Behav 23:303–317
Keetels M, Vroomen J (in press). Perception of synchrony between the senses. In: Murrary MM, Wallace MT (eds.) Frontiers in the neural basis of multisensory processes
Kent RD (1997) The speech sciences. Singular, San Diego, CA
King AJ (2005) Multisensory integration: strategies for synchronization. Curr Biol 15: R339–R341
King AJ, Palmer AR (1985) Integration of visual and auditory information in bimodal neurones in the guinea-pig superior colliculus. Exp Brain Res 60:492–500
Kohlrausch A, van de Par S (2005) Audio-visual interaction in the context of multi-media applications. In: Blauert J (ed.) Communication acoustics. Springer, Berlin, pp 109–138
Kopinska A, Harris LR (2004) Simultaneity constancy. Perception 33:1049–1060
Koppen C, Spence C (2007). Audiovisual asynchrony modulates the Colavita visual dominance effect. Brain Res 1186:224–232
Lee H-L, Noppeney U (2009) Audiovisual synchrony detection for speech and music signals. Poster presented at the 10th annual meeting of the international multisensory research forum (IMRF), New York City, 29th June–2nd July
Lewald J, Guski R (2004) Auditory-visual temporal integration as a function of distance: no compensation of sound-transmission time in human perception. Neurosci Lett 357: 119–122
Maier JX, Di Luca M, Ghazanfar AA (2009; submitted). Auditory-visual asynchrony detection in humans. J Exp Psychol: Human Percept Perform
Massaro DW (1996) Integration of multiple sources of information in language processing. In: Inui T, McClelland JL (eds) Attention and performance XVI: information integration in perception and communication. MIT Press, New York, pp 397–432
Massaro DW (2004) From multisensory integration to talking heads and language learning. In: Calvert GA, Spence C, Stein BE (eds) The handbook of multisensory processing. MIT Press, Cambridge, MA, pp 153–176
Massaro DW, Cohen MM (1993) Perceiving asynchronous bimodal speech in consonant-vowel and vowel syllables. Speech Commun 13:127–134
Massaro DW, Cohen MM, Smeele PMT (1996) Perception of asynchronous and conflicting visual and auditory speech. J Acoust Soc Am 100:1777–1786
Mauk MD, Buonomano DV (2004) The neural basis of temporal processing. Annu Rev Neurosci 27:307–340
McGrath M, Summerfield Q (1985) Intermodal timing relations and audiovisual speech recognition by normal hearing adults. J Acoust Soc Am 77:678–685
McGurk H, MacDonald J (1976) Hearing lips and seeing voices. Nature 264:746–748
Miner N, Caudell T (1998) Computational requirements and synchronization issues of virtual acoustic displays. Presence: Teleop Virt Environ 7:396–409
Morein-Zamir S, Soto-Faraco S, Kingstone A (2003) Auditory capture of vision: examining temporal ventriloquism. Cogn Brain Res 17:154–163
Munhall KG, Gribble P, Sacco L, Ward M (1996) Temporal constraints on the McGurk effect. Percept Psychophys 58:351–362
Munhall KG, Vatikiotis-Bateson E (2004) Spatial and temporal constraints on audiovisual speech perception. In: Calvert GA, Spence C, Stein BE (eds) The handbook of multisensory processing. MIT Press, Cambridge, MA, pp 177–188
Navarra J, Alsius A, Velasco I, Soto-Faraco S, Spence C (2010) Perception of audiovisual speech synchrony for native and non-native speech. Brain Res 1323:84–93
Navarra J, Hartcher-O’Brien J, Piazza E, Spence C (2009) Adaptation to audiovisual asynchrony modulates the speeded detection of sound. Proc Natl Acad Sci USA 106:9169–9173
Navarra J, Vatakis A, Zampini M, Soto-Faraco S, Humphreys W, Spence C (2005) Exposure to asynchronous audiovisual speech extends the temporal window for audiovisual integration. Cogn Brain Res 25:499–507
Neuta W, Feirtag M (1986) Fundamental neuroanatomy. Freeman Co, New York
Noesselt T, Bergmann D, Heinze H-J., Münte T, Spence C (submitted) Spatial coding of multisensory temporal relations in human superior temporal sulcus. PLoS ONE
Noesselt T, Fendrich R, Bonath B, Tyll S, Heinze H-J. (2005) Closer in time when farther in space - Spatial factors in audiovisual temporal integration. Cogn Brain Res 25:443–458
Pandey CP, Kunov H, Abel MS (1986) Disruptive effects of auditory signal delay on speech perception with lip-reading. J Audit Res 26:27–41
Parise C, Spence C (2009) ‘When birds of a feather flock together’: synesthetic correspondences modulate audiovisual integration in non-synesthetes. PLoS ONE 4(5):e5664. doi:10.1371/journal.pone.0005664
Petrini K, Russell M, Pollick F (2009) When knowing can replace seeing in audiovisual integration of actions. Cognition 110:432–439
Pöppel E, Schill K, von Steinbüchel N (1990) Sensory integration within temporally neutral system states: a hypothesis. Naturwissenschaften 77:89–91
Recanzone GH (2003) Auditory influences on visual temporal rate perception. J Neurophysiol 89:1078–1093
Reeves B, Voelker D (1993) Effects of audio-video asynchrony on viewer’s memory, evaluation of content and detection ability. Research report prepared for Pixel Instruments. Los Gatos, California
Rihs S (1995) The influence of audio on perceived picture quality and subjective audio-visual delay tolerance. In: Hamberg R, de Ridder H (eds) Proceedings of the MOSAIC workshop: advanced methods for the evaluation of television picture quality, Eindhoven, September 18th–19th, pp 133–137
Rutschmann J, Link R (1964) Perception of temporal order of stimuli differing in sense mode and simple reaction time. Percept Motor Skills 18:345–352
Scheier CR, Nijhawan R, Shimojo S (1999) Sound alters visual temporal resolution. Invest Opthalmol Vis Sci 40:S792
Schutz M, Kubovy M (2009) Causality in audio-visual sensory integration. J Exp Psychol: Human Percept Perform 35:1791–1810
Schwartz J-L, Robert-Ribes J, Escudier P (1998) Ten years after Summerfield: a taxonomy of models for audio-visual fusion in speech perception. In: Burnham D (ed.) Hearing by eye II: advances in the psychology of speechreading and auditory-visual speech. Psychology Press, Hove, UK, pp 85–108
Sekuler R, Sekuler AB, Lau R (1997) Sound alters visual motion perception. Nature 385:308
Slutsky DA, Recanzone GH (2001) Temporal and spatial dependency of the ventriloquism effect. Neuroreport 12:7–10
Soto-Faraco S, Alsius A (2007) Access to the uni-sensory components in a cross-modal illusion. Neuroreport 18:347–350
Soto-Faraco S, Alsius A (2009) Deconstructing the McGurk-MacDonald illusion. J Exp Psychol: Human Percept Perform 35:580–587
Soto-Faraco S, Lyons J, Gazzaniga M, Spence C, Kingstone A (2002) The ventriloquist in motion: illusory capture of dynamic information across sensory modalities. Cogn Brain Res 14:139–146
Spence C (2007) Audiovisual multisensory integration. J Acoust Soc Jpn: Acoust Sci Technol 28:61–70
Spence C. Prior entry: attention and temporal perception. In: Nobre AC, Coull JT (eds) Attention and time. Oxford University Press, Oxford (in press)
Spence C, Driver J (1997) On measuring selective attention to a specific sensory modality. Percept Psychophys 59:389–403
Spence C, Shore DI, Klein RM (2001) Multisensory prior entry. J Exp Psychol: Gen 130:799–832
Spence C, Squire SB (2003) Multisensory integration: maintaining the perception of synchrony. Curr Biol 13:R519–R521
Stein BE, Meredith MA (1993) The merging of the senses. MIT Press, Cambridge, MA
Steinmetz R (1996) Human perception of jitter and media synchronization. IEEE J Select Areas Commun 14:61–72
Sternberg S, Knoll RL, Gates BA (1971) Prior entry reexamined: Effect of attentional bias on order perception. Paper presented at the meeting of the Psychonomic Society, St. Louis, Missouri
Stone JV, Hunkin NM, Porrill J, Wood R, Keeler V, Beanland M, Port M, Porter NR (2001) When is now? Perception of simultaneity. Proc R Soc Lond B, Biol Sci 268:31–38
Sugita Y, Suzuki Y (2003) Implicit estimation of sound-arrival time. Nature 421:911
Teder-Sälejärvi WA, Di Russo F, McDonald JJ, Hillyard SA (2005) Effects of spatial congruity on audio-visual multimodal integration. J Cogn Neurosci 17:1396–1409
Thorne JD, Debner S (2008) Irrelevant visual stimuli improve auditory task performance. Neuroreport 19:553–557
Titchener- EB (1908) Lecture on the elementary psychology of feeling and attention. Macmillan, New York
Traunmüller H, Öhrström N (2007) Audiovisual perception of openness and lip rounding in front vowels. J Phonet 35:244–258
Tuomainen J, Andersen TS, Tiippana K, Sams M (2005) Audio-visual speech is special. Cognition 96:B13–B22
van de Par S, Kohlrausch A, Juola JF (1999) Judged synchrony/asynchrony for light-tone pairs. Poster presented at the 40th Annual Meeting of the Psychonomic Society, Los Angeles, CA
van Eijk RL J., Kohlrausch A, Juola JF, van de Par S (2008) Audiovisual synchrony and temporal order judgments: effects of experimental method and stimulus type. Percept Psychophys 70:955–968
van Wassenhove V., Grant KW., Poeppel D (2003) Electrophysiology of auditory-visual speech integration. International conference on auditory-visual speech processing (AVSP), St Jorioz, France, pp 31–35
van Wassenhove V, Grant KW, Poeppel D (2005) Visual speech speeds up the neural processing of auditory speech. Proc Natl Acad Sci USA 102:1181–1186
van Wassenhove V, Grant KW, Poeppel D (2007) Temporal window of integration in auditory-visual speech perception. Neuropsychologia 45:598–607
Vatakis A, Ghazanfar AA, Spence C (2008) Facilitation of multisensory integration by the “unity effect” reveals that speech is special. J Vis 8(9):14:1–11
Vatakis A, Navarra J, Soto-Faraco S, Spence C (2007) Temporal recalibration during asynchronous audiovisual speech perception. Exp Brain Res 181:173–181
Vatakis A, Spence C (2006a) Audiovisual synchrony perception for music, speech, and object actions. Brain Res 1111:134–142
Vatakis A, Spence C (2006b) Evaluating the influence of frame rate on the temporal aspects of audiovisual speech perception. Neurosci Lett 405:132–136
Vatakis A, Spence C (2006c) Audiovisual synchrony perception for speech and music using a temporal order judgment task. Neurosci Lett 393:40–44
Vatakis A, Spence C (2007a) How ‘special’ is the human face? Evidence from an audiovisual temporal order judgment task. Neuroreport 18:1807–1811
Vatakis A, Spence C (2007b) Crossmodal binding: evaluating the ‘unity assumption’ using complex audiovisual stimuli. Proceedings of the 19th international congress on acoustics (ICA), Madrid, Spain
Vatakis A, Spence C (2007c) Crossmodal binding: evaluating the ‘unity assumption’ using audiovisual speech stimuli. Percept Psychophys 69:744–756
Vatakis A, Spence C (2007d) Investigating the factors that influence the temporal perception of complex audiovisual events. Proc Eur Cogn Sci 2007 (EuroCogSci07):389–394
Vatakis A, Spence C (2007e) An assessment of the effect of physical differences in the articulation of consonants and vowels on audiovisual temporal perception. Poster presented at the one-day meeting for young speech researchers, University College London, London, UK
Vatakis A, Spence C (2008a). Investigating the effects of inversion on configural processing using an audiovisual temporal order judgment task. Perception 37:143–160
Vatakis A, Spence C (2008b). Evaluating the influence of the ‘unity assumption’ on the temporal perception of realistic audiovisual stimuli. Acta Psychol 127:12–23
Vatakis A, Spence C (submitted). Assessing the effect of physical differences in the articulation of consonants and vowels on audiovisual temporal perception. J Speech Lang Hear Res
Vroomen J, de Gelder B (2004) Temporal ventriloquism: sound modulates the flash-lag effect. J Exp Psychol: Human Percept Perform 30:513–518
Vroomen J, Keetels M (2006) The spatial constraint in intersensory pairing: no role in temporal ventriloquism. J Exp Psychol: Human Percept Perform 32:1063–1071
Wada Y, Kitagawa N, Noguchi K (2003) Audio-visual integration in temporal perception. Int J Psychophysiol 50:117–124
Welch RB, Warren DH (1980) Immediate perceptual response to intersensory discrepancy. Psychol Bull 88:638–667
Zampini M, Bird KJ, Bentley DE, Watson A, Barrett G, Jones AK, Spence C (2007) Prior entry for pain and vision: attention speeds the perceptual processing of painful stimuli. Neurosci Lett 414:75–79
Zampini M, Guest S, Shore DI, Spence C (2005) Audio-visual simultaneity judgments. Percept Psychophys 67:531–544
Zampini M, Shore DI, Spence C (2003a) Multisensory temporal order judgments: the role of hemispheric redundancy. Int J Psychophysiol 50:165–180
Zampini M, Shore DI, Spence C (2003b) Audiovisual temporal order judgments. Exp Brain Res 152:198–210
Zeki E (1993) A vision of the brain. New York: Oxford University Press
Acknowledgments
A.V. was supported by a Newton Abraham Studentship from the Medical Sciences Division, University of Oxford. Correspondence regarding this article should be addressed to Argiro Vatakis, Institute for Language and Speech Processing, Artemidos 6 & Epidavrou, Athens, 151 25, Greece. E-mail: argiro.vatakis@gmail.com.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2010 Springer Science + Business Media, LLC
About this chapter
Cite this chapter
Vatakis, A., Spence, C. (2010). Audiovisual Temporal Integration for Complex Speech, Object-Action, Animal Call, and Musical Stimuli. In: Kaiser, J., Naumer, M. (eds) Multisensory Object Perception in the Primate Brain. Springer, New York, NY. https://doi.org/10.1007/978-1-4419-5615-6_7
Download citation
DOI: https://doi.org/10.1007/978-1-4419-5615-6_7
Published:
Publisher Name: Springer, New York, NY
Print ISBN: 978-1-4419-5614-9
Online ISBN: 978-1-4419-5615-6
eBook Packages: Biomedical and Life SciencesBiomedical and Life Sciences (R0)