Abstract
Visual signals can have an important impact on the perceived location of sound sources. Neurological mechanisms enable interactions between seeing and hearing to form a sense of space. The effect of vision on auditory localisation percepts is of fundamental importance. A sound source is either perceived at the location of the visual source or it is perceptually shifted toward it’s direction. This bias is one form of visual capture. The extent of the interactions depends on time and space constraints beyond which visual and auditory cues do not necessarily interact. These constraints and interactions vary for the localisation of sources along the horizontal and vertical planes, as well as with distance. While the traditional models of audiovisual interaction in space perception assume sensory integration, recent models allow for sensory cues to either interact or not. Models of visual dominance, modality appropriateness, and maximum likelihood estimation predict one combined percept. The newer models of causal inference allow for varied perceptual outcomes depending on the relationship of the different sensory cues. Finally, visual spatial cues can induce changes to how sounds are localised after the audiovisual experience. This notorious effect, known as the ventriloquism aftereffect, is possibly the main mechanism of auditory space learning and calibration. The ventriloquism aftereffect has been described with a causal inference model and with an inverse model. The current chapter discusses all of the above concepts, establishing a connection between psychophysical data and available models.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Akaike, H. 1974. A new look at the statistical model identification. In Selected Papers of Hirotugu Akaike, 215–222, Springer.
Alais, D., and D. Burr. 2004. The ventriloquist effect results from near-optimal bimodal integration. Current Biology 14 (3): 257–262.
Beierholm, U., L. Shams, W.J. Ma, and K. Koerding. 2008. Comparing Bayesian models for multisensory cue combination without mandatory integration. In Advances in Neural Information Processing Systems, 81–88.
Bertelson, P. 1999. Ventriloquism: A case of crossmodal perceptual grouping. Advances in Psychology 129: 347–362.
Bertelson, P., and M. Radeau. 1981. Cross-modal bias and perceptual fusion with auditory-visual spatial discordance. Attention, Perception, & Psychophysics 29 (6): 578–584.
Besson, P., J. Richiardi, C. Bourdin, L. Bringoux, D.R. Mestre, and J.-L. Vercher. 2010. Bayesian networks and information theory for audio-visual perception modeling. Biological Cybernetics 103 (3): 213–226.
Binda, P., A. Bruno, D.C. Burr, and M.C. Morrone. 2007. Fusion of visual and auditory stimuli during saccades: a Bayesian explanation for perisaccadic distortions. Journal of Neuroscience 27 (32): 8525–8532.
Bresciani, J.-P., F. Dammeier, and M.O. Ernst. 2006. Vision and touch are automatically integrated for the perception of sequences of events. Journal of Vision 6 (5): 2–2.
Choe, C.S., R.B. Welch, R.M. Gilford, and J.F. Juola. 1975. The “ventriloquist effect”: Visual dominance or response bias? Attention, Perception, & Psychophysics 18 (1): 55–60.
Ernst, M.O., and M.S. Banks. 2002. Humans integrate visual and haptic information in a statistically optimal fashion. Nature 415 (6870): 429–433.
Hartline, P.H., R.P. Vimal, A. King, D. Kurylo, and D. Northmore. 1995. Effects of eye position on auditory localization and neural representation of space in superior colliculus of cats. Experimental Brain Research 104 (3): 402–408.
Held, R. 1955. Shifts in binaural localization after prolonged exposures to atypical combinations of stimuli. The American Journal of Psychology 68 (4): 526–548.
Kayser, C., and L. Shams. 2015. Multisensory causal inference in the brain. PLoS Biology 13 (2): e1002075.
King, A.J. 2009. Visual influences on auditory spatial learning. Philosophical Transactions of the Royal Society B: Biological Sciences 364 (1515): 331–339.
Körding, K.P., U. Beierholm, W.J. Ma, S. Quartz, J.B. Tenenbaum, and L. Shams. 2007. Causal inference in multisensory perception. PLoS One 2 (9): e943.
Landy, M.S., L.T. Maloney, E.B. Johnston, and M. Young. 1995. Measurement and modeling of depth cue combination: in defense of weak fusion. Vision Research 35 (3): 389–412.
Laurienti, P.J., R.A. Kraft, J.A. Maldjian, J.H. Burdette, and M.T. Wallace. 2004. Semantic congruence is a critical factor in multisensory behavioral performance. Experimental Brain Research 158 (4): 405–414.
Mendonça, C., A. Escher, S. van de Par, and H. Colonius. 2015. Predicting auditory space calibration from recent multisensory experience. Experimental Brain Research 233 (7): 1983–1991.
Mendonça, C., M. Hiipakka, S. van de Par, and H. Colonius. 2014. Adaptation to non-individualized spatial sound through audiovisual experience. In Audio Engineering Society Conference: 55th International Conference: Spatial Audio, Audio Engineering Society.
Mendonça, C., P. Mandelli, and V. Pulkki. 2016. Modeling the perception of audiovisual distance: Bayesian causal inference and other models. PloS One 11 (12): e0165391.
Mendonça, C., J.A. Santos, and J. López-Moliner. 2011. The benefit of multisensory integration with biological motion signals. Experimental Brain Research 213 (2–3): 185.
Meredith, M.A., and B.E. Stein. 1986. Visual, auditory, and somatosensory convergence on cells in superior colliculus results in multisensory integration. Journal of Neurophysiology 56 (3): 640–662.
Posner, M.I., M.J. Nissen, and R.M. Klein. 1976. Visual dominance: An information-processing account of its origins and significance. Psychological Review 83 (2): 157.
Radeau, M., and P. Bertelson. 1974. The after-effects of ventriloquism. The Quarterly Journal of Experimental Psychology 26 (1): 63–71.
Recanzone, G.H. 1998. Rapidly induced auditory plasticity: The ventriloquism aftereffect. Proceedings of the National Academy of Sciences 95 (3): 869–875.
Roach, N.W., J. Heron, and P.V. McGraw. 2006. Resolving multisensory conflict: a strategy for balancing the costs and benefits of audio-visual integration. Proceedings of the Royal Society of London B: Biological Sciences 273 (1598): 2159–2168.
Rock, I., and J. Victor. 1964. Vision and touch: An experimentally created conflict between the two senses. Science 143 (3606): 594–596.
Sato, Y., T. Toyoizumi, and K. Aihara. 2007. Bayesian inference explains perception of unity and ventriloquism aftereffect: Identification of common sources of audiovisual stimuli. Neural Computation 19 (12): 3335–3355.
Silva, C.C., C. Mendonça, S. Mouta, R. Silva, J.C. Campos, and J. Santos. 2013. Depth cues and perceived audiovisual synchrony of biological motion. PloS one 8 (11): e80096.
Slutsky, D.A., and G.H. Recanzone. 2001. Temporal and spatial dependency of the ventriloquism effect. Neuroreport 12 (1): 7–10.
Van Eijk, R.L., A. Kohlrausch, J.F. Juola, and S. van de Par. 2008. Audiovisual synchrony and temporal order judgments: Effects of experimental method and stimulus type. Attention, Perception, & Psychophysics 70 (6): 955–968.
Vroomen, J., P. Bertelson, and B. De Gelder. 2001. The ventriloquist effect does not depend on the direction of automatic visual attention. Attention, Perception, & Psychophysics 63 (4): 651–659.
Wallace, M.T., L.K. Wilkinson, and B.E. Stein. 1996. Representation and integration of multiple sensory inputs in primate superior colliculus. Journal of Neurophysiology 76 (2): 1246–1266.
Welch, R.B., and D.H. Warren. 1980. Immediate perceptual response to intersensory discrepancy. Psychological Bulletin 88 (3): 638.
Werner, S., J. Liebetrau, and T. Sporer. 2013. Vertical sound source localization influenced by visual stimuli. Signal Processing Research 2 (2): 29–38.
Wozny, D.R., U.R. Beierholm, and L. Shams. 2010. Probability matching as a computational strategy used in perception. PLoS Computational Biology 6 (8): e1000871.
Wozny, D.R., and L. Shams. 2011. Recalibration of auditory space following milliseconds of cross-modal discrepancy. Journal of Neuroscience 31 (12): 4607–4612.
Yuille, A.L., and H.H. Bulthoff. 1996. Bayesian decision theory and psychophysics. Perception as Bayesian Inference, 123.
Acknowledgements
This project has received funding from the European Union’s Horizon 2020 research and innovation programme under the Marie Sklodowska Curie grant No. 659114. Two anonymous reviewers have contributed with very constructive remarks.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2020 Springer Nature Switzerland AG
About this chapter
Cite this chapter
Mendonça, C. (2020). Psychophysical Models of Sound Localisation with Audiovisual Interactions. In: Blauert, J., Braasch, J. (eds) The Technology of Binaural Understanding. Modern Acoustics and Signal Processing. Springer, Cham. https://doi.org/10.1007/978-3-030-00386-9_11
Download citation
DOI: https://doi.org/10.1007/978-3-030-00386-9_11
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-00385-2
Online ISBN: 978-3-030-00386-9
eBook Packages: EngineeringEngineering (R0)