Skip to main content

Psychophysical Models of Sound Localisation with Audiovisual Interactions

  • Chapter
  • First Online:
The Technology of Binaural Understanding

Part of the book series: Modern Acoustics and Signal Processing ((MASP))

Abstract

Visual signals can have an important impact on the perceived location of sound sources. Neurological mechanisms enable interactions between seeing and hearing to form a sense of space. The effect of vision on auditory localisation percepts is of fundamental importance. A sound source is either perceived at the location of the visual source or it is perceptually shifted toward it’s direction. This bias is one form of visual capture. The extent of the interactions depends on time and space constraints beyond which visual and auditory cues do not necessarily interact. These constraints and interactions vary for the localisation of sources along the horizontal and vertical planes, as well as with distance. While the traditional models of audiovisual interaction in space perception assume sensory integration, recent models allow for sensory cues to either interact or not. Models of visual dominance, modality appropriateness, and maximum likelihood estimation predict one combined percept. The newer models of causal inference allow for varied perceptual outcomes depending on the relationship of the different sensory cues. Finally, visual spatial cues can induce changes to how sounds are localised after the audiovisual experience. This notorious effect, known as the ventriloquism aftereffect, is possibly the main mechanism of auditory space learning and calibration. The ventriloquism aftereffect has been described with a causal inference model and with an inverse model. The current chapter discusses all of the above concepts, establishing a connection between psychophysical data and available models.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 189.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Hardcover Book
USD 249.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  • Akaike, H. 1974. A new look at the statistical model identification. In Selected Papers of Hirotugu Akaike, 215–222, Springer.

    Google Scholar 

  • Alais, D., and D. Burr. 2004. The ventriloquist effect results from near-optimal bimodal integration. Current Biology 14 (3): 257–262.

    Google Scholar 

  • Beierholm, U., L. Shams, W.J. Ma, and K. Koerding. 2008. Comparing Bayesian models for multisensory cue combination without mandatory integration. In Advances in Neural Information Processing Systems, 81–88.

    Google Scholar 

  • Bertelson, P. 1999. Ventriloquism: A case of crossmodal perceptual grouping. Advances in Psychology 129: 347–362.

    Google Scholar 

  • Bertelson, P., and M. Radeau. 1981. Cross-modal bias and perceptual fusion with auditory-visual spatial discordance. Attention, Perception, & Psychophysics 29 (6): 578–584.

    Google Scholar 

  • Besson, P., J. Richiardi, C. Bourdin, L. Bringoux, D.R. Mestre, and J.-L. Vercher. 2010. Bayesian networks and information theory for audio-visual perception modeling. Biological Cybernetics 103 (3): 213–226.

    MATH  Google Scholar 

  • Binda, P., A. Bruno, D.C. Burr, and M.C. Morrone. 2007. Fusion of visual and auditory stimuli during saccades: a Bayesian explanation for perisaccadic distortions. Journal of Neuroscience 27 (32): 8525–8532.

    Google Scholar 

  • Bresciani, J.-P., F. Dammeier, and M.O. Ernst. 2006. Vision and touch are automatically integrated for the perception of sequences of events. Journal of Vision 6 (5): 2–2.

    Google Scholar 

  • Choe, C.S., R.B. Welch, R.M. Gilford, and J.F. Juola. 1975. The “ventriloquist effect”: Visual dominance or response bias? Attention, Perception, & Psychophysics 18 (1): 55–60.

    Google Scholar 

  • Ernst, M.O., and M.S. Banks. 2002. Humans integrate visual and haptic information in a statistically optimal fashion. Nature 415 (6870): 429–433.

    ADS  Google Scholar 

  • Hartline, P.H., R.P. Vimal, A. King, D. Kurylo, and D. Northmore. 1995. Effects of eye position on auditory localization and neural representation of space in superior colliculus of cats. Experimental Brain Research 104 (3): 402–408.

    Google Scholar 

  • Held, R. 1955. Shifts in binaural localization after prolonged exposures to atypical combinations of stimuli. The American Journal of Psychology 68 (4): 526–548.

    Google Scholar 

  • Kayser, C., and L. Shams. 2015. Multisensory causal inference in the brain. PLoS Biology 13 (2): e1002075.

    Google Scholar 

  • King, A.J. 2009. Visual influences on auditory spatial learning. Philosophical Transactions of the Royal Society B: Biological Sciences 364 (1515): 331–339.

    Google Scholar 

  • Körding, K.P., U. Beierholm, W.J. Ma, S. Quartz, J.B. Tenenbaum, and L. Shams. 2007. Causal inference in multisensory perception. PLoS One 2 (9): e943.

    ADS  Google Scholar 

  • Landy, M.S., L.T. Maloney, E.B. Johnston, and M. Young. 1995. Measurement and modeling of depth cue combination: in defense of weak fusion. Vision Research 35 (3): 389–412.

    Google Scholar 

  • Laurienti, P.J., R.A. Kraft, J.A. Maldjian, J.H. Burdette, and M.T. Wallace. 2004. Semantic congruence is a critical factor in multisensory behavioral performance. Experimental Brain Research 158 (4): 405–414.

    Google Scholar 

  • Mendonça, C., A. Escher, S. van de Par, and H. Colonius. 2015. Predicting auditory space calibration from recent multisensory experience. Experimental Brain Research 233 (7): 1983–1991.

    Google Scholar 

  • Mendonça, C., M. Hiipakka, S. van de Par, and H. Colonius. 2014. Adaptation to non-individualized spatial sound through audiovisual experience. In Audio Engineering Society Conference: 55th International Conference: Spatial Audio, Audio Engineering Society.

    Google Scholar 

  • Mendonça, C., P. Mandelli, and V. Pulkki. 2016. Modeling the perception of audiovisual distance: Bayesian causal inference and other models. PloS One 11 (12): e0165391.

    Google Scholar 

  • Mendonça, C., J.A. Santos, and J. López-Moliner. 2011. The benefit of multisensory integration with biological motion signals. Experimental Brain Research 213 (2–3): 185.

    Google Scholar 

  • Meredith, M.A., and B.E. Stein. 1986. Visual, auditory, and somatosensory convergence on cells in superior colliculus results in multisensory integration. Journal of Neurophysiology 56 (3): 640–662.

    Google Scholar 

  • Posner, M.I., M.J. Nissen, and R.M. Klein. 1976. Visual dominance: An information-processing account of its origins and significance. Psychological Review 83 (2): 157.

    Google Scholar 

  • Radeau, M., and P. Bertelson. 1974. The after-effects of ventriloquism. The Quarterly Journal of Experimental Psychology 26 (1): 63–71.

    Google Scholar 

  • Recanzone, G.H. 1998. Rapidly induced auditory plasticity: The ventriloquism aftereffect. Proceedings of the National Academy of Sciences 95 (3): 869–875.

    ADS  Google Scholar 

  • Roach, N.W., J. Heron, and P.V. McGraw. 2006. Resolving multisensory conflict: a strategy for balancing the costs and benefits of audio-visual integration. Proceedings of the Royal Society of London B: Biological Sciences 273 (1598): 2159–2168.

    Google Scholar 

  • Rock, I., and J. Victor. 1964. Vision and touch: An experimentally created conflict between the two senses. Science 143 (3606): 594–596.

    ADS  Google Scholar 

  • Sato, Y., T. Toyoizumi, and K. Aihara. 2007. Bayesian inference explains perception of unity and ventriloquism aftereffect: Identification of common sources of audiovisual stimuli. Neural Computation 19 (12): 3335–3355.

    MATH  Google Scholar 

  • Silva, C.C., C. Mendonça, S. Mouta, R. Silva, J.C. Campos, and J. Santos. 2013. Depth cues and perceived audiovisual synchrony of biological motion. PloS one 8 (11): e80096.

    ADS  Google Scholar 

  • Slutsky, D.A., and G.H. Recanzone. 2001. Temporal and spatial dependency of the ventriloquism effect. Neuroreport 12 (1): 7–10.

    Google Scholar 

  • Van Eijk, R.L., A. Kohlrausch, J.F. Juola, and S. van de Par. 2008. Audiovisual synchrony and temporal order judgments: Effects of experimental method and stimulus type. Attention, Perception, & Psychophysics 70 (6): 955–968.

    Google Scholar 

  • Vroomen, J., P. Bertelson, and B. De Gelder. 2001. The ventriloquist effect does not depend on the direction of automatic visual attention. Attention, Perception, & Psychophysics 63 (4): 651–659.

    Google Scholar 

  • Wallace, M.T., L.K. Wilkinson, and B.E. Stein. 1996. Representation and integration of multiple sensory inputs in primate superior colliculus. Journal of Neurophysiology 76 (2): 1246–1266.

    Google Scholar 

  • Welch, R.B., and D.H. Warren. 1980. Immediate perceptual response to intersensory discrepancy. Psychological Bulletin 88 (3): 638.

    Google Scholar 

  • Werner, S., J. Liebetrau, and T. Sporer. 2013. Vertical sound source localization influenced by visual stimuli. Signal Processing Research 2 (2): 29–38.

    Google Scholar 

  • Wozny, D.R., U.R. Beierholm, and L. Shams. 2010. Probability matching as a computational strategy used in perception. PLoS Computational Biology 6 (8): e1000871.

    ADS  MathSciNet  Google Scholar 

  • Wozny, D.R., and L. Shams. 2011. Recalibration of auditory space following milliseconds of cross-modal discrepancy. Journal of Neuroscience 31 (12): 4607–4612.

    Google Scholar 

  • Yuille, A.L., and H.H. Bulthoff. 1996. Bayesian decision theory and psychophysics. Perception as Bayesian Inference, 123.

    Google Scholar 

Download references

Acknowledgements

This project has received funding from the European Union’s Horizon 2020 research and innovation programme under the Marie Sklodowska Curie grant No. 659114. Two anonymous reviewers have contributed with very constructive remarks.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Catarina Mendonça .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2020 Springer Nature Switzerland AG

About this chapter

Check for updates. Verify currency and authenticity via CrossMark

Cite this chapter

Mendonça, C. (2020). Psychophysical Models of Sound Localisation with Audiovisual Interactions. In: Blauert, J., Braasch, J. (eds) The Technology of Binaural Understanding. Modern Acoustics and Signal Processing. Springer, Cham. https://doi.org/10.1007/978-3-030-00386-9_11

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-00386-9_11

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-00385-2

  • Online ISBN: 978-3-030-00386-9

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics