Abstract
Designing multimodal virtual environments promises revolutionary advances in interacting with computers in the near future. In this paper, we report the results of an experimental investigation on the possible use of surround-sound systems to support visualization, taking advantage of increased knowledge about how spatial perception and attention work in the human brain. We designed two auditory-visual cross-modal experiments, where noise bursts and light-blobs were presented synchronously, but with spatial offsets. We presented sounds in two ways: using free field sounds and using a stereo speaker set. Participants were asked to localize the direction of sound sources. In the first experiment visual stimuli were displaced vertically relative to the sounds, in the second experiment we used horizontal offsets. We found that, in both experiments, sounds were mislocalized in the direction of the visual stimuli in each condition (ventriloquism effect), but this effect was stronger when visual stimuli were displaced vertically, as compared to horizontally. Moreover we found that the ventriloquism effect is strongest for centrally presented sounds. The analyses revealed a variation between different sound presentation modes. We explain our results from the viewpoint of multimodal interface design. These findings draw attention to the importance of cognitive features of multimodal perception in the design of virtual environment setups and may help to open new ways to more realistic surround based multimodal virtual reality simulations.
Similar content being viewed by others
References
Nacke LE, Grimshaw MN, Lindley CA (2010) More than a feeling: measurement of sonic user experience and psychophysiology in a first-person shooter game. Interact Comput 22(5):336–343
Zhou Z, Cheok AD, Yang X, Qiu Y (2004) An experimental study on the role of 3D sound in augmented reality environment. Interact Comput 16(6):1043–1068
Hu H, Zhou L, Ma H, Wu Z (2008) HRTF personalization based on artificial neural network in individual virtual auditory space. Appl Acoust 69(2):163–172
Seeber BU, Kerber S, Hafter ER (2010) A system to simulate and reproduce audio-visual environments for spatial hearing research. Hear Res 260(1–2):1–10
Wenzel EM, Arruda M, Kistler DJ, Wightman FL (1993) Localization using nonindividualized head-related transfer functions. J Acoust Soc Am 94(1):111–123
Vroomen J, Gelder B De (2004) Perceptual effects of cross-modal stimulation: ventriloquism and the freezing phenomenon. Handb Multisens Process 3(1):1–23
Colin C, Radeau M, Soquet A, Dachy B, Deltenre P (2002) Electrophysiology of spatial scene analysis: the mismatch negativity (MMN) is sensitive to the ventriloquism illusion. Clin Neurophysiol Off J Int Fed Clin Neurophysiol 113(4):507–518
Török Á, Kolozsvári O, Virágh T, Honbolygó F, Csépe V (2013) Effect of stimulus intensity on response time distribution in multisensory integration. J Multimodal User Interfaces 8(2):209–216. doi:10.1007/s12193-013-0135-y
Middlebrooks JC, Green DM (1991) Sound localization by human listeners. Annu Rev Psychol 42:135–159
Thurlow WR, Jack CE (1973) Certain determinants of the ”ventriloquism effect”. Percept Motor Skills 36(3):1171–1184
Raykar VC, Duraiswami R, Yegnanarayana B (2005) Extracting the frequencies of the pinna spectral notches in measured head related impulse responses. J Acoust Soc Am 118(1):364–374
Shams L, Kamitani Y, Shimojo S (2000) Illusions. What you see is what you hear. Nature 408(6814):788
Henderson RM, Orbach HS (2006) Is there a mismatch negativity during change blindness? Neuroreport 17(10):1011–1015
Alais D, Burr D (2004) The ventriloquist effect results from near-optimal bimodal integration. Curr Biol 14(3):257–262
Ohshiro T, Angelaki DE, DeAngelis GC (2011) A normalization model of multisensory integration. Nat Neurosci 14(6):775–782
Körding KP, Beierholm U, Ma WJ, Quartz S, Tenenbaum JB, Shams L (2007) Causal inference in multisensory perception. PloS One 2(9):e943
Howard IP, Templeton WB (1966) Human spatial orientation. Wiley, New York
Bertelson P, Aschersleben G (1998) Automatic visual bias of perceived auditory location. Psychon Bull Rev 5(3):482–489
Stekelenburg JJ, Vroomen J, de Gelder B (2004) Illusory sound shifts induced by the ventriloquist illusion evoke the mismatch negativity. Neurosci Lett 357(3):163–166
Brewster S (1998) The design of sonically-enhanced widgets. Interact Comput 11(2):211–235
McDonald JJ, Ward LM (2000) Involuntary listening aids seeing: evidence from human electrophysiology. Psychol Sci 11(2):167–171
Hartnagel D, Bichot A, Roumes C (2007) Eye position affects audio-visual fusion in darkness. Perception 36(10):1487– 1496
Werner S, Liebetrau J, Sporer T (2013) Vertical sound source localization influenced by visual stimuli. Signal Process Res 2(2)
Besson P, Richiardi J, Bourdin C, Bringoux L, Mestre DR, Vercher J-L (2010) Bayesian networks and information theory for audio-visual perception modeling. Biol Cybern 103(3):213–226
Wozny DR, Shams L (2011) Recalibration of auditory space following milliseconds of cross-modal discrepancy. J Neurosci Off J Soc Neurosci 31(12):4607–4612
Hoffman L, Rovine MJ (2007) Multilevel models for the experimental psychologist: foundations and illustrative examples. Behav Res Methods 39(1):101–117
Heck RH, Thomas SL, Tabata LN (2010) Multilevel and longitudinal modeling with IBM SPSS (Google eBook). Taylor Francis, New York
Rovamo J, Virsu V (1979) An estimation and application of the human cortical magnification factor. Exp Brain Res 37(3):495– 510
Besson P, Bourdin C, Bringoux L (2011) A comprehensive model of audiovisual perception: both percept and temporal dynamics. PloS One 6(8):e23811
Moore DR, King AJ (1999) Auditory perception: the near and far of sound localization. Curr Biol 9(10):R361–R363
Bonath B, Noesselt T, Martinez A, Mishra J, Schwiecker K, Heinze H-J, Hillyard SA (2007) Neural basis of the ventriloquist illusion. Curr Biol CB 17(19):1697–1703
Baranyi P, Csapo A (2012) Definition and synergies of cognitive infocommunications. Acta Polytech Hung 9(1):67–83
Lee J-H, Spence C (2009) Feeling what you hear: task-irrelevant sounds modulate tactile perception delivered via a touch screen. J Multimodal User Interfaces 2(3–4):145–156
Haans A, IJsselsteijn WA (2012) Embodiment and telepresence: toward a comprehensive theoretical framework. Interact Comput 24(4):211–218
Kober SE, Kurzmann J, Neuper C (2012) Cortical correlate of spatial presence in 2D and 3D interactive virtual reality: an EEG study. Int J Psychophysiol Off J Int Organ Psychophysiol 83(3):365–374
Ghirardelli TG, Scharine AA (2009) Auditory–visual interactions. In: Letowski, El Michael B, Russo tomasz R (eds) Helmet-mounted displays: sensation, perception, and cognition issues. U.S. Army Aeromedical Research, Fort Rucker, pp 599–618
Acknowledgments
The research leading to these results has received funding from the European Community’s Research Infrastructure Action—grant agreement VISIONAIR 262044—under the 7th Framework Programme (FP7/2007-2013). Ágoston Török was supported by a Young Researcher Fellowship from the Hungarian Academy of Sciences. Thanks to Dénes Tóth for his help during the statistical analyses. Thanks to Orsolya Kolozsvári for her help in the preparation of the manuscript.
Author information
Authors and Affiliations
Corresponding author
Electronic supplementary material
Below is the link to the electronic supplementary material.
Rights and permissions
About this article
Cite this article
Török, Á., Mestre, D., Honbolygó, F. et al. It sounds real when you see it. Realistic sound source simulation in multimodal virtual environments. J Multimodal User Interfaces 9, 323–331 (2015). https://doi.org/10.1007/s12193-015-0185-4
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s12193-015-0185-4