Journal on Multimodal User Interfaces

, Volume 9, Issue 4, pp 323–331 | Cite as

It sounds real when you see it. Realistic sound source simulation in multimodal virtual environments

  • Ágoston TörökEmail author
  • Daniel Mestre
  • Ferenc Honbolygó
  • Pierre Mallet
  • Jean-Marie Pergandi
  • Valéria Csépe
Original Paper


Designing multimodal virtual environments promises revolutionary advances in interacting with computers in the near future. In this paper, we report the results of an experimental investigation on the possible use of surround-sound systems to support visualization, taking advantage of increased knowledge about how spatial perception and attention work in the human brain. We designed two auditory-visual cross-modal experiments, where noise bursts and light-blobs were presented synchronously, but with spatial offsets. We presented sounds in two ways: using free field sounds and using a stereo speaker set. Participants were asked to localize the direction of sound sources. In the first experiment visual stimuli were displaced vertically relative to the sounds, in the second experiment we used horizontal offsets. We found that, in both experiments, sounds were mislocalized in the direction of the visual stimuli in each condition (ventriloquism effect), but this effect was stronger when visual stimuli were displaced vertically, as compared to horizontally. Moreover we found that the ventriloquism effect is strongest for centrally presented sounds. The analyses revealed a variation between different sound presentation modes. We explain our results from the viewpoint of multimodal interface design. These findings draw attention to the importance of cognitive features of multimodal perception in the design of virtual environment setups and may help to open new ways to more realistic surround based multimodal virtual reality simulations.


Surround system Ventriloquist illusion Multisensory integration Multilevel-modeling  Spatial audio 



The research leading to these results has received funding from the European Community’s Research Infrastructure Action—grant agreement VISIONAIR 262044—under the 7th Framework Programme (FP7/2007-2013). Ágoston Török was supported by a Young Researcher Fellowship from the Hungarian Academy of Sciences. Thanks to Dénes Tóth for his help during the statistical analyses. Thanks to Orsolya Kolozsvári for her help in the preparation of the manuscript.

Supplementary material

12193_2015_185_MOESM1_ESM.docx (25 kb)
Supplementary material 1 (docx 24 KB)


  1. 1.
    Nacke LE, Grimshaw MN, Lindley CA (2010) More than a feeling: measurement of sonic user experience and psychophysiology in a first-person shooter game. Interact Comput 22(5):336–343CrossRefGoogle Scholar
  2. 2.
    Zhou Z, Cheok AD, Yang X, Qiu Y (2004) An experimental study on the role of 3D sound in augmented reality environment. Interact Comput 16(6):1043–1068CrossRefGoogle Scholar
  3. 3.
    Hu H, Zhou L, Ma H, Wu Z (2008) HRTF personalization based on artificial neural network in individual virtual auditory space. Appl Acoust 69(2):163–172CrossRefGoogle Scholar
  4. 4.
    Seeber BU, Kerber S, Hafter ER (2010) A system to simulate and reproduce audio-visual environments for spatial hearing research. Hear Res 260(1–2):1–10CrossRefGoogle Scholar
  5. 5.
    Wenzel EM, Arruda M, Kistler DJ, Wightman FL (1993) Localization using nonindividualized head-related transfer functions. J Acoust Soc Am 94(1):111–123CrossRefGoogle Scholar
  6. 6.
    Vroomen J, Gelder B De (2004) Perceptual effects of cross-modal stimulation: ventriloquism and the freezing phenomenon. Handb Multisens Process 3(1):1–23Google Scholar
  7. 7.
    Colin C, Radeau M, Soquet A, Dachy B, Deltenre P (2002) Electrophysiology of spatial scene analysis: the mismatch negativity (MMN) is sensitive to the ventriloquism illusion. Clin Neurophysiol Off J Int Fed Clin Neurophysiol 113(4):507–518CrossRefGoogle Scholar
  8. 8.
    Török Á, Kolozsvári O, Virágh T, Honbolygó F, Csépe V (2013) Effect of stimulus intensity on response time distribution in multisensory integration. J Multimodal User Interfaces 8(2):209–216. doi: 10.1007/s12193-013-0135-y CrossRefGoogle Scholar
  9. 9.
    Middlebrooks JC, Green DM (1991) Sound localization by human listeners. Annu Rev Psychol 42:135–159CrossRefGoogle Scholar
  10. 10.
    Thurlow WR, Jack CE (1973) Certain determinants of the ”ventriloquism effect”. Percept Motor Skills 36(3):1171–1184CrossRefGoogle Scholar
  11. 11.
    Raykar VC, Duraiswami R, Yegnanarayana B (2005) Extracting the frequencies of the pinna spectral notches in measured head related impulse responses. J Acoust Soc Am 118(1):364–374CrossRefGoogle Scholar
  12. 12.
    Shams L, Kamitani Y, Shimojo S (2000) Illusions. What you see is what you hear. Nature 408(6814):788CrossRefGoogle Scholar
  13. 13.
    Henderson RM, Orbach HS (2006) Is there a mismatch negativity during change blindness? Neuroreport 17(10):1011–1015CrossRefGoogle Scholar
  14. 14.
    Alais D, Burr D (2004) The ventriloquist effect results from near-optimal bimodal integration. Curr Biol 14(3):257–262CrossRefGoogle Scholar
  15. 15.
    Ohshiro T, Angelaki DE, DeAngelis GC (2011) A normalization model of multisensory integration. Nat Neurosci 14(6):775–782CrossRefGoogle Scholar
  16. 16.
    Körding KP, Beierholm U, Ma WJ, Quartz S, Tenenbaum JB, Shams L (2007) Causal inference in multisensory perception. PloS One 2(9):e943CrossRefGoogle Scholar
  17. 17.
    Howard IP, Templeton WB (1966) Human spatial orientation. Wiley, New YorkGoogle Scholar
  18. 18.
    Bertelson P, Aschersleben G (1998) Automatic visual bias of perceived auditory location. Psychon Bull Rev 5(3):482–489CrossRefGoogle Scholar
  19. 19.
    Stekelenburg JJ, Vroomen J, de Gelder B (2004) Illusory sound shifts induced by the ventriloquist illusion evoke the mismatch negativity. Neurosci Lett 357(3):163–166CrossRefGoogle Scholar
  20. 20.
    Brewster S (1998) The design of sonically-enhanced widgets. Interact Comput 11(2):211–235CrossRefGoogle Scholar
  21. 21.
    McDonald JJ, Ward LM (2000) Involuntary listening aids seeing: evidence from human electrophysiology. Psychol Sci 11(2):167–171CrossRefGoogle Scholar
  22. 22.
    Hartnagel D, Bichot A, Roumes C (2007) Eye position affects audio-visual fusion in darkness. Perception 36(10):1487– 1496CrossRefGoogle Scholar
  23. 23.
    Werner S, Liebetrau J, Sporer T (2013) Vertical sound source localization influenced by visual stimuli. Signal Process Res 2(2)Google Scholar
  24. 24.
    Besson P, Richiardi J, Bourdin C, Bringoux L, Mestre DR, Vercher J-L (2010) Bayesian networks and information theory for audio-visual perception modeling. Biol Cybern 103(3):213–226zbMATHCrossRefGoogle Scholar
  25. 25.
    Wozny DR, Shams L (2011) Recalibration of auditory space following milliseconds of cross-modal discrepancy. J Neurosci Off J Soc Neurosci 31(12):4607–4612CrossRefGoogle Scholar
  26. 26.
    Hoffman L, Rovine MJ (2007) Multilevel models for the experimental psychologist: foundations and illustrative examples. Behav Res Methods 39(1):101–117CrossRefGoogle Scholar
  27. 27.
    Heck RH, Thomas SL, Tabata LN (2010) Multilevel and longitudinal modeling with IBM SPSS (Google eBook). Taylor Francis, New YorkGoogle Scholar
  28. 28.
    Rovamo J, Virsu V (1979) An estimation and application of the human cortical magnification factor. Exp Brain Res 37(3):495– 510CrossRefGoogle Scholar
  29. 29.
    Besson P, Bourdin C, Bringoux L (2011) A comprehensive model of audiovisual perception: both percept and temporal dynamics. PloS One 6(8):e23811CrossRefGoogle Scholar
  30. 30.
    Moore DR, King AJ (1999) Auditory perception: the near and far of sound localization. Curr Biol 9(10):R361–R363CrossRefGoogle Scholar
  31. 31.
    Bonath B, Noesselt T, Martinez A, Mishra J, Schwiecker K, Heinze H-J, Hillyard SA (2007) Neural basis of the ventriloquist illusion. Curr Biol CB 17(19):1697–1703CrossRefGoogle Scholar
  32. 32.
    Baranyi P, Csapo A (2012) Definition and synergies of cognitive infocommunications. Acta Polytech Hung 9(1):67–83Google Scholar
  33. 33.
    Lee J-H, Spence C (2009) Feeling what you hear: task-irrelevant sounds modulate tactile perception delivered via a touch screen. J Multimodal User Interfaces 2(3–4):145–156Google Scholar
  34. 34.
    Haans A, IJsselsteijn WA (2012) Embodiment and telepresence: toward a comprehensive theoretical framework. Interact Comput 24(4):211–218CrossRefGoogle Scholar
  35. 35.
    Kober SE, Kurzmann J, Neuper C (2012) Cortical correlate of spatial presence in 2D and 3D interactive virtual reality: an EEG study. Int J Psychophysiol Off J Int Organ Psychophysiol 83(3):365–374CrossRefGoogle Scholar
  36. 36.
    Ghirardelli TG, Scharine AA (2009) Auditory–visual interactions. In: Letowski, El Michael B, Russo tomasz R (eds) Helmet-mounted displays: sensation, perception, and cognition issues. U.S. Army Aeromedical Research, Fort Rucker, pp 599–618Google Scholar

Copyright information

© OpenInterface Association 2015

Authors and Affiliations

  • Ágoston Török
    • 1
    • 2
    • 3
    Email author
  • Daniel Mestre
    • 4
  • Ferenc Honbolygó
    • 1
    • 3
  • Pierre Mallet
    • 4
  • Jean-Marie Pergandi
    • 4
  • Valéria Csépe
    • 1
    • 3
  1. 1.Brain Imaging Centre, Research Centre for Natural SciencesHungarian Academy of SciencesBudapestHungary
  2. 2.Doctoral School of PsychologyEötvös Loránd UniversityBudapestHungary
  3. 3.Department of Cognitive Psychology, Faculty of Pedagogy and PsychologyEötvös Loránd UniversityBudapestHungary
  4. 4.ISM UMR 7287Aix-Marseille Université, CNRSMarseille cedex 09France

Personalised recommendations