Abstract
In noisy contexts, sound discrimination improves when the auditory sources are separated in space. This phenomenon, named Spatial Release from Masking (SRM), arises from the interaction between the auditory information reaching the ear and spatial attention resources. To examine the relative contribution of these two factors, we exploited an audio-visual illusion in a hearing-in-noise task to create conditions in which the initial stimulation to the ears is held constant, while the perceived separation between speech and masker is changed illusorily (visual capture of sound). In two experiments, we asked participants to identify a string of five digits pronounced by a female voice, embedded in either energetic (Experiment 1) or informational (Experiment 2) noise, before reporting the perceived location of the heard digits. Critically, the distance between target digits and masking noise was manipulated both physically (from 22.5 to 75.0 degrees) and illusorily, by pairing target sounds with visual stimuli either at same (audio-visual congruent) or different positions (15 degrees offset, leftward or rightward: audio-visual incongruent). The proportion of correctly reported digits increased with the physical separation between the target and masker, as expected from SRM. However, despite effective visual capture of sounds, performance was not modulated by illusory changes of target sound position. Our results are compatible with a limited role of central factors in the SRM phenomenon, at least in our experimental setting. Moreover, they add to the controversial literature on the limited effects of audio-visual capture in auditory stream separation.
Similar content being viewed by others
Availability of data and materials
Valzolgher, C. (2021, March 23). Can visual capture of sound separate auditory streams? Retrieved from osf.io/vj7rg.
References
Amenta S, Artesini L, Musola D, Frau GN, Vespignani F, Pavani F (2020) Probing language processing in cochlear implant users with visual word recognition: effects of lexical and orthographic word properties. Lang Cogn Neurosci 36(2):1–12. https://doi.org/10.1080/23273798.2020.18046
Arbogast TL, Mason CR, Kidd G (2002) The effect of spatial separation on informational and energetic masking of speech. J Acoust Soc Am 112(5):2086–2098. https://doi.org/10.1121/1.1510141
Barton K (2018) MuMIn: multi-model inference. R package. Cran-R, 1, 289–290. Retrieved November 2020 from https://cran.r-project.org/web/packages/MuMIn/MuMIn.pdf
Bates D, Mächler M, Bolker BM, Walker SC (2015) Fitting linear mixed-effects models using lme4. J Stat Softw. https://doi.org/10.18637/jss.v067.i01
Bertelson P (1999) Ventriloquism: a case of crossmodal perceptual grouping. Adv Psychol 129:347–362. https://doi.org/10.1016/S0166-4115(99)80034-X
Best V, Ozmeral EJ, Shinn-Cunningham BG (2007) Visually-guided attention enhances target identification in a complex auditory scene. J Assoc Res Otolaryngol 8:294–304. https://doi.org/10.1007/s10162-007-0073-z
Best V, Mason CR, Kidd G (2011) Spatial release from masking in normally hearing and hearing-impaired listeners as a function of the temporal overlap of competing talkers. J Acoust Soc Am 129(3):1616–1625. https://doi.org/10.1121/1.3533733
Bregman AS (1990) Auditory scene analysis: the perceptual organization of sound. Bradford Books, Cambridge
Brungart DS, Durlach NI, Rabinowitz WM (1999) Auditory localization of nearby sources. II. Localization of a broadband source. J Acoust Soc Am 106(4):1956–1968. https://doi.org/10.1121/1.427943
Brungart DS, Simpson BD, Ericson MA, Scott KR (2001) Informational and energetic masking effects in the perception of multiple simultaneous talkers. J Acoust Soc Am 110(5):2527–2538. https://doi.org/10.1121/1.1408946
Bruns P (2019) The ventriloquist illusion as a tool to study multisensory processing: an update. Front Integr Neurosci. https://doi.org/10.3389/fnint.2019.00051
Brungart DS, Kordik AJ, Simpson BD (2005) Audio and visual cues in a two-talker divided attention speech-monitoring task. Human Factors 47(3):562–573. https://doi.org/10.1518/001872005774860023
Calamaro N, Myslobodskv MS, Soroker N (1995) Ventriloquist effect reinstates responsiveness to auditory stimuli in the ‘ignored’ space in patients with hemispatial neglect. J Clin Exp Neuropsychol 17(2):243–255. https://doi.org/10.1080/01688639508405121
Carlile S, Corkhill C (2015) Selective spatial attention modulates bottom-up informational masking of speech. Sci Rep 5(1):1–7. https://doi.org/10.1038/srep08662
Carlyon RP (2004) How the brain separates sounds. Trends Cogn Sci 8(10):465–471. https://doi.org/10.1016/j.tics.2004.08.008
Chan JS, Merrifield K, Spence C (2005) Auditory spatial attention assessed in a flanker interference task. Acta Acust Acust 91(3):554–563
Chen L, Vroomen J (2013) Intersensory binding across space and time: a tutorial review. Atten Percept Psychophys 75(5):790–811. https://doi.org/10.3758/s13414-013-0475-4
Ching TYC, van Wanrooy E, Dillon H, Carter L (2011) Spatial release from masking in normal-hearing children and children who use hearing aids. J Acoust Soc Am 129(1):368–375. https://doi.org/10.1121/1.3523295
Cooke M, Garcia Lecumberri ML, Barker J (2008) The foreign language cocktail party problem: Energetic and informational masking effects in non-native speech perception. J Acoust Soc Am 123(1):414–427. https://doi.org/10.1121/1.2804952
Culling JF, Hawley ML, Litovsky RY (2004) The role of head-induced interaural time and level differences in the speech reception threshold for multiple interfering sound sources. J Acoust Soc Am 116(2):1057–1065. https://doi.org/10.1121/1.1772396
Deng Y, Choi I, Shinn-Cunningham B, Baumgartner R (2019) Impoverished auditory cues limit engagement of brain networks controlling spatial selective attention. Neuroimage 202:116151. https://doi.org/10.1016/j.neuroimage.2019.116151
Desimone R, Duncan J (1995) Neural mechanisms of selective visual attention. Annu Rev Neurosci 18(1):193–222. https://doi.org/10.1146/annurev.ne.18.030195.001205
Driver J (1996) Enhancement of selective listening by illusory mislocation of speech sounds due to lip-reading. Nature 381(6577):66–68. https://doi.org/10.1038/381066a0
Ferlazzo F, Couyoumdjian A, Padovani T, Belardinelli MO (2002) Head-centred meridian effect on auditory spatial attention orienting. Q J Exp Psychol Sec A 55(3):937–963. https://doi.org/10.1080/02724980143000569
Festen JM, Plomp R (1990) Effects of fluctuating noise and interfering speech on the speech-reception threshold for impaired and normal hearing. J Acoust Soc Am 88(4):1725–1736. https://doi.org/10.1121/1.400247
Fox J, Weisberg S (2011) Multivariate linear models in R. An R companion to applied regression. Sage Publications, Thousand Oaks
Freyman RL, Helfer KS, McCall DD, Clifton RK (1999) The role of perceived spatial separation in the unmasking of speech. J Acoust Soc Am 106(6):3578–3588. https://doi.org/10.1121/1.428211
Freyman RL, Balakrishnan U, Helfer KS (2001) Spatial release from informational masking in speech recognition. J Acoust Soc Am 109(5):2112–2122. https://doi.org/10.1121/1.1354984
Giovanelli E, Valzolgher C, Gessa E, Todeschini M, Pavani F (2021) Unmasking the difficulty of listening to talkers with masks: lessons from the covid-19 pandemic. i-Perception. https://doi.org/10.1177/2041669521998393
Glyde H, Buchholz JM, Dillon H, Cameron S, Hickson L (2013) The importance of interaural time differences and level differences in spatial release from masking. J Acoust Soc Am 134(2):EL147–EL152. https://doi.org/10.1121/1.4812441
Günther F, Marelli M (2019) Enter sandman: Compound processing and semantic transparency in a compositional perspective. J Exp Psychol Learn Mem Cogn 45(10):1872–1882. https://doi.org/10.1037/xlm0000677
Hirsh IJ (1950) The relation between localization and intelligibility. J Acoust Soc Am 22(2):196–200. https://doi.org/10.1121/1.1906588
Ihlefeld A, Shinn-Cunningham B (2008) Spatial release from energetic and informational masking in a selective speech identification task. J Acoust Soc Am 123(6):4369–4379. https://doi.org/10.1121/1.2904826
Jack BN, O’Shea RP, Cottrell D, Ritter W (2013) Does the ventriloquist illusion assist selective listening? J Exp Psychol Human Percept Perform 39(5):1496–1502. https://doi.org/10.1037/a0033594
Jakien KM, Kampel SD, Stansell MM, Gallun FJ (2017) Validating a rapid, automated test of spatial release from masking. Am J Audiol 26(4):507–518. https://doi.org/10.1044/2017_AJA-17-0013
Jeffreys H (1961) Theory of probability, 3rd ed. Oxford University Press, New York, NY
Jeffreys H (1998) The theory of probability. OUP Oxford, Oxford
Kidd G, Mason CR, Best V, Marrone N (2010) Stimulus factors influencing spatial release from speech-on-speech masking. J Acoust Soc Am 128(4):1965–1978. https://doi.org/10.1121/1.3478781
Koelewijn T, de Kluiver H, Shinn-Cunningham BG, Zekveld AA, Kramer SE (2015) The pupil response reveals increased listening effort when it is difficult to focus attention. Hear Res 323:81–90. https://doi.org/10.1016/j.heares.2014.03.010
Kuznetsova A, Brockhoff PB, Christensen RHB (2017) lmerTest package: tests in linear mixed effects models. J Stat Softw. https://doi.org/10.18637/jss.v082.i13
Ladavas E, Pavani F, Farne A (2001) Auditory peripersonal space in humans: a case of auditory-tactile extinction. Neurocase 7(2):97–103. https://doi.org/10.1093/neucas/7.2.97
Litovsky RY (2005) Speech intelligibility and spatial release from masking in young children. J Acoust Soc Am 117(5):3091–3099. https://doi.org/10.1121/1.1873913
Litovsky RY (2012) Spatial release from masking. Acoustics Today 8(2):18–25. https://doi.org/10.1121/1.4729575
Ma WJ, Zhou X, Ross LA, Foxe JJ, Parra LC (2009) Lip-reading aids word recognition most in moderate noise: a Bayesian explanation using high-dimensional feature space. PLoS ONE 4(3):e4638. https://doi.org/10.1371/journal.pone.0004638
Morais J, Bertelson P (1975) Spatial position versus ear of entry as determinant of the auditory laterality effects: a stereophonic test. J Exp Psychol Hum Percept Perform 1(3):253–262. https://doi.org/10.1037/0096-1523.1.3.253
Muñoz RV, Aspöck L, Fels J (2019) Spatial release from masking under different reverberant conditions in young and elderly subjects: effect of moving or stationary maskers at circular and radial conditions. J Speech Lang Hear Res 62(9):3582–3595. https://doi.org/10.1044/2019_JSLHR-H-19-0092
Nakagawa S, Schielzeth H (2013) A general and simple method for obtaining R2 from generalized linear mixedeffects models. Methods Ecol Evol 4(2):133–142. https://doi.org/10.1111/j.2041-210x.2012.00261.x
Phillips DP, Vigneault-MacLean BK, Boehnke SE, Hall SE (2003) Acoustic hemifields in the spatial release from masking of speech by noise. J Am Acad Audiol 14(9):518–524. https://doi.org/10.3766/jaaa.14.9.7
Rabini G, Altobelli E, Pavani F (2019) Interactions between egocentric and allocentric spatial coding of sounds revealed by a multisensory learning paradigm. Sci Rep 9:7892. https://doi.org/10.1038/s41598-019-44267-3
Radeau M, Bertelson P (1974) The after-effects of ventriloquism. Q J Exper Psychol 26(1):63–71
R Core team 2013: Kuznetsova A, Christensen RHB, Brockhoff PB (2013) Different tests on lmer objects (of the lme4 package): introducing the lmerTest package, in: The R User Conference, useR!, University of Castilla-La Mancha, Albacete, Spain, p 66
Reisberg D, Scheiber R, Potemken L (1981) Eye position and the control of auditory attention. J Exp Psychol Hum Percept Perform 7:318–323. https://doi.org/10.1037/0096-1523.7.2.318
Rizzolatti G, Riggio L, Dascola I, Umiltá C (1987) Reorienting attention across the horizontal and vertical meridians: evidence in favor of a premotor theory of attention. Neuropsychologia 25(1):31–40. https://doi.org/10.1016/0028-3932(87)90041-8
Rudmann DS, McCarley JS, Kramer AF (2003) Bimodal displays improve speech comprehension in environments with multiple speakers. Human Factors 45(2):329–336. https://doi.org/10.1518/hfes.45.2.329.27237
Ruta N, Burleigh A, Pepperell R (2019) Does visual space correspond to imagined space? Retrieved November 2020 from https://repository.cardiffmet.ac.uk/bitstream/handle/10369/10782/ECVP_2019NR11.pdf?sequence=1&isAllowed=y
Sakamoto Y, Ishiguro M, Kitagawa G (1986) Akaike information criterion statistics. D. Reidel Publishing Company, Dordrecht
Shinn-Cunningham BG (2008) Object-based auditory and visual attention. Trends Cogn Sci 12(5):182–186. https://doi.org/10.1016/j.tics.2008.02.003
Shinn-Cunningham B, Best V, Lee AKC (2017) Auditory object formation and selection. In: Middlebrooks J, Simon J, Popper A, Fay R (eds) The auditory system at the cocktail party. Springer Handbook of Auditory Research, vol 60. Springer, Cham. https://doi.org/10.1007/978-3-319-51662-2_2
Soto-Faraco S, Ronald A, Spence C (2004) Tactile selective attention and body posture: assessing the multisensory contributions of vision and proprioception. Percept Psychophys 66(7):1077–1094. https://doi.org/10.3758/BF03196837
Spence C, Pavani F, Driver J (2004) Spatial constraints on visual-tactile cross-modal distractor congruency effects. Cogn Affect Behav Neurosci 4(2):148–169. https://doi.org/10.3758/CABN.4.2.148
Spence C, Lee J, van der Stoep N (2017) Responding to sounds from unseen locations: crossmodal attentional orienting in response to sounds presented from the rear. Eur J Neurosci 51:1137–1150. https://doi.org/10.1111/ejn.13733
Srinivasan NK, Jakien KM, Gallun FJ (2016) Release from masking for small spatial separations: effects of age and hearing loss. J Acoust Soc Am 140(1):EL73–EL78. https://doi.org/10.1121/1.4954386
Sumby WH, Pollack I (1954) Visual contribution to speech intelligibility in noise. J Acoust Soc America 26(2):212–215. https://doi.org/10.1121/1.1907309
Turri S, Rizvi M, Rabini G, Melonio A, Gennari R, Pavani F (2021) Orienting auditory attention through vision: the impact of monaural listening. Multisens Res. https://doi.org/10.1163/22134808-bja10059
van der Heijden K, Rauschecker JP, de Gelder B, Formisano E (2019) Cortical mechanisms of spatial hearing. Nat Rev Neurosci 20(10):609–623. https://doi.org/10.1038/s41583-019-0206-5
Varghese LA, Ozmeral EJ, Best V, Shinn-Cunningham BG (2012) How visual cues for when to listen aid selective auditory attention. J Assoc Res Otolaryngol 13(3):359–368. https://doi.org/10.1007/s10162-012-0314-7
Zhang M, Mary Ying YL, Ihlefeld A (2018) Spatial release from informational masking: evidence from functional near infrared spectroscopy. Trends Hear. https://doi.org/10.1177/2331216518817464
Zurek PM (1993) Binaural advantages and directional effects in speech intelligibility. In: Studebaker GA, Hockberg I (eds) Acoustical factors affecting hearing aid performance, 2nd ed. Allyn and Bacon, Needham Heights, MA
Acknowledgements
The authors are grateful to Dr. Massimo Vescovi for the technical support and two anonymous reviewers for comments on a previous version of the manuscript.
Funding
The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: C. V. was supported by a grant of the Università Italo-Francese/Université Franco-Italienne, the Zegna Founder's Scholarship and Associazione Amici di Claudio Demattè. F. P. was supported by a grant of the Agence Nationale de la Recherche (ANR-16-CE17-0016, VIRTUALHEARING3D, France), by a prize of the Foundation Medisite (France), by the Neurodis Foundation (France) and by a grant from the Italian Ministry for Research and University (MUR, PRIN 20177894ZH).
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Additional information
Communicated by Francesca Frassinetti.
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary Information
Below is the link to the electronic supplementary material.
Rights and permissions
About this article
Cite this article
Valzolgher, C., Giovanelli, E., Sorio, R. et al. Can visual capture of sound separate auditory streams?. Exp Brain Res 240, 813–824 (2022). https://doi.org/10.1007/s00221-021-06281-8
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00221-021-06281-8