Monaural Source Separation Using Spectral Cues

Pearlmutter, Barak A.; Zador, Anthony M.

doi:10.1007/978-3-540-30110-3_61

Barak A. Pearlmutter² &
Anthony M. Zador³

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 3195))

Included in the following conference series:

International Conference on Independent Component Analysis and Signal Separation

1372 Accesses
7 Citations

Abstract

The acoustic environment poses at least two important challenges. First, animals must localise sound sources using a variety of binaural and monaural cues; and second they must separate sources into distinct auditory streams (the “cocktail party problem”). Binaural cues include intra-aural intensity and phase disparity. The primary monaural cue is the spectral filtering introduced by the head and pinnae via the head-related transfer function (HRTF), which imposes different linear filters upon sources arising at different spatial locations.

Here we address the second challenge, source separation. We propose an algorithm for exploiting the monaural HRTF to separate spatially localised acoustic sources in a noisy environment. We assume that each source has a unique position in space, and is therefore subject to preprocessing by a different linear filter. We also assume prior knowledge of weak statistical regularities present in the sources. This framework can incorporate various aspects of acoustic transfer functions (echos, delays, multiple sensors, frequency-dependent attenuation) in a uniform fashion, treating them as cues for, rather than obstacles to, separation. To accomplish this, sources are represented sparsely in an overcomplete basis. This framework can be extended to make predictions about the neural representations required to separate acoustic sources.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 74.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Bregman, A.S.: Auditory Scene Analysis: The Perceptual Organization of Sound. MIT Press, Cambridge (1990) ISBN 0-262-02297-4
Book Google Scholar
Yost Jr., W.A., Dye, R.H., Sheft, S.: A simulated “cocktail party” with up to three sound sources. Percept Psychophys 58(7), 1026–1036 (1996)
Article Google Scholar
Chen, S.S., Donoho, D.L., Saunders, M.A.: Atomic decomposition by basis pursuit. SIAM Journal on Scientific Computing 20(1), 33–61 (1999)
Article MathSciNet Google Scholar
Lee, T.-W., Lewicki, M.S., Girolami, M., Sejnowski, T.J.: Blind source separation of more sources than mixtures using overcomplete representations. IEEE Signal Processing Letters 4(5), 87–90 (1999)
Google Scholar
Lewicki, M., Olshausen, B.A.: Inferring sparse, overcomplete image codes using an efficient coding framework. In: Advances in Neural Information Processing Systems 10, pp. 815–821. MIT Press, Cambridge (1998)
Google Scholar
Lewicki, M.S., Sejnowski, T.J.: Learning overcomplete representations. Neural Computation 12(2), 337–365 (2000)
Article Google Scholar
Zibulevsky, M., Pearlmutter, B.A.: Blind source separation by sparse decomposition in a signal dictionary. Neural Computation 13(4), 863–882 (2001)
Article Google Scholar
Bofill, P., Zibulevsky, M.: Underdetermined blind source separation using sparse representations. Signal Processing 81(11), 2353–2362 (2001)
Article Google Scholar
Rickard, S.T., Dietrich, F.: DOA estimation of manyW-disjoint orthogonal sources from two mixtures using DUET. In: Proceedings of the 10th IEEE Workshop on Statistical Signal and Array Processing (SSAP 2000), Pocono Manor, PA, August 2000, pp. 311–314 (2000)
Google Scholar
Cauwenberghs, G.: Monaural separation of independent acoustical components. In: Proc. IEEE Int. Symp. Circuits and Systems (ISCAS 1999), Orlando FL, vol. 5, pp. 62–65 (1999)
Google Scholar
Hochreiter, S., Mozer, M.C.: Monaural separation and classification of mixed signals: A support-vector regression perspective. In: Lee, T.-W., Jung, T.-P., Makeig, S., Sejnowski, T.J. (eds.) 3rd International Conference on Independent Component Analysis and Blind Signal Separation, San Diego, CA, December 9-12 (2001)
Google Scholar
Jang, G.-J., Lee, T.-W.: A maximum likelihood approach to single-channel source separation. Journal of Machine Learning Research 4, 1365–1392 (2003)
MathSciNet MATH Google Scholar
Roweis, S.T.: One microphone source separation. In: Advances in Neural Information Processing Systems 13, pp. 793–799. MIT Press, Cambridge (2001)
Google Scholar
Poggio, T., Torre, V., Koch, C.: Computational vision and regularization theory. Nature 317(6035), 314–319 (1985)
Article Google Scholar
Donoho, D.L., Elad, M.: Maximal sparsity representation via l1 minimization. Proceedings of the National Academy of Sciences 100, 2197–2202 (2003)
Article Google Scholar
Fletcher, R.: Semidefinite matrix constraints in optimization. SIAM J. Control and Opt. 23, 493–513 (1985)
Article MathSciNet Google Scholar
Hofman, P.M., Van Opstal, A.J.: Bayesian reconstruction of sound localization cues from responses to random spectra. Biol. Cybern. 86(4), 305–316 (2002)
Article Google Scholar
Knudsen, E.I., Konishi, M.: Mechanisms of sound localization in the barn owl. Journal of Comparative Physiology 133, 13–21 (1979)
Article Google Scholar
Wenzel, E.M., Arruda, M., Kistler, D.J., Wightman, F.L.: Localization using nonindividualized head-related transfer functions. J. Acoust. Soc. Am. 94(1), 111–123 (1993)
Article Google Scholar
Wightman, F.L., Kistler, D.J.: Headphone simulation of free-field listening. II: Psychophysical validation. J. Acoust. Soc. Am. 85(2), 868–878 (1989)
Article Google Scholar
Kulkarni, A., Colburn, H.S.: Role of spectral detail in sound-source localization. Nature 396(6713), 747–749 (1998)
Article Google Scholar
King, A.J., Parsons, C.H., Moore, D.R.: Plasticity in the neural coding of auditory space in the mammalian brain. Proc. Natl. Acad. Sci. USA 97(22), 11821–11828 (2000)
Article Google Scholar
Linkenhoker, B.A., Knudsen, E.I.: Incremental training increases the plasticity of the auditory space map in adult barn owls. Nature 419(6904), 293–296 (2002)
Article Google Scholar
Hofman, P.M., Van Riswick, J.G., Van Opstal, A.J.: Relearning sound localization with new ears. Nat. Neurosci. 1(5), 417–421 (1998)
Article Google Scholar
Shinn-Cunningham, B.G.: Models of plasticity in spatial auditory processing. Audiology and Neuro-Otology 6(4), 187–191 (2001)
Article Google Scholar
Bell, A.J., Sejnowski, T.J.: The ‘independent components’ of natural scenes are edge filters. Vision Research 37(23), 3327–3338 (1997)
Article Google Scholar
Olshausen, B.A., Field, D.J.: Sparse coding with an overcomplete basis set: A strategy employed by V1? Vision Research 37(23), 3311–3325 (1997)
Article Google Scholar
Riesenhuber, M., Poggio, T.: Models of object recognition. Nature Neuroscience 3 Suppl., 1199–1204 (2000)
Article Google Scholar
Olshausen, B., Field, D.J.: Emergence of simple-cell receptive field properties by learning a sparse code for natural images. Nature 381, 607–609 (1996)
Article Google Scholar
Olshausen, B.A., O’Connor, K.N.: A new window on sound. Nature Neuroscience 5, 292–293 (2002)
Article Google Scholar

Download references

Author information

Authors and Affiliations

Hamilton Institute, National University of Ireland, Maynooth, Co. Kildare, Ireland
Barak A. Pearlmutter
Cold Spring Harbor Laboratory, One Bungtown Rd, Cold Spring Harbor, NY, 11724, USA
Anthony M. Zador

Authors

Barak A. Pearlmutter
View author publications
You can also search for this author in PubMed Google Scholar
Anthony M. Zador
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Dept. of Architecture and Computer Technology, University of Granada, Spain
Carlos G. Puntonet & Alberto Prieto &

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Pearlmutter, B.A., Zador, A.M. (2004). Monaural Source Separation Using Spectral Cues. In: Puntonet, C.G., Prieto, A. (eds) Independent Component Analysis and Blind Signal Separation. ICA 2004. Lecture Notes in Computer Science, vol 3195. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-30110-3_61

Download citation

DOI: https://doi.org/10.1007/978-3-540-30110-3_61
Published: 27 October 2004
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-23056-4
Online ISBN: 978-3-540-30110-3
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics