Skip to main content

Monaural Source Separation Using Spectral Cues

  • Conference paper
  • First Online:
Independent Component Analysis and Blind Signal Separation (ICA 2004)

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 3195))

Abstract

The acoustic environment poses at least two important challenges. First, animals must localise sound sources using a variety of binaural and monaural cues; and second they must separate sources into distinct auditory streams (the “cocktail party problem”). Binaural cues include intra-aural intensity and phase disparity. The primary monaural cue is the spectral filtering introduced by the head and pinnae via the head-related transfer function (HRTF), which imposes different linear filters upon sources arising at different spatial locations.

Here we address the second challenge, source separation. We propose an algorithm for exploiting the monaural HRTF to separate spatially localised acoustic sources in a noisy environment. We assume that each source has a unique position in space, and is therefore subject to preprocessing by a different linear filter. We also assume prior knowledge of weak statistical regularities present in the sources. This framework can incorporate various aspects of acoustic transfer functions (echos, delays, multiple sensors, frequency-dependent attenuation) in a uniform fashion, treating them as cues for, rather than obstacles to, separation. To accomplish this, sources are represented sparsely in an overcomplete basis. This framework can be extended to make predictions about the neural representations required to separate acoustic sources.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 74.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Bregman, A.S.: Auditory Scene Analysis: The Perceptual Organization of Sound. MIT Press, Cambridge (1990) ISBN 0-262-02297-4

    Book  Google Scholar 

  2. Yost Jr., W.A., Dye, R.H., Sheft, S.: A simulated “cocktail party” with up to three sound sources. Percept Psychophys 58(7), 1026–1036 (1996)

    Article  Google Scholar 

  3. Chen, S.S., Donoho, D.L., Saunders, M.A.: Atomic decomposition by basis pursuit. SIAM Journal on Scientific Computing 20(1), 33–61 (1999)

    Article  MathSciNet  Google Scholar 

  4. Lee, T.-W., Lewicki, M.S., Girolami, M., Sejnowski, T.J.: Blind source separation of more sources than mixtures using overcomplete representations. IEEE Signal Processing Letters 4(5), 87–90 (1999)

    Google Scholar 

  5. Lewicki, M., Olshausen, B.A.: Inferring sparse, overcomplete image codes using an efficient coding framework. In: Advances in Neural Information Processing Systems 10, pp. 815–821. MIT Press, Cambridge (1998)

    Google Scholar 

  6. Lewicki, M.S., Sejnowski, T.J.: Learning overcomplete representations. Neural Computation 12(2), 337–365 (2000)

    Article  Google Scholar 

  7. Zibulevsky, M., Pearlmutter, B.A.: Blind source separation by sparse decomposition in a signal dictionary. Neural Computation 13(4), 863–882 (2001)

    Article  Google Scholar 

  8. Bofill, P., Zibulevsky, M.: Underdetermined blind source separation using sparse representations. Signal Processing 81(11), 2353–2362 (2001)

    Article  Google Scholar 

  9. Rickard, S.T., Dietrich, F.: DOA estimation of manyW-disjoint orthogonal sources from two mixtures using DUET. In: Proceedings of the 10th IEEE Workshop on Statistical Signal and Array Processing (SSAP 2000), Pocono Manor, PA, August 2000, pp. 311–314 (2000)

    Google Scholar 

  10. Cauwenberghs, G.: Monaural separation of independent acoustical components. In: Proc. IEEE Int. Symp. Circuits and Systems (ISCAS 1999), Orlando FL, vol. 5, pp. 62–65 (1999)

    Google Scholar 

  11. Hochreiter, S., Mozer, M.C.: Monaural separation and classification of mixed signals: A support-vector regression perspective. In: Lee, T.-W., Jung, T.-P., Makeig, S., Sejnowski, T.J. (eds.) 3rd International Conference on Independent Component Analysis and Blind Signal Separation, San Diego, CA, December 9-12 (2001)

    Google Scholar 

  12. Jang, G.-J., Lee, T.-W.: A maximum likelihood approach to single-channel source separation. Journal of Machine Learning Research 4, 1365–1392 (2003)

    MathSciNet  MATH  Google Scholar 

  13. Roweis, S.T.: One microphone source separation. In: Advances in Neural Information Processing Systems 13, pp. 793–799. MIT Press, Cambridge (2001)

    Google Scholar 

  14. Poggio, T., Torre, V., Koch, C.: Computational vision and regularization theory. Nature 317(6035), 314–319 (1985)

    Article  Google Scholar 

  15. Donoho, D.L., Elad, M.: Maximal sparsity representation via l1 minimization. Proceedings of the National Academy of Sciences 100, 2197–2202 (2003)

    Article  Google Scholar 

  16. Fletcher, R.: Semidefinite matrix constraints in optimization. SIAM J. Control and Opt. 23, 493–513 (1985)

    Article  MathSciNet  Google Scholar 

  17. Hofman, P.M., Van Opstal, A.J.: Bayesian reconstruction of sound localization cues from responses to random spectra. Biol. Cybern. 86(4), 305–316 (2002)

    Article  Google Scholar 

  18. Knudsen, E.I., Konishi, M.: Mechanisms of sound localization in the barn owl. Journal of Comparative Physiology 133, 13–21 (1979)

    Article  Google Scholar 

  19. Wenzel, E.M., Arruda, M., Kistler, D.J., Wightman, F.L.: Localization using nonindividualized head-related transfer functions. J. Acoust. Soc. Am. 94(1), 111–123 (1993)

    Article  Google Scholar 

  20. Wightman, F.L., Kistler, D.J.: Headphone simulation of free-field listening. II: Psychophysical validation. J. Acoust. Soc. Am. 85(2), 868–878 (1989)

    Article  Google Scholar 

  21. Kulkarni, A., Colburn, H.S.: Role of spectral detail in sound-source localization. Nature 396(6713), 747–749 (1998)

    Article  Google Scholar 

  22. King, A.J., Parsons, C.H., Moore, D.R.: Plasticity in the neural coding of auditory space in the mammalian brain. Proc. Natl. Acad. Sci. USA 97(22), 11821–11828 (2000)

    Article  Google Scholar 

  23. Linkenhoker, B.A., Knudsen, E.I.: Incremental training increases the plasticity of the auditory space map in adult barn owls. Nature 419(6904), 293–296 (2002)

    Article  Google Scholar 

  24. Hofman, P.M., Van Riswick, J.G., Van Opstal, A.J.: Relearning sound localization with new ears. Nat. Neurosci. 1(5), 417–421 (1998)

    Article  Google Scholar 

  25. Shinn-Cunningham, B.G.: Models of plasticity in spatial auditory processing. Audiology and Neuro-Otology 6(4), 187–191 (2001)

    Article  Google Scholar 

  26. Bell, A.J., Sejnowski, T.J.: The ‘independent components’ of natural scenes are edge filters. Vision Research 37(23), 3327–3338 (1997)

    Article  Google Scholar 

  27. Olshausen, B.A., Field, D.J.: Sparse coding with an overcomplete basis set: A strategy employed by V1? Vision Research 37(23), 3311–3325 (1997)

    Article  Google Scholar 

  28. Riesenhuber, M., Poggio, T.: Models of object recognition. Nature Neuroscience 3 Suppl., 1199–1204 (2000)

    Article  Google Scholar 

  29. Olshausen, B., Field, D.J.: Emergence of simple-cell receptive field properties by learning a sparse code for natural images. Nature 381, 607–609 (1996)

    Article  Google Scholar 

  30. Olshausen, B.A., O’Connor, K.N.: A new window on sound. Nature Neuroscience 5, 292–293 (2002)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2004 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Pearlmutter, B.A., Zador, A.M. (2004). Monaural Source Separation Using Spectral Cues. In: Puntonet, C.G., Prieto, A. (eds) Independent Component Analysis and Blind Signal Separation. ICA 2004. Lecture Notes in Computer Science, vol 3195. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-30110-3_61

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-30110-3_61

  • Published:

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-23056-4

  • Online ISBN: 978-3-540-30110-3

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics