Skip to main content

A General Modular Framework for Audio Source Separation

  • Conference paper
Latent Variable Analysis and Signal Separation (LVA/ICA 2010)

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 6365))

Abstract

Most of audio source separation methods are developed for a particular scenario characterized by the number of sources and channels and the characteristics of the sources and the mixing process. In this paper we introduce a general modular audio source separation framework based on a library of flexible source models that enable the incorporation of prior knowledge about the characteristics of each source. First, this framework generalizes several existing audio source separation methods, while bringing a common formulation for them. Second, it allows to imagine and implement new efficient methods that were not yet reported in the literature. We first introduce the framework by describing the flexible model, explaining its generality, and summarizing our modular implementation using a Generalized Expectation-Maximization algorithm. Finally, we illustrate the above-mentioned capabilities of the framework by applying it in several new and existing configurations to different source separation scenarios.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Abdallah, S.A., Plumbley, M.D.: Polyphonic transcription by nonnegative sparse coding of power spectra. In: Proc. 5th International Symposium Music Information Retrieval (ISMIR 2004), pp. 318–325 (October 2004)

    Google Scholar 

  2. Arberet, S., Gribonval, R., Bimbot, F.: A robust method to count and locate audio sources in a multichannel underdetermined mixture. IEEE Transactions on Signal Processing 58(1), 121–133 (2010)

    Article  Google Scholar 

  3. Arberet, S., Ozerov, A., Duong, N., Vincent, E., Gribonval, R., Bimbot, F., Vandergheynst, P.: Nonnegative matrix factorization and spatial covariance model for under-determined reverberant audio source separation. In: 10th Int. Conf. on Information Sciences, Signal Proc. and their Applications, ISSPA 2010 (2010)

    Google Scholar 

  4. Cardoso, J.F., Martin, M.: A flexible component model for precision ICA. In: Davies, M.E., James, C.J., Abdallah, S.A., Plumbley, M.D. (eds.) ICA 2007. LNCS, vol. 4666, pp. 1–8. Springer, Heidelberg (2007)

    Chapter  Google Scholar 

  5. Duong, N.Q.K., Vincent, E., Gribonval, R.: Under-determined convolutive blind source separation using spatial covariance models. In: IEEE International Conference on Acoustics,Speech, and Signal Processing ICASSP (March 2010)

    Google Scholar 

  6. Févotte, C., Bertin, N., Durrieu, J.L.: Nonnegative matrix factorization with the Itakura-Saito divergence. With application to music analysis. Neural Computation 21(3), 793–830 (2009)

    MATH  Google Scholar 

  7. Févotte, C., Cardoso, J.F.: Maximum likelihood approach for blind audio source separation using time-frequency Gaussian models. In: WASPAA 2005, Mohonk, NY, USA (October 2005)

    Google Scholar 

  8. FitzGerald, D., Cranitch, M., Coyle, E.: Extended nonnegative tensor factorisation models for musical sound source separation. In: Computational Intelligence and Neuroscience. Hindawi Publishing Corp. 2008 (2008)

    Google Scholar 

  9. Nesta, F., Svaizer, P., Omologo, M.: Cumulative state coherence transform for a robust two-channel multiple source localization. In: Adali, T., Jutten, C., Romano, J.M.T., Barros, A.K. (eds.) ICA 2009. LNCS, vol. 5441, pp. 290–297. Springer, Heidelberg (2009)

    Chapter  Google Scholar 

  10. Ozerov, A., Févotte, C.: Multichannel nonnegative matrix factorization in convolutive mixtures for audio source separation. IEEE Trans. on Audio, Speech and Lang. Proc. 18(3), 550–563 (2010)

    Article  Google Scholar 

  11. Ozerov, A., Févotte, C., Charbit, M.: Factorial scaled hidden Markov model for polyphonic audio representation and source separation. In: WASPAA 2009, October 18-21, pp. 121–124 (2009)

    Google Scholar 

  12. Pham, D.T., Servière, C., Boumaraf, H.: Blind separation of speech mixtures based on nonstationarity. In: Proceedings of the 7th International Symposium on Signal Processing and its Applications, pp. II–73–76 (2003)

    Google Scholar 

  13. Vincent, E., Bertin, N., Badeau, R.: Adaptive harmonic spectral decomposition for multiple pitch estimation. IEEE Trans. on Audio, Speech and Language Processing 18(3), 528–537 (2010)

    Article  Google Scholar 

  14. Vincent, E., Jafari, M., Abdallah, S.A., Plumbley, M.D., Davies, M.E.: Probabilistic modeling paradigms for audio source separation. In: Machine Audition: Principles, Algorithms and Systems. IGI Global (2010) (to appear)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2010 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Ozerov, A., Vincent, E., Bimbot, F. (2010). A General Modular Framework for Audio Source Separation. In: Vigneron, V., Zarzoso, V., Moreau, E., Gribonval, R., Vincent, E. (eds) Latent Variable Analysis and Signal Separation. LVA/ICA 2010. Lecture Notes in Computer Science, vol 6365. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-15995-4_5

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-15995-4_5

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-15994-7

  • Online ISBN: 978-3-642-15995-4

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics