Advertisement

First Stereo Audio Source Separation Evaluation Campaign: Data, Algorithms and Results

  • Emmanuel Vincent
  • Hiroshi Sawada
  • Pau Bofill
  • Shoji Makino
  • Justinian P. Rosca
Part of the Lecture Notes in Computer Science book series (LNCS, volume 4666)

Abstract

This article provides an overview of the first stereo audio source separation evaluation campaign, organized by the authors. Fifteen underdetermined stereo source separation algorithms have been applied to various audio data, including instantaneous, convolutive and real mixtures of speech or music sources. The data and the algorithms are presented and the estimated source signals are compared to reference signals using several objective performance criteria.

Keywords

Source Separation Blind Source Separation Spatial Image Reverberant Environment Instantaneous Mixture 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Cardoso, J.F.: Multidimensional independent component analysis. In: Proc. IEEE Int. Conf. on Acoustics, Speech and Signal Processing (ICASSP), IV–1941–1944 (1998)Google Scholar
  2. 2.
    Schobben, D., Torkkola, K., Smaragdis, P.: Evaluation of blind signal separation methods. In: Proc. Int. Conf. on Independent Component Analysis and Blind Source Separation (ICA), pp. 261–266 (1999)Google Scholar
  3. 3.
    Vincent, E., Gribonval, R., Févotte, C.: Performance measurement in blind audio source separation. IEEE Trans. on Audio, Speech and Language Processing 14, 1462–1469 (2006)CrossRefGoogle Scholar
  4. 4.
    Mansour, A., Kawamoto, M., Ohnishi, N.: A survey of the performance indexes of ICA algorithms. In: Proc. IASTED Int. Conf. on Modelling, Identification and Control (MIC), pp. 660–666 (2002)Google Scholar
  5. 5.
    Yılmaz, O., Rickard, S.T.: Blind separation of speech mixtures via time-frequency masking. IEEE Trans. on Signal Processing 52, 1830–1847 (2004)CrossRefGoogle Scholar
  6. 6.
    Barry, D., Coyle, E., Lawlor, B.: Real-time sound source separation using azimuth discrimination and resynthesis. In: Proc. 117th AES Convention. (2004) (preprint 6258)Google Scholar
  7. 7.
    Bofill, P., Zibulevsky, M.: Underdetermined blind source separation using sparse representations. Signal Processing 81, 2353–2362 (2001)zbMATHCrossRefGoogle Scholar
  8. 8.
    Xiao, M., Xie, S., Fu, Y.: A novel approach for underdetermined blind source separation in the frequency domain. In: Wang, J., Liao, X.-F., Yi, Z. (eds.) ISNN 2005. LNCS, vol. 3498, pp. 484–489. Springer, Heidelberg (2005)Google Scholar
  9. 9.
    Bofill, P., Monte, E.: Underdetermined convoluted source reconstruction using LP and SOCP, and a neural approximator of the optimizer. In: Rosca, J., Erdogmus, D., Príncipe, J.C., Haykin, S. (eds.) ICA 2006. LNCS, vol. 3889, pp. 569–576. Springer, Heidelberg (2006)CrossRefGoogle Scholar
  10. 10.
    Gowreesunker, B.V., Tewfik, A.H.: Two improved sparse decomposition methods for blind source separation. In: Proc. Int. Conf. on Independent Component Analysis and Blind Source Separation (ICA) (2007)Google Scholar
  11. 11.
    Mohan, S., Kramer, M.L., Wheeler, B.C., Jones, D.L.: Localization of nonstationary sources using a coherence test. In: Proc. IEEE Workshop on Statistical Signal Processing (SSP), pp. 470–473 (2003)Google Scholar
  12. 12.
    Arberet, S., Gribonval, R., Bimbot, F.: A robust method to count and locate audio sources in a stereophonic linear instantaneous mixture. In: Rosca, J., Erdogmus, D., Príncipe, J.C., Haykin, S. (eds.) ICA 2006. LNCS, vol. 3889, pp. 536–543. Springer, Heidelberg (2006)CrossRefGoogle Scholar
  13. 13.
    Lockwood, M.E., Jones, D.L., Bilger, R.C., Lansing, C.R., O’Brien Jr., W.D., Wheeler, B.C., Feng, A.S.: Performance of time- and frequency-domain binaural beamformers based on recorded signals from real rooms. Journal of the Acoustical Society of America 115, 379–391 (2004)CrossRefGoogle Scholar
  14. 14.
    Mitianoudis, N., Stathaki, T.: Underdetermined source separation using mixtures of warped Laplacians. In: Proc. Int. Conf. on Independent Component Analysis and Blind Source Separation (ICA) (2007)Google Scholar
  15. 15.
    Vincent, E.: Complex nonconvex lp norm minimization for underdetermined source separation. In: Proc. Int. Conf. on Independent Component Analysis and Blind Source Separation (ICA) (2007)Google Scholar
  16. 16.
    Xiao, M., Xie, S., Fu, Y.: A statistically sparse decomposition principle for underdetermined blind source separation. In: Proc. Int. Symp. on Intelligent Signal Processing and Communication Systems (ISPACS), pp. 165–168 (2005)Google Scholar
  17. 17.
    O’Grady, P.D., Pearlmutter, B.A.: Soft-LOST: EM on a mixture of oriented lines. In: Puntonet, C.G., Prieto, A.G. (eds.) ICA 2004. LNCS, vol. 3195, pp. 428–435. Springer, Heidelberg (2004)Google Scholar
  18. 18.
    Araki, S., Sawada, H., Makino, S.: Blind speech separation in a meeting situation with maximum SNR beamformers. In: Proc. IEEE Int. Conf. on Acoustics, Speech and Signal Processing (ICASSP), vol. I, pp. 41–44 (2007)Google Scholar
  19. 19.
    Cermak, J., Araki, S., Sawada, H., Makino, S.: Blind source separation based on beamformer array and time-frequency binary masking. In: Proc. IEEE Int. Conf. on Acoustics, Speech and Signal Processing (ICASSP), vol. I, pp. 145–148 (2007)Google Scholar
  20. 20.
    Izumi, Y., Ono, N., Sagayama, S.: Sparseness-based 2ch BSS using the EM algorithm in reverberant environment. In: IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA) (Submitted, 2007)Google Scholar
  21. 21.
    Kim, T., Attias, H.T., Lee, S.Y., Lee, T.W.: Blind source separation exploiting higher-order frequency dependencies. IEEE Trans. on Audio, Speech and Language Processing 15, 70–79 (2007)CrossRefGoogle Scholar
  22. 22.
    Mandel, M.I., Ellis, D.P.W., Jebara, T.: An EM algorithm for localizing multiple sound sources in reverberant environments. Advances in Neural Information Processing Systems (NIPS 19) (2007)Google Scholar
  23. 23.
    Sawada, H., Araki, S., Makino, S.: Measuring dependence of bin-wise separated signals for permutation alignment in frequency-domain BSS. In: Proc. IEEE Int. Symp. on Circuits and Systems (ISCAS), pp. 3247–3250 (2007)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2007

Authors and Affiliations

  • Emmanuel Vincent
    • 1
  • Hiroshi Sawada
    • 2
  • Pau Bofill
    • 3
  • Shoji Makino
    • 2
  • Justinian P. Rosca
    • 4
  1. 1.METISS Group, IRISA-INRIA, Campus de Beaulieu, 35042 Rennes CedexFrance
  2. 2.Signal Processing Research Group, NTT Communication Science Labs, 2-4, Hikaridai, Seika-cho, Soraku-gun, Kyoto 619-0237Japan
  3. 3.Departament d’Arquitectura de Computadors, Universitat Politècnica de Catalunya, Campus Nord Mòdul D6, Jordi Girona 1-3, 08034 BarcelonaSpain
  4. 4.Siemens Corporate Research, 755 College Road East, Princeton NJ 08540USA

Personalised recommendations