The 2011 Signal Separation Evaluation Campaign (SiSEC2011): - Audio Source Separation -

  • Shoko Araki
  • Francesco Nesta
  • Emmanuel Vincent
  • Zbyněk Koldovský
  • Guido Nolte
  • Andreas Ziehe
  • Alexis Benichoux
Part of the Lecture Notes in Computer Science book series (LNCS, volume 7191)

Abstract

This paper summarizes the audio part of the 2011 community-based Signal Separation Evaluation Campaign (SiSEC2011). Four speech and music datasets were contributed, including datasets recorded in noisy or dynamic environments and a subset of the SiSEC2010 datasets. The participants addressed one or more tasks out of four source separation tasks, and the results for each task were evaluated using different objective performance criteria. We provide an overview of the audio datasets, tasks and criteria. We also report the results achieved with the submitted systems, and discuss organization strategies for future campaigns.

Keywords

Source Separation Blind Source Separation Nonnegative Matrix Factorization Speech Enhancement Reverberation Time 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Cooke, M.P., Hershey, J., Rennie, S.: Monaural speech separation and recognition challenge. Computer Speech and Language 24, 1–15 (2010)CrossRefGoogle Scholar
  2. 2.
    Vincent, E., Araki, S., Theis, F.J., Nolte, G., Bofill, P., Sawada, H., Ozerov, A., Gowreesunker, B.V., Lutter, D., Duong, N.Q.K.: The Signal Separation Evaluation Campaign (2007–2010): Achievements and remaining challenges. Signal Processing (to appear)Google Scholar
  3. 3.
    Christensen, H., Barker, J., Ma, N., Green, P.: The CHiME corpus: a resource and a challenge for computational hearing in multisource environments. In: Proc. Interspeech, pp. 1918–1921 (2010)Google Scholar
  4. 4.
    Vincent, E., Gribonval, R., Plumbley, M.D.: Oracle estimators for the benchmarking of source separation algorithms. Signal Processing 87(8), 1933–1950 (2007)CrossRefMATHGoogle Scholar
  5. 5.
    Wang, D.L.: On ideal binary mask as the computational goal of auditory scene analysis. In: Speech Separation by Humans and Machines. Springer, Heidelberg (2005)Google Scholar
  6. 6.
    Araki, S., Ozerov, A., Gowreesunker, V., Sawada, H., Theis, F., Nolte, G., Lutter, D., Duong, N.Q.K.: The 2010 Signal Separation Evaluation Campaign (SiSEC2010): Audio Source Separation. In: Vigneron, V., Zarzoso, V., Moreau, E., Gribonval, R., Vincent, E. (eds.) LVA/ICA 2010. LNCS, vol. 6365, pp. 114–122. Springer, Heidelberg (2010)CrossRefGoogle Scholar
  7. 7.
    Vincent, E., Gribonval, R., Févotte, C.: Performance measurement in blind audio source separation. IEEE Trans. on Audio, Speech and Language Processing 14(4), 1462–1469 (2006)CrossRefGoogle Scholar
  8. 8.
    Emiya, V., Vincent, E., Harlander, N., Hohmann, V.: Subjective and objective quality assessment of audio source separation. IEEE Trans. on Audio, Speech and Language Processing 19(7), 2046–2057 (2011)CrossRefGoogle Scholar
  9. 9.
    Vincent, E.: Improved Perceptual Metrics for the Evaluation of Audio Source Separation. In: Theis, F., et al. (eds.) LVA/ICA 2012. LNCS, vol. 7191, pp. 430–437. Springer, Heidelberg (2012)Google Scholar
  10. 10.
    Ozerov, A., Vincent, E., Bimbot, F.: A general flexible framework for the handling of prior information in audio source separation. IEEE Trans. on Audio, Speech and Language Processing PP(99), 1 (2011)Google Scholar
  11. 11.
    Makkiabadi, B., Sanei, S., Marshall, D.: A k-subspace based tensor factorization approach for under-determined blind identification. In: Proc. ASILOMAR 2010 (2010)Google Scholar
  12. 12.
    Hirasawa, Y., Yasuraoka, N., Takahashi, T., Ogata, T., Okuno, H.G.: A GMM Sound Source Model for Blind Speech Separation in Under-determined Conditions. In: Yeredor, A., et al. (eds.) LVA/ICA 2012. LNCS, vol. 7191, pp. 446–453. Springer, Heidelberg (2012)Google Scholar
  13. 13.
    Iso, K., Araki, S., Makino, S., Nakatani, T., Sawada, H., Yamada, T., Nakamura, A.: Blind source separation of mixed speech in a high reverberation environment. In: Proc. HSCMA 2011, pp. 36–39 (2011)Google Scholar
  14. 14.
    Cho, J., Choi, J., Yoo, C.D.: Underdetermined convolutive blind source separation using a novel mixing matrix estimation and MMSE-based source estimation. In: Proc. MLSP 2011 (2011)Google Scholar
  15. 15.
    Nesta, F., Omologo, M.: Convolutive Underdetermined Source Separation through Weighted Interleaved ICA and Spatio-temporal Source Correlation. In: Yeredor, A., et al. (eds.) LVA/ICA 2012. LNCS, vol. 7191, pp. 222–230. Springer, Heidelberg (2012)Google Scholar
  16. 16.
    Sawada, H., Araki, S., Makino, S.: A two-stage frequency-domain blind source separation method for underdetermined convolutive mixtures. In: Proc. WASPAA, pp. 139–142 (2007)Google Scholar
  17. 17.
    Málek, J., Koldovský, Z., Tichavský, P.: Semi-blind Source Separation Based on ICA and Overlapped Speech Detection. In: Yeredor, A., et al. (eds.) LVA/ICA 2012. LNCS, vol. 7191, pp. 462–469. Springer, Heidelberg (2012)Google Scholar
  18. 18.
    Nesta, F., Omologo, M.: Generalized state coherence transform for multidimensional TDOA estimation of multiple sources. IEEE Transactions on Audio, Speech, and Language Processing 20(1), 246–260 (2012)CrossRefGoogle Scholar
  19. 19.
    Loesch, B., Yang, B.: Blind Source Separation Based on Time-Frequency Sparseness in the Presence of Spatial Aliasing. In: Vigneron, V., Zarzoso, V., Moreau, E., Gribonval, R., Vincent, E. (eds.) LVA/ICA 2010. LNCS, vol. 6365, pp. 1–8. Springer, Heidelberg (2010)CrossRefGoogle Scholar
  20. 20.
    Loesch, B., Yang, B.: Adaptive Segmentation and Separation of Determined Convolutive Mixtures under Dynamic Conditions. In: Vigneron, V., Zarzoso, V., Moreau, E., Gribonval, R., Vincent, E. (eds.) LVA/ICA 2010. LNCS, vol. 6365, pp. 41–48. Springer, Heidelberg (2010)CrossRefGoogle Scholar
  21. 21.
    Loesch, B., Nesta, F., Yang, B.: On the robustness of the multidimensional state coherence transform for solving the permutation problem of frequency-domain ICA. In: Proc. ICASSP, pp. 225–228 (2010)Google Scholar
  22. 22.
    Durrieu, J.-L., David, B., Richard, G.: A musically motivated mid-level representation for pitch estimation and musical audio source separation. IEEE Journal of Selected Topics on Signal Processing 5(6), 1180–1191 (2011)CrossRefGoogle Scholar
  23. 23.
    Durrieu, J.-L., Thiran, J.-P.: Musical Audio Source Separation Based on User-Selected F0 Track. In: Yeredor, A., et al. (eds.) LVA/ICA 2012. LNCS, vol. 7191, pp. 438–445. Springer, Heidelberg (2012)Google Scholar
  24. 24.
    Cano, E., Dittmar, C., Schuller, G.: Interaction of phase, magnitude and location of harmonic components in the perceived quality of extracted solo signals. In: Proc. AES (2011)Google Scholar
  25. 25.
    Spiertz, M., Gnann, V.: Note clustering based on 2D source-filter modeling for underdetermined blind source separation. In: Proc. AES (2011)Google Scholar
  26. 26.
    Marxer, R., Janer, J.: A Tikhonov regularization method for spectrum decomposition in low latency audio source separation. In: Proc. ICASSP 2012 (to appear, 2012)Google Scholar
  27. 27.
    Sawada, H., Kameoka, H., Araki, S., Ueda, N.: Efficient algorithms for multichannel extensions of Itakura-Saito nonnegative matrix factorization. In: Proc. ICASSP 2012 (to appear, 2012)Google Scholar
  28. 28.
    Mustiere, F., Bolic, M., Bouchard, M.: Real-world particle filtering-based speech enhancement. In: Proc. CIP, pp. 75–80 (2010)Google Scholar
  29. 29.
    Nesta, F., Matassoni, M.: Robust automatic speech recognition through on-line semi-blind source extraction. In: Proc. CHIME (2011)Google Scholar
  30. 30.
    Blandin, C., Ozerov, A., Vincent, E.: Multi-source TDOA estimation in reverberant audio using angular spectra and clustering. Signal Processing (to appear)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2012

Authors and Affiliations

  • Shoko Araki
    • 1
  • Francesco Nesta
    • 2
  • Emmanuel Vincent
    • 3
  • Zbyněk Koldovský
    • 4
  • Guido Nolte
    • 5
  • Andreas Ziehe
    • 5
  • Alexis Benichoux
    • 3
  1. 1.NTT Communication Science Labs.NTT CorporationJapan
  2. 2.Center of Information TechnologyFondazione Bruno Kessler - IrstItaly
  3. 3.Centre Inria RennesINRIABretagne AtlantiqueFrance
  4. 4.Technical University of LiberecCzech Republic
  5. 5.Fraunhofer Institute FIRST IDAGermany

Personalised recommendations