Advertisement

The 2015 Signal Separation Evaluation Campaign

  • Nobutaka OnoEmail author
  • Zafar Rafii
  • Daichi Kitamura
  • Nobutaka Ito
  • Antoine Liutkus
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 9237)

Abstract

In this paper, we report the 2015 community-based Signal Separation Evaluation Campaign (SiSEC 2015). This SiSEC consists of four speech and music datasets including two new datasets: “Professionally produced music recordings” and “Asynchronous recordings of speech mixtures”. Focusing on them, we overview the campaign specifications such as the tasks, datasets and evaluation criteria. We also summarize the performance of the submitted systems.

Keywords

Source Separation Deep Neural Network Robust Principal Component Analysis Permutation Problem Glottal Closure Instant 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Notes

Acknowledgment

We would like to thank Dr. Shigeki Miyabe for providing the new ASY dataset, and Mike Senior for giving us the permission to use the MSD database for creating the MSD100 corpus.

References

  1. 1.
    Ono, N., Koldovsky, Z., Miyabe, S., Ito, N.: The 2013 signal separation evaluation campaign. In: Proceedings of MLSP, pp. 1–6, September 2013Google Scholar
  2. 2.
    Vincent, E., Griboval, R., Févotte, C.: Performance measurement in blind audio source separation. IEEE Trans. ASLP 14(4), 1462–1469 (2006)Google Scholar
  3. 3.
    Emiya, V., Vincent, E., Harlander, N., Hohmann, V.: Subjective and objective quality assessment of audio source separation. IEEE Trans. ASLP 19(7), 2046–2057 (2011)Google Scholar
  4. 4.
    Mitianoudis, N.: A generalised directional laplacian distribution: estimation, mixture models and audio source separation. IEEE Trans. ASLP 20(9), 2397–2408 (2012)Google Scholar
  5. 5.
    Bouafif, M., Lachiri, Z.: Multi-sources separation for sound source localization. In: Proceedings of Interspeech, pp. 14–18, September 2014Google Scholar
  6. 6.
    Sawada, H., Araki, S., Makino, S.: Underdetermined convolutive blind source separation via frequency bin-wise clustering and permutation alignment. IEEE Trans. ASLP 19(3), 516–527 (2011)Google Scholar
  7. 7.
    López, A.R., Ono, N., Remes, U., Palomäki, K., Kurimo, M.: Designing multichannel source separation based on single-channel source separation. In: Proceedings of ICASSP, pp. 469–473, April 2015Google Scholar
  8. 8.
    Ito, N., Araki, S., Nakatani, T.: Permutation-free convolutive blind source separation via full-band clustering based on frequency-independent source presence priors. In: Proceedings of ICASSP, pp. 3238–3242, May 2013Google Scholar
  9. 9.
    Chan, T.-S., Yeh, T.-C., Fan, Z.-C., Chen, H.-W., Su, L., Yang, Y.-H., Jang, R.: Vocal activity informed singing voice separation with the iKala dataset. In: Proceedings of ICASSP, pp. 718–722, April 2015Google Scholar
  10. 10.
    Durrieu, J.-L., David, B., Richard, G.: A musically motivated mid-level representation for pitch estimation and musical audio source separation. IEEE J. Sel. Top. Sign. Process. 5(6), 1180–1191 (2011)CrossRefGoogle Scholar
  11. 11.
    Huang, P.-S., Chen, S.D., Smaragdis, P., Hasegawa-Johnson, M.: Singing-voice separation from monaural recordings using robust principal component analysis. In: Proceedings of ICASSP, pp. 57–60, March 2012Google Scholar
  12. 12.
    Liutkus, A., FitzGerald, D., Rafii, Z., Pardo, B., Daudet, L.: Kernel additive models for source separation. IEEE Trans. SP 62(16), 4298–4310 (2014)MathSciNetCrossRefGoogle Scholar
  13. 13.
    Liutkus, A., FitzGerald, D., Rafii, Z., Daudet, L.: Scalable audio separation with light kernel additive modelling. In: Proceedings of ICASSP, pp. 76–80, April 2015Google Scholar
  14. 14.
    Nugraha, A.A., Liutkus, A., Vincent, E.: Multichannel audio source separation with deep neural networks. Research report RR-8740, Inria (2015)Google Scholar
  15. 15.
    Ozerov, A., Vincent, E., Bimbot, F.: A general flexible framework for the handling of prior information in audio source separation. IEEE Trans. ASLP 20(4), 1118–1133 (2012)Google Scholar
  16. 16.
    Salaün, Y., Vincent, E., Bertin, N., Souviraà-Labastie, N., Jaureguiberry, X., Tran, D.T., Bimbot, F.: The flexible audio source separation toolbox version 2.0. In: Proceedings of ICASSP, 4–9 May 2014Google Scholar
  17. 17.
    Rafii, Z., Pardo, B.: REpeating pattern extraction technique (REPET): a simple method for music/voice separation. IEEE Trans. ASLP 21(1), 71–82 (2013)Google Scholar
  18. 18.
    Liutkus, A., Rafii, Z., Badeau, R., Pardo, B., Richard, G.: Adaptive filtering for music/voice separation exploiting the repeating musical structure. In: Proceedings of ICASSP, pp. 53–56, March 2012Google Scholar
  19. 19.
    Rafii, Z., Pardo, B.: Music/voice separation using the similarity matrix. In: Proceedings of ISMIR, pp. 583–588, October 2012Google Scholar
  20. 20.
    Rafii, Z., Liutkus, A., Pardo, B.: REPET for background/foreground separation in audio. In: Naik, G.R., Wang, W. (eds.) Blind Source Separation, Chap. 14. Signals and Communication Technology, pp. 395–411. Springer, Heidelberg (2014)CrossRefGoogle Scholar
  21. 21.
    Salamon, J., Gómez, E.: Melody extraction from polyphonic music signals using pitch contour characteristics. IEEE Trans. ASLP 20(6), 1759–1770 (2012)Google Scholar
  22. 22.
    Stöter, F.-R., Bayer, S., Edler, B.: Unison Source Separation. In: Proceedings of DAFx, September 2014Google Scholar
  23. 23.
    Uhlich, S., Giron, F., Mitsufuji, Y.: Deep neural network based instrument extraction from music. In: Proceedings of ICASSP, pp. 2135–2139, April 2015Google Scholar
  24. 24.
    Erdogan, H., Hershey, J.R., Watanabe, S., Le Roux, J.: Phase-sensitive and recognition-boosted speech separation using deep recurrent neural networks. In: Proceedings of ICASSP, pp. 708–712, April 2015Google Scholar
  25. 25.
    Wang, L.: Multi-band multi-centroid clustering based permutation alignment for frequency-domain blind speech separation. Digit. Signal Process. 31, 79–92 (2014)CrossRefGoogle Scholar
  26. 26.
    Miyabe, S., Ono, N., Makino, S.: Blind compensation of interchannel sampling frequency mismatch for ad hoc microphone array based on maximum likelihood estimation. Elsevier Signal Process. 107, 185–196 (2015)CrossRefGoogle Scholar
  27. 27.
    Ono, N.: Stable and fast update rules for independent vector analysis based on auxiliary function technique. In: Proceedings of WASPAA, pp. 189–192, October 2011Google Scholar
  28. 28.
    Chiba, H., Ono, N., Miyabe, S., Takahashi, Y., Yamada, T., Makino, S.: Amplitude-based speech enhancement with nonnegative matrix factorization for asynchronous distributed recording. In: Proceedings of IWAENC, pp. 204–208, September 2014Google Scholar

Copyright information

© Springer International Publishing Switzerland 2015

Authors and Affiliations

  • Nobutaka Ono
    • 1
    Email author
  • Zafar Rafii
    • 2
  • Daichi Kitamura
    • 3
  • Nobutaka Ito
    • 4
  • Antoine Liutkus
    • 5
  1. 1.National Institute of InformaticsTokyoJapan
  2. 2.Media Technology LabGracenoteEmeryvilleUSA
  3. 3.SOKENDAI (The Graduate University for Advanced Studies)HayamaJapan
  4. 4.NTT Communication Science LaboratoriesNTT CorporationKyotoJapan
  5. 5.INRIAVillers-lès-NancyFrance

Personalised recommendations