Convolutive Audio Source Separation Using Robust ICA and Reduced Likelihood Ratio Jump

  • Dimitrios Mallis
  • Thomas Sgouros
  • Nikolaos Mitianoudis
Conference paper
Part of the IFIP Advances in Information and Communication Technology book series (IFIPAICT, volume 475)

Abstract

Audio source separation is the task of isolating sound sources that are active simultaneously in a room captured by a set of microphones. Convolutive audio source separation of equal number of sources and microphones has a number of shortcomings including the complexity of frequency-domain ICA, the permutation ambiguity and the problem’s scalabity with increasing number of sensors. In this paper, the authors propose a multiple-microphone audio source separation algorithm based on a previous work of Mitianoudis and Davies [1]. Complex FastICA is substituted by Robust ICA increasing robustness and performance. Permutation ambiguity is solved using the Likelihood Ration Jump solution, which is now modified to decrease computational complexity in the case of multiple microphones.

References

  1. 1.
    Mitianoudis, N., Davies, M.: Audio source separation of convolutive mixtures. IEEE Trans. Speech Audio Process. 11(5), 489–497 (2003)CrossRefGoogle Scholar
  2. 2.
    Smaragdis, P.: Blind separation of convolved mixtures in the frequency domain. Neurocomputing 22(1), 21–34 (1998)CrossRefMATHGoogle Scholar
  3. 3.
    Mazur, R., Mertins, A.: An approach for solving the permutation problem of convolutive blind source separation based on statistical signal models. IEEE Trans. Audio Speech Lang. Process. 17(1), 117–126 (2009)CrossRefGoogle Scholar
  4. 4.
    Sawada, H., Araki, S., Makino, S.: Underdetermined convolutive blind source separation via frequency bin-wise clustering and permutation alignment. IEEE Trans. Audio Speech Lang. Process. 19(3), 516–527 (2011)CrossRefGoogle Scholar
  5. 5.
    Saito, S., Oishi, K., Furukawa, T.: Convolutive blind source separation using an iterative least-squares algorithm for non-orthogonal approximate joint diagonalization. IEEE Trans. Audio Speech Lang. Process. 23(12), 2434–2448 (2015)CrossRefGoogle Scholar
  6. 6.
    Sarmiento, A., Duran-Diaz, I., Cichocki, A., Cruces, S.: A contrast function based on generalised divergences for solving the permutation problem in convolved speech mixtures. IEEE Trans. Audio Speech Lang. Process. 23(11), 1713–1726 (2015)CrossRefGoogle Scholar
  7. 7.
    Wang, L., Ding, H., Yin, F.: A region-growing permutation alignment approach in frequency-domain blind source separation of speech mixtures. IEEE Trans. Audio Speech Lang. Process. 19(3), 2434–2448 (2011)CrossRefGoogle Scholar
  8. 8.
    Zarzoso, V., Comon, P.: Robust independent component analysis by iterative maximization of the Kurtosis contrast with algebraic optimal step size. IEEE Trans. Neural Netw. 21(2), 248–261 (2010)CrossRefGoogle Scholar
  9. 9.
    Hyvärinen, A.: The fixed-point algorithm and maximum likelihood estimation for independent component analysis. Neural Process. Lett. 10(1), 1–5 (1999)CrossRefGoogle Scholar
  10. 10.
    Hyvärinen, A.: Fast and robust fixed-point algorithms for independent component analysis. IEEE Trans. Neural Netw. 10(3), 626–634 (1999)CrossRefGoogle Scholar
  11. 11.
    Févotte, C., Gribonval, R., Vincent, E.: BSS EVAL toolbox user guide. Technical report, IRISA Technical Report 1706, Rennes, France, April 2005. http://www.irisa.fr/metiss/bss_eval/

Copyright information

© IFIP International Federation for Information Processing 2016

Authors and Affiliations

  • Dimitrios Mallis
    • 1
  • Thomas Sgouros
    • 1
  • Nikolaos Mitianoudis
    • 1
  1. 1.Department of Electrical and Computer EngineeringDemocritus University of ThraceXanthiGreece

Personalised recommendations