Advertisement

Evolving Systems

, Volume 9, Issue 4, pp 315–329 | Cite as

Convolutive audio source separation using robust ICA and an intelligent evolving permutation ambiguity solution

  • Dimitrios Mallis
  • Thomas Sgouros
  • Nikolaos Mitianoudis
Original Paper

Abstract

Audio source separation is the task of isolating sound sources that are active simultaneously in a room captured by a set of microphones. Convolutive audio source separation of equal number of sources and microphones has a number of shortcomings including the complexity of frequency-domain ICA, the permutation ambiguity and the problem’s scalabity with increasing number of sensors. In this paper, the authors propose a multiple-microphone audio source separation algorithm based on a previous work of Mitianoudis and Davies [IEEE Trans Speech Audio Process 11(5):489–497, 2003]. Complex FastICA is substituted by Robust ICA increasing robustness and performance. Permutation ambiguity is solved using two methodologies. The first is using the Likelihood Ration Jump solution, which is now modified to decrease computational complexity in the case of multiple microphones. The application of the MuSIC algorithm, as a preprocessing step to the previous solution, forms a second methodology with promising results.

Keywords

Convolutive source separation Robust ICA Permutation ambiguity 

References

  1. Cheney M (2001) The linear sampling method and the music algorithm. Inverse Probl 17:591–595MathSciNetCrossRefGoogle Scholar
  2. Comon P (1994) Independent component analysis—a new concept? Signal Process 36:287–314CrossRefGoogle Scholar
  3. Comon P, Jutten C (2010) Handbook of blind source separation: independent component analysis and applications. Academic Press, CambridgeGoogle Scholar
  4. Févotte C, Gribonval R, Vincent E (2005) BSS EVAL toolbox user guide. Technical report, IRISA technical report 1706, Rennes. http://www.irisa.fr/metiss/bss eval/. Accessed 27 July 2017
  5. Griffiths L, Jim C (1982) An alternative approach to linearly constrained adaptive beamforming. IEEE Trans Ant Propag 30:27–34CrossRefGoogle Scholar
  6. Herbig T, Gerl F, Minker W, Haeb-Umbach R (2011) Adaptive systems for unsupervised speaker tracking and speech recognition. Evol Syst 2(3):199–214CrossRefGoogle Scholar
  7. Hyvärinen A (1999) The fixed-point algorithm and maximum likelihood estimation for independent component analysis. Neural Process Lett 10(1):1–5CrossRefGoogle Scholar
  8. Hyvärinen A (1999) Fast and robust fixed-point algorithms for independent component analysis. IEEE Trans Neural Netw 10(3):626–634CrossRefGoogle Scholar
  9. Hyvärinen A, Karhunen J, Oja E (2001) Independent component analysis. Wiley, New York, p 481+xxiiCrossRefGoogle Scholar
  10. Hyvärinen A, Oja E (2000) Independent component analysis: algorithms and applications. Neural Netw 13(4–5):411–430CrossRefGoogle Scholar
  11. Ikeda S, Murata N (1998) A method of blind separation on temporal structure of signals. In: ICONIP, vol 98, Citeseer, pp 737–742Google Scholar
  12. Markovich-Golan S, Gannot S, Kellermann W (2017) Combined LCMV-TRINICON beamforming for separating multiple speech sources in noisy and reverberant environments. IEEE Trans Audio Speech Lang Process 25(2):320–332CrossRefGoogle Scholar
  13. Mazur R, Mertins A (2009) An approach for solving the permutation problem of convolutive blind source separation based on statistical signal models. IEEE Trans Audio Speech Lang Process 17(1):117–126CrossRefGoogle Scholar
  14. Mitianoudis N, Davies M (2003) Audio source separation of convolutive mixtures. IEEE Trans Speech Audio Process 11(5):489–497CrossRefGoogle Scholar
  15. Mitianoudis N, Davies M (2003) Using beamforming in the audio source separation problem. In: Proc. Int. Symp. on signal processing and its applications, Paris, pp 89 – 92Google Scholar
  16. Mitianoudis N, Davies M (2004) Permutation alignment for frequency domain ICA using subspace beamforming methods. In: Proc. Int. workshop on independent component analysis and source separation (ICA2004), Granada, pp 127–132Google Scholar
  17. Mitianoudis N (2004) Audio source separation using independent component analysis. PhD thesis, Queen Mary LondonGoogle Scholar
  18. Moon T, Stirling W (2000) Mathematical methods and algorithms for signal processing. Prentice Hall, Upper Saddle RiverGoogle Scholar
  19. Parra L, Spence C (2000) Convolutive blind separation of non-stationary sources. IEEE Trans Speech Audio Process 8(3):320–327CrossRefGoogle Scholar
  20. Saito S, Oishi K, Furukawa T (2015) Convolutive blind source separation using an iterative least-squares algorithm for non-orthogonal approximate joint diagonalization. IEEE Trans Audio Speech Lang Process 23(12):2434–2448CrossRefGoogle Scholar
  21. Sarmiento A, Duran-Diaz I, Cichocki A, Cruces S (2015) A contrast function based on generalised divergences for solving the permutation problem in convolved speech mixtures. IEEE Trans Audio Speech Lang Process 23(11):1713–1726CrossRefGoogle Scholar
  22. Sawada H, Araki S, Makino S (2011) Underdetermined convolutive blind source separation via frequency bin-wise clustering and permutation alignment. IEEE Trans Audio Speech Lang Process 19(3):516 – 527CrossRefGoogle Scholar
  23. Sawada H, Mukai R, Araki S, Makino S (2003) A robust and precise method for solving the permutation problem of frequency-domain blind source separation. In: 4th Int. Symp. on independent component analysis and blind signal separation (ICA2003). Nara, Japan, pp 505–510Google Scholar
  24. Simpson A, Roma G, Grais E, Mason R, Hummersone C, Liutkus A, Plumbley M (2016) Evaluation of audio source separation models using hypothesis-driven non-parametric statistical methods. In: 24th European signal processing conference (EUSIPCO 2016). Budapest, HungaryGoogle Scholar
  25. Smaragdis P (1998) Blind separation of convolved mixtures in the frequency domain. Neurocomputing 22(1):21–34CrossRefGoogle Scholar
  26. Wang L, Ding H, Yin F (2011) A region-growing permutation alignment approach in frequency-domain blind source separation of speech mixtures. IEEE Trans Audio Speech Lang Process 19(3):2434–2448CrossRefGoogle Scholar
  27. Zarzoso V, Comon P (2010) Robust independent component analysis by iterative maximization of the kurtosis contrast with algebraic optimal step size. IEEE Trans Neural Netw 21(2):248–261CrossRefGoogle Scholar
  28. Zhang K, Chan L (2010) Convolutive blind source separation by efficient blind deconvolution and minimal filter distortion. Neurocomputing 73(1315):2580–2588CrossRefGoogle Scholar

Copyright information

© Springer-Verlag GmbH Germany 2017

Authors and Affiliations

  1. 1.Electrical and Computer Engineering DepartmentDemocritus University of ThraceXanthiGreece

Personalised recommendations