Skip to main content
Log in

Convolutive audio source separation using robust ICA and an intelligent evolving permutation ambiguity solution

  • Original Paper
  • Published:
Evolving Systems Aims and scope Submit manuscript

Abstract

Audio source separation is the task of isolating sound sources that are active simultaneously in a room captured by a set of microphones. Convolutive audio source separation of equal number of sources and microphones has a number of shortcomings including the complexity of frequency-domain ICA, the permutation ambiguity and the problem’s scalabity with increasing number of sensors. In this paper, the authors propose a multiple-microphone audio source separation algorithm based on a previous work of Mitianoudis and Davies [IEEE Trans Speech Audio Process 11(5):489–497, 2003]. Complex FastICA is substituted by Robust ICA increasing robustness and performance. Permutation ambiguity is solved using two methodologies. The first is using the Likelihood Ration Jump solution, which is now modified to decrease computational complexity in the case of multiple microphones. The application of the MuSIC algorithm, as a preprocessing step to the previous solution, forms a second methodology with promising results.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8

Similar content being viewed by others

Notes

  1. Dataset available at http://utopia.duth.gr/nmitiano/download.html.

  2. Dataset available from http://utopia.duth.gr/nmitiano/download.html.

  3. http://www.i3s.unice.fr/~zarzoso/robustica.html.

References

  • Cheney M (2001) The linear sampling method and the music algorithm. Inverse Probl 17:591–595

    Article  MathSciNet  Google Scholar 

  • Comon P (1994) Independent component analysis—a new concept? Signal Process 36:287–314

    Article  Google Scholar 

  • Comon P, Jutten C (2010) Handbook of blind source separation: independent component analysis and applications. Academic Press, Cambridge

  • Févotte C, Gribonval R, Vincent E (2005) BSS EVAL toolbox user guide. Technical report, IRISA technical report 1706, Rennes. http://www.irisa.fr/metiss/bss eval/. Accessed 27 July 2017

  • Griffiths L, Jim C (1982) An alternative approach to linearly constrained adaptive beamforming. IEEE Trans Ant Propag 30:27–34

    Article  Google Scholar 

  • Herbig T, Gerl F, Minker W, Haeb-Umbach R (2011) Adaptive systems for unsupervised speaker tracking and speech recognition. Evol Syst 2(3):199–214

    Article  Google Scholar 

  • Hyvärinen A (1999) The fixed-point algorithm and maximum likelihood estimation for independent component analysis. Neural Process Lett 10(1):1–5

    Article  Google Scholar 

  • Hyvärinen A (1999) Fast and robust fixed-point algorithms for independent component analysis. IEEE Trans Neural Netw 10(3):626–634

    Article  Google Scholar 

  • Hyvärinen A, Karhunen J, Oja E (2001) Independent component analysis. Wiley, New York, p 481+xxii

    Book  Google Scholar 

  • Hyvärinen A, Oja E (2000) Independent component analysis: algorithms and applications. Neural Netw 13(4–5):411–430

    Article  Google Scholar 

  • Ikeda S, Murata N (1998) A method of blind separation on temporal structure of signals. In: ICONIP, vol 98, Citeseer, pp 737–742

  • Markovich-Golan S, Gannot S, Kellermann W (2017) Combined LCMV-TRINICON beamforming for separating multiple speech sources in noisy and reverberant environments. IEEE Trans Audio Speech Lang Process 25(2):320–332

    Article  Google Scholar 

  • Mazur R, Mertins A (2009) An approach for solving the permutation problem of convolutive blind source separation based on statistical signal models. IEEE Trans Audio Speech Lang Process 17(1):117–126

    Article  Google Scholar 

  • Mitianoudis N, Davies M (2003) Audio source separation of convolutive mixtures. IEEE Trans Speech Audio Process 11(5):489–497

    Article  Google Scholar 

  • Mitianoudis N, Davies M (2003) Using beamforming in the audio source separation problem. In: Proc. Int. Symp. on signal processing and its applications, Paris, pp 89 – 92

  • Mitianoudis N, Davies M (2004) Permutation alignment for frequency domain ICA using subspace beamforming methods. In: Proc. Int. workshop on independent component analysis and source separation (ICA2004), Granada, pp 127–132

  • Mitianoudis N (2004) Audio source separation using independent component analysis. PhD thesis, Queen Mary London

  • Moon T, Stirling W (2000) Mathematical methods and algorithms for signal processing. Prentice Hall, Upper Saddle River

    Google Scholar 

  • Parra L, Spence C (2000) Convolutive blind separation of non-stationary sources. IEEE Trans Speech Audio Process 8(3):320–327

    Article  Google Scholar 

  • Saito S, Oishi K, Furukawa T (2015) Convolutive blind source separation using an iterative least-squares algorithm for non-orthogonal approximate joint diagonalization. IEEE Trans Audio Speech Lang Process 23(12):2434–2448

    Article  Google Scholar 

  • Sarmiento A, Duran-Diaz I, Cichocki A, Cruces S (2015) A contrast function based on generalised divergences for solving the permutation problem in convolved speech mixtures. IEEE Trans Audio Speech Lang Process 23(11):1713–1726

    Article  Google Scholar 

  • Sawada H, Araki S, Makino S (2011) Underdetermined convolutive blind source separation via frequency bin-wise clustering and permutation alignment. IEEE Trans Audio Speech Lang Process 19(3):516 – 527

    Article  Google Scholar 

  • Sawada H, Mukai R, Araki S, Makino S (2003) A robust and precise method for solving the permutation problem of frequency-domain blind source separation. In: 4th Int. Symp. on independent component analysis and blind signal separation (ICA2003). Nara, Japan, pp 505–510

  • Simpson A, Roma G, Grais E, Mason R, Hummersone C, Liutkus A, Plumbley M (2016) Evaluation of audio source separation models using hypothesis-driven non-parametric statistical methods. In: 24th European signal processing conference (EUSIPCO 2016). Budapest, Hungary

  • Smaragdis P (1998) Blind separation of convolved mixtures in the frequency domain. Neurocomputing 22(1):21–34

    Article  Google Scholar 

  • Wang L, Ding H, Yin F (2011) A region-growing permutation alignment approach in frequency-domain blind source separation of speech mixtures. IEEE Trans Audio Speech Lang Process 19(3):2434–2448

    Article  Google Scholar 

  • Zarzoso V, Comon P (2010) Robust independent component analysis by iterative maximization of the kurtosis contrast with algebraic optimal step size. IEEE Trans Neural Netw 21(2):248–261

    Article  Google Scholar 

  • Zhang K, Chan L (2010) Convolutive blind source separation by efficient blind deconvolution and minimal filter distortion. Neurocomputing 73(1315):2580–2588

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Nikolaos Mitianoudis.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Mallis, D., Sgouros, T. & Mitianoudis, N. Convolutive audio source separation using robust ICA and an intelligent evolving permutation ambiguity solution. Evolving Systems 9, 315–329 (2018). https://doi.org/10.1007/s12530-017-9199-3

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s12530-017-9199-3

Keywords

Navigation