Underdetermined Reverberant Audio-Source Separation Through Improved Expectation–Maximization Algorithm
- 27 Downloads
Underdetermined reverberant audio-source separation is an important issue in speech and audio processing. To solve this problem, many separation algorithms have been proposed, in which model parameter estimation is performed in the time–frequency domain, leading to permutation ambiguity and poor separation performance. Additionally, in the existing expectation–maximization (EM) algorithms, one of the crucial problem is that updating the model parameters at each iterative step is time-consuming. In this paper, we present an improved EM algorithm that combines nonnegative matrix factorization (NMF) and time differences of arrival (TDOA) estimation, avoiding the time consumption by properly selecting initial values of the EM algorithm. In the proposed algorithm, NMF source model is used to avoid the permutation ambiguity problem, and acoustic localization can be achieved by transforming the TDOA. Then, model parameters are updated to obtain better separation results. Finally, the source signals are separated using Wiener filters. The experimental results show that compared with existing blind separation methods, the proposed algorithm achieves better performance on source separation.
KeywordsUnderdetermined mixture Nonnegative matrix factorization Time differences of arrival Expectation–maximization
The authors would like to thank the anonymous reviewers for their insightful comments and helpful critiques of the manuscript that helped improve this paper. This work was partially supported by the National Natural Science Foundation of China (Grants 613300032, 61773128, 61673126, U1701261). Additionally, this work was partially supported by the Postdoctoral Science Foundation of China, No. 2018M643022.
- 7.P. Comon, C. Jutten, Handbook of Blind Source Separation: Independent Component Analysis and Separation (Academic, Cambridge, 2010)Google Scholar
- 8.C.P. Demo, J. Srel, Cocktail Party Problem (Springer, New York, 2015)Google Scholar
- 12.Y. Guo, G. R. Naik, H. Nguyen, Single channel blind source separation based local mean decomposition for biomedical applications, in Engineering in Medicine and Biology Society 2013, pp. 6812–6815Google Scholar
- 20.F. Nesta and M. Omologo, Convolutive underdetermined source separation through weighted interleaved ICA and spatio-temporal source correlation. In: International Conference on Latent Variable Analysis and Signal Separation, Lva/ica 2012, Tel Aviv, Israel, March 12–15, 2012. Proceedings, 2012, pp. 222–230Google Scholar
- 21.A. Ozerov, C. Fvotte, R. Blouet, J. L. Durrieu, Multichannel nonnegative tensor factorization with structured constraints for user-guided audio source separation. In: IEEE International Conference on Acoustics, Speech and Signal Processing, 2011, pp. 257–260Google Scholar