Complex Extension of Infinite Sparse Factor Analysis for Blind Speech Separation
We present a method of blind source separation (BSS) for speech signals using a complex extension of infinite sparse factor analysis (ISFA) in the frequency domain. Our method is robust against delayed signals that usually occur in real environments, such as reflections, short-time reverberations, and time lags of signals arriving at microphones. ISFA is a conventional non-parametric Bayesian method of BSS, which has only been applied to time domain signals because it can only deal with real signals. Our method uses complex normal distributions to estimate source signals and mixing matrix. Experimental results indicate that our method outperforms the conventional ISFA in the average signal-to-distortion ratio (SDR).
KeywordsBlind source separation Infinite sparse factor analysis Non-parametric Bayes
Unable to display preview. Download preview PDF.
- 1.Wölfel, M., McDonough, J.: Distant Speech Recognition. Wiley (2009)Google Scholar
- 4.Valin, J.M., Rouat, J., Michaud, F.: Enhanced robot audition based on microphone array source separation with post-filter. In: Proceedings of IEEE/RSJ International Conference on Intelligent Robots and Systems, IROS 2004, vol. 3, pp. 2123–2128. IEEE (2004)Google Scholar
- 6.Hyvärinen, A., Karhunen, J., Oja, E.: Independent component analysis. Wiley Interscience (2001)Google Scholar
- 8.Griffiths, T., Ghahramani, Z.: Infinite latent feature models and the Indian buffet process. Advances in Neural Information Processing Systems 18, 475–482 (2006)Google Scholar
- 9.Meeds, E., Ghahramani, Z., Neal, R.M., Roweis, S.T.: Modeling dyadic data with binary latent factors. Advances in Neural Information Processing Systems 19, 977–984 (2007)Google Scholar
- 12.Vincent, E., Sawada, H., Bofill, P., Makino, S., Rosca, J.P.: First Stereo Audio Source Separation Evaluation Campaign: Data, Algorithms and Results. In: Davies, M.E., James, C.J., Abdallah, S.A., Plumbley, M.D. (eds.) ICA 2007. LNCS, vol. 4666, pp. 552–559. Springer, Heidelberg (2007)CrossRefGoogle Scholar