Abstract
We consider the problem of online audio source separation. Existing algorithms adopt either a sliding block approach or a stochastic gradient approach, which is faster but less accurate. Also, they rely either on spatial cues or on spectral cues and cannot separate certain mixtures. In this paper, we design a general online audio source separation framework that combines both approaches and both types of cues. The model parameters are estimated in the Maximum Likelihood (ML) sense using a Generalised Expectation Maximisation (GEM) algorithm with multiplicative updates. The separation performance is evaluated as a function of the block size and the step size and compared to that of an offline algorithm.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Makino, S., Lee, T.-W., Sawada, H.: Blind Speech Separation. Springer, Heidelberg (2007)
Vincent, E., Jafari, M.G., Abdallah, S.A., Plumbley, M.D., Davies, M.E.: Probabilistic modeling paradigms for audio source separation. In: Machine Audition: Principles, Algorithms and Systems, pp. 162–185. IGI Global (2010)
Mukai, R., Sawada, H., Araki, S., Makino, S.: Real-time Blind Source Separation For Moving Speakers Using Blockwise ICA and Residual Crosstalk Subtraction. In: 4th Int. Symp. Independent Component Analysis and Blind Signal Separation, pp. 975–980 (2003)
Mori, Y., Saruwatari, H., Takatani, T., Ukai, S., Shikano, K., Hiekata, T., Ikeda, Y., Hashimoto, H., Morita, T.: Blind separation of acoustic signals combining SIMO-model-based independent component analysis and binary masking. EURASIP Journal on Advances in Signal Processing 2006(1), 1–17 (2006)
Loesch, B., Yang, B.: Online blind source separation based on time-frequency sparseness. In: Proc. 2009 IEEE Int. Conf. on Acoustics, Speech and Signal Processing, pp. 117–120 (2009)
Togami, M.: Online speech source separation based on maximum likelihood of local Gaussian modeling. In: Proc. 2011 IEEE Int. Conf. on Acoustics, Speech and Signal Processing, pp. 213–216 (2011)
Ono, N., Miyamoto, K., Sagayama, S.: A real-time equalizer of harmonic and percussive components in music signals. In: Proc. 2008 Int. Conf. on Music Information Retrieval, pp. 139–144 (2008)
Wang, D., Vipperla, R., Evans, N.: Online pattern learning for non-negative convolutive sparse coding. In: Proc. Interspeech 2011, pp. 65–68 (2011)
Ozerov, A., Vincent, E., Bimbot, F.: A general flexible framework for the handling of prior information in audio source separation. IEEE Transactions on Audio, Speech, and Language Processing (to appear)
Duong, N.Q.K., Vincent, E., Gribonval, R.: Under-determined reverberant audio source separation using a full-rank spatial covariance model. IEEE Transactions on Audio, Speech, and Language Processing 18(7), 1830–1840 (2010)
Vincent, E., Sawada, H., Bofill, P., Makino, S., Rosca, J.P.: First Stereo Audio Source Separation Evaluation Campaign: Data, Algorithms and Results. In: Davies, M.E., James, C.J., Abdallah, S.A., Plumbley, M.D. (eds.) ICA 2007. LNCS, vol. 4666, pp. 552–559. Springer, Heidelberg (2007)
Brandstein, M.S., Ward, D.B.: Microphone Arrays: Signal Processing Techniques and Applications. Springer, Heidelberg (2001)
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 2012 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Simon, L.S.R., Vincent, E. (2012). A General Framework for Online Audio Source Separation. In: Theis, F., Cichocki, A., Yeredor, A., Zibulevsky, M. (eds) Latent Variable Analysis and Signal Separation. LVA/ICA 2012. Lecture Notes in Computer Science, vol 7191. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-28551-6_49
Download citation
DOI: https://doi.org/10.1007/978-3-642-28551-6_49
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-28550-9
Online ISBN: 978-3-642-28551-6
eBook Packages: Computer ScienceComputer Science (R0)