Abstract
We propose a robust method to estimate the number of audio sources and the mixing matrix in a linear instantaneous mixture, even with more sources than sensors. Our method is based on a multiscale Short Time Fourier Transform (STFT), and relies on the assumption that in the neighborhood of some (unknown) scales and time-frequency points, only one source contributes to the mixture. Such time-frequency regions provide local estimates of the corresponding columns of the mixing matrix. Our main contribution is a new clustering algorithm called DEMIX to estimate the number of sources and the mixing matrix based on such local estimates. In contrast to DUET or other similar sparsity-based algorithms, which rely on a global scatter plot, our algorithm exploits a local confidence measure to weight the influence of each time-frequency point in the estimated matrix. Inspired by the work of Deville, the confidence measure relies on the time-frequency local persistence of the activity/inactivity of each source. Experiments are provided with stereophonic mixtures and show the improved performance of DEMIX compared to K-means or ELBG clustering algorithms.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Yilmaz, O., Rickard, S.: Blind separation of speech mixtures via time-frequency masking. IEEE Transactions on Signal Processing 52, 1830–1847 (2002)
Abrard, F., Deville, Y., From, P.W.: blind source separation to blind source cancellation in the underdetermined case: a new approach based on time-frequency analysis. In: ICA (2001)
Abrard, F.: Blind separation of dependent sources using the time-frequency ratio of mixtures approach. In: ISSPA 2003, Paris, France. IEEE, Los Alamitos (2003)
Bofill, P.: Underdetermined blind source separation using sparse representations. In: Signal Processing, vol. 81, pp. 2353–2362 (2001)
Paul D.O’Grady, B.A., T.Rickard, S.: Survey of sparse and non-sparse methods in source separation. IJIST (International Journal of Imaging Systems and Technology) (2005)
MacQueen, J.B.: Some methods for classification and analysis of multivariate observations. In: 5-th Berkeley Symposium on Mathematical Statistics and Probability (1967)
Patanè, G., Russo, M.: The enhanced LBG algorithm. Neural Networks 14(9), 1219–1237 (2001)
Härdel, W., Simar, L. (eds.): Applied multivariate statistical analysis. Springer, Heidelberg (2003)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2006 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Arberet, S., Gribonval, R., Bimbot, F. (2006). A Robust Method to Count and Locate Audio Sources in a Stereophonic Linear Instantaneous Mixture. In: Rosca, J., Erdogmus, D., Príncipe, J.C., Haykin, S. (eds) Independent Component Analysis and Blind Signal Separation. ICA 2006. Lecture Notes in Computer Science, vol 3889. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11679363_67
Download citation
DOI: https://doi.org/10.1007/11679363_67
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-32630-4
Online ISBN: 978-3-540-32631-1
eBook Packages: Computer ScienceComputer Science (R0)