Abstract
The problem of underdetermined audio source separation has been explored in the literature for many years. The instantaneous \(K\)-sensors, \(L\)-sources mixing scenario (where \(K<L\)) has been tackled by many different approaches, provided the sources remain quite distinct in the virtual positioning space spanned by the sensors. In this case, the source separation problem can be solved as a directional clustering problem along the source position angles in the mixture. The use of Laplacian Mixture Models in order to cluster and thus separate sparse sources in underdetermined mixtures will be explained in detail in this chapter. The novel Generalised Directional Laplacian Density will be derived in order to address the problem of modelling multidimensional angular data. The developed scheme demonstrates robust separation performance along with low processing time.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
Notes
- 1.
A \(\pi \)-periodicity is valid for the observed phenomenon, since data in \((\pi /2,3\pi /2)\) are symmetrical to the ones in \((-\pi /2,\pi /2)\) (See Fig. 7.1b). Hence, the use of the atan function instead of the extended atan2 function is justified. For the rest of the analysis, we will assume that \(\theta _n\) takes values between \((0,\pi )\) rather than \((-\pi /2,\pi /2)\). This implies that data in the 4th quadrant \((-\pi /2,0)\) are mapped with odd symmetry to the 2nd quadrant (\(\pi /2,\pi \)). This is performed in order to facilate the derivations of the Generalised Directional Laplacian Distribution and does not alter anything in the actual data.
- 2.
Note that for \(n\) positive integer, we have that \(\varGamma (n)=(n-1)!\)
- 3.
MATLAB code for the “GaussSep” algorithm is available from http://www.irisa.fr/metiss/members/evincent/software.
- 4.
MATLAB code for the “DEMIX” algorithm is available from http://infoscience.epfl.ch/record/165878/files/.
- 5.
References
Araki, S., Sawada, H., Mukai, R., Makino, S.: Underdetermined blind sparse source separation for arbitrarily arranged multiple sensors. Signal Process. 87, 1833–1847 (2007)
Arberet, S., Gribonval, R., Bimbot, F.: A robust method to count and locate audio sources in a multichannel underdetermined mixture. IEEE Trans. Signal Process. 58(1), 121–133 (2010)
Attias, H.: Independent factor analysis. Neural Comput. 11(4), 803–851 (1999)
Banerjee, A., Dhillon, I.S., Ghosh, J., Sra, S.: Clustering on the unit hypersphere using von Mises-Fisher distributions. J. Mach. Learn. Res. 6, 1345–1382 (2005)
Bentley, J.: Modelling circular data using a mixture of von Mises and uniform distributions. Simon Fraser University, MSc thesis (2006)
Bilmes, J.: A gentle tutorial of the EM algorithm and its application to parameter estimation for Gaussian mixture and hidden mixture models. Technical Report, Department of Electrical Engineering and Computer Science, U.C. Berkeley, California (1998)
Blumensath, T., Davies, M.: Sparse and shift-invariant representations of music. IEEE Trans. Audio Speech Lang. Process. 14(1), 50–57 (2006)
Bofill, P., Zibulevsky, M.: Underdetermined blind source separation using sparse representations. Signal Process. 81(11), 2353–2362 (2001)
Cardoso, J.F.: Blind signal separation: statistical principles. Proc. IEEE 9(10), 2009–2025 (1998)
Cemgil, A., Févotte, C., Godsill, S.: Variational and stochastic inference for bayesian source separation. Digit. Signal Process. 17, 891913 (2007)
Cichocki, A., Amari, S.: Adaptive Blind Signal and Image Processing. Wiley, New York (2002)
Comon, P., Jutten, C.: Handbook of Blind Source Separation: Independent Component Analysis and Applications, 856 p. Academic Press, Waltham (2010)
Daudet, L., Sandler, M.: MDCT analysis of sinusoids: explicit results and applications to coding artifacts reduction. IEEE Trans. Speech Audio Process. 12(3), 302–312 (2004)
Davies, M., Mitianoudis, N.: A simple mixture model for sparse overcomplete ICA. IEE Proc. Vis. Image Signal Process. 151(1), 35–43 (2004)
Davies, M., Daudet, L.: Sparse audio representations using the mclt. Signal Process. 86(3), 358–368 (2006)
Dempster, A.P., Laird, N., Rubin, D.: Maximum likelihood for incomplete data via the EM algorithm. J. R. Stat. Soc. Ser. B 39, 1–38 (1977)
Devroye, L.: Non-Uniform Random Variate Generation. Springer, New York (1986)
Dhillon, I., Sra, S.: Modeling data using directional distributions. Technical Report TR-03-06, University of Texas at Austin, Austin, TX (2003)
Duarte, L., Suyama, R., Rivet, B., Attux, R., Romano, J., Jutten, C.: Blind compensation of nonlinear distortions: application to source separation of post-nonlinear mixtures. IEEE Trans. Signal Process. 60(11), 5832–5844 (2012)
Duong, N., Vincent, E., Gribonval, R.: Under-determined reverberant audio source separation using a full-rank spatial covariance model. IEEE Trans. Audio Speech Lang. Process. 18(7), 1830–1840 (2010)
Eriksson, J., Koivunen, V.: Identifiability, separability, and uniqueness of linear ica models. IEEE Signal Process. Lett. 11(7), 601–604 (2004)
Févotte, C., Godsill, S.: A bayesian approach to blind separation of sparse sources. IEEE Trans. Audio Speech Lang. Process. 14(6), 2174–2188 (2006)
Févotte, C., Gribonval, R., Vincent, E.: BSS EVAL toolbox user guide. Techical Report, IRISA Technical Report 1706, Rennes, France, April 2005, http://www.irisa.fr/metiss/bsseval/ (2005)
Fisher, N.: Statistical Analysis of Circular Data. Cambridge University Press, Cambridge (1993)
Girolami, M.: A variational method for learning sparse and overcomplete representations. Neural Comput. 13(11), 2517–2532 (2001)
Gribonval, R., Nielsen, M.: Sparse decomposition in unions of bases. IEEE Trans. Inf. Theory 49(12), 3320–3325 (2003)
Hyvärinen, A., Karhunen, J., Oja, E.: Independent Component Analysis. Wiley, New York, 481+xxii p. http://www.cis.hut.fi/projects/ica/book/ (2001)
Hyvärinen, A.: Independent component analysis in the presence of Gaussian noise by maximizing joint likelihood. Neurocomputing 22, 49–67 (1998)
Jammalamadaka, S., Sengupta, A.: Topics in Circular Statistics. World Scientific, Singapore (2001)
Jutten, C., Karhunen, J.: Advances in nonlinear blind source separation. In: Proceedings of 4th International Symposium on Independent Component Analysis and Blind Signal Separation (ICA2003), pp. 245–256. Nara, Japan (2003)
Kreyszig, E.: Advanced Engineering Mathematics, 1264 p. Wiley (2010)
Lee, T.W., Bell, A.J., Lambert, R.: Blind separation of delayed and convolved sources. In: Advances in Neural Information Processing Systems (NIPS), vol. 9, pp. 758–764 (1997)
Lee, T.W., Lewicki, M., Girolami, M., Sejnowski, T.: Blind source separation of more sources than mixtures using overcomplete representations. IEEE Signal Process. Lett. 4(5), 87–90 (1999)
Lewicki, M., Sejnowski, T.: Learning overcomplete representations. Neural Comput. 12, 337–365 (2000)
Lewicki, M.: Efficient coding of natural sounds. Nat. Neurosci. 5(4), 356–363 (2002)
MacQueen, J.: Some methods for classification and analysis of multivariate observations. In: Proceedings of 5-th Berkeley Symposium on Mathematical Statistics and Probability, pp. 281–297. Berkeley, California (1967)
Mardia, K., Kanti, V., Jupp, P.: Directional Statistics. Wiley, Chichester (1999)
Mitianoudis, N., Davies, M.: Permutation alignment for frequency domain ICA using subspace beamforming methods. In: Proceedings of the International Workshop on Independent Component Analysis and Source Separation (ICA2004), pp. 127–132. Granada, Spain (2004)
Mitianoudis, N., Stathaki, T.: Underdetermined source separation using mixtures of warped Laplacians. In: International Conference on Independent Component Analysis and Source Separation (ICA). London, UK (2007)
Mitianoudis, N.: A directional Laplacian density for underdetermined audio source separation. In: 20th International Conference on Artificial Neural Networks (ICANN). Thessaloniki, Greece (2010)
Mitianoudis, N., Davies, M.: Audio source separation of convolutive mixtures. IEEE Trans. Audio Speech Process. 11(5), 489–497 (2003)
Mitianoudis, N., Stathaki, T.: Batch and online underdetermined source separation using Laplacian mixture models. IEEE Trans. Audio Speech Lang. Process. 15(6), 1818–1832 (2007)
Moulines, E., Cardoso, J.F., Gassiat, E.: Maximum likelihood for blind separation and deconvolution of noisy signals using mixture models. In: Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP’97), pp. 3617–3620. Munich, Germany (1997)
O’Grady, P., Pearlmutter, B.: Hard-LOST: modified K-means for oriented lines. In: Proceedings of the Irish Signals and Systems Conference, pp. 247–252. Ireland (2004)
O’Grady, P., Pearlmutter, B.: Soft-LOST: EM on a mixture of oriented lines. In: Proceedings of the International Conference on Independent Component Analysis 2004, pp. 428–435. Granada, Spain (2004)
Pajunen, P., Hyvärinen, A., Karhunen, J.: Nonlinear blind source separation by self-organizing maps. In: Proceedings of the International Conference on Neural Information Processing, pp. 1207–1210. Hong Kong (1996)
Plumbley, M., Abdallah, S., Blumensath, T., Davies, M.: Sparse representations of polyphonic music. Signal Process. 86(3), 417–431 (2006)
Rickard, S., Balan, R., Rosca, J.: Real-time time-frequency based blind source separation. In: Proceedings of the ICA2001, pp. 651–656. San Diego, CA (2001)
Sawada, H., Araki, S., Makino, S.: A two-stage frequency-domain blind source separation method for underdetermined convolutive mixtures. In: IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA 2007), pp. 139–142 (2007)
Sawada, H., Araki, S., Makino, S.: Underdetermined convolutive blind source separation via frequency bin-wise clustering and permutation alignment. IEEE Trans. Audio Speech Lang. Process. 19(3), 516–527 (2011)
Sawada, H., Mukai, R., Araki, S., Makino, S.: A robust and precise method for solving the permutation problem of frequency-domain blind source separation. IEEE Trans. Speech Audio Process. 12(5), 75–87 (2004)
SiSEC 2008: Signal separation evaluation campaign. http://sisec2008.wiki.irisa.fr/tiki-index.php
SiSEC 2010: Signal separation evaluation campaign. http://sisec2010.wiki.irisa.fr/tiki-index.php
SiSEC 2011: Signal separation evaluation campaign. http://sisec.wiki.irisa.fr/tiki-index.php
Smaragdis, P., Boufounos, P.: Position and trajectory learning for microphone arrays. IEEE Trans. Audio Speech Lang. Process. 15(1), 358–368 (2007)
Smaragdis, P.: Blind separation of convolved mixtures in the frequency domain. Neurocomputing 22, 21–34 (1998)
Torkkola, K.: Blind separation of delayed and convolved sources. In: S. Haykin (ed.) Unsupervised Adaptive Filtering, vol. I, pp. 321–375. Wiley (2000)
Vincent, E., Arberet, S., Gribonval, R.: Underdetermined instantaneous audio source separation via local gaussian modeling. In: 8th International Conferences on Independent Component Analysis and Signal Separation (ICA), pp. 775–782. Paraty, Brazil (2009)
Vincent, E., Gribonval, R., Fevotte, C., Nesbit, A., Plumbley, M., Davies, M., Daudet, L.: BASS-dB: the blind audio source separation evaluation database. http://bass-db.gforge.inria.fr/BASS-dB/
Winter, S., Kellermann, W., Sawada, H., Makino, S.: MAP based underdetermined blind source separation of convolutive mixtures by hierarchical clustering and L1-norm minimization. EURASIP J. Adv. Signal Process. 1, 12 p (2007)
Yilmaz, O., Rickard, S.: Blind separation of speech mixtures via time-frequency masking. IEEE Trans. Signal Process. 52(7), 1830–1847 (2004)
Zibulevsky, M., Kisilev, P., Zeevi, Y., Pearlmutter, B.: Blind source separation via multinode sparse representation. Adv. Neural Inf. Process. Syst. 14, 1049–1056 (2002)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Appendices
Appendix 1
Calculation of the Normalisation Parameter for the Generalised DLD
To estimate the normalisation coefficient \(c_p(k)\) of (7.27), we need to solve the following equation:
Following Eq. (B.8) and in a similar manner to the analysis in Appendix B.2 in [18], we can rewrite the above equation as follows:
Following a similar methodology to Appendix B.2 in [18], the above yields:
Using the definition of \(I_p(k)\), we can write
Appendix 2
Gradient Updates for \(\mathbf {m}\) and \(k\) for the MDDLD
The first-order derivative of the log-likelihood in (7.28) for the estimation of \(\mathbf {m}\) are calculated below:
Before we estimate \(k\) from the log-likelihood (7.28), we derive the following property:
The above property can be generalised as follows:
The first-order derivative of the log-likelihood in (7.28) for the estimation of \(k\) is then calculated below:
Appendix 3
A Directional K-Means Algorithm
Assume that \(K\) is the number of clusters, \(\fancyscript{C}_i,\mathrm i=1,\dots ,K\) are the clusters, \(\mathbf {m}_i\) are the cluster centres and \(\mathbf {X}=\{\mathbf {x}_1,\dots ,\mathbf {x}_n,\dots ,\mathbf {x}_N\}\) is a \(p\)-D angular dataset lying on the half-unit \(p\)-D sphere. The original \(K\)-means [36] minimises the following non-directional error function:
where \(||\cdot ||\) represents the Euclidean distance. Instead of using the squared Euclidean distance for the \(p\)-D Directional \(K\)-Means, we introduce the following distance function:
The novel function \(D_l\) is similarly monotonic as the original distance but emphasises more on the contribution of points closer to the cluster centre. In addition, \(D_l\) is periodic with period \(\pi \). The \(p\)-D Directional \(K\)-Means can thus be described as follows:
-
1.
Randomly initialise \(K\) cluster centres \(\mathbf {m}_i\), where \(||\mathbf {m}_i||=1\).
-
2.
Calculate the distance of all points \(\mathbf {x}_n\) to the cluster centres \(\mathbf {m}_i\), using \(D_l\).
-
3.
The points with minimum distance to the centres \(\mathbf {m}_i\) form the new clusters \(\fancyscript{C}_i\).
-
4.
The clusters \(\fancyscript{C}_i\) vote for their new centres \(\mathbf {m}_i^+\). To avoid averaging mistakes with directional data, vector averaging is employed to ensure the validity of the addition. The resulting average is normalised to the half-unit \(p\)-D sphere:
$$\begin{aligned} \mathbf {m}_i^+=\frac{1}{k_i}\sum _{\mathbf {x}_n \in k_i }\mathbf {x}_n \end{aligned}$$(7.53)$$\begin{aligned} \mathbf {m}_i^+\leftarrow \mathbf {m}_i^+/||\mathbf {m}_i^+|| \end{aligned}$$(7.54) -
5.
Repeat steps (2)–(4) until the means \(\mathbf {m}_i\) have converged.
Rights and permissions
Copyright information
© 2014 Springer-Verlag Berlin Heidelberg
About this chapter
Cite this chapter
Mitianoudis, N. (2014). Underdetermined Audio Source Separation Using Laplacian Mixture Modelling. In: Naik, G., Wang, W. (eds) Blind Source Separation. Signals and Communication Technology. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-55016-4_7
Download citation
DOI: https://doi.org/10.1007/978-3-642-55016-4_7
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-55015-7
Online ISBN: 978-3-642-55016-4
eBook Packages: EngineeringEngineering (R0)