An Expectation-Maximization Algorithm for Blind Separation of Noisy Mixtures Using Gaussian Mixture Model

Gu, Fanglin; Zhang, Hang; Wang, Wenwu; Wang, Shan

doi:10.1007/s00034-016-0424-2

An Expectation-Maximization Algorithm for Blind Separation of Noisy Mixtures Using Gaussian Mixture Model

Published: 25 October 2016

Volume 36, pages 2697–2726, (2017)
Cite this article

Circuits, Systems, and Signal Processing Aims and scope Submit manuscript

Fanglin Gu¹,
Hang Zhang²,
Wenwu Wang³ &
…
Shan Wang¹

449 Accesses
8 Citations
Explore all metrics

Abstract

In this paper, we propose a new expectation-maximization (EM) algorithm, named GMM-EM, to blind separation of noisy instantaneous mixtures, in which the non-Gaussianity of independent sources is exploited by modeling their distribution using the Gaussian mixture model (GMM). The compatibility between the incomplete-data structure of the GMM and the hidden variable nature of the source separation problem leads to an efficient hierarchical learning and alternative method for estimating the sources and the mixing matrix. In comparison with conventional blind source separation algorithms, the proposed GMM-EM algorithm has superior performance for the separation of noisy mixtures due to the fact that the covariance matrix of the additive Gaussian noise is treated as a parameter. Furthermore, the GMM-EM algorithm works well in underdetermined cases by incorporating any prior information one may have and jointly estimating the mixing matrix and source signals in a Bayesian framework. Systematic simulations with both synthetic and real speech signals are used to show the advantage of the proposed algorithm over conventional independent component analysis techniques, such as FastICA, especially for noisy and/or underdetermined mixtures. Moreover, it can even achieve similar performance to a recent technique called null space component analysis with less computational complexity.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Genetic Algorithms in the Fields of Artificial Intelligence and Data Sciences

Article 30 August 2021

A Bayesian interpretation of the confusion matrix

Article 11 September 2017

Adaptive Filtering Based on Minimum Error Entropy Conjugate Gradient

Article 17 April 2024

Notes

For a likelihood function, a conjugate prior is defined as the prior for which the posteriori and the priori are of the same type of distributions.
Matlab codes can be found at: http://www.i3s.unice.fr/pcomon/TensorPackage.html.
Available at: http://www.kecl.ntt.co.jp/icl/signal/sawada/webdemo/bssdemo.html.

References

S. Amari, A. Cichocki, Adaptive blind signal processing-neural network approaches. Proc. IEEE 86(10), 2026–2048 (1998)
Article Google Scholar
A.J. Bell, T.J. Sejnowski, An information-maximization approach to blind separation and blind deconvolution. Neural Comput. 7, 1129–1159 (1995)
Article Google Scholar
A. Belouchrani, J.F. Cardoso, Maximum likelihood source separation for discrete sources, in Elsevier EUSIPCO’94 (Edinburgh, 1994)
J. Bilmes, A Gentle Tutorial on the EM Algorithm and Its Application to Parameter Estimation for Gaussian Mixture and Hidden Markov Models, Technical Report ICSI-TR-97-021 (University of Berkelym 1997). http://www.citeseer.nj.nec.com/blimes98gentle.html
J.F. Cardoso, B.H. Laheld, Equivariant adaptive source separation. IEEE Trans. Signal Process. 44(12), 3017–3030 (1996)
Article Google Scholar
J.F. Cardoso, A. Souloumiac, Blind beamforming for non-Gaussian signals, in IEE Proceedings F on Radar and Signal Processing, vol. 140(6) (1993), pp. 362–370
P. Comon, M. Rajih, Blind identification of underdetermined mixtures based on the characteristic function. Signal Process. 86(9), 2671–2681 (2006)
Article MATH Google Scholar
A.P. Dempster, N.M. Laird, D.B. Rubin, Maximum likelihood from incomplete data via the EM algorithm. J. R. Stat. Soc. Ser. B 39(1), 1–38 (1977)
MathSciNet MATH Google Scholar
J.L. Gauvain, C.H. Lee, Maximum a posteriori estimation for multivariate Gaussian mixture observations of chains. IEEE Trans. Speech Audio Process. 2(2), 291–298 (1994)
Article Google Scholar
F. Gu, H. Zhang, D. Zhu, Blind separation of complex sources using generalised generating function. IEEE Signal Process. Lett. 20(1), 71–74 (2013)
Article Google Scholar
F. Gu, H. Zhang, W. Wang, D. Zhu, Generalized generating function with tucker decomposition and alternating least squares for underdetermined blind identification. EURASIP J. Adv. Signal Process. (2013). doi:10.1186/1687-6180-2013-124
Google Scholar
F. Gu, H. Zhang, D. Zhu, Blind separation of non-stationary sources using continuous density Markov models. Digit. Signal Process. 23(5), 1549–1564 (2013)
Article MathSciNet Google Scholar
F. Gu, H. Zhang, Y. Xiao, A Bayesian approach to blind separation of mixed discrete sources by Gibbs sampleing, in Lecture Notes on Computer Science, vol. 6905 (2011), pp. 463–475
http://ai.korea.ac.kr/classes/2004/cse827/doc/map.1994.ieee.291, Accessed in 2013
Q. Huang, J. Yang, Y. Xue, Y. Zhou, Temporally correlated source separation based on variational Kalman smoother. Digit. Signal Process. 18(3), 422–433 (2008)
Article Google Scholar
Q. Huo, C. Chan, Bayesian Adaptive Learning of the Parameters of the Hidden Markov Model for Speech Recognition, HKU CSIS Technical Report TR-92-08 (1992). http://www.csis.hku.hk/research/techreps/document/TR-92-08
W. Hwang, J. Ho, Y. Lin, Null Space Component Analysis for Noisy Blind Source Separation, Technical Report TR-IIS-13-001 (2014). http://www.iis.sinica.edu.tw/page/library/TechReport/tr2013/tr13.html
A. Hyvärinen, Fast and robust fixed-point algorithms for independent component analysis. IEEE Trans. Neural Netw. 10(3), 626–634 (1999)
Article Google Scholar
A. Karfoul, L. Albera, G. Birot, Blind underdetermined mixture identification by joint canonical decomposition of HO cumulants. IEEE Trans. Signal Process. 58(2), 638–649 (2010)
Article MathSciNet Google Scholar
S. Kim, C.D. Yoo, Underdetermined blind source separation based on generalized Gaussian distribution, in Proceedings of the 16th IEEE Signal Processing Society Workshop on Machine Learning for Signal Processing (Arlington, VA, 2006), pp. 103–108
K.H. Knuth, Informed source separation: a Bayesian tutorial, in Proceedings of the 13th European Signal Processing Conference (EUSIPCO 2005) (Antalya, 2005)
S. Kullback, R.A. Leibler, On information and sufficiency. Ann. Math. Stat. 22(1), 79–86 (1951)
Article MathSciNet MATH Google Scholar
L.D. Lathauwer, J. Castaing, J.F. Cardoso, Fourth-order cumulant-based blind identification of underdetermined mixtures. IEEE Trans. Signal Process. 55(6), 2965–2973 (2007)
Article MathSciNet Google Scholar
L.D. Lathauwer, J. Castaing, Blind identification of underdetermined mixtures by simultaneous matrix diagonalization. IEEE Trans. Signal Process. 56(3), 1096–1105 (2008)
Article MathSciNet Google Scholar
C.-H. Lee, J.-L. Gauvain, Speaker adaptation based on MAP estimation of HMM parameters, in Proceedings of the IEEE International Conference on Acoustics Speech and Signal Processing (1993), pp. 558–561
J.Q. Li, A.R. Barron, Mixture density estimation, in Advances in Neural Information Processing Systems, vol. 12 (MIT Press, Cambridge, 2000), pp. 279–285
X. Luciani, A.L.F. de Almeida, P. Comon, Blind identification of underdetermined mixtures based on the characteristic function: the complex case. IEEE Trans. Signal Process. 59(2), 540–553 (2011)
Article MathSciNet Google Scholar
J. Ma, L. Xu, M.I. Jordan, Asymptotic convergence rate of the EM algorithm for Gaussian mixtures. Neural Comput. 12, 2881–2907 (2000)
Article Google Scholar
E. Moulines, J.F. Cardoso, E. Gassiat, Maximum likelihood for blind separation and deconvolution of noisy signals using mixture models, in International Conference on Acoustics, Speech, and Signal Processing (Munich, 1997), pp. 3617–3620
S. Peng, W. Hwang, Null space pursuit: an operator-based approach to adaptive signal processing. IEEE Trans. Signal Process. 58(5), 2475–2483 (2010)
Article MathSciNet Google Scholar
M. Razaviyayn, M. Hong, Z. Luo, A Unified Convergence Analysis of Block Successive Minimization Methods for Nonsmooth Optimization, arXiv:1209.2385 [math.OC] (2012)
T. Routtenberg, J. Tabrikian, MIMO-AR system identification and blind source separation for GMM-distributed sources. IEEE Trans. Signal Process. 57(5), 1717–1730 (2009)
Article MathSciNet Google Scholar
T. Rydn, EM versus chain Monte Carlo for estimation of hidden Markov models: a computational perspective. Bayesian Anal. 3, 659–688 (2008)
Article MathSciNet Google Scholar
G. Schwarz, Estimation the dimension of a model. Annu. Stat. 6(2), 461–464 (1978)
Article MATH Google Scholar
H. Snoussi, A.M. Djafari, Unsupervised learning for source separation with mixture of Gaussians prior for sources and Gaussian prior for mixture coefficients, in Proceedings of the 2001 IEEE Signal Processing Society Workshop on Neural Networks for Signal Processing XI (2001), pp. 293-302
H. Snoussi, J. Idier, Bayesian blind separation of generalized hyperbolic processes in noisy and underdetermined mixtures. IEEE Trans. Signal Process. 54(9), 3257–3269 (2006)
Article Google Scholar
S. Sun, C. Peng, W. Hou, J. Zheng, Y. Jiang, X. Zheng, Blind source separation with time series variational Bayes expectation maximization algorithm. Digit. Signal Process. 12(1), 17–33 (2012)
Article MathSciNet Google Scholar
K. Todros, J. Tabrikian, Blind separation of independent sources using Gaussian mixture model. IEEE Trans. Signal Process. 55(7), 3645–3658 (2007)
Article MathSciNet Google Scholar
P. Tseng, Convergence of a block coordinate descent method for nondifferentiable minimization. J. Optim. Theory Appl. 109, 475–494 (2001)
Article MathSciNet MATH Google Scholar
L. Xu, M.I. Jordan, On convergence properties of the EM algorithm for Gaussian mixtures. Neural Comput. 8, 129–151 (1996)
Article Google Scholar
Y. Zhang, X. Shi, C.H. Chen, A Gaussian mixture model for underdetermined independent component analysis. Signal Process. 86(6), 1538–1549 (2006)
Article MATH Google Scholar
Y. Zhao, Image segmentation using temporal-spatial information in dynamic scenes, in Proceedings of the IEEE International Conference on Machine Learning and Cybernetics (2003)

Download references

Acknowledgments

This work is supported in part by the National Natural Science Foundation of China under Grant 61601477 and the Engineering and Physical Sciences Research Council (EPSRC) Grant No. EP/K014307/1.

Author information

Authors and Affiliations

College of Electronic Science and Engineering, National University of Defense Technology, Changsha, 410073, China
Fanglin Gu & Shan Wang
College of Communication Engineering, PLA University of Science and Technology, Nanjing, 210073, China
Hang Zhang
Department of Electrical and Electronic Engineering, University of Surrey, Guildford, GU2 7XH, UK
Wenwu Wang

Authors

Fanglin Gu
View author publications
You can also search for this author in PubMed Google Scholar
Hang Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Wenwu Wang
View author publications
You can also search for this author in PubMed Google Scholar
Shan Wang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Shan Wang.

Appendices

Appendix 1: Proof of Equation (13)

Since

$$\begin{aligned}&f\left( \mathbf{s},Y|\mathbf{x},{\mathbf{A}}^g,\mathbf{R}_w^g,\varTheta ^g\right) \nonumber \\&\quad = \prod \limits _{t = 1}^T {f\left( \mathbf{s}(t)|\mathbf{x},{\mathbf{A}}^g ,\mathbf{R}_w^g,\varTheta ^g \right) f\left( y(t)|\mathbf{s}(t),\mathbf{x},{\mathbf{A}}^g ,\mathbf{R}_w^g,\varTheta ^g \right) } \end{aligned}$$

(39)

On the other hand,

$$\begin{aligned}&f(\mathbf{x},\mathbf{s},Y|{\mathbf{A}},\mathbf{R}_w,\varTheta ) \nonumber \\&\quad = \prod \limits _{t = 1}^T {f\left( \mathbf{x}(t)|\mathbf{s}(t),{\mathbf{A}},\mathbf{R}_w \right) f\left( \mathbf{s}(t)|y(t),\varTheta \right) } \end{aligned}$$

(40)

Substituting (39) and (40) in (11), it is straightforward to derive that

$$\begin{aligned} J= & {} \sum \limits _{t = 1}^T {\sum \limits _{m = 1}^M {\int _\mathbf{s} {f(y(t) = m|\mathbf{s}(t),\mathbf{x},{\mathbf{A}}^g,\mathbf{R}_w^g,\varTheta ^g )} } } \nonumber \\&\quad \log \omega _m f(\mathbf{s}(t)|y(t) = m,\varTheta )\mathrm{d}{} \mathbf{s} \nonumber \\&\quad + \sum \limits _{t = 1}^T {\int _\mathrm{{s}} {f(\mathbf{s}(t)|\mathbf{x},{\mathbf{A}}^g,\mathbf{R}_w^g,\varTheta ^g )\log f(\mathbf{x}(t)|\mathbf{s}(t),{\mathbf{A}},\mathbf{R}_w )} } \mathrm{d}{} \mathbf{s} \end{aligned}$$

Appendix 2: Proof of Equation (14)

Based on the Bayesian theory, it is easy to obtain

$$\begin{aligned}&f\left( \mathbf{s}(t)|\mathbf{x}(t),{\mathbf{A}}^g,\mathbf{R}_w^g,\varTheta ^g \right) \nonumber \\&\quad = \sum \limits _{y(t) = 1}^M {f\left( \mathbf{x}(t)|\mathbf{s}(t),{\mathbf{A}}^g,\mathbf{R}_w^g,\varTheta ^g \right) }\nonumber \\&\qquad f\left( \mathbf{s}(t)|y(t) = m,{\mathbf{A}}^g,\mathbf{R}_w^g,\varTheta ^g \right) \end{aligned}$$

(41)

Hence,

$$\begin{aligned}&f(\mathbf{x}(t),\mathbf{s}(t),{\mathbf{A}}^g,\mathbf{R}_w^g,\varTheta ^g ) \nonumber \\&\quad = \frac{1}{{\left| {2\pi \mathbf{R}_w^g } \right| ^{1/2} }}\nonumber \\&\qquad \exp \left\{ { - \frac{1}{2}(\mathbf{x}(t) - {\mathbf{A}}^g \mathbf{s}(t))^{\mathrm{T}} (\mathbf{R}_w^g )^{ - 1} (\mathbf{x}(t) - {\mathbf{A}}^g \mathbf{s}(t))} \right\} \nonumber \\&\qquad \sum _{m = 1}^M {\omega _m^g \frac{1}{{\left| {2\pi \mathbf{C}_m^g } \right| ^{1/2} }}} \nonumber \\&\qquad \quad \exp \left\{ { - \frac{1}{2}(\mathbf{s}(t) - {\varvec{\mu }}_m^g )^{\mathrm{T}} (\mathbf{C}_m^g )^{ - 1} (\mathbf{s}(t) - {\varvec{\mu }}_m^g )} \right\} \nonumber \\&\quad = \sum _{m = 1}^M {\omega _m^g \frac{1}{{\left| {2\pi \mathbf{R}_w^g } \right| ^{1/2} }}\frac{1}{{\left| {2\pi \mathbf{C}_m^g } \right| ^{1/2} }}} \nonumber \\&\qquad \exp \left\{ { - \frac{1}{2}(\mathbf{x}(t) - {\mathbf{A}}^g \mathbf{s}(t))^{\mathrm{T}} (\mathbf{R}_w^g )^{ - 1} (\mathbf{x}(t) - {\mathbf{A}}^g \mathbf{s}(t))} \right\} \nonumber \\&\qquad \exp \left\{ { - \frac{1}{2}(\mathbf{s}(t) - {\varvec{\mu }}_m^g )^{\mathrm{T}} (\mathbf{C}_m^g )^{ - 1} (\mathbf{s}(t) - {\varvec{\mu }}_m^g )} \right\} \end{aligned}$$

(42)

After a series of derivations, (42) can be simplified as

$$\begin{aligned} f(\mathbf{s}(t)|\mathbf{x}(t),{\mathbf{A}}^g,\mathbf{R}_w^g,\varTheta ^g ) = \sum _{m = 1}^M {\tilde{\omega } _{mt}^g \mathcal{N}\left[ {\mathbf{s}(t);{\varvec{\tilde{\mu }}}_{mt}^g,{\tilde{\mathbf{C}}}_{mt}^g } \right] } \end{aligned}$$

where

$$\begin{aligned} \left\{ \begin{array}{l} {\tilde{\mathbf{C}}}_{mt}^g = \left( {({\mathbf{A}}^g )^{\mathrm{T}} (\mathbf{R}_w^g )^{ - 1} {\mathbf{A}}^g + (\mathbf{C}_m^g )^{ - 1} } \right) ^{ - 1} \\ {\varvec{\tilde{\mu }}}_{mt}^g ~= \left( {{\tilde{\mathbf{C}}}_{mt}^g } \right) \left( {({\mathbf{A}}^g )^{\mathrm{T}} (\mathbf{R}_w^g )^{ - 1} \mathbf{x}(t) + (\mathbf{C}_m^g )^{ - 1} {\varvec{\mu }}_m^g } \right) \\ \tilde{\omega } _{mt}^g ~= \omega _m^g \left( {{{\left| {({\tilde{\mathbf{C}}}_{mt}^g )} \right| ^{1/2} } \Big / {\left| {2\pi \mathbf{R}_w^g } \right| ^{1/2} \left| {\mathbf{C}_m^g } \right| ^{1/2} }}} \right) \\ \qquad \quad \mathrm{{ }}\exp \left\{ { - \frac{1}{2}\left[ {\mathbf{x}^{\mathrm{T}} (t)(\mathbf{R}_w^g )^{ - 1} \mathbf{x}(t)} \right. } \right. \\ \qquad \qquad \quad \left. {\left. {\mathrm{{ }} + ({\varvec{\mu }}_m^g )^\mathrm{{T}} (\mathbf{C}_m^g )^{ - 1} {\varvec{\mu }}_m^g - ({\varvec{\tilde{\mu }}}_{mt}^g )^\mathrm{{T}} ({\tilde{\mathbf{C}}}_{mt}^g )^{ - 1} {\varvec{\tilde{\mu }}}_{mt}^g } \right] } \right\} \\ \end{array} \right. \end{aligned}$$

Appendix 3: Definition of Similarity Score

In order to measure the separation performance, the similarity score is introduced to evaluate the separation performance of the proposed algorithm

$$\begin{aligned} \rho _{ii} = {{\sum _{t = 1}^T {s_i (t)\hat{s}_i (t)} } \Big / {\sqrt{\sum _{t = 1}^T {(s_i (t))^2 } \sum _{t = 1}^T {(\hat{s}_i (t))^2 } } }} \end{aligned}$$

(43)

where ${{\hat{s}}_i(t)}$ is the ith recovered source signal. $\rho _{ii}$ depicts the similarity between the ith original source signal and the corresponding recovered source signal. It is clear that the larger the value of $\rho _{ii}$, the higher the degree of similarity between the original sources and the recovered sources.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Gu, F., Zhang, H., Wang, W. et al. An Expectation-Maximization Algorithm for Blind Separation of Noisy Mixtures Using Gaussian Mixture Model. Circuits Syst Signal Process 36, 2697–2726 (2017). https://doi.org/10.1007/s00034-016-0424-2

Download citation

Received: 28 October 2015
Revised: 11 September 2016
Accepted: 13 September 2016
Published: 25 October 2016
Issue Date: July 2017
DOI: https://doi.org/10.1007/s00034-016-0424-2

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

An Expectation-Maximization Algorithm for Blind Separation of Noisy Mixtures Using Gaussian Mixture Model

Abstract

Access this article

Similar content being viewed by others

Genetic Algorithms in the Fields of Artificial Intelligence and Data Sciences

A Bayesian interpretation of the confusion matrix

Adaptive Filtering Based on Minimum Error Entropy Conjugate Gradient

Notes

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Appendices

Appendix 1: Proof of Equation (13)

Appendix 2: Proof of Equation (14)

Appendix 3: Definition of Similarity Score

Rights and permissions

About this article

Cite this article

Keywords

Navigation

An Expectation-Maximization Algorithm for Blind Separation of Noisy Mixtures Using Gaussian Mixture Model

Abstract

Access this article

Similar content being viewed by others

Genetic Algorithms in the Fields of Artificial Intelligence and Data Sciences

A Bayesian interpretation of the confusion matrix

Adaptive Filtering Based on Minimum Error Entropy Conjugate Gradient

Notes

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Appendices

Appendix 1: Proof of Equation (13)

Appendix 2: Proof of Equation (14)

Appendix 3: Definition of Similarity Score

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation