Abstract
We compare EM, SEM, and MCMC algorithms to estimate the parameters of the Gaussian mixture model. We focus on problems in estimation arising from the likelihood function having a sharp ridge or saddle points. We use both synthetic and empirical data with those features. The comparison includes Bayesian approaches with different prior specifications and various procedures to deal with label switching. Although the solutions provided by these stochastic algorithms are more often degenerate, we conclude that SEM and MCMC may display faster convergence and improve the ability to locate the global maximum of the likelihood function.
Similar content being viewed by others
References
Celeux G. 1998. Bayesian inference for mixture: The label switching problem. In: Payne R. and Green P. (Eds.), COMPSTAT 98. Physica-Verlag, Heidelberg, pp. 227–232.
Celeux G. and Diebolt J. 1985. The SEM algorithm: A probabilistic teacher algorithm derived from the EM algorithm for the mixture problem. Computational Statistics Quarterly 2: 73–82.
Celeux G., Chauveau D., and Diebolt J. 1996. Stochastic versions of the EM algorithm: An experimental study in the mixture case. Journal of Statistical Computation and Simulations 55: 287–314.
Celeux G., Hurn M., and Robert C.P. 2000. Computational and inferential difficulties with mixture posterior distributions. Journal of the American Statistical Association 95: 957–970.
Cowles M.K. and Carlin B.P. 1996. Markov chain Monte Carlo convergence diagnostics: A comparative review. Journal of the American Statistical Association 91: 883–904.
Dempster A.P., Laird N.M., and Rubin D.B. 1977. Maximum likelihood from incomplete data via the EM algorithm (with discussion). Journal of the Royal Statistical Society B 39: 1–38.
Diebolt J. and Ip E.H.S. 1996. Stochastic EM: Method and application. In: Gilks W.R., Richardson S.T., and Spiegelhalter D.J. (Eds.), Markov Chain Monte Carlo in Practice. Chapman&Hall, London, pp. 259–273.
Diebolt J. and Robert C.P. 1994. Estimation of finite mixture distributions through Bayesian sampling. Journal of the Royal Statistical Society B 56(2): 363–375.
Escobar M.D. and West M. 1995. Bayesian density estimation and inference using mixtures. Journal of the American Statistical Association 90: 577–588.
Fletcher R. 1980. Practical Methods of Optimization. Vol. 1 Unconstrained Optimization. John Wiley & Sons, Chichester.
Frigessi A. and Rue H. 1997. Bayesian image classification with Baddeley’s delta loss. Journal of Computational and Graphical Statistics 6(1): 55–73.
Gelfand A.E. and Smith A.F.M. 1990. Sampling based approaches to calculating marginal densities. Journal of the American Statistical Association 85: 398–409.
Gelman A., Carlin J.B., Stern H.S., and Rubin D.B. 1995. Bayesian Data Analysis. Chapman & Hall, Boca Raton.
Hurn M., Justel A., and Robert C.P. 2003. Estimating mixtures of regressions. Journal of Computational and Graphical Statistics 12(1): 55–79.
McLachlan G.J. and Krishnan T. 1997. The EM Algorithm and Extensions. John Wiley & Sons, New York.
McLachlan G.J. and Peel D. 2000. Finite Mixture Models. John Wiley & Sons, New York.
Richardson S. and Green P.J. 1997. On Bayesian analysis of mixtures with an unknown number of components (with discussion). Journal of the Royal Statistical Society B 59(4): 731–792.
Robert C.P. 1996. Mixtures of distributions: Inference and estimation. In: Gilks W.R., Richardson S., and Spiegelhalter D.J. (Eds.), Markov Chain Monte Carlo in Practice. Chapman&Hall, London, pp. 441–464.
Roeder K. and Wasserman L. 1997. Practical Bayesian density estimation using mixtures of normals. Journal of the American Statistical Association, 92: 894–902.
Rue H. 1995. New loss functions in Bayesian imaging. Journal of the American Statistical Association 90(431): 900–908.
Sahu S.K. and Roberts G.O. 1999. On convergence of the EM algorithm and the Gibbs sampler. Statistics and Computing 9: 55–64.
Stephens M. 1997a. Discussion on ‘On Bayesian analysis of mixtures with an unknown number of components (with discussion)’. Journal of Royal Statistical Society B 59(4): 768–769.
Stephens M. 1997b. Bayesian methods for mixtures of normal distributions, DPhil Thesis Oxford. University of Oxford.
Stephens M. 2000. Dealing with label switching in mixture models, Journal of the Royal Statistical Society B 62(4): 795–809.
Tanner M.A. and Wong W.H. 1987. The calculation of posterior distributions by data augmentation. Journal of the American Statistical Association 82: 528–540.
UNDP. 2000. Human Development Report 2000. Human Rights and Human Development United Nations Development Programme, Oxford University Press, New York.
Vlassis N. and Likas A. 2002. A greedy EM algorithm for Gaussian mixture learning. Neural Processing Letters 15: 77–87.
Wedel M. and DeSarbo W.S. 1995. A mixture likelihood approach for generalized linear models. Journal of Classification 12: 21–55.
Wu C.F.J. 1983. On the convergence properties of the EM algorithm. Annals of Statistics 11: 95–103.
Author information
Authors and Affiliations
Rights and permissions
About this article
Cite this article
Dias, J.G., Wedel, M. An empirical comparison of EM, SEM and MCMC performance for problematic Gaussian mixture likelihoods. Statistics and Computing 14, 323–332 (2004). https://doi.org/10.1023/B:STCO.0000039481.32211.5a
Issue Date:
DOI: https://doi.org/10.1023/B:STCO.0000039481.32211.5a