Summary
Mixture model parameters are usually computed with maximum likelihood using an Expectation-Maximization (EM) algorithm. However, it is wellknown that this method can sometimes converge toward a critical point of the solution space which is not the global maximum. To minimize this problem, different strategies using different combinations of algorithms can be used. In this paper, we compare by the mean of numerical simulations strategies using EM, Classification EM, Stochastic EM, and Genetic algorithms for the optimization of mixture models. Our results indicate that two-stage procedures combining both an exploration phase and an optimization phase provide the best results, especially when these methods axe applied on several sets of initial conditions rather than on one single starting point.
Similar content being viewed by others
Notes
1The term population, as used in GA theory, must not be confounded with the statistical acceptation of the word. It refers here to the number of trials which are run simultaneously.
References
Berchtold A. (2001), Mixture Modeling of Non-Gaussian Probability Distributions, Proceedings of the 10th International Symposium on Applied Stochastic Models and Data Analysis, 188–193, Compiègne, France.
Berchtold A. (2003), Mixture Transition Distribution (MTD) Modeling of Heteroscedastic Time Series. Computational Statistics & Data Analysis, 41(3–4), 399–411.
Berchtold, A., Raftery, A.E. (2002), The Mixture Transition Distribution (MTD) Model for High-Order Markov Chains and Non-Gaussian Time Series. Statistical Science, 17(3), 328–356.
Biernacki C., Celeux G., Govaert, G. (2003), Choosing starting values for the EM algorithm for getting the highest likelihood in multivariate Gaussian mixture models. Computational Statistics & Data Analysis, 41(3–4), 561–575.
Böhning, D. (2001), The Potential of Recent Developments in Nonparametric Mixture Distributions, Proceedings of the 10th International Symposium on Applied Stochastic Models and Data Analysis, 6–13, Compiègne, France.
Celeux, G., Diebolt, J. (1985), The SEM algorithm: a probabilistic teacher algorithm derived from the EM algorithm for the mixture problem, Computational Statistics Quarterly, 2, 73–82.
Celeux, G., Govaert, G. (1992), A Classification EM Algorithm for Clustering and Two Stochastic Versions, Computational Statistics & Data Analysis, 14, 315–332.
Davis, L. et al. (1991), Handbook of Genetic Algorithms, Davis Editor, Van Nostrand Reinhold, New York.
Holland, J.H. (1975), Adaptation in Natural and Artificial Systems, University of Michigan Press, Ann Arbor.
Le, N.D., Martin, R.D., Raftery, A.E. (1996), Modeling Flat Stretches, Bursts, and Outliers in Time Series Using Mixture Transition Distribution Models, Journal of the American Statistical Association, 91, 1504–1515.
McLachlan, G.J., Krishnan, T. (1996), The EM Algorithm and Extensions, John Wiley & Sons, New York.
Raftery, A.E. (1985), A model for high-order Markov chains, Journal of the Royal Statistical Society B, 47(3), 528–539.
Vose, M.D. (1999), The Simple Genetic Algorithm, Foundations and Theory, MIT Press, Cambridge.
Wong, C.S., Li, W.K. (2000), On a mixture autoregression model, Journal of the Royal Statistical Society B, 62, 95–115.
Acknowledgements
We would like to thank Gilles Celeux for his very useful comments about the use and the behavior of the EM, CEM, and SEM algorithms, and Sue Pester for rewriting assistance. We are also grateful to three anonymous referees whose comments helped to significantly improve the final version of the paper.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Berchtold, A. Optimization of Mixture Models: Comparison of Different Strategies. CompStat 19, 385–406 (2004). https://doi.org/10.1007/BF03372103
Published:
Issue Date:
DOI: https://doi.org/10.1007/BF03372103