Skip to main content

On Fitting Mixture Models

  • Conference paper
  • First Online:

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 1654))

Abstract

Consider the problem of fitting a finite Gaussian mixture, with an unknown number of components, to observed data. This paper proposes a new minimum description length (MDL) type criterion, termed MMDL(for mixture MDL), to select the number of components of the model. MMDLis based on the identification of an “equivalent sample size”, for each component, which does not coincide with the full sample size. We also introduce an algorithm based on the standard expectationmaximization (EM) approach together with a new agglomerative step, called agglomerative EM (AEM). The experiments here reported have shown that MMDLo utperforms existing criteria of comparable computational cost. The good behavior of AEM, namely its good robustness with respect to initialization, is also illustrated experimentally.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. J. Banfield and A. Raftery. Model-based Gaussian and non-Gaussian clustering. Biometrics, 49:803–821, 1993.

    Article  MATH  MathSciNet  Google Scholar 

  2. H. Bensmail, G. Celeux, A. Raftery, and C. Robert. Inference in model-based cluster analysis. Statistics and Computing, 7:1–10, 1997.

    Article  Google Scholar 

  3. J. Bezdek. Pattern Recognition with Fuzzy Objective Function Algorithms. Plenum Press, New York, 1981.

    MATH  Google Scholar 

  4. G. Celeux and G. Govaert. Gaussian parsimonious clustering models. Pattern Recognition, 28(5):781–793, 1995.

    Article  Google Scholar 

  5. C. Fraley and A. Raftery. How many clusters? Which clustering method? Answers via model-based cluster analysis. Technical Report 329, Department of Statistics, University of Washington, Seattle, WA, 1998.

    Google Scholar 

  6. T. Hastie and R. Tibshirani. Discriminant analysis by Gaussian mixtures. Journal of the Royal Statistical Society (B), 58:155–176, 1996.

    MATH  MathSciNet  Google Scholar 

  7. A. Jain and R. Dubes. Algorithms for Clustering Data. Prentice Hall, Englewood Cliffs, N. J., 1988.

    MATH  Google Scholar 

  8. A. Jain and J. Moreau. Bootstrap techniques in cluster analysis. Pattern Recognition, 20(5):547–568, 1987.

    Article  Google Scholar 

  9. R. Kass and A. Raftery. Bayes factors. Journal of the American Statistical Association, 90:733–795, 1995.

    Google Scholar 

  10. M. Kloppenburg and P. Tavan. Deterministic annealing for density estimation by multivariate normal mixtures. Physical Review E, 55:R2089–R2092, 1997.

    Article  Google Scholar 

  11. P. Kontkanen, P. Myllymäki, and H. Tirri. Comparing bayesian model class selection criteria in discrete finite mixtures. In Proceedings of Information, Statistics, and Induction in Science-ISIS’96, pp. 364–374, Singapore, 1996. World Scientific.

    Google Scholar 

  12. S. Kullback. Information Theory and Statistics. J. Wiley & Sons, N. York, 1959.

    MATH  Google Scholar 

  13. J. Lin. Divergence measures based on the Shannon entropy. IEEE Trans. Information Theory, 37:145–151, 1991.

    Article  MATH  Google Scholar 

  14. G. McLachlan. On bootstrapping the likelihood ratio test statistic for the number of components in a normal mixture. Jour. Roy. Stat. Soc. (C), 36:318–324, 1987.

    Google Scholar 

  15. G. McLachlan and K. Basford. Mixture Models: Inference and Application to Clustering. Marcel Dekker, New York, 1988.

    Google Scholar 

  16. G. McLachlan and T. Krishnan. The EM Algorithm and Extensions. John Wiley & Sons, New York, 1997.

    MATH  Google Scholar 

  17. G. McLachlan and D. Peel. MIXFIT: an algorithm for the automatic fitting and testing of normal mixture models. In Proceedings of the 14th IAPR International Conference on Pattern Recognition, volume II, pages 553–557, 1998.

    Google Scholar 

  18. K. Mengersen and C. Robert. Testing for mixtures: a Bayesian entropic approach. In J. Bernardo, J. Berger, A Dawid, and F. Smith, editors, Bayesian Statistsics 5: Proceedings of the Fifth Valencia International Meeting, pages 255–276. Oxford University Press, 1996.

    Google Scholar 

  19. R. Neal. Bayesian mixture modeling. In Proceedings of the 11th International Workshop on Maximum Entropy and Bayesian Methods of Statistical Analysis, pages 197–211. Kluwer, Dordrecht, The Netherlands, 1992.

    Google Scholar 

  20. J. Oliver, R. Baxter, and C. Wallace. Unsupervised learning using MML. In Proceedings of the Thirtheenth International Conference on Machine Learning, pages 364–372. Morgan Kaufmann, San Francisco, CA, 1996.

    Google Scholar 

  21. S. Richardson and P. Green. On Bayesian analysis of mixtures with unknown number of components. Jour. of the Royal Statist. Soc. B, 59:731–792, 1997.

    Article  MATH  MathSciNet  Google Scholar 

  22. B. Ripley. Pattern Recognition and Neural Networks. Cambridge University Press, Cambridge, U.K., 1996.

    MATH  Google Scholar 

  23. J. Rissanen. Stochastic Complexity in Stastistical Inquiry. World Scientific, 1989.

    Google Scholar 

  24. C. Robert. Mixtures of distributions: Inference and estimation. In W. Gilks, S. Richardson, and D. Spiegelhalter, editors, Markov Chain Monte Carlo in Practice, London, 1996. Chapman & Hall.

    Google Scholar 

  25. S. Roberts, D. Husmeier, I. Rezek, and W. Penny. Bayesian approaches to gaussian mixture modelling. IEEE Transactions on Pattern Analysis and Machine Intelligence, 20(11), November 1998.

    Google Scholar 

  26. K. Roeder and L. Wasserman. Practical Bayesian density estimation using mixtures of normals. Journal of the American Statistical Association, 92:894–902, 1997.

    Article  MATH  MathSciNet  Google Scholar 

  27. K. Rose. Deterministic annealing for clustering, compression, classification, regression, and related optimization problems. Proc. of IEEE, 86:2210–2239, 1998.

    Article  Google Scholar 

  28. B. Silverman. Density Estimation for Statistics and Data Analysis. Chapman & Hall, London, 1986.

    MATH  Google Scholar 

  29. P. Smyth. Clustering using Monte-Carlo cross-validation. In Proceedings of the Second International Conference on Knowledge Discovery and Data Mining, pages 126–133. AAAI Press, Menlo Park, CA, 1996.

    Google Scholar 

  30. P. Smyth. Model selection for probabilistic clustering using cross-validated likelihood. Technical Report UCI-ICS 98-09, Information and Computer Science, University of California, Irvine, CA, 1998.

    Google Scholar 

  31. D. Titterington, A. Smith, and U. Makov. Statistical Analysis of Finite Mixture Distributions. John Wiley & Sons, Chichester (U.K.), 1985.

    MATH  Google Scholar 

  32. N. Ueda and R. Nakano. Deterministic annealing EM algorithm. Neural Networks, 11:271–282, 1998.

    Article  Google Scholar 

  33. A. Vailaya, M. Figueiredo, A. K. Jain, and H. Jiang Zhang. A bayesian framework for semantic classification of outdoor vacation images. In Proceedings of the 1999 SPIE Conference on Storage and Retrieval for Image and Video Databases VII, pages 415–426. San Jose, CA, 1999.

    Google Scholar 

  34. M. West and J Harrison. Bayesian Forecasting and Dynamic Models. Springer-Verlag, New York, 1989.

    MATH  Google Scholar 

  35. M. Whindham and A. Cutler. Information ratios for validating mixture analysis. Journal of the American Satistical Association, 87:1188–1192, 1992.

    Article  Google Scholar 

  36. A. Yuille, P. Stolorz, and J. Utans. Statistical physics, mixtures of distributions, and the EM algorithm. Neural Computation, 6:332–338, 1994.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 1999 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Figueiredo, M.A.T., Leitão, J.M.N., Jain, A.K. (1999). On Fitting Mixture Models. In: Hancock, E.R., Pelillo, M. (eds) Energy Minimization Methods in Computer Vision and Pattern Recognition. EMMCVPR 1999. Lecture Notes in Computer Science, vol 1654. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-48432-9_5

Download citation

  • DOI: https://doi.org/10.1007/3-540-48432-9_5

  • Published:

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-66294-5

  • Online ISBN: 978-3-540-48432-5

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics