Skip to main content

Abstract

Basing cluster analysis on mixture models has become a classical and powerful approach. It proves to be useful for understanding and suggesting significant criteria. With Gaussian clustering models, for instance, the parametrization of the variance matrix of a cluster in terms of its eigenvalue decomposition allows to propose many general clustering criteria from the simplest (k-means criterion) to the most complex. Mixture models can also deal with very different situations such as quantitative data, binary data, spatial data, missing data, partially classified samples, order constraints on clusters or outliers. This paper intends to illustrate the application of mixture models to various clustering situations from classical Gaussian models to multivariate binary observation vectors located at neighboring geographic sites.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 129.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 169.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  • Ambroise, C., Dang, D. and Govaert, G (1997): Clustering of spatial data by the em algorithm. In: A. Soares, J. Gumez-Hernandez, and F. Froidevaux (eds.): geoENVl-Geostatistics for Environmental Applications, volume 9 of Quantitative Geology and Geostatistics, 493–504. Kluwer Academic Publisher, Dordrecht.

    Google Scholar 

  • Ambroise, C. (1996): Approche probabiliste en classification automatique et contraintes de voisinage. PhD thesis, Université de Technologie de Compiègne.

    Google Scholar 

  • Ambroise, C. and Govaert, G (1996): Constrained clustering and kohonen self-organizing maps. Journal of Classification, 13(2), 299–313.

    Article  MathSciNet  MATH  Google Scholar 

  • Ambroise, C. and Govaert, G: Convergence of an em-type algorithm for spatial clustering. Pattern Recognition Letters, 19, 919–927.

    Google Scholar 

  • Banfield, J. and Raftery, A. (1993): Model-based gaussian and non-gaussian clustering. Biometrics, 49, 803–821.

    Article  MathSciNet  MATH  Google Scholar 

  • Bock, H. (1996): Probability models and hypothesis testing in partitioning cluster analysis: In: P. Arabie, L. J. Hubert, and G. De Soete (eds.): Clustering and Classification, 377–453. World Scientific Publishing, River Edge, N. J.

    Chapter  Google Scholar 

  • Celeux, G. (1992): Modèles probabilistes en classification. In: B. Droesbeke, J. Fichet and P. Tassi (eds.): Modèles pour l’analyse de données multidimensionnelles, 165–211. Economica, 1992.

    Google Scholar 

  • Celeux, G. and Govaert, G (1992): Clustering criteria for discrete data and latent class models. Journal of Classification, 8(2), 157–176.

    Article  Google Scholar 

  • Celeux, G. and Govaert, G (1995): Gaussian parsimonious clustering models. Pattern Recognition, 28(5), 781–793.

    Article  Google Scholar 

  • Dang, D. and Govaert, G (1998): Spatial Fuzzy Clustering using EM and Markov Random Fields. System Research and fnfo. Systems, 8, 183–202.

    Google Scholar 

  • Dang, D. and Govaert, G (1999): Clustering of spatial incomplete data using a fuzzy classifying likelihood. In: A. Frield, A. Berghols, and G. Kauermann (eds.): 14th International Workshop on Statistical Modelling, 150–157, Graz (Austria), July, 19–23 1999.

    Google Scholar 

  • Govaert G (1992): Classification binaire et modéles. Rev. Statistique Appliquée XXXVIII(1) 67–81

    Google Scholar 

  • Hartigan, PPJ. (1975): Clustering Algorithms. Wiley, New York.

    MATH  Google Scholar 

  • Hathaway, R (1986): Another interpretation of the em algorithm for mixture distributions. Statistics & Probability Letters, 4, 53–56.

    Article  MathSciNet  MATH  Google Scholar 

  • Lerman, I. (1981): Classification automatique et analyse ordinale des données. Dunod, Paris.

    Google Scholar 

  • Ling, R. (1973): A probability theory for cluster analysis. J. Am. Statis. Assoc., 68, 159–164.

    Article  MATH  Google Scholar 

  • Little, J. and Rubin, D. (1987): Analysis with Missing Data. Wiley, New York.

    MATH  Google Scholar 

  • Marriott, F. (1975): Separating mixtures of normal distributions. Biometrics, 31, 767–769.

    Article  MATH  Google Scholar 

  • Mclachlan, G. and Basford, K. (1988): Mixture Models, Inference and applications to clustering. Marcel Dekker, New York.

    MATH  Google Scholar 

  • Pearson K. (1894): Contributions to the mathematical theory of evolution. Phisosophical Transactions of the Royal Society of London Series A (185) 71–110

    Google Scholar 

  • Scott, PPA. and Symons, J. (1971): Clustering methods based on likelihood ratio criteria. Biometrics, 27, 387–397.

    Article  Google Scholar 

  • Symons, J. (1971): (1981): Clustering criteria and multivariate normal mixtures. Biometrics, 37, 35–43.

    Article  MathSciNet  Google Scholar 

  • Titterington, D., Smith, A. and Makov, U. (1985): Statistical Analysis of Finite Mixture Distributions. Wiley, New York.

    MATH  Google Scholar 

  • Wong, A. (1982): A hybrid clustering method for identifying highdensity clusters. J. Am. Statis. Assoc., 77, 841–847.

    Article  MATH  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2002 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Ambroise, C., Govaert, G. (2002). Clustering and Models. In: Gaul, W., Ritter, G. (eds) Classification, Automation, and New Media. Studies in Classification, Data Analysis, and Knowledge Organization. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-55991-4_1

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-55991-4_1

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-43233-3

  • Online ISBN: 978-3-642-55991-4

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics