Abstract
Basing cluster analysis on mixture models has become a classical and powerful approach. It proves to be useful for understanding and suggesting significant criteria. With Gaussian clustering models, for instance, the parametrization of the variance matrix of a cluster in terms of its eigenvalue decomposition allows to propose many general clustering criteria from the simplest (k-means criterion) to the most complex. Mixture models can also deal with very different situations such as quantitative data, binary data, spatial data, missing data, partially classified samples, order constraints on clusters or outliers. This paper intends to illustrate the application of mixture models to various clustering situations from classical Gaussian models to multivariate binary observation vectors located at neighboring geographic sites.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Ambroise, C., Dang, D. and Govaert, G (1997): Clustering of spatial data by the em algorithm. In: A. Soares, J. Gumez-Hernandez, and F. Froidevaux (eds.): geoENVl-Geostatistics for Environmental Applications, volume 9 of Quantitative Geology and Geostatistics, 493–504. Kluwer Academic Publisher, Dordrecht.
Ambroise, C. (1996): Approche probabiliste en classification automatique et contraintes de voisinage. PhD thesis, Université de Technologie de Compiègne.
Ambroise, C. and Govaert, G (1996): Constrained clustering and kohonen self-organizing maps. Journal of Classification, 13(2), 299–313.
Ambroise, C. and Govaert, G: Convergence of an em-type algorithm for spatial clustering. Pattern Recognition Letters, 19, 919–927.
Banfield, J. and Raftery, A. (1993): Model-based gaussian and non-gaussian clustering. Biometrics, 49, 803–821.
Bock, H. (1996): Probability models and hypothesis testing in partitioning cluster analysis: In: P. Arabie, L. J. Hubert, and G. De Soete (eds.): Clustering and Classification, 377–453. World Scientific Publishing, River Edge, N. J.
Celeux, G. (1992): Modèles probabilistes en classification. In: B. Droesbeke, J. Fichet and P. Tassi (eds.): Modèles pour l’analyse de données multidimensionnelles, 165–211. Economica, 1992.
Celeux, G. and Govaert, G (1992): Clustering criteria for discrete data and latent class models. Journal of Classification, 8(2), 157–176.
Celeux, G. and Govaert, G (1995): Gaussian parsimonious clustering models. Pattern Recognition, 28(5), 781–793.
Dang, D. and Govaert, G (1998): Spatial Fuzzy Clustering using EM and Markov Random Fields. System Research and fnfo. Systems, 8, 183–202.
Dang, D. and Govaert, G (1999): Clustering of spatial incomplete data using a fuzzy classifying likelihood. In: A. Frield, A. Berghols, and G. Kauermann (eds.): 14th International Workshop on Statistical Modelling, 150–157, Graz (Austria), July, 19–23 1999.
Govaert G (1992): Classification binaire et modéles. Rev. Statistique Appliquée XXXVIII(1) 67–81
Hartigan, PPJ. (1975): Clustering Algorithms. Wiley, New York.
Hathaway, R (1986): Another interpretation of the em algorithm for mixture distributions. Statistics & Probability Letters, 4, 53–56.
Lerman, I. (1981): Classification automatique et analyse ordinale des données. Dunod, Paris.
Ling, R. (1973): A probability theory for cluster analysis. J. Am. Statis. Assoc., 68, 159–164.
Little, J. and Rubin, D. (1987): Analysis with Missing Data. Wiley, New York.
Marriott, F. (1975): Separating mixtures of normal distributions. Biometrics, 31, 767–769.
Mclachlan, G. and Basford, K. (1988): Mixture Models, Inference and applications to clustering. Marcel Dekker, New York.
Pearson K. (1894): Contributions to the mathematical theory of evolution. Phisosophical Transactions of the Royal Society of London Series A (185) 71–110
Scott, PPA. and Symons, J. (1971): Clustering methods based on likelihood ratio criteria. Biometrics, 27, 387–397.
Symons, J. (1971): (1981): Clustering criteria and multivariate normal mixtures. Biometrics, 37, 35–43.
Titterington, D., Smith, A. and Makov, U. (1985): Statistical Analysis of Finite Mixture Distributions. Wiley, New York.
Wong, A. (1982): A hybrid clustering method for identifying highdensity clusters. J. Am. Statis. Assoc., 77, 841–847.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2002 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Ambroise, C., Govaert, G. (2002). Clustering and Models. In: Gaul, W., Ritter, G. (eds) Classification, Automation, and New Media. Studies in Classification, Data Analysis, and Knowledge Organization. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-55991-4_1
Download citation
DOI: https://doi.org/10.1007/978-3-642-55991-4_1
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-43233-3
Online ISBN: 978-3-642-55991-4
eBook Packages: Springer Book Archive