Cluster Stability Assessment Based on Theoretic Information Measures

Pascual, Damaris; Pla, Filiberto; Sánchez, J. Salvador

doi:10.1007/978-3-540-85920-8_27

Cluster Stability Assessment Based on Theoretic Information Measures

Damaris Pascual¹,
Filiberto Pla² &
J. Salvador Sánchez²

Conference paper

1927 Accesses
1 Citations

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 5197))

Abstract

Cluster validation to determine the right number of clusters is an important issue in clustering processes. In this work, a strategy to address the problem of cluster validation based on cluster stability properties is introduced. The stability index proposed is based on information measures taking into account the variation on some of these measures due to the variability in clustering solutions produced by different sample sets of the same problem. The experiments carried out on synthetic and real database show the effectiveness of the cluster stability index when the clustering algorithm is based on a data structure model adequate to the problem.

Download to read the full chapter text

Chapter PDF

References

Bouguessa, M., Wang, S., Sun, H.: An Objective approach to cluster validation. Pattern Recognition Letters 27, 1419–1430 (2006)
Article Google Scholar
Ertoz, L., Steinbach, M., Kumar, V.: Finding Clusters of Different Sizes, Shapes, and Densities in Noisy, High Dimensional Data. In: Third SIAM International Conference on data Mining (2003)
Google Scholar
Lange, T., Braun, M.L., Buhmann, J.M.: Stability-Based Validation of Clustering Solutions. Neural Computation 16, 1299–1323 (2004)
Article MATH Google Scholar
Milligan, G.W., Cooper, M.C.: An examination of procedures for determining the number of clusters in a data set. Psychometrika 50, 159–179 (1985)
Article Google Scholar
Pal, N.R., Bezdek, J.C.: On cluster validity for the fuzzy c-means model. IEEE Trans. Fuzzy Syst. 3(3), 370–379 (1995)
Article Google Scholar
Pascual, D., Pla, F., Sánchez, J.S.: Non Parametric Local Density-based Clustering for Multimodal Overlapping Distributions. In: Corchado, E., Yin, H., Botti, V., Fyfe, C. (eds.) IDEAL 2006. LNCS, vol. 4224, pp. 671–678. Springer, Heidelberg (2006)
Chapter Google Scholar
Sugar, C.: Techniques for clustering and classification with applications to medical problems. PhD Dissertation Stanford University, Stanford (1998)
Google Scholar
Sugar, C., Lenert, L., Olshen, R.: An application of cluster analysis to health services research: empirically defined health states for depression from the sf-12. Technical Report Stanford University, Stanford (1999)
Google Scholar
Tibshirani, R., Walther, G., Hastie, T.: Estimating the number of clusters in a data set via the gap statistic. J. R. Statist Soc. B 63, Part 2, 411–423 (2001)
Article MathSciNet MATH Google Scholar
Ben-Hur, A., Guyon, I.: Detecting stable clusters using principal component analysis. In: Brownstein, M., Khodursky, A. (eds.) Methods in Molecular Biology, pp. 159–182. Humana press (2003)
Google Scholar
Mufti, G.B., Bertrand, P., Moubarki, L.E.: Determining the number of groups from measures of cluster validity. In: ASMDA 2005, pp. 404–414 (2005)
Google Scholar
Cover, T.M., Thomas, J.A.: Elements of Information Theory. Wiley, Chichester (1991)
Book MATH Google Scholar
Li, J.: Divergence measures based on Shannon entropy. IEEE Trans. on Information Theory 37(1), 145–151 (1991)
Article MathSciNet MATH Google Scholar

Download references

Author information

Authors and Affiliations

Center for Pattern Recognition and Data Mining, Universidad de Oriente, Av. Patricio Lumumba s/n, Santiago de Cuba, 90500, Cuba
Damaris Pascual
Dept. Llentguages i Sistemas Informátics, Universitat Jaume I, 12071, Castelló, Spain
Filiberto Pla & J. Salvador Sánchez

Authors

Damaris Pascual
View author publications
You can also search for this author in PubMed Google Scholar
Filiberto Pla
View author publications
You can also search for this author in PubMed Google Scholar
J. Salvador Sánchez
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

José Ruiz-Shulcloper Walter G. Kropatsch

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Pascual, D., Pla, F., Sánchez, J.S. (2008). Cluster Stability Assessment Based on Theoretic Information Measures. In: Ruiz-Shulcloper, J., Kropatsch, W.G. (eds) Progress in Pattern Recognition, Image Analysis and Applications. CIARP 2008. Lecture Notes in Computer Science, vol 5197. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-85920-8_27

Download citation

DOI: https://doi.org/10.1007/978-3-540-85920-8_27
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-85919-2
Online ISBN: 978-3-540-85920-8
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Societies and partnerships

The International Association for Pattern Recognition (opens in a new tab)