Definition
A problem one faces in clustering is to decide the optimal partitioning of the data into clusters. In this context visualization of the data set is a crucial verification of the clustering results. In the case of large multidimensional data sets (e.g., more than three dimensions) effective visualization of the data set is cumbersome. Moreover the perception of clusters using available visualization tools is a difficult task for humans that are not accustomed to higher dimensional spaces. The procedure of evaluating the results of a clustering algorithm is known under the term cluster validity. Cluster validity consists of a set of techniques for finding a set of clusters that best fits natural partitions (of given datasets) without any a priori class information. The outcome of the clustering process is validated by a cluster validity index.
Historical Background
Clust...
Recommended Reading
Bezdek JC, Pal NR. Some new indexes of cluster validity. IEEE Trans Syst Man Cybern Part B. 1998;28(3):301–15.
Datta S, Datta S. Comparisons and validation of statistical clustering techniques for microarray gene expression data. Bioinformatics. 2003;19(4):459–66.
El-Melegy MT, Zanaty EA, Abd-Elhafiez WM, Farag AA. On cluster validity indexes in fuzzy and hard clustering algorithms for image segmentation. In: Proceedings of International Conference on Image Processing; 2007. p. 5–8.
Halkidi M, Batistakis Y, Vazirgiannis M. On clustering validation techniques. J Intell Inf Syst. 2001;17(2–3):107–45.
Halkidi M, Gunopulos D, Vazirgiannis M, Kumar N, Domeniconi C. A clustering framework based on subjective and objective validity criteria. ACM Trans Knowl Discov Data. 2008;1(4).
Jiang D, Tang C, Zhang A. Cluster analysis for gene expression data: a survey. IEEE Trans Knowl Data Eng. 2004;16(11):1370–86.
Kim M, Ramakrishna RS. New indices for cluster validity assessment. Pattern Recogn Lett. 2005;26(15):2353–63.
Maulik U, Bandyopadhyay S. Performance evaluation of some clustering algorithms and validity indices. IEEE Trans Pattern Anal Mach Intell. 2002;24(12):1650–4.
NIPS 2005 workshop on theoretical foundations of clustering, Saturday, December 10th, 2005. Available at: http://www.kyb.tuebingen.mpg.de/bs/people/ule/clustering_workshop_nips05/clustering_workshop_nips05.htm_
Pal NR, Bezdek JC. On cluster validity for the fuzzy c-means model. IEEE Trans Fuzzy Syst. 1995;3(3):370–9.
Rand WM. Objective criteria for the evaluation of clustering methods. J Am Stat Assoc. 1971;66(336):846–50.
Wang J-S, Chiang J-C. A cluster validity measure with a hybrid parameter search method for the support vector clustering algorithm. Pattern Recog. 2008;41(2):506–20.
Zhang J, Modestino JW. A model-fitting approach to cluster validation with application to stochastic model-based image segmentation. IEEE Trans Pattern Anal Mach Intell. 1990;12(10):1009–17.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Section Editor information
Rights and permissions
Copyright information
© 2016 Springer Science+Business Media New York
About this entry
Cite this entry
Vazirgiannis, M. (2016). Clustering Validity. In: Liu, L., Özsu, M. (eds) Encyclopedia of Database Systems. Springer, New York, NY. https://doi.org/10.1007/978-1-4899-7993-3_616-2
Download citation
DOI: https://doi.org/10.1007/978-1-4899-7993-3_616-2
Received:
Accepted:
Published:
Publisher Name: Springer, New York, NY
Online ISBN: 978-1-4899-7993-3
eBook Packages: Springer Reference Computer SciencesReference Module Computer Science and Engineering