An Efficient Similarity-Based Validity Index for Kernel Clustering Algorithm
The qualities of clustering, including those obtained by the kernel-based methods should be assessed. In this paper, by investigating the inherent pairwise similarities in kernel matrix implicitly defined by the kernel function, we define two statistical similarity coefficients which can be used to describe the within-cluster and between-cluster similarities between the data items, respectively. And then, an efficient cluster validity index and a self-adaptive kernel clustering (SAKC) algorithm are proposed based on these two similarity coefficients. The performance and effectiveness of the proposed validity index and SAKC algorithm are demonstrated, compared with some existing methods, on two synthetic datasets and four UCI real databases. And the robustness of this new index with Gaussian kernel width is also explored tentatively.
KeywordsData Item Synthetic Dataset Kernel Matrix Validity Index Ring Data
Unable to display preview. Download preview PDF.
- 2.Shawe-Taylor, J., Cristianini, N.: Kernel Methods for Pattern Analysis. Cambridge University Press, Cambridge (2004)Google Scholar
- 7.Newman, D.J., Hettich, S., Blake, C.L., Merz, C.J.: UCI Repository of Machine Learning Databases. Available at, ftp://ftp.ics.uci.edu/pub/machine-learning-databases