Advertisement

An Efficient Similarity-Based Validity Index for Kernel Clustering Algorithm

  • Yun-Wei Pu
  • Ming Zhu
  • Wei-Dong Jin
  • Lai-Zhao Hu
Part of the Lecture Notes in Computer Science book series (LNCS, volume 3971)

Abstract

The qualities of clustering, including those obtained by the kernel-based methods should be assessed. In this paper, by investigating the inherent pairwise similarities in kernel matrix implicitly defined by the kernel function, we define two statistical similarity coefficients which can be used to describe the within-cluster and between-cluster similarities between the data items, respectively. And then, an efficient cluster validity index and a self-adaptive kernel clustering (SAKC) algorithm are proposed based on these two similarity coefficients. The performance and effectiveness of the proposed validity index and SAKC algorithm are demonstrated, compared with some existing methods, on two synthetic datasets and four UCI real databases. And the robustness of this new index with Gaussian kernel width is also explored tentatively.

Keywords

Data Item Synthetic Dataset Kernel Matrix Validity Index Ring Data 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Xu, R., Wunsch II, D.C.: Survey of Clustering Algorithms. IEEE Trans. Neural Networks 16(3), 645–678 (2005)CrossRefGoogle Scholar
  2. 2.
    Shawe-Taylor, J., Cristianini, N.: Kernel Methods for Pattern Analysis. Cambridge University Press, Cambridge (2004)Google Scholar
  3. 3.
    Girolami, M.: Mercer Kernel-Based Clustering in Feature Space. IEEE Trans. Neural Networks 13(3), 780–784 (2002)CrossRefGoogle Scholar
  4. 4.
    Bezdek, J.C., Pal, N.R.: Some New Index of Cluster Validity. IEEE Trans. Systems, Man, and Cybernetics-Part B: Cybernetics 28(3), 301–315 (1998)CrossRefGoogle Scholar
  5. 5.
    Xie, X.L., Beni, G.: A Validity Measure for Fuzzy Clustering. IEEE Trans. Pattern Analysis and Machine Intelligence 13(8), 841–847 (1991)CrossRefGoogle Scholar
  6. 6.
    Chapelle, O., Vapnik, V., Bousqet, O., Mukherjee, S.: Choosing Multiple Parameters for Support Vector Machines. Machine Learning 46(1), 131–159 (2002)MATHCrossRefGoogle Scholar
  7. 7.
    Newman, D.J., Hettich, S., Blake, C.L., Merz, C.J.: UCI Repository of Machine Learning Databases. Available at, ftp://ftp.ics.uci.edu/pub/machine-learning-databases

Copyright information

© Springer-Verlag Berlin Heidelberg 2006

Authors and Affiliations

  • Yun-Wei Pu
    • 1
    • 2
    • 3
  • Ming Zhu
    • 1
    • 2
  • Wei-Dong Jin
    • 1
  • Lai-Zhao Hu
    • 2
  1. 1.School of Information Science and Tech.Southwest Jiaotong UniversityChengduChina
  2. 2.National EW LaboratoryChengduChina
  3. 3.Computer CenterKunming University of Science & TechnologyKunmingChina

Personalised recommendations