Advertisement

Dampster-Shafer Evidence Theory Based Multi-Characteristics Fusion for Clustering Evaluation

  • Shihong Yue
  • Teresa Wu
  • Yamin Wang
  • Kai Zhang
  • Weixia Liu
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 6401)

Abstract

Clustering is a widely used unsupervised learning method to group data with similar characteristics. The performance of the clustering method can be in general evaluated through some validity indices. However, most validity indices are designed for the specific algorithms along with specific structure of data space. Moreover, these indices consist of a few within- and between- clustering distance functions. The applicability of these indices heavily relies on the correctness of combining these functions. In this research, we first summarize three common characteristics of any clustering evaluation: (1) the clustering outcome can be evaluated by a group of validity indices if some efficient validity indices are available, (2) the clustering outcome can be measured by an independent intra-cluster distance function and (3) the clustering outcome can be measured by the neighborhood based functions. Considering the complementary and unstable natures among the clustering evaluation, we then apply Dampster-Shafter (D-S) Evidence Theory to fuse the three characteristics to generate a new index, termed fused Multiple Characteristic Indices (fMCI). The fMCI generally is capable to evaluate clustering outcomes of arbitrary clustering methods associated with more complex structures of data space. We conduct a number of experiments to demonstrate that the fMCI is applicable to evaluate different clustering algorithms on different datasets and the fMCI can achieve more accurate and robust clustering evaluation comparing to existing indices.

Keywords

Validity index data structure clustering algorithm Dampster-Shafer evidence theory 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Xu, R., Wunsch, D.: Survey of clustering algorithm. IEEE Trans. Neural Network 16(3), 645–678 (2005)CrossRefGoogle Scholar
  2. 2.
    Bezdek, J.C., Pal, N.R.: Some new indexes of cluster validity. IEEE Trans. SMC-B 28(3), 301–315 (1998)Google Scholar
  3. 3.
    Maulik, U., Bandyop, S.: Performance evaluation of some clustering algorithms and validity indices. IEEE Trans. Pattern Anal. Mach. Intel. 24(12), 1650–1654 (2002)CrossRefGoogle Scholar
  4. 4.
    Pakhira, M.K., Bandyopadhyay, S., Maulik, U.: A study of some fuzzy cluster validity indices, genetic clustering and application to pixel classification. Fuzzy Sets Syst. 155(3), 191–214 (2005)CrossRefMathSciNetGoogle Scholar
  5. 5.
    Wang, J., Chiang, J.: A Cluster Validity Measure with Outlier Detection for Support Vector Clustering. IEEE Trans. SMC-B 38(1), 78–89 (2008)Google Scholar
  6. 6.
    Hubert, L.J., Arabie, P.: Comparing partitions. J. Classification 2, 193–218 (1985)CrossRefGoogle Scholar
  7. 7.
    Davies, D.L., Bouldin, D.W.: A cluster separation measure. IEEE Trans. Pattern Anal. Machine Intell. 1(4), 224–227 (1979)CrossRefGoogle Scholar
  8. 8.
    Xie, X.L., Beni, G.: A validity measure for fuzzy clustering. IEEE Trans. Pattern Anal. Mach. Intell. 13(8), 841–847 (1991)CrossRefGoogle Scholar
  9. 9.
    Bezdek, J.C.: Pattern Recognition with fuzzy objective function algorithms. Plenum Press, New York (1981)zbMATHGoogle Scholar
  10. 10.
    Kim, M., Ramakrishna, R.S.: New indices for cluster validity assessment. Patt. Recog. Lett. 26, 2353–2363 (2005)CrossRefGoogle Scholar
  11. 11.
    Pakhira, M.K., Bandyopadhyay, S., Maulik, U.: Validity index for crisp and fuzzy clusters. Pattern Recognition 37(3), 487–501 (2004)zbMATHCrossRefGoogle Scholar
  12. 12.
    Saha, S., Bandyopadhyay, S.: Application of a new symmetry based cluster validity index for satellite image anghamitra. IEEE Geos. Remote Sensing Letter 5(2), 166–170 (2008)CrossRefGoogle Scholar
  13. 13.
    Tantrum, J., Murua, A., Stuetzle, W.: Hierarchical model-based clustering of large datasets through fractionation and refractionation. Information Systems 29, 315–326 (2004)CrossRefGoogle Scholar
  14. 14.
    MacQueen, J.B.: Some methods for classification and analysis of multivariate observations. In: The 5th Berkeley Symposium on Mathematical and Probability, Berkeley, vol. 1, pp. 281–297 (1967)Google Scholar
  15. 15.
    Bezdek, J.C., Pal, S.K.: Fuzzy models for Pattern recognition. Plenum Press, New York (1992)Google Scholar
  16. 16.
    Dunn, J.C.: A fuzzy relative of the ISODATA process and its use in detecting compact well-separated clusters. J. Cybern. 3(3), 32–57 (1973)zbMATHCrossRefMathSciNetGoogle Scholar
  17. 17.
    Tibshirani, R., Walther, G., Hastie, T.: Estimation the number of clusters in a dataset via the gap statistic. J. Royal Society-B 63(2), 411–423 (2000)CrossRefMathSciNetGoogle Scholar
  18. 18.
    Agrawal, R., Gehrke, J., Gunopulos, D., et al.: Automatic subspace clustering of high dimensional data. Data Mining. Knowl. Disc. 11(1), 5–33 (2005)CrossRefMathSciNetGoogle Scholar
  19. 19.
    Ester, M., Kriegel, H.P., et al.: A density-based algorithm for discovering clusters in large spatial datasets with noise. In: Proc. 2nd Int. Conf. KDDD 1996, Portland, Oregon, pp. 226–239 (1996)Google Scholar
  20. 20.
    Ma, E.W.M., Chow, T.W.S.: A new shifting grid clustering algorithm. Pattern Recognition 37, 503–514 (2004)zbMATHCrossRefGoogle Scholar
  21. 21.
    Wang, J., Chiang, J.: A cluster validity measure with a hybrid parameter search method for support vector clustering algorithm. Pattern Recognition 41(2), 506–520 (2008)zbMATHCrossRefGoogle Scholar
  22. 22.
    Kim, D.J., Lee, K.H., Lee, D.: On cluster validity index for estimation of the optimal number of fuzzy clusters. Pattern Recognition 37(10), 2009–2025 (2004)CrossRefGoogle Scholar
  23. 23.
    Yue, S., Li, P., Song, Z.: On the index of cluster validity. J. Chinese Electronic 14(3), 535–539 (2005)MathSciNetGoogle Scholar
  24. 24.
    Kittler, J., Hatef, M., Duin, R.P., Matas, J.: On Combining Classifiers. IEEE Trans. Patt. Anal. Mach. Intell. 20(3), 226–239 (1998)CrossRefGoogle Scholar
  25. 25.
    Kaftandjian, V., Zhu, Y., Dupuis, O., Lyon, I.: The combined use of the evidence theory and fuzzy logic for improving multimodal nondestructive testing system. IEEE Trans. Instr. Mea. 54(4), 1968–1977 (2005)CrossRefGoogle Scholar
  26. 26.
    Fred, A.L.N., Jain, A.K.: Combining multiple Clusterings using evidence accumulation. IEEE Trans. Patt. Anal. Mach. Intell. 27(6), 835–851 (2005)CrossRefGoogle Scholar
  27. 27.
    Sheng, W., Swift, S., Zhang, L., Liu, X.: A Weighted Sum Validity Function for Clustering With a Hybrid Niching Genetic Algorithm. IEEE Trans. SMC-B 35(6), 1156–1167 (2005)Google Scholar
  28. 28.
    Wu, S., Chow, W.S.: Clustering of the self-organizing map using a clustering validity index based on inter-cluster and intra-cluster density. Pattern Recognition 37(2), 175–188 (2004)zbMATHCrossRefGoogle Scholar
  29. 29.
    Zhang, W., Lee, Y.: The uncertainty of reasoning principles. Xi’an Jiaotong University Press, Xi’an (1999)Google Scholar
  30. 30.
    Cuzzolin, F.: A geometric approach to the theory of evidence. IEEE Trans. SMC-C 38(4), 522–534 (2008)Google Scholar
  31. 31.
    Regis, M., Doncescu, A., Desachy, J.: Use of Evidence theory for the fusion and the estimation of relevance of data sources: application to an alcoholic bioprocess. Traitements Signal 24(2), 115–132 (2007)Google Scholar
  32. 32.
    Boudraa, A., Bentabet, A., Salzensten, F., Guillon, L.: Dempster-Shafer’s probability assignment based on fuzzy membership functions. Elec. Lett. Comp. Vison. Image Anal. 4(1), 1–9 (2004)Google Scholar
  33. 33.
    Salzenstein, F., Boudraa, A.: Iterative estimation of Dempster-Shafer’s basic probability assignment: application tomultisensor image segment. Opt. Eng. 43(6), 1–7 (2004)CrossRefGoogle Scholar
  34. 34.
    Huang, Z., Ng, M.: A Fuzzy k-Modes Algorithm for Clustering Categorical Data. IEEE Trans. Fuzzy Systems 7(4), 446–452 (1999)CrossRefGoogle Scholar
  35. 35.
    Huang, Z., Ng, M.K., Rong, H.: Automated variable weighting in k-means type clustering. IEEE Trans. Patt. Anal. Mach. Intell. 27(3), 657–668 (2005)CrossRefGoogle Scholar
  36. 36.
    Pedrycz, W.: Conditional fuzzy clustering. Patt. Recog. Lett. 18(7), 791–807 (2005)Google Scholar
  37. 37.
  38. 38.
    Ankerst, M., Breunig, M., Kriegel, H.P.: Ordering points to identify the clustering structure. SIGMOD Record 28(2), 49–60 (1999)CrossRefGoogle Scholar
  39. 39.
    UCI Machine Learning Repository, ftp://ftp.cs.cornell.edu/pub/smart/
  40. 40.
  41. 41.
    Yue, S., Wei, M., Wang, J., Wang, H.: A general grid-clustering approach. Patt. Recog. Lett. 29(9), 1372–1384 (2008)CrossRefGoogle Scholar
  42. 42.
    Lange, T., Roth, V., BrauM, L., Buhmann, J.M.: Stability-Based Validation of Clustering Solutions. Neural Comput. 16(6), 1299–1323 (2004)zbMATHCrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2010

Authors and Affiliations

  • Shihong Yue
    • 1
  • Teresa Wu
    • 2
  • Yamin Wang
    • 1
  • Kai Zhang
    • 1
  • Weixia Liu
    • 1
  1. 1.School of Electric Engineering and AutomationTianjin UniversityTianjinChina
  2. 2.School of Computing, Informatics, Decision Systems EngineeringArizona State University TempeUSA

Personalised recommendations