Evidence Accumulation Clustering Based on the K-Means Algorithm

  • Ana Fred
  • Anil K. Jain
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 2396)

Abstract

The idea of evidence accumulation for the combination of multiple clusterings was recently proposed [7]. Taking the K-means as the basic algorithm for the decomposition of data into a large number, k, of compact clusters, evidence on pattern association is accumulated, by a voting mechanism, over multiple clusterings obtained by random initializations of the K-means algorithm. This produces a mapping of the clusterings into a new similarity measure between patterns. The final data partition is obtained by applying the single-link method over this similarity matrix. In this paper we further explore and extend this idea, by proposing: (a) the combination of multiple K-means clusterings using variable k; (b) using cluster lifetime as the criterion for extracting the final clusters; and (c) the adaptation of this approach to string patterns. This leads to a more robust clustering technique, with fewer design parameters than the previous approach and potential applications in a wider range of problems.

Keywords

Machine Intelligence Data Partition Cluster Validity Natural Cluster Vote Mechanism 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

References

  1. 1.
    T. A. Bailey and R. Dubes. Cluster validity profiles. Pattern Recognition, 15(2):61–83, 1982.CrossRefMathSciNetGoogle Scholar
  2. 2.
    J. Buhmann and M. Held. Unsupervised learning without overfitting: Empirical risk approximation as an induction principle for reliable clustering. In Sameer Singh, editor, International Conference on Advances in Pattern Recognition, pages 167–176. Springer Verlag, 1999.Google Scholar
  3. 3.
    R. O. Duda, P. E. Hart, and D. G. Stork. Pattern Classification. Wiley, second edition, 2001.Google Scholar
  4. 4.
    Y. El-Sonbaty and M. A. Ismail. On-line hierarchical clustering. Pattern Recognition Letters, pages 1285–1291, 1998.Google Scholar
  5. 5.
    M. Figueiredo and A. K. Jain. Unsupervised learning of finite mixture models. IEEE Trans. Pattern Analysis and Machine Intelligence, 24(3):381–396, 2002.CrossRefGoogle Scholar
  6. 6.
    B. Fischer, T. Zoller, and J. Buhmann. Path based pairwise data clustering with application to texture segmentation. In M. Figueiredo, J. Zerubia, and A. K. Jain, editors, Energy Minimization Methods in Computer Vision and Pattern Recogni-tion, volume 2134 of LNCS, pages 235–266. Springer Verlag, 2001.CrossRefGoogle Scholar
  7. 7.
    A. L. Fred. Finding consistent clusters in data partitions. In Josef Kittler and Fabio Roli, editors, Multiple Classifier Systems, volume LNCS 2096, pages 309–318. Springer, 2001.CrossRefGoogle Scholar
  8. 8.
    A. L. Fred and J. Leitão. Clustering under a hypothesis of smooth dissimilarity increments. In Proc. of the 15th Int’l Conference on Pattern Recognition, volume 2, pages 190–194, Barcelona, 2000.CrossRefGoogle Scholar
  9. 9.
    A. L. Fred, J. S. Marques, and P. M. Jorge. Hidden markov models vs syntactic modeling in object recognition. In ICIP’97, 1997.Google Scholar
  10. 10.
    M. Har-Even and V. L. Brailovsky. Probabilistic validation approach for clustering. Pattern Recognition, 16:1189–1196, 1995.CrossRefGoogle Scholar
  11. 11.
    A. Jain. Fundamentals of Digital Image Processing. Prentice-Hall, 1989.Google Scholar
  12. 12.
    A. K. Jain and R. C. Dubes. Algorithms for Clustering Data. Prentice Hall, 1988.Google Scholar
  13. 13.
    A.K. Jain, M. N. Murty, and P.J. Flynn. Data clustering: A review. ACM Computing Surveys, 31(3):264–323, September 1999.Google Scholar
  14. 14.
    J. Kittler, M. Hatef, R. P Duin, and J. Matas. On combining classifiers. IEEE Trans. Pattern Analysis and Machine Intelligence, 20(3):226–239, 1998.CrossRefGoogle Scholar
  15. 15.
    R. Kothari and D. Pitts. On finding the number of clusters. Pattern Recognition Letters, 20:405–416, 1999.CrossRefGoogle Scholar
  16. 16.
    Y. Man and I. Gath. Detection and separation of ring-shaped clusters using fuzzy clusters. IEEE Trans. Pattern Analysis and Machine Intelligence, 16(8):855–861, August 1994.Google Scholar
  17. 17.
    A. Marzal and E. Vidal. Computation of normalized edit distance and applications. IEEE Trans. Pattern Analysis and Machine Intelligence, 2(15):926–932, 1993.CrossRefGoogle Scholar
  18. 18.
    G. McLachlan and K. Basford. Mixture Models: Inference and Application to Clustering. Marcel Dekker, New York, 1988.Google Scholar
  19. 19.
    B. Mirkin. Concept learning and feature selection based on square-error clustering. Machine Learning, 35:25–39, 1999.MATHCrossRefMathSciNetGoogle Scholar
  20. 20.
    N. R. Pal and J. C. Bezdek. On cluster validity for the fuzzy c-means model. IEEE Trans. Fuzzy Systems, 3:370–379, 1995.CrossRefGoogle Scholar
  21. 21.
    E. J. Pauwels and G. Frederix. Fiding regions of interest for content-extraction. In Proc. of IS&T/SPIE Conference on Storage and Retrieval for Image and Video Databases VII, volume SPIE Vol. 3656, pages 501–510, San Jose, January 1999.Google Scholar
  22. 22.
    E. S. Ristad and P. N. Yianilos. Learning string-edit distance. IEEE Trans. Pattern Analysis and Machine Intelligence, 20(5):522–531, May 1998.Google Scholar
  23. 23.
    S. Roberts, D. Husmeier, I. Rezek, and W. Penny. Bayesian approaches to gaus-sian mixture modelling. IEEE Trans. Pattern Analysis and Machine Intelligence, 20(11), November 1998.Google Scholar
  24. 24.
    D. Stanford and A. E. Raftery. Principal curve clustering with noise. Technical report, University of Washington, http://www.stat.washington.edu/raftery, 1997.
  25. 25.
    H. Tenmoto, M. Kudo, and M. Shimbo. MDL-based selection of the number of components in mixture models for pattern recognition. In Adnan Amin, Dov Dori, Pavel Pudil, and Herbert Freeman, editors, Advances in Pattern Recognition,volume 1451 of Lecture Notes in Computer Science, pages 831–836. Springer Verlag, 1998.CrossRefGoogle Scholar
  26. 26.
    C. Zahn. Graph-theoretical methods for detecting and describing gestalt structures. IEEE Trans. Computers, C-20(1):68–86, 1971.CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2002

Authors and Affiliations

  • Ana Fred
    • 1
  • Anil K. Jain
    • 2
  1. 1.Instituto de Telecomunicações Instituto Superior TécnicoLisbonPortugal
  2. 2.Department of Computer Science and EngineeringMichigan State UniversityUSA

Personalised recommendations