Combining Multiple Clusterings Via k-Modes Algorithm

  • Huilan Luo
  • Fansheng Kong
  • Yixiao Li
Part of the Lecture Notes in Computer Science book series (LNCS, volume 4093)


Clustering ensembles have emerged as a powerful method for improving both the robustness and the stability of unsupervised classification solutions. However, finding a consensus clustering from multiple partitions is a difficult problem that can be approached from graph-based, combinatorial or statistical perspectives. A consensus scheme via the k-modes algorithm is proposed in this paper. A combined partition is found as a solution to the corresponding categorical data clustering problem using the k-modes algorithm. This study compares the performance of the k-modes consensus algorithm with other fusion approaches for clustering ensembles. Experimental results demonstrate the effectiveness of the proposed method.


Consensus Function Normalize Mutual Information Cluster Ensemble Consensus Cluster Circle Data 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Huang, Z.: A Fast Clustering Algorithm to Cluster Very Large Categorical Data Sets in Data Mining. In: Proceedings of the ACM SIGMOD Workshop on Research Issues on Data Mining and Knowledge Discovery (1997)Google Scholar
  2. 2.
    Topchy, A., Jain, A.K., Punch, W.: A Mixture Model for Clustering Ensembles. In: Proc. SIAM Conf. on Data Mining, pp. 379–390 (2004)Google Scholar
  3. 3.
    Minaei-Bidgoli, B., Topchy, A.P., Punch, W.F.: A Comparison of Resampling Methods for Clustering Ensembles. IC-AI, 939–945 (2004)Google Scholar
  4. 4.
    Strehl, A., Ghosh, J.: Cluster ensembles - a knowledge reuse framework for combining multiple partitions. Journal of Machine Learning Research 3, 583–617 (2002)MathSciNetGoogle Scholar
  5. 5.
    Law, M., Topchy, A., Jain, A.K.: Multiobjective Data Clustering. In: Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, vol. 2, pp. 424–430 (2004)Google Scholar
  6. 6.
    Topchy, A., Jain, A., Punch, W.: Combining multiple weak clusterings. In: Proc. Third IEEE International Conference on Data Mining (ICDM 2003) (2003)Google Scholar
  7. 7.
    Fred, A.L.N.: Finding Consistent Clusters in Data Partitions. In: Kittler, J., Roli, F. (eds.) MCS 2001. LNCS, vol. 2096, pp. 309–318. Springer, Heidelberg (2001)CrossRefGoogle Scholar
  8. 8.
    Fern, X.Z., Brodley, C.E.: Random Projection for High Dimensional Data Clustering: A Cluster Ensemble Approach. In: Proc. of the 20th International Conference on Machine Learning (ICML 2003), Washington DC, USA (2003)Google Scholar
  9. 9.
    Fischer, B., Buhmann, J.M.: Path-Based Clustering for Grouping of Smooth Curves and Texture Segmentation. IEEE Trans. on PAMI 25, 513–518 (2003)Google Scholar
  10. 10.
    Minaei-Bidgoli, B., Topchy, A.P., Punch, W.F.: Ensembles of Partitions via Data Resampling. In: International Conference on Information Technology: Coding and Computing (ITCC 2004), pp. 188–192 (2004)Google Scholar
  11. 11.
    Fred, A.L.N., Jain, A.K.: Data Clustering using Evidence Accumulation. In: Proc. of the 16th Intl. Conference on Pattern Recognition ICPR 2002, Quebec City, pp. 276–280 (2002)Google Scholar
  12. 12.
    Karypis, G., Kumar, V.: A fast and high quality multilevel scheme for partitioning irregular graphs. SIAM Journal of Scientific Computing 20, 359–392 (1998)CrossRefMathSciNetGoogle Scholar
  13. 13.
    Karypis, G., Aggarwal, R., Kumar, V., Shekhar, S.: Multilevel hypergraph partitioning: Application in VLSI domain. In: Proc. 34th ACM/IEEE Design Automation Conference, pp. 526–529 (1997)Google Scholar
  14. 14.
    Fischer, B., Buhmann, J.M.: Bagging for Path-Based Clustering. IEEE Trans. on Pattern Analysis and Machine Intelligence 25, 1411–1415 (2003)CrossRefGoogle Scholar
  15. 15.
    Dudoit, F.J.: Bagging to improve the accuracy of a clustering procedure. Bioinformatics 19, 1090–1099 (2003)CrossRefGoogle Scholar
  16. 16.
    Topchy, A., Jain, A.K., Punch, W.: Clustering Ensembles: Models of Consensus and Weak Partitions. IEEE Trans. Pattern Anal. Mach. Intell. 27, 1866–1881 (2005)CrossRefGoogle Scholar
  17. 17.
    Dietterich, T.G.: An experimental comparison of three methods for constructing ensembles of decision trees: Bagging, boosting and randomization. Machine learning 2, 139–157 (2000)CrossRefGoogle Scholar
  18. 18.
    Zelnik-Manor, L., Perona, P.: Self-Tuning Spectral Clustering. In: Eighteenth Annual Conference on Neural Information Processing Systems (NIPS) (2004)Google Scholar
  19. 19.
    Kuhn, H.W.: The hungarian method for the assignment problem. Naval Re-search Logistics Quaterly 2, 83–97 (1955)CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2006

Authors and Affiliations

  • Huilan Luo
    • 1
    • 2
  • Fansheng Kong
    • 1
  • Yixiao Li
    • 1
  1. 1.Artificial Intelligence InstituteZhejiang UniversityHangzhouChina
  2. 2.Institute of Information EngineeringJiangxi University of Science and TechnologyGangzhouChina

Personalised recommendations