Frequent Closed Patterns Based Multiple Consensus Clustering

  • Atheer Al-Najdi
  • Nicolas Pasquier
  • Frédéric Precioso
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 9693)

Abstract

Clustering is one of the major tasks in data mining. However, selecting an algorithm to cluster a dataset is a difficult task, especially if there is no prior knowledge on the structure of the data. Consensus clustering methods can be used to combine multiple base clusterings into a new solution that provides better partitioning. In this work, we present a new consensus clustering method based on detecting clustering patterns by mining frequent closed itemset. Instead of generating one consensus, this method both generates multiple consensuses based on varying the number of base clusterings, and links these solutions in a hierarchical representation that eases the selection of the best clustering. This hierarchical view also provides an analysis tool, for example to discover strong clusters or outlier instances.

Keywords

Unsupervised learning Clustering Consensus clustering Ensemble clustering Frequent closed patterns 

References

  1. 1.
    Asur, S., Ucar, D., Parthasarathy, S.: An ensemble framework for clustering protein-protein interaction networks. Bioinformatics 23(13), i29–i40 (2007)CrossRefGoogle Scholar
  2. 2.
    Caruana, R., Elhawary, M., Nguyen, N., Smith, C.: Meta clustering. In: Proceedings of the IEEE ICDM Conference, pp. 107–118 (2006)Google Scholar
  3. 3.
    Csardi, G., Nepusz, T.: The igraph software package for complex network research. InterJournal Complex Systems, 1695 (2006). http://igraph.org
  4. 4.
    Dalton, L., Ballarin, V., Brun, M.: Clustering algorithms: on learning, validation, performance, and applications to genomics. Curr. Genomics 10(6), 430 (2009)CrossRefGoogle Scholar
  5. 5.
    Ghaemi, R., Sulaiman, M.N., Ibrahim, H., Mustapha, N.: A survey: clustering ensembles techniques. WASET 50, 636–645 (2009)Google Scholar
  6. 6.
    Hahsler, M., Gruen, B., Hornik, K.: arules - a computational environment for mining association rules and frequent item sets. J. Stat. Softw. 14(15), 1–25 (2005)CrossRefGoogle Scholar
  7. 7.
    Halkidi, M., Batistakis, Y., Vazirgiannis, M.: On clustering validation techniques. J. Intell. Inf. Syst. 17(2), 107–145 (2001)CrossRefMATHGoogle Scholar
  8. 8.
    Hornik, K.: A CLUE for CLUster Ensembles. J. Stat. Softw. 14(12), 1–25 (2005)MathSciNetCrossRefGoogle Scholar
  9. 9.
    Hornik, K.: CLUE: Cluster ensembles (2015). r package version 0.3-50 http://CRAN.R-project.org/package=clue
  10. 10.
    Jaccard, P.: The distribution of the flora in the alpine zone.1. New Phytol. 11(2), 37–50 (1912). doi:10.1111/j.1469-8137.1912.tb05611.x CrossRefGoogle Scholar
  11. 11.
    Li, T., Ding, C.: Weighted consensus clustering. In: Proceedings of the SIAM Conference on Data Mining, pp. 798–809 (2008)Google Scholar
  12. 12.
    Lichman, M.: UCI machine learning repository (2013). http://archive.ics.uci.edu/ml
  13. 13.
    Mondal, K.C., Pasquier, N., Mukhopadhyay, A., Maulik, U., Bandhopadyay, S.: A new approach for association rule mining and bi-clustering using formal concept analysis. In: Perner, P. (ed.) MLDM 2012. LNCS, vol. 7376, pp. 86–101. Springer, Heidelberg (2012)CrossRefGoogle Scholar
  14. 14.
    Newman, D., Hettich, S., Blake, C., Merz, C.: UCI repository of machine learning databases (1998). http://www.ics.uci.edu/~mlearn/MLRepository.html
  15. 15.
    Pasquier, N., Bastide, Y., Taouil, R., Lakhal, L.: Efficient mining of association rules using closed itemset lattices. Inf. Syst. 24(1), 25–46 (1999)CrossRefMATHGoogle Scholar
  16. 16.
    R Core Team: R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing, Vienna, Austria (2015). https://www.R-project.org/
  17. 17.
    Sarumathi, S., Shanthi, N., Sharmila, M.: A comparative analysis of different categorical data clustering ensemble methods in data mining. IJCA 81(4), 46–55 (2013)CrossRefGoogle Scholar
  18. 18.
    Strehl, A., Ghosh, J.: Cluster ensembles - a knowledge reuse framework for combining multiple partitions. JMLR 3, 583–617 (2003)MathSciNetMATHGoogle Scholar
  19. 19.
    Ultsch, A.: Clustering with SOM: U*C. In: Proceedings of the WSOM Workshop, pp. 75–82 (2005)Google Scholar
  20. 20.
    Vega-Pons, S., Ruiz-Shulcloper, J.: A survey of clustering ensemble algorithms. IJPRAI 25(03), 337–372 (2011)MathSciNetGoogle Scholar
  21. 21.
    Wu, O., Hu, W., Maybank, S.J., Zhu, M., Li, B.: Efficient clustering aggregation based on data fragments. IEEE Trans. Syst. Man Cybern B Cybern. 42(3), 913–926 (2012)CrossRefGoogle Scholar
  22. 22.
    Xu, D., Tian, Y.: A comprehensive survey of clustering algorithms. Ann. Data Sci. 2(2), 165–193 (2015)MathSciNetCrossRefGoogle Scholar
  23. 23.
    Yang, G.: The complexity of mining maximal frequent itemsets and maximal frequent patterns. In: ACM SIGKDD, pp. 344–353 (2004)Google Scholar
  24. 24.
    Zhang, Y., Li, T.: Consensus clustering + meta clustering = multiple consensus clustering. In: Proceedings of the FLAIRS Conference (2011)Google Scholar

Copyright information

© Springer International Publishing Switzerland 2016

Authors and Affiliations

  • Atheer Al-Najdi
    • 1
  • Nicolas Pasquier
    • 1
  • Frédéric Precioso
    • 1
  1. 1.Univ. Nice Sophia Antipolis, CNRS, I3S, UMR 7271Sophia AntipolisFrance

Personalised recommendations