Advertisement

Sophisticated SOM based genetic operators in multi-objective clustering framework

  • Naveen SainiEmail author
  • Sriparna Saha
  • Aditya Harsh
  • Pushpak Bhattacharyya
Article
  • 41 Downloads

Abstract

Multi-objective clustering refers to the partitioning of a given collection of objects into various K-groups based on some similarity/dissimilarity criterion while optimizing different partition quality measures simultaneously. The current paper proposes an automated decomposition based multi-objective clustering technique, SOMDEA_clust, which is a fusion of self-organizing map (SOM) and multi-objective differential evolution. A novel reproduction operator is designed where the ensemble of multiple neighborhoods extracted using self-organizing map is used for constructing the variable mating pool size. The probabilities of selecting different sizes of the neighborhood are updated based on their performances in generating new improved solutions in the last few generations. Decomposition based selection scheme is also utilized in our paper which divides the multi-objective optimization (MOO) problem into a number of single objective subproblems. The objective functions corresponding to these subproblems are optimized in a collaborative manner by the use of MOO. The potentiality of the proposed framework is shown for clustering four real-life data sets and five artificial data sets in comparison to some existing multi-objective based clustering techniques, namely MOCK, SMEA_clust, MEA_clust, a single objective based genetic clustering technique, SOGA and a traditional clustering technique, K-means. To show the utility of SOM based reproduction operators, another decomposition based multi-objective clustering technique (MDEA_clust) without the use of SOM based operators is also developed in this paper. In order to show the efficacy of the proposed clustering technique in handling large data sets, two large scale datasets having more than 5000 data points are also utilized. As a real-life application, the proposed clustering technique is applied for scientific/web document clustering where a set of scientific/web documents are partitioned based on their content-similarities. Semantic representation is utilized to covert the text document into a real vector. Experimental results clearly illustrate the effectiveness of fusion of SOM and DE in developing an effective clustering technique.

Keywords

Clustering Cluster validity indices Self organizing map (SOM) Differential evolutionary algorithm (DE) Polynomial mutation Multi-objective optimization (MOO) 

Notes

Acknowledgments

Dr. Sriparna Saha would like to acknowledge the support of SERB Women in Excellence Award-SB/WEA/08/2017 for carrying out this work.

References

  1. 1.
    Arbelaitz O, Gurrutxaga I, Muguerza J, Pérez J M, Perona I (2013) An extensive comparative study of cluster validity indices. Pattern Recogn 46(1):243–256CrossRefGoogle Scholar
  2. 2.
    Bandyopadhyay S, Maulik U (2002) Genetic clustering for automatic evolution of clusters and application to image classification. Pattern Recogn 35(6):1197–1208zbMATHCrossRefGoogle Scholar
  3. 3.
    Bandyopadhyay S, Saha S (2007) Gaps: a clustering method using a new point symmetry-based distance measure. Pattern Recogn 40(12):3430–3451zbMATHCrossRefGoogle Scholar
  4. 4.
    Bandyopadhyay S, Saha S (2008) A new principal axis based line symmetry measurement and its application to clustering. In: International conference on neural information processing. Springer, pp 543–550Google Scholar
  5. 5.
    Bandyopadhyay S, Saha S (2008) A point symmetry-based clustering technique for automatic evolution of clusters. IEEE Trans Knowl Data Eng 20(11):1441–1457CrossRefGoogle Scholar
  6. 6.
    Bandyopadhyay S, Saha S, Maulik U, Deb K (2008) A simulated annealing-based multiobjective optimization algorithm: Amosa. IEEE Trans Evol Comput 12(3):269–283CrossRefGoogle Scholar
  7. 7.
    Cardoso-Cachopo A (2007) Improving methods for single-label text categorization. PdD Thesis, Instituto Superior Tecnico, Universidade Tecnica de LisboaGoogle Scholar
  8. 8.
    Das S, Abraham A, Konar A (2008) Automatic clustering using an improved differential evolution algorithm. IEEE Trans Syst Man Cybern Part A Syst Hum 38(1):218–237CrossRefGoogle Scholar
  9. 9.
    Davies DL, Bouldin DW (1979) A cluster separation measure. IEEE Trans Pattern Anal Mach Intell PAMI 1(2):224–227.  https://doi.org/10.1109/TPAMI.1979.4766909 CrossRefGoogle Scholar
  10. 10.
    Davies DL, Bouldin DW (1979) A cluster separation measure. IEEE Trans Pattern Anal Mach Intell PAMI-1(2):224–227CrossRefGoogle Scholar
  11. 11.
    Deb K (2014) Multi-objective optimization. In: Search methodologies. Springer, pp 403–449Google Scholar
  12. 12.
    Deb K, Tiwari S (2008) Omni-optimizer: a generic evolutionary algorithm for single and multi-objective optimization. Eur J Oper Res 185(3):1062–1087MathSciNetzbMATHCrossRefGoogle Scholar
  13. 13.
    Deb K, Pratap A, Agarwal S, Meyarivan T (2002) A fast and elitist multiobjective genetic algorithm: Nsga-ii. IEEE Trans Evol Comput 6(2):182–197CrossRefGoogle Scholar
  14. 14.
    Giagkiozis I, Purshouse RC, Fleming PJ (2014) Generalized decomposition and cross entropy methods for many-objective optimization. Inf Sci 282:363–387MathSciNetzbMATHCrossRefGoogle Scholar
  15. 15.
    Handl J, Knowles J (2007) An evolutionary approach to multiobjective clustering. IEEE Trans Evol Comput 11(1):56–76CrossRefGoogle Scholar
  16. 16.
    Haykin SS (2009) Neural networks and learning machines, vol 3. Prentice-Hall, Pearson Upper Saddle RiverGoogle Scholar
  17. 17.
    Jain AK, Dubes RC (1988) Algorithms for clustering data. Prentice-Hall, Pearson Upper Saddle RiverzbMATHGoogle Scholar
  18. 18.
    Karypis G, Han EH, Kumar V (1999) Chameleon: hierarchical clustering using dynamic modeling. Computer 32(8):68–75CrossRefGoogle Scholar
  19. 19.
    Kennedy J (2011) Particle swarm optimization. In: Encyclopedia of machine learning. Springer, pp 760–766Google Scholar
  20. 20.
    Kohonen T (1998) The self-organizing map. Neurocomputing 21(1):1–6MathSciNetzbMATHCrossRefGoogle Scholar
  21. 21.
    Li H, Zhang Q (2009) Multiobjective optimization problems with complicated pareto sets, moea/d and nsga-ii. IEEE Trans Evol Comput 13(2):284–302CrossRefGoogle Scholar
  22. 22.
    Lichman M (2013) UCI machine learning repository. http://archive.ics.uci.edu/ml
  23. 23.
    Maulik U, Bandyopadhyay S (2002) Performance evaluation of some clustering algorithms and validity indices. IEEE Trans Pattern Anal Mach Intell 24(12):1650–1654CrossRefGoogle Scholar
  24. 24.
    Pakhira MK, Bandyopadhyay S, Maulik U (2004) Validity index for crisp and fuzzy clusters. Pattern Recogn 37(3):487–501zbMATHCrossRefGoogle Scholar
  25. 25.
    Pennington J, Socher R, Manning C (2014) Glove: global vectors for word representation. In: Proceedings of the 2014 conference on empirical methods in natural language processing (EMNLP). pp 1532–1543Google Scholar
  26. 26.
    Price K, Storn RM, Lampinen JA (2006) Differential evolution: a practical approach to global optimization. Springer Science & Business Media, BerlinzbMATHGoogle Scholar
  27. 27.
    Saha S, Bandyopadhyay S (2010) A symmetry based multiobjective clustering technique for automatic evolution of clusters. Pattern Recogn 43(3):738–751zbMATHCrossRefGoogle Scholar
  28. 28.
    Saha S, Bandyopadhyay S (2012) Some connectivity based cluster validity indices. Appl Soft Comput 12 (5):1555–1565CrossRefGoogle Scholar
  29. 29.
    Saha S, Bandyopadhyay S (2013) A generalized automatic clustering algorithm in a multiobjective framework. Appl Soft Comput 13(1):89–108CrossRefGoogle Scholar
  30. 30.
    Saini N, Chourasia S, Saha S, Bhattacharyya P (2017) A self organizing map based multi-objective framework for automatic evolution of clusters. In: International conference on neural information processing. Springer, pp 672–682Google Scholar
  31. 31.
    Saini N, Saha S, Bhattacharyya P (2018) An improved technique for automatic email classification. In: 2018 international joint conference on neural networks (IJCNN). IEEE, pp 1–8Google Scholar
  32. 32.
    Starczewski A (2017) A new validity index for crisp clusters. Pattern Anal Applic 20(3):687–700MathSciNetCrossRefGoogle Scholar
  33. 33.
    Suresh K, Kundu D, Ghosh S, Das S, Abraham A (2009) Data clustering using multi-objective differential evolution algorithms. Fundamenta Informaticae 97(4):381–403MathSciNetGoogle Scholar
  34. 34.
    Welch BL (1947) The generalization of ‘student’s’ problem when several different population variances are involved. Biometrika 34(1/2):28–35. http://www.jstor.org/stable/2332510 MathSciNetzbMATHCrossRefGoogle Scholar
  35. 35.
    Zhang H, Zhang X, Gao XZ, Song S (2016) Self-organizing multiobjective optimization based on decomposition with neighborhood ensemble. Neurocomputing 173:1868–1884CrossRefGoogle Scholar
  36. 36.
    Zhang H, Zhou A, Song S, Zhang Q, Gao XZ, Zhang J (2016) A self-organizing multiobjective evolutionary algorithm. IEEE Trans Evol Comput 20(5):792–806.  https://doi.org/10.1109/TEVC.2016.2521868 CrossRefGoogle Scholar
  37. 37.
    Zhang Q, Li H (2007) Moea/d: a multiobjective evolutionary algorithm based on decomposition. IEEE Trans Evol Comput 11(6):712–731CrossRefGoogle Scholar

Copyright information

© Springer Science+Business Media, LLC, part of Springer Nature 2018

Authors and Affiliations

  1. 1.Department of Computer Science and EngineeringIndian Institute of Technology PatnaPatna-801103India
  2. 2.Department of Computer Science and EngineeringUniversity of Petroleum and Energy StudiesUttarakhandIndia

Personalised recommendations