Swarm based automatic clustering using nature inspired Emperor Penguins Colony algorithm

Harifi, Sasan; Khalilian, Madjid; Mohammadzadeh, Javad

doi:10.1007/s12530-023-09507-y

Swarm based automatic clustering using nature inspired Emperor Penguins Colony algorithm

Original Paper
Published: 11 June 2023

Volume 14, pages 1083–1099, (2023)
Cite this article

Evolving Systems Aims and scope Submit manuscript

205 Accesses
1 Citation
Explore all metrics

Abstract

Nature acts as a source of concepts, mechanisms, and principles for designing artificial computing systems to deal with complex computational problems. Most heuristic and metaheuristic algorithms are taken from the behavior of biological systems or physical systems in nature. Clustering is the process of grouping a set of data and putting it in a class of similar examples. Since the clustering problem is an NP-hard problem, using metaheuristics can be an appropriate tool to deal with these issues. Indeed, clustering is a special case of an optimization problem. In classic clustering, knowing the number of clusters is required before clustering. This paper presents an algorithm that requires no prior knowledge to classify the data. In this paper, we proposed a swarm-based Emperor Penguins Colony (EPC) algorithm to solve both classic and automatic clustering problems. The proposed approach is compared with six state-of-the-art, popular, and improved nature-inspired algorithms, a partitioning-based heuristic algorithm, and a hierarchical clustering method on ten real-world datasets. The results show that classic and automatic clustering using the EPC algorithm has better performance in comparison with other competing algorithms.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Nature-inspired metaheuristic techniques for automatic clustering: a survey and performance study

Article 25 January 2020

Population-based bio-inspired algorithms for cluster ensembles optimization

Article 24 March 2018

Applications and Advancements of Nature-Inspired Optimization Algorithms in Data Clustering: A Detailed Analysis

Data availability

The datasets used and/or analyzed during the current study are available from the corresponding author on reasonable request.

References

Agrawal R, Gehrke J, Gunopulos D, Raghavan P (1998) Automatic subspace clustering of high dimensional data for data mining applications. In: Proceedings of the 1998 ACM SIGMOD international conference on Management of data. pp 94–105
Aguiar C, Leite D (2020) Unsupervised fuzzy eIX: Evolving internal-external fuzzy clustering. In: 2020 IEEE Conference on Evolving and Adaptive Intelligent Systems (EAIS). pp 1–8
Alghamdi SA (2020) Emperor based resource allocation for D2D communication and QoF based routing over cellular V2X in urban environment (ERA-D2Q). Wireless Netw 26(5):3419–3437
Article Google Scholar
Aliniya Z, Mirroshandel SA (2019) A novel combinatorial merge-split approach for automatic clustering using imperialist competitive algorithm. Expert Syst Appl 117:243–266
Article Google Scholar
Angelin B, Geetha A (2021) A roc curve based K-Means clustering for Outlier Detection using Dragon fly optimization. Turkish J Comput Math Educ (TURCOMAT) 12(9):467–476
Google Scholar
Azarakhsh J, Raisi Z (2019) Automatic clustering using metaheuristic algorithms for content based image retrieval. In: Fundamental Research in Electrical Engineering The Selected Papers of The First International Conference on Fundamental Research in Electrical Engineering. Springer, Berlin, pp 83–99
Chapter Google Scholar
Berkhin P (2006) A survey of clustering data mining techniques. In: Grouping multidimensional data: Recent advances in clustering. Springer, Berlin, pp 25–71
Chapter Google Scholar
Cai J, Luo J, Wang S, Yang S (2018) Feature selection in machine learning: a new perspective. Neurocomputing 300:70–79
Article Google Scholar
Caliński T, Harabasz J (1974) A dendrite method for cluster analysis. Commun Statistics-theory Methods 3(1):1–27
Article MathSciNet MATH Google Scholar
Chaturvedi A, Green PE, Caroll JD (2001) K-modes clustering. J Classif 18:35–55
Article MathSciNet MATH Google Scholar
Chen JX, Gong YJ, Chen WN, Li M, Zhang J (2019) Elastic differential evolution for automatic data clustering. IEEE Trans cybernetics 51(8):4134–4147
Article Google Scholar
Cheng D, Zhu Q, Huang J, Wu Q, Yang L (2018) A novel cluster validity index based on local cores. IEEE Trans neural networks Learn Syst 30(4):985–999
Article Google Scholar
Chou CH, Su MC, Lai E (2004) A new cluster validity measure and its application to image compression. Pattern Anal Appl 7:205–220
Article MathSciNet Google Scholar
Collins SR, Miller KM, Maas NL, Roguev A, Fillingham J, Chu CS, Krogan NJ (2007) Functional dissection of protein complexes involved in yeast chromosome biology using a genetic interaction map. Nature 446(7137):806–810
Article Google Scholar
Das S, Abraham A, Konar A (2007) Automatic clustering using an improved differential evolution algorithm. IEEE Trans Syst man cybernetics-Part A: Syst Hum 38(1):218–237
Article Google Scholar
Davies DL, Bouldin DW (1979) A cluster separation measure. IEEE Trans Pattern Anal Mach Intell 2:224–227
Article Google Scholar
Defays D (1977) An efficient algorithm for a complete link method. Comput J 20(4):364–366
Article MathSciNet MATH Google Scholar
Derrac J, García S, Molina D, Herrera F (2011) A practical tutorial on the use of nonparametric statistical tests as a methodology for comparing evolutionary and swarm intelligence algorithms. Swarm Evol Comput 1(1):3–18
Article Google Scholar
Dey A, Dey S, Bhattacharyya S, Snasel V, Hassanien AE (2018) Simulated annealing based quantum inspired automatic clustering technique. In: The International Conference on Advanced Machine Learning Technologies and Applications (AMLTA2018). pp 73–81
Dhillon IS, Modha DS (2001) Concept decompositions for large sparse text data using clustering. Mach Learn 42:143–175
Article MATH Google Scholar
Dua D, Karra-Taniskidou E (2017) UCI Machine Learning Repository http://archive.ics.uci.edu/ml. Irvine, CA:University of California, School of Information and Computer Science.
Dunn JC (1974) Well-separated clusters and optimal fuzzy partitions. J cybernetics 4(1):95–104
Article MathSciNet MATH Google Scholar
Ester M, Kriegel HP, Sander J, Xu X (1996) A density-based algorithm for discovering clusters in large spatial databases with noise. Inkdd 96(34):226–231
Google Scholar
Ezugwu AE (2020) Nature-inspired metaheuristic techniques for automatic clustering: a survey and performance study. SN Appl Sci 2:1–57
Article Google Scholar
Flasiński M (2016) Pattern recognition and cluster analysis. Introduction to Artificial Intelligence. Springer, Cham. https://doi.org/10.1007/978-3-319-40022-8_10
Chapter MATH Google Scholar
Garai G, Chaudhuri BB (2004) A novel genetic algorithm for automatic clustering. Pattern Recognit Lett 25(2):173–187
Article Google Scholar
Garcia-Lamont F, Cervantes J, López A, Rodriguez L (2018) Segmentation of images by color features: a survey. Neurocomputing 292:1–27
Article Google Scholar
Gharehchopogh FS, Abdollahzadeh B, Khodadadi N, Mirjalili S (2023) Metaheuristics for clustering problems. In: Comprehensive Metaheuristics. Academic Press, Rome, pp 379–392
Chapter Google Scholar
Gower JC, Ross GJ (1969) Minimum spanning trees and single linkage cluster analysis. J Roy Stat Soc: Ser C (Appl Stat) 18(1):54–64
MathSciNet Google Scholar
Harifi S, Byagowi E, Khalilian M (2017) Comparative study of apache spark MLlib clustering algorithms. In: Data mining and big data: second international conference, DMBD 2017, Fukuoka, Japan, July 27–August 1, 2017, Proceedings 2. Springer International Publishing, pp 61–73
Harifi S, Khalilian M, Mohammadzadeh J, Ebrahimnejad S (2019) Emperor Penguins colony: a new metaheuristic algorithm for optimization. Evol Intel 12:211–226
Article Google Scholar
Harifi S, Khalilian M, Mohammadzadeh J, Ebrahimnejad S (2020a) Optimizing a neuro-fuzzy system based on nature-inspired emperor penguins colony optimization algorithm. IEEE Trans Fuzzy Syst 28(6):1110–1124
Article Google Scholar
Harifi S, Khalilian M, Mohammadzadeh J, Ebrahimnejad S (2020b) Using Metaheuristic Algorithms to improve k-Means clustering: a comparative study. Rev d’Intelligence Artif 34(3):297–305
Google Scholar
Harifi S, Khalilian M, Mohammadzadeh J, Ebrahimnejad S (2021) Optimization in solving inventory control problem using nature inspired Emperor Penguins colony algorithm. J Intell Manuf 32:1361–1375
Article Google Scholar
Hyde R, Angelov P, MacKenzie AR (2017) Fully online clustering of evolving data streams into arbitrarily shaped clusters. Inf Sci 382:96–114
Article Google Scholar
Ikotun AM, Almutari MS, Ezugwu AE (2021) K-means-based nature-inspired metaheuristic algorithms for automatic data clustering problems: recent advances and future directions. Appl Sci 11(23):11246
Article Google Scholar
Jambudi T, Gandhi S (2019) A New K-means-Based Algorithm for Automatic Clustering and Outlier Discovery. In: Information and communication technology for intelligent systems: proceedings of ICTIS 2018, Volume 2. pp 457–467
José-García A, Gómez-Flores W (2016) Automatic clustering using nature-inspired metaheuristics: a survey. Appl Soft Comput 41:192–213
Article Google Scholar
Kangin D, Angelov P (2015) Evolving clustering, classification and regression with TEDA. In: 2015 International Joint Conference on Neural Networks (IJCNN). pp 1–8
Kapoor S, Zeya I, Singhal C, Nanda SJ (2017) A grey wolf optimizer based automatic clustering algorithm for satellite image segmentation. Procedia Comput Sci 115:415–422
Article Google Scholar
Kaufman L, Rousseeuw PJ (2009) Finding groups in data: an introduction to cluster analysis. John Wiley & Sons, Rome
MATH Google Scholar
Kettani O, Ramdani F, Tadili B (2015) AK-means: an automatic clustering algorithm based on K-means. J Adv Comput Sci Technol 4(2):231
Article Google Scholar
Kovács F, Legány C, Babos A (2005) Cluster validity measurement techniques. In: 6th International symposium of hungarian researchers on computational intelligence
Kuo RJ, Huang YD, Lin CC, Wu YH, Zulvia FE (2014) Automatic kernel clustering with bee colony optimization algorithm. Inf Sci 283:107–122
Article Google Scholar
Lemos A, Leite D, Maciel L, Ballini R, Caminhas W, Gomide F (2012) Evolving fuzzy linear regression tree approach for forecasting sales volume of petroleum products. In: 2012 IEEE International Conference on Fuzzy Systems. pp 1–8
Lin NP, Chang CI, Chueh HE, Chen HJ, Hao WH (2008) A deflected grid-based algorithm for clustering analysis. WSEAS Trans Computers 7(4):125–132
Google Scholar
Liu Y, Li Z, Xiong H, Gao X, Wu J (2010) Understanding of internal clustering validation measures. In: 2010 IEEE international conference on data mining. pp 911–916
Liu Y, Wu X, Shen Y (2011) Automatic clustering using genetic algorithms. Appl Math Comput 218(4):1267–1279
MathSciNet MATH Google Scholar
Mattos CL, Barreto GA, Horstkemper D, Hellingrath B (2017) Metaheuristic optimization for automatic clustering of customer-oriented supply chain data. In: 2017 12th International Workshop on Self-Organizing Maps and Learning Vector Quantization, Clustering and Data Visualization (WSOM). pp 1–8
Mendenhall W, Beaver RJ, Beaver BM (2012) Introduction to probability and statistics. Cengage Learning, Chennai
MATH Google Scholar
Nguyen-Trang T, Nguyen-Thoi T, Nguyen-Thi KN, Vo-Van T (2023) Balance-driven automatic clustering for probability density functions using metaheuristic optimization. Int J Mach Learn Cybernet 14:1063–1078
Article Google Scholar
Pacheco TM, Gonçalves LB, Ströele V, Soares SSR (2018) An ant colony optimization for automatic data clustering problem. In: 2018 IEEE Congress on evolutionary computation (CEC). pp 1–8
Pakhira MK, Bandyopadhyay S, Maulik U (2004) Validity index for crisp and fuzzy clusters. Pattern Recogn 37(3):487–501
Article MATH Google Scholar
Pan SM, Cheng KS (2007) Evolution-based tabu search approach to automatic clustering. IEEE Trans Syst Man Cybernetics Part C (Applications Reviews) 37(5):827–838
Article Google Scholar
Pelleg D, Moore A (1999) Accelerating exact k-means algorithms with geometric reasoning. In: Proceedings of the fifth ACM SIGKDD international conference on knowledge discovery and data mining. pp 277–281
Pelleg D, Moore AW (2000) X-means: Extending k-means with efficient estimation of the number of clusters. In: Icml. pp 727–734
Phillips SJ (2002) Acceleration of k-means and related clustering algorithms. In: Algorithm Engineering and Experiments: 4th International Workshop, ALENEX 2002 San Francisco, CA, USA, pp 166–177
Said AB, Hadjidj R, Foufou S (2017) Cluster validity index based on Jeffrey divergence. Pattern Anal Appl 20:21–31
Article MathSciNet Google Scholar
Saxena A, Mukesh P, Akshansh G, Neha B, Om-Prakash P, Aruna T, Meng JE, Weiping D, Chin-Teng L (2017) A review of clustering techniques and developments. Neurocomputing 267:664–681
Article Google Scholar
Schölkopf B, Smola A, Müller KR (1998) Nonlinear component analysis as a kernel eigenvalue problem. Neural Comput 10(5):1299–1319
Article Google Scholar
Sharma M, Chhabra JK (2019) Sustainable automatic data clustering using hybrid PSO algorithm with mutation. Sustainable Computing: Informatics and Systems 23:144–157
Google Scholar
Silva AM, Caminhas W, Lemos A, Gomide F (2014) A fast learning algorithm for evolving neo-fuzzy neuron. Appl Soft Comput 14:194–209
Article Google Scholar
Starczewski A (2017) A new validity index for crisp clusters. Pattern Anal Appl 20:687–700
Article MathSciNet Google Scholar
Steinbach M, Karypis G, Kumar V (2000) A Comparison of Document Clustering Techniques, Technical Report; 00-034, University of Minnesota Digital Conservancy, 2000, 1–22. Available online: https://hdl.handle.net/11299/215421.
Tseng LY, Yang SB (2001) A genetic approach to the automatic clustering problem. Pattern Recogn 34(2):415–424
Article MATH Google Scholar
Wallace CS, Dowe DL (1994) Intrinsic classification by MML-the Snob program. In: Proceedings of the 7th Australian Joint Conference on Artificial Intelligence. p 37
Wang W, Yang J, Muntz R (1997) STING: a statistical information grid approach to spatial data mining. In Vldb 97:186–195
Google Scholar
Welch WJ (1982) Algorithmic complexity: three NP-hard problems in computational statistics. J Stat Comput Simul 15(1):17–25
Article MathSciNet MATH Google Scholar
Zhang B, Hsu M, Dayal U (2001) K-harmonic means-a spatial clustering algorithm with boosting. In: Temporal, spatial, and spatio-temporal data mining: first international Workshop, TSDM 2000 Lyon, France, September 12, 2000 Revised Papers, pp 31–45
Zhao Q, Fränti P (2014) WB-index: a sum-of-squares based index for cluster validity. Data Knowl Eng 92:77–89
Article Google Scholar
Zhao WL, Deng CH, Ngo CW (2018) k-means: a revisit. Neurocomputing 291:195–206
Article Google Scholar
Zhou Y, Wu H, Luo Q, Abdel-Baset M (2019) Automatic data clustering using nature-inspired symbiotic organism search algorithm. Knowl Based Syst 163:546–557
Article Google Scholar
Zhou Q, Hao JK, Wu Q (2021) Responsive threshold search based memetic algorithm for balanced minimum sum-of-squares clustering. Inf Sci 569:184–204
Article MathSciNet Google Scholar

Download references

Author information

Authors and Affiliations

Department of Computer Engineering, Islamic Azad University, Karaj Branch, Karaj, Iran
Sasan Harifi, Madjid Khalilian & Javad Mohammadzadeh

Authors

Sasan Harifi
View author publications
You can also search for this author in PubMed Google Scholar
Madjid Khalilian
View author publications
You can also search for this author in PubMed Google Scholar
Javad Mohammadzadeh
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Madjid Khalilian.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Harifi, S., Khalilian, M. & Mohammadzadeh, J. Swarm based automatic clustering using nature inspired Emperor Penguins Colony algorithm. Evolving Systems 14, 1083–1099 (2023). https://doi.org/10.1007/s12530-023-09507-y

Download citation

Received: 13 July 2021
Accepted: 03 May 2023
Published: 11 June 2023
Issue Date: December 2023
DOI: https://doi.org/10.1007/s12530-023-09507-y

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Swarm based automatic clustering using nature inspired Emperor Penguins Colony algorithm

Abstract

Access this article

Similar content being viewed by others

Nature-inspired metaheuristic techniques for automatic clustering: a survey and performance study

Population-based bio-inspired algorithms for cluster ensembles optimization

Applications and Advancements of Nature-Inspired Optimization Algorithms in Data Clustering: A Detailed Analysis

Data availability

References

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher’s Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Swarm based automatic clustering using nature inspired Emperor Penguins Colony algorithm

Abstract

Access this article

Similar content being viewed by others

Nature-inspired metaheuristic techniques for automatic clustering: a survey and performance study

Population-based bio-inspired algorithms for cluster ensembles optimization

Applications and Advancements of Nature-Inspired Optimization Algorithms in Data Clustering: A Detailed Analysis

Data availability

References

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher’s Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation