Abstract
In real-life scenarios, information about the number of clusters is unknown. Due to this, clustering algorithms are unable to generate the valuable partitions. Beside this, the appropriate and optimal number of features is also required to produce the good quality clusters. The selection of optimal number of clusters and feature is a challenging task in the clustering. To resolve these problems, an automatic multi-objective-based clustering approach called HMOSHSSA is proposed in this paper. In HMOSHSSA, the spotted hyena and salp swarm algorithms are hybridized to obtain a better trade-off between these algorithms’ intensification and diversification capabilities. Two novel concepts for encoding and threshold setting are incorporated in the HMOSHSSA. The encoding scheme is used to choose the optimal number of clusters and features during the optimization process. The variance of dataset is used for setting the threshold values for both clusters and features. A novel fitness function is proposed to improve the optimization process. The suggested algorithm’s performance is evaluated using eight well-known real-world datasets. The statistical significance of HMOSHSSA is measured through t-tests. Results reveal that the proposed approach is able to detect the optimal number of clusters and features from a given dataset without user intervention. This approach is also deployed for solving microarray data analysis and image segmentation problems. HMOSHSSA outperformed the other considered algorithms in terms of performance measures.
Similar content being viewed by others
Data availability statement
No dataset is genertaed in this work.
References
Nyo MT, Mebarek-Oudina F, Hlaing SS, Khan NA (2022) Otsu’s thresholding technique for mri image brain tumor segmentation. Multimed Tools Appl 81(30):43837–43849
Jain AK, Murty MN, Flynn PJ (1999) Data clustering: a review. ACM Comput Surv (CSUR) 31(3):264–323
Wang S, Huang S, Liu S, Bi Y (2023) Not just select samples, but exploration: genetic programming aided remote sensing target detection under deep learning. Appl Soft Comput 2023:110570
Kumar V, Chhabra JK, Kumar D (2016) Automatic data clustering using parameter adaptive harmony search algorithm and its application to image segmentation. J Intell Syst 25(4):595–610
Ahmad S, Mehfuz S, Mebarek-Oudina F, Beg J (2022) Rsm analysis based cloud access securitybroker: a systematic literature review. Cluster Comput 25(5):3733–3763
Liu S, Huang S, Wang S, Muhammad K, Bellavista P, Ser Del (2023) Visual tracking in complex scenes: a location fusion mechanism based on the combination of multiple visual cognition flows. Inf Fusion 96:281–296
Liu S, Xu X, Zhang Y, Muhammad K, Fu W (2022) A reliable sample selection strategy for weakly supervised visual tracking. IEEE Trans Reliab 72(1):15–26
Song Q, Ni J, Wang V (2011) A fast clustering-based feature subset selection algorithm for highdimensional data. IEEE Trans Knowl Data Eng 25(1):1–14
Jain AK, Duin RPW, Mao V (2000) Statistical pattern recognition: a review. IEEE Trans Pattern Anal Mach Intell 22(1):4–37
Kumar V, Chhabra JK, Kumar D (2014) Clustering using modified harmony search algorithm. Int J Comput Intell Studies 2 3(2-3):113–133
Law MH, Figueiredo MA, Jain AK (2004) Simultaneous feature selection and clustering using mixture models. IEEE Trans Pattern Anal Mach Intell 26(9):1154–1166
Nanda SJ, Panda G (2014) A survey on nature inspired metaheuristic algorithms for partitional clustering. Swarm Evol Comput 16:1–18
Kumar V, Kumar D (2019) Automatic clustering and feature selection using gravitational search algorithm and its application to microarray data analysis. Neural Comput Appl 31(8):3647–3663
Kumar V, Chhabra JK, Kumar D (2016) An automated parameter selection approach for simultaneous clustering and feature selection. J Eng Res 4(2):1–21
José-García A, Gómez-Flores W (2016) Automatic clustering using nature-inspired metaheuristics: a survey. Appl Soft Comput 41:192–213
Ranjan R, Chhabra JK (2023) Automatic clustering and feature selection using multi-objective crow search algorithm. Appl Soft Comput 142:110305
Zeng H, Cheung Y-M (2009) A new feature selection method for gaussian mixture clustering. Pattern Recognit 42(2):243–250
Hruschka ER, Campello RJ, Freitas AA et al (2009) A survey of evolutionary algorithms for clustering. IEEE Trans Syst Man Cybernet Part C (Applications and Reviews) 39(2):133–155
Das S, Abraham A, Konar A (2009) Metaheuristic clustering, vol 178, Springer, 2009
Xu R, Wunsch D (2005) Survey of clustering algorithms. IEEE Trans Neural Netw 16(3):645–678
Dhiman G, Kumar V (2017) Spotted hyena optimizer: a novel bio-inspired based metaheuristic technique for engineering applications. Adv Eng Soft 114:48–70
Dhiman G, Kumar V (2018) Multi-objective spotted hyena optimizer: a multi-objective optimization algorithm for engineering problems. Knowl-Based Syst 150:175–197
Mirjalili S, Gandomi AH, Mirjalili SZ, Saremi S, Faris H, Mirjalili SM (2017) Salp swarm algorithm: a bio-inspired optimizer for engineering design problems. Adv Eng Softw 114:163–191
Mukhopadhyay A, Maulik U, Bandyopadhyay S (2015) A survey of multiobjective evolutionary clustering. ACM Comput Surv (CSUR) 47(4):1–46
Handl J, Knowles J (2007) An evolutionary approach to multiobjective clustering. IEEE Trans Evol Comput 11(1):56–76
Bandyopadhyay S, Maulik U, Mukhopadhyay A (2007) Multiobjective genetic clustering for pixel classification in remote sensing imagery. IEEE Trans Geosci Remote Sens 45(5):1506–1511
Mukhopadhyay A, Maulik U (2011) A multiobjective approach to mr brain image segmentation. Appl Soft Comput 11(1):872–880
Saha S, Bandyopadhyay S (2009) A new multiobjective simulated annealing based clustering technique using symmetry. Pattern Recognit Letter 30(15):1392–1403
Saha S, Bandyopadhyay S (2010) A symmetry based multiobjective clustering technique for automatic evolution of clusters. Pattern Recognit 43(3):738–751
Saha S, Bandyopadhyay S (2013) A generalized automatic clustering algorithm in a multiobjective framework. Appl Soft Comput 13(1):89–108
Abubaker A, Baharum A, Alrefaei M (2015) Automatic clustering using multi-objective particle swarm and simulated annealing. PloS one 10(7):e0130995
Nanda SJ, Panda G (2013) Automatic clustering algorithm based on multi-objective immunized pso to classify actions of 3d human models. Eng Appl Artif Intell 26(5–6):1429–1441
Manikandan P, Selvarajan S (2015) Multi-objective clustering based on hybrid optimization algorithm (mo-cs-pso) and it’s application to health data. J Med Imaging Health Inform 5(6):1133–1144
Sheng W, Liu X, Fairhurst M (2008) A niching memetic algorithm for simultaneous clustering and feature selection. IEEE Trans Knowl Data Eng 20(7):868–879
Saha S, Spandana R, Ekbal A, Bandyopadhyay S (2015) Simultaneous feature selection and symmetry based clustering using multiobjective framework. Appl Soft Comput 29:479–486
Dhiman G, Kumar V (2018) Astrophysics inspired multi-objective approach for automatic clustering and feature selection in real-life environment. Modern Phys Letter B 32(31):1850385
Dong Z, Jia H, Liu M (2018) An adaptive multiobjective genetic algorithm with fuzzy-means for automatic data clustering. Math Problems Eng 2018
Qu H, Yin L, Tang X (2021) An automatic clustering method using multi-objective genetic algorithm with gene rearrangement and cluster merging. Appl Soft Comput 99:106929
Alok AK, Gupta P, Saha S, Sharma V (2020) Simultaneous feature selection and clustering of micro-array and rna-sequence gene expression data using multiobjective optimization. Int J Mach Learn Cybernet 11:2541–2563
Zhang Y, Cheng S, Shi Y, Gong D-w, Zhao X (2019) Cost-sensitive feature selection using twoarchive multi-objective artificial bee colony algorithm. Expert Syst Appl 137:46–58
Faris H, Heidari AA, Ala’M A-Z, Mafarja M, Aljarah I, Eshtay M, Mirjalili S (2020) Time-varying hierarchical chains of salps with random weight networks for feature selection. Expert Syst Appl 140:112898
Wu C, Wang J, Chen X, Du P, Yang W (2020) A novel hybrid system based on multi-objective optimization for wind speed forecasting. Renewable Energy 146:149–165
Azwan A, Razak A, Jusof M, Nasir A, Ahmad M (2018) A multiobjective simulated kalman filter optimization algorithm. In: 2018 IEEE International conference on applied system invention (ICASI), IEEE, 2018, pp 23–26
Markarian E, Fazelpour F (2019) Multi-objective optimization of energy performance of a building considering different configurations and types of pcm. Solar Energy 191:481–496
Prakash J, Singh PK (2019) Gravitational search algorithm and k-means for simultaneous feature selection and data clustering: a multi-objective approach. Soft Comput 23:2083–2100
Yan D, Cao H, Yu Y, Wang Y, Yu X (2020) Single-objective/multiobjective cat swarm optimization clustering analysis for data partition. IEEE Trans Autom Sci Eng 17(3):1633–1646
Kumar V, Chhabra JK, Kumar D (2017) Performance evaluation of line symmetry-based validity indices on clustering algorithms. J Intell Syst 26(3):483–503
Bandyopadhyay S, Saha S (2008) A point symmetry-based clustering technique for automatic evolution of clusters. IEEE Trans Knowl Data Eng 20(11):1441–1457
Blake CL, Merz CJ (1998) Uci repository of machine learning databases, 1998
Hartigan JA, Wong MA (1979) Algorithm as 136: a k-means clustering algorithm. J R Stat Soc. series c (applied statistics) 28(1):100–108
He X, Cai D, Niyogi P (2005) Laplacian score for feature selection. Adv Neural Inf Proc Syst 18
Iyer VR, Eisen MB, Ross DT, Schuler G, Moore T, Lee JC, Trent JM, Staudt LM, Hudson J, Boguski MS et al (1999) The transcriptional program in the response of human fibroblasts to serum. Sci 283(5398):83–87
Chu S, DeRisi J, Eisen M, Mulholland J, Botstein D, Brown PO, Herskowitz I (1998) The transcriptional program of sporulation in budding yeast. Sci 282(5389):699–705
Wen X, Fuhrman S, Michaels GS, Carr DB, Smith S, Barker JL, Somogyi R (1998) Large-scale temporal gene expression mapping of central nervous system development. Proc Natl Acad Sci 95(1):334–339
Rousseeuw PJ (1987) Silhouettes: a graphical aid to the interpretation and validation of cluster analysis. J Comput Appl Math 20:53–65
Funding
The authors did not receive support from any organization for the submitted work.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of Interest
The authors declare no conflict of interest.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Kumar, V., Kumari, R. & Kumar, S. HMOSHSSA: a novel framework for solving simultaneous clustering and feature selection problems. Multimed Tools Appl (2024). https://doi.org/10.1007/s11042-024-18726-7
Received:
Revised:
Accepted:
Published:
DOI: https://doi.org/10.1007/s11042-024-18726-7