Skip to main content
Log in

HMOSHSSA: a novel framework for solving simultaneous clustering and feature selection problems

  • Published:
Multimedia Tools and Applications Aims and scope Submit manuscript

Abstract

In real-life scenarios, information about the number of clusters is unknown. Due to this, clustering algorithms are unable to generate the valuable partitions. Beside this, the appropriate and optimal number of features is also required to produce the good quality clusters. The selection of optimal number of clusters and feature is a challenging task in the clustering. To resolve these problems, an automatic multi-objective-based clustering approach called HMOSHSSA is proposed in this paper. In HMOSHSSA, the spotted hyena and salp swarm algorithms are hybridized to obtain a better trade-off between these algorithms’ intensification and diversification capabilities. Two novel concepts for encoding and threshold setting are incorporated in the HMOSHSSA. The encoding scheme is used to choose the optimal number of clusters and features during the optimization process. The variance of dataset is used for setting the threshold values for both clusters and features. A novel fitness function is proposed to improve the optimization process. The suggested algorithm’s performance is evaluated using eight well-known real-world datasets. The statistical significance of HMOSHSSA is measured through t-tests. Results reveal that the proposed approach is able to detect the optimal number of clusters and features from a given dataset without user intervention. This approach is also deployed for solving microarray data analysis and image segmentation problems. HMOSHSSA outperformed the other considered algorithms in terms of performance measures.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Algorithm 1
Algorithm 2
Fig. 1
Fig. 2
Algorithm 3
Fig. 3
Algorithm 4
Algorithm 5
Fig. 4
Fig. 5
Fig. 6
Fig. 7

Similar content being viewed by others

Data availability statement

No dataset is genertaed in this work.

References

  1. Nyo MT, Mebarek-Oudina F, Hlaing SS, Khan NA (2022) Otsu’s thresholding technique for mri image brain tumor segmentation. Multimed Tools Appl 81(30):43837–43849

    Article  Google Scholar 

  2. Jain AK, Murty MN, Flynn PJ (1999) Data clustering: a review. ACM Comput Surv (CSUR) 31(3):264–323

    Article  Google Scholar 

  3. Wang S, Huang S, Liu S, Bi Y (2023) Not just select samples, but exploration: genetic programming aided remote sensing target detection under deep learning. Appl Soft Comput 2023:110570

    Article  Google Scholar 

  4. Kumar V, Chhabra JK, Kumar D (2016) Automatic data clustering using parameter adaptive harmony search algorithm and its application to image segmentation. J Intell Syst 25(4):595–610

    Google Scholar 

  5. Ahmad S, Mehfuz S, Mebarek-Oudina F, Beg J (2022) Rsm analysis based cloud access securitybroker: a systematic literature review. Cluster Comput 25(5):3733–3763

    Article  PubMed  PubMed Central  Google Scholar 

  6. Liu S, Huang S, Wang S, Muhammad K, Bellavista P, Ser Del (2023) Visual tracking in complex scenes: a location fusion mechanism based on the combination of multiple visual cognition flows. Inf Fusion 96:281–296

    Article  Google Scholar 

  7. Liu S, Xu X, Zhang Y, Muhammad K, Fu W (2022) A reliable sample selection strategy for weakly supervised visual tracking. IEEE Trans Reliab 72(1):15–26

    Article  Google Scholar 

  8. Song Q, Ni J, Wang V (2011) A fast clustering-based feature subset selection algorithm for highdimensional data. IEEE Trans Knowl Data Eng 25(1):1–14

    Article  CAS  Google Scholar 

  9. Jain AK, Duin RPW, Mao V (2000) Statistical pattern recognition: a review. IEEE Trans Pattern Anal Mach Intell 22(1):4–37

    Article  Google Scholar 

  10. Kumar V, Chhabra JK, Kumar D (2014) Clustering using modified harmony search algorithm. Int J Comput Intell Studies 2 3(2-3):113–133

  11. Law MH, Figueiredo MA, Jain AK (2004) Simultaneous feature selection and clustering using mixture models. IEEE Trans Pattern Anal Mach Intell 26(9):1154–1166

    Article  PubMed  Google Scholar 

  12. Nanda SJ, Panda G (2014) A survey on nature inspired metaheuristic algorithms for partitional clustering. Swarm Evol Comput 16:1–18

    Article  Google Scholar 

  13. Kumar V, Kumar D (2019) Automatic clustering and feature selection using gravitational search algorithm and its application to microarray data analysis. Neural Comput Appl 31(8):3647–3663

    Article  Google Scholar 

  14. Kumar V, Chhabra JK, Kumar D (2016) An automated parameter selection approach for simultaneous clustering and feature selection. J Eng Res 4(2):1–21

    Article  Google Scholar 

  15. José-García A, Gómez-Flores W (2016) Automatic clustering using nature-inspired metaheuristics: a survey. Appl Soft Comput 41:192–213

    Article  Google Scholar 

  16. Ranjan R, Chhabra JK (2023) Automatic clustering and feature selection using multi-objective crow search algorithm. Appl Soft Comput 142:110305

    Article  Google Scholar 

  17. Zeng H, Cheung Y-M (2009) A new feature selection method for gaussian mixture clustering. Pattern Recognit 42(2):243–250

    Article  ADS  Google Scholar 

  18. Hruschka ER, Campello RJ, Freitas AA et al (2009) A survey of evolutionary algorithms for clustering. IEEE Trans Syst Man Cybernet Part C (Applications and Reviews) 39(2):133–155

    Article  Google Scholar 

  19. Das S, Abraham A, Konar A (2009) Metaheuristic clustering, vol 178, Springer, 2009

  20. Xu R, Wunsch D (2005) Survey of clustering algorithms. IEEE Trans Neural Netw 16(3):645–678

    Article  PubMed  Google Scholar 

  21. Dhiman G, Kumar V (2017) Spotted hyena optimizer: a novel bio-inspired based metaheuristic technique for engineering applications. Adv Eng Soft 114:48–70

    Article  Google Scholar 

  22. Dhiman G, Kumar V (2018) Multi-objective spotted hyena optimizer: a multi-objective optimization algorithm for engineering problems. Knowl-Based Syst 150:175–197

    Article  Google Scholar 

  23. Mirjalili S, Gandomi AH, Mirjalili SZ, Saremi S, Faris H, Mirjalili SM (2017) Salp swarm algorithm: a bio-inspired optimizer for engineering design problems. Adv Eng Softw 114:163–191

  24. Mukhopadhyay A, Maulik U, Bandyopadhyay S (2015) A survey of multiobjective evolutionary clustering. ACM Comput Surv (CSUR) 47(4):1–46

    Article  Google Scholar 

  25. Handl J, Knowles J (2007) An evolutionary approach to multiobjective clustering. IEEE Trans Evol Comput 11(1):56–76

    Article  Google Scholar 

  26. Bandyopadhyay S, Maulik U, Mukhopadhyay A (2007) Multiobjective genetic clustering for pixel classification in remote sensing imagery. IEEE Trans Geosci Remote Sens 45(5):1506–1511

    Article  ADS  Google Scholar 

  27. Mukhopadhyay A, Maulik U (2011) A multiobjective approach to mr brain image segmentation. Appl Soft Comput 11(1):872–880

    Article  Google Scholar 

  28. Saha S, Bandyopadhyay S (2009) A new multiobjective simulated annealing based clustering technique using symmetry. Pattern Recognit Letter 30(15):1392–1403

    Article  ADS  Google Scholar 

  29. Saha S, Bandyopadhyay S (2010) A symmetry based multiobjective clustering technique for automatic evolution of clusters. Pattern Recognit 43(3):738–751

    Article  ADS  Google Scholar 

  30. Saha S, Bandyopadhyay S (2013) A generalized automatic clustering algorithm in a multiobjective framework. Appl Soft Comput 13(1):89–108

    Article  Google Scholar 

  31. Abubaker A, Baharum A, Alrefaei M (2015) Automatic clustering using multi-objective particle swarm and simulated annealing. PloS one 10(7):e0130995

    Article  PubMed  PubMed Central  Google Scholar 

  32. Nanda SJ, Panda G (2013) Automatic clustering algorithm based on multi-objective immunized pso to classify actions of 3d human models. Eng Appl Artif Intell 26(5–6):1429–1441

    Article  Google Scholar 

  33. Manikandan P, Selvarajan S (2015) Multi-objective clustering based on hybrid optimization algorithm (mo-cs-pso) and it’s application to health data. J Med Imaging Health Inform 5(6):1133–1144

    Article  Google Scholar 

  34. Sheng W, Liu X, Fairhurst M (2008) A niching memetic algorithm for simultaneous clustering and feature selection. IEEE Trans Knowl Data Eng 20(7):868–879

    Article  Google Scholar 

  35. Saha S, Spandana R, Ekbal A, Bandyopadhyay S (2015) Simultaneous feature selection and symmetry based clustering using multiobjective framework. Appl Soft Comput 29:479–486

    Article  Google Scholar 

  36. Dhiman G, Kumar V (2018) Astrophysics inspired multi-objective approach for automatic clustering and feature selection in real-life environment. Modern Phys Letter B 32(31):1850385

    Article  ADS  CAS  Google Scholar 

  37. Dong Z, Jia H, Liu M (2018) An adaptive multiobjective genetic algorithm with fuzzy-means for automatic data clustering. Math Problems Eng 2018

  38. Qu H, Yin L, Tang X (2021) An automatic clustering method using multi-objective genetic algorithm with gene rearrangement and cluster merging. Appl Soft Comput 99:106929

    Article  Google Scholar 

  39. Alok AK, Gupta P, Saha S, Sharma V (2020) Simultaneous feature selection and clustering of micro-array and rna-sequence gene expression data using multiobjective optimization. Int J Mach Learn Cybernet 11:2541–2563

    Article  Google Scholar 

  40. Zhang Y, Cheng S, Shi Y, Gong D-w, Zhao X (2019) Cost-sensitive feature selection using twoarchive multi-objective artificial bee colony algorithm. Expert Syst Appl 137:46–58

    Article  Google Scholar 

  41. Faris H, Heidari AA, Ala’M A-Z, Mafarja M, Aljarah I, Eshtay M, Mirjalili S (2020) Time-varying hierarchical chains of salps with random weight networks for feature selection. Expert Syst Appl 140:112898

    Article  Google Scholar 

  42. Wu C, Wang J, Chen X, Du P, Yang W (2020) A novel hybrid system based on multi-objective optimization for wind speed forecasting. Renewable Energy 146:149–165

    Article  Google Scholar 

  43. Azwan A, Razak A, Jusof M, Nasir A, Ahmad M (2018) A multiobjective simulated kalman filter optimization algorithm. In: 2018 IEEE International conference on applied system invention (ICASI), IEEE, 2018, pp 23–26

  44. Markarian E, Fazelpour F (2019) Multi-objective optimization of energy performance of a building considering different configurations and types of pcm. Solar Energy 191:481–496

    Article  ADS  Google Scholar 

  45. Prakash J, Singh PK (2019) Gravitational search algorithm and k-means for simultaneous feature selection and data clustering: a multi-objective approach. Soft Comput 23:2083–2100

    Article  Google Scholar 

  46. Yan D, Cao H, Yu Y, Wang Y, Yu X (2020) Single-objective/multiobjective cat swarm optimization clustering analysis for data partition. IEEE Trans Autom Sci Eng 17(3):1633–1646

    Google Scholar 

  47. Kumar V, Chhabra JK, Kumar D (2017) Performance evaluation of line symmetry-based validity indices on clustering algorithms. J Intell Syst 26(3):483–503

    Google Scholar 

  48. Bandyopadhyay S, Saha S (2008) A point symmetry-based clustering technique for automatic evolution of clusters. IEEE Trans Knowl Data Eng 20(11):1441–1457

    Article  Google Scholar 

  49. Blake CL, Merz CJ (1998) Uci repository of machine learning databases, 1998

  50. Hartigan JA, Wong MA (1979) Algorithm as 136: a k-means clustering algorithm. J R Stat Soc. series c (applied statistics) 28(1):100–108

    Google Scholar 

  51. He X, Cai D, Niyogi P (2005) Laplacian score for feature selection. Adv Neural Inf Proc Syst 18

  52. Iyer VR, Eisen MB, Ross DT, Schuler G, Moore T, Lee JC, Trent JM, Staudt LM, Hudson J, Boguski MS et al (1999) The transcriptional program in the response of human fibroblasts to serum. Sci 283(5398):83–87

    Article  ADS  CAS  Google Scholar 

  53. Chu S, DeRisi J, Eisen M, Mulholland J, Botstein D, Brown PO, Herskowitz I (1998) The transcriptional program of sporulation in budding yeast. Sci 282(5389):699–705

    Article  ADS  CAS  Google Scholar 

  54. Wen X, Fuhrman S, Michaels GS, Carr DB, Smith S, Barker JL, Somogyi R (1998) Large-scale temporal gene expression mapping of central nervous system development. Proc Natl Acad Sci 95(1):334–339

    Article  ADS  CAS  PubMed  PubMed Central  Google Scholar 

  55. Rousseeuw PJ (1987) Silhouettes: a graphical aid to the interpretation and validation of cluster analysis. J Comput Appl Math 20:53–65

    Article  Google Scholar 

Download references

Funding

The authors did not receive support from any organization for the submitted work.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Vijay Kumar.

Ethics declarations

Conflict of Interest

The authors declare no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Kumar, V., Kumari, R. & Kumar, S. HMOSHSSA: a novel framework for solving simultaneous clustering and feature selection problems. Multimed Tools Appl (2024). https://doi.org/10.1007/s11042-024-18726-7

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s11042-024-18726-7

Keywords

Navigation