Abstract
The fields of data science and data mining are enduring high-dimensionality issues because of a high volume of data. Conventional machine learning techniques give disgruntled responses to high-dimensional datasets. Feature selection is used to get the appropriate information from the dataset to reduce the dimensionality of the data. The recently proposed Salp Swarm Algorithm (SSA) is a population-based meta-heuristic optimization technique inspired by the Sea Salps Swarming technique. SSA failed to converge initial random solutions to the global optimum owing to its complete dependency on the number of iterations for the process of exploration and exploitation. The proposed improved SSA (iSSA) aims to enhance the ability of Salps to explore divergent areas by randomly updating its location. Randomizing the Salps location via Levy flight enriches the exploitation potential of SSA resulting in it converging the model toward the global optima. The performance of the proposed iSSA is investigated using six different high-dimensional microarray datasets. While comparing the ability to converge, it is understood that the proposed model outperforms SSA providing 0.1033% more confidence in the selected features. The results of the simulation revealed that the iSSA can provide better competitive and significant results compared to SSA.
Similar content being viewed by others
References
Khaire UM, Dhanalakshmi R (2019) Feature selection and classification of microarray data for cancer prediction using mapreduce implementation of random forest algorithm. J Sci Ind Res (India) 78:158–161
Ibrahim RA, Elaziz MA, Oliva D et al (2019) An opposition-based social spider optimization for feature selection. Soft Comput 23:13547–13567. https://doi.org/10.1007/s00500-019-03891-x
Jayaprakash A, KeziSelvaVijila C (2019) Feature selection using Ant Colony optimization (ACO) and road sign detection and recognition (RSDR) system. Cogn Syst Res 58:123–133. https://doi.org/10.1016/j.cogsys.2019.04.002
Mafarja MM, Mirjalili S (2018) Whale optimization approaches for wrapper feature selection. Appl Soft Comput 62:441–453. https://doi.org/10.1016/j.asoc.2017.11.006
Libbrecht MW, Noble WS (2015) Machine learning in genetics and genomics. Nat Rev Genet 16:321–332. https://doi.org/10.1038/nrg3920
Khaire UM, Dhanalakshmi R (2019) Stability of feature selection algorithm: a review. J King Saud Univ Comput Inf Sci. https://doi.org/10.1016/j.jksuci.2019.06.012
Bahassine S, Madani A, Al-Sarem M, Kissi M (2020) Feature selection using an improved Chi-square for Arabic text classification. J King Saud Univ Comput Inf Sci 32:225–231. https://doi.org/10.1016/j.jksuci.2018.05.010
Manbari Z, AkhlaghianTab F, Salavati C (2019) Hybrid fast unsupervised feature selection for high-dimensional data. Expert Syst Appl 124:97–118. https://doi.org/10.1016/j.eswa.2019.01.016
Motawi TMK, Sadik NAH, Shaker OG et al (2016) Study of microRNAs-21/221 as potential breast cancer biomarkers in Egyptian women. Gene 590:210–219. https://doi.org/10.1016/j.gene.2016.01.042
Urbanowicz RJ, Meeker M, La Cava W et al (2018) Relief-based feature selection: introduction and review. J Biomed Inform 85:189–203. https://doi.org/10.1016/j.jbi.2018.07.014
Gu N, Fan M, Du L, Ren D (2015) Efficient sequential feature selection based on adaptive eigenspace model. Neurocomputing 161:199–209. https://doi.org/10.1016/j.neucom.2015.02.043
Abualigah LM, Khader AT, Hanandeh ES (2018) A new feature selection method to improve the document clustering using particle swarm optimization algorithm. J Comput Sci 25:456–466. https://doi.org/10.1016/j.jocs.2017.07.018
Hancer E (2018) Differential evolution for feature selection: a fuzzy wrapper – filter approach. Soft Comput 23:5233–5248. https://doi.org/10.1007/s00500-018-3545-7
Khanji C, Lalonde L, Bareil C et al (2019) Lasso regression for the prediction of intermediate outcomes related to cardiovascular disease prevention using the TRANSIT quality indicators. Med Care 57:63–72. https://doi.org/10.1097/MLR.0000000000001014
Ma N, Zhao S, Sun Z et al (2019) An improved ridge regression algorithm and its application in predicting TV ratings. Multimed Tools Appl 78:525–536. https://doi.org/10.1007/s11042-017-5250-4
Diao R, Shen Q (2012) Feature selection with harmony search. IEEE Trans Syst Man Cybern Part B Cybern 42:1509–1523. https://doi.org/10.1109/TSMCB.2012.2193613
Yan C, Ma J, Luo H et al (2019) A novel feature selection method for high-dimensional biomedical data based on an improved binary clonal flower pollination algorithm. Hum Hered 84:1–13. https://doi.org/10.1159/000501652
Javidy B, Hatamlou A, Mirjalili S (2015) Ions motion algorithm for solving optimization problems. Appl Soft Comput J 32:72–79. https://doi.org/10.1016/j.asoc.2015.03.035
Tayarani-N MH, Akbarzadeh-T MR (2014) Magnetic-inspired optimization algorithms: operators and structures. Swarm Evol Comput 19:82–101. https://doi.org/10.1016/j.swevo.2014.06.004
Rashedi E, Rashedi E, Nezamabadi-pour H (2018) A comprehensive survey on gravitational search algorithm. Swarm Evol Comput 41:141–158. https://doi.org/10.1016/j.swevo.2018.02.018
Mirjalili S (2016) Dragonfly algorithm: a new meta-heuristic optimization technique for solving single-objective, discrete, and multi-objective problems. Neural Comput Appl 27:1053–1073. https://doi.org/10.1007/s00521-015-1920-1
Emary E, Zawbaa HM, Hassanien AE (2016) Binary grey wolf optimization approaches for feature selection. Neurocomputing 172:371–381. https://doi.org/10.1016/j.neucom.2015.06.083
Husseinzadeh Kashan A (2014) League championship algorithm (LCA): an algorithm for global optimization inspired by sport championships. Appl Soft Comput J 16:171–200. https://doi.org/10.1016/j.asoc.2013.12.005
Zheng YJ, Xu XL, Ling HF, Chen SY (2015) A hybrid fireworks optimization method with differential evolution operators. Neurocomputing 148:75–82. https://doi.org/10.1016/j.neucom.2012.08.075
Sadollah A, Bahreininejad A, Eskandar H, Hamdi M (2012) Mine blast algorithm for optimization of truss structures with discrete variables. Comput Struct 102:49–63. https://doi.org/10.1016/j.compstruc.2012.03.013
Mirjalili S, Gandomi AH, Mirjalili SZ et al (2017) Salp swarm algorithm: a bio-inspired optimizer for engineering design problems. Adv Eng Softw 114:163–191. https://doi.org/10.1016/j.advengsoft.2017.07.002
Khaire UM, Dhanalakshmi R (2020) High-dimensional microarray dataset classification using an improved adam optimizer (iAdam). J Ambient Intell Humaniz Comput 11:5187–5204. https://doi.org/10.1007/s12652-020-01832-3
Faris H, Mafarja MM, Heidari AA et al (2018) An efficient binary Salp swarm algorithm with crossover scheme for feature selection problems. Knowledge-Based Syst 154:43–67. https://doi.org/10.1016/j.knosys.2018.05.009
Faris H, Heidari AA, Al-Zoubi AM et al (2020) Time-varying hierarchical chains of salps with random weight networks for feature selection. Expert Syst Appl 140:112898. https://doi.org/10.1016/j.eswa.2019.112898
Aljarah I, Mafarja M, Heidari AA et al (2018) Asynchronous accelerating multi-leader salp chains for feature selection. Appl Soft Comput J 71:964–979. https://doi.org/10.1016/j.asoc.2018.07.040
Ibrahim RA, Ewees AA, Oliva D et al (2019) Improved salp swarm algorithm based on particle swarm optimization for feature selection. J Ambient Intell Humaniz Comput 10:3155–3169. https://doi.org/10.1007/s12652-018-1031-9
Khamees M, Albakry A, Shaker K (2018) Multi-objective feature selection: hybrid of Salp Swarm and simulated annealing approach. In: Al-mamory SO, Alwan JK, Hussein AD (eds) New trends in information and communications technology applications. Springer, Cham, Baghdad, Iraq, pp 129–142
Hegazy AE, Makhlouf MA, El-Tawel GS (2020) Improved salp swarm algorithm for feature selection. J King Saud Univ Comput Inf Sci 32:335–344. https://doi.org/10.1016/j.jksuci.2018.06.003
Sayed GI, Khoriba G, Haggag MH (2018) A novel chaotic salp swarm algorithm for global optimization and feature selection. Appl Intell 48:3462–3481. https://doi.org/10.1007/s10489-018-1158-6
Neggaz N, Ewees AA, Elaziz MA, Mafarja M (2020) Boosting salp swarm algorithm by sine cosine algorithm and disrupt operator for feature selection. Expert Syst Appl 145:113103. https://doi.org/10.1016/j.eswa.2019.113103
Tubishat M, Ja’afar S, Alswaitti M et al (2021) Dynamic Salp swarm algorithm for feature selection. Expert Syst Appl 164:113873. https://doi.org/10.1016/j.eswa.2020.113873
Wolpert DH, Macready WG (1997) No free lunch theorems for optimization. IEEE Trans Evol Comput 1:67–82. https://doi.org/10.1109/4235.585893
Faramarzi A, Heidarinejad M, Mirjalili S, Gandomi AH (2020) Marine predators algorithm: a nature-inspired metaheuristic. Expert Syst Appl 152:113377. https://doi.org/10.1016/j.eswa.2020.113377
Tarkhaneh O, Shen H (2019) Training of feedforward neural networks for data classification using hybrid particle swarm optimization, Mantegna Lévy flight and neighborhood search. Heliyon 5:e01275. https://doi.org/10.1016/j.heliyon.2019.e01275
Zhu Z, Ong YS, Dash M (2007) Markov blanket-embedded genetic algorithm for gene selection. Pattern Recognit 40:3236–3248. https://doi.org/10.1016/j.patcog.2007.02.007
Bhosale PG, Cristea S, Ambatipudi S et al (2017) Chromosomal alterations and gene expression changes associated with the progression of leukoplakia to advanced gingivobuccal cancer. Transl Oncol 10:396–409. https://doi.org/10.1016/j.tranon.2017.03.008
Yan Y, Liu R, Ding Z et al (2019) A parameter-free cleaning method for SMOTE in imbalanced classification. IEEE Access 7:23537–23548. https://doi.org/10.1109/ACCESS.2019.2899467
Lapchak PA, Zhang JH (2018) Data standardization and quality management. Transl Stroke Res 9:4–8. https://doi.org/10.1007/s12975-017-0531-9
Ho Y, Wookey S (2020) The real-world-weight cross-entropy loss function: modeling the costs of mislabeling. IEEE Access 8:4806–4813. https://doi.org/10.1109/ACCESS.2019.2962617
Wade BSC, Joshi SH, Gutman BA, Thompson PM (2017) Machine learning on high dimensional shape data from subcortical brain surfaces: a comparison of feature selection and classification methods. Pattern Recognit 63:731–739. https://doi.org/10.1016/j.patcog.2016.09.034
Acknowledgements
The authors sincerely thank the Department of Science and Technology (DST), Government of India for funding this research project work under the Interdisciplinary Cyber Physical Systems (ICPS) scheme (Grant No. T-54).
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Balakrishnan, K., Dhanalakshmi, R. & Khaire, U.M. Improved salp swarm algorithm based on the levy flight for feature selection. J Supercomput 77, 12399–12419 (2021). https://doi.org/10.1007/s11227-021-03773-w
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11227-021-03773-w