Abstract
Improving classification performance is an essential goal for various practical applications. Feature selection has become an important data preprocessing step in machine learning systems. However, many effective methods based on heuristic search strategies have the problem of high running costs. This paper proposes an efficient multiobjective feature selection method based on artificial immune algorithm optimization. It introduces a clone selection algorithm to explore the search space of optimal feature subsets. According to the target requirements of feature selection, combined with biological research results, this method introduces genome shuffling technology and a conditional lethal mutation mechanism to improve the search performance of the algorithm. Experimental comparisons are conducted on 21 benchmark datasets with 17 advanced feature selection methods in terms of classification accuracy, the number of feature subsets, and computational cost. The results show that the algorithm achieves the smallest number of selected features (only 3.26% compared to the lowest) and better average classification accuracy with a much lower average computational cost than others (only 3.62% compared to the lowest).
Similar content being viewed by others
References
Idakwo G, Luttrell IVJ, Chen M, Hong H, Gong P, Zhang C (2019) A review of feature reduction methods for QSAR-based toxicity prediction. Advances in computational toxicology, pp 119–139. https://doi.org/10.1007/978-3-030-16443-0_7
Nguyen BH, Xue B, Zhang M (2020) A survey on swarm intelligence approaches to feature selection in data mining. Swarm Evol Comput 54:100663. https://doi.org/10.1016/j.swevo.2020.100663
Ray P, Reddy SS, Banerjee T (2021) Various dimension reduction techniques for high dimensional data analysis: a review. Artif Intell Rev 54(5):3473. https://doi.org/10.1007/s10462-020-09928-0https://doi.org/10.1007/s10462-020-09928-0
Sun Z (2014) Parallel feature selection based on MapReduce (Springer). https://doi.org/10.1007/978-3-319-01766-2_35
Rostami M, Berahmand K, Nasiri E, Forouzande S, Forouzandeh S (2021) Review of swarm intelligence-based feature selection methods. Eng Appl Artif Intell 100(September 2020):104210. https://doi.org/10.1016/j.engappai.2021.104210
Zhang Y, wei Gong D, zhi Gao X, Tian T, yan Sun X (2020) Binary differential evolution with self-learning for multi-objective feature selection. Inf Sci 507:67. https://doi.org/10.1016/j.ins.2019.08.040https://doi.org/10.1016/j.ins.2019.08.040
Kılıç F, Kaya Y, Yildirim S (2021) A novel multi population based particle swarm optimization for feature selection. Knowl-Based Syst 219:106894. https://doi.org/10.1016/j.knosys.2021.106894
Wang Y, Li T (2020) Local feature selection based on artificial immune system for classification. Appl Soft Comput 87:105989. https://doi.org/10.1016/j.asoc.2019.105989
Kumar S, Singh M (2021) Breast cancer detection based on feature selection using enhanced grey wolf optimizer and support vector machine algorithms. Vietnam J Comput Sci 08(02):177. https://doi.org/10.1142/S219688882150007X
Hu G, Du B, Wang X, Wei G (2022) An enhanced black widow optimization algorithm for feature selection. Knowl-Based Syst 235:107638. https://doi.org/10.1016/j.knosys.2021.107638
Abualigah L, Dulaimi AJ (2021) A novel feature selection method for data mining tasks using hybrid Sine Cosine Algorithm and Genetic Algorithm. Clust Comput 24(April):2161. https://doi.org/10.1007/s10586-021-03254-y
Hu G, Zhong J, Du B, Wei G (2022) An enhanced hybrid arithmetic optimization algorithm for engineering applications. Comput Methods Appl Mech Eng 394:114901. https://doi.org/10.1016/j.cma.2022.114901https://doi.org/10.1016/j.cma.2022.114901
Tiwari A, Chaturvedi A (2022) A hybrid feature selection approach based on information theory and dynamic butterfly optimization algorithm for data classification. Expert Syst Appl 196:116621. https://doi.org/10.1016/j.eswa.2022.116621
Abdel-Basset M, Abdel-Fatah L, Sangaiah AK (2018) Metaheuristic algorithms: A comprehensive review. In: Computational intelligence for multimedia big data on the cloud with engineering applications (Elsevier). https://doi.org/10.1016/B978-0-12-813314-9.00010-4https://doi.org/10.1016/B978-0-12-813314-9.00010-4, pp 185–231
Hira ZM, Gillies DF (2015) A review of feature selection and feature extraction methods applied on microarray data. Adv Bioinformatics 2015:1. https://doi.org/10.1155/2015/198363
Zebari R, Abdulazeez A, Zeebaree D, Zebari D, Saeed J (2020) A comprehensive review of dimensionality reduction techniques for feature selection and feature extraction. J Appl Sci Technol Trends 1(2):56. https://doi.org/10.38094/jastt1224
Zhang Y (2021) Safety management of civil engineering construction based on artificial intelligence and machine vision technology. Adv Civil Eng 2021:1. https://doi.org/10.1155/2021/3769634
Tadist K, Najah S, Nikolov NS, Mrabti F, Zahi A (2019) Feature selection methods and genomic big data: A systematic review. J Big Data 6(1):79. https://doi.org/10.1186/s40537-019-0241-0
Jović A, Brkić K, Bogunović N (2015) A review of feature selection methods with applications. In: 2015 38th International convention on information and communication technology, Electronics and microelectronics, MIPRO 2015 - Proceedings. pp 1200–1205. https://doi.org/10.1109/MIPRO.2015.7160458
Venkatesh B, Anuradha J (2019) A review of feature selection and its methods. Cybern Inf Technol 19(1):3. https://doi.org/10.2478/cait-2019-0001https://doi.org/10.2478/cait-2019-0001
Kumar V (2014) Feature selection: A literature review. The Smart Computing Review 4(3). https://doi.org/10.6029/smartcr.2014.03.007
Senthil Kumar P, Lopez D (2016) A review on feature selection methods for high dimensional data. Inter J Eng Technol 8(2):669
Ji Z, Meng G, Huang D, Yue X, Wang B (2015) NMFBFS: A NMF-based feature selection method in identifying pivotal clinical symptoms of hepatocellular carcinoma. Comput Math Methods Med 2015:1. https://doi.org/10.1155/2015/846942
Salesi S, Cosma G, Mavrovouniotis M (2021) TAGA: Tabu Asexual Genetic Algorithm embedded in a filter/filter feature selection approach for high-dimensional data. Inform Sci 565:105. https://doi.org/10.1016/j.ins.2021.01.020
Kohavi R, John GH, John H (1997) Wrappers for feature subset selection. Artif Intell 97 (1-2):273. https://doi.org/10.1016/S0004-3702(97)00043-Xhttps://doi.org/10.1016/S0004-3702(97)00043-X
Cai J, Luo J, Wang S, Yang S (2018) Feature selection in machine learning: A new perspective. Neurocomputing 300(2):70. https://doi.org/10.1016/j.neucom.2017.11.077
Tang J, Alelyani S, Lui H (2014) Feature selection for classification: A review. Data Classification:, Algorithms and Applications, pp 571–605. https://doi.org/10.1201/b17320-3
Mohsenzadeh Y, Sheikhzadeh H, Reza AM, Bathaee N, Kalayeh MM (2013) The relevance sample-feature machine: A sparse Bayesian learning approach to joint feature-sample selection. IEEE Trans Cybern 43(6):2241. https://doi.org/10.1109/TCYB.2013.2260736https://doi.org/10.1109/TCYB.2013.2260736
Quinlan JR (2014) C4.5: Programs for machine learning (Elsevier)
Zhu P, Zuo W, Zhang L, Hu Q, Shiu SC (2015) Unsupervised feature selection by regularized self-representation. Pattern Recogn 48(2):438. https://doi.org/10.1016/j.patcog.2014.08.006
Gebken B, Peitz S (2021) An efficient descent method for locally lipschitz multiobjective optimization problems. J Optim Theory Appl 188(3):696. https://doi.org/10.1007/s10957-020-01803-w
Hong WJ, Yang P, Tang K (2021) Evolutionary computation for large-scale multi-objective optimization: A decade of progresses. Int J Autom Comput 18(2):155. https://doi.org/10.1007/s11633-020-1253-0https://doi.org/10.1007/s11633-020-1253-0
Bahri O, Talbi EG, Ben Amor N (2018) A generic fuzzy approach for multi-objective optimization under uncertainty. Swarm Evol Comput 40:166. https://doi.org/10.1016/j.swevo.2018.02.002
Jothi G, Inbarani HH (2016) Hybrid tolerance rough set-firefly based supervised feature selection for MRI brain tumor image classification. Appl Soft Comput J 46:639. https://doi.org/10.1016/j.asoc.2016.03.014https://doi.org/10.1016/j.asoc.2016.03.014
Santucci V, Baioletti M, Milani A (2020) An algebraic framework for swarm and evolutionary algorithms in combinatorial optimization. Swarm Evol Comput 100673:55. https://doi.org/10.1016/j.swevo.2020.100673https://doi.org/10.1016/j.swevo.2020.100673
Zhang S, Lee C, Chan H, Choy K, Wu Z (2015) Swarm intelligence applied in green logistics: A literature review. Eng Appl Artif Intell 37:154. https://doi.org/10.1016/j.engappai.2014.09.007
Barak S, Dahooie JH, Tichý T (2015) Wrapper ANFIS-ICA Method to do stock market timing and feature selection on the basis of japanese candlestick. Expert Syst Appl 42(23):9221. https://doi.org/10.1016/j.eswa.2015.08.010
Senawi A, Wei HL, Billings SA (2017) A new maximum relevance-minimum Multicollinearity (MRmMC) Method for feature selection and ranking. Pattern Recogn 67:47. https://doi.org/10.1016/j.patcog.2017.01.026
Li T (2004) Computer immunology (Publishing House of Electronics Industry)
Macfarlane B (1959) The clonal selection theory of acquired immunity. cambridge university press, London
De Castro LN, Von Zuben FJ (2000) The clonal selection algorithm with engineering applications. In: Proceedings of GECCO, vol 2000, pp 36–39
Shang R, Jiao L, Liu F, Ma W (2012) A novel immune clonal algorithm for MO problems. IEEE Trans Evol Comput 16(1):35. https://doi.org/10.1109/TEVC.2010.2046328
Dai H, Yang Y, Li H, Li C (2014) Bi-Direction quantum crossover-based clonal selection algorithm and its applications. Expert Syst Appl 41(16):7248. https://doi.org/10.1016/j.eswa.2014.05.053
Xu N, Ding Y, Ren L, Hao K (2018) Degeneration recognizing clonal selection algorithm for multimodal optimization. IEEE Trans Cybern 48(3):848. https://doi.org/10.1109/TCYB.2017.2657797
Yan X, Li P, Tang K, Gao L, Wang L (2020) Clonal selection based intelligent parameter inversion algorithm for prestack seismic data. Inf Sci 517:86. https://doi.org/10.1016/j.ins.2019.12.083
Luo W, Lin X, Zhu T, Xu P (2019) A clonal selection algorithm for dynamic multimodal function optimization. Swarm Evol Comput 50(February 2018):100459. https://doi.org/10.1016/j.swevo.2018.10.010https://doi.org/10.1016/j.swevo.2018.10.010
Aladeemy M, Tutun S, Khasawneh MT (2017) A new hybrid approach for feature selection and support vector machine model selection based on self-adaptive cohort intelligence. Expert Syst Appl 88:118. https://doi.org/10.1016/j.eswa.2017.06.030
Emary E, Zawbaa HM, Hassanien AE (2016) Binary ant lion approaches for feature selection. Neurocomputing 213:54. https://doi.org/10.1016/j.neucom.2016.03.101
Mafarja MM, Mirjalili S (2017) Hybrid whale optimization algorithm with simulated annealing for feature selection. Neurocomputing 260:302. https://doi.org/10.1016/j.neucom.2017.04.053
Aziz MAE, Hassanien AE (2018) Modified cuckoo search algorithm with rough sets for feature selection. Neural Comput & Applic 29(4):925. https://doi.org/10.1007/s00521-016-2473-7
Magocha TA, Zabed H, Yang M, Yun J, Zhang H, Qi X (2018) Improvement of industrially important microbial strains by genome shuffling: Current status and future prospects. Bioresour Technol 257:281. https://doi.org/10.1016/j.biortech.2018.02.118
Frydenberg O (2009) Population studies of a lethal mutant in drosophila melanogaster: I. Behaviour in populations with discrete generations. Hereditas 50(1):89. https://doi.org/10.1111/j.1601-5223.1963.tb01896.xhttps://doi.org/10.1111/j.1601-5223.1963.tb01896.x
Dua D, Graff C (2017) UCI machine learning repository
AS (2018) University Arizona State University’s (ASU) Repository
Kaggle (2020) Wisconsin Diagnosis Breast Cancer Database (WDBC)
de Rosa GH, Papa JP, Yang XS (2020) A Nature-inspired feature selection approach based on Hypercomplex information. Appl Soft Comput 106453:94. https://doi.org/10.1016/j.asoc.2020.106453
Papa JP, Rosa GH, de Souza AN, Afonso LC (2018) Feature selection through binary brain storm optimization. Comput & Electric Eng 72:468. https://doi.org/10.1016/j.compeleceng.2018.10.013https://doi.org/10.1016/j.compeleceng.2018.10.013
Abdel-Basset M, El-Shahat D, El-henawy I, de Albuquerque VHC, Mirjalili S (2020) A new fusion of grey wolf optimizer algorithm with a two-phase mutation for feature selection. Expert Syst Appl 139:112824. https://doi.org/10.1016/j.eswa.2019.112824
Mafarja M, Aljarah I, Faris H, Hammouri AI, Al-Zoubi AM, Mirjalili S (2019) Binary grasshopper optimisation algorithm approaches for feature selection problems. Expert Syst Appl 117(September):267. https://doi.org/10.1016/j.eswa.2018.09.015
Ji B, Lu X, Sun G, Zhang W, Li J, Xiao Y (2020) Bio-inspired feature selection: An improved binary particle swarm optimization approach. IEEE Access 8:85989. https://doi.org/10.1109/ACCESS.2020.2992752https://doi.org/10.1109/ACCESS.2020.2992752
Taradeh M, Mafarja M, Heidari AA, Faris H, Aljarah I, Mirjalili S, Fujita H (2019) An evolutionary gravitational search-based feature selection. Inf Sci 497:219. https://doi.org/10.1016/j.ins.2019.05.038https://doi.org/10.1016/j.ins.2019.05.038
Arora S, Anand P (2019) Binary butterfly optimization approaches for feature selection. Expert Syst Appl 116:147. https://doi.org/10.1016/j.eswa.2018.08.051https://doi.org/10.1016/j.eswa.2018.08.051
Khaire UM, Dhanalakshmi R (2019) Stability of feature selection algorithm: A review. Journal of King Saud University-Computer and Information Sciences
Acknowledgements
This work was supported in part by the National Key Research and Development Program of China (No. 2020YFB1805400); in part by the National Natural Science Foundation of China (No.U1736212 , U19A2068, 62032002, and 62002248); in part by the China Postdoctoral Science Foundation (No. 2019TQ0217, 2020M673277, and 2020M683345); in part by the Provincial Key Research and Development Program of Sichuan (No. 20ZDYF3145); in part by the Fundamental Research Funds for the Central Universities (No. Y J201933); in part by the China International Postdoctoral Exchange Fellowship Program (Talent-Introduction).
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of Interests
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Zhu, Y., Li, T. & Lan, X. Feature selection optimized by the artificial immune algorithm based on genome shuffling and conditional lethal mutation. Appl Intell 53, 13972–13992 (2023). https://doi.org/10.1007/s10489-022-03971-w
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10489-022-03971-w