Skip to main content
Log in

Feature selection optimized by the artificial immune algorithm based on genome shuffling and conditional lethal mutation

  • Published:
Applied Intelligence Aims and scope Submit manuscript

Abstract

Improving classification performance is an essential goal for various practical applications. Feature selection has become an important data preprocessing step in machine learning systems. However, many effective methods based on heuristic search strategies have the problem of high running costs. This paper proposes an efficient multiobjective feature selection method based on artificial immune algorithm optimization. It introduces a clone selection algorithm to explore the search space of optimal feature subsets. According to the target requirements of feature selection, combined with biological research results, this method introduces genome shuffling technology and a conditional lethal mutation mechanism to improve the search performance of the algorithm. Experimental comparisons are conducted on 21 benchmark datasets with 17 advanced feature selection methods in terms of classification accuracy, the number of feature subsets, and computational cost. The results show that the algorithm achieves the smallest number of selected features (only 3.26% compared to the lowest) and better average classification accuracy with a much lower average computational cost than others (only 3.62% compared to the lowest).

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4

Similar content being viewed by others

References

  1. Idakwo G, Luttrell IVJ, Chen M, Hong H, Gong P, Zhang C (2019) A review of feature reduction methods for QSAR-based toxicity prediction. Advances in computational toxicology, pp 119–139. https://doi.org/10.1007/978-3-030-16443-0_7

  2. Nguyen BH, Xue B, Zhang M (2020) A survey on swarm intelligence approaches to feature selection in data mining. Swarm Evol Comput 54:100663. https://doi.org/10.1016/j.swevo.2020.100663

    Article  Google Scholar 

  3. Ray P, Reddy SS, Banerjee T (2021) Various dimension reduction techniques for high dimensional data analysis: a review. Artif Intell Rev 54(5):3473. https://doi.org/10.1007/s10462-020-09928-0https://doi.org/10.1007/s10462-020-09928-0

    Article  Google Scholar 

  4. Sun Z (2014) Parallel feature selection based on MapReduce (Springer). https://doi.org/10.1007/978-3-319-01766-2_35

  5. Rostami M, Berahmand K, Nasiri E, Forouzande S, Forouzandeh S (2021) Review of swarm intelligence-based feature selection methods. Eng Appl Artif Intell 100(September 2020):104210. https://doi.org/10.1016/j.engappai.2021.104210

    Article  Google Scholar 

  6. Zhang Y, wei Gong D, zhi Gao X, Tian T, yan Sun X (2020) Binary differential evolution with self-learning for multi-objective feature selection. Inf Sci 507:67. https://doi.org/10.1016/j.ins.2019.08.040https://doi.org/10.1016/j.ins.2019.08.040

    Article  MathSciNet  MATH  Google Scholar 

  7. Kılıç F, Kaya Y, Yildirim S (2021) A novel multi population based particle swarm optimization for feature selection. Knowl-Based Syst 219:106894. https://doi.org/10.1016/j.knosys.2021.106894

    Article  Google Scholar 

  8. Wang Y, Li T (2020) Local feature selection based on artificial immune system for classification. Appl Soft Comput 87:105989. https://doi.org/10.1016/j.asoc.2019.105989

    Article  Google Scholar 

  9. Kumar S, Singh M (2021) Breast cancer detection based on feature selection using enhanced grey wolf optimizer and support vector machine algorithms. Vietnam J Comput Sci 08(02):177. https://doi.org/10.1142/S219688882150007X

    Article  Google Scholar 

  10. Hu G, Du B, Wang X, Wei G (2022) An enhanced black widow optimization algorithm for feature selection. Knowl-Based Syst 235:107638. https://doi.org/10.1016/j.knosys.2021.107638

    Article  Google Scholar 

  11. Abualigah L, Dulaimi AJ (2021) A novel feature selection method for data mining tasks using hybrid Sine Cosine Algorithm and Genetic Algorithm. Clust Comput 24(April):2161. https://doi.org/10.1007/s10586-021-03254-y

    Article  Google Scholar 

  12. Hu G, Zhong J, Du B, Wei G (2022) An enhanced hybrid arithmetic optimization algorithm for engineering applications. Comput Methods Appl Mech Eng 394:114901. https://doi.org/10.1016/j.cma.2022.114901https://doi.org/10.1016/j.cma.2022.114901

    Article  MathSciNet  MATH  Google Scholar 

  13. Tiwari A, Chaturvedi A (2022) A hybrid feature selection approach based on information theory and dynamic butterfly optimization algorithm for data classification. Expert Syst Appl 196:116621. https://doi.org/10.1016/j.eswa.2022.116621

    Article  Google Scholar 

  14. Abdel-Basset M, Abdel-Fatah L, Sangaiah AK (2018) Metaheuristic algorithms: A comprehensive review. In: Computational intelligence for multimedia big data on the cloud with engineering applications (Elsevier). https://doi.org/10.1016/B978-0-12-813314-9.00010-4https://doi.org/10.1016/B978-0-12-813314-9.00010-4, pp 185–231

  15. Hira ZM, Gillies DF (2015) A review of feature selection and feature extraction methods applied on microarray data. Adv Bioinformatics 2015:1. https://doi.org/10.1155/2015/198363

    Article  Google Scholar 

  16. Zebari R, Abdulazeez A, Zeebaree D, Zebari D, Saeed J (2020) A comprehensive review of dimensionality reduction techniques for feature selection and feature extraction. J Appl Sci Technol Trends 1(2):56. https://doi.org/10.38094/jastt1224

    Article  Google Scholar 

  17. Zhang Y (2021) Safety management of civil engineering construction based on artificial intelligence and machine vision technology. Adv Civil Eng 2021:1. https://doi.org/10.1155/2021/3769634

    Google Scholar 

  18. Tadist K, Najah S, Nikolov NS, Mrabti F, Zahi A (2019) Feature selection methods and genomic big data: A systematic review. J Big Data 6(1):79. https://doi.org/10.1186/s40537-019-0241-0

    Article  Google Scholar 

  19. Jović A, Brkić K, Bogunović N (2015) A review of feature selection methods with applications. In: 2015 38th International convention on information and communication technology, Electronics and microelectronics, MIPRO 2015 - Proceedings. pp 1200–1205. https://doi.org/10.1109/MIPRO.2015.7160458

  20. Venkatesh B, Anuradha J (2019) A review of feature selection and its methods. Cybern Inf Technol 19(1):3. https://doi.org/10.2478/cait-2019-0001https://doi.org/10.2478/cait-2019-0001

    MathSciNet  Google Scholar 

  21. Kumar V (2014) Feature selection: A literature review. The Smart Computing Review 4(3). https://doi.org/10.6029/smartcr.2014.03.007

  22. Senthil Kumar P, Lopez D (2016) A review on feature selection methods for high dimensional data. Inter J Eng Technol 8(2):669

    Google Scholar 

  23. Ji Z, Meng G, Huang D, Yue X, Wang B (2015) NMFBFS: A NMF-based feature selection method in identifying pivotal clinical symptoms of hepatocellular carcinoma. Comput Math Methods Med 2015:1. https://doi.org/10.1155/2015/846942

    Article  Google Scholar 

  24. Salesi S, Cosma G, Mavrovouniotis M (2021) TAGA: Tabu Asexual Genetic Algorithm embedded in a filter/filter feature selection approach for high-dimensional data. Inform Sci 565:105. https://doi.org/10.1016/j.ins.2021.01.020

    Article  MathSciNet  Google Scholar 

  25. Kohavi R, John GH, John H (1997) Wrappers for feature subset selection. Artif Intell 97 (1-2):273. https://doi.org/10.1016/S0004-3702(97)00043-Xhttps://doi.org/10.1016/S0004-3702(97)00043-X

    Article  MATH  Google Scholar 

  26. Cai J, Luo J, Wang S, Yang S (2018) Feature selection in machine learning: A new perspective. Neurocomputing 300(2):70. https://doi.org/10.1016/j.neucom.2017.11.077

    Article  Google Scholar 

  27. Tang J, Alelyani S, Lui H (2014) Feature selection for classification: A review. Data Classification:, Algorithms and Applications, pp 571–605. https://doi.org/10.1201/b17320-3

  28. Mohsenzadeh Y, Sheikhzadeh H, Reza AM, Bathaee N, Kalayeh MM (2013) The relevance sample-feature machine: A sparse Bayesian learning approach to joint feature-sample selection. IEEE Trans Cybern 43(6):2241. https://doi.org/10.1109/TCYB.2013.2260736https://doi.org/10.1109/TCYB.2013.2260736

    Article  Google Scholar 

  29. Quinlan JR (2014) C4.5: Programs for machine learning (Elsevier)

  30. Zhu P, Zuo W, Zhang L, Hu Q, Shiu SC (2015) Unsupervised feature selection by regularized self-representation. Pattern Recogn 48(2):438. https://doi.org/10.1016/j.patcog.2014.08.006

    Article  MATH  Google Scholar 

  31. Gebken B, Peitz S (2021) An efficient descent method for locally lipschitz multiobjective optimization problems. J Optim Theory Appl 188(3):696. https://doi.org/10.1007/s10957-020-01803-w

    Article  MathSciNet  MATH  Google Scholar 

  32. Hong WJ, Yang P, Tang K (2021) Evolutionary computation for large-scale multi-objective optimization: A decade of progresses. Int J Autom Comput 18(2):155. https://doi.org/10.1007/s11633-020-1253-0https://doi.org/10.1007/s11633-020-1253-0

    Article  Google Scholar 

  33. Bahri O, Talbi EG, Ben Amor N (2018) A generic fuzzy approach for multi-objective optimization under uncertainty. Swarm Evol Comput 40:166. https://doi.org/10.1016/j.swevo.2018.02.002

    Article  Google Scholar 

  34. Jothi G, Inbarani HH (2016) Hybrid tolerance rough set-firefly based supervised feature selection for MRI brain tumor image classification. Appl Soft Comput J 46:639. https://doi.org/10.1016/j.asoc.2016.03.014https://doi.org/10.1016/j.asoc.2016.03.014

    Article  Google Scholar 

  35. Santucci V, Baioletti M, Milani A (2020) An algebraic framework for swarm and evolutionary algorithms in combinatorial optimization. Swarm Evol Comput 100673:55. https://doi.org/10.1016/j.swevo.2020.100673https://doi.org/10.1016/j.swevo.2020.100673

    Google Scholar 

  36. Zhang S, Lee C, Chan H, Choy K, Wu Z (2015) Swarm intelligence applied in green logistics: A literature review. Eng Appl Artif Intell 37:154. https://doi.org/10.1016/j.engappai.2014.09.007

    Article  Google Scholar 

  37. Barak S, Dahooie JH, Tichý T (2015) Wrapper ANFIS-ICA Method to do stock market timing and feature selection on the basis of japanese candlestick. Expert Syst Appl 42(23):9221. https://doi.org/10.1016/j.eswa.2015.08.010

    Article  Google Scholar 

  38. Senawi A, Wei HL, Billings SA (2017) A new maximum relevance-minimum Multicollinearity (MRmMC) Method for feature selection and ranking. Pattern Recogn 67:47. https://doi.org/10.1016/j.patcog.2017.01.026

    Article  Google Scholar 

  39. Li T (2004) Computer immunology (Publishing House of Electronics Industry)

  40. Macfarlane B (1959) The clonal selection theory of acquired immunity. cambridge university press, London

    Google Scholar 

  41. De Castro LN, Von Zuben FJ (2000) The clonal selection algorithm with engineering applications. In: Proceedings of GECCO, vol 2000, pp 36–39

  42. Shang R, Jiao L, Liu F, Ma W (2012) A novel immune clonal algorithm for MO problems. IEEE Trans Evol Comput 16(1):35. https://doi.org/10.1109/TEVC.2010.2046328

    Article  Google Scholar 

  43. Dai H, Yang Y, Li H, Li C (2014) Bi-Direction quantum crossover-based clonal selection algorithm and its applications. Expert Syst Appl 41(16):7248. https://doi.org/10.1016/j.eswa.2014.05.053

    Article  Google Scholar 

  44. Xu N, Ding Y, Ren L, Hao K (2018) Degeneration recognizing clonal selection algorithm for multimodal optimization. IEEE Trans Cybern 48(3):848. https://doi.org/10.1109/TCYB.2017.2657797

    Article  Google Scholar 

  45. Yan X, Li P, Tang K, Gao L, Wang L (2020) Clonal selection based intelligent parameter inversion algorithm for prestack seismic data. Inf Sci 517:86. https://doi.org/10.1016/j.ins.2019.12.083

    Article  Google Scholar 

  46. Luo W, Lin X, Zhu T, Xu P (2019) A clonal selection algorithm for dynamic multimodal function optimization. Swarm Evol Comput 50(February 2018):100459. https://doi.org/10.1016/j.swevo.2018.10.010https://doi.org/10.1016/j.swevo.2018.10.010

    Article  Google Scholar 

  47. Aladeemy M, Tutun S, Khasawneh MT (2017) A new hybrid approach for feature selection and support vector machine model selection based on self-adaptive cohort intelligence. Expert Syst Appl 88:118. https://doi.org/10.1016/j.eswa.2017.06.030

    Article  Google Scholar 

  48. Emary E, Zawbaa HM, Hassanien AE (2016) Binary ant lion approaches for feature selection. Neurocomputing 213:54. https://doi.org/10.1016/j.neucom.2016.03.101

    Article  Google Scholar 

  49. Mafarja MM, Mirjalili S (2017) Hybrid whale optimization algorithm with simulated annealing for feature selection. Neurocomputing 260:302. https://doi.org/10.1016/j.neucom.2017.04.053

    Article  Google Scholar 

  50. Aziz MAE, Hassanien AE (2018) Modified cuckoo search algorithm with rough sets for feature selection. Neural Comput & Applic 29(4):925. https://doi.org/10.1007/s00521-016-2473-7

    Article  Google Scholar 

  51. Magocha TA, Zabed H, Yang M, Yun J, Zhang H, Qi X (2018) Improvement of industrially important microbial strains by genome shuffling: Current status and future prospects. Bioresour Technol 257:281. https://doi.org/10.1016/j.biortech.2018.02.118

    Article  Google Scholar 

  52. Frydenberg O (2009) Population studies of a lethal mutant in drosophila melanogaster: I. Behaviour in populations with discrete generations. Hereditas 50(1):89. https://doi.org/10.1111/j.1601-5223.1963.tb01896.xhttps://doi.org/10.1111/j.1601-5223.1963.tb01896.x

    Article  Google Scholar 

  53. Dua D, Graff C (2017) UCI machine learning repository

  54. AS (2018) University Arizona State University’s (ASU) Repository

  55. Kaggle (2020) Wisconsin Diagnosis Breast Cancer Database (WDBC)

  56. de Rosa GH, Papa JP, Yang XS (2020) A Nature-inspired feature selection approach based on Hypercomplex information. Appl Soft Comput 106453:94. https://doi.org/10.1016/j.asoc.2020.106453

    Google Scholar 

  57. Papa JP, Rosa GH, de Souza AN, Afonso LC (2018) Feature selection through binary brain storm optimization. Comput & Electric Eng 72:468. https://doi.org/10.1016/j.compeleceng.2018.10.013https://doi.org/10.1016/j.compeleceng.2018.10.013

    Article  Google Scholar 

  58. Abdel-Basset M, El-Shahat D, El-henawy I, de Albuquerque VHC, Mirjalili S (2020) A new fusion of grey wolf optimizer algorithm with a two-phase mutation for feature selection. Expert Syst Appl 139:112824. https://doi.org/10.1016/j.eswa.2019.112824

    Article  Google Scholar 

  59. Mafarja M, Aljarah I, Faris H, Hammouri AI, Al-Zoubi AM, Mirjalili S (2019) Binary grasshopper optimisation algorithm approaches for feature selection problems. Expert Syst Appl 117(September):267. https://doi.org/10.1016/j.eswa.2018.09.015

    Article  Google Scholar 

  60. Ji B, Lu X, Sun G, Zhang W, Li J, Xiao Y (2020) Bio-inspired feature selection: An improved binary particle swarm optimization approach. IEEE Access 8:85989. https://doi.org/10.1109/ACCESS.2020.2992752https://doi.org/10.1109/ACCESS.2020.2992752

    Article  Google Scholar 

  61. Taradeh M, Mafarja M, Heidari AA, Faris H, Aljarah I, Mirjalili S, Fujita H (2019) An evolutionary gravitational search-based feature selection. Inf Sci 497:219. https://doi.org/10.1016/j.ins.2019.05.038https://doi.org/10.1016/j.ins.2019.05.038

    Article  Google Scholar 

  62. Arora S, Anand P (2019) Binary butterfly optimization approaches for feature selection. Expert Syst Appl 116:147. https://doi.org/10.1016/j.eswa.2018.08.051https://doi.org/10.1016/j.eswa.2018.08.051

    Article  Google Scholar 

  63. Khaire UM, Dhanalakshmi R (2019) Stability of feature selection algorithm: A review. Journal of King Saud University-Computer and Information Sciences

Download references

Acknowledgements

This work was supported in part by the National Key Research and Development Program of China (No. 2020YFB1805400); in part by the National Natural Science Foundation of China (No.U1736212 , U19A2068, 62032002, and 62002248); in part by the China Postdoctoral Science Foundation (No. 2019TQ0217, 2020M673277, and 2020M683345); in part by the Provincial Key Research and Development Program of Sichuan (No. 20ZDYF3145); in part by the Fundamental Research Funds for the Central Universities (No. Y J201933); in part by the China International Postdoctoral Exchange Fellowship Program (Talent-Introduction).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Tao Li.

Ethics declarations

Conflict of Interests

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Zhu, Y., Li, T. & Lan, X. Feature selection optimized by the artificial immune algorithm based on genome shuffling and conditional lethal mutation. Appl Intell 53, 13972–13992 (2023). https://doi.org/10.1007/s10489-022-03971-w

Download citation

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10489-022-03971-w

Keywords

Navigation