Improved salp swarm algorithm based on particle swarm optimization for feature selection

Abstract

Feature selection (FS) is a machine learning process commonly used to reduce the high dimensionality problems of datasets. This task permits to extract the most representative information of high sized pools of data, reducing the computational effort in other tasks as classification. This article presents a hybrid optimization method for the FS problem; it combines the slap swarm algorithm (SSA) with the particle swarm optimization. The hybridization between both approaches creates an algorithm called SSAPSO, in which the efficacy of the exploration and the exploitation steps is improved. To verify the performance of the proposed algorithm, it is tested over two experimental series, in the first one, it is compared with other similar approaches using benchmark functions. Meanwhile, in the second set of experiments, the SSAPSO is used to determine the best set of features using different UCI datasets. Where the redundant or the confusing features are removed from the original dataset while keeping or yielding a better accuracy. The experimental results provide the evidence of the enhancement in the SSAPSO regarding the performance and the accuracy without affecting the computational effort.

This is a preview of subscription content, access via your institution.

Fig. 1
Fig. 2
Fig. 3
Fig. 4

References

  1. Al-Ayyoub M, Jararweh Y, Rababah A, Aldwairi M (2017) Feature extraction and selection for arabic tweets authorship authentication. J Ambient Intell Humaniz Comput 8(3):383–393

    Article  Google Scholar 

  2. Anderson PA, Bone Q (1980) Communication between individuals in salp chains II. Physiology. Proc R Soc Lond B Biol Sci 210(1181):559–574

    Article  Google Scholar 

  3. Arigbabu OA, Mahmood S, Ahmad SMS, Arigbabu AA (2016) Smile detection using hybrid face representation. J Ambient Intell Humaniz Comput 7(3):415–426

    Article  Google Scholar 

  4. Awada W, Khoshgoftaar TM, Dittman D, Wald R, Napolitano A (2012) A review of the stability of feature selection techniques for bioinformatics data. In: 2012 IEEE 13th international conference on information reuse and integration (IRI). IEEE, pp 356–363

  5. Chang PC, Lin JJ, Liu CH (2012) An attribute weight assignment and particle swarm optimization algorithm for medical database classifications. Comput Methods Prog Biomed 107(3):382–392

    Article  Google Scholar 

  6. Chen LH, Yang B, jing Wang S, Wang G, zhong Li H, bin Liu W (2014) Towards an optimal support vector machine classifier using a parallel particle swarm optimization strategy. Appl Math Comput 239:180–197

    MathSciNet  MATH  Google Scholar 

  7. Chikh R, Chikhi S (2017) Clustered negative selection algorithm and fruit fly optimization for email spam detection. J Ambient Intell Humaniz Comput. https://doi.org/10.1007/s12652-017-0621-2

    Google Scholar 

  8. Chuang LY, Yang CH, Yang CH (2009) Tabu search and binary particle swarm optimization for feature selection using microarray data. J Comput Biol 16(12):1689–1703

    MathSciNet  Article  Google Scholar 

  9. Cuevas E, Cienfuegos M (2014) A new algorithm inspired in the behavior of the social-spider for constrained optimization. Expert Syst Appl 41(2):412–425

    Article  Google Scholar 

  10. Dash M, Liu H (2003) Consistency-based search in feature selection. Artif Intell 151(1–2):155–176

    MathSciNet  Article  MATH  Google Scholar 

  11. Eberhart R, Kennedy J (1995) A new optimizer using particle swarm theory. In: Micro machine and human science, 1995. MHS’95. Proceedings of the Sixth International Symposium on, IEEE, pp 39–43

  12. El Aziz MA, Ewees AA, Hassanien AE (2016) Hybrid swarms optimization based image segmentation. In: Bhattacharyya S, Dutta P, De S, Klepac G (eds) Hybrid soft computing for image segmentation. Springer, Berlin, pp 1–21

    Google Scholar 

  13. El Aziz MA, Hemdan AM, Ewees AA, Elhoseny M, Shehab A, Hassanien AE, Xiong S (2017) Prediction of biochar yield using adaptive neuro-fuzzy inference system with particle swarm optimization. In: PowerAfrica, 2017 IEEE PES, IEEE, pp 115–120

  14. El Aziz MA, Ewees AA, Hassanien AE (2018a) Multi-objective whale optimization algorithm for content-based image retrieval. Multimed Tools Appl 77:26135–26172

    Article  Google Scholar 

  15. El Aziz MA, Ewees AA, Hassanien AE, Mudhsh M, Xiong S (2018b) Multi-objective whale optimization algorithm for multilevel thresholding segmentation. In: Hassanien A, Oliva D (eds) Advances in soft computing and machine learning in image processing. Springer, Berlin, pp 23–39

    Google Scholar 

  16. Elaziz MEA, Ewees AA, Oliva D, Duan P, Xiong S (2017) A hybrid method of sine cosine algorithm and differential evolution for feature selection. In: Liu D, Xie S, Li Y, Zhao D, El-Alfy ES (eds) International conference on neural information processing. Springer, Berlin, pp 145–155

    Google Scholar 

  17. Ewees AA, El Aziz MA, Elhoseny M (2017a) Social-spider optimization algorithm for improving anfis to predict biochar yield. In: 2017 8th International Conference on Computing, Communication and Networking Technologies (ICCCNT), IEEE, pp 1–6

  18. Ewees AA, El Aziz MA, Hassanien AE (2017b) Chaotic multi-verse optimizer-based feature selection. Neural Comput Appl. https://doi.org/10.1007/s00521-017-3131-4

    Google Scholar 

  19. Ewees AA, Elaziz MA, Houssein EH (2018) Improved grasshopper optimization algorithm using opposition-based learning. Expert Syst Appl 112:156–172

    Article  Google Scholar 

  20. Gasca SJARE (2006) Eliminating redundancy and irrelevance using a new mlp-based feature selection method. Pattern Recognit 39(2):313–315

    MathSciNet  Article  MATH  Google Scholar 

  21. Goldberg DE, Holland JH (1988) Genetic algorithms and machine learning. Machine Learn 3(2):95–99

    Article  Google Scholar 

  22. Guyon I, Elisseeff A (2003) An introduction to variable and feature selection. J Mach Learn Res 3:1157–1182

    MATH  Google Scholar 

  23. Guyon I, Weston J, Barnhill S, Vapnik V (2002) Gene selection for cancer classification using support vector machines. Mach Learn 46(1–3):389–422

    Article  MATH  Google Scholar 

  24. Hafez AI, Hassanien AE, Zawbaa HM, Emary E (2015) Hybrid monkey algorithm with krill herd algorithm optimization for feature selection. In: 2015 11th International computer engineering conference (ICENCO). IEEE, pp 273–277

  25. Han J, Pei J, Kamber M (2011) Data mining: concepts and techniques. Elsevier, New York

    Google Scholar 

  26. Henschke N, Everett JD, Richardson AJ, Suthers IM (2016) Rethinking the role of salps in the ocean. Trends Ecol Evol 31(9):720–733

    Article  Google Scholar 

  27. Ibrahim RA, Oliva D, Ewees AA, Lu S (2017) Feature selection based on improved runner-root algorithm using chaotic singer map and opposition-based learning. In: Liu D, Xie S, Li Y, Zhao D, El-Alfy ES (eds) International conference on neural information processing. Springer, Berlin, pp 156–166

    Google Scholar 

  28. Ibrahim RA, Elaziz MA, Ewees AA, Selim IM, Lu S (2018) Galaxy images classification using hybrid brain storm optimization with moth flame optimization. J Astron Telesc Instrum Syst 4(3):038001

    Article  Google Scholar 

  29. Inbarani HH, Azar AT, Jothi G (2014) Supervised hybrid feature selection based on pso and rough sets for medical diagnosis. Comput Methods Prog Biomed 113(1):175–185

    Article  Google Scholar 

  30. Jensen R, Goodarzi M, Freitas MP (2009) Feature selection and linear/nonlinear regression methods for the accurate prediction of glycogen synthase kinase-3beta inhibitory activities. J Chem Inf Model 49:824–832

    Article  Google Scholar 

  31. Karaboga D, Basturk B (2007) A powerful and efficient algorithm for numerical function optimization: artificial beecolony (abc) algorithm. J Global Optim 39(3):459–471

    MathSciNet  Article  MATH  Google Scholar 

  32. Karnan M, Thangavel K, Sivakuar R, Geetha K (2006) Ant colony optimization for feature selection and classification of microcalcifications in digital mammograms. In: Advanced Computing and Communications, 2006. ADCOM 2006. International Conference on, IEEE, pp 298–303

  33. Kira K, Rendell LA (1992) The feature selection problem: traditional methods and a new algorithm. In: Proc AAAI 1992, San Jose, CA, pp 129–134

  34. Kohane IS, Butte AJ, Kho A (2002) Microarrays for an integrative genomics. MIT press, Cambridge

    Google Scholar 

  35. Kohavi R (1994) Feature subset selection using the wrapper method, overfitting and dynamic search space topology. In: Proc AAAI Fall Symposium on Relevance, pp 109–113

  36. Kung SY, Luo Y, Mak MW (2010) Feature selection for genomic signal processing: unsupervised, supervised, and self-supervised scenarios. J Signal Process Syst 61(1):3–20

    Article  Google Scholar 

  37. Lai C, Reinders MJ, Wessels L (2006) Random subspace method for multivariate feature selection. Pattern recognition letters 27(10):1067–1076

    Article  Google Scholar 

  38. Li X, Wang G (2015) Optimal band selection for hyperspectral data with improved differential evolution. J Ambient Intell Humaniz Comput 6(5):675–688

    MathSciNet  Article  Google Scholar 

  39. Li J, Wong L (2002) Identifying good diagnostic genes or genes groups from gene expression data by using the concept of emerging patterns. Bioinformatics 18:725–734

    Article  Google Scholar 

  40. Liu Y, Wang G, Chen H, Dong H, Zhu X, Wang S (2011) An improved particle swarm optimization for feature selection. J Bionic Eng 8(2):191–200

    Article  Google Scholar 

  41. Madin LP (1990) Aspects of jet propulsion in salps. Can J Zool 68(4):765–777

    Article  Google Scholar 

  42. Menghour K, Souici-Meslati L (2016) Hybrid aco-pso based approaches for feature selection. Int J Intell Eng Syst 9(3):65–79

    Google Scholar 

  43. Mirjalili S, Mirjalili SM, Lewis A (2014) Grey wolf optimizer. Adv Eng Softw 69:46–61

    Article  Google Scholar 

  44. Mirjalili S, Gandomi AH, Mirjalili SZ, Saremi S, Faris H, Mirjalili SM (2017) Salp swarm algorithm: a bio-inspired optimizer for engineering design problems. Adv Eng Softw 114:163–191

    Article  Google Scholar 

  45. Modrzejewski M (1993) Feature selection using rough sets theory. In: Proceedings of the European Conference on Machine Learning, Vienna, Austria, pp 213–226

  46. Moradi P, Rostami M (2015) Integration of graph clustering with ant colony optimization for feature selection. Knowl Based Syst 84:144–161

    Article  Google Scholar 

  47. Neumann J, Schnörr C, Steidl G (2005) Combined svm-based feature selection and classification. Mach Learn 61(1–3):129–150

    Article  MATH  Google Scholar 

  48. Niknam T, Amiri B (2010) An efficient hybrid approach based on pso, aco and k-means for cluster analysis. Appl Soft Comput 10(1):183–197

    Article  Google Scholar 

  49. Noman S, Shamsuddin SM, Hassanien AE (2009) Hybrid learning enhancement of rbf network with particle swarm optimization. In: Hassanien AE, Abraham A, Vasilakos AV, Pedrycz W (eds) Foundations of computational, intelligence, vol 1. Springer, Berlin, pp 381–397

    Google Scholar 

  50. Prabukumar M, Agilandeeswari L, Ganesan K (2017) An intelligent lung cancer diagnosis system using cuckoo search optimization and support vector machine classifier. J Ambient Intell Humaniz Comput. https://doi.org/10.1007/s12652-017-0655-5

    Google Scholar 

  51. Raymer ML, Punch WF, Goodman ED, Kuhn LA, Jain AK (2000) Dimensionality reduction using genetic algorithms. IEEE transactions on evolutionary computation 4(2):164–171

    Article  Google Scholar 

  52. Rodrigues D, Pereira LA, Nakamura RY, Costa KA, Yang XS, Souza AN, Papa JP (2014) A wrapper approach for feature selection based on bat algorithm and optimum-path forest. Expert Syst Appl 41(5):2250–2258

    Article  Google Scholar 

  53. Saravanan RA, Rajesh Babu M (2017) Enhanced text mining approach based on ontology for clustering research project selection. J Ambient Intell Humaniz Comput. https://doi.org/10.1007/s12652-017-0637-7

    Google Scholar 

  54. Sutherland KR, Weihs D (2017) Hydrodynamic advantages of swimming by salp chains. J R Soc Interface 14(133):20170,298

    Article  Google Scholar 

  55. Tanaka K, Kurita T, Kawabe T (2007) Selection of import vectors via binary particle swarm optimization and cross-validation for kernel logistic regression. In: Proceedings of the International Joint Conference on Neural Networks (IJCNN 07),IEEE, Orlando, Fla, USA, pp 1037–1042

  56. Thangavel K, Velayutham C (2011) Mammogram image analysis: bioinspired computational approach. In: Proceedings of the International Conference on Soft Computing for Problem Solving, pp 20–22

  57. Unler A, Murat A (2010) A discrete particle swarm optimization method for feature selection in binary classification problems. Eur J Oper Res 206(3):528–539

    Article  MATH  Google Scholar 

  58. Wang Y, Cen Y, Zhao R, Zhang L, Kan S, Hu S (2018) Compressed sensing based feature fusion for image retrieval. J Ambient Intell Humaniz Comput. https://doi.org/10.1007/s12652-018-0895-z

    Google Scholar 

  59. Xie ZX, Hu QH, Yu DR (2006) Improved feature selection algorithm based on svm and correlation. In: International Symposium on Neural Networks. Springer, Berlin, pp 1373–1380

  60. Yamuna G, Thamaraichelvi B (2016) Hybrid firefly swarm intelligence based feature selection for medical data classification and segmentation in svd–nsct domain. Int J Adv Res 4(9):744–760

    Article  Google Scholar 

  61. Yang Y, Slattery S, Ghani R (2002) A study of approaches to hypertext categorization. J Intell Inform Syst 18(2):219–241

    Article  Google Scholar 

  62. Yao YY (2003) Information-theoretic measures for knowledge discovery and data mining. In: Karmeshu (ed) Entropy measures, maximum entropy principle and emerging applications. Springer, Berlin, Heidelberg, pp 115–136

    Google Scholar 

  63. Yeh WC, Yang YT, Lai CM (2016) A hybrid simplified swarm optimization method for imbalanced data feature selection. Aust Acad Bus Econ Rev 2(3):263–275

    Google Scholar 

  64. Zhang H, Sun G (2002) Feature selection using tabu search method. Pattern Recognit 35(3):701–711

    Article  MATH  Google Scholar 

  65. Zhong DJN (2001) Using rough sets with heuristics for feature selection. J Intell Inform Syst 16:199–214

    Article  MATH  Google Scholar 

Download references

Acknowledgements

This work is supported by the Science and Technology Program of Shenzhen of China under Grant Nos. JCYJ20170818160208570 and JCYJ20170307160458368.

Author information

Affiliations

Authors

Corresponding author

Correspondence to Songfeng Lu.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Ibrahim, R.A., Ewees, A.A., Oliva, D. et al. Improved salp swarm algorithm based on particle swarm optimization for feature selection. J Ambient Intell Human Comput 10, 3155–3169 (2019). https://doi.org/10.1007/s12652-018-1031-9

Download citation

Keywords

  • Salp swarm algorithm
  • Particle swarm optimization
  • Feature selection
  • Global optimization
  • Swarm techniques