Advertisement

Binary Harris Hawks Optimizer for High-Dimensional, Low Sample Size Feature Selection

  • Thaer Thaher
  • Ali Asghar Heidari
  • Majdi Mafarja
  • Jin Song Dong
  • Seyedali MirjaliliEmail author
Chapter
Part of the Algorithms for Intelligent Systems book series (AIS)

Abstract

Feature selection is a preprocessing step that aims to eliminate the features that may negatively influence the performance of the machine learning techniques. The negative influence is due to the possibility of having many irrelevant and/or redundant features. In this chapter, a binary variant of recent Harris hawks optimizer (HHO) is proposed to boost the efficacy of wrapper-based feature selection techniques. HHO is a new fast and efficient swarm-based optimizer with various simple but effective exploratory and exploitative mechanisms (Levy flight, greedy selection, etc.) and a dynamic structure for solving continuous problems. However, it was originally designed for continuous search spaces. To deal with binary feature spaces, we propose a new binary HHO in this chapter. The binary HHO is validated based on special types of feature selection datasets. These hard datasets are high dimensional, which means that there is a huge number of features. Simultaneously, we should deal with a low number of samples. Various experiments and comparisons reveal the improved stability of HHO in dealing with this type of datasets.

Keywords

Harris Hawk optimizer Optimization Feature selection Neural networks Artificial intelligence Machine learning Data science 

References

  1. 1.
    Pudil P, Novovičová J, Kittler J (1994) Floating search methods in feature selection. Pattern Recognit Lett 15:1119–1125CrossRefGoogle Scholar
  2. 2.
    Seijo-Pardo B, Bolón-Canedo V, Alonso-Betanzos A (2019) On developing an automatic threshold applied to feature selection ensembles. Inf Fusion 45:227–245CrossRefGoogle Scholar
  3. 3.
    Bolón-Canedo V, Alonso-Betanzos A (2019) Ensembles for feature selection: a review and future trends. Inf Fusion 52:1–12CrossRefGoogle Scholar
  4. 4.
    Jin X, Xu A, Bie R, Guo P (2006) Machine learning techniques and chi-square feature selection for cancer classification using sage gene expression profiles. In: International workshop on data mining for biomedical applications. Springer, pp 106–115Google Scholar
  5. 5.
    Mafarja M, Aljarah I, Heidari AA, Hammouri AI, Faris H, Ala’M A-Z, Mirjalili S (2018) Evolutionary population dynamics and grasshopper optimization approaches for feature selection problems. Knowl-Based Syst 145:25–45CrossRefGoogle Scholar
  6. 6.
    Heidari AA, Aljarah I, Faris H, Chen H, Luo J, Mirjalili S (2019) An enhanced associative learning-based exploratory whale optimizer for global optimization. Neural Comput ApplGoogle Scholar
  7. 7.
    Xu Y, Chen H, Heidari AA, Luo J, Zhang Q, Zhao X, Li C (2019) An efficient chaotic mutative moth-flame-inspired optimizer for global optimization tasks. Expert Syst Appl 129:135–155CrossRefGoogle Scholar
  8. 8.
    Heidari AA, Mirjalili S, Faris H, Aljarah I, Mafarja M, Chen H (2019) Harris hawks optimization: algorithm and applications. Futur Gener Comput Syst 97:849–872CrossRefGoogle Scholar
  9. 9.
    Kohavi R, John GH (1997) Wrappers for feature subset selection. Artif Intell 97:273–324CrossRefGoogle Scholar
  10. 10.
    Crawford B, Soto R, Astorga G, Conejeros JG, Castro C, Paredes F (2017) Putting continuous metaheuristics to work in binary search spaces. Complexity 2017:1–19MathSciNetCrossRefGoogle Scholar
  11. 11.
    Afshinmanesh F, Marandi A, Rahimi-Kian A (2005) A novel binary particle swarm optimization method using artificial immune system. In: EUROCON 2005-The international conference on Computer as a Tool, vol 1. IEEE, pp 217–220Google Scholar
  12. 12.
    Kennedy J, Eberhart RC (1997) A discrete binary version of the particle swarm algorithm. In: 1997 IEEE international conference on systems, man, and cybernetics, computational cybernetics and simulation, vol 5. IEEE, pp 4104–4108Google Scholar
  13. 13.
    Rashedi E, Nezamabadi-Pour H, Saryazdi S (2010) Bgsa: binary gravitational search algorithm. Nat Comput 9:727–745MathSciNetCrossRefGoogle Scholar
  14. 14.
    Altman NS (1992) An introduction to kernel and nearest-neighbor nonparametric regression. Am Stat 46:175–185MathSciNetGoogle Scholar
  15. 15.
    Liao T, Kuo R (2018) Five discrete symbiotic organisms search algorithms for simultaneous optimization of feature subset and neighborhood size of knn classification models. Appl Soft Comput 64:581–595CrossRefGoogle Scholar
  16. 16.
    Mafarja M, Aljarah I, Heidari AA, Faris H, Fournier-Viger P, Li X, Mirjalili S (2018) Binary dragonfly optimization for feature selection using time-varying transfer functions. Knowl-Based Syst 161:185–204CrossRefGoogle Scholar
  17. 17.
    Faris H, Mafarja MM, Heidari AA, Aljarah I, Ala’M A-Z, Mirjalili S, Fujita H (2018) An efficient binary salp swarm algorithm with crossover scheme for feature selection problems. Knowl-Based Syst 154:43–67CrossRefGoogle Scholar
  18. 18.
    Statnikov A, Aliferis CF, Tsamardinos I, Hardin D, Levy S (2004) A comprehensive evaluation of multicategory classification methods for microarray gene expression cancer diagnosis. Bioinformatics 21:631–643CrossRefGoogle Scholar
  19. 19.
    Luo J, Chen H, Heidari AA, Xu Y, Zhang Q, Li C (2019) Multi-strategy boosted mutative whale-inspired optimization approaches. Appl Math ModelGoogle Scholar
  20. 20.
    Benjamin DJ, Berger JO (2019) Three recommendations for improving the use of p-values. Am Stat 73:186–191MathSciNetCrossRefGoogle Scholar
  21. 21.
    Goldberg DE, Holland JH (1988) Genetic algorithms and machine learning. Mach Learn 3:95–99CrossRefGoogle Scholar
  22. 22.
    Emary E, Zawbaa HM, Hassanien AE (2016) Binary ant lion approaches for feature selection. Neurocomputing 213:54–65CrossRefGoogle Scholar
  23. 23.
    Nakamura RY, Pereira LA, Costa KA, Rodrigues D, Papa JP, Yang X-S (2012) Bba: a binary bat algorithm for feature selection. In: 2012 25th SIBGRAPI conference on graphics, patterns and images. IEEE, pp 291–297Google Scholar
  24. 24.
    Sayed GI, Khoriba G, Haggag MH (2018) A novel chaotic salp swarm algorithm for global optimization and feature selection. Appl Intell 48:3462–3481CrossRefGoogle Scholar

Copyright information

© Springer Nature Singapore Pte Ltd. 2020

Authors and Affiliations

  • Thaer Thaher
    • 1
    • 2
  • Ali Asghar Heidari
    • 3
    • 4
  • Majdi Mafarja
    • 5
  • Jin Song Dong
    • 4
    • 6
  • Seyedali Mirjalili
    • 7
    • 8
    Email author
  1. 1.College of Engineering and Information TechnologyAn-Najah National UniversityNablusPalestine
  2. 2.Department of Information TechnologyAt-Tadamun SocietyNablusPalestine
  3. 3.School of Surveying and Geospatial Engineering, College of EngineeringUniversity of TehranTehranIran
  4. 4.Department of Computer Science, School of ComputingNational University of SingaporeSingaporeSingapore
  5. 5.Department of Computer Science, Faculty of Engineering and TechnologyBirzeit UniversityBirzeitPalestine
  6. 6.Institute for Integrated and Intelligent SystemsGriffith UniversityNathan, BrisbaneAustralia
  7. 7.Torrens University AustraliaBrisbaneAustralia
  8. 8.Griffith UniversityBrisbaneAustralia

Personalised recommendations