Advertisement

Evolutionary and Swarm-Based Feature Selection for Imbalanced Data Classification

  • Feras Namous
  • Hossam FarisEmail author
  • Ali Asghar Heidari
  • Monther Khalafat
  • Rami S. Alkhawaldeh
  • Nazeeh Ghatasheh
Chapter
Part of the Algorithms for Intelligent Systems book series (AIS)

Abstract

Recently, feature selection task has gained more attention in classification of problems. This task aims to find the most important features in a large search space of potential solutions. Hence, a challenging problem is manifested to find the optimal solution. In this paper, we study a metaheuristic-based approach for feature selection in binary classification problems. The scenario deals with several highly imbalanced datasets. In an attempt to handle the problem of imbalanced data, the common fitness function based on the classification accuracy is replaced with two more effective fitness functions: the area under the ROC curve and the geometric mean. To evaluate the effectiveness of the developed approach, two popular metaheuristic approaches are experimented with the three fitness functions for classifying six imbalanced datasets. The chapter discusses the impact of the used fitness function on the final performance of the proposed methods. The proposed methods demonstrated that some fitness functions like the accuracy rate can mislead the identification process of the relevant features in imbalanced datasets.

Keywords

Feature selection Evolutionary neural networks Imbalanced data Classification 

References

  1. 1.
    Angeline PJ (1998) Using selection to improve particle swarm optimization. In: The 1998 IEEE International Conference on Evolutionary computation proceedings, 1998. IEEE World Congress on Computational Intelligence. IEEE, pp 84–89Google Scholar
  2. 2.
    Dua D, Graff C (2017) UCI machine learning repositoryGoogle Scholar
  3. 3.
    Eberhart R, Kennedy J (1995) A new optimizer using particle swarm theory. In: Proceedings of the sixth international symposium on Micro machine and human science, 1995. MHS’95. IEEE, pp 39–43Google Scholar
  4. 4.
    Faris H, Al-Zoubi Heidari AA, Aljarah I, Mafarja M, Hassonah MA, Fujita H (2019) An intelligent system for spam detection and identification of the most relevant features based on evolutionary random weight networks. Inf Fusion 48:67–83CrossRefGoogle Scholar
  5. 5.
    Faris H, Aljarah I, Al-Madi N, Mirjalili S (2016) Optimizing the learning process of feedforward neural networks using lightning search algorithm. Int J Artif Intell Tools 25(06):1650033CrossRefGoogle Scholar
  6. 6.
    Hall M, Frank E, Holmes G, Pfahringer B, Reutemann P, Witten IH (2009) The WEKA data mining software: an update. SIGKDD Explor 11(1):10–18CrossRefGoogle Scholar
  7. 7.
    Kennedy J, Eberhart R (1995) A new optimizer using particle swarm theory. In: Proceedings of the sixth international symposium on micro machine and human science, 1995. MHS ’95, pp 39–43Google Scholar
  8. 8.
    Lin S-W, Ying K-C, Chen S-C, Lee Z-J (2008) Particle swarm optimization for parameter determination and feature selection of support vector machines. Expert Syst Appl 35(4):1817–1824CrossRefGoogle Scholar
  9. 9.
    Mavrovouniotis M, Li C, Yang S (2017) A survey of swarm intelligence for dynamic optimization: algorithms and applications. Swarm Evol Comput 33:1–17CrossRefGoogle Scholar
  10. 10.
    Moraglio , Di Chio C, Poli R (2007) Geometric particle swarm optimisation. In: European conference on genetic programming. Springer, pp 125–136Google Scholar
  11. 11.
    Quinlan R (1993) C4.5: programs for machine learning. Morgan Kaufmann Publishers, San Mateo, CAGoogle Scholar
  12. 12.
    Talbi E-G (2009) Metaheuristics: from design to implementation, vol 74. WileyGoogle Scholar
  13. 13.
    Whitley D (1994) A genetic algorithm tutorial. Stat Comput 4(2):65–85CrossRefGoogle Scholar
  14. 14.
    Wolpert DH, Macready WG et al (1997) No free lunch theorems for optimization. IEEE Trans Evol Comput 1(1):67–82CrossRefGoogle Scholar

Copyright information

© Springer Nature Singapore Pte Ltd. 2020

Authors and Affiliations

  • Feras Namous
    • 1
  • Hossam Faris
    • 1
    Email author
  • Ali Asghar Heidari
    • 2
    • 3
  • Monther Khalafat
    • 1
  • Rami S. Alkhawaldeh
    • 4
  • Nazeeh Ghatasheh
    • 4
  1. 1.King Abdullah II School for Information TechnologyThe University of JordanAmmanJordan
  2. 2.School of Surveying and Geospatial Engineering, College of EngineeringUniversity of TehranTehranIran
  3. 3.Department of Computer Science, School of ComputingNational University of SingaporeSingaporeSingapore
  4. 4.Faculty of Information Technology and SystemsThe University of JordanAqabaJordan

Personalised recommendations