Springer Nature is making SARS-CoV-2 and COVID-19 research free. View research | View latest news | Sign up for updates

Efficient feature selection using one-pass generalized classifier neural network and binary bat algorithm with a novel fitness function

  • 95 Accesses

Abstract

In high-dimensional data, many of the features are either irrelevant to the machine learning task or are redundant. These situations lead to two problems, firstly overfitting and secondly high computational overhead. The paper proposes a feature selection method to identify the relevant subset of features for the machine-learning task using wrapper approach. The wrapper approach uses the Binary Bat algorithm to select the set of features and One-pass Generalized Classifier Neural Network (OGCNN) to evaluate the selected set of features using a novel fitness function. The proposed fitness function accounts for the entropy of sensitivity and specificity along with accuracy of classifier and fraction of selected features. The fitness function is compared using four classifiers (Radial Basis Function Neural Network, Probabilistic Neural Network, Extreme Learning Machine and OGCNN) on six publicly available datasets. One-pass classifiers are chosen as these are computationally faster. The results suggest that OGCNN along with the novel fitness function performs well in the majority of cases.

This is a preview of subscription content, log in to check access.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15
Fig. 16
Fig. 17
Fig. 18
Fig. 19

References

  1. Altman EI, Marco G, Varetto F (1994) Corporate distress diagnosis: comparisons using linear discriminant analysis and neural networks (the Italian experience). J Bank Finance 18(3):505–529

  2. Arora S, Singh S (2017) An effective hybrid butterfly optimization algorithm with artificial bee colony for numerical optimization. Int J Interact Multimed Artif Intell 26:14–21

  3. Arun V, Krishna M, Arunkumar BV, Padma SK et al (2018) Exploratory boosted feature selection and neural network framework for depression classification. Int J Interact Multimed Artif Intell 5(3):61–71

  4. Babaoglu S, Findik O, Ülker E (2010) A comparison of feature selection models utilizing binary particle swarm optimization and genetic algorithm in determining coronary artery disease using support vector machine. Exp Syst Appl 37(4):3177–3183

  5. Bonabeau Christoph E (2001) Swarm intelligence, vol 79. Morgan Kaufmann Publishers, Burlington

  6. Bourlard H, Morgan N (1993) Continuous speech recognition by connectionist statistical methods. IEEE Trans Neural Netw 4(6):893–909

  7. Chakraborty B, Kawamura A (2018) A new penalty-based wrapper fitness function for feature subset selection with evolutionary algorithms. J Inf Telecommun 2(2):163–180

  8. Chi B-W, Hsu C-C (2012) A hybrid approach to integrate genetic algorithm into dual scoring model in enhancing the performance of credit scoring model. Exp Syst Appl 39(3):2650–2661

  9. Cybenko G (1989) Approximation by superpositions of a sigmoidal function. Math Control Signals Syst 2(4):303–314

  10. da Silva SF, Ribeiro MX, Batista Neto JdE, Traina-Jr C, Traina AJ (2011) Improving the ranking quality of medical image retrieval using a genetic feature selection method. Dec Support Syst 51(4):810–820

  11. De Castro LN, Von Zuben FJ (2005) Recent developments in biologically inspired computing. Idea Group Pub, Hershey

  12. Derrac J, García S, Herrera F (2009) A first study on the use of coevolutionary algorithms for instance and feature selection. In: Hybrid artificial intelligence systems, pp 557–564

  13. Devijver P, Kittler J (1982) Pattern recognition: a statistical approach. Prentice -Hall, Englewood Cliffs, New Jersey

  14. Dua D, Casey G (2017) UCI machine learning repository. http://archive.ics.uci.edu/ml

  15. Edla DR, Tripathi D, Cheruku R, Kuppili V (2018) An efficient multi-layer ensemble framework with BPSOGSA-based feature selection for credit scoring data analysis. Arab J Sci Eng 43(12):6909–6928. https://doi.org/10.1007/s13369-017-2905-4

  16. Espitia HE, Sofrony JI (2018) Statistical analysis for vortex particle swarm optimization. Appl Soft Comput 67:370–386

  17. Guyon I (1991) Applications of neural networks to character recognition. Int J Pattern Recognit Artif Intell 05(02):353–382

  18. Han J, Kamber M, Pei J (2011) Data mining: concepts and techniques. Elsevier, Amsterdam

  19. Hornik K (1991) Approximation capabilities of multilayer feedforward networks. Neural Netw 4(2):251–257

  20. Huang C-L, Dun J-F (2008) A distributed PSOSVM hybrid system with feature selection and parameter optimization. Appl Soft Comput 8(4):1381–1391

  21. Hunt R, Neshatian K, Zhang M (2012) A genetic programming approach to hyper-heuristic feature selection. In: Asia-Pacific conference on simulated evolution and learning SEAL. Springer, Berlin, pp 320–330

  22. Kennedy J, Eberhart R (1997) A discrete binary version of the particle swarm algorithm. In: IEEE international conference on systems, man, and cybernetics. computational cybernetics and simulation, vol 5. IEEE, pp 4104–4108

  23. Kushwaha P, Welekar RR (2016) International journal of interactive multimedia and artificial intelligence. Int J Interact Multimed Artif Intell 4(Regular Issue):16–21

  24. Lin C-M, Hou Y-L, Chen T-Y, Chen K-H (2014) Breast nodules computer-aided diagnostic system design using fuzzy cerebellar model neural networks. IEEE Trans Fuzzy Syst 22(3):693–699

  25. Mafarja M, Jaber I, Eleyan D, Hammouri A, Mirjalili S (2017) Binary dragonfly algorithm for feature selection. In International conference on new trends in computing sciences (ICTCS). IEEE, pp 12–17

  26. Meza J, Espitia H, Montenegro C, Crespo RG (2016) Statistical analysis of a multi-objective optimization algorithm based on a model of particles with vorticity behavior. Soft Comput 20(9):3521–3536

  27. Mirjalili S, Hashim SZM (2012) BMOA: binary magnetic optimization algorithm. Int J Mach Learn Comput 2(2):204–208

  28. Mirjalili S, Mirjalili SM, Yang X-S (2014) Binary bat algorithm. Neural Comput Appl 25(3–4):663–681

  29. Mitchell MC (1998) An introduction to genetic algorithms. MIT Press, Cambridge

  30. Muni D, Pal N, Das J (2006) Genetic programming for simultaneous feature selection and classifier design. IEEE Trans Syst Man Cybern Part B (Cybernetics) 36(1):106–117

  31. Nakamura RYM, Pereira LAM, Rodrigues D, Costa KAP, Papa JP 552, Yang XS (2013) Binary bat algorithm for feature selection. In: Swarm intelligence and bio-inspired computation. Elsevier, pp 225–237

  32. Olariu S, Zomaya AY (2006) Handbook of bioinspired algorithms and applications. Chapman & Hall/CRC, Boca Raton

  33. Oreski S, Oreski G (2014) Genetic algorithm-based heuristic for feature selection in credit risk assessment. Exp Syst Appl 41(4):2052–2064

  34. Ozyildirim BM, Avci M (2013) Generalized classifier neural network. Neural Netw 39:18–26

  35. Ozyildirim BM, Avci M (2016) One pass learning for generalized classifier neural network. Neural Netw 73:70–76

  36. Rashedi E, Nezamabadi-pour H, Saryazdi S (2010) BGSA: binary gravitational search algorithm. Nat Comput 9(3):727–745

  37. Revanasiddappa M, Harish B (2018) A new feature selection method based on intuitionistic fuzzy entropy to categorize text documents. Int J Interact Multimed Artif Intell 5(3):106–117

  38. Saeys Y, Inza I, Larranaga P (2007) A review of feature selection techniques in bioinformatics. Bioinformatics 23(19):2507–2517

  39. Savchenko A (2013) Probabilistic neural network with homogeneity testing in recognition of discrete patterns set. Neural Netw 46:227–241

  40. Souza F, Matias T, Araujo R (2011) Co-evolutionarygenetic multilayer perceptron for feature selection and modeldesign. In: ETFA2011. IEEE, pp 1–7

  41. Unler A, Murat A (2010) A discrete particle swarm optimization method for feature selection in binary classification problems. Eur J Oper Res 206(3):528–539

  42. Winkler SM, Affenzeller M, Jacak W, Stekel H (2011) Identification of cancer diagnosis estimation models using evolutionary algorithms—a case study for breast cancer, melanoma, and cancer in the respiratory system general terms. In: 13th annual conference genetic and evolutionary computation conference (GECCO), number 11. Dublin, Ireland, pp 503–510

  43. Xue B, Zhang M, Browne WN (2013a) Novel initialisation and updating mechanisms in PSO for feature selection in classification. In: European conference on the applications of evolutionary computation. Springer, Berlin, Heidelberg, pp 428–438

  44. Xue B, Zhang M, Browne WN (2013b) Particle swarm optimization for feature selection in classification: a multi-objective approach. IEEE Trans Cybern 43(6):1656–1671

  45. Xue B, Zhang M, Browne WN, Yao X (2016) A survey on evolutionary computation approaches to feature selection. IEEE Trans Evol Comput 20(4):606–626

  46. Yang X-S (2010) A new metaheuristic bat-inspired algorithm. Nature inspired cooperative strategies for optimization (NICSO 2010). Stud Comput Intell 284:65–74

  47. Zang H, Zhang S, Hapeshi K (2010) A review of nature-inspired algorithms. J Bionic Eng 7:232–237

  48. Zawbaa HM, Emary E, Grosan C, Snasel V (2018) Large-dimensionality small-instance set feature selection: a hybrid bio-inspired heuristic approach. Swarm Evol Comput 42:29–42

  49. Zeugmann T, Poupart P, Kennedy J, Jin X, Han J, Saitta L, Sebag M, Peters J, Bagnell JA, Daelemans W, Webb GI, Ting KM, Ting KM, Webb GI, Shirabad JS, Fürnkranz J, Hüllermeier E, Matwin S, Sakakibara Y, Flener P, Schmid U, Procopiuc CM, Lachiche N, Fürnkranz J (2011) Particle swarm optimization. Encyclopedia of machine learning. Springer, Boston, pp 760–766

  50. Zhang G (2000) Neural networks for classification: a survey. IEEE Trans Syst Man Cybern Part C (Appl Rev) 30(4):451–462

  51. Zhang Y, Xia C, Gong D, Sun X (2014) Multi-objective PSO algorithm for feature selection problems with unreliable data. In: International conference in swarm intelligence. Springer, Cham, pp 386–393

  52. Zhao X, Li D, Yang B, Ma C, Zhu Y, Chen H (2014) Feature selection based on improved ant colony optimization for online detection of foreign fiber in cotton. Appl Soft Comput 24:585–596

  53. Zhu Z, Ong Y-S, Dash M (2007) Markov blanket-embedded genetic algorithm for gene selection. Pattern Recognit 40(11):3236–3248

Download references

Author information

Correspondence to Akshata K. Naik.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Ethical approval

This article does not contain any studies with human participants or animals performed by any of the authors.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Communicated by V. Loia.

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Naik, A.K., Kuppili, V. & Edla, D.R. Efficient feature selection using one-pass generalized classifier neural network and binary bat algorithm with a novel fitness function. Soft Comput 24, 4575–4587 (2020). https://doi.org/10.1007/s00500-019-04218-6

Download citation

Keywords

  • Feature selection
  • Wrapper approach
  • Bio-inspired algorithms
  • One-pass neural network