Skip to main content
Log in

Efficient feature selection using one-pass generalized classifier neural network and binary bat algorithm with a novel fitness function

  • Methodologies and Application
  • Published:
Soft Computing Aims and scope Submit manuscript

Abstract

In high-dimensional data, many of the features are either irrelevant to the machine learning task or are redundant. These situations lead to two problems, firstly overfitting and secondly high computational overhead. The paper proposes a feature selection method to identify the relevant subset of features for the machine-learning task using wrapper approach. The wrapper approach uses the Binary Bat algorithm to select the set of features and One-pass Generalized Classifier Neural Network (OGCNN) to evaluate the selected set of features using a novel fitness function. The proposed fitness function accounts for the entropy of sensitivity and specificity along with accuracy of classifier and fraction of selected features. The fitness function is compared using four classifiers (Radial Basis Function Neural Network, Probabilistic Neural Network, Extreme Learning Machine and OGCNN) on six publicly available datasets. One-pass classifiers are chosen as these are computationally faster. The results suggest that OGCNN along with the novel fitness function performs well in the majority of cases.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15
Fig. 16
Fig. 17
Fig. 18
Fig. 19

Similar content being viewed by others

References

  • Altman EI, Marco G, Varetto F (1994) Corporate distress diagnosis: comparisons using linear discriminant analysis and neural networks (the Italian experience). J Bank Finance 18(3):505–529

    Google Scholar 

  • Arora S, Singh S (2017) An effective hybrid butterfly optimization algorithm with artificial bee colony for numerical optimization. Int J Interact Multimed Artif Intell 26:14–21

    Google Scholar 

  • Arun V, Krishna M, Arunkumar BV, Padma SK et al (2018) Exploratory boosted feature selection and neural network framework for depression classification. Int J Interact Multimed Artif Intell 5(3):61–71

    Google Scholar 

  • Babaoglu S, Findik O, Ülker E (2010) A comparison of feature selection models utilizing binary particle swarm optimization and genetic algorithm in determining coronary artery disease using support vector machine. Exp Syst Appl 37(4):3177–3183

    Google Scholar 

  • Bonabeau Christoph E (2001) Swarm intelligence, vol 79. Morgan Kaufmann Publishers, Burlington

    Google Scholar 

  • Bourlard H, Morgan N (1993) Continuous speech recognition by connectionist statistical methods. IEEE Trans Neural Netw 4(6):893–909

    Google Scholar 

  • Chakraborty B, Kawamura A (2018) A new penalty-based wrapper fitness function for feature subset selection with evolutionary algorithms. J Inf Telecommun 2(2):163–180

    Google Scholar 

  • Chi B-W, Hsu C-C (2012) A hybrid approach to integrate genetic algorithm into dual scoring model in enhancing the performance of credit scoring model. Exp Syst Appl 39(3):2650–2661

    Google Scholar 

  • Cybenko G (1989) Approximation by superpositions of a sigmoidal function. Math Control Signals Syst 2(4):303–314

    MathSciNet  MATH  Google Scholar 

  • da Silva SF, Ribeiro MX, Batista Neto JdE, Traina-Jr C, Traina AJ (2011) Improving the ranking quality of medical image retrieval using a genetic feature selection method. Dec Support Syst 51(4):810–820

    Google Scholar 

  • De Castro LN, Von Zuben FJ (2005) Recent developments in biologically inspired computing. Idea Group Pub, Hershey

    Google Scholar 

  • Derrac J, García S, Herrera F (2009) A first study on the use of coevolutionary algorithms for instance and feature selection. In: Hybrid artificial intelligence systems, pp 557–564

    Google Scholar 

  • Devijver P, Kittler J (1982) Pattern recognition: a statistical approach. Prentice -Hall, Englewood Cliffs, New Jersey

  • Dua D, Casey G (2017) UCI machine learning repository. http://archive.ics.uci.edu/ml

  • Edla DR, Tripathi D, Cheruku R, Kuppili V (2018) An efficient multi-layer ensemble framework with BPSOGSA-based feature selection for credit scoring data analysis. Arab J Sci Eng 43(12):6909–6928. https://doi.org/10.1007/s13369-017-2905-4

    Google Scholar 

  • Espitia HE, Sofrony JI (2018) Statistical analysis for vortex particle swarm optimization. Appl Soft Comput 67:370–386

    Google Scholar 

  • Guyon I (1991) Applications of neural networks to character recognition. Int J Pattern Recognit Artif Intell 05(02):353–382

    Google Scholar 

  • Han J, Kamber M, Pei J (2011) Data mining: concepts and techniques. Elsevier, Amsterdam

    MATH  Google Scholar 

  • Hornik K (1991) Approximation capabilities of multilayer feedforward networks. Neural Netw 4(2):251–257

    MathSciNet  Google Scholar 

  • Huang C-L, Dun J-F (2008) A distributed PSOSVM hybrid system with feature selection and parameter optimization. Appl Soft Comput 8(4):1381–1391

    Google Scholar 

  • Hunt R, Neshatian K, Zhang M (2012) A genetic programming approach to hyper-heuristic feature selection. In: Asia-Pacific conference on simulated evolution and learning SEAL. Springer, Berlin, pp 320–330

    Google Scholar 

  • Kennedy J, Eberhart R (1997) A discrete binary version of the particle swarm algorithm. In: IEEE international conference on systems, man, and cybernetics. computational cybernetics and simulation, vol 5. IEEE, pp 4104–4108

  • Kushwaha P, Welekar RR (2016) International journal of interactive multimedia and artificial intelligence. Int J Interact Multimed Artif Intell 4(Regular Issue):16–21

    Google Scholar 

  • Lin C-M, Hou Y-L, Chen T-Y, Chen K-H (2014) Breast nodules computer-aided diagnostic system design using fuzzy cerebellar model neural networks. IEEE Trans Fuzzy Syst 22(3):693–699

    Google Scholar 

  • Mafarja M, Jaber I, Eleyan D, Hammouri A, Mirjalili S (2017) Binary dragonfly algorithm for feature selection. In International conference on new trends in computing sciences (ICTCS). IEEE, pp 12–17

  • Meza J, Espitia H, Montenegro C, Crespo RG (2016) Statistical analysis of a multi-objective optimization algorithm based on a model of particles with vorticity behavior. Soft Comput 20(9):3521–3536

    Google Scholar 

  • Mirjalili S, Hashim SZM (2012) BMOA: binary magnetic optimization algorithm. Int J Mach Learn Comput 2(2):204–208

  • Mirjalili S, Mirjalili SM, Yang X-S (2014) Binary bat algorithm. Neural Comput Appl 25(3–4):663–681

    Google Scholar 

  • Mitchell MC (1998) An introduction to genetic algorithms. MIT Press, Cambridge

    MATH  Google Scholar 

  • Muni D, Pal N, Das J (2006) Genetic programming for simultaneous feature selection and classifier design. IEEE Trans Syst Man Cybern Part B (Cybernetics) 36(1):106–117

    Google Scholar 

  • Nakamura RYM, Pereira LAM, Rodrigues D, Costa KAP, Papa JP 552, Yang XS (2013) Binary bat algorithm for feature selection. In: Swarm intelligence and bio-inspired computation. Elsevier, pp 225–237

  • Olariu S, Zomaya AY (2006) Handbook of bioinspired algorithms and applications. Chapman & Hall/CRC, Boca Raton

    MATH  Google Scholar 

  • Oreski S, Oreski G (2014) Genetic algorithm-based heuristic for feature selection in credit risk assessment. Exp Syst Appl 41(4):2052–2064

    Google Scholar 

  • Ozyildirim BM, Avci M (2013) Generalized classifier neural network. Neural Netw 39:18–26

    MATH  Google Scholar 

  • Ozyildirim BM, Avci M (2016) One pass learning for generalized classifier neural network. Neural Netw 73:70–76

    Google Scholar 

  • Rashedi E, Nezamabadi-pour H, Saryazdi S (2010) BGSA: binary gravitational search algorithm. Nat Comput 9(3):727–745

    MathSciNet  MATH  Google Scholar 

  • Revanasiddappa M, Harish B (2018) A new feature selection method based on intuitionistic fuzzy entropy to categorize text documents. Int J Interact Multimed Artif Intell 5(3):106–117

    Google Scholar 

  • Saeys Y, Inza I, Larranaga P (2007) A review of feature selection techniques in bioinformatics. Bioinformatics 23(19):2507–2517

    Google Scholar 

  • Savchenko A (2013) Probabilistic neural network with homogeneity testing in recognition of discrete patterns set. Neural Netw 46:227–241

    MATH  Google Scholar 

  • Souza F, Matias T, Araujo R (2011) Co-evolutionarygenetic multilayer perceptron for feature selection and modeldesign. In: ETFA2011. IEEE, pp 1–7

  • Unler A, Murat A (2010) A discrete particle swarm optimization method for feature selection in binary classification problems. Eur J Oper Res 206(3):528–539

    MATH  Google Scholar 

  • Winkler SM, Affenzeller M, Jacak W, Stekel H (2011) Identification of cancer diagnosis estimation models using evolutionary algorithms—a case study for breast cancer, melanoma, and cancer in the respiratory system general terms. In: 13th annual conference genetic and evolutionary computation conference (GECCO), number 11. Dublin, Ireland, pp 503–510

  • Xue B, Zhang M, Browne WN (2013a) Novel initialisation and updating mechanisms in PSO for feature selection in classification. In: European conference on the applications of evolutionary computation. Springer, Berlin, Heidelberg, pp 428–438

    Google Scholar 

  • Xue B, Zhang M, Browne WN (2013b) Particle swarm optimization for feature selection in classification: a multi-objective approach. IEEE Trans Cybern 43(6):1656–1671

    Google Scholar 

  • Xue B, Zhang M, Browne WN, Yao X (2016) A survey on evolutionary computation approaches to feature selection. IEEE Trans Evol Comput 20(4):606–626

    Google Scholar 

  • Yang X-S (2010) A new metaheuristic bat-inspired algorithm. Nature inspired cooperative strategies for optimization (NICSO 2010). Stud Comput Intell 284:65–74

    Google Scholar 

  • Zang H, Zhang S, Hapeshi K (2010) A review of nature-inspired algorithms. J Bionic Eng 7:232–237

    Google Scholar 

  • Zawbaa HM, Emary E, Grosan C, Snasel V (2018) Large-dimensionality small-instance set feature selection: a hybrid bio-inspired heuristic approach. Swarm Evol Comput 42:29–42

    Google Scholar 

  • Zeugmann T, Poupart P, Kennedy J, Jin X, Han J, Saitta L, Sebag M, Peters J, Bagnell JA, Daelemans W, Webb GI, Ting KM, Ting KM, Webb GI, Shirabad JS, Fürnkranz J, Hüllermeier E, Matwin S, Sakakibara Y, Flener P, Schmid U, Procopiuc CM, Lachiche N, Fürnkranz J (2011) Particle swarm optimization. Encyclopedia of machine learning. Springer, Boston, pp 760–766

    Google Scholar 

  • Zhang G (2000) Neural networks for classification: a survey. IEEE Trans Syst Man Cybern Part C (Appl Rev) 30(4):451–462

    MathSciNet  Google Scholar 

  • Zhang Y, Xia C, Gong D, Sun X (2014) Multi-objective PSO algorithm for feature selection problems with unreliable data. In: International conference in swarm intelligence. Springer, Cham, pp 386–393

    Google Scholar 

  • Zhao X, Li D, Yang B, Ma C, Zhu Y, Chen H (2014) Feature selection based on improved ant colony optimization for online detection of foreign fiber in cotton. Appl Soft Comput 24:585–596

    Google Scholar 

  • Zhu Z, Ong Y-S, Dash M (2007) Markov blanket-embedded genetic algorithm for gene selection. Pattern Recognit 40(11):3236–3248

    MATH  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Akshata K. Naik.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Ethical approval

This article does not contain any studies with human participants or animals performed by any of the authors.

Additional information

Communicated by V. Loia.

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Naik, A.K., Kuppili, V. & Edla, D.R. Efficient feature selection using one-pass generalized classifier neural network and binary bat algorithm with a novel fitness function. Soft Comput 24, 4575–4587 (2020). https://doi.org/10.1007/s00500-019-04218-6

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00500-019-04218-6

Keywords

Navigation