Skip to main content

Advertisement

Log in

A histogram based fuzzy ensemble technique for feature selection

  • Research Paper
  • Published:
Evolutionary Intelligence Aims and scope Submit manuscript

Abstract

Feature selection (FS) is an integral part of many machine learning problems in providing a better and time-efficient classification model. In recent times, many new FS algorithms have been proposed which combine well-established algorithms to overcome drawbacks of the constituent algorithms. The general process of combination is to allow them to operate consecutively or simultaneously. These rudimentary combinations in many cases do not allow for proper inclusion of the advantages of the specific algorithms and this necessitates an alternative approach for combining. Initially without interrupting the flow of the algorithms, we allow them to generate their results. After selection of the most dominant features, the rest of the combination is done using the concept of histogram and assigning a weightage to the fuzzy features based on the quality of the candidate solution in which they appear. In the proposed method, the outcome of the three popularly used algorithms with complementary exploitation–exploration trade-off namely genetic algorithm (GA), binary particle swarm optimisation (BPSO) and ant colony optimisation (ACO) are combined together. Then, 14 popular UCI datasets have been used to evaluate the proposed FS method. Results obtained by our proposed ensemble are compared with some popular FS models like gravitational search algorithm, histogram based multi objective GA, GA, BPSO and ACO, and it shows that our algorithm outperforms the others.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9

Similar content being viewed by others

References

  1. Curse of Dimensionality (n.d.) https://en.wikipedia.org/wiki/Curse_of_dimensionality. Accessed 28 Dec 2018

  2. Yang J, Honavar V (1998) Feature subset selection using a genetic algorithm. IEEE Intell Syst Appl 13:44–49

    Article  Google Scholar 

  3. Forsati R, Moayedikia A, Jensen R, Shamsfard M, Meybodi MR (2014) Enriched ant colony optimization and its application in feature selection. Neurocomputing 142:354–371. https://doi.org/10.1016/j.neucom.2014.03.053

    Article  Google Scholar 

  4. Ghosh M, Begum S, Sarkar R, Chakraborty D, Maulik U (2019) Recursive memetic algorithm for gene selection in microarray data. Expert Syst Appl 116:172–185

    Article  Google Scholar 

  5. Ghosh M, Adhikary S, Ghosh KK, Sardar A, Begum S, Sarkar R (2019) Genetic algorithm based cancerous gene identification from microarray data using ensemble of filter methods. Med Biol Eng Comput 57:159–176

    Article  Google Scholar 

  6. Liu H, Motoda H (2007) Computational methods of feature selection. CRC Press, Boca Raton

    Book  Google Scholar 

  7. Mitra P, Murthy CA, Pal SK (2002) Unsupervised feature selection using feature similarity. IEEE Trans Pattern Anal Mach Intell 24:301–312

    Article  Google Scholar 

  8. Shang W-Q, Qu Y-L, Huang H-K, Zhu H-B, Lin Y-M, Dong H-B (2006) Fuzzy knn text classifier based on Gini index. J Guangxi Norm Univ 24:87–90

    MATH  Google Scholar 

  9. Dorigo M, Birattari M (2011) Ant colony optimization. In: Sammut C, Webb GI (eds) Encyclopedia machine learning. Springer, Berlin, pp 36–39

    Google Scholar 

  10. Eberhart R, Kennedy J (1995) A new optimizer using particle swarm theory. In: MHS’95. Proceedings of the sixth international symposium on micro machine and human science. IEEE, pp 39–43

  11. Holland JH (1992) Genetic algorithms. Sci Am 1:66–73

    Article  Google Scholar 

  12. Zhu Z, Ong YS, Dash M (2007) Markov blanket-embedded genetic algorithm for gene selection. Pattern Recognit 40:3236–3248. https://doi.org/10.1016/j.patcog.2007.02.007

    Article  MATH  Google Scholar 

  13. Duval B, Hao J-K, Hernandez Hernandez JC (2009) A memetic algorithm for gene selection and molecular classification of cancer. In: Proceedings of the 11th annual conference on genetic and evolutionary computation—GECCO’09, p 201. https://doi.org/10.1145/1569901.1569930

  14. Aghdam MH, Ghasem-Aghaee N, Basiri ME (2009) Text feature selection using ant colony optimization. Expert Syst Appl 36:6843–6853. https://doi.org/10.1016/j.eswa.2008.08.022

    Article  Google Scholar 

  15. Ghosh M, Guha R, Sarkar R, Abraham A (2019) A wrapper–filter feature selection technique based on ant colony optimization. Neural Comput Appl. https://doi.org/10.1007/s00521-019-04171-3

    Article  Google Scholar 

  16. Kennedy J, Eberhart RC (1997) A discrete binary version of the particle swarm algorithm, systems, man, cybernetics. In: IEEE international conference on computational cybernetics and simulation, vol 5, pp 4104–4108

  17. Wei J, Zhang R, Yu Z, Hu R, Tang J, Gui C, Yuan Y (2017) A BPSO–SVM algorithm based on memory renewal and enhanced mutation mechanisms for feature selection. Appl Soft Comput J 58:176–192. https://doi.org/10.1016/j.asoc.2017.04.061

    Article  Google Scholar 

  18. Moradi P, Gholampour M (2016) A hybrid particle swarm optimization for feature subset selection by integrating a novel local search strategy. Appl Soft Comput J 43:117–130. https://doi.org/10.1016/j.asoc.2016.01.044

    Article  Google Scholar 

  19. Sarkar R, Ghosh M, Chatterjee A, Malakar S (2018) An advanced particle swarm optimization based feature selection method for tri-script handwritten digit recognition. In: International conference on computational intelligence, communications, and business analytics, pp 978–981

  20. Frohlich H, Chapelle O, Scholkopf B (2016) Feature selection for support vector machines by means of genetic algorithm. In: Proceedings of the 15th IEEE international conference tools with artificial intelligence, pp 142–148. https://doi.org/10.1109/tai.2003.1250182

  21. Leardi R (2000) Application of genetic algorithm—PLS for feature selection in spectral data sets. J Chemom 14(5–6):643–655

    Article  Google Scholar 

  22. Ghosh M, Guha R, Mondal R, Singh PK, Sarkar R (2018) Feature selection using histogram based multi-objective GA for handwritten devanagari numeral recognition. Intell Eng Inform AISC 695:471–479

    Article  Google Scholar 

  23. Guha R, Ghosh M, Kapri S, Shaw S, Mutsuddi S, Bhateja V, Sarkar R (2019) Deluge based genetic algorithm for feature selection. Evol Intell. https://doi.org/10.1007/s12065-019-00218-5

    Article  Google Scholar 

  24. Prasad Y, Biswas KK, Jain CK (2010) SVM classifier based feature selection using GA, ACO and PSO for siRNA design. In: International conference in swarm intelligence, pp 307–314. Springer, Berlin

    Google Scholar 

  25. Huang C, Dun J (2008) A distributed PSO–SVM hybrid system with feature selection and parameter optimization. Appl Soft Comput 8:1381–1391. https://doi.org/10.1016/j.asoc.2007.10.007

    Article  Google Scholar 

  26. Huang CL (2009) ACO-based hybrid classification system with feature subset selection and model parameters optimization. Neurocomputing 73:438–448. https://doi.org/10.1016/j.neucom.2009.07.014

    Article  Google Scholar 

  27. Nemati S, Ehsan M, Ghasem-aghaee N, Hosseinzadeh M (2009) Expert systems with applications a novel ACO–GA hybrid algorithm for feature selection in protein function prediction. Expert Syst Appl 36:12086–12094. https://doi.org/10.1016/j.eswa.2009.04.023

    Article  Google Scholar 

  28. Basiri ME, Nemati S (2009) A novel hybrid ACO–GA algorithm for text feature selection. In: Proceedings of 11th IEEE conference on congress on evolutionary computation, pp 2561–2568

  29. Sheikhan M, Mohammadi N (2012) Neural-based electricity load forecasting using hybrid of GA and ACO for feature selection. Neural Comput Appl 21:1961–1970. https://doi.org/10.1007/s00521-011-0599-1

    Article  Google Scholar 

  30. Alba E, Garcia-Nieto J, Jourdan L, Talbi E-G (2007) Gene selection in cancer classification using PSO/SVM and GA/SVM hybrid algorithms. In: 2007 IEEE congress on evolutionary computation, pp 284–290

  31. Cadenas JM, Garrido MC, MartíNez R (2013) Feature subset selection filter–wrapper based on low quality data. Expert Syst Appl 40:6241–6252

    Article  Google Scholar 

  32. Tran CT, Zhang M, Andreae P, Xue B (2016) Improving performance for classification with incomplete data using wrapper-based feature selection. Evol Intell 9:81–94

    Article  Google Scholar 

  33. Harifi S, Khalilian M, Mohammadzadeh J, Ebrahimnejad S (2019) Emperor penguins colony: a new metaheuristic algorithm for optimization. Evol Intell 12(2):211–226

    Article  Google Scholar 

  34. Cheng J, Duan Z (2019) Cloud model based sine cosine algorithm for solving optimization problems. Evol Intell. https://doi.org/10.1007/s12065-019-00251-4

    Article  Google Scholar 

  35. Singh H, Kumar Y, Kumar S (2019) A new meta-heuristic algorithm based on chemical reactions for partitional clustering problems. Evol Intell 12(2):241–252

    Article  Google Scholar 

  36. Cruz DPF, Maia RD, De Castro LN (2019) A critical discussion into the core of swarm intelligence algorithms. Evol Intell 12(2):189–200

    Article  Google Scholar 

  37. Elbes M, Alzubi S, Kanan T, Al-Fuqaha A, Hawashin B (2019) A survey on particle swarm optimization with emphasis on engineering and network applications. Evol Intell 12(2):113–129

    Article  Google Scholar 

  38. Wolpert DH, Macready WG (1997) No free lunch theorems for optimization. IEEE Trans Evol Comput 1:67–82

    Article  Google Scholar 

  39. UCI repository (n.d.) https://archive.ics.uci.edu/ml/datasets.html. Accessed 7 Jan 2019

  40. Rashedi E, Nezamabadi-pour H, Saryazdi S (2009) GSA: a gravitational search algorithm. Inf Sci 179:2232–2248. https://doi.org/10.1016/j.ins.2009.03.004

    Article  MATH  Google Scholar 

  41. Singh PK, Sarkar R, Nasipuri M (2016) Significance of non-parametric statistical tests for comparison of classifiers over multiple datasets. Int J Comput Sci Math 7(5):410–422

    Article  MathSciNet  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Pawan Kumar Singh.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Ghosh, M., Guha, R., Singh, P.K. et al. A histogram based fuzzy ensemble technique for feature selection. Evol. Intel. 12, 713–724 (2019). https://doi.org/10.1007/s12065-019-00279-6

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s12065-019-00279-6

Keywords

Navigation