Advertisement

Evolutionary Intelligence

, Volume 12, Issue 4, pp 713–724 | Cite as

A histogram based fuzzy ensemble technique for feature selection

  • Manosij Ghosh
  • Ritam Guha
  • Pawan Kumar SinghEmail author
  • Vikrant Bhateja
  • Ram Sarkar
Research Paper
  • 17 Downloads

Abstract

Feature selection (FS) is an integral part of many machine learning problems in providing a better and time-efficient classification model. In recent times, many new FS algorithms have been proposed which combine well-established algorithms to overcome drawbacks of the constituent algorithms. The general process of combination is to allow them to operate consecutively or simultaneously. These rudimentary combinations in many cases do not allow for proper inclusion of the advantages of the specific algorithms and this necessitates an alternative approach for combining. Initially without interrupting the flow of the algorithms, we allow them to generate their results. After selection of the most dominant features, the rest of the combination is done using the concept of histogram and assigning a weightage to the fuzzy features based on the quality of the candidate solution in which they appear. In the proposed method, the outcome of the three popularly used algorithms with complementary exploitation–exploration trade-off namely genetic algorithm (GA), binary particle swarm optimisation (BPSO) and ant colony optimisation (ACO) are combined together. Then, 14 popular UCI datasets have been used to evaluate the proposed FS method. Results obtained by our proposed ensemble are compared with some popular FS models like gravitational search algorithm, histogram based multi objective GA, GA, BPSO and ACO, and it shows that our algorithm outperforms the others.

Keywords

Feature selection Histogram based fuzzy ensemble Genetic algorithm Binary particle swarm optimisation Ant colony optimisation UCI dataset 

Notes

Compliance with ethical standards

Conflict of interest

The authors declare that they have no conflict of interest.

References

  1. 1.
    Curse of Dimensionality (n.d.) https://en.wikipedia.org/wiki/Curse_of_dimensionality. Accessed 28 Dec 2018
  2. 2.
    Yang J, Honavar V (1998) Feature subset selection using a genetic algorithm. IEEE Intell Syst Appl 13:44–49CrossRefGoogle Scholar
  3. 3.
    Forsati R, Moayedikia A, Jensen R, Shamsfard M, Meybodi MR (2014) Enriched ant colony optimization and its application in feature selection. Neurocomputing 142:354–371.  https://doi.org/10.1016/j.neucom.2014.03.053 CrossRefGoogle Scholar
  4. 4.
    Ghosh M, Begum S, Sarkar R, Chakraborty D, Maulik U (2019) Recursive memetic algorithm for gene selection in microarray data. Expert Syst Appl 116:172–185CrossRefGoogle Scholar
  5. 5.
    Ghosh M, Adhikary S, Ghosh KK, Sardar A, Begum S, Sarkar R (2019) Genetic algorithm based cancerous gene identification from microarray data using ensemble of filter methods. Med Biol Eng Comput 57:159–176CrossRefGoogle Scholar
  6. 6.
    Liu H, Motoda H (2007) Computational methods of feature selection. CRC Press, Boca RatonCrossRefGoogle Scholar
  7. 7.
    Mitra P, Murthy CA, Pal SK (2002) Unsupervised feature selection using feature similarity. IEEE Trans Pattern Anal Mach Intell 24:301–312CrossRefGoogle Scholar
  8. 8.
    Shang W-Q, Qu Y-L, Huang H-K, Zhu H-B, Lin Y-M, Dong H-B (2006) Fuzzy knn text classifier based on Gini index. J Guangxi Norm Univ 24:87–90zbMATHGoogle Scholar
  9. 9.
    Dorigo M, Birattari M (2011) Ant colony optimization. In: Sammut C, Webb GI (eds) Encyclopedia machine learning. Springer, Berlin, pp 36–39Google Scholar
  10. 10.
    Eberhart R, Kennedy J (1995) A new optimizer using particle swarm theory. In: MHS’95. Proceedings of the sixth international symposium on micro machine and human science. IEEE, pp 39–43Google Scholar
  11. 11.
    Holland JH (1992) Genetic algorithms. Sci Am 1:66–73CrossRefGoogle Scholar
  12. 12.
    Zhu Z, Ong YS, Dash M (2007) Markov blanket-embedded genetic algorithm for gene selection. Pattern Recognit 40:3236–3248.  https://doi.org/10.1016/j.patcog.2007.02.007 CrossRefzbMATHGoogle Scholar
  13. 13.
    Duval B, Hao J-K, Hernandez Hernandez JC (2009) A memetic algorithm for gene selection and molecular classification of cancer. In: Proceedings of the 11th annual conference on genetic and evolutionary computation—GECCO’09, p 201.  https://doi.org/10.1145/1569901.1569930
  14. 14.
    Aghdam MH, Ghasem-Aghaee N, Basiri ME (2009) Text feature selection using ant colony optimization. Expert Syst Appl 36:6843–6853.  https://doi.org/10.1016/j.eswa.2008.08.022 CrossRefGoogle Scholar
  15. 15.
    Ghosh M, Guha R, Sarkar R, Abraham A (2019) A wrapper–filter feature selection technique based on ant colony optimization. Neural Comput Appl.  https://doi.org/10.1007/s00521-019-04171-3 CrossRefGoogle Scholar
  16. 16.
    Kennedy J, Eberhart RC (1997) A discrete binary version of the particle swarm algorithm, systems, man, cybernetics. In: IEEE international conference on computational cybernetics and simulation, vol 5, pp 4104–4108Google Scholar
  17. 17.
    Wei J, Zhang R, Yu Z, Hu R, Tang J, Gui C, Yuan Y (2017) A BPSO–SVM algorithm based on memory renewal and enhanced mutation mechanisms for feature selection. Appl Soft Comput J 58:176–192.  https://doi.org/10.1016/j.asoc.2017.04.061 CrossRefGoogle Scholar
  18. 18.
    Moradi P, Gholampour M (2016) A hybrid particle swarm optimization for feature subset selection by integrating a novel local search strategy. Appl Soft Comput J 43:117–130.  https://doi.org/10.1016/j.asoc.2016.01.044 CrossRefGoogle Scholar
  19. 19.
    Sarkar R, Ghosh M, Chatterjee A, Malakar S (2018) An advanced particle swarm optimization based feature selection method for tri-script handwritten digit recognition. In: International conference on computational intelligence, communications, and business analytics, pp 978–981Google Scholar
  20. 20.
    Frohlich H, Chapelle O, Scholkopf B (2016) Feature selection for support vector machines by means of genetic algorithm. In: Proceedings of the 15th IEEE international conference tools with artificial intelligence, pp 142–148.  https://doi.org/10.1109/tai.2003.1250182
  21. 21.
    Leardi R (2000) Application of genetic algorithm—PLS for feature selection in spectral data sets. J Chemom 14(5–6):643–655CrossRefGoogle Scholar
  22. 22.
    Ghosh M, Guha R, Mondal R, Singh PK, Sarkar R (2018) Feature selection using histogram based multi-objective GA for handwritten devanagari numeral recognition. Intell Eng Inform AISC 695:471–479CrossRefGoogle Scholar
  23. 23.
    Guha R, Ghosh M, Kapri S, Shaw S, Mutsuddi S, Bhateja V, Sarkar R (2019) Deluge based genetic algorithm for feature selection. Evol Intell.  https://doi.org/10.1007/s12065-019-00218-5 CrossRefGoogle Scholar
  24. 24.
    Prasad Y, Biswas KK, Jain CK (2010) SVM classifier based feature selection using GA, ACO and PSO for siRNA design. In: International conference in swarm intelligence, pp 307–314. Springer, BerlinGoogle Scholar
  25. 25.
    Huang C, Dun J (2008) A distributed PSO–SVM hybrid system with feature selection and parameter optimization. Appl Soft Comput 8:1381–1391.  https://doi.org/10.1016/j.asoc.2007.10.007 CrossRefGoogle Scholar
  26. 26.
    Huang CL (2009) ACO-based hybrid classification system with feature subset selection and model parameters optimization. Neurocomputing 73:438–448.  https://doi.org/10.1016/j.neucom.2009.07.014 CrossRefGoogle Scholar
  27. 27.
    Nemati S, Ehsan M, Ghasem-aghaee N, Hosseinzadeh M (2009) Expert systems with applications a novel ACO–GA hybrid algorithm for feature selection in protein function prediction. Expert Syst Appl 36:12086–12094.  https://doi.org/10.1016/j.eswa.2009.04.023 CrossRefGoogle Scholar
  28. 28.
    Basiri ME, Nemati S (2009) A novel hybrid ACO–GA algorithm for text feature selection. In: Proceedings of 11th IEEE conference on congress on evolutionary computation, pp 2561–2568Google Scholar
  29. 29.
    Sheikhan M, Mohammadi N (2012) Neural-based electricity load forecasting using hybrid of GA and ACO for feature selection. Neural Comput Appl 21:1961–1970.  https://doi.org/10.1007/s00521-011-0599-1 CrossRefGoogle Scholar
  30. 30.
    Alba E, Garcia-Nieto J, Jourdan L, Talbi E-G (2007) Gene selection in cancer classification using PSO/SVM and GA/SVM hybrid algorithms. In: 2007 IEEE congress on evolutionary computation, pp 284–290Google Scholar
  31. 31.
    Cadenas JM, Garrido MC, MartíNez R (2013) Feature subset selection filter–wrapper based on low quality data. Expert Syst Appl 40:6241–6252CrossRefGoogle Scholar
  32. 32.
    Tran CT, Zhang M, Andreae P, Xue B (2016) Improving performance for classification with incomplete data using wrapper-based feature selection. Evol Intell 9:81–94CrossRefGoogle Scholar
  33. 33.
    Harifi S, Khalilian M, Mohammadzadeh J, Ebrahimnejad S (2019) Emperor penguins colony: a new metaheuristic algorithm for optimization. Evol Intell 12(2):211–226CrossRefGoogle Scholar
  34. 34.
    Cheng J, Duan Z (2019) Cloud model based sine cosine algorithm for solving optimization problems. Evol Intell.  https://doi.org/10.1007/s12065-019-00251-4 CrossRefGoogle Scholar
  35. 35.
    Singh H, Kumar Y, Kumar S (2019) A new meta-heuristic algorithm based on chemical reactions for partitional clustering problems. Evol Intell 12(2):241–252CrossRefGoogle Scholar
  36. 36.
    Cruz DPF, Maia RD, De Castro LN (2019) A critical discussion into the core of swarm intelligence algorithms. Evol Intell 12(2):189–200CrossRefGoogle Scholar
  37. 37.
    Elbes M, Alzubi S, Kanan T, Al-Fuqaha A, Hawashin B (2019) A survey on particle swarm optimization with emphasis on engineering and network applications. Evol Intell 12(2):113–129CrossRefGoogle Scholar
  38. 38.
    Wolpert DH, Macready WG (1997) No free lunch theorems for optimization. IEEE Trans Evol Comput 1:67–82CrossRefGoogle Scholar
  39. 39.
    UCI repository (n.d.) https://archive.ics.uci.edu/ml/datasets.html. Accessed 7 Jan 2019
  40. 40.
    Rashedi E, Nezamabadi-pour H, Saryazdi S (2009) GSA: a gravitational search algorithm. Inf Sci 179:2232–2248.  https://doi.org/10.1016/j.ins.2009.03.004 CrossRefzbMATHGoogle Scholar
  41. 41.
    Singh PK, Sarkar R, Nasipuri M (2016) Significance of non-parametric statistical tests for comparison of classifiers over multiple datasets. Int J Comput Sci Math 7(5):410–422MathSciNetCrossRefGoogle Scholar

Copyright information

© Springer-Verlag GmbH Germany, part of Springer Nature 2019

Authors and Affiliations

  1. 1.Department of Computer Science and EngineeringJadavpur UniversityKolkataIndia
  2. 2.Electronics and Communication Engineering DepartmentShri Ramswaroop Memorial Group of Professional CollegesLucknowIndia

Personalised recommendations