Fast unsupervised feature selection based on the improved binary ant system and mutation strategy

Abstract

The “curse of dimensionality” issue caused by high-dimensional datasets not only imposes high memory and computational costs but also deteriorates the capability of learning methods. The main purpose of feature selection is to reduce the dimensionality of these datasets by discarding redundant and irrelevant features, which improves the performance of the learning algorithm. In this paper, a new feature selection algorithm, referred to as FSBACOM, was presented based on the binary ant system (BAS). The proposed method sought to improve feature selection by decreasing redundancy and achieved an optimum solution by increasing search space in a short time. For this purpose, the features were organized sequentially in a circular graph, where each feature was connected to the next one with two select/deselect edges. The proposed representation of the search space reduced computational time significantly, particularly on the high-dimensional datasets. Inspired from genetic algorithm and simulated annealing, a damped mutation strategy was introduced to avoid falling into local optima. In addition, a new idea was utilized to reduce the redundancy between selected features as far as possible. The performance of the proposed algorithm was compared to that of state-of-the-art feature selection algorithms using different classifiers on real-world datasets. The experimental results confirmed that FSBACOM significantly reduces computational time and achieves better performance than other feature selection methods.

This is a preview of subscription content, log in to check access.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8

References

  1. 1.

    Liu H, Yu L (2005) Toward integrating feature selection algorithms for classification and clustering. IEEE Trans Knowl Data Eng 17(4):491–502

    Article  Google Scholar 

  2. 2.

    Jenatton R, Audibert J-Y, Bach F (2011) Structured variable selection with sparsity-inducing norms. J Mach Learn Res 12(10):2777–2824

    MathSciNet  MATH  Google Scholar 

  3. 3.

    Kim Y, Kim J (2004) Gradient LASSO for feature selection. In: Proceedings of the twenty-first international conference on machine learning. ACM, p 60

  4. 4.

    Peng H, Long F, Ding C (2005) Feature selection based on mutual information criteria of max-dependency, max-relevance, and min-redundancy. IEEE Trans Pattern Anal Mach Intell 27(8):1226–1238

    Article  Google Scholar 

  5. 5.

    Yang J-B, Ong C-J (2011) Feature selection using probabilistic prediction of support vector regression. IEEE Trans Neural Netw 22(6):954–962

    Article  Google Scholar 

  6. 6.

    Xiang S, Nie F, Meng G, Pan C, Zhang C (2012) Discriminative least squares regression for multiclass classification and feature selection. IEEE Trans Neural Netw Learn Syst 23(11):1738–1754

    Article  Google Scholar 

  7. 7.

    Zhao Z, Wang L, Liu H, Ye J (2013) On similarity preserving feature selection. IEEE Trans Knowl Data Eng 25(3):619–632

    Article  Google Scholar 

  8. 8.

    He X, Cai D, Niyogi P (2006) Laplacian score for feature selection. In: Paper presented at the advances in neural information processing systems

  9. 9.

    Zhao Z, Liu H (2007) Spectral feature selection for supervised and unsupervised learning. In: Proceedings of the 24th international conference on machine learning. ACM, pp 1151–1157

  10. 10.

    Jiang Y, Ren J (2011) Eigenvalue sensitive feature selection. In: Proceedings of the 28th international conference on machine learning (ICML-11), pp 89–96

  11. 11.

    Yang Y, Shen HT, Ma Z, Huang Z, Zhou X (2011) l2, 1-norm regularized discriminative feature selection for unsupervised learning. In: IJCAI proceedings-international joint conference on artificial intelligence, vol 1, p 1589

  12. 12.

    Padungweang P, Lursinsap C, Sunat K (2012) A discrimination analysis for unsupervised feature selection via optic diffraction principle. IEEE Trans Neural Netw Learn Syst 23(10):1587–1600

    Article  Google Scholar 

  13. 13.

    Xu Z, King I, Lyu MR-T, Jin R (2010) Discriminative semi-supervised feature selection via manifold regularization. IEEE Trans Neural Netw 21(7):1033–1047

    Article  Google Scholar 

  14. 14.

    Zhao J, Lu K, He X (2008) Locality sensitive semi-supervised feature selection. Neurocomputing 71(10):1842–1849

    Article  Google Scholar 

  15. 15.

    Zeng Z, Wang X, Zhang J, Wu Q (2016) Semi-supervised feature selection based on local discriminative information. Neurocomputing 173:102–109

    Article  Google Scholar 

  16. 16.

    Guyon I, Elisseeff A (2003) An introduction to variable and feature selection. J Mach Learn Res 3(3):1157–1182

    MATH  Google Scholar 

  17. 17.

    Gheyas IA, Smith LS (2010) Feature subset selection in large dimensionality domains. Pattern Recognit 43(1):5–13

    MATH  Article  Google Scholar 

  18. 18.

    Tabakhi S, Moradi P, Akhlaghian F (2014) An unsupervised feature selection algorithm based on ant colony optimization. Eng Appl Artif Intell 32:112–123

    Article  Google Scholar 

  19. 19.

    Saeys Y, Inza I, Larrañaga P (2007) A review of feature selection techniques in bioinformatics. Bioinformatics 23(19):2507–2517

    Article  Google Scholar 

  20. 20.

    Sotoca JM, Pla F (2010) Supervised feature selection by clustering using conditional mutual information-based distances. Pattern Recognit 43(6):2068–2081. https://doi.org/10.1016/j.patcog.2009.12.013

    MATH  Article  Google Scholar 

  21. 21.

    Wei J, Zhang R, Yu Z, Hu R, Tang J, Gui C, Yuan Y (2017) A BPSO-SVM algorithm based on memory renewal and enhanced mutation mechanisms for feature selection. Appl Soft Comput 58:176–192

    Article  Google Scholar 

  22. 22.

    Wan Y, Wang M, Ye Z, Lai X (2016) A feature selection method based on modified binary coded ant colony optimization algorithm. Appl Soft Comput 49:248–258

    Article  Google Scholar 

  23. 23.

    Huang C-L, Huang W-L (2009) Handling sequential pattern decay: developing a two-stage collaborative recommender system. Electron Commer Res Appl 8(3):117–129

    Article  Google Scholar 

  24. 24.

    Das AK, Goswami S, Chakrabarti A, Chakraborty B (2017) A new hybrid feature selection approach using feature association map for supervised and unsupervised classification. Expert Syst Appl 88:81–94

    Article  Google Scholar 

  25. 25.

    Chandrashekar G, Sahin F (2014) A survey on feature selection methods. Comput Electr Eng 40(1):16–28

    Article  Google Scholar 

  26. 26.

    Jović A, Brkić K, Bogunović N (2015) A review of feature selection methods with applications. In: 2015 38th International convention on information and communication technology, electronics and microelectronics (MIPRO). IEEE, pp 1200–1205

  27. 27.

    Oh I-S, Lee J-S, Moon B-R (2004) Hybrid genetic algorithms for feature selection. IEEE Trans Pattern Anal Mach Intell 26(11):1424–1437

    Article  Google Scholar 

  28. 28.

    Chuang L-Y, Yang C-H, Li J-C (2011) Chaotic maps based on binary particle swarm optimization for feature selection. Appl Soft Comput 11(1):239–248

    Article  Google Scholar 

  29. 29.

    Chuang L-Y, Tsai S-W, Yang C-H (2011) Improved binary particle swarm optimization using catfish effect for feature selection. Expert Syst Appl 38(10):12699–12707

    Article  Google Scholar 

  30. 30.

    Amoozegar M, Minaei-Bidgoli B (2018) Optimizing Multi-objective PSO based feature selection method using a feature elitism mechanism. Expert Syst Appl 113:499–514

    Article  Google Scholar 

  31. 31.

    Ghaemi M, Feizi-Derakhshi M-R (2016) Feature selection using forest optimization algorithm. Pattern Recognit 60:121–129

    Article  Google Scholar 

  32. 32.

    Jiang S, Chin K-S, Wang L, Qu G, Tsui KL (2017) Modified genetic algorithm-based feature selection combined with pre-trained deep neural network for demand forecasting in outpatient department. Expert Syst Appl 82:216–230

    Article  Google Scholar 

  33. 33.

    Zorarpacı E, Özel SA (2016) A hybrid approach of differential evolution and artificial bee colony for feature selection. Expert Syst Appl 62:91–103

    Article  Google Scholar 

  34. 34.

    Chen Y, Miao D, Wang R (2010) A rough set approach to feature selection based on ant colony optimization. Pattern Recognit Lett 31(3):226–233

    Article  Google Scholar 

  35. 35.

    Chen Y, Miao D, Wang R, Wu K (2011) A rough set approach to feature selection based on power set tree. Knowl Based Syst 24(2):275–281

    Article  Google Scholar 

  36. 36.

    Tabakhi S, Moradi P (2015) Relevance–redundancy feature selection based on ant colony optimization. Pattern Recognit 48(9):2798–2811

    Article  Google Scholar 

  37. 37.

    Tabakhi S, Najafi A, Ranjbar R, Moradi P (2015) Gene selection for microarray data classification using a novel ant colony optimization. Neurocomputing 168:1024–1036

    Article  Google Scholar 

  38. 38.

    Kashef S, Nezamabadi-pour H (2015) An advanced ACO algorithm for feature subset selection. Neurocomputing 147:271–279

    Article  Google Scholar 

  39. 39.

    Kong M, Tian P (2006) Introducing a binary ant colony optimization. In: International workshop on ant colony optimization and swarm intelligence. Springer, pp 444–451

  40. 40.

    Jang S-H, Roh J-H, Kim W, Sherpa T, Kim J-H, Park J-B (2011) A novel binary ant colony optimization: application to the unit commitment problem of power systems. J Electr Eng Technol 6(2):174–181

    Article  Google Scholar 

  41. 41.

    Chen B, Chen L, Chen Y (2013) Efficient ant colony optimization for image feature selection. Sig Process 93(6):1566–1576

    Article  Google Scholar 

  42. 42.

    Kadri O, Mouss LH, Mouss MD (2012) Fault diagnosis of rotary kiln using SVM and binary ACO. J Mech Sci Technol 26(2):601–608

    Article  Google Scholar 

  43. 43.

    Kong M, Tian P (2005) A binary ant colony optimization for the unconstrained function optimization problem. Comput Intell Secur 3801:682–687

    Article  Google Scholar 

  44. 44.

    Dorigo M, Gambardella LM (1997) Ant colony system: a cooperative learning approach to the traveling salesman problem. IEEE Trans Evol Comput 1(1):53–66

    Article  Google Scholar 

  45. 45.

    Mohan U (2011) Bio inspired computing. BSc. Seminar. Division of CS SOE. CUSAT

  46. 46.

    Benesty J, Chen J, Huang Y, Cohen I (2009) Pearson correlation coefficient. In: Benesty J et al (eds) Noise reduction in speech processing. Springer, Berlin, pp 1–4. https://doi.org/10.1007/978-3-642-00296-0_5

    Google Scholar 

  47. 47.

    Kabir MM, Shahjahan M, Murase K (2011) A new local search based hybrid genetic algorithm for feature selection. Neurocomputing 74(17):2914–2928

    Article  Google Scholar 

  48. 48.

    Martens D, Baesens B, Fawcett T (2011) Editorial survey: swarm intelligence for data mining. Mach Learn 82(1):1–42

    MathSciNet  Article  Google Scholar 

  49. 49.

    Theodoridis S, Koutroumbas K (2008) Pattern recognition. Elsevier Science, Amsterdam

    Google Scholar 

  50. 50.

    Lai C, Reinders MJT, Wessels L (2006) Random subspace method for multivariate feature selection. Pattern Recognit Lett 27(10):1067–1076. https://doi.org/10.1016/j.patrec.2005.12.018

    Article  Google Scholar 

  51. 51.

    Ghazavi SN, Liao TW (2008) Medical data mining by fuzzy modeling with selected features. Artif Intell Med 43(3):195–206

    Article  Google Scholar 

  52. 52.

    Haindl M, Somol P, Ververidis D, Kotropoulos C (2006) Feature selection based on mutual correlation. In: Progress in pattern recognition, image analysis and applications, pp 569–577

    Google Scholar 

  53. 53.

    Ferreira AJ, Figueiredo MA (2012) Efficient feature selection filters for high-dimensional data. Pattern Recognit Lett 33(13):1794–1804

    Article  Google Scholar 

  54. 54.

    Ferreira AJ, Figueiredo MAT (2012) An unsupervised approach to feature discretization and selection. Pattern Recognit 45(9):3048–3060. https://doi.org/10.1016/j.patcog.2011.12.008

    Article  Google Scholar 

  55. 55.

    Alon U, Barkai N, Notterman DA, Gish K, Ybarra S, Mack D, Levine AJ (1999) Broad patterns of gene expression revealed by clustering analysis of tumor and normal colon tissues probed by oligonucleotide arrays. Proc Natl Acad Sci 96(12):6745–6750

    Article  Google Scholar 

  56. 56.

    Golub TR, Slonim DK, Tamayo P, Huard C, Gaasenbeek M, Mesirov JP, Coller H, Loh ML, Downing JR, Caligiuri MA (1999) Molecular classification of cancer: class discovery and class prediction by gene expression monitoring. Science 286(5439):531–537

    Article  Google Scholar 

  57. 57.

    Bache K, Lichman M (2013) UCI Machine Learning Repository. University of California, School of Information and Computer Science, Irvine. http://archive.ics.uci.edu/ml

  58. 58.

    Nene SA, Nayar SK, Murase H et al (1996) Columbia object image library (coil-20). Technical report CUCS-005-96

  59. 59.

    Unler A, Murat A, Chinnam RB (2011) mr2PSO: a maximum relevance minimum redundancy feature selection method based on swarm intelligence for support vector machine classification. Inf Sci 181(20):4625–4641. https://doi.org/10.1016/j.ins.2010.05.037

    Article  Google Scholar 

  60. 60.

    Zhang Y, Yang A, Xiong C, Wang T, Zhang Z (2014) Feature selection using data envelopment analysis. Knowl Based Syst 64:70–80

    Article  Google Scholar 

  61. 61.

    Almuallim H, Dietterich TG (1991) Efficient algorithms for identifying relevant features. In: Proceedings of the 9th Canadian conference on artificial intelligence. Citeseer, pp 38–45

  62. 62.

    Friedman M (1940) A comparison of alternative tests of significance for the problem of m rankings. Ann Math Stat 11(1):86–92

    MathSciNet  MATH  Article  Google Scholar 

Download references

Author information

Affiliations

Authors

Corresponding author

Correspondence to Fardin Akhlaghian Tab.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Manbari, Z., Akhlaghian Tab, F. & Salavati, C. Fast unsupervised feature selection based on the improved binary ant system and mutation strategy. Neural Comput & Applic 31, 4963–4982 (2019). https://doi.org/10.1007/s00521-018-03991-z

Download citation

Keywords

  • Data classification
  • High-dimensional data
  • Feature selection
  • Binary ant system
  • Filter approach
  • Mutation