Abstract
The “curse of dimensionality” issue caused by high-dimensional datasets not only imposes high memory and computational costs but also deteriorates the capability of learning methods. The main purpose of feature selection is to reduce the dimensionality of these datasets by discarding redundant and irrelevant features, which improves the performance of the learning algorithm. In this paper, a new feature selection algorithm, referred to as FSBACOM, was presented based on the binary ant system (BAS). The proposed method sought to improve feature selection by decreasing redundancy and achieved an optimum solution by increasing search space in a short time. For this purpose, the features were organized sequentially in a circular graph, where each feature was connected to the next one with two select/deselect edges. The proposed representation of the search space reduced computational time significantly, particularly on the high-dimensional datasets. Inspired from genetic algorithm and simulated annealing, a damped mutation strategy was introduced to avoid falling into local optima. In addition, a new idea was utilized to reduce the redundancy between selected features as far as possible. The performance of the proposed algorithm was compared to that of state-of-the-art feature selection algorithms using different classifiers on real-world datasets. The experimental results confirmed that FSBACOM significantly reduces computational time and achieves better performance than other feature selection methods.
Similar content being viewed by others
References
Liu H, Yu L (2005) Toward integrating feature selection algorithms for classification and clustering. IEEE Trans Knowl Data Eng 17(4):491–502
Jenatton R, Audibert J-Y, Bach F (2011) Structured variable selection with sparsity-inducing norms. J Mach Learn Res 12(10):2777–2824
Kim Y, Kim J (2004) Gradient LASSO for feature selection. In: Proceedings of the twenty-first international conference on machine learning. ACM, p 60
Peng H, Long F, Ding C (2005) Feature selection based on mutual information criteria of max-dependency, max-relevance, and min-redundancy. IEEE Trans Pattern Anal Mach Intell 27(8):1226–1238
Yang J-B, Ong C-J (2011) Feature selection using probabilistic prediction of support vector regression. IEEE Trans Neural Netw 22(6):954–962
Xiang S, Nie F, Meng G, Pan C, Zhang C (2012) Discriminative least squares regression for multiclass classification and feature selection. IEEE Trans Neural Netw Learn Syst 23(11):1738–1754
Zhao Z, Wang L, Liu H, Ye J (2013) On similarity preserving feature selection. IEEE Trans Knowl Data Eng 25(3):619–632
He X, Cai D, Niyogi P (2006) Laplacian score for feature selection. In: Paper presented at the advances in neural information processing systems
Zhao Z, Liu H (2007) Spectral feature selection for supervised and unsupervised learning. In: Proceedings of the 24th international conference on machine learning. ACM, pp 1151–1157
Jiang Y, Ren J (2011) Eigenvalue sensitive feature selection. In: Proceedings of the 28th international conference on machine learning (ICML-11), pp 89–96
Yang Y, Shen HT, Ma Z, Huang Z, Zhou X (2011) l2, 1-norm regularized discriminative feature selection for unsupervised learning. In: IJCAI proceedings-international joint conference on artificial intelligence, vol 1, p 1589
Padungweang P, Lursinsap C, Sunat K (2012) A discrimination analysis for unsupervised feature selection via optic diffraction principle. IEEE Trans Neural Netw Learn Syst 23(10):1587–1600
Xu Z, King I, Lyu MR-T, Jin R (2010) Discriminative semi-supervised feature selection via manifold regularization. IEEE Trans Neural Netw 21(7):1033–1047
Zhao J, Lu K, He X (2008) Locality sensitive semi-supervised feature selection. Neurocomputing 71(10):1842–1849
Zeng Z, Wang X, Zhang J, Wu Q (2016) Semi-supervised feature selection based on local discriminative information. Neurocomputing 173:102–109
Guyon I, Elisseeff A (2003) An introduction to variable and feature selection. J Mach Learn Res 3(3):1157–1182
Gheyas IA, Smith LS (2010) Feature subset selection in large dimensionality domains. Pattern Recognit 43(1):5–13
Tabakhi S, Moradi P, Akhlaghian F (2014) An unsupervised feature selection algorithm based on ant colony optimization. Eng Appl Artif Intell 32:112–123
Saeys Y, Inza I, Larrañaga P (2007) A review of feature selection techniques in bioinformatics. Bioinformatics 23(19):2507–2517
Sotoca JM, Pla F (2010) Supervised feature selection by clustering using conditional mutual information-based distances. Pattern Recognit 43(6):2068–2081. https://doi.org/10.1016/j.patcog.2009.12.013
Wei J, Zhang R, Yu Z, Hu R, Tang J, Gui C, Yuan Y (2017) A BPSO-SVM algorithm based on memory renewal and enhanced mutation mechanisms for feature selection. Appl Soft Comput 58:176–192
Wan Y, Wang M, Ye Z, Lai X (2016) A feature selection method based on modified binary coded ant colony optimization algorithm. Appl Soft Comput 49:248–258
Huang C-L, Huang W-L (2009) Handling sequential pattern decay: developing a two-stage collaborative recommender system. Electron Commer Res Appl 8(3):117–129
Das AK, Goswami S, Chakrabarti A, Chakraborty B (2017) A new hybrid feature selection approach using feature association map for supervised and unsupervised classification. Expert Syst Appl 88:81–94
Chandrashekar G, Sahin F (2014) A survey on feature selection methods. Comput Electr Eng 40(1):16–28
Jović A, Brkić K, Bogunović N (2015) A review of feature selection methods with applications. In: 2015 38th International convention on information and communication technology, electronics and microelectronics (MIPRO). IEEE, pp 1200–1205
Oh I-S, Lee J-S, Moon B-R (2004) Hybrid genetic algorithms for feature selection. IEEE Trans Pattern Anal Mach Intell 26(11):1424–1437
Chuang L-Y, Yang C-H, Li J-C (2011) Chaotic maps based on binary particle swarm optimization for feature selection. Appl Soft Comput 11(1):239–248
Chuang L-Y, Tsai S-W, Yang C-H (2011) Improved binary particle swarm optimization using catfish effect for feature selection. Expert Syst Appl 38(10):12699–12707
Amoozegar M, Minaei-Bidgoli B (2018) Optimizing Multi-objective PSO based feature selection method using a feature elitism mechanism. Expert Syst Appl 113:499–514
Ghaemi M, Feizi-Derakhshi M-R (2016) Feature selection using forest optimization algorithm. Pattern Recognit 60:121–129
Jiang S, Chin K-S, Wang L, Qu G, Tsui KL (2017) Modified genetic algorithm-based feature selection combined with pre-trained deep neural network for demand forecasting in outpatient department. Expert Syst Appl 82:216–230
Zorarpacı E, Özel SA (2016) A hybrid approach of differential evolution and artificial bee colony for feature selection. Expert Syst Appl 62:91–103
Chen Y, Miao D, Wang R (2010) A rough set approach to feature selection based on ant colony optimization. Pattern Recognit Lett 31(3):226–233
Chen Y, Miao D, Wang R, Wu K (2011) A rough set approach to feature selection based on power set tree. Knowl Based Syst 24(2):275–281
Tabakhi S, Moradi P (2015) Relevance–redundancy feature selection based on ant colony optimization. Pattern Recognit 48(9):2798–2811
Tabakhi S, Najafi A, Ranjbar R, Moradi P (2015) Gene selection for microarray data classification using a novel ant colony optimization. Neurocomputing 168:1024–1036
Kashef S, Nezamabadi-pour H (2015) An advanced ACO algorithm for feature subset selection. Neurocomputing 147:271–279
Kong M, Tian P (2006) Introducing a binary ant colony optimization. In: International workshop on ant colony optimization and swarm intelligence. Springer, pp 444–451
Jang S-H, Roh J-H, Kim W, Sherpa T, Kim J-H, Park J-B (2011) A novel binary ant colony optimization: application to the unit commitment problem of power systems. J Electr Eng Technol 6(2):174–181
Chen B, Chen L, Chen Y (2013) Efficient ant colony optimization for image feature selection. Sig Process 93(6):1566–1576
Kadri O, Mouss LH, Mouss MD (2012) Fault diagnosis of rotary kiln using SVM and binary ACO. J Mech Sci Technol 26(2):601–608
Kong M, Tian P (2005) A binary ant colony optimization for the unconstrained function optimization problem. Comput Intell Secur 3801:682–687
Dorigo M, Gambardella LM (1997) Ant colony system: a cooperative learning approach to the traveling salesman problem. IEEE Trans Evol Comput 1(1):53–66
Mohan U (2011) Bio inspired computing. BSc. Seminar. Division of CS SOE. CUSAT
Benesty J, Chen J, Huang Y, Cohen I (2009) Pearson correlation coefficient. In: Benesty J et al (eds) Noise reduction in speech processing. Springer, Berlin, pp 1–4. https://doi.org/10.1007/978-3-642-00296-0_5
Kabir MM, Shahjahan M, Murase K (2011) A new local search based hybrid genetic algorithm for feature selection. Neurocomputing 74(17):2914–2928
Martens D, Baesens B, Fawcett T (2011) Editorial survey: swarm intelligence for data mining. Mach Learn 82(1):1–42
Theodoridis S, Koutroumbas K (2008) Pattern recognition. Elsevier Science, Amsterdam
Lai C, Reinders MJT, Wessels L (2006) Random subspace method for multivariate feature selection. Pattern Recognit Lett 27(10):1067–1076. https://doi.org/10.1016/j.patrec.2005.12.018
Ghazavi SN, Liao TW (2008) Medical data mining by fuzzy modeling with selected features. Artif Intell Med 43(3):195–206
Haindl M, Somol P, Ververidis D, Kotropoulos C (2006) Feature selection based on mutual correlation. In: Progress in pattern recognition, image analysis and applications, pp 569–577
Ferreira AJ, Figueiredo MA (2012) Efficient feature selection filters for high-dimensional data. Pattern Recognit Lett 33(13):1794–1804
Ferreira AJ, Figueiredo MAT (2012) An unsupervised approach to feature discretization and selection. Pattern Recognit 45(9):3048–3060. https://doi.org/10.1016/j.patcog.2011.12.008
Alon U, Barkai N, Notterman DA, Gish K, Ybarra S, Mack D, Levine AJ (1999) Broad patterns of gene expression revealed by clustering analysis of tumor and normal colon tissues probed by oligonucleotide arrays. Proc Natl Acad Sci 96(12):6745–6750
Golub TR, Slonim DK, Tamayo P, Huard C, Gaasenbeek M, Mesirov JP, Coller H, Loh ML, Downing JR, Caligiuri MA (1999) Molecular classification of cancer: class discovery and class prediction by gene expression monitoring. Science 286(5439):531–537
Bache K, Lichman M (2013) UCI Machine Learning Repository. University of California, School of Information and Computer Science, Irvine. http://archive.ics.uci.edu/ml
Nene SA, Nayar SK, Murase H et al (1996) Columbia object image library (coil-20). Technical report CUCS-005-96
Unler A, Murat A, Chinnam RB (2011) mr2PSO: a maximum relevance minimum redundancy feature selection method based on swarm intelligence for support vector machine classification. Inf Sci 181(20):4625–4641. https://doi.org/10.1016/j.ins.2010.05.037
Zhang Y, Yang A, Xiong C, Wang T, Zhang Z (2014) Feature selection using data envelopment analysis. Knowl Based Syst 64:70–80
Almuallim H, Dietterich TG (1991) Efficient algorithms for identifying relevant features. In: Proceedings of the 9th Canadian conference on artificial intelligence. Citeseer, pp 38–45
Friedman M (1940) A comparison of alternative tests of significance for the problem of m rankings. Ann Math Stat 11(1):86–92
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
The authors declare that they have no conflict of interest.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Manbari, Z., Akhlaghian Tab, F. & Salavati, C. Fast unsupervised feature selection based on the improved binary ant system and mutation strategy. Neural Comput & Applic 31, 4963–4982 (2019). https://doi.org/10.1007/s00521-018-03991-z
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00521-018-03991-z