Abstract
Feature selection (FS) can be defined as the problem of finding the minimal number of features from an original set with the minimum information loss. Since FS problems are known as NP-hard problems, it is necessary to investigate a fast and an effective search algorithm to tackle this problem. In this paper, two incremental hill-climbing techniques (QuickReduct and CEBARKCC) are hybridized with the binary ant lion optimizer in a model called HBALO. In the proposed approach, a pool of solutions (ants) is generated randomly and then enhanced by embedding the most informative features in the dataset that are selected by the two filter feature selection models. The resultant population is then used by BALO algorithm to find the best solution. The proposed binary approaches are tested on a set of 18 well-known datasets from UCI repository and compared with the most recent related approaches. The experimental results show the superior performance of the proposed approaches in searching the feature space for optimal feature combinations.
Similar content being viewed by others
References
Abualigah LM, Khader AT, Hanandeh ES (2017) A new feature selection method to improve the document clustering using particle swarm optimization algorithm. J Comput Sci. https://doi.org/10.1016/j.jocs.2017.07.018
Anusha M, Sathiaseelan J (2015) Feature selection using K-means genetic algorithm for multi-objective optimization. Procedia Comput Sci 57:1074–1080
Asir D, Appavu S, Jebamalar E (2016) Literature review on feature selection methods for high-dimensional data. Int J Comput Appl 136(1):9–17
Bell DA, Wang H (2000) A formalism for relevance and its application in feature subset Selection. Mach Learn 41(2):175–195. https://doi.org/10.1023/A:1007612503587
Bello R, Nowe A, Caballero Y, Gómez Y, Vrancx P (2005) A model based on ant colony system and rough set theory to feature selection. Paper presented at the Proceedings of the 2005 conference on genetic and evolutionary computation, Washington DC, USA
Bello R, Gomez Y, Nowe A, Garcia MM (2007) Two-step particle swarm optimization to solve the feature selection problem. Paper presented at the Proceedings of the seventh international conference on intelligent systems design and applications, Brazil
Blake CL, Merz CJ (1998) UCI repository of machine learning databases. Retrieved 1 June, 2016, from http://www.ics.uci.edu/mlearn/
BoussaïD I, Lepagnot J, Siarry P (2013) A survey on optimization metaheuristics. Inf Sci 237:82–117
Chakraborty B (2008) Feature subset selection by particle swarm optimization with fuzzy fitness function. Paper presented at the 3rd International conference on intelligent system and knowledge engineering, 2008. ISKE 2008
Chen Y, Miao D, Wang R (2010) A rough set approach to feature selection based on ant colony optimization. Pattern Recogn Lett 31(3):226–233
Chen H, Jiang W, Li C, Li R (2013) A heuristic feature selection approach for text categorization by using chaos optimization and genetic algorithm. Math Probl Eng. https://doi.org/10.1155/2013/524017
Christaline JA, Ramesh R, Vaishali D (2016) Bio-inspired computational algorithms for improved image steganalysis. Indian J Sci Technol. https://doi.org/10.17485/ijst/2016/v9i10/88995
Chuang L-Y, Chang H-W, Tu C-J, Yang C-H (2008) Improved binary PSO for feature selection using gene expression data. Comput Biol Chem 32(1):29–38
Dorigo M, Maniezzo V, Colorni A (1996) Ant system: optimization by a colony of cooperating agents. IEEE Trans Syst Man Cybern Part B Cybern 26(1):29–41
Emary E, Zawbaa HM (2016) Impact of chaos functions on modern swarm optimizers. PLoS ONE 11(7):e0158738
Emary E, Zawbaa HM (2018) Feature selection via Lèvy Antlion optimization. Pattern Anal Appl. https://doi.org/10.1007/s10044-018-0695-2
Emary E, Zawbaa HM, Hassanien AE (2016a) Binary ant lion approaches for feature selection. Neurocomputing 213:54–65
Emary E, Zawbaa HM, Hassanien AE (2016b) Binary grey wolf optimization approaches for feature selection. Neurocomputing 172:371–381. https://doi.org/10.1016/j.neucom.2015.06.083
Friedman J, Hastie T, Tibshirani R (2001) The elements of statistical learning. Springer series in statistics. Springer, Berlin
Goodenough J, McGuire B, Jakob E (2009) Perspectives on animal behavior. Wiley, Hoboken
Grosan C, Emary E, Zawbaa H (2018) Experienced grey wolf optimizer through reinforcement learning and neural networks. IEEE Trans Neural Netw Learn Syst (TNNLS) 29(13):681–694. https://doi.org/10.1109/TNNLS.2016.2634548
Gunasundari S, Janakiraman S, Meenambal S (2016) Velocity bounded boolean particle swarm optimization for improved feature selection in liver and kidney disease diagnosis. Expert Syst Appl 56:28–47
Holland JH (1992) Adaptation in natural and artificial systems. MIT Press, Cambridge
Hutchins M, Olendorf D (2004) Grzimek’s animal life encyclopedia: lower metazoans and lesser deuterosomes, vol 1. Gale/Cengage Learning, Farmington Hills
Il-Seok O, Jin-Seon L, Byung-Ro M (2004) Hybrid genetic algorithms for feature selection. IEEE Trans Pattern Anal Mach Intell 26(11):1424–1437. https://doi.org/10.1109/TPAMI.2004.105
Jensen R, Shen Q (2002) Fuzzy-rough sets for descriptive dimensionality reduction. Paper presented at the Proceedings of the 2002 IEEE international conference on fuzzy systems, 2002. FUZZ-IEEE’02
Jensen R, Shen Q (2003) Finding rough set reducts with ant colony optimization. Paper presented at the Proceedings of the 2003 UK workshop on computational intelligence
Jensen R, Shen Q (2004) Semantics-preserving dimensionality reduction: rough and fuzzy-rough-based approaches. IEEE Trans Knowl Data Eng 16(12):1457–1471. https://doi.org/10.1109/TKDE.2004.96
Jensen R, Shen Q (2008) Computational intelligence and feature selection: rough and fuzzy approaches. Wiley-IEEE Press, Hoboken
Karaboga D (2005) An idea based on honey bee swarm for numerical optimization. Technical Report TR06, Erciyes University Press, Erciyes. Kayseri/Türkiye: Erciyes University, Engineering Faculty, Computer Engineering Department
Ke L, Feng Z, Ren Z (2008) An efficient ant colony optimization approach to attribute reduction in rough set theory. Pattern Recogn Lett 29(9):1351–1357
Kennedy J, Eberhart R (1995) Particle swarm optimization. Paper presented at the Proceedings of the IEEE international conference on neural networks, 1995
Kittler J (1975) Mathematical methods of feature selection in pattern recognition. Int J Man Mach Stud 7(5):609–638
Kohavi R, John GH (1997) Wrappers for feature subset selection. Artif Intell 97(1):273–324
Liu H, Motoda H (1998) Feature selection for knowledge discovery and data mining. Kluwer, Boston
Mafarja M, Abdullah S (2013) Investigating memetic algorithm in solving rough set attribute reduction. Int J Comput Appl Technol 48(3):195–202
Mafarja MM, Mirjalili S (2017) Hybrid whale optimization algorithm with simulated annealing for feature selection. Neurocomputing 260:302–312
Mafarja M, Mirjalili S (2018) Whale optimization approaches for wrapper feature selection. Appl Soft Comput 62:441–453
Mafarja M, Aljarah I, Heidari AA, Hammouri AI, Faris H, Ala’M A-Z, Mirjalili S (2017a) Evolutionary population dynamics and grasshopper optimization approaches for feature selection problems. Knowl Based Syst 145:25–45
Mafarja M, Eleyan D, Abdullah S, Mirjalili S (2017b) S-shaped vs. V-shaped transfer functions for ant lion optimization algorithm in feature selection problem. Paper presented at the Proceedings of the international conference on future networks and distributed systems
Mafarja M, Jaber I, Eleyan D, Hammouri A, Mirjalili S (2017c) Binary dragonfly algorithm for feature selection
Mirjalili S (2015) The ant lion optimizer. Adv Eng Softw 83:80–98
Mirjalili S, Mirjalili SM, Lewis A (2014) Grey wolf optimizer. Adv Eng Softw 69:46–61
Moradi P, Gholampour M (2016) A hybrid particle swarm optimization for feature subset selection by integrating a novel local search strategy. Appl Soft Comput 43:117–130
Osman IH, Kelly JP (2012) Meta-heuristics: theory and applications. Springer, Berlin
Pawlak Z (1982) Rough sets. Int J Inf Comput Sci 11:341–356
Pawlak Z (1991) Rough sets: theoretical aspects of reasoning about data. Kluwer, Dordrecht
Shokouhifar M, Sabet S (2010) A hybrid approach for effective feature selection using neural networks and artificial bee colony optimization. Paper presented at the 3rd International conference on machine vision (ICMV 2010)
Ślezak D (2002) Approximate entropy reducts. Fundamenta informaticae 53(3–4):365–390
Theodoridis S, Koutroumbas K (2006) Pattern recognition, 3rd edn. Academic Press, Orlando
Wang X, Yang J, Teng X, Xia W, Jensen R (2007) Feature selection based on rough sets and particle swarm optimization. Pattern Recogn Lett 28(4):459–471
Wang J, Li T, Ren R (2010) A real time IDSs based on artificial bee colony-support vector machine algorithm. Paper presented at the 2010 Third international workshop on advanced computational intelligence (IWACI)
Wang H, Khoshgoftaar TM, Napolitano A (2012) Software measurement data reduction using ensemble techniques. Neurocomputing 92:124–132
Wang A, An N, Chen G, Li L, Alterovitz G (2015a) Accelerating wrapper-based feature selection with K-nearest-neighbor. Knowl-Based Syst 83:81–91
Wang Y, Liu Y, Feng L, Zhu X (2015b) Novel feature selection method based on harmony search for email classification. Knowl-Based Syst 73:311–323
Wolpert D (1997) No free lunch theorem for optimization. IEEE Trans Evol Comput 1:467–482
Xue B, Zhang M, Browne WN (2014) Particle swarm optimisation for feature selection in classification: novel initialisation and updating mechanisms. Appl Soft Comput 18:261–276
Yang J, Honavar VG (1998) Feature subset selection using a genetic algorithm. IEEE Intell Syst 13(2):44–49. https://doi.org/10.1109/5254.671091
Yu H, Wang G, Yang D, Wu Z (2002) Knowledge reduction algorithms based on rough set and conditional information entropy. Paper presented at the AeroSense 2002
Zawbaa HM, Emary E, Parv B (2015) Feature selection based on antlion optimization algorithm. Paper presented at the 2015 Third world conference on complex systems (WCCS)
Zawbaa HM, Emary E, Grosan C (2016) Feature selection via chaotic antlion optimization. PLoS ONE 11(3):e0150652
Zawbaa HM, Emary E, Grosan C, Snasel V (2018) Large-dimensionality small-instance set feature selection: a hybrid bio-inspired heuristic approach. Swarm Evolut Comput. https://doi.org/10.1016/j.swevo.2018.02.021
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
All authors declare that there is no conflict of interest.
Ethical standard
This article does not contain any studies with human participants or animals performed by any of the authors.
Additional information
Communicated by V. Loia.
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Mafarja, M.M., Mirjalili, S. Hybrid binary ant lion optimizer with rough set and approximate entropy reducts for feature selection. Soft Comput 23, 6249–6265 (2019). https://doi.org/10.1007/s00500-018-3282-y
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00500-018-3282-y