Abstract
The selection of an optimal feature subset from all available features in the data is a vital task of data pre-processing used for several purposes such as the dimensionality reduction, the computational complexity reduction required for data processing (e.g., clustering, classification and regression) and the performance enhancement of a data processing technique. To serve such purposes, feature selection approaches which are fundamentally categorized into filters and wrappers try to eliminate irrelevant, redundant and erroneous features in the data. Each category comes with its own advantages and disadvantages. While wrappers can generally provide higher classification performance than filters, filters are computationally more efficient than wrappers. In order to bring the advantages of wrappers and filters together, i.e., to get higher classification performance with smaller feature subset size in a shorter time, this paper proposes a differential evolution approach combining filter and wrapper approaches through an improved information theoretic local search mechanism which is based on the concepts of fuzziness to cope with both continuous and discrete datasets. To show the superiority of the proposed approach, it is examined and compared with traditional and recent evolutionary feature selection approaches on several benchmarks from different well-known data repositories.
Similar content being viewed by others
References
Ahmed S, Zhang M, Peng L (2014) Improving feature ranking for biomarker discovery in proteomics mass spectrometry data using genetic programming. Conn Sci 26(3):215–243
Al-Ani A (2005) Ant colony optimization for feature subset selection. In: Proceedings of World Academy of Science, Engineering and Technology, pp 35–38
Al-Ani A, Alsukker A, Khushaba RN (2013) Feature subset selection using differential evolution and a wheel based search strategy. Swarm Evol Comput 9(Supplement C):15–26
Al-Janabi S (2017) Pragmatic miner to risk analysis for intrusion detection (PMRA-ID). In: Mohamed A, Berry MW, Yap BW (eds) Soft computing in data science. Springer, Singapore, pp 263–277
Al-Janabi S, Alwan E (2017) Soft mathematical system to solve black box problem through development the farb based on hyperbolic and polynomial functions. In: 10th international conference on developments in eSystems engineering (DeSE2017), pp 37–42
Al-Janabi S, Al-Shourbaji I, Salman MA (2018) Assessing the suitability of soft computing approaches for forest fires prediction. Appl Comput Inform 14(2):214–224
Alford A, Adams J, Shelton J, Dozier G, Bryant K, Kelly J (2013) Genetic and evolutionary biometrics: exploring value preference space for hybrid feature weighting and selection. Int J Intell Comput Cybern 6(1):4–20
Ali SH (2012) A novel tool (FP-KC) for handle the three main dimensions reduction and association rule mining. In: 6th international conference on sciences of electronics, technologies of information and telecommunications (SETIT2012), pp 951–961
Almuallim H, Dietterich TG (1994) Learning boolean concepts in the presence of many irrelevant features. Artif Intell 69(1):279–305
Apolloni J, Leguizamn G, Alba E (2016) Two hybrid wrapper–filter feature selection algorithms applied to high-dimensional microarray experiments. Appl Soft Comput 38:922–932
Babu B, Munawar S (2007) Differential evolution strategies for optimal design of shell-and-tube heat exchangers. Chem Eng Sci 62(14):3720–3739
Bishop CM (1995) Neural networks for pattern recognition. Oxford University Press, Oxford
Bishop CM (2006) Pattern recognition and machine learning (information science and statistics). Springer, Secaucus
Boubezoul A, Paris S (2012) Application of global optimization methods to model and feature selection. Pattern Recognit 45(10):3676–3686
Butler-Yeoman T, Xue B, Zhang M (2015) Particle swarm optimisation for feature selection: a hybrid filter-wrapper approach. In: IEEE Congress on Evolutionary Computation (CEC2015), pp 2428–2435
Caruana R, Freitag D (1994) Greedy attribute selection. In: Proceedings of the eleventh international conference on machine learning. Morgan Kaufmann, pp 28–36
Castro PA, Zuben FJV (2010) Multiobjective feature selection using a Bayesian artificial immune system. Int J Intell Comput Cybern 3(2):235–256
Cervante L, Xue B, Shang L, Zhang M (2012) A dimension reduction approach to classification based on particle swarm optimisation and rough set theory. In: Thielscher M, Zhang D (eds) AI2012: advances in artificial intelligence. Lecture notes in computer science, vol 7691. Springer, Berlin
Chen D, Chan KCC, Wu X (2008) Gene expression analyses using genetic algorithm based hybrid approaches. In: IEEE Congress on Evolutionary Computation (CEC2008), pp 963–969
Chen TC, Hsieh YC, You PS, Lee YC (2010) Feature selection and classification by using grid computing based evolutionary approach for the microarray data. In: 2010 3rd international conference on computer science and information technology, vol 9, pp 85–89
Chuang LY, Ke CH, Yang CH (2008) A hybrid both filter and wrapper feature selection method for microarray classification. In: Proceedings of the international multiconference of engineers and computer scientists (IMECS’2008)
Deb A, Roy JS, Gupta B (2014) Performance comparison of differential evolution, particle swarm optimization and genetic algorithm in the design of circularly polarized microstrip antennas. IEEE Trans Antennas Propag 62(8):3920–3928
Dua D, Karra Taniskidou E (2017) UCI Machine Learning Repository, University of California, School of Information and Computer Science, Irvine, CA. http://archive.ics.uci.edu/ml
Estévez PA, Tesmer M, Perez CA, Zurada JM (2009) Normalized mutual information feature selection. IEEE Trans Neural Netw 20(2):189–201
Golub G, Van Loan C (1996) Matrix computations. Johns Hopkins studies in the mathematical sciences. Johns Hopkins University Press, Baltimore
Gutlein M, Frank E, Hall M, Karwath A (2009) Large-scale attribute selection using wrappers. In: IEEE symposium on computational intelligence and data mining (CIDM ’09), pp 332–339
Hall M, Frank E, Holmes G, Pfahringer B, Reutemann P, Witten IH (2009) The weka data mining software: an update. SIGKDD Explor Newsl 11(1):10–18
Hancer E, Xue B, Karaboga D, Zhang M (2015) A binary ABC algorithm based on advanced similarity scheme for feature selection. Appl Soft Comput 36:334–348
Hancer E, Xue B, Zhang M (2017) A differential evolution based feature selection approach using an improved filter criterion. In: IEEE symposium series on computational intelligence (SSCI2017), pp 1–8
Hancer E, Xue B, Zhang M (2018a) Differential evolution for filter feature selection based on information theory and feature ranking. Knowl Based Syst 140:103–119
Hancer E, Xue B, Zhang M, Karaboga D, Akay B (2018b) Pareto front feature selection based on artificial bee colony optimization. Inf Sci 422:462–479
He X, Cai D, Niyogi P (2005) Laplacian score for feature selection. In: Proceedings of the 18th international conference on neural information processing systems, NIPS’05, pp 507–514
He X, Zhang Q, Sun N, Dong Y (2009) Feature selection with discrete binary differential evolution. In: International conference on artificial intelligence and computational intelligence, vol 4, pp 327–330
Hong JH, Cho SB (2006) Efficient huge-scale feature selection with speciated genetic algorithm. Pattern Recognit Lett 27(2):143–150
Huang CL, Dun JF (2008) A distributed PSO–SVM hybrid system with feature selection and parameter optimization. Appl Soft Comput 8(4):1381–1391
Huang J, Rong P (2009) A hybrid genetic algorithm for feature selection based on mutual information. In: Emmert-Streib F, Dehmer M (eds) Information theory and statistical learning. Springer, Boston
Huang J, Cai Y, Xu X (2007) A hybrid genetic algorithm for feature selection wrapper based on mutual information. Pattern Recognit Lett 28(13):1825–1844
Iswandy K, Koenig A (2006) Feature-level fusion by multi-objective binary particle swarm based unbiased feature selection for optimized sensor system design. In: IEEE international conference on multisensor fusion and integration for intelligent systems, pp 365–370
Jeong YS, Shin SK, Jeong KM (2015) An evolutionary algorithm with the partial sequential forward floating search mutation for large-scale feature selection problems. J Oper Res Soc 66(4):529–538
Jolliffe I (2014) Principal component analysis. Wiley, London
Khushaba RN, Al-Ani A, AlSukker A, Al-Jumaily A (2008) A combined ant colony and differential evolution feature selection algorithm. Springer, Berlin, pp 1–12
Khushaba RN, Kodagoda S, Lal S, Dissanayake G (2011) Driver drowsiness classification using fuzzy wavelet-packet-based feature-extraction algorithm. IEEE Trans Biomed Eng 58(1):121–131
Kira K, Rendell LA (1992) A practical approach to feature selection. In: Proceedings of the ninth international workshop on machine learning, ML92, pp 249–256
Kulluk S, Özbakr L, Tapkan PZ, Baykasolu A (2016) Cost-sensitive meta-learning classifiers: MEPAR-miner and DIFACONN-miner. Knowl Based Syst 98:148–161. https://doi.org/10.1016/j.knosys.2016.01.025
Lane M, Xue B, Liu I, Zhang M (2013) Particle swarm optimisation and statistical clustering for feature selection. In: Cranefield S, Nayak A (eds) Advances in artificial intelligence. Lecture notes in computer science, vol 8272. Springer, Cham, pp 214–220
Lin SW, Ying KC, Chen SC, Lee ZJ (2008) Particle swarm optimization for parameter determination and feature selection of support vector machines. Expert Syst Appl 35(4):1817–1824
Liu Y, Wang G, Chen H, Dong H, Zhu X, Wang S (2011) An improved particle swarm optimization for feature selection. J Bionic Eng 8(2):191–200
Lustgarten JL, Visweswaran S, Gopalakrishnan V, Cooper GF (2011) Application of an efficient Bayesian discretization method to biomedical data. BMC Bioinform 12(309):1–15
Mansouri R, Torabi H, Hoseini M, Morshedzadeh H (2015) Optimization of the water distribution networks with differential evolution (DE) and mixed integer linear programming (MILP). J Water Resour Prot 7(9):715–729
Marill T, Green D (2006) On the effectiveness of receptors in recognition systems. IEEE Trans Inf Theory 9(1):11–17
Mika S, Rtsch G, Weston J, Schlkopf B, Mller KR (1999) Fisher discriminant analysis with kernels. In: Proceedings of the IEEE Signal Processing Society workshop
Moharam A, El-Hosseini MA, Ali HA (2016) Design of optimal PID controller using hybrid differential evolution and particle swarm optimization with an aging leader and challengers. Appl Soft Comput 38(Supplement C):727–737
Mottalib M, Islam A, Kabeer SJ, A Mottalib I (2015) Microarray gene selection using adaptive wrapper and filtering techniques. In: 10th international conference on information technology and applications (ICITA2015)
Muni DP, Pal NR, Das J (2006) Genetic programming for simultaneous feature selection and classifier design. IEEE Trans Syst Man Cybern B Cybern 36(1):106–117
Naseriparsa M, Bidgoli A, Varaee T (2014) A hybrid feature selection method to improve performance of a group of classification algorithms. Int J Comput Appl 69(17):28–35
Nguyen HB, Xue B, Liu I, Zhang M (2014) Filter based backward elimination in wrapper based PSO for feature selection in classification. In: IEEE Congress on Evolutionary Computation (CEC2014), pp 3111–3118
Oh IS, Lee JS, Moon BR (2004) Hybrid genetic algorithms for feature selection. IEEE Trans Pattern Anal Mach Intell 26(11):1424–1437
Patel A, Al-Janabi S, AlShourbaji I, Pedersen J (2015) A novel methodology towards a trusted environment in mashup web applications. Comput Secur 49:107–122
Peng H, Long F, Ding C (2005) Feature selection based on mutual information: criteria of max-dependency, max-relevance, and min-redundancy. IEEE Trans Pattern Anal Mach Intell 27(8):1226–1238
Pudil P, Novoviov J, Kittler J (1994) Floating search methods in feature selection. Pattern Recognit Lett 15(11):1119–1125
Ramos CCO, de Souza AN, Falcao AX, Papa JP (2012) New insights on nontechnical losses characterization through evolutionary-based feature selection. IEEE Trans Power Deliv 27(1):140–146
Refaeilzadeh P, Tang L, Liu H (2009) Cross-validation. Springer, Boston, pp 532–538
Roweis ST, Saul LK (2000) Nonlinear dimensionality reduction by locally linear embedding. Science 290(5500):2323–2326
Sahu B, Mishra D (2012) A novel feature selection algorithm using particle swarm optimization for cancer microarray data. Proc Eng 38(Supplement C):27–31
Shannon CE (2001) A mathematical theory of communication. SIGMOBILE Mob Comput Commun Rev 5(1):3–55
Stearns S (1976) On selecting features for pattern classifiers. In: International conference on pattern recognition
Storn R, Price K (1997) Differential evolution—a simple and efficient heuristic for global optimization over continuous spaces. J Glob Optim 11(4):341–359
Tang EK, Suganthan PN, Yao X (2005) Feature selection for microarray data using least squares svm and particle swarm optimization. In: IEEE symposium on computational intelligence in bioinformatics and computational biology, pp 1–8
Tapkan P, zbakr L, Kulluk S, Baykasolu A (2016) A cost-sensitive classification algorithm: BEE-Miner. Knowl Based Syst 95:99–113
Whitney AW (1971) A direct method of nonparametric measurement selection. IEEE Trans Comput 20(9):1100–1103
Witten IH, Frank E, Hall MA (2011) Data mining: practical machine learning tools and techniques, 3rd edn. Morgan Kaufmann, San Francisco
Xue B (2013) Particle swarm optimisation for feature selection in classification. PhD thesis, School of Engineering and Computer Science, Victoria University of Wellington
Xue B, Cervante L, Shang L, Browne WN, Zhang M (2012) A multi-objective particle swarm optimisation for filter-based feature selection in classification problems. Conn Sci 24(2–3):91–116
Xue B, Zhang M, Browne W (2013) Novel initialisation and updating mechanisms in PSO for feature selection in classification. In: Esparcia-Alcazar A (ed) Applications of evolutionary computation. Lecture notes in computer science, vol 7835. Springer, Berlin
Xue B, Cervante L, Shang L, Brown WN, Zhang M (2014a) Binary PSO and rough set theory for feature selection: a multi-objective filter based approach. Int J Comput Intell Appl 13(02):1450009
Xue B, Zhang M, Browne WN (2014b) Particle swarm optimisation for feature selection in classification: novel initialisation and updating mechanisms. Appl Soft Comput 18:261–276
Xue B, Zhang M, Browne WN, Yao X (2016) A survey on evolutionary computation approaches to feature selection. IEEE Trans Evol Comput 20(4):606–626
Yahya AA, Osman A, Ramli AR, Balola A (2011) Feature selection for high dimensional data: an evolutionary filter approach. J Comput Sci 7(5):800–820
Zhang C, Hu H (2005) Using PSO algorithm to evolve an optimum input subset for a SVM in time series forecasting. In: IEEE international conference on systems, man and cybernetics, vol 4, pp 3793–3796
Zhang D, Wei B (2014) Comparison between differential evolution and particle swarm optimization algorithms. In: IEEE international conference on mechatronics and automation, pp 239–244
Zhang LX, Wang JX, Zhao YN, Yang ZH (2003) A novel hybrid feature selection algorithm: using ReliefF estimation for GA-Wrapper search. In: Proceedings of the international conference on machine learning and cybernetics, vol 1, pp 380–384
Zhu Z, Ong YS, Dash M (2007) Wrapper–filter feature selection algorithm using a memetic framework. IEEE Trans Systems Man Cybern B Cybern 37(1):70–76
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
The author declares that he has no conflict of interest. In addition, this article does not contain any studies with human participants or animals performed by the author. The undersigned author declares that this manuscript is original, has not been published before, and is not currently being considered for publication elsewhere.
Additional information
Communicated by V. Loia.
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Hancer, E. Differential evolution for feature selection: a fuzzy wrapper–filter approach. Soft Comput 23, 5233–5248 (2019). https://doi.org/10.1007/s00500-018-3545-7
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00500-018-3545-7