Investigating Multi-Operator Differential Evolution for Feature Selection
Performance issues when dealing with a large number of features are well-known for classification algorithms. Feature selection aims at mitigating these issues by reducing the number of features in the data. Hence, in this paper, a feature selection approach based on a multi-operator differential evolution algorithm is proposed. The algorithm partitions the initial population into a number of sub-populations evolving using a pool of distinct mutation strategies. Periodically, the sub-populations exchange information to enhance their diversity. This multi-operator approach reduces the sensitivity of the standard differential evolution to the selection of an appropriate mutation strategy. Two classifiers, namely decision trees and k-nearest neighborhood, are used to evaluate the generated subsets of features. Experimental analysis has been conducted on several real data sets using a 10-fold cross validation. The analysis shows that the proposed algorithm successfully determines efficient feature subsets, which can improve the classification accuracy of the classifiers under consideration. The usefulness of the proposed method on large scale data set has been demonstrated using the KDD Cup 1999 intrusion data set, where the proposed method can effectively remove irrelevant features from the data.
KeywordsFeature Selection Classification Accuracy Differential Evolution Intrusion Detection Feature Subset
This work was supported by the Australian Centre for Cyber Security Research Funding Program, under a grant no. PS38135.
- 2.Caruana, R., Freitag, D.: Greedy Attribute Selection, pp. 28–36. Citeseer (1994). 00536Google Scholar
- 4.Elsayed, S.M., Sarker, R.A., Essam, D.L.: Multi-operator based evolutionary algorithms for solving constrained optimization problems. Comput. Oper. Res. 38(12), 1877–1896 (2011). http://www.sciencedirect.com/science/article/pii/S030505481100075X.00072 CrossRefMathSciNetzbMATHGoogle Scholar
- 5.Frank, A., Asuncion, A.: UCI Machine Learning Repository. University of California, Irvine, School of Information and Computer Sciences (2010). http://archive.ics.uci.edu/ml
- 7.Garcia-Nieto, J., Alba, E., Apolloni, J.: Hybrid DE-SVM approach for feature selection: application to gene expression datasets, pp. 1–6. IEEE (2009)Google Scholar
- 8.Hall, M.A.: Correlation-based Feature Selection for Machine Learning. Ph.D. thesis (1999)Google Scholar
- 9.John, G.H., Kohavi, R., Pfleger, K.: Irrelevant Features and the Subset Selection Problem. In: Proceedings, pp. 121–129. Morgan Kaufmann (1994)Google Scholar
- 10.Khushaba, R.N., Al-Ani, A., Al-Jumaily, A.: Feature subset selection using differential evolution and a statistical repair mechanism. Expert Syst. Appl. 38(9), 11515–11526 (2011). http://linkinghub.elsevier.com/retrieve/pii/S0957417411004362 CrossRefGoogle Scholar
- 14.Martinoyić, G., Bajer, D., Zorić, B.: A differential evolution approach to dimensionality reduction for classification needs. Int. J. Appl. Math. Comput. Sci. 24(1), 111 (2014). http://www.degruyter.com/view/j/amcs.2014.24.issue-1/amcs-2014-0009/amcs-2014-0009.xml MathSciNetGoogle Scholar
- 16.Price, K., Storn, R., Lampinen, J.: Differential Evolution: A Practical Approach to Global Optimization. Springer, New York (2005)Google Scholar
- 20.Tušar, T., Filipič, B.: Differential evolution versus genetic algorithms in multiobjective optimization. In: Obayashi, S., Deb, K., Poloni, C., Hiroyasu, T., Murata, T. (eds.) EMO 2007. LNCS, vol. 4403, pp. 257–271. Springer, Heidelberg (2007). http://dx.doi.org/10.1007/978-3-540-70928-2_22 CrossRefGoogle Scholar
- 21.Yu, L., Liu, H.: Feature selection for high-dimensional data: a fast correlation-based filter solution, pp. 856–863 (2003). http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.68.2975