Abstract
Dimensionality reduction is an interesting area of research in data mining. An effective way to reduce dimensions is feature selection that removes irrelevant information meanwhile helping to understand the learning model better and improving prediction accuracy. In this paper, we face a challenge of filter methods to determine number of significant features that achieves better performance since filters don’t evaluate performance based on accuracy but use certain criteria to rank features based on some scores. To handle this challenge, we proposed an effective hybrid approach for feature selection that is a filter-based method inspired by concepts of chi-square, Relief-F, and mutual information. It provides a score for each feature then specifies threshold value automatically based on dataset in use to select important subset of features used to build model, which reduces required execution time and amount of memory. Our proposed approach was analyzed empirically and theoretically to demonstrate its efficiency.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Tang, J., Alelyani, S., Liu, H.: Feature selection for classification: a review. Data Classif. Algorithms Appl. 37‏ (2014)
Jain, D., Singh, V.: An efficient hybrid feature selection model for dimensionality reduction. Proc. Comput. Sci. 132, 333–341 (2018)
Xue, B., Zhang, M., Browne, W.N., Yao, X.: A survey on evolutionary computation approaches to feature selection. IEEE Trans. Evol. Comput. 20(4), 606–626 (2016)
Miao, J., Niu, L.: A survey on feature selection. Proc. Comput. Sci. 91, 919–926 (2016)
Nguyen, H.B., Xue, B., Liu, I., Zhang, M.: Filter based backward elimination in wrapper based PSO for feature selection in classification. In: 2014 IEEE Congress on Evolutionary Computation (CEC), pp. 3111–3118. IEEE (2014)
Guyon, I., Elisseeff, A.: An introduction to variable and feature selection. J. Mach. Learn. Res. 3(3), 1157–1182 (2003)
Dash, M., Liu, H.: Feature selection for classification. Intell. Data Anal. 1(1–4), 131–156 (1997)
Yu, L., Liu, H.: Feature selection for high-dimensional data: a fast correlation-based filter solution. In: Proceedings of the 20th International Conference on Machine Learning (ICML), pp. 856–863 (2003)
Kumar, V., Minz, S.: Feature Selection. SmartCR 4(3), 211–229 (2014)
Qin, C.J., Guan, Q., Wang, X.P.: Application of ensemble algorithm integrating multiple criteria feature selection in coronary heart disease detection. Biomed. Eng. Appl. Basis Commun. 29(06), 1750043 (2017)
Haq, A.U., Li, J.P., Memon, M.H., Nazir, S., Sun, R.: A hybrid intelligent system framework for the prediction of heart disease using machine learning algorithms. Mob. Inf. Syst. (2018)
Steuer, R., Kurths, J., Daub, C.O., Weise, J., Selbig, J.: The mutual information: detecting and evaluating dependencies between variables. Bioinformatics 18(2), S231–S240 (2002)
Khalid, S., Khalil, T., Nasreen, S.: A survey of feature selection and feature extraction techniques in machine learning. In: 2014 Science and Information Conference, pp. 372–378. IEEE (2014)
Kononenko, I.: Estimating attributes: analysis and extensions of RELIEF. In: European conference on machine learning, pp. 171–182. Springer, Heidelberg (1994)
Sulaiman, M.A., Labadin, J.: Feature selection based on mutual information. In: 2015 9th International Conference on IT in Asia (CITA), pp. 1–6. IEEE (2015)‏
Jabbar, M.A., Deekshatulu, B.L., Chandra, P.: Prediction of heart disease using random forest and feature subset selection. In: Innovations in Bio-Inspired Computing and Applications, pp. 187–196. Springer, Cham (2016)
Djellali, H., Zine, N.G., Azizi, N.: Two stages feature selection based on filter ranking methods and SVMRFE on medical applications. In: Modelling and Implementation of Complex Systems, pp. 281–293. Springer, Cham (2016)
Huda, S., Yearwood, J., Jelinek, H.F., Hassan, M.M., Fortino, G., Buckland, M.: A hybrid feature selection with ensemble classification for imbalanced healthcare data: a case study for brain tumor diagnosis. IEEE Access 4, 9145–9154 (2016)
Liu, X., Wang, X., Su, Q., Zhang, M., Zhu, Y., Wang, Q., Wang, Q.: A hybrid classification system for heart disease diagnosis based on the RFRS method. Comput. Math. Methods Med. (2017)
Guyon, I., Elisseeff, A.: An introduction to variable and feature selection. J. Mach. Learn. Res. 3(3), 1157–1182 (2003)
Tsanas, A., Little, M.A., McSharry, P.E.: A simple filter benchmark for feature selection. J. Mach. Learn. Res. 1, 1–24 (2010)
Peng, H., Long, F., Ding, C.: Feature selection based on mutual information: criteria of max-dependency, max-relevance, and min-redundancy. IEEE Trans. Pattern Anal. Mach. Intell. 8, 1226–1238 (2005)
Cervante, L., Xue, B., Zhang, M., Shang, L.: Binary particle swarm optimisation for feature selection: A filter based approach. In: 2012 IEEE Congress on Evolutionary Computation, pp. 1–8. IEEE (2012)
Liu, H., Setiono, R.: Chi2: feature selection and discretization of numeric attributes. In: Proceedings of 7th IEEE International Conference on Tools with Artificial Intelligence, pp. 388–391. IEEE (1995)‏
Kira, K., Rendell, L.A.: The feature selection problem: traditional methods and a new algorithm. In: AAAI, vol. 2, pp. 129–134, July 1992‏
Breiman, L.: Random forests. Mach. Learn. 45(1), 5–32 (2001)
Friedman, N., Geiger, D., Goldszmidt, M.: Bayesian network classifiers. Mach. Learn. 29(2–3), 131–163 (1997)
Wagacha, P.W.: Induction of decision trees. Found. Learn. Adapt. Syst. 12, 1–14 (2003)
Kleinbaum, D.G., Dietz, K., Gail, M., Klein, M., Klein, M.: Logistic regression. Springer, New York (2002)
Sokolova, M., Japkowicz, N., Szpakowicz, S.: Beyond accuracy, F-score and ROC: a family of discriminant measures for performance evaluation. In: Australasian Joint Conference on Artificial Intelligence, pp. 1015–1021. Springer, Heidelberg (2006)
Tharwat, A.: Classification assessment methods. Appl. Comput. Inf. (2018)
UCI Machine Learning Repository. http://archive.ics.uci.edu/ml
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2020 Springer Nature Switzerland AG
About this paper
Cite this paper
Farghaly, H.M., Ali, A.A., El-Hafeez, T.A. (2020). Developing an Efficient Method for Automatic Threshold Detection Based on Hybrid Feature Selection Approach. In: Silhavy, R. (eds) Artificial Intelligence and Bioinspired Computational Methods. CSOC 2020. Advances in Intelligent Systems and Computing, vol 1225. Springer, Cham. https://doi.org/10.1007/978-3-030-51971-1_5
Download citation
DOI: https://doi.org/10.1007/978-3-030-51971-1_5
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-51970-4
Online ISBN: 978-3-030-51971-1
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)