Abstract
The advent of Big Data, and specially the advent of datasets with high dimensionality, has brought an important necessity to identify the relevant features of the data. In this scenario, the importance of feature selection is beyond doubt and different methods have been developed, although researchers do not agree on which one is the best method for any given setting. This chapter provides the reader with the foundations about feature selection (see Sect. 2.1) as well as a description of the state-of-the-art feature selection methods (Sect. 2.2). Then, these methods will be analyzed on several synthetic datasets (Sect. 2.3) trying to draw conclusions about their performance when dealing with a crescent number of irrelevant features, noise in the data, redundancy and interaction between attributes, as well as a small ratio between number of samples and number of features. Finally, in Sect. 2.4, some state-of-the-art methods will be analyzed to study their scalability, i.e. the impact of an increase in the training set on the computational performance of an algorithm in terms of accuracy, training time and stability.
Part of the content of this chapter was previously published in Knowledge and Information Systems (https://doi.org/10.1007/s10115-012-0487-8 and https://doi.org/10.1007/s10115-017-1140-3).
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsReferences
Wold, S., Esbensen, K., Geladi, P.: Principal component analysis. Chemom. Intell. Lab. Syst. 2(1–3), 37–52 (1987)
Yang, Y. Pederson, J.O.: A comparative study on feature selection in text categorization. In: Proceedings of the 20th International Conference on Machine Learning, pp. 856–863 (2003)
Yu, L., Liu, H.: Efficient feature selection via analysis of relevance and redundancy. J. Mach. Learn. Res. 5, 1205–1224 (2004)
Provost, F.: Distributed data mining: scaling up and beyond. J. Adv. Distrib. Parallel Knowl. Discov. 3–27 (2000)
Guyon, I., Gunn, S., Nikravesh, M., Zadeh, L.: Feature Extraction: Foundations and Applications. Springer, Berlin (2006)
Stańczyk, U., Jain, L.C.: Feature Selection for Data and Pattern Recognition. Springer (2015)
Liu, H., Motoda, H.: Computational Methods of Feature Selection. CRC Press (2007)
Bolón-Canedo, V., Sánchez-Maroño, N., Alonso-Betanzos, A.: Feature Selection for High-dimensional Data. Springer (2015)
Hall, M.A.: Correlation-based Feature Selection for Machine Learning. Ph.D. thesis, University of Waikato, Hamilton, New Zealand (1999)
Dash, M., Liu, H.: Consistency-based search in feature selection. J. Artif. Intell. 151(1–2), 155–176 (2003)
Zhao, Z., Liu, H.: Searching for interacting features. In: Proceedings of the International Joint Conference on Artificial Intelligence, pp. 1156–1167 (1991)
Hall, M.A., Smith, L.A.: Practical feature subset selection for machine learning. J. Comput. Sci. 98, 4–6 (1998)
Kononenko, I.: Estimating attributes: analysis and extensions of RELIEF. In: Proceedings of the European Conference on Machine Learning, pp. 171–182 (1994)
Kira, K., Rendell, L.: A practical approach to feature selection. In: Proceedings of the 9th International Workshop on Machine Learning, pp. 249–256 (1992)
Peng, H., Long, F., Ding, C.: Feature selection based on mutual information: criteria of max-dependency, max-relevance, and min-redundancy. IEEE Trans. Pattern Anal. Mach. Intell. 27(8), 1226–1238 (2005)
Guyon, I., Weston, J., Barnhill, S.M.D., Vapnik, V.: Gene selection for cancer classification using support vector machines. J. Mach. Learn. 46(1–3), 389–422 (2002)
Rakotomamonjy, A.: Variable selection using SVM-based criteria. J. Mach. Learn. Res. 3, 1357–1370 (2003)
Mejía-Lavalle, M., Sucar, E., Arroyo, G.: Feature selection with a perceptron neural net. In: Proceedings of the International Workshop on Feature Selection for Data Mining, pp. 131–135 (2006)
Witten, I.H., Frank, E.: Data mining: practical machine learning tools and techniques. Morgan Kaufmann, San Francisco. http://www.cs.waikato.ac.nz/ml/weka/ (2005). Accessed July 2017]
Belanche, L.A., González, F.F.: Review and evaluation of feature selection algorithms in synthetic problems. http://arxiv.org/abs/1101.2320. Accessed July 2017
John, G.H., Kohavi, R., Pfleger, K.: Irrelevant features and the subset selection problem. In: Proceedings of the 11th International Conference on Machine Learning, pp. 121–129 (1994)
Zhu, Z., Ong, Y.S., Zurada, J.M.: Identification of full and partial class relevant genes. IEEE Trans. Comput. Biol. Bioinform. 7(2), 263–277 (2010)
Thrun, S. et al., The MONK’s problems: A performance comparison of different learning algorithms. Technical report CS-91-197, CMU (1991)
Breiman, L., Friedman, J., Olshen, R., Stone, C.: Classification and Regression Trees. Wadsworth International Group (1984)
Mamitsuka, H.: Query-learning-based iterative feature-subset selection for learning from high-dimensional data sets. Knowl. Inf. Syst. 9(1), 91–108 (2006)
Quinlan, J.R.: C4.5: Programs for Machine Learning. Morgan Kaufmann (1993)
Rish, I.: An empirical study of the naive bayes classifier. In: Proceedings of IJCAI-01 Workshop on Empirical Methods in Artificial Intelligence, pp. 41–46 (2001)
Aha, D.W., Kibler, D., Albert, M.K.: Instance-based learning algorithms. J. Mach. Learn. 6(1), 37–66 (1991)
Shawe-Taylor, J., Cristianini, N.: An Introduction To Support Vector Machines And Other Kernel-based Learning Methods, Cambridge University Press (2000)
Bolon-Canedo, V., Sanchez-Marono, N., Alonso-Betanzos, A.: A review of feature selection methods on synthetic data. Knowl. Inf. Syst. 34(3), 483–519 (2013)
Kohavi, R., John, G.H.: Wrappers for feature subset selection. J. Artif. Intell. 97(1–2), 273–324 (1997)
Kim, G., Kim, Y., Lim, H., Kim, H.: An MLP-based feature subset selection for HIV-1 protease cleavage site analysis. J. Artif. Intell. Med. 48, 83–89 (2010)
Seijo-Pardo, B., Bolón-Canedo, V., Alonso-Betanzos, A.: Testing different ensemble configurations for feature selection. Neural Process. Lett. 46, 857–880 (2017)
Bolón-Canedo, V., Sánchez-Maroño, N., Alonso-Betanzos, A.: Recent advances and emerging challenges of feature selection in the context of big data. Knowl.-Based Syst 86, 33–45 (2015)
Khoshgoftaar, T M., Golawala, M. and Van Hulse, J. An empirical study of learning from imbalanced data using random forest. In: ICTAI 2007. 19th IEEE International Conference on Tools with Artificial Intelligence, vol. 2, pp. 310–317. IEEE (2007)
Liu, H. and Setiono, R.Chi2: Feature selection and discretization of numeric attributes. In: Proceedings of Seventh International Conference on Tools with Artificial Intelligence, pp. 388–391. IEEE (1995)
Bolón-Canedo, V., Rego-Fernández, D., Peteiro-Barral, D., Alonso-Betanzos, A., Guijarro-Berdiñas, B., Sánchez-Maroño, N.: On the scalability of feature selection methods on high-dimensional data. Knowl. Inf. Syst. (2017, in press)
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
Copyright information
© 2018 Springer International Publishing AG, part of Springer Nature
About this chapter
Cite this chapter
Bolón-Canedo, V., Alonso-Betanzos, A. (2018). Feature Selection. In: Recent Advances in Ensembles for Feature Selection. Intelligent Systems Reference Library, vol 147. Springer, Cham. https://doi.org/10.1007/978-3-319-90080-3_2
Download citation
DOI: https://doi.org/10.1007/978-3-319-90080-3_2
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-90079-7
Online ISBN: 978-3-319-90080-3
eBook Packages: EngineeringEngineering (R0)