Abstract
A large number of data are increasing in multiple fields such as social media, bioinformatics and health care. These data contain redundant, irrelevant or noisy data which causes high dimensionality. Feature selection is generally used in data mining to define the tools and techniques available for reducing inputs to a controllable size for processing and analysis. Feature selection is also used for dimension reduction, machine learning and other data mining applications. A survey of different feature selection methods are presented in this paper for obtaining relevant features. It also introduces feature selection algorithm called genetic algorithm for detection and diagnosis of biological problems. Genetic algorithm is mainly focused in the field of medicines which can be beneficial for physicians to solve complex problems. Finally, this paper concludes with various challenges and applications in feature selection.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Gheyas., & Smith, L. S. (2010). Feature subset selection in large dimensionality domains. PatternRecognition, 43(1), 5–13.
Xue, B., Zhang, M. J., & Browne, W. N. (2013). Particle swarm optimization for feature selection in classification: A multi-objective approach. IEEE Transactions Cybernetics, 43(6), 1656–1671.
Yang, K., Cai, Z., Li, J., & Lin, G. (2006). A stable gene selection in microarray data analysis. BMC Bioinformatics, 7(1), 228.
Liu, H., Motoda, H., Setiono, R., & Zhao, Z. (2010). Feature selection: An ever evolving frontier in data mining. Journal of Machine. Learning. Research Proceeding Track, 10, 4–13.
PEHRO, D., & Stork, D. G. (2001). Pattern classification. Wiley-Interscience Publication.
Bo, T., & Jonassen, I. (2002). New feature subset selection procedures for classification of expression profiles. Genome Biology, 3(4), 1–0017.
Xu, R., Damelin, S., Nadler, B., & Wunsch, D. C., II. (2010). Clustering of high-dimensional gene expression data with feature filtering methods and diffusion maps. Artificial Intelligence in Medicine, 48(2/3), 91–98.
Bandyopadhyay, S., Mukhopadhyay, A., & Maulik, U. (2007). An improved algorithm for clustering gene expression data. Bioinformatics, 23(21), 2859–2865.
Maulik, U. (2011). Analysis of gene microarray data in a soft computing framework. Applied Soft Computing, 11(6), 4152–4160.
Ahmed, S., Zhang, M., & Peng, L. (2013). Enhanced feature selection for biomarker discovery in LC-MS data using GP. In Proceedings of the 2013 IEEE Congress Evolutionary Computation (CEC) (pp. 584–591). Cancún, Mexico.
Derrac, J., Garcia, S., & Herrera, F (2009). A first study on the use of coevolutionary algorithms for instance and feature selection. In Hybrid artificial intelligence systems (LNCS 5572) (pp. 557–564). Berlin, Germany: Springer.
Li, Y., Zhang, S., & Zeng, X. (2009). Research of multi-population agent genetic algorithm for feature selection. Expert Systems with Applications, 36(9), 11570–11581.
Mao, Q., & Tsang, I. W.-H. (2013). A feature selection method for multivariate performance measures. IEEE Transactions on Pattern Analysis and Machine Intelligence, 35(9), 2051–2063.
Venkatraman, V., Dalby, A. R., & Yang, Z. R. (2004). Evaluation of mutual information and genetic programming for feature selection in QSAR. Journal of Chemical Information and Computer Sciences, 44(5), 1686–1692.
Min, F., Hu, Q., & Zhu, W. (2014). Feature selection with test cost constraint. International Journal of Appropriate Reasoning, 55(1), 167–179.
Jeong, Y.-S., Shin, K. S., & Jeong, M. K. (2014). An evolutionary algorithm with the partial sequential forward floating search mutation for largescale feature selection problems. Journal of the Operational Research Society, 66(4), 529–538.
Wang, S., Pedrycz, W., Zhu, Q., & Zhu, W. (2015). Subspace learning for unsupervised feature selection via matrix factorization. Pattern Recognition, 48(1), 10–19.
Lane, M. C., Xue, B., Liu, I., & Zhang, M. (2013). Particle swarm optimisation and statistical clustering for feature selection. In Advances in artificial intelligence (LNCS 8272). (pp. 214–220). Cham, Switzerland: Springer.
Lane, M. C., Xue, B., Liu. I., & Zhang, M. (2014). Gaussian based particle swarm optimisation and statistical clustering for feature selection. In Evolutionary computation in combinatorial optimisation (LNCS 8600). (pp. 133–144). Berlin, Germany: Springer.
Ke, L., Feng, Z., Xu, Z., Shang, K., & Wang, Y. (2010). A multiobjective ACO algorithm for rough feature selection. In Proceedings of the. second Pacific Asia Conference Circuits Communications and System (PACCS) (vol. 1, pp. 207–210). Beijing, China.
Ghaheri, A., Shoar, S., Naderan, M., & Hoseini, S. S. (2015). November). The applications of genetic algorithms in medicine. Oman Medical Journal, 30(6), 406–416.
Hauskrecht, M., Pelikan, R., Valko, M., Lyons-Weiler, J. (2007). Feature selection and dimensionality reduction in genomics and proteomics. In Fundamentals of data mining in genomics and proteomics. (pp. 149–172).
Rui, Y., Huang, T. S., Chang, S. (1999). Image retrieval: Current techniques, promising directions and open issues. Journal of Visual Communication and Image Representation, 10(4), 39–62.
Liu, H., & Motoda, H. (2007). Computational methods of feature selection. Chapman and Hall/CRC Press.
Uysal, A. K., & Gunal, S. (2012). A novel probabilistic feature selection method for text classification. Knowledge-Based Systems, 36, 226–235.
Doshi, M., & Chaturvedi, D. S. K. (2014). Correlation based feature selection (cfs) technique to predict student perfromance. International Journal of Computer Networks & Communications (IJCNC), 6(3).
Roffo, G., Melzi, S., & Cristani,M. (2015). Infinite feature selection. In EEE International Conference on Computer Vision, (pp. 4202–4210).
Yang, Y., Shen, H. T., Ma, Z., Huang, Z., & Zhou, X. l2,1-Norm regularized discriminative feature selection for unsupervised learning. In Proceedings of the Twenty-Second International Joint Conference on Artificial Intelligence.
Xu, J., & Man, H. (2011). Dictionary learning based on laplacian score ins coding. In P. Perner (Ed.), MLDM 2011 (Vol. 6871, pp. 253–264). LNCS (LNAI) Heidelberg: Springer.
Cai, D., Zhang, C., & He, X. (2010). Unsupervised feature selection for multi-cluster data. KDD.
Kohavi, R., & John, G. H. (1997). Wrappers for feature subset selection. Artificial intelligence, 97(1), 273–324.
Bradley, P. S., & Mangasarian, O. L. (2010). Feature slection via concave minimization and support vector machines.
Kabir, M. M., Islam, M. M., & Murase, K. (2011). A new local search based hybrid genetic algorithm for feature selection. Neurocomputing, 74, 2194–2928.
Sam, M. L., Camara, F., Ndiaye, S., Slimani, Y., & Esseghir, M. A. (2012 June). A Novel RFE-SVM-based Feature Selection Approach for Classification. International Journal of Advanced Science and Technology, 43.
Guan, S., Liu, J., & Qi, Y. (2004). An incremental approach to contribution-based feature selection. Journal of Intelligence Systems, 13(1).
Kabir, M. M., Islam, M. M., & Murase, K. (2008). A new wrapper feature selection approach using neural network. In Proceedings of the Joint Fourth International Conference on Soft Computing and Intelligent Systems and Ninth International Symposium on Advanced Intelligent Systems (SCIS&ISIS2008) (pp. 1953–1958). Japan.
Kabir, M. M., Islam, M. M., & Murase, K. (2010). A new wrapper feature selection approach using neural network. Neurocomputing, 73, 3273–3283.
Gasca, E., Sanchez, J., & Alonso, R. (2006). Eliminating redundancy and irrelevance using a new MLP-based feature selection method.Pattern Recognition, 39, 313–315.
Hsu, C., Huang, H., & Schuschel, D. (2002 ). The ANNIGMAwrapper approach to fast feature selection for neural nets. IEEE Transaction son Systems, Man, and Cybernetics—Part B:Cybernetics, 32(2), 207–212.
Ghareb, A., Bakar, A., & Hamdan, A. (2015). Hybrid feature selection based on enhanced genetic algorithm for text categorization. In Expert systems with applications. Elsevier.
Pedergnan, M. (2013). A novel technique for optimal feature selection in attribute profiles based on genetic algorithms. IEEE Transactions on Geoscience and Remote Sensing, 51(6).
Sivagaminathan, R. K., & Ramakrishnan, S. (2007). A hybrid approach for feature subset selection using neural networks and ant colony optimization. Expert systems with applications, 33, 49–60.
Aghdam, M. H., Aghaee, N. G., & Basiri, M. E. (2009). Text feature selection using ant colony optimization. Expert systems with applications, 36, 6843–6853.
Wang, X., Yang, J., Teng, X., Xia, W., & Jensen, R. (2006). Feature selection based on rough sets and particle swarm optimization. Pattern Recognition Letters, 28(4), 459–471.
Liu, Z., Liu, S., Liu, L., Sun, J., Peng, X., & Wang, T. (2015). Sentiment recognition of online course reviews using multi-swarm optimization-based selected features. In Neuro-Computing. Elsevier.
Kinnear, K. E. (1994). A perspective on the work in this book. In K. E. Kinnear (Ed.), Advances in genetic programming (pp. 3–17). Cambridge: MIT Press.
Mitchell, M. (1995). Genetic algorithms: An overview. Complexity, 1(1), 31–39.
Haupt, R. L., & Haupt, S. E. (1998). Practical genetic algorithms. New York: Wiley Interscience.
Hasancebi, O., & Erbatur, F. (2000). Evaluation of crossover techniques in genetic algorithm based optimum structural design. Computer & Structures, 78, 435–448.
Mitchell, M. (1996). An Introduction to genetic algorithms. Cambridge: MIT Press.
Koza, J. R. (1994). Introduction to genetic programming. In K. E. Kinnear (Ed.), Advances in genetic programming (pp. 21–41). Cambridge: MIT Press.
https://archive.ics.uci.edu/ml, 30th March, 2018.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2019 Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Colaco, S., Kumar, S., Tamang, A., Biju, V.G. (2019). A Review on Feature Selection Algorithms. In: Shetty, N., Patnaik, L., Nagaraj, H., Hamsavath, P., Nalini, N. (eds) Emerging Research in Computing, Information, Communication and Applications. Advances in Intelligent Systems and Computing, vol 906. Springer, Singapore. https://doi.org/10.1007/978-981-13-6001-5_11
Download citation
DOI: https://doi.org/10.1007/978-981-13-6001-5_11
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-13-6000-8
Online ISBN: 978-981-13-6001-5
eBook Packages: EngineeringEngineering (R0)