Abstract
This paper introduces an approach to feature subset selection which is able to characterise the attributes of a supervised machine learning problem into two categories: essential and important features. Additionally, the fusion of both kinds of features yields to an overcoming in the prediction task, where some measures such as accuracy and Receiver Operating Characteristic curve (ROC) have been reported. The test-bed is composed of eight binary and multi-class classification problems with up to five hundred of attributes. Several classification algorithms such as Ridor, PART, C4.5 and NBTree have been tested to assess the proposal.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Bouckaert, R.R., Frank, E., Hall, M.A., Holmes, G., Pfahringer, B., Reutemann, P., Witten, I.H.: Weka–experiences with a Java open-source project. J. Mach. Learn. Res. 11, 2533–2541 (2010)
Frank, E., Witten, I.H.: Generating accurate rule sets without global optimization. In: Shavlik, J. (ed.) Fifteenth International Conference on Machine Learning, pp. 144–151. Morgan Kaufmann (1998)
Gaines, B.R., Compton, P.: Induction of ripple-down rules applied to modeling large databases. J. Intell. Inf. Syst. 5(3), 211–228 (1995)
Guyon, I., Gunn, S., Ben-Hur, A., Dror, G.: Result analysis of the NIPS 2003 feature selection challenge. In: Advances in Neural Information Processing Systems, pp. 545–552 (2005)
Hall, M.A.: Correlation-based feature selection for machine learning. Ph.D. thesis, University of Waikato, Hamilton, New Zealand (1999)
Kacprzyk, J., Pedrycz, W.: Springer Handbook of Computational Intelligence. Springer, Heidelberg (2015). https://doi.org/10.1007/978-3-662-43505-2
Kohavi, R.: Scaling up the accuracy of Naive-Bayes classifiers: a decision-tree hybrid. In: KDD, pp. 202–207 (1996)
Koller, D., Sahami, M.: Toward optimal feature selection. Technical report, Stanford InfoLab (1996)
Liu, H., Motoda, H.: Feature Extraction, Construction and Selection: A Data Mining Perspective, vol. 453. Springer Science & Business Media, New York (1998). https://doi.org/10.1007/978-1-4615-5725-8
Martínez-Álvarez, F., Troncoso, A., Quintián, H., Corchado, E.: Hybrid Artificial Intelligent Systems: 11th International Conference, HAIS 2016, Seville, Spain, April 18-20, 2016, Proceedings, vol. 9648. Springer, Heidelberg (2016). https://doi.org/10.1007/978-3-319-32034-2
Olafsson, S., Li, X., Wu, S.: Operations research and data mining. Eur. J. Oper. Res. 187(3), 1429–1448 (2008)
Quinlan, J.R.: C4.5: Programs for Machine Learning, vol. 1. Morgan Kaufmann, Stanford (1993)
Schafer, J.L.: Analysis of Incomplete Multivariate Data. CRC Press, Boca Raton (1997)
Somol, P., Grim, J., Pudil, P.: The problem of fragile feature subset preference in feature selection methods and a proposal of algorithmic work around. In: 2010 20th International Conference on Pattern Recognition (ICPR), pp. 4396–4399. IEEE (2010)
Tallon-Ballesteros, A.J., Correia, L.: Medium and high-dimensionality attribute selection in Bayes-type classifiers. In: 2017 International Work Conference on Bioinspired Intelligence (IWOBI), pp. 121–126. IEEE (2017)
Tallón-Ballesteros, A.J., Hervás-Martínez, C., Riquelme, J.C., Ruiz, R.: Improving the accuracy of a two-stage algorithm in evolutionary product unit neural networks for classification by means of feature selection. In: Ferrández, J.M., Álvarez Sánchez, J.R., de la Paz, F., Toledo, F.J. (eds.) IWINAC 2011. LNCS, vol. 6687, pp. 381–390. Springer, Heidelberg (2011). https://doi.org/10.1007/978-3-642-21326-7_41
Tallón-Ballesteros, A.J., Hervás-Martínez, C., Riquelme, J.C., Ruiz, R.: Feature selection to enhance a two-stage evolutionary algorithm in product unit neural networks for complex classification problems. Neurocomputing 114, 107–117 (2013)
Tallón-Ballesteros, A.J., Ibiza-Granados, A.: Simplifying pattern recognition problems via a scatter search algorithm. Int. J. Computat. Methods Eng. Sci. Mech. 17(5–6), 315–321 (2016)
Tallón-Ballesteros, A.J., Riquelme, J.C.: Low dimensionality or same subsets as a result of feature selection: an in-depth roadmap. In: Ferrández Vicente, J.M., Álvarez-Sánchez, J.R., de la Paz López, F., Toledo Moreo, J., Adeli, H. (eds.) IWINAC 2017. LNCS, vol. 10338, pp. 531–539. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-59773-7_54
Tallón-Ballesteros, A.J., Riquelme, J.C., Ruiz, R.: Accuracy increase on evolving product unit neural networks via feature subset selection. In: Martínez-Álvarez, F., Troncoso, A., Quintián, H., Corchado, E. (eds.) HAIS 2016. LNCS (LNAI), vol. 9648, pp. 136–148. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-32034-2_12
Tallón-Ballesteros, A.J., Riquelme, J.C., Ruiz, R.: Merging subsets of attributes to improve a hybrid consistency-based filter: a case of study in product unit neural networks. Connect. Sci. 28(3), 242–257 (2016)
ML UCI: Repository, the UC Irvine machine learning repository (2017)
Wang, K., Yuen, S.T., Xu, J., Lee, S.P., Yan, H.H.N., Shi, S.T., Siu, H.C., Deng, S., Chu, K.M., Law, S., et al.: Whole-genome sequencing and comprehensive molecular profiling identify new driver mutations in gastric cancer. Nature Genet. 46(6), 573 (2014)
Wolpert, D.H.: The supervised learning no-free-lunch theorems. In: Roy, R., Köppen, M., Ovaska, S., Furuhashi, T., Hoffmann, F. (eds.) Soft Computing and Industry, pp. 25–42. Springer, London (2002). https://doi.org/10.1007/978-1-4471-0123-9_3
Yu, L., Liu, H.: Feature selection for high-dimensional data: a fast correlation-based filter solution. In: Proceedings of the 20th International Conference on Machine Learning (ICML-03), pp. 856–863 (2003)
Zhang, H.: The optimality of Naive Bayes. In: Barr, V., Markov, Z. (eds.) Proceedings of the Seventeenth International Florida Artificial Intelligence Research Society Conference (FLAIRS 2004). AAAI Press (2004)
Acknowledgments
This work has been partially subsidized by TIN2014-55894-C2-R project of the Spanish Inter-Ministerial Commission of Science and Technology (MICYT), FEDER funds, the P11-TIC-7528 project of the “Junta de Andalucía” (Spain) and by FCT, Portugal, under Grant UID/Multi/04046/2013.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2018 Springer International Publishing AG, part of Springer Nature
About this paper
Cite this paper
Tallón-Ballesteros, A.J., Correia, L., Xue, B. (2018). Featuring the Attributes in Supervised Machine Learning. In: de Cos Juez, F., et al. Hybrid Artificial Intelligent Systems. HAIS 2018. Lecture Notes in Computer Science(), vol 10870. Springer, Cham. https://doi.org/10.1007/978-3-319-92639-1_29
Download citation
DOI: https://doi.org/10.1007/978-3-319-92639-1_29
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-92638-4
Online ISBN: 978-3-319-92639-1
eBook Packages: Computer ScienceComputer Science (R0)