Featuring the Attributes in Supervised Machine Learning

Tallón-Ballesteros, Antonio J.; Correia, Luís; Xue, Bing

doi:10.1007/978-3-319-92639-1_29

Antonio J. Tallón-Ballesteros²⁰,
Luís Correia²¹ &
Bing Xue²²

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 10870))

Included in the following conference series:

International Conference on Hybrid Artificial Intelligence Systems

2423 Accesses
5 Citations

Abstract

This paper introduces an approach to feature subset selection which is able to characterise the attributes of a supervised machine learning problem into two categories: essential and important features. Additionally, the fusion of both kinds of features yields to an overcoming in the prediction task, where some measures such as accuracy and Receiver Operating Characteristic curve (ROC) have been reported. The test-bed is composed of eight binary and multi-class classification problems with up to five hundred of attributes. Several classification algorithms such as Ridor, PART, C4.5 and NBTree have been tested to assess the proposal.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Bouckaert, R.R., Frank, E., Hall, M.A., Holmes, G., Pfahringer, B., Reutemann, P., Witten, I.H.: Weka–experiences with a Java open-source project. J. Mach. Learn. Res. 11, 2533–2541 (2010)
MATH Google Scholar
Frank, E., Witten, I.H.: Generating accurate rule sets without global optimization. In: Shavlik, J. (ed.) Fifteenth International Conference on Machine Learning, pp. 144–151. Morgan Kaufmann (1998)
Google Scholar
Gaines, B.R., Compton, P.: Induction of ripple-down rules applied to modeling large databases. J. Intell. Inf. Syst. 5(3), 211–228 (1995)
Article Google Scholar
Guyon, I., Gunn, S., Ben-Hur, A., Dror, G.: Result analysis of the NIPS 2003 feature selection challenge. In: Advances in Neural Information Processing Systems, pp. 545–552 (2005)
Google Scholar
Hall, M.A.: Correlation-based feature selection for machine learning. Ph.D. thesis, University of Waikato, Hamilton, New Zealand (1999)
Google Scholar
Kacprzyk, J., Pedrycz, W.: Springer Handbook of Computational Intelligence. Springer, Heidelberg (2015). https://doi.org/10.1007/978-3-662-43505-2
Book MATH Google Scholar
Kohavi, R.: Scaling up the accuracy of Naive-Bayes classifiers: a decision-tree hybrid. In: KDD, pp. 202–207 (1996)
Google Scholar
Koller, D., Sahami, M.: Toward optimal feature selection. Technical report, Stanford InfoLab (1996)
Google Scholar
Liu, H., Motoda, H.: Feature Extraction, Construction and Selection: A Data Mining Perspective, vol. 453. Springer Science & Business Media, New York (1998). https://doi.org/10.1007/978-1-4615-5725-8
Book MATH Google Scholar
Martínez-Álvarez, F., Troncoso, A., Quintián, H., Corchado, E.: Hybrid Artificial Intelligent Systems: 11th International Conference, HAIS 2016, Seville, Spain, April 18-20, 2016, Proceedings, vol. 9648. Springer, Heidelberg (2016). https://doi.org/10.1007/978-3-319-32034-2
Book Google Scholar
Olafsson, S., Li, X., Wu, S.: Operations research and data mining. Eur. J. Oper. Res. 187(3), 1429–1448 (2008)
Article MathSciNet Google Scholar
Quinlan, J.R.: C4.5: Programs for Machine Learning, vol. 1. Morgan Kaufmann, Stanford (1993)
Google Scholar
Schafer, J.L.: Analysis of Incomplete Multivariate Data. CRC Press, Boca Raton (1997)
Book Google Scholar
Somol, P., Grim, J., Pudil, P.: The problem of fragile feature subset preference in feature selection methods and a proposal of algorithmic work around. In: 2010 20th International Conference on Pattern Recognition (ICPR), pp. 4396–4399. IEEE (2010)
Google Scholar
Tallon-Ballesteros, A.J., Correia, L.: Medium and high-dimensionality attribute selection in Bayes-type classifiers. In: 2017 International Work Conference on Bioinspired Intelligence (IWOBI), pp. 121–126. IEEE (2017)
Google Scholar
Tallón-Ballesteros, A.J., Hervás-Martínez, C., Riquelme, J.C., Ruiz, R.: Improving the accuracy of a two-stage algorithm in evolutionary product unit neural networks for classification by means of feature selection. In: Ferrández, J.M., Álvarez Sánchez, J.R., de la Paz, F., Toledo, F.J. (eds.) IWINAC 2011. LNCS, vol. 6687, pp. 381–390. Springer, Heidelberg (2011). https://doi.org/10.1007/978-3-642-21326-7_41
Chapter Google Scholar
Tallón-Ballesteros, A.J., Hervás-Martínez, C., Riquelme, J.C., Ruiz, R.: Feature selection to enhance a two-stage evolutionary algorithm in product unit neural networks for complex classification problems. Neurocomputing 114, 107–117 (2013)
Article Google Scholar
Tallón-Ballesteros, A.J., Ibiza-Granados, A.: Simplifying pattern recognition problems via a scatter search algorithm. Int. J. Computat. Methods Eng. Sci. Mech. 17(5–6), 315–321 (2016)
Article MathSciNet Google Scholar
Tallón-Ballesteros, A.J., Riquelme, J.C.: Low dimensionality or same subsets as a result of feature selection: an in-depth roadmap. In: Ferrández Vicente, J.M., Álvarez-Sánchez, J.R., de la Paz López, F., Toledo Moreo, J., Adeli, H. (eds.) IWINAC 2017. LNCS, vol. 10338, pp. 531–539. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-59773-7_54
Chapter Google Scholar
Tallón-Ballesteros, A.J., Riquelme, J.C., Ruiz, R.: Accuracy increase on evolving product unit neural networks via feature subset selection. In: Martínez-Álvarez, F., Troncoso, A., Quintián, H., Corchado, E. (eds.) HAIS 2016. LNCS (LNAI), vol. 9648, pp. 136–148. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-32034-2_12
Chapter Google Scholar
Tallón-Ballesteros, A.J., Riquelme, J.C., Ruiz, R.: Merging subsets of attributes to improve a hybrid consistency-based filter: a case of study in product unit neural networks. Connect. Sci. 28(3), 242–257 (2016)
Article Google Scholar
ML UCI: Repository, the UC Irvine machine learning repository (2017)
Google Scholar
Wang, K., Yuen, S.T., Xu, J., Lee, S.P., Yan, H.H.N., Shi, S.T., Siu, H.C., Deng, S., Chu, K.M., Law, S., et al.: Whole-genome sequencing and comprehensive molecular profiling identify new driver mutations in gastric cancer. Nature Genet. 46(6), 573 (2014)
Article Google Scholar
Wolpert, D.H.: The supervised learning no-free-lunch theorems. In: Roy, R., Köppen, M., Ovaska, S., Furuhashi, T., Hoffmann, F. (eds.) Soft Computing and Industry, pp. 25–42. Springer, London (2002). https://doi.org/10.1007/978-1-4471-0123-9_3
Chapter Google Scholar
Yu, L., Liu, H.: Feature selection for high-dimensional data: a fast correlation-based filter solution. In: Proceedings of the 20th International Conference on Machine Learning (ICML-03), pp. 856–863 (2003)
Google Scholar
Zhang, H.: The optimality of Naive Bayes. In: Barr, V., Markov, Z. (eds.) Proceedings of the Seventeenth International Florida Artificial Intelligence Research Society Conference (FLAIRS 2004). AAAI Press (2004)
Google Scholar

Download references

Acknowledgments

This work has been partially subsidized by TIN2014-55894-C2-R project of the Spanish Inter-Ministerial Commission of Science and Technology (MICYT), FEDER funds, the P11-TIC-7528 project of the “Junta de Andalucía” (Spain) and by FCT, Portugal, under Grant UID/Multi/04046/2013.

Author information

Authors and Affiliations

Department of Languages and Computer Systems, University of Seville, Seville, Spain
Antonio J. Tallón-Ballesteros
BioISI - Faculdade de Ciências, Universidade de Lisboa, Lisboa, Portugal
Luís Correia
School of Engineering and Computer Science, Victoria University of Wellington, Wellington, New Zealand
Bing Xue

Authors

Antonio J. Tallón-Ballesteros
View author publications
You can also search for this author in PubMed Google Scholar
Luís Correia
View author publications
You can also search for this author in PubMed Google Scholar
Bing Xue
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Antonio J. Tallón-Ballesteros .

Editor information

Editors and Affiliations

Department of Mine Operating and Prospection, University of Oviedo, Oviedo, Spain
Francisco Javier de Cos Juez
Department of Computer Science, University of Oviedo, Oviedo, Spain
José Ramón Villar
Department of Computer Science, University of Oviedo, Oviedo, Spain
Enrique A. de la Cal
Department of Civil Engineering, University of Burgos, Burgos, Spain
Álvaro Herrero
University of A Coruña, A Coruña, Spain
Héctor Quintián
University of Salamanca, Salamanca, Spain
José António Sáez
University of Salamanca, Salamanca, Spain
Emilio Corchado

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Tallón-Ballesteros, A.J., Correia, L., Xue, B. (2018). Featuring the Attributes in Supervised Machine Learning. In: de Cos Juez, F., et al. Hybrid Artificial Intelligent Systems. HAIS 2018. Lecture Notes in Computer Science(), vol 10870. Springer, Cham. https://doi.org/10.1007/978-3-319-92639-1_29

Download citation

DOI: https://doi.org/10.1007/978-3-319-92639-1_29
Published: 08 June 2018
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-92638-4
Online ISBN: 978-3-319-92639-1
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics