Skip to main content

Featuring the Attributes in Supervised Machine Learning

  • Conference paper
  • First Online:
Hybrid Artificial Intelligent Systems (HAIS 2018)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 10870))

Included in the following conference series:

Abstract

This paper introduces an approach to feature subset selection which is able to characterise the attributes of a supervised machine learning problem into two categories: essential and important features. Additionally, the fusion of both kinds of features yields to an overcoming in the prediction task, where some measures such as accuracy and Receiver Operating Characteristic curve (ROC) have been reported. The test-bed is composed of eight binary and multi-class classification problems with up to five hundred of attributes. Several classification algorithms such as Ridor, PART, C4.5 and NBTree have been tested to assess the proposal.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Bouckaert, R.R., Frank, E., Hall, M.A., Holmes, G., Pfahringer, B., Reutemann, P., Witten, I.H.: Weka–experiences with a Java open-source project. J. Mach. Learn. Res. 11, 2533–2541 (2010)

    MATH  Google Scholar 

  2. Frank, E., Witten, I.H.: Generating accurate rule sets without global optimization. In: Shavlik, J. (ed.) Fifteenth International Conference on Machine Learning, pp. 144–151. Morgan Kaufmann (1998)

    Google Scholar 

  3. Gaines, B.R., Compton, P.: Induction of ripple-down rules applied to modeling large databases. J. Intell. Inf. Syst. 5(3), 211–228 (1995)

    Article  Google Scholar 

  4. Guyon, I., Gunn, S., Ben-Hur, A., Dror, G.: Result analysis of the NIPS 2003 feature selection challenge. In: Advances in Neural Information Processing Systems, pp. 545–552 (2005)

    Google Scholar 

  5. Hall, M.A.: Correlation-based feature selection for machine learning. Ph.D. thesis, University of Waikato, Hamilton, New Zealand (1999)

    Google Scholar 

  6. Kacprzyk, J., Pedrycz, W.: Springer Handbook of Computational Intelligence. Springer, Heidelberg (2015). https://doi.org/10.1007/978-3-662-43505-2

    Book  MATH  Google Scholar 

  7. Kohavi, R.: Scaling up the accuracy of Naive-Bayes classifiers: a decision-tree hybrid. In: KDD, pp. 202–207 (1996)

    Google Scholar 

  8. Koller, D., Sahami, M.: Toward optimal feature selection. Technical report, Stanford InfoLab (1996)

    Google Scholar 

  9. Liu, H., Motoda, H.: Feature Extraction, Construction and Selection: A Data Mining Perspective, vol. 453. Springer Science & Business Media, New York (1998). https://doi.org/10.1007/978-1-4615-5725-8

    Book  MATH  Google Scholar 

  10. Martínez-Álvarez, F., Troncoso, A., Quintián, H., Corchado, E.: Hybrid Artificial Intelligent Systems: 11th International Conference, HAIS 2016, Seville, Spain, April 18-20, 2016, Proceedings, vol. 9648. Springer, Heidelberg (2016). https://doi.org/10.1007/978-3-319-32034-2

    Book  Google Scholar 

  11. Olafsson, S., Li, X., Wu, S.: Operations research and data mining. Eur. J. Oper. Res. 187(3), 1429–1448 (2008)

    Article  MathSciNet  Google Scholar 

  12. Quinlan, J.R.: C4.5: Programs for Machine Learning, vol. 1. Morgan Kaufmann, Stanford (1993)

    Google Scholar 

  13. Schafer, J.L.: Analysis of Incomplete Multivariate Data. CRC Press, Boca Raton (1997)

    Book  Google Scholar 

  14. Somol, P., Grim, J., Pudil, P.: The problem of fragile feature subset preference in feature selection methods and a proposal of algorithmic work around. In: 2010 20th International Conference on Pattern Recognition (ICPR), pp. 4396–4399. IEEE (2010)

    Google Scholar 

  15. Tallon-Ballesteros, A.J., Correia, L.: Medium and high-dimensionality attribute selection in Bayes-type classifiers. In: 2017 International Work Conference on Bioinspired Intelligence (IWOBI), pp. 121–126. IEEE (2017)

    Google Scholar 

  16. Tallón-Ballesteros, A.J., Hervás-Martínez, C., Riquelme, J.C., Ruiz, R.: Improving the accuracy of a two-stage algorithm in evolutionary product unit neural networks for classification by means of feature selection. In: Ferrández, J.M., Álvarez Sánchez, J.R., de la Paz, F., Toledo, F.J. (eds.) IWINAC 2011. LNCS, vol. 6687, pp. 381–390. Springer, Heidelberg (2011). https://doi.org/10.1007/978-3-642-21326-7_41

    Chapter  Google Scholar 

  17. Tallón-Ballesteros, A.J., Hervás-Martínez, C., Riquelme, J.C., Ruiz, R.: Feature selection to enhance a two-stage evolutionary algorithm in product unit neural networks for complex classification problems. Neurocomputing 114, 107–117 (2013)

    Article  Google Scholar 

  18. Tallón-Ballesteros, A.J., Ibiza-Granados, A.: Simplifying pattern recognition problems via a scatter search algorithm. Int. J. Computat. Methods Eng. Sci. Mech. 17(5–6), 315–321 (2016)

    Article  MathSciNet  Google Scholar 

  19. Tallón-Ballesteros, A.J., Riquelme, J.C.: Low dimensionality or same subsets as a result of feature selection: an in-depth roadmap. In: Ferrández Vicente, J.M., Álvarez-Sánchez, J.R., de la Paz López, F., Toledo Moreo, J., Adeli, H. (eds.) IWINAC 2017. LNCS, vol. 10338, pp. 531–539. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-59773-7_54

    Chapter  Google Scholar 

  20. Tallón-Ballesteros, A.J., Riquelme, J.C., Ruiz, R.: Accuracy increase on evolving product unit neural networks via feature subset selection. In: Martínez-Álvarez, F., Troncoso, A., Quintián, H., Corchado, E. (eds.) HAIS 2016. LNCS (LNAI), vol. 9648, pp. 136–148. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-32034-2_12

    Chapter  Google Scholar 

  21. Tallón-Ballesteros, A.J., Riquelme, J.C., Ruiz, R.: Merging subsets of attributes to improve a hybrid consistency-based filter: a case of study in product unit neural networks. Connect. Sci. 28(3), 242–257 (2016)

    Article  Google Scholar 

  22. ML UCI: Repository, the UC Irvine machine learning repository (2017)

    Google Scholar 

  23. Wang, K., Yuen, S.T., Xu, J., Lee, S.P., Yan, H.H.N., Shi, S.T., Siu, H.C., Deng, S., Chu, K.M., Law, S., et al.: Whole-genome sequencing and comprehensive molecular profiling identify new driver mutations in gastric cancer. Nature Genet. 46(6), 573 (2014)

    Article  Google Scholar 

  24. Wolpert, D.H.: The supervised learning no-free-lunch theorems. In: Roy, R., Köppen, M., Ovaska, S., Furuhashi, T., Hoffmann, F. (eds.) Soft Computing and Industry, pp. 25–42. Springer, London (2002). https://doi.org/10.1007/978-1-4471-0123-9_3

    Chapter  Google Scholar 

  25. Yu, L., Liu, H.: Feature selection for high-dimensional data: a fast correlation-based filter solution. In: Proceedings of the 20th International Conference on Machine Learning (ICML-03), pp. 856–863 (2003)

    Google Scholar 

  26. Zhang, H.: The optimality of Naive Bayes. In: Barr, V., Markov, Z. (eds.) Proceedings of the Seventeenth International Florida Artificial Intelligence Research Society Conference (FLAIRS 2004). AAAI Press (2004)

    Google Scholar 

Download references

Acknowledgments

This work has been partially subsidized by TIN2014-55894-C2-R project of the Spanish Inter-Ministerial Commission of Science and Technology (MICYT), FEDER funds, the P11-TIC-7528 project of the “Junta de Andalucía” (Spain) and by FCT, Portugal, under Grant UID/Multi/04046/2013.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Antonio J. Tallón-Ballesteros .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2018 Springer International Publishing AG, part of Springer Nature

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Tallón-Ballesteros, A.J., Correia, L., Xue, B. (2018). Featuring the Attributes in Supervised Machine Learning. In: de Cos Juez, F., et al. Hybrid Artificial Intelligent Systems. HAIS 2018. Lecture Notes in Computer Science(), vol 10870. Springer, Cham. https://doi.org/10.1007/978-3-319-92639-1_29

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-92639-1_29

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-92638-4

  • Online ISBN: 978-3-319-92639-1

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics