Feature Selection and Interpretable Feature Transformation: A Preliminary Study on Feature Engineering for Classification Algorithms

Tallón-Ballesteros, Antonio J.; Tuba, Milan; Xue, Bing; Hashimoto, Takako

doi:10.1007/978-3-030-03496-2_31

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 11315))

Included in the following conference series:

International Conference on Intelligent Data Engineering and Automated Learning

1211 Accesses
2 Citations

Abstract

This paper explores the limitation of consistency-based measures in the context of feature selection. These kinds of filters are not very widespread in large-dimensionality problems. Typically, the number of selected of attributes is very small and the ability to do right predictions is a drawback. The principal contribution of this work is the introduction of a new approach within feature engineering to create new attributes after the feature selection stage. The experimentation on multi-class problems with a feature space in the order of tens of thousands shed light on that some improvements took place with the new proposal. As a final insight, some new relationships were discovered due to the combined application of feature selection and feature transformation. Additionally, a new measure for classification problems which relates the number of features and the number of classes or labels is also proposed.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Alpaydin, E.: Introduction to Machine Learning. MIT press, Cambridge (2014)
MATH Google Scholar
Tallón-Ballesteros, A.J., Ibiza-Granados, A.: Simplifying pattern recognition problems via a scatter search algorithm. Int. J. Comput. Methods Eng. Sci. Mech. 17(5–6), 315–321 (2016)
Article MathSciNet Google Scholar
Wirth, R., Hipp, J.: CRISP-DM: towards a standard process model for data mining. In: Proceedings of the 4th International Conference on the Practical Applications of Knowledge Discovery and Data Mining, pp. 29–39 (2000)
Google Scholar
Cho, S.-B., Tallón-Ballesteros, Antonio J.: Visual tools to lecture data analytics and engineering. In: Ferrández Vicente, J.M., Álvarez-Sánchez, J.R., de la Paz López, F., Toledo Moreo, J., Adeli, H. (eds.) IWINAC 2017. LNCS, vol. 10338, pp. 551–558. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-59773-7_56
Chapter Google Scholar
Frank, E., Hall, M., Holmes, G., Kirkby, R., Pfahringer, B., Witten, I.H., Trigg, L.: Weka-a machine learning workbench for data mining. In: Maimon, O., Rokach, L. (eds.) Data Mining and Knowledge Discovery Handbook, pp. 1269–1277. Springer, Boston (2009). https://doi.org/10.1007/978-0-387-09823-4_66
Chapter Google Scholar
Akthar, F., Hahne, C.: Rapidminer 5 operator reference. Rapid-I GmbH 50, 65 (2012)
Google Scholar
Dong, G., Liu, H.: Feature Engineering for Machine Learning and Data Analytics. CRC Press, Boca Raton (2018)
Book Google Scholar
Tallón-Ballesteros, A.J., Riquelme, J.C.: Low dimensionality or same subsets as a result of feature selection: an in-depth roadmap. In: FV, J.M., Álvarez-Sánchez, J.R., de la Paz López, F., Toledo Moreo, J., Adeli, H. (eds.) IWINAC 2017. LNCS, vol. 10338, pp. 531–539. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-59773-7_54
Chapter Google Scholar
Tallón-Ballesteros, A.J., Li, K. (eds.): Fuzzy Systems and Data Mining III: Proceedings of FSDM 2017, vol. 299. IOS Press, Amsterdam (2017)
Google Scholar
Liu, H., Motoda, H.: Feature transformation and subset selection. IEEE Intell. Syst. 2, 26–28 (1998)
Google Scholar
Tallón-Ballesteros, A.J., Riquelme, J.C., Ruiz, R.: Merging subsets of attributes to improve a hybrid consistency-based filter: a case of study in product unit neural networks. Connect. Sci. 28(3), 242–257 (2016)
Article Google Scholar
Tallón-Ballesteros, A.J., Correia, L., Xue, B.: Featuring the attributes in supervised machine learning. In: de Cos Juez, F., et al. (eds.) HAIS 2018. Lecture Notes in Computer Science, vol. 10870, pp. 350–362. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-92639-1_29
Chapter Google Scholar
Hall, M.A.: Correlation-based feature selection for machine learning (1999)
Google Scholar
Shin, K., Kuboyama, T., Hashimoto, T., Shepard, D.: sCwc/sLcc: highly scalable feature selection algorithms. Information 8(4), 159 (2017)
Article Google Scholar
Shin, K., Xu, X.M.: Consistency-based feature selection. In: Velásquez, J.D., Ríos, S.A., Howlett, R.J., Jain, L.C. (eds.) KES 2009. LNCS (LNAI), vol. 5711, pp. 342–350. Springer, Heidelberg (2009). https://doi.org/10.1007/978-3-642-04595-0_42
Chapter Google Scholar
Arauzo-Azofra, A., Benitez, J.M., Castro, J.L.: Consistency measures for feature selection. J. Intell. Inf. Syst. 30(3), 273–292 (2008)
Article Google Scholar
Tallón-Ballesteros, Antonio J., Correia, L., Cho, S.-B.: Stochastic and non-stochastic feature selection. In: Yin, H., et al. (eds.) IDEAL 2017. LNCS, vol. 10585, pp. 592–598. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-68935-7_64
Chapter Google Scholar

Download references

Acknowledgment

This work has been partially subsidised by TIN2014-55894-C2-R, TIN2017-88209-C2-R (Spanish Inter-Ministerial Commission of Science and Technology (MICYT)), P11-TIC-7528 projects (“Junta de Andalucía” (Spain)) and FEDER funds. It has also been supported by the Ministry of Education, Science and Technological Development of Republic of Serbia, Grant no. III-44006.

Author information

Authors and Affiliations

University of Seville, Seville, Spain
Antonio J. Tallón-Ballesteros
Singidunum University, Belgrade, Serbia
Milan Tuba
Victoria University of Wellington, Wellington, New Zealand
Bing Xue
Chiba University of Commerce, Konodai Ichikawa City, Chiba, Japan
Takako Hashimoto

Authors

Antonio J. Tallón-Ballesteros
View author publications
You can also search for this author in PubMed Google Scholar
Milan Tuba
View author publications
You can also search for this author in PubMed Google Scholar
Bing Xue
View author publications
You can also search for this author in PubMed Google Scholar
Takako Hashimoto
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Antonio J. Tallón-Ballesteros .

Editor information

Editors and Affiliations

University of Manchester, Manchester, UK
Hujun Yin
Rm 209, Building B, Autonomous University of Madrid, Madrid, Spain
David Camacho
Campus of Gualtar, University of Minho, Braga, Portugal
Paulo Novais
University of Seville, Seville, Spain
Antonio J. Tallón-Ballesteros

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Tallón-Ballesteros, A.J., Tuba, M., Xue, B., Hashimoto, T. (2018). Feature Selection and Interpretable Feature Transformation: A Preliminary Study on Feature Engineering for Classification Algorithms. In: Yin, H., Camacho, D., Novais, P., Tallón-Ballesteros, A. (eds) Intelligent Data Engineering and Automated Learning – IDEAL 2018. IDEAL 2018. Lecture Notes in Computer Science(), vol 11315. Springer, Cham. https://doi.org/10.1007/978-3-030-03496-2_31

Download citation

DOI: https://doi.org/10.1007/978-3-030-03496-2_31
Published: 09 November 2018
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-03495-5
Online ISBN: 978-3-030-03496-2
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics