Abstract
Features selection of high-dimensional data is desirable, mainly when extensive data is used and generated more often. Currently considered research problems are related to the appropriate feature selection in a multidimensional space allowing the selection of only those relevant to the analyzed problem. The implemented and applied machine learning approach made it possible to recognize feature profiles to distinguish two classes of observations. Regarding methods based on logistic regression, 21 features were selected, and 10 features related to the examined problem were identified for neural networks. This made it possible to significantly reduce the dimensionality of the data from as many as 406 original dimensions. Moreover, the feature selection approaches allowed for consistent results; as many as eight features were common to both utilized methods. The application of the recognized profiles also made it possible to obtain very high classification quality metrics, which in the case of logistic regression both for feature selection and classification, amounted to almost 94% of the weighted classification accuracy while maintaining the F1score metric at a very high level of 93%. The achieved results indicate the high efficiency of the proposed feature selection solution. The presented results support the potential of the implemented logistic regression-based approach for solving feature selection and classification for two-class problems.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Quinlan, J.R.: in Induction of decision trees. Mach. Learn. 1(1), 81â106 (1986)
Mitchell, M.: An Introduction to Genetic Algorithms. MIT press, London (1998)
Krose, B., Smagt, P.V.D.: An introduction to neural networks (2011)
Sutton, O.: Introduction to k nearest neighbour classification and condensed nearest neighbour data reduction, University lectures, University of Leicester 1 ( 2012)
Park, H.A.: An introduction to logistic regression: from basic concepts to interpretation with particular attention to nursing domain. J. Korean Acad. Nurs. 43(2), 154â164 (2013)
Sieradzka, K., PolaĆska, J.: Feature selection methods for classification purposes, recent advances in computational oncology and personalized medicine, vol. 2: The Challenges of the Future, Publishing House of the Silesian University of Technology, pp. 169â189 (2022)
JoviÄ, A., BrkiÄ, K., BogunoviÄ, N.: A review of feature selection methods with applications. In: 2015 38th International Convention on Information and Communication Technology, Electronics and Microelectronics (MIPRO), pp. 1200â1205 (2015)
Chandrashekar, G., Sahin, F.: A survey on feature selection methods. Comput. Electr. Eng. 40(1), 16â28 (2014)
Mera-Gaona, M., LĂłpez, D.M., Vargas-Canas, R., Neumann, U.: Framework for the ensemble of feature selection methods. Appl. Sci. 11(17), 8122 (2021)
Ferreira, A.J., Figueiredo, M.A.: Efficient feature selection filters for high-dimensional data. Pattern Recogn. Lett. 33(13), 1794â1804 (2012)
tf.keras.Sequential, https://www.tensorflow.org/api_docs/python/tf/keras/Sequential. Accessed 23 Jan 2023
Kingma, D.P., Ba, J.: Adam: a method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014)
Welcome to the SHAP documentation, https://shap.readthedocs.io/en/latest/. Accessed 10 Feb 2023
MarcĂlio, W.E., Eler, D.M.: From explanations to feature selection: assessing SHAP values as feature selection mechanism. In: 33rd SIBGRAPI Conference on Graphics, Patterns and Images (SIBGRAPI), pp. 340â347 (2020)
Acknowledgements
This work has been supported by European Union under the European Social Fund grant AIDA â POWR.03.02.00â00-I029 and SUT grant for Support and Development of Research Potential no. 02/070/BK_23/0043.
Author information
Authors and Affiliations
Corresponding authors
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Sieradzka, K., PolaĆska, J. (2023). Feature Selection Methods Comparison: Logistic Regression-Based Algorithm and Neural Network Tools. In: Rocha, M., Fdez-Riverola, F., Mohamad, M.S., Gil-GonzĂĄlez, A.B. (eds) Practical Applications of Computational Biology and Bioinformatics, 17th International Conference (PACBB 2023). PACBB 2023. Lecture Notes in Networks and Systems, vol 743. Springer, Cham. https://doi.org/10.1007/978-3-031-38079-2_4
Download citation
DOI: https://doi.org/10.1007/978-3-031-38079-2_4
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-38078-5
Online ISBN: 978-3-031-38079-2
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)