Skip to main content

Feature Selection Methods Comparison: Logistic Regression-Based Algorithm and Neural Network Tools

  • Conference paper
  • First Online:
Practical Applications of Computational Biology and Bioinformatics, 17th International Conference (PACBB 2023) (PACBB 2023)

Abstract

Features selection of high-dimensional data is desirable, mainly when extensive data is used and generated more often. Currently considered research problems are related to the appropriate feature selection in a multidimensional space allowing the selection of only those relevant to the analyzed problem. The implemented and applied machine learning approach made it possible to recognize feature profiles to distinguish two classes of observations. Regarding methods based on logistic regression, 21 features were selected, and 10 features related to the examined problem were identified for neural networks. This made it possible to significantly reduce the dimensionality of the data from as many as 406 original dimensions. Moreover, the feature selection approaches allowed for consistent results; as many as eight features were common to both utilized methods. The application of the recognized profiles also made it possible to obtain very high classification quality metrics, which in the case of logistic regression both for feature selection and classification, amounted to almost 94% of the weighted classification accuracy while maintaining the F1score metric at a very high level of 93%. The achieved results indicate the high efficiency of the proposed feature selection solution. The presented results support the potential of the implemented logistic regression-based approach for solving feature selection and classification for two-class problems.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 149.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 199.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Quinlan, J.R.: in Induction of decision trees. Mach. Learn. 1(1), 81–106 (1986)

    Article  Google Scholar 

  2. Mitchell, M.: An Introduction to Genetic Algorithms. MIT press, London (1998)

    Book  MATH  Google Scholar 

  3. Krose, B., Smagt, P.V.D.: An introduction to neural networks (2011)

    Google Scholar 

  4. Sutton, O.: Introduction to k nearest neighbour classification and condensed nearest neighbour data reduction, University lectures, University of Leicester 1 ( 2012)

    Google Scholar 

  5. Park, H.A.: An introduction to logistic regression: from basic concepts to interpretation with particular attention to nursing domain. J. Korean Acad. Nurs. 43(2), 154–164 (2013)

    Article  Google Scholar 

  6. Sieradzka, K., PolaƄska, J.: Feature selection methods for classification purposes, recent advances in computational oncology and personalized medicine, vol. 2: The Challenges of the Future, Publishing House of the Silesian University of Technology, pp. 169–189 (2022)

    Google Scholar 

  7. Jović, A., Brkić, K., Bogunović, N.: A review of feature selection methods with applications. In: 2015 38th International Convention on Information and Communication Technology, Electronics and Microelectronics (MIPRO), pp. 1200–1205 (2015)

    Google Scholar 

  8. Chandrashekar, G., Sahin, F.: A survey on feature selection methods. Comput. Electr. Eng. 40(1), 16–28 (2014)

    Article  Google Scholar 

  9. Mera-Gaona, M., LĂłpez, D.M., Vargas-Canas, R., Neumann, U.: Framework for the ensemble of feature selection methods. Appl. Sci. 11(17), 8122 (2021)

    Article  Google Scholar 

  10. Ferreira, A.J., Figueiredo, M.A.: Efficient feature selection filters for high-dimensional data. Pattern Recogn. Lett. 33(13), 1794–1804 (2012)

    Article  Google Scholar 

  11. tf.keras.Sequential, https://www.tensorflow.org/api_docs/python/tf/keras/Sequential. Accessed 23 Jan 2023

  12. Kingma, D.P., Ba, J.: Adam: a method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014)

  13. Welcome to the SHAP documentation, https://shap.readthedocs.io/en/latest/. Accessed 10 Feb 2023

  14. Marcílio, W.E., Eler, D.M.: From explanations to feature selection: assessing SHAP values as feature selection mechanism. In: 33rd SIBGRAPI Conference on Graphics, Patterns and Images (SIBGRAPI), pp. 340–347 (2020)

    Google Scholar 

Download references

Acknowledgements

This work has been supported by European Union under the European Social Fund grant AIDA – POWR.03.02.00–00-I029 and SUT grant for Support and Development of Research Potential no. 02/070/BK_23/0043.

Author information

Authors and Affiliations

Authors

Corresponding authors

Correspondence to Katarzyna Sieradzka or Joanna PolaƄska .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Sieradzka, K., PolaƄska, J. (2023). Feature Selection Methods Comparison: Logistic Regression-Based Algorithm and Neural Network Tools. In: Rocha, M., Fdez-Riverola, F., Mohamad, M.S., Gil-González, A.B. (eds) Practical Applications of Computational Biology and Bioinformatics, 17th International Conference (PACBB 2023). PACBB 2023. Lecture Notes in Networks and Systems, vol 743. Springer, Cham. https://doi.org/10.1007/978-3-031-38079-2_4

Download citation

Publish with us

Policies and ethics