Skip to main content

A Novel Entropy-Based Approach to Feature Selection

  • Conference paper
  • First Online:
Intelligent Information and Database Systems (ACIIDS 2017)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 10191))

Included in the following conference series:

  • 1920 Accesses

Abstract

The amount of features in datasets has increased significantly in the age of big data. Processing such datasets requires an enormous amount of computing power, which exceeds the capability of traditional machines. Based on mutual information and selection gain, the novel feature selection approach is proposed. With Mackey-Glass, S&P 500, and TAIEX time series datasets, we investigated how good the proposed approach could perform feature selection for a compact subset of feature variables optimal or near optimal, through comparing the results by the proposed approach to those by the brute force method. With these results, we determine the proposed approach can establish a subset solution optimal or near optimal to the problem of feature selection with very fast calculation.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Aksakalli, V., Malekipirbazari, M.: Feature selection via binary simultaneous perturbation stochastic approximation. Pattern Recogn. Lett. 75, 41–47 (2016)

    Article  Google Scholar 

  2. Alibeigi, M., Hashemi, S., Hamzeh, A.: Unsupervised feature selection using feature density functions. Int. J. Electr. Electron. Eng. 3(7), 394–399 (2009)

    Google Scholar 

  3. Azmandian, F., Dy, J.G., Aslam, J.A., Kaeli, D.R.: Local kernel density ratio-based feature selection for outlier detection. In ACML, pp. 49–64, November 2012

    Google Scholar 

  4. Dash, M., Liu, H.: Feature selection for classification. Intell. Data Anal. 1(3), 131–156 (1997)

    Article  Google Scholar 

  5. Dash, M., Liu, H.: Consistency-based search in feature selection. Artif. Intell. 151(1), 155–176 (2003)

    Article  MathSciNet  MATH  Google Scholar 

  6. De Smith, M.J.: STATSREF: Statistical Analysis Handbook - a web-based statistics resource. The Winchelsea Press, Winchelsea (2015)

    Google Scholar 

  7. Forman, G.: An extensive empirical study of feature selection metrics for text classification. J. Mach. Learn. Res. 3, 1289–1305 (2003)

    MATH  Google Scholar 

  8. Geng, X., Hu, G.: Unsupervised feature selection by kernel density estimation in wavelet-based spike sorting. Biomed. Sig. Process. Control 7(2), 112–117 (2012)

    Article  Google Scholar 

  9. Guyon, I., Elisseeff, A.: An introduction to variable and feature selection. J. Mach. Learn. Res. 3, 1157–1182 (2003)

    MATH  Google Scholar 

  10. Jain, A., Zongker, D.: Feature selection: evaluation, application, and small sample performance. IEEE Trans. Pattern Anal. Mach. Intell. 19(2), 153–158 (1997)

    Article  Google Scholar 

  11. John, G.H., Kohavi, R., Pfleger, K.: Irrelevant features and the subset selection problem. In Proceedings of the Eleventh International Conference on Machine Learning, pp. 121–129 (1994)

    Google Scholar 

  12. Kohavi, R., John, G.H.: Wrappers for feature subset selection. Artif. Intell. 97(1), 273–324 (1997)

    Article  MATH  Google Scholar 

  13. Loughrey, J., Cunningham, P.: Overfitting in wrapper-based feature subset selection: the harder you try the worse it gets. In: Bramer, M., Coenen, F., Allen, T. (eds.) Research and Development in Intelligent Systems XXI, pp. 33–43. Springer, London (2005)

    Chapter  Google Scholar 

  14. Mackey, M., Glass, L.: Oscillation and chaos in physiological control systems. Science 197(4300), 287–289 (1977)

    Article  Google Scholar 

  15. Naghibi, T., Hoffmann, S., Pfister, B.: A semidefinite programming based search strategy for feature selection with mutual information measure. IEEE Trans. Pattern Anal. Mach. Intell. 37(8), 1529–1541 (2015)

    Article  Google Scholar 

  16. Parzen, E.: On estimation of a probability density function and mode. Ann. Math. Stat. 33(3), 1065–1076 (1962)

    Article  MathSciNet  MATH  Google Scholar 

  17. Peng, H., Long, F., Ding, C.: Feature selection based on mutual information criteria of max-dependency, max-relevance, and min-redundancy. IEEE Trans. Pattern Anal. Mach. Intell. 27(8), 1226–1238 (2005)

    Article  Google Scholar 

  18. Rosenblatt, M.: Remarks on some nonparametric estimates of a density function. Ann. Math. Stat. 27(3), 832–837 (1956)

    Article  MathSciNet  MATH  Google Scholar 

  19. Supriyanto, C., Yusof, N., Nurhadiono, B.: Two-level feature selection for Naive Bayes with kernel density estimation in question classification based on Bloom’s cognitive levels. In: 2013 International Conference on Information Technology and Electrical Engineering (ICITEE), pp. 237–241, October 2013

    Google Scholar 

  20. Torkkola, K.: Feature extraction by non-parametric mutual information maximization. J. Mach. Learn. Res. 3, 1415–1438 (2003)

    MathSciNet  MATH  Google Scholar 

  21. Zhang, J., Wang, S.: A novel single-feature and synergetic-features selection method by using ISE-based KDE and random permutation. Chin. J. Electron. 25(1), 114–120 (2016)

    Article  Google Scholar 

Download references

Acknowledgments

This study was supported by the research project with funding no. MOST 104-2221-E-008-116, Ministry of Science & Technology, Taiwan.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Chunshien Li .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2017 Springer International Publishing AG

About this paper

Cite this paper

Tu, CH., Li, C. (2017). A Novel Entropy-Based Approach to Feature Selection. In: Nguyen, N., Tojo, S., Nguyen, L., Trawiński, B. (eds) Intelligent Information and Database Systems. ACIIDS 2017. Lecture Notes in Computer Science(), vol 10191. Springer, Cham. https://doi.org/10.1007/978-3-319-54472-4_42

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-54472-4_42

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-54471-7

  • Online ISBN: 978-3-319-54472-4

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics