A Novel Entropy-Based Approach to Feature Selection

Tu, Chia-Hao; Li, Chunshien

doi:10.1007/978-3-319-54472-4_42

Chia-Hao Tu¹⁷ &
Chunshien Li¹⁷

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 10191))

Included in the following conference series:

Asian Conference on Intelligent Information and Database Systems

1920 Accesses

Abstract

The amount of features in datasets has increased significantly in the age of big data. Processing such datasets requires an enormous amount of computing power, which exceeds the capability of traditional machines. Based on mutual information and selection gain, the novel feature selection approach is proposed. With Mackey-Glass, S&P 500, and TAIEX time series datasets, we investigated how good the proposed approach could perform feature selection for a compact subset of feature variables optimal or near optimal, through comparing the results by the proposed approach to those by the brute force method. With these results, we determine the proposed approach can establish a subset solution optimal or near optimal to the problem of feature selection with very fast calculation.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Aksakalli, V., Malekipirbazari, M.: Feature selection via binary simultaneous perturbation stochastic approximation. Pattern Recogn. Lett. 75, 41–47 (2016)
Article Google Scholar
Alibeigi, M., Hashemi, S., Hamzeh, A.: Unsupervised feature selection using feature density functions. Int. J. Electr. Electron. Eng. 3(7), 394–399 (2009)
Google Scholar
Azmandian, F., Dy, J.G., Aslam, J.A., Kaeli, D.R.: Local kernel density ratio-based feature selection for outlier detection. In ACML, pp. 49–64, November 2012
Google Scholar
Dash, M., Liu, H.: Feature selection for classification. Intell. Data Anal. 1(3), 131–156 (1997)
Article Google Scholar
Dash, M., Liu, H.: Consistency-based search in feature selection. Artif. Intell. 151(1), 155–176 (2003)
Article MathSciNet MATH Google Scholar
De Smith, M.J.: STATSREF: Statistical Analysis Handbook - a web-based statistics resource. The Winchelsea Press, Winchelsea (2015)
Google Scholar
Forman, G.: An extensive empirical study of feature selection metrics for text classification. J. Mach. Learn. Res. 3, 1289–1305 (2003)
MATH Google Scholar
Geng, X., Hu, G.: Unsupervised feature selection by kernel density estimation in wavelet-based spike sorting. Biomed. Sig. Process. Control 7(2), 112–117 (2012)
Article Google Scholar
Guyon, I., Elisseeff, A.: An introduction to variable and feature selection. J. Mach. Learn. Res. 3, 1157–1182 (2003)
MATH Google Scholar
Jain, A., Zongker, D.: Feature selection: evaluation, application, and small sample performance. IEEE Trans. Pattern Anal. Mach. Intell. 19(2), 153–158 (1997)
Article Google Scholar
John, G.H., Kohavi, R., Pfleger, K.: Irrelevant features and the subset selection problem. In Proceedings of the Eleventh International Conference on Machine Learning, pp. 121–129 (1994)
Google Scholar
Kohavi, R., John, G.H.: Wrappers for feature subset selection. Artif. Intell. 97(1), 273–324 (1997)
Article MATH Google Scholar
Loughrey, J., Cunningham, P.: Overfitting in wrapper-based feature subset selection: the harder you try the worse it gets. In: Bramer, M., Coenen, F., Allen, T. (eds.) Research and Development in Intelligent Systems XXI, pp. 33–43. Springer, London (2005)
Chapter Google Scholar
Mackey, M., Glass, L.: Oscillation and chaos in physiological control systems. Science 197(4300), 287–289 (1977)
Article Google Scholar
Naghibi, T., Hoffmann, S., Pfister, B.: A semidefinite programming based search strategy for feature selection with mutual information measure. IEEE Trans. Pattern Anal. Mach. Intell. 37(8), 1529–1541 (2015)
Article Google Scholar
Parzen, E.: On estimation of a probability density function and mode. Ann. Math. Stat. 33(3), 1065–1076 (1962)
Article MathSciNet MATH Google Scholar
Peng, H., Long, F., Ding, C.: Feature selection based on mutual information criteria of max-dependency, max-relevance, and min-redundancy. IEEE Trans. Pattern Anal. Mach. Intell. 27(8), 1226–1238 (2005)
Article Google Scholar
Rosenblatt, M.: Remarks on some nonparametric estimates of a density function. Ann. Math. Stat. 27(3), 832–837 (1956)
Article MathSciNet MATH Google Scholar
Supriyanto, C., Yusof, N., Nurhadiono, B.: Two-level feature selection for Naive Bayes with kernel density estimation in question classification based on Bloom’s cognitive levels. In: 2013 International Conference on Information Technology and Electrical Engineering (ICITEE), pp. 237–241, October 2013
Google Scholar
Torkkola, K.: Feature extraction by non-parametric mutual information maximization. J. Mach. Learn. Res. 3, 1415–1438 (2003)
MathSciNet MATH Google Scholar
Zhang, J., Wang, S.: A novel single-feature and synergetic-features selection method by using ISE-based KDE and random permutation. Chin. J. Electron. 25(1), 114–120 (2016)
Article Google Scholar

Download references

Acknowledgments

This study was supported by the research project with funding no. MOST 104-2221-E-008-116, Ministry of Science & Technology, Taiwan.

Author information

Authors and Affiliations

Laboratory of Intelligent Systems and Applications, Department of Information Management, National Central University, Taoyuan City, Taiwan
Chia-Hao Tu & Chunshien Li

Authors

Chia-Hao Tu
View author publications
You can also search for this author in PubMed Google Scholar
Chunshien Li
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Chunshien Li .

Editor information

Editors and Affiliations

Wrocław University of Science and Technology , Wroclaw, Poland
Ngoc Thanh Nguyen
Japan Advanced Institute of Science and Technology , Nomi, Japan
Satoshi Tojo
Japan Advanced Institute of Science and Technology , Nomi, Japan
Le Minh Nguyen
Wrocław University of Science and Technology , Wrocław, Poland
Bogdan Trawiński

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Tu, CH., Li, C. (2017). A Novel Entropy-Based Approach to Feature Selection. In: Nguyen, N., Tojo, S., Nguyen, L., Trawiński, B. (eds) Intelligent Information and Database Systems. ACIIDS 2017. Lecture Notes in Computer Science(), vol 10191. Springer, Cham. https://doi.org/10.1007/978-3-319-54472-4_42

Download citation

DOI: https://doi.org/10.1007/978-3-319-54472-4_42
Published: 26 February 2017
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-54471-7
Online ISBN: 978-3-319-54472-4
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics