Supervised feature selection method via potential value estimation

Zhao, Long; Jiang, LinFeng; Dong, XiangJun

doi:10.1007/s10586-016-0635-0

Supervised feature selection method via potential value estimation

Published: 17 September 2016

Volume 19, pages 2039–2049, (2016)
Cite this article

Cluster Computing Aims and scope Submit manuscript

Long Zhao^1,2,
LinFeng Jiang¹ &
XiangJun Dong¹

250 Accesses
1 Citation
Explore all metrics

Abstract

Feature selection is an important step dealing with high dimensional data. In order to select categories related features, the importance of feature need to be measured. The existing importance measure algorithms can’t reflect different distributions of data space and have poor interpretabilities. In this paper, a new feature weight calculation method via potential value estimation is proposed. The potential values indicate different data distributions in different dimensions. The quality of data points is another parameter needed to calculate the potential value of the data points in data field. The quality of the data points is related to the density and the type of the surrounding points. At the same time, the extraction of important features should not only consider the distribution of the feature itself but also consider the correlation with other features or categories. This method adopts the \(S_{w}\) (potential value within class) and \(S_{b} \)(potential value between different classes) to calculate the information entropy of each feature. The representative features have been selected to structure classifier. In order to accelerate the speed of operation, different grids are divided with different dimensions. By estimating the potential value of different data points on the same dimension, the correlation between feature and label is evaluated. After a series of analysis and experiments, the proposed method has been proved has overall classification accuracy with the fewest features. The effect of dimensionality reduction is significantly higher than FRGDF and the other manual information methods.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Feature Selection Method Based on Feature’s Classification Bias and Performance

Feature Selection Based on Density Peak Clustering Using Information Distance Measure

Unsupervised feature selection based on decision graph

Article 02 December 2016

References

Zhao, L., Wang, S., Lin, Y.: A new filter approach based on generalized data field. Lect. Notes Comput. Sci. 8933, 319–333 (2014)
Article Google Scholar
Samsudin, S.H., Shafri, H.Z.M., Hamedianfar, A., et al.: Spectral feature selection and classification of roofing materials using field spectroscopy data. J. Appl. Remote Sens. 9(1), 967–976 (2015)
Article Google Scholar
Zhang, D., Chen, S., Zhou, Z.H.: Constraint score: a new filter method for feature selection with pairwise constraints. Pattern Recognit. 41(5), 1440–1451 (2008)
Article MATH Google Scholar
Kamkar, Iman, Gupta, Sunil Kumar, Phung, Dinh, et al.: Stabilizing [formula omitted]-norm prediction models by supervised feature grouping. J. Biomed. Inf. 59, 149–168 (2016)
Shojaie, A., Michailidis, G.: Discovering graphical Granger causality using the truncating lasso penalty. Bioinformatics 26(18), i517–i523 (2010)
Article Google Scholar
Kato K.: Group Lasso for high dimensional sparse quantile regression models. Statistics (2011)
Cover, T.M., Thomas, J.A.: Elements of information theory. Cognit. Sci. 3(3), 177–212 (2005)
Google Scholar
Witten, I.H., Frank, E., Hall, M.A.: Data Mining: Practical Machine Learning Tools and Techniques. Morgan Kaufmann Publishers Inc., San Francisco (2011)
Google Scholar
Gauvreau, K., Pagano, M.: Student’s t test. Nutrition 9(4) (1995)
Yu, L., Liu, H.: Feature selection for high-dimensional data: a fast correlation-based filter solution. Int. Conf. Mach. Learn. 3, 856–863 (2003)
Google Scholar
Hall, M.A.: Correlation-based feature selection for discrete and numeric class machine learning. In: Proceedings of the Seventeenth International Conference on Machine Learning (ICML 2000), Stanford University, Stanford, CA, USA, June 29–July 2, pp. 359–366 (2000)
Jakulin, A.: Machine learning based on attribute interactions. Computer & Information Science (2005)
Meyer, Patrick E.: Bontempi, G.: On the use of variable complementarity for feature selection in cancer classification. Lect. Notes Comput. Sci. 3907, 91–102 (2006)
Article Google Scholar
Bennasar, M., Hicks, Y., Setchi, R.: Feature selection using joint mutual information maximisation. Expert Syst. Appl. 42(22), 8520–8532 (2015)
Article Google Scholar
Cheng, H., et al.: Conditional mutual information-based feature selection analyzing for synergy and redundancy. ETRI J. 33(2), 210–218 (2011)
Article Google Scholar
Lin, D., Tang, X.: Conditional Infomax Learning: An Integrated Framework for Feature Extraction and Fusion. Computer Vision—ECCV 2006, pp. 68–82. Springer, Berlin (2006)
Ramachandran, S.B., Gillis, K.D.: Estimating the Parameters of Amperometric Spikes Detected using a Matched-Filter Approach. Biophys. J. 110(3), 429a (2016)
Article Google Scholar
Garcaía-Torres, M., Gómez-Vela, F., Melián-Batista, B., et al.: High-dimensional feature selection via feature grouping: A variable neighborhood search approach. Inf. Sci. 326(C), 102–118 (2016)
Article MathSciNet Google Scholar

Download references

Acknowledgments

This work was supported by the funds of Shandong provincial water conservancy scientific research and technology promotion project. The project number is SDSLKY201320 (Research on hidden danger intelligent warning system of water conservancy security based on big data). This work was partly supported by National Natural Science Foundation of China (71271125, 61502260) and Natural Science Foundation of Shandong Province, China (ZR2011FM028).

Author information

Authors and Affiliations

QiLu University of Technology, Jinan, 250353, Shandong, People’s Republic of China
Long Zhao, LinFeng Jiang & XiangJun Dong
State Key Laboratory of software Engineering, Wuhan University, Hubei, People’s Republic of China
Long Zhao

Authors

Long Zhao
View author publications
You can also search for this author in PubMed Google Scholar
LinFeng Jiang
View author publications
You can also search for this author in PubMed Google Scholar
XiangJun Dong
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding authors

Correspondence to LinFeng Jiang or XiangJun Dong.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Zhao, L., Jiang, L. & Dong, X. Supervised feature selection method via potential value estimation. Cluster Comput 19, 2039–2049 (2016). https://doi.org/10.1007/s10586-016-0635-0

Download citation

Received: 29 May 2016
Revised: 09 August 2016
Accepted: 30 August 2016
Published: 17 September 2016
Issue Date: December 2016
DOI: https://doi.org/10.1007/s10586-016-0635-0

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Supervised feature selection method via potential value estimation

Abstract

Access this article

Similar content being viewed by others

Feature Selection Method Based on Feature’s Classification Bias and Performance

Feature Selection Based on Density Peak Clustering Using Information Distance Measure

Unsupervised feature selection based on decision graph

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding authors

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Supervised feature selection method via potential value estimation

Abstract

Access this article

Similar content being viewed by others

Feature Selection Method Based on Feature’s Classification Bias and Performance

Feature Selection Based on Density Peak Clustering Using Information Distance Measure

Unsupervised feature selection based on decision graph

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding authors

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation