Skip to main content
Log in

Feature selection based on mutual information with correlation coefficient

  • Published:
Applied Intelligence Aims and scope Submit manuscript

Abstract

Feature selection is an important preprocessing process in machine learning. It selects the crucial features by removing irrelevant features or redundant features from the original feature set. Most of feature selection algorithms focus on maximizing relevant information and minimizing redundant information. In order to remove more redundant information in the evaluation criteria, we propose a feature selection based on mutual information with correlation coefficient (CCMI) in this paper. We introduce the correlation coefficient in the paper, and combine the correlation coefficient and mutual information to measure the relationship between different features. We use the absolute value of the correlation coefficient between two different features as the weight of the redundant item denoted by the mutual information in the evaluation standard. In order to select low redundancy features effectively, we also use the principle of minimization in the evaluation criteria. By comparing with 7 popular contrast algorithms in 12 data sets, CCMI has achieved the highest average classification accuracy for two classifiers of SVM and KNN. Experimental results show that our proposed CCMI has better feature classification capability.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14

Similar content being viewed by others

References

  1. Wang Z, Li M, Li J (2015) A multi-objective evolutionary algorithm for feature selection based on mutual information with a new redundancy measure. Inf Sci 307:73–88

    Article  MathSciNet  Google Scholar 

  2. Bennasar M, Hicks Y, Setchi R (2015) Feature selection using joint mutual information maximisation. Expert Syst Appl 42:8520–8532

    Article  Google Scholar 

  3. Hoque N, Bhattacharyya DK, Kalita JK (2014) MIFS-ND: A mutual information-based feature selection method. Expert Syst Appl 41:6371–6385

    Article  Google Scholar 

  4. Zhou H, Guo J, Wang Y (2016) A feature selection approach based on term distributions. Springerplus 5(1):1–14

    Article  Google Scholar 

  5. Das SK, Das SR (2001) Wrappers and a boosting-based hybrid for feature selection. In: Proceedings of the eighteenth international conference on machine learning, pp 74–81

  6. Maldonado S, Weber R (2009) A wrapper method for feature selection using support vector machines. Inf Sci 179:2208–2217

    Article  Google Scholar 

  7. Jiang L, Kong G, Li C (2019) Wrapper framework for test-cost-sensitive feature selection. IEEE Trans Sys Man Cybern Sys:1–10

  8. Zhu QH, Bin YY (2018) Discriminative embedded unsupervised feature selection. Patt Recogn Lett 112:219–225

    Article  Google Scholar 

  9. Han J, Kamber M, Pei J (2011) Data mining: concepts and techniques, 3rd edn. Morgan Kaufmann, Burlington

    MATH  Google Scholar 

  10. Vinh NX, Zhou S, Chan J, Bailey J (2016) Can high-order dependencies improve mutual information based feature selection? Patt Recogn 53:46–58

    Article  Google Scholar 

  11. Peng H, Fan Y (2017) Feature selection by optimizing a lower bound of conditional mutual information. Inf Sci 418-419:652–667

    Article  Google Scholar 

  12. Lewis DD (1992) Feature selection and feature extraction for text categorization. In: Proceedings of the workshop on speech and natural language. Association for Computation Linguistics, pp 212–217

  13. Battiti R (1994) Using mutual information for selecting features in supervised neural net learning. IEEE Trans Neural Netw 5:537–550

    Article  Google Scholar 

  14. Peng H, Long F, Ding C (2005) Feature selection based on mutual information: criteria of max-dependency, max-relevance, and min-redundancy. IEEE Trans Patt Anal Mach Intell 27(8):1226– 1238

    Article  Google Scholar 

  15. Lin D, Tang X (2006) Conditional infomax learning: an integrated framework for feature extraction and fusion. In: European Conference on computer version, pp 68–82

  16. Yang HH, Moody J (1999) Feature selection based on joint mutual information. In: Proceedings of international ICSC symposium on advances in intelligent data analysis, pp 22–25

  17. Wang J, Wei JM, Yang Z, Wang SQ (2017) Feature selection by maximizing independent classification information. IEEE Trans Knowl Data Eng 29(4):828–841

    Article  Google Scholar 

  18. Gao W, Hu L, Zhang P et al (2018) Feature selection considering the composition of feature relevancy. Patt Recogn Lett 112: 70–74

    Article  Google Scholar 

  19. Brown G, Pocock A, Zhao MJ, Lujan M (2012) Conditional likelihood maximisation: a unifying framework for information theoretic feature selection. J Mach Learn Res 13(1):27–66

    MathSciNet  MATH  Google Scholar 

  20. Zhou HF, Zhang Y, Zhang YJ, Liu HJ (2018) Feature selection based on conditional mutual information: minimum conditional relevance and minimum conditional redundancy. Appl Intell 49:883–896

    Article  Google Scholar 

  21. Asuncion A, Newman DJ (2007) UCI machine learning repository, University of California, Irvine, School of Information and Computer Science. http://www.ics.uci.edu/mlearn/MLRepository.html

  22. Borah P, Gupta D (2020) Unconstrained convex minimization based implicit Lagrangian twin extreme learning machine for classification (ULTELMC). Appl Intell 50:1327–1344

    Article  Google Scholar 

  23. Gupta D, Sarma HJ, Mishra K, Prasad M (2019) Regularized Universum twin support vector machine for classification of EEG signal. In: IEEE international conference on systems, man and cybernetics (SMC), pp 2298–2304

  24. Adhikary D, Das GD (2020) Applying over 100 classifiers for churn prediction in telecom companies. Multimedia Tools and Applications

  25. Borah P, Gupta D (2020) Functional iterative approaches for solving support vector classification problems based on generalized Huber loss. Neural Comput Appl 32:9245–9265

    Article  Google Scholar 

  26. Gupta D, Borah P, Prasad M (2017) A fuzzy based Lagrangian twin parametric-margin support vector machine (FLTPMSVM). In: IEEE symposium series on computational intelligence (SSCI), pp 1–7

  27. Demišar J, Schuurmans D (2006) Statistical comparisons of classifiers over multiple data sets. J Mach Learn Res 7(1):1–30

    MathSciNet  Google Scholar 

Download references

Acknowledgements

The corresponding author would like to thank the support from the National Key Research and Development Plan under the Grant of 2017YFB1402103, the National Natural Science Foundation of China under the Grant of 61971347, the Key Research and Development Program of Shaanxi under the Grant of 2021JZ-49, the Education Department of Shaanxi Province Key Laboratory Project under the Grant of 15JS079, Xi’an Science Program Project under the Grant of 2020KJRC0094.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Hongfang Zhou.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Zhou, H., Wang, X. & Zhu, R. Feature selection based on mutual information with correlation coefficient. Appl Intell 52, 5457–5474 (2022). https://doi.org/10.1007/s10489-021-02524-x

Download citation

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10489-021-02524-x

Keywords

Navigation