Skip to main content
Log in

Conditional mutual information-based feature selection algorithm for maximal relevance minimal redundancy

  • Published:
Applied Intelligence Aims and scope Submit manuscript

Abstract

There are many feature selection algorithms based on mutual information and three-dimensional mutual information (TDMI) among features and the class label, since these algorithms do not consider TDMI among features, feature selection performance can be influenced. In view of the problem, this paper investigates feature selection based on TDMI among features. According to the maximal relevance minimal redundancy criterion, joint mutual information among the class label and feature set is adopted to describe relevance, and mutual information between feature set is exploited to describe redundancy. Then, joint mutual information among the class label and feature set as well as mutual information between feature set is decomposed. In the process of decomposing, TDMI among features is considered and an objective function is obtained. Finally, a feature selection algorithm based on conditional mutual information for maximal relevance minimal redundancy (CMI-MRMR) is proposed. To validate the performance, we compare CMI-MRMR with several feature selection algorithms. Experimental results show that CMI-MRMR can achieve better feature selection performance.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3

Similar content being viewed by others

References

  1. Guyon I, Elisseeff A (2003) An introduction to variable and feature selection. J Mach Learn Res 3:1157–1182

    MATH  Google Scholar 

  2. Brown G, Pocock A, Zhao MJ, Lujan M (2012) Conditional likelihood maximisation: a unifying framework for information theoretic feature selection. J Mach Learn Res 13:27–66

    MathSciNet  MATH  Google Scholar 

  3. Bolon CV, Sanchez MN, Alonso BA (2015) Recent advances and emerging challenges of feature selection in the context of big data. Knowl.-Based Syst 86:33–45

    Article  Google Scholar 

  4. Huang XJ, Zhang L, Wang BJ, Li FZ, Zhang Z (2018) Feature clustering based support vector machine recursive feature elimination for gene selection. Appl Intell 48(3):594–607

    Article  Google Scholar 

  5. Wang YW, Feng LZ, Zhu JM (2018) Novel artificial bee colony based feature selection method for filtering redundant information. Appl Intell 48(4):868–885

    Article  Google Scholar 

  6. Shang CX, Li M, Feng SZ, Jiang QS, Fan JP (2013) Feature selection via maximizing global information gain for text classification. Knowl-Based Syst 54:298–309

    Article  Google Scholar 

  7. Tang B, Kay S, He HB (2016) Toward optimal feature selection in naive bayes for text categorization. IEEE Trans Knowl Data Eng 28(9):2508–2521

    Article  Google Scholar 

  8. Gu XY, Guo JC (2019) A study on Subtractive Pixel Adjacency Matrix features. Multimed Tools Appl 78(14):19681–19695

    Article  Google Scholar 

  9. Gu XY, Guo JC, Wei HW, He YH (2020) Spatial-domain steganalytic feature selection based on three-way interaction information and KS test. Soft Comput 24(1):333–340

    Article  Google Scholar 

  10. Fei T, Kraus D, Zoubir AM (2015) Contributions to Automatic Target Recognition Systems for Underwater Mine Classification. IEEE Trans Geosci Remote Sens 53(1):505–518

    Article  Google Scholar 

  11. Zhang F, Chan PPK, Biggio B, Yeung DS, Roli F (2016) Adversarial feature selection against evasion attacks. IEEE Trans Cybern 46(3):766–777

    Article  Google Scholar 

  12. Veronica BC, Noelia SM, Amparo AB (2013) A review of feature selection methods on synthetic data. Knowl Inf Syst 34(3):483–519

    Article  Google Scholar 

  13. Jia XP, Kuo BC, Crawford MM (2013) Feature Mining for Hyperspectral Image Classification. Proc IEEE 101(3):676–697

    Article  Google Scholar 

  14. Lin CH, Chen HY, Wu YS (2014) Study of image retrieval and classification based on adaptive features using genetic algorithm feature selection. Expert Syst Appl 41(15):6611–6621

    Article  Google Scholar 

  15. Naghibi T, Hoffmann S, Pfister B (2015) A Semidefinite Programming Based Search Strategy for Feature Selection with Mutual Information Measure. IEEE Trans Pattern Anal Mach Intell 37(8):1529–1541

    Article  Google Scholar 

  16. Estevez PA, Tesmer M, Perez CA, Zurada JA (2009) Normalized Mutual Information Feature Selection. IEEE Trans Neural Netw 20(2):189–201

    Article  Google Scholar 

  17. Lewis DD (1992) Feature selection and feature extraction for text categorization. In: Proceedings of the workshop on speech and natural language, pp 212–217

  18. Peng HC, Long FH, Ding C (2005) Feature selection based on mutual information: Criteria of max-dependency, max-relevance and min-redundancy. IEEE Trans Pattern Anal Mach Intell 27(8):1226–1238

    Article  Google Scholar 

  19. Foithong S, Pinngern O, Attachoo B (2012) Feature subset selection wrapper based on mutual information and rough sets. Expert Syst Appl 39(1):574–584

    Article  Google Scholar 

  20. Wang ZC, Li MQ, Li JZ (2015) A multi-objective evolutionary algorithm for feature selection based on mutual information with a new redundancy measure. Inform Sci 307:73–88

    Article  MathSciNet  Google Scholar 

  21. Zeng ZL, Zhang HJ, Zhang R, Yin CX (2015) A novel feature selection method considering feature interaction. Pattern Recogn 48(8):2656–2666

    Article  Google Scholar 

  22. Bennasar M, Hicks Y, Setchi R (2015) Feature selection using Joint Mutual Information Maximisation. Expert Syst Appl 42(22):8520–8532

    Article  Google Scholar 

  23. Wang J, Wei JM, Yang ZL, Wang SQ (2017) Feature Selection by Maximizing Independent Classification Information. IEEE Trans Knowl Data Eng 29(4):828–841

    Article  Google Scholar 

  24. Jakulin A, Bratko I (2004) Testing the significance of attribute interactions. In: Proceedings of international conference on machine learning, pp 409–416

  25. Battiti R (1994) Using mutual information for selecting features in supervised neural net learning. IEEE Trans Neural Netw 5(4):537–550

    Article  Google Scholar 

  26. Kwak N, Choi CH (2002) Input feature selection for classification problems. IEEE Trans Neural Netw 13(1):143–159

    Article  Google Scholar 

  27. Gu XY, Guo JC, Xiao LJ, Ming T, Li CY (2020) A Feature Selection Algorithm Based on Equal Interval Division and Minimal-Redundancy-Maximal-Relevance. Neural Process Lett 51(2):1237–1263

    Article  Google Scholar 

  28. Sun X, Liu YH, Xu MT, Chen HL, Han JW, Wang KH (2013) Feature selection using dynamic weights for classification. Knowl.-Based Syst 37:541–549

    Article  Google Scholar 

  29. Fleuret F (2004) Fast binary feature selection with conditional mutual information. J Mach Learn Res 5:1531–1555

    MathSciNet  MATH  Google Scholar 

  30. Yang HH, Moody JE (1999) Data visualization and feature selection: new algorithms for nongaussian data. In: Proceedings of conference on neural information processing systems

  31. Ren JF, Jiang XD, Yuan JS (2015) Learning LBP structure by maximizing the conditional mutual information. Pattern Recogn 48(10):3180–3190

    Article  Google Scholar 

  32. Lin DH, Tang X (2006) Conditional infomax learning: An integrated framework for feature extraction and fusion. In: Proceedings of european conference on computer vision. pp 68–82

  33. Gao WF, Hu L, Zhang P, He JL (2018) Feature selection considering the composition of feature relevancy. Pattern Recogn Lett 112:70–74

    Article  Google Scholar 

  34. Gao WF, Hu L, Zhang P (2018) Class-specific mutual information variation for feature selection. Pattern Recogn 79:328–339

    Article  Google Scholar 

  35. Dua D, Graff C (2019) UCI machine learning repository. University of California, Irvine, School of Information and Computer Sciences

  36. Li JD, Cheng KW, Wang SH, Morstatter F, Trevino RP, Tang JL, Liu H (2018) Feature selection: a data perspective. ACM Comput Surv 50(6)

  37. Fayyad UM, Irani KB (1993) Multi-interval discretization of continuous-valued attributes for classification learning. In: Proceedings of International Joint Conference on Artificial Intelligence. pp 1022–1027

  38. Hall M, Frank E, Holmes G, Pfahringer B, Reutemann P, Witten IH (2009) The WEKA data mining software: an update. ACM SIGKDD Explor Newslett 11(1):10–18

    Article  Google Scholar 

  39. Zhao Z, Morstatter F, Sharma S, Alelyani S, Anand A, Liu H (2010) Advancing feature selection research. ASU feature selection repository 1–28

Download references

Acknowledgements

This work was supported by the National Natural Science Foundation of China (61771334).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Jichang Guo.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Gu, X., Guo, J., Xiao, L. et al. Conditional mutual information-based feature selection algorithm for maximal relevance minimal redundancy. Appl Intell 52, 1436–1447 (2022). https://doi.org/10.1007/s10489-021-02412-4

Download citation

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10489-021-02412-4

Keywords

Navigation