Skip to main content
Log in

A Feature Selection Method Using Dynamic Dependency and Redundancy Analysis

  • Research Article-Computer Engineering and Computer Science
  • Published:
Arabian Journal for Science and Engineering Aims and scope Submit manuscript

Abstract

Feature selection is an indispensable step in the data preprocessing stage of data mining and pattern recognition. In some numerical small sample data, it is often high dimensional in nature. Some traditional information-theoretic-based feature selection algorithms, however, neglect to judge the dependency correlation and redundancy among these high-dimensional features. How to dynamically find the dependent correlated features among features becomes an urgent problem to be solved. This paper proposes a feature selection algorithm using dynamic weighted conditional mutual information (DWCMI). Firstly, the algorithm uses the interaction information to calculate the coefficient for determining the interaction redundancy between features and between features and class labels. Secondly, the value of this coefficient is used to dynamically adjust the size of the weights of the maximum classification information and to find the dependency-related features between features. Finally, DWCMI is validated against six other feature selection algorithms on two classifiers using 12 different datasets with classification accuracy metrics (Precision_macro, Recall_macro and F1_macro). The experimental results show that the DWCMI method can effectively select relevant features and interaction features in small sample data, thus improving the quality of the feature subset and increasing the classification accuracy.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4

Similar content being viewed by others

References

  1. Peng, H.; Long, F.; Ding, C.: Feature selection based on mutual information criteria of max-dependency, max-relevance, and min-redundancy. IEEE Trans. Pattern Anal. Mach. Intell. 27(8), 1226–1238 (2005). https://doi.org/10.1109/TPAMI.2005.159

    Article  Google Scholar 

  2. Lin, X.; Li, C.; Ren, W., et al.: A new feature selection method based on symmetrical uncertainty and interaction gain. Comput. Biol. Chem. 83, 107149 (2019). https://doi.org/10.1016/j.compbiolchem.2019.107149

    Article  MathSciNet  Google Scholar 

  3. Zhang, Y.; Zhang, Q.; Chen, Z., et al.: Feature assessment and ranking for classification with nonlinear sparse representation and approximate dependence analysis. Decis. Support Syst. 122, 113064 (2019). https://doi.org/10.1016/j.dss.2019.05.004

    Article  Google Scholar 

  4. Wang, X.; Sun, M.; Ge, W.: An incremental feature extraction method without estimating image covariance matrix. J. Electron. Inf. Technol. 41(11), 2768–2776 (2019). https://doi.org/10.11999/JEIT181138

    Article  Google Scholar 

  5. Long, L.; Zheng, L.: Kernel principal component correlation and discrimination analysis feature extraction method for target HRRP recognition. J. Electron. Inf. Technol. 40(1), 173–180 (2018). https://doi.org/10.11999/JEIT170329

    Article  Google Scholar 

  6. Sun, G.; Song, Z.; Liu, J., et al.: Feature selection method based on maximum information coefficient and approximate Markov blanket. Acta Autom. Sin. 43(05), 795–805 (2017). https://doi.org/10.16383/j.aas.2017.c150851

    Article  Google Scholar 

  7. Cai, J.; Luo, J.; Wang, S., et al.: Feature selection in machine learning: a new perspective. Neurocomputing 300, 70–79 (2018). https://doi.org/10.1016/j.neucom.2017.11.077

    Article  Google Scholar 

  8. Heidari, A.A.; Mirjalili, S.; Faris, H., et al.: Harris hawks optimization: algorithm and applications. Future Gener. Comput. Syst. 97, 849–872 (2019). https://doi.org/10.1016/j.future.2019.02.028

    Article  Google Scholar 

  9. Mirjalili, S.; Lewis, A.: The Whale optimization algorithm. Adv. Eng. Softw. 95, 51–67 (2016). https://doi.org/10.1016/j.advengsoft.2016.01.008

    Article  Google Scholar 

  10. Zhou, Q.; Zhou, H.; Li, T.: Cost-sensitive feature selection using random forest: selecting low-cost subsets of informative features. Knowledge-Based Syst. 95, 1–11 (2016). https://doi.org/10.1016/j.knosys.2015.11.010

    Article  Google Scholar 

  11. Vieira, S.M.; Mendonça, L.F.; Farinha, G.J., et al.: Modified binary PSO for feature selection using SVM applied to mortality prediction of septic patients. Appl. Soft Comput. 13(8), 3494–3504 (2013). https://doi.org/10.1016/j.asoc.2013.03.021

    Article  Google Scholar 

  12. Sun, X.; Liu, Y.; Li, J., et al.: Using cooperative game theory to optimize the feature selection problem. Neurocomputing 97, 86–93 (2012). https://doi.org/10.1016/j.neucom.2012.05.001

    Article  Google Scholar 

  13. Sun, G.; Li, J.; Dai, J., et al.: Feature selection for IoT based on maximal information coefficient. Future Gener. Comput. Syst. 89, 606–616 (2018). https://doi.org/10.1016/j.future.2018.05.060

    Article  Google Scholar 

  14. Macedo, F.; Oliveira, M.R.; Pacheco, A., et al.: Theoretical foundations of forward feature selection methods based on mutual information. Neurocomputing 325, 67–89 (2019). https://doi.org/10.1016/j.neucom.2018.09.077

    Article  Google Scholar 

  15. Liu, H.; Ditzler, G.: A semi-parallel framework for greedy information-theoretic feature selection. Inf. Sci. 492, 13–28 (2019). https://doi.org/10.1016/j.ins.2019.03.075

    Article  MathSciNet  MATH  Google Scholar 

  16. Dabba, A.; Tari, A.; Meftali, S., et al.: Gene selection and classification of microarray data method based on mutual information and moth flame algorithm. Expert Syst. Appl. 166, 114012 (2021). https://doi.org/10.1016/j.eswa.2020.114012

    Article  Google Scholar 

  17. Dai, J.; Chen, J.: Feature selection via normative fuzzy information weight with application into tumor classification. Appl. Soft Comput. 92, 106299 (2020). https://doi.org/10.1016/j.asoc.2020.106299

    Article  Google Scholar 

  18. Peng, H.; Long, F.; Ding, C.: Feature selection based on mutual information: criteria of max-dependency, max-relevance, and min-redundancy. IEEE Trans. Pattern Anal. Mach. Intell. 27(8), 1226–1238 (2005). https://doi.org/10.1109/TPAMI.2005.159

    Article  Google Scholar 

  19. Brown, G.; Pocock, A.; Zhao, M.J., et al.: Conditional likelihood maximisation: a unifying framework for information theoretic feature selection. J. Mach. Learn. Res. 13, 27–66 (2012)

    MathSciNet  MATH  Google Scholar 

  20. Bennasar, M.; Hicks, Y.; Setchi, R.: Feature selection using joint mutual information maximisation. Expert Syst. Appl. 42(22), 8520–8532 (2015). https://doi.org/10.1016/j.eswa.2015.07.007

    Article  Google Scholar 

  21. Wang, X.; Guo, B.; Shen, Y., et al.: Input feature selection method based on feature set equivalence and mutual information gain maximization. IEEE Access 7, 151525–151538 (2019). https://doi.org/10.1109/ACCESS.2019.2948095

    Article  Google Scholar 

  22. Gao, W.; Hu, L.; Zhang, P., et al.: Feature selection considering the composition of feature relevancy. Pattern Recogn. Lett. 112, 70–74 (2018). https://doi.org/10.1016/j.patrec.2018.06.005

    Article  Google Scholar 

  23. Zeng, Z.; Zhang, H.; Zhang, R., et al.: A novel feature selection method considering feature interaction. Pattern Recogn. 48(8), 2656–2666 (2015). https://doi.org/10.1016/j.patcog.2015.02.025

    Article  Google Scholar 

  24. Qi, Z.; Wang, H.; He, T., et al.: FRIEND: feature selection on inconsistent data. Neurocomputing 391, 52–64 (2020). https://doi.org/10.1016/j.neucom.2020.01.094

    Article  Google Scholar 

  25. Nayak, S.K.; Rout, P.K.; Jagadev, A.K., et al.: Elitism based multi-objective differential evolution for feature selection: a filter approach with an efficient redundancy measure. J. King Saud Univ. Comput. Inf. Sci. 32(2), 174–187 (2020). https://doi.org/10.1016/j.jksuci.2017.08.001

    Article  Google Scholar 

  26. Juan-Ying, X.; Ming-Zhao, W.; Ying, Z., et al.: Differential expression gene selection algorithms for unbalanced gene datasets. Chin. J. Comput. 42(06), 1232–1251 (2019). https://doi.org/10.11897/SP.J.1016.2019.01232

    Article  Google Scholar 

  27. Wang, J.; Wei, J.; Yang, Z., et al.: Feature selection by maximizing independent classification information. IEEE Trans. Knowl. Data Eng. 29(4), 828–841 (2017). https://doi.org/10.1109/TKDE.2017.2650906

    Article  Google Scholar 

  28. Gao, W.; Hu, L.; Zhang, P.: Class-specific mutual information variation for feature selection. Pattern Recogn. 79, 328–339 (2018). https://doi.org/10.1016/j.patcog.2018.02.020

    Article  Google Scholar 

  29. Zhang, P.; Gao, W.: Feature selection considering uncertainty change ratio of the class label. Appl. Soft Comput. 95, 106537 (2020). https://doi.org/10.1016/j.asoc.2020.106537

    Article  Google Scholar 

  30. Zhang, P.; Gao, W.; Liu, G.: Feature selection considering weighted relevancy. Appl. Intell. 48(12), 4615–4625 (2018). https://doi.org/10.1007/s10489-018-1239-6

    Article  Google Scholar 

  31. Kurgan, L.A.; Cios, K.J.: CAIM discretization algorithm. IEEE Trans. Knowl. Data Eng. 16(2), 145–153 (2004). https://doi.org/10.1109/tkde.2004.1269594

    Article  Google Scholar 

  32. Che, J.; Yang, Y.; Li, L., et al.: Maximum relevance minimum common redundancy feature selection for nonlinear data. Inf. Sci. 409–410, 68–86 (2017). https://doi.org/10.1016/j.ins.2017.05.013

    Article  MATH  Google Scholar 

  33. Cheng, J.; Wang, J.: An association-based evolutionary ensemble method of variable selection. Expert Syst. Appl. 124, 143–155 (2019). https://doi.org/10.1016/j.eswa.2019.01.039

    Article  Google Scholar 

  34. Hu, L.; Gao, W.; Zhao, K., et al.: Feature selection considering two types of feature relevancy and feature interdependency. Expert Syst. Appl. 93, 423–434 (2018). https://doi.org/10.1016/j.eswa.2017.10.016

    Article  Google Scholar 

Download references

Acknowledgements

The experimental data set selects the world-famous UCI universal data set (https://archive.ics.uci.edu/ml/datasets.html and the world-famous ASU universal data set http://featureselection.asu.edu/datasets.php). No problems were found in the declaration.

Funding

This work was supported by the National Science and Technology Basic Work Special Project of China under Grant 2015FY111700-6.

Author information

Authors and Affiliations

Authors

Contributions

I wrote the manuscript, read and approved the final manuscript.

Corresponding author

Correspondence to Zhang Li.

Ethics declarations

Conflict of interest

The author declares that he has no competing interests.

Ethical approval

This study does not involve any ethical issues.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Li, Z. A Feature Selection Method Using Dynamic Dependency and Redundancy Analysis. Arab J Sci Eng 47, 10419–10433 (2022). https://doi.org/10.1007/s13369-022-06590-2

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s13369-022-06590-2

Keywords

Navigation