Advertisement

Applied Intelligence

, Volume 50, Issue 1, pp 101–118 | Cite as

Feature selection with Symmetrical Complementary Coefficient for quantifying feature interactions

  • Rui Zhang
  • Zuoquan ZhangEmail author
Article
  • 111 Downloads

Abstract

In the field of machine learning and data mining, feature interaction is a ubiquitous issue that cannot be ignored and has attracted more attention in recent years. In this paper, we proposed the Symmetrical Complementary Coefficient which can quantify feature interactions very well. Based on it, we improved the Sequential Forward Selection (SFS) algorithm and proposed a new feature subset searching algorithm called SCom-SFS which only needs to consider the feature interactions between adjacent features on a given sequence instead of all of them. Moreover, discovered feature interactions can speed up the process of searching for the optimal feature subset. In addition, we have improved the ReliefF algorithm by screening out representative samples from the original data set, and need not to sample the samples. The improved ReliefF algorithm has been proved to be more efficient and reliable. An effective and complete feature selection algorithm RRSS is obtained through the combination of the two modified algorithms. According to the experimental results, the proposed algorithm RRSS outperformed five classic and two latest feature selection algorithms in terms of size of resulting feature subset, Accuracy, Kappa coefficient, and adjusted Mean-Square Error (MSE).

Keywords

Feature selection ReliefF Sequential Forward Selection Feature interaction Random Forest Symmetrical Complementary Coefficient 

Notes

Acknowledgements

Thanks to the data sets provided by the UCI repository. And The breast cancer domain was obtained from the University Medical Centre, Institute of Oncology, Ljubljana, Yugoslavia. Thanks go to M. Zwitter and M. Soklic for providing the data. The Statlog-Vehicle data set was from the Turing Institute, Glasgow, Scotland. Also thanks to R language and the authors of different packages.

References

  1. 1.
    Breiman L (1996) Bagging predictors. Mach Learn 24(2):123–140zbMATHGoogle Scholar
  2. 2.
    Breiman L (2001) Random forests. Mach Learn 45(1):5–32zbMATHCrossRefGoogle Scholar
  3. 3.
    Cortez P, Silva AMG (2008) Using Data Mining to Predict Secondary School Student Performance. In: Brito A, Teixeira J (eds) Proceedings of 5th future business technology conference, pp 5–12Google Scholar
  4. 4.
    Dash M, Liu H (2003) Consistency-based search in feature selection. Artif Intell 151(1):155–176MathSciNetzbMATHCrossRefGoogle Scholar
  5. 5.
    Demsar J (2006) Statistical comparisons of classifiers over multiple data sets. J Mach Learn Res 7(1):1–30MathSciNetzbMATHGoogle Scholar
  6. 6.
    Dheeru D, Karra Taniskidou E (2017) UCI machine learning repository. http://archive.ics.uci.edu/ml
  7. 7.
    Estevez PA, Tesmer M, Perez CA, Zurada JM (2009) Normalized mutual information feature selection. IEEE Trans Neural Netw 20(2):189–201CrossRefGoogle Scholar
  8. 8.
    Fleuret F (2004) Fast binary feature selection with conditional mutual information. J Mach Learn Res 5 (3):1531–1555MathSciNetzbMATHGoogle Scholar
  9. 9.
    Friedman M (1937) The use of ranks to avoid the assumption of normality implicit in the analysis of variance. J Am Stat Assoc 32(200):675–701zbMATHCrossRefGoogle Scholar
  10. 10.
    Gao W, Hu L, Zhang P, He J (2018) Feature selection considering the composition of feature relevancy. Pattern Recognit Lett 112:70–74CrossRefGoogle Scholar
  11. 11.
    Gonzalez-Abril L, Cuberos FJ, Velasco F, Ortega JA (2009) Ameva: an autonomous discretization algorithm. Expert Syst Appl 36(3):5327–5332CrossRefGoogle Scholar
  12. 12.
    Guyon I, Elisseeff A (2003) An introduction to variable and feature selection. J Mach Learn Res 3 (6):1157–1182zbMATHGoogle Scholar
  13. 13.
    Hall MA (2000) Correlation-based feature selection for discrete and numeric class machine learning. In: Proceedings of the seventeenth international conference on machine learning, pp 359–366Google Scholar
  14. 14.
    Jakulin A, Bratko I (2003) Analyzing attribute dependencies. In: European conference on principles of data mining and knowledge discovery. Springer, pp 229–240Google Scholar
  15. 15.
    Jakulin A, Bratko I (2004) Testing the significance of attribute interactions. In: Proceedings of the 21st international conference on machine learning, pp 409–416Google Scholar
  16. 16.
    John GH, Kohavi R, Pfleger K (1994) Irrelevant features and the subset selection problem. In: Machine learning proceedings 1994. Elsevier, pp 121–129Google Scholar
  17. 17.
    Kira K, Rendell LA (1992) The feature selection problem: traditional methods and a new algorithm. In: Tenth national conference on artificial intelligence, pp 129–134Google Scholar
  18. 18.
    Koller D, Sahami M (1996) Toward optimal feature selection. In: Thirteenth international conference on international conference on machine learning, pp 284–292Google Scholar
  19. 19.
    Kononenko I (1994) Estimating attributes: analysis and extensions of relief. In: European conference on machine learning on machine learning, pp 171–182CrossRefGoogle Scholar
  20. 20.
    Kursa MB, Jankowski A, Rudnicki WR (2010) Boruta—a system for feature selection. Fund Inform 101 (4):271–285MathSciNetCrossRefGoogle Scholar
  21. 21.
    Liu H, Setiono R (1996) A probabilistic approach to feature selection—a filter solution. In: International conference on machine learning, pp 319–327Google Scholar
  22. 22.
    Nemenyi P (1963) Distribution-eree multiple comparison. PhD thesisGoogle Scholar
  23. 23.
    Ng AY (2004) Feature selection, L 1 vs. L 2 regularization, and rotational invariance. In: Proceedings of the twenty-first international conference on machine learning. ACM, p 78Google Scholar
  24. 24.
    Park H, Kwon HC (2008) Extended relief algorithms in instance-based feature filtering. In: International conference on advanced language processing and web information technology, pp 123–128Google Scholar
  25. 25.
    Peng H, Long F, Ding C (2005) Feature selection based on mutual information criteria of max-dependency, max-relevance, and min-redundancy. IEEE Trans Pattern Anal Mach Intell 27(8):1226–1238CrossRefGoogle Scholar
  26. 26.
    Robnik-Šikonja M, Kononenko I (2003) Theoretical and empirical analysis of relieff and rrelieff. Mach Learn 53(1–2):23–69zbMATHCrossRefGoogle Scholar
  27. 27.
    Shieh MD, Yang CC (2008) Multiclass SVM-RFE for product form feature selection. Expert Syst Appl 35 (1):531–541CrossRefGoogle Scholar
  28. 28.
    Song L, Smola A, Gretton A, Bedo J, Borgwardt K (2012) Feature selection via dependence maximization. J Mach Learn Res 1(1):1393–1434MathSciNetzbMATHGoogle Scholar
  29. 29.
    Strobl C, Boulesteix AL, Augustin T (2007) Unbiased split selection for classification trees based on the gini index. Comput Stat Data Anal 52(1):483–501MathSciNetzbMATHCrossRefGoogle Scholar
  30. 30.
    Su YX, Fu Y, Li X (2007) A feature selection method based on relieff evaluation and complementary coefficient. Electron Opt Control 14(3):12–15Google Scholar
  31. 31.
    Tang X, Dai Y, Xiang Y (2019) Feature selection based on feature interactions with application to text categorization. Expert Syst Appl 120:207–216CrossRefGoogle Scholar
  32. 32.
    Tuv E, Borisov A, Runger G, Torkkola K (2009) Feature selection with ensembles, artificial variables, and redundancy elimination. J Mach Learn Res 10(3):1341–1366MathSciNetzbMATHGoogle Scholar
  33. 33.
    Wang G, Song Q (2012) Selecting feature subset via constraint association rules. In: Pacific-Asia conference on advances in knowledge discovery and data mining, pp 304–321CrossRefGoogle Scholar
  34. 34.
    Wang H, Lo SH, Zheng T, Hu I (2012) Interaction-based feature selection and classification for high-dimensional biological data. Bioinformatics 28(21):2834–2842CrossRefGoogle Scholar
  35. 35.
    Yu L, Liu H (2003) Feature selection for high-dimensional data: a fast correlation-based filter solution. In: Twentieth international conference on international conference on machine learning, pp 856–863Google Scholar
  36. 36.
    Yu L, Liu H (2004) Efficient feature selection via analysis of relevance and redundancy. J Mach Learn Res 5(12):1205–1224MathSciNetzbMATHGoogle Scholar
  37. 37.
    Zeng Z, Zhang H, Zhang R, Yin C (2015) A novel feature selection method considering feature interaction. Pattern Recogn 48(8):2656–2666CrossRefGoogle Scholar
  38. 38.
    Zhao Z, Liu H (2009) Searching for interacting features in subset selection. Intell Data Anal 13(2):207–228CrossRefGoogle Scholar

Copyright information

© Springer Science+Business Media, LLC, part of Springer Nature 2019

Authors and Affiliations

  1. 1.School of ScienceBeijing Jiaotong UniversityBeijingChina

Personalised recommendations