Advertisement

Neural Computing and Applications

, Volume 31, Supplement 2, pp 989–997 | Cite as

Semi-supervised relevance index for feature selection

  • Frederico CoelhoEmail author
  • Cristiano Castro
  • Antônio P. Braga
  • Michel Verleysen
Original Article
  • 84 Downloads

Abstract

This paper presents a new relevance index based on mutual information that is based on labeled and unlabeled data. The proposed index, which is based in Mutual Information, takes into account the similarity between features and their joint influence on the output variable. Based on this principle, a method to select features is developed to eliminate redundant and irrelevant features when the relevance index value is less then a threshold value. A strategy to set the threshold is also proposed in this work. Experiments show that the new method is capable of capturing important joint relations between input and output variables, which are incorporated into a new feature selection clustering approach.

Keywords

Mutual information Semi-supervised Feature selection Similarity criterion 

Notes

Acknowledgements

This work was developed with financial support of CAPES and CNPq from Brazil.

Compliance with ethical standards

Conflict of interest

The authors declare that they have no conflict of interest.

References

  1. 1.
    Alimoglu F, Alpaydin E (1996) Methods of combining multiple classifiers based on different representations for pen-based handwritten digit recognition. In: Proceedings of the fifth Turkish artificial intelligence and artificial neural networks symposium (TAINN 96)Google Scholar
  2. 2.
    Cover TM, Thomas JA (1991) Elements of information theory. Wiley, New YorkCrossRefzbMATHGoogle Scholar
  3. 3.
    Cun YL, Denker JS, Solla SA (1990) Optimal brain damage. In: Advances in neural information processing systems. Morgan Kaufmann, pp 598–605Google Scholar
  4. 4.
    Dhillon IS, Mallela S, Kumar R (2003) A divisive information theoretic feature clustering algorithm for text classification. J Mach Learn Res 3:1265–1287. http://dl.acm.org/citation.cfm?id=944919.944973
  5. 5.
    Fisher RA (1936) The use of multiple measurements in taxonomic problems. Ann Eugen 7:179–188CrossRefGoogle Scholar
  6. 6.
    François D, Wertz V, Verleysen M (2006) The permutation test for feature selection by mutual information. In: ESANN 2006, European symposium on artificial neural networks, pp 239–244Google Scholar
  7. 7.
    Gorman PR, Sejnowski TJ (1988) Analysis of hidden units in a layered network trained to classify sonar targets. Neural Netw 1(1):75–89CrossRefGoogle Scholar
  8. 8.
    Guyon I, Gunn S, Nikravesh M, Zadeh LA (2006) Feature extraction: foundations and applications (studies in fuzziness and soft computing). Springer, Secaucus. http://portal.acm.org/citation.cfm?id=1208773
  9. 9.
    Hanchuan Peng CD, Long F (2005) Minimum redundancy maximum relevance feature selection. IEEE Intell Syst 20:70–71Google Scholar
  10. 10.
    Feature selection for intrusion detection using random forest (2016) Hasan M., N.M.A.S., Molla, K. J Inf Sec 7:129–140Google Scholar
  11. 11.
    Kay SM (2006) Intuitive probability and random processes using MATLAB. Springer, BerlinCrossRefGoogle Scholar
  12. 12.
    Kira K, (1992) Rendell LA A practical approach to feature selection. In: ML92: proceedings of the ninth international workshop on Machine learning, pp 249–256. Morgan Kaufmann Publishers Inc., San Francisco. http://portal.acm.org/citation.cfm?id=142034
  13. 13.
    Krier C, François D, Rossi F, Verleysen M (2007) Feature clustering and mutual information for the selection of variables in spectral data. In: Neural networks, pp 25–27Google Scholar
  14. 14.
    Li J (2008) Semi-supervised feature selection under logistic I-RELIEF framework. In: 2008 19th international conference on pattern recognition pp 1–4. doi: 10.1109/ICPR.2008.4761687. http://ieeexplore.ieee.org/lpdocs/epic03/wrapper.htm?arnumber=4761687
  15. 15.
    Llobet E, Gualdron O, Brezmes J, Vilanova X, Correig X (2005) An unsupervised dimensionality-reduction technique. In: Sensors, 2005. IEEEGoogle Scholar
  16. 16.
    Mitra P, Murthy CA, Pal SK (2002) Unsupervised feature selection using feature similarity. IEEE Trans Pattern Anal Mach Intell 24(3):301–312. doi: 10.1109/34.990133 CrossRefGoogle Scholar
  17. 17.
    Olusola AA, Oladele AS, Abosede, DO (2010) Analysis of kdd 99 intrusion detection dataset for selection of relevance featuresGoogle Scholar
  18. 18.
    Press WH, Teukolsky SA, Vetterling WT, Flannery BP (1992) Numerical recipes in C: the art of scientific computing, vol 2. Cambridge University Press, New YorkzbMATHGoogle Scholar
  19. 19.
    Quinzan, I., Sotoca, J.M., Pla, F.: Clustering-based feature selection in semi-supervised problems. In: International Conference on Intelligent Systems Design and Applications, vol 0, pp 535–540 (2009). doi:10.1109/ISDA.2009.211Google Scholar
  20. 20.
    Ramana BV, Babu MP, Venkateswarlu NB (2011) A critical study of selected classification algorithms for liver disease diagnosis. Int J Database Manage Syst 3:101–114CrossRefGoogle Scholar
  21. 21.
    Ren J, Qiu Z, Fan W, Cheng H, Yu PS (2008) Forward semi-supervised feature selection. In: PAKDD’08: proceedings of the 12th Pacific-Asia conference on advances in knowledge discovery and data mining. Springer, Berlin, pp 970–976Google Scholar
  22. 22.
    Shi J, Malik J (2000) Normalized cuts and image segmentation. IEEE Trans Pattern Anal Mach Intell 22(8):888–905. http://citeseer.ist.psu.edu/shi97normalized.html
  23. 23.
    Sigillito VG, Wing SP, Hutton LV, Baker KB (1989) Classification of radar returns from the ionosphere using neural networks. Johns Hopkins APL Tech Dig 10(3):262–266Google Scholar
  24. 24.
    Sotoca JM, Pla F (2010) Supervised feature selection by clustering using conditional mutual information-based distances. Pattern Recognit 43(6):2068–2081. doi: 10.1016/j.patcog.2009.12.013. http://www.sciencedirect.com/science/article/pii/S0031320309004828
  25. 25.
    Stoppiglia H, Dreyfus G, Dubois R, Oussar Y (2003) Ranking a random feature for variable and feature selection. J Mach Learn Res 3:1399–1414zbMATHGoogle Scholar
  26. 26.
    Tibshirani R (1996) Regression shrinkage and selection via the lasso. J R Stat Soc B 58:267–288. http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.35.7574
  27. 27.
    Uci machine learning repository. http://archive.ics.uci.edu/ml/
  28. 28.
    Ward JH (1963) Hierarchical grouping to optimize an objective function. J Am Stat Assoc 58(301):236–244. doi: 10.2307/2282967 MathSciNetCrossRefGoogle Scholar
  29. 29.
    Xu Z, Jin R, Lyu MR (2009) King I discriminative semi-supervised feature selection via manifold regularization. In: IJCAI’09: proceedings of the 21st international joint conference on artificial intelligence. Morgan Kaufmann Publishers Inc., San Francisco, pp 1303–1308Google Scholar
  30. 30.
    Yang L, Wang L (2007) Simultaneous feature selection and classification via semi-supervised models. International conference on natural computation 1:646–650. doi: 10.1109/ICNC.2007.666
  31. 31.
    Zhao J, Lu K, He X (2008) Locality sensitive semi-supervised feature selection. Neurocomputing 71(10–12):1842–1849. doi: 10.1016/j.neucom.2007.06.014 CrossRefGoogle Scholar
  32. 32.
    Zhao Z, Liu H (2007) Semi-supervised feature selection via spectral analysis. In: SDMGoogle Scholar
  33. 33.
    Zhong E, Xie S, Fan W, Ren J, Peng J, Zhang K (2008) Graph-based iterative hybrid feature selection. In: IEEE international conference on data mining 1133–1138. doi: 10.1109/ICDM.2008.63

Copyright information

© The Natural Computing Applications Forum 2017

Authors and Affiliations

  1. 1.Universidade Federal de Minas GeraisBelo HorizonteBrazil
  2. 2.Université Catholique de LouvainLouvain-la-NeuveBelgium

Personalised recommendations