A Geometric Approach to Feature Ranking Based Upon Results of Effective Decision Boundary Feature Matrix

  • Claudia Diamantini
  • Alberto Gemelli
  • Domenico Potena
Chapter
Part of the Studies in Computational Intelligence book series (SCI, volume 584)

Abstract

This chapter presents a new method of Feature Ranking (FR) that calculates the relative weight of features in their original domain with an algorithmic procedure. The method supports information selection of real world features and is useful when the number of features has costs implications. The Feature Extraction (FE) techniques, although accurate, provide the weights of artificial features whereas it is important to weight the real features to have readable models. The accuracy of the ranking is also an important aspect; the heuristics methods, another major family of ranking methods based on generate-and-test procedures, are by definition approximate although they produce readable models. The ranking method proposed here combines the advantages of older methods, it has at its core a feature extraction technique based on Effective Decision Boundary Feature Matrix (EDBFM), which is extended to calculate the total weight of the real features through a procedure geometrically justified. The modular design of the new method allows to include any FE technique referable to the EDBFM model; a thorough benchmarking of the various solutions has been conducted.

Keywords

Feature ranking Feature weight Effective decision boundary feature matrix Classification 

References

  1. 1.
    Alelyani, S., Liu, H., Wang, L.: The effect of the characteristics of the dataset on the selection stability. In: Proceedings of the 23rd IEEE International Conference on Tools with Artificial Intelligence (ICTAI), pp. 970–977 (2011)Google Scholar
  2. 2.
    Arauzo-Azofra, A., Aznarte, A.L., Benitez, J.M.: Empirical study of feature selection methods based on individual feature evaluation for classification problems. Expert Syst. Appl. 37(3), 8170–8177 (2011)CrossRefGoogle Scholar
  3. 3.
    Blum, A., Langley, P.: Selection of relevant features and examples in machine learning. Artif. Intell. 97(1), 245–271 (1997)CrossRefMathSciNetMATHGoogle Scholar
  4. 4.
    Bock, M., Bohner, J., Conrad, O., Kothe, R., Ringler, A.: Saga, system for automated geoscientific analysis. Technical Report Saga Users Group Association, University of Gottingen, http://www.saga-gis.org (2000)
  5. 5.
    Cantu-Paz, E., Newsam, S., Kamath, C.: Feature selection in scientific applications. In: Proceedings of the 2004 ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 788–793 (2004)Google Scholar
  6. 6.
    Chandrashekar, G., Sahin, F.: A survey on feature selection methods. Comput. Electr. Eng. 40(1), 16–28 (2014)CrossRefGoogle Scholar
  7. 7.
    Chawla, S.: Feature selection, association rules network and theory building. In: Proceedings of the Fourth Workshop on Feature Selection in Data Mining, pp. 14–21 (2010)Google Scholar
  8. 8.
    Dash, M., Liu, H.: Feature selection for classification. Intell. Data Anal. 97(11), 131–156 (1997)CrossRefGoogle Scholar
  9. 9.
    Diamantini, C., Panti, M.: An efficient and scalable data compression approach to classification. ACM SIGKDD Explor. 2(2), 54–60 (2000)CrossRefGoogle Scholar
  10. 10.
    Diamantini, C., Potena, D.: A study of feature extraction techniques based on decision border estimate. In: Liu, H., Motoda, H. (eds.) Computational Methods of Feature Selection, pp. 109–129. Chapman & Hall/CRC, Boca Raton (2007)CrossRefGoogle Scholar
  11. 11.
    Ding, S., Zhu, H., Jia, W., Su, C.: A survey on feature extraction for pattern recognition. Artif. Intell. Rev. 37(3), 169–180 (2012)CrossRefGoogle Scholar
  12. 12.
    Escalante, H.J., Montes, M., Sucar, E.: An energy-based model for feature selection. In: Proceedings of the 2008 IEEE World Congress on Computational Intelligence (WCCI), pp. 1–8 (2008)Google Scholar
  13. 13.
    Gemelli, A., Mancini, A., Diamantini, C., Longhi, S.: GIS to Support Cost-Effective Decisions on Renewable Sources: Applications for Low Temperature Geothermal Energy. Springer, New York (2013)CrossRefGoogle Scholar
  14. 14.
    Go, J., Lee, C.: Analytical decision boundary feature extraction for neural networks. In: Proceedings of the IEEE 2000 International Geoscience and Remote Sensing, pp. 3072–3074 (2000)Google Scholar
  15. 15.
    Guyon, I., Elisseeff, A.: An introduction to variable and feature selection. J. Mach. Learn. Res. 2003(3), 1157–1182 (2003)Google Scholar
  16. 16.
    Guyon, I., Elisseeff, A.: An introduction to feature extraction. In: Guyon, I., Gunn, S., Nikravesh, M., Zadeh, L. (eds.) Feature Extraction, Foundations and Applications, pp. 1–25. Springer, New York (2006)CrossRefGoogle Scholar
  17. 17.
    Guyon, I., Aliferis, C., Elisseeff, A.: Causal feature selection. In: Liu, H., Motoda, H. (eds.) Computational Methods of Feature Selection, pp. 1–40. Chapman and Hall, London (2007)Google Scholar
  18. 18.
    Holte, R.C.: Very simple classification rules perform well on most commonly used datasets. Mach. Learn. 11(1), 63–90 (1993)CrossRefMathSciNetMATHGoogle Scholar
  19. 19.
    John, G.H., Kohavi, R., Pfleger, K.: Irrelevant features and the subset selection problem. In: Proceedings of the 11th International Conference on Machine Learning, pp. 121–129 (1994)Google Scholar
  20. 20.
    Kira, K., Rendell, L.A.: The feature selection problem: traditional methods and a new algorithm. In: Proceedings of the Ninth National Conference on Artificial Intelligence, pp. 129–132 (1992)Google Scholar
  21. 21.
    Kohavi, R., John, G.H.: Wrappers for feature subset selection. Artif. Intell. 97(1), 273–324 (1997)CrossRefMATHGoogle Scholar
  22. 22.
    Kohonen, T.: The self-organizing map. Proc. IEEE 78(9), 1464–1480 (1990)CrossRefGoogle Scholar
  23. 23.
    Kononenko, P.C.: Estimating attributes: analysis and extensions of relief. In: Proceedings of the European Conference on Machine Learning ’94, pp. 171–182 (1994)Google Scholar
  24. 24.
    Lee, C., Landgrebe, D.A.: Feature selection based on decision boundaries. In: Proceedings of the IEEE 1991 International in Geoscience and Remote Sensing Symposium—IGARSS, pp. 1471–1474 (1991)Google Scholar
  25. 25.
    Lee, C., Landgrebe, D.A.: Feature extraction based on decision boundaries. IEEE Trans. Pattern Anal. Mach. Intell. 15(4), 388–400 (1993)CrossRefGoogle Scholar
  26. 26.
    Liu, H., Yu, L.: Toward integrating feature selection algorithms for classification and clustering. IEEE Trans. Knowl. Data Eng. 17(4), 491–502 (2005)CrossRefGoogle Scholar
  27. 27.
    Liu, H., Motoda, H.: Less is more. In: Liu, H., Motoda, H. (eds.) Computational Methods of Feature Selection, pp. 3–12. Chapman and Hall, London (2007)CrossRefGoogle Scholar
  28. 28.
    Liu, H., Suna, J., Liu, L., Zhang, H.: Feature selection with dynamic mutual information. Pattern Recognit. 42(7), 1330–1339 (2009)CrossRefMATHGoogle Scholar
  29. 29.
    Liu, H., Motoda, H., Setiono, R., Zhao, Z.: Feature selection: an ever evolving frontier in data mining. J. Mach. Learn. Res.- Proc. 10(1), 4–13 (2010)Google Scholar
  30. 30.
    Monteiro, S.T., Murphy, R.J.: Embedded feature selection of hyperspectral bands with boosted decision trees. In: Proceedings of the IEEE 2011 International in Geoscience and Remote Sensing Symposium, IGARSS, pp. 2361–2364 (2011)Google Scholar
  31. 31.
    Newman, D.J., Hettich, S., Blake, C.L., Merz, C.J.: Repository of machine learning databases. University of California, Technical Report (1998)Google Scholar
  32. 32.
    Quinlan, J.R.: Improved use of continuous attributes in C4.5. J. Artif. Intell. Res. 4(1), 77–90 (1996)MATHGoogle Scholar
  33. 33.
    Senoussi, H., Chebel-Morello, B.: A new contextual based feature selection. In: Proceedings of the 2008 International Joint Conference on Neural Networks (IJCNN), pp. 1265–1272 (2008)Google Scholar
  34. 34.
    Sima, C., Attoor, S., Brag-Neto, U., Lowey, J., Suh, E., Dougherty, E.R.: Impact of error estimation on feature selection. Pattern Recognit. 38(12), 2472–2482 (2005)CrossRefGoogle Scholar
  35. 35.
    Singhi, K.S., Liu, H.: Feature subset selection bias for classification learning. In: Proceedings of the 23rd International Conference on Machine Learning, pp. 849–856 (2006)Google Scholar
  36. 36.
    Wang, L., Zhou, N., Chu, F.: A general wrapper approach to selection of class-dependent features. IEEE Trans. Neural Netw. 19(7), 1267–1278 (2008)CrossRefGoogle Scholar
  37. 37.
    Ye, J.: Characterization of a family of algorithms for generalized discriminant analysis on undersampled problems. J. Mach. Learn. Res. 6, 483–502 (2005)MathSciNetMATHGoogle Scholar
  38. 38.
    Zhao, Z., Wang, J., Sharma, S., Agarwal, N., Liu, H., Chang, Y.: An integrative approach to identifying biologically relevant genes. In: Proceedings of SIAM International Conference on Data Mining, pp. 838–849 (2010)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2015

Authors and Affiliations

  • Claudia Diamantini
    • 1
  • Alberto Gemelli
    • 1
  • Domenico Potena
    • 1
  1. 1.Dipartimento di Ingegneria Dell’InformazioneUniversità Politecnica Delle MarcheAnconaItaly

Personalised recommendations