Ensemble Feature Ranking

  • Kees Jong
  • Jérémie Mary
  • Antoine Cornuéjols
  • Elena Marchiori
  • Michèle Sebag
Part of the Lecture Notes in Computer Science book series (LNCS, volume 3202)

Abstract

A crucial issue for Machine Learning and Data Mining is Feature Selection, selecting the relevant features in order to focus the learning search. A relaxed setting for Feature Selection is known as Feature Ranking, ranking the features with respect to their relevance.

This paper proposes an ensemble approach for Feature Ranking, aggregating feature rankings extracted along independent runs of an evolutionary learning algorithm named ROGER. The convergence of ensemble feature ranking is studied in a theoretical perspective, and a statistical model is devised for the empirical validation, inspired from the complexity framework proposed in the Constraint Satisfaction domain. Comparative experiments demonstrate the robustness of the approach for learning (a limited kind of) non-linear concepts, specifically when the features significantly outnumber the examples.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Bi, J., Bennett, K.P., Embrechts, M., Breneman, C.M., Song, M.: Dimensionality reduction via sparse support vector machines. J. of Machine Learning Research 3, 1229–1243 (2003)MATHCrossRefGoogle Scholar
  2. 2.
    Botta, M., Giordana, A., Saitta, L., Sebag, M.: Relational learning as search in a critical region. J. of Machine Learning Research 4, 431–463 (2003)CrossRefMathSciNetGoogle Scholar
  3. 3.
    Bradley, A.P.: The use of the area under the ROC curve in the evaluation of machine learning algorithms. Pattern Recognition (1997)Google Scholar
  4. 4.
    Breiman, L.: Arcing classifiers. Annals of Statistics 26(3), 801–845 (1998)MATHCrossRefMathSciNetGoogle Scholar
  5. 5.
    Esposito, R., Saitta, L.: Monte Carlo theory as an explanation of bagging and boosting. In: Proc. of IJCAI 2003, pp. 499–504 (2003)Google Scholar
  6. 6.
    Ferri, C., Flach, P.A., Hernández-Orallo, J.: Learning decision trees using the area under the ROC curve. In: Proc. ICML 2002, pp. 179–186. Morgan Kaufmann, San Francisco (2002)Google Scholar
  7. 7.
    Freund, Y., Shapire, R.E.: Experiments with a new boosting algorithm. In: Saitta, L. (ed.) Proc. ICML 1996, pp. 148–156. Morgan Kaufmann, San Francisco (1996)Google Scholar
  8. 8.
    Giordana, A., Saitta, L.: Phase transitions in relational learning. Machine Learning 41, 217–251 (2000)MATHCrossRefGoogle Scholar
  9. 9.
    Guerra-Salcedo, C., Whitley, D.: Genetic approach to feature selection for ensemble creation. In: Proc. GECCO 1999, pp. 236–243 (1999)Google Scholar
  10. 10.
    Guyon, I., Elisseeff, A.: An introduction to variable and feature selection. J. of Machine Learning Research 3, 1157–1182 (2003)MATHCrossRefGoogle Scholar
  11. 11.
    Guyon, I., Weston, J., Barnhill, S., Vapnik, V.: Gene selection for cancer classification using support vector machines. Machine Learning 46, 389–422 (2002)MATHCrossRefGoogle Scholar
  12. 12.
    Hogg, T., Huberman, B.A., Williams, C.P. (eds.): Artificial Intelligence: Special Issue on Frontiers in Problem Solving: Phase Transitions and Complexity, vol. 81(1-2). Elsevier, Amsterdam (1996)Google Scholar
  13. 13.
    John, G.H., Kohavi, R., Pfleger, K.: Irrelevant features and the subset selection problem. In: Proc. ICML 1994, pp. 121–129. Morgan Kaufmann, San Francisco (1994)Google Scholar
  14. 14.
    Ling, C.X., Hunag, J., Zhang, H.: AUC: a better measure than accuracy in comparing learning algorithms. In: Proc. of IJCAI 2003 (2003)Google Scholar
  15. 15.
    Pepe, M.S., Longton, G., Anderson, G.L., Schummer, M.: Selecting differentially expressed genes from microarray experiments. Biometrics 59, 133–142 (2003)CrossRefMathSciNetMATHGoogle Scholar
  16. 16.
    Rosset, S.: Model selection via the AUC. In: Proc. ICML 2004, Morgan Kaufmann, San Francisco (2004) (to appear)Google Scholar
  17. 17.
    Sebag, M., Azé, J., Lucas, N.: Impact studies and sensitivity analysis in medical data mining with ROC-based genetic learning. In: IEEE-ICDM 2003, pp. 637–640 (2003)Google Scholar
  18. 18.
    Sebag, M., Azé, J., Lucas, N.: ROC-based evolutionary learning: Application to medical data mining. In: Liardet, P., Collet, P., Fonlupt, C., Lutton, E., Schoenauer, M. (eds.) EA 2003. LNCS, vol. 2936, pp. 384–396. Springer, Heidelberg (2004)CrossRefGoogle Scholar
  19. 19.
    Stoppiglia, H., Dreyfus, G., Dubois, R., Oussar, Y.: Ranking a random feature for variable and feature selection. J. of Machine Learning Research 3, 1399–1414 (2003)MATHCrossRefGoogle Scholar
  20. 20.
    Vafaie, H., De Jong, K.: Genetic algorithms as a tool for feature selection in machine learning. In: Proc. ICTAI 1992, pp. 200–204 (1992)Google Scholar
  21. 21.
    Yan, L., Dodier, R.H., Mozer, M., Wolniewicz, R.H.: Optimizing classifier performance via an approximation to the Wilcoxon-Mann-Whitney statistic. In: Proc. of ICML 2003, pp. 848–855. Morgan Kaufmann, San Francisco (2003)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2004

Authors and Affiliations

  • Kees Jong
    • 1
  • Jérémie Mary
    • 2
  • Antoine Cornuéjols
    • 2
  • Elena Marchiori
    • 1
  • Michèle Sebag
    • 2
  1. 1.Department of Mathematics and Computer ScienceVrije Universiteit AmsterdamThe Netherlands
  2. 2.Laboratoire de Recherche en Informatique, CNRS-INRIAUniversité Paris-Sud OrsayFrance

Personalised recommendations