Ensemble Learning with Evolutionary Computation: Application to Feature Ranking

  • Kees Jong
  • Elena Marchiori
  • Michèle Sebag
Part of the Lecture Notes in Computer Science book series (LNCS, volume 3242)


Exploiting the diversity of hypotheses produced by evolutionary learning, a new ensemble approach for Feature Selection is presented, aggregating the feature rankings extracted from the hypotheses. A statistical model is devised to enable the direct evaluation of the approach; comparative experimental results show its good behavior on non-linear concepts when the features outnumber the examples.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Bi, J., Bennett, K.P., Embrechts, M., Breneman, C.M., Song, M.: Dimensionality reduction via sparse support vector machines. J. of Machine Learning Research 3, 1229–1243 (2003)MATHCrossRefGoogle Scholar
  2. 2.
    Bradley, A.P.: The use of the area under the ROC curve in the evaluation of machine learning algorithms. Pattern Recognition (1997)Google Scholar
  3. 3.
    Breiman, L.: Arcing classifiers. Annals of Statistics 26(3), 801–845 (1998)MATHCrossRefMathSciNetGoogle Scholar
  4. 4.
    Collobert, R., Bengio, S.: SVMtorch: Support vector machines for large-scale regression problems. J. of Machine Learning Research 1, 143–160 (2001)CrossRefMathSciNetGoogle Scholar
  5. 5.
    Esposito, R., Saitta, L.: Monte Carlo theory as an explanation of bagging and boosting. In: Proc. of IJCAI 2003, pp. 499–504 (2003)Google Scholar
  6. 6.
    Freund, Y., Shapire, R.E.: Experiments with a new boosting algorithm. In: Saitta, L. (ed.) Proc. ICML1996, pp. 148–156. Morgan Kaufmann, San Francisco (1996)Google Scholar
  7. 7.
    Giordana, A., Saitta, L.: Phase transitions in relational learning. Machine Learning 41, 217–251 (2000)MATHCrossRefGoogle Scholar
  8. 8.
    Guerra-Salcedo, C., Whitley, D.: Genetic approach to feature selection for ensemble creation. In: Proc. GECCO 1999, pp. 236–243 (1999)Google Scholar
  9. 9.
    Guyon, I., Elisseeff, A.: An introduction to variable and feature selection. J. of Machine Learning Research 3, 1157–1182 (2003)MATHCrossRefGoogle Scholar
  10. 10.
    Guyon, I., Weston, J., Barnhill, S., Vapnik, V.: Gene selection for cancer classification using support vector machines. Machine Learning 46, 389–422 (2002)MATHCrossRefGoogle Scholar
  11. 11.
    Hogg, T., Huberman, B.A., Williams, C.P. (eds.): Artificial Intelligence: Special Issue on Frontiers in Problem Solving: Phase Transitions and Complexity, vol. 81(1-2). Elsevier, Amsterdam (1996)Google Scholar
  12. 12.
    Imamura, K., Heckendorn, R.B., Soule, T., Foster, J.A.: Abstention reduces errors; decision abstaining n-version genetic programming. In: Proc. GECCO 2002, pp. 796–803. Morgan Kaufmann, San Francisco (2002)Google Scholar
  13. 13.
    John, G.H., Kohavi, R., Pfleger, K.: Irrelevant features and the subset selection problem. In: Proc. ICML 1994, pp. 121–129. Morgan Kaufmann, San Francisco (1994)Google Scholar
  14. 14.
    Jong, K., Mary, J., Cornuéjols, A., Marchiori, E., Sebag, M.: Ensemble feature ranking. In: Boulicaut, J.-F., Esposito, F., Giannotti, F., Pedreschi, D. (eds.) PKDD 2004. LNCS (LNAI), vol. 3202, pp. 267–278. Springer, Heidelberg (2004) (to appear)CrossRefGoogle Scholar
  15. 15.
    Ling, C.X., Hunag, J., Zhang, H.: AUC: a better measure than accuracy in comparing learning algorithms. In: Proc. of IJCAI 2003 (2003)Google Scholar
  16. 16.
    Rosset, S.: Model selection via the AUC. In: Proc. ICML 2004, Morgan Kaufmann, San Francisco (2004) (to appear)Google Scholar
  17. 17.
    Sebag, M., Azé, J., Lucas, N.: ROC-based evolutionary learning: Application to medical data mining. In: Liardet, P., Collet, P., Fonlupt, C., Lutton, E., Schoenauer, M. (eds.) EA 2003. LNCS, vol. 2936, pp. 384–396. Springer, Heidelberg (2004)CrossRefGoogle Scholar
  18. 18.
    Song, D., Heywood, M.I., Nur Zincir-Heywood, A.: A linear genetic programming approach to intrusion detection. In: Proc. GECCO 2002, pp. 2325–2336. Springer, Heidelberg (2002)Google Scholar
  19. 19.
    Stoppiglia, H., Dreyfus, G., Dubois, R., Oussar, Y.: Ranking a random feature for variable and feature selection. J. of Machine Learning Research 3, 1399–1414 (2003)MATHCrossRefGoogle Scholar
  20. 20.
    Vafaie, H., De Jong, K.: Genetic algorithms as a tool for feature selection in machine learning. In: Proc. ICTAI 1992, pp. 200–204 (1992)Google Scholar
  21. 21.
    Yan, L., Dodier, R.H., Mozer, M., Wolniewicz, R.H.: Optimizing classifier performance via an approximation to the Wilcoxon-Mann-Whitney statistic. In: Proc. ICML 2003, pp. 848–855. Morgan Kaufmann, San Francisco (2003)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2004

Authors and Affiliations

  • Kees Jong
    • 1
  • Elena Marchiori
    • 1
  • Michèle Sebag
    • 2
  1. 1.Department of Mathematics and Computer ScienceVrije UniversiteitAmsterdamThe Netherlands
  2. 2.Laboratoire de Recherche en Informatique, CNRS-INRIAUniversité Paris-Sud OrsayFrance

Personalised recommendations