Evolutionary Search of Optimal Features

  • Manuel del Valle
  • Luis F. Lago-Fernández
  • Fernando J. Corbacho
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 4224)


In data mining problems, the selection of appropriate input transformations is often crucial to obtain good solutions. The purpose of such transformations is to project the original attribute space onto a new one that, being closer to the problem structure, allows for more compact and interpretable solutions. We address the problem of automatic construction of input transformations in classification problems. We use an evolutionary approach to search the space of input transformations and a linear method to perform classification on the new feature space. Our assumption is that once a proper data representation, which captures the problem structure, is found, even a linear classifier may find a good solution. Linear methods are free from local minima, while the use of a representation space closer to the problem structure will in general provide more compact and interpretable solutions. We test our method using an artificial problem and a real classification problem from the UCI database. In both cases we obtain low error solutions that in addition are compact and interpretable.


Feature Selection Genetic Programming Linear Discriminant Analysis Optimal Feature Data Mining Problem 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Fayyad, U.M., Piatetsky-Shapiro, G., Smyth, P.: From data mining to knowledge discovery: an overview. In: Fayyad, U.M., Piatetsky-Shapiro, G., Smyth, P., Uthurusamy, R. (eds.) Advances in Knowledge Discovery and Data Mining, pp. 1–34. AAAI Press, Menlo Park (1996)Google Scholar
  2. 2.
    Forman, G.: An extensive empirical study of feature selection metrics for text classification. J. Mach. Learn. Res. 3, 1289–1306 (2003)MATHCrossRefGoogle Scholar
  3. 3.
    Guyon, I., Elisseeff, A.: An Introduction to Variable and Feature Selection. J. Mach. Learn. Res. 3, 1157–1182 (2003)MATHCrossRefGoogle Scholar
  4. 4.
    Stoppiglia, H., Dreyfus, G., Dubois, R., Oussar, Y.: Ranking a random feature for variable and feature selection. J. Mach. Learn. Res. 3, 1157–1182 (2003)CrossRefGoogle Scholar
  5. 5.
    Flach, P.A., Lavrac, N.: The role of Feature Construction in Inductive Rule Learning. In: ICML, pp. 1–11 (2000)Google Scholar
  6. 6.
    Kramer, S.: Demand-Driven Construction of Structural Features in ILP. ILP 2157, 132–141 (2001)Google Scholar
  7. 7.
    Kuscu, I.: A Genetic constructive induction model. In: Angeline, P.J., et al. (eds.) Proc. Congress on Evolutionary Computation, 1st edn., pp. 212–217. IEEE Press, Los Alamitos (1999)Google Scholar
  8. 8.
    Muharram, M., Smith, G.D.: Evolutionary Constructive Induction. IEEE Trans. Knowl. Data Eng. 17(11), 1518–1528 (2005)CrossRefGoogle Scholar
  9. 9.
    Otero, F.E.B., Silva, M.M.S., Freitas, A.A., Nievola, J.C.: Genetic Programming for Attribute Construction in Data Mining. In: Ryan, C., Soule, T., Keijzer, M., Tsang, E.P.K., Poli, R., Costa, E. (eds.) EuroGP 2003. LNCS, vol. 2610, pp. 383–393. Springer, Heidelberg (2003)CrossRefGoogle Scholar
  10. 10.
    Rennie, J.D.M., Jaakkola, T.: Automatic Feature Induction for Text Classification. MIT Artificial Intelligence Laboratory Abstract Book, Cambridge (2002)Google Scholar
  11. 11.
    Utgoff, P.E., Precup, D.: Constructive Function Approximation. In: Liu, H., Motoda, H. (eds.) Feature Extraction, Construction and Selection: a Data Mining Perspective, pp. 219–235. Kluwer Academic, Boston (1998)Google Scholar
  12. 12.
    Duda, R.O., Hart, P.E., Stork, D.G.: Pattern classification. John Wiley and Sons, Chichester (2001)MATHGoogle Scholar
  13. 13.
    Koza, J.: Genetic Programming: On the Programming of Computers by Means of Natural Selection. MIT Press, Cambridge (1992)MATHGoogle Scholar
  14. 14.
    del Valle, M., Sánchez, B., Lago-Fernández, L.F., Corbacho, F.J.: Feature discovery in classification problems. In: Famili, A.F., Kok, J.N., Peña, J.M., Siebes, A., Feelders, A. (eds.) IDA 2005. LNCS, vol. 3646, pp. 486–496. Springer, Heidelberg (2005)CrossRefGoogle Scholar
  15. 15.
    Blake, C.L., Merz, C.J.: UCI Repository of Machine Learning Database (1998), http://www.ics.uci.edu/simmlearn/MLRepository.html
  16. 16.
    Haring, S., Kok, J.N., van Wezel, M.C.: Feature selection for neural networks through functional links found by evolutionary computation. In: Liu, X., Cohen, P.R., Berthold, M.R. (eds.) IDA 1997. LNCS, vol. 1280, pp. 199–210. Springer, Heidelberg (1997)CrossRefGoogle Scholar
  17. 17.
    Hu, Y.: Constructive Induction: Covering Attribute Spectrum. In: Liu, H., Motoda, H. (eds.) Feature Extraction, Construction and Selection: a Data Mining Perspective, pp. 257–272. Kluwer Academic, Boston (1998)Google Scholar
  18. 18.
    Sierra, A., Macıas, J.A., Corbacho, F.: Evolution of Functional Link Networks. IEEE Trans. Evol. Comp. 5(1), 54–65 (2001)CrossRefGoogle Scholar
  19. 19.
    Gagne, C., Parizeau, M.: BEAGLE Puppy Genetic Programming Library., http://beagle.gel.ulaval.ca/puppy
  20. 20.
    Lim, T., Loh, W., Shih, Y.: A Comparison of Prediction Accuracy, Complexity, and Training Time of Thirty-three Old and New Classification Algorithms. Mach. Learn 40, 203–229 (2000)MATHCrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2006

Authors and Affiliations

  • Manuel del Valle
    • 1
  • Luis F. Lago-Fernández
    • 1
    • 2
  • Fernando J. Corbacho
    • 1
    • 2
  1. 1.Escuela Politécnica SuperiorUniversidad Autónoma de MadridMadridSpain
  2. 2.Cognodata ConsultingMadridSpain

Personalised recommendations