Evolutionary Feature Construction Using Information Gain and Gini Index

  • Mohammed A. Muharram
  • George D. Smith
Part of the Lecture Notes in Computer Science book series (LNCS, volume 3003)


Feature construction using genetic programming is carried out to study the effect on the performance of a range of classification algorithms with the inclusion of the evolved attributes. Two different fitness functions are used in the genetic program, one based on information gain and the other based on the gini index. The classification algorithms used are three classification tree algorithms, namely C5, CART, CHAID and an MLP neural network. The intention of the research is to ascertain if the decision tree classification algorithms benefit more using features constructed using a genetic programme whose fitness function incorporates the same fundamental learning mechanism as the splitting criteria of the associated decision tree.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Bensusan, H., Kuscu, I.: Constructive induction using genetic programming. In: Fogarty, T., Venturini, G. (eds.) Proceedings of Int. Conf. Machine Learning, Evolutionary Computing and Machine Learning Workshop (1996)Google Scholar
  2. 2.
    Biggs, D., de Ville, B., Suen, E.: A method of choosing multiway partitions for classification and decision trees. J. of Applied Statistics 18, 49–62 (1991)CrossRefGoogle Scholar
  3. 3.
    Breiman, L., Friedman, J.H., Olshen, R.A., Stone, C.J.: Classification and Regression Trees. Wadsworth, Inc., Belmont (1984)MATHGoogle Scholar
  4. 4.
    Kass, G.V.: An exploratory technique for investigating large quantities of categorical data. Applied Statistics 29, 119–127 (1980)CrossRefGoogle Scholar
  5. 5.
    Koza, J.: Genetic Programming: On the Programming of Computers by Means of Natural Selection. MIT Press, Cambridge (1992)MATHGoogle Scholar
  6. 6.
    Kuscu, I.: A genetic constructive induction model. In: Angeline, P.J., Michalewicz, Z., Schoenauer, M., Yao, X., Zalzala, A. (eds.) Proc. of Congress on Evolutionary Computation, vol. 1, pp. 212–217. IEEE Press, Los Alamitos (1999)Google Scholar
  7. 7.
    Muharram, M.A., Smith, G.D.: The effect of evolved attributes on classification algorithms. In: Gedeon, T(T.) D., Fung, L.C.C. (eds.) AI 2003. LNCS (LNAI), vol. 2903, pp. 933–941. Springer, Heidelberg (2003)CrossRefGoogle Scholar
  8. 8.
    Murthy, S.K., Salzberg, S.: A system for induction of oblique decision trees. Journal of Artificial Intelligence Research 2, 1–32 (1994)MATHGoogle Scholar
  9. 9.
    Otero, F.E.B., Silva, M.M.S., Freitas, A.A., Nievola, J.C.: Genetic programming for attribute construction in data mining. In: Ryan, C., Soule, T., Keijzer, M., Tsang, E.P.K., Poli, R., Costa, E. (eds.) EuroGP 2003. LNCS, vol. 2610, pp. 384–393. Springer, Heidelberg (2003)CrossRefGoogle Scholar
  10. 10.
    Quinlan, J.R.: C4.5: Programs for Machine Learning. Morgan Kaufmann, San Mateo (1993)Google Scholar
  11. 11.
    Treigueiros, D., Berry, R.H.: The application of neural network based methods to the extraction of knowledge from accounting reports. In: Proceedings of 24th Annual Hawaii Int. Conf. on System Sciences IV, pp. 137–146 (1991)Google Scholar
  12. 12.
    Witten, I.H., Frank, E.: Data Mining: Practical machine learning tools and techniques with Java. Morgan Kaufmann, San Francisco (1999)Google Scholar
  13. 13.
    Zheng, Z.: Effects of different types of new attribute on constructive induction. In: Proc of 8th Int. Conf. on Tools with Artifical Intelligence (ICTAI 1996), pp. 254–257. IEEE, Los Alamitos (1996)CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2004

Authors and Affiliations

  • Mohammed A. Muharram
    • 1
  • George D. Smith
    • 1
  1. 1.School of Computing Sciences UEA NorwichNorwichEngland

Personalised recommendations