Skip to main content

Fitness Function Comparison for GA-Based Feature Construction

  • Conference paper
Current Topics in Artificial Intelligence (CAEPIA 2007)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 4788))

Included in the following conference series:

Abstract

When primitive data representation yields attribute interactions, learning requires feature construction. MFE2/GA, a GA-based feature construction has been shown to learn more accurately than others when there exist several complex attribute interactions. A new fitness function, based on the principle of Minimum Description Length (MDL), is proposed and implemented as part of the MFE3/GA system. Since the individuals of the GA population are collections of new features constructed to change the representation of data, an MDL-based fitness considers not only the part of data left unexplained by the constructed features (errors), but also the complexity of the constructed features as a new representation (theory). An empirical study shows the advantage of the new fitness over other fitness not based on MDL, and both are compared to the performance baselines provided by relevant systems.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Liu, H., Motoda, H.: Feature Extraction, Construction and Selection: A Data Mining Perspective. The International Series in Engineering and Computer Science, vol. 453. Kluwer Academic Publishers, Norwell, MA, USA (1998)

    MATH  Google Scholar 

  2. Freitas, A.A.: Understanding the crucial role of attribute interaction in data mining. AI Review 16(3), 177–199 (2001)

    MATH  Google Scholar 

  3. Jakulin, A., Bratko, I.: Testing the significance of attribute interactions. In: Brodley, C.E. (ed.) Proc. of the Twenty-first International Conference on Machine Learning, pp. 409–416. ACM Press, New York, USA (2004)

    Google Scholar 

  4. Pérez, E., Rendell, L.A.: Using multidimensional projection to find relations. In: Proc. of the Twelfth International Conference on Machine Learning, Tahoe City, California, pp. 447–455. Morgan Kaufmann, San Francisco (1995)

    Google Scholar 

  5. Michalewicz, Z.: Genetic Algorithms + Data Structures = Evolution Programs. Springer, New York, Inc (1999)

    Google Scholar 

  6. Larsen, O., Freitas, A.A., Nievola, J.C.: Constructing X-of-N attributes with a genetic algorithm. In: Proc. of the GECCO, p. 1268. Morgan Kaufmann, San Francisco (2002)

    Google Scholar 

  7. Muharram, M., Smith, G.D.: Evolutionary constructive induction. IEEE Transactions on Knowledge and Data Engineering 17(11), 1518–1528 (2005)

    Article  Google Scholar 

  8. Otero, F., Silva, M., Freitas, A., Nievola, J.: Genetic programming for attribute construction in data mining. In: Ryan, C., Soule, T., Keijzer, M., Tsang, E.P.K., Poli, R., Costa, E. (eds.) EuroGP 2003. LNCS, vol. 2610, pp. 384–393. Springer, Heidelberg (2003)

    Chapter  Google Scholar 

  9. Ritthoff, O., Klinkenberg, R., Fischer, S., Mierswa, I.: A hybrid approach to feature selection and generation using an evolutionary algorithm. In: UK Workshop on Computational Intelligence (September 2002)

    Google Scholar 

  10. Shafti, L.S., Pérez, E.: Reducing complex attribute interaction through non-algebraic feature construction. In: Proc. of the IASTED International Conference on AIA, Innsbruck, Austria, pp. 359–365. Acta Press (February 2007)

    Google Scholar 

  11. Grunwald, P.D.: The Minimum Description Length Principle. MIT Press, Cambridge (2007)

    Google Scholar 

  12. Rissanen, J.: A universal prior for integers and estimation by minimum description length. The Annals of Statistics 11(2), 416–431 (1983)

    Article  MATH  MathSciNet  Google Scholar 

  13. Quinlan, J.R., Rivest, R.L.: Inferring decision trees using the minimum description length principle. Inf. Comput. 80(3), 227–248 (1989)

    Article  MATH  MathSciNet  Google Scholar 

  14. Quinlan, R.J.: C4.5: Programs for Machine Learning. Morgan Kaufmann, San Mateo, California (1993)

    Google Scholar 

  15. Shannon, C.E.: A mathematical theory of communication. Bell System Tech. Journal 27, 379–423, 623–656 (1948)

    Google Scholar 

  16. Zupan, B., Bohanec, M., Bratko, I., Demsar, J.: Learning by discovering concept hierarchies. Artificial Intelligence 109(1-2), 211–242 (1999)

    Article  MATH  MathSciNet  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Daniel Borrajo Luis Castillo Juan Manuel Corchado

Rights and permissions

Reprints and permissions

Copyright information

© 2007 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Shafti, L.S., Pérez, E. (2007). Fitness Function Comparison for GA-Based Feature Construction. In: Borrajo, D., Castillo, L., Corchado, J.M. (eds) Current Topics in Artificial Intelligence. CAEPIA 2007. Lecture Notes in Computer Science(), vol 4788. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-75271-4_26

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-75271-4_26

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-75270-7

  • Online ISBN: 978-3-540-75271-4

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics