Bloat Control and Generalization Pressure Using the Minimum Description Length Principle for a Pittsburgh Approach Learning Classifier System

  • Jaume Bacardit
  • Josep Maria Garrell
Part of the Lecture Notes in Computer Science book series (LNCS, volume 4399)

Abstract

Bloat control and generalization pressure are very important issues in the design of Pittsburgh Approach Learning Classifier Systems (LCS), in order to achieve simple and accurate solutions in a reasonable time. In this paper we propose a method to achieve these objectives based on the Minimum Description Length (MDL) principle. This principle is a metric which combines in a smart way the accuracy and the complexity of a theory (rule set , instance set, etc.). An extensive comparison with our previous generalization pressure method across several domains and using two knowledge representations has been done. The test show that the MDL based size control method is a good and robust choice.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Holland, J.H.: Adaptation in Natural and Artificial Systems. University of Michigan Press, Ann Arbor (1975)Google Scholar
  2. 2.
    Smith, S.F.: Flexible learning of problem solving heuristics through adaptive search. In: Proceedings of the Eighth International Joint Conference on Artificial Intelligence, Los Altos, CA, pp. 421–425. Morgan Kaufmann, San Francisco (1983)Google Scholar
  3. 3.
    Holland, J.H.: Escaping Brittleness: The possibilities of General-Purpose Learning Algorithms Applied to Parallel Rule-Based Systems. In: Machine learning, an artificial intelligence approach. Volume II, pp. 593–623 (1986)Google Scholar
  4. 4.
    DeJong, K.A., Spears, W.M.: Learning concept classification rules using genetic algorithms. Proceedings of the International Joint Conference on Artificial Intelligence, 651–656 (1991)Google Scholar
  5. 5.
    Wilson, S.W.: Classifier fitness based on accuracy. Evolutionary Computation 3, 149–175 (1995)CrossRefGoogle Scholar
  6. 6.
    Langdon, W.B.: Fitness causes bloat in variable size representations. Technical Report CSRP-97-14, University of Birmingham, School of Computer Science, Position paper at the Workshop on Evolutionary Computation with Variable Size Representation at ICGA-97 (1997)Google Scholar
  7. 7.
    Mitchell, T.M.: Machine Learning. McGraw-Hill, New York (1997)MATHGoogle Scholar
  8. 8.
    Bacardit, J., Garrell, J.M.: Métodos de generalización para sistemas clasificadores de Pittsburgh. In: Proceedings of the “Primer Congreso Español de Algoritmos Evolutivos y Bioinspirados (AEB’02)”, pp. 486–493 (2002)Google Scholar
  9. 9.
    Rissanen, J.: Modeling by shortest data description. Automatica 14, 465–471 (1978)CrossRefMATHGoogle Scholar
  10. 10.
    Pfahringer, B.: Practical uses of the minimum description length principle in inductive learning (1995)Google Scholar
  11. 11.
    Bacardit, J., Garrell, J.M.: Evolving multiple discretizations with adaptive intervals for a pittsburgh rule-based learning classifier system. In: Proceedings of the Genetic and Evolutionary Computation Conference - GECCO2003, Springer, Heidelberg (2003)Google Scholar
  12. 12.
    Gao, Q., Li, M., Viányi, P.: Applying mdl to learn best model granularity. Artificial Intelligence 121, 1–29 (2000)CrossRefMATHMathSciNetGoogle Scholar
  13. 13.
    Iba, H., de Garis, H., Sato, T.: Genetic programming using a minimum description length principle. In: Kinnear Jr., K.E. (ed.) Advances in Genetic Programming, pp. 265–284. MIT Press, Cambridge (1994)Google Scholar
  14. 14.
    Quinlan, J.R.: C4.5: Programs for Machine Learning. Morgan Kaufmann, San Francisco (1993)Google Scholar
  15. 15.
    Luke, S., Panait, L.: Lexicographic parsimony pressure. In: GECCO 2002: Proceedings of the Genetic and Evolutionary Computation Conference, pp. 829–836 (2002)Google Scholar
  16. 16.
    Llorà, X., et al.: Accuracy, Parsimony, and Generality in Evolutionary Learning System a Multiobjective Selection. In: Lanzi, P.L., Stolzmann, W., Wilson, S.W. (eds.) IWLCS 2003. LNCS (LNAI), vol. 2661, Springer, Heidelberg (2003)Google Scholar
  17. 17.
    Bernadó, E., Garrell, J.M.: Multiobjective learning in a genetic classifier system (MOLeCS). Butlletí de l’Associació Catalana l’Intel.ligència Artificial 22, 102–111 (2000)Google Scholar
  18. 18.
    Bacardit, J.: Pittsburgh Genetics-Based Machine Learning in the Data Mining era: Representations, generalization, and run-time. PhD thesis, Ramon Llull University, Barcelona, Catalonia, Spain (2004)Google Scholar
  19. 19.
    Rivest, R.L.: Learning decision lists. Machine Learning 2(3), 229–246 (1987), citeseer.nj.nec.com/rivest87learning.html Google Scholar
  20. 20.
    Wolpert, D.H., Macready, W.G.: No free lunch theorems for search. Technical Report SFI-TR-95-02-010, Santa Fe, NM (1995)Google Scholar
  21. 21.
    Brodley, C.: Addressing the selective superiority problem: Automatic algorithm /model class selection (1993)Google Scholar
  22. 22.
    Goldberg, D.E., Deb, K.: A comparative analysis of selection schemes used in genetic algorithms. In: Foundations of Genetic Algorithms, pp. 69–93. Morgan Kaufmann, San Francisco (1991)Google Scholar
  23. 23.
    Llorà, X., Garrell, J.M.: Knowledge-independent data mining with fine-grained parallel evolutionary algorithms. In: Proceedings of the Third Genetic and Evolutionary Computation Conference, pp. 461–468. Morgan Kaufmann, San Francisco (2001)Google Scholar
  24. 24.
    Blake, C., Keogh, E., Merz, C.: Uci repository of machine learning databases (1998), http://www.ics.uci.edu/mlearn/MLRepository.html
  25. 25.
    Martínez Marroquín, E., Vos, C., et al.: Morphological analysis of mammary biopsy images. In: Proceedings of the IEEE International Conference on Image Processing, pp. 943–947. IEEE Computer Society Press, Los Alamitos (1996)Google Scholar
  26. 26.
    Martí, J., Cufí, X., Regincós, J., et al.: Shape-based feature selection for microcalcification evaluation. In: Imaging Conference on Image Processing, 3338, pp. 1215–1224 (1998)Google Scholar
  27. 27.
    Golobardes, E., et al.: Genetic classifier system as a heuristic weighting method for a case-based classifier system. Butlletí de l’Associació Catalana d’Intel.ligència Artificial 22, 132–141 (2000)Google Scholar
  28. 28.
    Kohavi, R.: A study of cross-validation and bootstrap for accuracy estimation and model selection. In: IJCAI, pp. 1137–1145 (1995), citeseer.nj.nec.com/kohavi95study.html
  29. 29.
    Witten, I.H., Frank, E.: Data Mining: practical machine learning tools and techniques with java implementations. Morgan Kaufmann, San Francisco (2000)Google Scholar
  30. 30.
    Aha, D.W., Kibler, D.F., Albert, M.K.: Instance-based learning algorithms. Machine Learning 6(1), 37–66 (1991)Google Scholar
  31. 31.
    Bernadó, E., Garrell, J.M.: Accuracy-based learning classifier systems: Models, analysis and applications to classification tasks. Special Issue of the Evolutionary Computation Journal on Learning Classifier Systems (in press, 2003)Google Scholar

Copyright information

© Springer Berlin Heidelberg 2007

Authors and Affiliations

  • Jaume Bacardit
    • 1
  • Josep Maria Garrell
    • 2
  1. 1.Automated Scheduling, Optimisation and Planning research group, School of Computer Science and IT, University of Nottingham, Jubilee Campus, Wollaton Road, Nottingham, NG8 1BBUK
  2. 2.Intelligent Systems Research Group, Enginyeria i Arquitectura La Salle, Universitat Ramon Llull, Psg. Bonanova 8, 08022-Barcelona, Catalonia, Spain, Europe 

Personalised recommendations