Evolutionary Intelligence

, Volume 1, Issue 3, pp 209–232 | Cite as

Genetic-based machine learning systems are competitive for pattern recognition

  • Albert Orriols-Puig
  • Jorge Casillas
  • Ester Bernadó-Mansilla
Review Article

Abstract

During the last decade, research on Genetic-Based Machine Learning has resulted in several proposals of supervised learning methodologies that use evolutionary algorithms to evolve rule-based classification models. Usually, these new GBML approaches are accompanied by little experimentation and there is a lack of comparisons among different proposals. Besides, the competitiveness of GBML systems with respect to non-evolutionary, highly-used machine learning techniques has only been partially studied. This paper reviews the state of the art in GBML, selects some of the best representatives of different families, and compares the accuracy and the interpretability of their models. The paper also analyzes the behavior of the GBML approaches with respect to some of the most influential machine learning techniques that belong to different learning paradigms such as decision trees, support vector machines, instance-based classifiers, and probabilistic classifiers. The experimental observations emphasize the suitability of GBML systems for performing classification tasks. Moreover, the analysis points out the strengths of the different systems, which can be used as recommendation guidelines on which systems should be employed depending on whether the user prefers to maximize the accuracy or the interpretability of the models.

Keywords

Pattern recognition Supervised learning Fuzzy logics Genetic-based machine learning Learning classifier systems Learning fuzzy-classifier systems 

Notes

Acknowledgments

The authors would like to warmly thank Raúl Giráldez (Pablo de Olavide University), Yusuke Nojima (Osaka Prefecture University), and Raúl Pérez (University of Granada) for providing us with the experimental results of HIDER, HMOF, and SLAVE, respectively. The authors also wish to acknowledge the Ministerio de Educación y Ciencia for its support under projects TIN2005-08386-C05-01 and TIN2005-08386-C05-04, and Generalitat de Catalunya for its support under grants 2005FI-00252 and 2005SGR-00302.

References

  1. 1.
    Aggarwal C (ed) (2007) Data streams: models and algorithms. Springer, HeidelbergGoogle Scholar
  2. 2.
    Aguilar-Ruiz JS, Giraldez R, Riquelme JC (2007) Natural encoding for evolutionary supervised learning. IEEE Trans Evol Comput 11(4):466–479CrossRefGoogle Scholar
  3. 3.
    Aguilar-Ruiz JS, Riquelme JC, Toro M (2003) Evolutionary learning of hierarchical decision rules. IEEE Trans Syst Man Cybern B 33(2):324–331CrossRefGoogle Scholar
  4. 4.
    Aha DW, Kibler DF, Albert MK (1991) Instance-based learning algorithms. Mach Learn 6(1):37–66Google Scholar
  5. 5.
    Alcalá-Fdez J, Sánchez L, García S, del Jesus MJ, Ventura S, Garrell JM, Otero J, Romero C, Bacardit J, Rivas VM, Fernández JC, Herrera F (2008) KEEL: a software tool to assess evolutionary algorithms to data mining problems. Soft Comput (forthcoming)Google Scholar
  6. 6.
    Anikow CZ (1993) A knowledge-intensive genetic algorithm for supervised learning. Mach Learn 13(2–3):189–228CrossRefGoogle Scholar
  7. 7.
    Asuncion A, Newman DJ (2007) UCI Machine learning repository: http://www.ics.uci.edu/∼mlearn/MLRepository.html. University of California
  8. 8.
    Bacardit J (2004) Pittsburgh genetic-based machine learning in the data mining era: representations, generalization and run-time. PhD thesis, Ramon Llull University, Barcelona, Catalonia, SpainGoogle Scholar
  9. 9.
    Bacardit J, Krasnogor N (2008) Empirical evaluation of ensemble techniques for a pittsburgh learning classifier system. In: Learning classifier systems, revised selected papers of the international workshop on learning classifier systems 2006–2007. Springer, HeidelbergGoogle Scholar
  10. 10.
    Bernadó-Mansilla E, Garrell JM (2003) Accuracy-based learning classifier systems: models, analysis and applications to classification tasks. Evol Comput 11(3):209–238CrossRefGoogle Scholar
  11. 11.
    Bernadó-Mansilla E, Ho TK (2005) Domain of competence of XCS classifier system in complexity measurement space. IEEE Trans Evol Comput 9(1):1–23CrossRefGoogle Scholar
  12. 12.
    Bernadó-Mansilla E, Llorà X, Garrell JM (2002) XCS and GALE: a comparative study of two learning classifier systems on data mining. In: Advances in learning classifier systems volume 2321 of LNAI. Springer, Heidelberg, pp 115–132Google Scholar
  13. 13.
    Bonelli P, Parodi A (1991) An efficient classifier system and its experimental comparison with two representative learning methods on three medical domains. In 4th international conference on genetic algorithms, pp 288–295Google Scholar
  14. 14.
    Bull L, Studley M, Bagnall A, Whittley I (2007) Learning classifier system ensembles with rule-sharing. IEEE Trans Evol Comput 11(4):496–502CrossRefGoogle Scholar
  15. 15.
    Butz MV (2006) Rule-based evolutionary online learning systems: a principled approach to LCS analysis and design volume 109 of studies in Fuzziness and Soft Computing. Springer, HeidelbergGoogle Scholar
  16. 16.
    Butz MV, Lanzi PL, Wilson SW (2008) Function approximation with XCS: hyperellipsoidal conditions, recursive least squares, and compaction. IEEE Trans Evol Comput. doi:10.1109/TEVC.2007.903551 (forthcoming)
  17. 17.
    Cantú-Paz E (2001) Efficient and accurate parallel genetic algorithms. Kluwer, DordrechtGoogle Scholar
  18. 18.
    Castillo L, González A, Pérez R (2001) Including a simplicity criterion in the selection of the best rule in a genetic fuzzy learning algorithm. Fuzzy Sets Syst 120:309–321MATHCrossRefGoogle Scholar
  19. 19.
    Corcoran AL, Sen S (1994) Using real-valued genetic algorithms to evolve rule sets for classification. In: International conference on evolutionary computation, pp 120–124Google Scholar
  20. 20.
    O. Cordón, Herrera F, Hoffmann F, Magdalena L (2001) Genetic fuzzy systems: evolutionary tuning and learning of fuzzy knowledge bases volume 19 of advances in fuzzy systems—aplications and theory. World ScientificGoogle Scholar
  21. 21.
    Dam HH, Lokan C, Abbass HA (2007) Evolutionary online data mining: an investigation in a dynamic environment. In: Evolutionary computation in dynamic and uncertain environments volume 51/2007 of studies in computational intelligence. Springer Berlin/Heidelberg, pp 153–178Google Scholar
  22. 22.
    de Jong KA, Spears W (1991) Learning concept classification rules using genetic algorithms. In: Proceedings of the international joint conference on artificial intelligence. Sidney, pp 651–656Google Scholar
  23. 23.
    de Jong KA, Spears WM, Gordon DF (1993) Using genetic algorithms for concept learning. Genetic algorithms for machine learning. In: John J, Grefenstette (eds) A special issue of machine learning, vol 13, 2–3, pp 161–188Google Scholar
  24. 24.
    Deb K, Pratap A, Agarwal S, Meyarivan T (2002) A fast and elitist multiobjective genetic algorithm: NSGA-II. IEEE Trans Evol Comput 6(2):182–197CrossRefGoogle Scholar
  25. 25.
    del Jesús MJ, Hoffmann F, Navascués LJ, Sánchez L (2004) Induction of fuzzy-rule-based classifiers with evolutionary boosting algorithms. IEEE Trans Fuzzy Syst 12(3):296–308CrossRefGoogle Scholar
  26. 26.
    Demšar J (2006) Statistical comparisons of classifiers over multiple data sets. J Mach Learn Res 7:1–30MathSciNetGoogle Scholar
  27. 27.
    Dietterich TG (1998) Approximate statistical tests for comparing supervised classification learning algorithms. Neural Comput 10(7):1895–1924CrossRefGoogle Scholar
  28. 28.
    Dixon PW, Corne DW, Oates MJ (2004) A ruleset reduction algorithm for the XCSI learning classifier system. In: Lecture Notes in Computer Science, vol 2661/2003. Springer, Heidelberg, pp 20–29Google Scholar
  29. 29.
    Drugowitsch J, Barry AM (2008) A formal framework and extensions for function approximation in learning classifier systems. Mach Learn 70(1):45–88CrossRefGoogle Scholar
  30. 30.
    Dunn OJ (1961) Multiple comparisons among means. J Am Stat Assoc 56:52–64MATHCrossRefMathSciNetGoogle Scholar
  31. 31.
    Eiben AE, Smith JE (2003) Introduction to evolutionary computing. Springer, BerlinMATHGoogle Scholar
  32. 32.
    Fisher RA (1959) Statistical methods and scientific inference. 2nd edn. Hafner Publishing Co, New YorkGoogle Scholar
  33. 33.
    Frank E, Witten IH (1998) Generating accurate rule sets without global optimization. In: Proceedings of the 15th international conference on machine learning. Morgan Kaufmann, San Francisco, pp 144–151Google Scholar
  34. 34.
    Freitas A (2002) Data mining and knowledge discovery with evolutionary algorithms. Springer, HeidelbergGoogle Scholar
  35. 35.
    Freund Y, Schapire RE (1996) Experiments with a new boosting algorithm. In: International conference on machine learning, pp 148–156Google Scholar
  36. 36.
    Friedman J, Hastie T, Tibshirani R (2000) Additive logistic regression: a statistical view of boosting. Ann Stat 32(2):337–374CrossRefMathSciNetGoogle Scholar
  37. 37.
    Friedman M (1937) The use of ranks to avoid the assumption of normality implicit in the analysis of variance. J Am Stat Assoc 32:675–701CrossRefGoogle Scholar
  38. 38.
    Friedman M (1940) A comparison of alternative tests of significance for the problem of m rankings. Ann Math Stat 11:86–92MATHCrossRefGoogle Scholar
  39. 39.
    Fu C, Davis L (2002) A modified classifier system compaction algorithm. In: GECCO’02: Proceedings of the 2002 genetic and evolutionary computation conference. Morgan Kaufmann Publishers Inc., San Francisco, pp 920–925Google Scholar
  40. 40.
    Fürnkranz J (1999) Separate-and-conquer rule learning. Artif Intell Rev 13(1):3–54Google Scholar
  41. 41.
    Giordana A, Neri F (1995) Search-intensive concept induction. Evol Comput 3(4):375–419CrossRefGoogle Scholar
  42. 42.
    Giráldez R, Aguilar-Ruiz JS, Riquelme JC (2002) Discretization oriented to decision rules generation. In: Knowledge-based intelligent information engineering systems and allied technologies (KES’02). IOS Press, pp 275–279Google Scholar
  43. 43.
    Goldberg DE (1989) Genetic algorithms in search, optimization and machine learning. 1st edn, Addison Wesley, ReadingGoogle Scholar
  44. 44.
    Goldberg DE (2002) The design of innovation: lessons from and for competent genetic algorithms. 1st edn. Kluwer, DordrechtGoogle Scholar
  45. 45.
    González A, Pérez R (1998) Completeness and consistency conditions for learning fuzzy rules. Fuzzy Sets Syst 96:37–51CrossRefGoogle Scholar
  46. 46.
    González A, Pérez R (1999) SLAVE: a genetic learning system based on an iterative approach. IEEE Trans Fuzzy Syst 7(2):176–191CrossRefGoogle Scholar
  47. 47.
    Greene DP, Smith SE (1993) Competition-based induction of decision models from examples. Mach Learn 13:229–257CrossRefGoogle Scholar
  48. 48.
    Herrera F (2008) Genetic fuzzy systems: taxonomy and current research trends and prospects. Evol Intell 1(1):27–46. doi:10.1007/s12065-007-0001-5 Google Scholar
  49. 49.
    Hochberg Y (1988) A sharper Bonferroni procedure for multiple tests of significance. Biometrika 75:800–802MATHCrossRefMathSciNetGoogle Scholar
  50. 50.
    Holland JH (1975) Adaptation in natural and artificial systems. The University of Michigan Press, Ann ArborGoogle Scholar
  51. 51.
    Holland JH (1976) Adaptation. In: Rosen R, Snell F (eds) Progress in theoretical biology, vol 4. Academic Press, New York, pp 263–293Google Scholar
  52. 52.
    Holland JH, Reitman JS (1978) Cognitive systems based on adaptive algorithms. In: Waterman DA, Hayes-Roth F (eds) Pattern-directed inference systems. Academic Press, San Diego, pp 313–329Google Scholar
  53. 53.
    Holm S (1979) A simple sequentially rejective multiple test procedure. Scand J Stat 6:65–70MATHMathSciNetGoogle Scholar
  54. 54.
    Ishibuchi H, Nojima Y (2007) Analysis of interpretability-accuracy tradeoff of fuzzy systems by multiobjective fuzzy genetics-based machine learning. Int J Approx Reason 44(1):4–31MATHCrossRefMathSciNetGoogle Scholar
  55. 55.
    Ishibuchi H, Yamamoto T (2005) Rule weight specification in fuzzy rule-based classification systems. IEEE Trans Fuzzy Syst 13(4):428–435CrossRefGoogle Scholar
  56. 56.
    Ishibuchi H, Yamamoto T, Murata T (2005) Hybridization of fuzzy GBML approaches for pattern classification problems. IEEE Trans Syst Man Cybern B Cybern 35(2):359–365CrossRefGoogle Scholar
  57. 57.
    John GH, Langley P (1995) Estimating continuous distributions in bayesian classifiers. In: 11th conference on uncertainty in artificial intelligence. Morgan Kaufmann, San Mateo, pp 338–345Google Scholar
  58. 58.
    Llorà X, Garrell JM (2001) Knowledge-independent data mining with fine-grained parallel evolutionary algorithms. In: GECCO’01: Proceedings of the 2th annual conference on genetic and evolutionary computation. Morgan Kaufmann Publishers, San Mateo, pp 461?‘468Google Scholar
  59. 59.
    Llorà X, Reddy R, Matesic B, Bhargava R (2007) Towards better than human capability in diagnosing prostate cancer using infrared spectroscopic imaging. In GECCO’07: Proceedings of the 9th annual conference on genetic and evolutionary computation. ACM, New York, pp 2098–2105Google Scholar
  60. 60.
    Llorà X, Sastry K (2006) Fast rule matching for learning classifier systems via vector instructions. In: GECCO’06: Proceedings of the 8th annual conference on Genetic and evolutionary computation. ACM, New York, pp 1513–1520Google Scholar
  61. 61.
    Llorà X, Sastry K, Yu T-L, Goldberg DE (2007) Do not match, inherit: fitness surrogates for genetics-based machine learning techniques. In: GECCO’07: Proceedings of the 9th annual conference on Genetic and evolutionary computation. ACM, New York, pp 1798–1805Google Scholar
  62. 62.
    Llorà X, Wilson SW (2004) Mixed decision trees: minimizing knowledge representation bias in lcs. In: GECCO’04: Proceedings of the genetic and evolutionary computation conference. Springer, LNCS, vol 3103, pp 797–809Google Scholar
  63. 63.
    Michalewicz Z (1999) Genetic algorithms + data structures = evolution programs. 3rd edn. Springer, Heidelberg Google Scholar
  64. 64.
    Nojima Y, Ishibuchi H, Kuwajima I (2008) Parallel distributed genetic fuzzy rule selection. Soft Comput (forthcomming)Google Scholar
  65. 65.
    Nurnberger A, Borgelt C, Klose A (1999) Improving naive Bayes classifiers using neuro-fuzzy learning. In: Proceedings of the 1999 conference on neural information processing, vol 1, Perth, pp 154–159Google Scholar
  66. 66.
    Núñez M, Fidalgo R, Morales R (2007) Learning in environments with unknown dynamics: towards more robust concept learners. J Mach Learn Res 8:2595–2628MathSciNetGoogle Scholar
  67. 66.
    Orriols-Puig A, Bernadó-Mansilla E (2004) Analysis of reduction algorithms for XCS classifier system. In: Recent advances in artificial intelligence research and development number 113 in 1. IOS Press, pp 383–390Google Scholar
  68. 68.
    Orriols-Puig A, Bernadó-Mansilla E (2006) Bounding XCS parameters for unbalanced datasets. In: GECCO’06: Proceedings of the 2006 genetic and evolutionary computation conference. ACM Press, New York, pp 1561–1568Google Scholar
  69. 69.
    Orriols-Puig A, Bernadó-Mansilla E (2008) Evolutionary rule-based systems for imbalanced datasets. Soft Comput J. doi:10.1007/s00500-008-0319-7
  70. 70.
    Orriols-Puig A, Bernadó-Mansilla E (2008) Revisiting UCS: description, fitness sharing and comparison with XCS. In: Advances at the Frontier of LCSs. Springer, HeidelbergGoogle Scholar
  71. 71.
    Otero J, Sánchez L (2006) Induction of descriptive fuzzy classifiers with the logitboost algorithm. Soft Comput 10(9):825–835CrossRefGoogle Scholar
  72. 72.
    Platt J (1998) Fast training of support vector machines using sequential minimal optimization. In: Advances in Kernel methods—support vector learning. MIT Press, CambridgeGoogle Scholar
  73. 73.
    Quinlan JR (1995) C4.5: programs for machine learning. Morgan Kaufmann Publishers, San MateoGoogle Scholar
  74. 74.
    Rissanen J (1978) Modeling by shortest data description. Automatica 14:465–471MATHCrossRefGoogle Scholar
  75. 75.
    Rivest RL (1987) Learning decision lists. Mach Learn 2(3):229–246Google Scholar
  76. 76.
    Sánchez L, Couso I (2007) Advocating the use of imprecisely observed data in genetic fuzzy systems. IEEE Trans Fuzzy Syst 15(4):551–562CrossRefGoogle Scholar
  77. 77.
    Sánchez L, Couso I, Casillas J (2007) Modeling vague data with genetic fuzzy systems under a combination of crisp and imprecise criteria. In: Proceedings of the 2007 IEEE symposium on computational intelligence in multicriteria decision making, pp 346–353Google Scholar
  78. 78.
    Schapire RE, Singer Y (1999) Improved boosting algorithms using confidence-rated predictions. Mach Learn 37(3):297–336MATHCrossRefGoogle Scholar
  79. 79.
    Sheskin DJ (2000) Handbook of parametric and nonparametric statistical procedures. Chapman & Hall, LondonGoogle Scholar
  80. 80.
    Smith SF (1980) A learning system based on genetic adaptive algorithms. PhD thesis. University of Pittsburgh, USAGoogle Scholar
  81. 81.
    Stone C, Bull L (2003) For real! XCS with continuous-valued inputs. Evol Comput 11(3):299–336CrossRefGoogle Scholar
  82. 82.
    Tammee K, Bull L, Ouen P (2007) Ycsc: a modified clustering technique based on lcs. J Digit Inf Manage 5(3):160–167Google Scholar
  83. 83.
    Theodoridis S, Koutroumbas K (2006) Pattern Recognition, 3rd edn. Elsevier, AmsterdamGoogle Scholar
  84. 84.
    Vapnik V (1995) The nature of statistical learning theory. Springer, New YorkMATHGoogle Scholar
  85. 85.
    Venturini G (1993) SIA: a supervised inductive algorithm with genetic search for learning attributes based concepts. In: Brazdil PB (eds) Machine learning: ECML-93 - Proc. of the European conference on machine learning. Springer, Berlin, pp 280–296Google Scholar
  86. 86.
    Wilcoxon F (1945) Individual comparisons by ranking methods. Biometrics 1:80–83CrossRefGoogle Scholar
  87. 87.
    Wilson SW (1995) Classifier fitness based on accuracy. Evol Comput 3(2):149–175CrossRefGoogle Scholar
  88. 88.
    Wilson SW (1998) Generalization in the XCS classifier system. In: 3rd annual conf. on genetic programming. Morgan Kaufmann, San Mateo, pp 665–674Google Scholar
  89. 87.
    Wilson SW (2000) Get real! XCS with continuous-valued inputs. In: Learning classifier systems. From foundations to applications LNAI, Springer, Berlin, pp 209–219Google Scholar
  90. 90.
    Wilson SW (2002) Classifiers that approximate functions. J Nat Comput 1(2):211–234MATHCrossRefGoogle Scholar
  91. 91.
    Wilson SW (2002) Compact rulesets from XCSI. In: Advances in learning classifier systems, 4th international workshop, Lecture Notes in Artificial Intelligence, vol 2321. Springer, Heidelberg, pp 197–210Google Scholar
  92. 92.
    Wilson SW (2008) Classifier conditions using gene expression programming. Technical report, IlliGAL Report No. 2008001, Urbana-Champaign IL 61801, USAGoogle Scholar
  93. 93.
    Witten IH, Frank E (2005) Data mining: practical machine learning tools and techniques, 2nd edn. Morgan Kaufmann, San FranciscoGoogle Scholar
  94. 94.
    Wu X, Kumar V, Quinlan JR, Ghosh J, Yang Q, Motoda H, McLachlan GJ, Ng A, Liu B, Yu PS, Zhou ZH, Steinbach M, Hand DJ, Steinberg D (2007) Top 10 algorithms in data mining. Knowl Inf Syst 14(1):1–37CrossRefGoogle Scholar

Copyright information

© Springer-Verlag 2008

Authors and Affiliations

  • Albert Orriols-Puig
    • 1
  • Jorge Casillas
    • 2
  • Ester Bernadó-Mansilla
    • 1
  1. 1.Grup de Recerca en Sistemes Intel·ligents, Enginyeria i Arquitectura La SalleUniversitat Ramon LlullBarcelonaSpain
  2. 2.Department of Computer Science and Artificial IntelligenceUniversity of GranadaGranadaSpain

Personalised recommendations