Memetic Computing

, Volume 5, Issue 2, pp 95–130 | Cite as

Integrating memetic search into the BioHEL evolutionary learning system for large-scale datasets

Regular Research Paper

Abstract

Local search methods are widely used to improve the performance of evolutionary computation algorithms in all kinds of domains. Employing advanced and efficient exploration mechanisms becomes crucial in complex and very large (in terms of search space) problems, such as when employing evolutionary algorithms to large-scale data mining tasks. Recently, the GAssist Pittsburgh evolutionary learning system was extended with memetic operators for discrete representations that use information from the supervised learning process to heuristically edit classification rules and rule sets. In this paper we first adapt some of these operators to BioHEL, a different evolutionary learning system applying the iterative learning approach, and afterwards propose versions of these operators designed for continuous attributes and for dealing with noise. The performance of all these operators and their combination is extensively evaluated on a broad range of synthetic large-scale datasets to identify the settings that present the best balance between efficiency and accuracy. Finally, the identified best configurations are compared with other classes of machine learning methods on both synthetic and real-world large-scale datasets and show very competent performance.

Keywords

Memetic algorithms Evolutionary algorithms Evolutionary rule learning 

References

  1. 1.
    Bacardit J (2004) Pittsburgh genetics-based machine learning in the data mining era: representations, generalization, and run-time. PhD thesis, Ramon Llull University, BarcelonaGoogle Scholar
  2. 2.
    Bacardit J, Burke EK, Krasnogor N (2009) Improving the scalability of rule-based evolutionary learning. Memet Comput 1(1): 55–67Google Scholar
  3. 3.
    Bacardit J, Goldberg DE, Butz MV, Llorà X, Garrell JM (2004) Speeding-up pittsburgh learning classifier systems: Modeling time and accuracy. In: Parallel problem solving from nature, PPSN 2004. Springer, LNCS 3242, pp 1021–1031Google Scholar
  4. 4.
    Bacardit J, Krasnogor N (2008) Empirical evaluation of ensemble techniques for a pittsburgh learning classifier system. In: Learning classifier systems, Lecture Notes in Computer Science. Springer, vol. 4998. Berlin, pp 255–268Google Scholar
  5. 5.
    Bacardit J, Krasnogor N (2009) A mixed discrete-continuous attribute list representation for large scale classification domains. In: GECCO ’09: proceedings of the 11th annual conference on genetic and evolutionary computation. ACM, pp 1155–1162.Google Scholar
  6. 6.
    Bacardit Jaume, Krasnogor Natalio (2009) Performance and efficiency of memetic pittsburgh learning classifier systems. Evol Comput J 17(3):307–342CrossRefGoogle Scholar
  7. 7.
    Bacardit J, Widera P, Márquez-Chamorro A, Divina F, Aguilar-Ruiz Jesús S, Krasnogor N (2012) Contact map prediction using a large-scale ensemble of rule sets and the fusion of multiple predicted structural features. BioinformaticsGoogle Scholar
  8. 8.
    Bassel GW, Glaab E, Marquez J, Holdsworth MJ, Bacardit J (2011) Functional network construction in arabidopsis using rule-based machine learning on large-scale data sets. Plant Cell Online 23(9):3101–3116Google Scholar
  9. 9.
    Butz MV (2004) Rule-based evolutionary online learning systems: learning bounds, classification, and prediction. PhD thesis, Champaign (AAI3153259)Google Scholar
  10. 10.
    De Jong KA, Spears WM (1991) Learning concept classification rules using genetic algorithms. In: Proceedings of the international joint conference on artificial intelligence. Morgan Kaufmann, pp 651–656Google Scholar
  11. 11.
    Demšar J (2006) Statistical comparisons of classifiers over multiple data sets. J Mach Learn Res 7:1–30Google Scholar
  12. 12.
    Dorigo M, Stützle T (2004) And colony optimization. The MIT Press, CambridgeGoogle Scholar
  13. 13.
    Fernández A, García S, Luengo J, Bernadó-Mansilla E, Herrera F (2010) Genetics-based machine learning for rule induction: state of the art and taxonomy and comparative study. IEEE Trans Evol Comput 14(6):913–941Google Scholar
  14. 14.
    Franco MA, Krasnogor N, Bacardit J (2010) Speeding up the evaluation of evolutionary learning systems using gpgpus. In: Proceedings of the 12th annual conference on genetic and evolutionary computation, GECCO ’10. ACM, New York, pp 1039–1046Google Scholar
  15. 15.
    Franco MA, Krasnogor N, Bacardit J (2012) Analysing biohel using challenging boolean functions. Evol Intell 5(2):87–102Google Scholar
  16. 16.
    Franco MA, Krasnogor N, Bacardit J (2012) Post-processing operators for decision lists. In: Proceedings of the fourteenth international conference on genetic and evolutionary computation conference, GECCO ’12. ACM, New York, pp 847–854 Google Scholar
  17. 17.
    Franco MA, Krasnogor N, Bacardit J (2012) Post-processing operators for decision lists. In: Proceedings of the fourteenth international conference on genetic and evolutionary computation conference, GECCO ’12, Philadelphia, p 847Google Scholar
  18. 18.
    Frank A, Asuncion A (2010) UCI machine learning repositoryGoogle Scholar
  19. 19.
    García S, Herrera F (2008) An extension on “statistical comparisons of classifiers over multiple data sets” for all pairwise comparisons. J Mach Learn Res 9:2677–2694Google Scholar
  20. 20.
    Grefenstette JJ (1991) Lamarckian learning in multi-agent environments. In: Belew R, Booker L (eds) Proceedings of the fourth international conference on genetic algorithms. Morgan Kaufman, San Mateo, pp 303–310Google Scholar
  21. 21.
    Hall M, Frank E, Holmes G, Pfahringer B, Reutemann P, Witten IH (2009) The WEKA data mining software: an update. SIGKDD Explor 11(1):10–18Google Scholar
  22. 22.
    Harik G (1999) Linkage learning via probabilistic modeling in the ecga. Technical Report 99010, Illinois Genetic Algorithms Lab, University of Illinois at Urbana-ChampaignGoogle Scholar
  23. 23.
    Harik G, Lobo FG, Goldberg DE (1999) The compact genetic algorithm. IEEE-EC 3(4):287Google Scholar
  24. 24.
    Kearns MJ, Vazirani UV (1994) Vazirani. An introduction to computational learning theory. MIT Press, CambridgeGoogle Scholar
  25. 25.
    Kennedy J, Eberhart R (1995) Particle swarm optimization. In: Proceedings of the IEEE international conference on neural networks, vol 4, pp 1942–1948Google Scholar
  26. 26.
    Koza JR (1992) Genetic programming. The MIT Press, CambridgeGoogle Scholar
  27. 27.
    Krasnogor N, Smith J (2005) A tutorial for competent memetic algorithms: model, taxonomy, and design issues. IEEE Trans Evol Comput 9(5):474–488Google Scholar
  28. 28.
    Larrañaga P, Lozano JA (2002) Estimation of distribution algorithms. Kluwer Academic, DordrechtGoogle Scholar
  29. 29.
    Llorà X, Priya A, Bhargava R (2009) Observer-invariant histopathology using genetics-based machine learning. Nat Comput 8:101–120. doi:10.1007/s11047-007-9056-6 Google Scholar
  30. 30.
    Llorà X, Sastry K, Goldberg DE (2005) The compact classifier system: scalability analysis and first results. In: Proceedings of the congress on evolutionary computation, vol 1. IEEE Press, pp 596–603Google Scholar
  31. 31.
    Llorà X, Sastry K, Lima CF, Lobo FG, Goldberg DE (2008) Linkage learning, rule representation, and the X-ray extended compact classifier system. In: Learning classifier systems. Revised Selected Papers of IWLCS 2006–2007, LNAI 4998. Springer, Berlin, pp 189–205Google Scholar
  32. 32.
    Pelikan M, Goldberg DE, Cantú-Paz E (1999) BOA: the Bayesian optimization algorithm. In: Proceedings of the genetic and evolutionary computation conference GECCO-99, vol I. Morgan Kaufmann, pp 525–532Google Scholar
  33. 33.
    Venturini G; SIA (1993) A supervised inductive algorithm with genetic search for learning attributes based concepts. In: Brazdil PB (ed) ECML-93, Proceedings of the European conference on machine learning. Springer, Berlin, pp 280–296Google Scholar
  34. 34.
    Wilson SW (1995) Classifier fitness based on accuracy. Evol Comput 3(2):149–175Google Scholar
  35. 35.
    Wyatt D, Bull L (2004) A memetic learning classifier system for describing continuous-valued problem spaces. In: Recent advances in memetic algorithms. Springer, New York, pp 355–396Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2013

Authors and Affiliations

  1. 1.Department of Computer ScienceUniversity College LondonLondonUK
  2. 2.Interdisciplinary Computing and Complex Systems (ICOS) Research Group, School of Computer ScienceUniversity of NottinghamNottinghamUK

Personalised recommendations