Abstract
Local search methods are widely used to improve the performance of evolutionary computation algorithms in all kinds of domains. Employing advanced and efficient exploration mechanisms becomes crucial in complex and very large (in terms of search space) problems, such as when employing evolutionary algorithms to large-scale data mining tasks. Recently, the GAssist Pittsburgh evolutionary learning system was extended with memetic operators for discrete representations that use information from the supervised learning process to heuristically edit classification rules and rule sets. In this paper we first adapt some of these operators to BioHEL, a different evolutionary learning system applying the iterative learning approach, and afterwards propose versions of these operators designed for continuous attributes and for dealing with noise. The performance of all these operators and their combination is extensively evaluated on a broad range of synthetic large-scale datasets to identify the settings that present the best balance between efficiency and accuracy. Finally, the identified best configurations are compared with other classes of machine learning methods on both synthetic and real-world large-scale datasets and show very competent performance.
This is a preview of subscription content,
to check access.Access this article
We’re sorry, something doesn't seem to be working properly.
Please try refreshing the page. If that doesn't work, please contact support so we can address the problem.











Similar content being viewed by others
Notes
Briefly described in the next subsection.
References
Bacardit J (2004) Pittsburgh genetics-based machine learning in the data mining era: representations, generalization, and run-time. PhD thesis, Ramon Llull University, Barcelona
Bacardit J, Burke EK, Krasnogor N (2009) Improving the scalability of rule-based evolutionary learning. Memet Comput 1(1): 55–67
Bacardit J, Goldberg DE, Butz MV, Llorà X, Garrell JM (2004) Speeding-up pittsburgh learning classifier systems: Modeling time and accuracy. In: Parallel problem solving from nature, PPSN 2004. Springer, LNCS 3242, pp 1021–1031
Bacardit J, Krasnogor N (2008) Empirical evaluation of ensemble techniques for a pittsburgh learning classifier system. In: Learning classifier systems, Lecture Notes in Computer Science. Springer, vol. 4998. Berlin, pp 255–268
Bacardit J, Krasnogor N (2009) A mixed discrete-continuous attribute list representation for large scale classification domains. In: GECCO ’09: proceedings of the 11th annual conference on genetic and evolutionary computation. ACM, pp 1155–1162.
Bacardit Jaume, Krasnogor Natalio (2009) Performance and efficiency of memetic pittsburgh learning classifier systems. Evol Comput J 17(3):307–342
Bacardit J, Widera P, Márquez-Chamorro A, Divina F, Aguilar-Ruiz Jesús S, Krasnogor N (2012) Contact map prediction using a large-scale ensemble of rule sets and the fusion of multiple predicted structural features. Bioinformatics
Bassel GW, Glaab E, Marquez J, Holdsworth MJ, Bacardit J (2011) Functional network construction in arabidopsis using rule-based machine learning on large-scale data sets. Plant Cell Online 23(9):3101–3116
Butz MV (2004) Rule-based evolutionary online learning systems: learning bounds, classification, and prediction. PhD thesis, Champaign (AAI3153259)
De Jong KA, Spears WM (1991) Learning concept classification rules using genetic algorithms. In: Proceedings of the international joint conference on artificial intelligence. Morgan Kaufmann, pp 651–656
Demšar J (2006) Statistical comparisons of classifiers over multiple data sets. J Mach Learn Res 7:1–30
Dorigo M, Stützle T (2004) And colony optimization. The MIT Press, Cambridge
Fernández A, García S, Luengo J, Bernadó-Mansilla E, Herrera F (2010) Genetics-based machine learning for rule induction: state of the art and taxonomy and comparative study. IEEE Trans Evol Comput 14(6):913–941
Franco MA, Krasnogor N, Bacardit J (2010) Speeding up the evaluation of evolutionary learning systems using gpgpus. In: Proceedings of the 12th annual conference on genetic and evolutionary computation, GECCO ’10. ACM, New York, pp 1039–1046
Franco MA, Krasnogor N, Bacardit J (2012) Analysing biohel using challenging boolean functions. Evol Intell 5(2):87–102
Franco MA, Krasnogor N, Bacardit J (2012) Post-processing operators for decision lists. In: Proceedings of the fourteenth international conference on genetic and evolutionary computation conference, GECCO ’12. ACM, New York, pp 847–854
Franco MA, Krasnogor N, Bacardit J (2012) Post-processing operators for decision lists. In: Proceedings of the fourteenth international conference on genetic and evolutionary computation conference, GECCO ’12, Philadelphia, p 847
Frank A, Asuncion A (2010) UCI machine learning repository
García S, Herrera F (2008) An extension on “statistical comparisons of classifiers over multiple data sets” for all pairwise comparisons. J Mach Learn Res 9:2677–2694
Grefenstette JJ (1991) Lamarckian learning in multi-agent environments. In: Belew R, Booker L (eds) Proceedings of the fourth international conference on genetic algorithms. Morgan Kaufman, San Mateo, pp 303–310
Hall M, Frank E, Holmes G, Pfahringer B, Reutemann P, Witten IH (2009) The WEKA data mining software: an update. SIGKDD Explor 11(1):10–18
Harik G (1999) Linkage learning via probabilistic modeling in the ecga. Technical Report 99010, Illinois Genetic Algorithms Lab, University of Illinois at Urbana-Champaign
Harik G, Lobo FG, Goldberg DE (1999) The compact genetic algorithm. IEEE-EC 3(4):287
Kearns MJ, Vazirani UV (1994) Vazirani. An introduction to computational learning theory. MIT Press, Cambridge
Kennedy J, Eberhart R (1995) Particle swarm optimization. In: Proceedings of the IEEE international conference on neural networks, vol 4, pp 1942–1948
Koza JR (1992) Genetic programming. The MIT Press, Cambridge
Krasnogor N, Smith J (2005) A tutorial for competent memetic algorithms: model, taxonomy, and design issues. IEEE Trans Evol Comput 9(5):474–488
Larrañaga P, Lozano JA (2002) Estimation of distribution algorithms. Kluwer Academic, Dordrecht
Llorà X, Priya A, Bhargava R (2009) Observer-invariant histopathology using genetics-based machine learning. Nat Comput 8:101–120. doi:10.1007/s11047-007-9056-6
Llorà X, Sastry K, Goldberg DE (2005) The compact classifier system: scalability analysis and first results. In: Proceedings of the congress on evolutionary computation, vol 1. IEEE Press, pp 596–603
Llorà X, Sastry K, Lima CF, Lobo FG, Goldberg DE (2008) Linkage learning, rule representation, and the X-ray extended compact classifier system. In: Learning classifier systems. Revised Selected Papers of IWLCS 2006–2007, LNAI 4998. Springer, Berlin, pp 189–205
Pelikan M, Goldberg DE, Cantú-Paz E (1999) BOA: the Bayesian optimization algorithm. In: Proceedings of the genetic and evolutionary computation conference GECCO-99, vol I. Morgan Kaufmann, pp 525–532
Venturini G; SIA (1993) A supervised inductive algorithm with genetic search for learning attributes based concepts. In: Brazdil PB (ed) ECML-93, Proceedings of the European conference on machine learning. Springer, Berlin, pp 280–296
Wilson SW (1995) Classifier fitness based on accuracy. Evol Comput 3(2):149–175
Wyatt D, Bull L (2004) A memetic learning classifier system for describing continuous-valued problem spaces. In: Recent advances in memetic algorithms. Springer, New York, pp 355–396
Acknowledgments
We acknowledge the support of the UK Engineering and Physical Sciences Research Council (EPSRC) under grant EP/H016597/1. We are grateful for the use of the University of Nottingham’s High Performance Computing Facility.
Author information
Authors and Affiliations
Corresponding author
Appendix
Appendix
Rights and permissions
About this article
Cite this article
Calian, D.A., Bacardit, J. Integrating memetic search into the BioHEL evolutionary learning system for large-scale datasets. Memetic Comp. 5, 95–130 (2013). https://doi.org/10.1007/s12293-013-0108-4
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s12293-013-0108-4