Integrating memetic search into the BioHEL evolutionary learning system for large-scale datasets

Calian, Dan Andrei; Bacardit, Jaume

doi:10.1007/s12293-013-0108-4

Integrating memetic search into the BioHEL evolutionary learning system for large-scale datasets

Regular Research Paper
Published: 16 February 2013

Volume 5, pages 95–130, (2013)
Cite this article

Memetic Computing Aims and scope Submit manuscript

Dan Andrei Calian¹ &
Jaume Bacardit²

327 Accesses
8 Citations
2 Altmetric
Explore all metrics

Abstract

Local search methods are widely used to improve the performance of evolutionary computation algorithms in all kinds of domains. Employing advanced and efficient exploration mechanisms becomes crucial in complex and very large (in terms of search space) problems, such as when employing evolutionary algorithms to large-scale data mining tasks. Recently, the GAssist Pittsburgh evolutionary learning system was extended with memetic operators for discrete representations that use information from the supervised learning process to heuristically edit classification rules and rule sets. In this paper we first adapt some of these operators to BioHEL, a different evolutionary learning system applying the iterative learning approach, and afterwards propose versions of these operators designed for continuous attributes and for dealing with noise. The performance of all these operators and their combination is extensively evaluated on a broad range of synthetic large-scale datasets to identify the settings that present the best balance between efficiency and accuracy. Finally, the identified best configurations are compared with other classes of machine learning methods on both synthetic and real-world large-scale datasets and show very competent performance.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Trends of Evolutionary Machine Learning to Address Big Data Mining

Evolving continuous optimisers from scratch

Article Open access 20 October 2021

Continuous Optimizers for Automatic Design and Evaluation of Classification Pipelines

Notes

Briefly described in the next subsection.

References

Bacardit J (2004) Pittsburgh genetics-based machine learning in the data mining era: representations, generalization, and run-time. PhD thesis, Ramon Llull University, Barcelona
Bacardit J, Burke EK, Krasnogor N (2009) Improving the scalability of rule-based evolutionary learning. Memet Comput 1(1): 55–67
Google Scholar
Bacardit J, Goldberg DE, Butz MV, Llorà X, Garrell JM (2004) Speeding-up pittsburgh learning classifier systems: Modeling time and accuracy. In: Parallel problem solving from nature, PPSN 2004. Springer, LNCS 3242, pp 1021–1031
Bacardit J, Krasnogor N (2008) Empirical evaluation of ensemble techniques for a pittsburgh learning classifier system. In: Learning classifier systems, Lecture Notes in Computer Science. Springer, vol. 4998. Berlin, pp 255–268
Bacardit J, Krasnogor N (2009) A mixed discrete-continuous attribute list representation for large scale classification domains. In: GECCO ’09: proceedings of the 11th annual conference on genetic and evolutionary computation. ACM, pp 1155–1162.
Bacardit Jaume, Krasnogor Natalio (2009) Performance and efficiency of memetic pittsburgh learning classifier systems. Evol Comput J 17(3):307–342
Article Google Scholar
Bacardit J, Widera P, Márquez-Chamorro A, Divina F, Aguilar-Ruiz Jesús S, Krasnogor N (2012) Contact map prediction using a large-scale ensemble of rule sets and the fusion of multiple predicted structural features. Bioinformatics
Bassel GW, Glaab E, Marquez J, Holdsworth MJ, Bacardit J (2011) Functional network construction in arabidopsis using rule-based machine learning on large-scale data sets. Plant Cell Online 23(9):3101–3116
Google Scholar
Butz MV (2004) Rule-based evolutionary online learning systems: learning bounds, classification, and prediction. PhD thesis, Champaign (AAI3153259)
De Jong KA, Spears WM (1991) Learning concept classification rules using genetic algorithms. In: Proceedings of the international joint conference on artificial intelligence. Morgan Kaufmann, pp 651–656
Demšar J (2006) Statistical comparisons of classifiers over multiple data sets. J Mach Learn Res 7:1–30
Google Scholar
Dorigo M, Stützle T (2004) And colony optimization. The MIT Press, Cambridge
Fernández A, García S, Luengo J, Bernadó-Mansilla E, Herrera F (2010) Genetics-based machine learning for rule induction: state of the art and taxonomy and comparative study. IEEE Trans Evol Comput 14(6):913–941
Google Scholar
Franco MA, Krasnogor N, Bacardit J (2010) Speeding up the evaluation of evolutionary learning systems using gpgpus. In: Proceedings of the 12th annual conference on genetic and evolutionary computation, GECCO ’10. ACM, New York, pp 1039–1046
Franco MA, Krasnogor N, Bacardit J (2012) Analysing biohel using challenging boolean functions. Evol Intell 5(2):87–102
Google Scholar
Franco MA, Krasnogor N, Bacardit J (2012) Post-processing operators for decision lists. In: Proceedings of the fourteenth international conference on genetic and evolutionary computation conference, GECCO ’12. ACM, New York, pp 847–854
Franco MA, Krasnogor N, Bacardit J (2012) Post-processing operators for decision lists. In: Proceedings of the fourteenth international conference on genetic and evolutionary computation conference, GECCO ’12, Philadelphia, p 847
Frank A, Asuncion A (2010) UCI machine learning repository
García S, Herrera F (2008) An extension on “statistical comparisons of classifiers over multiple data sets” for all pairwise comparisons. J Mach Learn Res 9:2677–2694
Google Scholar
Grefenstette JJ (1991) Lamarckian learning in multi-agent environments. In: Belew R, Booker L (eds) Proceedings of the fourth international conference on genetic algorithms. Morgan Kaufman, San Mateo, pp 303–310
Hall M, Frank E, Holmes G, Pfahringer B, Reutemann P, Witten IH (2009) The WEKA data mining software: an update. SIGKDD Explor 11(1):10–18
Google Scholar
Harik G (1999) Linkage learning via probabilistic modeling in the ecga. Technical Report 99010, Illinois Genetic Algorithms Lab, University of Illinois at Urbana-Champaign
Harik G, Lobo FG, Goldberg DE (1999) The compact genetic algorithm. IEEE-EC 3(4):287
Google Scholar
Kearns MJ, Vazirani UV (1994) Vazirani. An introduction to computational learning theory. MIT Press, Cambridge
Kennedy J, Eberhart R (1995) Particle swarm optimization. In: Proceedings of the IEEE international conference on neural networks, vol 4, pp 1942–1948
Koza JR (1992) Genetic programming. The MIT Press, Cambridge
Krasnogor N, Smith J (2005) A tutorial for competent memetic algorithms: model, taxonomy, and design issues. IEEE Trans Evol Comput 9(5):474–488
Google Scholar
Larrañaga P, Lozano JA (2002) Estimation of distribution algorithms. Kluwer Academic, Dordrecht
Llorà X, Priya A, Bhargava R (2009) Observer-invariant histopathology using genetics-based machine learning. Nat Comput 8:101–120. doi:10.1007/s11047-007-9056-6
Google Scholar
Llorà X, Sastry K, Goldberg DE (2005) The compact classifier system: scalability analysis and first results. In: Proceedings of the congress on evolutionary computation, vol 1. IEEE Press, pp 596–603
Llorà X, Sastry K, Lima CF, Lobo FG, Goldberg DE (2008) Linkage learning, rule representation, and the X-ray extended compact classifier system. In: Learning classifier systems. Revised Selected Papers of IWLCS 2006–2007, LNAI 4998. Springer, Berlin, pp 189–205
Pelikan M, Goldberg DE, Cantú-Paz E (1999) BOA: the Bayesian optimization algorithm. In: Proceedings of the genetic and evolutionary computation conference GECCO-99, vol I. Morgan Kaufmann, pp 525–532
Venturini G; SIA (1993) A supervised inductive algorithm with genetic search for learning attributes based concepts. In: Brazdil PB (ed) ECML-93, Proceedings of the European conference on machine learning. Springer, Berlin, pp 280–296
Wilson SW (1995) Classifier fitness based on accuracy. Evol Comput 3(2):149–175
Google Scholar
Wyatt D, Bull L (2004) A memetic learning classifier system for describing continuous-valued problem spaces. In: Recent advances in memetic algorithms. Springer, New York, pp 355–396

Download references

Acknowledgments

We acknowledge the support of the UK Engineering and Physical Sciences Research Council (EPSRC) under grant EP/H016597/1. We are grateful for the use of the University of Nottingham’s High Performance Computing Facility.

Author information

Authors and Affiliations

Department of Computer Science, University College London, Gower Street, London, C1E 6BT, UK
Dan Andrei Calian
Interdisciplinary Computing and Complex Systems (ICOS) Research Group, School of Computer Science, University of Nottingham, Jubilee Campus, Wollaton Road, Nottingham, NG8 1BB, UK
Jaume Bacardit

Authors

Dan Andrei Calian
View author publications
You can also search for this author in PubMed Google Scholar
Jaume Bacardit
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Jaume Bacardit.

Appendix

See Tables 16, 17, 18 and 19.

Table 16 ES1: Full cross-validation accuracy results on the Checkerboard datasets

Full size table

Table 17 ES1: Full run-time (in s) results on the Checkerboard datasets

Full size table

Table 18 ES2: Full training accuracy results on the large-scale synthetic datasets

Full size table

Table 19 ES2: Full run-time (in s) results on the large-scale synthetic datasets

Full size table

Rights and permissions

Reprints and permissions

About this article

Cite this article

Calian, D.A., Bacardit, J. Integrating memetic search into the BioHEL evolutionary learning system for large-scale datasets. Memetic Comp. 5, 95–130 (2013). https://doi.org/10.1007/s12293-013-0108-4

Download citation

Received: 27 March 2012
Accepted: 17 January 2013
Published: 16 February 2013
Issue Date: June 2013
DOI: https://doi.org/10.1007/s12293-013-0108-4

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Integrating memetic search into the BioHEL evolutionary learning system for large-scale datasets

Abstract

Access this article

Similar content being viewed by others

Trends of Evolutionary Machine Learning to Address Big Data Mining

Evolving continuous optimisers from scratch

Continuous Optimizers for Automatic Design and Evaluation of Classification Pipelines

Notes

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Appendix

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Integrating memetic search into the BioHEL evolutionary learning system for large-scale datasets

Abstract

Access this article

Similar content being viewed by others

Trends of Evolutionary Machine Learning to Address Big Data Mining

Evolving continuous optimisers from scratch

Continuous Optimizers for Automatic Design and Evaluation of Classification Pipelines

Notes

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Appendix

Appendix

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation