Abstract
In this paper, we resort to the TensorFlow framework to investigate the benefits of applying data vectorization and fitness caching methods to domain evaluation in Genetic Programming. For this purpose, an independent engine was developed, TensorGP, along with a testing suite to extract comparative timing results across different architectures and amongst both iterative and vectorized approaches. Our performance benchmarks further analyze the benefits of employing vectorization techniques and throughput-oriented hardware in several GP scenarios consisting of varying tree sizes and domain resolutions. In specific, it is shown that by applying the TensorFlow eager execution model to the evolutionary process, speedup gains of up to two orders of magnitude can be achieved on a parallel approach running on a GPU when compared to a standard iterative approach for a typical symbolic regression problem. Lastly, we also demonstrate the performance benefits of explicit operator definition when compared to operator composition in TensorGP.
Similar content being viewed by others
Data Availability Statement
All datasets used in the experiments are publicly available.
Code Availability Statement
Part of the code can be found at https://github.com/AwardOfSky/TensorGP.
Notes
TensorGP repository available at https://github.com/AwardOfSky/TensorGP.
References
Abadi M, Barham P, Chen J, Chen Z, Davis A, Dean J, Devin M, Ghemawat S, Irving G, Isard M, et al. Tensorflow: A system for large-scale machine learning. In: 12th USENIX Symposium on Operating Systems Design and Implementation (OSDI 16) 2016; 265–283 (2016).
Agrawal A, Modi AN, Passos A, Lavoie A, Agarwal A, Shankar A, Ganichev I, Levenberg J, Hong M, Monga R, et al. Tensorflow eager: A multi-stage, python-embedded dsl for machine learning. arXiv preprint 2019; arXiv:1903.01855
Andre D, Koza JR. Parallel genetic programming: a scalable implementation using the transputer network architecture. In: Advances in genetic programming. MIT Press; 1996. p. 317–37.
Arenas M, Romero G, Mora A, Castillo P, Merelo J. Gpu parallel computation in bioinspired algorithms: a review. In: Advances in intelligent modelling and simulation. Springer; 2013. p. 113–34.
Augusto DA, Barbosa HJ. Accelerated parallel genetic programming tree evaluation with opencl. J Parallel Distrib Comput. 2013;73(1):86–100.
Baeta F, Correia J, Martins T, Machado P. Tensorgp - genetic programming engine in tensorflow. In: P.A. Castillo, J.L.J. Laredo (eds.) Applications of Evolutionary Computation - 24th International Conference, EvoApplications 2021, Held as Part of EvoStar 2021, Virtual Event, Proceedings, Lecture Notes in Computer Science. 2021;12694: 763-778. Springer. https://doi.org/10.1007/978-3-030-72699-7_48.
Burlacu B, Kronberger G, Kommenda M. Operon c++ an efficient genetic programming framework for symbolic regression. In: Proceedings of the 2020 Genetic and Evolutionary Computation Conference Companion. 2020; 1562–1570.
Cano A, Ventura S. Gpu-parallel subtree interpreter for genetic programming. In: Proceedings of the 2014 Annual Conference on Genetic and Evolutionary Computation. 2014; 887–894. ACM.
Cano A, Zafra A, Ventura S. Speeding up the evaluation phase of gp classification algorithms on gpus. Soft Comput. 2012;16(2):187–202.
Cavaglia M, Staats K, Gill T. Finding the origin of noise transients in ligo data with machine learning. 2018; arXiv preprint arXiv:1812.05225.
Chitty DM. A data parallel approach to genetic programming using programmable graphics hardware. In: Proceedings of the 9th annual conference on Genetic and evolutionary computation. 2007; 1566–1573. ACM.
Chitty DM. Fast parallel genetic programming: multi-core cpu versus many-core gpu. Soft Comput. 2012;16(10):1795–814.
Fortin FA, De Rainville FM, Gardner MAG, Parizeau M, Gagné C. Deap: evolutionary algorithms made easy. J Mach Learn Res. 2012;13(1):2171–5.
Fu X, Ren X, Mengshoel OJ, Wu X. Stochastic optimization for market return prediction using financial knowledge graph. In: 2018 IEEE International Conference on Big Knowledge (ICBK). 2018; 25–32. IEEE.
Giacobini M, Tomassini M, Vanneschi L. Limiting the number of fitness cases in genetic programming using statistics. In: International Conference on Parallel Problem Solving from Nature. Springer; 2002. p. 371–80.
Handley S. On the use of a directed acyclic graph to represent a population of computer programs. In: Proceedings of the First IEEE Conference on Evolutionary Computation. IEEE World Congress on Computational Intelligence. 1994; 154–159. IEEE.
Keijzer M. Efficiently representing populations in genetic programming. In: Advances in genetic programming. MIT Press; 1996. p. 259–78.
Keijzer M. Alternatives in subtree caching for genetic programming. In: European Conference on Genetic Programming. Springer; 2004. p. 328–37.
Koza JR, Bennett F, Hutchings JL, Bade SL, Keane MA, Andre D. Evolving sorting networks using genetic programming and the rapidly reconfigurable xilinx 6216 field-programmable gate array. In: Conference Record of the Thirty-First Asilomar Conference on Signals, Systems and Computers (Cat. No. 97CB36136). 1997;1: 404–410. IEEE.
Machado P, Cardoso A. Speeding up genetic programming. Procs 2nd Int Symp AI Adapt Syst CIMAF. 1999;99:217–22.
Matousek R, Hulka T, Dobrovsky L, Kudela J. Sum epsilon-tube error fitness function design for gp symbolic regression: Preliminary study. In: 2019 International Conference on Control, Artificial Intelligence, Robotics & Optimization (ICCAIRO). 2019; 78–83. IEEE.
McDermott J, White DR, Luke S, Manzoni L, Castelli M, Vanneschi L, Jaskowski W, Krawiec K, Harper R, De Jong K, et al. Genetic programming needs better benchmarks. In: Proceedings of the 14th annual conference on Genetic and evolutionary computation. 2012; 791–798.
de Melo VV, Fazenda ÁL, Sotto LFDP, Iacca G. A mimd interpreter for genetic programming. In: International Conference on the Applications of Evolutionary Computation (Part of EvoStar). Springer; 2020. p. 645–58.
Moore, G.: Cramming more components onto integrated circuits. Proc IEEE 86(1), 82–85 (1998). https://doi.org/10.1109/JPROC.1998.658762. http://ieeexplore.ieee.org/xpl/freeabs_all.jsp?tp=&arnumber=658762&isnumber=14340.
Pagie L, Hogeweg P. Evolutionary consequences of coevolving targets. Evol comput. 1997;5(4):401–18.
Poli, R., Langdon, W.B., McPhee, N.F.: A field guide to genetic programming. Lulu Enterprises, UK Ltd (2008).
Rowland T, Weisstein EW. Tensor. From MathWorld—A Wolfram Web Resource. http://mathworld.wolfram.com/Tensor.html. Accessed 11 June 2021.
Staats K, Pantridge E, Cavaglia M, Milovanov I, Aniyan A. Tensorflow enabled genetic programming. In: Proceedings of the Genetic and Evolutionary Computation Conference Companion. 2017; 1872–1879. ACM.
Van der Walt S, Schönberger JL, Nunez-Iglesias J, Boulogne F, Warner JD, Yager N, Gouillart E, Yu T. scikit-image: image processing in python. PeerJ. 2014;2:e453.
Wong P, Zhang M. Scheme: Caching subtrees in genetic programming. In: 2008 IEEE Congress on Evolutionary Computation (IEEE World Congress on Computational Intelligence). 2008; 2678–2685. IEEE.
Funding
This work is funded by national funds through the FCT - Foundation for Science and Technology, I.P., within the scope of the project CISUC - UID/CEC/00326/2020 and by European Social Fund, through the Regional Operational Program Centro 2020 and by the project grant DSAIPA/DS/0022/2018 (GADgET). We also thank the NVIDIA Corporation for the hardware granted to this research. We also thank the NVIDIA Corporation for the hardware granted to this research.
Author information
Authors and Affiliations
Contributions
This paper is an extended version of the work published in [6]. We first investigate the benefits of applying data vectorization and fitness caching methods to domain evaluation in Genetic Programming. For this purpose, an independent engine was developed, TensorGP, along with a testing suite to extract comparative timing results across different architectures and amongst both iterative and vectorized approaches. Our performance benchmarks demonstrate that by exploiting the TensorFlow eager execution model, performance gains of up to two orders of magnitude can be achieved on a parallel approach running on dedicated hardware when compared to a standard iterative approach. In summary, the contributions are as follows: (i) a new Genetic Programming Engine; (ii) a benchmark speedup study with both an off the shelf framework and direct implementation of a Genetic Programming algorithm; (iii) evaluation of different execution modes; (iv) showcase the implementation of a particular operator with the analysis on performance for different execution types.
Corresponding author
Ethics declarations
Conflict of interest
On behalf of all authors, the corresponding author states that there is no conflict of interest.
Ethics approval
Not applicable.
Consent to participate
Not applicable.
Consent for publication
Not applicable.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
This article is part of the topical collection “Applications of bioinspired computing (to real world problems)” guest edited by Aniko Ekart, Pedro Castillo and Juanlu Jiménez-Laredo.
Rights and permissions
About this article
Cite this article
Baeta, F., Correia, J., Martins, T. et al. Exploring Genetic Programming in TensorFlow with TensorGP. SN COMPUT. SCI. 3, 154 (2022). https://doi.org/10.1007/s42979-021-01006-8
Received:
Accepted:
Published:
DOI: https://doi.org/10.1007/s42979-021-01006-8