Abstract
Symbolic regression has multiple applications in data mining and scientific computing. Genetic Programming (GP) is the mainstream method of solving symbolic regression problems, but its execution speed under large datasets has always been a bottleneck. This paper describes a CUDA-based parallel symbolic regression algorithm that leverages the parallelism of the GPU to speed up the fitness evaluation process in symbolic regression. We make the fitness evaluation step fully performed on the GPU and make use of various GPU hardware resources. We compare training time and regression accuracy between the proposed approach and existing symbolic regression frameworks including gplearn, TensorGP, and KarooGP. The proposed approach is the fastest among all the tested frameworks in both synthetic benchmarks and large-scale benchmarks.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Abadi, M., Agarwal, A., Barham, P., et al.: TensorFlow: large-scale machine learning on heterogeneous distributed systems (2016)
Agrawal, A., Modi, A.N., Passos, A., et al.: TensorFlow eager: a multi-stage, python-embedded DSL for machine learning. CoRR abs/1903.01855 (2019)
Awange, J.L., PalĆ”ncz, B.: Symbolic Regression, pp. 203ā216. Springer International Publishing, Cham (2016)
Baeta, F., Correia, J.A., Martins, T., et al.: Speed benchmarking of genetic programming frameworks. In: Proceedings of the Genetic and Evolutionary Computation Conference, pp. 768ā775. GECCO 2021. Association for Computing Machinery, New York, NY, USA (2021)
Baeta, F., Correia, J., Martins, T., Machado, P.: TensorGP ā genetic programming engine in TensorFlow. In: Castillo, P.A., JimĆ©nez Laredo, J.L. (eds.) EvoApplications 2021. LNCS, vol. 12694, pp. 763ā778. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-72699-7_48
Biggio, L., Bendinelli, T., Neitz, A., Lucchi, A., Parascandolo, G.: Neural symbolic regression that scales. CoRR abs/2106.06427 (2021). arxiv:2106.06427
Biles, J.A.: Autonomous GenJam: eliminating the fitness bottleneck by eliminating fitness. In: Proceedings of the Genetic and Evolutionary Computation Conference Workshop Program, vol. 7 (2001)
Cano, A., Zafra, A., Ventura, S.: Speeding up the evaluation phase of GP classification algorithms on GPUs. Soft Comput. 16, 187ā202 (2012)
Chitty, D.M.: Improving the performance of GPU-based genetic programming through exploitation of on-chip memory. Soft Comput. 20, 661ā680 (2016)
Chitty, D.M.: Exploiting tournament selection for efficient parallel genetic programming. In: Lotfi, A., Bouchachia, H., Gegov, A., Langensiepen, C., McGinnity, M. (eds.) UKCI 2018. AISC, vol. 840, pp. 41ā53. Springer, Cham (2019). https://doi.org/10.1007/978-3-319-97982-3_4
Cook, S.: CUDA Programming: A Developerās Guide to Parallel Computing with GPUs, 1st edn. Morgan Kaufmann Publishers Inc., San Francisco, CA, USA (2012)
Dua, D., Graff, C.: UCI machine learning repository (2017). https://archive.ics.uci.edu/ml
Fortin, F.A., De Rainville, F.M., Gardner, M., Parizeau, M., GagnĆ©, C.: DEAP: evolutionary algorithms made easy. J. Mach. Learn. Res. Mach. Learn. Open Source Softw. 13, 2171ā2175 (2012)
Lee, K.H., Yeun, Y.S.: Genetic programming approach to curve fitting of noisy data and its application in ship design. Trans. Soc. CAD/CAM Eng. 9 (2004)
Handley, S.: On the use of a directed acyclic graph to represent a population of computer programs. In: Proceedings of the First IEEE Conference on Evolutionary Computation. IEEE World Congress on Computational Intelligence, pp. 154ā159, vol. 1 (1994)
Harper, R.: Spatial co-evolution: quicker, fitter and less bloated. In: Proceedings of the 14th Annual Conference on Genetic and Evolutionary Computation, pp. 759ā766. GECCO 2012. ACM (2012)
Icke, I., Rosenberg, A.: Multi-objective genetic programming projection pursuit for exploratory data modeling (2010). https://doi.org/10.48550/ARXIV.1010.1888
Ikonomovska, E.: Airline dataset: for evaluation of machine learning algorithms on non-stationary streaming real-world problems (2009). https://kt.ijs.si/elena_ikonomovska/data.html
Keijzer, M.: Alternatives in subtree caching for genetic programming. In: Keijzer, M., OāReilly, U.-M., Lucas, S., Costa, E., Soule, T. (eds.) EuroGP 2004. LNCS, vol. 3003, pp. 328ā337. Springer, Heidelberg (2004). https://doi.org/10.1007/978-3-540-24650-3_31
Koza, J.: Genetic programming: on the programming of computers by means of natural selection. Complex Adap. Syst. 1 (1992)
Koza, J.: Genetic programming as a means for programming computers by natural selection. Stat. Comput. 4(2), 87 (1994)
Langdon, W.B., Banzhaf, W.: A SIMD interpreter for genetic programming onĀ GPUĀ graphicsĀ cards. In: OāNeill, M., et al. (eds.) EuroGP 2008. LNCS, vol. 4971, pp. 73ā85. Springer, Heidelberg (2008). https://doi.org/10.1007/978-3-540-78671-9_7
Langdon, W.B., Poli, R., McPhee, N.F., et al.: Genetic programming: an introduction and tutorial, with a survey of techniques and applications. In: Computational Intelligence: A Compendium, pp. 927ā1028. Studies in Computational Intelligence, Springer, Heidelberg (2008). https://doi.org/10.1007/978-3-540-78293-3_22
Martius, G., Lampert, C.H.: Extrapolation and learning equations. CoRR abs/1610.02995 (2016). arxiv:1610.02995
McDermott, J., White, D.R., Luke, S., et al.: Genetic programming needs better benchmarks. In: Proceedings of the 14th Annual Conference on Genetic and Evolutionary Computation, pp. 791ā798. GECCO 2012. Association for Computing Machinery, New York, NY, USA (2012)
Pagie, L., Hogeweg, P.: Evolutionary consequences of coevolving targets. Evolut. Comput. 5(4), 401ā418 (1997)
Pedregosa, F., Varoquaux, G., Gramfort, A., et al.: Scikit-learn: machine learning in python. J. Mach. Learn. Res. 12, 2825ā2830 (2012)
Petersen, B.K., et al.: Deep symbolic regression, version 1.0, December 2019. https://doi.org/10.11578/dc.20200220.1, https://www.osti.gov//servlets/purl/1600741
Poli, R., Langdon, W.B., McPhee, N.F.: A field guide to genetic programming. Published via http://lulu.com and freely available at http://www.gp-field-guide.org.uk (2008)
Staats, K., Pantridge, E., Cavaglia, M., et al.: TensorFlow enabled genetic programming. In: Proceedings of the Genetic and Evolutionary Computation Conference Companion, pp. 1872ā1879. GECCO 2017. Association for Computing Machinery, New York, NY, USA (2017)
Stephens, T.: Genetic programming in python with a scikit-learn inspired api: Gplearn (2016). https://github.com/trevorstephens/gplearn
Tohme, T., Liu, D., Youcef-Toumi, K.: GSR: a generalized symbolic regression approach (2022). https://doi.org/10.48550/ARXIV.2205.15569, arxiv:2205.15569
Tohme, T., Vanslette, K., Youcef-Toumi, K.: A generalized Bayesian approach to model calibration. Reliabil. Eng. Syst. Saf. 204, 107ā141 (2020). https://doi.org/10.1016/j.ress.2020.107141
Tohme, T., Vanslette, K., Youcef-Toumi, K.: Improving regression uncertainty estimation under statistical change. CoRR abs/2109.08213 (2021). arxiv:2109.08213
Wang, Y., Wagner, N., Rondinelli, J.M.: Symbolic regression in materials science. MRS Communications (2019). https://doi.org/10.48550/ARXIV.1901.04136
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
Ā© 2022 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Zhang, R., Lensen, A., Sun, Y. (2022). Speeding upĀ Genetic Programming Based Symbolic Regression Using GPUs. In: Khanna, S., Cao, J., Bai, Q., Xu, G. (eds) PRICAI 2022: Trends in Artificial Intelligence. PRICAI 2022. Lecture Notes in Computer Science, vol 13629. Springer, Cham. https://doi.org/10.1007/978-3-031-20862-1_38
Download citation
DOI: https://doi.org/10.1007/978-3-031-20862-1_38
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-20861-4
Online ISBN: 978-3-031-20862-1
eBook Packages: Computer ScienceComputer Science (R0)