Linear Combination of Distance Measures for Surrogate Models in Genetic Programming

  • Martin ZaeffererEmail author
  • Jörg Stork
  • Oliver Flasch
  • Thomas Bartz-Beielstein
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 11102)


Surrogate models are a well established approach to reduce the number of expensive function evaluations in continuous optimization. In the context of genetic programming, surrogate modeling still poses a challenge, due to the complex genotype-phenotype relationships. We investigate how different genotypic and phenotypic distance measures can be used to learn Kriging models as surrogates. We compare the measures and suggest to use their linear combination in a kernel.

We test the resulting model in an optimization framework, using symbolic regression problem instances as a benchmark. Our experiments show that the model provides valuable information. Firstly, the model enables an improved optimization performance compared to a model-free algorithm. Furthermore, the model provides information on the contribution of different distance measures. The data indicates that a phenotypic distance measure is important during the early stages of an optimization run when less data is available. In contrast, genotypic measures, such as the tree edit distance, contribute more during the later stages.


Genetic programming Surrogate models Distance measures 


  1. 1.
    Koza, J.R.: Genetic programming as a means for programming computers by natural selection. Stat. Comput. 4(2), 87–112 (1994)CrossRefGoogle Scholar
  2. 2.
    Flasch, O.: A modular genetic programming system. Ph.D. thesis, TU Dortmund (2015)Google Scholar
  3. 3.
    Nguyen, S., Mei, Y., Zhang, M.: Genetic programming for production scheduling: a survey with a unified framework. Complex Intell. Syst. 3(1), 41–66 (2017)CrossRefGoogle Scholar
  4. 4.
    Bartz-Beielstein, T., Zaefferer, M.: Model-based methods for continuous and discrete global optimization. Appl. Soft Comput. 55, 154–167 (2017)CrossRefGoogle Scholar
  5. 5.
    Parisotto, E., Mohamed, A., Singh, R., Li, L., Zhou, D., Kohli, P.: Neuro-symbolic program synthesis (2016). arXiv e-prints 1611.01855
  6. 6.
    Moraglio, A., Kattan, A.: Geometric generalisation of surrogate model based optimisation to combinatorial spaces. In: Merz, P., Hao, J.-K. (eds.) EvoCOP 2011. LNCS, vol. 6622, pp. 142–154. Springer, Heidelberg (2011). Scholar
  7. 7.
    Zaefferer, M., Stork, J., Friese, M., Fischbach, A., Naujoks, B., Bartz-Beielstein, T.: Efficient global optimization for combinatorial problems. In: Proceedings of the 2014 Genetic and Evolutionary Computation Conference, GECCO 2014, pp. 871–878. ACM, New York (2014)Google Scholar
  8. 8.
    Jones, D.R., Schonlau, M., Welch, W.J.: Efficient global optimization of expensive black-box functions. J. Global Optim. 13(4), 455–492 (1998)MathSciNetCrossRefGoogle Scholar
  9. 9.
    Jin, Y.: Surrogate-assisted evolutionary computation: recent advances and future challenges. Swarm Evol. Comput. 1(2), 61–70 (2011)CrossRefGoogle Scholar
  10. 10.
    Kattan, A., Ong, Y.S.: Surrogate genetic programming: a semantic aware evolutionary search. Inf. Sci. 296, 345–359 (2015)CrossRefGoogle Scholar
  11. 11.
    Hildebrandt, T., Branke, J.: On using surrogates with genetic programming. Evol. Comput. 23(3), 343–367 (2015)CrossRefGoogle Scholar
  12. 12.
    Nguyen, S., Zhang, M., Johnston, M., Tan, K.C.: Selection schemes in surrogate-assisted genetic programming for job shop scheduling. In: Dick, G., et al. (eds.) SEAL 2014. LNCS, vol. 8886, pp. 656–667. Springer, Cham (2014). Scholar
  13. 13.
    Nguyen, S., Zhang, M., Tan, K.C.: Surrogate-assisted genetic programming with simplified models for automated design of dispatching rules. IEEE Trans. Cybern. 47(9), 1–15 (2016)Google Scholar
  14. 14.
    Moraglio, A., Kattan, A.: Geometric surrogate model based optimisation for genetic programming: Initial experiments. Technical report, University of Birmingham (2011)Google Scholar
  15. 15.
    Forrester, A., Sobester, A., Keane, A.: Engineering Design via Surrogate Modelling. Wiley, Hoboken (2008)CrossRefGoogle Scholar
  16. 16.
    Mockus, J., Tiesis, V., Zilinskas, A.: The application of Bayesian methods for seeking the extremum. In: Towards Global Optimization 2, North-Holland, pp. 117–129 (1978)Google Scholar
  17. 17.
    Pawlik, M., Augsten, N.: Tree edit distance: robust and memory-efficient. Inf. Syst. 56, 157–173 (2016)CrossRefGoogle Scholar
  18. 18.
    Pawlik, M., Augsten, N.: APTED release 0.1.1. GitHub (2016). Accessed 01 June 2017
  19. 19.
    Moraglio, A., Poli, R.: Geometric landscape of homologous crossover for syntactic trees. In: 2005 IEEE Congress on Evolutionary Computation, Edinburgh, UK. IEEE (2005)Google Scholar
  20. 20.
    Gablonsky, J., Kelley, C.: A locally-biased form of the direct algorithm. J. Global Optim. 21(1), 27–37 (2001)MathSciNetCrossRefGoogle Scholar
  21. 21.
    Nelder, J.A., Mead, R.: A simplex method for function minimization. Comput. J. 7(4), 308–313 (1965)MathSciNetCrossRefGoogle Scholar
  22. 22.
    Flasch, O., Mersmann, O., Bartz-Beielstein, T., Stork, J., Zaefferer, M.: RGP: R genetic programming framework. R package version 0.4-1 (2014)Google Scholar
  23. 23.
    Zaefferer, M.: Combinatorial efficient global optimization in R - CEGO v2.2.0 (2017). Accessed 10 Jan 2018

Copyright information

© Springer Nature Switzerland AG 2018

Authors and Affiliations

  • Martin Zaefferer
    • 1
    Email author
  • Jörg Stork
    • 1
  • Oliver Flasch
    • 2
  • Thomas Bartz-Beielstein
    • 1
  1. 1.Institute of Data Science, Engineering, and AnalyticsTH KölnGummersbachGermany
  2. 2.sourcewerk GmbHDortmundGermany

Personalised recommendations