Abstract
We present the results of a community survey regarding genetic programming benchmark practices. Analysis shows broad consensus that improvement is needed in problem selection and experimental rigor. While views expressed in the survey dissuade us from proposing a large-scale benchmark suite, we find community support for creating a “blacklist” of problems which are in common use but have important flaws, and whose use should therefore be discouraged. We propose a set of possible replacement problems.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.Notes
References
J. Bacardit, M. Stout, N. Krasnogor, J.D. Hirst, J. Blazewicz, Coordination number prediction using learning classifier systems, in Proceedings of the Genetic and Evolutionary Computation Conference (GECCO), (Seattle, Washington, USA, 2006), p. 247. doi:10.1145/1143997.1144041
D.F. Barrero, M. R-Moreno, B. Castano, D. Camacho, An empirical study on the accuracy of computational effort in genetic programming, in Proceedings of the Congress on Evolutionary Computation (2011)
S. Christensen, F. Oppacher, An analysis of Koza’s computational effort statistic for genetic programming. In: Proceedings of EuroGP. (Springer, Berlin, 2002)
J.M. Daida, R. Bertram, S. Stanhope, J. Khoo, S. Chaudhary, O. Chaudhary, What makes a problem GP-Hard? Analysis of a tunably difficult problem in genetic programming. Genet. Program Evolvable Mach. 2, 165–191 (2001)
C. Drummond, N. Japkowicz, Warning: statistical benchmarking is addictive. Kicking the habit in machine learning. J. Exp. Theor. Artif. Intell. 22(1), 67–80 (2010)
E. Espié, C. Guionneau, B. Wymann, C. Dimitrakakis, R. Coulom, A. Sumner, TORCS—the open racing car simulator (2005)
R. Feldt, M. O’Neill, C. Ryan, P. Nordin, W.B. Langdon, GP-Beagle: a benchmarking problem repository for the genetic programming community, in Late Breaking Papers at GECCO (2000)
A. Fernández-Ares, A. Mora, J. Merelo, P. García-Sánchez, C. Fernandes, Optimizing player behavior in a real-time strategy game using evolutionary algorithms, in Proceedings of the Congress on Evolutionary Computation, pp. 2017–2024. IEEE (2011)
P. Flener, U. Schmid, An introduction to inductive programming. Artif. Intell. Rev. 29(1), 45–62 (2008)
A. Frank, A. Asuncion, UCI machine learning repository (2010). http://archive.ics.uci.edu/ml
J. Friedman, Multivariate adaptive regression splines. Ann. Stat. 19(1), 1–67 (1991)
M. Gallagher, A. Ryan, Learning to play Pac-Man: an evolutionary, rule-based approach, in Proceedings of the Congress on Evolutionary Computation, vol. 4, pp. 2462–2469. IEEE (2003)
C. Gathercole, P. Ross, An adverse interaction between crossover and restricted tree depth in genetic programming, in: Proceedings of the Genetic and Evolutionary Computation Conference (GECCO) (1996)
D.E. Goldberg, U.M. O’Reilly, Where does the good stuff go, and why? How contextual semantics influence program structure in simple genetic programming, in Proceedings of EuroGP (1998)
S. Gulwani, Dimensions in program synthesis, in Proceedings of the 12th International ACM SIGPLAN Symposium on Principles and Practice of Declarative Programming (ACM, Philadelphia, 2010) , pp. 13–24
S. Gustafson, E.K. Burke, N. Krasnogor, The tree-string problem: an artificial domain for structure and content search, in Proceedings of EuroGP (2005)
D.J. Hand, Classifier technology and the illusion of progress. Stat. Sci. 21(1), 1–14 (2006)
M. Harman, B. Jones, Search-based software engineering. Inf. Softw. Technol. 43(14), 833–839 (2001)
R. Harper, Spatial co-evolution: quicker, fitter and less bloated, in Proceedings of the Genetic and Evolutionary Computation Conference (GECCO) (ACM, Philadelphia, 2012), pp. 759–766
K. Hartness, Robocode: using games to teach artificial intelligence. J. Comput. Sci. Coll. 19(4), 287–291 (2004)
T.H. Hoang, N.X. Hoai, N.T. Hien, R.I. McKay, D. Essam, ORDERTREE: a new test problem for genetic programming, in Proceedings of the Genetic and Evolutionary Computation Conference (GECCO) (2006)
R.C. Holte, Very simple classification rules perform well on most commonly used datasets. Mach. Learn. 11, 63–90 (1993)
K. Imamura, J. Foster, A. Krings, The test vector problem and limitations to evolving digital circuits, in Proceedings of the Second NASA/DoD Workshop on Evolvable Hardware, pp. 75–79. IEEE (2000)
D. Johnson, in A theoretician’s guide to the experimental analysis of algorithms. Data structures, near neighbor searches, and methodology: fifth and sixth DIMACS implementation challenges, vol 59, pp. 215–250 (2002)
M. Keijzer, Improving symbolic regression with interval arithmetic and linear scaling, in Proceedings of EuroGP (2003)
E. Kirshenbaum, Iteration over vectors in genetic programming. HP Laboratories Technical Report HPL-2001-327 (2001)
M.F. Korns, Accuracy in symbolic regression, in Proceedings of Genetic Programming Theory and Practice (2011)
J.R. Koza, Genetic Programming: On the Programming of Computers by Means of Natural Selection. (MIT Press, Cambridge, MA, 1992)
J.R. Koza, Genetic Programming II: Automatic Discovery of Reusable Programs. (MIT Press, Cambridge, MA, 1994)
D. Loiacono, J. Togelius, Competitions@WCCI-2008: simulated car racing competition. ACM SIGEVOlution 2(4), 35–36 (2007)
S. Luke, L. Panait, Is the perfect the enemy of the good? in Proceedings of the Genetic and Evolutionary Computation Conference (GECCO) (2002)
J. McDermott, D.R. White, S. Luke, L. Manzoni, M. Castelli, L. Vanneschi, W. Jaśkowski, K. Krawiec, R. Harper, K.D. Jong, U.M. O’Reilly, Genetic programming needs better benchmarks, in Proceedings of the Genetic and Evolutionary Computation Conference (GECCO) (ACM, Philadelphia, 2012)
Q.U. Nguyen, X.H. Nguyen, M. O’Neill, R.I. Mckay, E. Galván-López, Semantically-based crossover in genetic programming: application to real-valued symbolic regression. Genet. Program Evolvable Mach. 12, 91–119 (2011)
J. Niehaus, W. Banzhaf, More on computational effort statistics for genetic programming, in Proceedings of EuroGP (2003)
M. O’Neill, L. Vanneschi, S. Gustafson, W. Banzhaf, Open issues in genetic programming. Genet. Program Evolvable Mach. 11(3/4), 339–363 (2010)
L. Pagie, P. Hogeweg, Evolutionary consequences of coevolving targets. Evol. Comput. 5, 401–418 (1997)
N. Paterson, M. Livesey, Performance comparison in genetic programming, in Late Breaking Papers at GECCO (2000)
D. Perez, P. Rohlfshagen, S.M. Lucas, Monte-Carlo tree search for the physical travelling salesman problem, in Applications of Evolutionary Computation. Lecture Notes in Computer Science, vol. 7248, ed. by C. Di Chio, A. Agapitos, S. Cagnoni, C. Cotta, F.F. de Vega, G.A. Di Caro, R. Drechsler, A. Ekárt, A.I. Esparcia-Alcázar, M. Farooq, W.B. Langdon, J.J. Merelo-Guervós, M. Preuss, H. Richter, S. Silva, A. Simões, G. Squillero, E. Tarantino, A.G.B. Tettamanzi, J. Togelius, N. Urquhart, A.Ş. Uyar, G.N. Yannakakis (Springer, Berlin, Heidelberg, 2012), pp. 255–264
D. Phong, N. Hoai, R. McKay, C. Siriteanu, N. Uy, N. Park, Evolving the best known approximation to the q function. In: Proceedings of the Genetic and Evolutionary Computation Conference (GECCO) (ACM, Philadelphia, 2012) , pp. 807–814
B. Punch, D. Zongker, E. Goodman, The royal tree problem, a benchmark for single and multiple population genetic programming. In: Advances in Genetic Programming 2, (MIT Press, Cambridge, MA, 1996), pp. 299–316
A. Strauss, J. Corbin (eds), Qualitative Research Practice: A Guide for Social Science Students and Researchers. (Sage, Beverley Hills, CA, 1997)
S. Salzberg, On comparing classifiers: Pitfalls to avoid and a recommended approach. Data Min. Knowl. Disc. 1, 317–328 (1997)
B. Sendhoff, M. Roberts, X. Yao, Evolutionary computation benchmarking repository. IEEE Comput. Intell. Mag. 1(4), 50–60 (2006)
J.C. Sprott, Simplest dissipative chaotic flow. Phys. Lett. A 228(4), 271–274 (1997)
A. Strauss, J. Corbin, Grounded Theory in Practice. (Sage, Beverley Hills, CA, 1997)
M. Streeter, L.A. Becker, Automated discovery of numerical approximation formulae via genetic programming. Genet. Program. Evol. Mach. 4, 255–286 (2003). doi:10.1023/A:1025176407779
J. Togelius, S. Karakovskiy, R. Baumgarten, The 2009 mario ai competition, in Proceedings of the Congress on Evolutionary Computation (2010)
M. Tomassini, L. Vanneschi, P. Collard, M. Clergue, A study of fitness distance correlation as a difficulty measure in genetic programming. Evol. Comput. 13, 213–239 (2005). doi:10.1162/1063656054088549
L. Vanneschi, M. Castelli, L. Manzoni, The K landscapes: a tunably difficult benchmark for genetic programming, in Proceedings of the Genetic and Evolutionary Computation Conference (GECCO) (2011)
E. Vladislavleva, G. Smits, D. Den Hertog, Order of nonlinearity as a complexity measure for models generated by symbolic regression via pareto genetic programming. IEEE Trans. Evol. Comput. 13(2), 333–349 (2009)
K.L. Wagstaff, Machine learning that matters, in Proceedings of the 29th International Conference on Machine Learning (ICML-12), ed. by J. Langford, J. Pineau (2012)
J. Walker, J. Miller, Predicting prime numbers using Cartesian genetic programming. Proceedings of EuroGP pp. 205–216 (2007)
J. Walker, J. Miller, The automatic acquisition, evolution and reuse of modules in Cartesian genetic programming. IEEE Trans. Evol. Comput. 12(4), 397–417 (2008)
H. Warren, Hacker’s Delight. (Addison-Wesley Professional, 2003). http://hackersdelight.org/
W. Weimer, T. Nguyen, C. Le Goues, S. Forrest, Automatically finding patches using genetic programming, in Proceedings of the 31st International Conference on Software Engineering (2009)
P. Widera, J. Garibaldi, N. Krasnogor, GP challenge: evolving energy function for protein structure prediction. Genet. Program Evolvable Mach. 11, 61–88 (2010)
J.L. Wilkerson, D.R. Tauritz, J. Bridges, Multi-objective coevolutionary automated software correction system, in Proceedings of the Genetic and Evolutionary Computation Conference (GECCO). (ACM, Philadelphia, 2012)
L. Wilkinson, A. Anand, D. Tuan, CHIRP: a new classifier based on composite hypercubes on iterated random projections. In: Proceedings of the 17th ACM SIGKDD international conference on Knowledge discovery and data mining (KDD), vol. 11, (2011), pp. 6–14
Acknowledgments
Thanks to Ricardo Segurado and the University College Dublin CSTAR statistics consultancy; thanks to Marilyn McGee-Lennon in the School of Computing Science at the University of Glasgow for her advice on survey design, and to the School itself for providing the supporting web service. Thanks to all those who participated in the GP survey and have engaged in discussion through the GP mailing list, the benchmark mailing list, and the GECCO 2012 debate. Thanks to the anonymous reviewers of this paper. David R White is funded by the Scottish Informatics and Computer Science Alliance. James McDermott is funded by the Irish Research Council. Gabriel Kronberger is supported by the Austrian Research Promotion Agency, Josef Ressel-centre “Heureka!” Wojciech Jaśkowski is supported by Polish Ministry of Science and Education, grant no. 91-531/DS.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
White, D.R., McDermott, J., Castelli, M. et al. Better GP benchmarks: community survey results and proposals. Genet Program Evolvable Mach 14, 3–29 (2013). https://doi.org/10.1007/s10710-012-9177-2
Received:
Revised:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10710-012-9177-2