Expert-driven genetic algorithms for simulating evaluation functions

Abstract

In this paper we demonstrate how genetic algorithms can be used to reverse engineer an evaluation function’s parameters for computer chess. Our results show that using an appropriate expert (or mentor), we can evolve a program that is on par with top tournament-playing chess programs, outperforming a two-time World Computer Chess Champion. This performance gain is achieved by evolving a program that mimics the behavior of a superior expert. The resulting evaluation function of the evolved program consists of a much smaller number of parameters than the expert’s. The extended experimental results provided in this paper include a report on our successful participation in the 2008 World Computer Chess Championship. In principle, our expert-driven approach could be used in a wide range of problems for which appropriate experts are available.

This is a preview of subscription content, log in to check access.

Fig. 1
Fig. 2
Fig. 3

Notes

  1. 1.

    An evaluation unit in chess programs is commonly called a centipawn, i.e., 1/100th of the value of a pawn. Traditionally, a pawn is assigned a value of 100, and all other parameters are assigned relative values. However, the value of a pawn itself need not be exactly 100, so a unit of evaluation may no longer be exactly 1/100th of a pawn. Despite this inconsistency, the term centipawn is still used to denote the smallest evaluation unit.

  2. 2.

    Note that Evol* and RandOrg (including the sets of parameters of their evaluation function) are essentially the same, except for the actual values assigned to these parameters.

  3. 3.

    Our genetically evolved program participated under the name Falcon, which is the original name we had used in previous championships. Even though a name reflecting evolution (such as FalconGA) might have been more appropriate, it is customary that the participants use the same program name every year, even when using a substantially different version.

References

  1. 1.

    S.G. Akl, M.M. Newborn, The principal continuation and the killer heuristic. in Proceedings of the 5th Annual ACM Computer Science Conference (ACM Press, Seattle, WA, 1977), pp. 466–473

  2. 2.

    P. Aksenov, Genetic Algorithms for Optimising Chess Position Scoring. Master’s Thesis, University of Joensuu, Finland (2004)

  3. 3.

    T.S. Anantharaman, Extension heuristics. ICCA J. 14(2), 47–65 (1991)

    MathSciNet  Google Scholar 

  4. 4.

    J. Baxter, A. Tridgell, L. Weaver, Learning to play chess using temporal-differences. Mach. Learn. 40(3), 243–263 (2000)

    MATH  Article  Google Scholar 

  5. 5.

    D.F. Beal, Experiments with the null move. Advances in Computer Chess 5, in ed. by D.F. Beal (Elsevier Science, Amsterdam, 1989), pp. 65–79

  6. 6.

    D.F. Beal, M.C. Smith, Quantification of search extension benefits. ICCA J. 18(4), 205–218 (1995)

    Google Scholar 

  7. 7.

    Y. Björnsson, T.A. Marsland, Multi-cut pruning in alpha-beta search. in Proceedings of the First International Conference on Computers and Games, Tsukuba, Japan (1998), pp. 15–24

  8. 8.

    Y. Björnsson, T.A. Marsland, Multi-cut alpha-beta-pruning in game-tree search. Theor. Comput. Sci. 252(1–2), 177–196 (2001)

    MATH  Article  Google Scholar 

  9. 9.

    M. Block, M. Bader, E. Tapia, M. Ramirez, K. Gunnarsson, E. Cuevas, D. Zaldivar, R. Rojas, Using reinforcement learning in chess engines, Res. Comput. Sci. 35, 31–40 (2008)

    Google Scholar 

  10. 10.

    M.S. Campbell, T.A. Marsland, A comparison of minimax tree search algorithms. Artif. Intell. 20(4), 347–367 (1983)

    MATH  Article  Google Scholar 

  11. 11.

    S. Chinchalkar, An upper bound for the number of reachable positions. ICCA J. 19(3), 181–183 (1996)

    Google Scholar 

  12. 12.

    O. David-Tabibi, A. Felner, N.S. Netanyahu, Blockage detection in pawn endings. in Proceedings of the 2004 International Conference on Computers and Games, eds. by H.J. van den Herik, Y. Björnsson, N.S. Netanyahu (Springer (LNCS 3846), Ramat-Gan, Israel, 2006), pp. 187–201

  13. 13.

    O. David-Tabibi, M. Koppel, N.S. Netanyahu, Genetic algorithms for mentor-assisted evaluation function optimization. in Proceedings of the Genetic and Evolutionary Computation Conference (Atlanta, GA, 2008), pp. 1469–1476

  14. 14.

    O. David-Tabibi, N.S. Netanyahu, Extended null-move reductions. in Proceedings of the 2008 International Conference on Computers and Games, eds. by H.J. van den Herik, X. Xu, Z. Ma, M.H.M. Winands (Springer (LNCS 5131), Beijing, China, 2008), pp. 205–216

  15. 15.

    C. Donninger, Null move and deep search: Selective search heuristics for obtuse chess programs. ICCA J. 16(3), 137–143 (1993)

    Google Scholar 

  16. 16.

    J.J. Gillogly, The technology chess program. Artif. Intell. 3(1–3), 145–163 (1972)

    MATH  Article  Google Scholar 

  17. 17.

    R. Gross, K. Albrecht, W. Kantschik, W. Banzhaf, Evolving chess playing programs. in Proceedings of the Genetic and Evolutionary Computation Conference (New York, NY, 2002), pp. 740–747

  18. 18.

    A. Hauptman, M. Sipper, Using genetic programming to evolve chess endgame players. in Proceedings of the 2005 European Conference on Genetic Programming (Springer, Lausanne, Switzerland, 2005), pp. 120–131

  19. 19.

    A. Hauptman, M. Sipper, Evolution of an efficient search algorithm for the Mate-in-N problem in chess. in Proceedings of the 2007 European Conference on Genetic Programming (Springer, Valencia, Spain, 2007), pp. 78–89

  20. 20.

    E.A. Heinz, Extended futility pruning. ICCA J. 21(2), 75–83 (1998)

    MathSciNet  Google Scholar 

  21. 21.

    R.M. Hyatt, A.E. Gower, H.L. Nelson. Cray Blitz. Computers, chess, and cognition, in eds. T.A. Marsland, J. Schaeffer (Springer, New York, 1990), pp. 227–237

    Google Scholar 

  22. 22.

    G. Kendall, G. Whitwell, An evolutionary approach for the tuning of a chess evaluation function using population dynamics. in Proceedings of the 2001 Congress on Evolutionary Computation. (IEEE Press, World Trade Center, Seoul, Korea, 2001), pp. 995–1002

  23. 23.

    J. McCarthy, Chess as the Drosophila of AI. Computers, chess, and cognition, eds. T.A. Marsland, J. Schaeffer (Springer, New York, 1990), pp. 227–237

    Google Scholar 

  24. 24.

    H.L. Nelson. Hash tables in Cray Blitz. ICCA J. 8(1), 3–13 (1985)

    Google Scholar 

  25. 25.

    A. Reinfeld, An improvement to the Scout tree-search algorithm. ICCA J. 6(4), 4–14 (1983)

    Google Scholar 

  26. 26.

    J. Schaeffer, The history heuristic. ICCA J. 6(3), 16–19 (1983)

    Google Scholar 

  27. 27.

    J. Schaeffer, The history heuristic and alpha-beta search enhancements in practice. IEEE Trans. Pattern. Anal. Mach. Intell. 11(11), 1203–1212 (1989)

    Article  Google Scholar 

  28. 28.

    J. Schaeffer, M. Hlynka, V. Jussila, Temporal difference learning applied to a high-performance game-playing program. in Proceedings of the 2001 International Joint Conference on Artificial Intelligence (Seattle, WA, 2001), pp. 529–534

  29. 29.

    J.J. Scott. A chess-playing program, in machine intelligence 4, eds. B. Meltzer, D. Michie (Edinburgh University Press, Edinburgh, 1969), pp. 255–265

    Google Scholar 

  30. 30.

    D.J. Slate, L.R. Atkin, Chess 4.5—The Northwestern University chess program. Chess skill in man and machine, ed. by P.W. Frey (Springer, New York, 2nd ed, 1983), pp. 82–118

  31. 31.

    R.S. Sutton, A.G. Barto. Reinforcement learning: an introduction (MIT Press, Cambridge, MA, 1998)

    Google Scholar 

  32. 32.

    G. Tesauro, Practical issues in temporal difference learning. Mach. Learn. 8(3–4), 257–277 (1992)

    MATH  Google Scholar 

  33. 33.

    W. Tunstall-Pedoe (1991) Genetic algorithms optimising evaluation functions. ICCA J. 14(3), 119–128 (1991)

    Google Scholar 

  34. 34.

    M.A. Wiering, TD Learning of Game Evaluation Functions with Hierarchical Neural Architectures. Master’s Thesis, University of Amsterdam (1995)

Download references

Author information

Affiliations

Authors

Corresponding author

Correspondence to Omid David-Tabibi.

Additional information

A preliminary version of this paper appeared in Proceedings of the 2008 Genetic and Evolutionary Computation Conference [13] and received the Best Paper Award in the conference’s Real-World Applications track.

Appendix

Appendix

A. Experimental setup

Our experimental setup consisted of the following resources:

  • Falcon chess engine running under UCI protocol, and Crafty 19, Junior 9, Fritz 8, and Hiarcs 8 running as a native ChessBase engines.

  • Encyclopedia of Chess Middlegames (ECM) test suite, consisting of 879 positions.

  • Fritz 8 interface for automatic running of matches. Fritz opening book was used for all games.

  • AMD Athlon 64 3200+ with 1 GB RAM and Windows XP operating system.

B. Elo rating system

The Elo rating system, developed by Arpad Elo, is the official system for calculating the relative skill levels of players in chess. The following statistics from the January 2009 FIDE rating list provide a general impression of the meaning of the Elo rating system:

  • 21079 players have a rating above 2200 Elo.

  • 2886 players have a rating between 2400 and 2499, most of whom have either the title of International Master (IM) or Grandmaster (GM).

  • 876 players have a rating between 2500 and 2599, most of whom have the title of GM.

  • 188 players have a rating between 2600 and 2699, all of whom have the title of GM.

  • 32 players have a rating above 2700.

Only four players have ever had a rating of 2800 or above. A novice player is generally associated with rating values below 1400 Elo. Given the rating difference (RD) between player A and player B, the expected winning rate w (0 ≤ w ≤ 1) of player A is given by

$$ w = {\frac{1} {10^{-RD/400} + 1}}. $$
(1)

Given the winning rate of player A against player B (as is the case in our experiments), the expected rating difference between the two players can be derived from the above formula, i.e.,

$$ RD = -400 \log_{10}({\frac{1} {w}} - 1). $$
(2)

In addition, given the results of a series of N matches between two players, we can derive confidence intervals for their rating difference. Without loss of generality, let W, D, and L denote, respectively, the number of wins, draws, and losses of the first player. The mean score and standard deviation are given, respectively, by

$$ \overline{x} = {\frac{W + D/2} {N}}. $$
(3)

and

$$ s = \sqrt{{\frac{W \cdot (1 - \overline{x})^2 + D \cdot(0.5 - \overline{x})^2 + L \cdot \overline{x}^2} {N - 1}}}. $$
(4)

Note that \(\overline{x}\) is essentially an estimate of the expected winning rate. Now, suppose that we are interested in computing, for example, the 95% confidence interval (which corresponds to ± two standard deviations) of the rating difference. For this we compute the lower and upper ends of the winning rate, i.e., \(w_{lo} = \overline{x} - 2s\) and \(w_{hi} = \overline{x} + 2s\). Substituting w lo and w hi in Eq. 2 we obtain the corresponding lower and upper ends of the 95% confidence interval of the rating difference. Given any confidence level, one can compute the corresponding RD confidence interval similarly to the above described steps.

Rights and permissions

Reprints and Permissions

About this article

Cite this article

David-Tabibi, O., Koppel, M. & Netanyahu, N.S. Expert-driven genetic algorithms for simulating evaluation functions. Genet Program Evolvable Mach 12, 5–22 (2011). https://doi.org/10.1007/s10710-010-9103-4

Download citation

Keywords

  • Computer chess
  • Fitness evaluation
  • Games
  • Genetic algorithms
  • Parameter tuning