Abstract
In this paper we demonstrate how genetic algorithms can be used to reverse engineer an evaluation function’s parameters for computer chess. Our results show that using an appropriate expert (or mentor), we can evolve a program that is on par with top tournamentplaying chess programs, outperforming a twotime World Computer Chess Champion. This performance gain is achieved by evolving a program that mimics the behavior of a superior expert. The resulting evaluation function of the evolved program consists of a much smaller number of parameters than the expert’s. The extended experimental results provided in this paper include a report on our successful participation in the 2008 World Computer Chess Championship. In principle, our expertdriven approach could be used in a wide range of problems for which appropriate experts are available.
This is a preview of subscription content, access via your institution.
Notes
An evaluation unit in chess programs is commonly called a centipawn, i.e., 1/100th of the value of a pawn. Traditionally, a pawn is assigned a value of 100, and all other parameters are assigned relative values. However, the value of a pawn itself need not be exactly 100, so a unit of evaluation may no longer be exactly 1/100th of a pawn. Despite this inconsistency, the term centipawn is still used to denote the smallest evaluation unit.
Note that Evol* and RandOrg (including the sets of parameters of their evaluation function) are essentially the same, except for the actual values assigned to these parameters.
Our genetically evolved program participated under the name Falcon, which is the original name we had used in previous championships. Even though a name reflecting evolution (such as FalconGA) might have been more appropriate, it is customary that the participants use the same program name every year, even when using a substantially different version.
References
S.G. Akl, M.M. Newborn, The principal continuation and the killer heuristic. in Proceedings of the 5th Annual ACM Computer Science Conference (ACM Press, Seattle, WA, 1977), pp. 466–473
P. Aksenov, Genetic Algorithms for Optimising Chess Position Scoring. Master’s Thesis, University of Joensuu, Finland (2004)
T.S. Anantharaman, Extension heuristics. ICCA J. 14(2), 47–65 (1991)
J. Baxter, A. Tridgell, L. Weaver, Learning to play chess using temporaldifferences. Mach. Learn. 40(3), 243–263 (2000)
D.F. Beal, Experiments with the null move. Advances in Computer Chess 5, in ed. by D.F. Beal (Elsevier Science, Amsterdam, 1989), pp. 65–79
D.F. Beal, M.C. Smith, Quantification of search extension benefits. ICCA J. 18(4), 205–218 (1995)
Y. Björnsson, T.A. Marsland, Multicut pruning in alphabeta search. in Proceedings of the First International Conference on Computers and Games, Tsukuba, Japan (1998), pp. 15–24
Y. Björnsson, T.A. Marsland, Multicut alphabetapruning in gametree search. Theor. Comput. Sci. 252(1–2), 177–196 (2001)
M. Block, M. Bader, E. Tapia, M. Ramirez, K. Gunnarsson, E. Cuevas, D. Zaldivar, R. Rojas, Using reinforcement learning in chess engines, Res. Comput. Sci. 35, 31–40 (2008)
M.S. Campbell, T.A. Marsland, A comparison of minimax tree search algorithms. Artif. Intell. 20(4), 347–367 (1983)
S. Chinchalkar, An upper bound for the number of reachable positions. ICCA J. 19(3), 181–183 (1996)
O. DavidTabibi, A. Felner, N.S. Netanyahu, Blockage detection in pawn endings. in Proceedings of the 2004 International Conference on Computers and Games, eds. by H.J. van den Herik, Y. Björnsson, N.S. Netanyahu (Springer (LNCS 3846), RamatGan, Israel, 2006), pp. 187–201
O. DavidTabibi, M. Koppel, N.S. Netanyahu, Genetic algorithms for mentorassisted evaluation function optimization. in Proceedings of the Genetic and Evolutionary Computation Conference (Atlanta, GA, 2008), pp. 1469–1476
O. DavidTabibi, N.S. Netanyahu, Extended nullmove reductions. in Proceedings of the 2008 International Conference on Computers and Games, eds. by H.J. van den Herik, X. Xu, Z. Ma, M.H.M. Winands (Springer (LNCS 5131), Beijing, China, 2008), pp. 205–216
C. Donninger, Null move and deep search: Selective search heuristics for obtuse chess programs. ICCA J. 16(3), 137–143 (1993)
J.J. Gillogly, The technology chess program. Artif. Intell. 3(1–3), 145–163 (1972)
R. Gross, K. Albrecht, W. Kantschik, W. Banzhaf, Evolving chess playing programs. in Proceedings of the Genetic and Evolutionary Computation Conference (New York, NY, 2002), pp. 740–747
A. Hauptman, M. Sipper, Using genetic programming to evolve chess endgame players. in Proceedings of the 2005 European Conference on Genetic Programming (Springer, Lausanne, Switzerland, 2005), pp. 120–131
A. Hauptman, M. Sipper, Evolution of an efficient search algorithm for the MateinN problem in chess. in Proceedings of the 2007 European Conference on Genetic Programming (Springer, Valencia, Spain, 2007), pp. 78–89
E.A. Heinz, Extended futility pruning. ICCA J. 21(2), 75–83 (1998)
R.M. Hyatt, A.E. Gower, H.L. Nelson. Cray Blitz. Computers, chess, and cognition, in eds. T.A. Marsland, J. Schaeffer (Springer, New York, 1990), pp. 227–237
G. Kendall, G. Whitwell, An evolutionary approach for the tuning of a chess evaluation function using population dynamics. in Proceedings of the 2001 Congress on Evolutionary Computation. (IEEE Press, World Trade Center, Seoul, Korea, 2001), pp. 995–1002
J. McCarthy, Chess as the Drosophila of AI. Computers, chess, and cognition, eds. T.A. Marsland, J. Schaeffer (Springer, New York, 1990), pp. 227–237
H.L. Nelson. Hash tables in Cray Blitz. ICCA J. 8(1), 3–13 (1985)
A. Reinfeld, An improvement to the Scout treesearch algorithm. ICCA J. 6(4), 4–14 (1983)
J. Schaeffer, The history heuristic. ICCA J. 6(3), 16–19 (1983)
J. Schaeffer, The history heuristic and alphabeta search enhancements in practice. IEEE Trans. Pattern. Anal. Mach. Intell. 11(11), 1203–1212 (1989)
J. Schaeffer, M. Hlynka, V. Jussila, Temporal difference learning applied to a highperformance gameplaying program. in Proceedings of the 2001 International Joint Conference on Artificial Intelligence (Seattle, WA, 2001), pp. 529–534
J.J. Scott. A chessplaying program, in machine intelligence 4, eds. B. Meltzer, D. Michie (Edinburgh University Press, Edinburgh, 1969), pp. 255–265
D.J. Slate, L.R. Atkin, Chess 4.5—The Northwestern University chess program. Chess skill in man and machine, ed. by P.W. Frey (Springer, New York, 2nd ed, 1983), pp. 82–118
R.S. Sutton, A.G. Barto. Reinforcement learning: an introduction (MIT Press, Cambridge, MA, 1998)
G. Tesauro, Practical issues in temporal difference learning. Mach. Learn. 8(3–4), 257–277 (1992)
W. TunstallPedoe (1991) Genetic algorithms optimising evaluation functions. ICCA J. 14(3), 119–128 (1991)
M.A. Wiering, TD Learning of Game Evaluation Functions with Hierarchical Neural Architectures. Master’s Thesis, University of Amsterdam (1995)
Author information
Authors and Affiliations
Corresponding author
Additional information
A preliminary version of this paper appeared in Proceedings of the 2008 Genetic and Evolutionary Computation Conference [13] and received the Best Paper Award in the conference’s RealWorld Applications track.
Appendix
Appendix
A. Experimental setup
Our experimental setup consisted of the following resources:

Falcon chess engine running under UCI protocol, and Crafty 19, Junior 9, Fritz 8, and Hiarcs 8 running as a native ChessBase engines.

Encyclopedia of Chess Middlegames (ECM) test suite, consisting of 879 positions.

Fritz 8 interface for automatic running of matches. Fritz opening book was used for all games.

AMD Athlon 64 3200+ with 1 GB RAM and Windows XP operating system.
B. Elo rating system
The Elo rating system, developed by Arpad Elo, is the official system for calculating the relative skill levels of players in chess. The following statistics from the January 2009 FIDE rating list provide a general impression of the meaning of the Elo rating system:

21079 players have a rating above 2200 Elo.

2886 players have a rating between 2400 and 2499, most of whom have either the title of International Master (IM) or Grandmaster (GM).

876 players have a rating between 2500 and 2599, most of whom have the title of GM.

188 players have a rating between 2600 and 2699, all of whom have the title of GM.

32 players have a rating above 2700.
Only four players have ever had a rating of 2800 or above. A novice player is generally associated with rating values below 1400 Elo. Given the rating difference (RD) between player A and player B, the expected winning rate w (0 ≤ w ≤ 1) of player A is given by
Given the winning rate of player A against player B (as is the case in our experiments), the expected rating difference between the two players can be derived from the above formula, i.e.,
In addition, given the results of a series of N matches between two players, we can derive confidence intervals for their rating difference. Without loss of generality, let W, D, and L denote, respectively, the number of wins, draws, and losses of the first player. The mean score and standard deviation are given, respectively, by
and
Note that \(\overline{x}\) is essentially an estimate of the expected winning rate. Now, suppose that we are interested in computing, for example, the 95% confidence interval (which corresponds to ± two standard deviations) of the rating difference. For this we compute the lower and upper ends of the winning rate, i.e., \(w_{lo} = \overline{x}  2s\) and \(w_{hi} = \overline{x} + 2s\). Substituting w _{ lo } and w _{ hi } in Eq. 2 we obtain the corresponding lower and upper ends of the 95% confidence interval of the rating difference. Given any confidence level, one can compute the corresponding RD confidence interval similarly to the above described steps.
Rights and permissions
About this article
Cite this article
DavidTabibi, O., Koppel, M. & Netanyahu, N.S. Expertdriven genetic algorithms for simulating evaluation functions. Genet Program Evolvable Mach 12, 5–22 (2011). https://doi.org/10.1007/s1071001091034
Received:
Revised:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s1071001091034
Keywords
 Computer chess
 Fitness evaluation
 Games
 Genetic algorithms
 Parameter tuning