Monte-Carlo Tree Search Solver
Abstract
Recently, Monte-Carlo Tree Search (MCTS) has advanced the field of computer Go substantially. In this article we investigate the application of MCTS for the game Lines of Action (LOA). A new MCTS variant, called MCTS-Solver, has been designed to play narrow tactical lines better in sudden-death games such as LOA. The variant differs from the traditional MCTS in respect to backpropagation and selection strategy. It is able to prove the game-theoretical value of a position given sufficient time. Experiments show that a Monte-Carlo LOA program using MCTS-Solver defeats a program using MCTS by a winning score of 65%. Moreover, MCTS-Solver performs much better than a program using MCTS against several different versions of the world-class αβ program MIA. Thus, MCTS-Solver constitutes genuine progress in using simulation-based search approaches in sudden-death games, significantly improving upon MCTS-based programs.
Keywords
Simulation Strategy Good Move Simulated Game Move Category Visit CountPreview
Unable to display preview. Download preview PDF.
References
- 1.Abramson, B.: Expected-outcome: A general model of static evaluation. IEEE Transactions on Pattern Analysis and Machine Intelligence 12(2), 182–193 (1990)CrossRefGoogle Scholar
- 2.Allis, L.V.: Searching for Solutions in Games and Artificial Intelligence. PhD thesis, Rijksuniversiteit Limburg, Maastricht (1994)Google Scholar
- 3.Allis, L.V., van der Meulen, M., van den Herik, H.J.: Proof-number search. Artificial Intelligence 66(1), 91–123 (1994)MATHCrossRefMathSciNetGoogle Scholar
- 4.Benson, D.B.: Life in the Game of Go. In: Levy, D.N.L. (ed.) Computer Games, vol. 2, pp. 203–213. Springer, New York (1988)Google Scholar
- 5.Billings, D., Björnsson, Y.: Search and knowledge in Lines of Action. In: van den Herik, H.J., Iida, H., Heinz, E.A. (eds.) Advances in Computer Games 10: Many Games, Many Challenges, pp. 231–248. Kluwer Academic Publishers, Boston (2003)Google Scholar
- 6.Bouzy, B., Helmstetter, B.: Monte-Carlo Go Developments. In: van den Herik, H.J., Iida, H., Heinz, E.A. (eds.) Advances in Computer Games 10: Many Games, Many Challenges, pp. 159–174. Kluwer Academic Publishers, Boston (2003)Google Scholar
- 7.Brügmann, B.: Monte Carlo Go. Technical report, Physics Department, Syracuse University (1993)Google Scholar
- 8.Cazenave, T., Borsboom, J.: Golois Wins Phantom Go Tournament. ICGA Journal 30(3), 165–166 (2007)Google Scholar
- 9.Chaslot, G.M.J.-B., Winands, M.H.M., Uiterwijk, J.W.H.M., van den Herik, H.J., Bouzy, B.: Progressive strategies for Monte-Carlo Tree Search. New Mathematics and Natural Computation 4(3), 343–357 (2008)CrossRefMathSciNetMATHGoogle Scholar
- 10.Coulom, R.: Efficient selectivity and backup operators in Monte-Carlo tree search. In: van den Herik, H.J., Ciancarini, P., Donkers, H.H.L.M(J.) (eds.) CG 2006. LNCS, vol. 4630, pp. 72–83. Springer, Heidelberg (2007)Google Scholar
- 11.Gelly, S., Silver, D.: Combining online and offline knowledge in UCT. In: Ghahramani, Z. (ed.) Proceedings of the International Conference on Machine Learning (ICML). ACM International Conference Proceeding Series, vol. 227, pp. 273–280. ACM, New York (2007)CrossRefGoogle Scholar
- 12.Helmstetter, B., Cazenave, T.: Architecture d’un programme de Lines of Action. In: Cazenave, T. (ed.) Intelligence artificielle et jeux, pp. 117–126. Hermes Science (2006) (in French)Google Scholar
- 13.Kloetzer, J., Iida, H., Bouzy, B.: The Monte-Carlo Approach in Amazons. In: van den Herik, H.J., Uiterwijk, J.W.H.M., Winands, M.H.M., Schadd, M.P.D. (eds.) Proceedings of the Computer Games Workshop 2007 (CGW 2007), pp. 185–192. Universiteit Maastricht, Maastricht (2007)Google Scholar
- 14.Kocsis, L., Szepesvári, C.: Bandit Based Monte-Carlo Planning. In: Fürnkranz, J., Scheffer, T., Spiliopoulou, M. (eds.) ECML 2006. LNCS (LNAI), vol. 4212, pp. 282–293. Springer, Heidelberg (2006)CrossRefGoogle Scholar
- 15.Kocsis, L., Szepesvári, C., Willemson, J.: Improved Monte-Carlo Search (2006), http://zaphod.aml.sztaki.hu/papers/cg06-ext.pdf
- 16.Sackson, S.: A Gamut of Games. Random House, New York (1969)Google Scholar
- 17.Tsuruoka, Y., Yokoyama, D., Chikayama, T.: Game-tree search algorithm based on realization probability. ICGA Journal 25(3), 132–144 (2002)Google Scholar
- 18.van der Werf, E.C.D., van den Herik, H.J., Uiterwijk, J.W.H.M.: Solving Go on small boards. ICGA Journal 26(2), 92–107 (2003)Google Scholar
- 19.Winands, M.H.M.: Analysis and implementation of Lines of Action. Master’s thesis. Universiteit Maastricht, Maastricht (2000)Google Scholar
- 20.Winands, M.H.M.: Informed Search in Complex Games. PhD thesis, Universiteit Maastricht, Maastricht (2004)Google Scholar
- 21.Winands, M.H.M., Björnsson, Y.: Enhanced realization probability search. New Mathematics and Natural Computation 4(3), 329–342 (2008)CrossRefMathSciNetMATHGoogle Scholar
- 22.Winands, M.H.M., Kocsis, L., Uiterwijk, J.W.H.M., van den Herik, H.J.: Temporal difference learning and the Neural MoveMap heuristic in the game of Lines of Action. In: Mehdi, Q., Gough, N., Cavazza, M. (eds.) GAME-ON 2002, Ghent, Belgium, pp. 99–103. SCS Europe Bvba (2002)Google Scholar
- 23.Winands, M.H.M., van den Herik, H.J.: MIA: a world champion LOA program. In: The 11th Game Programming Workshop in Japan (GPW 2006), pp. 84–91 (2006)Google Scholar
- 24.Winands, M.H.M., van den Herik, H.J., Uiterwijk, J.W.H.M., van der Werf, E.C.D.: Enhanced forward pruning. Information Sciences 175(4), 315–329 (2005)CrossRefMathSciNetGoogle Scholar
- 25.Zhang, P., Chen, K.: Monte-Carlo Go tactic search. In: Wang, P., et al. (eds.) Proceedings of the 10th Joint Conference on Information Sciences (JCIS 2007), pp. 662–670. World Scientific Publishing Co. Pte. Ltd, Singapore (2007)CrossRefGoogle Scholar