Advertisement

Monte-Carlo Tree Search Solver

  • Mark H. M. Winands
  • Yngvi Björnsson
  • Jahn-Takeshi Saito
Part of the Lecture Notes in Computer Science book series (LNCS, volume 5131)

Abstract

Recently, Monte-Carlo Tree Search (MCTS) has advanced the field of computer Go substantially. In this article we investigate the application of MCTS for the game Lines of Action (LOA). A new MCTS variant, called MCTS-Solver, has been designed to play narrow tactical lines better in sudden-death games such as LOA. The variant differs from the traditional MCTS in respect to backpropagation and selection strategy. It is able to prove the game-theoretical value of a position given sufficient time. Experiments show that a Monte-Carlo LOA program using MCTS-Solver defeats a program using MCTS by a winning score of 65%. Moreover, MCTS-Solver performs much better than a program using MCTS against several different versions of the world-class αβ program MIA. Thus, MCTS-Solver constitutes genuine progress in using simulation-based search approaches in sudden-death games, significantly improving upon MCTS-based programs.

Keywords

Simulation Strategy Good Move Simulated Game Move Category Visit Count 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Abramson, B.: Expected-outcome: A general model of static evaluation. IEEE Transactions on Pattern Analysis and Machine Intelligence 12(2), 182–193 (1990)CrossRefGoogle Scholar
  2. 2.
    Allis, L.V.: Searching for Solutions in Games and Artificial Intelligence. PhD thesis, Rijksuniversiteit Limburg, Maastricht (1994)Google Scholar
  3. 3.
    Allis, L.V., van der Meulen, M., van den Herik, H.J.: Proof-number search. Artificial Intelligence 66(1), 91–123 (1994)MATHCrossRefMathSciNetGoogle Scholar
  4. 4.
    Benson, D.B.: Life in the Game of Go. In: Levy, D.N.L. (ed.) Computer Games, vol. 2, pp. 203–213. Springer, New York (1988)Google Scholar
  5. 5.
    Billings, D., Björnsson, Y.: Search and knowledge in Lines of Action. In: van den Herik, H.J., Iida, H., Heinz, E.A. (eds.) Advances in Computer Games 10: Many Games, Many Challenges, pp. 231–248. Kluwer Academic Publishers, Boston (2003)Google Scholar
  6. 6.
    Bouzy, B., Helmstetter, B.: Monte-Carlo Go Developments. In: van den Herik, H.J., Iida, H., Heinz, E.A. (eds.) Advances in Computer Games 10: Many Games, Many Challenges, pp. 159–174. Kluwer Academic Publishers, Boston (2003)Google Scholar
  7. 7.
    Brügmann, B.: Monte Carlo Go. Technical report, Physics Department, Syracuse University (1993)Google Scholar
  8. 8.
    Cazenave, T., Borsboom, J.: Golois Wins Phantom Go Tournament. ICGA Journal 30(3), 165–166 (2007)Google Scholar
  9. 9.
    Chaslot, G.M.J.-B., Winands, M.H.M., Uiterwijk, J.W.H.M., van den Herik, H.J., Bouzy, B.: Progressive strategies for Monte-Carlo Tree Search. New Mathematics and Natural Computation 4(3), 343–357 (2008)CrossRefMathSciNetMATHGoogle Scholar
  10. 10.
    Coulom, R.: Efficient selectivity and backup operators in Monte-Carlo tree search. In: van den Herik, H.J., Ciancarini, P., Donkers, H.H.L.M(J.) (eds.) CG 2006. LNCS, vol. 4630, pp. 72–83. Springer, Heidelberg (2007)Google Scholar
  11. 11.
    Gelly, S., Silver, D.: Combining online and offline knowledge in UCT. In: Ghahramani, Z. (ed.) Proceedings of the International Conference on Machine Learning (ICML). ACM International Conference Proceeding Series, vol. 227, pp. 273–280. ACM, New York (2007)CrossRefGoogle Scholar
  12. 12.
    Helmstetter, B., Cazenave, T.: Architecture d’un programme de Lines of Action. In: Cazenave, T. (ed.) Intelligence artificielle et jeux, pp. 117–126. Hermes Science (2006) (in French)Google Scholar
  13. 13.
    Kloetzer, J., Iida, H., Bouzy, B.: The Monte-Carlo Approach in Amazons. In: van den Herik, H.J., Uiterwijk, J.W.H.M., Winands, M.H.M., Schadd, M.P.D. (eds.) Proceedings of the Computer Games Workshop 2007 (CGW 2007), pp. 185–192. Universiteit Maastricht, Maastricht (2007)Google Scholar
  14. 14.
    Kocsis, L., Szepesvári, C.: Bandit Based Monte-Carlo Planning. In: Fürnkranz, J., Scheffer, T., Spiliopoulou, M. (eds.) ECML 2006. LNCS (LNAI), vol. 4212, pp. 282–293. Springer, Heidelberg (2006)CrossRefGoogle Scholar
  15. 15.
    Kocsis, L., Szepesvári, C., Willemson, J.: Improved Monte-Carlo Search (2006), http://zaphod.aml.sztaki.hu/papers/cg06-ext.pdf
  16. 16.
    Sackson, S.: A Gamut of Games. Random House, New York (1969)Google Scholar
  17. 17.
    Tsuruoka, Y., Yokoyama, D., Chikayama, T.: Game-tree search algorithm based on realization probability. ICGA Journal 25(3), 132–144 (2002)Google Scholar
  18. 18.
    van der Werf, E.C.D., van den Herik, H.J., Uiterwijk, J.W.H.M.: Solving Go on small boards. ICGA Journal 26(2), 92–107 (2003)Google Scholar
  19. 19.
    Winands, M.H.M.: Analysis and implementation of Lines of Action. Master’s thesis. Universiteit Maastricht, Maastricht (2000)Google Scholar
  20. 20.
    Winands, M.H.M.: Informed Search in Complex Games. PhD thesis, Universiteit Maastricht, Maastricht (2004)Google Scholar
  21. 21.
    Winands, M.H.M., Björnsson, Y.: Enhanced realization probability search. New Mathematics and Natural Computation 4(3), 329–342 (2008)CrossRefMathSciNetMATHGoogle Scholar
  22. 22.
    Winands, M.H.M., Kocsis, L., Uiterwijk, J.W.H.M., van den Herik, H.J.: Temporal difference learning and the Neural MoveMap heuristic in the game of Lines of Action. In: Mehdi, Q., Gough, N., Cavazza, M. (eds.) GAME-ON 2002, Ghent, Belgium, pp. 99–103. SCS Europe Bvba (2002)Google Scholar
  23. 23.
    Winands, M.H.M., van den Herik, H.J.: MIA: a world champion LOA program. In: The 11th Game Programming Workshop in Japan (GPW 2006), pp. 84–91 (2006)Google Scholar
  24. 24.
    Winands, M.H.M., van den Herik, H.J., Uiterwijk, J.W.H.M., van der Werf, E.C.D.: Enhanced forward pruning. Information Sciences 175(4), 315–329 (2005)CrossRefMathSciNetGoogle Scholar
  25. 25.
    Zhang, P., Chen, K.: Monte-Carlo Go tactic search. In: Wang, P., et al. (eds.) Proceedings of the 10th Joint Conference on Information Sciences (JCIS 2007), pp. 662–670. World Scientific Publishing Co. Pte. Ltd, Singapore (2007)CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2008

Authors and Affiliations

  • Mark H. M. Winands
    • 1
  • Yngvi Björnsson
    • 2
  • Jahn-Takeshi Saito
    • 1
  1. 1.Games and AI Group, MICC, Faculty of Humanities and SciencesUniversiteit MaastrichtMaastrichtThe Netherlands
  2. 2.School of Computer ScienceReykjavík UniversityReykjavíkIceland

Personalised recommendations