Single-Player Monte-Carlo Tree Search

  • Maarten P. D. Schadd
  • Mark H. M. Winands
  • H. Jaap van den Herik
  • Guillaume M. J. -B. Chaslot
  • Jos W. H. M. Uiterwijk
Part of the Lecture Notes in Computer Science book series (LNCS, volume 5131)

Abstract

Classical methods such as A* and IDA* are a popular and successful choice for one-player games. However, they fail without an accurate admissible evaluation function. In this paper we investigate whether Monte-Carlo Tree Search (MCTS) is an interesting alternative for one-player games where A* and IDA* methods do not perform well. Therefore, we propose a new MCTS variant, called Single-Player Monte-Carlo Tree Search (SP-MCTS). The selection and backpropagation strategy in SP-MCTS are different from standard MCTS. Moreover, SP-MCTS makes use of a straightforward Meta-Search extension. We tested the method on the puzzle SameGame. It turned out that our SP-MCTS program gained the highest score so far on the standardized test set.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Biedl, T.C., Demaine, E.D., Demaine, M.L., Fleischer, R., Jacobsen, L., Munro, I.: The Complexity of Clickomania. In: Nowakowski, R.J. (ed.) More Games of No Chance, Proc. MSRI Workshop on Combinatorial Games, pp. 389–404. MSRI Publ., Berkeley. Cambridge University Press, Cambridge (2002)Google Scholar
  2. 2.
    Billings, D.: Personal Communication. University of Alberta, Canada (2007)Google Scholar
  3. 3.
    Bouzy, B., Cazanave, T.: Computer Go: An AI-Oriented Survey. Artificial Intelligence 132(1), 39–103 (2001)MATHCrossRefMathSciNetGoogle Scholar
  4. 4.
    Bouzy, B., Helmstetter, B.: Monte-Carlo Go Developments. In: van den Herik, H.J., Iida, H., Heinz, E.A. (eds.) Proceedings of the 10th Advances in Computer Games Conference (ACG-10), The Netherlands, pp. 159–174. Kluwer Academic, Dordrecht (2003)Google Scholar
  5. 5.
    Brügmann, B.: Monte Carlo Go. Technical report, Physics Department, Syracuse University (1993)Google Scholar
  6. 6.
    Cazenave, T., Borsboom, J.: Golois Wins Phantom Go Tournament. ICGA Journal 30(3), 165–166 (2007)Google Scholar
  7. 7.
    Chaslot, G.M.J.-B., Winands, M.H.M., Uiterwijk, J.W.H.M., van den Herik, H.J., Bouzy, B.: Progressive strategies for Monte-Carlo Tree Search. New Mathematics and Natural Computation 4(3), 343–357 (2008)CrossRefMathSciNetGoogle Scholar
  8. 8.
    Chaslot, G.M.J.B., de Jong, S., Saito, J.-T., Uiterwijk, J.W.H.M.: Monte-Carlo Tree Search in Production Management Problems. In: Schobbens, P.Y., Vanhoof, W., Schwanen, G. (eds.) Proceedings of the 18th BeNeLux Conference on Artificial Intelligence, Namur, Belgium, pp. 91–98 (2006)Google Scholar
  9. 9.
    Chaslot, G.M.J.B., Saito, J.-T., Bouzy, B., Uiterwijk, J.W.H.M., van den Herik, H.J.: Monte-Carlo Strategies for Computer Go. In: Schobbens, P.Y., Vanhoof, W., Schwanen, G. (eds.) Proceedings of the 18th BeNeLux Conference on Artificial Intelligence, Namur, Belgium, pp. 83–91 (2006)Google Scholar
  10. 10.
    Coulom, R.: Efficient selectivity and backup operators in Monte-Carlo tree search. In: van den Herik, H.J., Ciancarini, P., Donkers, H.H.L.M(J.) (eds.) CG 2006. LNCS, vol. 4630, pp. 72–83. Springer, Heidelberg (2007)CrossRefGoogle Scholar
  11. 11.
    Culberson, J.C., Schaeffer, J.: Pattern databases. Computational Intelligence 14(3), 318–334 (1998)CrossRefMathSciNetGoogle Scholar
  12. 12.
    Felner, A., Zahavi, U., Schaeffer, J., Holte, R.C.: Dual Lookups in Pattern Databases. In: IJCAI, Edinburgh, Scotland, pp. 103–108 (2005)Google Scholar
  13. 13.
    Gomes, C.P., Selman, B., McAloon, K., Tretkoff, C.: Randomization in Backtrack Search: Exploiting Heavy-Tailed Profiles for Solving Hard Scheduling Problems. In: AIPS, Pittsburg, PA, pp. 208–213 (1998)Google Scholar
  14. 14.
    Hart, P.E., Nilsson, N.J., Raphael, B.: A formal basis for the heuristic determination of minimum cost paths. IEEE Transactions on Systems Science and Cybernatics 4(2), 100–107 (1968)CrossRefGoogle Scholar
  15. 15.
    Junghanns, A.: Pushing the Limits: New Developments in Single Agent Search. PhD thesis, University of Alberta, Alberta, Canada (1999)Google Scholar
  16. 16.
    Kendall, G., Parkes, A., Spoerer, K.: A Survey of NP-Complete Puzzles. ICGA Journal 31(1), 13–34 (2008)Google Scholar
  17. 17.
    Kocsis, L., Szepesvári, C.: Bandit based Monte-Carlo Planning. In: Fürnkranz, J., Scheffer, T., Spiliopoulou, M. (eds.) ECML 2006. LNCS (LNAI), vol. 4212, pp. 282–293. Springer, Heidelberg (2006)Google Scholar
  18. 18.
    Kocsis, L., Szepesvári, C., Willemson, J.: Improved Monte-Carlo Search (2006), http://zaphod.aml.sztaki.hu/papers/cg06-ext.pdf
  19. 19.
    Korf, R.E.: Depth-first iterative deepening: An optimal admissable tree search. Artificial Intelligence 27(1), 97–109 (1985)MATHCrossRefMathSciNetGoogle Scholar
  20. 20.
    Moribe, K.: Chain shot! Gekkan ASCII, (November 1985) (in Japanese)Google Scholar
  21. 21.
    PDA Game Guide.com. Pocket PC Jawbreaker Game. The Ultimate Guide to PDA Games, Retrieved 7.1.2008 (2008), http://www.pdagameguide.com/jawbreaker-game.html
  22. 22.
    Sadikov, A., Bratko, I.: Solving 20 × 20 Puzzles. In: van den Herik, H.J., Uiterwijk, J.W.H.M., Winands, M.H.M., Schadd, M.P.D. (eds.) Proceedings of the Computer Games Workshop 2007 (CGW 2007), The Netherlands, pp. 157–164. Universiteit Maastricht, Maastricht (2007)Google Scholar
  23. 23.
    Tesauro, G., Galperin, G.R.: On-line policy improvement using Monte Carlo search. In: Mozer, M.C., Jordan, M.I., Petsche, T. (eds.) Advances in Neural Information Processing Systems, vol. 9, pp. 1068–1074. MIT Press, Cambridge (1997)Google Scholar
  24. 24.
    University of Alberta GAMES Group. GAMES Group News (Archives) (2002), http://www.cs.ualberta.ca/~games/archives.html
  25. 25.
    Vempaty, N.R., Kumar, V., Korf, R.E.: Depth-first versus best-first search. In: AAAI, Anaheim, California, USA, pp. 434–440. MIT Press, Cambridge (1991)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2008

Authors and Affiliations

  • Maarten P. D. Schadd
    • 1
  • Mark H. M. Winands
    • 1
  • H. Jaap van den Herik
    • 1
  • Guillaume M. J. -B. Chaslot
    • 1
  • Jos W. H. M. Uiterwijk
    • 1
  1. 1.Games and AI Group, MICC, Faculty of Humanities and SciencesUniversiteit MaastrichtMaastrichtThe Netherlands

Personalised recommendations