Investigating the Limits of Monte-Carlo Tree Search Methods in Computer Go
Monte-Carlo Tree Search methods have led to huge progress in computer Go. Still, program performance is uneven - most current Go programs are much stronger in some aspects of the game, such as local fighting and positional evaluation, than in other aspects. Well known weaknesses of many programs include (1) the handling of several simultaneous fights, including the two safe groups problem, and (2) dealing with coexistence in seki.
After a brief review of MCTS techniques, three research questions regarding the behavior of MCTS-based Go programs in specific types of Go situations are formulated. Then, an extensive empirical study of ten leading Go programs investigates their performance in two specifically designed test sets containing two safe group and seki situations.
The results give a good indication of the state of the art in computer Go as of 2012/2013. They show that while a few of the very top programs can apparently solve most of these evaluation problems in their playouts already, these problems are difficult to solve by global search.
KeywordsTree Search Regression Test Global Search Test Scenario Game Tree
This project would not have been possible without the support of all the Go program’s authors. Many of them supported us by implementing extra GTP commands in their programs, and by helping to debug the test set through testing early versions with their programs.
Financial support was provided by NSERC, including a Discovery Accelerator Supplement grant for Müller which partially supported Huang’s stay.
- 2.Coquelin, P.-A., Munos, R.: Bandit algorithms for tree search. In: Parr, R., van der Gaag, L. (eds.) UAI, pp. 67–74. AUAI Press (2007)Google Scholar
- 3.Gelly, S.: A contribution to reinforcement learning; application to computer-Go. Ph.D. thesis, Université Paris-Sud (2007)Google Scholar
- 4.Gelly, S., Silver, D.: Combining online and offline knowledge in UCT. In: ICML ’07: Proceedings of the 24th International Conference on Machine Learning, pp. 273–280. ACM (2007)Google Scholar
- 6.Kocsis, L., Szepesvári, C.: Bandit based Monte-Carlo planning. In: Fürnkranz, J., Scheffer, T., Spiliopoulou, M. (eds.) ECML 2006. LNCS (LNAI), vol. 4212, pp. 282–293. Springer, Heidelberg (2006) Google Scholar
- 7.Müller, M.: Fuego-GB Prototype at the human machine competition in Barcelona 2010: a tournament report and analysis. Technical report TR 10–08, Dept. of Computing Science. University of Alberta, Edmonton, Alberta, Canada (2010). https://www.cs.ualberta.ca/research/theses-publications/technical-reports/2010/TR10-08
- 8.Ramanujan, R., Selman, B.: Trade-offs in sampling-based adversarial planning. In: Bacchus, F., Domshlak, C., Edelkamp, S., Helmert, M. (eds.) ICAPS, pp. 202–209. AAAI (2011)Google Scholar
- 10.Silver, D., Tesauro, G.: Monte-Carlo simulation balancing. In: Danyluk, A., Bottou, L., Littman, M. (eds.) ICML. ACM International Conference Proceeding Series. vol. 382, pp. 945–952. ACM (2009)Google Scholar