On the Scalability of Parallel UCT

  • Richard B. Segal
Part of the Lecture Notes in Computer Science book series (LNCS, volume 6515)

Abstract

The parallelization of MCTS across multiple-machines has proven surprisingly difficult. The limitations of existing algorithms were evident in the 2009 Computer Olympiad where Zen using a single four-core machine defeated both Fuego with ten eight-core machines, and Mogo with twenty thirty-two core machines. This paper investigates the limits of parallel MCTS in order to understand why distributed parallelism has proven so difficult and to pave the way towards future distributed algorithms with better scaling. We first analyze the single-threaded scaling of Fuego and find that there is an upper bound on the play-quality improvements which can come from additional search. We then analyze the scaling of an idealized N-core shared memory machine to determine the maximum amount of parallelism supported by MCTS. We show that parallel speedup depends critically on how much time is given to each player. We use this relationship to predict parallel scaling for time scales beyond what can be empirically evaluated due to the immense computation required. Our results show that MCTS can scale nearly perfectly to at least 64 threads when combined with virtual loss, but without virtual loss scaling is limited to just eight threads. We also find that for competition time controls scaling to thousands of threads is impossible not necessarily due to MCTS not scaling, but because high levels of parallelism can start to bump up against the upper performance bound of Fuego itself.

Keywords

Core Machine Play Quality Multiarmed Bandit Parallel Scaling Multiarmed Bandit Problem 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Gelly, S., Hoock, J.B., Rimmel, A., Teytaud, O., Kalemkarian, Y.: The parallelization of monte-carlo planning - parallelization of mc-planning. In: ICINCO-ICSO, pp. 244–249 (2008)Google Scholar
  2. 2.
    Kocsis, L., Szepesvári, C.: Bandit based monte-carlo planning. In: Fürnkranz, J., Scheffer, T., Spiliopoulou, M. (eds.) ECML 2006. LNCS (LNAI), vol. 4212, pp. 282–293. Springer, Heidelberg (2006)CrossRefGoogle Scholar
  3. 3.
    Gelly, S., Silver, D.: Combining online and offline learning in UCT. In: 17th International Conference on Machine Learning, pp. 273–280 (2007)Google Scholar
  4. 4.
    Gelly, S., Wang, Y., Munos, R., Teytaud, O.: Modification of UCT with patterns in Monte-Carlo Go. Technical Report 6062, INRIA, France (2006)Google Scholar
  5. 5.
    Chaslot, G.M.J.-B., Winands, M.H.M., van den Herik, H.J.: Parallel monte-carlo tree search. In: van den Herik, H.J., Xu, X., Ma, Z., Winands, M.H.M. (eds.) CG 2008. LNCS, vol. 5131, pp. 60–71. Springer, Heidelberg (2008)CrossRefGoogle Scholar
  6. 6.
    Enzenberger, M., Müller, M.: A lock-free multithreaded monte-carlo tree search algorithm. In: Advances in Computer Games 12 (2009)Google Scholar
  7. 7.
    Auer, P., Cesa-Binachi, N., Fischer, P.: Finite-time analysis of the multiarmed bandit problem. Machine Learning 47, 235–256 (2002)CrossRefMATHGoogle Scholar
  8. 8.
    Dailey, D.: 9x9 scalability study (2008), http://cgos.boardspace.net/study/index.html
  9. 9.
    Cazenave, T., Jouandeau, N.: A parallel monte-carlo tree search algorithm. In: van den Herik, H.J., Xu, X., Ma, Z., Winands, M.H.M. (eds.) CG 2008. LNCS, vol. 5131, pp. 72–80. Springer, Heidelberg (2008)CrossRefGoogle Scholar
  10. 10.
    Teytaud, O.: Parallel algorithms. Posting to the Computer Go mailing list (2008), http://computer-go.org/pipermail/computer-go/2008-May/015074.html

Copyright information

© Springer-Verlag Berlin Heidelberg 2011

Authors and Affiliations

  • Richard B. Segal
    • 1
  1. 1.IBM ResearchYorktown HeightsUSA

Personalised recommendations