Advertisement

Parallel Monte-Carlo Tree Search for HPC Systems

  • Tobias Graf
  • Ulf Lorenz
  • Marco Platzner
  • Lars Schaefers
Part of the Lecture Notes in Computer Science book series (LNCS, volume 6853)

Abstract

Monte-Carlo Tree Search (MCTS) is a simulation-based search method that brought about great success to applications such as Computer-Go in the past few years. The power of MCTS strongly depends on the number of simulations computed per time unit and the amount of memory available to store data gathered during simulation. High-performance computing systems such as large compute clusters provide vast computation and memory resources and thus seem to be natural targets for running MCTS. However, so far only few publications deal with parallelizing MCTS for distributed memory machines. In this paper, we present a novel approach for the parallelization of MCTS which allows for an equally distributed spreading of both the work and memory load among all compute nodes within a distributed memory HPC system. We describe our approach termed UCT-Treesplit and evaluate its performance on the example of a state-of-the-art Go engine.

Keywords

UCT HPC Monte-Carlo Tree Search distributed memory 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Auer, P., Cesa-Bianchi, N., Fischer, P.: Finite-Time Analysis of the Multiarmed Bandit Problem. In: Machine Learning, vol. 47, pp. 235–256. Kluwer Academic, Dordrecht (2002)Google Scholar
  2. 2.
    Bourki, A., Chaslot, G.M.J.-B., Coulm, M., Danjean, V., Doghmen, H., Hoock, J.-B., Hérault, T., Rimmel, A., Teytaud, F., Teytaud, O., Vayssiére, P., Yu, Z.: Scalability and Parallelization of Monte-Carlo Tree Search. In: International Conference on Computers and Games, pp. 48–58 (2010)Google Scholar
  3. 3.
    Chaslot, G.M.J.-B., Winands, M.H.M., Jaap van den Herik, H.: Parallel Monte-Carlo Tree Search. In: Conference on Computers and Games, pp. 60–71 (2008)Google Scholar
  4. 4.
    Coulom, R.: Efficient Selectivity and Backup Operators in Monte-Carlo Tree Search. In: van den Herik, H.J., Ciancarini, P., Donkers, H.H.L.M(J.) (eds.) CG 2006. LNCS, vol. 4630, pp. 72–83. Springer, Heidelberg (2007)CrossRefGoogle Scholar
  5. 5.
    Coulom, R.: Computing Elo Ratings of Move Patterns in the Game of Go. ICGA Journal 30(4), 198–208 (2007)Google Scholar
  6. 6.
    Donninger, C., Kure, A., Lorenz, U.: Parallel Brutus: The First Distributed, FPGA Accelerated Chess Program. In: 18th International Parallel and Distributed Processing Symposium. IEEE Computer Society, Los Alamitos (2004)Google Scholar
  7. 7.
    Enzenberger, M., Müller, M.: A Lock-Free Multithreaded Monte-Carlo Tree Search Algorithm. In: van den Herik, H.J., Spronck, P. (eds.) ACG 2009. LNCS, vol. 6048, pp. 14–20. Springer, Heidelberg (2010)CrossRefGoogle Scholar
  8. 8.
    Feldmann, R., Mysliwietz, P., Monien, B.: Distributed game tree search on a massively parallel system. In: Monien, B., Ottmann, T. (eds.) Data Structures and Efficient Algorithms. LNCS, vol. 594, pp. 270–288. Springer, Heidelberg (1992)CrossRefGoogle Scholar
  9. 9.
    Gelly, S., Wang, Y., Munos, R., Teytaud, O.: Modifications of UCT with Patterns in Monte-Carlo Go. Technical Report 6062, INRIA (2006)Google Scholar
  10. 10.
    Himstedt, K., Lorenz, U., Möller, D.P.F.: A twofold distributed game-tree search approach using interconnected clusters. In: Luque, E., Margalef, T., Benítez, D. (eds.) Euro-Par 2008. LNCS, vol. 5168, pp. 587–598. Springer, Heidelberg (2008)CrossRefGoogle Scholar
  11. 11.
    Huang, S.-C., Coulom, R., Lin, S.-S.: Monte-Carlo Simulation Balancing in Practice. In: Conference on Computers and Games, pp. 81–92 (2010)Google Scholar
  12. 12.
    Donald Knuth, E., Moore, R.W.: An Analysis of Alpha-Beta Pruning. In: Artificial Intelligence, vol. 6, pp. 293–327. North-Holland Publishing Company, Amsterdam (1975)Google Scholar
  13. 13.
    Kocsis, L., Szepesvári, C.: Bandit Based Monte-Carlo Planning. In: Fürnkranz, J., Scheffer, T., Spiliopoulou, M. (eds.) ECML 2006. LNCS (LNAI), vol. 4212, pp. 282–293. Springer, Heidelberg (2006)CrossRefGoogle Scholar
  14. 14.
    Lorenz, U.: Parallel controlled conspiracy number search. In: Monien, B., Feldmann, R.L. (eds.) Euro-Par 2002. LNCS, vol. 2400, pp. 420–430. Springer, Heidelberg (2002)CrossRefGoogle Scholar
  15. 15.
    Romein, J.W., Plaat, A., Bal, H.E., Schaeffer, J.: Transposition table driven work scheduling in distributed search. In: National Conference on Artificial Intelligence, pp. 725–731 (1999)Google Scholar
  16. 16.
    Segal, R.B.: On the Scalability of Parallel UCT. In: International Conference on Computer and Games, pp. 36–47 (2010)Google Scholar
  17. 17.
    Silver, D.: Reinforcement Learning and Simulation-Based Search in Computer Go. PhD thesis, University of Alberta (2009)Google Scholar
  18. 18.
    Silver, D., Tesauro, G.: Monte-Carlo Simulation Balancing. In: International Conference on Machine Learning, pp. 945–952 (2009)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2011

Authors and Affiliations

  • Tobias Graf
    • 1
  • Ulf Lorenz
    • 2
  • Marco Platzner
    • 1
  • Lars Schaefers
    • 1
  1. 1.University of PaderbornPaderbornGermany
  2. 2.TU DarmstadtDarmstadtGermany

Personalised recommendations