Monte-Carlo Tree Search Enhancements for Havannah

  • Jan A. Stankiewicz
  • Mark H. M. Winands
  • Jos W. H. M. Uiterwijk
Part of the Lecture Notes in Computer Science book series (LNCS, volume 7168)

Abstract

This article shows how the performance of a Monte-Carlo Tree Search (MCTS) player for Havannah can be improved by guiding the search in the playout and selection steps of MCTS. To improve the playout step of the MCTS algorithm, we used two techniques to direct the simulations, Last-Good-Reply (LGR) and N-grams. Experiments reveal that LGR gives a significant improvement, although it depends on which LGR variant is used. Using N-grams to guide the playouts also achieves a significant increase in the winning percentage. Combining N-grams with LGR leads to a small additional improvement. To enhance the selection step of the MCTS algorithm, we initialize the visit and win counts of the new nodes based on pattern knowledge. By biasing the selection towards joint/neighbor moves, local connections, and edge/corner connections, a significant improvement in the performance is obtained. Experiments show that the best overall performance is obtained when combining the visit-and-win-count initialization with LGR and N-grams. In the best case, a winning percentage of 77.5% can be achieved against the default MCTS program.

Keywords

Good Reply Selection Step Decay Factor Local Connection Previous Move 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Arneson, B., Hayward, R.B., Henderson, P.: Monte Carlo Tree Search in Hex. IEEE Transactions on Computational Intelligence and AI in Games 2(4), 251–258 (2010)CrossRefGoogle Scholar
  2. 2.
    Baier, H., Drake, P.D.: The Power of Forgetting: Improving the Last-Good-Reply Policy in Monte Carlo Go. IEEE Transactions on Computational Intelligence and AI in Games 2(4), 303–309 (2010)CrossRefGoogle Scholar
  3. 3.
    Björnsson, Y., Finnsson, H.: CadiaPlayer: A Simulation-Based General Game Player. IEEE Transactions on Computational Intelligence and AI in Games 1(1), 4–15 (2009)CrossRefGoogle Scholar
  4. 4.
    Chaslot, G.M.J.-B.: Monte-Carlo Tree Search. PhD thesis, Maastricht University, Maastricht, The Netherlands (2010)Google Scholar
  5. 5.
    Chaslot, G.M.J.-B., Winands, M.H.M., Uiterwijk, J.W.H.M., van den Herik, H.J., Bouzy, B.: Progressive Strategies for Monte-Carlo Tree Search. New Mathematics and Natural Computation 4(3), 343–357 (2008)MathSciNetMATHCrossRefGoogle Scholar
  6. 6.
    Coulom, R.: Efficient Selectivity and Backup Operators in Monte-Carlo Tree Search. In: van den Herik, H.J., Ciancarini, P., Donkers, H.H.L.M(J.) (eds.) CG 2006. LNCS, vol. 4630, pp. 72–83. Springer, Heidelberg (2007)CrossRefGoogle Scholar
  7. 7.
    Drake, P.D.: The Last-Good-Reply Policy for Monte-Carlo Go. ICGA Journal 32(4), 221–227 (2009)MathSciNetGoogle Scholar
  8. 8.
    Fossel, J.D.: Monte-Carlo Tree Search Applied to the Game of Havannah. Bachelor’s thesis, Maastricht University, Maastricht, The Netherlands (2010)Google Scholar
  9. 9.
    Freeling, C.: Introducing Havannah. Abstract Games 14, 14–20 (2003)Google Scholar
  10. 10.
    Gelly, S., Silver, D.: Combining Online and Offline Knowledge in UCT. In: Ghahramani, Z. (ed.) Proceedings of the 24th International Conference on Machine Learning, ICML 2007, pp. 273–280. ACM Press, New York (2007)CrossRefGoogle Scholar
  11. 11.
    Joosten, B.: Creating a Havannah Playing Agent. Bachelor’s thesis, Maastricht University, Maastricht, The Netherlands (2009)Google Scholar
  12. 12.
    Knuth, D.E., Moore, R.W.: An Analysis of Alpha-Beta Pruning. Artificial Intelligence 6(4), 293–326 (1975)MathSciNetMATHCrossRefGoogle Scholar
  13. 13.
    Kocsis, L., Szepesvári, C.: Bandit Based Monte-Carlo Planning. In: Fürnkranz, J., Scheffer, T., Spiliopoulou, M. (eds.) ECML 2006. LNCS (LNAI), vol. 4212, pp. 282–293. Springer, Heidelberg (2006)CrossRefGoogle Scholar
  14. 14.
    Laramée, F.D.: Using N-Gram Statistical Models to Predict Player Behavior. In: Rabin, S. (ed.) AI Game Programming Wisdom, pp. 596–601. Charles River Media, Hingham (2002)Google Scholar
  15. 15.
    Lee, C.-S., Wang, M.-H., Chaslot, G.M.J.-B., Hoock, J.-B., Rimmel, A., Teytaud, O., Tsai, S.-R., Hsu, S.-C., Hong, T.-P.: The Computational Intelligence of MoGo Revealed in Taiwan’s Computer Go Tournaments. IEEE Transactions on Computational Intelligence and AI in Games 1(1), 73–89 (2009)CrossRefGoogle Scholar
  16. 16.
    Lorentz, R.J.: Improving Monte–Carlo Tree Search in Havannah. In: van den Herik, H.J., Iida, H., Plaat, A. (eds.) CG 2010. LNCS, vol. 6515, pp. 105–115. Springer, Heidelberg (2011)CrossRefGoogle Scholar
  17. 17.
    Nijssen, J(P.) A.M., Winands, M.H.M.: Enhancements for Multi-Player Monte-Carlo Tree Search. In: van den Herik, H.J., Iida, H., Plaat, A. (eds.) CG 2010. LNCS, vol. 6515, pp. 238–249. Springer, Heidelberg (2011)CrossRefGoogle Scholar
  18. 18.
    Rimmel, A., Teytaud, F.: Multiple Overlapping Tiles for Contextual Monte Carlo Tree Search. In: Di Chio, C., Cagnoni, S., Cotta, C., Ebner, M., Ekárt, A., Esparcia-Alcazar, A.I., Goh, C.-K., Merelo, J.J., Neri, F., Preuß, M., Togelius, J., Yannakakis, G.N. (eds.) EvoApplicatons 2010. LNCS, vol. 6024, pp. 201–210. Springer, Heidelberg (2010)CrossRefGoogle Scholar
  19. 19.
    Rimmel, A., Teytaud, F., Teytaud, O.: Biasing Monte-Carlo Simulations through RAVE Values. In: van den Herik, H.J., Iida, H., Plaat, A. (eds.) CG 2010. LNCS, vol. 6515, pp. 59–68. Springer, Heidelberg (2011)CrossRefGoogle Scholar
  20. 20.
    Shannon, C.E.: Predication and Entropy of Printed English. The Bell System Technical Journal 30(1), 50–64 (1951)MATHGoogle Scholar
  21. 21.
    Stankiewicz, J.A.: Knowledge-based Monte-Carlo Tree Search in Havannah. Master’s thesis, Maastricht University, Maastricht, The Netherlands (2011)Google Scholar
  22. 22.
    Sutton, R.S., Barto, A.G.: Reinforcement Learning: An Introduction. MIT Press, Cambridge (1998)Google Scholar
  23. 23.
    Teytaud, F., Teytaud, O.: Creating an Upper-Confidence-Tree Program for Havannah. In: van den Herik, H.J., Spronck, P. (eds.) ACG 2009. LNCS, vol. 6048, pp. 65–74. Springer, Heidelberg (2010)CrossRefGoogle Scholar
  24. 24.
    Teytaud, F., Teytaud, O.: On the Huge Benefit of Decisive Moves in Monte-Carlo Tree Search Algorithms. In: Yannakakis, G.N., Togelius, J. (eds.) Proceedings of the 2010 IEEE Conference on Computational Intelligence and Games (CIG 2010), pp. 359–364. IEEE Press (2010)Google Scholar
  25. 25.
    Winands, M.H.M., Björnsson, Y.: αβ-based Play-outs in Monte-Carlo Tree Search. In: 2011 IEEE Conference on Computational Intelligence and Games (CIG 2011), pp. 110–117. IEEE Press (2011)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2012

Authors and Affiliations

  • Jan A. Stankiewicz
    • 1
  • Mark H. M. Winands
    • 1
  • Jos W. H. M. Uiterwijk
    • 1
  1. 1.Department of Knowledge EngineeringMaastricht UniversityThe Netherlands

Personalised recommendations