Advertisement

A Curling Agent Based on the Monte-Carlo Tree Search Considering the Similarity of the Best Action Among Similar States

  • Katsuki OhtoEmail author
  • Tetsuro Tanaka
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 10664)

Abstract

Curling is one of the most strategic winter sports. Recently, many computer scientists have studied curling strategies. The Digital Curling system is a framework used to compare curling strategies. Herein, we present a computer agent based on the Monte-Carlo Tree Search (MCTS) for the Digital Curling framework. We implemented a novel action decision method based on MCTS for Markov decision processes with continuous state space. The experimental results show that our search method is effective for agents with a simple simulation policy and agents with a handmade complex one.

References

  1. 1.
    Ito, T., Kitasei, Y.: Proposal and implementation of “digital curling”. In: 2015 IEEE Conference on Computational Intelligence and Games, pp. 469–473. IEEE (2015)Google Scholar
  2. 2.
    Yamamoto, M., Kato, S., Iizuka, H.: Digital curling strategy on game tree search. In: 2015 IEEE Conference on Computational Intelligence and Games, pp. 474–480. IEEE (2015)Google Scholar
  3. 3.
    Yee, T., Lisy, V., Bowling, M.: Monte Carlo tree search in continuous action spaces with execution uncertainty. In: Proceedings of the Twenty-Fifth International Joint Conference on Artificial Intelligence, pp. 690–696. AAAI (2016)Google Scholar
  4. 4.
    Browne, C., Powley, E., Whitehouse, D., Lucas, S., Cowling, P.I., Rohlfshagen, P., Tavener, S., Perez, D., Samothrakis, S., Colton, S.: A survey of Monte Carlo tree search methods. IEEE Trans. Comput. Intell. AI Games 4, 1–49 (2012)CrossRefGoogle Scholar
  5. 5.
    Silver, D., Huang, A., Maddison, C.J., Guez, A., Sifre, L., van den Driessche, G., Schrittwieser, J., Antonoglou, I., Panneershelvam, V., Lanctot, M., Dieleman, S., Grewe, D., Nham, J., Kalchbrenner, N., Sutskever, I., Lillicrap, T., Leach, M., Kavukcuoglu, K., Graepel, T., Hassabis, D.: Mastering the game of Go with deep neural networks and tree search. Nature 529(7587), 484–489 (2016)CrossRefGoogle Scholar
  6. 6.
    Kocsis, L., Szepesvári, C.: Bandit based Monte-Carlo planning. In: Fürnkranz, J., Scheffer, T., Spiliopoulou, M. (eds.) ECML 2006. LNCS, vol. 4212, pp. 282–293. Springer, Heidelberg (2006).  https://doi.org/10.1007/11871842_29 CrossRefGoogle Scholar
  7. 7.
    Auer, P., Cesa-Bianchi, N., Fischer, P.: Finite-time analysis of the multiarmed bandit problem. Mach. Learn. 47, 235–256 (2002)CrossRefzbMATHGoogle Scholar
  8. 8.
    Gerry, S., Wang, Y.: Exploration exploitation in Go: UCT for Monte-Carlo Go. In: Advances in Neural Information Processing Systems, NIPS, pp. 282–293 (2006)Google Scholar
  9. 9.
    van Lishout, F., Chaslot, G., Uiterwijk, J.: Monte-Carlo tree search in backgammon. In: Computer Games Workshop, MICC-IKAT, pp. 175–184 (2007)Google Scholar
  10. 10.
    Auer, P., Ortner, R., Szepesvári, C.: Improved rates for the stochastic continuum-armed bandit problem. In: Bshouty, N.H., Gentile, C. (eds.) COLT 2007. LNCS, vol. 4539, pp. 454–468. Springer, Heidelberg (2007).  https://doi.org/10.1007/978-3-540-72927-3_33 CrossRefGoogle Scholar
  11. 11.
    Bubeck, S., Munos, R., Stoltz, G., Szepesvári, C.: Online optimization in \(x\)-armed bandits. In: Advances in Neural Information Processing Systems, NIPS, pp. 201–208 (2009)Google Scholar
  12. 12.
    Perez, D., Rohlfshagen, P., Lucas, S.M.: Monte-Carlo tree search for the physical travelling salesman problem. In: Di Chio, C., et al. (eds.) EvoApplications 2012. LNCS, vol. 7248, pp. 255–264. Springer, Heidelberg (2012).  https://doi.org/10.1007/978-3-642-29178-4_26 CrossRefGoogle Scholar
  13. 13.
    Couëtoux, A., Hoock, J.-B., Sokolovska, N., Teytaud, O., Bonnard, N.: Continuous upper confidence trees. In: Coello, C.A.C. (ed.) LION 2011. LNCS, vol. 6683, pp. 433–445. Springer, Heidelberg (2011).  https://doi.org/10.1007/978-3-642-25566-3_32 CrossRefGoogle Scholar
  14. 14.
    Weinstein, A.: Local planning for continuous Markov decision processes. Ph.D. thesis, Rutgers University (2014)Google Scholar
  15. 15.
    Chaslot, G., Fiter, C., Hoock, J.-B., Rimmel, A., Teytaud, O.: Adding expert knowledge and exploration in Monte-Carlo tree search. In: van den Herik, H.J., Spronck, P. (eds.) ACG 2009. LNCS, vol. 6048, pp. 1–13. Springer, Heidelberg (2010).  https://doi.org/10.1007/978-3-642-12993-3_1 CrossRefGoogle Scholar
  16. 16.
    Zobrist, A.: A new hashing method with applications for game playing. ICCA J. 13, 69–73 (1970)Google Scholar

Copyright information

© Springer International Publishing AG 2017

Authors and Affiliations

  1. 1.The University of TokyoTokyoJapan

Personalised recommendations