Skip to main content

Adding Expert Knowledge and Exploration in Monte-Carlo Tree Search

  • Conference paper

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 6048))

Abstract

We present a new exploration term, more efficient than classical UCT-like exploration terms. It combines efficiently expert rules, patterns extracted from datasets, All-Moves-As-First values, and classical online values. As this improved bandit formula does not solve several important situations (semeais, nakade) in computer Go, we present three other important improvements which are central in the recent progress of our program MoGo.

  • We show an expert-based improvement of Monte-Carlo simulations for nakade situations; we also emphasize some limitations of this modification.

  • We show a technique which preserves diversity in the Monte-Carlo simulation, which greatly improves the results in 19x19.

  • Whereas the UCB-based exploration term is not efficient in MoGo, we show a new exploration term which is highly efficient in MoGo.

MoGo recently won a game with handicap 7 against a 9Dan Pro player, Zhou JunXun, winner of the LG Cup 2007, and a game with handicap 6 against a 1Dan pro player, Li-Chen Chien.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Chaslot, G.M.J.B., Winands, M.H.M., Uiterwijk, J.W.H.M., van den Herik, H.J., Bouzy, B.: Progressive strategies for monte-carlo tree search. In: Wang, P., et al. (eds.) Proceedings of the 10th Joint Conference on Information Sciences (JCIS 2007), pp. 655–661. World Scientific Publishing Co. Pte. Ltd., Singapore (2007)

    Chapter  Google Scholar 

  2. Coulom, R.: Efficient selectivity and backup operators in monte-carlo tree search. In: Ciancarini, P., van den Herik, H.J. (eds.) CG 2006. LNCS, vol. 4630, pp. 72–83. Springer, Heidelberg (2007)

    Chapter  Google Scholar 

  3. Kocsis, L., Szepesvari, C.: Bandit-based monte-carlo planning. In: Fürnkranz, J., Scheffer, T., Spiliopoulou, M. (eds.) ECML 2006. LNCS (LNAI), vol. 4212, pp. 282–293. Springer, Heidelberg (2006)

    Chapter  Google Scholar 

  4. Gelly, S., Silver, D.: Combining online and offline knowledge in uct. In: ICML 2007: Proceedings of the 24th international conference on Machine learning, New York, NY, USA, pp. 273–280. ACM Press, New York (2007)

    Chapter  Google Scholar 

  5. Brügmann, B.: Monte-Carlo Go (Unpublished) (1993)

    Google Scholar 

  6. Bouzy, B., Helmstetter, B.: Monte-Carlo Go developments. In: van den Herik, H.J., Iida, H., Heinz, E.A. (eds.) 10th Advances in Computer Games, pp. 159–174 (2003)

    Google Scholar 

  7. Coquelin, P.A., Munos, R.: Bandit algorithms for tree search. In: Proceedings of UAI 2007 (2007)

    Google Scholar 

  8. Gelly, S., Hoock, J.B., Rimmel, A., Teytaud, O., Kalemkarian, Y.: The parallelization of monte-carlo planning. In: Proceedings of the International Conference on Informatics in Control, Automation and Robotics (ICINCO 2008), pp. 198–203 (2008) (to appear)

    Google Scholar 

  9. Bouzy, B., Chaslot, G.M.J.B.: Bayesian generation and integration of k-nearest-neighbor patterns for 19x19 go. In: Kendall, G., Lucas, S. (eds.) IEEE 2005 Symposium on Computational Intelligence in Games, Colchester, UK, pp. 176–181 (2005)

    Google Scholar 

  10. Coulom, R.: Computing elo ratings of move patterns in the game of go. In: Computer Games Workshop, Amsterdam, The Netherlands (2007)

    Google Scholar 

  11. Bouzy, B., Chaslot, G.M.J.B.: Monte-Carlo Go Reinforcement Learning Experiments. In: Kendall, G., Louis, S. (eds.) IEEE 2006 Symposium on Computational Intelligence in Games, Reno, USA, pp. 187–194 (2006)

    Google Scholar 

  12. Wang, Y., Gelly, S.: Modifications of UCT and sequence-like simulations for Monte-Carlo Go. In: IEEE Symposium on Computational Intelligence and Games, Honolulu, Hawaii, pp. 175–182 (2007)

    Google Scholar 

  13. Bouzy, B.: Associating domain-dependent knowledge and Monte-Carlo approaches within a go program. In: Chen, K. (ed.) Information Sciences, Heuristic Search and Computer Game Playing IV, vol. 175, pp. 247–257 (2005)

    Google Scholar 

  14. Ralaivola, L., Wu, L., Baldi, P.: SVM and pattern-enriched common fate graphs for the game of Go. In: Proceedings of ESANN 2005, pp. 485–490 (2005)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2010 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Chaslot, G., Fiter, C., Hoock, JB., Rimmel, A., Teytaud, O. (2010). Adding Expert Knowledge and Exploration in Monte-Carlo Tree Search. In: van den Herik, H.J., Spronck, P. (eds) Advances in Computer Games. ACG 2009. Lecture Notes in Computer Science, vol 6048. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-12993-3_1

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-12993-3_1

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-12992-6

  • Online ISBN: 978-3-642-12993-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics