Skip to main content

Upper Confidence Trees with Short Term Partial Information

  • Conference paper
Applications of Evolutionary Computation (EvoApplications 2011)

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 6624))

Included in the following conference series:

Abstract

We show some mathematical links between partially observable (PO) games in which information is regularly revealed, and simultaneous actions games. Using this, we study the extension of Monte-Carlo Tree Search algorithms to PO games and to games with simultaneous actions. We apply the results to Urban Rivals, a free PO internet card game with more than 10 millions of registered users.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Audibert, J.-Y., Bubeck, S.: Minimax policies for adversarial and stochastic bandits. In: Proceedings of the Annual Conference on Learning Theory (COLT) (2009)

    Google Scholar 

  2. Auer, P., Cesa-Bianchi, N., Freund, Y., Schapire, R.E.: Gambling in a rigged casino: the adversarial multi-armed bandit problem. In: Proceedings of the 36th Annual Symposium on Foundations of Computer Science, pp. 322–331. IEEE Computer Society Press, Los Alamitos (1995)

    Google Scholar 

  3. Bouzy, B., Métivier, M.: Multi-agent learning experiments on repeated matrix games. In: ICML, pp. 119–126 (2010)

    Google Scholar 

  4. Coulom, R.: Efficient Selectivity and Backup Operators in Monte-Carlo Tree Search. In: Ciancarini, P., van den Herik, H.J. (eds.) CG 2006. LNCS, vol. 4630, pp. 72–83. Springer, Heidelberg (2007)

    Chapter  Google Scholar 

  5. Grigoriadis, M.D., Khachiyan, L.G.: A sublinear-time randomized approximation algorithm for matrix games. Operations Research Letters 18(2), 53–58 (1995)

    Article  MathSciNet  MATH  Google Scholar 

  6. Hearn, R.A., Demaine, E.: Games, Puzzles, and Computation. AK Peters, Wellesley (2009)

    MATH  Google Scholar 

  7. Kocsis, L., Szepesvári, C.: Bandit based monte-carlo planning. In: Fürnkranz, J., Scheffer, T., Spiliopoulou, M. (eds.) ECML 2006. LNCS (LNAI), vol. 4212, pp. 282–293. Springer, Heidelberg (2006)

    Chapter  Google Scholar 

  8. Lai, T., Robbins, H.: Asymptotically efficient adaptive allocation rules. Advances in Applied Mathematics 6, 4–22 (1985)

    Article  MathSciNet  MATH  Google Scholar 

  9. Lee, C.-S., Wang, M.-H., Chaslot, G., Hoock, J.-B., Rimmel, A., Teytaud, O., Tsai, S.-R., Hsu, S.-C., Hong, T.-P.: The Computational Intelligence of MoGo Revealed in Taiwan’s Computer Go Tournaments. IEEE Transactions on Computational Intelligence and AI in games (2009)

    Google Scholar 

  10. Madani, O., Hanks, S., Condon, A.: On the undecidability of probabilistic planning and related stochastic optimization problems. Artif. Intell. 147(1-2), 5–34 (2003)

    Article  MathSciNet  MATH  Google Scholar 

  11. Mundhenk, M., Goldsmith, J., Lusena, C., Allender, E.: Complexity of finite-horizon markov decision process problems. J. ACM 47(4), 681–720 (2000)

    Article  MathSciNet  MATH  Google Scholar 

  12. Papadimitriou, C.H., Tsitsiklis, J.N.: The complexity of markov decision processses. Mathematics of Operations Research 12(3), 441–450 (1987)

    Article  MathSciNet  MATH  Google Scholar 

  13. Rintanen, J.: Complexity of Planning with Partial Observability. In: Proceedings of ICAPS 2003 Workshop on Planning under Uncertainty and Incomplete Information, Trento, Italy (June 2003)

    Google Scholar 

  14. Teytaud, O.: Decidability and complexity in partially observable antagonist coevolution. In: Proceedings of Dagstuhl’s seminar 10361 (2010)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2011 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Teytaud, O., Flory, S. (2011). Upper Confidence Trees with Short Term Partial Information. In: Di Chio, C., et al. Applications of Evolutionary Computation. EvoApplications 2011. Lecture Notes in Computer Science, vol 6624. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-20525-5_16

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-20525-5_16

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-20524-8

  • Online ISBN: 978-3-642-20525-5

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics