Abstract
We show some mathematical links between partially observable (PO) games in which information is regularly revealed, and simultaneous actions games. Using this, we study the extension of Monte-Carlo Tree Search algorithms to PO games and to games with simultaneous actions. We apply the results to Urban Rivals, a free PO internet card game with more than 10 millions of registered users.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Audibert, J.-Y., Bubeck, S.: Minimax policies for adversarial and stochastic bandits. In: Proceedings of the Annual Conference on Learning Theory (COLT) (2009)
Auer, P., Cesa-Bianchi, N., Freund, Y., Schapire, R.E.: Gambling in a rigged casino: the adversarial multi-armed bandit problem. In: Proceedings of the 36th Annual Symposium on Foundations of Computer Science, pp. 322–331. IEEE Computer Society Press, Los Alamitos (1995)
Bouzy, B., Métivier, M.: Multi-agent learning experiments on repeated matrix games. In: ICML, pp. 119–126 (2010)
Coulom, R.: Efficient Selectivity and Backup Operators in Monte-Carlo Tree Search. In: Ciancarini, P., van den Herik, H.J. (eds.) CG 2006. LNCS, vol. 4630, pp. 72–83. Springer, Heidelberg (2007)
Grigoriadis, M.D., Khachiyan, L.G.: A sublinear-time randomized approximation algorithm for matrix games. Operations Research Letters 18(2), 53–58 (1995)
Hearn, R.A., Demaine, E.: Games, Puzzles, and Computation. AK Peters, Wellesley (2009)
Kocsis, L., Szepesvári, C.: Bandit based monte-carlo planning. In: Fürnkranz, J., Scheffer, T., Spiliopoulou, M. (eds.) ECML 2006. LNCS (LNAI), vol. 4212, pp. 282–293. Springer, Heidelberg (2006)
Lai, T., Robbins, H.: Asymptotically efficient adaptive allocation rules. Advances in Applied Mathematics 6, 4–22 (1985)
Lee, C.-S., Wang, M.-H., Chaslot, G., Hoock, J.-B., Rimmel, A., Teytaud, O., Tsai, S.-R., Hsu, S.-C., Hong, T.-P.: The Computational Intelligence of MoGo Revealed in Taiwan’s Computer Go Tournaments. IEEE Transactions on Computational Intelligence and AI in games (2009)
Madani, O., Hanks, S., Condon, A.: On the undecidability of probabilistic planning and related stochastic optimization problems. Artif. Intell. 147(1-2), 5–34 (2003)
Mundhenk, M., Goldsmith, J., Lusena, C., Allender, E.: Complexity of finite-horizon markov decision process problems. J. ACM 47(4), 681–720 (2000)
Papadimitriou, C.H., Tsitsiklis, J.N.: The complexity of markov decision processses. Mathematics of Operations Research 12(3), 441–450 (1987)
Rintanen, J.: Complexity of Planning with Partial Observability. In: Proceedings of ICAPS 2003 Workshop on Planning under Uncertainty and Incomplete Information, Trento, Italy (June 2003)
Teytaud, O.: Decidability and complexity in partially observable antagonist coevolution. In: Proceedings of Dagstuhl’s seminar 10361 (2010)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2011 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Teytaud, O., Flory, S. (2011). Upper Confidence Trees with Short Term Partial Information. In: Di Chio, C., et al. Applications of Evolutionary Computation. EvoApplications 2011. Lecture Notes in Computer Science, vol 6624. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-20525-5_16
Download citation
DOI: https://doi.org/10.1007/978-3-642-20525-5_16
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-20524-8
Online ISBN: 978-3-642-20525-5
eBook Packages: Computer ScienceComputer Science (R0)