Abstract
We investigate the use of Monte-Carlo Tree Search (MCTS) within the field of computer Poker, more specifically No-Limit Texas Hold’em. The hidden information in Poker results in so called miximax game trees where opponent decision nodes have to be modeled as chance nodes. The probability distribution in these nodes is modeled by an opponent model that predicts the actions of the opponents. We propose a modification of the standard MCTS selection and backpropagation strategies that explicitly model and exploit the uncertainty of sampled expected values. The new strategies are evaluated as a part of a complete Poker bot that is, to the best of our knowledge, the first exploiting no-limit Texas Hold’em bot that can play at a reasonable level in games of more than two players.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
References
Gilpin, A., Sandholm, T., Sørensen, T.: A heads-up no-limit Texas Hold’em poker player: discretized betting models and automatically generated equilibrium-finding programs. In: Proceedings of the 7th international joint conference on Autonomous agents and multiagent systems, International Foundation for Autonomous Agents and Multiagent Systems Richland, SC, vol. 2, pp. 911–918 (2008)
Billings, D.: Algorithms and assessment in computer poker. PhD thesis, Edmonton, Alta, Canada (2006)
Kocsis, L., Szepesvari, C.: Bandit based monte-carlo planning. In: Fürnkranz, J., Scheffer, T., Spiliopoulou, M. (eds.) ECML 2006. LNCS (LNAI), vol. 4212, pp. 282–293. Springer, Heidelberg (2006)
Russell, S., Norvig, P.: Artificial intelligence: A modern approach. Prentice Hall, Englewood Cliffs (2003)
Suffecool, K.: Cactus kev’s poker hand evaluator (July 2007), http://www.suffecool.net/poker/evaluator.html
Witten Ian, H., Eibe, F.: Data Mining: Practical machine learning tools and techniques. Morgan Kaufmann, San Francisco (2005)
Wang, Y., Witten, I.: Induction of model trees for predicting continuous classes (1996)
Coulom, R.: Efficient selectivity and backup operators in Monte-Carlo tree search. In: van den Herik, H.J., Ciancarini, P., Donkers, H.H.L.M(J.) (eds.) CG 2006. LNCS, vol. 4630, pp. 72–83. Springer, Heidelberg (2007)
Gelly, S., Wang, Y.: Exploration exploitation in go: UCT for Monte-Carlo go. In: Twentieth Annual Conference on Neural Information Processing Systems, NIPS 2006 (2006)
Chaslot, G., Winands, M., Herik, H., Uiterwijk, J., Bouzy, B.: Progressive strategies for monte-carlo tree search. New Mathematics and Natural Computation 4(3), 343 (2008)
Van Lishout, F., Chaslot, G., Uiterwijk, J.: Monte-Carlo Tree Search in Backgammon. In: Computer Games Workshop, pp. 175–184 (2007)
Auer, P., Cesa-Bianchi, N., Fischer, P.: Finite-time analysis of the multiarmed bandit problem. Machine Learning 47(2-3), 235–256 (2002)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2009 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Van den Broeck, G., Driessens, K., Ramon, J. (2009). Monte-Carlo Tree Search in Poker Using Expected Reward Distributions. In: Zhou, ZH., Washio, T. (eds) Advances in Machine Learning. ACML 2009. Lecture Notes in Computer Science(), vol 5828. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-05224-8_28
Download citation
DOI: https://doi.org/10.1007/978-3-642-05224-8_28
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-05223-1
Online ISBN: 978-3-642-05224-8
eBook Packages: Computer ScienceComputer Science (R0)