Adapting Strategies to Opponent Models in Incomplete Information Games: A Reinforcement Learning Approach for Poker

  • Luís Filipe Teófilo
  • Nuno Passos
  • Luís Paulo Reis
  • Henrique Lopes Cardoso
Part of the Lecture Notes in Computer Science book series (LNCS, volume 7326)

Abstract

Researching into the incomplete information games (IIG) field requires the development of strategies which focus on optimizing the decision making process, as there is no unequivocal best choice for a particular play. As such, this paper describes the development process and testing of an agent able to compete against human players on Poker – one of the most popular IIG. The used methodology combines pre-defined opponent models with a reinforcement learning approach. The decision-making algorithm creates a different strategy against each type of opponent by identifying the opponent’s type and adjusting the rewards of the actions of the corresponding strategy. The opponent models are simple classifications used by Poker experts. Thus, each strategy is constantly adapted throughout the games, continuously improving the agent’s performance. In light of this, two agents with the same structure but different rewarding conditions were developed and tested against other agents and each other. The test results indicated that after a training phase the developed strategy is capable of outperforming basic/intermediate playing strategies thus validating this approach.

Keywords

Incomplete Information Games Opponent Modeling Reinforcement Learning Poker 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Billings, D.: Algorithms and Assessment in Computer Poker. Ph.D. University of Alberta, Edmonton, Alberta, Canada (2006)Google Scholar
  2. 2.
    Newborn, M.: Kasparov versus Deep Blue: Computer Chess Comes of Age, 1st edn. Springer (1996)Google Scholar
  3. 3.
    Sklansky, D.: The Theory of Poker: A Professional Poker Player Teaches You How to Think Like One, 4th edn. Two Plus Two (2007)Google Scholar
  4. 4.
    Billings, D.: Computer Poker. M.Sc. University of Alberta, Canada (1995)Google Scholar
  5. 5.
    Davidson, A.: Opponent Modeling in Poker: Learning and Acting in a Hostile and Uncertain Environment. M.Sc. University Alberta, Edmonton, Alberta, Canada (2002)Google Scholar
  6. 6.
    Schauenberg, T.: Opponent Modeling and Search in Poker. M.Sc. University Alberta, Edmonton, Alberta, Canada (2006)Google Scholar
  7. 7.
    Frank, I., Basin, D., Matsubara, H.: Finding optimal strategies for imperfect information games. In: Proceedings 15th National/10th Conference on Artificial Intelligence/Innovative Applications of Artificial Intelligence, pp. 500–507. American Association for Artificial Intelligence, Menlo Park (1998)Google Scholar
  8. 8.
    Johanson, M.: Robust Strategies and Counter-Strategies: Building a Champion Level Computer Poker Player. M.Sc. University Alberta, Edmonton, Alberta, Canada (2007)Google Scholar
  9. 9.
    Gilpin, A., Sandholm, T.: A competitive Texas Hold’em poker player via automated abstraction and real-time equilibrium computation. In: Proceedings 5th International Joint Conference on Autonomous Agents and Multiagent Systems, Hakodate, Japan, pp. 1453–1454 (2006)Google Scholar
  10. 10.
    Gilpin, A., Sandholm, T.: Better automated abstraction techniques for im-perfect information games, with application to Texas Hold’em poker. In: Proceedings 6th International Joint Conference on Autonomous agents and Multiagent Systems. Article 192, Honolulu, Hawaii, United States, 8 pages (2007)Google Scholar
  11. 11.
    Billings, D., Burch, N., Davidson, A., Holte, R.C., Schaeffer, J., Schauenberg, T., Szafron, D.: Approximating game-theoretic optimal strategies for full-scale poker. In: Proceedings 18th International Joint Conference on Artificial Intelligence, Acapulco, Mexico, pp. 661–668 (2003)Google Scholar
  12. 12.
    Johanson, M., Bowling, M.: Data Biased Robust Counter Strategies. Journal of Machine Learning Research 5, 264–271 (2009)Google Scholar
  13. 13.
    Teófilo, L.F., Reis, L.P.: Building a No Limit Texas Hold’em Poker Agent Based on Game Logs Using Supervised Learning. In: Kamel, M., Karray, F., Gueaieb, W., Khamis, A. (eds.) AIS 2011. LNCS, vol. 6752, pp. 73–82. Springer, Heidelberg (2011)CrossRefGoogle Scholar
  14. 14.
    Kleij, A.A.J.: Monte Carlo Tree Search and Opponent Modeling through Player Clustering in no-limit Texas Hold’em Poker. M.Sc. University of Groningen, Netherlands (2010)Google Scholar
  15. 15.
    Van den Broeck, G., Driessens, K., Ramon, J.: Monte-Carlo Tree Search in Poker Using Expected Reward Distributions. In: Zhou, Z.-H., Washio, T. (eds.) ACML 2009. LNCS, vol. 5828, pp. 367–381. Springer, Heidelberg (2009)CrossRefGoogle Scholar
  16. 16.
    Dahl, F.A.: A Reinforcement Learning Algorithm Applied to Simplified Two-Player Texas Hold’em Poker. In: Flach, P.A., De Raedt, L. (eds.) ECML 2001. LNCS (LNAI), vol. 2167, pp. 85–96. Springer, Heidelberg (2001)CrossRefGoogle Scholar
  17. 17.
    Open Meerkat Poker Testbed (2012), http://code.google.com/p/opentestbed/

Copyright information

© Springer-Verlag Berlin Heidelberg 2012

Authors and Affiliations

  • Luís Filipe Teófilo
    • 1
    • 2
  • Nuno Passos
    • 2
  • Luís Paulo Reis
    • 1
    • 3
  • Henrique Lopes Cardoso
    • 1
    • 2
  1. 1.LIACC – Artificial Intelligence and Computer Science Lab.University of PortoPortugal
  2. 2.FEUP – Faculty of Engineering, DEIUniversity of PortoPortugal
  3. 3.EEUM – School of Engineering, DSIUniversity of MinhoPortugal

Personalised recommendations