The Feasibility of Deep Counterfactual Regret Minimisation for Trading Card Games

Adams, David

doi:10.1007/978-3-031-22695-3_11

David Adams¹⁰

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 13728))

Included in the following conference series:

Australasian Joint Conference on Artificial Intelligence

1527 Accesses

Abstract

Counterfactual Regret Minimisation (CFR) is the leading technique for approximating Nash Equilibria in imperfect information games. It was an integral part of Libratus, the first AI to beat professionals at Heads-up No-limit Texas-holdem Poker. However, current implementations of CFR rely on a tabular game representation and hand-crafted abstractions to reduce the state space, limiting their ability to scale to larger and more complex games. More recently, techniques such as Deep CFR (DCFR), Variance-Reduction Monte-carlo CFR (VR-MCCFR) and Double Neural CFR (DN-CFR) have been proposed to alleviate CFR’s shortcomings by both learning the game state and reducing the overall computation through aggressive sampling. To properly test potential performance improvements, a class of game harder than Poker is required, especially considering current agents are already at superhuman levels. The trading card game Yu-Gi-Oh was selected as its game interactions are highly sophisticated, the overall state space is many orders of magnitude higher than Poker and there are existing simulator implementations. It also introduces the concept of a meta-strategy, where a player strategically chooses a specific set of cards from a large pool to play. Overall, this work seeks to evaluate whether newer CFR methods scale to harder games by comparing the relative performance of existing techniques such as regular CFR and Heuristic agents to the newer DCFR whilst also seeing if these agents can provide automated evaluation of meta-strategies.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 89.00; Price excludes VAT (USA)

Softcover Book: USD 119.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Dueling book. https://www.duelingbook.com/
Project ignis card scripts for edopro. https://github.com/ProjectIgnis/CardScripts
Project ignis: Edopro. https://github.com/ProjectIgnis/EDOPro
Best-first minimax search: Artif. Intell. 84(1), 299–337 (1996). https://doi.org/10.1016/0004-3702(95)00096-8
Akira: Yu-gi-oh! world championship 2013. https://roadoftheking.com/yu-gi-oh-world-championship-2013/
Bowling, M., Burch, N., Johanson, M., Tammelin, O.: Heads-up limit Hold’em poker is solved. Science 347(6218), 145–149 (2015)
Article Google Scholar
Brown, N., Ganzfried, S., Sandholm, T.: Hierarchical abstraction, distributed equilibrium computation, and post-processing, with application to a champion no-limit Texas Hold’em agent. In: Workshops at the Twenty-Ninth AAAI Conference on Artificial Intelligence (2015)
Google Scholar
Brown, N., Lerer, A., Gross, S., Sandholm, T.: Deep counterfactual regret minimization. CoRR abs/1811.00164 (2018). http://arxiv.org/abs/1811.00164
Brown, N., Sandholm, T.: Superhuman AI for heads-up no-limit poker: libratus beats top professionals. Science 359(6374), 418–424 (2018). https://doi.org/10.1126/science.aao1733, https://science.sciencemag.org/content/359/6374/418
Browne, C., et al.: A survey of Monte Carlo tree search methods. IEEE Trans. Comput. Intell. AI Games 4(1), 1–43 (03 2012). https://doi.org/10.1109/TCIAIG.2012.2186810
Cowling, P.I., Ward, C.D., Powley, E.J.: Ensemble determinization in Monte Carlo tree search for the imperfect information card game magic: the gathering. IEEE Trans. Comput. Intell. AI Games 4(4), 241–257 (2012)
Article Google Scholar
Dockhorn, A., Mostaghim, S.: Introducing the hearthstone-AI competition. arXiv preprint arXiv:1906.04238 (2019)
Grad, Ł.: Helping AI to play hearthstone using neural networks. In: 2017 Federated Conference on Computer Science and Information Systems (FedCSIS), pp. 131–134 (2017). https://doi.org/10.15439/2017F561
James, S., Konidaris, G., Rosman, B.: An analysis of Monte Carlo tree search. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 31 (2017)
Google Scholar
Blar, J., Mutchler, D., Liu, C.: Games with imperfect information (1993)
Google Scholar
Johanson, M., Bard, N., Lanctot, M., Gibson, R.G., Bowling, M.: Efficient Nash equilibrium approximation through Monte Carlo counterfactual regret minimization. In: AAMAS, pp. 837–846. Citeseer (2012)
Google Scholar
Konami: Yugioh duel links. https://www.konami.com/yugioh/duel_links/en/
Li, H., Hu, K., Zhang, S., Qi, Y., Song, L.: Double neural counterfactual regret minimization. In: International Conference on Learning Representations (2020). https://openreview.net/forum?id=ByedzkrKvH
Matros, A.: Lloyd shapley and chess with imperfect information. Games Econ. Behav. 108, 600–613 (2018). https://doi.org/10.1016/j.geb.2017.12.003, https://www.sciencedirect.com/science/article/pii/S0899825617302221, special Issue in Honor of Lloyd Shapley: Seven Topics in Game Theory
Schmid, M., Burch, N., Lanctot, M., Moravcik, M., Kadlec, R., Bowling, M.: Variance reduction in Monte Carlo counterfactual regret minimization (VR-MCCFR) for extensive form games using baselines. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 33, pp. 2157–2164 (2019)
Google Scholar
Silver, D., et al.: Mastering chess and shogi by self-play with a general reinforcement learning algorithm (2017)
Google Scholar
Silver, D., et al.: Mastering the game of go without human knowledge. Nature 550(7676), 354–359 (2017)
Google Scholar
SinL0rtuen: New format’s top tiers - let’s make a list together ii. https://www.pojo.biz/board/showthread.php?t=991471
Syrgkanis, V., Agarwal, A., Luo, H., Schapire, R.E.: Fast convergence of regularized learning in games. arXiv preprint arXiv:1507.00407 (2015)
Ward, C.D., Cowling, P.I.: Monte Carlo search applied to card selection in magic: the gathering. In: 2009 IEEE Symposium on Computational Intelligence and Games, pp. 9–16 (2009). https://doi.org/10.1109/CIG.2009.5286501
Zinkevich, M., Johanson, M., Bowling, M., Piccione, C.: Regret minimization in games with incomplete information. In: Advances in Neural Information Processing Systems, vol. 20, pp. 1729–1736 (2007)
Google Scholar

Download references

Author information

Authors and Affiliations

University of Western Australia, Perth, WA, 6009, Australia
David Adams

Authors

David Adams
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to David Adams .

Editor information

Editors and Affiliations

University of New South Wales, Sydney, NSW, Australia
Haris Aziz
University of Western Australia, Perth, WA, Australia
Débora Corrêa
University of Western Australia, Perth, WA, Australia
Tim French

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Adams, D. (2022). The Feasibility of Deep Counterfactual Regret Minimisation for Trading Card Games. In: Aziz, H., Corrêa, D., French, T. (eds) AI 2022: Advances in Artificial Intelligence. AI 2022. Lecture Notes in Computer Science(), vol 13728. Springer, Cham. https://doi.org/10.1007/978-3-031-22695-3_11

Download citation

DOI: https://doi.org/10.1007/978-3-031-22695-3_11
Published: 03 December 2022
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-22694-6
Online ISBN: 978-3-031-22695-3
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics