Deep Reinforcement Learning in Strategic Board Game Environments

Xenou, Konstantia; Chalkiadakis, Georgios; Afantenos, Stergos

doi:10.1007/978-3-030-14174-5_16

Konstantia Xenou¹³,
Georgios Chalkiadakis¹³ &
Stergos Afantenos¹⁴

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 11450))

Included in the following conference series:

European Conference on Multi-Agent Systems

1010 Accesses
11 Citations

Abstract

In this paper we propose a novel Deep Reinforcement Learning (DRL) algorithm that uses the concept of “action-dependent state features”, and exploits it to approximate the Q-values locally, employing a deep neural network with parallel Long Short Term Memory (LSTM) components, each one responsible for computing an action-related Q-value. As such, all computations occur simultaneously, and there is no need to employ “target” networks and experience replay, which are techniques regularly used in the DRL literature. Moreover, our algorithm does not require previous training experiences, but trains itself online during game play. We tested our approach in the Settlers Of Catan multi-player strategic board game. Our results confirm the effectiveness of our approach, since it outperforms several competitors, including the state-of-the-art jSettler heuristic algorithm devised for this particular domain.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
We remark that no factored state representation was assumed in [25]; rather, each state was linked to a single action-dependent feature (with its set of values).
2.
More accurately, in our implementation in a pseudo-parallel manner: all LSTMs are executed independently and the final action is selected given their outputs.
3.
http://nand.net/jsettlers/.
4.
In general, our action and game set up follows [4].
5.
Compare this number to the 500, 000 learning experiences required by the DRL agent in [4].

References

Afantenos, S., Kow, E., Asher, N., Perret, J.: Discourse parsing for multi-party chat dialogues. Proc. EMNLP 2015, 928–937 (2015)
Google Scholar
Anschel, O., Baram, N., Shimkin, N.: Deep reinforcement learning with averaged target DQN. CoRR abs/1611.01929 (2016)
Google Scholar
Bellman, R.: Dynamic programming. Courier Corporation, Chelmsford (2013)
MATH Google Scholar
Cuayáhuitl, H., Keizer, S., Lemon, O.: Strategic dialogue management via deep reinforcement learning. In: Proceedings of the NIPS Deep Reinforcement Learning Workshop (NIPS 2015) (2015)
Google Scholar
Dearden, R., Friedman, N., Russell, S.: Bayesian Q-learning. In: AAAI/IAAI, pp. 761–768 (1998)
Google Scholar
Dobre, M.S., Lascarides, A.: Online learning and mining human play in complex games. In: 2015 IEEE Conference on Computational Intelligence and Games (CIG), pp. 60–67. IEEE (2015)
Google Scholar
Finnman, P., Winberg, M.: Deep reinforcement learning compared with Q-table learning applied to backgammon (2016)
Google Scholar
Guhe, M., Lascarides, A.: Game strategies for the Settlers of Catan. In: Computational Intelligence and Games (CIG), pp. 1–8. IEEE (2014)
Google Scholar
van Hasselt, H., Guez, A., Silver, D.: Deep reinforcement learning with double Q-learning. CoRR abs/1509.06461 (2015)
Google Scholar
Hausknecht, M., Stone, P.: Deep recurrent Q-learning for partially observable MDPs. CoRR, abs/1507.06527 7(1) (2015)
Google Scholar
Hochreiter, S., Schmidhuber, J.: Long short-term memory. Neural Comput. 9(8), 1735–1780 (1997)
Article Google Scholar
Karamalegkos, E.: Monte Carlo tree search in the “Settlers of Catan” strategy game, Senior Undergraduate Diploma thesis, School of Electrical and Computer Engineering, Technical University of Crete (2014). https://goo.gl/rU9vG8
Keizer, S., et al.: Evaluating persuasion strategies and deep reinforcement learning methods for negotiation dialogue agents. In: Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics: Volume 2, Short Papers, vol. 2, pp. 480–484 (2017)
Google Scholar
Kok, J.R., Vlassis, N.: Collaborative multiagent reinforcement learning by payoff propagation. J. Mach. Learn. Res. 7(Sep), 1789–1828 (2006)
MathSciNet MATH Google Scholar
Lai, M.: Giraffe: using deep reinforcement learning to play Chess. arXiv preprint arXiv:1509.01549 (2015)
Lillicrap, T.P., et al.: Continuous control with deep reinforcement learning. CoRR abs/1509.02971 (2015)
Google Scholar
Mnih, V., Kavukcuoglu, K., Silver, D., et al.: Human-level control through deep reinforcement learning. Nature 518(7540), 529–533 (2015)
Article Google Scholar
Oh, J., Guo, X., Lee, H., Lewis, R.L., Singh, S.: Action-conditional video prediction using deep networks in Atari games. In: Advances in Neural Information Processing Systems, pp. 2863–2871 (2015)
Google Scholar
Osband, I., Blundell, C., Pritzel, A., Roy, B.V.: Deep exploration via bootstrapped DQN. CoRR abs/1602.04621 (2016)
Google Scholar
Panousis, K.P.: Real-time planning and learning in the “Settlers of Catan” strategy game, Senior Undergraduate Diploma thesis, School of Electrical and Computer Engineering, Technical University of Crete (2014). https://goo.gl/4Hpx8w
Pfeiffer, M.: Reinforcement learning of strategies for Settlers of Catan. In: International Conference on Computer Games: Artificial Intelligence (2018)
Google Scholar
Russell, S.J., Zimdars, A.: Q-decomposition for reinforcement learning agents. In: Proceedings of the 20th International Conference on Machine Learning (ICML-03), pp. 656–663 (2003)
Google Scholar
Schmidhuber, J.: Deep learning in neural networks: an overview. Neural Netw. 61, 85–117 (2015)
Article Google Scholar
Silver, D., et al.: Mastering the game of go without human knowledge. Nature 550, 354–359 (2017)
Article Google Scholar
Stone, P., Veloso, M.: Team-partitioned, opaque-transition reinforcement learning. In: Proceedings of the Third Annual Conference on Autonomous Agents, pp. 206–212. ACM (1999)
Google Scholar
Sutton, R.S., Barto, A.G.: Reinforcement Learning: An Introduction. MIT Press, Cambridge (1998)
Google Scholar
Szita, I., Chaslot, G., Spronck, P.: Monte-Carlo tree search in Settlers of Catan. In: van den Herik, H.J., Spronck, P. (eds.) ACG 2009. LNCS, vol. 6048, pp. 21–32. Springer, Heidelberg (2010). https://doi.org/10.1007/978-3-642-12993-3_3
Chapter Google Scholar
Thomas, R.S.: Real-time decision making for adversarial environments using a plan-based heuristic. Ph.D. thesis, Northwestern University (2003)
Google Scholar

Download references

Author information

Authors and Affiliations

School of Electrical and Computer Engineering, Technical University of Crete, Chania, Greece
Konstantia Xenou & Georgios Chalkiadakis
Institut de recherche en informatique de Toulouse (IRIT), Université Paul Sabatier, Toulouse, France
Stergos Afantenos

Authors

Konstantia Xenou
View author publications
You can also search for this author in PubMed Google Scholar
Georgios Chalkiadakis
View author publications
You can also search for this author in PubMed Google Scholar
Stergos Afantenos
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Georgios Chalkiadakis .

Editor information

Editors and Affiliations

University of Bergen, Bergen, Norway
Marija Slavkovik

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Xenou, K., Chalkiadakis, G., Afantenos, S. (2019). Deep Reinforcement Learning in Strategic Board Game Environments. In: Slavkovik, M. (eds) Multi-Agent Systems. EUMAS 2018. Lecture Notes in Computer Science(), vol 11450. Springer, Cham. https://doi.org/10.1007/978-3-030-14174-5_16

Download citation

DOI: https://doi.org/10.1007/978-3-030-14174-5_16
Published: 15 February 2019
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-14173-8
Online ISBN: 978-3-030-14174-5
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics