Automatic Generation of a Sub-optimal Agent Population with Learning

Reis, Simão; Reis, Luís Paulo; Lau, Nuno

doi:10.1007/978-3-030-16184-2_7

Automatic Generation of a Sub-optimal Agent Population with Learning

Simão Reis¹⁸,
Luís Paulo Reis¹⁸ &
Nuno Lau¹⁹

Conference paper
First Online: 30 March 2019

1979 Accesses
2 Citations

Part of the book series: Advances in Intelligent Systems and Computing ((AISC,volume 931))

Abstract

Most modern solutions for video game balancing are directed towards specific games. We are currently researching general methods for automatic multiplayer game balancing. The problem is modeled as a meta-game, where game-play change the rules from another game. This way, a Machine Learning agent that learns to play a meta-game, learns how to change a base game following some balancing metric. But an issue resides in the generation of high volume of game-play training data, was agents of different skill compete against each other. For this end we propose the automatic generation of a population of surrogate agents by learning sampling. In Reinforcement Learning an agent learns in a trial error fashion where it improves gradually its policy, the mapping from world state to action to perform. This means that in each successful evolutionary step an agent follows a sub-optimal strategy, or eventually the optimal strategy. We store the agent policy at the end of each training episode. The process is evaluated in simple environments with distinct properties. Quality of the generated population is evaluated by the diversity of the difficulty the agents have in solving their tasks.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 169.00; Price excludes VAT (USA)

Softcover Book: USD 219.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Notes

References

Alankus, G., Lazar, A., May, M., Kelleher, C.: Towards customizable games for stroke rehabilitation. In: Proceedings of the SIGCHI Conference on Human Factors in Computing Systems, CHI 2010, New York, NY, USA, pp. 2113–2122. ACM (2010)
Google Scholar
Andersen, P., Goodwin, M., Granmo, O.: Deep RTS: a game environment for deep reinforcement learning in real-time strategy games. In: 2018 IEEE Conference on Computational Intelligence and Games (CIG), pp. 1–8, August 2018
Google Scholar
Baldwin, A., Johnson, D., Wyeth, P., Sweetser, P.: A framework of dynamic difficulty adjustment in competitive multiplayer video games. In: 2013 IEEE International Games Innovation Conference (IGIC), pp. 16–19, September 2013
Google Scholar
Barth-Maron, G., Hoffman, M.W., Budden, D., Dabney, W., Horgan, D., Dhruva, T.B., Muldal, A., Heess, N., Lillicrap, T.: Distributional policy gradients. In: International Conference on Learning Representations (2018)
Google Scholar
Brockman, G., Cheung, V., Pettersson, L., Schneider, J., Schulman, J., Tang, J., Zaremba, W.: OpenAI gym. CoRR, abs/1606.01540 (2016)
Google Scholar
Burke, J.W., McNeill, M.D.J., Charles, D.K., Morrow, P.J., Crosbie, J.H., McDonough, S.M.: Augmented reality games for upper-limb stroke rehabilitation. In: 2010 Second International Conference on Games and Virtual Worlds for Serious Applications, pp. 75–78, March 2010
Google Scholar
Haarnoja, T., Zhou, A., Abbeel, P., Levine, S.: Soft actor-critic: off-policy maximum entropy deep reinforcement learning with a stochastic actor. CoRR, abs/1801.01290 (2018)
Google Scholar
Jaderberg, M., Czarnecki, W.M., Dunning, I., Marris, L., Lever, G., Castañeda, A.G., Beattie, C., Rabinowitz, N.C., Morcos, A.S., Ruderman, A., Sonnerat, N., Green, T., Deason, L., Leibo, J.Z., Silver, D., Hassabis, D., Kavukcuoglu, K., Graepel, T.: Human-level performance in first-person multiplayer games with population-based deep reinforcement learning. CoRR, abs/1807.01281 (2018)
Google Scholar
Li, Y.: Deep reinforcement learning: an overview. CoRR, abs/1701.07274 (2017)
Google Scholar
Lillicrap, T.P., Hunt, J.J., Pritzel, A., Heess, N., Erez, T., Tassa, Y., Silver, D., Wierstra, D.: Continuous control with deep reinforcement learning. CoRR, abs/1509.02971 (2015)
Google Scholar
Lowe, R., Wu, Y., Tamar, A., Harb, J., Abbeel, P., Mordatch, I.: Multi-agent actor-critic for mixed cooperative-competitive environments. In: Neural Information Processing Systems (NIPS) (2017)
Google Scholar
Mnih, V., Kavukcuoglu, K., Silver, D., Graves, A., Antonoglou, I., Wierstra, D., Riedmiller, M.A.: Playing atari with deep reinforcement learning. CoRR, abs/1312.5602 (2013)
Google Scholar
Oliveira, S., Magalhães, L.: Adaptive content generation for games. In: Encontro Português de Computação Gráfica e Interação (EPCGI), pp. 1–8, October 2017
Google Scholar
Silver, D., Huang, A., Maddison, C.J., Guez, A., Sifre, L., van den Driessche, G., Schrittwieser, J., Antonoglou, I., Panneershelvam, V., Lanctot, M., Dieleman, S., Grewe, D., Nham, J., Kalchbrenner, N., Sutskever, I., Lillicrap, T.P., Leach, M., Kavukcuoglu, K., Graepel, T., Hassabis, D.: Mastering the game of go with deep neural networks and tree search. Nature 529, 484–489 (2016)
Article Google Scholar
Silver, D., Hubert, T., Schrittwieser, J., Antonoglou, I., Lai, M., Guez, A., Lanctot, M., Sifre, L., Kumaran, D., Graepel, T., et al.: Mastering chess and shogi by self-play with a general reinforcement learning algorithm. arXiv preprint arXiv:1712.01815 (2017)
Silver, D., Lever, G., Heess, N., Degris, T., Wierstra, D., Riedmiller, M.: Deterministic policy gradient algorithms. In: Xing, E.P., Jebara, T. (eds.) Proceedings of the 31st International Conference on Machine Learning, vol. 32 of Proceedings of Machine Learning Research, pp. 387–395, Bejing, China, 22–24 June 2014. PMLR (2014)
Google Scholar
Silver, D., Schrittwieser, J., Simonyan, K., Antonoglou, I., Huang, A., Guez, A., Hubert, T., Baker, L.R., Lai, M., Bolton, A., Chen, Y., Lillicrap, T.P., Hui, F., Sifre, L., van den Driessche, G., Graepel, T., Hassabis, D.: Mastering the game of go without human knowledge. Nature 550, 354–359 (2017)
Article Google Scholar
Vanacken, L., Notelaers, S., Raymaekers, C., Coninx, K., van den Hoogen, W.M., Ijsselsteijn, W.A., Feys, P.: Game-based collaborative training for arm rehabilitation of MS patients: a proof-of-concept game, pp. 1–10 (2010). conference; GameDays2010; 2010-03-25; 2010-03-26; Conference date: 25-03-2010 Through 26-03-2010
Google Scholar
Wu, M., Xiong, S., Iida, H.: Fairness mechanism in multiplayer online battle arena games. In: 2016 3rd International Conference on Systems and Informatics (ICSAI), pp. 387–392, November 2016
Google Scholar

Download references

Acknowledgments

This work is supported by: Portuguese Foundation for Science and Technology (FCT) under grant SFRH/BD/129445/2017; LIACC (PEst-UID/CEC/00027/2013); IEETA (UID/CEC/00127/2013).

Author information

Authors and Affiliations

LIACC/FEUP, Artificial Intelligence and Computer Science Laboratory, Faculty of Engineering, University of Porto, Porto, Portugal
Simão Reis & Luís Paulo Reis
DETI/IEETA, Institute of Electronics and Informatics Engineering of Aveiro, University of Aveiro, Aveiro, Portugal
Nuno Lau

Authors

Simão Reis
View author publications
You can also search for this author in PubMed Google Scholar
Luís Paulo Reis
View author publications
You can also search for this author in PubMed Google Scholar
Nuno Lau
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Simão Reis .

Editor information

Editors and Affiliations

Departamento de Engenharia Informática, Universidade de Coimbra, Coimbra, Portugal
Álvaro Rocha
The Ohio State University, Columbus, OH, USA
Hojjat Adeli
Faculdade de Engenharia/LIACC, Universidade do Porto, Porto, Portugal
Luís Paulo Reis
DIMES, Università della Calabria, Arcavacata di Rende, Italy
Sandra Costanzo

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Reis, S., Reis, L.P., Lau, N. (2019). Automatic Generation of a Sub-optimal Agent Population with Learning. In: Rocha, Á., Adeli, H., Reis, L., Costanzo, S. (eds) New Knowledge in Information Systems and Technologies. WorldCIST'19 2019. Advances in Intelligent Systems and Computing, vol 931. Springer, Cham. https://doi.org/10.1007/978-3-030-16184-2_7

Download citation

DOI: https://doi.org/10.1007/978-3-030-16184-2_7
Published: 30 March 2019
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-16183-5
Online ISBN: 978-3-030-16184-2
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)

Publish with us

Policies and ethics