Abstract
Most modern solutions for video game balancing are directed towards specific games. We are currently researching general methods for automatic multiplayer game balancing. The problem is modeled as a meta-game, where game-play change the rules from another game. This way, a Machine Learning agent that learns to play a meta-game, learns how to change a base game following some balancing metric. But an issue resides in the generation of high volume of game-play training data, was agents of different skill compete against each other. For this end we propose the automatic generation of a population of surrogate agents by learning sampling. In Reinforcement Learning an agent learns in a trial error fashion where it improves gradually its policy, the mapping from world state to action to perform. This means that in each successful evolutionary step an agent follows a sub-optimal strategy, or eventually the optimal strategy. We store the agent policy at the end of each training episode. The process is evaluated in simple environments with distinct properties. Quality of the generated population is evaluated by the diversity of the difficulty the agents have in solving their tasks.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsReferences
Alankus, G., Lazar, A., May, M., Kelleher, C.: Towards customizable games for stroke rehabilitation. In: Proceedings of the SIGCHI Conference on Human Factors in Computing Systems, CHI 2010, New York, NY, USA, pp. 2113–2122. ACM (2010)
Andersen, P., Goodwin, M., Granmo, O.: Deep RTS: a game environment for deep reinforcement learning in real-time strategy games. In: 2018 IEEE Conference on Computational Intelligence and Games (CIG), pp. 1–8, August 2018
Baldwin, A., Johnson, D., Wyeth, P., Sweetser, P.: A framework of dynamic difficulty adjustment in competitive multiplayer video games. In: 2013 IEEE International Games Innovation Conference (IGIC), pp. 16–19, September 2013
Barth-Maron, G., Hoffman, M.W., Budden, D., Dabney, W., Horgan, D., Dhruva, T.B., Muldal, A., Heess, N., Lillicrap, T.: Distributional policy gradients. In: International Conference on Learning Representations (2018)
Brockman, G., Cheung, V., Pettersson, L., Schneider, J., Schulman, J., Tang, J., Zaremba, W.: OpenAI gym. CoRR, abs/1606.01540 (2016)
Burke, J.W., McNeill, M.D.J., Charles, D.K., Morrow, P.J., Crosbie, J.H., McDonough, S.M.: Augmented reality games for upper-limb stroke rehabilitation. In: 2010 Second International Conference on Games and Virtual Worlds for Serious Applications, pp. 75–78, March 2010
Haarnoja, T., Zhou, A., Abbeel, P., Levine, S.: Soft actor-critic: off-policy maximum entropy deep reinforcement learning with a stochastic actor. CoRR, abs/1801.01290 (2018)
Jaderberg, M., Czarnecki, W.M., Dunning, I., Marris, L., Lever, G., Castañeda, A.G., Beattie, C., Rabinowitz, N.C., Morcos, A.S., Ruderman, A., Sonnerat, N., Green, T., Deason, L., Leibo, J.Z., Silver, D., Hassabis, D., Kavukcuoglu, K., Graepel, T.: Human-level performance in first-person multiplayer games with population-based deep reinforcement learning. CoRR, abs/1807.01281 (2018)
Li, Y.: Deep reinforcement learning: an overview. CoRR, abs/1701.07274 (2017)
Lillicrap, T.P., Hunt, J.J., Pritzel, A., Heess, N., Erez, T., Tassa, Y., Silver, D., Wierstra, D.: Continuous control with deep reinforcement learning. CoRR, abs/1509.02971 (2015)
Lowe, R., Wu, Y., Tamar, A., Harb, J., Abbeel, P., Mordatch, I.: Multi-agent actor-critic for mixed cooperative-competitive environments. In: Neural Information Processing Systems (NIPS) (2017)
Mnih, V., Kavukcuoglu, K., Silver, D., Graves, A., Antonoglou, I., Wierstra, D., Riedmiller, M.A.: Playing atari with deep reinforcement learning. CoRR, abs/1312.5602 (2013)
Oliveira, S., Magalhães, L.: Adaptive content generation for games. In: Encontro Português de Computação Gráfica e Interação (EPCGI), pp. 1–8, October 2017
Silver, D., Huang, A., Maddison, C.J., Guez, A., Sifre, L., van den Driessche, G., Schrittwieser, J., Antonoglou, I., Panneershelvam, V., Lanctot, M., Dieleman, S., Grewe, D., Nham, J., Kalchbrenner, N., Sutskever, I., Lillicrap, T.P., Leach, M., Kavukcuoglu, K., Graepel, T., Hassabis, D.: Mastering the game of go with deep neural networks and tree search. Nature 529, 484–489 (2016)
Silver, D., Hubert, T., Schrittwieser, J., Antonoglou, I., Lai, M., Guez, A., Lanctot, M., Sifre, L., Kumaran, D., Graepel, T., et al.: Mastering chess and shogi by self-play with a general reinforcement learning algorithm. arXiv preprint arXiv:1712.01815 (2017)
Silver, D., Lever, G., Heess, N., Degris, T., Wierstra, D., Riedmiller, M.: Deterministic policy gradient algorithms. In: Xing, E.P., Jebara, T. (eds.) Proceedings of the 31st International Conference on Machine Learning, vol. 32 of Proceedings of Machine Learning Research, pp. 387–395, Bejing, China, 22–24 June 2014. PMLR (2014)
Silver, D., Schrittwieser, J., Simonyan, K., Antonoglou, I., Huang, A., Guez, A., Hubert, T., Baker, L.R., Lai, M., Bolton, A., Chen, Y., Lillicrap, T.P., Hui, F., Sifre, L., van den Driessche, G., Graepel, T., Hassabis, D.: Mastering the game of go without human knowledge. Nature 550, 354–359 (2017)
Vanacken, L., Notelaers, S., Raymaekers, C., Coninx, K., van den Hoogen, W.M., Ijsselsteijn, W.A., Feys, P.: Game-based collaborative training for arm rehabilitation of MS patients: a proof-of-concept game, pp. 1–10 (2010). conference; GameDays2010; 2010-03-25; 2010-03-26; Conference date: 25-03-2010 Through 26-03-2010
Wu, M., Xiong, S., Iida, H.: Fairness mechanism in multiplayer online battle arena games. In: 2016 3rd International Conference on Systems and Informatics (ICSAI), pp. 387–392, November 2016
Acknowledgments
This work is supported by: Portuguese Foundation for Science and Technology (FCT) under grant SFRH/BD/129445/2017; LIACC (PEst-UID/CEC/00027/2013); IEETA (UID/CEC/00127/2013).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2019 Springer Nature Switzerland AG
About this paper
Cite this paper
Reis, S., Reis, L.P., Lau, N. (2019). Automatic Generation of a Sub-optimal Agent Population with Learning. In: Rocha, Á., Adeli, H., Reis, L., Costanzo, S. (eds) New Knowledge in Information Systems and Technologies. WorldCIST'19 2019. Advances in Intelligent Systems and Computing, vol 931. Springer, Cham. https://doi.org/10.1007/978-3-030-16184-2_7
Download citation
DOI: https://doi.org/10.1007/978-3-030-16184-2_7
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-16183-5
Online ISBN: 978-3-030-16184-2
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)