Skip to main content

Automatic Generation of a Sub-optimal Agent Population with Learning

  • Conference paper
  • First Online:

Part of the book series: Advances in Intelligent Systems and Computing ((AISC,volume 931))

Abstract

Most modern solutions for video game balancing are directed towards specific games. We are currently researching general methods for automatic multiplayer game balancing. The problem is modeled as a meta-game, where game-play change the rules from another game. This way, a Machine Learning agent that learns to play a meta-game, learns how to change a base game following some balancing metric. But an issue resides in the generation of high volume of game-play training data, was agents of different skill compete against each other. For this end we propose the automatic generation of a population of surrogate agents by learning sampling. In Reinforcement Learning an agent learns in a trial error fashion where it improves gradually its policy, the mapping from world state to action to perform. This means that in each successful evolutionary step an agent follows a sub-optimal strategy, or eventually the optimal strategy. We store the agent policy at the end of each training episode. The process is evaluated in simple environments with distinct properties. Quality of the generated population is evaluated by the diversity of the difficulty the agents have in solving their tasks.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   169.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   219.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Notes

  1. 1.

    www.dota2.com/.

  2. 2.

    https://gym.openai.com/envs/FrozenLake-v0/.

  3. 3.

    https://gym.openai.com/envs/CartPole-v0/.

  4. 4.

    https://gym.openai.com/envs/MountainCar-v0/.

References

  1. Alankus, G., Lazar, A., May, M., Kelleher, C.: Towards customizable games for stroke rehabilitation. In: Proceedings of the SIGCHI Conference on Human Factors in Computing Systems, CHI 2010, New York, NY, USA, pp. 2113–2122. ACM (2010)

    Google Scholar 

  2. Andersen, P., Goodwin, M., Granmo, O.: Deep RTS: a game environment for deep reinforcement learning in real-time strategy games. In: 2018 IEEE Conference on Computational Intelligence and Games (CIG), pp. 1–8, August 2018

    Google Scholar 

  3. Baldwin, A., Johnson, D., Wyeth, P., Sweetser, P.: A framework of dynamic difficulty adjustment in competitive multiplayer video games. In: 2013 IEEE International Games Innovation Conference (IGIC), pp. 16–19, September 2013

    Google Scholar 

  4. Barth-Maron, G., Hoffman, M.W., Budden, D., Dabney, W., Horgan, D., Dhruva, T.B., Muldal, A., Heess, N., Lillicrap, T.: Distributional policy gradients. In: International Conference on Learning Representations (2018)

    Google Scholar 

  5. Brockman, G., Cheung, V., Pettersson, L., Schneider, J., Schulman, J., Tang, J., Zaremba, W.: OpenAI gym. CoRR, abs/1606.01540 (2016)

    Google Scholar 

  6. Burke, J.W., McNeill, M.D.J., Charles, D.K., Morrow, P.J., Crosbie, J.H., McDonough, S.M.: Augmented reality games for upper-limb stroke rehabilitation. In: 2010 Second International Conference on Games and Virtual Worlds for Serious Applications, pp. 75–78, March 2010

    Google Scholar 

  7. Haarnoja, T., Zhou, A., Abbeel, P., Levine, S.: Soft actor-critic: off-policy maximum entropy deep reinforcement learning with a stochastic actor. CoRR, abs/1801.01290 (2018)

    Google Scholar 

  8. Jaderberg, M., Czarnecki, W.M., Dunning, I., Marris, L., Lever, G., Castañeda, A.G., Beattie, C., Rabinowitz, N.C., Morcos, A.S., Ruderman, A., Sonnerat, N., Green, T., Deason, L., Leibo, J.Z., Silver, D., Hassabis, D., Kavukcuoglu, K., Graepel, T.: Human-level performance in first-person multiplayer games with population-based deep reinforcement learning. CoRR, abs/1807.01281 (2018)

    Google Scholar 

  9. Li, Y.: Deep reinforcement learning: an overview. CoRR, abs/1701.07274 (2017)

    Google Scholar 

  10. Lillicrap, T.P., Hunt, J.J., Pritzel, A., Heess, N., Erez, T., Tassa, Y., Silver, D., Wierstra, D.: Continuous control with deep reinforcement learning. CoRR, abs/1509.02971 (2015)

    Google Scholar 

  11. Lowe, R., Wu, Y., Tamar, A., Harb, J., Abbeel, P., Mordatch, I.: Multi-agent actor-critic for mixed cooperative-competitive environments. In: Neural Information Processing Systems (NIPS) (2017)

    Google Scholar 

  12. Mnih, V., Kavukcuoglu, K., Silver, D., Graves, A., Antonoglou, I., Wierstra, D., Riedmiller, M.A.: Playing atari with deep reinforcement learning. CoRR, abs/1312.5602 (2013)

    Google Scholar 

  13. Oliveira, S., Magalhães, L.: Adaptive content generation for games. In: Encontro Português de Computação Gráfica e Interação (EPCGI), pp. 1–8, October 2017

    Google Scholar 

  14. Silver, D., Huang, A., Maddison, C.J., Guez, A., Sifre, L., van den Driessche, G., Schrittwieser, J., Antonoglou, I., Panneershelvam, V., Lanctot, M., Dieleman, S., Grewe, D., Nham, J., Kalchbrenner, N., Sutskever, I., Lillicrap, T.P., Leach, M., Kavukcuoglu, K., Graepel, T., Hassabis, D.: Mastering the game of go with deep neural networks and tree search. Nature 529, 484–489 (2016)

    Article  Google Scholar 

  15. Silver, D., Hubert, T., Schrittwieser, J., Antonoglou, I., Lai, M., Guez, A., Lanctot, M., Sifre, L., Kumaran, D., Graepel, T., et al.: Mastering chess and shogi by self-play with a general reinforcement learning algorithm. arXiv preprint arXiv:1712.01815 (2017)

  16. Silver, D., Lever, G., Heess, N., Degris, T., Wierstra, D., Riedmiller, M.: Deterministic policy gradient algorithms. In: Xing, E.P., Jebara, T. (eds.) Proceedings of the 31st International Conference on Machine Learning, vol. 32 of Proceedings of Machine Learning Research, pp. 387–395, Bejing, China, 22–24 June 2014. PMLR (2014)

    Google Scholar 

  17. Silver, D., Schrittwieser, J., Simonyan, K., Antonoglou, I., Huang, A., Guez, A., Hubert, T., Baker, L.R., Lai, M., Bolton, A., Chen, Y., Lillicrap, T.P., Hui, F., Sifre, L., van den Driessche, G., Graepel, T., Hassabis, D.: Mastering the game of go without human knowledge. Nature 550, 354–359 (2017)

    Article  Google Scholar 

  18. Vanacken, L., Notelaers, S., Raymaekers, C., Coninx, K., van den Hoogen, W.M., Ijsselsteijn, W.A., Feys, P.: Game-based collaborative training for arm rehabilitation of MS patients: a proof-of-concept game, pp. 1–10 (2010). conference; GameDays2010; 2010-03-25; 2010-03-26; Conference date: 25-03-2010 Through 26-03-2010

    Google Scholar 

  19. Wu, M., Xiong, S., Iida, H.: Fairness mechanism in multiplayer online battle arena games. In: 2016 3rd International Conference on Systems and Informatics (ICSAI), pp. 387–392, November 2016

    Google Scholar 

Download references

Acknowledgments

This work is supported by: Portuguese Foundation for Science and Technology (FCT) under grant SFRH/BD/129445/2017; LIACC (PEst-UID/CEC/00027/2013); IEETA (UID/CEC/00127/2013).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Simão Reis .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2019 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Reis, S., Reis, L.P., Lau, N. (2019). Automatic Generation of a Sub-optimal Agent Population with Learning. In: Rocha, Á., Adeli, H., Reis, L., Costanzo, S. (eds) New Knowledge in Information Systems and Technologies. WorldCIST'19 2019. Advances in Intelligent Systems and Computing, vol 931. Springer, Cham. https://doi.org/10.1007/978-3-030-16184-2_7

Download citation

Publish with us

Policies and ethics