Advertisement

Self-adaptive MCTS for General Video Game Playing

  • Chiara F. SironiEmail author
  • Jialin Liu
  • Diego Perez-Liebana
  • Raluca D. Gaina
  • Ivan Bravi
  • Simon M. Lucas
  • Mark H. M. Winands
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 10784)

Abstract

Monte-Carlo Tree Search (MCTS) has shown particular success in General Game Playing (GGP) and General Video Game Playing (GVGP) and many enhancements and variants have been developed. Recently, an on-line adaptive parameter tuning mechanism for MCTS agents has been proposed that almost achieves the same performance as off-line tuning in GGP.

In this paper we apply the same approach to GVGP and use the popular General Video Game AI (GVGAI) framework, in which the time allowed to make a decision is only 40 ms. We design three Self-Adaptive MCTS (SA-MCTS) agents that optimize on-line the parameters of a standard non-Self-Adaptive MCTS agent of GVGAI. The three agents select the parameter values using Naïve Monte-Carlo, an Evolutionary Algorithm and an N-Tuple Bandit Evolutionary Algorithm respectively, and are tested on 20 single-player games of GVGAI.

The SA-MCTS agents achieve more robust results on the tested games. With the same time setting, they perform similarly to the baseline standard MCTS agent in the games for which the baseline agent performs well, and significantly improve the win rate in the games for which the baseline agent performs poorly. As validation, we also test the performance of non-Self-Adaptive MCTS instances that use the most sampled parameter settings during the on-line tuning of each of the three SA-MCTS agents for each game. Results show that these parameter settings improve the win rate on the games Wait for Breakfast and Escape by 4 times and 150 times, respectively.

Keywords

MCTS On-line tuning Self-adaptive Robust game playing General video game playing 

Notes

Acknowledgments

This work is partially funded by the Netherlands Organisation for Scientific Research (NWO) in the framework of the project GoGeneral, grant number 612.001.121, and the EPSRC IGGI Centre for Doctoral Training, grant number EP/L015846/1.

References

  1. 1.
    Coulom, R.: Efficient selectivity and backup operators in Monte-Carlo Tree Search. In: van den Herik, H.J., Ciancarini, P., Donkers, H.H.L.M.J. (eds.) CG 2006. LNCS, vol. 4630, pp. 72–83. Springer, Heidelberg (2007).  https://doi.org/10.1007/978-3-540-75538-8_7 CrossRefGoogle Scholar
  2. 2.
    Kocsis, L., Szepesvári, C.: Bandit based Monte-Carlo planning. In: Fürnkranz, J., Scheffer, T., Spiliopoulou, M. (eds.) ECML 2006. LNCS (LNAI), vol. 4212, pp. 282–293. Springer, Heidelberg (2006).  https://doi.org/10.1007/11871842_29 CrossRefGoogle Scholar
  3. 3.
    Yannakakis, G.N., Togelius, J.: Artificial Intelligence and Games. Springer (2018), http://gameaibook.org
  4. 4.
    Browne, C.B., Powley, E., Whitehouse, D., Lucas, S.M., Cowling, P.I., Rohlfshagen, P., Tavener, S., Perez, D., Samothrakis, S., Colton, S.: A survey of Monte Carlo tree search methods. IEEE Trans. Comput. Intell. AI Games 4(1), 1–43 (2012)CrossRefGoogle Scholar
  5. 5.
    Helmbold, D.P., Parker-Wood, A.: All-moves-as-first heuristics in Monte-Carlo Go. In: IC-AI, pp. 605–610 (2009)Google Scholar
  6. 6.
    Gelly, S., Silver, D.: Combining online and offline knowledge in UCT. In: Proceedings of the 24th International Conference on Machine Learning, pp. 273–280. ACM (2007)Google Scholar
  7. 7.
    Cazenave, T.: Generalized rapid action value estimation. In: Proceedings of the 24th International Joint Conference on Artificial Intelligence, pp. 754–760. AAAI Press (2015)Google Scholar
  8. 8.
    Sironi, C.F., Winands, M.H.M.: Comparison of rapid action value estimation variants for general game playing. In: 2016 IEEE Conference on Computational Intelligence and Games (CIG), pp. 309–316. IEEE (2016)Google Scholar
  9. 9.
    Finnsson, H., Björnsson, Y.: Simulation-based approach to general game playing. In: AAAI, vol. 8, pp. 259–264 (2008)Google Scholar
  10. 10.
    Powley, E.J., Cowling, P.I., Whitehouse, D.: Information capture and reuse strategies in Monte Carlo tree search, with applications to games of hidden information. Artif. Intell. 217, 92–116 (2014)MathSciNetCrossRefzbMATHGoogle Scholar
  11. 11.
    Silver, D., Huang, A., Maddison, C.J., Guez, A., Sifre, L., Van Den Driessche, G., Schrittwieser, J., Antonoglou, I., Panneershelvam, V., Lanctot, M., et al.: Mastering the game of Go with deep neural networks and tree search. Nature 529(7587), 484–489 (2016)CrossRefGoogle Scholar
  12. 12.
    Silver, D., Schrittwieser, J., Simonyan, K., Antonoglou, I., Huang, A., Guez, A., Hubert, T., Baker, L., Lai, M., Bolton, A., et al.: Mastering the game of Go without human knowledge. Nature 550(7676), 354–359 (2017)CrossRefGoogle Scholar
  13. 13.
    Björnsson, Y., Finnsson, H.: CadiaPlayer: a simulation-based general game player. IEEE Trans. Comput. Intell. AI Games, 1(1), 4–15 (2009)CrossRefGoogle Scholar
  14. 14.
    Perez-Liebana, D., Samothrakis, S., Togelius, J., Schaul, T., Lucas, S.M., Couëtoux, A., Lee, J., Lim, C.U., Thompson, T.: The 2014 general video game playing competition. IEEE Trans. Comput. Intell. AI Games 8(3), 229–243 (2016)CrossRefGoogle Scholar
  15. 15.
    Gaina, R.D., Couetoux, A., Soemers, D.J.N.J., Winands, M.H.M., Vodopivec, T., Kirchgeßner, F., Liu, J., Lucas, S.M., Perez-Liebana, D.: The 2016 two-player GVGAI competition. IEEE Trans. Comput. Intell. AI Games (2017, accepted for publication)Google Scholar
  16. 16.
    Soemers, D.J.N.J., Sironi, C.F., Schuster, T., Winands, M.H.M.: Enhancements for real-time Monte-Carlo tree search in general video game playing. In: 2016 IEEE Conference on Computational Intelligence and Games (CIG), pp. 1–8. IEEE (2016)Google Scholar
  17. 17.
    Gaina, R.D., Liu, J., Lucas, S.M., Pérez-Liébana, D.: Analysis of vanilla rolling horizon evolution parameters in general video game playing. In: Squillero, G., Sim, K. (eds.) EvoApplications 2017. LNCS, vol. 10199, pp. 418–434. Springer, Cham (2017).  https://doi.org/10.1007/978-3-319-55849-3_28 CrossRefGoogle Scholar
  18. 18.
    Bravi, I., Khalifa, A., Holmgård, C., Togelius, J.: Evolving game-specific UCB alternatives for general video game playing. In: Squillero, G., Sim, K. (eds.) EvoApplications 2017. LNCS, vol. 10199, pp. 393–406. Springer, Cham (2017).  https://doi.org/10.1007/978-3-319-55849-3_26 CrossRefGoogle Scholar
  19. 19.
    Sironi, C.F., Winands, M.H.M.: On-line parameters tuning for Monte-Carlo tree search in general game playing. In: 6th Workshop on Computer Games (CGW) (2017)Google Scholar
  20. 20.
    Liu, J., Perez-Liebana, D., Lucas, S.M.: The single-player GVGAI learning framework - technical manual (2017)Google Scholar
  21. 21.
    Khalifa, A., Perez-Liebana, D., Lucas, S.M., Togelius, J.: General video game level generation. In: Proceedings of the 2016 on Genetic and Evolutionary Computation Conference, pp. 253–259. ACM (2016)Google Scholar
  22. 22.
    Khalifa, A., Green, M.C., Perez-Liebana, D., Togelius, J.: General video game rule generation. In: 2017 IEEE Conference on Computational Intelligence and Games (CIG), pp. 170–177. IEEE (2017)Google Scholar
  23. 23.
    Ebner, M., Levine, J., Lucas, S.M., Schaul, T., Thompson, T., Togelius, J.: Towards a video game description language. In: Dagstuhl Follow-Ups, vol. 6. Schloss Dagstuhl-Leibniz-Zentrum fuer Informatik (2013)Google Scholar
  24. 24.
    Bellemare, M.G., Naddaf, Y., Veness, J., Bowling, M.: The arcade learning environment: an evaluation platform for general agents. J. Artif. Intell. Res. (JAIR) 47, 253–279 (2013)Google Scholar
  25. 25.
    Auer, P., Cesa-Bianchi, N., Fischer, P.: Finite-time analysis of the multiarmed bandit problem. Mach. Learn. 47(2–3), 235–256 (2002)CrossRefzbMATHGoogle Scholar
  26. 26.
    Ontanón, S.: Combinatorial multi-armed bandits for real-time strategy games. J. Artif. Intell. Res. 58, 665–702 (2017)MathSciNetzbMATHGoogle Scholar
  27. 27.
    Kunanusont, K., Gaina, R.D., Liu, J., Perez-Liebana, D., Lucas, S.M.: The n-tuple bandit evolutionary algorithm for automatic game improvement. In: 2017 IEEE Congress on Evolutionary Computation (CEC). IEEE (2017)Google Scholar
  28. 28.
    Perez-Liebana, D., Liu, J., Lucas, S.M.: General video game AI as a tool for game design. In: Tutorial at IEEE Conference on Computational Intelligence and Games (CIG) (2017)Google Scholar
  29. 29.
    Nelson, M.J.: Investigating vanilla MCTS scaling on the GVG-AI game corpus. In: Proceedings of the 2016 IEEE Conference on Computational Intelligence and Games, pp. 403–409 (2016)Google Scholar
  30. 30.
    Bontrager, P., Khalifa, A., Mendes, A., Togelius, J.: Matching games and algorithms for general video game playing. In: Twelfth Artificial Intelligence and Interactive Digital Entertainment Conference, pp. 122–128 (2016)Google Scholar

Copyright information

© Springer International Publishing AG, part of Springer Nature 2018

Authors and Affiliations

  1. 1.Games and AI Group, Department of Data Science and Knowledge EngineeringMaastricht UniversityMaastrichtThe Netherlands
  2. 2.Game AI Group, School of Electronic Engineering and Computer ScienceQueen Mary University of LondonLondonUK

Personalised recommendations