The Computer Games Journal

, Volume 8, Issue 3–4, pp 143–156 | Cite as

Intelligent Adjustment of Game Properties at Run Time Using Multi-armed Bandits

  • Zahra Amiri
  • Yoones A. SekhavatEmail author


Dynamic modification of game properties based on the preferences of players can be an essential factor of successful game design. This paper proposes a technique based on the multi-armed bandit (MAB) approach for intelligent and dynamic theme selection in a video game. The epsilon-greedy algorithm is exploited in order to implement the MAB approach and apply players’ preferences in the game. A 3D-Roll ball game with four different themes has been developed for the purpose of evaluating the efficacy of the proposed technique. In this game, the color of the gaming environment and the speed of a player are defined as two game properties that determine game themes. The results of a user study performed on this system show that our technique has the potential of being used as a toolkit for determining the preferences of players at real-time.


Video games Difficulty adjustment Multi-armed bandit Balancing 



This work has been carried out in the Cognitive Augmented Reality Lab ( at the Faculty of Multimedia, Tabriz Islamic Art University.


  1. Agrawal, R. (1995). Sample mean-based index policies with O(log n) regret for the multi-armed bandit problem. Advances in Applied Probability,27, 1054–1078.MathSciNetCrossRefGoogle Scholar
  2. Andersen, E., Liu, Y. E., Apter, E., Boucher-Genesse, F. & Popovic, Z. (2010). Gameplay analysis through state projection. In Proceedings of FDG (pp. 1–8). ACM Press.Google Scholar
  3. Audibert, J. Y., Bubeck, S., & Munos, R. (2010). Best arm identification in multi-armed bandits. In Proceedings of the 23rd conference on learning theory.Google Scholar
  4. Auer, P., Cesa-Bianchi, N., & Fischer, P. (2002). Finite-time analysis of the multiarmed bandit problem. Machine Learning,47, 235–256.CrossRefGoogle Scholar
  5. Belluz, J., Gaudesi, M., & Tonda, A. (2015). Operator selection using improved dynamic multi-armed bandit. In GECCO’15, July 1115, 2015, Madrid, Spain. Copyright ©ACM.Google Scholar
  6. Bouneffouf, P. D., & Feraud, R. (2016). Multi-armed bandit problem with known trend. Neurocomputing Archive,205:C, 16–21.CrossRefGoogle Scholar
  7. Broden, B., Hammar, M., Nilsson, B. J., & Paraschakis, D. (2017). Bandit algorithms for e-Commerce recommender systems. In: RecSys 2017, Como, Italy.Google Scholar
  8. Bubeck, S., & Cesa-Bianchi, N. (2012). Regret analysis of stochastic and nonstochastic multi-armed bandit problems. Foundations and Trends in Machine Learning,5(1), 1–122.CrossRefGoogle Scholar
  9. Cesa-Bianchi, N., & Lugosi, G. (2012). Combinatorial bandits. Journal of Computer and System Sciences,78, 1404–1422.MathSciNetCrossRefGoogle Scholar
  10. Chen, W., Wang, Y., Yuan, Y., & Wang, Q. (2016). Combinatorial multi-armed bandit and its extension to probabilistically triggered arms. Journal of Machine Learning Research,17, 1–33.MathSciNetzbMATHGoogle Scholar
  11. Desurvire, H., Caplan, M., & Toth, J. A. (2004). Using heuristics to evaluate the playability of games. In: Extended abstracts CHI 2004 (pp. 1509–1512). ACM Press.Google Scholar
  12. Desurvire, H., & Wiberg, C. (2009). Game usability heuristics (play) for evaluating and designing better games: the next iteration. In Proceedings of OCSC 2009 (pp. 557–566).Google Scholar
  13. Dixit, P. N., Youngblood, G. M. (2008). Understanding playtest data through visual data mining in interactive 3D environments. In: Proceedings of CGAMES.Google Scholar
  14. Drachen, A., & Canossa, A. (2009a). Towards gameplay analysis via gameplay metrics. In: Proceedings of Mind Trek (pp. 202–209). ACM Press.Google Scholar
  15. Drachen, A., & Canossa, A. (2009b). Analyzing spatial user behavior in computer games using geo-graphic information systems. In: Proceedings of MindTrek (pp. 182–189). ACM Press.Google Scholar
  16. Dumas, J. S. (2002). User-based evaluations. In The humancomputer interaction handbook (pp. 1093–1117). L. Erlbaum Associates Inc.Google Scholar
  17. Fialho, A., Da Costa, L., Schoenauer, M., & Sebag, M. (2010). Analyzing bandit-based adaptive operator selection mechanisms. Annals of Mathematics and Artificial Intelligence,60(1–2), 25–64.MathSciNetCrossRefGoogle Scholar
  18. Gajos, K., & Weld, D. S. (2005). Preference elicitation for interface optimization. In Proceedings of the 18th annual ACM symposium on user interface software and technology (pp. 173–182).Google Scholar
  19. Garivier, A., Kaufmann, E., & Koolen, W. (2016). Maximum action identification: A new bandit framework for games. In JMLR: Workshop and Conference Proceedings (vol 49, pp. 1–23).Google Scholar
  20. Geslin, E., Jegou, L., & Beaudoin, D. (2016). How color properties can be used to elicit emotions in video games. International Journal of Computer Games Technology. Scholar
  21. Gilleade, K. M., & Dix, A. (2004). Using frustration in the design of adaptive videogames (pp. 228–232). New York: ACM Press.Google Scholar
  22. Ijsselsteijn, W. A., de Kort, Y. A. W., & Poels, K. (2013). The game experience questionnaire. Technische Universiteit Eindhoven.Google Scholar
  23. Kaufmann, E., & Garivier, A. (2017). Learning the distribution with largest mean: Two bandit frameworks. In ESAIM: Proceedings and surveys (pp. 1–10).Google Scholar
  24. Kaufmann, E., & Kalyanakrishnan, S. (2013). Information complexity in bandit subset selection. In Proceedings of the 26th conference on learning theory (pp. 228–251).Google Scholar
  25. Kohavi, R., Deng, A., Frasca, B., Longbotham, R., Walker, T., & Ya, X. (2012) Trustworthy online controlled experiments: Five puzzling outcomes explained. In Proceedings of the 18th ACM SIGKDD international conference on knowledge discovery and data mining (pp. 786–794).Google Scholar
  26. Kohavi, R., Longbotham, R., Sommerfield, D., & Henne, R. M. (2009). Controlled experiments on the web: Survey and practical guide. Data Mining and Knowledge Discovery,18, 140–181.MathSciNetCrossRefGoogle Scholar
  27. Kuleshov, V., & Precup, D. (2000). Algorithms for the multi-armed bandit problem. Journal of Machine Learning Research,1, 1–48.Google Scholar
  28. Kuniavsky, M. (2003). Observing the user experience: A practitioner’s guide to user research. Amsterdam: Elsevier.Google Scholar
  29. Tran-Thanh, L. Chapman, A., de Cote, E. M., Rogers, A., & Jennings, N. R. (2010). First policies for budget-limited multi-armed bandits. In Proceedings of the 24th AAAI conference on artificial intelligence, 1115 July 2010, Georgia, USA (pp. 1211–1216).Google Scholar
  30. Li, L., Chu, W., Langford, J., & Schapire, R. E. (2010). A contextual-bandit approach to personalized news article recommendation. In Proceedings of the 19th international conference on World Wide Web (pp. 661–670).Google Scholar
  31. Liu, J., Togelius, J., Perez-Liebana, D., & Lucas, S. M. (2017). Evolving game skill-depth using general video game AI agents. In: IEEE congress on evolutionary computation (CEC) San Sebastian, Spain.Google Scholar
  32. Liu, Y., Mandel, T., Brunskill, E., & Popovic, Z. (2014). Trading off scientific knowledge and user learning with multi-armed bandits. In EDM (pp. 161–168).Google Scholar
  33. Lomas, D., Forlizzi, J., Poonawala, N., Patel, N., Shodhan, S., Patel, K., Koedinger, K. R., & Brunskill, E. (2016). Interface design optimization as a multi-armed bandit problem. In Proceedings of the 2016 CHI conference on human factors in computing systems (pp. 4142–4153).Google Scholar
  34. Loren, R., & Benson, D. B. (Eds.). (1999). Introduction to string field theory (2nd ed.). New York: Springer.Google Scholar
  35. Lu, J., Li, L., Shen, D., Chen, G., Jia, B., Blasch, E., & Pham, K. (2017). Dynamic multi-arm bandit game based multi-agents spectrum sharing strategy design. Cornell University. Submitted 12 Nov 2017.Google Scholar
  36. Nakamura, A., Helmbold, D. P., & Warmuth, M. K. (2016). Noise-free multi-armed bandit game (pp. 412–423). Switzerland: Springer.zbMATHGoogle Scholar
  37. Ontanon, S. (2017). Combinatorial multi-armed bandits for real-time strategy games. Journal of Artificial Intelligence Research,58, 665–702.MathSciNetCrossRefGoogle Scholar
  38. Ontanon, S., & Zhu, J. (2011). The SAM algorithm for analogy-based story generation. In Proceedings of the seventh AAAI conference on artificial intelligence and interactive digital entertainment, AIIDE 2011, 1014 October 2011, Stanford, California, USA (pp. 67–72).Google Scholar
  39. Pendharkar, P. C., & Cusatis, P. (2017). Trading financial indices with reinforcement learning agents. Expert Systems with Applications,103, 1–13.CrossRefGoogle Scholar
  40. Raharjo, K. (2002). Using confidence bounds for exploitation–exploration trade-offs. Journal of Machine Learning Research,3, 397–422.MathSciNetGoogle Scholar
  41. Raharjo, K., & Lawrence, R. (2016). Using multi-armed bandits to optimize game play metrics and effective game design. International Journal of Computer and Information Engineering,10(10), 1758–1761.Google Scholar
  42. Ramirez, A. J., & Bulitko, V. (2014). Automated planning and player modeling for interactive story telling. In IEEE transactions on computer intelligence and AI in games (pp. 375–386).Google Scholar
  43. Sekhavat, Y. A. (2017). Behavior trees for computer games. International Journal on Artificial Intelligence Tools,26, 1–28.CrossRefGoogle Scholar
  44. Sekhavat, Y. A., & Namani, M. S. (2018). Projection-based AR: Effective visual feedback in gait rehabilitation. IEEE Transactions on Human–Machine Systems,48(6), 626–636.CrossRefGoogle Scholar
  45. Sutton, R. S., & Barto, A. G. (1988). Introduction to reinforcement learning. Cambridge, MA: MIT Press.zbMATHGoogle Scholar
  46. Sweetser, P., & Wyeth, P. (2005). Gameflow: A model for evaluating player enjoyment in games. Computer Entertainment,3, 1–24.CrossRefGoogle Scholar
  47. Thawonmas, R., Kurashige, M., & Chen, K. T. (2007). Detection of landmarks for clustering of online-game players. International Journal of Virtual Reality,6, 11–16.Google Scholar
  48. Trappl, R., & Petta, P. (1997). Creating personalities for synthetic actors: Towards autonomous personality agents (p. 119). Berlin: Springer.CrossRefGoogle Scholar
  49. Tychsen, A., & Canossa, A. (2008). Defining personas in games using metrics. In Proceedings of future play 2008 (pp. 400–433). ACM Press.Google Scholar
  50. Vermeulen, I. E., Roth, C., Vorderer, P., & Klimmt, C. (2010). Measuring user responses to inter-active stories: Towards a standardized assessment tool. In Proceedings of the international conference on interactive digital storytelling (ICIDS) (pp. 38–43).Google Scholar
  51. Zhao, Z., & Liu, A. L. (2017) Intelligent demand response for electricity consumers: A multi-armed bandit game approach. In Intelligent system application to power systems (ISAP).Google Scholar

Copyright information

© Springer Science+Business Media, LLC, part of Springer Nature 2019

Authors and Affiliations

  1. 1.Faculty of MultimediaTabriz Islamic Art UniversityTabrizIran

Personalised recommendations