Evaluating Human-like Behaviors of Video-Game Agents Autonomously Acquired with Biological Constraints

  • Nobuto Fujii
  • Yuichi Sato
  • Hironori Wakama
  • Koji Kazai
  • Haruhiro Katayose
Part of the Lecture Notes in Computer Science book series (LNCS, volume 8253)


Designing the behavioral patterns of video game agents (Non-player character: NPC) is a crucial aspect in developing video games. While various systems that have aimed at automatically acquiring behavioral patterns have been proposed and some have successfully obtained stronger patterns than human players, those patterns have looked mechanical. When human players play video games together with NPCs as their opponents/supporters, NPCs’ behavioral patterns have not only to be strong but also to be human-like. We propose the autonomous acquisition of NPCs’ behaviors, which emulate the behaviors of human players. Instead of implementing straightforward heuristics, the behaviors are acquired using techniques of reinforcement learning with Q-Learning and pathfinding through an A* algorithm, where biological constraints are imposed. Human-like behaviors that imply human cognitive processes were obtained by imposing sensory error, perceptual and motion delay, physical fatigue, and balancing between repetition and novelty as the biological constraints in computational simulations using “Infinite Mario Bros.”. We evaluated human-like behavioral patterns through subjective assessments, and discuss the possibility of implementing the proposed system.


Autonomously strategy acquisition Machine learning Biological constraints Video game agent Infinite Mario Bros 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Fujii, N., Sato, Y., Wakama, H., Katayose, H.: Autonomously acquiring a video game agent’s behavior: Letting players feel like playing with a human player. In: Nijholt, A., Romão, T., Reidsma, D. (eds.) ACE 2012. LNCS, vol. 7624, pp. 490–493. Springer, Heidelberg (2012)CrossRefGoogle Scholar
  2. 2.
    Fujita, H., Ishii, S.: Model-based reinforcement learning for partially observable games with sampling-based state estimation. Neural Computation 19, 3051–3087 (2007)MathSciNetCrossRefGoogle Scholar
  3. 3.
    Hoki, K.: Optimal control of minimax search results to learn positional evaluation. In: Game Programming Workshop 2006, pp. 78–83 (2006)Google Scholar
  4. 4.
    Hoki, K., Kaneko, T.: The global landscape of objective functions for the optimization of shogi piece values with a game-tree search. In: van den Herik, H.J., Plaat, A. (eds.) ACG 2011. LNCS, vol. 7168, pp. 184–195. Springer, Heidelberg (2012)CrossRefGoogle Scholar
  5. 5.
    Cabrera, J.L., Milton, J.G.: On-off intermittency in a human balancing task. Physical Review Letters 89(15) (September 2002)Google Scholar
  6. 6.
    Karakovskiy, S., Togelius, J.: The mario ai benchmark and competitions. IEEE Transactions on Computational Intelligence and AI in Games 4, 55–67 (2012)CrossRefGoogle Scholar
  7. 7.
    Maslow, A.H.: A theory of human motivation. Psychological Review 50, 370–396 (1943)CrossRefGoogle Scholar
  8. 8.
    Minsky, M.: The Emotion Machine: Commonsense Thinking, Artificial Intelligence, and the Future of the Human Mind, reprint edition. Simon and Schuster (2007)Google Scholar
  9. 9.
    Scheffe, H.: An analysis of variance for paired comparisons. Journal of the American Statistical Association 47(259), 381–400 (1952)MathSciNetzbMATHGoogle Scholar
  10. 10.
    Schrum, J., Karpov, I.V., Miikkulainen, R.: Human-like behavior via neuroevolution of combat behavior and replay of human traces. In: 2011 IEEE Conference, CIG 2011, pp. 329–336 (2011)Google Scholar
  11. 11.
    Soni, B., Hingston, P.: Bots trained to play like a human are more fun. In: 2008 IEEE International Joint Conference on Neural Networks, pp. 363–369 (2008)Google Scholar
  12. 12.
    Sugiyama, T., Obata, T., Hoki, K., Ito, T.: Optimistic selection rule better than majority voting system. In: van den Herik, H.J., Iida, H., Plaat, A. (eds.) CG 2010. LNCS, vol. 6515, pp. 166–175. Springer, Heidelberg (2011)CrossRefGoogle Scholar
  13. 13.
    Togelius, J., Karakovskiy, S., Baumgarten, R.: The 2009 mario AI competition. In: 2010 IEEE Evolutionary Computation (CEC), pp. 1–8 (2010)Google Scholar
  14. 14.
    Watkins, C.: Learning from delayed rewards. PhD thesis, Cambridge University, Cambridge, England (1989)Google Scholar
  15. 15.
    Zajonc, R.B.: Attitudinal effects of mere exposure. Journal of Personality and Social Psychology 9, 1–27 (1968)CrossRefGoogle Scholar

Copyright information

© Springer International Publishing Switzerland 2013

Authors and Affiliations

  • Nobuto Fujii
    • 1
  • Yuichi Sato
    • 1
  • Hironori Wakama
    • 1
  • Koji Kazai
    • 1
  • Haruhiro Katayose
    • 1
  1. 1.Graduate School of Science and Technology, Japan Research Fellow of Japan Society for the Promotion of ScienceKwansei Gakuin UniversitySandaJapan

Personalised recommendations