Advertisement

Competitive Reinforcement Learning in Atari Games

  • Mark McKenzie
  • Peter Loxley
  • William Billingsley
  • Sebastien Wong
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 10400)

Abstract

This research describes a study into the ability of a state of the art reinforcement learning algorithm to learn to perform multiple tasks. We demonstrate that the limitation of learning to performing two tasks can be mitigated with a competitive training method. We show that this approach results in improved generalization of the system when performing unforeseen tasks. The learning agent assessed is an altered version of the DeepMind deep Q–learner network (DQN), which has been demonstrated to outperform human players for a number of Atari 2600 games. The key findings of this paper is that there were significant degradations in performance when learning more than one game, and how this varies depends on both similarity and the comparative complexity of the two games.

Keywords

Reinforcement learning DQN Atari 

References

  1. 1.
    Bellemare, M.G., Naddaf, Y., Veness, J., Bowling, M.: The arcade learning environment: an evaluation platform for general agents. arXiv preprint arXiv:1207.4708 (2012)
  2. 2.
    Bucila, C., Caruana, R., Niculescu-Mizil, A.: Model compression. In: Proceedings of the 12th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 535–541. ACM (2006)Google Scholar
  3. 3.
    Hinton, G., Vinyals, O., Dean, J.: Distilling the knowledge in a neural network. arXiv preprint arXiv:1503.02531 (2015)
  4. 4.
    Kirkpatrick, J., Pascanu, R., Rabinowitz, N., Veness, J., Desjardins, G., Rusu, A.A., Milan, K., Quan, J., Ramalho, T., Grabska-Barwinska, A., et al.: Overcoming catastrophic forgetting in neural networks. arXiv preprint arXiv:1612.00796 (2016)
  5. 5.
    Kuzovkin, I.: Deepmind-atari-deep-q-learner (2015). https://www.github.com/kuz/DeepMind-Atari-Deep-Q-Learner.git
  6. 6.
    Van der Maaten, L., Hinton, G.: Visualizing data using t-SNE. J. Mach. Learn. Res. 9, 2579–2605 (2008)zbMATHGoogle Scholar
  7. 7.
    Mnih, V., Kavukcuoglu, K., Silver, D., Graves, A., Antonoglou, I., Wierstra, D., Riedmiller, M.: Playing Atari with deep reinforcement learning. arXiv preprint arXiv:1312.5602 (2013)
  8. 8.
    Mnih, V., Kavukcuoglu, K., Silver, D., Rusu, A.A., Veness, J., Bellemare, M.G., Graves, A., Riedmiller, M., Fidjeland, A.K., Ostrovski, G., et al.: Human-level control through deep reinforcement learning. Nature 518(7540), 529–533 (2015)CrossRefGoogle Scholar
  9. 9.
    Rusu, A.A., Colmenarejo, S.G., Gulcehre, C., Desjardins, G., Kirkpatrick, J., Pascanu, R., Mnih, V., Kavukcuoglu, K., Hadsell, R.: Policy distillation. arXiv preprint arXiv:1511.06295 (2015)
  10. 10.
    Silver, D., Huang, A., Maddison, C.J., Guez, A., Sifre, L., Van Den Driessche, G., Schrittwieser, J., Antonoglou, I., Panneershelvam, V., Lanctot, M., et al.: Mastering the game of go with deep neural networks and tree search. Nature 529(7587), 484–489 (2016)CrossRefGoogle Scholar
  11. 11.
    Sutton, R.S., Barto, A.G.: Reinforcement Learning: An Introduction, vol. 1. MIT Press, Cambridge (1998)Google Scholar

Copyright information

© Springer International Publishing AG 2017

Authors and Affiliations

  • Mark McKenzie
    • 1
  • Peter Loxley
    • 2
  • William Billingsley
    • 2
  • Sebastien Wong
    • 1
  1. 1.Defence Science and Technology GroupAdelaideAustralia
  2. 2.University of New EnglandArmidaleAustralia

Personalised recommendations