Advertisement

On the Cross-Domain Reusability of Neural Modules for General Video Game Playing

Conference paper
Part of the Communications in Computer and Information Science book series (CCIS, volume 614)

Abstract

We consider a general approach to knowledge transfer in which an agent learning with a neural network adapts how it reuses existing networks as it learns in a new domain. Networks trained for a new domain are able to improve performance by selectively routing activation through previously learned neural structure, regardless of how or for what it was learned. We consider a neuroevolution implementation of the approach with application to reinforcement learning domains. This approach is more general than previous approaches to transfer for reinforcement learning. It is domain-agnostic and requires no prior assumptions about the nature of task relatedness or mappings. We analyze the method’s performance and applicability in high-dimensional Atari 2600 general video game playing.

Keywords

Reinforcement Learning Transfer Learning Target Task Target Network Burst Phase 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Notes

Acknowledgments

This research was supported in part by NSF grant DBI-0939454, NIH grant R01-GM105042, and an NPSC fellowship sponsored by NSA.

References

  1. 1.
    Anderson, M.L.: Neural reuse: a fundamental organizational principle of the brain. Behav. Brain Sci. 33, 245–266 (2010)CrossRefGoogle Scholar
  2. 2.
    Bellemare, M.G., Naddaf, Y., Veness, J., Bowling, M.: The arcade learning environment: an evaluation platform for general agents. J. Artif. Intell. Res. 47, 253–279 (2013)Google Scholar
  3. 3.
    Braylan, A., Hollenbeck, M., Meyerson, E., Miikkulainen, R.: Frame skip is a powerful parameter for learning to play atari. In: AAAI Workshop on Learning for General Competency in Video Games (2015)Google Scholar
  4. 4.
    Diestel, R.: Graph Theory, p. 278. Springer, Heidelberg (2005)Google Scholar
  5. 5.
    Gomez, F.J.: Robust non-linear control through neuroevolution. Technical report, UT Austin (2003)Google Scholar
  6. 6.
    Gomez, F.J., Miikkulainen, R.: Incremental evolution of complex general behavior. Adapt. Behav. 5.3(5), 317–342 (1997)CrossRefGoogle Scholar
  7. 7.
    Gomez, F.J., Miikkulainen, R.: Active guidance for a finless rocket using neuroevolution. In: Proceedings of GECCO 2003, pp. 2084–2095 (2003)Google Scholar
  8. 8.
    Gomez, F.J., Schmidhuber, J.: Co-evolving recurrent neurons learn deep memory pomdps. In: Proceedings of GECCO 2005, pp. 491–498 (2005)Google Scholar
  9. 9.
    Hausknecht, M., Lehman, J., Miikkulainen, R., Stone, P.: A neuroevolution approach to general atari game playing. In: Computational Intelligence and AI in Games (2013)Google Scholar
  10. 10.
    Hausknecht, M., Stone, P.: The impact of determinism on learning Atari 2600 games. In: AAAI Workshop on Learning for General Competency in Video Games (2015)Google Scholar
  11. 11.
    Koutník, J., Schmidhuber, J., Gomez, F.: Online evolution of deep convolutional network for vision-based reinforcement learning. In: del Pobil, A.P., Chinellato, E., Martinez-Martin, E., Hallam, J., Cervera, E., Morales, A. (eds.) SAB 2014. LNCS, vol. 8575, pp. 260–269. Springer, Heidelberg (2014)Google Scholar
  12. 12.
    Lukoševičius, M., Jaeger, H.: Reservoir computing approaches to recurrent neural network training. Comput. Sci. Rev. 3(3), 127–149 (2009)CrossRefzbMATHGoogle Scholar
  13. 13.
    Miikkulainen, R., Feasley, E., Johnson, L., Karpov, I., Rajagopalan, P., Rawal, A., Tansey, W.: Multiagent learning through neuroevolution. In: Alippi, C., Bouchon-Meunier, B., Greenwood, G.W., Abbass, H.A., Liu, J. (eds.) WCCI 2012. LNCS, vol. 7311, pp. 24–46. Springer, Heidelberg (2012)CrossRefGoogle Scholar
  14. 14.
    Milo, R., Shen-Orr, S., Itzkovitz, S., Kashtan, N., Chklovskii, D., Alon, U.: Network motifs: simple building blocks of complex networks. Science 298(5594), 824–827 (2002)CrossRefGoogle Scholar
  15. 15.
    Mnih, V., Kavukcuoglu, K., Silver, D., Graves, A., Antonoglou, I., Wierstra, D., Riedmiller, M.: Playing atari with deep reinforcement learning (2013). arXiv:1312.5602
  16. 16.
    Pan, S.J., Yang, Q.: A survey on transfer learning. Knowl. Data Eng. 22(10), 1345–1359 (2010)CrossRefGoogle Scholar
  17. 17.
    Perez, D., Samothrakis, S., Togelius, J., Schaul, T., Lucas, S., Couetoux, A., Lee, J., Lim, C., Thompson, T.: The 2014 general video game playing competition. IEEE Trans. Comput. Intell. AI Games (2015)Google Scholar
  18. 18.
    Schaul, T.: A video game description language for model-based or interactive learning. In: Proceedings of IEEE Conference on Computational Intelligence in Games (CIG 2013), pp. 193–200 (2013)Google Scholar
  19. 19.
    Schmidhuber, J., Wierstra, D., Gagliolo, M., Gomez, F.J.: Training recurrent networks by Evolino. Neural Comput. 19(3), 757–779 (2007)CrossRefzbMATHGoogle Scholar
  20. 20.
    Shultz, T.R., Rivest, F.: Knowledge-based cascade-correlation: using knowledge to speed learning. Connection Sci. 13(1), 43–72 (2001)CrossRefzbMATHGoogle Scholar
  21. 21.
    Swarup, S., Ray, S.R.: Cross-domain knowledge transfer using structured representations. In: Proceedings of AAAI, pp. 506–511 (2006)Google Scholar
  22. 22.
    Talvitie, E., Singh, S.: An experts algorithm for transfer learning. In: Proceedings of IJCAI 2007, pp. 1065–1070 (2007)Google Scholar
  23. 23.
    Taylor, M.E., Kuhlmann, G., Stone, P.: Autonomous transfer for reinforcement learning. In: Proceedings of AAMAS 2008, pp. 283–290 (2008)Google Scholar
  24. 24.
    Taylor, M.E., Stone, P.: Transfer learning for reinforcement learning domains: a survey. J. Mach. Learn. Res. 10, 1633–1685 (2009)MathSciNetzbMATHGoogle Scholar
  25. 25.
    Taylor, M.E., Whiteson, S., Stone, P.: Transfer via inter-task mappings in policy search reinforcement learning. In: Proceedings of AAMAS 2007, pp. 156–163 (2007)Google Scholar

Copyright information

© Springer International Publishing Switzerland 2016

Authors and Affiliations

  1. 1.Department of Computer ScienceThe University of Texas at AustinAustinUSA

Personalised recommendations