Abstract
A tangled program graph framework (TPG) was recently proposed as an emergent process for decomposing tasks and simultaneously composing solutions by organizing code into graphs of teams of programs. The initial evaluation assessed the ability of TPG to discover agents capable of playing Atari game titles under the Arcade Learning Environment. This is an example of ‘visual’ reinforcement learning, i.e. agents are evolved directly from the frame buffer without recourse to hand designed features. TPG was able to evolve solutions competitive with state-of-the-art deep reinforcement learning solutions, but at a fraction of the complexity. One simplification assumed was that the visual input could be down sampled from a \(210 \times 160\) resolution to \(42 \times 32\). In this work, we consider the challenging 3D first person shooter environment of ViZDoom and require that agents be evolved at the original visual resolution of \(320 \times 240\) pixels. In addition, we address issues for developing agents capable of operating in multi-task ViZDoom environments simultaneously. The resulting TPG solutions retain all the emergent properties of the original work as well as the computational efficiency. Moreover, solutions appear to generalize across multiple task scenarios, whereas equivalent solutions from deep reinforcement learning have focused on single task scenarios alone.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsNotes
- 1.
- 2.
For an illustration of the incremental construction of TPG individuals see the presentation slides of Stephen Kelly from EuroGP’17 http://stephenkelly.ca/research_files/skelly-mheywood-eurogp-2017.pdf.
- 3.
Conversely, the deep reinforcement learning approach of [17] cropped the original visual space to \(84 \times 84\) pixels.
- 4.
- 5.
- 6.
Note that \(H_{size}\) is the number of teams present and reflects the number of agents (root teams) at initialization. However, as teams are subsumed into graphs, the number of agents (root teams) will decrease. Likewise, application of the variation operator could switch an action from a team pointer to an atomic action, breaking a graph into two smaller graphs, resulting in an increase in the number of agents.
- 7.
Any negative normalized fitness values are treated as 0, thus producing a number in the range [0, 1].
References
Alvernaz, S., Togelius, J.: Autoencoder-augmented neuroevolution for visual Doom playing. In: IEEE Conference on Computational Intelligence and Games, pp. 1–8 (2017)
Bellemare, M.G., Naddaf, Y., Veness, J., Bowling, M.: The arcade learning environment: an evaluation platform for general agents. J. Artif. Intell. Res. 47, 253–279 (2012)
Braylan, A., Hollenbeck, M., Meyerson, E., Miikkulainen, R.: Reuse of neural modules for general video game playing. In: Proceedings of the AAAI Conference on Artificial Intelligence, pp. 353–359 (2016)
Guo, X., Singh, S., Lewis, R., Lee, H.: Deep learning for reward design to improve Monte Carlo Tree Search in ATARI games. In: Proceedings of the AAAI Conference on Artificial Intelligence, pp. 1519–1525 (2016)
Hausknecht, M., Lehman, J., Miikkulainen, R., Stone, P.: A neuroevolution approach to general Atari game playing. IEEE Trans. Comput. Intell. AI Games 6(4), 355–366 (2014)
Jia, B., Ebner, M.: Evolving game state features from raw pixels. In: McDermott, J., Castelli, M., Sekanina, L., Haasdijk, E., García-Sánchez, P. (eds.) EuroGP 2017. LNCS, vol. 10196, pp. 52–63. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-55696-3_4
Kelly, S., Heywood, M.I.: Knowledge transfer from Keepaway soccer to half-field offense through program symbiosis: Building simple programs for a complex task. In: ACM Genetic and Evolutionary Computation Conference, pp. 1143–1150 (2015)
Kelly, S., Heywood, M.I.: Emergent tangled graph representations for atari game playing agents. In: McDermott, J., Castelli, M., Sekanina, L., Haasdijk, E., García-Sánchez, P. (eds.) EuroGP 2017. LNCS, vol. 10196, pp. 64–79. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-55696-3_5
Kelly, S., Heywood, M.I.: Multi-task learning in Atari video games with emergent tangled program graphs. In: ACM Genetic and Evolutionary Computation Conference, pp. 195–202 (2017)
Kempka, M., Wydmuch, M., Runc, G., Toczek, J., Jaśkowski, W.: ViZDoom: a doom-based AI research platform for visual reinforcement learning. In: IEEE Conference on Computational Intelligence and Games, pp. 1–8 (2016)
Kirkpatrick, J., Pascanu, R., Rabinowitz, N.C., Veness, J., Desjardins, G., Rusu, A.A., Milan, K., Quan, J., Ramalho, T., Grabska-Barwinska, A., Hassabis, D., Clopath, C., Kumaran, D., Hadsell, R.: Overcoming catastrophic forgetting in neural networks. CoRR abs/1612.00796 (2016)
Kunanusont, K., Lucas, S.M., Pérez-Liébana, D.: General video game AI: learning from screen capture. In: IEEE Conference on Computational Intelligence and Games, pp. 2078–2085 (2017)
Lample, G., Chaplot, D.S.: Playing FPS games with deep reinforcement learning. In: Proceedings of the AAAI Conference on Artificial Intelligence, pp. 2140–2146 (2017)
Lichodzijewski, P.: A symbiotic bid-based framework for problem decomposition using Genetic Programming. Ph.D. thesis, Faculty of Computer Science, Dalhousie University (2011)
Lichodzijewski, P., Heywood, M.I.: Symbiosis, complexification and simplicity under GP. In: Proceedings of the ACM Genetic and Evolutionary Computation Conference, pp. 853–860 (2010)
Loiacono, D., Lanzi, P., Togelius, J., Onieva, E., Pelta, D., Butz, M., Lonneker, T., Cardamone, L., Perez, D., Sáez, Y., Preuss, M., Quadflieg, J.: The 2009 simulated car racing championship. IEEE Trans. Comput. Intell. AI Games 2(2), 131–147 (2010)
Mnih, V., Kavukcuoglu, K., Silver, D., Rusu, A.A., Veness, J., Bellemare, M.G., Graves, A., Riedmiller, M., Fidjeland, A.K., Ostrovski, G., Petersen, S., Beattie, C., Sadik, A., Antonoglou, I., King, H., Kumaran, D., Wierstra, D., Legg, S., Hassabis, D.: Human-level control through deep reinforcement learning. Nature 518(7540), 529–533 (2015)
Poulsen, A.P., Thorhauge, M., Funch, M.H., Risi, S.: DLNE: a hybridization of deep learning and neuroevolution for visual control. In: IEEE Conference on Computational Intelligence and Games, pp. 1–8 (2017)
Ratcliffe, D.S., Devlin, S., Kruschwitz, U., Citi, L.: Clyde: a deep reinforcement learning DOOM playing agent. In: AAAI Workshop on What’s Next for AI in Games, pp. 983–990 (2017)
Whiteson, S., Kohl, N., Miikkulainen, R., Stone, P.: Evolving keepaway soccer players through task decomposition. Mach. Learn. 59(1), 5–30 (2005)
Wu, Y., Tian, Y.: Training agent for first-person shooter game with actor-critic curriculum learning. In: International Conference on Learning Representations, pp. 1–10 (2017)
Yannakakis, G.N., Togelius, J.: A panorama of artificial and computational intelligence in games. IEEE Trans. Comput. Intell. AI Games 7(4), 317–335 (2015)
Acknowledgments
This research was supported by NSERC grant CRDJ 499792.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2018 Springer International Publishing AG, part of Springer Nature
About this paper
Cite this paper
Smith, R.J., Heywood, M.I. (2018). Scaling Tangled Program Graphs to Visual Reinforcement Learning in ViZDoom. In: Castelli, M., Sekanina, L., Zhang, M., Cagnoni, S., García-Sánchez, P. (eds) Genetic Programming. EuroGP 2018. Lecture Notes in Computer Science(), vol 10781. Springer, Cham. https://doi.org/10.1007/978-3-319-77553-1_9
Download citation
DOI: https://doi.org/10.1007/978-3-319-77553-1_9
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-77552-4
Online ISBN: 978-3-319-77553-1
eBook Packages: Computer ScienceComputer Science (R0)