Scaling Tangled Program Graphs to Visual Reinforcement Learning in ViZDoom

Smith, Robert J.; Heywood, Malcolm I.

doi:10.1007/978-3-319-77553-1_9

Scaling Tangled Program Graphs to Visual Reinforcement Learning in ViZDoom

Robert J. Smith¹⁸ &
Malcolm I. Heywood¹⁸

Conference paper
First Online: 02 March 2018

1207 Accesses
7 Citations

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 10781))

Abstract

A tangled program graph framework (TPG) was recently proposed as an emergent process for decomposing tasks and simultaneously composing solutions by organizing code into graphs of teams of programs. The initial evaluation assessed the ability of TPG to discover agents capable of playing Atari game titles under the Arcade Learning Environment. This is an example of ‘visual’ reinforcement learning, i.e. agents are evolved directly from the frame buffer without recourse to hand designed features. TPG was able to evolve solutions competitive with state-of-the-art deep reinforcement learning solutions, but at a fraction of the complexity. One simplification assumed was that the visual input could be down sampled from a \(210 \times 160\) resolution to \(42 \times 32\). In this work, we consider the challenging 3D first person shooter environment of ViZDoom and require that agents be evolved at the original visual resolution of \(320 \times 240\) pixels. In addition, we address issues for developing agents capable of operating in multi-task ViZDoom environments simultaneously. The resulting TPG solutions retain all the emergent properties of the original work as well as the computational efficiency. Moreover, solutions appear to generalize across multiple task scenarios, whereas equivalent solutions from deep reinforcement learning have focused on single task scenarios alone.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Notes

1.
1344 pixels in the down sampled visual interface of [8, 9] versus 76800 pixels in the TPG deployment demonstrated for ViZDoom.
2.
For an illustration of the incremental construction of TPG individuals see the presentation slides of Stephen Kelly from EuroGP’17 http://stephenkelly.ca/research_files/skelly-mheywood-eurogp-2017.pdf.
3.
Conversely, the deep reinforcement learning approach of [17] cropped the original visual space to \(84 \times 84\) pixels.
4.
For example, [1] assume a \(120 \times 160\) visual input and [13] assume \(60 \times 108\).
5.
https://github.com/mwydmuch/ViZDoom/tree/master/scenarios.
6.
Note that \(H_{size}\) is the number of teams present and reflects the number of agents (root teams) at initialization. However, as teams are subsumed into graphs, the number of agents (root teams) will decrease. Likewise, application of the variation operator could switch an action from a team pointer to an atomic action, breaking a graph into two smaller graphs, resulting in an increase in the number of agents.
7.
Any negative normalized fitness values are treated as 0, thus producing a number in the range [0, 1].

References

Alvernaz, S., Togelius, J.: Autoencoder-augmented neuroevolution for visual Doom playing. In: IEEE Conference on Computational Intelligence and Games, pp. 1–8 (2017)
Google Scholar
Bellemare, M.G., Naddaf, Y., Veness, J., Bowling, M.: The arcade learning environment: an evaluation platform for general agents. J. Artif. Intell. Res. 47, 253–279 (2012)
Google Scholar
Braylan, A., Hollenbeck, M., Meyerson, E., Miikkulainen, R.: Reuse of neural modules for general video game playing. In: Proceedings of the AAAI Conference on Artificial Intelligence, pp. 353–359 (2016)
Google Scholar
Guo, X., Singh, S., Lewis, R., Lee, H.: Deep learning for reward design to improve Monte Carlo Tree Search in ATARI games. In: Proceedings of the AAAI Conference on Artificial Intelligence, pp. 1519–1525 (2016)
Google Scholar
Hausknecht, M., Lehman, J., Miikkulainen, R., Stone, P.: A neuroevolution approach to general Atari game playing. IEEE Trans. Comput. Intell. AI Games 6(4), 355–366 (2014)
Article Google Scholar
Jia, B., Ebner, M.: Evolving game state features from raw pixels. In: McDermott, J., Castelli, M., Sekanina, L., Haasdijk, E., García-Sánchez, P. (eds.) EuroGP 2017. LNCS, vol. 10196, pp. 52–63. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-55696-3_4
Chapter Google Scholar
Kelly, S., Heywood, M.I.: Knowledge transfer from Keepaway soccer to half-field offense through program symbiosis: Building simple programs for a complex task. In: ACM Genetic and Evolutionary Computation Conference, pp. 1143–1150 (2015)
Google Scholar
Kelly, S., Heywood, M.I.: Emergent tangled graph representations for atari game playing agents. In: McDermott, J., Castelli, M., Sekanina, L., Haasdijk, E., García-Sánchez, P. (eds.) EuroGP 2017. LNCS, vol. 10196, pp. 64–79. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-55696-3_5
Chapter Google Scholar
Kelly, S., Heywood, M.I.: Multi-task learning in Atari video games with emergent tangled program graphs. In: ACM Genetic and Evolutionary Computation Conference, pp. 195–202 (2017)
Google Scholar
Kempka, M., Wydmuch, M., Runc, G., Toczek, J., Jaśkowski, W.: ViZDoom: a doom-based AI research platform for visual reinforcement learning. In: IEEE Conference on Computational Intelligence and Games, pp. 1–8 (2016)
Google Scholar
Kirkpatrick, J., Pascanu, R., Rabinowitz, N.C., Veness, J., Desjardins, G., Rusu, A.A., Milan, K., Quan, J., Ramalho, T., Grabska-Barwinska, A., Hassabis, D., Clopath, C., Kumaran, D., Hadsell, R.: Overcoming catastrophic forgetting in neural networks. CoRR abs/1612.00796 (2016)
Google Scholar
Kunanusont, K., Lucas, S.M., Pérez-Liébana, D.: General video game AI: learning from screen capture. In: IEEE Conference on Computational Intelligence and Games, pp. 2078–2085 (2017)
Google Scholar
Lample, G., Chaplot, D.S.: Playing FPS games with deep reinforcement learning. In: Proceedings of the AAAI Conference on Artificial Intelligence, pp. 2140–2146 (2017)
Google Scholar
Lichodzijewski, P.: A symbiotic bid-based framework for problem decomposition using Genetic Programming. Ph.D. thesis, Faculty of Computer Science, Dalhousie University (2011)
Google Scholar
Lichodzijewski, P., Heywood, M.I.: Symbiosis, complexification and simplicity under GP. In: Proceedings of the ACM Genetic and Evolutionary Computation Conference, pp. 853–860 (2010)
Google Scholar
Loiacono, D., Lanzi, P., Togelius, J., Onieva, E., Pelta, D., Butz, M., Lonneker, T., Cardamone, L., Perez, D., Sáez, Y., Preuss, M., Quadflieg, J.: The 2009 simulated car racing championship. IEEE Trans. Comput. Intell. AI Games 2(2), 131–147 (2010)
Article Google Scholar
Mnih, V., Kavukcuoglu, K., Silver, D., Rusu, A.A., Veness, J., Bellemare, M.G., Graves, A., Riedmiller, M., Fidjeland, A.K., Ostrovski, G., Petersen, S., Beattie, C., Sadik, A., Antonoglou, I., King, H., Kumaran, D., Wierstra, D., Legg, S., Hassabis, D.: Human-level control through deep reinforcement learning. Nature 518(7540), 529–533 (2015)
Article Google Scholar
Poulsen, A.P., Thorhauge, M., Funch, M.H., Risi, S.: DLNE: a hybridization of deep learning and neuroevolution for visual control. In: IEEE Conference on Computational Intelligence and Games, pp. 1–8 (2017)
Google Scholar
Ratcliffe, D.S., Devlin, S., Kruschwitz, U., Citi, L.: Clyde: a deep reinforcement learning DOOM playing agent. In: AAAI Workshop on What’s Next for AI in Games, pp. 983–990 (2017)
Google Scholar
Whiteson, S., Kohl, N., Miikkulainen, R., Stone, P.: Evolving keepaway soccer players through task decomposition. Mach. Learn. 59(1), 5–30 (2005)
Article MATH Google Scholar
Wu, Y., Tian, Y.: Training agent for first-person shooter game with actor-critic curriculum learning. In: International Conference on Learning Representations, pp. 1–10 (2017)
Google Scholar
Yannakakis, G.N., Togelius, J.: A panorama of artificial and computational intelligence in games. IEEE Trans. Comput. Intell. AI Games 7(4), 317–335 (2015)
Article Google Scholar

Download references

Acknowledgments

This research was supported by NSERC grant CRDJ 499792.

Author information

Authors and Affiliations

Dalhousie University, Halifax, NS, Canada
Robert J. Smith & Malcolm I. Heywood

Authors

Robert J. Smith
View author publications
You can also search for this author in PubMed Google Scholar
Malcolm I. Heywood
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Robert J. Smith .

Editor information

Editors and Affiliations

Universidade Nova de Lisboa, Lisbon, Portugal
Mauro Castelli
Brno University of Technology, Brno, Czech Republic
Lukas Sekanina
Victoria University of Wellington, Wellington, New Zealand
Mengjie Zhang
University of Parma, Parma, Italy
Stefano Cagnoni
University of Cádiz, Cádiz, Spain
Pablo García-Sánchez

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Smith, R.J., Heywood, M.I. (2018). Scaling Tangled Program Graphs to Visual Reinforcement Learning in ViZDoom. In: Castelli, M., Sekanina, L., Zhang, M., Cagnoni, S., García-Sánchez, P. (eds) Genetic Programming. EuroGP 2018. Lecture Notes in Computer Science(), vol 10781. Springer, Cham. https://doi.org/10.1007/978-3-319-77553-1_9

Download citation

DOI: https://doi.org/10.1007/978-3-319-77553-1_9
Published: 02 March 2018
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-77552-4
Online ISBN: 978-3-319-77553-1
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics