Skip to main content

Scaling Tangled Program Graphs to Visual Reinforcement Learning in ViZDoom

  • Conference paper
  • First Online:

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 10781))

Abstract

A tangled program graph framework (TPG) was recently proposed as an emergent process for decomposing tasks and simultaneously composing solutions by organizing code into graphs of teams of programs. The initial evaluation assessed the ability of TPG to discover agents capable of playing Atari game titles under the Arcade Learning Environment. This is an example of ‘visual’ reinforcement learning, i.e. agents are evolved directly from the frame buffer without recourse to hand designed features. TPG was able to evolve solutions competitive with state-of-the-art deep reinforcement learning solutions, but at a fraction of the complexity. One simplification assumed was that the visual input could be down sampled from a \(210 \times 160\) resolution to \(42 \times 32\). In this work, we consider the challenging 3D first person shooter environment of ViZDoom and require that agents be evolved at the original visual resolution of \(320 \times 240\) pixels. In addition, we address issues for developing agents capable of operating in multi-task ViZDoom environments simultaneously. The resulting TPG solutions retain all the emergent properties of the original work as well as the computational efficiency. Moreover, solutions appear to generalize across multiple task scenarios, whereas equivalent solutions from deep reinforcement learning have focused on single task scenarios alone.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Notes

  1. 1.

    1344 pixels in the down sampled visual interface of [8, 9] versus 76800 pixels in the TPG deployment demonstrated for ViZDoom.

  2. 2.

    For an illustration of the incremental construction of TPG individuals see the presentation slides of Stephen Kelly from EuroGP’17 http://stephenkelly.ca/research_files/skelly-mheywood-eurogp-2017.pdf.

  3. 3.

    Conversely, the deep reinforcement learning approach of [17] cropped the original visual space to \(84 \times 84\) pixels.

  4. 4.

    For example, [1] assume a \(120 \times 160\) visual input and [13] assume \(60 \times 108\).

  5. 5.

    https://github.com/mwydmuch/ViZDoom/tree/master/scenarios.

  6. 6.

    Note that \(H_{size}\) is the number of teams present and reflects the number of agents (root teams) at initialization. However, as teams are subsumed into graphs, the number of agents (root teams) will decrease. Likewise, application of the variation operator could switch an action from a team pointer to an atomic action, breaking a graph into two smaller graphs, resulting in an increase in the number of agents.

  7. 7.

    Any negative normalized fitness values are treated as 0, thus producing a number in the range [0, 1].

References

  1. Alvernaz, S., Togelius, J.: Autoencoder-augmented neuroevolution for visual Doom playing. In: IEEE Conference on Computational Intelligence and Games, pp. 1–8 (2017)

    Google Scholar 

  2. Bellemare, M.G., Naddaf, Y., Veness, J., Bowling, M.: The arcade learning environment: an evaluation platform for general agents. J. Artif. Intell. Res. 47, 253–279 (2012)

    Google Scholar 

  3. Braylan, A., Hollenbeck, M., Meyerson, E., Miikkulainen, R.: Reuse of neural modules for general video game playing. In: Proceedings of the AAAI Conference on Artificial Intelligence, pp. 353–359 (2016)

    Google Scholar 

  4. Guo, X., Singh, S., Lewis, R., Lee, H.: Deep learning for reward design to improve Monte Carlo Tree Search in ATARI games. In: Proceedings of the AAAI Conference on Artificial Intelligence, pp. 1519–1525 (2016)

    Google Scholar 

  5. Hausknecht, M., Lehman, J., Miikkulainen, R., Stone, P.: A neuroevolution approach to general Atari game playing. IEEE Trans. Comput. Intell. AI Games 6(4), 355–366 (2014)

    Article  Google Scholar 

  6. Jia, B., Ebner, M.: Evolving game state features from raw pixels. In: McDermott, J., Castelli, M., Sekanina, L., Haasdijk, E., García-Sánchez, P. (eds.) EuroGP 2017. LNCS, vol. 10196, pp. 52–63. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-55696-3_4

    Chapter  Google Scholar 

  7. Kelly, S., Heywood, M.I.: Knowledge transfer from Keepaway soccer to half-field offense through program symbiosis: Building simple programs for a complex task. In: ACM Genetic and Evolutionary Computation Conference, pp. 1143–1150 (2015)

    Google Scholar 

  8. Kelly, S., Heywood, M.I.: Emergent tangled graph representations for atari game playing agents. In: McDermott, J., Castelli, M., Sekanina, L., Haasdijk, E., García-Sánchez, P. (eds.) EuroGP 2017. LNCS, vol. 10196, pp. 64–79. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-55696-3_5

    Chapter  Google Scholar 

  9. Kelly, S., Heywood, M.I.: Multi-task learning in Atari video games with emergent tangled program graphs. In: ACM Genetic and Evolutionary Computation Conference, pp. 195–202 (2017)

    Google Scholar 

  10. Kempka, M., Wydmuch, M., Runc, G., Toczek, J., Jaśkowski, W.: ViZDoom: a doom-based AI research platform for visual reinforcement learning. In: IEEE Conference on Computational Intelligence and Games, pp. 1–8 (2016)

    Google Scholar 

  11. Kirkpatrick, J., Pascanu, R., Rabinowitz, N.C., Veness, J., Desjardins, G., Rusu, A.A., Milan, K., Quan, J., Ramalho, T., Grabska-Barwinska, A., Hassabis, D., Clopath, C., Kumaran, D., Hadsell, R.: Overcoming catastrophic forgetting in neural networks. CoRR abs/1612.00796 (2016)

    Google Scholar 

  12. Kunanusont, K., Lucas, S.M., Pérez-Liébana, D.: General video game AI: learning from screen capture. In: IEEE Conference on Computational Intelligence and Games, pp. 2078–2085 (2017)

    Google Scholar 

  13. Lample, G., Chaplot, D.S.: Playing FPS games with deep reinforcement learning. In: Proceedings of the AAAI Conference on Artificial Intelligence, pp. 2140–2146 (2017)

    Google Scholar 

  14. Lichodzijewski, P.: A symbiotic bid-based framework for problem decomposition using Genetic Programming. Ph.D. thesis, Faculty of Computer Science, Dalhousie University (2011)

    Google Scholar 

  15. Lichodzijewski, P., Heywood, M.I.: Symbiosis, complexification and simplicity under GP. In: Proceedings of the ACM Genetic and Evolutionary Computation Conference, pp. 853–860 (2010)

    Google Scholar 

  16. Loiacono, D., Lanzi, P., Togelius, J., Onieva, E., Pelta, D., Butz, M., Lonneker, T., Cardamone, L., Perez, D., Sáez, Y., Preuss, M., Quadflieg, J.: The 2009 simulated car racing championship. IEEE Trans. Comput. Intell. AI Games 2(2), 131–147 (2010)

    Article  Google Scholar 

  17. Mnih, V., Kavukcuoglu, K., Silver, D., Rusu, A.A., Veness, J., Bellemare, M.G., Graves, A., Riedmiller, M., Fidjeland, A.K., Ostrovski, G., Petersen, S., Beattie, C., Sadik, A., Antonoglou, I., King, H., Kumaran, D., Wierstra, D., Legg, S., Hassabis, D.: Human-level control through deep reinforcement learning. Nature 518(7540), 529–533 (2015)

    Article  Google Scholar 

  18. Poulsen, A.P., Thorhauge, M., Funch, M.H., Risi, S.: DLNE: a hybridization of deep learning and neuroevolution for visual control. In: IEEE Conference on Computational Intelligence and Games, pp. 1–8 (2017)

    Google Scholar 

  19. Ratcliffe, D.S., Devlin, S., Kruschwitz, U., Citi, L.: Clyde: a deep reinforcement learning DOOM playing agent. In: AAAI Workshop on What’s Next for AI in Games, pp. 983–990 (2017)

    Google Scholar 

  20. Whiteson, S., Kohl, N., Miikkulainen, R., Stone, P.: Evolving keepaway soccer players through task decomposition. Mach. Learn. 59(1), 5–30 (2005)

    Article  MATH  Google Scholar 

  21. Wu, Y., Tian, Y.: Training agent for first-person shooter game with actor-critic curriculum learning. In: International Conference on Learning Representations, pp. 1–10 (2017)

    Google Scholar 

  22. Yannakakis, G.N., Togelius, J.: A panorama of artificial and computational intelligence in games. IEEE Trans. Comput. Intell. AI Games 7(4), 317–335 (2015)

    Article  Google Scholar 

Download references

Acknowledgments

This research was supported by NSERC grant CRDJ 499792.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Robert J. Smith .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2018 Springer International Publishing AG, part of Springer Nature

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Smith, R.J., Heywood, M.I. (2018). Scaling Tangled Program Graphs to Visual Reinforcement Learning in ViZDoom. In: Castelli, M., Sekanina, L., Zhang, M., Cagnoni, S., García-Sánchez, P. (eds) Genetic Programming. EuroGP 2018. Lecture Notes in Computer Science(), vol 10781. Springer, Cham. https://doi.org/10.1007/978-3-319-77553-1_9

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-77553-1_9

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-77552-4

  • Online ISBN: 978-3-319-77553-1

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics