Skip to main content

Crawling in Rogue’s Dungeons with (Partitioned) A3C

  • Conference paper
  • First Online:
Machine Learning, Optimization, and Data Science (LOD 2018)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 11331))

Abstract

Rogue is a famous dungeon-crawling video-game of the 80ies, the ancestor of its gender. Rogue-like games are known for the necessity to explore partially observable and always different randomly-generated labyrinths, preventing any form of level replay. As such, they serve as a very natural and challenging task for reinforcement learning, requiring the acquisition of complex, non-reactive behaviors involving memory and planning. In this article we show how, exploiting a version of Asynchronous Advantage Actor-Critic (A3C) partitioned on different situations, the agent is able to reach the stairs and descend to the next level in 98% of cases.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    For a good agent, in average, little more than one hundred move are typically enough.

  2. 2.

    Source code and weights are publicly available at [2].

  3. 3.

    Source code and weights are publicly available at [2].

  4. 4.

    A video of our agent playing is available at https://youtu.be/1j6_165Q46w.

References

  1. RMSPropOptimizer. https://www.tensorflow.org/api_docs/python/tf/train/RMSPropOptimizer

  2. Asperti, A., Cortesi, D., Sovrano, F.: Partitioned A3C for rogueinabox. https://github.com/Francesco-Sovrano/Partitioned-A3C-for-RogueInABox

  3. Asperti, A., Pieri, C.D., Maldini, M., Pedrini, G., Sovrano, F.: A modular deep-learning environment for rogue. WSEAS Trans. Syst. Control 12, 362–373 (2017). http://www.wseas.org/multimedia/journals/control/2017/a785903-070.php

    Google Scholar 

  4. Asperti, A., Pieri, C.D., Pedrini, G.: Rogueinabox: an environment for rogue like learning. Int. J. Comput. 2, 146–154 (2017). http://www.iaras.org/iaras/filedownloads/ijc/2017/006-0022(2017).pdf

    Google Scholar 

  5. Bellemare, M.G., Naddaf, Y., Veness, J., Bowling, M.: The arcade learning environment: an evaluation platform for general agents. J. Artif. Intell. Res. (JAIR) 47, 253–279 (2013). https://doi.org/10.1613/jair.3912

    Article  Google Scholar 

  6. Cerny, V., Dechterenko, F.: Rogue-like games as a playground for artificial intelligence – evolutionary approach. In: Chorianopoulos, K., Divitini, M., Hauge, J.B., Jaccheri, L., Malaka, R. (eds.) ICEC 2015. LNCS, vol. 9353, pp. 261–271. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-24589-8_20

    Chapter  Google Scholar 

  7. Dilokthanakul, N., Kaplanis, C., Pawlowski, N., Shanahan, M.: Feature control as intrinsic motivation for hierarchical reinforcement learning. CoRR abs/1705.06769 (2017). http://arxiv.org/abs/1705.06769

  8. van Hasselt, H., Guez, A., Silver, D.: Deep reinforcement learning with double Q-learning. CoRR abs/1509.06461 (2015). http://arxiv.org/abs/1509.06461

  9. Jaderberg, M., et al.: Reinforcement learning with unsupervised auxiliary tasks. CoRR abs/1611.05397 (2016). http://arxiv.org/abs/1611.05397

  10. Jaderberg, M., Simonyan, K., Zisserman, A., Kavukcuoglu, K.: Spatial transformer networks. In: Cortes, C., Lawrence, N.D., Lee, D.D., Sugiyama, M., Garnett, R. (eds.) Advances in Neural Information Processing Systems: Annual Conference on Neural Information Processing Systems 2015, Montreal, Quebec, Canada, 7–12 December 2015, vol. 28, pp. 2017–2025 (2015). http://papers.nips.cc/paper/5854-spatial-transformer-networks

  11. Kempka, M., Wydmuch, M., Runc, G., Toczek, J., Jaskowski, W.: ViZDoom: a doom-based AI research platform for visual reinforcement learning. CoRR abs/1605.02097 (2016). http://arxiv.org/abs/1605.02097

  12. Klyubin, A.S., Polani, D., Nehaniv, C.L.: Empowerment: a universal agent-centric measure of control. In: Proceedings of the IEEE Congress on Evolutionary Computation, CEC 2005, Edinburgh, UK, 2–4 September 2005, pp. 128–135 (2005). https://doi.org/10.1109/CEC.2005.1554676

  13. Kulkarni, T.D., Narasimhan, K., Saeedi, A., Tenenbaum, J.B.: Hierarchical deep reinforcement learning: integrating temporal abstraction and intrinsic motivation. CoRR abs/1604.06057 (2016). http://arxiv.org/abs/1604.06057

  14. Miyoshi, K.: Unreal implementation. https://github.com/miyosuda/unreal

  15. Mnih, V., et al.: Asynchronous methods for deep reinforcement learning. CoRR abs/1602.01783 (2016). http://arxiv.org/abs/1602.01783

  16. Mnih, V., et al.: Human-level control through deep reinforcement learning. Nature 518(7540), 529–533 (2015). https://doi.org/10.1038/nature14236

    Article  Google Scholar 

  17. Singh, S.P., Barto, A.G., Chentanez, N.: Intrinsically motivated reinforcement learning. In: Advances in Neural Information Processing Systems: Neural Information Processing Systems, NIPS 2004, Vancouver, British Columbia, Canada, 13–18 December 2004, vol. 17, pp. 1281–1288 (2004). http://papers.nips.cc/paper/2552-intrinsically-motivated-reinforcement-learning

  18. Song, Y., Xu, M., Zhang, S., Huo, L.: Generalization tower network: a novel deep neural network architecture for multi-task learning. CoRR abs/1710.10036 (2017). http://arxiv.org/abs/1710.10036

  19. Sun, R., Peterson, T.: Multi-agent reinforcement learning: weighting and partitioning. Neural Netw. 12(4–5), 727–753 (1999). https://doi.org/10.1016/S0893-6080(99)00024-6

    Article  Google Scholar 

  20. Sutton, R.S., Barto, A.G.: Introduction to Reinforcement Learning, 1st edn. MIT Press, Cambridge (1998)

    Google Scholar 

  21. Vezhnevets, A.S., et al.: Feudal networks for hierarchical reinforcement learning. CoRR abs/1703.01161 (2017). http://arxiv.org/abs/1703.01161

  22. Wang, Z.: Sample efficient actor-critic with experience replay (2016)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Andrea Asperti .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2019 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Asperti, A., Cortesi, D., Sovrano, F. (2019). Crawling in Rogue’s Dungeons with (Partitioned) A3C. In: Nicosia, G., Pardalos, P., Giuffrida, G., Umeton, R., Sciacca, V. (eds) Machine Learning, Optimization, and Data Science. LOD 2018. Lecture Notes in Computer Science(), vol 11331. Springer, Cham. https://doi.org/10.1007/978-3-030-13709-0_22

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-13709-0_22

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-13708-3

  • Online ISBN: 978-3-030-13709-0

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics