Crawling in Rogue’s Dungeons with (Partitioned) A3C

Asperti, Andrea; Cortesi, Daniele; Sovrano, Francesco

doi:10.1007/978-3-030-13709-0_22

Andrea Asperti¹⁷,
Daniele Cortesi¹⁷ &
Francesco Sovrano¹⁷

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 11331))

Included in the following conference series:

International Conference on Machine Learning, Optimization, and Data Science

2165 Accesses
4 Citations

Abstract

Rogue is a famous dungeon-crawling video-game of the 80ies, the ancestor of its gender. Rogue-like games are known for the necessity to explore partially observable and always different randomly-generated labyrinths, preventing any form of level replay. As such, they serve as a very natural and challenging task for reinforcement learning, requiring the acquisition of complex, non-reactive behaviors involving memory and planning. In this article we show how, exploiting a version of Asynchronous Advantage Actor-Critic (A3C) partitioned on different situations, the agent is able to reach the stairs and descend to the next level in 98% of cases.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
For a good agent, in average, little more than one hundred move are typically enough.
2.
Source code and weights are publicly available at [2].
3.
Source code and weights are publicly available at [2].
4.
A video of our agent playing is available at https://youtu.be/1j6_165Q46w.

References

RMSPropOptimizer. https://www.tensorflow.org/api_docs/python/tf/train/RMSPropOptimizer
Asperti, A., Cortesi, D., Sovrano, F.: Partitioned A3C for rogueinabox. https://github.com/Francesco-Sovrano/Partitioned-A3C-for-RogueInABox
Asperti, A., Pieri, C.D., Maldini, M., Pedrini, G., Sovrano, F.: A modular deep-learning environment for rogue. WSEAS Trans. Syst. Control 12, 362–373 (2017). http://www.wseas.org/multimedia/journals/control/2017/a785903-070.php
Google Scholar
Asperti, A., Pieri, C.D., Pedrini, G.: Rogueinabox: an environment for rogue like learning. Int. J. Comput. 2, 146–154 (2017). http://www.iaras.org/iaras/filedownloads/ijc/2017/006-0022(2017).pdf
Google Scholar
Bellemare, M.G., Naddaf, Y., Veness, J., Bowling, M.: The arcade learning environment: an evaluation platform for general agents. J. Artif. Intell. Res. (JAIR) 47, 253–279 (2013). https://doi.org/10.1613/jair.3912
Article Google Scholar
Cerny, V., Dechterenko, F.: Rogue-like games as a playground for artificial intelligence – evolutionary approach. In: Chorianopoulos, K., Divitini, M., Hauge, J.B., Jaccheri, L., Malaka, R. (eds.) ICEC 2015. LNCS, vol. 9353, pp. 261–271. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-24589-8_20
Chapter Google Scholar
Dilokthanakul, N., Kaplanis, C., Pawlowski, N., Shanahan, M.: Feature control as intrinsic motivation for hierarchical reinforcement learning. CoRR abs/1705.06769 (2017). http://arxiv.org/abs/1705.06769
van Hasselt, H., Guez, A., Silver, D.: Deep reinforcement learning with double Q-learning. CoRR abs/1509.06461 (2015). http://arxiv.org/abs/1509.06461
Jaderberg, M., et al.: Reinforcement learning with unsupervised auxiliary tasks. CoRR abs/1611.05397 (2016). http://arxiv.org/abs/1611.05397
Jaderberg, M., Simonyan, K., Zisserman, A., Kavukcuoglu, K.: Spatial transformer networks. In: Cortes, C., Lawrence, N.D., Lee, D.D., Sugiyama, M., Garnett, R. (eds.) Advances in Neural Information Processing Systems: Annual Conference on Neural Information Processing Systems 2015, Montreal, Quebec, Canada, 7–12 December 2015, vol. 28, pp. 2017–2025 (2015). http://papers.nips.cc/paper/5854-spatial-transformer-networks
Kempka, M., Wydmuch, M., Runc, G., Toczek, J., Jaskowski, W.: ViZDoom: a doom-based AI research platform for visual reinforcement learning. CoRR abs/1605.02097 (2016). http://arxiv.org/abs/1605.02097
Klyubin, A.S., Polani, D., Nehaniv, C.L.: Empowerment: a universal agent-centric measure of control. In: Proceedings of the IEEE Congress on Evolutionary Computation, CEC 2005, Edinburgh, UK, 2–4 September 2005, pp. 128–135 (2005). https://doi.org/10.1109/CEC.2005.1554676
Kulkarni, T.D., Narasimhan, K., Saeedi, A., Tenenbaum, J.B.: Hierarchical deep reinforcement learning: integrating temporal abstraction and intrinsic motivation. CoRR abs/1604.06057 (2016). http://arxiv.org/abs/1604.06057
Miyoshi, K.: Unreal implementation. https://github.com/miyosuda/unreal
Mnih, V., et al.: Asynchronous methods for deep reinforcement learning. CoRR abs/1602.01783 (2016). http://arxiv.org/abs/1602.01783
Mnih, V., et al.: Human-level control through deep reinforcement learning. Nature 518(7540), 529–533 (2015). https://doi.org/10.1038/nature14236
Article Google Scholar
Singh, S.P., Barto, A.G., Chentanez, N.: Intrinsically motivated reinforcement learning. In: Advances in Neural Information Processing Systems: Neural Information Processing Systems, NIPS 2004, Vancouver, British Columbia, Canada, 13–18 December 2004, vol. 17, pp. 1281–1288 (2004). http://papers.nips.cc/paper/2552-intrinsically-motivated-reinforcement-learning
Song, Y., Xu, M., Zhang, S., Huo, L.: Generalization tower network: a novel deep neural network architecture for multi-task learning. CoRR abs/1710.10036 (2017). http://arxiv.org/abs/1710.10036
Sun, R., Peterson, T.: Multi-agent reinforcement learning: weighting and partitioning. Neural Netw. 12(4–5), 727–753 (1999). https://doi.org/10.1016/S0893-6080(99)00024-6
Article Google Scholar
Sutton, R.S., Barto, A.G.: Introduction to Reinforcement Learning, 1st edn. MIT Press, Cambridge (1998)
Google Scholar
Vezhnevets, A.S., et al.: Feudal networks for hierarchical reinforcement learning. CoRR abs/1703.01161 (2017). http://arxiv.org/abs/1703.01161
Wang, Z.: Sample efficient actor-critic with experience replay (2016)
Google Scholar

Download references

Author information

Authors and Affiliations

Department of Informatics: Science and Engineering (DISI), University of Bologna, Mura Anteo Zamboni 7, 40127, Bologna, Italy
Andrea Asperti, Daniele Cortesi & Francesco Sovrano

Authors

Andrea Asperti
View author publications
You can also search for this author in PubMed Google Scholar
Daniele Cortesi
View author publications
You can also search for this author in PubMed Google Scholar
Francesco Sovrano
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Andrea Asperti .

Editor information

Editors and Affiliations

University of Catania, Catania, Italy and University of Reading, Reading, UK
Giuseppe Nicosia
University of Florida, Gainesville, FL, USA
Panos Pardalos
University of Catania, Catania, Italy
Giovanni Giuffrida
Harvard University, Cambridge, MA, USA
Renato Umeton
IBM, Tivoli Research Lab, Rome, Italy
Vincenzo Sciacca

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Asperti, A., Cortesi, D., Sovrano, F. (2019). Crawling in Rogue’s Dungeons with (Partitioned) A3C. In: Nicosia, G., Pardalos, P., Giuffrida, G., Umeton, R., Sciacca, V. (eds) Machine Learning, Optimization, and Data Science. LOD 2018. Lecture Notes in Computer Science(), vol 11331. Springer, Cham. https://doi.org/10.1007/978-3-030-13709-0_22

Download citation

DOI: https://doi.org/10.1007/978-3-030-13709-0_22
Published: 14 February 2019
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-13708-3
Online ISBN: 978-3-030-13709-0
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics