Abstract
Rogue is a famous dungeon-crawling video-game of the 80ies, the ancestor of its gender. Rogue-like games are known for the necessity to explore partially observable and always different randomly-generated labyrinths, preventing any form of level replay. As such, they serve as a very natural and challenging task for reinforcement learning, requiring the acquisition of complex, non-reactive behaviors involving memory and planning. In this article we show how, exploiting a version of Asynchronous Advantage Actor-Critic (A3C) partitioned on different situations, the agent is able to reach the stairs and descend to the next level in 98% of cases.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
For a good agent, in average, little more than one hundred move are typically enough.
- 2.
Source code and weights are publicly available at [2].
- 3.
Source code and weights are publicly available at [2].
- 4.
A video of our agent playing is available at https://youtu.be/1j6_165Q46w.
References
RMSPropOptimizer. https://www.tensorflow.org/api_docs/python/tf/train/RMSPropOptimizer
Asperti, A., Cortesi, D., Sovrano, F.: Partitioned A3C for rogueinabox. https://github.com/Francesco-Sovrano/Partitioned-A3C-for-RogueInABox
Asperti, A., Pieri, C.D., Maldini, M., Pedrini, G., Sovrano, F.: A modular deep-learning environment for rogue. WSEAS Trans. Syst. Control 12, 362–373 (2017). http://www.wseas.org/multimedia/journals/control/2017/a785903-070.php
Asperti, A., Pieri, C.D., Pedrini, G.: Rogueinabox: an environment for rogue like learning. Int. J. Comput. 2, 146–154 (2017). http://www.iaras.org/iaras/filedownloads/ijc/2017/006-0022(2017).pdf
Bellemare, M.G., Naddaf, Y., Veness, J., Bowling, M.: The arcade learning environment: an evaluation platform for general agents. J. Artif. Intell. Res. (JAIR) 47, 253–279 (2013). https://doi.org/10.1613/jair.3912
Cerny, V., Dechterenko, F.: Rogue-like games as a playground for artificial intelligence – evolutionary approach. In: Chorianopoulos, K., Divitini, M., Hauge, J.B., Jaccheri, L., Malaka, R. (eds.) ICEC 2015. LNCS, vol. 9353, pp. 261–271. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-24589-8_20
Dilokthanakul, N., Kaplanis, C., Pawlowski, N., Shanahan, M.: Feature control as intrinsic motivation for hierarchical reinforcement learning. CoRR abs/1705.06769 (2017). http://arxiv.org/abs/1705.06769
van Hasselt, H., Guez, A., Silver, D.: Deep reinforcement learning with double Q-learning. CoRR abs/1509.06461 (2015). http://arxiv.org/abs/1509.06461
Jaderberg, M., et al.: Reinforcement learning with unsupervised auxiliary tasks. CoRR abs/1611.05397 (2016). http://arxiv.org/abs/1611.05397
Jaderberg, M., Simonyan, K., Zisserman, A., Kavukcuoglu, K.: Spatial transformer networks. In: Cortes, C., Lawrence, N.D., Lee, D.D., Sugiyama, M., Garnett, R. (eds.) Advances in Neural Information Processing Systems: Annual Conference on Neural Information Processing Systems 2015, Montreal, Quebec, Canada, 7–12 December 2015, vol. 28, pp. 2017–2025 (2015). http://papers.nips.cc/paper/5854-spatial-transformer-networks
Kempka, M., Wydmuch, M., Runc, G., Toczek, J., Jaskowski, W.: ViZDoom: a doom-based AI research platform for visual reinforcement learning. CoRR abs/1605.02097 (2016). http://arxiv.org/abs/1605.02097
Klyubin, A.S., Polani, D., Nehaniv, C.L.: Empowerment: a universal agent-centric measure of control. In: Proceedings of the IEEE Congress on Evolutionary Computation, CEC 2005, Edinburgh, UK, 2–4 September 2005, pp. 128–135 (2005). https://doi.org/10.1109/CEC.2005.1554676
Kulkarni, T.D., Narasimhan, K., Saeedi, A., Tenenbaum, J.B.: Hierarchical deep reinforcement learning: integrating temporal abstraction and intrinsic motivation. CoRR abs/1604.06057 (2016). http://arxiv.org/abs/1604.06057
Miyoshi, K.: Unreal implementation. https://github.com/miyosuda/unreal
Mnih, V., et al.: Asynchronous methods for deep reinforcement learning. CoRR abs/1602.01783 (2016). http://arxiv.org/abs/1602.01783
Mnih, V., et al.: Human-level control through deep reinforcement learning. Nature 518(7540), 529–533 (2015). https://doi.org/10.1038/nature14236
Singh, S.P., Barto, A.G., Chentanez, N.: Intrinsically motivated reinforcement learning. In: Advances in Neural Information Processing Systems: Neural Information Processing Systems, NIPS 2004, Vancouver, British Columbia, Canada, 13–18 December 2004, vol. 17, pp. 1281–1288 (2004). http://papers.nips.cc/paper/2552-intrinsically-motivated-reinforcement-learning
Song, Y., Xu, M., Zhang, S., Huo, L.: Generalization tower network: a novel deep neural network architecture for multi-task learning. CoRR abs/1710.10036 (2017). http://arxiv.org/abs/1710.10036
Sun, R., Peterson, T.: Multi-agent reinforcement learning: weighting and partitioning. Neural Netw. 12(4–5), 727–753 (1999). https://doi.org/10.1016/S0893-6080(99)00024-6
Sutton, R.S., Barto, A.G.: Introduction to Reinforcement Learning, 1st edn. MIT Press, Cambridge (1998)
Vezhnevets, A.S., et al.: Feudal networks for hierarchical reinforcement learning. CoRR abs/1703.01161 (2017). http://arxiv.org/abs/1703.01161
Wang, Z.: Sample efficient actor-critic with experience replay (2016)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2019 Springer Nature Switzerland AG
About this paper
Cite this paper
Asperti, A., Cortesi, D., Sovrano, F. (2019). Crawling in Rogue’s Dungeons with (Partitioned) A3C. In: Nicosia, G., Pardalos, P., Giuffrida, G., Umeton, R., Sciacca, V. (eds) Machine Learning, Optimization, and Data Science. LOD 2018. Lecture Notes in Computer Science(), vol 11331. Springer, Cham. https://doi.org/10.1007/978-3-030-13709-0_22
Download citation
DOI: https://doi.org/10.1007/978-3-030-13709-0_22
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-13708-3
Online ISBN: 978-3-030-13709-0
eBook Packages: Computer ScienceComputer Science (R0)