Deep Multi-agent Reinforcement Learning in a Homogeneous Open Population

Rădulescu, Roxana; Legrand, Manon; Efthymiadis, Kyriakos; Roijers, Diederik M.; Nowé, Ann

doi:10.1007/978-3-030-31978-6_8

Roxana Rădulescu⁹,
Manon Legrand⁹,
Kyriakos Efthymiadis⁹,
Diederik M. Roijers^9,10 &
…
Ann Nowé⁹

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 1021))

Included in the following conference series:

Benelux Conference on Artificial Intelligence

936 Accesses
7 Citations

Abstract

Advances in reinforcement learning research have recently produced agents that are competent, or sometimes exceed human performance, in complex tasks. Most interesting real world problems however, are not restricted to one agent, but instead deal with multiple agents acting in the same environment and have proven to be challenging tasks to solve. In this work we present a study on a homogeneous open population of agents modelled as a multi-agent reinforcement learning (MARL) system. We propose a centralised learning approach, with decentralised execution in which agents are given the same policy to execute individually. Using the SimuLane highway traffic simulator as a test-bed we show experimentally that using a single-agent learnt policy to initialise the multi-agent scenario, which we then fine-tune to the task, out-performs agents that learn in the multi-agent setting from scratch. Specifically we contribute an open population MARL configuration, how to transfer knowledge from single- to a multi-agent setting and a training procedure for a homogeneous open population of agents.

M. Legrand—Contribution done during the master thesis studies at the Vrije Universiteit Brussel.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
We have previously demonstrated an earlier version of SimuLane at BNAIC 2017 [12].
2.
We note that we run simulations for other intermediate values, but only show here the two extremes, for the sake of graph legibility.

References

Amato, C., Oliehoek, F.A.: Scalable planning and learning for multiagent POMDPs. In: AAAI, pp. 1995–2002 (2015)
Google Scholar
Boutsioukis, G., Partalas, I., Vlahavas, I.: Transfer learning in multi-agent reinforcement learning domains. In: Sanner, S., Hutter, M. (eds.) EWRL 2011. LNCS (LNAI), vol. 7188, pp. 249–260. Springer, Heidelberg (2012). https://doi.org/10.1007/978-3-642-29946-9_25
Chapter Google Scholar
Busoniu, L., Babuska, R., Schutter, B.D.: A comprehensive survey of multiagent reinforcement learning. IEEE Trans Syst. Man Cybern. Part C 38(2), 156–172 (2008)
Article Google Scholar
De Hauwere, Y.M.: Sparse interactions in multi-agent reinforcement learning. Ph.D. thesis, Vrije Universiteit Brussel (2011)
Google Scholar
Duchi, J., Hazan, E., Singer, Y.: Adaptive subgradient methods for online learning and stochastic optimization. J. Mach. Learn. Res. 12(Jul), 2121–2159 (2011)
MathSciNet MATH Google Scholar
Espeholt, L., et al.: IMPALA: scalable distributed deep-RL with importance weighted actor-learner architectures. arXiv preprint arXiv:1802.01561 (2018)
Foerster, J., Assael, Y.M., de Freitas, N., Whiteson, S.: Learning to communicate with deep multi-agent reinforcement learning. In: Advances in Neural Information Processing Systems, pp. 2137–2145 (2016)
Google Scholar
Foerster, J., et al.: Stabilising experience replay for deep multi-agent reinforcement learning. arXiv preprint arXiv:1702.08887 (2017)
Gupta, J.K., Egorov, M., Kochenderfer, M.: Cooperative multi-agent control using deep reinforcement learning. In: Sukthankar, G., Rodriguez-Aguilar, J.A. (eds.) AAMAS 2017. LNCS (LNAI), vol. 10642, pp. 66–83. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-71682-4_5
Chapter Google Scholar
Heinerman, J., Rango, M., Eiben, A.E.: Evolution, individual learning, and social learning in a swarm of real robots. In: 2015 IEEE Symposium Series on Computational Intelligence, pp. 1055–1062. IEEE (2015)
Google Scholar
Legrand, M.: Deep reinforcement learning for autonomous vehicle control among human drivers. Master dissertation, Vrije Universiteit Brussel (2017). http://ai.vub.ac.be/sites/default/files/thesis_legrand.pdf
Legrand, M., Rădulescu, R., Roijers, D.M., Nowé, A.: The SimuLane highway traffic simulator for multi-agent reinforcement learning. BNAIC 2017, 394–395 (2017)
Google Scholar
Lillicrap, T.P., et al.: Continuous control with deep reinforcement learning. arXiv preprint arXiv:1509.02971 (2015)
Lin, L.J.: Self-improving reactive agents based on reinforcement learning, planning and teaching. Mach. Learn. 8(3–4), 293–321 (1992)
Google Scholar
Littman, M.L.: Value-function reinforcement learning in Markov games. Cogn. Syst. Res. 2(1), 55–66 (2001)
Article Google Scholar
Lowe, R., Wu, Y., Tamar, A., Harb, J., Abbeel, O.P., Mordatch, I.: Multi-agent actor-critic for mixed cooperative-competitive environments. In: Advances in Neural Information Processing Systems, pp. 6382–6393 (2017)
Google Scholar
Mnih, V., et al.: Asynchronous methods for deep reinforcement learning. CoRR abs/1602.01783 (2016)
Google Scholar
Mnih, V., et al.: Playing Atari with deep reinforcement learning. CoRR abs/1312.5602 (2013)
Google Scholar
Mnih, V., et al.: Human-level control through deep reinforcement learning. Nature 518(7540), 529–533 (2015)
Article Google Scholar
Mordatch, I., Abbeel, P.: Emergence of grounded compositional language in multi-agent populations. arXiv preprint arXiv:1703.04908 (2017)
Mossalam, H., Assael, Y., Roijers, D., Whiteson, S.: Multi-objective deep reinforcement learning. In: NIPS Workshop on Deep RL (2016)
Google Scholar
Nowé, A., Vrancx, P., De Hauwere, Y.M.: Game theory and multi-agent reinforcement learning. In: Wiering, M., van Otterlo, M. (eds.) Reinforcement Learning: State of the Art, pp. 441–470. Springer, Heidelberg (2012). https://doi.org/10.1007/978-3-642-27645-3_14
Chapter Google Scholar
Rusu, A.A., et al.: Progressive neural networks. arXiv preprint arXiv:1606.04671 (2016)
Schulman, J., Levine, S., Abbeel, P., Jordan, M., Moritz, P.: Trust region policy optimization. In: International Conference on Machine Learning, pp. 1889–1897 (2015)
Google Scholar
Silver, D., et al.: Mastering the game of go with deep neural networks and tree search. Nature 529(7587), 484–489 (2016)
Article Google Scholar
Srivastava, N., Hinton, G., Krizhevsky, A., Sutskever, I., Salakhutdinov, R.: Dropout: a simple way to prevent neural networks from overfitting. J. Mach. Learn. Res. 15, 1929–1958 (2014)
MathSciNet MATH Google Scholar
Steckelmacher, D., Roijers, D.M., Harutyunyan, A., Vrancx, P., Plisnier, H., Nowé, A.: Reinforcement learning in POMDPs with memoryless options and option-observation initiation sets. AAAI 2018, 4099–4106 (2018)
Google Scholar
Sutton, R.S., Barto, A.G.: Reinforcement Learning: An Introduction. MIT Press, Cambridge (1998)
MATH Google Scholar
Taylor, M.E., Stone, P.: Transfer learning for reinforcement learning domains: a survey. J. Mach. Learn. Res. 10(Jul), 1633–1685 (2009)
MathSciNet MATH Google Scholar
Van Hasselt, H., Guez, A., Silver, D.: Deep reinforcement learning with double Q-learning. AAAI 16, 2094–2100 (2016)
Google Scholar
Watkins, C.J.C.H.: Learning from delayed rewards. Ph.D. thesis, University of Cambridge England (1989)
Google Scholar
Wiggers, A.J., Oliehoek, F.A., Roijers, D.M.: Structure in the value function of two-player zero-sum games of incomplete information. In: ECAI 2016, pp. 1628–1629 (2016)
Google Scholar
Zhang, C., Lesser, V.: Coordinating multi-agent reinforcement learning with limited communication. In: Proceedings of the 2013 International Conference on Autonomous Agents and Multi-agent Systems, pp. 1101–1108 (2013)
Google Scholar

Download references

Acknowledgements

This work is supported by Flanders Innovation & Entrepreneurship (VLAIO), SBO project 140047: Stable MultI-agent LEarnIng for neTworks (SMILE-IT), and the European Union FET Proactive Initiative project 64089: Deferred Restructuring of Experience in Autonomous Machines (DREAM) and the Security-Driven Engineering of Cloud-Based Applications (SeCLOUD).

Author information

Authors and Affiliations

Vrije Universiteit Brussel, Brussels, Belgium
Roxana Rădulescu, Manon Legrand, Kyriakos Efthymiadis, Diederik M. Roijers & Ann Nowé
Vrije Universiteit Amsterdam, Amsterdam, The Netherlands
Diederik M. Roijers

Authors

Roxana Rădulescu
View author publications
You can also search for this author in PubMed Google Scholar
Manon Legrand
View author publications
You can also search for this author in PubMed Google Scholar
Kyriakos Efthymiadis
View author publications
You can also search for this author in PubMed Google Scholar
Diederik M. Roijers
View author publications
You can also search for this author in PubMed Google Scholar
Ann Nowé
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Roxana Rădulescu .

Editor information

Editors and Affiliations

Tilburg University, Tilburg, The Netherlands
Martin Atzmueller
Eindhoven University of Technology, Eindhoven, The Netherlands
Wouter Duivesteijn

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Rădulescu, R., Legrand, M., Efthymiadis, K., Roijers, D.M., Nowé, A. (2019). Deep Multi-agent Reinforcement Learning in a Homogeneous Open Population. In: Atzmueller, M., Duivesteijn, W. (eds) Artificial Intelligence. BNAIC 2018. Communications in Computer and Information Science, vol 1021. Springer, Cham. https://doi.org/10.1007/978-3-030-31978-6_8

Download citation

DOI: https://doi.org/10.1007/978-3-030-31978-6_8
Published: 25 September 2019
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-31977-9
Online ISBN: 978-3-030-31978-6
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics