Skip to main content

Deep Multi-agent Reinforcement Learning in a Homogeneous Open Population

  • Conference paper
  • First Online:
Artificial Intelligence (BNAIC 2018)

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 1021))

Included in the following conference series:

Abstract

Advances in reinforcement learning research have recently produced agents that are competent, or sometimes exceed human performance, in complex tasks. Most interesting real world problems however, are not restricted to one agent, but instead deal with multiple agents acting in the same environment and have proven to be challenging tasks to solve. In this work we present a study on a homogeneous open population of agents modelled as a multi-agent reinforcement learning (MARL) system. We propose a centralised learning approach, with decentralised execution in which agents are given the same policy to execute individually. Using the SimuLane highway traffic simulator as a test-bed we show experimentally that using a single-agent learnt policy to initialise the multi-agent scenario, which we then fine-tune to the task, out-performs agents that learn in the multi-agent setting from scratch. Specifically we contribute an open population MARL configuration, how to transfer knowledge from single- to a multi-agent setting and a training procedure for a homogeneous open population of agents.

M. Legrand—Contribution done during the master thesis studies at the Vrije Universiteit Brussel.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    We have previously demonstrated an earlier version of SimuLane at BNAIC 2017 [12].

  2. 2.

    We note that we run simulations for other intermediate values, but only show here the two extremes, for the sake of graph legibility.

References

  1. Amato, C., Oliehoek, F.A.: Scalable planning and learning for multiagent POMDPs. In: AAAI, pp. 1995–2002 (2015)

    Google Scholar 

  2. Boutsioukis, G., Partalas, I., Vlahavas, I.: Transfer learning in multi-agent reinforcement learning domains. In: Sanner, S., Hutter, M. (eds.) EWRL 2011. LNCS (LNAI), vol. 7188, pp. 249–260. Springer, Heidelberg (2012). https://doi.org/10.1007/978-3-642-29946-9_25

    Chapter  Google Scholar 

  3. Busoniu, L., Babuska, R., Schutter, B.D.: A comprehensive survey of multiagent reinforcement learning. IEEE Trans Syst. Man Cybern. Part C 38(2), 156–172 (2008)

    Article  Google Scholar 

  4. De Hauwere, Y.M.: Sparse interactions in multi-agent reinforcement learning. Ph.D. thesis, Vrije Universiteit Brussel (2011)

    Google Scholar 

  5. Duchi, J., Hazan, E., Singer, Y.: Adaptive subgradient methods for online learning and stochastic optimization. J. Mach. Learn. Res. 12(Jul), 2121–2159 (2011)

    MathSciNet  MATH  Google Scholar 

  6. Espeholt, L., et al.: IMPALA: scalable distributed deep-RL with importance weighted actor-learner architectures. arXiv preprint arXiv:1802.01561 (2018)

  7. Foerster, J., Assael, Y.M., de Freitas, N., Whiteson, S.: Learning to communicate with deep multi-agent reinforcement learning. In: Advances in Neural Information Processing Systems, pp. 2137–2145 (2016)

    Google Scholar 

  8. Foerster, J., et al.: Stabilising experience replay for deep multi-agent reinforcement learning. arXiv preprint arXiv:1702.08887 (2017)

  9. Gupta, J.K., Egorov, M., Kochenderfer, M.: Cooperative multi-agent control using deep reinforcement learning. In: Sukthankar, G., Rodriguez-Aguilar, J.A. (eds.) AAMAS 2017. LNCS (LNAI), vol. 10642, pp. 66–83. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-71682-4_5

    Chapter  Google Scholar 

  10. Heinerman, J., Rango, M., Eiben, A.E.: Evolution, individual learning, and social learning in a swarm of real robots. In: 2015 IEEE Symposium Series on Computational Intelligence, pp. 1055–1062. IEEE (2015)

    Google Scholar 

  11. Legrand, M.: Deep reinforcement learning for autonomous vehicle control among human drivers. Master dissertation, Vrije Universiteit Brussel (2017). http://ai.vub.ac.be/sites/default/files/thesis_legrand.pdf

  12. Legrand, M., Rădulescu, R., Roijers, D.M., Nowé, A.: The SimuLane highway traffic simulator for multi-agent reinforcement learning. BNAIC 2017, 394–395 (2017)

    Google Scholar 

  13. Lillicrap, T.P., et al.: Continuous control with deep reinforcement learning. arXiv preprint arXiv:1509.02971 (2015)

  14. Lin, L.J.: Self-improving reactive agents based on reinforcement learning, planning and teaching. Mach. Learn. 8(3–4), 293–321 (1992)

    Google Scholar 

  15. Littman, M.L.: Value-function reinforcement learning in Markov games. Cogn. Syst. Res. 2(1), 55–66 (2001)

    Article  Google Scholar 

  16. Lowe, R., Wu, Y., Tamar, A., Harb, J., Abbeel, O.P., Mordatch, I.: Multi-agent actor-critic for mixed cooperative-competitive environments. In: Advances in Neural Information Processing Systems, pp. 6382–6393 (2017)

    Google Scholar 

  17. Mnih, V., et al.: Asynchronous methods for deep reinforcement learning. CoRR abs/1602.01783 (2016)

    Google Scholar 

  18. Mnih, V., et al.: Playing Atari with deep reinforcement learning. CoRR abs/1312.5602 (2013)

    Google Scholar 

  19. Mnih, V., et al.: Human-level control through deep reinforcement learning. Nature 518(7540), 529–533 (2015)

    Article  Google Scholar 

  20. Mordatch, I., Abbeel, P.: Emergence of grounded compositional language in multi-agent populations. arXiv preprint arXiv:1703.04908 (2017)

  21. Mossalam, H., Assael, Y., Roijers, D., Whiteson, S.: Multi-objective deep reinforcement learning. In: NIPS Workshop on Deep RL (2016)

    Google Scholar 

  22. Nowé, A., Vrancx, P., De Hauwere, Y.M.: Game theory and multi-agent reinforcement learning. In: Wiering, M., van Otterlo, M. (eds.) Reinforcement Learning: State of the Art, pp. 441–470. Springer, Heidelberg (2012). https://doi.org/10.1007/978-3-642-27645-3_14

    Chapter  Google Scholar 

  23. Rusu, A.A., et al.: Progressive neural networks. arXiv preprint arXiv:1606.04671 (2016)

  24. Schulman, J., Levine, S., Abbeel, P., Jordan, M., Moritz, P.: Trust region policy optimization. In: International Conference on Machine Learning, pp. 1889–1897 (2015)

    Google Scholar 

  25. Silver, D., et al.: Mastering the game of go with deep neural networks and tree search. Nature 529(7587), 484–489 (2016)

    Article  Google Scholar 

  26. Srivastava, N., Hinton, G., Krizhevsky, A., Sutskever, I., Salakhutdinov, R.: Dropout: a simple way to prevent neural networks from overfitting. J. Mach. Learn. Res. 15, 1929–1958 (2014)

    MathSciNet  MATH  Google Scholar 

  27. Steckelmacher, D., Roijers, D.M., Harutyunyan, A., Vrancx, P., Plisnier, H., Nowé, A.: Reinforcement learning in POMDPs with memoryless options and option-observation initiation sets. AAAI 2018, 4099–4106 (2018)

    Google Scholar 

  28. Sutton, R.S., Barto, A.G.: Reinforcement Learning: An Introduction. MIT Press, Cambridge (1998)

    MATH  Google Scholar 

  29. Taylor, M.E., Stone, P.: Transfer learning for reinforcement learning domains: a survey. J. Mach. Learn. Res. 10(Jul), 1633–1685 (2009)

    MathSciNet  MATH  Google Scholar 

  30. Van Hasselt, H., Guez, A., Silver, D.: Deep reinforcement learning with double Q-learning. AAAI 16, 2094–2100 (2016)

    Google Scholar 

  31. Watkins, C.J.C.H.: Learning from delayed rewards. Ph.D. thesis, University of Cambridge England (1989)

    Google Scholar 

  32. Wiggers, A.J., Oliehoek, F.A., Roijers, D.M.: Structure in the value function of two-player zero-sum games of incomplete information. In: ECAI 2016, pp. 1628–1629 (2016)

    Google Scholar 

  33. Zhang, C., Lesser, V.: Coordinating multi-agent reinforcement learning with limited communication. In: Proceedings of the 2013 International Conference on Autonomous Agents and Multi-agent Systems, pp. 1101–1108 (2013)

    Google Scholar 

Download references

Acknowledgements

This work is supported by Flanders Innovation & Entrepreneurship (VLAIO), SBO project 140047: Stable MultI-agent LEarnIng for neTworks (SMILE-IT), and the European Union FET Proactive Initiative project 64089: Deferred Restructuring of Experience in Autonomous Machines (DREAM) and the Security-Driven Engineering of Cloud-Based Applications (SeCLOUD).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Roxana Rădulescu .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2019 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Rădulescu, R., Legrand, M., Efthymiadis, K., Roijers, D.M., Nowé, A. (2019). Deep Multi-agent Reinforcement Learning in a Homogeneous Open Population. In: Atzmueller, M., Duivesteijn, W. (eds) Artificial Intelligence. BNAIC 2018. Communications in Computer and Information Science, vol 1021. Springer, Cham. https://doi.org/10.1007/978-3-030-31978-6_8

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-31978-6_8

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-31977-9

  • Online ISBN: 978-3-030-31978-6

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics