Skip to main content

A Top-Down Approach to Attain Decentralized Multi-agents

  • Chapter
  • First Online:
Handbook of Reinforcement Learning and Control

Part of the book series: Studies in Systems, Decision and Control ((SSDC,volume 325))

  • 7357 Accesses

Abstract

In the modern age with the enormous amounts of data being collected, machine learning has blossomed into a necessary tool in nearly all academic and industrial fields. In many cases, machine learning deals with a static dataset, e.g., a fixed training data, for its analysis, such as in recognition of hand-written digits. Yet there is also a need to apply machine learning to streams of data, i.e., online learning. One approach is that of reinforcement learning, the setting being an agent seeking to maximize a cumulative reward signal it receives from the environment. An extension of this notion is multi-agent reinforcement learning, where now the setting is that we have many agents that seek to maximize their cumulative reward signal; the stipulation being that each agent can only observe its own local observation and communication from other agents which may have limited bandwidth, i.e., the agents must act in a decentralized manner. In this chapter, we demonstrate a method, centralized expert supervises multi-agents (CESMA), to obtain decentralized multi-agents through a top-down approach; we first obtain a solution with a centralized controller, and then decentralize this using imitation learning.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 189.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 249.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 249.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Bu, L., Babu, R., De Schutter, B., et al.: A comprehensive survey of multiagent reinforcement learning. IEEE Trans. Syst. Man Cybern. Part C (Appl. Rev.) 38(2), 156–172 (2008)

    Google Scholar 

  2. Dobbe, R., Fridovich-Keil, D., Tomlin, C.: Fully decentralized policies for multi-agent systems: an information theoretic approach. In: Guyon, I., Luxburg, U.V., Bengio, S., Wallach, H., Fergus, R., Vishwanathan, S., Garnett, R. (eds.) Advances in Neural Information Processing Systems, vol. 30, pp. 2941–2950. Curran Associates, Inc., New York (2017)

    Google Scholar 

  3. Evans, R., Gao, J.: Deepmind AI reduces Google data centre cooling bill by 40 (2017)

    Google Scholar 

  4. Kober, J., Bagnell, J.A., Peters, J.: Reinforcement learning in robotics: a survey. Int. J. Robot. Res. 32(11), 1238–1274 (2013)

    Google Scholar 

  5. Kraemer, L., Banerjee, B.: Multi-agent reinforcement learning as a rehearsal for decentralized planning. Neurocomputing 190, 82–94 (2016)

    Article  Google Scholar 

  6. Levine, S., Finn, C., Darrell, T., Abbeel, P.: End-to-end training of deep visuomotor policies. J. Mach. Learn. Res. 17(1), 1334–1373 (2016)

    MathSciNet  MATH  Google Scholar 

  7. Lillicrap, T.P., Hunt, J.J., Pritzel, A., Heess, N., Erez, T., Tassa, Y., Silver, D., Wierstra, D.: Continuous control with deep reinforcement learning. CoRR (2015). arXiv:1509.02971

  8. Lin, A.T., Debord, M.J., Estabridis, K., Hewer, G.A., Osher, S.J.: CESMA: centralized expert supervises multi-agents. CoRR (2019). arXiv:1902.02311

  9. Littman, M.L.: Markov games as a framework for multi-agent reinforcement learning. In: Cohen, W.W., Hirsh, H. (eds.) Machine Learning Proceedings 1994, pp. 157–163. Morgan Kaufmann, San Francisco (1994)

    Google Scholar 

  10. Lowe, R., Wu, Y., Tamar, A., Harb, J., Abbeel, P., Mordatch, I.: Multi-agent actor-critic for mixed cooperative-competitive environments. CoRR (2017). arXiv:1706.02275

  11. Matignon, L., Laurent, G.J., Le Fort Piat, N.: Review: independent reinforcement learners in cooperative Markov games: a survey regarding coordination problems. Knowl. Eng. Rev. 27(1), 1–31 (2012)

    Google Scholar 

  12. Mnih, V., Kavukcuoglu, K., Silver, D., Graves, A., Antonoglou, I., Wierstra, D., Riedmiller, M.A.: Playing Atari with deep reinforcement learning. CoRR (2013). arXiv:1312.5602

  13. Mnih, V., Kavukcuoglu, K., Silver, D., Rusu, A.A., Veness, J., Bellemare, M.G., Graves, A., Riedmiller, M., Fidjeland, A.K., Ostrovski, G., et al.: Human-level control through deep reinforcement learning. Nature 518(7540), 529 (2015)

    Google Scholar 

  14. Oliehoek, F.A.: Decentralized POMDPs. Reinforcement Learning, pp. 471–503. Springer, Berlin (2012)

    Google Scholar 

  15. Panait, L., Luke, S.: Cooperative multi-agent learning: the state of the art. Auton. Agents Multi-agent Syst. 11(3), 387–434 (2005)

    Article  Google Scholar 

  16. Paulos, J., Chen, S.W., Shishika, D., Kumar, V.: Decentralization of multiagent policies by learning what to communicate (2018)

    Google Scholar 

  17. Ross, S., Bagnell, D.: Efficient reductions for imitation learning. In: Proceedings of the Thirteenth International Conference on Artificial Intelligence and Statistics, pp. 661–668 (2010)

    Google Scholar 

  18. Ross, S., Gordon, G.J., Bagnell, J.A.: No-regret reductions for imitation learning and structured prediction. CoRR (2010). arXiv:1011.0686

  19. Rummery, G.A., Niranjan, M.: On-line Q-learning using connectionist systems, vol. 37. University of Cambridge, Department of Engineering Cambridge, England (1994)

    Google Scholar 

  20. Schaal, S., Ijspeert, A., Billard, A.: Computational approaches to motor learning by imitation. Philos. Trans. R. Soc. Lond. Ser. B: Biol. Sci. 358(1431), 537–547 (2003)

    Google Scholar 

  21. Shoham, Y., Leyton-Brown, K.: Multiagent Systems: Algorithmic, Game-Theoretic, and Logical Foundations. Cambridge University Press, New York (2008)

    Book  Google Scholar 

  22. Silver, D., Huang, A., Maddison, C.J., Guez, A., Sifre, L., Van Den Driessche, G., Schrittwieser, J., Antonoglou, I., Panneershelvam, V., Lanctot, M., et al.: Mastering the game of go with deep neural networks and tree search. Nature 529(7587), 484 (2016)

    Google Scholar 

  23. Silver, D., Schrittwieser, J., Simonyan, K., Antonoglou, I., Huang, A., Guez, A., Hubert, T., Baker, L., Lai, M., Bolton, A., et al.: Mastering the game of go without human knowledge. Nature 550(7676), 354 (2017)

    Article  Google Scholar 

  24. Sutton, R.S.: Learning to predict by the methods of temporal differences. Mach. Learn. 3(1), 9–44 (1988)

    Google Scholar 

  25. Sutton, R.S., Barto, A.G.: Reinforcement Learning: An Introduction. MIT Press, Cambridge (2018)

    Google Scholar 

  26. Tan, M.: Readings in agents. In: Huhns, M.N., Singh, M.P. (eds.) Multi-agent Reinforcement Learning: Independent Versus Cooperative Agents, pp. 487–494. Morgan Kaufmann Publishers Inc., San Francisco (1998)

    Google Scholar 

  27. Wang, Z., Schaul, T., Hessel, M., Van Hasselt, H., Lanctot, M., De Freitas, N.: Dueling network architectures for deep reinforcement learning (2015). arXiv:1511.06581

  28. Watkins, C.J.C.H., Dayan, P.: Q-learning. Mach. Learn. 8(3), 279–292 (1992)

    MATH  Google Scholar 

  29. Williams, R.J.: Simple statistical gradient-following algorithms for connectionist reinforcement learning. Mach. Learn. 8, 09 (1998)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Alex Tong Lin .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2021 Springer Nature Switzerland AG

About this chapter

Check for updates. Verify currency and authenticity via CrossMark

Cite this chapter

Lin, A.T., Montúfar, G., Osher, S.J. (2021). A Top-Down Approach to Attain Decentralized Multi-agents. In: Vamvoudakis, K.G., Wan, Y., Lewis, F.L., Cansever, D. (eds) Handbook of Reinforcement Learning and Control. Studies in Systems, Decision and Control, vol 325. Springer, Cham. https://doi.org/10.1007/978-3-030-60990-0_14

Download citation

Publish with us

Policies and ethics