A Top-Down Approach to Attain Decentralized Multi-agents

Lin, Alex Tong; Montúfar, Guido; Osher, Stanley J.

doi:10.1007/978-3-030-60990-0_14

Alex Tong Lin⁶,
Guido Montúfar^6,7,8 &
Stanley J. Osher⁶

Part of the book series: Studies in Systems, Decision and Control ((SSDC,volume 325))

7357 Accesses

Abstract

In the modern age with the enormous amounts of data being collected, machine learning has blossomed into a necessary tool in nearly all academic and industrial fields. In many cases, machine learning deals with a static dataset, e.g., a fixed training data, for its analysis, such as in recognition of hand-written digits. Yet there is also a need to apply machine learning to streams of data, i.e., online learning. One approach is that of reinforcement learning, the setting being an agent seeking to maximize a cumulative reward signal it receives from the environment. An extension of this notion is multi-agent reinforcement learning, where now the setting is that we have many agents that seek to maximize their cumulative reward signal; the stipulation being that each agent can only observe its own local observation and communication from other agents which may have limited bandwidth, i.e., the agents must act in a decentralized manner. In this chapter, we demonstrate a method, centralized expert supervises multi-agents (CESMA), to obtain decentralized multi-agents through a top-down approach; we first obtain a solution with a centralized controller, and then decentralize this using imitation learning.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 189.00; Price excludes VAT (USA)

Softcover Book: USD 249.99; Price excludes VAT (USA)

Hardcover Book: USD 249.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Bu, L., Babu, R., De Schutter, B., et al.: A comprehensive survey of multiagent reinforcement learning. IEEE Trans. Syst. Man Cybern. Part C (Appl. Rev.) 38(2), 156–172 (2008)
Google Scholar
Dobbe, R., Fridovich-Keil, D., Tomlin, C.: Fully decentralized policies for multi-agent systems: an information theoretic approach. In: Guyon, I., Luxburg, U.V., Bengio, S., Wallach, H., Fergus, R., Vishwanathan, S., Garnett, R. (eds.) Advances in Neural Information Processing Systems, vol. 30, pp. 2941–2950. Curran Associates, Inc., New York (2017)
Google Scholar
Evans, R., Gao, J.: Deepmind AI reduces Google data centre cooling bill by 40 (2017)
Google Scholar
Kober, J., Bagnell, J.A., Peters, J.: Reinforcement learning in robotics: a survey. Int. J. Robot. Res. 32(11), 1238–1274 (2013)
Google Scholar
Kraemer, L., Banerjee, B.: Multi-agent reinforcement learning as a rehearsal for decentralized planning. Neurocomputing 190, 82–94 (2016)
Article Google Scholar
Levine, S., Finn, C., Darrell, T., Abbeel, P.: End-to-end training of deep visuomotor policies. J. Mach. Learn. Res. 17(1), 1334–1373 (2016)
MathSciNet MATH Google Scholar
Lillicrap, T.P., Hunt, J.J., Pritzel, A., Heess, N., Erez, T., Tassa, Y., Silver, D., Wierstra, D.: Continuous control with deep reinforcement learning. CoRR (2015). arXiv:1509.02971
Lin, A.T., Debord, M.J., Estabridis, K., Hewer, G.A., Osher, S.J.: CESMA: centralized expert supervises multi-agents. CoRR (2019). arXiv:1902.02311
Littman, M.L.: Markov games as a framework for multi-agent reinforcement learning. In: Cohen, W.W., Hirsh, H. (eds.) Machine Learning Proceedings 1994, pp. 157–163. Morgan Kaufmann, San Francisco (1994)
Google Scholar
Lowe, R., Wu, Y., Tamar, A., Harb, J., Abbeel, P., Mordatch, I.: Multi-agent actor-critic for mixed cooperative-competitive environments. CoRR (2017). arXiv:1706.02275
Matignon, L., Laurent, G.J., Le Fort Piat, N.: Review: independent reinforcement learners in cooperative Markov games: a survey regarding coordination problems. Knowl. Eng. Rev. 27(1), 1–31 (2012)
Google Scholar
Mnih, V., Kavukcuoglu, K., Silver, D., Graves, A., Antonoglou, I., Wierstra, D., Riedmiller, M.A.: Playing Atari with deep reinforcement learning. CoRR (2013). arXiv:1312.5602
Mnih, V., Kavukcuoglu, K., Silver, D., Rusu, A.A., Veness, J., Bellemare, M.G., Graves, A., Riedmiller, M., Fidjeland, A.K., Ostrovski, G., et al.: Human-level control through deep reinforcement learning. Nature 518(7540), 529 (2015)
Google Scholar
Oliehoek, F.A.: Decentralized POMDPs. Reinforcement Learning, pp. 471–503. Springer, Berlin (2012)
Google Scholar
Panait, L., Luke, S.: Cooperative multi-agent learning: the state of the art. Auton. Agents Multi-agent Syst. 11(3), 387–434 (2005)
Article Google Scholar
Paulos, J., Chen, S.W., Shishika, D., Kumar, V.: Decentralization of multiagent policies by learning what to communicate (2018)
Google Scholar
Ross, S., Bagnell, D.: Efficient reductions for imitation learning. In: Proceedings of the Thirteenth International Conference on Artificial Intelligence and Statistics, pp. 661–668 (2010)
Google Scholar
Ross, S., Gordon, G.J., Bagnell, J.A.: No-regret reductions for imitation learning and structured prediction. CoRR (2010). arXiv:1011.0686
Rummery, G.A., Niranjan, M.: On-line Q-learning using connectionist systems, vol. 37. University of Cambridge, Department of Engineering Cambridge, England (1994)
Google Scholar
Schaal, S., Ijspeert, A., Billard, A.: Computational approaches to motor learning by imitation. Philos. Trans. R. Soc. Lond. Ser. B: Biol. Sci. 358(1431), 537–547 (2003)
Google Scholar
Shoham, Y., Leyton-Brown, K.: Multiagent Systems: Algorithmic, Game-Theoretic, and Logical Foundations. Cambridge University Press, New York (2008)
Book Google Scholar
Silver, D., Huang, A., Maddison, C.J., Guez, A., Sifre, L., Van Den Driessche, G., Schrittwieser, J., Antonoglou, I., Panneershelvam, V., Lanctot, M., et al.: Mastering the game of go with deep neural networks and tree search. Nature 529(7587), 484 (2016)
Google Scholar
Silver, D., Schrittwieser, J., Simonyan, K., Antonoglou, I., Huang, A., Guez, A., Hubert, T., Baker, L., Lai, M., Bolton, A., et al.: Mastering the game of go without human knowledge. Nature 550(7676), 354 (2017)
Article Google Scholar
Sutton, R.S.: Learning to predict by the methods of temporal differences. Mach. Learn. 3(1), 9–44 (1988)
Google Scholar
Sutton, R.S., Barto, A.G.: Reinforcement Learning: An Introduction. MIT Press, Cambridge (2018)
Google Scholar
Tan, M.: Readings in agents. In: Huhns, M.N., Singh, M.P. (eds.) Multi-agent Reinforcement Learning: Independent Versus Cooperative Agents, pp. 487–494. Morgan Kaufmann Publishers Inc., San Francisco (1998)
Google Scholar
Wang, Z., Schaul, T., Hessel, M., Van Hasselt, H., Lanctot, M., De Freitas, N.: Dueling network architectures for deep reinforcement learning (2015). arXiv:1511.06581
Watkins, C.J.C.H., Dayan, P.: Q-learning. Mach. Learn. 8(3), 279–292 (1992)
MATH Google Scholar
Williams, R.J.: Simple statistical gradient-following algorithms for connectionist reinforcement learning. Mach. Learn. 8, 09 (1998)
Google Scholar

Download references

Author information

Authors and Affiliations

Department of Mathematics, University of California, Los Angeles, CA, 90095, USA
Alex Tong Lin, Guido Montúfar & Stanley J. Osher
Department of Statistics, University of California, Los Angeles, CA, 90095, USA
Guido Montúfar
Max Planck Institute for Mathematics in the Sciences, 04103, Leipzig, Germany
Guido Montúfar

Authors

Alex Tong Lin
View author publications
You can also search for this author in PubMed Google Scholar
Guido Montúfar
View author publications
You can also search for this author in PubMed Google Scholar
Stanley J. Osher
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Alex Tong Lin .

Editor information

Editors and Affiliations

The Daniel Guggenheim School of Aerospace Engineering, Georgia Institute of Technology, Atlanta, GA, USA
Kyriakos G. Vamvoudakis
Department of Electrical Engineering, The University of Texas at Arlington, Arlington, TX, USA
Yan Wan
Department of Electrical Engineering, The University of Texas at Arlington, Arlington, TX, USA
Frank L. Lewis
Army Research Office, Durham, NC, USA
Derya Cansever

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Lin, A.T., Montúfar, G., Osher, S.J. (2021). A Top-Down Approach to Attain Decentralized Multi-agents. In: Vamvoudakis, K.G., Wan, Y., Lewis, F.L., Cansever, D. (eds) Handbook of Reinforcement Learning and Control. Studies in Systems, Decision and Control, vol 325. Springer, Cham. https://doi.org/10.1007/978-3-030-60990-0_14

Download citation

DOI: https://doi.org/10.1007/978-3-030-60990-0_14
Published: 24 June 2021
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-60989-4
Online ISBN: 978-3-030-60990-0
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)

Publish with us

Policies and ethics