Abstract
In the modern age with the enormous amounts of data being collected, machine learning has blossomed into a necessary tool in nearly all academic and industrial fields. In many cases, machine learning deals with a static dataset, e.g., a fixed training data, for its analysis, such as in recognition of hand-written digits. Yet there is also a need to apply machine learning to streams of data, i.e., online learning. One approach is that of reinforcement learning, the setting being an agent seeking to maximize a cumulative reward signal it receives from the environment. An extension of this notion is multi-agent reinforcement learning, where now the setting is that we have many agents that seek to maximize their cumulative reward signal; the stipulation being that each agent can only observe its own local observation and communication from other agents which may have limited bandwidth, i.e., the agents must act in a decentralized manner. In this chapter, we demonstrate a method, centralized expert supervises multi-agents (CESMA), to obtain decentralized multi-agents through a top-down approach; we first obtain a solution with a centralized controller, and then decentralize this using imitation learning.
This is a preview of subscription content, access via your institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsReferences
Bu, L., Babu, R., De Schutter, B., et al.: A comprehensive survey of multiagent reinforcement learning. IEEE Trans. Syst. Man Cybern. Part C (Appl. Rev.) 38(2), 156–172 (2008)
Dobbe, R., Fridovich-Keil, D., Tomlin, C.: Fully decentralized policies for multi-agent systems: an information theoretic approach. In: Guyon, I., Luxburg, U.V., Bengio, S., Wallach, H., Fergus, R., Vishwanathan, S., Garnett, R. (eds.) Advances in Neural Information Processing Systems, vol. 30, pp. 2941–2950. Curran Associates, Inc., New York (2017)
Evans, R., Gao, J.: Deepmind AI reduces Google data centre cooling bill by 40 (2017)
Kober, J., Bagnell, J.A., Peters, J.: Reinforcement learning in robotics: a survey. Int. J. Robot. Res. 32(11), 1238–1274 (2013)
Kraemer, L., Banerjee, B.: Multi-agent reinforcement learning as a rehearsal for decentralized planning. Neurocomputing 190, 82–94 (2016)
Levine, S., Finn, C., Darrell, T., Abbeel, P.: End-to-end training of deep visuomotor policies. J. Mach. Learn. Res. 17(1), 1334–1373 (2016)
Lillicrap, T.P., Hunt, J.J., Pritzel, A., Heess, N., Erez, T., Tassa, Y., Silver, D., Wierstra, D.: Continuous control with deep reinforcement learning. CoRR (2015). arXiv:1509.02971
Lin, A.T., Debord, M.J., Estabridis, K., Hewer, G.A., Osher, S.J.: CESMA: centralized expert supervises multi-agents. CoRR (2019). arXiv:1902.02311
Littman, M.L.: Markov games as a framework for multi-agent reinforcement learning. In: Cohen, W.W., Hirsh, H. (eds.) Machine Learning Proceedings 1994, pp. 157–163. Morgan Kaufmann, San Francisco (1994)
Lowe, R., Wu, Y., Tamar, A., Harb, J., Abbeel, P., Mordatch, I.: Multi-agent actor-critic for mixed cooperative-competitive environments. CoRR (2017). arXiv:1706.02275
Matignon, L., Laurent, G.J., Le Fort Piat, N.: Review: independent reinforcement learners in cooperative Markov games: a survey regarding coordination problems. Knowl. Eng. Rev. 27(1), 1–31 (2012)
Mnih, V., Kavukcuoglu, K., Silver, D., Graves, A., Antonoglou, I., Wierstra, D., Riedmiller, M.A.: Playing Atari with deep reinforcement learning. CoRR (2013). arXiv:1312.5602
Mnih, V., Kavukcuoglu, K., Silver, D., Rusu, A.A., Veness, J., Bellemare, M.G., Graves, A., Riedmiller, M., Fidjeland, A.K., Ostrovski, G., et al.: Human-level control through deep reinforcement learning. Nature 518(7540), 529 (2015)
Oliehoek, F.A.: Decentralized POMDPs. Reinforcement Learning, pp. 471–503. Springer, Berlin (2012)
Panait, L., Luke, S.: Cooperative multi-agent learning: the state of the art. Auton. Agents Multi-agent Syst. 11(3), 387–434 (2005)
Paulos, J., Chen, S.W., Shishika, D., Kumar, V.: Decentralization of multiagent policies by learning what to communicate (2018)
Ross, S., Bagnell, D.: Efficient reductions for imitation learning. In: Proceedings of the Thirteenth International Conference on Artificial Intelligence and Statistics, pp. 661–668 (2010)
Ross, S., Gordon, G.J., Bagnell, J.A.: No-regret reductions for imitation learning and structured prediction. CoRR (2010). arXiv:1011.0686
Rummery, G.A., Niranjan, M.: On-line Q-learning using connectionist systems, vol. 37. University of Cambridge, Department of Engineering Cambridge, England (1994)
Schaal, S., Ijspeert, A., Billard, A.: Computational approaches to motor learning by imitation. Philos. Trans. R. Soc. Lond. Ser. B: Biol. Sci. 358(1431), 537–547 (2003)
Shoham, Y., Leyton-Brown, K.: Multiagent Systems: Algorithmic, Game-Theoretic, and Logical Foundations. Cambridge University Press, New York (2008)
Silver, D., Huang, A., Maddison, C.J., Guez, A., Sifre, L., Van Den Driessche, G., Schrittwieser, J., Antonoglou, I., Panneershelvam, V., Lanctot, M., et al.: Mastering the game of go with deep neural networks and tree search. Nature 529(7587), 484 (2016)
Silver, D., Schrittwieser, J., Simonyan, K., Antonoglou, I., Huang, A., Guez, A., Hubert, T., Baker, L., Lai, M., Bolton, A., et al.: Mastering the game of go without human knowledge. Nature 550(7676), 354 (2017)
Sutton, R.S.: Learning to predict by the methods of temporal differences. Mach. Learn. 3(1), 9–44 (1988)
Sutton, R.S., Barto, A.G.: Reinforcement Learning: An Introduction. MIT Press, Cambridge (2018)
Tan, M.: Readings in agents. In: Huhns, M.N., Singh, M.P. (eds.) Multi-agent Reinforcement Learning: Independent Versus Cooperative Agents, pp. 487–494. Morgan Kaufmann Publishers Inc., San Francisco (1998)
Wang, Z., Schaul, T., Hessel, M., Van Hasselt, H., Lanctot, M., De Freitas, N.: Dueling network architectures for deep reinforcement learning (2015). arXiv:1511.06581
Watkins, C.J.C.H., Dayan, P.: Q-learning. Mach. Learn. 8(3), 279–292 (1992)
Williams, R.J.: Simple statistical gradient-following algorithms for connectionist reinforcement learning. Mach. Learn. 8, 09 (1998)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2021 Springer Nature Switzerland AG
About this chapter
Cite this chapter
Lin, A.T., Montúfar, G., Osher, S.J. (2021). A Top-Down Approach to Attain Decentralized Multi-agents. In: Vamvoudakis, K.G., Wan, Y., Lewis, F.L., Cansever, D. (eds) Handbook of Reinforcement Learning and Control. Studies in Systems, Decision and Control, vol 325. Springer, Cham. https://doi.org/10.1007/978-3-030-60990-0_14
Download citation
DOI: https://doi.org/10.1007/978-3-030-60990-0_14
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-60989-4
Online ISBN: 978-3-030-60990-0
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)