Learning When to Communicate Among Actors with the Centralized Critic for the Multi-agent System

Sun, Qingshuang; Yao, Yuan; Yi, Peng; Zhou, Xingshe; Yang, Gang

doi:10.1007/978-981-19-4549-6_11

Qingshuang Sun¹²,
Yuan Yao¹²,
Peng Yi¹²,
Xingshe Zhou¹² &
…
Gang Yang¹²

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 1492))

Included in the following conference series:

CCF Conference on Computer Supported Cooperative Work and Social Computing

525 Accesses
1 Citations

Abstract

Centralized training and decentralized execution have become a basic setting for multi-agent reinforcement learning. As the number of agents increases, the performance of the actors that only use their own local observations with centralized critics is prone to bottlenecks in complex scenarios. Recent research has shown that agents learn when to communicate to share information efficiently, that agents communicate with each other in a right time during the execution phase to complete the cooperation task. Therefore, in this paper, we proposed a model that learn when to communicate under the centralized critic supporting, so that the agent is able to adaptive control communication under the centralized critic learned by global environmental information. Experiments in a cooperation scenario demonstrate the advantages of model. With our proposed cooperation model, agents are able to block communication at an appropriate time under the centralized critic setting and cooperation with each other at the task.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 89.00; Price excludes VAT (USA)

Softcover Book: USD 119.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Chen, F., Ren, W., et al.: On the control of multi-agent systems: a survey. Found. Trends® Syst. Control 6(4), 339–499 (2019)
Google Scholar
d’Inverno, M., Luck, M.: An operational analysis of agent relationships. In: Understanding Agent Systems. Springer Series on Agent Technology. Springer, Heidelberg (2004). https://doi.org/10.1007/978-3-662-10702-7_4
Silver, D., et al.: Mastering the game of go with deep neural networks and tree search. nature 529(7587), 484–489 (2016)
Article Google Scholar
Fan, T., Long, P., Liu, W., Pan, J.: Distributed multi-robot collision avoidance via deep reinforcement learning for navigation in complex scenarios. Int. J. Robot. Res. 39, 856–892 (2020)
Article Google Scholar
Xiao, Y., Hoffman, J., Xia, T., Amato, C.: Learning multi-robot decentralized macro-action-based policies via a centralized Q-net. In: 2020 IEEE International Conference on Robotics and Automation (ICRA), pp. 10695–10701. IEEE (2020)
Google Scholar
Kiran, B.R., Sobh, I., Talpaert, V., Mannion, P., Perez, P.: Deep reinforcement learning for autonomous driving: a survey. IEEE Trans. Intell. Transp. Syst. 23, 4909–4926 (2022)
Article Google Scholar
Lowe, R., Wu, Y., Tamar, A., Harb, J., Pieter Abbeel, O., Mordatch, I.: Multi-agent actor-critic for mixed cooperative-competitive environments. In: Advances in Neural Information Processing Systems, vol. 30, pp. 6379–6390 (2017)
Google Scholar
Hernandez-Leal, P., Kartal, B., Taylor, M.E.: A survey and critique of multiagent deep reinforcement learning. Auton. Agent. Multi-Agent Syst. 33(6), 750–797 (2019). https://doi.org/10.1007/s10458-019-09421-1
Article Google Scholar
Sukhbaatar, S., Fergus, R., et al.: Learning multiagent communication with backpropagation. In: Advances in Neural Information Processing Systems, pp. 2244–2252 (2016)
Google Scholar
Foerster, J., Assael, I.A., De Freitas, N., Whiteson, S.: Learning to communicate with deep multi-agent reinforcement learning. In: Advances in Neural Information Processing Systems, pp. 2137–2145 (2016)
Google Scholar
Singh, A., Jain, T., Sukhbaatar, S.: Learning when to communicate at scale in multiagent cooperative and competitive tasks. In: International Conference on Learning Representations (2018)
Google Scholar
Williams, R.J.: Simple statistical gradient-following algorithms for connectionist reinforcement learning. Mach. Learn. 8(3), 229–256 (1992)
MATH Google Scholar
Das, A., et al.: TarMAC: targeted multi-agent communication. In: International Conference on Machine Learning, pp. 1538–1546 (2019)
Google Scholar
Jiang, J., Lu, Z.: Learning attentional communication for multi-agent cooperation. In: Advances in Neural Information Processing Systems, pp. 7254–7264 (2018)
Google Scholar
Lillicrap, T.P., et al.: Continuous control with deep reinforcement learning. arXiv preprint arXiv:1509.02971 (2015)
Iqbal, S., Sha, F.: Actor-attention-critic for multi-agent reinforcement learning. In: International Conference on Machine Learning, pp. 2961–2970. PMLR (2019)
Google Scholar
Sutton, R.S., Barto, A.G.: Reinforcement Learning: An Introduction. MIT Press (2018)
Google Scholar
Xu, K., et al.: Show, attend and tell: neural image caption generation with visual attention. In: International Conference on Machine Learning, pp. 2048–2057. PMLR (2015)
Google Scholar
Hochreiter, S., Schmidhuber, J.: Long short-term memory. Neural Comput. 9(8), 1735–1780 (1997)
Article Google Scholar

Download references

Acknowledgments

This work was supported in part by the National Key Research and Development Program of China (No. 2017YFB1001902), the National Natural Science Foundation of China (No. 61876151, 62032018) and the Fundamental Research Funds for the Central Universities (No. 3102019DX1005).

Author information

Authors and Affiliations

School of Computer Science, Northwestern Polytechnical University, Xi’an, China
Qingshuang Sun, Yuan Yao, Peng Yi, Xingshe Zhou & Gang Yang

Authors

Qingshuang Sun
View author publications
You can also search for this author in PubMed Google Scholar
Yuan Yao
View author publications
You can also search for this author in PubMed Google Scholar
Peng Yi
View author publications
You can also search for this author in PubMed Google Scholar
Xingshe Zhou
View author publications
You can also search for this author in PubMed Google Scholar
Gang Yang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Gang Yang .

Editor information

Editors and Affiliations

Shandong University, Jinan, China
Yuqing Sun
Fudan University, Shanghai, China
Tun Lu
Hunan University of Science and Technology, Xiangtan, China
Buqing Cao
Tongji University, Shanghai, China
Hongfei Fan
Guangdong University of Technology, Guangzhou, China
Dongning Liu
University of Warwick, Coventry, UK
Bowen Du
University of Shanghai for Science and Technology, Shanghai, China
Liping Gao

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Sun, Q., Yao, Y., Yi, P., Zhou, X., Yang, G. (2022). Learning When to Communicate Among Actors with the Centralized Critic for the Multi-agent System. In: Sun, Y., et al. Computer Supported Cooperative Work and Social Computing. ChineseCSCW 2021. Communications in Computer and Information Science, vol 1492. Springer, Singapore. https://doi.org/10.1007/978-981-19-4549-6_11

Download citation

DOI: https://doi.org/10.1007/978-981-19-4549-6_11
Published: 22 July 2022
Publisher Name: Springer, Singapore
Print ISBN: 978-981-19-4548-9
Online ISBN: 978-981-19-4549-6
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Societies and partnerships

the China Computer Federation (CCF) (opens in a new tab)

Learning When to Communicate Among Actors with the Centralized Critic for the Multi-agent System