LINDA: multi-agent local information decomposition for awareness of teammates

Cao, Jiahan; Yuan, Lei; Wang, Jianhao; Zhang, Shaowei; Zhang, Chongjie; Yu, Yang; Zhan, De-Chuan

doi:10.1007/s11432-021-3479-9

LINDA: multi-agent local information decomposition for awareness of teammates

Research Paper
Published: 26 July 2023

Volume 66, article number 182101, (2023)
Cite this article

Science China Information Sciences Aims and scope Submit manuscript

Jiahan Cao¹^na1,
Lei Yuan¹^na1,
Jianhao Wang²,
Shaowei Zhang¹,
Chongjie Zhang²,
Yang Yu^1,3 &
…
De-Chuan Zhan^1,3

167 Accesses
Explore all metrics

Abstract

In cooperative multi-agent reinforcement learning (MARL), where agents only have access to partial observations, efficiently leveraging local information is critical. During long-time observations, agents can build awareness for teammates to alleviate the restriction of partial observability. However, previous MARL methods usually neglect awareness learning from local information for better collaboration. To address this problem, we propose a novel framework, multi-agent local information decomposition for awareness of teammates (LINDA), with which agents learn to decompose local information and build awareness for each teammate. We model the awareness as stochastic random variables and perform representation learning to ensure the informativeness of awareness representations by maximizing the mutual information between awareness and the actual trajectory of the corresponding agent. LINDA is agnostic to specific algorithms and can be flexibly integrated with different MARL methods. Sufficient experiments show that the proposed framework learns informative awareness from local partial observations for better collaboration and significantly improves the learning performance, especially on challenging tasks.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

References

Tuyls K, Weiss G. Multiagent learning: basics, challenges, and prospects. AI Mag, 2012, 33: 41
Google Scholar
Gronauer S, Diepold K. Multi-agent deep reinforcement learning: a survey. Artif Intell Rev, 2022, 55: 895–943
Article Google Scholar
Cui H, Zhang Z. A cooperative multi-agent reinforcement learning method based on coordination degree. IEEE Access, 2021, 9: 123805
Article Google Scholar
OroojlooyJadid A, Hajinezhad D. A review of cooperative multi-agent deep reinforcement learning. 2019. ArXiv:1908.03963
Cao Y, Yu W, Ren W, et al. An overview of recent progress in the study of distributed multi-agent coordination. IEEE Trans Ind Inf, 2012, 9: 427–438
Article Google Scholar
Zhou M, Luo J, Villela J, et al. Smarts: scalable multi-agent reinforcement learning training school for autonomous driving. 2020. ArXiv:2010.09776
Nowé A, Vrancx P, de Hauwere Y M. Game theory and multi-agent reinforcement learning. In: Reinforcement Learning. Berlin: Springer, 2012. 441–470
MATH Google Scholar
Christianos F, Schäfer L, Albrecht S V. Shared experience actor-critic for multi-agent reinforcement learning. In: Proceedings of Advances in Neural Information Processing Systems, 2020
Lyu X, Xiao Y, Daley B, et al. The contrasting centralized and decentralized critics in multi-agent reinforcement learning. In: Proceedings of the 20th International Conference on Autonomous Agents and Multiagent Systems, 2021. 844–852
Sunehag P, Lever G, Gruslys A, et al. Value-decomposition networks for cooperative multi-agent learning based on team reward. In: Proceedings of the 17th International Conference on Autonomous Agents and MultiAgent Systems, 2018. 2085–2087
Rashid T, Samvelyan M, de Witt C S, et al. QMIX: monotonic value function factorisation for deep multi-agent reinforcement learning. In: Proceedings of the 35th International Conference on Machine Learning, 2018. 4292–4301
Wang J, Ren Z, Liu T, et al. QPLEX: duplex dueling multi-agent q-learning. In: Proceedings of the 9th International Conference on Learning Representations, 2021
Wang J, Zhang Y, Kim T, et al. Shapley q-value: a local reward approach to solve global reward games. In: Proceedings of the 34th AAAI Conference on Artificial Intelligence, 2020. 7285–7292
Wang T, Zeng L, Dong W, et al. Context-aware sparse deep coordination graphs. 2021. ArXiv:2106.02886
Wang J, Wang J, Zhang Y, et al. SHAQ: incorporating shapley value theory into q-learning for multi-agent reinforcement learning. 2021. ArXiv:2105.15013
Rashid T, Farquhar G, Peng B, et al. Weighted QMIX: expanding monotonic value function factorisation for deep multi-agent reinforcement learning. In: Proceedings of Advances in Neural Information Processing Systems, 2020
Cho K, van Merrienboer B, Gülçehre Ç, et al. Learning phrase representations using RNN encoder-decoder for statistical machine translation. In: Proceedings of the Conference on Empirical Methods in Natural Language Processing, 2014. 1724–1734
Stone P, Kaminka G A, Kraus S, et al. Ad hoc autonomous agent teams: collaboration without pre-coordination. In: Proceedings of the 24th AAAI Conference on Artificial Intelligence, 2010
He H, Boyd-Graber J L. Opponent modeling in deep reinforcement learning. In: Proceedings of the 33rd International Conference on Machine Learning, 2016. 1804–1813
Hong Z, Su S, Shann T, et al. A deep policy inference q-network for multi-agent systems. In: Proceedings of the 17th International Conference on Autonomous Agents and MultiAgent Systems, 2018. 1388–1396
Papoudakis G, Albrecht S V. Variational autoencoders for opponent modeling in multi-agent systems. 2020. ArXiv:2001.10829
Papoudakis G, Christianos F, Albrecht S V. Local information opponent modelling using variational autoencoders. 2020. ArXiv:2006.09447
Albrecht S V, Stone P. Reasoning about hypothetical agent behaviours and their parameters. In: Proceedings of the 16th Conference on Autonomous Agents and MultiAgent Systems, 2017. 547–555
Vinyals O, Ewalds T, Bartunov S, et al. Starcraft II: a new challenge for reinforcement learning. 2017. ArXiv:1708.04782
Samvelyan M, Rashid T, de Witt C S, et al. The starcraft multi-agent challenge. In: Proceedings of the 18th International Conference on Autonomous Agents and MultiAgent Systems, 2019. 2186–2188
Lowe R, Wu Y, Tamar A, et al. Multi-agent actor-critic for mixed cooperative-competitive environments. In: Proceedings of Advances in Neural Information Processing Systems, 2017. 6379–6390
Foerster J N, Farquhar G, Afouras T, et al. Counterfactual multi-agent policy gradients. In: Proceedings of the 32nd AAAI Conference on Artificial Intelligence, 2018. 2974–2982
Iqbal S, Sha F. Actor-attention-critic for multi-agent reinforcement learning. In: Proceedings of the 36th International Conference on Machine Learning, 2019. 2961–2970
Son K, Kim D, Kang W J, et al. QTRAN: learning to factorize with transformation for cooperative multi-agent reinforcement learning. In: Proceedings of the 36th International Conference on Machine Learning, 2019. 5887–5896
Wang T, Dong H, Lesser V R, et al. ROMA: multi-agent reinforcement learning with emergent roles. In: Proceedings of the 37th International Conference on Machine Learning, 2020. 9876–9886
Wang T, Wang J, Zheng C, et al. Learning nearly decomposable value functions via communication minimization. In: Proceedings of the 8th International Conference on Learning Representations, 2020
Wang T, Gupta T, Mahajan A, et al. RODE: learning roles to decompose multi-agent tasks. In: Proceedings of the 9th International Conference on Learning Representations, 2021
Xie A, Losey D P, Tolsma R, et al. Learning latent representations to influence multi-agent interaction. In: Proceedings of the 4th Conference on Robot Learning, 2020. 575–588
Wang W, Yang T, Liu Y, et al. Action semantics network: considering the effects of actions in multiagent systems. In: Proceedings of the 8th International Conference on Learning Representations, 2020
Zhang T, Xu H, Wang X, et al. Multi-agent collaboration via reward attribution decomposition. 2020. ArXiv:2010.08531
Hu S, Zhu F, Chang X, et al. UPDeT: universal multi-agent reinforcement learning via policy decoupling with transformers. 2021. ArXiv:2101.08001
Wu B. Hierarchical macro strategy model for MOBA game AI. In: Proceedings of the 33rd AAAI Conference on Artificial Intelligence, 2019. 1206–1213
Palanisamy P. Multi-agent connected autonomous driving using deep reinforcement learning. In: Proceedings of International Joint Conference on Neural Networks, 2020. 1–7
Albrecht S V, Stone P. Autonomous agents modelling other agents: a comprehensive survey and open problems. Artif Intelligence, 2018, 258: 66–95
Article MathSciNet MATH Google Scholar
Oliehoek F A, Amato C. A Concise Introduction to Decentralized POMDPs. Berlin: Springer, 2016
Book MATH Google Scholar
Oliehoek F A, Spaan M T J, Vlassis N. Optimal and approximate q-value functions for decentralized POMDPs. J Artif Intell Res, 2008, 32: 289–353
Article MathSciNet MATH Google Scholar
Kraemer L, Banerjee B. Multi-agent reinforcement learning as a rehearsal for decentralized planning. Neurocomputing, 2016, 190: 82–94
Article Google Scholar
Kingma D P, Welling M. Auto-encoding variational bayes. 2013. ArXiv:1312.6114
Alemi A A, Fischer I, Dillon J V, et al. Deep variational information bottleneck. In: Proceedings of the 5th International Conference on Learning Representations, 2017
van der Maaten L, Hinton G. Visualizing data using t-SNE. J Mach Learn Res, 2008, 9: 2579–2605
MATH Google Scholar
Stone P, Veloso M. Task decomposition, dynamic role assignment, and low-bandwidth communication for real-time strategic teamwork. Artif Intell, 1999, 110: 241–273
Article MATH Google Scholar
Lhaksmana K M, Murakami Y, Ishida T. Role-based modeling for designing agent behavior in self-organizing multi-agent systems. Int J Soft Eng Knowl Eng, 2018, 28: 79–96
Article Google Scholar

Download references

Acknowledgements

This work was supported by National Natural Science Foundation of China (Grant No. 61773198).

Author information

Cao J H and Yuan L have the same contribution to this work.

Authors and Affiliations

National Key Laboratory for Novel Software Technology, Nanjing University, Nanjing, 210046, China
Jiahan Cao, Lei Yuan, Shaowei Zhang, Yang Yu & De-Chuan Zhan
Institute for Interdisciplinary Information Sciences, Tsinghua University, Beijing, 100084, China
Jianhao Wang & Chongjie Zhang
Polixir Technologies, Nanjing, 210046, China
Yang Yu & De-Chuan Zhan

Authors

Jiahan Cao
View author publications
You can also search for this author in PubMed Google Scholar
Lei Yuan
View author publications
You can also search for this author in PubMed Google Scholar
Jianhao Wang
View author publications
You can also search for this author in PubMed Google Scholar
Shaowei Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Chongjie Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Yang Yu
View author publications
You can also search for this author in PubMed Google Scholar
De-Chuan Zhan
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to De-Chuan Zhan.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Cao, J., Yuan, L., Wang, J. et al. LINDA: multi-agent local information decomposition for awareness of teammates. Sci. China Inf. Sci. 66, 182101 (2023). https://doi.org/10.1007/s11432-021-3479-9

Download citation

Received: 19 October 2021
Revised: 10 January 2022
Accepted: 25 March 2022
Published: 26 July 2023
DOI: https://doi.org/10.1007/s11432-021-3479-9

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

LINDA: multi-agent local information decomposition for awareness of teammates

Abstract

Access this article

Similar content being viewed by others

Multi-agent deep reinforcement learning: a survey

Multi-Agent Reinforcement Learning: A Selective Overview of Theories and Algorithms

Challenges of real-world reinforcement learning: definitions, benchmarks and analysis

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

LINDA: multi-agent local information decomposition for awareness of teammates

Abstract

Access this article

Similar content being viewed by others

Multi-agent deep reinforcement learning: a survey

Multi-Agent Reinforcement Learning: A Selective Overview of Theories and Algorithms

Challenges of real-world reinforcement learning: definitions, benchmarks and analysis

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation