Skip to main content
Log in

LINDA: multi-agent local information decomposition for awareness of teammates

  • Research Paper
  • Published:
Science China Information Sciences Aims and scope Submit manuscript

Abstract

In cooperative multi-agent reinforcement learning (MARL), where agents only have access to partial observations, efficiently leveraging local information is critical. During long-time observations, agents can build awareness for teammates to alleviate the restriction of partial observability. However, previous MARL methods usually neglect awareness learning from local information for better collaboration. To address this problem, we propose a novel framework, multi-agent local information decomposition for awareness of teammates (LINDA), with which agents learn to decompose local information and build awareness for each teammate. We model the awareness as stochastic random variables and perform representation learning to ensure the informativeness of awareness representations by maximizing the mutual information between awareness and the actual trajectory of the corresponding agent. LINDA is agnostic to specific algorithms and can be flexibly integrated with different MARL methods. Sufficient experiments show that the proposed framework learns informative awareness from local partial observations for better collaboration and significantly improves the learning performance, especially on challenging tasks.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. Tuyls K, Weiss G. Multiagent learning: basics, challenges, and prospects. AI Mag, 2012, 33: 41

    Google Scholar 

  2. Gronauer S, Diepold K. Multi-agent deep reinforcement learning: a survey. Artif Intell Rev, 2022, 55: 895–943

    Article  Google Scholar 

  3. Cui H, Zhang Z. A cooperative multi-agent reinforcement learning method based on coordination degree. IEEE Access, 2021, 9: 123805

    Article  Google Scholar 

  4. OroojlooyJadid A, Hajinezhad D. A review of cooperative multi-agent deep reinforcement learning. 2019. ArXiv:1908.03963

  5. Cao Y, Yu W, Ren W, et al. An overview of recent progress in the study of distributed multi-agent coordination. IEEE Trans Ind Inf, 2012, 9: 427–438

    Article  Google Scholar 

  6. Zhou M, Luo J, Villela J, et al. Smarts: scalable multi-agent reinforcement learning training school for autonomous driving. 2020. ArXiv:2010.09776

  7. Nowé A, Vrancx P, de Hauwere Y M. Game theory and multi-agent reinforcement learning. In: Reinforcement Learning. Berlin: Springer, 2012. 441–470

    MATH  Google Scholar 

  8. Christianos F, Schäfer L, Albrecht S V. Shared experience actor-critic for multi-agent reinforcement learning. In: Proceedings of Advances in Neural Information Processing Systems, 2020

  9. Lyu X, Xiao Y, Daley B, et al. The contrasting centralized and decentralized critics in multi-agent reinforcement learning. In: Proceedings of the 20th International Conference on Autonomous Agents and Multiagent Systems, 2021. 844–852

  10. Sunehag P, Lever G, Gruslys A, et al. Value-decomposition networks for cooperative multi-agent learning based on team reward. In: Proceedings of the 17th International Conference on Autonomous Agents and MultiAgent Systems, 2018. 2085–2087

  11. Rashid T, Samvelyan M, de Witt C S, et al. QMIX: monotonic value function factorisation for deep multi-agent reinforcement learning. In: Proceedings of the 35th International Conference on Machine Learning, 2018. 4292–4301

  12. Wang J, Ren Z, Liu T, et al. QPLEX: duplex dueling multi-agent q-learning. In: Proceedings of the 9th International Conference on Learning Representations, 2021

  13. Wang J, Zhang Y, Kim T, et al. Shapley q-value: a local reward approach to solve global reward games. In: Proceedings of the 34th AAAI Conference on Artificial Intelligence, 2020. 7285–7292

  14. Wang T, Zeng L, Dong W, et al. Context-aware sparse deep coordination graphs. 2021. ArXiv:2106.02886

  15. Wang J, Wang J, Zhang Y, et al. SHAQ: incorporating shapley value theory into q-learning for multi-agent reinforcement learning. 2021. ArXiv:2105.15013

  16. Rashid T, Farquhar G, Peng B, et al. Weighted QMIX: expanding monotonic value function factorisation for deep multi-agent reinforcement learning. In: Proceedings of Advances in Neural Information Processing Systems, 2020

  17. Cho K, van Merrienboer B, Gülçehre Ç, et al. Learning phrase representations using RNN encoder-decoder for statistical machine translation. In: Proceedings of the Conference on Empirical Methods in Natural Language Processing, 2014. 1724–1734

  18. Stone P, Kaminka G A, Kraus S, et al. Ad hoc autonomous agent teams: collaboration without pre-coordination. In: Proceedings of the 24th AAAI Conference on Artificial Intelligence, 2010

  19. He H, Boyd-Graber J L. Opponent modeling in deep reinforcement learning. In: Proceedings of the 33rd International Conference on Machine Learning, 2016. 1804–1813

  20. Hong Z, Su S, Shann T, et al. A deep policy inference q-network for multi-agent systems. In: Proceedings of the 17th International Conference on Autonomous Agents and MultiAgent Systems, 2018. 1388–1396

  21. Papoudakis G, Albrecht S V. Variational autoencoders for opponent modeling in multi-agent systems. 2020. ArXiv:2001.10829

  22. Papoudakis G, Christianos F, Albrecht S V. Local information opponent modelling using variational autoencoders. 2020. ArXiv:2006.09447

  23. Albrecht S V, Stone P. Reasoning about hypothetical agent behaviours and their parameters. In: Proceedings of the 16th Conference on Autonomous Agents and MultiAgent Systems, 2017. 547–555

  24. Vinyals O, Ewalds T, Bartunov S, et al. Starcraft II: a new challenge for reinforcement learning. 2017. ArXiv:1708.04782

  25. Samvelyan M, Rashid T, de Witt C S, et al. The starcraft multi-agent challenge. In: Proceedings of the 18th International Conference on Autonomous Agents and MultiAgent Systems, 2019. 2186–2188

  26. Lowe R, Wu Y, Tamar A, et al. Multi-agent actor-critic for mixed cooperative-competitive environments. In: Proceedings of Advances in Neural Information Processing Systems, 2017. 6379–6390

  27. Foerster J N, Farquhar G, Afouras T, et al. Counterfactual multi-agent policy gradients. In: Proceedings of the 32nd AAAI Conference on Artificial Intelligence, 2018. 2974–2982

  28. Iqbal S, Sha F. Actor-attention-critic for multi-agent reinforcement learning. In: Proceedings of the 36th International Conference on Machine Learning, 2019. 2961–2970

  29. Son K, Kim D, Kang W J, et al. QTRAN: learning to factorize with transformation for cooperative multi-agent reinforcement learning. In: Proceedings of the 36th International Conference on Machine Learning, 2019. 5887–5896

  30. Wang T, Dong H, Lesser V R, et al. ROMA: multi-agent reinforcement learning with emergent roles. In: Proceedings of the 37th International Conference on Machine Learning, 2020. 9876–9886

  31. Wang T, Wang J, Zheng C, et al. Learning nearly decomposable value functions via communication minimization. In: Proceedings of the 8th International Conference on Learning Representations, 2020

  32. Wang T, Gupta T, Mahajan A, et al. RODE: learning roles to decompose multi-agent tasks. In: Proceedings of the 9th International Conference on Learning Representations, 2021

  33. Xie A, Losey D P, Tolsma R, et al. Learning latent representations to influence multi-agent interaction. In: Proceedings of the 4th Conference on Robot Learning, 2020. 575–588

  34. Wang W, Yang T, Liu Y, et al. Action semantics network: considering the effects of actions in multiagent systems. In: Proceedings of the 8th International Conference on Learning Representations, 2020

  35. Zhang T, Xu H, Wang X, et al. Multi-agent collaboration via reward attribution decomposition. 2020. ArXiv:2010.08531

  36. Hu S, Zhu F, Chang X, et al. UPDeT: universal multi-agent reinforcement learning via policy decoupling with transformers. 2021. ArXiv:2101.08001

  37. Wu B. Hierarchical macro strategy model for MOBA game AI. In: Proceedings of the 33rd AAAI Conference on Artificial Intelligence, 2019. 1206–1213

  38. Palanisamy P. Multi-agent connected autonomous driving using deep reinforcement learning. In: Proceedings of International Joint Conference on Neural Networks, 2020. 1–7

  39. Albrecht S V, Stone P. Autonomous agents modelling other agents: a comprehensive survey and open problems. Artif Intelligence, 2018, 258: 66–95

    Article  MathSciNet  MATH  Google Scholar 

  40. Oliehoek F A, Amato C. A Concise Introduction to Decentralized POMDPs. Berlin: Springer, 2016

    Book  MATH  Google Scholar 

  41. Oliehoek F A, Spaan M T J, Vlassis N. Optimal and approximate q-value functions for decentralized POMDPs. J Artif Intell Res, 2008, 32: 289–353

    Article  MathSciNet  MATH  Google Scholar 

  42. Kraemer L, Banerjee B. Multi-agent reinforcement learning as a rehearsal for decentralized planning. Neurocomputing, 2016, 190: 82–94

    Article  Google Scholar 

  43. Kingma D P, Welling M. Auto-encoding variational bayes. 2013. ArXiv:1312.6114

  44. Alemi A A, Fischer I, Dillon J V, et al. Deep variational information bottleneck. In: Proceedings of the 5th International Conference on Learning Representations, 2017

  45. van der Maaten L, Hinton G. Visualizing data using t-SNE. J Mach Learn Res, 2008, 9: 2579–2605

    MATH  Google Scholar 

  46. Stone P, Veloso M. Task decomposition, dynamic role assignment, and low-bandwidth communication for real-time strategic teamwork. Artif Intell, 1999, 110: 241–273

    Article  MATH  Google Scholar 

  47. Lhaksmana K M, Murakami Y, Ishida T. Role-based modeling for designing agent behavior in self-organizing multi-agent systems. Int J Soft Eng Knowl Eng, 2018, 28: 79–96

    Article  Google Scholar 

Download references

Acknowledgements

This work was supported by National Natural Science Foundation of China (Grant No. 61773198).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to De-Chuan Zhan.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Cao, J., Yuan, L., Wang, J. et al. LINDA: multi-agent local information decomposition for awareness of teammates. Sci. China Inf. Sci. 66, 182101 (2023). https://doi.org/10.1007/s11432-021-3479-9

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s11432-021-3479-9

Keywords

Navigation