Skip to main content

Cooperative Multi-Agent Reinforcement Learning with Dynamic Target Localization: A Reward Sharing Approach

  • Conference paper
  • First Online:
AI 2023: Advances in Artificial Intelligence (AI 2023)

Abstract

Cooperation in multi-agent reinforcement learning (MARL) facilitates the acquisition of complex problem-solving skills and promotes more efficient and effective decision-making among agents. Numerous strategies for cooperative learning in MARL exist, including joint action learning, task decomposition, role assignment, and communication protocols. However, deploying these strategies in a complex and dynamic environment remains challenging. To address such challenges, we propose a technique that uses reward sharing to enhance cooperation in partially observable multi-agent environments. As an extension of reward shaping, reward sharing allows agents to work together towards a global objective while still pursuing their local objectives. This approach can foster cooperation and reduce competition between agents without explicit communication, ultimately leading to faster learning and better performance. This study compares three different reward sharing techniques: the Performance Incentive (PI), the Observer’s Share (OS), and the Synergy Achievement (SA) in the context of dynamic target localization, focusing on simulation studies. Thereafter, the proposed reward sharing techniques are evaluated under the effects of objective prioritization, various agent counts, and a variety of map sizes. The research reveals that the proposed reward sharing techniques enhance agent performance, scaling the number of agents leads to higher rewards, and demonstrates a negative correlation between map size and average rewards.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 59.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 79.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Yang, J., Borovikov, I., Zha, H.: Hierarchical cooperative multi-agent reinforcement learning with skill discovery. In: Adaptive Agents and Multi-Agent Systems (2019)

    Google Scholar 

  2. Multi-agent Reinforcement Learning: Independent vs. Cooperative Agents. Morgan Kaufmann Publishers Inc., San Francisco (1997)

    Google Scholar 

  3. Hu, Z., Zhao, D.: Reinforcement learning for multi-agent patrol policy. In: 9th IEEE International Conference on Cognitive Informatics (ICCI’10), pp. 530–535 (2010)

    Google Scholar 

  4. Claus, C., Boutilier, C.: The dynamics of reinforcement learning in cooperative multiagent systems. In: AAAI/IAAI (1998)

    Google Scholar 

  5. Rashid, T., Samvelyan, M., De Witt, C.S., Farquhar, G., Foerster, J., Whiteson, S.: QMIX: monotonic value function factorisation for deep multi-agent reinforcement learning (2018). ArXiv, abs/1803.11485

    Google Scholar 

  6. Marzari, L., Pore, A., Dall’Alba, D., Aragon-Camarasa, G., Farinelli, A., Fiorini, P.: Towards hierarchical task decomposition using deep reinforcement learning for pick and place subtasks. In: 2021 20th International Conference on Advanced Robotics (ICAR), pp. 640–645 (2021)

    Google Scholar 

  7. Chaimowicz, L., Campos, M.F., Kumar, V.: Dynamic role assignment for cooperative robots. In: Proceedings 2002 IEEE International Conference on Robotics and Automation (Cat. No.02CH37292), vol. 1, pp. 293–298 (2002)

    Google Scholar 

  8. Foerster, J.N., Assael, Y., De Freitas, N., Whiteson, S.: Learning to communicate with deep multi-agent reinforcement learning (2016). ArXiv, abs/1605.06676

    Google Scholar 

  9. Gupta, J.K., Egorov, M., Kochenderfer, M.: Cooperative multi-agent control using deep reinforcement learning. In: Sukthankar, G., Rodriguez-Aguilar, J.A. (eds.) AAMAS 2017. LNCS (LNAI), vol. 10642, pp. 66–83. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-71682-4_5

    Chapter  Google Scholar 

  10. Mnih, V., et al.: Playing atari with deep reinforcement learning (2013). ArXiv, abs/1312.5602

    Google Scholar 

  11. Ng, A.Y., Harada, D., Russell, S.: Policy invariance under reward transformations: theory and application to reward shaping. In: International Conference on Machine Learning (1999)

    Google Scholar 

  12. Wiewiora, E., Cottrell, G.W., Elkan, C.: Principled methods for advising reinforcement learning agents. In: Proceedings of the Twentieth International Conference on International Conference on Machine Learning, ICML’03, pp. 792–799. AAAI Press (2003)

    Google Scholar 

  13. Mannion, P., Devlin, S., Mason, K., Duggan, J., Howley, E.: Policy invariance under reward transformations for multi-objective reinforcement learning. Neurocomputing 263, 60–73 (2017)

    Article  Google Scholar 

  14. Mannion, P., Devlin, S., Duggan, J., Howley, E.: Reward shaping for knowledge-based multi-objective multi-agent reinforcement learning. Knowl. Eng. Rev. 33, e23 (2018). https://doi.org/10.1017/S0269888918000292. Cambridge University Press

  15. Grześ, M., Kudenko, D.: Multigrid reinforcement learning with reward shaping. In: Kurková, V., Neruda, R., Koutník, J. (eds.) ICANN 2008. LNCS, vol. 5163, pp. 357–366. Springer, Heidelberg (2008). https://doi.org/10.1007/978-3-540-87536-9_37

  16. Grzes, M., Kudenko, D.: Reinforcement learning with reward shaping and mixed resolution function approximation. Int. J. Agent Technol. Syst. 1, 36–54 (2009)

    Article  Google Scholar 

  17. Ferreira, E., Lefèvre, F.: Reinforcement-learning based dialogue system for human-robot interactions with socially-inspired rewards. Comput. Speech Lang. 34, 256–274 (2015)

    Article  Google Scholar 

  18. Devlin, S., Yliniemi, L., Kudenko, D., Tumer, K.: Potential-based difference rewards for multiagent reinforcement learning. In: Adaptive Agents and Multi-Agent Systems (2014)

    Google Scholar 

  19. Kim, D., et al.: Learning to schedule communication in multi-agent reinforcement learning (2019). ArXiv, abs/1902.01554

    Google Scholar 

  20. Hostallero, D.E., Kim, D., Moon, S., Son, K., Kang, W.J., Yi, Y.: Inducing cooperation through reward reshaping based on peer evaluations in deep multi-agent reinforcement learning. In: AAMAS (2020)

    Google Scholar 

  21. Co-Reyes, J.D., Sanjeev, S., Berseth, G., Gupta, A., Levine, S.: Ecological reinforcement learning (2020). ArXiv, abs/2006.12478

    Google Scholar 

  22. Huang, B., Jin, Y.: Reward shaping in multiagent reinforcement learning for self-organizing systems in assembly tasks. Adv. Eng. Inform. 54, 101800 (2022)

    Article  Google Scholar 

  23. Konidaris, G.D., Barto, A.G.: Autonomous shaping: knowledge transfer in reinforcement learning. In: Proceedings of the 23rd International Conference on Machine Learning (2006)

    Google Scholar 

  24. Rouček, T., et al.: DARPA subterranean challenge: multi-robotic exploration of underground environments. In: Mazal, J., Fagiolini, A., Vasik, P. (eds.) MESAS 2019. LNCS, vol. 11995, pp. 274–290. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-43890-6_22

  25. Stone, P., Veloso, M.: Multiagent systems: a survey from a machine learning perspective (2000)

    Google Scholar 

  26. Chen, X., Ghadirzadeh, A., Björkman, M., Jensfelt, P.: Meta-learning for multi-objective reinforcement learning (2018)

    Google Scholar 

  27. Deep reinforcement learning framework for autonomous driving. Electron. Imaging 2017(19), 70–76 (2017)

    Google Scholar 

  28. Schulman, J., Wolski, F., Dhariwal, P., Radford, A., Klimov, O.: Proximal policy optimization algorithms (2017). ArXiv, abs/1707.06347

    Google Scholar 

  29. Schulman, J., Levine, S., Abbeel, P., Jordan, M., Moritz, P.: Trust region policy optimization (2015). ArXiv, abs/1502.05477

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Helani Wickramaarachchi .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2024 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Wickramaarachchi, H., Kirley, M., Geard, N. (2024). Cooperative Multi-Agent Reinforcement Learning with Dynamic Target Localization: A Reward Sharing Approach. In: Liu, T., Webb, G., Yue, L., Wang, D. (eds) AI 2023: Advances in Artificial Intelligence. AI 2023. Lecture Notes in Computer Science(), vol 14472. Springer, Singapore. https://doi.org/10.1007/978-981-99-8391-9_25

Download citation

  • DOI: https://doi.org/10.1007/978-981-99-8391-9_25

  • Published:

  • Publisher Name: Springer, Singapore

  • Print ISBN: 978-981-99-8390-2

  • Online ISBN: 978-981-99-8391-9

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics