Skip to main content

Investigating Partner Diversification Methods in Cooperative Multi-agent Deep Reinforcement Learning

Part of the Communications in Computer and Information Science book series (CCIS,volume 1333)

Abstract

Overfitting to learning partners is a known problem, in multi-agent reinforcement learning (MARL), due to the co-evolution of learning agents. Previous works explicitly add diversity to learning partners for mitigating this problem. However, since there are many approaches for introducing diversity, it is not clear which one should be used under what circumstances. In this work, we clarify the situation and reveal that widely used methods such as partner sampling and population-based training are unreliable at introducing diversity under fully cooperative multi-agent Markov decision process. We find that generating pre-trained partners is a simple yet effective procedure to achieve diversity. Finally, we highlight the impact of diversified learning partners on the generalization of learning agents using cross-play and ad-hoc team performance as evaluation metrics.

Keywords

  • Coordination
  • Deep reinforcement learning
  • Multi-agent system
  • Generalization

This is a preview of subscription content, access via your institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • DOI: 10.1007/978-3-030-63823-8_46
  • Chapter length: 8 pages
  • Instant PDF download
  • Readable on all devices
  • Own it forever
  • Exclusive offer for individuals only
  • Tax calculation will be finalised during checkout
eBook
USD   119.00
Price excludes VAT (USA)
  • ISBN: 978-3-030-63823-8
  • Instant PDF download
  • Readable on all devices
  • Own it forever
  • Exclusive offer for individuals only
  • Tax calculation will be finalised during checkout
Softcover Book
USD   159.99
Price excludes VAT (USA)
Fig. 1.
Fig. 2.
Fig. 3.
Fig. 4.
Fig. 5.

References

  1. Bansal, T., Pachocki, J., Sidor, S., Sutskever, I., Mordatch, I.: Emergent complexity via multi-agent competition. arXiv preprint arXiv:1710.03748 (2017)

  2. Barrett, S., Rosenfeld, A., Kraus, S., Stone, P.: Making friends on the fly: cooperating with new teammates. Artif. Intell. 242, 132–171 (2017)

    MathSciNet  CrossRef  Google Scholar 

  3. Canaan, R., Gao, X., Togelius, J., Nealen, A., Menzel, S.: Generating and adapting to diverse ad-hoc cooperation agents in Hanabi. arXiv preprint arXiv:2004.13710(2020)

  4. Carroll, M., et al.: On the utility of learning about humans for human-AI coordination. In: Advances in Neural Information Processing Systems, pp. 5175–5186 (2019)

    Google Scholar 

  5. Foerster, J.N., Farquhar, G., Afouras, T., Nardelli, N., Whiteson, S.: Counterfactual multi-agent policy gradients. In: Thirty-Second AAAI Conference on Artificial Intelligence (2018)

    Google Scholar 

  6. Ghosh, A., Tschiatschek, S., Mahdavi, H., Singla, A.: Towards deployment of robust AI agents for human-machine partnerships. arXiv preprint arXiv:1910.02330 (2019)

  7. Grover, A., Al-Shedivat, M., Gupta, J.K., Burda, Y., Edwards, H.: Learning policy representations in multiagent systems. arXiv preprint arXiv:1806.06464 (2018)

  8. Henderson, P., Islam, R., Bachman, P., Pineau, J., Precup, D., Meger, D.: Deep reinforcement learning that matters. In: Thirty-Second AAAI Conference on Artificial Intelligence (2018)

    Google Scholar 

  9. Hu, H., Foerster, J.N.: Simplified action decoder for deep multi-agent reinforcement learning. arXiv preprint arXiv:1912.02288 (2019)

  10. Hu, H., Lerer, A., Peysakhovich, A., Foerster, J.: “Other-play” for zero-shot coordination. arXiv preprint arXiv:2003.02979 (2020)

  11. Islam, R., Henderson, P., Gomrokchi, M., Precup, D.: Reproducibility of benchmarked deep reinforcement learning tasks for continuous control. arXiv preprintarXiv:1708.04133 (2017)

    Google Scholar 

  12. Justesen, N., Torrado, R.R., Bontrager, P., Khalifa, A., Togelius, J., Risi, S.: Illuminating generalization in deep reinforcement learning through procedural level generation. arXiv preprint arXiv:1806.10729 (2018)

  13. Lanctot, M., et al.: A unified game-theoretic approach to multiagent reinforcement learning. In: Advances in Neural Information Processing Systems, pp. 4190–4203 (2017)

    Google Scholar 

  14. Le, H.M., Yue, Y., Carr, P., Lucey, P.: Coordinated multi-agent imitation learning. In: Proceedings of the 34th International Conference on Machine Learning-Volume 70, pp. 1995–2003. JMLR. org (2017)

    Google Scholar 

  15. Li, M.G., Jiang, B., Zhu, H., Che, Z., Liu, Y.: Generative attention networks for multi-agent behavioral modeling. In: AAAI, pp. 7195–7202 (2020)

    Google Scholar 

  16. Liu, S., Lever, G., Merel, J., Tunyasuvunakool, S., Heess, N., Graepel, T.: Emergent coordination through competition. arXiv preprint arXiv:1902.07151 (2019)

  17. Lowe, R., Wu, Y., Tamar, A., Harb, J., Abbeel, O.P., Mordatch, I.: Multi-agent actor-critic for mixed cooperative-competitive environments. In: Advances in Neural Information Processing Systems, pp. 6379–6390 (2017)

    Google Scholar 

  18. OpenAI, Berner, C., et al.: Dota 2 with large scale deep reinforcement learning. arXiv preprint arXiv:1912.06680 (2019)

  19. Schulman, J., Wolski, F., Dhariwal, P., Radford, A., Klimov, O.: Proximal policy optimization algorithms. arXiv preprint arXiv:1707.06347 (2017)

  20. Stone, P., Kaminka, G.A., Kraus, S., Rosenschein, J.S.: Ad hoc autonomous agent teams: Collaboration without pre-coordination. In: Twenty-Fourth AAAI Conference on Artificial Intelligence (2010)

    Google Scholar 

  21. Vinyals, O., et al.: Grandmaster level in Starcraft II using multi-agent reinforcement learning. Nature 575(7782), 350–354 (2019)

    CrossRef  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Nat Dilokthanakul .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and Permissions

Copyright information

© 2020 Springer Nature Switzerland AG

About this paper

Verify currency and authenticity via CrossMark

Cite this paper

Charakorn, R., Manoonpong, P., Dilokthanakul, N. (2020). Investigating Partner Diversification Methods in Cooperative Multi-agent Deep Reinforcement Learning. In: Yang, H., Pasupa, K., Leung, A.CS., Kwok, J.T., Chan, J.H., King, I. (eds) Neural Information Processing. ICONIP 2020. Communications in Computer and Information Science, vol 1333. Springer, Cham. https://doi.org/10.1007/978-3-030-63823-8_46

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-63823-8_46

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-63822-1

  • Online ISBN: 978-3-030-63823-8

  • eBook Packages: Computer ScienceComputer Science (R0)