Improving Multi-agent Reinforcement Learning with Imperfect Human Knowledge

Han, Xiaoxu; Tang, Hongyao; Li, Yuan; Kou, Guang; Liu, Leilei

doi:10.1007/978-3-030-61616-8_30

Xiaoxu Han^11,12,
Hongyao Tang¹¹,
Yuan Li¹³,
Guang Kou¹² &
…
Leilei Liu¹¹

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 12397))

Included in the following conference series:

International Conference on Artificial Neural Networks

2307 Accesses
1 Citations
3 Altmetric

Abstract

Multi-agent reinforcement learning has gained great success in many decision-making tasks. However, there are still some challenges such as low efficiency of exploration, significant time consumption, which bring great obstacles for it to be applied in the real world. Incorporating human knowledge into the learning process has been regarded as a promising way to ameliorate these problems. This paper proposes a novel approach to utilize imperfect human knowledge to improve the performance of multi-agent reinforcement learning. We leverage logic rules, which can be seen as a popular form of human knowledge, as part of the action space in reinforcement learning. During the trial-and-error, the value of rules and the original action will be estimated. Logic rules, therefore, can be selected flexibly and efficiently to assist the learning. Moreover, we design a new exploration way, in which rules are preferred to be explored at the early training stage. Finally, we make experimental evaluations and analyses of our approach in challenging StarCraftII micromanagement scenarios. The empirical results show that our approach outperforms the state-of-the-art multi-agent reinforcement learning method, not only in the performance but also in the learning speed.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

A Multi-branch Ensemble Agent Network for Multi-agent Reinforcement Learning

Continuous self-adaptive optimization to learn multi-task multi-agent

Article Open access 17 December 2021

Learning Distinct Strategies for Heterogeneous Cooperative Multi-agent Reinforcement Learning

Notes

1.
https://github.com/oxwhirl/smac.

References

Ammanabrolu, P., Riedl, M.O.: Playing text-adventure games with graph-based deep reinforcement learning. arXiv preprint arXiv:1812.01628 (2018)
Bougie, N., Ichise, R.: Deep reinforcement learning boosted by external knowledge. In: Proceedings of the 33rd Annual ACM Symposium on Applied Computing, pp. 331–338 (2018)
Google Scholar
Bougie12, N., Ichise, R.: Rule-based reinforcement learning augmented by external knowledge
Google Scholar
Du, Y., Narasimhan, K.: Task-agnostic dynamics priors for deep reinforcement learning. arXiv preprint arXiv:1905.04819 (2019)
Hester, T., Vecerik, M., et al.: Deep q-learning from demonstrations. In: Thirty-Second AAAI Conference on Artificial Intelligence (2018)
Google Scholar
Ho, J., Ermon, S.: Generative adversarial imitation learning. In: Advances in Neural Information Processing Systems, pp. 4565–4573 (2016)
Google Scholar
Moreno, D.L., Regueiro, C.V., et al.: Using prior knowledge to improve reinforcement learning in mobile robotics. In: Proceedings of the Towards Autonomous Robotics Systems, University of Essex, UK (2004)
Google Scholar
Rashid, T., Samvelyan, M., et al.: QMIX: monotonic value function factorisation for deep multi-agent reinforcement learning. In: Proceedings of the 35th International Conference on Machine Learning, ICML 2018, pp. 4292–4301 (2018)
Google Scholar
Sunehag, P., Lever, G., et al.: Value-decomposition networks for cooperative multi-agent learning based on team reward. In: Proceedings of the 17th International Conference on Autonomous Agents and Multiagent Systems, pp. 2085–2087 (2018)
Google Scholar
Tan, M.: Multi-agent reinforcement learning: Independent vs. cooperative agents. In: Proceedings of the Tenth International Conference on Machine Learning, pp. 330–337 (1993)
Google Scholar
Wang, Z., Taylor, M.E.: Interactive reinforcement learning with dynamic reuse of prior knowledge from human and agent demonstrations. In: Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence, IJCAI 2019, pp. 3820–3827. ijcai.org (2019)
Google Scholar
Zhang, G., Li, Y., et al.: Efficient training techniques for multi-agent reinforcement learning in combat tasks. IEEE Access 7, 109301–109310 (2019)
Article Google Scholar
Zhang, H., Gao, Z., et al.: Faster and safer training by embedding high-level knowledge into deep reinforcement learning. arXiv preprint arXiv:1910.09986 (2019)

Download references

Author information

Authors and Affiliations

College of Intelligence and Computing, Tianjin University, Tianjin, 300350, China
Xiaoxu Han, Hongyao Tang & Leilei Liu
Artificial Intelligence Research Center, National Innovation Institute of Defense Technology, Beijing, 100072, China
Xiaoxu Han & Guang Kou
Academy of Military Sciences, Beijing, 100091, China
Yuan Li

Authors

Xiaoxu Han
View author publications
You can also search for this author in PubMed Google Scholar
Hongyao Tang
View author publications
You can also search for this author in PubMed Google Scholar
Yuan Li
View author publications
You can also search for this author in PubMed Google Scholar
Guang Kou
View author publications
You can also search for this author in PubMed Google Scholar
Leilei Liu
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding authors

Correspondence to Yuan Li or Guang Kou .

Editor information

Editors and Affiliations

Department of Applied Informatics, Comenius University in Bratislava, Bratislava, Slovakia
Igor Farkaš
Department of Applied Mathematics and Computer Science, Technical University of Denmark, Kgs. Lyngby, Denmark
Paolo Masulli
Department of Informatics, University of Hamburg, Hamburg, Germany
Stefan Wermter

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Han, X., Tang, H., Li, Y., Kou, G., Liu, L. (2020). Improving Multi-agent Reinforcement Learning with Imperfect Human Knowledge. In: Farkaš, I., Masulli, P., Wermter, S. (eds) Artificial Neural Networks and Machine Learning – ICANN 2020. ICANN 2020. Lecture Notes in Computer Science(), vol 12397. Springer, Cham. https://doi.org/10.1007/978-3-030-61616-8_30

Download citation

DOI: https://doi.org/10.1007/978-3-030-61616-8_30
Published: 14 October 2020
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-61615-1
Online ISBN: 978-3-030-61616-8
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Improving Multi-agent Reinforcement Learning with Imperfect Human Knowledge

Abstract

Access this chapter

Similar content being viewed by others

A Multi-branch Ensemble Agent Network for Multi-agent Reinforcement Learning

Continuous self-adaptive optimization to learn multi-task multi-agent

Learning Distinct Strategies for Heterogeneous Cooperative Multi-agent Reinforcement Learning

Notes

References

Author information

Authors and Affiliations

Corresponding authors

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Navigation

Improving Multi-agent Reinforcement Learning with Imperfect Human Knowledge

Abstract

Access this chapter

Similar content being viewed by others

A Multi-branch Ensemble Agent Network for Multi-agent Reinforcement Learning

Continuous self-adaptive optimization to learn multi-task multi-agent

Learning Distinct Strategies for Heterogeneous Cooperative Multi-agent Reinforcement Learning

Notes

References

Author information

Authors and Affiliations

Corresponding authors

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation