Multi-agent Deep Reinforcement Learning for Pursuit-Evasion Game Scalability

Xu, Lin; Hu, Bin; Guan, Zhihong; Cheng, Xinming; Li, Tao; Xiao, Jiangwen

doi:10.1007/978-981-32-9682-4_69

Lin Xu³⁷,
Bin Hu³⁸,
Zhihong Guan³⁷,
Xinming Cheng³⁹,
Tao Li⁴⁰ &
…
Jiangwen Xiao³⁷

Part of the book series: Lecture Notes in Electrical Engineering ((LNEE,volume 592))

Included in the following conference series:

Chinese Intelligent Systems Conference

1543 Accesses
4 Citations

Abstract

In a pursuit-evasion game, the pursuers usually can capture the evaders successfully when the practical application environment is similar to the one that the pursuers was trained on. However, when there are some pursuers broken down or some new pursuers joining, which will result in that the number of agents in practice is different from the number of agents that was trained on. In other words, the environment has changed. In multi-agent deep reinforcement leaning algorithm, which means that the input and output dimension of network has changed, the trained pursuers may can not capture the evaders in the real-world application. To solve this problem, we proposed a multi-agent reinforcement learning framework so that when the number of pursuers has changed, the pursuers can also capture the evaders. Based on deep deterministic policy gradient (DDPG) framework and Bi directional recurrent neural network (Bi-RNN), we proposed the scalable deep reinforcement learning method for pursuit-evasion game, and apply it into multi-agent pursuit-evasion game in 2D-Dynamic environment. In this game, the speed of evaders is higher than the pursuers, but the number of evaders is less than the pursuers. Our experimental results show that this algorithm can increase the scalability and stability of multi-agent pursuit-evasion game.

This work was supported in part by the National Natural Science Foundation of China under Grants 61672245, 61873287, 61672112, 61572210, and 61633011.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 169.00; Price excludes VAT (USA)

Softcover Book: USD 219.99; Price excludes VAT (USA)

Hardcover Book: USD 299.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Soga S, Kobayashi I (2013) A study on the efficiency of learning a robot controller in various environments. In: 2013 IEEE symposium on adaptive dynamic programming and reinforcement learning (ADPRL), pp 164–169
Google Scholar
Awheda MD, Schwartz HM (2016) A fuzzy reinforcement learning algorithm using a predictor for pursuit-evasion games. In: 2016 annual IEEE systems conference (SysCon). IEEE, pp 1–8
Google Scholar
Schwartz HM, Howard M (2014) Multi-agent machine learning: a reinforcement approach. Wiley Publishing, Hoboken, pp 144–199
MATH Google Scholar
Jouffe L (1998) Fuzzy inference system learning by reinforcement methods. IEEE Trans Syst Man Cybern 28(3):338–355
Article Google Scholar
Desouky SF, Schwartz HM (2011) Q (\(\lambda \))-learning adaptive fuzzy logic controllers for pursuit-evasion differential games. Int J Adapt Control Signal Process 25(10):910–927
Article MathSciNet Google Scholar
Awheda MD, Schwartz HM (2015) The residual gradient FACL algorithm for differential games. In: 2015 IEEE 28th Canadian conference on electrical and computer engineering (CCECE). IEEE, pp 1006–1011
Google Scholar
Mnih V, Kavukcuoglu K, Silver D (2015) Human-level control through deep reinforcement learning. Nature 518(7540):529–533
Article Google Scholar
Silver D, Huang A (2016) Mastering the game of go with deep neural networks and tree search. Nature 529(7587):484–489
Article Google Scholar
Silver D, Schrittwieser J, Simonyan K (2017) Mastering the game of go without human knowledge. Nature 550(7676):354–359
Article Google Scholar
Levine S, Finn C, Darrell T (2016) End-to-end training of deep visuomotor policies. J Mach Learn Res 17(1):1334–1373
MathSciNet MATH Google Scholar
Mao H, Alizadeh M, Menache I (2016) Resource management with deep reinforcement learning. In: Proceedings of the 15th ACM workshop on hot topics in networks. ACM, pp 50–56
Google Scholar
Jaques N, Gu S, Turner RE (2017) Tuning recurrent neural networks with reinforcement learning. In: Proceedings of the 34th international conference on machine learning
Google Scholar
Tan M (1993) Multi-agent reinforcement learning: independent vs. cooperative agents. In: Proceedings of the tenth international conference on machine learning, pp 330–337
Chapter Google Scholar
Enright JJ, Wurman PR (2011) Optimization and coordinated autonomy in mobile fulfillment systems. In: Workshops at the twenty-fifth AAAI conference on artificial intelligence
Google Scholar
Stephan J, Fink J, Kumar V (2017) Concurrent control of mobility and communication in multirobot systems. IEEE Trans Robot 33(5):1248–1254
Article Google Scholar
Foerster JN, Farquhar G, Afouras T (2018) Counterfactual multi-agent policy gradients. In: Thirty-second AAAI conference on artificial intelligence
Google Scholar
Lowe R, Wu Y, Tamar A, Harb J, Abbeel OP, Mordatch I (2017) Multi-agent actor-critic for mixed cooperative-competitive environments. In: Advances in neural information processing systems, pp 6382–6393
Google Scholar
Littman ML (1994) Markov games as a framework for multi-agent reinforcement learning. In: Machine learning proceedings 1994, pp 157–163
Chapter Google Scholar
Bilgin AT, Kadioglu UE (2015) An approach to multi-agent pursuit evasion games using reinforcement learning. In: 2015 international conference on advanced robotics (ICAR). IEEE, pp 164—169
Google Scholar
Foerster J, Assael YM, Freitas N (2016) Learning to communicate with deep multi-agent reinforcement learning. In: Advances in neural information processing systems, pp 2137–2145
Google Scholar
Khan A, Zhang C, Lee DD (2018) Scalable centralized deep multi-agent reinforcement learning via policy gradients. arXiv preprint arXiv
Google Scholar
Lillicrap TP, Timothy P (2015) Continuous control with deep reinforcement learning. Comput Sci 8(6):187
Google Scholar
Foerster J, Assael IA, Freitas N (2016) Learning to communicate with deep multi-agent reinforcement learning. In: Advances in neural information processing systems, pp 2137–2145
Google Scholar
Tesauro G (2004) Extending q-learning to general adaptive multi-agent systems. In: Advances in neural information processing systems, pp 871–878
Google Scholar
Silver D, Lever G, Heess N (2014) Deterministic policy gradient algorithms. In: International conference on international conference on machine learning, ICML, pp 387–395
Google Scholar
Kingma DP, Ba J (2015) Adam: a method for Stochastic Optimization. In: 3rd international conference for learning representations, San Diego
Google Scholar

Download references

Author information

Authors and Affiliations

School of Artificial Intelligence and Automation, Huazhong University of Science and Technology, Wuhan, 430074, China
Lin Xu, Zhihong Guan & Jiangwen Xiao
Wuhan National Laboratory for Optoelectronics, Huazhong University of Science and Technology, Wuhan, 430074, China
Bin Hu
School of Information Science and Engineering, Central South University, Changsha, 430083, China
Xinming Cheng
School of Electronics and Information, Yangtze University, Jingzhou, 434023, China
Tao Li

Authors

Lin Xu
View author publications
You can also search for this author in PubMed Google Scholar
Bin Hu
View author publications
You can also search for this author in PubMed Google Scholar
Zhihong Guan
View author publications
You can also search for this author in PubMed Google Scholar
Xinming Cheng
View author publications
You can also search for this author in PubMed Google Scholar
Tao Li
View author publications
You can also search for this author in PubMed Google Scholar
Jiangwen Xiao
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Bin Hu .

Editor information

Editors and Affiliations

Beihang University, Beijing, China
Yingmin Jia
Beijing University of Posts and Telecommunications, Beijing, China
Junping Du
University of Science and Technology Beijing, Beijing, China
Weicun Zhang

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Xu, L., Hu, B., Guan, Z., Cheng, X., Li, T., Xiao, J. (2020). Multi-agent Deep Reinforcement Learning for Pursuit-Evasion Game Scalability. In: Jia, Y., Du, J., Zhang, W. (eds) Proceedings of 2019 Chinese Intelligent Systems Conference. CISC 2019. Lecture Notes in Electrical Engineering, vol 592. Springer, Singapore. https://doi.org/10.1007/978-981-32-9682-4_69

Download citation

DOI: https://doi.org/10.1007/978-981-32-9682-4_69
Published: 08 September 2019
Publisher Name: Springer, Singapore
Print ISBN: 978-981-32-9681-7
Online ISBN: 978-981-32-9682-4
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)

Publish with us

Policies and ethics