Learning Deception Using Fuzzy Multi-Level Reinforcement Learning in a Multi-Defender One-Invader Differential Game

Asgharnia, Amirhossein; Schwartz, Howard; Atia, Mohamed

doi:10.1007/s40815-022-01352-6

Learning Deception Using Fuzzy Multi-Level Reinforcement Learning in a Multi-Defender One-Invader Differential Game

Published: 19 July 2022

Volume 24, pages 3015–3038, (2022)
Cite this article

International Journal of Fuzzy Systems Aims and scope Submit manuscript

483 Accesses
2 Citations
Explore all metrics

Abstract

Differential games are a class of game theory problems governed by differential equations. Differential games are often defined in the continuous domain and solved by the calculus of variations. However, modelling and solving these games are not straightforward tasks. Differential games, like game theory, are often involved with social dilemmas and social behaviours. Modelling these social phenomena with mathematical tools is often problematic. In this paper, we modelled deception to increase the pay-off in differential games. Deception is modelled as a bi-level policy system, and each level is modelled with a fuzzy controller. Fuzzy controllers are trained using a novel hierarchical fuzzy actor-critic learning algorithm. A deceitful player plays against multiple opponents. Although there is one ultimate goal for the player, it can choose multiple fake goals as well. The intention is to find a strategy to switch between the fake goals and the true goal to fool the opponents. The simulation platform is the game of guarding territories, a specific form of the pursuit–evasion games. We propose a method to easily increase the number of defenders with minimum changes in the policies. We create a universal structure that is not affected by the curse of dimensionality. We show that a discerning invader capable of using deception can improve its performance against the defenders by increasing the chance of invasion. We investigate the single-invader single-defender game and the single-invader multi-defender game. We study the superior invader and agents with the same speed. In all mentioned situations, the invader increases its pay-off by using deception versus being honest. A two-level policy system is used in this paper to model deception. The lower-level policy controls each goal’s invasion actions, while the higher-level policy controls deception where a successful game is not initially possible.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Monte Carlo Tree Search: a review of recent modifications and applications

Article Open access 19 July 2022

Maciej Świechowski, Konrad Godlewski, … Jacek Mańdziuk

Multi-Agent Reinforcement Learning: A Selective Overview of Theories and Algorithms

Game-theoretic multi-agent motion planning in a mixed environment

Article 15 March 2024

Xiaoxue Zhang & Lihua Xie

Code/Data Availability

The software and dataset are archived in the Machine Learning and Robotics Laboratory, Carleton University. They are available from the corresponding author on reasonable request.

References

Wagner, A.R., Arkin, R.C.: Acting deceptively: providing robots with the capacity for deception. Int. J. Soc. Robot. 3(1), 5–26 (2011)
Article Google Scholar
Bond, C.F., Robinson, M.: The evolution of deception. J. Nonverbal Behav. 12(4), 295–307 (1988)
Article Google Scholar
Skyrms, B.: Signals: Evolution, Learning, and Information. Oxford University Press, Oxford (2010)
Book Google Scholar
Greenberg, I.: The role of deception in decision theory. J. Conflict Resolut. 26(1), 139–156 (1982)
Article Google Scholar
Whaley, B.: Toward a general theory of deception. J. Strateg. Stud. 5(1), 178–192 (1982)
Article Google Scholar
Ettinger, D., Jehiel, P.: A theory of deception. Am. Econ. J. 2(1), 1–20 (2010)
Google Scholar
Bond, C.F., Kahler, K.N., Paolicelli, L.M.: The miscommunication of deception: an adaptive perspective. J. Exp. Soc. Psychol. 21(4), 331–345 (1985)
Article Google Scholar
Shim, J., Arkin, R.C.: “Biologically-inspired deceptive behavior for a robot,” Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), vol. 7426 LNAI, pp. 401–411 (2012)
Meira-Góes, R., Kang, E., Kwong, R.H., Lafortune, S.: Synthesis of sensor deception attacks at the supervisory layer of Cyber-Physical Systems. Automatica 121, 109172 (2020)
Article MathSciNet MATH Google Scholar
Ornik, M., Topcu, U.: Deception in Optimal Control. In: Proceedings of the 2018 56th Annual Allerton Conference on Communication, Control, and Computing, Allerton 2018, pp. 821–828 (2019)
Karabag, M.O., Ornik, M., Topcu, U.: Deception in supervisory control. IEEE Trans. Autom. Control 67(2), 738–753 (2022)
Article MathSciNet MATH Google Scholar
Kouzehgar, M., Badamchizadeh, M.A.: Fuzzy signaling game of deception between ant-inspired deceptive robots with interactive learning. Appl. Soft Comput. 75, 373–387 (2019)
Article Google Scholar
Venkatesan, R.H., Sinha, N.K.: The Target Guarding Problem Revisited: Some Interesting Revelations, vol. 47 (2014). 19th IFAC World Congress
Asgharnia, A., Schwartz, H.M., Atia, M.: Deception in the game of guarding multiple territories: A machine learning approach. In: Proceedings of the 2020 IEEE International Conference on Systems, Man, and Cybernetics (SMC), pp. 381–388 (2020)
Asgharnia, A., Schwartz, H.M., Atia, M.: Deception in a multi-agent adversarial game: The game of guarding several territories. In: Proceedings of the 2020 IEEE Symposium Series on Computational Intelligence (SSCI), pp. 1321–1327 (2020)
Garcia, E., Casbeer, D.W., Pachter, M.: Active target defence differential game: fast defender case. IET Control Theory Appl. 11(17), 2985–2993 (2017)
Article MathSciNet Google Scholar
Garcia, E., Casbeer, D.W., Pachter, M.: The complete differential game of active target defense. arXiv (2020)
Garcia, E., Casbeer, D.W., Pachter, M.: Pursuit in the presence of a defender. Dyn. Games Appl. 9(3), 652–670 (2019)
Article MathSciNet MATH Google Scholar
Isaacs, R.: Differential games: a mathematical theory with applications to warfare and pursuit, control and optimization. Courier Corporation, Chelmsford (1999)
MATH Google Scholar
Blasch, E.P., Pham, K., Shen, D.: Orbital satellite pursuit-evasion game-theoretical control. In: Proceedings of the 2012 11th International Conference on Information Science, Signal Processing and their Applications, ISSPA 2012, pp. 1007–1012 (2012)
Lau, M., Steffens, M., Mavris, D.: Closed-loop control in active target defense using machine learning. AIAA Scitech 2019 Forum, no. January (2019)
Awheda, M.D., Schwartz, H.M.: A decentralized fuzzy learning algorithm for pursuit-evasion differential games with superior evaders. J. Intell. Robot. Syst. 83(1), 35–53 (2016)
Article Google Scholar
Schwartz, H.: An object oriented approach to fuzzy actor-critic learning for multi-agent differential games. In: Proceedings of the 2019 IEEE Symposium Series on Computational Intelligence (SSCI), pp. 183–190 (2019)
Pachter, M.:Isaacs’ two-on-one pursuit-evasion game. In: Advances in Dynamic Games, pp. 25–55. Springer (2020)
Garcia, E., Bopardikar, S.D.: Cooperative containment of a high-speed evader. Proc. Am. Control Conf. 2021–May, 4698–4703 (2021)
Google Scholar
Garcia, E.: Cooperative target protection from a superior attacker. Automatica 131, 109696 (2021)
Article MathSciNet MATH Google Scholar
Von Moll, A., Casbeer, D., Garcia, E., Milutinović, D., Pachter, M.: The multi-pursuer single-evader game: a geometric approach. J. Intell. Robot. Syst. 96(2), 193–207 (2019)
Article Google Scholar
Yan, R., Shi, Z., Zhong, Y.: Reach-avoid games with two defenders and one attacker: an analytical approach. IEEE Trans. Cybern. 49(3), 1035–1046 (2019)
Article Google Scholar
Yan, R., Shi, Z., Zhong, Y.: Cooperative strategies for two-evader-one-pursuer reach-avoid differential games. Int. J. Syst. Sci. 52(9), 1894–1912 (2021)
Article MathSciNet MATH Google Scholar
Makkapati, V.R., Tsiotras, P.: Optimal evading strategies and task allocation in multi-player Pursuit-Evasion problems. Dyn. Games Appl. 9(4), 1168–1187 (2019)
Article MathSciNet MATH Google Scholar
Qadir, M.Z., Piao, S., Jiang, H., Souidi, M.E.H.: A novel approach for multi-agent cooperative pursuit to capture grouped evaders. J. Supercomput. 76(5), 3416–3426 (2020)
Article Google Scholar
Awheda, M.D., Schwartz, H.M.: A residual gradient fuzzy reinforcement learning algorithm for differential games. Int. J. Fuzzy Syst. 19(4), 1058–1076 (2017)
Article MathSciNet Google Scholar
Leng, L., Li, J., Zhu, J., Hwang, K.S., Shi, H.: Multi-agent reward-iteration fuzzy Q-learning. Int. J. Fuzzy Syst. 23(6), 1669–1679 (2021)
Article Google Scholar
Gneezy, U.: Deception: the role of consequences. Am. Econ. Rev. 95(1), 384–394 (2005)
Article Google Scholar
McEnenaey, W., Singh, R.: Deception in autonomous vehicle decision making in an adversarial environment. Collect. Techn. Pap. 4(August), 3032–3043 (2005)
Google Scholar
Dragan, A., Holladay, R., Srinivasa, S.: Deceptive robot motion: synthesis, analysis and experiments. Auton. Robot. 39(3), 331–345 (2015)
Article Google Scholar
Bontrager, P., Khalifa, A., Anderson, D., Stephenson, M., Salge, C., Togelius, J.: superstition in the network: deep reinforcement learning plays deceptive games. Proc. AAAI Conf. 15, 10–16 (2019)
Google Scholar
Ghiya, S., Sycara, K.: Learning complex multi-agent policies in presence of an adversary. arXiv:2008.07698 (2020)
Li, C., Wei, X., Zhao, Y., Geng, X.: An effective maximum entropy exploration approach for deceptive game in reinforcement learning R. Neurocomputing 403, 98–108 (2020)
Article Google Scholar
Oliveira, E.D., Donadoni, L., Boriero, S., Bonarini, A.: Deceptive actions to improve the attribution of rationality to playing robotic agents. Int. J. Soc. Robot. 13(2), 391–405 (2021)
Article Google Scholar
Raslan, H., Schwartz, H., Givigi, S.: A learning invader for the guarding a territory game. J. Intell. Roboti. Syst. 83(1), 55–70 (2016)
Article Google Scholar
Klancar, G., Zdesar, A., Blazic, S., Skrjanc, I.: Wheeled Mobile Robotics: From Fundamentals Towards Autonomous Systems, 1st edn. Butterworth-Heinemann, Oxford (2017)
Google Scholar
Analikwu, C.V., Schwartz, H.M.: Multi-agent learning in the game of guarding a territory. Int. J. Innov. Comput. Inf. Control 13, 1855–1872 (2017)
Google Scholar
Dai, X., Li, C.K., Rad, A.B.: An approach to tune fuzzy controllers based on reinforcement learning for autonomous vehicle control. IEEE Trans. Intell. Transp. Syst. 6(3), 285–293 (2005)
Article Google Scholar
Schwartz, H.M.: Multi-Agent Machine Learning: A Reinforcement Approach. Wiley, Hoboken (2014)
Book MATH Google Scholar
Jouffe, L.: Actor-critic learning based on fuzzy inference system. In: Proceedings of the 1996 IEEE International Conference on Systems, Man and Cybernetics. Information Intelligence and Systems (Cat. No.96CH35929) 1, pp. 339–344 (1996)
Botvinick, M.M., Niv, Y., Barto, A.C.: Hierarchically organized behavior and its neural foundations: a reinforcement learning perspective. Cognition 113(3), 262–280 (2009)
Article Google Scholar
Bacon, P.-l., Harb, J., Precup, D.: The option-critic architecture. In: Proceedings of the AAAI Conference on Artificial Intelligence, pp. 1726–1734
Sutton, R.S., Barto, A.G.: Reinforcement Learning: An Introduction. MIT Press, Cambridge (2018)
MATH Google Scholar
Levy, A., Platt, R., Konidaris, G., Saenko, K.: Learning multi-level hierarchies with hindsight. In: Proceedings of the 7th International Conference on Learning Representations, ICLR 2019, pp. 1–16 (2019)
Chen, S., Arkin, R.C.: Counter-misdirection in behavior-based multi-robot teams. In: Proceedings of the ISR 2021-2021 IEEE International Conference on Intelligence and Safety for Robotics, pp. 268–275 (2021)

Download references

Funding

This research is funded by the Natural Sciences and Engineering Research Council of Canada (NSERC). (No. RGPIN-2017-06379 and No. RGPIN-2017-06261).

Author information

Authors and Affiliations

Department of Systems and Computer Engineering, Carleton University, 1125 Colonel By Drive, Ottawa, ON, K1S 5B6, Canada
Amirhossein Asgharnia, Howard Schwartz & Mohamed Atia

Authors

Amirhossein Asgharnia
View author publications
You can also search for this author in PubMed Google Scholar
Howard Schwartz
View author publications
You can also search for this author in PubMed Google Scholar
Mohamed Atia
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

AA: Methodology, Software, Writing - original draft. HS: Supervision, Writing - review and editing. MA: Supervision, Writing - review and commenting.

Corresponding author

Correspondence to Amirhossein Asgharnia.

Ethics declarations

Competing interests

The authors have no financial or proprietary interests in any material discussed in this article.

Ethical Approval

Not applicable (this article does not contain any studies with human participants or animals performed by any of the authors).

Consent to Participate

Not applicable (this article does not contain any studies with human participants or animals performed by any of the authors).

Consent for Publication

All authors have approved the manuscript and agree with its publication on the Journal of Intelligent and Robotic Systems.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Asgharnia, A., Schwartz, H. & Atia, M. Learning Deception Using Fuzzy Multi-Level Reinforcement Learning in a Multi-Defender One-Invader Differential Game. Int. J. Fuzzy Syst. 24, 3015–3038 (2022). https://doi.org/10.1007/s40815-022-01352-6

Download citation

Received: 16 March 2022
Revised: 12 June 2022
Accepted: 15 June 2022
Published: 19 July 2022
Issue Date: October 2022
DOI: https://doi.org/10.1007/s40815-022-01352-6

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Learning Deception Using Fuzzy Multi-Level Reinforcement Learning in a Multi-Defender One-Invader Differential Game

Abstract

Access this article

Similar content being viewed by others

Monte Carlo Tree Search: a review of recent modifications and applications

Multi-Agent Reinforcement Learning: A Selective Overview of Theories and Algorithms

Game-theoretic multi-agent motion planning in a mixed environment

Code/Data Availability

References

Funding

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Competing interests

Ethical Approval

Consent to Participate

Consent for Publication

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Learning Deception Using Fuzzy Multi-Level Reinforcement Learning in a Multi-Defender One-Invader Differential Game

Abstract

Access this article

Similar content being viewed by others

Monte Carlo Tree Search: a review of recent modifications and applications

Multi-Agent Reinforcement Learning: A Selective Overview of Theories and Algorithms

Game-theoretic multi-agent motion planning in a mixed environment

Code/Data Availability

References

Funding

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Competing interests

Ethical Approval

Consent to Participate

Consent for Publication

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation