Skip to main content
Log in

Learning Deception Using Fuzzy Multi-Level Reinforcement Learning in a Multi-Defender One-Invader Differential Game

  • Published:
International Journal of Fuzzy Systems Aims and scope Submit manuscript

Abstract

Differential games are a class of game theory problems governed by differential equations. Differential games are often defined in the continuous domain and solved by the calculus of variations. However, modelling and solving these games are not straightforward tasks. Differential games, like game theory, are often involved with social dilemmas and social behaviours. Modelling these social phenomena with mathematical tools is often problematic. In this paper, we modelled deception to increase the pay-off in differential games. Deception is modelled as a bi-level policy system, and each level is modelled with a fuzzy controller. Fuzzy controllers are trained using a novel hierarchical fuzzy actor-critic learning algorithm. A deceitful player plays against multiple opponents. Although there is one ultimate goal for the player, it can choose multiple fake goals as well. The intention is to find a strategy to switch between the fake goals and the true goal to fool the opponents. The simulation platform is the game of guarding territories, a specific form of the pursuit–evasion games. We propose a method to easily increase the number of defenders with minimum changes in the policies. We create a universal structure that is not affected by the curse of dimensionality. We show that a discerning invader capable of using deception can improve its performance against the defenders by increasing the chance of invasion. We investigate the single-invader single-defender game and the single-invader multi-defender game. We study the superior invader and agents with the same speed. In all mentioned situations, the invader increases its pay-off by using deception versus being honest. A two-level policy system is used in this paper to model deception. The lower-level policy controls each goal’s invasion actions, while the higher-level policy controls deception where a successful game is not initially possible.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15

Similar content being viewed by others

Code/Data Availability

The software and dataset are archived in the Machine Learning and Robotics Laboratory, Carleton University. They are available from the corresponding author on reasonable request.

References

  1. Wagner, A.R., Arkin, R.C.: Acting deceptively: providing robots with the capacity for deception. Int. J. Soc. Robot. 3(1), 5–26 (2011)

    Article  Google Scholar 

  2. Bond, C.F., Robinson, M.: The evolution of deception. J. Nonverbal Behav. 12(4), 295–307 (1988)

    Article  Google Scholar 

  3. Skyrms, B.: Signals: Evolution, Learning, and Information. Oxford University Press, Oxford (2010)

    Book  Google Scholar 

  4. Greenberg, I.: The role of deception in decision theory. J. Conflict Resolut. 26(1), 139–156 (1982)

    Article  Google Scholar 

  5. Whaley, B.: Toward a general theory of deception. J. Strateg. Stud. 5(1), 178–192 (1982)

    Article  Google Scholar 

  6. Ettinger, D., Jehiel, P.: A theory of deception. Am. Econ. J. 2(1), 1–20 (2010)

    Google Scholar 

  7. Bond, C.F., Kahler, K.N., Paolicelli, L.M.: The miscommunication of deception: an adaptive perspective. J. Exp. Soc. Psychol. 21(4), 331–345 (1985)

    Article  Google Scholar 

  8. Shim, J., Arkin, R.C.: “Biologically-inspired deceptive behavior for a robot,” Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), vol. 7426 LNAI, pp. 401–411 (2012)

  9. Meira-Góes, R., Kang, E., Kwong, R.H., Lafortune, S.: Synthesis of sensor deception attacks at the supervisory layer of Cyber-Physical Systems. Automatica 121, 109172 (2020)

    Article  MathSciNet  MATH  Google Scholar 

  10. Ornik, M., Topcu, U.: Deception in Optimal Control. In: Proceedings of the 2018 56th Annual Allerton Conference on Communication, Control, and Computing, Allerton 2018, pp. 821–828 (2019)

  11. Karabag, M.O., Ornik, M., Topcu, U.: Deception in supervisory control. IEEE Trans. Autom. Control 67(2), 738–753 (2022)

    Article  MathSciNet  MATH  Google Scholar 

  12. Kouzehgar, M., Badamchizadeh, M.A.: Fuzzy signaling game of deception between ant-inspired deceptive robots with interactive learning. Appl. Soft Comput. 75, 373–387 (2019)

    Article  Google Scholar 

  13. Venkatesan, R.H., Sinha, N.K.: The Target Guarding Problem Revisited: Some Interesting Revelations, vol. 47 (2014). 19th IFAC World Congress

  14. Asgharnia, A., Schwartz, H.M., Atia, M.: Deception in the game of guarding multiple territories: A machine learning approach. In: Proceedings of the 2020 IEEE International Conference on Systems, Man, and Cybernetics (SMC), pp. 381–388 (2020)

  15. Asgharnia, A., Schwartz, H.M., Atia, M.: Deception in a multi-agent adversarial game: The game of guarding several territories. In: Proceedings of the 2020 IEEE Symposium Series on Computational Intelligence (SSCI), pp. 1321–1327 (2020)

  16. Garcia, E., Casbeer, D.W., Pachter, M.: Active target defence differential game: fast defender case. IET Control Theory Appl. 11(17), 2985–2993 (2017)

    Article  MathSciNet  Google Scholar 

  17. Garcia, E., Casbeer, D.W., Pachter, M.: The complete differential game of active target defense. arXiv (2020)

  18. Garcia, E., Casbeer, D.W., Pachter, M.: Pursuit in the presence of a defender. Dyn. Games Appl. 9(3), 652–670 (2019)

    Article  MathSciNet  MATH  Google Scholar 

  19. Isaacs, R.: Differential games: a mathematical theory with applications to warfare and pursuit, control and optimization. Courier Corporation, Chelmsford (1999)

    MATH  Google Scholar 

  20. Blasch, E.P., Pham, K., Shen, D.: Orbital satellite pursuit-evasion game-theoretical control. In: Proceedings of the 2012 11th International Conference on Information Science, Signal Processing and their Applications, ISSPA 2012, pp. 1007–1012 (2012)

  21. Lau, M., Steffens, M., Mavris, D.: Closed-loop control in active target defense using machine learning. AIAA Scitech 2019 Forum, no. January (2019)

  22. Awheda, M.D., Schwartz, H.M.: A decentralized fuzzy learning algorithm for pursuit-evasion differential games with superior evaders. J. Intell. Robot. Syst. 83(1), 35–53 (2016)

    Article  Google Scholar 

  23. Schwartz, H.: An object oriented approach to fuzzy actor-critic learning for multi-agent differential games. In: Proceedings of the 2019 IEEE Symposium Series on Computational Intelligence (SSCI), pp. 183–190 (2019)

  24. Pachter, M.:Isaacs’ two-on-one pursuit-evasion game. In: Advances in Dynamic Games, pp. 25–55. Springer (2020)

  25. Garcia, E., Bopardikar, S.D.: Cooperative containment of a high-speed evader. Proc. Am. Control Conf. 2021–May, 4698–4703 (2021)

    Google Scholar 

  26. Garcia, E.: Cooperative target protection from a superior attacker. Automatica 131, 109696 (2021)

    Article  MathSciNet  MATH  Google Scholar 

  27. Von Moll, A., Casbeer, D., Garcia, E., Milutinović, D., Pachter, M.: The multi-pursuer single-evader game: a geometric approach. J. Intell. Robot. Syst. 96(2), 193–207 (2019)

    Article  Google Scholar 

  28. Yan, R., Shi, Z., Zhong, Y.: Reach-avoid games with two defenders and one attacker: an analytical approach. IEEE Trans. Cybern. 49(3), 1035–1046 (2019)

    Article  Google Scholar 

  29. Yan, R., Shi, Z., Zhong, Y.: Cooperative strategies for two-evader-one-pursuer reach-avoid differential games. Int. J. Syst. Sci. 52(9), 1894–1912 (2021)

    Article  MathSciNet  MATH  Google Scholar 

  30. Makkapati, V.R., Tsiotras, P.: Optimal evading strategies and task allocation in multi-player Pursuit-Evasion problems. Dyn. Games Appl. 9(4), 1168–1187 (2019)

    Article  MathSciNet  MATH  Google Scholar 

  31. Qadir, M.Z., Piao, S., Jiang, H., Souidi, M.E.H.: A novel approach for multi-agent cooperative pursuit to capture grouped evaders. J. Supercomput. 76(5), 3416–3426 (2020)

    Article  Google Scholar 

  32. Awheda, M.D., Schwartz, H.M.: A residual gradient fuzzy reinforcement learning algorithm for differential games. Int. J. Fuzzy Syst. 19(4), 1058–1076 (2017)

    Article  MathSciNet  Google Scholar 

  33. Leng, L., Li, J., Zhu, J., Hwang, K.S., Shi, H.: Multi-agent reward-iteration fuzzy Q-learning. Int. J. Fuzzy Syst. 23(6), 1669–1679 (2021)

    Article  Google Scholar 

  34. Gneezy, U.: Deception: the role of consequences. Am. Econ. Rev. 95(1), 384–394 (2005)

    Article  Google Scholar 

  35. McEnenaey, W., Singh, R.: Deception in autonomous vehicle decision making in an adversarial environment. Collect. Techn. Pap. 4(August), 3032–3043 (2005)

    Google Scholar 

  36. Dragan, A., Holladay, R., Srinivasa, S.: Deceptive robot motion: synthesis, analysis and experiments. Auton. Robot. 39(3), 331–345 (2015)

    Article  Google Scholar 

  37. Bontrager, P., Khalifa, A., Anderson, D., Stephenson, M., Salge, C., Togelius, J.: superstition in the network: deep reinforcement learning plays deceptive games. Proc. AAAI Conf. 15, 10–16 (2019)

    Google Scholar 

  38. Ghiya, S., Sycara, K.: Learning complex multi-agent policies in presence of an adversary. arXiv:2008.07698 (2020)

  39. Li, C., Wei, X., Zhao, Y., Geng, X.: An effective maximum entropy exploration approach for deceptive game in reinforcement learning R. Neurocomputing 403, 98–108 (2020)

    Article  Google Scholar 

  40. Oliveira, E.D., Donadoni, L., Boriero, S., Bonarini, A.: Deceptive actions to improve the attribution of rationality to playing robotic agents. Int. J. Soc. Robot. 13(2), 391–405 (2021)

    Article  Google Scholar 

  41. Raslan, H., Schwartz, H., Givigi, S.: A learning invader for the guarding a territory game. J. Intell. Roboti. Syst. 83(1), 55–70 (2016)

    Article  Google Scholar 

  42. Klancar, G., Zdesar, A., Blazic, S., Skrjanc, I.: Wheeled Mobile Robotics: From Fundamentals Towards Autonomous Systems, 1st edn. Butterworth-Heinemann, Oxford (2017)

    Google Scholar 

  43. Analikwu, C.V., Schwartz, H.M.: Multi-agent learning in the game of guarding a territory. Int. J. Innov. Comput. Inf. Control 13, 1855–1872 (2017)

    Google Scholar 

  44. Dai, X., Li, C.K., Rad, A.B.: An approach to tune fuzzy controllers based on reinforcement learning for autonomous vehicle control. IEEE Trans. Intell. Transp. Syst. 6(3), 285–293 (2005)

    Article  Google Scholar 

  45. Schwartz, H.M.: Multi-Agent Machine Learning: A Reinforcement Approach. Wiley, Hoboken (2014)

    Book  MATH  Google Scholar 

  46. Jouffe, L.: Actor-critic learning based on fuzzy inference system. In: Proceedings of the 1996 IEEE International Conference on Systems, Man and Cybernetics. Information Intelligence and Systems (Cat. No.96CH35929) 1, pp. 339–344 (1996)

  47. Botvinick, M.M., Niv, Y., Barto, A.C.: Hierarchically organized behavior and its neural foundations: a reinforcement learning perspective. Cognition 113(3), 262–280 (2009)

    Article  Google Scholar 

  48. Bacon, P.-l., Harb, J., Precup, D.: The option-critic architecture. In: Proceedings of the AAAI Conference on Artificial Intelligence, pp. 1726–1734

  49. Sutton, R.S., Barto, A.G.: Reinforcement Learning: An Introduction. MIT Press, Cambridge (2018)

    MATH  Google Scholar 

  50. Levy, A., Platt, R., Konidaris, G., Saenko, K.: Learning multi-level hierarchies with hindsight. In: Proceedings of the 7th International Conference on Learning Representations, ICLR 2019, pp. 1–16 (2019)

  51. Chen, S., Arkin, R.C.: Counter-misdirection in behavior-based multi-robot teams. In: Proceedings of the ISR 2021-2021 IEEE International Conference on Intelligence and Safety for Robotics, pp. 268–275 (2021)

Download references

Funding

This research is funded by the Natural Sciences and Engineering Research Council of Canada (NSERC). (No. RGPIN-2017-06379 and No. RGPIN-2017-06261).

Author information

Authors and Affiliations

Authors

Contributions

AA: Methodology, Software, Writing - original draft. HS: Supervision, Writing - review and editing. MA: Supervision, Writing - review and commenting.

Corresponding author

Correspondence to Amirhossein Asgharnia.

Ethics declarations

Competing interests

The authors have no financial or proprietary interests in any material discussed in this article.

Ethical Approval

Not applicable (this article does not contain any studies with human participants or animals performed by any of the authors).

Consent to Participate

Not applicable (this article does not contain any studies with human participants or animals performed by any of the authors).

Consent for Publication

All authors have approved the manuscript and agree with its publication on the Journal of Intelligent and Robotic Systems.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Asgharnia, A., Schwartz, H. & Atia, M. Learning Deception Using Fuzzy Multi-Level Reinforcement Learning in a Multi-Defender One-Invader Differential Game. Int. J. Fuzzy Syst. 24, 3015–3038 (2022). https://doi.org/10.1007/s40815-022-01352-6

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s40815-022-01352-6

Keywords

Navigation