Scalable Learning of Intrusion Response Through Recursive Decomposition

Hammar, Kim; Stadler, Rolf

doi:10.1007/978-3-031-50670-3_9

Kim Hammar¹⁰ &
Rolf Stadler¹⁰

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 14167))

Included in the following conference series:

International Conference on Decision and Game Theory for Security

259 Accesses

Abstract

We study automated intrusion response for an IT infrastructure and formulate the interaction between an attacker and a defender as a partially observed stochastic game. To solve the game we follow an approach where attack and defense strategies co-evolve through reinforcement learning and self-play toward an equilibrium. Solutions proposed in previous work prove the feasibility of this approach for small infrastructures but do not scale to realistic scenarios due to the exponential growth in computational complexity with the infrastructure size. We address this problem by introducing a method that recursively decomposes the game into subgames with low computational complexity which can be solved in parallel. Applying optimal stopping theory we show that the best response strategies in these subgames exhibit threshold structures, which allows us to compute them efficiently. To solve the decomposed game we introduce an algorithm called Decompositional Fictitious Self-Play (dfsp), which learns Nash equilibria through stochastic approximation. We evaluate the learned strategies in an emulation environment where real intrusions and response actions can be executed. The results show that the learned strategies approximate an equilibrium and that dfsp significantly outperforms a state-of-the-art algorithm for a realistic infrastructure configuration.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 59.99; Price excludes VAT (USA)

Softcover Book: USD 74.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Game Theoretical Model for Adaptive Intrusion Detection System

Stochastic Dynamic Information Flow Tracking Game with Reinforcement Learning

Adversarial and Uncertain Reasoning for Adaptive Cyber Defense: Building the Scientific Foundation

References

Alpcan, T., Basar, T.: Network Security: A Decision and Game-Theoretic Approach, 1st edn. Cambridge University Press, Cambridge (2010)
Book Google Scholar
Altman, E., et al.: Jamming game with incomplete information about the jammer. In: Conference on Performance Evaluation Methodologies and Tools (2009)
Google Scholar
Bellman, R.: A Markovian decision process. J. Math. Mech. 6(5), 679–684 (1957)
MathSciNet Google Scholar
Brooks, R.: A robust layered control system for a mobile robot. IEEE J. Robot. Autom. 2(1), 14–23 (1986)
Article Google Scholar
Brown, G.W.: Iterative solution of games by fictitious play. In: Activity Analysis of Production and Allocation, pp. 374–376 (1951)
Google Scholar
Cormen, T., et al.: Introduction to Algorithms, 4th edn. The MIT Press, Cambridge (2022)
Google Scholar
Hammar, K., Stadler, R.: Finding effective security strategies through reinforcement learning and self-play. In: International Conference on Network and Service Management (CNSM 2020), Izmir, Turkey (2020)
Google Scholar
Hammar, K., Stadler, R.: Learning intrusion prevention policies through optimal stopping. In: International Conference on Network and Service Management (CNSM 2021), Izmir, Turkey (2021). https://arxiv.org/pdf/2106.07160.pdf
Hammar, K., Stadler, R.: Intrusion prevention through optimal stopping. IEEE Trans. Netw. Serv. Manag. 19(3), 2333–2348 (2022)
Article Google Scholar
Hammar, K., Stadler, R.: Learning near-optimal intrusion responses against dynamic attackers. IEEE Trans. Netw. Serv. Manag. 1 (2023). https://doi.org/10.1109/TNSM.2023.3293413
Hammar, K., Stadler, R.: Scalable learning of intrusion responses through recursive decomposition (2023). https://arxiv.org/abs/2309.03292
Han, Y., et al.: Reinforcement learning for autonomous defence in software-defined networking. In: Bushnell, L., Poovendran, R., Başar, T. (eds.) GameSec 2018. LNCS, vol. 11199, pp. 145–165. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-01554-1_9
Chapter Google Scholar
Heinrich, J., Silver, D.: Deep reinforcement learning from self-play in imperfect-information games (2016). https://arxiv.org/abs/1603.01121
Hespanha, J., Prandini, M.: Nash equilibria in partial-information games on Markov chains. In: IEEE Conference on Decision and Control (2001)
Google Scholar
Horák, K.: Scalable algorithms for solving stochastic games with limited partial observability. Ph.D. thesis, Czech Technical University in Prague (2019)
Google Scholar
Horák, K., Bošanský, B.: Solving partially observable stochastic games with public observations. In: Proceedings of the AAAI Conference on Artificial Intelligence (2019)
Google Scholar
Huang, L., Chen, J., Zhu, Q.: Factored Markov game theory for secure interdependent infrastructure networks. In: Rass, S., Schauer, S. (eds.) Game Theory for Security and Risk Management. SDGTFA, pp. 99–126. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-75268-6_5
Chapter Google Scholar
Huang, Y., Huang, L., Zhu, Q.: Reinforcement learning for feedback-enabled cyber resilience. Ann. Rev. Control 53, 273–295 (2022)
Article MathSciNet Google Scholar
Kamhoua, C., et al.: Game Theory and Machine Learning for Cyber Security. Wiley, Hoboken (2021)
Book Google Scholar
Kearns, M., Littman, M., Singh, S.: Graphical models for game theory. In: Seventeenth Conference on Uncertainty in Artificial Intelligence (UAI 2001) (2001)
Google Scholar
Krishnamurthy, V.: Partially Observed Markov Decision Processes: From Filtering to Controlled Sensing (2016). https://doi.org/10.1017/CBO9781316471104
Nair, R., et al.: Networked distributed POMDPs: a synthesis of distributed constraint optimization and POMDPs. In: Conference on Artificial Intelligence and the Innovative Applications of Artificial Intelligence (2005)
Google Scholar
Nash, J.F.: Non-cooperative games. Ann. Math. 54(2), 286–295 (1951)
Article MathSciNet Google Scholar
Ouyang, Y., Tavafoghi, H., Teneketzis, D.: Dynamic games with asymmetric information: common information based perfect Bayesian equilibria and sequential decomposition. IEEE Trans. Autom. Control 62(1), 222–237 (2017)
Article MathSciNet Google Scholar
Rasouli, M., Miehling, E., Teneketzis, D.: A scalable decomposition method for the dynamic defense of cyber networks. In: Rass, S., Schauer, S. (eds.) Game Theory for Security and Risk Management. SDGTFA, pp. 75–98. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-75268-6_4
Chapter Google Scholar
Schulman, J., et al.: Proximal policy optimization algorithms (2017). https://arxiv.org/abs/1707.06347
Seuken, S., Zilberstein, S.: Formal models and algorithms for decentralized decision making under uncertainty. Auton. Agents Multi-Agent Syst. 17, 190–250 (2008). https://doi.org/10.1007/s10458-007-9026-5
Article Google Scholar
Shoham, Y., Leyton-Brown, K.: Multiagent Systems: Algorithmic, Game-Theoretic, and Logical Foundations, Cambridge (2009)
Google Scholar
Siljak, D.: Large-Scale Dynamic Systems: Stability and Structure. Dover (1978)
Google Scholar
Tambe, M.: Security and Game Theory: Algorithms, Deployed Systems, Lessons Learned, 1st edn. Cambridge University Press, Cambridge (2011)
Book Google Scholar
Timbers, F., et al.: Approximate exploitability: learning a best response in large games (2020). https://arxiv.org/abs/2004.09677
Topkis, D.M.: Minimizing a submodular function on a lattice. Oper. Res. 26(2), 305–321 (1978). https://www.jstor.org/stable/169636
Tsemogne, O., Hayel, Y., Kamhoua, C., Deugoué, G.: Optimizing intrusion detection systems placement against network virus spreading using a partially observable stochastic minimum-threat path game. In: Fang, F., Xu, H., Hayel, Y. (eds.) GameSec 2022. LNCS, vol. 13727, pp. 274–296. Springer, Cham (2023). https://doi.org/10.1007/978-3-031-26369-9_14
Chapter Google Scholar
Zan, X., et al.: A hierarchical and factored POMDP based automated intrusion response framework. In: Conference on Software Technology and Engineering (2010)
Google Scholar
Zheng, J., Castañón, D.A.: Decomposition techniques for Markov zero-sum games with nested information. In: 52nd IEEE Conference on Decision and Control (2013)
Google Scholar

Download references

Author information

Authors and Affiliations

Division of Network and Systems Engineering, KTH Royal Institute of Technology, Stockholm, Sweden
Kim Hammar & Rolf Stadler

Authors

Kim Hammar
View author publications
You can also search for this author in PubMed Google Scholar
Rolf Stadler
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Kim Hammar .

Editor information

Editors and Affiliations

University of Florida, Gainesville, FL, USA
Jie Fu
Czech Technical University in Prague, Prague, Czech Republic
Tomas Kroupa
University of Avignon, Avignon, France
Yezekael Hayel

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Hammar, K., Stadler, R. (2023). Scalable Learning of Intrusion Response Through Recursive Decomposition. In: Fu, J., Kroupa, T., Hayel, Y. (eds) Decision and Game Theory for Security. GameSec 2023. Lecture Notes in Computer Science, vol 14167. Springer, Cham. https://doi.org/10.1007/978-3-031-50670-3_9

Download citation

DOI: https://doi.org/10.1007/978-3-031-50670-3_9
Published: 29 December 2023
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-50669-7
Online ISBN: 978-3-031-50670-3
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Scalable Learning of Intrusion Response Through Recursive Decomposition

Abstract

Access this chapter

Similar content being viewed by others

Game Theoretical Model for Adaptive Intrusion Detection System

Stochastic Dynamic Information Flow Tracking Game with Reinforcement Learning

Adversarial and Uncertain Reasoning for Adaptive Cyber Defense: Building the Scientific Foundation

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Navigation

Scalable Learning of Intrusion Response Through Recursive Decomposition

Abstract

Access this chapter

Similar content being viewed by others

Game Theoretical Model for Adaptive Intrusion Detection System

Stochastic Dynamic Information Flow Tracking Game with Reinforcement Learning

Adversarial and Uncertain Reasoning for Adaptive Cyber Defense: Building the Scientific Foundation

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation