Adversarial Deep Reinforcement Learning Based Adaptive Moving Target Defense

Eghtesad, Taha; Vorobeychik, Yevgeniy; Laszka, Aron

doi:10.1007/978-3-030-64793-3_4

Taha Eghtesad¹²,
Yevgeniy Vorobeychik¹³ &
Aron Laszka¹²

Part of the book series: Lecture Notes in Computer Science ((LNSC,volume 12513))

Included in the following conference series:

International Conference on Decision and Game Theory for Security

1466 Accesses
10 Citations

Abstract

Moving target defense (MTD) is a proactive defense approach that aims to thwart attacks by continuously changing the attack surface of a system (e.g., changing host or network configurations), thereby increasing the adversary’s uncertainty and attack cost. To maximize the impact of MTD, a defender must strategically choose when and what changes to make, taking into account both the characteristics of its system as well as the adversary’s observed activities. Finding an optimal strategy for MTD presents a significant challenge, especially when facing a resourceful and determined adversary who may respond to the defender’s actions. In this paper, we propose a multi-agent partially-observable Markov Decision Process model of MTD and formulate a two-player general-sum game between the adversary and the defender. To solve this game, we propose a multi-agent reinforcement learning framework based on the double oracle algorithm. Finally, we provide experimental results to demonstrate the effectiveness of our framework in finding optimal policies.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Abadi, M., et al.: TensorFlow: a system for large-scale machine learning. In: 12th USENIX Symposium on Operating Systems Design and Implementation (OSDI), pp. 265–283 (2016)
Google Scholar
Albanese, M., Connell, W., Venkatesan, S., Cybenko, G.: Moving target defense quantification. In: Jajodia, S., Cybenko, G., Liu, P., Wang, C., Wellman, M. (eds.) Adversarial and Uncertain Reasoning for Adaptive Cyber Defense. LNCS, vol. 11830, pp. 94–111. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-30719-6_5
Chapter Google Scholar
Bhosale, R., Mahajan, S., Kulkarni, P.: Cooperative machine learning for intrusion detection system. Int. J. Sci. Eng. Res. 5(1), 1780–1785 (2014)
Google Scholar
Brockman, G., et al.: OpenAI gym (2016)
Google Scholar
Chen, P., Xu, J., Lin, Z., Xu, D., Mao, B., Liu, P.: A practical approach for adaptive data structure layout randomization. In: Pernul, G., Ryan, P.Y.A., Weippl, E. (eds.) ESORICS 2015. LNCS, vol. 9326, pp. 69–89. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-24174-6_4
Chapter Google Scholar
Govindan, S., Wilson, R.: A global Newton method to compute Nash equilibria. J. Econ. Theory 110(1), 65–86 (2003)
Article MathSciNet Google Scholar
Herrero, Á., Corchado, E.: Multiagent systems for network intrusion detection: a review. In: Herrero, Á., Gastaldo, P., Zunino, R., Corchado, E. (eds.) Computational Intelligence in Security for Information Systems. AINSC, vol. 63, pp. 143–154. Springer, Heidelberg (2009). https://doi.org/10.1007/978-3-642-04091-7_18
Chapter MATH Google Scholar
Hill, A., et al.: Stable baselines (2018). https://github.com/hill-a/stable-baselines
Hu, Z., Chen, P., Zhu, M., Liu, P.: Reinforcement learning for adaptive cyber defense against zero-day attacks. In: Jajodia, S., Cybenko, G., Liu, P., Wang, C., Wellman, M. (eds.) Adversarial and Uncertain Reasoning for Adaptive Cyber Defense. LNCS, vol. 11830, pp. 54–93. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-30719-6_4
Chapter Google Scholar
Iannucci, S., Barba, O.D., Cardellini, V., Banicescu, I.: A performance evaluation of deep reinforcement learning for model-based intrusion response. In: 4th IEEE International Workshops on Foundations and Applications of Self* Systems (FAS* W), pp. 158–163 (2019)
Google Scholar
Lanctot, M., et al.: A unified game-theoretic approach to multiagent reinforcement learning. In: Advances in Neural Information Processing Systems, pp. 4190–4203 (2017)
Google Scholar
Lei, C., Ma, D.H., Zhang, H.Q.: Optimal strategy selection for moving target defense based on Markov game. IEEE Access 5, 156–169 (2017)
Article Google Scholar
Li, H., Zheng, Z.: Optimal timing of moving target defense: a Stackelberg game model. arXiv preprint arXiv:1905.13293 (2019)
Malialis, K., Devlin, S., Kudenko, D.: Distributed reinforcement learning for adaptive and robust network intrusion response. Connect. Sci. 27(3), 234–252 (2015)
Article Google Scholar
Malialis, K., Kudenko, D.: Distributed response to network intrusions using multiagent reinforcement learning. Eng. Appl. Artif. Intell. 41, 270–284 (2015)
Article Google Scholar
McKelvey, R.D., McLennan, A.M., Turocy, T.L.: Gambit: software tools for game theory (2006)
Google Scholar
McMahan, H.B., Gordon, G.J., Blum, A.: Planning in the presence of cost functions controlled by an adversary. In: Proceedings of the 20th International Conference on Machine Learning (ICML 2003), pp. 536–543 (2003)
Google Scholar
Mnih, V., et al.: Playing Atari with deep reinforcement learning. arXiv preprint arXiv:1312.5602 (2013)
Nguyen, T.T., Reddi, V.J.: Deep reinforcement learning for cyber security. arXiv preprint arXiv:1906.05799 (2019)
Oakley, L., Oprea, A.: Playing adaptively against stealthy opponents: a reinforcement learning strategy for the FlipIt security game. arXiv preprint arXiv:1906.11938 (2019)
Oh, J., Chockalingam, V., Singh, S., Lee, H.: Control of memory, active perception, and action in Minecraft. arXiv preprint arXiv:1605.09128 (2016)
Prakash, A., Wellman, M.P.: Empirical game-theoretic analysis for moving target defense. In: 2nd ACM Workshop on Moving Target Defense (MTD), pp. 57–65. ACM (2015)
Google Scholar
Russell, S., Norvig, P.: Artificial Intelligence: A Modern Approach. Prentice Hall, Upper Saddle River (2002)
MATH Google Scholar
Sengupta, S., Kambhampati, S.: Multi-agent reinforcement learning in Bayesian Stackelberg Markov games for adaptive moving target defense. arXiv preprint arXiv:2007.10457 (2020)
Sengupta, S., et al.: A game theoretic approach to strategy generation for moving target defense in web applications. In: 16th Conference on Autonomous Agents and Multiagent Systems, pp. 178–186 (2017)
Google Scholar
Shamshirband, S., Patel, A., Anuar, N.B., Kiah, M.L.M., Abraham, A.: Cooperative game theoretic approach using fuzzy \(Q\)-learning for detecting and preventing intrusions in wireless sensor networks. Eng. Appl. Artif. Intell. 32, 228–241 (2014)
Article Google Scholar
Shoham, Y., Leyton-Brown, K.: Multiagent Systems: Algorithmic, Game-Theoretic, and Logical Foundations. Cambridge University Press, Cambridge (2008)
Book Google Scholar
Tan, J., Lei, C., Zhang, H., Cheng, Y.: Optimal strategy selection approach to moving target defense based on Markov robust game. Comput. Secur. 8(5), 63–76 (2019)
Article Google Scholar
Tong, L., Laszka, A., Yan, C., Zhang, N., Vorobeychik, Y.: Finding needles in a moving haystack: prioritizing alerts with adversarial reinforcement learning. arXiv preprint arXiv:1906.08805 (2019)
Watkins, C.J., Dayan, P.: Q-learning. Mach. Learn. 8(3–4), 279–292 (1992)
MATH Google Scholar
Wright, M., Venkatesan, S., Albanese, M., Wellman, M.P.: Moving target defense against DDoS attacks: an empirical game-theoretic analysis. In: 3rd ACM Workshop on Moving Target Defense (MTD), pp. 93–104. ACM (2016)
Google Scholar
Zheng, J., Namin, A.S.: Markov decision process to enforce moving target defense policies. arXiv preprint arXiv:1905.09222 (2019)

Download references

Acknowledgments

This research was partially supported by the NSF (CAREER Grant IIS-1905558) and ARO (W911NF1910241).

Author information

Authors and Affiliations

University of Houston, Houston, TX, 77004, USA
Taha Eghtesad & Aron Laszka
Washington University in St. Louis, St. Louis, MO, 63130, USA
Yevgeniy Vorobeychik

Authors

Taha Eghtesad
View author publications
You can also search for this author in PubMed Google Scholar
Yevgeniy Vorobeychik
View author publications
You can also search for this author in PubMed Google Scholar
Aron Laszka
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Taha Eghtesad .

Editor information

Editors and Affiliations

Tandon School of Engineering, New York University, Brooklyn, NY, USA
Quanyan Zhu
ISR, University of Maryland, College Park, MD, USA
John S. Baras
Electrical Engineering, University of Washington, Seattle, WA, USA
Radha Poovendran
New York University, New York, NY, USA
Juntao Chen

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Eghtesad, T., Vorobeychik, Y., Laszka, A. (2020). Adversarial Deep Reinforcement Learning Based Adaptive Moving Target Defense. In: Zhu, Q., Baras, J.S., Poovendran, R., Chen, J. (eds) Decision and Game Theory for Security. GameSec 2020. Lecture Notes in Computer Science(), vol 12513. Springer, Cham. https://doi.org/10.1007/978-3-030-64793-3_4

Download citation

DOI: https://doi.org/10.1007/978-3-030-64793-3_4
Published: 22 December 2020
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-64792-6
Online ISBN: 978-3-030-64793-3
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics