Abstract
Moving target defense (MTD) is a proactive defense approach that aims to thwart attacks by continuously changing the attack surface of a system (e.g., changing host or network configurations), thereby increasing the adversary’s uncertainty and attack cost. To maximize the impact of MTD, a defender must strategically choose when and what changes to make, taking into account both the characteristics of its system as well as the adversary’s observed activities. Finding an optimal strategy for MTD presents a significant challenge, especially when facing a resourceful and determined adversary who may respond to the defender’s actions. In this paper, we propose a multi-agent partially-observable Markov Decision Process model of MTD and formulate a two-player general-sum game between the adversary and the defender. To solve this game, we propose a multi-agent reinforcement learning framework based on the double oracle algorithm. Finally, we provide experimental results to demonstrate the effectiveness of our framework in finding optimal policies.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Abadi, M., et al.: TensorFlow: a system for large-scale machine learning. In: 12th USENIX Symposium on Operating Systems Design and Implementation (OSDI), pp. 265–283 (2016)
Albanese, M., Connell, W., Venkatesan, S., Cybenko, G.: Moving target defense quantification. In: Jajodia, S., Cybenko, G., Liu, P., Wang, C., Wellman, M. (eds.) Adversarial and Uncertain Reasoning for Adaptive Cyber Defense. LNCS, vol. 11830, pp. 94–111. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-30719-6_5
Bhosale, R., Mahajan, S., Kulkarni, P.: Cooperative machine learning for intrusion detection system. Int. J. Sci. Eng. Res. 5(1), 1780–1785 (2014)
Brockman, G., et al.: OpenAI gym (2016)
Chen, P., Xu, J., Lin, Z., Xu, D., Mao, B., Liu, P.: A practical approach for adaptive data structure layout randomization. In: Pernul, G., Ryan, P.Y.A., Weippl, E. (eds.) ESORICS 2015. LNCS, vol. 9326, pp. 69–89. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-24174-6_4
Govindan, S., Wilson, R.: A global Newton method to compute Nash equilibria. J. Econ. Theory 110(1), 65–86 (2003)
Herrero, Á., Corchado, E.: Multiagent systems for network intrusion detection: a review. In: Herrero, Á., Gastaldo, P., Zunino, R., Corchado, E. (eds.) Computational Intelligence in Security for Information Systems. AINSC, vol. 63, pp. 143–154. Springer, Heidelberg (2009). https://doi.org/10.1007/978-3-642-04091-7_18
Hill, A., et al.: Stable baselines (2018). https://github.com/hill-a/stable-baselines
Hu, Z., Chen, P., Zhu, M., Liu, P.: Reinforcement learning for adaptive cyber defense against zero-day attacks. In: Jajodia, S., Cybenko, G., Liu, P., Wang, C., Wellman, M. (eds.) Adversarial and Uncertain Reasoning for Adaptive Cyber Defense. LNCS, vol. 11830, pp. 54–93. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-30719-6_4
Iannucci, S., Barba, O.D., Cardellini, V., Banicescu, I.: A performance evaluation of deep reinforcement learning for model-based intrusion response. In: 4th IEEE International Workshops on Foundations and Applications of Self* Systems (FAS* W), pp. 158–163 (2019)
Lanctot, M., et al.: A unified game-theoretic approach to multiagent reinforcement learning. In: Advances in Neural Information Processing Systems, pp. 4190–4203 (2017)
Lei, C., Ma, D.H., Zhang, H.Q.: Optimal strategy selection for moving target defense based on Markov game. IEEE Access 5, 156–169 (2017)
Li, H., Zheng, Z.: Optimal timing of moving target defense: a Stackelberg game model. arXiv preprint arXiv:1905.13293 (2019)
Malialis, K., Devlin, S., Kudenko, D.: Distributed reinforcement learning for adaptive and robust network intrusion response. Connect. Sci. 27(3), 234–252 (2015)
Malialis, K., Kudenko, D.: Distributed response to network intrusions using multiagent reinforcement learning. Eng. Appl. Artif. Intell. 41, 270–284 (2015)
McKelvey, R.D., McLennan, A.M., Turocy, T.L.: Gambit: software tools for game theory (2006)
McMahan, H.B., Gordon, G.J., Blum, A.: Planning in the presence of cost functions controlled by an adversary. In: Proceedings of the 20th International Conference on Machine Learning (ICML 2003), pp. 536–543 (2003)
Mnih, V., et al.: Playing Atari with deep reinforcement learning. arXiv preprint arXiv:1312.5602 (2013)
Nguyen, T.T., Reddi, V.J.: Deep reinforcement learning for cyber security. arXiv preprint arXiv:1906.05799 (2019)
Oakley, L., Oprea, A.: Playing adaptively against stealthy opponents: a reinforcement learning strategy for the FlipIt security game. arXiv preprint arXiv:1906.11938 (2019)
Oh, J., Chockalingam, V., Singh, S., Lee, H.: Control of memory, active perception, and action in Minecraft. arXiv preprint arXiv:1605.09128 (2016)
Prakash, A., Wellman, M.P.: Empirical game-theoretic analysis for moving target defense. In: 2nd ACM Workshop on Moving Target Defense (MTD), pp. 57–65. ACM (2015)
Russell, S., Norvig, P.: Artificial Intelligence: A Modern Approach. Prentice Hall, Upper Saddle River (2002)
Sengupta, S., Kambhampati, S.: Multi-agent reinforcement learning in Bayesian Stackelberg Markov games for adaptive moving target defense. arXiv preprint arXiv:2007.10457 (2020)
Sengupta, S., et al.: A game theoretic approach to strategy generation for moving target defense in web applications. In: 16th Conference on Autonomous Agents and Multiagent Systems, pp. 178–186 (2017)
Shamshirband, S., Patel, A., Anuar, N.B., Kiah, M.L.M., Abraham, A.: Cooperative game theoretic approach using fuzzy \(Q\)-learning for detecting and preventing intrusions in wireless sensor networks. Eng. Appl. Artif. Intell. 32, 228–241 (2014)
Shoham, Y., Leyton-Brown, K.: Multiagent Systems: Algorithmic, Game-Theoretic, and Logical Foundations. Cambridge University Press, Cambridge (2008)
Tan, J., Lei, C., Zhang, H., Cheng, Y.: Optimal strategy selection approach to moving target defense based on Markov robust game. Comput. Secur. 8(5), 63–76 (2019)
Tong, L., Laszka, A., Yan, C., Zhang, N., Vorobeychik, Y.: Finding needles in a moving haystack: prioritizing alerts with adversarial reinforcement learning. arXiv preprint arXiv:1906.08805 (2019)
Watkins, C.J., Dayan, P.: Q-learning. Mach. Learn. 8(3–4), 279–292 (1992)
Wright, M., Venkatesan, S., Albanese, M., Wellman, M.P.: Moving target defense against DDoS attacks: an empirical game-theoretic analysis. In: 3rd ACM Workshop on Moving Target Defense (MTD), pp. 93–104. ACM (2016)
Zheng, J., Namin, A.S.: Markov decision process to enforce moving target defense policies. arXiv preprint arXiv:1905.09222 (2019)
Acknowledgments
This research was partially supported by the NSF (CAREER Grant IIS-1905558) and ARO (W911NF1910241).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2020 Springer Nature Switzerland AG
About this paper
Cite this paper
Eghtesad, T., Vorobeychik, Y., Laszka, A. (2020). Adversarial Deep Reinforcement Learning Based Adaptive Moving Target Defense. In: Zhu, Q., Baras, J.S., Poovendran, R., Chen, J. (eds) Decision and Game Theory for Security. GameSec 2020. Lecture Notes in Computer Science(), vol 12513. Springer, Cham. https://doi.org/10.1007/978-3-030-64793-3_4
Download citation
DOI: https://doi.org/10.1007/978-3-030-64793-3_4
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-64792-6
Online ISBN: 978-3-030-64793-3
eBook Packages: Computer ScienceComputer Science (R0)