Skip to main content

Adversarial Deep Reinforcement Learning Based Adaptive Moving Target Defense

  • Conference paper
  • First Online:
Decision and Game Theory for Security (GameSec 2020)

Part of the book series: Lecture Notes in Computer Science ((LNSC,volume 12513))

Included in the following conference series:

Abstract

Moving target defense (MTD) is a proactive defense approach that aims to thwart attacks by continuously changing the attack surface of a system (e.g., changing host or network configurations), thereby increasing the adversary’s uncertainty and attack cost. To maximize the impact of MTD, a defender must strategically choose when and what changes to make, taking into account both the characteristics of its system as well as the adversary’s observed activities. Finding an optimal strategy for MTD presents a significant challenge, especially when facing a resourceful and determined adversary who may respond to the defender’s actions. In this paper, we propose a multi-agent partially-observable Markov Decision Process model of MTD and formulate a two-player general-sum game between the adversary and the defender. To solve this game, we propose a multi-agent reinforcement learning framework based on the double oracle algorithm. Finally, we provide experimental results to demonstrate the effectiveness of our framework in finding optimal policies.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Abadi, M., et al.: TensorFlow: a system for large-scale machine learning. In: 12th USENIX Symposium on Operating Systems Design and Implementation (OSDI), pp. 265–283 (2016)

    Google Scholar 

  2. Albanese, M., Connell, W., Venkatesan, S., Cybenko, G.: Moving target defense quantification. In: Jajodia, S., Cybenko, G., Liu, P., Wang, C., Wellman, M. (eds.) Adversarial and Uncertain Reasoning for Adaptive Cyber Defense. LNCS, vol. 11830, pp. 94–111. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-30719-6_5

    Chapter  Google Scholar 

  3. Bhosale, R., Mahajan, S., Kulkarni, P.: Cooperative machine learning for intrusion detection system. Int. J. Sci. Eng. Res. 5(1), 1780–1785 (2014)

    Google Scholar 

  4. Brockman, G., et al.: OpenAI gym (2016)

    Google Scholar 

  5. Chen, P., Xu, J., Lin, Z., Xu, D., Mao, B., Liu, P.: A practical approach for adaptive data structure layout randomization. In: Pernul, G., Ryan, P.Y.A., Weippl, E. (eds.) ESORICS 2015. LNCS, vol. 9326, pp. 69–89. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-24174-6_4

    Chapter  Google Scholar 

  6. Govindan, S., Wilson, R.: A global Newton method to compute Nash equilibria. J. Econ. Theory 110(1), 65–86 (2003)

    Article  MathSciNet  Google Scholar 

  7. Herrero, Á., Corchado, E.: Multiagent systems for network intrusion detection: a review. In: Herrero, Á., Gastaldo, P., Zunino, R., Corchado, E. (eds.) Computational Intelligence in Security for Information Systems. AINSC, vol. 63, pp. 143–154. Springer, Heidelberg (2009). https://doi.org/10.1007/978-3-642-04091-7_18

    Chapter  MATH  Google Scholar 

  8. Hill, A., et al.: Stable baselines (2018). https://github.com/hill-a/stable-baselines

  9. Hu, Z., Chen, P., Zhu, M., Liu, P.: Reinforcement learning for adaptive cyber defense against zero-day attacks. In: Jajodia, S., Cybenko, G., Liu, P., Wang, C., Wellman, M. (eds.) Adversarial and Uncertain Reasoning for Adaptive Cyber Defense. LNCS, vol. 11830, pp. 54–93. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-30719-6_4

    Chapter  Google Scholar 

  10. Iannucci, S., Barba, O.D., Cardellini, V., Banicescu, I.: A performance evaluation of deep reinforcement learning for model-based intrusion response. In: 4th IEEE International Workshops on Foundations and Applications of Self* Systems (FAS* W), pp. 158–163 (2019)

    Google Scholar 

  11. Lanctot, M., et al.: A unified game-theoretic approach to multiagent reinforcement learning. In: Advances in Neural Information Processing Systems, pp. 4190–4203 (2017)

    Google Scholar 

  12. Lei, C., Ma, D.H., Zhang, H.Q.: Optimal strategy selection for moving target defense based on Markov game. IEEE Access 5, 156–169 (2017)

    Article  Google Scholar 

  13. Li, H., Zheng, Z.: Optimal timing of moving target defense: a Stackelberg game model. arXiv preprint arXiv:1905.13293 (2019)

  14. Malialis, K., Devlin, S., Kudenko, D.: Distributed reinforcement learning for adaptive and robust network intrusion response. Connect. Sci. 27(3), 234–252 (2015)

    Article  Google Scholar 

  15. Malialis, K., Kudenko, D.: Distributed response to network intrusions using multiagent reinforcement learning. Eng. Appl. Artif. Intell. 41, 270–284 (2015)

    Article  Google Scholar 

  16. McKelvey, R.D., McLennan, A.M., Turocy, T.L.: Gambit: software tools for game theory (2006)

    Google Scholar 

  17. McMahan, H.B., Gordon, G.J., Blum, A.: Planning in the presence of cost functions controlled by an adversary. In: Proceedings of the 20th International Conference on Machine Learning (ICML 2003), pp. 536–543 (2003)

    Google Scholar 

  18. Mnih, V., et al.: Playing Atari with deep reinforcement learning. arXiv preprint arXiv:1312.5602 (2013)

  19. Nguyen, T.T., Reddi, V.J.: Deep reinforcement learning for cyber security. arXiv preprint arXiv:1906.05799 (2019)

  20. Oakley, L., Oprea, A.: Playing adaptively against stealthy opponents: a reinforcement learning strategy for the FlipIt security game. arXiv preprint arXiv:1906.11938 (2019)

  21. Oh, J., Chockalingam, V., Singh, S., Lee, H.: Control of memory, active perception, and action in Minecraft. arXiv preprint arXiv:1605.09128 (2016)

  22. Prakash, A., Wellman, M.P.: Empirical game-theoretic analysis for moving target defense. In: 2nd ACM Workshop on Moving Target Defense (MTD), pp. 57–65. ACM (2015)

    Google Scholar 

  23. Russell, S., Norvig, P.: Artificial Intelligence: A Modern Approach. Prentice Hall, Upper Saddle River (2002)

    MATH  Google Scholar 

  24. Sengupta, S., Kambhampati, S.: Multi-agent reinforcement learning in Bayesian Stackelberg Markov games for adaptive moving target defense. arXiv preprint arXiv:2007.10457 (2020)

  25. Sengupta, S., et al.: A game theoretic approach to strategy generation for moving target defense in web applications. In: 16th Conference on Autonomous Agents and Multiagent Systems, pp. 178–186 (2017)

    Google Scholar 

  26. Shamshirband, S., Patel, A., Anuar, N.B., Kiah, M.L.M., Abraham, A.: Cooperative game theoretic approach using fuzzy \(Q\)-learning for detecting and preventing intrusions in wireless sensor networks. Eng. Appl. Artif. Intell. 32, 228–241 (2014)

    Article  Google Scholar 

  27. Shoham, Y., Leyton-Brown, K.: Multiagent Systems: Algorithmic, Game-Theoretic, and Logical Foundations. Cambridge University Press, Cambridge (2008)

    Book  Google Scholar 

  28. Tan, J., Lei, C., Zhang, H., Cheng, Y.: Optimal strategy selection approach to moving target defense based on Markov robust game. Comput. Secur. 8(5), 63–76 (2019)

    Article  Google Scholar 

  29. Tong, L., Laszka, A., Yan, C., Zhang, N., Vorobeychik, Y.: Finding needles in a moving haystack: prioritizing alerts with adversarial reinforcement learning. arXiv preprint arXiv:1906.08805 (2019)

  30. Watkins, C.J., Dayan, P.: Q-learning. Mach. Learn. 8(3–4), 279–292 (1992)

    MATH  Google Scholar 

  31. Wright, M., Venkatesan, S., Albanese, M., Wellman, M.P.: Moving target defense against DDoS attacks: an empirical game-theoretic analysis. In: 3rd ACM Workshop on Moving Target Defense (MTD), pp. 93–104. ACM (2016)

    Google Scholar 

  32. Zheng, J., Namin, A.S.: Markov decision process to enforce moving target defense policies. arXiv preprint arXiv:1905.09222 (2019)

Download references

Acknowledgments

This research was partially supported by the NSF (CAREER Grant IIS-1905558) and ARO (W911NF1910241).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Taha Eghtesad .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2020 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Eghtesad, T., Vorobeychik, Y., Laszka, A. (2020). Adversarial Deep Reinforcement Learning Based Adaptive Moving Target Defense. In: Zhu, Q., Baras, J.S., Poovendran, R., Chen, J. (eds) Decision and Game Theory for Security. GameSec 2020. Lecture Notes in Computer Science(), vol 12513. Springer, Cham. https://doi.org/10.1007/978-3-030-64793-3_4

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-64793-3_4

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-64792-6

  • Online ISBN: 978-3-030-64793-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics