Abstract
Reinforcement learning has shown much success in games such as chess, backgammon and Go [21, 22, 24]. However, in most of these games, agents have full knowledge of the environment at all times. In this paper, we describe a deep learning model in which agents successfully adapt to different classes of opponents and learn the optimal counter-strategy using reinforcement learning in a game under partial observability. We apply our model to \(\mathsf {FlipIt}\) [25], a two-player security game in which both players, the attacker and the defender, compete for ownership of a shared resource and only receive information on the current state of the game upon making a move. Our model is a deep neural network combined with Q-learning and is trained to maximize the defender’s time of ownership of the resource. Despite the noisy information, our model successfully learns a cost-effective counter-strategy outperforming its opponent’s strategies and shows the advantages of the use of deep reinforcement learning in game theoretic scenarios. We also extend \(\mathsf {FlipIt}\) to a larger action-spaced game with the introduction of a new lower-cost move and generalize the model to n-player \(\mathsf {FlipIt}\).
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Alpcan, T., Basar, M.: Network Security: A Decision and Game-Theoretic Approach. Cambridge University Press, Cambridge (2010)
Buczak, A.L., Guven, E.: A survey of data mining and machine learning methods for cyber security intrusion detection. IEEE Commun. Surv. Tutor. 18(2), 1153–1176 (2016)
Ding, D., Han, Q.-L., Xiang, Y., Ge, X., Zhang, X.-M.: A survey on security control and attack detection for industrial cyber-physical systems. Neurocomputing 275, 1674–1683 (2018)
Falliere, N., Murchu, L.O., Chien, E.: W32. Stuxnet dossier. Symantec White Paper (2011)
Feng, X., Zheng, Z., Hu, P., Cansever, D., Mohapatra, P.: Stealthy attacks meets insider threats: a three-player game model. In: IEEE Military Communications Conference (MILCOM), pp. 25–30, October 2015
Gueye, A., Marbukh, V., Walrand, J.C.: Towards a metric for communication network vulnerability to attacks: a game theoretic approach. In: Krishnamurthy, V., Zhao, Q., Huang, M., Wen, Y. (eds.) GameNets 2012. LNICST, vol. 105, pp. 259–274. Springer, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35582-0_20
Hu, P., Li, H., Fu, H., Cansever, D., Mohapatra, P.: Dynamic defense strategy against advanced persistent threat with insiders. In: IEEE Conference on Computer Communications (INFOCOM), pp. 747–755, April 2015
Kingma, D.P., Ba, J.: Adam: a method for stochastic optimization. CoRR abs/1412.6980 (2014)
Kunreuther, H., Heal, G.: Interdependent security. J. Risk Uncertain. 26(2), 231–249 (2003)
Laszka, A., Horvath, G., Felegyhazi, M., Buttyán, L.: FlipThem: modeling targeted attacks with , for multiple resources. In: Poovendran, R., Saad, W. (eds.) GameSec 2014. LNCS, vol. 8840, pp. 175–194. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-12601-2_10
Laszka, A., Johnson, B., Grossklags, J.: Mitigating covert compromises. In: Chen, Y., Immorlica, N. (eds.) WINE 2013. LNCS, vol. 8289, pp. 319–332. Springer, Heidelberg (2013). https://doi.org/10.1007/978-3-642-45046-4_26
Laszka, A., Johnson, B., Grossklags, J.: Mitigation of targeted and non-targeted covert attacks as a timing game. In: Das, S.K., Nita-Rotaru, C., Kantarcioglu, M. (eds.) GameSec 2013. LNCS, vol. 8252, pp. 175–191. Springer, Cham (2013). https://doi.org/10.1007/978-3-319-02786-9_11
Milosevic, N., Dehghantanha, A., Choo, K.-K.: Machine learning aided android malware classification. Comput. Electr. Eng. 61, 266–274 (2017)
Mnih, V., et al.: Playing atari with deep reinforcement learning. CoRR, abs/1312.5602 (2013)
Mnih, V., et al.: Human-level control through deep reinforcement learning. Nature 518(7540), 529–533 (2015)
Oakley, L., Oprea, A.: \(\sf QFlip\): an adaptive reinforcement learning strategy for the \(\sf FlipIt\) security game. In: Alpcan, T., Vorobeychik, Y., Baras, J.S., Dán, G. (eds.) GameSec 2019. LNCS, vol. 11836, pp. 364–384. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-32430-8_22
Paszke, A., et al.: Automatic differentiation in pytorch. In: Neural Information Processing Systems (2017)
Pham, V., Cid, C.: Are we compromised? Modelling security assessment games. In: Grossklags, J., Walrand, J. (eds.) GameSec 2012. LNCS, vol. 7638, pp. 234–247. Springer, Heidelberg (2012). https://doi.org/10.1007/978-3-642-34266-0_14
Schwartz, N.D., Drew, C.: RSA faces angry users after breach. New York Times, page B1, 8 June 2011
Silver, D., et al.: Mastering the game of go with deep neural networks and tree search. Nature 529(7587), 484–489 (2016)
Silver, D., et al.: A general reinforcement learning algorithm that masters chess, shogi, and go through self-play. Science 362(6419), 1140–1144 (2018)
Silver, D., et al.: Mastering the game of go without human knowledge. Nature 550(7676), 354–359 (2017)
Sutton, R.S., Barto, A.G.: Reinforcement Learning: An Introduction. The MIT Press, Cambridge (2018)
Tesauro, G.: Temporal difference learning and TD-Gammon. Commun. ACM 38(3), 58–68 (1995)
van Dijk, M., Juels, A., Oprea, A., Rivest, R.L.: Flipit: the game of “stealthy takeover’’. J. Cryptology 26(4), 655–713 (2013)
Wang, Z., et al.: Sample efficient actor-critic with experience replay. CoRR abs/1611.01224 (2016)
Watkins, C.J.C.H., Dayan, P.: Q-learning. Mach. Learn. 8(3), 279–292 (1992)
Xiao, L., Wan, X., Lu, X., Zhang, Y., Wu, D.: IoT security techniques based on machine learning. ArXiv abs/1801.06275 (2018)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2022 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Greige, L., Chin, P. (2022). Deep Reinforcement Learning for FlipIt Security Game. In: Benito, R.M., Cherifi, C., Cherifi, H., Moro, E., Rocha, L.M., Sales-Pardo, M. (eds) Complex Networks & Their Applications X. COMPLEX NETWORKS 2021. Studies in Computational Intelligence, vol 1072. Springer, Cham. https://doi.org/10.1007/978-3-030-93409-5_68
Download citation
DOI: https://doi.org/10.1007/978-3-030-93409-5_68
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-93408-8
Online ISBN: 978-3-030-93409-5
eBook Packages: EngineeringEngineering (R0)