INNES: An intelligent network penetration testing model based on deep reinforcement learning

Li, Qianyu; Hu, Miao; Hao, Hao; Zhang, Min; Li, Yang

doi:10.1007/s10489-023-04946-1

INNES: An intelligent network penetration testing model based on deep reinforcement learning

Published: 01 September 2023

Volume 53, pages 27110–27127, (2023)
Cite this article

Applied Intelligence Aims and scope Submit manuscript

Qianyu Li¹,
Miao Hu¹,
Hao Hao²,
Min Zhang¹ &
…
Yang Li¹

455 Accesses
1 Citation
Explore all metrics

Abstract

Penetration testing (PT) is a crucial way to ensure the security of computer systems. However, it requires a high threshold and can only be implemented by trained experts. Automated tools can reduce the pressure of talent shortages, and reinforcement learning (RL) is a promising approach for achieving automated PT. Due to the unreasonable characterization of the PT process and the low efficiency of RL data, the applicability of the model is limited, and it is difficult to reuse, which hinders its practical application. In this paper, we propose an INNES (INtelligent peNEtration teSting) model based on deep reinforcement learning (DRL). First, the model characterizes the key elements of PT more reasonably based on the Markov decision process (MDP), fully considering the commonality of the PT process in different scenarios to improve its applicability. Second, the DQN_valid algorithm is designed to constrain the agent’s action space, to improve the agent’s decision-making accuracy, and avoid invalid exploration, according to the feature that enables the effective action space to gradually increase during the PT process. The experimental results show that our model is not only effective for automated PT in the network environment but also has portability, which provides a possible future direction for practical application of intelligent PT based on RL.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Machine Learning: Algorithms, Real-World Applications and Research Directions

Article 22 March 2021

Machine learning and deep learning

Article Open access 08 April 2021

Interpreting Black-Box Models: A Review on Explainable Artificial Intelligence

Article Open access 24 August 2023

Availability of data and materials

The data that support the findings of this study are available from the corresponding author, Min, upon reasonable request.

Notes

https://github.com/jcabrale/CyberBattleSim
The scenarios and the model code are available at https://github.com/liqianyu513/INNES-model.

References

ZC (2022) Research on internet security situation awareness prediction technology based on improved rbf neural network algorithm. Journal of Computational and Cognitive Engineering
Zennaro FM, Erdodi L (2020) Modeling penetration testing with reinforcement learning using capture-the-flag challenges and tabular q-learning
Wani A, Revathi S, Khaliq R (2021) Sdn-based intrusion detection system for iot using deep learning classifier (idsiot-sdl). CAAI Transactions on Intelligence Technology 003:006
Google Scholar
RV, AK, AA (2022) Revisiting shift cipher technique for amplified data security. Journal of Computational and Cognitive Engineering
Stefinko Y, Piskozub A, Banakh R (2016) Manual and automated penetration testing. benefits and drawbacks. modern tendency. In: 2016 13th International Conference on Modern Problems of Radio Engineering. Telecommunications and Computer Science (TCSET)
Abu-Dabaseh F, Alshammari E (2018) Automated penetration testing : An overview. In: 4th International Conference on Natural Language Computing (NATL 2018)
Tabassum M, Sharma T, Mohanan S (2021) Ethical hacking and penetrate testing using kali and metasploit framework
Nadeem A, Verwer S, Moskal S, Yang SJ (2022) Alert-driven attack graph generation using s-pdfa. IEEE transactions on dependable and secure computing 2:19
Google Scholar
Stergiopoulos GGD, Dedousis P (2022) Automatic analysis of attack graphs for risk mitigation and prioritization on large-scale and complex networks in industry 4.0. In: International Journal of Information Security pp. 37–59
Zhang Y, Zhou T, Zhu J, Wang Q (2020) Domain-independent intelligent planning technology and its application to automated penetration testing oriented attack path discovery. J Electron Inf Technol 42(9):2095–2107
Google Scholar
Hoffmann J (2015) Simulated penetration testing: From" dijkstra" to" turing test++". Proceedings of the International Conference on Automated Planning and Scheduling 25:364–372
Article Google Scholar
Silva C, Fernandes B, Feitosa EL, Garcia VC (2022) Piracema.io: A rules-based tree model for phishing prediction. Expert Systems with Applications 191, 116239
Bruce Schneier (2018) Artificial intelligence and the attack/defense balance. IEEE Security & Privacy 16(2):96–96
Article MathSciNet Google Scholar
Debnath S, Roy P, Namasudra S, Crespo RG (2022) Audio-visual automatic speech recognition towards education for disabilities.J Autism Dev Disord
Gutub A (2022) Boosting image watermarking authenticity spreading secrecy from counting-based secret-sharing. CAAI Transactions on Intelligence Technology
Gupta A, Namasudra S (2022) A novel technique for accelerating live migration in cloud computing. Autom Softw Eng
Manjari KSG, Verma M (2022) Qest: Quantized and efficient scene text detector using deep learning. Transactions on Asian and Low-Resource Language Information Processing
Hunt MS, Stephanie DO, Msci A (2023) The role of data analytics and artificial intelligence (ai) in ocular telehealth. Ocular Telehealth 213–232
Witte J, Gao K, Zll A (2023) Artificial intelligence: The future of sustainable agriculture? a research agenda. Publications of Darmstadt Technical University, Institute for Business Studies (BWL)
Google Scholar
Nath K (2023) The role of artificial intelligence in the modeling, analysis and inspection of ultrasonic welding processes - a review. Int J Comput Mater Sci Eng
Ravi V, Soman KP, Alazab M, Sriram S, KS (2020) A comprehensive tutorial and survey of applications of deep learning for cyber security
Sutton RS, Barto AG, et al (1998) Introduction to reinforcement learning
Sutton RS, Barto AG (2018) Reinforcement Learning: An Introduction
Spieker H, Gotlieb A, Marijan D, Mossige M (2018) Reinforcement learning for automatic test case prioritization and selection in continuous integration
Shen Z, Yin H, Jing L, Liang Y, Wang J (2022) A cooperative routing protocol based on q-learning for underwater optical-acoustic hybrid wireless sensor networks. IEEE sensors journal (22-1)
Mnih V, Kavukcuoglu K, Silver D, Graves A, Antonoglou I, Wierstra D, Riedmiller M (2013) Playing atari with deep reinforcement learning. Computer Science
Volodymyr, Mnih, Koray, Kavukcuoglu, David, Silver, Andrei, A, Rusu, Joel (2015) Human-level control through deep reinforcement learning. Nature
Liu Q, Zhai JW, Zhang ZZ, Zhong S, Xu J (2018) A survey on deep reinforcement learning. Chinese Journal of Computers
Liu W, Su S, Tang T, Wang X (2021) A dqn-based intelligent control method for heavy haul trains on long steep downhill section. Transportation Research Part C Emerging Technologies 129(10):103249
Article Google Scholar
Zhang C, Zheng K, Tian Y, Pi Y, Chen R, Xue W, Yang T, An D (2022) Advertising impression resource allocation strategy with multi-level budget constraint dqn in real-time bidding. Neurocomputing (Jun.1), 488
Ahmadi H, Ashtiani M, Azgomi MA, Saheb-Nassagh R (2022) A dqn-based agent for automatic software refactoring. Information and software technology (Jul.), 147
Shmaryahu D, Shani G, Hoffmann J, Steinmetz M (2017) Partially observable contingent planning for penetration testing. In: Iwaise: First International Workshop on Artificial Intelligence in Security vol. 33
Pretschner A (2017) Automated attack planning using a partially observable model for penetration testing of industrial control systems
Ghanem MC, Chen TM (2018) Reinforcement learning for intelligent penetration testing. In: 2018 Second World Conference on Smart Trends in Systems, Security and Sustainability (WorldS4) pp. 185–192
Zhou T-y, Zang Y-c, Zhu J-h, Wang Q-x (2019) Nig-ap: a new method for automated penetration testing. Frontiers of Information Technology & Electronic Engineering 20(9):1277–1288
Article Google Scholar
Nguyen H, Teerakanok S, Inomata A, Uehara T (2021) The proposal of double agent architecture using actor-critic algorithm for penetration testing. In: 7th International Conference on Information Systems Security and Privacy
Tran K, Akella A, Standen M, Kim J, Bowman D, Richer T, Lin CT (2021) Deep hierarchical reinforcement agents for automated penetration testing
Walter E, Ferguson-Walter K, Ridley A (2021) Incorporating deception into cyberbattlesim for autonomous defense
Calderon P (2017) Nmap : Network exploration and security auditing cookbook : A complete guide to mastering nmap and its scripting engine, covering practical tasks for penetration testers and system administrators
Team M (2021) Cyberbattlesim. Created by Christian Seifert, Michael Betser, William Blum, James Bono, Kate Farris, Emily Goren, Justin Grana, Kristian Holsheimer, Brandon Marken, Joshua Neil, Nicole Nichols, Jugal Parikh, Haoran Wei

Download references

Acknowledgements

This work was supported by the National Natural Science Foundation of China under Grant No.61971413.

Author information

Authors and Affiliations

College of Electronic Engineering, National University of Defense Technology, Hefei, 230031, China
Qianyu Li, Miao Hu, Min Zhang & Yang Li
Shandong Computer Science Center (National Supercomputing Center in Jinan), Qilu University of Technology (Shandong Academy of Sciences), Jinan, 250353, China
Hao Hao

Authors

Qianyu Li
View author publications
You can also search for this author in PubMed Google Scholar
Miao Hu
View author publications
You can also search for this author in PubMed Google Scholar
Hao Hao
View author publications
You can also search for this author in PubMed Google Scholar
Min Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Yang Li
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Min Zhang.

Ethics declarations

Competing Interests

We declare that we have no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendix A Network scenario details

To verify the validity of our model, we conducted experiments on two parts of the scenarios. One part is the scenarios included in the currently open-source CyberBattleSim platform developed by Microsoft. Sample1 and Sample2 are randomly generated by the scenario automatic generation module contained by CyberBattleSim. The other part is the scenarios that we abstractly construct from the real network environment. In RL, the input and output of the agent, namely the scale of state space and action space, need to be defined before the neural network structure design. However, when applying RL to solve PT problems, it is nearly impossible to fully describe all the elements in the network scenarios. Therefore, we limit the size of the problem so that agents can be applied to all scenarios within that range. In this paper, the network scenarios limit of the PT tasks that the agents we build can perform is shown in Table 8. We design the network scenarios according to the scope given in the table.

Table 8 The elements included in our network scenarios

Full size table

Figure 8 shows the structure of each scenario built in this paper. In the figure, the start node is the starting point of the agent. It is not included in the network scenario, so each scenario contains 9 nodes. The lines between nodes represent the reachability between them. The black line A\( \rightarrow \)B indicates that the agent can discover node B after vulnerability exploitation at node A, and the red line A\( \rightarrow \)B indicates that the agent can discover the credential of node B after vulnerability exploitation at node A. The red nodes indicate the starting point of the penetration testing or the agent already owned nodes before testing began, and the green nodes indicate the penetration testing target. In node information, None means that the element does not exist on the node. None on the Start node means that this information does not participate in the agent’s decision. We set up a search operation on the Start node to discover other nodes.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Li, Q., Hu, M., Hao, H. et al. INNES: An intelligent network penetration testing model based on deep reinforcement learning. Appl Intell 53, 27110–27127 (2023). https://doi.org/10.1007/s10489-023-04946-1

Download citation

Accepted: 05 August 2023
Published: 01 September 2023
Issue Date: November 2023
DOI: https://doi.org/10.1007/s10489-023-04946-1

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

INNES: An intelligent network penetration testing model based on deep reinforcement learning

Abstract

Access this article

Similar content being viewed by others