Dr.PathFinder: hybrid fuzzing with deep reinforcement concolic execution toward deeper path-first search

Jeon, Seungho; Moon, Jongsub

doi:10.1007/s00521-022-07008-8

Dr.PathFinder: hybrid fuzzing with deep reinforcement concolic execution toward deeper path-first search

Original Article
Published: 25 February 2022

Volume 34, pages 10731–10750, (2022)
Cite this article

Neural Computing and Applications Aims and scope Submit manuscript

726 Accesses
3 Citations
Explore all metrics

Abstract

Fuzzing is an effective approach to discover bugs in programs, especially memory corruption bugs, using randomly generated test cases. However, without prior knowledge of the target program, the fuzzer can generate only a limited number of test cases because of sanity checks. To solve this problem, recent studies have proposed hybrid fuzzers that observe the context of a target program using symbolic execution; these fuzzers generate test cases to bypass the sanity check. While hybrid fuzzers explore “deep” bugs in the target program, they generate many ineffective test cases. In this paper, we propose a concolic execution algorithm that combines deep reinforcement learning with a hybrid fuzzing solution, Dr.PathFinder. When the reinforcement learning agent encounters a branch during concolic execution, it evaluates the state and determines the search path. In this process,“shallow” paths are pruned, and “deep” paths are searched first. This reduces unnecessary exploration, allowing the efficient memory usage and alleviating the state explosion problem. In experiments with the CB-multios dataset for deep bug cases, Dr.PathFinder discovered approximately five times more bugs than AFL and two times more than Driller-AFL. In addition to finding more bugs, Dr.PathFinder generated 19 times fewer test cases and used at least \(2\%\) less memory than Driller-AFL. While it performed well in finding bugs located in deep paths, Dr.PathFinder had limitation to find bugs located at shallow paths, which we discussed.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Explainable artificial intelligence: a comprehensive review

Article 18 November 2021

Generative AI for pentesting: the good, the bad, the ugly

Article Open access 15 March 2024

A survey of safety and trustworthiness of large language models through the lens of verification and validation

Article Open access 17 June 2024

Notes

This number only counts vulnerabilities reported at https://lcamtuf.coredump.cx/afl/.
Some applications based on reinforcement learning models start learning with a stochastic policy, then they switch to a deterministic policy. However, in this experiment, Dr.PathFinder maintained a stochastic policy to continuously reflect new states and actions therein as fuzzing proceeds.
In the DQN algorithm, the policy depends on the Q-network. Therefore, training the Q-network is equivalent to improving the policy.

References

Aschermann C, Schumilo S, Blazytko T, Gawlik R, Holz T (2019) REDQUEEN: fuzzing with input-to-state correspondence. In: Proceedings of NDSS, pp 1–15. https://doi.org/10.14722/ndss.2019.23371
Barreto A, Dabney W , Munos R, Hunt JJ, Schaul T, van Hasselt H, Silver D (2018) Successor features for transfer in reinforcement learning
Barrett C, Stump A, Tinelli C (2010) The SMT-LIB standard: version 2.0. In: Gupta A, Kroening D (eds.) Proceedings of international work satisfy modul theory, p 14
Böhme M, Pham VT, Roychoudhury A (2016) Coverage-based greybox fuzzing as Markov chain. Proc ACM Conf Comput Commun Secur. https://doi.org/10.1145/2976749.2978428
Article Google Scholar
Bottou L (2012) Stochastic gradient descent tricks. Lect. Notes Comput. Sci. (including Subser. Lect. Notes Artif. Intell. Lect. Notes Bioinformatics). https://doi.org/10.1007/978-3-642-35289-8-25
Brumley D, Jager I, Avgerinos T, Schwartz EJ (2011) BAP: A binary analysis platform. Lect. Notes Comput. Sci. (including Subser. Lect. Notes Artif. Intell. Lect. Notes Bioinformatics) pp. 464–469. https://doi.org/10.1007/978-3-642-22110-1_37
Cadar C, Dunbar D, Engler D (2019) Klee: unassisted and automatic generation of high-coverage tests for complex systems programs. In: Proc USENIX Symp Oper Syst Des Impl, pp 209–224
Cha SK, Avgerinos T, Rebert A, Brumley D (2012) Unleashing Mayhem on binary code. In: Proc IEEE Symp Secur Priv, pp 380–394. https://doi.org/10.1109/SP.2012.31
Chen P, Chen H (2018) Angora: efficient fuzzing by principled search. In: Proceedings IEEE symposium on security and privacy, pp. 711–725. https://doi.org/10.1109/SP.2018.00046
Chipounov V, Kuznetsov V, Candea G (2011) S2E: a platform for in-vivo multi-path analysis of software systems. In: Proceedings international conference on architectural support for programming languages and operating systems, pp. 265–278. https://doi.org/10.1145/1950365.1950396
De Moura L, Bjørner N (2008) Z3: an efficient SMT solver. Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) pp. 337–340. https://doi.org/10.1007/978-3-540-78800-3_24
Defence Advanced Research Projects Agency (DARPA): Cyber Grand Challenge (CGC) (2016). https://www.darpa.mil/program/cyber-grand-challenge
Dolan-Gavitt B, Hulin P, Kirda E, Leek T, Mambretti A, Robertson W, Ulrich F, Whelan R (2016) LAVA: large-scale automated vulnerability addition. In: Proceedings IEEE Symposium on Security and Privacy, pp. 110–121. https://doi.org/10.1109/SP.2016.15
Enck W, Gilbert P, Han S, Tendulkar V, Chun BG, Cox LP, Jung J, McDaniel P, Sheth AN (2014) TaintDroid: an information-flow tracking system for realtime privacy monitoring on smartphones. ACM Trans Comput Syst. https://doi.org/10.1145/2619091
Article Google Scholar
Ganai M, Lee D, Gupta A (2012) DTAM: dynamic taint analysis of multi-threaded programs for relevancy. Proc ACM SIGSOFT Int Symp Found Softw Eng. https://doi.org/10.1145/2393596.2393650
Article Google Scholar
Ganesh V, Leek T, Rinard M (2009) Taint-based directed whitebox fuzzing. Proc Int Conf Softw Eng. https://doi.org/10.1109/ICSE.2009.5070546
Article Google Scholar
Godefroid P, Klarlund N, Sen K (2005) DART: directed automated random testing. ACM SIGPLAN Not 1:2. https://doi.org/10.1145/1064978.1065036
Article Google Scholar
Google: Honggfuzz (2016). https://github.com/google/honggfuzz
Haller I, Slowinska A, Neugschwandtner M, Bos H (2013) Dowsing for overflows: a guided fuzzer to find buffer boundary violations. In: Proceedings of USENIX security symposium, pp 49–64
He K, Zhang X, Ren S, Sun J (2015) Delving deep into rectifiers: Surpassing human-level performance on imagenet classification. In: Proceedings IEEE international conference on computer vision, pp. 1026–1034. https://doi.org/10.1109/ICCV.2015.123
Hochreiter S, Schmidhuber J (1997) Long short-term memory. Neural Comput 1:2. https://doi.org/10.1162/neco.1997.9.8.1735
Article Google Scholar
Ioffe S, Szegedy C (2015) Batch normalization: accelerating deep network training by reducing internal covariate shift. In: Proceedings of international conference on machine learning, pp. 448–456
Ispoglou KK, Austin D, Mohan V, Payer M (2020) FuzzGen: automatic fuzzer generation. In: Proceedings of 29th USENIX security symposium
Kim S, Faerevaag M, Jung M, Jung S, Oh D, Lee J, Cha SK (2017) Testing intermediate representations for binary analysis. In: Proceedings of IEEE/ACM international conference on automated software engineering, pp 353–364. https://doi.org/10.1109/ASE.2017.8115648
Kingma DP, Ba J (2017) Adam: a method for stochastic optimization
Laf-Intel: circumventing fuzzing roadblocks with compiler transformations (2016). https://lafintel.wordpress.com/2016/08/15/circumventing-fuzzing-roadblocks-with-compiler-transformations/
Landi W (1992) Undecidability of static analysis. ACM Lett Program Lang Syst. https://doi.org/10.1145/161494.161501
Article Google Scholar
Lattner C, Adve V (2004) LLVM: a compilation framework for lifelong program analysis & transformation. In: International symposium on code generation and optimization, pp 75–86. https://doi.org/10.1109/CGO.2004.1281665
Lecun Y, Bengio Y, Hinton G (2015) Deep learning. https://doi.org/10.1038/nature14539
Liang J, Jiang Y, Wang M, Jiao X, Chen Y, Song H, Choo KKR (2020) DeepFuzzer: accelerated deep Greybox fuzzing. IEEE Trans Depend Secur Comput. https://doi.org/10.1109/TDSC.2019.2961339
Article Google Scholar
Liang J, Wang M, Chen Y, Jiang Y, Zhang R (2018) Fuzz testing in practice: obstacles and solutions. In: 25th IEEE international conference software analysis evolution reengineering, SANER 2018—Proceedings, vol 2018-March. https://doi.org/10.1109/SANER.2018.8330260
Lin LJ (1993) Reinforcement learning for robots using neural networks. Carnegie-Mellon Univ Pittsburgh PA School of Computer Science, Tech. rep
Luk CK, Cohn R, Muth R, Patil H, Klauser A, Lowney G, Wallace S, Reddi VJ, Hazelwood K (2005) Pin: Building customized program analysis tools with dynamic instrumentation. In: Proceedings of ACM SIGPLAN Conference programming language implementation, pp. 190–200
Lyu C, Ji S, Zhang C, Li Y, Lee WH, Song Y, Beyah R (2019) MOPT: optimized mutation scheduling for fuzzers. In: Proceedings of USENIX security symposium, pp 1949–1966
Masri W, Podgurski A, Leon D (2004) Detecting and debugging insecure information flows. In: Proceedings—international symposium on software reliability engineering, pp 198–209. https://doi.org/10.1109/issre.2004.17
Mnih V, Kavukcuoglu K, Silver D, Rusu AA, Veness J, Bellemare MG, Graves A, Riedmiller M, Fidjeland AK, Ostrovski G, Petersen S, Beattie C, Sadik A, Antonoglou I, King H, Kumaran D, Wierstra D, Legg S, Hassabis D (2015) Human-level control through deep reinforcement learning. Nature. https://doi.org/10.1038/nature14236
Article Google Scholar
Nethercote N, Seward J (2007) Valgrind: A framework for heavyweight dynamic binary instrumentation. In: Proceedings ACM SIGPLAN conference on programming language design and implementation, pp 89–100. https://doi.org/10.1145/1250734.1250746
Peng H, Shoshitaishvili Y, Payer M (2018) T-Fuzz: fuzzing by program transformation. In: Proceedings of IEEE symposium on security and privacy, pp 697–710. https://doi.org/10.1109/SP.2018.00056
Rawat S, Jain V, Kumar A, Cojocar L, Giuffrida C, Bos H (2017) VUzzer: application-aware evolutionary fuzzing. In: Proceedings of NDSS, pp 1–14. https://doi.org/10.14722/ndss.2017.23404
Rummery GA, Niranjan M (1994) On-line Q-learning using connectionist systems, vol 37. Cambridge University, Engineering Department
Schwartz EJ, Avgerinos T, Brumley D (2010) All you ever wanted to know about dynamic taint analysis and forward symbolic execution (but might have been afraid to ask). In: Proceedings of IEEE symposium on security and privacy, pp 317–331. https://doi.org/10.1109/SP.2010.26
Sen K, Marinov D, Agha G (2005) CUTE: a concolic unit testing engine for C. In: Proceedings of European Software Engineering Conference held jointly with 13th ACM SIGSOFT International Symposium on Foundations of Software, pp 263–272
Sengupta S, Basak S, Peters RA (2018) Particle Swarm Optimization: a survey of historical and recent developments with hybridization perspectives. https://doi.org/10.3390/make1010010
Shoshitaishvili Y, Wang R, Salls C, Stephens N, Polino M, Dutcher A, Grosen J, Feng S, Hauser C, Kruegel C, Vigna, G (2016) SOK: (State of) the art of war: offensive techniques in binary analysis. In: Proceedings of IEEE symposium on security and privacy, pp 138–157. https://doi.org/10.1109/SP.2016.17
Silver D, Huang A, Maddison CJ, Guez A, Sifre L, Van Den Driessche G, Schrittwieser J, Antonoglou I, Panneershelvam V, Lanctot M, Dieleman S, Grewe D, Nham J, Kalchbrenner N, Sutskever I, Lillicrap T, Leach M, Kavukcuoglu K, Graepel T, Hassabis D (2016) Mastering the game of Go with deep neural networks and tree search. Nature. https://doi.org/10.1038/nature16961
Article Google Scholar
Silver D, Schrittwieser J, Simonyan K, Antonoglou I, Huang A, Guez A, Hubert T, Baker L, Lai M, Bolton A, Chen Y, Lillicrap T, Hui F, Sifre L, Van Den Driessche G, Graepel T, Hassabis D (2017) Mastering the game of Go without human knowledge. Nature. https://doi.org/10.1038/nature24270
Article Google Scholar
Stephens N, Grosen J, Salls C, Dutcher A, Wang R, Corbetta J, Shoshitaishvili Y, Kruegel C, Vigna G (2017) Driller: augmenting fuzzing through selective symbolic execution. In: Proceedings of NDSS, pp 1–16. https://doi.org/10.14722/ndss.2016.23368
Sun M, Wei T, Lui JC (2016) TaintART: a practical multi-level information-flow tracking system for Android RunTime. In: Proceedings of ACM conference computer communications security, pp 331–342. https://doi.org/10.1145/2976749.2978343
Sutton RS, Barto AG (2018) Reinforcement learning: an introduction, 2nd edn, vol 3. MIT Press
Trail of Bits: DARPA challenges sets for Linux, Windows, and macOS (2016). https://github.com/trailofbits/cb-multios
Mnih V, Kavukcuoglu K, Silver D, Graves A, Antonoglou I, Daan W, Riedmiller M (2016) Playing atari with deep reinforcement learning. https://doi.org/10.1038/nature14236
Wang T, Wei T, Gu G, Zou W (2010) TaintScope: a checksum-aware directed fuzzing tool for automatic software vulnerability detection. In: Proceedings of IEEE symposium on security and privacy, pp 497–512. https://doi.org/10.1109/SP.2010.37
Watkins CJCH, Dayan P (1992) Q-learning. Mach Learn. https://doi.org/10.1007/bf00992698
Article MATH Google Scholar
Wei CY, Hong YT, Lu CJ (2017) Online reinforcement learning in stochastic games
Yun I, Lee S, Xu M, Jang Y, Kim T (2018) QSYM: a practical concolic execution engine tailored for hybrid fuzzing. In: Proceedings of USENIX security symposium, pp 745–761
Zakeri Nasrabadi M, Parsa S, Kalaee A (2021) Format-aware learn & fuzz: deep test data generation for efficient fuzzing. Neural Comput Appl 33(5):1–17. https://doi.org/10.1007/s00521-020-05039-7
Article Google Scholar
Zalewski M (2017) American fuzzy lop. https://lcamtuf.coredump.cx/afl/
Zhao L, Duan Y, Yin H, Xuan J (2019) Send hardest problems my way: probabilistic path prioritization for hybrid fuzzing. In: Proceedings of NDSS, pp 1–15. https://doi.org/10.14722/ndss.2019.23504

Download references

Acknowledgements

This research was supported by a grant from the Korea Health Technology R&D Project through the Korea Health Industry Development Institute (KHIDI) and funded by the Ministry of Health & Welfare, Republic of Korea (Grant Number: HI19C0791).

Author information

Authors and Affiliations

Division of Information Security, Graduate School of Information Security, Korea University, Seoul, Republic of Korea
Seungho Jeon & Jongsub Moon

Authors

Seungho Jeon
View author publications
You can also search for this author in PubMed Google Scholar
Jongsub Moon
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Jongsub Moon.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Jeon, S., Moon, J. Dr.PathFinder: hybrid fuzzing with deep reinforcement concolic execution toward deeper path-first search. Neural Comput & Applic 34, 10731–10750 (2022). https://doi.org/10.1007/s00521-022-07008-8

Download citation

Received: 19 April 2021
Accepted: 30 January 2022
Published: 25 February 2022
Issue Date: July 2022
DOI: https://doi.org/10.1007/s00521-022-07008-8

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Dr.PathFinder: hybrid fuzzing with deep reinforcement concolic execution toward deeper path-first search

Abstract

Access this article

Similar content being viewed by others

Explainable artificial intelligence: a comprehensive review

Generative AI for pentesting: the good, the bad, the ugly

A survey of safety and trustworthiness of large language models through the lens of verification and validation

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Dr.PathFinder: hybrid fuzzing with deep reinforcement concolic execution toward deeper path-first search

Abstract

Access this article

Similar content being viewed by others

Explainable artificial intelligence: a comprehensive review

Generative AI for pentesting: the good, the bad, the ugly

A survey of safety and trustworthiness of large language models through the lens of verification and validation

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation