Skip to main content

My Fuzzer Beats Them All! Developing a Framework for Fair Evaluation and Comparison of Fuzzers

  • 1984 Accesses

Part of the Lecture Notes in Computer Science book series (LNSC,volume 12972)

Abstract

Fuzzing has become one of the most popular techniques to identify bugs in software. To improve the fuzzing process, a plethora of techniques have recently appeared in academic literature. However, evaluating and comparing these techniques is challenging as fuzzers depend on randomness when generating test inputs. Commonly, existing evaluations only partially follow best practices for fuzzing evaluations. We argue that the reason for this are twofold. First, it is unclear if the proposed guidelines are necessary due to the lack of comprehensive empirical data in the case of fuzz testing. Second, there does not yet exist a framework that integrates statistical evaluation techniques to enable fair comparison of fuzzers.

To address these limitations, we introduce a novel fuzzing evaluation framework called SENF (Statistical EvaluatioN of Fuzzers). We demonstrate the practical applicability of our framework by utilizing the most wide-spread fuzzer AFL as our baseline fuzzer and exploring the impact of different evaluation parameters (e.g., the number of repetitions or run-time), compilers, seeds, and fuzzing strategies. Using our evaluation framework, we show that supposedly small changes of the parameters can have a major influence on the measured performance of a fuzzer.

This is a preview of subscription content, access via your institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • DOI: 10.1007/978-3-030-88418-5_9
  • Chapter length: 21 pages
  • Instant PDF download
  • Readable on all devices
  • Own it forever
  • Exclusive offer for individuals only
  • Tax calculation will be finalised during checkout
eBook
USD   89.00
Price excludes VAT (USA)
  • ISBN: 978-3-030-88418-5
  • Instant PDF download
  • Readable on all devices
  • Own it forever
  • Exclusive offer for individuals only
  • Tax calculation will be finalised during checkout
Softcover Book
USD   119.99
Price excludes VAT (USA)
Fig. 1.
Fig. 2.
Fig. 3.
Fig. 4.

Notes

  1. 1.

    https://github.com/CyberGrandChallenge/.

  2. 2.

    Common Weakness Enumeration (CWE) is a list of software and hardware problem types (https://cwe.mitre.org/).

  3. 3.

    https://github.com/mirrorer/afl/blob/master/qemu_mode/README.qemu.

  4. 4.

    https://gitlab.com/laf-intel/laf-llvm-pass/tree/master.

References

  1. Aizatsky, M., Serebryany, K., Chang, O., Arya, A., Whittaker, M.: Announcing OSS-Fuzz: continuous fuzzing for open source software (2016)

    Google Scholar 

  2. Arcuri, A., Briand, L.: A Hitchhiker’s guide to statistical tests for assessing randomized algorithms in software engineering. Softw. Test. Verif. Reliab. 24, 219–250 (2014)

    Google Scholar 

  3. Aschermann, C., Schumilo, S., Blazytko, T., Gawlik, R., Holz, T.: REDQUEEN: fuzzing with input-to-state correspondence. In: Symposium on Network and Distributed System Security (NDSS) (2019)

    Google Scholar 

  4. Benjamin, D.J., Berger, J.O., Johannesson, M., Nosek, B.A., Wagenmakers, E.J., et al.: Redefine statistical significance. Hum. Nat. Behav. 2, 6–10 (2017)

    Google Scholar 

  5. Blazytko, T., et al.: GRIMOIRE: synthesizing structure while fuzzing. In: USENIX Security Symposium (2019)

    Google Scholar 

  6. Böhme, M., Falk, B.: Fuzzing: on the exponential cost of vulnerability discovery. In: Symposium on the Foundations of Software Engineering (FSE) (2020)

    Google Scholar 

  7. Böhme, M., Pham, V.T., Nguyen, M.D., Roychoudhury, A.: Directed greybox fuzzing. In: ACM Conference on Computer and Communications Security (CCS) (2017)

    Google Scholar 

  8. Böhme, M., Pham, V.T., Roychoudhury, A.: Coverage-based greybox fuzzing as markov chain. In: ACM Conference on Computer and Communications Security (CCS) (2016)

    Google Scholar 

  9. Cadar, C., Dunbar, D., Engler, D.: KLEE: unassisted and automatic generation of high-coverage tests for complex systems programs. In: USENIX Conference on Operating Systems Design and Implementation (2008)

    Google Scholar 

  10. Chen, H., et al.: Hawkeye: towards a desired directed grey-box fuzzer. In: ACM Conference on Computer and Communications Security (CCS) (2018)

    Google Scholar 

  11. Chen, P., Chen, H.: Angora: efficient fuzzing by principled search. In: IEEE Symposium on Security and Privacy (S&P) (2018)

    Google Scholar 

  12. Chen, Y., et al.: EnFuzz: ensemble fuzzing with seed synchronization among diverse fuzzers. In: USENIX Security Symposium (2019)

    Google Scholar 

  13. Cho, M., Kim, S., Kwon, T.: Intriguer: field-level constraint solving for hybrid fuzzing. In: ACM Conference on Computer and Communications Security (CCS) (2019)

    Google Scholar 

  14. Dolan-Gavitt, B., et al.: LAVA: large-scale automated vulnerability addition. In: IEEE Symposium on Security and Privacy (S&P) (2016)

    Google Scholar 

  15. Fioraldi, A., Maier, D., Eißfeldt, H., Heuse, M.: Afl++: combining incremental steps of fuzzing research. In: USENIX Workshop on Offensive Technologies (WOOT) (2020)

    Google Scholar 

  16. Fisher, R.: On the Interpretation of \(\chi ^2\) from contingency tables, and the calculation of P. J. R. Stat. Soc. 85, 87–94 (1922)

    Google Scholar 

  17. Fisher, R.: Statistical Methods for Research Workers. Oliver and Boyd, Edinburgh (1925)

    Google Scholar 

  18. Gan, S., et al.: GREYONE: data flow sensitive fuzzing. In: USENIX Security Symposium (2020)

    Google Scholar 

  19. Gan, S., et al.: CollAFL: path sensitive fuzzing. In: IEEE Symposium on Security and Privacy (S&P) (2018)

    Google Scholar 

  20. Google: fuzzer-test-suite (2016). https://github.com/google/fuzzer-test-suite/

  21. Guido, D.: Your tool works better than mine? Prove it (2016). https://blog.trailofbits.com/2016/08/01/your-tool-works-better-than-mine-prove-it/

  22. Hazimeh, A., Herrera, A., Payer, M.: Magma: a ground-truth fuzzing benchmark. In: Proceedings of the ACM on Measurement and Analysis of Computing Systems 4 (2020)

    Google Scholar 

  23. Hocevar, S.: zzuf (2006). https://github.com/samhocevar/zzuf/

  24. Huang, H., Yao, P., Wu, R., Shi, Q., Zhang, C.: Pangolin: incremental hybrid fuzzing with polyhedral path abstraction. In: IEEE Symposium on Security and Privacy (S&P) (2020)

    Google Scholar 

  25. Inozemtseva, L., Holmes, R.: Coverage is not strongly correlated with test suite effectiveness. In: International Conference on Software Engineering (ICSE) (2014)

    Google Scholar 

  26. Klees, G., Ruef, A., Cooper, B., Wei, S., Hicks, M.: Evaluating fuzz testing. In: ACM Conference on Computer and Communications Security (CCS) (2018)

    Google Scholar 

  27. Lemieux, C., Sen, K.: FairFuzz: a targeted mutation strategy for increasing greybox fuzz testing coverage (2018)

    Google Scholar 

  28. Li, Y., et al.: UNIFUZZ: a holistic and pragmatic metrics-driven platform for evaluating fuzzers. In: USENIX Security Symposium (2021)

    Google Scholar 

  29. Lyu, C., et al.: MOPT: optimized mutation scheduling for fuzzers. In: USENIX Security Symposium (2019)

    Google Scholar 

  30. Mann, H., Whitney, D.: On a test of whether one of two random variables is stochastically larger than the other. Ann. Math. Stat. 18 (1947)

    Google Scholar 

  31. Metzman, J., Arya, A., Szekeres, L.: FuzzBench: Fuzzer Benchmarking as a Service (2020). https://opensource.googleblog.com/2020/03/fuzzbench-fuzzer-benchmarking-as-service.html

  32. Paaßen, D., Surminski, S., Rodler, M., Davi, L.: Public github respository of SENF. https://github.com/uni-due-syssec/SENF

  33. Pham, V.T., Böhme, M., Santosa, A.E., Căciulescu, A.R., Roychoudhury, A.: Smart greybox fuzzing. IEEE Trans. Softw. Eng. (2019)

    Google Scholar 

  34. Pham, V.-T., Khurana, S., Roy, S., Roychoudhury, A.: Bucketing failing tests via symbolic analysis. In: Huisman, M., Rubin, J. (eds.) FASE 2017. LNCS, vol. 10202, pp. 43–59. Springer, Heidelberg (2017). https://doi.org/10.1007/978-3-662-54494-5_3

    CrossRef  Google Scholar 

  35. R Core Team: R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing, Vienna, Austria (2019).https://www.R-project.org/

  36. Schumilo, S., Aschermann, C., Gawlik, R., Schinzel, S., Holz, T.: kAFL: hardware-assisted feedback fuzzing for OS kernels. In: USENIX Security Symposium (2017)

    Google Scholar 

  37. van Tonder, R., Kotheimer, J., Le Goues, C.: Semantic crash bucketing. In: ACM/IEEE International Conference on Automated Software Engineering (2018)

    Google Scholar 

  38. Vargha, A., Delaney, H.D.: A critique and improvement of the CL common language effect size statistics of McGraw and Wong. J. Educ. Behav. Stat. 25, 10—132 (2000)

    Google Scholar 

  39. Wang, Y., et al.: Not all coverage measurements are equal: fuzzing for input prioritization. In: Symposium on Network and Distributed System Security (NDSS) (2020)

    Google Scholar 

  40. Yue, T., et al.: EcoFuzz: adaptive energy-saving greybox fuzzing as a variant of the adversarial multi-armed bandit. In: USENIX Security Symposium (2020)

    Google Scholar 

  41. Zalewski, M.: Technical “whitepaper” for afl-fuzz. https://lcamtuf.coredump.cx/afl/technical_details.txt

  42. Zhao, L., Duan, Y., Yin, H., Xuan, J.: Send hardest problems my way: probabilistic path prioritization for hybrid fuzzing. In: Symposium on Network and Distributed System Security (NDSS) (2019)

    Google Scholar 

Download references

Acknowledgements

Funded by the Deutsche Forschungsgemeinschaft (DFG, German Research Foundation) under Germany’s Excellence Strategy – EXC 2092 CASA – 390781972.

This work has been partially funded by the Deutsche Forschungsgemeinschaft (DFG, German Research Foundation) – SFB 1119 – 236615297.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to David Paaßen .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and Permissions

Copyright information

© 2021 Springer Nature Switzerland AG

About this paper

Verify currency and authenticity via CrossMark

Cite this paper

Paaßen, D., Surminski, S., Rodler, M., Davi, L. (2021). My Fuzzer Beats Them All! Developing a Framework for Fair Evaluation and Comparison of Fuzzers. In: Bertino, E., Shulman, H., Waidner, M. (eds) Computer Security – ESORICS 2021. ESORICS 2021. Lecture Notes in Computer Science(), vol 12972. Springer, Cham. https://doi.org/10.1007/978-3-030-88418-5_9

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-88418-5_9

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-88417-8

  • Online ISBN: 978-3-030-88418-5

  • eBook Packages: Computer ScienceComputer Science (R0)