StringFuzz: A Fuzzer for String Solvers
- 4.5k Downloads
In this paper, we introduce StringFuzz: a modular SMT-LIB problem instance transformer and generator for string solvers. We supply a repository of instances generated by StringFuzz in SMT-LIB 2.0/2.5 format. We systematically compare Z3str3, CVC4, Z3str2, and Norn on groups of such instances, and identify those that are particularly challenging for some solvers. We briefly explain our observations and show how StringFuzz helped discover causes of performance degradations in Z3str3.
In recent years, many algorithms for solving string constraints have been developed and implemented in SMT solvers such as Norn , CVC4 , and Z3 (e.g., Z3str2  and Z3str3 ). To validate and benchmark these solvers, their developers have relied on hand-crafted input suites [1, 4, 5] or real-world examples from a limited set of industrial applications [2, 11]. These test suites have helped developers identify implementation defects and develop more sophisticated solving heuristics. Unfortunately, as more features are added to solvers, these benchmarks often remain stagnant, leaving increasing functionality untested. As such, there is an acute need for a more robust, inexpensive, and automatic way of generating benchmarks to test the correctness and performance of SMT solvers.
Fuzzing has been used to test all kinds of software including SAT solvers . Inspired by the utility of fuzzers, we introduce StringFuzz and describe its value as an exploratory testing tool. We demonstrate its efficacy by presenting limitations it helped discover in leading string solvers. To the best of our knowledge, StringFuzz is the only tool aimed at automatic generation of string constraints. StringFuzz can be used to mutate or transform existing benchmarks, as well as randomly generate structured instances. These instances can be scaled with respect to a variety of parameters, e.g., length of string constants, depth of concatenations (concats) and regular expressions (regexes), number of variables, number of length constraints, and many more.
1 The StringFuzz tool: In Sect. 2, we describe a modular fuzzer that can transform and generate SMT-LIB 2.0/2.5 string and regex instances. Scaling inputs (e.g., long string constants, deep concatenations) are particularly useful in identifying asymptotic behaviors in solvers, and StringFuzz has many options to generate them. We briefly document StringFuzz’s components and modular architecture. We provide example use cases to demonstrate its utility as an exploratory solver testing tool.
A repository of SMT-LIB 2.0/2.5 instances: We present a repository of SMT-LIB 2.0/2.5 string and regex instance suites that we generated using StringFuzz in Sect. 3. This repository consists of two categories: one with new instances generated by StringFuzz (generated); and another with transformed instances generated from a small suite of industrial benchmarks (transformed).
Experimental Results and Analysis: We compare the performance of Z3str3, CVC4, Z3str2, and Norn on the StringFuzz suites Concats-Balanced, Concats-Big, Concats-Extracts-Small, and Different-Prefix in Sect. 4. We highlight these suites because they make some solvers perform poorly, but not others. We analyze our experimental results, and pinpoint algorithmic limitations in Z3str3 that cause poor performance.
This tool generates SMT-LIB instances. It supports several generators and options that specify its output. Details can be found in Table 1a.
This tool transforms SMT-LIB instances. It supports several transformers and options that specify its output and input, which are explained in Table 1b. Note that transformers Translate and Reverse also preserve satisfiability under certain conditions.
This tool takes an SMT-LIB instance as input and outputs its properties: the number of variables/literals, the max/median syntactic depth of expressions, the max/median literal length, etc.
StringFuzz built-in (a) generators and (b) transformers.
Regex Generating Capabilities.
StringFuzz can generate and transform instances with regex constraints. For example, the command “stringfuzzg regex -r 2 -d 1 -t 1 -M 3 -X 10” produces this instance:
Example Use Case. In Sect. 3 we use StringFuzz to generate benchmark suites in a batch mode. We can also use StringFuzz for on-line exploratory debugging. For example, the script below repeatedly feeds random StringFuzz instances to CVC4 until the solver produces an error:
3 Instance Suites
In this section, we describe the benchmark suites we generated with StringFuzz, and on which we conducted our experimental evaluation. Table 2a lists instances that were generated by stringfuzzg. Table 2b lists instances derived from existing seed instances by iteratively applying stringfuzzx. Every transformed instance is named according to its seed and the transformations it undertook. For example, z3-regex-1-fuzz-graft.smt2 was transformed by applying Fuzz and then Graft to z3-regex-1.smt2.
Repository of 10,258 SMT-LIB 2.0/2.5 instances.
4 Experimental Results and Analysis
Usefulness to Z3str3: A Case Study. StringFuzz’s ability to produce scaling instances helped uncover several implementation issues and performance limitations in Z3str3. Scaling inputs can reveal issues that would normally be out of scope for unit tests or industrial benchmarks. Three different performance and implementation bugs were identified and fixed in Z3str3 as a result of testing with the StringFuzz scaling suites Lengths-Long and Concats-Big.
StringFuzz also helped identify a number of performance-related issues and opportunities for new heuristics in Z3str3. For example, by examining Z3str3’s execution traces on the instances in the Concats-Big suite we discovered a potential new heuristic. In particular, Z3str3 does not make full use of the solving context (e.g. some terms are empty strings) to simplify the concatenations of a long list of string terms before trying to reason about the equivalences among sub-terms. Z3str3 therefore introduces a large number of unnecessary intermediate variables and propagations.
5 Related Work
Many solver developers create their own test suites to validate their solvers [1, 4, 5]. Several popular instance suites are also publicly available for solver testing and benchmarking, such as the Kaluza  and Kausler  suites. There are likewise several fuzzers and instance generators currently available, but none of them can generate or transform string and regex instances. For example, the FuzzSMT  tool generates SMT-LIB instances with bit-vectors and arrays, but does not support strings or regexes. The SMTpp  tool pre-processes and simplifies instances, but does not generate new ones or fuzz existing ones.
- 1.CVC4 regression test suite. https://github.com/CVC4/CVC4/tree/master/test/regress
- 2.Kaluza benchmark suite. http://webblaze.cs.berkeley.edu/2010/kaluza/
- 3.Stringfuzz source code, benchmark suites, and supplemental material. http://stringfuzz.dmitryblotsky.com
- 4.Z3str2 test suite. https://github.com/z3str/Z3-str/tree/master/tests
- 5.Z3str3 test scripts. https://github.com/Z3Prover/z3/tree/master/src/test
- 7.Berzish, M., Ganesh, V., Zheng, Y.: Z3str3: a string solver with theory-aware heuristics. In: Stewart, D., Weissenbacher, G., (eds.), 2017 Formal Methods in Computer Aided Design, FMCAD 2017, Vienna, Austria, 2–6 October 2017, pp. 55–59. IEEE (2017)Google Scholar
- 8.Bonichon, R., Déharbe, D., Dobal, P., Tavares, C.: SMTpp: preprocessors and analyzers for SMT-LIB. In: Proceedings of the 13th International Workshop on Satisfiability Modulo Theories, SMT 2015 (2015)Google Scholar
- 9.Brummayer, R., Biere, A.: Fuzzing and delta-debugging SMT solvers. In: Proceedings of the 7th International Workshop on Satisfiability Modulo Theories, SMT 2009, pp. 1–5. ACM, New York, NY, USA (2009)Google Scholar
- 11.Kausler, S., Sherman, E.: Evaluation of string constraint solvers in the context of symbolic execution. In: Proceedings of the 29th ACM/IEEE International Conference on Automated Software Engineering, ASE 2014, pp. 259–270. ACM, New York, NY, USA (2014)Google Scholar
- 13.Zheng, Y., Zhang, X., Ganesh, V.: Z3-str: a Z3-based string solver for web application analysis. In: Meyer, B., Baresi, L., Mezini, M., (eds.) Joint Meeting of the European Software Engineering Conference and the ACM SIGSOFT Symposium on the Foundations of Software Engineering, ESEC/FSE 2013, Saint Petersburg, Russian Federation, 18–26 August 2013, pp. 114–124. ACM (2013)Google Scholar
<SimplePara><Emphasis Type="Bold">Open Access</Emphasis>This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License(http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.</SimplePara><SimplePara>The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.</SimplePara>