StringFuzz: A Fuzzer for String Solvers

. In this paper, we introduce StringFuzz: a modular SMT-LIB problem instance transformer and generator for string solvers. We supply a repository of instances generated by StringFuzz in SMT-LIB 2.0/2.5 format. We systematically compare Z3str3, CVC4, Z3str2, and Norn on groups of such instances, and identify those that are particularly challenging for some solvers. We brieﬂy explain our observations and show how StringFuzz helped discover causes of performance degradations in Z3str3.


Introduction
In recent years, many algorithms for solving string constraints have been developed and implemented in SMT solvers such as Norn [6], CVC4 [12], and Z3 (e.g., Z3str2 [13] and Z3str3 [7]). To validate and benchmark these solvers, their developers have relied on hand-crafted input suites [1,4,5] or real-world examples from a limited set of industrial applications [2,11]. These test suites have helped developers identify implementation defects and develop more sophisticated solving heuristics. Unfortunately, as more features are added to solvers, these benchmarks often remain stagnant, leaving increasing functionality untested. As such, there is an acute need for a more robust, inexpensive, and automatic way of generating benchmarks to test the correctness and performance of SMT solvers.
Fuzzing has been used to test all kinds of software including SAT solvers [10]. Inspired by the utility of fuzzers, we introduce StringFuzz and describe its value as an exploratory testing tool. We demonstrate its efficacy by presenting limitations it helped discover in leading string solvers. To the best of our knowledge, StringFuzz is the only tool aimed at automatic generation of string constraints. StringFuzz can be used to mutate or transform existing benchmarks, as well as randomly generate structured instances. These instances can be scaled with respect to a variety of parameters, e.g., length of string constants, depth of concatenations (concats) and regular expressions (regexes), number of variables, number of length constraints, and many more.

1 The StringFuzz tool:
In Sect. 2, we describe a modular fuzzer that can transform and generate SMT-LIB 2.0/2.5 string and regex instances. Scaling inputs (e.g., long string constants, deep concatenations) are particularly useful in identifying asymptotic behaviors in solvers, and StringFuzz has many options to generate them. We briefly document StringFuzz's components and modular architecture. We provide example use cases to demonstrate its utility as an exploratory solver testing tool. 2. A repository of SMT-LIB 2.0/2.5 instances: We present a repository of SMT-LIB 2.0/2.5 string and regex instance suites that we generated using StringFuzz in Sect. 3. This repository consists of two categories: one with new instances generated by StringFuzz (generated); and another with transformed instances generated from a small suite of industrial benchmarks (transformed).

Experimental Results and Analysis:
We compare the performance of Z3str3, CVC4, Z3str2, and Norn on the StringFuzz suites Concats-Balanced, Concats-Big, Concats-Extracts-Small, and Different-Prefix in Sect. 4. We highlight these suites because they make some solvers perform poorly, but not others. We analyze our experimental results, and pinpoint algorithmic limitations in Z3str3 that cause poor performance.

StringFuzz
Implementation and Architecture. StringFuzz is implemented as a Python package, and comes with several executables to generate, transform, and analyze SMT-LIB 2.0/2.5 string and regex instances. Its components are implemented as UNIX "filters" to enable easy integration with other tools (including themselves). For example, the outputs of generators can be piped into transformers, and transformers can be chained to produce a stream of tuned inputs to a solver. StringFuzz is composed of the following tools: stringfuzzg This tool generates SMT-LIB instances. It supports several generators and options that specify its output. Details can be found in Table 1a. stringfuzzx This tool transforms SMT-LIB instances. It supports several transformers and options that specify its output and input, which are explained in Table 1b. Note that transformers Translate and Reverse also preserve satisfiability under certain conditions. stringstats This tool takes an SMT-LIB instance as input and outputs its properties: the number of variables/literals, the max/median syntactic depth of expressions, the max/median literal length, etc. (a) stringfuzzg built-in generators.

Name
Generates instances that have ...

Concats
Long concats and optional random extracts.

Lengths
Many variables (and their concats) with length constraints.

Overlaps
An expression of the form A.X = X.B.

Equality
An equality among concats, each with variables or constants.

Regex
Regexes of varying complexity. Random-Text Totally random ASCII text. Random-AST Random string and regex constraints.

Replaces literals and operators with similar ones. Graft
Randomly swaps non-leaf nodes with leaf nodes. Multiply a Multiplies integers and repeats strings by N.

Reverse b
Reverses all string literals and concat arguments.

Rotate
Rotates compatible nodes in syntax tree. Translate b Permutes the alphabet. Unprintable Replaces characters in literals with unprintable ones.
a Can guarantee satisfiable output instances from satisfiable input instances [3]. b Can guarantee input and output instances will be equisatisfiable [3].
We organized StringFuzz to be easily extended. To show this, we note that while the whole project contains 3,183 lines of code, it takes an average of 45 lines of code to create a transformer. StringFuzz can be installed from source, or from the Python PIP package repository.
Example Use Case. In Sect. 3 we use StringFuzz to generate benchmark suites in a batch mode. We can also use StringFuzz for on-line exploratory debugging. For example, the script below repeatedly feeds random StringFuzz instances to CVC4 until the solver produces an error: while stringfuzzg -r random-ast -m \ | tee instance.smt25 | cvc4 --lang smt2.5 --tlimit=5000 --strings-exp; do sleep 0 done

Instance Suites
In this section, we describe the benchmark suites we generated with String-Fuzz, and on which we conducted our experimental evaluation. Table 2a lists instances that were generated by stringfuzzg. Table 2b lists instances derived from existing seed instances by iteratively applying stringfuzzx. Every transformed instance is named according to its seed and the transformations it undertook. For example, z3-regex-1-fuzz-graft.smt2 was transformed by applying Fuzz and then Graft to z3-regex-1.smt2.
The Amazon category contains 472 instances derived from two seeds supplied by our industrial collaborators. The Regex category is seeded by the Z3str2 regex test suite [4], which contains 42 instances. Through cumulative transformations we expanded the 42 seeds to 7,551 unique instances. Finally, the Sanitizer category is obtained from five industrial e-mail address and IPv4 sanitizers.

Experimental Results and Analysis
We generated several problem instance suites with StringFuzz that made one solver perform poorly, but not others. 2 They are Concats-Balanced, Concats-Big, Concats-Extracts-Small, and Different-Prefix . Figure 1 shows the suites that StringFuzz also helped identify a number of performance-related issues and opportunities for new heuristics in Z3str3. For example, by examining Z3str3's execution traces on the instances in the Concats-Big suite we discovered a potential new heuristic. In particular, Z3str3 does not make full use of the solving context (e.g. some terms are empty strings) to simplify the concatenations of a long list of string terms before trying to reason about the equivalences among subterms. Z3str3 therefore introduces a large number of unnecessary intermediate variables and propagations.

Related Work
Many solver developers create their own test suites to validate their solvers [1,4,5]. Several popular instance suites are also publicly available for solver testing and benchmarking, such as the Kaluza [2] and Kausler [11] suites. There are likewise several fuzzers and instance generators currently available, but none of them can generate or transform string and regex instances. For example, the FuzzSMT [9] tool generates SMT-LIB instances with bit-vectors and arrays, but does not support strings or regexes. The SMTpp [8] tool pre-processes and simplifies instances, but does not generate new ones or fuzz existing ones.