figure a
figure b

1 Introduction

Path-merging [1, 7, 8] is a technique that speeds up the execution of Dynamic Symbolic Execution (DSE) by collapsing paths within code regions into a disjunctive logical constraint. Java Ranger (JR) [12] is a path-merging tool for Java Programs. It summarizes symbolic branches during execution. JR generates the disjunctive logical constraint for a code region predicated on a symbolic branch by using a sequence of transformations. For example, JR alternates between substituting values for local variables in its summary and inlining method summaries to eliminate dynamically dispatched method invocations. See [11] for more information.

2 Path Merging Extensions and Results

Despite handling many of the Java language features, in SV-COMP 2020 [10] JR did not support symbolically executing string functions. It also did not summarize arrayload and arraystore statements that exist outside a code region predicated on a symbolic branch. For example, if a and i are symbolic integers, JR could summarize a region of the form: \(if(a) \; \{ myval = arr[i] ...\}\) But not: \(myval = arr[i]\). More precisely, the newly introduced features to JR include:

  1. 1.

    Summarizing Array Creation of Symbolic Size: to support the creation of symbolic-sized single and multi-dimensional arrays, we bound the symbolic size to several values, and we executed the program on each concrete value.

  2. 2.

    Summarizing ArrayLoad and ArrayStore: to support the arrayload and the arraystore of a symbolic index, we create a disjunctive constraint that describes possible valuations. This constraint is then pushed on the path condition. For example: for a symbolic index i and an array arr of size 3, we encode arrayload of the form \(myval=arr[i]\) as

    $$myval:=\text {ite}(i==0, arr[0], \text {ite}(i==1, arr[1],arr[2])$$

    Similarly, we encode the arraystore of the form \(arr[i]=myval\) as

    $$ \begin{array}{c} arr[0]_{new}:=\text {ite}(i==0,myval,arr[0]_{old}) \\ \wedge \; arr[1]_{new}:=\text {ite}(i==1, myval, arr[1]_{old}) \\ \wedge \; arr[2]_{new}:=\text {ite}(i==2, myval,arr[2]_{old}) \end{array} $$

    where \(arr[i]_{old}\), and \(arr[i]_{new}\) indicate the old and the new values of the array arr at index i.

  3. 3.

    Symbolically Executing Symbolic Strings: We added support to some basic string operations for the String package and the StringBuilder package; this includes but is not limited to charAt, concat, contains, endsWith, equals, indexOf, length, replace, startsWith, isEmpty and substring.

2.1 Run Configuration

In addition to JR configurations used in SV-COMP 2020 [10], we used the below configurations for turning on the added features.:

  • symbolic.jrarrays=true: to enable the above array features.

  • symbolic.strings=true: to enable executing symbolic string

  • symbolic.string_dp=z3str3: to use Z3’s default string theory.

  • symbolic.string_dp_timeout_ms=3000: for timeout on the string queries.

Table 1. results of JR’s version participating in 2020 versus the improved 2023 version

2.2 Results

To understand the value of the JR’s extensions above, we evaluated the old JR tool [9] from SV-COMP 2020, which had no support for symbolic arrays nor symbolic strings, to JR’s version participating in 2023. We ran both versions on the verification tasks used in SV-COMP 2023. Results in Tb. 1 show an increased number of correctly solved tasks from 429 to 475, but more importantly, a significant reduction in incorrect results from 97 to zero. These improved scores show the importance and significance of the added support.

Unfortunately, however, because the current version of JR has no support for witness generation, all correctly reached false verdicts were not included in the SV-COMP 2023 score [2], which resulted in JR scoring 400 points instead of 675. In the future, we plan to extend JR to support witness generation.

3 Formula Structure in Path-Merged String Constraints

Fig. 1.
figure 1

loopCharAt Example

Fig. 1 shows loopCharAt: an SV-COMP 2023 verification task [3] (from an example of Avgerinos et al. [1]) that can dramatically benefit from path-merging. The task accepts a symbolic string arg, and checks each character to see if it is the letter ‘B’. If so it increments counter. The assertion fails if the value of the counter can be 121.

Fig. 2.
figure 2

Average running time by size and query type

For a symbolic string of length n, this code has \(2^n\) execution paths, since each character can be B or not B independently. But applying path merging to the if statement leads to a single execution path for a given length string. While JR sees this expected asymptotic benefit (one path per string length), reaching the assertion failure takes more than 2 hours, well beyond the competition time limit. Most time is spent in the solver, so we investigated whether changing the syntax of the query could improve performance.

Each query generated from the satisfiability of the assert statement asks whether an n-character string can contain 121 (or more generally, k) B characters; this query is satisfiable if \(0 \le k \le n\). We used a script to generate variations of the query for different values of n and k, and different semantically equivalent ways of expressing the constraints. We then measured the time to solve the queries using Z3 4.8.15 with the seq string solver, on an Intel i7-3770 workstation running Ubuntu 20.04. The choice of k appeared to have little effect on performance, so we report the results of averaging over runs with \(0 \le k \le n + 1\). Figure 2 shows how the running time grows with n, and that the query style has a large impact on performance.

We describe the query styles in order of increasing overhead. Because no complex string operations are needed, an equivalent query can be expressed in a simple bit-vector (QF_BV) logic. This was by far the fastest, and the only style where the running time appears to grow linearly with n. The remaining styles use a logic of strings and integers (QF_SLIA), and we started with the constraint style that seemed most natural to write by hand (“clean”) and sequentially added complexities to make the constraints increasingly similar to those JR produces. All these QF_SLIA styles appear to slow down as a cubic polynomial in n, as illustrated by the best-fit lines. Two features of JR’s queries had little effect on performance: expressing the string length with a series of inequalities (in JR these come from the loop), and introducing a temporary variable corresponding to each update of the counter. A modest but measurable slowdown came from expressing the effect of the merged region with OR and AND operations, instead of the functional if-then-else operator. A final dramatic slowdown came from constraining the value of each character via its character code (= (str.to_code (str.at s 0)) 66) (natural because Java’s char is an integer type) instead of as a one-character string (= (str.at s 0) "B"). These results suggest that this verification task could become feasible in 15 minutes if either JR or solvers can transform the slow-to-solve forms into fast-to-solve ones.

4 Data-Availability Statement

Java Ranger is developed at the University of Minnesota. It is continuously maintained on GitHub [6]. Readers interested in the reproducibility of Java Ranger results in the competition an artifact can be found here [4, 5].