Abstract
There are many hard verification problems that are currently only solvable by applying several verifiers that are based on complementing technologies. Conditional model checking (CMC) is a successful solution for cooperation between verification tools. In CMC, the first verifier outputs a condition describing the state space that it successfully verified. The second verifier uses the condition to focus its verification on the unverified state space. To use arbitrary second verifiers, we recently proposed a reducerbased approach. One can use the reducerbased approach to construct a conditional verifier from a reducer and a (nonconditional) verifier: the reducer translates the condition into a residual program that describes the unverified state space and the verifier can be any offtheshelf verifier (that does not need to understand conditions). Until now, only one reducer was available. But for a systematic investigation of the reducer concept, we need several reducers. To fill this gap, we developed FRed, a Framework for exploring different REDucers. Given an existing reducer, FRed allows us to derive various new reducers, which differ in their tradeoff between size and precision of the residual program. For our experiments, we derived seven different reducers. Our evaluation on the largest and most diverse public collection of verification problems shows that we need all seven reducers to solve hard verification tasks that were not solvable before with the considered verifiers.
Replication package available on Zenodo [12].
Funded in part by the Deutsche Forschungsgemeinschaft (DFG) – 418257054 (Coop).
Download conference paper PDF
1 Introduction
Due to the undecidability of software verification, even after more than 40 years of research on automatic software verification [31], some hard verification tasks cannot be solved by a single verifier alone. To increase the number of solvable tasks, one needs to combine the strengths of distinct verifiers. Several combinations [3, 8, 9, 20, 23, 25, 32, 33, 37] were proposed in the literature. One promising combination is conditional model checking (CMC) [9], which unlike others does not modify the programs nor let the combined techniques know each other. CMC works as follows: If the first verifier gives up on the verification task, it outputs a condition that describes the state space that it successfully verified. The (conditional) second verifier uses the condition of the first verifier to focus its work on the stillunverified state space. Note that one can easily extend the CMC approach to more than two verifiers by letting all verifiers generate conditions.
To easily construct conditional verifiers (i.e., verifiers that understand conditions) from existing offtheshelf verifiers, a recent work proposed the concept of reducerbased CMC [13]. Instead of making a verifier aware of conditions, reducerbased CMC constructs a conditional verifier from an existing verifier by plugging a reducer in front of the verifier. The reducer is a preprocessor that given the original program and the condition as input, translates the condition into a (residual) program, a format that is understandable by classic verifiers.
The construction of a reducer, especially proving its soundness, is complex and so far there exists only one reducer. However, this reducer’s translation is very precise, and therefore, may construct programs that are orders of magnitudes larger than the original program. To solve this problem, and to support systematic experimentation with different reducers, we propose the formal framework FRed, which streamlines and simplifies the construction of new reducers from existing ones. Its underlying idea is to construct a new reducer \(r=F \circ R\), a socalled fold reducer, by sequentially composing an existing reducer R with a folder F. A folder uses a heuristic that specifies how to modify the program constructed by the existing reducer. More concretely, a folder defines which program locations of the program constructed by the existing reducer are collapsed into a new location and, thus, specifies how to coarsen the program. However, to avoid false alarms, the specified coarsening must not add new program behavior.
New conditional verifiers CV can be constructed with FRed according to the equation \(CV = V \circ (F \circ R)\), where \(r=(F \circ R)\) is the foldreducer composed of the existing reducer R and a folder F, V is an arbitrary verifier, and \(\circ \) is the sequential composition operator. Figure 1 illustrates this construction in the context of reducerbased CMC. We used this construction to build 49 conditional verifiers, which use the already existing reducer, one of seven folders, and one of seven verifiers. Our large experimental study revealed that using several reducers (with different folders) can make the overall verification more effective.
Contributions. We make the following contributions:

We introduce FRed, a framework for the composition of new reducers from existing reducers and folding heuristics.

We prove that FRed derives valid reducers in case the existing reducer is valid and the folding heuristic adheres to a correctness constraint.

We use our framework FRed to derive seven new reducers from the existing reducer ParComp [13] and use them in various conditional verifiers.

We experimentally show that the overall effectiveness of reducerbased CMC can be increased using various reducers.

Our reducers and all experimental data are available for replication and to construct further conditional model checkers (see Sect. 6).
2 Background
Program Representation. Following the literature [8, 10], we model a program by a controlflow automaton (CFA) \(C = (L, \ell _0,G)\) consisting of a set \(L\) of locations, an initial location \(\ell _0\in L\), and a set of controlflow edges \(G\subseteq L\times Ops\times L\). The set \(Ops\) describes all possible operations. In our presentation, we only consider operations on integer variables that are either boolean expressions (socalled assume operations) or assignments. However, our implementation supports C programs. In the following, we use \(\mathcal {L}\) for the superset of all location sets and \(\mathcal {C}\) for the set of all CFAs. A CFA \(C=(L,\ell _0,G)\) is deterministic (i.e., representable as a C program) if for all controlflow edges \((\ell , op_1, \ell _1), (\ell , op_2, \ell _2)\in G\) either \(op_1=op_2\) and \(\ell _1=\ell _2\), or \(op_1\) and \(op_2\) are assume operations with \(op_1\equiv \lnot op_2\).
The left of Fig. 2 shows our example program absPow, which computes \(f(N)=2^{\lceil \log _2N\rceil }\) for \(N\ne 0\) and e.g., ensures the property \(f(N)\ne 0\). Next to program absPow, its deterministic CFA is shown, which contains one edge per assignment and two edges for each condition of an if or whilestatement. The two edges per if or whilestatement are labeled with the condition and its negation and represent the two evaluations of the condition.
Program Semantics. We use an operational semantics and represent a program’s state by a pair of location \(\ell \) (the value of the program counter) and concrete data state c. In our representation, a concrete data state is a mapping from the program variables into the set of integer values. Now, a concrete path \(\pi \) of a CFA \(C = (L, \ell _0,G)\) is a sequence \((\ell _0, c_0){\mathop {\rightarrow }\limits ^{g_1}}\dots {\mathop {\rightarrow }\limits ^{g_n}} (\ell _n, c_n)\) such that for all \(1\le i\le n:\) \(g_i = (\ell _{i1}, op_i , \ell _i)\in G\) and \(c_{i1}{\mathop {\rightarrow }\limits ^{op_i}}c_i\), i.e., (a) in case of assume operations, \(c_{i1}\models op_i\) and \(c_{i1}=c_i\) or (b) in case of assignments, \(c_i = \mathrm {SP}_{op_i} (c_{i1})\) and \(SP\) is the strongestpost operator of the semantics. We let \(paths(C)\) be the set of all concrete paths of a CFA \(C\). Given a concrete path \(\pi =(\ell _0, c_0){\mathop {\rightarrow }\limits ^{g_1}}\dots {\mathop {\rightarrow }\limits ^{g_n}} (\ell _n, c_n)\), we derive its execution \(ex(\pi )=c_0c_1\dots c_n\). Finally, we define \(ex(C):=\{ex(\pi )\mid \pi \in paths(C)\}\) to be the executions of a CFA \(C\).
Condition. After an (incomplete) verification run, a condition sums up which concrete paths of a program have been explored [9]. We model the condition as an automaton describing the syntactical program paths that have been verified and the assumptions that have been made on these paths (i.e., which concrete data states were included). Thus, the condition’s edges are labeled by pairs of program edges and assumptions. We model assumptions as state conditions, letting \(\varPhi \) denote the set of all state conditions. Accepting states subsume explored paths, i.e., if a path’s prefix is accepted by the condition, the path has been explored. Nonexplored paths either end in a nonaccepting state or more often have a prefix that ends in a state \(q\) from which no further transition is applicable. Typically, the latter means that the verifier did not explore beyond the prefix.
The automaton on the right of Fig. 2 shows a condition for our example program absPow. For the sake of presentation, we left out the assumptions, which are all true. The condition states that the elsebranch of the outermost ifstatement was explored and that the verifier performed a BFS alike exploration of the ifbranch, which split the exploration of the inner ifbranch and which is interrupted after one loop unrolling. Formally, a condition is defined as follows.^{Footnote 1}
Definition 1
A condition \(A=(Q,\varSigma ,\delta ,q_0,F)\) consists of

a finite set \(Q\) of states, an initial state \(q_0\in Q\), and accepting states \(F\subseteq Q\),

an alphabet \(\varSigma \subseteq 2^G\times \varPhi \), and

a transition relation \(\delta \subseteq Q\times \varSigma \times Q\) with \(\lnot \exists (q_f, \cdot ,q)\in \delta :\) \(q_f\in F\wedge q\notin F\).
We let \(\mathcal {A}\) be the set of conditions.
As already said, a condition describes which paths of a program have been looked at. The following definition formalizes this coverage property. Note that we use \(c\models \varphi \) to describe that a concrete data state c satisfies a state condition \(\varphi \).
Definition 2
A condition \(A=(Q,\varSigma ,\delta ,q_0,F)\) covers a concrete path \(\pi =(\ell _0, c_0){\mathop {\rightarrow }\limits ^{g_1}}\dots {\mathop {\rightarrow }\limits ^{g_n}}(\ell _n, c_n)\) if there exists a run \(\rho =q_0\xrightarrow {(G_1,\varphi _1)} \dots \xrightarrow {(G_k,\varphi _k)}q_k\) in \(A\) such that (a) \(0\le k\le n\), (b) \(q_k\in F\), and (c) \(\forall 1\le i\le k: g_i\in G_i \wedge (c_i\models \varphi _i)\).
Reducer. The CMC approach suggests that after an incomplete verification run, a second verifier should use the produced condition to explore only the uncovered paths. However, many verifiers do not understand conditions. To overcome this problem, reducerbased CMC [13] suggests to extend verifiers with a preprocessing step that translates the condition into a residual program. A residual program may overapproximate those program paths that are not covered by the condition, but must not introduce additional program behavior. We follow reducerbased CMC [13] and use reducers to compute residual programs.
Definition 3
A reducer is a function \(red: \mathcal {C}\times \mathcal {A}\rightarrow \mathcal {C}\) satisfying the residual property: \(\begin{array}{l} \forall C\!\in \!\mathcal {C}, \forall A\!\in \!\mathcal {A}: \mathrm {ex}(C){\setminus }\{\mathrm {ex}(\pi )\mid \! A~\mathrm {covers}~\pi \}\subseteq \mathrm {ex}(red(C,A)) \subseteq \mathrm {ex}(C).\end{array}\)
First, a reducer for a specific class of conditions was proposed [26]. Then, reducerbased CMC [13] generalized the first approach to use a reducer, named ParComp, which supports all kinds of conditions, and showed that it is indeed a reducer [13]. To compute a residual program, the reducer ParComp performs a parallel composition of the program and the condition. Starting in the initial location and initial condition state, it matches CFA edges with condition transitions that subsume the respective CFA edge. If no matching condition transition exists, ParComp switches to consider CFA edges only. Additionally, it stops exploring states containing a final state \(q\in F\) since the condition covers all longer paths.
However, the reducer ParComp has one drawback. Verifiers often unfold the program, e.g., unroll loops or inspect branches separately. Due to partially explored paths, some of the unfoldings become part of the condition and will be encoded in the residual program generated by ParComp. Thus, the residual program constructed by ParComp may become orders of magnitudes larger than the original program resulting in increased parsing costs for the second verifier. Additionally, a verifier \(v_2\) analyzing the residual program generated by ParComp is forced to apply the same unfoldings on the noncovered paths as the conditiongenerating verifier \(v_1\). However, it might be more effective or efficient if verifier \(v_2\) would less often (or never) unfold certain program structures of the original program. To tackle this problem, we present the framework FRed that extends reducers like ParComp to let them compute smaller residual programs with fewer unfoldings at the cost of adding more explored paths to the residual program, i.e., computing less precise residual programs.
3 FRed: FoldReducers from Reducers
To assist a systematic exploration of the reducer design space, we present the framework FRed. With FRed one can methodically derive new reducers from existing ones, thereby controlling the precision and size of the produced residual programs. One only needs to define how to compress the residual programs computed by the original reducer. Currently, FRed is limited to the class of pathpreserving reducers. Pathpreserving reducers have the advantage that they keep the reference to the original program within the syntactical structure of the residual program, i.e., except for location renaming they encode a subset of the syntactical paths of the original program. This makes it easier to derive new reducers from them. Next, we formally define a pathpreserving reducer, where \(\mathcal {U}\) is the universe of location markers (e.g., condition states).
Definition 4
A reducer \(ppr\) is pathpreserving if for any generated residual program \(ppr((L,\ell _0,G),A)=(L_r,\ell _{0,r}, G_r)\) it is valid that (a) \(L_r\subseteq L\times \mathcal {U}\) for some \(\mathcal {U}\), (b) \(\ell _{0,r}=(\ell _0,\cdot )\), and (c) \(\forall ((\ell ,u),op,(\ell ',u'))\in G_r: \exists (\ell ,op,\ell ')\in G\).
Given a pathpreserving reducer like ParComp, the goal of FRed is to derive new reducers that produce smaller, less precise residual programs. Our idea is that the new reducers aggregate certain similar behavior of the residual program \(C_r\) produced by the given pathpreserving reducer. So far, the framework FRed supports syntactical aggregations that unite location states of the program \(C_r\). These aggregations can be used to revert loopunfoldings or separation of branches, the main cause for large residual programs. Additionally, these aggregations are simple to compute. One needs to define only a partitioning of \(C_r\)’s location states into equivalence classes. However, to get proper reducers, the derived reducers must not introduce new program behavior. Transferred to our aggregations, this means that we must not combine location states of \(C_r\) that refer to different locations of the original program. We introduce the concept of a locationconsistent partitioner that computes partitions respecting this requirement.
Definition 5
A locationconsistent partitioner is a function p that maps a set \(L_r\subseteq \mathcal {L}\times \mathcal {U}\) to a partition \(\{L_1,\dots ,L_n\}\) of \(L_r\) s.t. \(\forall 1\le i\le n\!: \{\ell \mid (\ell ,\cdot )\in L_i\} =1\). We use \(\mathcal {P}\) for the set of all locationconsistent partitioners.
As examples, we consider the two extreme locationconsistent partitioners \(\mathrm {cfa}\) and \(\mathrm {sep}\) as defined in the following. Partitioner \(\mathrm {cfa}\) groups all elements with the same location and \(\mathrm {sep}\) never groups elements.
All remaining locationconsistent partitioners group subsets of elements with same locations. Often, they are context dependent, i.e., they take into account the structure of the original program or the program \(C_r\) generated by the pathpreserving reducer. For instance, we use the following partitioner that combines locations referring to the same loop head in the original program. The partitioner is parameterized by the loop heads \(L'\) of the original program.
A partioning of the nodes of a graph, e.g., a CFA, induces a coarser graph. Each set of nodes becomes a node of the new graph and there exists an edge between two sets of nodes if there exists an edge between two nodes in the original graph, one in each set. A folder applies this principle to compress a residual program computed by a pathpreserving reducer. A locationconsistent partitioner defines the partitioning of location states. Furthermore, the new initial program location is the set of location states that contains the original initial location. Due to the partitioner’s properties, exactly one such set exists.
Definition 6
A folder \(\mathrm {fold}: \mathcal {C}\times \mathcal {P}\rightarrow \mathcal {C}\) compresses a CFA \(C_r=(L_r,\ell _{0,r},G_r)\) with a locationconsistent partitioner \(p\) such that
\(\ell _{0,r}\!\in \!\ell _{0,p}\) and \(G_p=\!\big \{(\ell _p,op,\ell '_p) \,\big \, \ell _p,\ell '_p\in p(L_r) \wedge \exists (\ell ,op,\ell ')\!\in \!G_r: \ell \!\in \!\ell _p \wedge \ell '\!\in \!\ell '_p\big \}\).
We use folders to construct so called foldreducers from an existing pathpreserving reducer. To this end, we concatenate the pathpreserving reducer with a folder.
Definition 7
Let p be a locationconsistent partitioner and ppr a pathpreserving reducer. The foldreducer for p and ppr is
Figure 3 shows five residual programs constructed from program absPow (Fig. 2, left) and the condition for it (Fig. 2, right). The residual programs differ in their program size and structure. They were constructed by the seven different foldreducers used in the evaluation, all of them using the reducer ParComp [13], but we converted them into a better readable form using proper if and whilestatements instead of gotos. Note that for this example, some foldreducers constructed the same residual program. To construct the residual programs in Figs.3a and 3e, the partitioners \(\mathrm {cfa}\) and \(\mathrm {sep}\) could be used, respectively. For the residual program in Fig. 3b, we used partitioner \(\mathrm {lh}_{L'}\) with \(L'=\{\ell _4\}\). The partitioner used to construct the program in Fig. 3c undoes unfoldings of ifstatements but keeps loopunfoldings. Finally, the program in Fig. 3d is generated with a partitioner that allows loopunfoldings up to a given bound of ten and then folds them. However, loop heads of the same iteration are always combined.
Above, we used foldreducers to compute residual programs. In general, we plan to use foldreducers in the construction of conditional verifiers. Thus, we must show that foldreducers are reducers. Syntactically, foldreducers look like reducers. It remains to be shown that foldreducers fulfill the residual property.
Theorem 1
Every foldreducer FoldRed \(_{p}^{ppr}\) is a reducer.
Proof
We need to show that . Since ppr is reducer, \(\text {ex}(C)\setminus \{\text {ex}(\pi )\mid A~\text {covers}~\pi \}\subseteq \text {ex}(ppr(C,A))\). Thus, it suffices to show that .
In the following, let \(C=(L_o,\ell _{0,o}, G_o)\), \(ppr(C,A))=(L_r, \ell _{0,r}, G_r)\), and . Due to the requirements on \(p\) and the definition of the foldreducer, there exists a unique function \(h: L_r\rightarrow L_f\) with \(\forall \ell _r\in L_r: \ell _r\in h(\ell _r)\) and \(h(\ell _{0,r})=\ell _{0,f}\).
Part I) :
Part II) :
In practice, arbitrary foldreducers are unsatisfactory since they may produce nondeterministic CFAs, which cannot be translated to C programs. Figure 4 shows an example of a nondeterministic CFA generated by a foldreducer. In the example, the nondeterminism is caused by the partitioner \(lh_{\{\ell _4\}}\), which only combines loop heads. Generally, also the condition may cause nondeterminism.^{Footnote 2} To solve the nondeterminism problem, we transform a foldreducer into a deterministic foldreducer that generates deterministic residual programs from deterministic, input programs. The basic idea is to adapt the partitioner to compute a coarser partitioning. The coarser partitioning combines all partition elements of the original partition that would cause the residual program to be nondeterministic.
Algorithm 1 shows how to compute such a coarser partitioning from the original partitioning. Starting with the original partitioning, it combines partitions of its current partitioning as long as there exist two CFA edges causing nondeterminism, i.e., they consider the same operation and start in the same partition element, but end in different partition elements.
Attentive readers already noticed that Alg. 1 uses the program \(C_r\) generated by the pathpreserving reducer to adapt the partitioning. Since multiple programs may consider the same set of location states but different controlflow edges, it is impossible to adapt the partitioner without knowledge of \(C_r\). Thus, a deterministic foldreducer must use different adaptions of the partitioner p. The correct adaption depends on the input program and the pathpreserving reducer. We use the following adaption, which depends on the original program and the pathpreserving reducer ppr used by the foldreducer.
The adapted partitioner returns the partitioning computed by the original partitioner except for one case. When the original program \(C\) is deterministic and the adapted partitioner is given the location states of the program computed by the pathpreserving reducer, the partition is adapted with Alg. 1. Note that we neglect to apply Alg. 1 for nondeterministic original programs, because it then may combine partitions considering different location states of the original program, thus, resulting in a locationinconsistent partitioner. However, to use the adapted partitioner in a foldreducer, it must remain locationconsistent.
Lemma 1
For a given CFA \(C\), condition \(A\), pathpreserving reducer \(ppr\), locationconsistent partitioner p, function \(\mathrm {det}_{ppr(C,A), p}\) is a locationconsistent partitioner.
Knowing that the adapted partitioner remains locationconsistent, we explain how to derive a deterministic foldreducer from a foldreducer. The idea is simple. The deterministic foldreducer uses for each input program a dedicated variant of the original foldreducer. This dedicated variant uses the prescribed adaption \(\mathrm {det}(ppr(C,A),p)\) of the original partitioner to the original program.
Definition 8
Let be a foldreducer. We define the deterministic foldreducer to be FoldRed \(_{p,ppr}^\mathrm {det}\) .
We already showed that the proposed adaption of the locationconsistent partitioner results in a locationconsistent partitioner. Now, we can easily conclude that deterministic foldreducers guarantee the residual property and, thus, can be used to construct conditional verifiers.
Corollary 1
Every deterministic foldreducer FoldRed \(_{p,ppr}^\mathrm {det}\) is a reducer.
While the previous property is mandatory, we build deterministic foldreducers to produce deterministic programs when given deterministic programs. The subsequent proposition certifies this property of deterministic foldreducers.
Proposition 1
Given a deterministic foldreducer FoldRed \(_{p,ppr}^\mathrm {det}\), a deterministic controlflow automaton \(C\), and a condition \(A\), then the residual program FoldRed \(_{p,ppr}^\mathrm {det}\) \((C,A)\) is deterministic.
4 Evaluation
The main goals of our experiments are to systematically investigate different (fold)reducers and to find out whether foldreducers can overcome the problem that reducer ParComp sometimes generates too large and precise residual programs. Since ParComp was the only available reducer our goal was to counteract on its weaknesses (i.e., the sometimes large residual programs), investigating whether one needs to settle for ParComp ’s weakness is beyond the scope of this evaluation. Another goal of our evaluation is to compare CMC with foldreducers against noncooperative combinations, especially sequential combinations. This leads us to three research questions:

RQ 1. Do distinct foldreducers generate different residual programs?

RQ 2. Can foldreducer be better than reducer ParComp and is there a reducer that dominates the others?

RQ 3. Can reducerbased CMC replace noncooperative verifier combinations?
4.1 Experimental Setup
CMC Configurations. A reducerbased CMC configuration consists of (1) a conditiongenerating verifier \(v_1\), (2) a reducer r, and (3) a second verifier \(v_2\) (cf. Figure 1). For components \(v_1\) and r, we use CPAchecker [14] in revision r32965 since it already provides conditiongenerating verifiers and reducer ParComp [13].
As in other works [9, 13], we use a predicate analysis [15] and a value analysis [16], both using a time limit of 100 s^{Footnote 3}, as conditiongenerating verifiers. If they do not succeed within 100 s, they give up and output a condition. For verifier \(v_2\), we use the three tools [29], ESBMC [34], and VeriAbs [30] that performed best on the reachability categories of SVCOMP 2020^{Footnote 4} as well as , which performed best in the SoftwareSystems category of SVCOMP 2020. For all four tools, we use their version submitted to SVCOMP 2020. Additionally, we used three wellmaintained analyses, kInduction [7], predicate analysis [15], and value analysis [16], which are part of the awardwinning sequential composition of CPAchecker [29]. For them, we also use CPAchecker revision r32965.
We investigated seven foldreducers r, which we implemented in the FRed plugin for CPAchecker. All foldreducers inline functions and typically use the deterministic foldreducer variant of the reducers described in Sect. 3. Only the CFA and the SEP reducers already generate deterministic, residual programs and do not need to use the deterministic variant. The seven foldreducers are:

CFA Foldreducer that uses partitioner \({cfa}\), i.e., it combines elements with same location states and, thus, reconstructs those parts of the original CFA that have not been fully explored.

LH Foldreducer that is based on partitioner \(lh_{L'}\) and undoes loopunfoldings. It combines all elements with the same loophead location state from \(L'\).

LHC Foldreducer that also aims at reverting loopunfoldings, but avoids to combine loop executions started in different contexts, i.e., reached on different syntactical paths ignoring finished loops.

LHB Foldreducer that limits loopunfoldings, i.e., keeps loopunfoldings up to a given bound (we use 10) and afterwards collapse the unfoldings.

LHBC Foldreducer that like LHB limits loopunfoldings up to a bound of 10, but additionally separates loop executions with different contexts like LHC.

NLH Foldreducer that undoes branch but not loopunfoldings (keeps different loop iterations separated).

SEP Foldreducer that never combines elements, uses partitioner \({sep}\) (same as ParComp [13]).
Combining each foldreducer r with all second verifiers \(v_2\) we obtain 49 conditional verifiers \(v_2 \circ r\). Combining the conditional verifiers with the conditiongenerating verifier gives us 84 reducerbased CMC configurations.^{Footnote 5}
Tasks. For our evaluation, we considered the wellestablished benchmark set^{Footnote 6} from the competition on software verification [4]. We focused on the 6 907 tasks of the ReachSafety categories, because all considered analyses can verify the property “no call to function __VERIFIER_error() is reachable”. For each conditiongenerating verifier \(v_1\), we created a task set that excludes all tasks for which all reducers reported an error (\(\approx \)11%) as well as all easy tasks (\(\approx \)45%). A task is considered easy if it does not require CMC because it can be solved in 100 s by \(v_1\) or in 1 000 s^{Footnote 7} by all verifiers \(v_2\). Thus, we only look at tasks for which CMC can contribute additional value (2 949 tasks for CMC with \(v_1\) = predicate analysis and 3 046 tasks for CMC with \(v_1\) = value analysis).
Execution Framework. We performed our experiments on machines with 33 GB of memory and an Intel Xeon E31230 v5 CPU (8 processing units and a frequency of 3.4 GHz). The machines run a Ubuntu 18.04 operating system (Linux kernel 4.15). We use BenchExec [17] to run our experiments. To ensure that all CMC configurations with the same verifier \(v_1\) use the same conditions, we run the conditiongenerating verifiers \(v_1\) once with a runtime limit of 100 s^{Footnote 8} and a memory limit of 15 GB. The generated conditions are then used when running the conditional verifiers with a runtime limit of 900 s and a memory limit of 15 GB.
Replication Support. Our experimental data are available online (see Sect. 6).
4.2 Experimental Results
RQ 1 (Different residual programs?) Already our example (Figure 3) shows that residual programs generated by different reducers can significantly differ in the program size and the branching structure. To further investigate the difference of residual programs, we searched our tasks for programs for which all seven reducers generated residual programs with different numbers of program locations, and selected the program sqrt_Householder_interval.c. Figure 5 shows graph shapes of the CFAs of the residual programs generated by the seven foldreducers. In a graph shape, the width of line i is proportional to the number of CFA nodes with a shortest path of length i from the initial location. We observe that the graph shapes differ in their height and width. Thus, residual programs differ in their branching structure. Finally, we looked at the size increase of the residual programs, i.e., number of locations of residual program (\(L_\mathrm {residual}\)) divided by number of locations of original program (\(L_\mathrm {original}\)). Figure 6 shows boxplots depicting for each reducer the distribution of the size increases of its residual programs. We observe that the boxes differ in size, the median (middle line) and the whiskers, which supports that residual programs from distinct reducers differ.
RQ 2 (Better than ParComp and existence of dominating reducer?) To answer this research question, we study the number of tasks solved correctly by the CMC configurations. We focus on correctly solved tasks and exclude incorrectly solved tasks, which are an unreliable source of information caused by an unsound CMC configuration, e.g., due to an unsound verifier or a bug in one of the CMC configurations. For each CMC configuration, we report the numbers for the full task set^{Footnote 9} and for a restricted task set that only considers those tasks that cannot be solved by the two verifiers in the CMC configuration and, thus, requires cooperation, e.g., via CMC. Table 1 shows the numbers for the CMC configurations using the predicate analysis (upper part) and the value analysis (lower part) for the conditiongenerating verifier \(v_1\). The total number of tasks considered in each column are reported at the top. The CMC configurations are fixed by the reducer (row) and the verifier \(v_2\) (columns). Column ‘+All’ displays the numbers of correctly solved tasks by CMC configurations with any verifier \(v_2\), but excluding tasks that one of the CMC configurations solved incorrectly.^{Footnote 10} Similarly, row ‘All’ uses any reducer. The last row is discussed later.
Looking at Table 1, we first observe that there exist verifier combinations for which the CMC configurations using the SEP reducer, which is identical to reducer ParComp, does not solve the most tasks (bold numbers). We also observe that for some CMC configurations the best reducer differs when considering the full or the restricted task set. Also, the best reducers differ when changing the conditiongenerating verifier. Hence, the best reducer depends on (1) the task set, and (2) the verifier combination. Additionally, we observe that the numbers in row ‘All’ are often larger than in the previous rows. Thus, we are more effective when using different reducers. Moreover, our raw data revealed that for all seven reducers there exist tasks that can only be solved by a verifier combination when using this particular reducer. Therefore, we need all seven reducers.
RQ 3 (Replacement for noncooperative verifier combinations?) To answer this question, we compare CMC with foldreducers against a combination that executes verifier \(v_1\) and \(v_2\) in sequence using the same program for both verifiers and without exchanging any information. This combination is identical to CMC with the identity reducer ID, which returns the input program. Row ID in Table 1 shows the number of tasks solved correctly by the sequential composition. Obviously, the sequential composition does not solve any task in the restricted task set, which only contains tasks that cannot be solved by \(v_1\) and \(v_2\). To solve these tasks, one needs cooperation approaches like reducerbased CMC. For the full task set, we observe that except for one case row ID solves more tasks than the other rows. Hence, reducerbased CMC should only be used for hard verification tasks that cannot be solved by single verifiers and, thus, need cooperation.
4.3 Threats to Validity
In theory, our reducers fulfill the residual condition. However, in practice our reducer implementation might contain bugs that lead to residual programs that add or miss program behavior, i.e., violate the residual condition. In principle, such bugs can lead to residual programs fulfilling the same property as the original program, but that are easier to verify. Hence, some of the correctly solved tasks might come from such bugs. Furthermore, our results concerning the reducers may not generalize. First, we considered a subset of the SVCOMP tasks and analyses that are run in SVCOMP. The analyses are likely trained on the tasks. However, also CMC configurations that unfold the original program a lot, and thus generate residual programs that look differently from the original program, solved many tasks. We are confident that our results apply to other programs. Second, we used specific time limits for the conditiongenerating verifier \(v_1\) and the conditional verifier (reducer plus verifier \(v_2\)). While we chose common time limits, our results may look differently when using different limits.
5 Related Work
Our work is based on the idea of conditional model checking (CMC) [9], which combines analyses via condition passing. The early conditional model checkers [9] used the condition to directly steer the exploration of the second analysis. Translating the condition into a residual program was first proposed in 2015 [26]. Besides slicing, they construct the residual program from a parallel combination of condition and program. Recently, reducerbased CMC [13] generalized the idea of residual programs and introduced the concept of a reducer. The proposed reducer was similar to the earlier parallel combination [26]. In this paper, we construct multiple, new reducers from the original reducer [13].
Combination of Analyses. One type of combination testifies verification results. These combinations try to confirm alarms [18, 25, 28, 35, 44, 47] or proofs [1, 39, 41, 45], possibly excluding unconfirmed results. Violation and correctness witnesses [5, 6] provide a toolindependent exchange format for alarms and proofs, enabling other tools to check a verifier’s result. Further combinations join forces of different analyses. On the one hand, analysis domains are integrated [8, 10, 23, 24, 33] to get more precise domains than the pure product. On the other hand, interleavings of analysis algorithms are proposed [3, 27, 36, 37] to benefit from (intermediate) results of other algorithms. A third class of combinations distributes the verification effort among different tools. CMC [9] and reducerbased CMC [13], which we apply, belong to this class. Often, the program parts that could not be verified by the first analyzer are encoded with programs. Sometimes annotations (assertions) are added [19,20,21, 46], while program trimming [32] adds assume statements to the original program. Reducerbased CMC [13] and program partitioning [43] output a new program describing a subset of the original program paths. Abstractiondriven concolic testing [27] interleaves concolic testing and predicate abstraction to construct test cases for test goals. CoVeriTest [11] recently generalized this approach. Conditional static analysis [49] splits the program paths into subsets, runs one dataflow analysis on each subset and finally combines the results of these restricted analyses.
Program Transformation for Verification. Our work uses foldreducers to transform the original program to remove alreadyverified paths. Like any reducer, foldreducers may unfold the structure (execution paths) of the original program. Moreover, foldreducers use a folder that aims at reverting some of the unfoldings introduced by the existing reducer used in the foldreducer. Likewise, verification refactoring [53] heuristically undoes compiler optimizations to ease verification. Programsfromproofs [42] pursues the same goal, but it unfolds the program structure to ease verification. Program partitioning [43] and abstractiondriven concolic testing [27] transform the original program to remove tested or infeasible program paths. Unfolding the program structure is a common approach to remove infeasible paths [2, 38, 48] or improve the analysis result [40, 50, 51]. In contrast, folding is used less often. Examples are compiler optimizations like constant propagation [52] and commonsubexpression elimination [22].
6 Conclusion
One solution to the problem of verifying complex software systems is to improve verification algorithms and theories. An orthogonal solution is to combine existing techniques. Conditional model checking (CMC) is a promising approach to combine the strengths of different verifiers. To construct new conditional model checkers from existing model checkers in an implementationless and configurable manner (offtheshelf, plugandplay), the concept of reducerbased CMC was recently proposed [13]. Instead of spending developer resources on adapting existing verifiers to make them understand conditions—the information exchange format in CMC—, reducerbased CMC suggests to put reducers in front of existing, offtheshelf verifiers. The task of a reducer is to convert the condition into a format that the verifier already understands, namely program code. Until now, only one reducer existed. Our experiments revealed that there is a lot of potential for improving the effectiveness by using different kinds of reducers.
Developing new reducers can be a laborious task. One must define how to compute the residual program from the input condition and program. Moreover, one must prove that the reducer fulfills the residual property, a correctness property for the reducer. To systematically study reducers, we developed the framework FRed, which simplifies the development of new reducers. FRed allows us to derive the new reducer from an existing one and a heuristic that describes how to coarsen the residual program generated by the existing reducer. To prove that the derived reducer is indeed a reducer, one only needs to show that the specified heuristic is a locationconsistent partitioner, a property much simpler than the residual property. Our experience with FRed is that developing and implementing a new heuristic takes at most a few hours. In the future, algorithm selection could be applied to choose the most suitable reducer for a task.
Data Availability Statement The reducers and all experimental data are publicly available for replication on a web page ^{Footnote 11} and as replication package [12].
Notes
 1.
This paper considers only conditions that are represented as automata, while CMC in general [9] is not restricted to a particular representation.
 2.
Theoretically, the nondeterminism may also be caused by a nondeterministic, original program. However, we assume that the original program is deterministic.
 3.
We chose a time limit of 100 s because a large proportion of the solvable tasks (\({>}\)86%) were solved in less than 100 s.
 4.
 5.
We excluded the 14 combinations in which verifiers \(v_1\) and \(v_2\) are identical because they do not describe a cooperation between different verifiers, but are basically identical to a verification with a single verifier with some additional overhead.
 6.
 7.
We grant CMC 1 000 s. We use a a standard time limit of 900 s for the conditional verifier and, as already explained, 100 s for the conditiongenerating verifier \(v_1\).
 8.
To not interrupt condition writing, we applied the limit to the verification algorithm. Imprecise enforcement or condition writing may result in runtimes larger than 100 s.
 9.
Remember that the full task set depends on the conditiongenerating verifier \(v_1\) because we only look at tasks for which CMC can contribute additional value.
 10.
For +All, the tasks in the restricted set are neither solved by \(v_1\) nor any verifier \(v_2\).
 11.
References
Albert, E., Puebla, G., Hermenegildo, M.V.: Abstractioncarrying code. In: Proc. LPAR, LNCS, vol. 3452, pp. 380–397. Springer (2004). https://doi.org/10.1007/9783540322757_25
Balakrishnan, G., Sankaranarayanan, S., Ivancic, F., Wei, O., Gupta, A.: SLR: Pathsensitive analysis through infeasiblepath detection and syntactic language refinement. In: Proc. SAS, LNCS, vol. 5079, pp. pp. 238–254. Springer (2008). https://doi.org/10.1007/9783540691662_16
Beckman, N., Nori, A.V., Rajamani, S.K., Simmons, R.J.: Proofs from tests. In: Proc. ISSTA, pp. 3–14. ACM (2008). https://doi.org/10.1145/1390630.1390634
Beyer, D.: Advances in automatic software verification: SVCOMP 2020. In: Proc. TACAS (2), LNCS, vol. 12079, pp. 347–367. Springer (2020). https://doi.org/10.1007/9783030452377_21
Beyer, D., Dangl, M., Dietsch, D., Heizmann, M.: Correctness witnesses: Exchanging verification results between verifiers. In: Proc. FSE, pp. 326–337. ACM (2016). https://doi.org/10.1145/2950290.2950351
Beyer, D., Dangl, M., Dietsch, D., Heizmann, M., Stahlbauer, A.: Witness validation and stepwise testification across software verifiers. In: Proc. FSE, pp. 721–733. ACM (2015). https://doi.org/10.1145/2786805.2786867
Beyer, D., Dangl, M., Wendler, P.: Boosting kinduction with continuouslyrefined invariants. In: Proc. CAV, LNCS, vol. 9206, pp. 622–640. Springer (2015). https://doi.org/10.1007/9783319216904_42
Beyer, D., Gulwani, S., Schmidt, D.: Combining model checking and dataflow analysis. In: Handbook of Model Checking, pp. 493–540. Springer (2018). https://doi.org/10.1007/9783319105758_16
Beyer, D., Henzinger, T.A., Keremoglu, M.E., Wendler, P.: Conditional model checking: A technique to pass information between verifiers. In: Proc. FSE. ACM (2012). https://doi.org/10.1145/2393596.2393664
Beyer, D., Henzinger, T.A., Théoduloz, G.: Program analysis with dynamic precision adjustment. In: Proc. ASE, pp. 29–38. IEEE (2008). https://doi.org/10.1109/ASE.2008.13
Beyer, D., Jakobs, M.C.: CoVeriTest: Cooperative verifierbased testing. In: Proc. FASE, LNCS, vol. 11424, pp. 389–408. Springer (2019). https://doi.org/10.1007/9783030167226_23
Beyer, D., Jakobs, M.C.: Replication package for article ‘FRed: Conditional model checking via reducers and folders’ in: Proc. SEFM 2020 (2020). https://doi.org/10.5281/zenodo.3953565
Beyer, D., Jakobs, M.C., Lemberger, T., Wehrheim, H.: Reducerbased construction of conditional verifiers. In: Proc. ICSE, pp. 1182–1193. ACM (2018). https://doi.org/10.1145/3180155.3180259
Beyer, D., Keremoglu, M.E.: CPAchecker: A tool for configurable software verification. In: Proc. CAV, LNCS, vol. 6806, pp. 184–190. Springer (2011). https://doi.org/10.1007/9783642221101_16
Beyer, D., Keremoglu, M.E., Wendler, P.: Predicate abstraction with adjustableblock encoding. In: Proc. FMCAD, pp. 189–197. FMCAD (2010)
Beyer, D., Löwe, S.: Explicitstate software model checking based on CEGAR and interpolation. In: Proc. FASE, LNCS, vol. 7793, pp. 146–162. Springer (2013). https://doi.org/10.1007/9783642370571_11
Beyer, D., Löwe, S., Wendler, P.: Reliable benchmarking: Requirements and solutions. Int. J. Softw. Tools Technol. Transfer 21(1), 1–29 (2017). https://doi.org/10.1007/s100090170469y
Chebaro, O., Kosmatov, N., Giorgetti, A., Julliand, J.: Program slicing enhances a verification technique combining static and dynamic analysis. In: Proc. SAC, pp. 1284–1291. ACM (2012). https://doi.org/10.1145/2245276.2231980
Christakis, M., Müller, P., Wüstholz, V.: Collaborative verification and testing with explicit assumptions. In: Proc. FM, LNCS, vol. 7436, pp. 132–146. Springer (2012). https://doi.org/10.1007/9783642327599_13
Christakis, M., Müller, P., Wüstholz, V.: Guiding dynamic symbolic execution toward unverified program executions. In: Proc. ICSE, pp. 144–155. ACM (2016). https://doi.org/10.1145/2884781.2884843
Christakis, M., Wüstholz, V.: Bounded abstract interpretation. In: Proc. SAS, LNCS, vol. 9837, pp. 105–125. Springer (2016). https://doi.org/10.1007/9783662534137_6
Cocke, J.: Global common subexpression elimination. In: Proc. Symposium on Compiler Optimization, pp. 20–24. ACM (1970). https://doi.org/10.1145/800028.808480
Cousot, P., Cousot, R.: Systematic design of programanalysis frameworks. In: Proc. POPL, pp. 269–282. ACM (1979). https://doi.org/10.1145/567752.567778
Cousot, P., Cousot, R., Feret, J., Mauborgne, L., Miné, A., Monniaux, D., Rival, X.: Combination of abstractions in the ASTRÉE static analyzer. In: Proc. ASIAN’06, LNCS, vol. 4435, pp. 272–300. Springer (2008). https://doi.org/10.1007/9783540775058_23
Csallner, C., Smaragdakis, Y.: Check ‘n’ Crash: Combining static checking and testing. In: Proc. ICSE, pp. 422–431. ACM (2005). https://doi.org/10.1145/1062455.1062533
Czech, M., Jakobs, M., Wehrheim, H.: Just test what you cannot verify! In: Proc. FASE, LNCS, vol. 9033, pp. 100–114. Springer (2015). https://doi.org/10.1007/9783662466759_7
Daca, P., Gupta, A., Henzinger, T.A.: Abstractiondriven concolic testing. In: Proc. VMCAI, LNCS, vol. 9583, pp. 328–347. Springer (2016). https://doi.org/10.1007/9783662491225_16
Dams, D., Namjoshi, K.S.: Orion: Highprecision methods for static error analysis of C and C++ programs. In: Proc. FMCO, LNCS, vol. 4111, pp. 138–160. Springer (2005). https://doi.org/10.1007/11804192_7
Dangl, M., Löwe, S., Wendler, P.: CPAchecker with support for recursive programs and floatingpoint arithmetic (competition contribution). In: Proc. TACAS, LNCS, vol. 9035, pp. 423–425. Springer (2015). https://doi.org/10.1007/9783662466810_34
Darke, P., Prabhu, S., Chimdyalwar, B., Chauhan, A., Kumar, S., Chowdhury, A.B., Venkatesh, R., Datar, A., Medicherla, R.K.: VeriAbs: Verification by abstraction and test generation (competition contribution). In: Proc. TACAS, LNCS, vol. 10806, pp. 457–462. Springer (2018). https://doi.org/10.1007/9783319899633_32
D’Silva, V., Kröning, D., Weissenbacher, G.: A survey of automated techniques for formal software verification. IEEE Trans. CAD Integr. Circ. Syst. 27(7), 1165–1178 (2008). https://doi.org/10.1109/TCAD.2008.923410
Ferles, K., Wüstholz, V., Christakis, M., Dillig, I.: Failuredirected program trimming. In: Proc. ESEC/FSE, pp. 174–185. ACM (2017). https://doi.org/10.1145/3106237.3106249
Fischer, J., Jhala, R., Majumdar, R.: Joining data flow with predicates. In: Proc. FSE, pp. 227–236. ACM (2005). https://doi.org/10.1145/1081706.1081742
Gadelha, M.Y.R., Monteiro, F.R., Cordeiro, L.C., Nicole, D.A.: Esbmc v6.0: Verifying C programs using \(k\)induction and invariant inference (competition contribution). In: Proc. TACAS (3), LNCS, vol. 11429, pp. 209–213. Springer (2019). https://doi.org/10.1007/9783030175023_15
Ge, X., Taneja, K., Xie, T., Tillmann, N.: DyTa: Dynamic symbolic execution guided with static verification results. In: Proc. ICSE, pp. 992–994. ACM (2011). https://doi.org/10.1145/1985793.1985971
Godefroid, P., Nori, A.V., Rajamani, S.K., Tetali, S.: Compositional maymust program analysis: Unleashing the power of alternation. In: Proc. POPL, pp. 43–56. ACM (2010). https://doi.org/10.1145/1706299.1706307
Gulavani, B.S., Henzinger, T.A., Kannan, Y., Nori, A.V., Rajamani, S.K.: Synergy: A new algorithm for property checking. In: Proc. FSE, pp. 117–127. ACM (2006). https://doi.org/10.1145/1181775.1181790
Gulwani, S., Jain, S., Koskinen, E.: Controlflow refinement and progress invariants for bound analysis. In: Proc. PLDI, pp. 375–385. ACM (2009). https://doi.org/10.1145/1542476.1542518
Henzinger, T.A., Jhala, R., Majumdar, R., Necula, G.C., Sutre, G., Weimer, W.: Temporalsafety proofs for systems code. In: Proc. CAV, LNCS, vol. 2404, pp. 526–538. Springer (2002). https://doi.org/10.1007/3540456570_45
Holley, L.H., Rosen, B.K.: Qualified dataflow problems. In: Proc. POPL, pp. 68–82. ACM (1980). https://doi.org/10.1145/567446.567454
Jakobs, M.C., Wehrheim, H.: Certification for configurable program analysis. In: Proc. SPIN, pp. 30–39. ACM (2014). https://doi.org/10.1145/2632362.2632372
Jakobs, M.C., Wehrheim, H.: Programs from proofs: A framework for the safe execution of untrusted software. ACM Trans. Program. Lang. Syst. 39(2), 7:1–7:56 (2017). https://doi.org/10.1145/3014427
Jalote, P., Vangala, V., Singh, T., Jain, P.: Program partitioning: A framework for combining static and dynamic analysis. In: Proc. WODA, pp. 11–16. ACM (2006). https://doi.org/10.1145/1138912.1138916
Li, K., Reichenbach, C., Csallner, C., Smaragdakis, Y.: Residual investigation: predictive and precise bug detection. In: Proc. ISSTA, pp. 298–308. ACM (2012). https://doi.org/10.1145/2338965.2336789
Necula, G.C.: Proofcarrying code. In: Proc. POPL, pp. 106–119. ACM (1997). https://doi.org/10.1145/263699.263712
Necula, G.C., McPeak, S., Weimer, W.: CCured: Typesafe retrofitting of legacy code. In: Proc. POPL, pp. 128–139. ACM (2002). https://doi.org/10.1145/503272.503286
Post, H., Sinz, C., Kaiser, A., Gorges, T.: Reducing false positives by combining abstract interpretation and bounded model checking. In: Proc. ASE, pp. 188–197. IEEE (2008). https://doi.org/10.1109/ASE.2008.29
Sharma, R., Dillig, I., Dillig, T., Aiken, A.: Simplifying loopinvariant generation using splitter predicates. In: Proc. CAV, LNCS, vol. 6806, pp. 703–719. Springer (2011). https://doi.org/10.1007/9783642221101_57
Sherman, E., Dwyer, M.B.: Structurally defined conditional dataflow static analysis. In: Proc. TACAS (2), LNCS, vol. 10806, pp. 249–265. Springer (2018). https://doi.org/10.1007/9783319899633_15
Steffen, B.: Propertyoriented expansion. In: Proc. SAS, LNCS, vol. 1145, pp. 22–41. Springer (1996). https://doi.org/10.1007/3540617396_31
Thakur, A.V., Govindarajan, R.: Comprehensive pathsensitive dataflow analysis. In: Proc. CGO, pp. 55–63. ACM (2008). https://doi.org/10.1145/1356058.1356066
Wegman, M.N., Zadeck, F.K.: Constant propagation with conditional branches. In: Proc. POPL, pp. 291–299. ACM (1985). https://doi.org/10.1145/318593.318659
Yin, X., Knight, J.C., Weimer, W.: Exploiting refactoring in formal verification. In: Proc. DSN, pp. 53–62. IEEE (2009). https://doi.org/10.1109/DSN.2009.5270355
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Open Access This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence and indicate if changes were made.
The images or other third party material in this chapter are included in the chapter's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.
Copyright information
© 2020 The Author(s)
About this paper
Cite this paper
Beyer, D., Jakobs, MC. (2020). FRed: Conditional Model Checking via Reducers and Folders. In: de Boer, F., Cerone, A. (eds) Software Engineering and Formal Methods. SEFM 2020. Lecture Notes in Computer Science(), vol 12310. Springer, Cham. https://doi.org/10.1007/9783030587680_7
Download citation
DOI: https://doi.org/10.1007/9783030587680_7
Published:
Publisher Name: Springer, Cham
Print ISBN: 9783030587673
Online ISBN: 9783030587680
eBook Packages: Computer ScienceComputer Science (R0)