figure a

1 Introduction

Due to the undecidability of software verification, even after more than 40 years of research on automatic software verification [31], some hard verification tasks cannot be solved by a single verifier alone. To increase the number of solvable tasks, one needs to combine the strengths of distinct verifiers. Several combinations [3, 8, 9, 20, 23, 25, 32, 33, 37] were proposed in the literature. One promising combination is conditional model checking (CMC) [9], which unlike others does not modify the programs nor let the combined techniques know each other. CMC works as follows: If the first verifier gives up on the verification task, it outputs a condition that describes the state space that it successfully verified. The (conditional) second verifier uses the condition of the first verifier to focus its work on the still-unverified state space. Note that one can easily extend the CMC approach to more than two verifiers by letting all verifiers generate conditions.

To easily construct conditional verifiers (i.e., verifiers that understand conditions) from existing off-the-shelf verifiers, a recent work proposed the concept of reducer-based CMC [13]. Instead of making a verifier aware of conditions, reducer-based CMC constructs a conditional verifier from an existing verifier by plugging a reducer in front of the verifier. The reducer is a preprocessor that given the original program and the condition as input, translates the condition into a (residual) program, a format that is understandable by classic verifiers.

The construction of a reducer, especially proving its soundness, is complex and so far there exists only one reducer. However, this reducer’s translation is very precise, and therefore, may construct programs that are orders of magnitudes larger than the original program. To solve this problem, and to support systematic experimentation with different reducers, we propose the formal framework FRed, which streamlines and simplifies the construction of new reducers from existing ones. Its underlying idea is to construct a new reducer \(r=F \circ R\), a so-called fold reducer, by sequentially composing an existing reducer R with a folder F. A folder uses a heuristic that specifies how to modify the program constructed by the existing reducer. More concretely, a folder defines which program locations of the program constructed by the existing reducer are collapsed into a new location and, thus, specifies how to coarsen the program. However, to avoid false alarms, the specified coarsening must not add new program behavior.

New conditional verifiers CV can be constructed with FRed according to the equation \(CV = V \circ (F \circ R)\), where \(r=(F \circ R)\) is the fold-reducer composed of the existing reducer R and a folder F, V is an arbitrary verifier, and \(\circ \) is the sequential composition operator. Figure 1 illustrates this construction in the context of reducer-based CMC. We used this construction to build 49 conditional verifiers, which use the already existing reducer, one of seven folders, and one of seven verifiers. Our large experimental study revealed that using several reducers (with different folders) can make the overall verification more effective.

Fig. 1.
figure 1

Reducer-based CMC configuration \((v_2 \circ r) \circ v_1\) with FRed

Contributions. We make the following contributions:

  • We introduce FRed, a framework for the composition of new reducers from existing reducers and folding heuristics.

  • We prove that FRed derives valid reducers in case the existing reducer is valid and the folding heuristic adheres to a correctness constraint.

  • We use our framework FRed to derive seven new reducers from the existing reducer ParComp  [13] and use them in various conditional verifiers.

  • We experimentally show that the overall effectiveness of reducer-based CMC can be increased using various reducers.

  • Our reducers and all experimental data are available for replication and to construct further conditional model checkers (see Sect. 6).

2 Background

Program Representation. Following the literature [8, 10], we model a program by a control-flow automaton (CFA) \(C = (L, \ell _0,G)\) consisting of a set \(L\) of locations, an initial location \(\ell _0\in L\), and a set of control-flow edges \(G\subseteq L\times Ops\times L\). The set \(Ops\) describes all possible operations. In our presentation, we only consider operations on integer variables that are either boolean expressions (so-called assume operations) or assignments. However, our implementation supports C programs. In the following, we use \(\mathcal {L}\) for the superset of all location sets and \(\mathcal {C}\) for the set of all CFAs. A CFA \(C=(L,\ell _0,G)\) is deterministic (i.e., representable as a C program) if for all control-flow edges \((\ell , op_1, \ell _1), (\ell , op_2, \ell _2)\in G\) either \(op_1=op_2\) and \(\ell _1=\ell _2\), or \(op_1\) and \(op_2\) are assume operations with \(op_1\equiv \lnot op_2\).

Fig. 2.
figure 2

Example program absPow, its CFA, and a condition for our example absPow with accepting state \(q_f\) and assumptions elided (all true)

The left of Fig. 2 shows our example program absPow, which computes \(f(N)=2^{\lceil \log _2|N|\rceil }\) for \(N\ne 0\) and e.g., ensures the property \(f(N)\ne 0\). Next to program absPow, its deterministic CFA is shown, which contains one edge per assignment and two edges for each condition of an if- or while-statement. The two edges per if- or while-statement are labeled with the condition and its negation and represent the two evaluations of the condition.

Program Semantics. We use an operational semantics and represent a program’s state by a pair of location \(\ell \) (the value of the program counter) and concrete data state c. In our representation, a concrete data state is a mapping from the program variables into the set of integer values. Now, a concrete path \(\pi \) of a CFA \(C = (L, \ell _0,G)\) is a sequence \((\ell _0, c_0){\mathop {\rightarrow }\limits ^{g_1}}\dots {\mathop {\rightarrow }\limits ^{g_n}} (\ell _n, c_n)\) such that for all \(1\le i\le n:\) \(g_i = (\ell _{i-1}, op_i , \ell _i)\in G\) and \(c_{i-1}{\mathop {\rightarrow }\limits ^{op_i}}c_i\), i.e., (a) in case of assume operations, \(c_{i-1}\models op_i\) and \(c_{i-1}=c_i\) or (b) in case of assignments, \(c_i = \mathrm {SP}_{op_i} (c_{i-1})\) and \(SP\) is the strongest-post operator of the semantics. We let \(paths(C)\) be the set of all concrete paths of a CFA \(C\). Given a concrete path \(\pi =(\ell _0, c_0){\mathop {\rightarrow }\limits ^{g_1}}\dots {\mathop {\rightarrow }\limits ^{g_n}} (\ell _n, c_n)\), we derive its execution \(ex(\pi )=c_0c_1\dots c_n\). Finally, we define \(ex(C):=\{ex(\pi )\mid \pi \in paths(C)\}\) to be the executions of a CFA \(C\).

Condition. After an (incomplete) verification run, a condition sums up which concrete paths of a program have been explored [9]. We model the condition as an automaton describing the syntactical program paths that have been verified and the assumptions that have been made on these paths (i.e., which concrete data states were included). Thus, the condition’s edges are labeled by pairs of program edges and assumptions. We model assumptions as state conditions, letting \(\varPhi \) denote the set of all state conditions. Accepting states subsume explored paths, i.e., if a path’s prefix is accepted by the condition, the path has been explored. Non-explored paths either end in a non-accepting state or more often have a prefix that ends in a state \(q\) from which no further transition is applicable. Typically, the latter means that the verifier did not explore beyond the prefix.

The automaton on the right of Fig. 2 shows a condition for our example program absPow. For the sake of presentation, we left out the assumptions, which are all true. The condition states that the else-branch of the outermost if-statement was explored and that the verifier performed a BFS alike exploration of the if-branch, which split the exploration of the inner if-branch and which is interrupted after one loop unrolling. Formally, a condition is defined as follows.Footnote 1

Definition 1

A condition \(A=(Q,\varSigma ,\delta ,q_0,F)\) consists of

  • a finite set \(Q\) of states, an initial state \(q_0\in Q\), and accepting states \(F\subseteq Q\),

  • an alphabet \(\varSigma \subseteq 2^G\times \varPhi \), and

  • a transition relation \(\delta \subseteq Q\times \varSigma \times Q\) with \(\lnot \exists (q_f, \cdot ,q)\in \delta :\) \(q_f\in F\wedge q\notin F\).

We let \(\mathcal {A}\) be the set of conditions.

As already said, a condition describes which paths of a program have been looked at. The following definition formalizes this coverage property. Note that we use \(c\models \varphi \) to describe that a concrete data state c satisfies a state condition \(\varphi \).

Definition 2

A condition \(A=(Q,\varSigma ,\delta ,q_0,F)\) covers a concrete path \(\pi =(\ell _0, c_0){\mathop {\rightarrow }\limits ^{g_1}}\dots {\mathop {\rightarrow }\limits ^{g_n}}(\ell _n, c_n)\) if there exists a run \(\rho =q_0\xrightarrow {(G_1,\varphi _1)} \dots \xrightarrow {(G_k,\varphi _k)}q_k\) in \(A\) such that (a) \(0\le k\le n\), (b) \(q_k\in F\), and (c) \(\forall 1\le i\le k: g_i\in G_i \wedge (c_i\models \varphi _i)\).

Reducer. The CMC approach suggests that after an incomplete verification run, a second verifier should use the produced condition to explore only the uncovered paths. However, many verifiers do not understand conditions. To overcome this problem, reducer-based CMC [13] suggests to extend verifiers with a preprocessing step that translates the condition into a residual program. A residual program may overapproximate those program paths that are not covered by the condition, but must not introduce additional program behavior. We follow reducer-based CMC [13] and use reducers to compute residual programs.

Definition 3

A reducer is a function \(red: \mathcal {C}\times \mathcal {A}\rightarrow \mathcal {C}\) satisfying the residual property: \(\begin{array}{l} \forall C\!\in \!\mathcal {C}, \forall A\!\in \!\mathcal {A}: \mathrm {ex}(C){\setminus }\{\mathrm {ex}(\pi )\mid \! A~\mathrm {covers}~\pi \}\subseteq \mathrm {ex}(red(C,A)) \subseteq \mathrm {ex}(C).\end{array}\)

First, a reducer for a specific class of conditions was proposed [26]. Then, reducer-based CMC [13] generalized the first approach to use a reducer, named ParComp, which supports all kinds of conditions, and showed that it is indeed a reducer [13]. To compute a residual program, the reducer ParComp performs a parallel composition of the program and the condition. Starting in the initial location and initial condition state, it matches CFA edges with condition transitions that subsume the respective CFA edge. If no matching condition transition exists, ParComp switches to consider CFA edges only. Additionally, it stops exploring states containing a final state \(q\in F\) since the condition covers all longer paths.

However, the reducer ParComp has one drawback. Verifiers often unfold the program, e.g., unroll loops or inspect branches separately. Due to partially explored paths, some of the unfoldings become part of the condition and will be encoded in the residual program generated by ParComp. Thus, the residual program constructed by ParComp may become orders of magnitudes larger than the original program resulting in increased parsing costs for the second verifier. Additionally, a verifier \(v_2\) analyzing the residual program generated by ParComp is forced to apply the same unfoldings on the non-covered paths as the condition-generating verifier \(v_1\). However, it might be more effective or efficient if verifier \(v_2\) would less often (or never) unfold certain program structures of the original program. To tackle this problem, we present the framework FRed that extends reducers like ParComp to let them compute smaller residual programs with fewer unfoldings at the cost of adding more explored paths to the residual program, i.e., computing less precise residual programs.

3 FRed: Fold-Reducers from Reducers

To assist a systematic exploration of the reducer design space, we present the framework FRed. With FRed one can methodically derive new reducers from existing ones, thereby controlling the precision and size of the produced residual programs. One only needs to define how to compress the residual programs computed by the original reducer. Currently, FRed is limited to the class of path-preserving reducers. Path-preserving reducers have the advantage that they keep the reference to the original program within the syntactical structure of the residual program, i.e., except for location renaming they encode a subset of the syntactical paths of the original program. This makes it easier to derive new reducers from them. Next, we formally define a path-preserving reducer, where \(\mathcal {U}\) is the universe of location markers (e.g., condition states).

Definition 4

A reducer \(ppr\) is path-preserving if for any generated residual program \(ppr((L,\ell _0,G),A)=(L_r,\ell _{0,r}, G_r)\) it is valid that (a) \(L_r\subseteq L\times \mathcal {U}\) for some \(\mathcal {U}\), (b) \(\ell _{0,r}=(\ell _0,\cdot )\), and (c) \(\forall ((\ell ,u),op,(\ell ',u'))\in G_r: \exists (\ell ,op,\ell ')\in G\).

Given a path-preserving reducer like ParComp, the goal of FRed is to derive new reducers that produce smaller, less precise residual programs. Our idea is that the new reducers aggregate certain similar behavior of the residual program \(C_r\) produced by the given path-preserving reducer. So far, the framework FRed supports syntactical aggregations that unite location states of the program \(C_r\). These aggregations can be used to revert loop-unfoldings or separation of branches, the main cause for large residual programs. Additionally, these aggregations are simple to compute. One needs to define only a partitioning of \(C_r\)’s location states into equivalence classes. However, to get proper reducers, the derived reducers must not introduce new program behavior. Transferred to our aggregations, this means that we must not combine location states of \(C_r\) that refer to different locations of the original program. We introduce the concept of a location-consistent partitioner that computes partitions respecting this requirement.

Definition 5

A location-consistent partitioner is a function p that maps a set \(L_r\subseteq \mathcal {L}\times \mathcal {U}\) to a partition \(\{L_1,\dots ,L_n\}\) of \(L_r\) s.t. \(\forall 1\le i\le n\!: |\{\ell \mid (\ell ,\cdot )\in L_i\}| =1\). We use \(\mathcal {P}\) for the set of all location-consistent partitioners.

As examples, we consider the two extreme location-consistent partitioners \(\mathrm {cfa}\) and \(\mathrm {sep}\) as defined in the following. Partitioner \(\mathrm {cfa}\) groups all elements with the same location and \(\mathrm {sep}\) never groups elements.

$$ \mathrm {cfa}(L_r) = \big \{\{(\ell ,u) \!\in \! L_r\mid \ell = \ell ' \} \,\big |\, \exists (\ell ',\cdot ) \!\in \! L_r\big \} \quad ~~ \mathrm {sep}(L_r) = \big \{\{(\ell ,u)\} \,\big |\, (\ell ,u) \!\in \! L_r\big \} $$

All remaining location-consistent partitioners group subsets of elements with same locations. Often, they are context dependent, i.e., they take into account the structure of the original program or the program \(C_r\) generated by the path-preserving reducer. For instance, we use the following partitioner that combines locations referring to the same loop head in the original program. The partitioner is parameterized by the loop heads \(L'\) of the original program.

$$\begin{array}{l l l} \mathrm {lh}_{L'}(L_r) &{} = &{} \mathrm {cfa}\big (\{(\ell , u)\in L_r\mid \ell \in L'\}\big ) ~\cup ~ \mathrm {sep}\big (\{(\ell , u)\in L_r\mid \ell \notin L'\}\big )\\ \end{array}$$

A partioning of the nodes of a graph, e.g., a CFA, induces a coarser graph. Each set of nodes becomes a node of the new graph and there exists an edge between two sets of nodes if there exists an edge between two nodes in the original graph, one in each set. A folder applies this principle to compress a residual program computed by a path-preserving reducer. A location-consistent partitioner defines the partitioning of location states. Furthermore, the new initial program location is the set of location states that contains the original initial location. Due to the partitioner’s properties, exactly one such set exists.

Definition 6

A folder \(\mathrm {fold}: \mathcal {C}\times \mathcal {P}\rightarrow \mathcal {C}\) compresses a CFA \(C_r=(L_r,\ell _{0,r},G_r)\) with a location-consistent partitioner \(p\) such that

$$\begin{aligned} \mathrm {fold}((L_r,\ell _{0,r},G_r),p)=(p(L_r),\ell _{0,p}, G_p)~\text {with } \end{aligned}$$

\(\ell _{0,r}\!\in \!\ell _{0,p}\) and  \(G_p=\!\big \{(\ell _p,op,\ell '_p) \,\big |\, \ell _p,\ell '_p\in p(L_r) \wedge \exists (\ell ,op,\ell ')\!\in \!G_r: \ell \!\in \!\ell _p \wedge \ell '\!\in \!\ell '_p\big \}\).

We use folders to construct so called fold-reducers from an existing path-preserving reducer. To this end, we concatenate the path-preserving reducer with a folder.

Definition 7

Let p be a location-consistent partitioner and ppr a path-preserving reducer. The fold-reducer for p and ppr is

$$\begin{aligned} \textsc {FoldRed}_p^{ppr}(C,A):=\mathrm {fold}(ppr(C,A),p). \end{aligned}$$
Fig. 3.
figure 3

Five residual programs with increasing program sizes and varying program structure, constructed by the seven fold-reducers considered in the evaluation

Figure 3 shows five residual programs constructed from program absPow  (Fig. 2, left) and the condition for it (Fig. 2, right). The residual programs differ in their program size and structure. They were constructed by the seven different fold-reducers used in the evaluation, all of them using the reducer ParComp  [13], but we converted them into a better readable form using proper if- and while-statements instead of gotos. Note that for this example, some fold-reducers constructed the same residual program. To construct the residual programs in Figs.3a and 3e, the partitioners \(\mathrm {cfa}\) and \(\mathrm {sep}\) could be used, respectively. For the residual program in Fig. 3b, we used partitioner \(\mathrm {lh}_{L'}\) with \(L'=\{\ell _4\}\). The partitioner used to construct the program in Fig. 3c undoes unfoldings of if-statements but keeps loop-unfoldings. Finally, the program in Fig. 3d is generated with a partitioner that allows loop-unfoldings up to a given bound of ten and then folds them. However, loop heads of the same iteration are always combined.

Above, we used fold-reducers to compute residual programs. In general, we plan to use fold-reducers in the construction of conditional verifiers. Thus, we must show that fold-reducers are reducers. Syntactically, fold-reducers look like reducers. It remains to be shown that fold-reducers fulfill the residual property.

Theorem 1

Every fold-reducer FoldRed \(_{p}^{ppr}\) is a reducer.

Proof

We need to show that . Since ppr is reducer, \(\text {ex}(C)\setminus \{\text {ex}(\pi )\mid A~\text {covers}~\pi \}\subseteq \text {ex}(ppr(C,A))\). Thus, it suffices to show that .

In the following, let \(C=(L_o,\ell _{0,o}, G_o)\), \(ppr(C,A))=(L_r, \ell _{0,r}, G_r)\), and . Due to the requirements on \(p\) and the definition of the fold-reducer, there exists a unique function \(h: L_r\rightarrow L_f\) with \(\forall \ell _r\in L_r: \ell _r\in h(\ell _r)\) and \(h(\ell _{0,r})=\ell _{0,f}\).

Part I) :

Part II) :

Fig. 4.
figure 4

Nondeterministic residual program built from program absPow, the condition from Fig. 2, and a fold-reducer using reducer ParComp and partitioner \(lh_{\{\ell _4\}}\)

In practice, arbitrary fold-reducers are unsatisfactory since they may produce non-deterministic CFAs, which cannot be translated to C programs. Figure 4 shows an example of a non-deterministic CFA generated by a fold-reducer. In the example, the non-determinism is caused by the partitioner \(lh_{\{\ell _4\}}\), which only combines loop heads. Generally, also the condition may cause non-determinism.Footnote 2 To solve the non-determinism problem, we transform a fold-reducer into a deterministic fold-reducer that generates deterministic residual programs from deterministic, input programs. The basic idea is to adapt the partitioner to compute a coarser partitioning. The coarser partitioning combines all partition elements of the original partition that would cause the residual program to be non-deterministic.

Algorithm 1 shows how to compute such a coarser partitioning from the original partitioning. Starting with the original partitioning, it combines partitions of its current partitioning as long as there exist two CFA edges causing non-determinism, i.e., they consider the same operation and start in the same partition element, but end in different partition elements.

figure b

Attentive readers already noticed that Alg. 1 uses the program \(C_r\) generated by the path-preserving reducer to adapt the partitioning. Since multiple programs may consider the same set of location states but different control-flow edges, it is impossible to adapt the partitioner without knowledge of \(C_r\). Thus, a deterministic fold-reducer must use different adaptions of the partitioner p. The correct adaption depends on the input program and the path-preserving reducer. We use the following adaption, which depends on the original program and the path-preserving reducer ppr used by the fold-reducer.

$$\mathrm {det}_{ppr(C,A), p}(L):=\left\{ \begin{array}{c l} \mathrm {det}(ppr(C,A),p) &{} \text {if } ppr(C,A)=(L,\cdot ,\cdot ) \wedge C \text {~deterministic}\\ p(L) &{} \mathrm {else} \end{array}\right. $$

The adapted partitioner returns the partitioning computed by the original partitioner except for one case. When the original program \(C\) is deterministic and the adapted partitioner is given the location states of the program computed by the path-preserving reducer, the partition is adapted with Alg. 1. Note that we neglect to apply Alg. 1 for non-deterministic original programs, because it then may combine partitions considering different location states of the original program, thus, resulting in a location-inconsistent partitioner. However, to use the adapted partitioner in a fold-reducer, it must remain location-consistent.

Lemma 1

For a given CFA \(C\), condition \(A\), path-preserving reducer \(ppr\), location-consistent partitioner p, function \(\mathrm {det}_{ppr(C,A), p}\) is a location-consistent partitioner.

Knowing that the adapted partitioner remains location-consistent, we explain how to derive a deterministic fold-reducer from a fold-reducer. The idea is simple. The deterministic fold-reducer uses for each input program a dedicated variant of the original fold-reducer. This dedicated variant uses the prescribed adaption \(\mathrm {det}(ppr(C,A),p)\) of the original partitioner to the original program.

Definition 8

Let be a fold-reducer. We define the deterministic fold-reducer to be FoldRed \(_{p,ppr}^\mathrm {det}\) .

We already showed that the proposed adaption of the location-consistent partitioner results in a location-consistent partitioner. Now, we can easily conclude that deterministic fold-reducers guarantee the residual property and, thus, can be used to construct conditional verifiers.

Corollary 1

Every deterministic fold-reducer FoldRed \(_{p,ppr}^\mathrm {det}\) is a reducer.

While the previous property is mandatory, we build deterministic fold-reducers to produce deterministic programs when given deterministic programs. The subsequent proposition certifies this property of deterministic fold-reducers.

Proposition 1

Given a deterministic fold-reducer FoldRed \(_{p,ppr}^\mathrm {det}\), a deterministic control-flow automaton \(C\), and a condition \(A\), then the residual program FoldRed \(_{p,ppr}^\mathrm {det}\) \((C,A)\) is deterministic.

4 Evaluation

The main goals of our experiments are to systematically investigate different (fold-)reducers and to find out whether fold-reducers can overcome the problem that reducer ParComp sometimes generates too large and precise residual programs. Since ParComp was the only available reducer our goal was to counteract on its weaknesses (i.e., the sometimes large residual programs), investigating whether one needs to settle for ParComp ’s weakness is beyond the scope of this evaluation. Another goal of our evaluation is to compare CMC with fold-reducers against non-cooperative combinations, especially sequential combinations. This leads us to three research questions:

  • RQ 1. Do distinct fold-reducers generate different residual programs?

  • RQ 2. Can fold-reducer be better than reducer ParComp and is there a reducer that dominates the others?

  • RQ 3. Can reducer-based CMC replace non-cooperative verifier combinations?

4.1 Experimental Setup

CMC Configurations. A reducer-based CMC configuration consists of (1) a condition-generating verifier \(v_1\), (2) a reducer r, and (3) a second verifier \(v_2\) (cf. Figure 1). For components \(v_1\) and r, we use CPAchecker [14] in revision r32965 since it already provides condition-generating verifiers and reducer ParComp  [13].

As in other works [9, 13], we use a predicate analysis [15] and a value analysis [16], both using a time limit of 100 sFootnote 3, as condition-generating verifiers. If they do not succeed within 100 s, they give up and output a condition. For verifier \(v_2\), we use the three tools   [29], ESBMC  [34], and VeriAbs  [30] that performed best on the reachability categories of SV-COMP 2020Footnote 4 as well as , which performed best in the SoftwareSystems category of SV-COMP 2020. For all four tools, we use their version submitted to SV-COMP 2020. Additionally, we used three well-maintained analyses, kInduction  [7], predicate analysis  [15], and value analysis  [16], which are part of the award-winning sequential composition of CPAchecker  [29]. For them, we also use CPAchecker  revision r32965.

We investigated seven fold-reducers r, which we implemented in the FRed plug-in for CPAchecker. All fold-reducers inline functions and typically use the deterministic fold-reducer variant of the reducers described in Sect. 3. Only the CFA and the SEP reducers already generate deterministic, residual programs and do not need to use the deterministic variant. The seven fold-reducers are:

  • CFA Fold-reducer that uses partitioner \({cfa}\), i.e., it combines elements with same location states and, thus, reconstructs those parts of the original CFA that have not been fully explored.

  • LH Fold-reducer that is based on partitioner \(lh_{L'}\) and undoes loop-unfoldings. It combines all elements with the same loop-head location state from \(L'\).

  • LHC Fold-reducer that also aims at reverting loop-unfoldings, but avoids to combine loop executions started in different contexts, i.e., reached on different syntactical paths ignoring finished loops.

  • LHB Fold-reducer that limits loop-unfoldings, i.e., keeps loop-unfoldings up to a given bound (we use 10) and afterwards collapse the unfoldings.

  • LHBC Fold-reducer that like LHB limits loop-unfoldings up to a bound of 10, but additionally separates loop executions with different contexts like LHC.

  • NLH Fold-reducer that undoes branch- but not loop-unfoldings (keeps different loop iterations separated).

  • SEP Fold-reducer that never combines elements, uses partitioner \({sep}\) (same as ParComp  [13]).

Combining each fold-reducer r with all second verifiers \(v_2\) we obtain 49 conditional verifiers \(v_2 \circ r\). Combining the conditional verifiers with the condition-generating verifier gives us 84 reducer-based CMC configurations.Footnote 5

Tasks. For our evaluation, we considered the well-established benchmark setFootnote 6 from the competition on software verification  [4]. We focused on the 6 907 tasks of the ReachSafety categories, because all considered analyses can verify the property “no call to function __VERIFIER_error() is reachable”. For each condition-generating verifier \(v_1\), we created a task set that excludes all tasks for which all reducers reported an error (\(\approx \)11%) as well as all easy tasks (\(\approx \)45%). A task is considered easy if it does not require CMC because it can be solved in 100 s by \(v_1\) or in 1 000 sFootnote 7 by all verifiers \(v_2\). Thus, we only look at tasks for which CMC can contribute additional value (2 949 tasks for CMC with \(v_1\) = predicate analysis and 3 046 tasks for CMC with \(v_1\) = value analysis).

Execution Framework. We performed our experiments on machines with 33 GB of memory and an Intel Xeon E3-1230 v5 CPU (8 processing units and a frequency of 3.4 GHz). The machines run a Ubuntu 18.04 operating system (Linux kernel 4.15). We use BenchExec [17] to run our experiments. To ensure that all CMC configurations with the same verifier \(v_1\) use the same conditions, we run the condition-generating verifiers \(v_1\) once with a runtime limit of 100 sFootnote 8 and a memory limit of 15 GB. The generated conditions are then used when running the conditional verifiers with a runtime limit of 900 s and a memory limit of 15 GB.

Replication Support. Our experimental data are available online (see Sect. 6).

Fig. 5.
figure 5

Shape graphs (indicating structure) of residual programs constructed from program sqrt_Householder_interval.c by respective fold-reducer

Fig. 6.
figure 6

Boxplot for size increase of residual programs

Table 1. Number of verification tasks solved correctly by each CMC configuration that uses the predicate analysis (upper part) or the value analysis (lower part) for condition generation; last column combines the previous columns

4.2 Experimental Results

RQ 1 (Different residual programs?) Already our example (Figure 3) shows that residual programs generated by different reducers can significantly differ in the program size and the branching structure. To further investigate the difference of residual programs, we searched our tasks for programs for which all seven reducers generated residual programs with different numbers of program locations, and selected the program sqrt_Householder_interval.c. Figure 5 shows graph shapes of the CFAs of the residual programs generated by the seven fold-reducers. In a graph shape, the width of line i is proportional to the number of CFA nodes with a shortest path of length i from the initial location. We observe that the graph shapes differ in their height and width. Thus, residual programs differ in their branching structure. Finally, we looked at the size increase of the residual programs, i.e., number of locations of residual program (\(|L_\mathrm {residual}|\)) divided by number of locations of original program (\(|L_\mathrm {original}|\)). Figure 6 shows boxplots depicting for each reducer the distribution of the size increases of its residual programs. We observe that the boxes differ in size, the median (middle line) and the whiskers, which supports that residual programs from distinct reducers differ.

RQ 2 (Better than ParComp and existence of dominating reducer?) To answer this research question, we study the number of tasks solved correctly by the CMC configurations. We focus on correctly solved tasks and exclude incorrectly solved tasks, which are an unreliable source of information caused by an unsound CMC configuration, e.g., due to an unsound verifier or a bug in one of the CMC configurations. For each CMC configuration, we report the numbers for the full task setFootnote 9 and for a restricted task set that only considers those tasks that cannot be solved by the two verifiers in the CMC configuration and, thus, requires cooperation, e.g., via CMC. Table 1 shows the numbers for the CMC configurations using the predicate analysis (upper part) and the value analysis (lower part) for the condition-generating verifier \(v_1\). The total number of tasks considered in each column are reported at the top. The CMC configurations are fixed by the reducer (row) and the verifier \(v_2\) (columns). Column ‘+All’ displays the numbers of correctly solved tasks by CMC configurations with any verifier \(v_2\), but excluding tasks that one of the CMC configurations solved incorrectly.Footnote 10 Similarly, row ‘All’ uses any reducer. The last row is discussed later.

Looking at Table 1, we first observe that there exist verifier combinations for which the CMC configurations using the SEP reducer, which is identical to reducer ParComp, does not solve the most tasks (bold numbers). We also observe that for some CMC configurations the best reducer differs when considering the full or the restricted task set. Also, the best reducers differ when changing the condition-generating verifier. Hence, the best reducer depends on (1) the task set, and (2) the verifier combination. Additionally, we observe that the numbers in row ‘All’ are often larger than in the previous rows. Thus, we are more effective when using different reducers. Moreover, our raw data revealed that for all seven reducers there exist tasks that can only be solved by a verifier combination when using this particular reducer. Therefore, we need all seven reducers.

RQ 3 (Replacement for non-cooperative verifier combinations?) To answer this question, we compare CMC with fold-reducers against a combination that executes verifier \(v_1\) and \(v_2\) in sequence using the same program for both verifiers and without exchanging any information. This combination is identical to CMC with the identity reducer ID, which returns the input program. Row ID in Table 1 shows the number of tasks solved correctly by the sequential composition. Obviously, the sequential composition does not solve any task in the restricted task set, which only contains tasks that cannot be solved by \(v_1\) and \(v_2\). To solve these tasks, one needs cooperation approaches like reducer-based CMC. For the full task set, we observe that except for one case row ID solves more tasks than the other rows. Hence, reducer-based CMC should only be used for hard verification tasks that cannot be solved by single verifiers and, thus, need cooperation.

4.3 Threats to Validity

In theory, our reducers fulfill the residual condition. However, in practice our reducer implementation might contain bugs that lead to residual programs that add or miss program behavior, i.e., violate the residual condition. In principle, such bugs can lead to residual programs fulfilling the same property as the original program, but that are easier to verify. Hence, some of the correctly solved tasks might come from such bugs. Furthermore, our results concerning the reducers may not generalize. First, we considered a subset of the SV-COMP tasks and analyses that are run in SV-COMP. The analyses are likely trained on the tasks. However, also CMC configurations that unfold the original program a lot, and thus generate residual programs that look differently from the original program, solved many tasks. We are confident that our results apply to other programs. Second, we used specific time limits for the condition-generating verifier \(v_1\) and the conditional verifier (reducer plus verifier \(v_2\)). While we chose common time limits, our results may look differently when using different limits.

5 Related Work

Our work is based on the idea of conditional model checking (CMC) [9], which combines analyses via condition passing. The early conditional model checkers [9] used the condition to directly steer the exploration of the second analysis. Translating the condition into a residual program was first proposed in 2015 [26]. Besides slicing, they construct the residual program from a parallel combination of condition and program. Recently, reducer-based CMC [13] generalized the idea of residual programs and introduced the concept of a reducer. The proposed reducer was similar to the earlier parallel combination [26]. In this paper, we construct multiple, new reducers from the original reducer [13].

Combination of Analyses. One type of combination testifies verification results. These combinations try to confirm alarms [18, 25, 28, 35, 44, 47] or proofs [1, 39, 41, 45], possibly excluding unconfirmed results. Violation and correctness witnesses [5, 6] provide a tool-independent exchange format for alarms and proofs, enabling other tools to check a verifier’s result. Further combinations join forces of different analyses. On the one hand, analysis domains are integrated [8, 10, 23, 24, 33] to get more precise domains than the pure product. On the other hand, interleavings of analysis algorithms are proposed [3, 27, 36, 37] to benefit from (intermediate) results of other algorithms. A third class of combinations distributes the verification effort among different tools. CMC [9] and reducer-based CMC [13], which we apply, belong to this class. Often, the program parts that could not be verified by the first analyzer are encoded with programs. Sometimes annotations (assertions) are added [19,20,21, 46], while program trimming [32] adds assume statements to the original program. Reducer-based CMC [13] and program partitioning [43] output a new program describing a subset of the original program paths. Abstraction-driven concolic testing [27] interleaves concolic testing and predicate abstraction to construct test cases for test goals. CoVeriTest [11] recently generalized this approach. Conditional static analysis [49] splits the program paths into subsets, runs one dataflow analysis on each subset and finally combines the results of these restricted analyses.

Program Transformation for Verification. Our work uses fold-reducers to transform the original program to remove already-verified paths. Like any reducer, fold-reducers may unfold the structure (execution paths) of the original program. Moreover, fold-reducers use a folder that aims at reverting some of the unfoldings introduced by the existing reducer used in the fold-reducer. Likewise, verification refactoring [53] heuristically undoes compiler optimizations to ease verification. Programs-from-proofs [42] pursues the same goal, but it unfolds the program structure to ease verification. Program partitioning [43] and abstraction-driven concolic testing [27] transform the original program to remove tested or infeasible program paths. Unfolding the program structure is a common approach to remove infeasible paths [2, 38, 48] or improve the analysis result [40, 50, 51]. In contrast, folding is used less often. Examples are compiler optimizations like constant propagation [52] and common-subexpression elimination [22].

6 Conclusion

One solution to the problem of verifying complex software systems is to improve verification algorithms and theories. An orthogonal solution is to combine existing techniques. Conditional model checking (CMC) is a promising approach to combine the strengths of different verifiers. To construct new conditional model checkers from existing model checkers in an implementation-less and configurable manner (off-the-shelf, plug-and-play), the concept of reducer-based CMC was recently proposed [13]. Instead of spending developer resources on adapting existing verifiers to make them understand conditions—the information exchange format in CMC—, reducer-based CMC suggests to put reducers in front of existing, off-the-shelf verifiers. The task of a reducer is to convert the condition into a format that the verifier already understands, namely program code. Until now, only one reducer existed. Our experiments revealed that there is a lot of potential for improving the effectiveness by using different kinds of reducers.

Developing new reducers can be a laborious task. One must define how to compute the residual program from the input condition and program. Moreover, one must prove that the reducer fulfills the residual property, a correctness property for the reducer. To systematically study reducers, we developed the framework FRed, which simplifies the development of new reducers. FRed allows us to derive the new reducer from an existing one and a heuristic that describes how to coarsen the residual program generated by the existing reducer. To prove that the derived reducer is indeed a reducer, one only needs to show that the specified heuristic is a location-consistent partitioner, a property much simpler than the residual property. Our experience with FRed is that developing and implementing a new heuristic takes at most a few hours. In the future, algorithm selection could be applied to choose the most suitable reducer for a task.

Data Availability Statement The reducers and all experimental data are publicly available for replication on a web page Footnote 11 and as replication package  [12].