Our goal is to define a configurable and flexible framework for predicate-based approaches that is helpful both in theory (by simplifying development and studying of approaches) as well as in practice (by being customizable for different use cases). In addition, a mature and efficient implementation of this framework should allow reliable scientific experiments and application in practice of the approaches that are integrated now or in the future.
The core of our framework is defined as a CPA for predicate-based analyses, which we name the Predicate CPA \(\mathbb {P}\). It is an extension of an existing CPA for predicate abstraction with adjustable-block encoding (ABE) [21], and a preliminary version was already published [25]. The Predicate CPA \(\mathbb {P}= (D_\mathbb {P}, {\varPi }_\mathbb {P}, \rightsquigarrow _\mathbb {P}, \mathsf {merge}_\mathbb {P}, \mathsf {stop}_\mathbb {P}, \mathsf {prec}_\mathbb {P})\) consists of the abstract domain \(D_\mathbb {P}\), the set \({\varPi }_\mathbb {P}\) of precisions, the transfer relation \(\rightsquigarrow _\mathbb {P}\), the merge operator \(\mathsf {merge}_\mathbb {P}\), the stop operator \(\mathsf {stop}_\mathbb {P}\), and the operator \(\mathsf {prec}_\mathbb {P}\) for dynamic precision adjustment. Additionally, we will define an operator \(\mathsf {fcover}_\mathbb {P}\) for Impact-style forced covering and an operator \(\mathsf {refine}_\mathbb {P}\) for refinements. In the following, we will define and describe these parts in more details. We also provide an extended version of the CPA algorithm, and in the next section we will describe how to express various algorithms for software verification using the concepts defined here. The examples in this section illustrate some cases that occur when verifying the running example program given in Fig. 2 using one of these algorithms from Sect. 4.
Abstract Domain, Precisions, and CPA Operators
The abstract domain \(D_\mathbb {P}= (C, \mathcal {E}_\mathbb {P}, [\![ \cdot ]\!]_\mathbb {P})\) consists of the set C of concrete states, the semilattice \(\mathcal {E}_\mathbb {P}\) over abstract states, and the concretization function \([\![ \cdot ]\!]_\mathbb {P}\). The semilattice \(\mathcal {E}_\mathbb {P}= (E_\mathbb {P}, \sqsubseteq _\mathbb {P})\) consists of the set \(E_\mathbb {P}\) of abstract states and the partial order \(\sqsubseteq _\mathbb {P}\).
Abstract States
Because of the use of adjustable-block encoding [21], an abstract state \(e\in E_\mathbb {P}\) of the Predicate CPA is a triple \((\psi ,{{l^\psi }^{}_{\!\!}},\varphi )\) of an abstraction formula \(\psi \), the abstraction location \({{l^\psi }^{}_{\!\!}}\) (the program location where \(\psi \) was computed), and a path formula \(\varphi \). Both formulas are first-order formulas over predicates over the program variables from the set \(X\), and an abstract state represents all concrete states that satisfy their conjunction: \([\![ (\psi ,{{l^\psi }^{}_{\!\!}},\varphi ) ]\!]_\mathbb {P}= \{(c,\,\cdot ) \in C\mid c \models (\psi \wedge \varphi )\}\). The partial order \(\sqsubseteq _\mathbb {P}\) is defined as \((\psi _1,{{l^\psi }^{}_{\!\!1}},\varphi _1) \sqsubseteq _\mathbb {P}(\psi _2,{{l^\psi }^{}_{\!\!2}},\varphi _2) = \left( (\psi _1\wedge \varphi _1) \Rightarrow (\psi _2\wedge \varphi _2)\right) \), i.e., an abstract state is less than or equal to another state if the conjunction of the formulas of the first state implies the conjunction of the formulas of the other state. Abstract states where the path formula \(\varphi \) is \( true \) are called abstraction states, other abstract states are intermediate states. The transfer relation produces only intermediate states, and at the end of a block of program operations the operator \(\mathsf {prec}\) computes an abstraction state from an intermediate state. The initial abstract state is the abstraction state \(( true , l _ INIT , true )\).
The path formula of an abstract state is always represented syntactically as an SMT formula. The representation of the abstraction formula, however, can be configured. We can either use a binary-decision diagram (BDD) [31], as in classic predicate abstraction [15, 47], or an SMT formula similar to the path formula. Using BDDs allows performing cheap entailment checks between abstraction states at the cost of an increased effort for constructing the BDDs.
Precisions
A precision \(\pi \in {\varPi }_\mathbb {P}\) of the Predicate CPA is a mapping from program locations to sets of predicates over the program variables. This allows using a different abstraction level at each location in the program (lazy abstraction). The initial precision is typically the mapping \(\pi ( l ) = \emptyset \), for all \( l \in L \). The Predicate CPA does not use dynamic precision adjustment [19] during an execution of the CPA algorithm: instead the precision is adjusted only during a refinement step, if the predicate refinement strategy is used. The only operation that changes its behavior based on the precision is the predicate abstraction that may be computed at block ends by the operator \(\mathsf {prec}_\mathbb {P}\).
Transfer Relation
The transfer relation \((\psi ,{{l^\psi }^{}_{\!\!}},\varphi ) \rightsquigarrow ((\psi ,{{l^\psi }^{}_{\!\!}},\varphi '),\pi )\) for a CFA edge \((l_i, op _i, l_j)\) produces a successor state \((\psi ,{{l^\psi }^{}_{\!\!}},\varphi ')\) such that the abstraction formula and location stay unchanged and the path formula \(\varphi '\) is created by applying the strongest-postcondition operator for the current CFA edge to the previous path formula: \(\varphi ' = {\mathsf {SP}_{ op _i}({\varphi })}\). Note that this is an inexpensive, purely syntactical operation that does not involve any actual solving, and that it is a precise operation, i.e., it does not perform any form of abstraction.
Merge Operator
The merge operator \(\mathsf {merge}_\mathbb {P}\) combines intermediate states that belong to the same block (their abstraction formula and location is the same) and keeps any other abstract states separate:
$$\begin{aligned}&\mathsf {merge}_\mathbb {P}\left( \left( \psi _1,{{l^\psi }^{}_{\!\!1}},\varphi _1\right) , \left( \psi _2,{{l^\psi }^{}_{\!\!2}},\varphi _2\right) , \pi \right) \\&\quad {=} {\left\{ \begin{array}{ll} \left( \psi _2, {{l^\psi }^{}_{\!\!2}}, \varphi _1 \vee \varphi _2\right) &{} \text {if}\quad (\psi _1 = \psi _2) \wedge \left( {{l^\psi }^{}_{\!\!1}} = {{l^\psi }^{}_{\!\!2}}\right) \\ \left( \psi _2,{{l^\psi }^{}_{\!\!2}},\varphi _2\right) &{} \text {otherwise} \end{array}\right. } \end{aligned}$$
This definition is common for analyses based on adjustable-block encoding (ABE) [21]. By merging abstract states inside each block, the number of abstract states in the ARG is kept small, and no precision is lost due to merging, because the path formula of an abstract state exactly represents the path(s) from the block start without abstraction. At the same time the loss of information that would lead to a path-insensitive analysis if states would be merged across blocks is avoided. The result is that the ARG, if projected to contain only abstraction states, forms an abstract-reachability tree (ART) like in a path-sensitive analysis without ABE. This is necessary for being able to reconstruct abstract paths, for example during refinement and for reporting concrete error paths.
Stop Operator
The stop operator \(\mathsf {stop}_\mathbb {P}\) checks coverage only for abstraction states and always returns \( false \) for intermediate states:
$$\begin{aligned}&\mathsf {stop}_\mathbb {P}((\psi ,{{l^\psi }^{}_{\!\!}},\varphi ), R, \pi ) \\&\quad {=} {\left\{ \begin{array}{ll} \exists (\psi ',{{l^\psi }^{}_{\!\!}}',\varphi ') \in R: \varphi '= true \wedge (\psi ,{{l^\psi }^{}_{\!\!}},\varphi ) \sqsubseteq _\mathbb {P}(\psi ',{{l^\psi }^{}_{\!\!}}',\varphi ') &{}\text {if}\quad \varphi = true \\ false &{} \text {otherwise} \end{array}\right. } \end{aligned}$$
Because the path formula of an abstraction state is always \( true \), the first case is equivalent to checking if there exists an abstraction state \((\psi ',\cdot , true )\) in the set R whose abstraction formula \(\psi '\) is implied by the abstraction formula \(\psi \) of the current abstraction state \((\psi , {{l^\psi }^{}_{\!\!}}, \varphi )\). If abstraction formulas are represented by BDDs, this is an efficient operation, otherwise a potentially costly SMT query is required. The coverage check for intermediate states is omitted for efficiency, because it would always need to involve (potentially many) SMT queries. Note that this implies that infinitely long sequences of intermediate states must be avoided, otherwise the analysis would not terminate.
Precision-Adjustment Operator
The precision-adjustment operator \(\mathsf {prec}_\mathbb {P}\) either returns the input abstract state and precision, or converts an intermediate state into an abstraction state performing predicate abstraction. The decision is made by the block-adjustment operator \(\mathsf {blk}\) [21], which returns \( true \) or \( false \) depending on whether the current block ends at the current abstract state and thus an abstraction should be computed. The decision can be based on the current abstract state as well as on information about the current program location. We define the following common choices for \(\mathsf {blk}\): \(\mathsf {blk}^{{ lf}}\) returns \( true \) at loop heads, function calls/returns, and at the error location \({ l _ ERR }\), leading to a behavior similar to large-block encoding (LBE) [11]. \(\mathsf {blk}^{l}\) returns \( true \) only at loop heads and at the error location \({ l _ ERR }\). The abstraction at the error location is needed for detecting the reachability of abstract error states due to the satisfiability check that is implicitly done by the abstraction computation if the precision is not empty. \(\mathsf {blk}^ never \) always returns \( false \). This will prevent all abstractions and (due to how \(\mathsf {stop}_\mathbb {P}\) is defined) also prevents coverage between abstract states. This means that an analysis with \(\mathsf {blk}^ never \) will unroll the CFA endlessly until other reasons prevent this. We will show a meaningful application of \(\mathsf {blk}^ never \) in Sect. 4.1 (BMC).
The boolean predicate abstraction [7] \(({\varphi })^{\rho }_{\mathbb {B}}\) of a formula \(\varphi \) for a set \(\rho \) of predicates is the strongest boolean combination of predicates from \(\rho \) that is implied by \(\varphi \). It can be computed using an SMT solver by solving \(\varphi \wedge \bigwedge _{p_i\in \rho } (v_{p_i} \Leftrightarrow p_i)\) and enumerating all its models with respect to the fresh boolean variables \(v_{p_1},\ldots ,v_{p_{|\rho |}}\). For each model we create a conjunction over the predicates from \(\rho \), with each predicate \(p_i\) being negated if the model maps the corresponding variable \(v_{p_i}\) to \( false \). The result of \(({\varphi })^{\rho }_{\mathbb {B}}\) is the disjunction of all these conjunctions. To create an abstraction state from an intermediate state \((\psi ,{{l^\psi }^{}_{\!\!}},\varphi )\) at program location \( l \) (which is tracked by another CPA that runs in parallel to the Predicate CPA as a sibling component within the same Composite CPA and from which the location can be retrieved), we compute the boolean predicate abstraction \(({\psi \wedge \varphi })^{\pi ( l )}_{\mathbb {B}}\) for the formula \(\psi \wedge \varphi \) and the set \(\pi ( l )\) of predicates from the precision, after adjusting the variable names of \(\psi \) to match those of \(\varphi \) (because the variables from \(\psi \) need to match the ’oldest’ variables in \(\varphi \)). Thus, we can define the precision-adjustment operator as
$$\begin{aligned} \mathsf {prec}_\mathbb {P}\left( \left( \psi ,{{l^\psi }^{}_{\!\!}},\varphi \right) ,\pi ,R\right) = {\left\{ \begin{array}{ll} \left( \left( ({\psi \wedge \varphi })^{\pi ( l )}_{\mathbb {B}}, l , true \right) , \pi \right) &{} \text {if}\quad \mathsf {blk}\left( \left( \psi ,{{l^\psi }^{}_{\!\!}},\varphi \right) , l\right) \\ \left( \left( \psi ,{{l^\psi }^{}_{\!\!}},\varphi \right) , \pi \right) &{}\text {otherwise } \end{array}\right. } \end{aligned}$$
Note that, if an abstraction is going to be computed, the current path formula \(\varphi \) precisely represents all the paths within this block (i.e., from the last abstraction state to the current abstract state). Thus, we name this path formula the block formula for the block ending in the current abstract state. If the precision is empty for the current program location, the outcome of the abstraction computation will always simply be \( true \) and no SMT queries are necessary. If the precision for the current program location is \(\{ false \}\), the abstraction computation will be equivalent to a simple satisfiability check, and the outcome will always be either \( true \) or \( false \).
Refinement
The refinement operator \(\mathsf {refine}_\mathbb {P}\) takes as input two sets \(\mathsf {reached}\subseteq E\times {\varPi }\) of reached abstract states and \(\mathsf {waitlist}\subseteq E\times {\varPi }\) of frontier abstract states and expects \(\mathsf {reached}\) to contain an abstract error state at error location \({ l _ ERR }\) that represents a specification violation. \(\mathsf {refine}\) either returns the sets unchanged (if the abstract error state is reachable, i.e., there is a feasible error path), or modified such that the sets can be used for continuing the state-space exploration with an increased precision (if the error path is infeasible). The operator works in four steps.
Abstract-Counterexample Construction
The first step is to construct the set of abstract paths between the initial abstract state and the abstract error state. Traditionally, in an abstract reachability tree, there would exist exactly one such abstract path. Because we use ABE, however, intermediate states can be merged, and thus the abstract states form an abstract reachability graph, where several paths can exist from the initial abstract state to the abstract error state. All these abstract paths to the abstract error state contain the same sequence of abstraction states with varying sequences of intermediate states in between. This is due to the fact that abstraction states are never merged, and intermediate states are merged only locally within a block. Thus, the ARG, if projected to the abstraction states, still forms a tree. The initial abstract state is always an abstraction state by definition, and our choices of the block-adjustment operator \(\mathsf {blk}\) ensure that all abstract error states are also abstraction states. Thus, we define as abstract counterexample the sequence \({\langle e_0,\ldots ,e_n \rangle }\) that begins with the initial abstract state (\(e_0 = e_ INIT \)), ends with the abstract error state \(e_n\), and contains all abstraction states \(e_1,\ldots ,e_{n-1}\) on paths between these two abstract states. This sequence can be reconstructed from the ARG by following a single arbitrary abstract path backwards from the abstract error state (using the information tracked by the ARG CPA), without needing to explicitly enumerate all (potentially exponentially many) abstract paths between the initial abstract state and the abstract error state.
Feasibility Check
From an abstract counterexample \({\langle e_0,\ldots ,e_n \rangle }\) we can create a sequence \({\langle \varphi _1,\ldots ,\varphi _n \rangle }\) of block formulas where each \(\varphi _i\) represents all paths between \(e_{i-1}\) and \(e_i\). Note that each \(\varphi _i\) is also exactly the same formula as the path formula that was used as input when computing the abstraction for state \(e_i\). Then we check whether there exists a feasible concrete path that is represented by one of the abstract paths of the abstract counterexample by checking the counterexample formula \(\bigwedge _{i=1}^n \varphi _i\) for satisfiability in a single SMT query. If satisfiable, the analysis has found a violation of the specification and terminates. Otherwise, i.e., if all abstract paths to the abstract error state are infeasible under the concrete program semantics, we say that the abstract counterexample is spurious, and a refinement of the abstract model is necessary to eliminate this infeasible error path from the ARG.
Interpolation
To refine the abstract model, \(\mathsf {refine}_\mathbb {P}\) uses Craig interpolation [38] to discover abstract facts that allow eliminating the infeasible error path. Given a sequence \({\widehat{\varphi }={\langle \varphi _1,\ldots ,\varphi _n \rangle }}\) of formulas whose conjunction is unsatisfiable, a sequence \({\langle \tau _0,\ldots ,\tau _n \rangle }\) is an inductive sequence of interpolants for \(\widehat{\varphi }\) if
-
1.
\(\tau _0 = true \) and \(\tau _n = false \),
-
2.
\(\forall i\in \{1,\ldots ,n\} : \tau _{i-1}\wedge \varphi _i \Rightarrow \tau _i\), and
-
3.
for all \(i\in \{1,\ldots ,n-1\}\), \(\tau _i\) references only variables that occur in \(\bigwedge _{j=1}^i \varphi _i\) as well as in \(\bigwedge _{j=i+1}^n \varphi _i\).
Note that every interpolation sequence starts with no assumption (\(\tau _0 = true \)) and ends with a contradiction (\(\tau _n = false \)), and that \(\tau _i\Rightarrow \lnot \bigwedge _{j=i+1}^n \varphi _j\) follows from the definition, for all \(i\in \{1,\ldots ,n\}\). For many common SMT theories, interpolants are guaranteed to exist and can be computed using off-the-shelf SMT solvers from a proof of unsatisfiability for \(\bigwedge _{i=1}^n \varphi _i\). Note that in general there exist many possible sequences of interpolants for a single infeasible error path.
Refinement Strategies
Lastly, \(\mathsf {refine}_\mathbb {P}\) needs to refine the precision of the analysis such that afterwards the analysis is guaranteed to not encounter the same error path again. A refinement strategy uses the current spurious abstract counterexample \({\langle e_0,\ldots ,e_n \rangle }\) and the corresponding sequence \({\langle \tau _0,\ldots ,\tau _n \rangle }\) of interpolants to modify the sets \(\mathsf {reached}\) and \(\mathsf {waitlist}\). For this step, two common approaches exist. Afterwards, the refinement is finished, the modified sets \(\mathsf {reached}\) and \(\mathsf {waitlist}\) are returned to the analysis, and the analysis continues with building the abstract model (which will now be more precise).
Impact
Refinement. One refinement strategy is to perform a refinement similar to the function Refine of the Impact algorithm [61]. The Impact refinement strategy takes each abstraction state \(\psi _i\) of the abstract counterexample and conjoins to its abstraction formula the corresponding interpolant \(\tau _i\). If an abstract state is actually strengthened by this (i.e., the previous abstraction formula did not already imply the interpolant), we also need to recheck all coverage relations of this abstract state. Figure 3a outlines such a situation: an abstract state \(e_i'\) previously covered by another abstract state \(e_i\) is now no longer covered, because the abstraction formula of \(e_i\) was strengthened by the refinement. In this case, we uncover and readd all leaf abstract states in the subgraph of the ARG that starts with the uncovered abstract state \(e_i'\) to the set \(\mathsf {waitlist}\). We also check for each of the strengthened abstract states whether it is now covered by any other abstract state at the same program location. If this is successful, i.e., if a strengthened abstract state \(e_j\) is now covered by another abstract state \(e_j'\) as shown in Fig. 3b, we mark the subgraph that starts with that strengthened abstract state \(e_j\) as covered and remove all leafs therein from \(\mathsf {waitlist}\) (we do not need to expand covered abstract states). The only change to the set \(\mathsf {reached}\) is the removal of all abstract states whose abstraction formula is now equivalent to \( false \) and their successors. Due to the properties of interpolants, this is guaranteed to be the case for at least the abstract error state.
Predicate Refinement. Another refinement strategy is used for traditional lazy predicate abstraction. It extracts the atoms of the interpolants as predicates, creates a new precision \(\pi \) with these predicates, and restarts (a part of) the analysis with a new precision that is extended by \(\pi \).
The precision \(\pi \) is a mapping from program locations to sets of predicates, and we add predicates to the precision only for program locations where they are necessary. Assuming that, starting from an abstract counterexample \({\langle e_0,\ldots ,e_n \rangle }\) with abstraction states at program locations \({\langle l _0,\ldots , l _n \rangle }\) we obtained a sequence \({\langle \tau _0,\ldots ,\tau _n \rangle }\) of interpolants and extracted a sequence \({\langle \rho _0,\ldots ,\rho _n \rangle }\) of sets of predicates. Then we add each predicate to the precision for the program location that corresponds to the point in the abstract counterexample where the predicate appears in the interpolant, i.e., \(\pi ( l ) = \bigcup _{i=0}^n (\rho _i\text { if } l = l _i\text { else }\emptyset )\). Note that due to the properties of interpolants, \(\pi ({ l _ ERR })\) will always be \(\{ false \}\). We take the precision \(\pi \) with the new predicates and the existing precision \(\pi _n\) that is associated in the set \(\mathsf {reached}\) with the abstract error state \(e_n\) and join them element-wise to create the new precision \(\pi '\) with \(\forall l \in L : \pi '( l ) = \pi _n( l ) \cup \pi ( l )\) that will be used in the subsequent analysis.
Finally, the sets \(\mathsf {reached}\) and \(\mathsf {waitlist}\) are prepared for continuing with the analysis. We remove only those parts of the ARG for which the new predicates are necessary. For this, we determine the first abstract state of the abstract counterexample for which the new precision \(\pi '\) would lead to more predicates being used in the abstraction computation than the originally used predicates and call this the pivot abstract state. Then we remove the subgraph of the ARG that starts with the pivot abstract state from the sets \(\mathsf {reached}\) and \(\mathsf {waitlist}\), as well as all abstract states that were covered by one of the removed abstract states. To ensure that the removed parts of the ARG get re-explored, we take all remaining parents of removed abstract states, replace the precision with which they are associated in \(\mathsf {reached}\) with the new precision \(\pi '\), and add them to the set \(\mathsf {waitlist}\). This has not only the effect of avoiding the re-exploration of unchanged parts of the ARG, but also leads to the new predicates being used only in the relevant part of the ARG, with other parts of the program state space being explored with different (possibly more abstract and thus more efficient) precisions.
Forced Covering
Forced coverings were introduced for lazy abstraction with interpolants (Impact) [61] for a faster convergence of the analysis. Typically, when the CPA algorithm creates a new successor abstract state for an Impact analysis, this new abstract state is too abstract to be covered by existing abstract states, since the Impact refinement strategy is used, which leads to all new abstraction states being equivalent to \( true \). If an abstract state cannot be covered, the analysis needs to further create successors of it, leading to more abstract states and possibly more refinements. The idea of forced covering is to strengthen new abstract states such that they are covered by existing abstract states immediately if possible.
We define an operator \(\mathsf {fcover}_\mathbb {P}: 2^{E\times {\varPi }} \times E\times {\varPi }\rightarrow 2^{E\times {\varPi }}\) that takes as input the set \(\mathsf {reached}\) of reachable abstract states and an abstract state \(e\) with precision \(\pi \) and returns an updated set \(\mathsf {reached}'\) of reachable abstract states. The operator may replace \(e\) and other abstract states in \(\mathsf {reached}\) with strengthened versions, if it can guarantee that this is sound and if afterwards the strengthened version of \(e\) is covered by another abstract state in \(\mathsf {reached}'\). A trivial implementation of this operator is \(\mathsf {fcover}^ id (\mathsf {reached}, e, \pi ) = \mathsf {reached}\), which does not strengthen abstract states and returns the set \(\mathsf {reached}\) unchanged.
An alternative implementation is \(\mathsf {fcover}^{\textsc {Impact}}\), which adopts the strategy for forced coverings presented for lazy abstraction with interpolants [61]. We extend this approach here to support adjustable-block encoding. Because the Predicate CPA does not attempt to cover intermediate states (only abstraction states), we also only attempt forced coverings for abstraction states. Figure 4 shows a sketch of the concept of forced covering in Impact to help visualize the following explanation: Given an abstraction state \(e\) that should be covered if possible, the candidate abstract states for covering are those abstraction states that belong to the same location, were created before \(e\), and are not covered themselves. For each candidate \(e'\), we first determine the nearest common ancestor abstraction state \(\hat{e}\) of \(e\) and \(e'\) (using the information tracked by the ARG CPA). Now let us denote the abstraction formulas of \(e'\) and \(\hat{e}\) with \(\psi '\) and \(\hat{\psi }\), respectively, and let \(\varphi \) be the path formula that represents the paths from \(\hat{e}\) to \(e\). We then determine whether \(\psi '\) also holds for \(e\) by checking if \(\hat{\psi } \wedge \varphi \implies \psi '\) holds, i.e., whether it is impossible to reach a concrete state that is not represented by \(\psi '\) when starting at \(\hat{e}\) and following the paths to \(e\). If this holds, we can strengthen the abstraction formula of \(e\) with \(\psi '\) (which immediately lets us cover \(e\) by \(e'\)). Furthermore, if there are abstraction states along the paths from \(\hat{e}\) to \(e\), we need to strengthen these states, too, in order to keep the ARG well-formed. We can do so by computing interpolants at the appropriate locations along the paths for the query that we have just solved and strengthen the abstract states with the interpolants. If the query does not hold, we switch to the next candidate abstract state and try again. Finally, \(\mathsf {fcover}^{\textsc {Impact}}\) returns an updated set \(\mathsf {reached}\) with strengthened abstract states, or the original set \(\mathsf {reached}\) if forced covering was unsuccessful for each of the candidates. Note that this forced-covering strategy is similar to interpolation-based refinement with the Impact refinement strategy, just that we attempt to prove that \(\psi '\) instead of \( false \) holds at the end of the path, and that the refined path does not start at the initial abstract state but at \(\hat{e}\).
An Extended CPA Algorithm
In order to be able to use all the features of the Predicate CPA and support approaches such as lazy abstraction, we also need to slightly extend the CPA algorithm. The extended version, which we call the CPA++ algorithm, is shown as Algorithm 2. Compared to the original version (Algorithm 1), it has the following differences:
-
1.
CPA++ gets \(\mathsf {reached}\) and \(\mathsf {waitlist}\) as input and returns updated versions of both of them, instead of getting an initial abstract state and returning a set of reachable abstract states.
-
2.
CPA++ calls a function \(\mathsf {abort}\) to determine whether it should abort early for each found abstract state (lines 16–17).
-
3.
CPA++ calls the precision-adjustment operator immediately for each new abstract state (line 7) instead of only before expanding an abstract state.
-
4.
CPA++ attempts a forced covering by calling \(\mathsf {fcover}\) before expanding an abstract state (lines 3–5).
The first two changes allow calling CPA++ iteratively and keep expanding the same set of abstract states, which is necessary for CEGAR with lazy abstraction (where we want to abort as soon as we find an abstract error state and continue after refinement without restarting from scratch; \(\mathsf {abort}(e)\) is typically implemented to return \( true \) if \(e\) is an abstract state at error location \({ l _ ERR }\)). The new position of the call to the precision-adjustment operator is necessary because previously the resulting abstract states (\(\widehat{e}\) in Algorithm 1) were never put into \(\mathsf {reached}\). However, we need the abstract states resulting from \(\mathsf {prec}\) to be in \(\mathsf {reached}\), because among them are the abstraction states of the Predicate CPA, which are necessary for refinement.
Similar changes to the CPA algorithm have been used previously [22, 25]; we now combine them in order to provide an all-encompassing algorithm for reachability that we can use as building block for our unifying framework for predicate-based software verification.