Abstract
Many applications of computeraided verification require bitprecise reasoning as provided by satisfiability modulo theories (SMT) solvers for the theory of quantifierfree fixedsize bitvectors. The current stateoftheart in solving bitvector formulas in SMT relies on bitblasting, where a given formula is eagerly translated into propositional logic (SAT) and handed to an underlying SAT solver. Bitblasting is efficient in practice, but may not scale if the input size can not be reduced sufficiently during preprocessing. A recent scorebased local search approach lifts stochastic local search from the bitlevel (SAT) to the wordlevel (SMT) without bitblasting and proved to be quite effective on hard satisfiable instances, particularly in the context of symbolic execution. However, it still relies on bruteforce randomization and restarts to achieve completeness. Guided by a completeness proof, we simplified, extended and formalized our propagationbased variant of this approach. We obtained a clean, simple and more precise algorithm that does not rely on scorebased local search techniques and does not require bruteforce randomization or restarts to achieve completeness. It further yields substantial gain in performance. In this article, we present and discuss our complete propagation based local search approach for bitvector logics in SMT in detail. We further provide an extended and extensive experimental evaluation including an analysis of randomization effects.
Similar content being viewed by others
1 Introduction
A majority of applications in the field of hardware and software verification requires bitprecise reasoning as provided by satisfiability modulo theories (SMT) solvers for the quantifierfree theory of fixedsize bitvectors. In many of these applications, e.g., (constrained random) test case generation [33, 38, 40] or white box fuzz testing [21], a majority of the problems is satisfiable. For this kind of problems, local search procedures are useful even though they do not allow to determine unsatisfiability. Previous work [18, 36] showed that local search approaches for bitvector logics in SMT are orthogonal to other approaches, which suggests that they are in particular beneficial within a portfolio setting [36].
Current stateoftheart SMT solvers for the quantifierfree theory of fixedsize bitvectors [4, 13, 14, 16, 34] employ the socalled bitblasting approach (e.g., [30]), where an input formula is eagerly translated to propositional logic (SAT) and handed to an underlying SAT solver. While efficient in practice, bitblasting approaches heavily rely on rewriting and other techniques [9,10,11,12, 17, 19, 20, 24, 25] to simplify the input during preprocessing and may not scale if the input size can not be reduced sufficiently. In [18], Fröhlich et al. proposed to attack the problem from a different angle and presented a scorebased local search approach for bitvector logics, which lifts stochastic local search (SLS) from the bitlevel (SAT) to the wordlevel (SMT) without bitblasting. In previous years, and in particular since the SAT challenge 2012 [2], a new generation of SLS for SAT solvers with very simple architecture [3] achieved remarkable results not only in the random but also in the combinatorial tracks of recent SAT competitions [1, 2, 7]. Previous attempts to utilize SLS techniques in SMT by integrating an SLS SAT solver into the DPLL(T)framework of the SMT solver MathSAT [13] were not able to compete with bitblasting [23]. In contrast, the wordlevel local search approach in [18] showed promising initial results. However, [18] does not fully exploit the wordlevel structure but rather simulates bitlevel local search by focusing on single bit flips.
Hence, in [36], we proposed a propagationbased extension of [18], which introduced an additional strategy to propagate assignments from the outputs to the inputs. This significantly improved performance. Our results further suggested that these techniques may be beneficial in a sequential portfolio setting [39] in combination with bitblasting. However, downpropagating assignments as presented in [36] utilizes inverse value computation only, which can get stuck if no inverse value can be found. In that case, [36] still falls back on scorebased local search techniques and requires bruteforce randomization and restarts to achieve completeness, as does [18]. Further, inverse value computation as presented in [36] is too restrictive for some operators and focusing only on inverse values when downpropagating assignments is incomplete and may inadvertently prune the search.
In this paper, guided by a formal completeness proof we present a simple, precise and complete local search variant of the procedure proposed in [36]. Our approach does not use scorebased local search techniques as described in [18] but relies on propagation of assignments only. It further does not require bruteforce randomization or restarts to achieve completeness. To determine propagation paths, we extend the concept of controlling inputs to the wordlevel, which allows to further prune the search. To propagate assignments down, we lift the concept of “backtracing” of Automatic Test Pattern Generation (ATPG) [32], which goes back to the PODEM algorithm [22], to the wordlevel. We further provide a formalization of backtracing for the bitlevel and the wordlevel. Note that in contrast to backtracing in ATPG, our algorithm works with complete assignments. Existing algorithms for wordlevel ATPG [27, 28] are based on branchbound, use neither backtracing nor complete assignments,and in general lack formal treatment.
We implemented our techniques in our SMT solver Boolector [34] and show that combining our propagationbased approach with bitblasting within a sequential portfolio setting is beneficial in terms of performance. We provide an extensive experimental evaluation, including an analysis of randomization effects as a result of different seeds for the random number generator, in particular in comparison to the scorebased local search approach in [18] as implemented in Boolector. Our results show that our techniques yield a substantial gain in performance.
This article extends and revises work presented earlier in [35]. We provide a more detailed description of the propagationbased local search approach introduced in [35], including extensive examples illustrating the core concepts of our approach. We further include a complete set of rules for determining assignments during backtracing. Our previous experimental evaluation of a sequential portfolio combination of our propagationbased technique with bitblasting was a virtual experiment. For this paper, we implemented such a sequential portfolio combination within Boolector and provide an extensive experimental evaluation of our techniques. This evaluation includes an indepth analysis of the performance of our propagationbased local search approach compared to the scorebased local search approach presented in [18] and the evaluation of randomization effects of both techniques, which were not included in previous work.
2 Overview
Our propagationbased local search procedure is based on propagating target assignments from the outputs to the inputs and does not need to rely on restarts or bruteforce randomization to achieve completeness. Local search procedures are in general incomplete in the sense that they do not allow to determine unsatisfiability. Hence, in the following, we restrict our notion of completeness to satisfiable input problems and use it synonymously to the more established property of probabilistically approximately complete (PAC) [26], which is commonly used in the AI community to discuss completeness properties of local search algorithms. It follows the traditional notion of nondeterministic computation of Turing machines, which entails that we treat probabilistic choices as nondeterministic choices [26].
The basic idea of our approach is illustrated in Fig. 1 and described more precisely in pseudo code in Fig. 2. It is applied to propositional formulas (the bitlevel) and quantifierfree bitvector formulas (the wordlevel) as follows.
Given a formula \(\phi \), we assume without loss of generality that \(\phi \) is a directed acyclic graph (DAG) with a single root r (the socalled root constraint or output of \(\phi \)). We use the letter \(\sigma \) to refer to complete but nonsatisfying assignments to all inputs and operators in \(\phi \). We further identify complete satisfying assignments with the letter \(\omega \). Starting from a random but nonsatisfying initial assignment \(\sigma _1\) with \(\sigma _1(r) = 0\), our goal is to reach a satisfying assignment \(\omega \) with \(\omega (r)=1\) by iteratively changing the values of primary inputs. We identify \(\omega (r) = 1\) as the target value of output r (line 3), denoted as \(0 \rightsquigarrow 1\) in Fig. 1, and propagate this value along a path towards the primary inputs (lines 4–7). We also refer to this process as “backtracing” [22]. Recursively propagating target value \({\omega (r) = 1}\) from the output to the primary inputs yields a new value \({x_i \ne \sigma _i(v_i)}\) for an input \(v_i\) (e.g., \(x_1\) for \(v_1\) in Fig. 1). By updating assignment \(\sigma _{i}\) on input \(v_i\) to \(\sigma _{i+1}(v_i) = x_i\) (e.g., \(\sigma _2(v_1) = x_1\) in Fig. 1) without changing the value of other primary inputs but recomputing consistent values for inner nodes (lines 8–9), we move from \(\sigma _i\) to \(\sigma _{i+1}\) and repeat this process until we reach a satisfying assignment, i.e., \(\sigma _{i+1} = \omega \).
When downpropagating assignments, we identify path selection (line 5) and selecting the value to propagate (line 6) as the only sources of nondeterminism. However, we aim to maximally reduce nondeterministic choices without sacrificing completeness. Hence, on the bitlevel, path selection prioritizes controlling inputs w.r.t. the current assignment, a wellknown concept from ATPG, while value selection for a selected input is uniquely defined. On the wordlevel, we introduce the corresponding new notion of essential inputs, which lifts the bitlevel concept of controlling inputs to the wordlevel, and restrict value selection to the computation of what we refer to as consistent and inverse values.
As expected for local search, our propagationbased approach is not able to determine unsatisfiability. Thus the algorithm in Fig. 2 does not terminate in case that a given input formula is unsatisfiable. When determining satisfiability, however, our propagationbased local search approach is complete (PAC), i.e., there exists a nondeterministic choice of moves that leads to a solution.
In the following, we first introduce and formalize our propagationbased approach on the bitlevel and prove its completeness. We then lift it to the wordlevel, and prove its completeness on the wordlevel. We further analyze randomization effects as result of using different seeds for the random number generator and show that our techniques yield substantial performance improvements, in particular in combination with bitblasting within a sequential portfolio setting.
3 Bitlevel
For the sake of simplicity and without loss of generality we consider a fixed Boolean formula \(\phi \) and restrict the set of Boolean operators to \(\{\wedge ,\lnot \}\). We interpret \(\phi \) as a singlerooted AndInverterGraph (AIG) [31], where an AIG is a DAG represented as a 5tuple (r, N, G, V, E).
The set of nodes \(N = G \,\cup \, V\) contains the single root node \(r \in N\), and is further partitioned into a set of gates G and a set of primary inputs (or variables) V. We require that the set of variables is nonempty, i.e., \({V \ne \emptyset }\), and assume that the Boolean constants \(\mathbb {B}= \{0,1\}\), i.e., \(\{ false , true \}\), do not occur in N. This assumption is without loss of generality since every occurrence of true and false as input to a gate \(g \in G\) can be eliminated through rewriting.
The set of gates \(G = A \, \cup \, I\) consists of a set of andgates A and a set of invertergates I. We write \(g = n \wedge m\) if \(g \in A\), and \(g = \lnot n\) if \(g \in I\). We further refer to the children of a node \(g \in G\) as its (gate) inputs (e.g., n or m). Let \(E = E_A \cup E_I\) be the edge relation between nodes, with \({E_A:A \rightarrow N^2}\) and \(E_I:I \rightarrow N\) describing edges from and resp. invertergates to its input(s). We write \(E(g) = (n, m)\) for \(g = n \wedge m\) and \(E(g) = n\) for \(g = \lnot n\) and further introduce the notation \(g \rightarrow n\) for an edge between a gate g and one of its inputs n.
We define a complete assignment \(\sigma \) of the given fixed formula \(\phi \) as a complete function \({\sigma :N \rightarrow \mathbb {B}}\). Similarly, a partial assignment \(\alpha \) of formula \(\phi \) is defined as a partial function \({\alpha :N \rightarrow \mathbb {B}}\). We say that a complete assignment \(\sigma \) is consistent on a gate \(g \in I\) with \(g = \lnot n\) iff \(\sigma (g) = \lnot \sigma (n)\), and consistent on a gate \(g \in A\) with \({g = n \wedge m}\) iff \(\sigma (g) = \sigma (n) \wedge \sigma (m)\).
A complete assignment \(\sigma \) is globally consistent on \(\phi \) (or just consistent) iff it is consistent on all gates \(g \in G\). An assignment \(\sigma \) is satisfying if it is consistent (thus complete) and satisfies the root, i.e., \({\sigma (r) = 1}\). We use the letter \(\omega \) to denote a satisfying assignment. A formula \(\phi \) is satisfiable if it has a satisfying assignment. We use \(\mathcal {C}\) to denote the set of consistent assignments, and \(\mathcal {W}\) with \({{\mathcal {W}} \subseteq {\mathcal {C}}}\) to denote the set of satisfying assignments of formula \(\phi \).
Given two consistent assignments \(\sigma \) and \(\sigma '\), we say that \(\sigma '\) is obtained from \(\sigma \) by flipping the (assignment of a) variable \(v \in V\), written as \({\sigma \xrightarrow {v} \sigma '}\), iff \(\sigma (v) = \lnot \sigma '(v)\) and \(\sigma (u) = \sigma '(u)\) for all \(u \in V\backslash \{v\}\). We also refer to flipping a variable as a move. Note that \(\sigma '(g)\) for gates \(g \in G\) is defined implicitly due to consistency of assignment \(\sigma '\) after fixing the values for the primary inputs V.
Given a set of variables V that can be flipped nondeterministically, let \(S:{\mathcal {C}} \rightarrow \mathbb {P}({\mathcal {M}})\) be a (local search) strategy that maps a consistent assignment to a set of possible moves \({\mathcal {M}} = V\). Note that in general, there exist different notions of strategy, e.g., as in the context of game theory or synthesis. In the context of local search, using the term “strategy” as defined above is well established, e.g., [26, 37]. Further note that since a move corresponds to flipping a variable, the set of possible moves \(\mathcal {M}\) corresponds to the set of variables V and is redundant on the bitlevel. However, we use the same notation on the wordlevel where \(\mathcal {M}\) captures the set of moves valid under a strategy as a set of input value pairs, since a wordlevel move requires to additionally identify the new value of an input.
A move \(v \in V\) is valid under strategy S for assignment \(\sigma \in {\mathcal {C}}\) if \(v \in S(\sigma )\). Similarly, a sequence of moves \(\mu = {(v_1, \ldots , v_k)} \in V^*\) of length \({k = \mu }\) with \({v_1,\ldots ,v_k \in V}\) is valid under strategy S, iff there exists a sequence of consistent assignments \({(\sigma _1, \ldots , \sigma _{k+1}) \in \mathcal {C}^*}\) such that \({\sigma _i \xrightarrow {v_i} \sigma _{i+1}}\) and \(v_i \in S(\sigma _i)\) for \({1 \le i \le k}\). In this case, assignment \(\sigma _{k+1}\) can be reached from assignment \(\sigma _1\) under strategy S (with k moves), also written as \(\sigma _1 \rightarrow ^* \sigma _{k+1}\).
Definition 1
(Complete Strategy) If formula \(\phi \) is satisfiable, then a strategy S is called complete iff for all consistent assignments \({\sigma \in \mathcal C}\) there exists a satisfying assignment \({\omega \in \mathcal W}\) such that \(\omega \) can be reached from \(\sigma \) under S, i.e., \(\sigma \rightarrow ^* \omega \).
Given an assignment \({\sigma \in \mathcal C}\) and a satisfiable assignment \({\omega \in \mathcal W}\), let \(\Delta (\sigma ,\omega ) = \{ {v \in V} \mid {\sigma (v) \ne \omega (v)} \}\) be the set of variables with different values in \(\sigma \) and \(\omega \). Thus, \(\text {HD}(\sigma ,\omega ) = {\Delta (\sigma ,\omega )}\) is the Hamming Distance between \(\sigma \) and \(\omega \) on V.
Definition 2
(DistanceReducing Strategy) A strategy S is (nondeterministically) distance reducing, if for all assignments \({\sigma \in \mathcal{C} \backslash \mathcal{W}}\) there exists a satisfying assignment \({\omega \in \mathcal W}\) and a move \(\sigma \xrightarrow {v} \sigma '\) valid under S which reduces the Hamming Distance. That is, move \(v \in \mathcal M\) is in \(\Delta (\sigma ,\omega )\), thus \(\text {HD}(\sigma ,\omega )  \text {HD}(\sigma '\!,\omega ) = 1\).
Obviously, any distance reducing strategy can reach a satisfying assignment (though not necessarily \(\omega \)) within at most \(\text {HD}(\sigma ,\omega ) \) moves. This first observation is the key argument in the completeness proofs for our propagation based strategies (both on the bitlevel and wordlevel).
Proposition 3
A distance reducing strategy is also complete
In the following, our ultimate goal is to define a strategy that maximally reduces nondeterministic choices without sacrificing completeness. In the algorithm shown in Fig. 2, path selection (selecting the backtracing node in line 5) and value selection (selecting the backtracing value in line 6) while downpropagating assignments constitute the only sources of nondeterminism. As we will show later, in contrast to value selection on the wordlevel, selecting a backtracing value on the bitlevel is uniquely defined. When selecting a backtracing node on the bitlevel, nondeterminism can be reduced by utilizing the notion of controlling inputs from ATPG [32], which is defined as follows.
Definition 4
(Controlling Input) Let node \({n \in N}\) be an input of a gate \({g \in G}\), i.e., \(g \rightarrow n\), and let \(\sigma \) be a complete assignment consistent on g. We say that input n is controlling under \(\sigma \) if for all complete assignments \(\sigma '\) consistent on g with \(\sigma (n) = \sigma '(n)\) we have \(\sigma (g) = \sigma '(g)\).
In other words, gate input n is controlling if the assignment of g (i.e., its output value) remains unchanged as long as the assignment of n does not change.
Given an assignment \(\sigma \) consistent on \(g \in G\), we denote a target value t for gate g as \(\sigma (g) \rightsquigarrow t\). On the bitlevel, target value t is implicitly given through assignment \(\sigma \) as \(t = \lnot \sigma (g)\), i.e., t can not be reached as long as the controlling inputs of g remain unchanged. On the wordlevel, t may be any value \(t \ne \sigma (g)\).
Example 5
Figure 3 shows all possible assignments \(\sigma \) consistent on a gate \(g \in G\). At the outputs we denote current assignment \(\sigma (g)\) and target value t as \(\sigma (g) \rightsquigarrow t\) with \(t = \lnot \sigma (g)\), e.g., \(0 \rightsquigarrow 1\). At the inputs we show their assignment under \(\sigma \). All controlling inputs are indicated with an underline. Note that andgate \(g = n \wedge m\) has no controlling inputs for \(\sigma (g) = 1\).
We define a sequence of nodes \({\pi = (n_1, \ldots , n_k) \in N^+}\) as a path of length k with \(k = \pi \) iff \(n_i \rightarrow n_{i+1}\) for \({0< i < k}\), also written as \({n_1 \rightarrow \ldots \rightarrow n_k}\). A path \(\pi \) is rooted if \({n_1 = r}\), and (fully) expanded if \({n_k \in V}\). We refer to \(n_k \in V\) as the leaf of \(\pi \) in this case. As a restriction on \(\phi \), we require all nodes \(n \in N\) to be reachable from the root r, i.e., there exists a path \(\pi \) such that \({\pi = (r, \ldots , n)}\). We further require all paths to be acyclic, i.e., for all \(n\in N\) there exists no path \({n \rightarrow ^+ n}\). Note that as a consequence of representing \(\phi \) as a DAG, this is the case for any path in \(\phi \). Given a path \({\pi = (\ldots , g)}\) with gate \({g \in G}\) and \({g \rightarrow n}\), we say that \(\pi .n = (\ldots ,g,n)\) is an extension of path \(\pi \) with node n.
Definition 6
(Path Selection) Given a complete assignment \(\sigma \in \mathcal C\) and a path \({\pi = (\ldots , g)}\) as above. Gate input \(\,n\) can be selected w.r.t. assignment \(\sigma \) to extend path \(\pi \) to \(\pi .n\) if input n is controlling or if gate g has no controlling input.
Path selection based on the notion of controlling inputs as introduced above exploits observability don’t cares as defined in the context of ATPG [32]. Similarly, we adopt the ATPG idea of backtracing [22, 32] as follows.
Definition 7
(Backtracing Step) Given a complete assignment \(\sigma \in \mathcal C\) and a gate \({g \in G}\) with \(g \rightarrow n\). A backtracing step w.r.t. assignment \(\sigma \) selects gate input n w.r.t. assignment \(\sigma \) as in Definition 6 and determines a backtracing value x for input n as follows. If \(g = \lnot n\), then \(x = \sigma (g)\). Else, if \(g = n \wedge m\), then \(x = \lnot \sigma (g)\).
As an important observation it turns out that performing a bitlevel backtracing step always flips the value of the selected input under \(\sigma \). For a selected input, the backtracing value is therefore always unique. This can be checked easily by considering all possible scenarios shown in Fig. 3.
Proposition 8
A backtracing step yields as backtracing value \(x = \lnot \sigma (n)\).
Example 9
Consider \({g = n \wedge m}\) and the assignment \(\sigma = \{g \mapsto 0, n \mapsto 0,\) \({m \mapsto 1\}}\) consistent on g as depicted in Fig. 3. Assume that \({t = \lnot \sigma (g) = 1}\) is the target value of g, i.e., \(\sigma (g) \rightsquigarrow t\) with \(0 \rightsquigarrow 1\). We select n as the single controlling input of g (underlined), which yields backtracing value \({x = \lnot \sigma (n) = 1}\).
A trace \({\tau = (\pi , \alpha )}\) is a rooted path \(\pi = (n_1, \ldots , n_k)\) labelled with a partial assignment \(\alpha \), where \(\alpha \) is defined exactly on \(\{n_1, \ldots , n_k\}\). A trace \((\pi , \alpha )\) is (fully) expanded, if \(\pi \) is a fully expanded path, i.e., node \(n_k \in V\) is the leaf of \(\pi \) and \(\tau \).
Let \(\sigma \in \mathcal{C} \backslash \mathcal{W}\) be a complete consistent but nonsatisfying assignment. Then the notion of extension is lifted from paths to traces as follows. Given a trace \(\tau = (\pi ,\alpha )\) with \(\pi = (\ldots , g)\), \(g \in G\) and \(g \rightarrow n\). A backtracing (or propagation) step w.r.t. \(\sigma \) and target value \(t = \lnot \sigma (g) = \alpha (g)\) yields backtracing value \(x = \lnot \sigma (n) = \alpha '(n)\) and extends trace \(\tau \) to \(\tau ' = (\pi ',\alpha ')\) (also denoted as \(\tau \rightarrow \tau '\)) if path \(\pi ' = \pi .n\) is an extension of \(\pi \) and \(\alpha '(m) = \alpha (m)\) for all nodes m in \(\pi \).
We define the root trace \(\rho = {((r), \{ r \mapsto 1 \})}\) as a trace that maps root r to its target value \({\omega (r) = 1}\). A propagation trace w.r.t. assignment \(\sigma \) is a (possibly partial) trace \(\tau \) that starts from the root trace \(\rho \) and is extended by propagation steps w.r.t. assignment \(\sigma \), denoted as \(\rho \rightarrow ^* \tau \).
Note that given a path \(\pi \) and \(\sigma \), partial assignment \(\alpha \) is redundant on the bitlevel. However, we use the same notation on the wordlevel, where \(\alpha \) captures updates to \(\sigma \) along \(\pi \), which (in contrast to the bitlevel) are not uniquely defined.
Definition 10
(Propagation Strategy) Given a nonsatisfying but consistent assignment \(\sigma \in \mathcal{C} \backslash \mathcal{W}\!\), the set of valid moves \(\mathcal{P}(\sigma )\) for \(\sigma \) under propagation strategy \(\mathcal P\) contains exactly the leafs of all expanded propagation traces w.r.t \(\sigma \).
In the following, we present and prove the main Lemma of this paper, which then immediately gives completeness of strategy \(\mathcal P\) in Theorem 12. It is reused for proving completeness of the extension of \(\mathcal P\) to the wordlevel in Theorem 25.
Lemma 11
(Propagation Lemma) Given a nonsatisfying but consistent assignment \(\sigma \in \mathcal{C} \backslash \mathcal{W}\), then for any satisfying assignment \(\omega \in \mathcal W\), used as an oracle, there exists a fully expanded propagation trace \(\tau \) w.r.t. \(\sigma \) with leaf \(v \in \Delta (\sigma ,\omega )\).
Proof
The basic idea of our completeness proof is to inductively extend the root trace \(\rho \) to traces \(\tau = (\pi ,\alpha )\), i.e., \(\rho \rightarrow ^*\tau \), through propagation steps, which all satisfy the (key) invariant
The root trace \(\rho = ((r), \{r \mapsto \omega (r)\})\) obviously satisfies this invariant. Now, let \(\tau = (\pi ,\alpha )\) be a trace that satisfies the invariant but is not fully expanded, i.e., \(\pi = (r,\ldots , g)\) with \(g \in G\) and \(\alpha (g) = \omega (g) \ne \sigma (g)\). Since \(\sigma (g) \ne \omega (g)\) it follows that g has at least one input n with \(\sigma (n) \ne \omega (n)\). If g has no controlling input, then by Definition 6 it is allowed to select n as an input with \(\sigma (n) \ne \omega (n)\). Otherwise, input n is selected as any controlling input. In both cases we select \(x = \omega (n) \ne \sigma (n)\) as backtracing value using Proposition 8. Trace \(\tau \) is now extended by n with backtracing value x to \(\tau '\), i.e., \({\tau \rightarrow \tau '}\), which in turn concludes the inductive proof of Eq. (1). Any fully expanded propagation trace \(\tau = (\pi , \alpha )\) with leaf \(v \in V\), as generated above, also satisfies the invariant in Eq. (1). Thus, we have \(\alpha (v) = \omega (v) \ne \sigma (v)\) with \(v \in \Delta (\sigma ,\omega )\).
In essence, given assignment \(\sigma \) and \(\omega \) as above, our propagation strategy propagates target value \(\omega (r)\) from root r towards the primary inputs, ultimately producing a fully expanded propagation trace \(\tau = (\pi ,\alpha )\). In case of nondeterministic choices for extending the trace we use \(\omega \) as an oracle to pick an input n with \(\sigma (n) \ne \omega (n)\), which can be selected according to Definition 6. The oracle allows us to ensure that for all nodes \(n \in \pi \), \(\alpha (n) = \omega (n)\), which yields \(\alpha (v) = \omega (v) \not = \sigma (v)\) and consequently \(v \in \Delta (\sigma , \omega )\) for leaf v of \(\tau \). Thus, using Lemma 11, our propagation strategy turns out to be distance reducing, and therefore, according to Proposition 3, complete. Figure 4 illustrates the basic idea of our proof, which, in the following, serves as a basis for lifting our approach from the bitlevel to the wordlevel.
Theorem 12
Under the assumptions of the previous Lemma 11 we also get \(v \in \mathcal P (\sigma )\) for leaf v. Thus, \(\mathcal P\) is distance reducing and, as a consequence, complete.
4 Wordlevel
In the following, we only consider bitvector expressions of fixed bitwidth \({w \in \mathbb {N}}\). We denote a bitvector expression n of width w as \(n_{[w]}\), but will omit the bitwidth if the context allows. We refer to the ith bit of \(n_{[w]}\) as n[i] with \({1 \le i \le w}\) and, for the sake of simplicity, define bit indices as starting from 1 rather than 0. We interpret n[1] as the least significant bit (LSB) and n[w] as the most significant bit (MSB), and denote bit ranges over n from bit index j down to index i as \(n[j\mathop {:}i]\). In string representations of bitvectors, we interpret the bit at the far left index as the MSB and the bit at the far right index as the LSB. We further define \( ctz \) to be the common function that c ounts the number of t railing z eroes of a given bitvector, i.e., the number of consecutive 0bits starting from the LSB, e.g., \( ctz (0101) = 0\) and \( ctz (111100) = 2\). Similarly, \( clz \) is the common function to c ompute the number of l eading z eroes, i.e., the number of consecutive 0bits starting from the MSB, e.g., \( clz (0101) = 1\) and \( clz (111100) = 0\).
For the sake of simplicity and without loss of generality we consider a fixed singlerooted quantifierfree bitvector formula \(\phi \) and interpret Boolean expressions as bitvector expressions of bitwidth one. The set of bitvector operators is restricted to \( \mathcal O = \{ \mathrel { \& },\sim ,=,<,\mathop {<<},\mathop {>>},+,\cdot ,\div ,\bmod ,\circ ,{[{}\mathop {:}{}]},\text {ifthenelse}\} \) and interpreted according to Table 1. The selection of operators in \(\mathcal O\) is rather arbitrary but provides a good compromise between effective and efficient wordlevel rewriting and compact encodings for bitblasting approaches. It is complete, though, in the sense that all operators defined in SMTLIB [5] (in particular signed operators) can be modeled in a compact way. Note that our methods are not restricted to singlerootedness or this particular selection of operators, and can easily be lifted to any other set of operators or the multirooted case.
We interpret formula \(\phi \) as a singlerooted DAG represented as an 8tuple \((r,N,\kappa ,O,F,V,B,E)\). The set of nodes \({N = O \cup V \cup B}\) contains the single root node \({r \in N}\) of bitwidth one, and is further partitioned into a set of operator nodes O, a set of primary inputs (or bitvector variables) V, and a set of bitvector constants \(B \subseteq \mathbb {B}^*\), which are denoted in either decimal or binary notation if the context allows. The bitwidth of a node is given by \({\kappa :N \rightarrow \mathbb {N}}\), thus \(\kappa (r) = 1\). Operator nodes are interpreted as bitvector operators via \(F : O \rightarrow \mathcal{O}\), which in turn determines their arity and input and output bitwidths as defined in Table 1. The edge relation between nodes is given as \({E = E_1 \cup E_2 \cup E_3}\), with \({E_i:O \rightarrow N^i}\) describing the set of edges from unary, binary, and ternary operator nodes to its input(s), respectively. We again use the notation \({o \rightarrow n}\) for an edge between an operator node o and one of its inputs n.
We only consider wellformed formulas, where the bitwidths of all operator nodes and their inputs conform to the conditions imposed via interpretation F as defined in Table 1. For instance, we denote a bitvector addition node o with inputs n and m as \({o = n + m}\), where \({o \in O}\) of arity 2 with \({F(o) = +}\), and therefore \({\kappa (o) = \kappa (n) = \kappa (m)}\). In the following, if more convenient we will use the functional notation \(o = \diamond (n_1,\ldots ,n_k)\) for operator node \(o \in O\) of arity k with inputs \(n_1,\ldots ,n_k\) and \(F(o) = \diamond \), e.g., \(+(n, m)\). Note that the semantics of all operators in \(\mathcal O\) correspond to their SMTLIB counterparts listed in Table 1, with three exceptions. Given a logical shift operation \(n \mathop {<<} m\) or \(n \mathop {>>} m\), w.l.o.g. and as implemented in our SMT solver Boolector [34], we restrict bitwidth \(\kappa (n)\) to \(2^{\kappa (m)}\!\). Further, as implemented by Boolector and other stateoftheart SMT solvers, e.g., MathSAT [13] Yices [16] and Z3 [14], we define an unsigned division by zero to return the greatest possible value rather than introducing uninterpreted functions, i.e., \(x \div 0 =\; \sim 0\). Similarly, \(x \bmod 0 = x\).
A complete assignment \(\sigma \) of a given fixed \(\phi \) is a complete function \(\sigma :N \rightarrow \mathbb {B}^*\) with \(\sigma (n) \in \mathbb {B}^{\kappa (n)}\!\), and a partial assignment is a partial function \(\alpha :N \rightarrow \mathbb {B}^*\) with \(\alpha (n) \in \mathbb {B}^{\kappa (n)}\!\). Given an operator node \(o \in O\) with \(o = \diamond (n_1, \ldots , n_k)\) and \({\diamond \in \mathcal{O}}\), a complete assignment \(\sigma \) is consistent on o if \(\sigma (o) = f (\sigma (n_1), \ldots , \sigma (n_k))\) where \(f :\mathbb {B}^{\kappa (n_1)} \times \cdots \times \mathbb {B}^{\kappa (n_k)} \rightarrow \mathbb {B}^{\kappa (o)}\!\) is determined by the semantics of operator \(\diamond \) as defined in the SMTLIB standard [5] (with the exceptions discussed above).
A complete assignment is (globally) consistent on \(\phi \) (or just consistent), iff it is consistent on all bitvector operator nodes \({o \in O}\) and \(\sigma (b) = b\) for all bitvector constants \(b \in B\). A satisfying assignment \(\omega \) is a complete and consistent assignment that satisfies the root, i.e., \({\omega (r) = 1}\). In the following, we will again use the letter \(\mathcal C\) to denote the set of complete and consistent assignments, and the letter \(\mathcal W\) with \(\mathcal{W} \subseteq \mathcal{C}\) to denote the set of satisfying assignments of formula \(\phi \).
Given a bitvector variable \(v \in V\) with \(\kappa (v) = w\) and assignments \({\sigma ,\sigma '\!\!\in \mathcal C}\). We adopt the notion of obtaining assignment \(\sigma '\) from assignment \(\sigma \) by assigning a new value x to variable v with \({x \in \mathbb {B}^w}\) and \(x \ne \sigma (v)\), written as \({\sigma \xrightarrow {v \mapsto x} \sigma '}\), which we refer to as a move. The set of wordlevel moves is thus defined as \({\mathcal{M} = \{ (v,x) \mid v \in V, x \in \mathbb {B}^{\kappa (v)}\}}\), and accordingly, a wordlevel propagation strategy \(\mathcal P\) is defined as a function \(S :\mathcal{C} \mapsto \mathbb {P}(\mathcal{M})\), which maps a consistent assignment to a set of moves. We lift propagation strategy \(\mathcal P\) from the bitlevel to the wordlevel by first introducing our new notion of essential inputs, which lifts and extends the bitlevel notion of controlling inputs to the wordlevel.
Definition 13
(Essential Inputs) Let \(n \in N\) be an input of a bitvector operator node \({o \in O}\), i.e., \({o \rightarrow n}\), and let \(\sigma \) be a complete assignment consistent on o. Further, let t be the target value of o, i.e., \(\sigma (o) \rightsquigarrow t\), with \({t \ne \sigma (o)}\). We say that n is an essential input under \(\sigma \) w.r.t. target value t, if for all complete assignments \(\sigma '\) consistent on o with \({\sigma (n) = \sigma '(n)}\), we have \({\sigma '(o) \ne t}\).
In other words, an input n to an operator node o is essential w.r.t. some target value t, if o can not assume t as long as the assignment of n does not change. As an example, consider the bitvector operators and their essential inputs under some consistent assignment \(\sigma \) w.r.t. some target value t as depicted in Fig. 5.
Example 14
Consider the bitvector operators \( \{+,\;\mathrel { \& },\;\mathop {<<},\;\cdot ,\;\div ,\;\bmod ,\;\circ \}\) of bitwidth 2 as depicted in Fig. 5. For an operator node o, at the outputs we denote given assignment \(\sigma (o)\) and target value t as \(\sigma (o) \rightsquigarrow t\) (e.g., \({10 \rightsquigarrow 01}\)). At the inputs we show their assignment under \(\sigma \). Essential inputs (under \(\sigma \) w.r.t. target value t) are indicated with an underline.

(a)
Given \({{\mathbf {o} := n + m}}\) with \({t = 10}\) and \(\sigma = \{ {o \mapsto 11},\; {n \mapsto 00},\; {m \mapsto 11}\,\}\). Operator \(+\) has no essential inputs, independent from \(\sigma \) and t.

(b)
Given \( {{\mathbf {o} := n \mathrel { \& } m}}\) with \({t = 01}\) and \(\sigma = \{ {o \mapsto 10},\; {n \mapsto 10},\; {m \mapsto 11} \,\}\). Input n is essential since \( t \mathrel { \& } \sigma (n) \ne t\) and thus, it is not possible to find a value x for m such that \( \sigma (n) \mathrel { \& } x = t\). Input m, however, is not essential since \( t \mathrel { \& } \sigma (m) = t\).

(c)
Given \({{\mathbf {o} := n \mathop {<<} m}}\) with \({t = 10}\) and \(\sigma = \{ {o \mapsto 00},\; {n \mapsto 00},\; {m \mapsto 1}\}\). Input n is obviously essential, since shifting 00 can never result in the nonzero target value \(t = 01\). Input m, however, is not essential, since it is possible to simply select, e.g., \(x = 01\) for n such that \(t = 10 = x \mathop {<<} \sigma (m) = 01 \mathop {<<} 1\).

(d)
Given \({\mathbf {o := n \cdot m}}\) with \({t = 10}\) and \({\sigma = \{ {o \mapsto 00},\; {n \mapsto 00},\; {m \mapsto 10}\}}\). Input n is essential since \(t \ne 00\) but \(\sigma (n) = 00\), and thus, it is not possible to find a value x for m such that \(\sigma (n) \cdot x = t\). Input m, however, is not essential since we could pick, e.g., \(x = 01\) for n to obtain \(t = 10 = x \cdot \sigma (m) = 01 \cdot 10\).

(e)
Given \({{\mathbf {o} := n \div m}}\) with \({t = 10}\) and \({\sigma = \{ {o \mapsto 01},\; {n \mapsto 01},\; {m \mapsto 01}\}}\). Input n is essential since \({\sigma (n) < t}\), and thus, it is not possible to find a value x for m such that \({\sigma (n) \div x = t}\). Input m, however, is not essential, since we could pick, e.g., \(x = 10\) to obtain \(t = 10 = x \cdot \sigma (m) = 10 \cdot 01\).

(f)
Given \({{\mathbf {o} := n \bmod m}}\) with \({t = 10}\) and \(\sigma = \{ {o \mapsto 00},\; {n \mapsto 01},\; {m \mapsto 01}\}\). Since \({\sigma (n) = 01 < 10 = t}\), it is not possible to find a value x for m such that \(\sigma (n)\bmod x = t\). However, since \({\sigma (m) = 01}\) but \({t \ne 00}\), it is also not possible to find a value x for n such that \({x\bmod \sigma (m) = t}\). Hence, both inputs are essential.

(g)
Given \({{\mathbf {o} := n \circ m}}\) with \({t = 11}\) and \(\sigma = \{ {o \mapsto 01},\; {n \mapsto 0},\; {m \mapsto 1}\}\). Input n is essential since \({\sigma (n) \ne t[2:2]}\), and thus, it is not possible to find a value x for m such that \(\sigma (n) \circ x = t\). Input m, however, is not essential since it already matches the corresponding slice of the target value.
Note that bitlevel expressions (AIGs) can be represented by bitvectors of bitwidth one, which can be interpreted as wordlevel Boolean expressions. In this sense, the notion of controlling inputs can also be applied to Boolean expressions on the wordlevel.
Proposition 15
When applied to bitlevel expressions, the notion of essential inputs exactly matches the notion of controlling inputs.
Proof
For applying the notion of essential inputs to bitlevel expressions, consider the operator set \(O = \{ \lnot , \wedge \} = G\) and operator \(o \in G\) with \(o \rightarrow n\). Target value \({t \ne \sigma (o)}\) as in Definition 13 implies \(t = \lnot \sigma (o)\) for operator o. This exactly matches the implicit definition of the target value of a Boolean operator on the bitlevel. Now assume that input n is essential w.r.t. target value\(\;t\). Then, if \({\sigma (n) = \sigma '(n)}\), by Definition 13 we have that \({\sigma '(o) \ne t}\), and therefore \({\sigma '(o) = \lnot t = \sigma (o)}\), which exactly matches the notion of controlling inputs as in Definition 4. The other direction (applying the notion of controlling inputs to wordlevel Boolean expressions exactly matches the notion of essential inputs) works in the same way. \(\square \)
The definition of a (rooted and expanded) path as a sequence of nodes \(\pi = (n_1, \ldots , n_k) \in N^*\) is lifted from the bitlevel to the wordlevel in the natural way. Corresponding restrictions and implications of Sect. 3 apply. The notions of path selection and path extension are lifted to the wordlevel as follows.
Definition 16
(Path Extension) Given a path \({\pi = (\ldots , o)}\) with \({o \in O}\) and \({o \rightarrow n}\), we say that \({\pi .n = (\ldots , o, n)}\) is an extension of path \(\pi \) with node n.
Definition 17
(Path Selection) Given a complete consistent assignment \(\sigma \in \mathcal C\), a path \(\pi = (\ldots , o)\) as in Definition 16 above, and \(\sigma (o) \rightsquigarrow t\), i.e., \(t \ne \sigma (o)\), then input n can be selected w.r.t. \(\sigma \) and target value t to extend \(\pi \) to \(\pi .n\) if n is essential or if o has no essential input (in both cases essential under \(\sigma \) w.r.t. \(\,t\)).
Figure 6 shows examples for all combinations of essential (underlined) and nonessential inputs for all bitvector operators in \(\mathcal O\). For an operator node o, an output value \(\sigma (o) \rightsquigarrow t\) indicates the desired transition from current assignment \(\sigma (o)\) to target value t, and an input value shows its assignment under \(\sigma \). Underlined blue cases indicate that this input is a single essential input and will therefore always be selected. Any other case (both inputs are essential or no input is essential) represents a nondeterministic choice during path selection.
In contrast to value selection on the bitlevel, where a backtracing step always yields the flipped assignment of the selected input as backtracing value, on the wordlevel, selecting a backtracing value is not uniquely defined but a source of nondeterminism. We consider three variants of value selection, under the following assumptions. Let t be the target value of an operator node \(o \in O\), and let \(\sigma \in \mathcal C\) be a complete assignment such that \(\sigma (o) \ne t\). Further, assume that input n with \(o \rightarrow n\) is selected w.r.t. target value t and \(\sigma \) as in Definition 17 above.
Definition 18
(Random Value) Any value x with \(\kappa (x) = \kappa (n)\) is called a random value for input n.
Definition 19
(Consistent Value) A random value x is a consistent value for input n w.r.t. target value t, if there is a complete assignment \(\sigma '\) consistent on operator node o with \(\sigma '(n) = x\) and \(\sigma '(o) = t\).
In other words, a value is consistent for an input, if it allows to produce the target value after changing values of other inputs if necessary. We compute a consistent value as backtracing value x for input n as described in Table 2.
However, in some cases, restricting the notion of consistent values even further may be beneficial. Consider the following motivating example.
Example 20
Consider a formula \(\phi := 274177_{[65]} \cdot v = 18446744073709551617_{[65]}\). Computing \(x = 18446744073709551617_{[65]} \div 274177_{[65]} = 67280421310721_{[65]}\) immediately concludes with a satisfying assignment for \(\phi \).
The chances to select \(x = 67280421310721_{[65]}\) if consistent values for the multiplication operator are chosen as described in Table 2 are arbitrarily small. Hence, we also consider the notion of inverse values, which utilize the inverse of an operator.
Definition 21
(Inverse Value) A consistent value x is an inverse value for input n w.r.t. target value t and assignment \(\sigma \), if there exists a complete assignment \(\sigma '\) consistent on operator node o with \(\sigma '(n) = x\), \(\sigma '(o) = t\) and \(\sigma '(m) = \sigma (m)\) for all inputs m with \(o \rightarrow m\) and \(m \ne n\).
In other words, a value is an inverse value for input n, if it allows to produce the target value for an operator node without changing the assignment of its other inputs. Consequently, an inverse value for input n is also consistent. We compute an inverse value as backtracing value x for input n as described in Tables 3 and 4.
Note that inverse value computation as initially presented in [36] is too restrictive for some operators, which is incomplete since it may inadvertently prune the search. We therefore require that inverse value computation allows to generate all possible values for all operators in \(\mathcal O\), which is the case for the rules for inverse value computation as described in Tables 3 and 4.
Definition 22
(Backtracing Step) Let \(\sigma \in \mathcal C\) be a complete consistent assignment. Given an operator node \(o \in O\) with \(o \rightarrow n\) and a target value \({t \ne \sigma (o)}\), then a backtracing step selects input n of operator node o w.r.t. \(\sigma \) as in Definition 17 and selects a backtracing value x for n as a consistent (and optionally inverse) value w.r.t. \(\sigma \) and t if such a value exists, and a random value otherwise.
Note that it is not always possible to find an inverse value for input n, e.g., \( {o := n \mathrel { \& } m}\) with \(\sigma = \{ o \mapsto 00, n \mapsto 00, m \mapsto 00 \}\) and \(t = 01\). Further, even for operators that allow to always produce inverse values, e.g., operator \(+\), doing so may lead to inadvertently pruning the search space, see Example 23 below.
Example 23
Consider formula \(\phi := v + v + 2_{[2]} = 0_{[2]}\) with root \(r := p_2 = 0_{[2]}\), where \(p_2 := v + p_1\) and \(p_1 := v + 2_{[2]}\), and a complete consistent assignment \(\sigma _1 = \{ v \mapsto 00\), \(p_1 \mapsto 10\), \(p_2 \mapsto 10\), \(r \mapsto 0\}\), as shown in Fig. 7a. Let \({t = 1}\) be the target value of root r, i.e., our goal is to find a value for bitvector variable v such that \({p_2 = 00}\), and thus, formula \(\phi \) is satisfied. Assume that as in Fig. 7ab, only inverse values are selected for \(+\) operators during propagation. Down propagating target values along the path indicated by blue arrows in Fig. 7a, the move \(v \mapsto 10 = \alpha _1(v)\) is generated, which consequently yields assignment \(\sigma _2 = \{ v \mapsto 10\), \(p_1 \mapsto 00\), \(p_2 \mapsto 10\), \(r \mapsto 0 \}\) as indicated in Fig. 7b. Selecting the other possible propagation path, the same move is produced. Thus, \(\sigma _2\) is independent of which of the two paths is selected. Since \({\sigma _2(r) \ne t}\), target value t is again propagated down, which generates move \(v \mapsto 00 = \alpha _2(v)\), again independently of which path is selected. With this, we move back to the initial assignment \(\sigma _1\). Consequently, a satisfying assignment, e.g., \(\omega (v) = 01\) or \(\omega '(v) = 11\), can not be reached by only selecting inverse values. However, selecting a consistent but noninverse value for \(p_1\) as, e.g., in Fig. 7c, generates move \(v \mapsto 01 = \alpha _1' (v)\), which yields a satisfying assignment \(\omega = \{ v \mapsto 01\), \(p_1 \mapsto 11\), \(p_2 \mapsto 00\), \(r \mapsto 1\} \).
As shown in Example 23, a propagation strategy using only inverse values without further randomization is incomplete. Hence, when performing a backtracing step, we in general select some consistent noninverse value, if no inverse value exists, and otherwise nondeterministically choose between consistent (but not necessarily inverse) and inverse values. Since all operators in \(\mathcal O\) are surjective for our selected semantics (i.e., they can produce any target value, e.g., \({\sim \!0 \bmod 0 = \;\,\sim \!0}\)), it is not necessary to select inconsistent random values. For other sets of operators, however, this might be necessary. For the sake of completeness we therefore included the selection of random values in the formal definition of backtracing steps.
Note that since on the bitlevel the backtracing value for a selected input is uniquely determined (see Proposition 8), the issue of value selection is specific to the wordlevel. Further, when interpreting AIGs as wordlevel expressions, the notion of backtracing steps on the bitlevel as in Definition 7 exactly matches the wordlevel notion as in Definition 22 using Proposition 15. As a side note, the problem of value selection during wordlevel backtracing and subsequent wordlevel propagation is similar to the problem of making a theory decision (“model assignment”) and propagating this decision in MCSat [15, 29].
The wordlevel propagation strategy \(\mathcal P\) is defined in exactly the same way as for the bitlevel (see Definition 10) except that the wordlevel notion of backtracing based on essential inputs and consistent and inverse value selection (Definition 22) replaces bitlevel backtracing based on controlling inputs (Definition 7), and the set of valid moves \(\mathcal P(\sigma )\) contains not only the leafs of all expanded propagation traces but also their updated assignments, i.e., \((v,\alpha (v))\) for a leaf v. Further important concepts defined on the bitlevel in Sect. 3 can be extended naturally to the wordlevel. These concepts include (expanded) paths and traces, leafs, and trace extension. We omit formal definitions accordingly.
Proposition 8, which is substantial for the bitlevel proof of Lemma 11, does not directly apply on the wordlevel due to the more sophisticated selection of backtracing values. We lift Proposition 8 to the wordlevel as follows.
Proposition 24
Let \(\sigma \in \mathcal{C}\) be a complete consistent assignment, and let \(\omega \) be a satisfying assignment \(\omega \in \mathcal W\). Given operator node \(o \in O\) with \(o \rightarrow n\) and target value \(t = \omega (o) \ne \sigma (o)\), i.e., \(\sigma (o)\rightsquigarrow t\), then there exists a backtracing step w.r.t. assignment \(\sigma \) and target value t, which selects input n and backtracing value \(x = \omega (n) \ne \sigma (n)\).
Proof
First, assume that operator node o has an essential input w.r.t. assignment \(\sigma \). Then we select an arbitrary essential input n of o. Since target value \(t = \omega (o) \ne \sigma (o)\), we get \(\sigma (n) \ne \omega (n)\) by contraposition of Definition 13. Similarly, if o has no essential inputs, then we select n as an arbitrary input with \(\sigma (n) \ne \omega (n)\), which has to exist since \(\omega (o) \ne \sigma (o)\). In both cases, we can select \(x = \omega (n) \ne \sigma (n)\) as backtracing value, which is consistent for operator node o w.r.t. assignment \(\sigma \) and target value t since \(\omega \) is consistent. Picking a random value as backtracing value, which is the last case in Definition 22, can not occur under the given assumptions since, as already discussed, \(\omega \) is consistent on o.
Using Proposition 24 instead of Proposition 8, the bitlevel proof of Lemma 11 can then be lifted to the wordlevel by replacing every occurrence of gate g with operator node o, and the notion of “controlling” input with “essential” input.
Theorem 25
Theorem 12 and Lemma 11 also apply on the wordlevel, and thus, propagation strategy \(\mathcal P\) is also complete on the wordlevel.
Note that even though Proposition 24 would allow us to restrict the selection of consistent and inverse backtracing values to be different from the current input node value, i.e., \(x \ne \sigma (n)\), we do not enforce this property. Restricting value selection to a value \(x \ne \sigma (n)\) interferes with path selection, in particular in the case where an input node is selected for which the current value is the only consistent or inverse value. We leave the exploration of this optimization to future work.
5 Experimental evaluation
We implemented our propagation strategy within our SMT solver Boolector [34] and consider the following configurations.

(1)
Bb The core Boolector engine, which implements a bitblasting approach. This configuration is identical to the version that entered the QF_BV track of the SMT competition 2016 and uses (internal) version bbc of our SAT solver Lingeling [8] as back end solver.

(2)
Bsls The scorebased local search approach of [18] as implemented in Boolector [36], with random walks enabled. This approach lifts stochastic local search for SAT to the wordlevel and iteratively moves from a nonsatisfying towards a satisfying assignment by flipping single bits or incrementing, decrementing and (bitwise) negating the values of the primary inputs. Moves are in general selected as the best (improving) moves according to some score function, and if no such move exists, a random value is chosen. If random walks are enabled, with a certain probability some random (and not necessarily the best) move is performed. This configuration mainly corresponds to the default configuration of [18] as implemented in Z3 [14] except for the score definition, which differs due to implementation issues (as described in [36]).

(3)
Paig The bitlevel configuration of our propagationbased approach, which operates on the AIG representation of a given input as bitblasted by Boolector.

(4)
Pw The wordlevel configuration of our propagationbased approach which directly operates on the given bitvector formula, with inverse values prioritized over consistent values during backtracing with a probability of 99 to 1.
Note that the choice of rewriting and other simplification techniques applied prior to the actual decision procedure may considerably influence its performance. In order to provide the same basis for comparison and avoid skewed results due to differences in the rewriting and simplification techniques applied by Z3 [14] versus Boolector, we do not compare our propagationbased approach against the original implementation of [18] in Z3 but against our implementation of [18] in Boolector (configuration Bsls). All configurations of Boolector apply the same set of rewriting and simplification techniques in the same order.
Since [35] and in particular for the SMT competition 2016, we improved several core components of Boolector, which affects all the configurations above. The default configurations of Paig and Pw therefore show major improvements in comparison to [35]. In comparison to [36], the default configuration of Bsls, however, seems to perform worse. This is solely due to minor changes within the scorebased local search engine of Boolector that affect the random number generator (RNG). We will show that the difference in the number of solved instances compared to [36] lies within the expected variance caused by randomization effects. Note that where not otherwise noted, in the default configuration of all local search configurations Bsls, Paig and Pw we will use a seed of value 0 for the RNG.
We compiled a set of in total 16436 benchmarks^{Footnote 1} and included all benchmarks with status sat or unknown in the QF_BV category of the SMTLIB[6] benchmark library except those proved by Bb to be unsatisfiable within a time limit of 1200 s. We further excluded all benchmarks solved by Boolector via rewriting only. Note that our benchmark set is the same set we already used in [36] and [35]. Previously, all benchmarks in the Sage2 family that used nonSMTLIBv2 compliant operators had to be explicitly excluded from the set above. However, since the SMT competition 2016, these benchmarks have been removed from SMTLIB.
All experiments were performed on a cluster with 30 nodes of 2.83 GHz Intel Core 2 Quad machines with 8 GB of memory using Ubuntu 14.04.3 LTS. Each run is limited to use 7 GB of main memory. In terms of runtime we consider CPU time only. In case of a time out or memory out, the time limit is used as runtime.
Note that the results in [18] indicate that there still exists a considerable gap between the performance of stateoftheart bitblasting and wordlevel local search. However, the latter significantly outperforms bitblasting on several instances. We therefore evaluated our local search configurations with regard to an application within a sequential portfolio setting and apply a limit of 1 and 10 s for the local search configurations, and a limit of 1200 s for the bitblasting and the sequential portfolio configurations.
We evaluated our propagationbased strategy in comparison to the scorebased local search approach in [18], in particular in terms of robustness with respect to randomization effects. We run a batch of 21 runs of each configuration Pw, Paig and Bsls with different seeds for the RNG of Boolector (one with default seed 0 and 20 with different random seeds) with a time limit of 10 s. Table 5 summarizes the results of configurations Bb, Bsls, Paig and Pw with a time limit of 10 s and default seed 0 for the local search configurations. As further illustrated in Figs. 8 and 9, overall, our wordlevel propagation strategy Pw clearly outperforms our bitlevel propagation strategy Paig and the scorebased local search approach Bsls.
Figure 9 shows the results of Pw, Paig and Bsls over all 21 runs with different seeds in terms of number of solved instances and runtime as boxandwhiskers plots with the results of the runs with default seed 0 indicated with a red diamond. As a measure for robustness we use the standard deviation (SD) and the interquartile range (IQR), i.e., the distance between the lower quartile and the upper quartile, of the results of all 21 runs with different seeds, where lower values indicate a higher level of robustness. In terms of number of solved instances, for configuration Pw (SD: 17.88, IQR: 27) both the SD and the IQR is less than half of the SD and the IQR of Bsls (SD: 44.9, IQR: 60) and less than a third of the SD and IQR of Paig (SD: 62.6, IQR: 82). These results suggest that compared to Paig, both Pw and Bsls profit from directly working on the wordlevel, and overall, our wordlevel propagationbased strategy is indeed more robust with respect to randomization effects than the scorebased local search approach of [18].
Even though overall Pw outperforms Paig and Bsls on some benchmarks in the families sage, Sage2 and stp_samples, in comparison to Bsls (457 instances) and Paig (38 instances) configuration Pw seems to struggle. As an interesting observation, when bitblasting the benchmarks in question, for the majority of benchmarks more than 50% of the bitvector expressions contain bits that have been simplified to the Boolean constants \(\{0,1\}\) on the bitlevel. Our bitlevel strategy Paig operates on the bitblasted AIG layer where all constant bits are eliminated via rewriting, and therefore always propagates target values that can actually be assumed. Our wordlevel strategy Pw, however, does not know which bits can be simplified to constant bits and may therefore determine and propagate target values that can never be assumed. Configuration Bsls, on the other hand, also does not have any explicit information on constant bits but considers them implicitly when exploring the neighborhood prior to performing a move since any neighbor with constant bits not matching their value will not result in score improvement.
In an additional experiment, we evaluated the models of the 457 benchmarks on which Pw seems to have a disadvantage over Bsls and identified an interesting pattern.
For more than 80% (374 instances) out of all 457 instances the assignment of more than 50% of the inputs was 0, and for 80% (293 instances) out of these instances, for more than 30% of the nonzero inputs only one bit was set to 1.
Hence, since Bsls starts with an initial assignment where all inputs are set to 0, for this kind of benchmarks its focus on single bit flips allows to quickly move the initial assignment towards a satisfying assignment.
For more than 60% of all 457 instances, Bsls required less than 50 moves (in comparison, the maximum number of moves for all solved instances is 3086).
Configuration Pw starts with the same initial assignment as Bsls, however, as mentioned above, the fact that the majority of these 457 benchmarks contains a considerable amount of expressions with constant bits seems to handicap Pw.
These results suggest that for this set of benchmarks the strategy of Bsls is advantageous over Pw and in particular profits from an initial assignment where all inputs are set to 0. Hence, in an additional experiment we introduce configurations Bsls1 and Pw1 where we initialized the inputs with all bits set to 1 rather than 0. Figure 10 shows the performance of Bsls1 and Pw1 in comparison to Bsls and Pw over 21 runs with different seeds (again, one with default seed 0 and 20 with different random seeds) with a time limit of 10 s. Table 6 further summarizes the results of Bsls, Bsls1, Pw and Pw1 with default seed 0. Overall, configuration Bsls obviously profits considerably from initializing the inputs with 0 since in comparison to Bsls the number of solved instances of Bsls1 drops by almost 10%. In particular on the set of 457 benchmarks where Bsls had an advantage over our propagationbased strategy Pw, initializing the inputs with 1 resulted in Bsls1 only solving 42 instances (9.2%) within a time limit of 10 s. Our propagationbased strategy, on the other hand, is much more robust than Bsls with respect to the input initialization value and seems to overall even profit from initializing the inputs with 1 rather than 0.
Figure 8c shows the performance of our propagationbased configuration Pw compared to our bitblasting configuration Bb with a time limit of 10 s. As summarized in Table 5, even though there exists a considerable gap in the number of solved instances between Bb and Pw (within 10 s, Bb solves almost 2000 instances more than Pw), on 2650 benchmarks, Pw outperforms Bb by at least a factor of 10. In an additional experiment, we evaluated Pw with a time limit of 1200 s, which increases the number of solved instances compared to a time limit of 10 s by 7% (571 instances). These results suggest a combination of both configurations within a sequential portfolio setting [39], where our propagationbased strategy is run for a certain amount of time prior to invoking the bitblasting engine. However, in practice, the number of propagation steps performed is a more reliable metric than the actual runtime of Pw within a sequential portfolio setting. In the following, we distinguish two sequential portfolio configurations.

(1)
Bb+PwvirtualXs A virtual sequential portfolio combination of Pw and Bb, where we assume that Pw is run exactly X seconds prior to invoking Bb.

(2)
Bb+PwX The sequential portfolio combination of Pw and Bb as implemented in Boolector, where configuration Pw is run with a limit of X propagation steps prior to invoking Bb. Note that this configuration won the QF_BV division of the main track of the SMT competition 2016 with X=1000=1k.
Figure 11 illustrates the performance of a virtual sequential portfolio combination Bb+Pwvirtual1s in comparison to the bitblasting configuration Bb with a time limit of 1200 s, where we assume that configuration Pw is run for 1 s before falling back to the bitblasting engine. Overall, configuration Bb+Pwvirtual1s solves 63 instances more than Bb, and further outperforms Bb in terms of runtime by at least a factor of 10 on almost 2400 benchmarks.
Figure 12 shows the performance of the sequential portfolio combinations Bb+Pw1k, Bb+Pw10k, Bb+Pw50k and Bb+Pw100k in comparison to configuration Bb with a time limit of 1200 s, where Pw is run with a limit of 1000, 10, 000, 50, 000 and 100, 000 propagation steps before invoking the bitblasting engine. With a limit of 1k propagation steps, configuration Bb+Pw1k already solves 41 instances more than Bb. It further outperforms Bb in terms of runtime by at least a factor of 10 on more than 2400 benchmarks. Increasing the propagation step limit for configuration Pw to 10k, 50k and 100k further increases performance in term of runtime, with 2601 (Bb+Pw10k), 2649 (Bb+Pw50k) and 2657 (Bb+Pw100k) instances solved by at least a factor of 10 faster than with configuration Bb. In terms of number of solved instances, configuration Bb+Pw10k shows the best performance with a plus of 52 instances compared to Bb. Configurations Bb+Pw50k and Bb+Pw10k still solve 50 and 45 more instances than Bb, but lose instances compared to Bb+Pw10k due to the increasing overhead introduced for those instances not solved within the given propagation step limit.
In an additional experiment with configurations Bb+Pw1k and Bb+Pw10k, we compiled a set of 21172 unsatisfiable benchmarks containing all QF_BV benchmarks in SMTLIB with status unsat and determined the overhead introduced by Pw. With a total of 1237 s for configuration Bb+Pw1k, the overhead for the unsatisfiable instances is negligible compared to the performance gain of almost 102k s on the satisfiable instances. For configuration Bb+Pw10k, the overhead for the unsatisfiable instances is larger by a factor of 10 (10,316 s), which is still an order of magnitude less than the performance gain of more than 116k s on the satisfiable instances.
Table 7 summarizes the results of configurations Bb, Bb+Pwvirtual1s and Bb+Pw10k, and gives a more detailed overview by benchmark family with a time limit of 1200 s. As shown in Fig. 12, a propagation step limit of 100k (Bb+Pw100k) almost corresponds to virtually limiting the runtime of Pw to 1 s (Bb+Pwvirtual1s), in particular when considering the number of instances solved by at least a factor of 10 faster than Bb. A propagation limit of \(10\,000\) (Bb+Pw10k), however, yields the best results in terms of number of solved instances and the overall runtime.
Figure 13 shows the influence on randomization effects of our propagationbased strategy Pw in terms of the number of solved instances when introducing different levels of nondeterminism during value selection and different path selection strategies over 21 runs with different seeds and a time limit of 10 s.
In terms of value selection, the default configuration of Pw prioritizes inverse values over consistent values during backtracing with a probability of 99:1. As illustrated in Fig. 13a, decreasing this ratio, i.e., increasing the probability to choose consistent values over inverse values, increases the level of nondeterminism of our backtracing algorithm, and as a consequence, the variance in terms of performance. The default ratio of 99:1 has a SD of 17.9 and decreasing the ratio of inverse to consistent values to 50:50 and 0:100 (consistent values only), the standard deviation increases to 23.5 and 38.9. When decreasing the level of nondeterminism by increasing the ratio of inverse to consistent values to 100:0 (inverse values only), on the other hand, the SD drops to 14.1. Overall, as shown in Fig. 13a, a higher probability to choose inverse over consistent values also increases performance. However, as shown in Sect. 4, using inverse values only (ratio 100:0) is incomplete.
In terms of path selection, not prioritizing inputs but choosing randomly corresponds to a maximum level of nondeterminism. Prioritizing controlling inputs for Boolean operators already decreases nondeterminism during path selection. However, utilizing essential inputs for all wordlevel operators decreases nondeterminism even further. Figure 13a shows the influence of decreasing the level of nondeterminism during path selection in terms of the number of solved instances over 21 runs with different seeds and a time limit of 10 s. By default, Pw prioritizes essential inputs for all wordlevel operators. Utilizing only controlling inputs of Boolean operators already decreases performance, and not prioritizing inputs but choosing randomly decreases performance even further. Prioritizing essential inputs for all wordlevel operators yields the best results.
6 Conclusion
In this paper, we presented our complete propagationbased local search strategy for the theory of quantifierfree fixedsize bitvectors, which we previously presented in [35], in more detail.
We defined a complete set of rules for determining backtracing values when propagating assignments towards the primary inputs and provided extensive examples to illustrate the core concepts of our approach. We further provided a more extensive experimental evaluation, including an analysis of randomization effects caused by using different seeds for the random number generator. Motivated by the experimental results in [35], which showed the potential of a sequential portfolio combination of our propagationbased strategy and a stateoftheart bitblasting approach, we implemented this combination in our SMT solver Boolector. Our results confirm a considerable gain in performance.
Our procedure was evaluated on problems in the theory of quantifierfree bitvectors in SMT. However, it is not restricted to bitvector logics. Applying our strategy to other logics is probably the most intriguing direction for future work.
When combined with bitblasting, our propagationbased techniques may learn properties of the input formula that might be useful for the bitblasting engine. We leave learning and passing these properties to the bitblasting engine to future work. Further, extending our propagationbased techniques by introducing strategies for conflict detection and resolution during backtracing as well as lemma generation to obtain an algorithm that is able to also prove unsatisfiability is another challenge for future work. A possible direction would be incorporating techniques from the MCSat for bitvectors approach presented in [41].
Finally, we would like to thank Andreas Fröhlich and the reviewers for helpful comments, and Holger Hoos for fruitful discussions on the relation between nondeterministic completeness and the notion of probabilistically approximately complete (PAC).
Notes
All experimental data of this evaluation can be found at http://fmv.jku.at/fmsd16.
References
Balint A, Belov A, Heule MJH, Järvisalo M (eds) (2013) SAT competition 2013, Department of Computer Science Series of Publications B, vol B20131. University of Helsinki
Balint A, Belov A, Järvisalo M, Sinz C (2015) Overview and analysis of the SAT challenge 2012 solver competition. Artif Intell 223:120–155
Balint A, Schöning U (2012) Choosing probability distributions for stochastic local search and the role of make versus break. In: SAT, Lecture Notes in Computer Science, vol 7317, pp 16–29. Springer
Barrett C, Conway CL, Deters M, Hadarean L, Jovanovic D, King T, Reynolds A, Tinelli C (2011) CVC4. In: CAV, Lecture Notes in Computer Science, vol 6806, pp 171–177. Springer
Barrett C, Fontaine P, Tinelli C (2015) The SMTLIB standard: version 2.5. Technical report, Department of Computer Science, The University of Iowa. https://www.SMTLIB.org
Barrett C, Stump A, Tinelli C (2010) SMTLIB. https://www.SMTLIB.org
Belov A, Heule MJH, Järvisalo M (eds) (2014) SAT competition 2014, Department of Computer Science Series of Publications B, vol B20142. University of Helsinki
Biere A (2016) Splatz, Lingeling, Plingeling, Treengeling, YalSAT Entering the SAT Competition 2016. In: SAT competition 2016—solver and benchmark descriptions, Department of Computer Science Series of Publications B, vol B20161, pp 44–45. University of Helsinki
Brummayer R (2009) Efficient SMT solving for bitvectors and the extensional theory of arrays. Ph.D. thesis, Johannes Kepler University Linz
Bruttomesso R (2008) RTL verification: from SAT to SMT(BV). Ph.D. thesis, University of Trento
Bruttomesso R, Cimatti A, Franzén A, Griggio A, Hanna Z, Nadel A, Palti A, Sebastiani R (2007) A lazy and layered SMT(BV) solver for hard industrial verification problems. In: CAV, Lecture Notes in Computer Science, vol 4590, pp 547–560. Springer
Bruttomesso R, Pek E, Sharygina N, Tsitovich A (2010) The OpenSMT solver. In: TACAS, Lecture Notes in Computer Science, vol 6015, pp 150–153. Springer
Cimatti A, Griggio A, Schaafsma BJ, Sebastiani R (2013) The MathSAT5 SMT solver. In: TACAS, Lecture Notes in Computer Science, vol 7795, pp 93–107. Springer
de Moura LM, Bjørner N (2008) Z3: an efficient SMT solver. In: TACAS, Lecture Notes in Computer Science, vol 4963, pp 337–340. Springer
de Moura LM, Jovanovic D (2013) A modelconstructing satisfiability calculus. In: VMCAI, Lecture Notes in Computer Science, vol 7737, pp 1–12. Springer
Dutertre B (2014) Yices 2.2. In: CAV, Lecture Notes in Computer Science, vol 8559, pp 737–744. Springer
Franzen A (2010) Efficient solving of the satisfiability modulo bitvectors problem and some extensions to SMT. Ph.D. thesis, University of Trento
Fröhlich A, Biere A, Wintersteiger CM, Hamadi Y (2015) Stochastic local search for satisfiability modulo theories. In: Bonet B, Koenig S (eds) Proceedings of the twentyninth AAAI conference on artificial intelligence, Jan 25–30, 2015, Austin, Texas, USA, pp 1136–1143. AAAI Press
Ganesh V (2007) Decision procedures for bitvectors, arrays and integers. Ph.D. thesis, Stanford University
Ganesh V, Dill DL (2007) A decision procedure for bitvectors and arrays. In: CAV, Lecture Notes in Computer Science, vol 4590, pp 519–531. Springer
Godefroid P, Levin MY, Molnar DA (2008) Automated whitebox fuzz testing. In: NDSS. The Internet Society
Goel P (1981) An implicit enumeration algorithm to generate tests for combinational logic circuits. IEEE Trans Comput 30(3):215–222
Griggio A, Phan Q, Sebastiani R, Tomasi S (2011) Stochastic local search for SMT: combining theory solvers with walksat. In: FroCoS, Lecture Notes in Computer Science, vol 6989, pp 163–178. Springer
Hadarean L, Bansal K, Jovanovic D, Barrett C, Tinelli C (2014) A tale of two solvers: eager and lazy approaches to bitvectors. In: CAV, Lecture Notes in Computer Science, vol 8559, pp 680–695. Springer
Hansen TA (2012) A constraint solver and its application to machine code test generation. Ph.D. thesis, University of Melbourne
Hoos HH (1999) On the runtime behaviour of stochastic local search algorithms for SAT. In: AAAI/IAAI, pp 661–666. AAAI Press/The MIT Press
Huang C, Cheng K (2000) Assertion checking by combined wordlevel ATPG and modular arithmetic constraintsolving techniques. In: DAC, pp 118–123
Iyer MA (2003) Race: a wordlevel ATPGbased constraints solver system for smart random simulation. In: ITC, pp 299–308. IEEE Computer Society
Jovanović D, Barrett C, de Moura L (2013) The design and implementation of the model constructing satisfiability calculus. In: Formal methods in computeraided design, FMCAD 2013, Portland, OR, USA, Oct 20–23, 2013, pp 173–180. IEEE
Kroening D, Strichman O (2008) Decision procedures—an algorithmic point of view. Texts in theoretical computer science. An EATCS series. Springer, Heidelberg
Kuehlmann A, Paruthi V, Krohm F, Ganai MK (2002) Robust Boolean reasoning for equivalence checking and functional property verification. IEEE Trans CAD Integr Circuits Syst 21(12):1377–1394
Kunz W, Stoffel D (1997) Reasoning in Boolean networks: logic synthesis and verification using testing techniques. Kluwer Academic Publishers, Norwell
Naveh Y, Rimon M, Jaeger I, Katz Y, Vinov M, Marcus E, Shurek G (2007) Constraintbased random stimuli generation for hardware verification. AI Mag 28(3):13–30
Niemetz A, Preiner M, Biere A (2015) Boolector 2.0. JSAT 9:53–58
Niemetz A, Preiner M, Biere A (2016) Precise and complete propagation based local search for satisfiability modulo theories. In: CAV (1), Lecture Notes in Computer Science, vol 9779, pp 199–217. Springer
Niemetz A, Preiner M, Biere A, Fröhlich A (2015) Improving local search for bitvector logics in SMT with path propagation. In: DIFTS@FMCAD, pp 1–10
Selman B, Kautz HA, Cohen B (1994) Noise strategies for improving local search. In: AAAI, pp 337–343. AAAI Press/The MIT Press
Tillmann N, Schulte W (2005) Parameterized unit tests. In: ESEC/SIGSOFT FSE, pp 253–262. ACM
Xu L, Hutter F, Hoos HH, LeytonBrown K (2008) Satzilla: Portfoliobased algorithm selection for SAT. J Artif Intell Res (JAIR) 32:565–606
Yuan J, Pixley C, Aziz A (2006) Constraintbased verification. Springer, Berlin
Zeljic A, Wintersteiger CM, Rümmer P (2016) Deciding bitvector formulas with mcSAT. In: SAT, Lecture Notes in Computer Science, vol 9710, pp 249–266. Springer
Acknowledgements
Open access funding provided by Austrian Science Fund (FWF).
Author information
Authors and Affiliations
Corresponding author
Additional information
Supported by Austrian Science Fund (FWF) under NFN Grant S11408N23 (RiSE).
Rights and permissions
Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.
About this article
Cite this article
Niemetz, A., Preiner, M. & Biere, A. Propagation based local search for bitprecise reasoning. Form Methods Syst Des 51, 608–636 (2017). https://doi.org/10.1007/s1070301702956
Published:
Issue Date:
DOI: https://doi.org/10.1007/s1070301702956