Propagation based local search for bit-precise reasoning

Many applications of computer-aided verification require bit-precise reasoning as provided by satisfiability modulo theories (SMT) solvers for the theory of quantifier-free fixed-size bit-vectors. The current state-of-the-art in solving bit-vector formulas in SMT relies on bit-blasting, where a given formula is eagerly translated into propositional logic (SAT) and handed to an underlying SAT solver. Bit-blasting is efficient in practice, but may not scale if the input size can not be reduced sufficiently during preprocessing. A recent score-based local search approach lifts stochastic local search from the bit-level (SAT) to the word-level (SMT) without bit-blasting and proved to be quite effective on hard satisfiable instances, particularly in the context of symbolic execution. However, it still relies on brute-force randomization and restarts to achieve completeness. Guided by a completeness proof, we simplified, extended and formalized our propagation-based variant of this approach. We obtained a clean, simple and more precise algorithm that does not rely on score-based local search techniques and does not require brute-force randomization or restarts to achieve completeness. It further yields substantial gain in performance. In this article, we present and discuss our complete propagation based local search approach for bit-vector logics in SMT in detail. We further provide an extended and extensive experimental evaluation including an analysis of randomization effects.


Introduction
A majority of applications in the field of hardware and software verification requires bit-precise reasoning as provided by satisfiability modulo theories (SMT) solvers for the quantifier-free theory of fixed-size bit-vectors. In many of these applications, e.g., (constrained random) test case generation [33,38,40] or white box fuzz testing [21], a majority of the problems is satisfiable. For this kind of problems, local search procedures are useful even though they do not allow to determine unsatisfiability. Previous work [18,36] showed that local search approaches for bit-vector logics in SMT are orthogonal to other approaches, which suggests that they are in particular beneficial within a portfolio setting [36].
Current state-of-the-art SMT solvers for the quantifier-free theory of fixed-size bitvectors [4,13,14,16,34] employ the so-called bit-blasting approach (e.g., [30]), where an input formula is eagerly translated to propositional logic (SAT) and handed to an underlying SAT solver. While efficient in practice, bit-blasting approaches heavily rely on rewriting and other techniques [9][10][11][12]17,19,20,24,25] to simplify the input during preprocessing and may not scale if the input size can not be reduced sufficiently. In [18], Fröhlich et al. proposed to attack the problem from a different angle and presented a score-based local search approach for bit-vector logics, which lifts stochastic local search (SLS) from the bit-level (SAT) to the word-level (SMT) without bit-blasting. In previous years, and in particular since the SAT challenge 2012 [2], a new generation of SLS for SAT solvers with very simple architecture [3] achieved remarkable results not only in the random but also in the combinatorial tracks of recent SAT competitions [1,2,7]. Previous attempts to utilize SLS techniques in SMT by integrating an SLS SAT solver into the DPLL(T)-framework of the SMT solver MathSAT [13] were not able to compete with bit-blasting [23]. In contrast, the word-level local search approach in [18] showed promising initial results. However, [18] does not fully exploit the word-level structure but rather simulates bit-level local search by focusing on single bit flips.
Hence, in [36], we proposed a propagation-based extension of [18], which introduced an additional strategy to propagate assignments from the outputs to the inputs. This significantly improved performance. Our results further suggested that these techniques may be beneficial in a sequential portfolio setting [39] in combination with bit-blasting. However, down-propagating assignments as presented in [36] utilizes inverse value computation only, which can get stuck if no inverse value can be found. In that case, [36] still falls back on scorebased local search techniques and requires brute-force randomization and restarts to achieve completeness, as does [18]. Further, inverse value computation as presented in [36] is too restrictive for some operators and focusing only on inverse values when down-propagating assignments is incomplete and may inadvertently prune the search.
In this paper, guided by a formal completeness proof we present a simple, precise and complete local search variant of the procedure proposed in [36]. Our approach does not use score-based local search techniques as described in [18] but relies on propagation of assignments only. It further does not require brute-force randomization or restarts to achieve completeness. To determine propagation paths, we extend the concept of controlling inputs to the word-level, which allows to further prune the search. To propagate assignments down, we lift the concept of "backtracing" of Automatic Test Pattern Generation (ATPG) [32], which goes back to the PODEM algorithm [22], to the word-level. We further provide a formalization of backtracing for the bit-level and the word-level. Note that in contrast to backtracing in ATPG, our algorithm works with complete assignments. Existing algorithms for word-level ATPG [27,28] are based on branch-bound, use neither backtracing nor complete assignments,and in general lack formal treatment.
We implemented our techniques in our SMT solver Boolector [34] and show that combining our propagation-based approach with bit-blasting within a sequential portfolio setting is beneficial in terms of performance. We provide an extensive experimental evaluation, including an analysis of randomization effects as a result of different seeds for the random number generator, in particular in comparison to the score-based local search approach in [18] as implemented in Boolector. Our results show that our techniques yield a substantial gain in performance.
This article extends and revises work presented earlier in [35]. We provide a more detailed description of the propagation-based local search approach introduced in [35], including extensive examples illustrating the core concepts of our approach. We further include a complete set of rules for determining assignments during backtracing. Our previous experimental evaluation of a sequential portfolio combination of our propagation-based technique with bit-blasting was a virtual experiment. For this paper, we implemented such a sequential portfolio combination within Boolector and provide an extensive experimental evaluation of our techniques. This evaluation includes an in-depth analysis of the performance of our propagation-based local search approach compared to the score-based local search approach presented in [18] and the evaluation of randomization effects of both techniques, which were not included in previous work.

Overview
Our propagation-based local search procedure is based on propagating target assignments from the outputs to the inputs and does not need to rely on restarts or brute-force randomization to achieve completeness. Local search procedures are in general incomplete in the sense that they do not allow to determine unsatisfiability. Hence, in the following, we restrict our notion of completeness to satisfiable input problems and use it synonymously to the more established property of probabilistically approximately complete (PAC) [26], which is commonly used in the AI community to discuss completeness properties of local search algorithms. It follows the traditional notion of non-deterministic computation of Turing machines, which entails that we treat probabilistic choices as non-deterministic choices [26].
The basic idea of our approach is illustrated in Fig. 1 and described more precisely in pseudo code in Fig. 2. It is applied to propositional formulas (the bit-level) and quantifierfree bit-vector formulas (the word-level) as follows.
Given a formula φ, we assume without loss of generality that φ is a directed acyclic graph (DAG) with a single root r (the so-called root constraint or output of φ). We use the letter σ to refer to complete but non-satisfying assignments to all inputs and operators in φ. We further identify complete satisfying assignments with the letter ω. Starting from a random but non-satisfying initial assignment σ 1 with σ 1 (r ) = 0, our goal is to reach a satisfying assignment ω with ω(r ) = 1 by iteratively changing the values of primary inputs. We identify ω(r ) = 1 as the target value of output r (line 3), denoted as 0 1 in Fig. 1, and propagate this value along a path towards the primary inputs (lines 4-7). We also refer to this process as "backtracing" [22]. Recursively propagating target value ω(r ) = 1 from the output to the primary inputs yields a new value x i = σ i (v i ) for an input v i (e.g., x 1 for v 1 in Fig. 1). By updating assignment σ i on input v i to σ i+1 (v i ) = x i (e.g., σ 2 (v 1 ) = x 1 in Fig. 1) without changing the value of other primary inputs but recomputing consistent values Fig. 1 Basic idea of our propagation-based local search approach. Starting from an unsatisfying assignment σ 1 , we force root r to assume its target value ω(r ) = 1 and iteratively propagate this information towards the inputs until we find a solution ω. for inner nodes (lines 8-9), we move from σ i to σ i+1 and repeat this process until we reach a satisfying assignment, i.e., σ i+1 = ω. When down-propagating assignments, we identify path selection (line 5) and selecting the value to propagate (line 6) as the only sources of non-determinism. However, we aim to maximally reduce non-deterministic choices without sacrificing completeness. Hence, on the bit-level, path selection prioritizes controlling inputs w.r.t. the current assignment, a wellknown concept from ATPG, while value selection for a selected input is uniquely defined. On the word-level, we introduce the corresponding new notion of essential inputs, which lifts the bit-level concept of controlling inputs to the word-level, and restrict value selection to the computation of what we refer to as consistent and inverse values.
As expected for local search, our propagation-based approach is not able to determine unsatisfiability. Thus the algorithm in Fig. 2 does not terminate in case that a given input formula is unsatisfiable. When determining satisfiability, however, our propagation-based local search approach is complete (PAC), i.e., there exists a non-deterministic choice of moves that leads to a solution.
In the following, we first introduce and formalize our propagation-based approach on the bit-level and prove its completeness. We then lift it to the word-level, and prove its completeness on the word-level. We further analyze randomization effects as result of using different seeds for the random number generator and show that our techniques yield substantial performance improvements, in particular in combination with bit-blasting within a sequential portfolio setting.

Bit-level
For the sake of simplicity and without loss of generality we consider a fixed Boolean formula φ and restrict the set of Boolean operators to {∧, ¬}. We interpret φ as a single-rooted And-Inverter-Graph (AIG) [31], where an AIG is a DAG represented as a 5-tuple (r, N , G, V, E).
The set of nodes N = G ∪ V contains the single root node r ∈ N, and is further partitioned into a set of gates G and a set of primary inputs (or variables) V . We require that the set of variables is non-empty, i.e., V = ∅, and assume that the Boolean constants B = {0, 1}, i.e., {false, true}, do not occur in N . This assumption is without loss of generality since every occurrence of true and false as input to a gate g ∈ G can be eliminated through rewriting.
The set of gates G = A ∪ I consists of a set of and-gates A and a set of inverter-gates I . We write g = n ∧ m if g ∈ A, and g = ¬n if g ∈ I . We further refer to the children of a node g ∈ G as its (gate) inputs (e.g., n or m). Let E = E A ∪ E I be the edge relation between nodes, with E A : A → N 2 and E I : I → N describing edges from and-resp. inverter-gates to its input(s). We write E(g) = (n, m) for g = n ∧ m and E(g) = n for g = ¬n and further introduce the notation g → n for an edge between a gate g and one of its inputs n.
We define a complete assignment σ of the given fixed formula φ as a complete function σ : N → B. Similarly, a partial assignment α of formula φ is defined as a partial function α : N → B. We say that a complete assignment σ is consistent on a gate g ∈ I with g = ¬n iff σ (g) = ¬σ (n), and consistent on a gate g ∈ A with g = n ∧ m iff σ (g) = σ (n) ∧ σ (m).
A complete assignment σ is globally consistent on φ (or just consistent) iff it is consistent on all gates g ∈ G. An assignment σ is satisfying if it is consistent (thus complete) and satisfies the root, i.e., σ (r ) = 1. We use the letter ω to denote a satisfying assignment. A formula φ is satisfiable if it has a satisfying assignment. We use C to denote the set of consistent assignments, and W with W ⊆ C to denote the set of satisfying assignments of formula φ.
Given two consistent assignments σ and σ , we say that σ is obtained from σ by flipping We also refer to flipping a variable as a move. Note that σ (g) for gates g ∈ G is defined implicitly due to consistency of assignment σ after fixing the values for the primary inputs V .
Given a set of variables V that can be flipped non-deterministically, let S : C → P(M) be a (local search) strategy that maps a consistent assignment to a set of possible moves M = V . Note that in general, there exist different notions of strategy, e.g., as in the context of game theory or synthesis. In the context of local search, using the term "strategy" as defined above is well established, e.g., [26,37]. Further note that since a move corresponds to flipping a variable, the set of possible moves M corresponds to the set of variables V and is redundant on the bit-level. However, we use the same notation on the word-level where M captures the set of moves valid under a strategy as a set of input value pairs, since a word-level move requires to additionally identify the new value of an input. Definition 2 (Distance-Reducing Strategy) A strategy S is (non-deterministically) distance reducing, if for all assignments σ ∈ C\W there exists a satisfying assignment ω ∈ W and a move σ v − → σ valid under S which reduces the Hamming Distance. That is, move v ∈ M is in (σ, ω), thus HD(σ, ω) − HD(σ , ω) = 1.
Obviously, any distance reducing strategy can reach a satisfying assignment (though not necessarily ω) within at most HD(σ, ω) moves. This first observation is the key argument in the completeness proofs for our propagation based strategies (both on the bit-level and word-level).

Proposition 3 A distance reducing strategy is also complete
In the following, our ultimate goal is to define a strategy that maximally reduces nondeterministic choices without sacrificing completeness. In the algorithm shown in Fig. 2, path selection (selecting the backtracing node in line 5) and value selection (selecting the backtracing value in line 6) while down-propagating assignments constitute the only sources of non-determinism. As we will show later, in contrast to value selection on the wordlevel, selecting a backtracing value on the bit-level is uniquely defined. When selecting a backtracing node on the bit-level, non-determinism can be reduced by utilizing the notion of controlling inputs from ATPG [32], which is defined as follows.
Definition 4 (Controlling Input) Let node n ∈ N be an input of a gate g ∈ G, i.e., g → n, and let σ be a complete assignment consistent on g. We say that input n is controlling under σ if for all complete assignments σ consistent on g with σ (n) = σ (n) we have σ (g) = σ (g).
In other words, gate input n is controlling if the assignment of g (i.e., its output value) remains unchanged as long as the assignment of n does not change.
Given an assignment σ consistent on g ∈ G, we denote a target value t for gate g as σ (g) t. On the bit-level, target value t is implicitly given through assignment σ as t = ¬σ (g), i.e., t can not be reached as long as the controlling inputs of g remain unchanged.
On the word-level, t may be any value t = σ (g). Figure 3 shows all possible assignments σ consistent on a gate g ∈ G. At the outputs we denote current assignment σ (g) and target value t as σ (g) t with t = ¬σ (g), e.g., 0

Example 5
1. At the inputs we show their assignment under σ . All controlling inputs are indicated with an underline. Note that and-gate g = n ∧ m has no controlling inputs for σ (g) = 1.
We define a sequence of nodes π = (n 1 , . . . , n k ) ∈ N + as a path of length k with k = |π| iff n i → n i+1 for 0 < i < k, also written as n 1 → . . . → n k . A path π is rooted if n 1 = r , and (fully) expanded if n k ∈ V. We refer to n k ∈ V as the leaf of π in this case. As a restriction on φ, we require all nodes n ∈ N to be reachable from the root r , i.e., there exists a path π such that π = (r, . . . , n). We further require all paths to be acyclic, i.e., for all n ∈ N there exists no path n → + n. Note that as a consequence of representing φ as a DAG, this is the case for any path in φ. Given a path π = (. . . , g) with gate g ∈ G and g → n, we say that π.n = (. . . , g, n) is an extension of path π with node n. Definition 6 (Path Selection) Given a complete assignment σ ∈ C and a path π = (. . . , g) as above. Gate input n can be selected w.r.t. assignment σ to extend path π to π.n if input n is controlling or if gate g has no controlling input.
Path selection based on the notion of controlling inputs as introduced above exploits observability don't cares as defined in the context of ATPG [32]. Similarly, we adopt the ATPG idea of backtracing [22,32] as follows.

Definition 7 (Backtracing
Step) Given a complete assignment σ ∈ C and a gate g ∈ G with g → n. A backtracing step w.r.t. assignment σ selects gate input n w.r.t. assignment σ as in Definition 6 and determines a backtracing value x for input n as follows. If g = ¬n, then x = σ (g). Else, if g = n ∧ m, then x = ¬σ (g).
As an important observation it turns out that performing a bit-level backtracing step always flips the value of the selected input under σ . For a selected input, the backtracing value is therefore always unique. This can be checked easily by considering all possible scenarios shown in Fig. 3.
We define the root trace ρ = ((r ), {r → 1}) as a trace that maps root r to its target value ω(r ) = 1. A propagation trace w.r.t. assignment σ is a (possibly partial) trace τ that starts from the root trace ρ and is extended by propagation steps w.r.t. assignment σ , denoted as ρ → * τ .
Note that given a path π and σ , partial assignment α is redundant on the bit-level. However, we use the same notation on the word-level, where α captures updates to σ along π, which (in contrast to the bit-level) are not uniquely defined.
Definition 10 (Propagation Strategy) Given a non-satisfying but consistent assignment σ ∈ C\W, the set of valid moves P(σ ) for σ under propagation strategy P contains exactly the leafs of all expanded propagation traces w.r.t σ .
In the following, we present and prove the main Lemma of this paper, which then immediately gives completeness of strategy P in Theorem 12. It is reused for proving completeness of the extension of P to the word-level in Theorem 25.
The basic idea of our completeness proof. Using a satisfying assignment ω as an oracle, in each Lemma 11 (Propagation Lemma) Given a non-satisfying but consistent assignment σ ∈ C\W, then for any satisfying assignment ω ∈ W, used as an oracle, there exists a fully expanded propagation trace τ w.
( 1 ) The root trace ρ = ((r ), {r → ω(r )}) obviously satisfies this invariant. Now, let τ = (π, α) be a trace that satisfies the invariant but is not fully expanded, i.e., π = (r, . . . , g) with g ∈ G and α(g) = ω(g) = σ (g). Since σ (g) = ω(g) it follows that g has at least one input n with σ (n) = ω(n). If g has no controlling input, then by Definition 6 it is allowed to select n as an input with σ (n) = ω(n). Otherwise, input n is selected as any controlling input. In both cases we select x = ω(n) = σ (n) as backtracing value using Proposition 8. Trace τ is now extended by n with backtracing value x to τ , i.e., τ → τ , which in turn concludes the inductive proof of Eq. (1). Any fully expanded propagation trace τ = (π, α) with leaf v ∈ V , as generated above, also satisfies the invariant in Eq. (1). Thus, we have In essence, given assignment σ and ω as above, our propagation strategy propagates target value ω(r ) from root r towards the primary inputs, ultimately producing a fully expanded propagation trace τ = (π, α). In case of non-deterministic choices for extending the trace we use ω as an oracle to pick an input n with σ (n) = ω(n), which can be selected according to Definition 6. The oracle allows us to ensure that for all nodes n ∈ π, α(n) = ω(n), which yields α(v) = ω(v) = σ (v) and consequently v ∈ (σ, ω) for leaf v of τ . Thus, using Lemma 11, our propagation strategy turns out to be distance reducing, and therefore, according to Proposition 3, complete. Figure 4 illustrates the basic idea of our proof, which, in the following, serves as a basis for lifting our approach from the bit-level to the word-level.

Theorem 12
Under the assumptions of the previous Lemma 11 we also get v ∈ P(σ ) for leaf v. Thus, P is distance reducing and, as a consequence, complete. Table 1 The set of considered bit-vector operators (w, p, q, i, j ∈ N).

Operator
SMT-LIB Arity Bit-width Output Input

Word-level
In the following, we only consider bit-vector expressions of fixed bit-width w ∈ N. We denote a bit-vector expression n of width w as n [w] , but will omit the bit-width if the context allows. We refer to the i-th bit of n [w] as n[i] with 1 ≤ i ≤ w and, for the sake of simplicity, define bit indices as starting from 1 rather than 0. We interpret n [1] as the least significant bit (LSB) and n[w] as the most significant bit (MSB), and denote bit ranges over n from bit index j down to index i as n[ j : i]. In string representations of bit-vectors, we interpret the bit at the far left index as the MSB and the bit at the far right index as the LSB. We further define ctz to be the common function that c ounts the number of trailing zeroes of a given bitvector, i.e., the number of consecutive 0-bits starting from the LSB, e.g., ctz(0101) = 0 and ctz(111100) = 2. Similarly, clz is the common function to compute the number of leading zeroes, i.e., the number of consecutive 0-bits starting from the MSB, e.g., clz(0101) = 1 and clz(111100) = 0. For the sake of simplicity and without loss of generality we consider a fixed single-rooted quantifier-free bit-vector formula φ and interpret Boolean expressions as bit-vector expressions of bit-width one. The set of bit-vector operators is restricted to O = {&, ∼, =, <, <<, >>, +, ·, ÷, mod, •, [ : ], if-then-else} and interpreted according to Table 1. The selection of operators in O is rather arbitrary but provides a good compromise between effective and efficient word-level rewriting and compact encodings for bit-blasting approaches. It is complete, though, in the sense that all operators defined in SMT-LIB [5] (in particular signed operators) can be modeled in a compact way. Note that our methods are not restricted to single-rootedness or this particular selection of operators, and can easily be lifted to any other set of operators or the multi-rooted case.
We interpret formula φ as a single-rooted DAG represented as an 8-tuple (r, N , κ, O, F, V, B, E). The set of nodes N = O ∪ V ∪ B contains the single root node r ∈ N of bitwidth one, and is further partitioned into a set of operator nodes O, a set of primary inputs (or bit-vector variables) V, and a set of bit-vector constants B ⊆ B * , which are denoted in either decimal or binary notation if the context allows. The bit-width of a node is given by κ : N → N, thus κ(r ) = 1. Operator nodes are interpreted as bit-vector operators via F : O → O, which in turn determines their arity and input and output bit-widths as defined in Table 1. The edge relation between nodes is given as describing the set of edges from unary, binary, and ternary operator nodes to its input(s), respectively. We again use the notation o → n for an edge between an operator node o and one of its inputs n.
We only consider well-formed formulas, where the bit-widths of all operator nodes and their inputs conform to the conditions imposed via interpretation F as defined in Table 1. For instance, we denote a bit-vector addition node o with inputs n and m as o = n + m, where o ∈ O of arity 2 with F(o) = +, and therefore κ(o) = κ(n) = κ(m). In the following, if more convenient we will use the functional notation o = (n 1 , . . . , n k ) for operator node o ∈ O of arity k with inputs n 1 , . . . , n k and F(o) = , e.g., +(n, m). Note that the semantics of all operators in O correspond to their SMT-LIB counterparts listed in Table 1, with three exceptions. Given a logical shift operation n << m or n >> m, w.l.o.g. and as implemented in our SMT solver Boolector [34], we restrict bit-width κ(n) to 2 κ(m) . Further, as implemented by Boolector and other state-of-the-art SMT solvers, e.g., MathSAT [13] Yices [16] and Z3 [14], we define an unsigned division by zero to return the greatest possible value rather than introducing uninterpreted functions, i.e., A complete assignment σ of a given fixed φ is a complete function σ : N → B * with σ (n) ∈ B κ(n) , and a partial assignment is a partial function α : is determined by the semantics of operator as defined in the SMT-LIB standard [5] (with the exceptions discussed above).
A complete assignment is (globally) consistent on φ (or just consistent), iff it is consistent on all bit-vector operator nodes o ∈ O and σ (b) = b for all bit-vector constants b ∈ B. A satisfying assignment ω is a complete and consistent assignment that satisfies the root, i.e., ω(r ) = 1. In the following, we will again use the letter C to denote the set of complete and consistent assignments, and the letter W with W ⊆ C to denote the set of satisfying assignments of formula φ.
Given a bit-vector variable v ∈ V with κ(v) = w and assignments σ, σ ∈ C. We adopt the notion of obtaining assignment σ from assignment σ by assigning a new value x to variable v with x ∈ B w and x = σ (v), written as σ v →x − −− → σ , which we refer to as a move. The set of word-level moves is thus defined and accordingly, a word-level propagation strategy P is defined as a function S : C → P(M), which maps a consistent assignment to a set of moves. We lift propagation strategy P from the bit-level to the word-level by first introducing our new notion of essential inputs, which lifts and extends the bit-level notion of controlling inputs to the word-level. In other words, an input n to an operator node o is essential w.r.t. some target value t, if o can not assume t as long as the assignment of n does not change. As an example, consider the bit-vector operators and their essential inputs under some consistent assignment σ w.r.t. some target value t as depicted in Fig. 5. Input m, however, is not essential, since it is possible to simply select, e.g., x = 01 for n such that t = 10 = x << σ (m) = 01 << 1. (d) Given o := n · m with t = 10 and σ = {o → 00, n → 00, m → 10}. Input n is essential since t = 00 but σ (n) = 00, and thus, it is not possible to find a value x for m such that σ (n) · x = t. Input m, however, is not essential since we could pick, e.g., x = 01 for n to obtain t = 10 = x · σ (m) = 01 · 10. (e) Given o := n ÷ m with t = 10 and σ = {o → 01, n → 01, m → 01}. Input n is essential since σ (n) < t, and thus, it is not possible to find a value x for m such that σ (n) ÷ x = t. Input m, however, is not essential, since we could pick, e.g., x = 10 to obtain t = 10 = x · σ (m) = 10 · 01. (f) Given o := n mod m with t = 10 and σ = {o → 00, n → 01, m → 01}. Since σ (n) = 01 < 10 = t, it is not possible to find a value x for m such that σ (n) mod x = t. However, since σ (m) = 01 but t = 00, it is also not possible to find a value x for n such that x mod σ (m) = t. Hence, both inputs are essential.
, and thus, it is not possible to find a value x for m such that σ (n)•x = t. Input m, however, is not essential since it already matches the corresponding slice of the target value.
Note that bit-level expressions (AIGs) can be represented by bit-vectors of bit-width one, which can be interpreted as word-level Boolean expressions. In this sense, the notion of controlling inputs can also be applied to Boolean expressions on the word-level.   indicates the desired transition from current assignment σ (o) to target value t, and an input value shows its assignment under σ . Underlined blue cases indicate that this input is a single essential input and will therefore always be selected. Any other case (both inputs are essential or no input is essential) represents a non-deterministic choice during path selection.
In contrast to value selection on the bit-level, where a backtracing step always yields the flipped assignment of the selected input as backtracing value, on the word-level, selecting a backtracing value is not uniquely defined but a source of non-determinism. We consider three variants of value selection, under the following assumptions. Let t be the target value of an operator node o ∈ O, and let σ ∈ C be a complete assignment such that σ (o) = t. Further, assume that input n with o → n is selected w.r.t. target value t and σ as in Definition 17 above. In other words, a value is consistent for an input, if it allows to produce the target value after changing values of other inputs if necessary. We compute a consistent value as backtracing value x for input n as described in Table 2.
However, in some cases, restricting the notion of consistent values even further may be beneficial. Consider the following motivating example. The chances to select x = 67280421310721 [65] if consistent values for the multiplication operator are chosen as described in Table 2 are arbitrarily small. Hence, we also consider the notion of inverse values, which utilize the inverse of an operator. In other words, a value is an inverse value for input n, if it allows to produce the target value for an operator node without changing the assignment of its other inputs. Consequently, an inverse value for input n is also consistent. We compute an inverse value as backtracing value x for input n as described in Tables 3 and 4.
Note that inverse value computation as initially presented in [36] is too restrictive for some operators, which is incomplete since it may inadvertently prune the search. We therefore require that inverse value computation allows to generate all possible values for all operators in O, which is the case for the rules for inverse value computation as described in Tables 3  and 4.  Let y = m > > ctz (σ(m)), thus y is odd. We compute y −1 as its multiplicative inverse modulo 2 w , e.g., via the Extended Euclidean algorithm (similar to word-level rewriting techniques that require solving for a variable, e.g. [18]), and determine x as (t > > ctz (σ(m))) · y −1 except that all bits in x[w : w − ctz (σ(m)) + 1] are set arbitrarily, with w = κ(n).   Note that it is not always possible to find an inverse value for input n, e.g., o := n & m with σ = {o → 00, n → 00, m → 00} and t = 01. Further, even for operators that allow to always produce inverse values, e.g., operator +, doing so may lead to inadvertently pruning the search space, see Example 23 below.
As shown in Example 23, a propagation strategy using only inverse values without further randomization is incomplete. Hence, when performing a backtracing step, we in general select some consistent non-inverse value, if no inverse value exists, and otherwise nondeterministically choose between consistent (but not necessarily inverse) and inverse values. Since all operators in O are surjective for our selected semantics (i.e., they can produce any target value, e.g., ∼ 0 mod 0 = ∼0), it is not necessary to select inconsistent random values. For other sets of operators, however, this might be necessary. For the sake of completeness we therefore included the selection of random values in the formal definition of backtracing steps. Note that since on the bit-level the backtracing value for a selected input is uniquely determined (see Proposition 8), the issue of value selection is specific to the word-level. Further, when interpreting AIGs as word-level expressions, the notion of backtracing steps on the bit-level as in Definition 7 exactly matches the word-level notion as in Definition 22 using Proposition 15. As a side note, the problem of value selection during word-level backtracing and subsequent word-level propagation is similar to the problem of making a theory decision ("model assignment") and propagating this decision in MCSat [15,29].
The word-level propagation strategy P is defined in exactly the same way as for the bitlevel (see Definition 10) except that the word-level notion of backtracing based on essential inputs and consistent and inverse value selection (Definition 22) replaces bit-level backtracing based on controlling inputs (Definition 7), and the set of valid moves P(σ ) contains not only the leafs of all expanded propagation traces but also their updated assignments, i.e., (v, α(v)) for a leaf v. Further important concepts defined on the bit-level in Sect. 3 can be extended naturally to the word-level. These concepts include (expanded) paths and traces, leafs, and trace extension. We omit formal definitions accordingly.
Proposition 8, which is substantial for the bit-level proof of Lemma 11, does not directly apply on the word-level due to the more sophisticated selection of backtracing values. We lift Proposition 8 to the word-level as follows.

Proposition 24
Let σ ∈ C be a complete consistent assignment, and let ω be a satisfying assignment ω ∈ W. Given operator node o ∈ O with o → n and target value t = ω(o) = σ (o), i.e., σ (o) t, then there exists a backtracing step w.r.t. assignment σ and target value t, which selects input n and backtracing value x = ω(n) = σ (n).
Proof First, assume that operator node o has an essential input w.r.t. assignment σ . Then we select an arbitrary essential input n of o. Since target value t = ω(o) = σ (o), we get σ (n) = ω(n) by contraposition of Definition 13. Similarly, if o has no essential inputs, then we select n as an arbitrary input with σ (n) = ω(n), which has to exist since ω(o) = σ (o).
In both cases, we can select x = ω(n) = σ (n) as backtracing value, which is consistent for operator node o w.r.t. assignment σ and target value t since ω is consistent. Picking a random value as backtracing value, which is the last case in Definition 22, can not occur under the given assumptions since, as already discussed, ω is consistent on o.
Using Proposition 24 instead of Proposition 8, the bit-level proof of Lemma 11 can then be lifted to the word-level by replacing every occurrence of gate g with operator node o, and the notion of "controlling" input with "essential" input.
Theorem 25 Theorem 12 and Lemma 11 also apply on the word-level, and thus, propagation strategy P is also complete on the word-level.
Note that even though Proposition 24 would allow us to restrict the selection of consistent and inverse backtracing values to be different from the current input node value, i.e., x = σ (n), we do not enforce this property. Restricting value selection to a value x = σ (n) interferes with path selection, in particular in the case where an input node is selected for which the current value is the only consistent or inverse value. We leave the exploration of this optimization to future work.

Experimental evaluation
We implemented our propagation strategy within our SMT solver Boolector [34] and consider the following configurations.
(1) Bb The core Boolector engine, which implements a bit-blasting approach. This configuration is identical to the version that entered the QF_BV track of the SMT competition 2016 and uses (internal) version bbc of our SAT solver Lingeling [8] as back end solver. (2) Bsls The score-based local search approach of [18] as implemented in Boolector [36], with random walks enabled. This approach lifts stochastic local search for SAT to the word-level and iteratively moves from a non-satisfying towards a satisfying assignment by flipping single bits or incrementing, decrementing and (bit-wise) negating the values of the primary inputs. Moves are in general selected as the best (improving) moves according to some score function, and if no such move exists, a random value is chosen. If random walks are enabled, with a certain probability some random (and not necessarily the best) move is performed. This configuration mainly corresponds to the default configuration of [18] as implemented in Z3 [14] except for the score definition, which differs due to implementation issues (as described in [36]). (3) Paig The bit-level configuration of our propagation-based approach, which operates on the AIG representation of a given input as bit-blasted by Boolector. (4) Pw The word-level configuration of our propagation-based approach which directly operates on the given bit-vector formula, with inverse values prioritized over consistent values during backtracing with a probability of 99 to 1.
Note that the choice of rewriting and other simplification techniques applied prior to the actual decision procedure may considerably influence its performance. In order to provide the same basis for comparison and avoid skewed results due to differences in the rewriting and simplification techniques applied by Z3 [14] versus Boolector, we do not compare our propagation-based approach against the original implementation of [18] in Z3 but against our implementation of [18] in Boolector (configuration Bsls). All configurations of Boolector apply the same set of rewriting and simplification techniques in the same order.  Since [35] and in particular for the SMT competition 2016, we improved several core components of Boolector, which affects all the configurations above. The default configurations of Paig and Pw therefore show major improvements in comparison to [35]. In comparison to [36], the default configuration of Bsls, however, seems to perform worse. This is solely due to minor changes within the score-based local search engine of Boolector that affect the random number generator (RNG). We will show that the difference in the number of solved instances compared to [36] lies within the expected variance caused by randomization effects. Note that where not otherwise noted, in the default configuration of all local search configurations Bsls, Paig and Pw we will use a seed of value 0 for the RNG.
We compiled a set of in total 16436 benchmarks 1 and included all benchmarks with status sat or unknown in the QF_BV category of the SMT-LIB [6] benchmark library except those proved by Bb to be unsatisfiable within a time limit of 1200 s. We further excluded all benchmarks solved by Boolector via rewriting only. Note that our benchmark set is the same set we already used in [36] and [35]. Previously, all benchmarks in the Sage2 family that used non-SMT-LIBv2 compliant operators had to be explicitly excluded from the set above. However, since the SMT competition 2016, these benchmarks have been removed from SMT-LIB.
All experiments were performed on a cluster with 30 nodes of 2.83 GHz Intel Core 2 Quad machines with 8 GB of memory using Ubuntu 14.04.3 LTS. Each run is limited to use 7 GB of main memory. In terms of runtime we consider CPU time only. In case of a time out or memory out, the time limit is used as runtime.
Note that the results in [18] indicate that there still exists a considerable gap between the performance of state-of-the-art bit-blasting and word-level local search. However, the latter significantly outperforms bit-blasting on several instances. We therefore evaluated our local search configurations with regard to an application within a sequential portfolio setting and apply a limit of 1 and 10 s for the local search configurations, and a limit of 1200 s for the bit-blasting and the sequential portfolio configurations.
We evaluated our propagation-based strategy in comparison to the score-based local search approach in [18], in particular in terms of robustness with respect to randomization effects. We run a batch of 21 runs of each configuration Pw, Paig and Bsls with different seeds for the RNG of Boolector (one with default seed 0 and 20 with different random seeds) with a time limit of 10 s. Table 5 summarizes the results of configurations Bb, Bsls, Paig and Pw with a time limit of 10 s and default seed 0 for the local search configurations. As further illustrated in Figs. 8 and 9, overall, our word-level propagation strategy Pw clearly outperforms our bit-level propagation strategy Paig and the score-based local search approach Bsls. Figure 9 shows the results of Pw, Paig and Bsls over all 21 runs with different seeds in terms of number of solved instances and runtime as box-and-whiskers plots with the results of the runs with default seed 0 indicated with a red diamond. As a measure for robustness we use the standard deviation (SD) and the inter-quartile range (IQR), i.e., the distance between the lower quartile and the upper quartile, of the results of all 21 runs with different seeds, where lower values indicate a higher level of robustness. In terms of number of solved instances, for configuration Pw (SD: 17.88, IQR: 27) both the SD and the IQR is less than half of the SD and the IQR of Bsls (SD: 44.9, IQR: 60) and less than a third of the SD and IQR of Paig (SD: 62.6, IQR: 82). These results suggest that compared to Paig, both Pw and Bsls profit from directly working on the word-level, and overall, our word-level propagation-based strategy is indeed more robust with respect to randomization effects than the score-based local search approach of [18].
Even though overall Pw outperforms Paig and Bsls on some benchmarks in the families sage, Sage2 and stp_samples, in comparison to Bsls (457 instances) and Paig (38 instances) configuration Pw seems to struggle. As an interesting observation, when bit-blasting the benchmarks in question, for the majority of benchmarks more than 50% of the bit-vector expressions contain bits that have been simplified to the Boolean constants {0, 1} on the bitlevel. Our bit-level strategy Paig operates on the bit-blasted AIG layer where all constant bits are eliminated via rewriting, and therefore always propagates target values that can actually be assumed. Our word-level strategy Pw, however, does not know which bits can be simplified to constant bits and may therefore determine and propagate target values that can never be assumed. Configuration Bsls, on the other hand, also does not have any explicit information on constant bits but considers them implicitly when exploring the neighborhood prior to performing a move since any neighbor with constant bits not matching their value will not result in score improvement.  In an additional experiment, we evaluated the models of the 457 benchmarks on which Pw seems to have a disadvantage over Bsls and identified an interesting pattern.
For more than 80% (374 instances) out of all 457 instances the assignment of more than 50% of the inputs was 0, and for 80% (293 instances) out of these instances, for more than 30% of the non-zero inputs only one bit was set to 1.
Hence, since Bsls starts with an initial assignment where all inputs are set to 0, for this kind of benchmarks its focus on single bit flips allows to quickly move the initial assignment towards a satisfying assignment.
For more than 60% of all 457 instances, Bsls required less than 50 moves (in comparison, the maximum number of moves for all solved instances is 3086).
Configuration Pw starts with the same initial assignment as Bsls, however, as mentioned above, the fact that the majority of these 457 benchmarks contains a considerable amount of expressions with constant bits seems to handicap Pw.
These results suggest that for this set of benchmarks the strategy of Bsls is advantageous over Pw and in particular profits from an initial assignment where all inputs are set to 0. Hence, in an additional experiment we introduce configurations Bsls-1 and Pw-1 where we initialized the inputs with all bits set to 1 rather than 0. Figure 10 shows the performance of  Fig. 9 Number of solved instances and runtime over 21 runs (with different seeds) of configurations Pw, Paig and Bsls with a time limit of 10 s.
Bsls-1 and Pw-1 in comparison to Bsls and Pw over 21 runs with different seeds (again, one with default seed 0 and 20 with different random seeds) with a time limit of 10 s. Table 6 further summarizes the results of Bsls, Bsls-1, Pw and Pw-1 with default seed 0. Overall, configuration Bsls obviously profits considerably from initializing the inputs with 0 since in comparison to Bsls the number of solved instances of Bsls-1 drops by almost 10%. In particular on the set of 457 benchmarks where Bsls had an advantage over our propagationbased strategy Pw, initializing the inputs with 1 resulted in Bsls-1 only solving 42 instances (9.2%) within a time limit of 10 s. Our propagation-based strategy, on the other hand, is much more robust than Bsls with respect to the input initialization value and seems to overall even profit from initializing the inputs with 1 rather than 0. Figure 8c shows the performance of our propagation-based configuration Pw compared to our bit-blasting configuration Bb with a time limit of 10 s. As summarized in Table 5, even though there exists a considerable gap in the number of solved instances between Bb and Pw (within 10 s, Bb solves almost 2000 instances more than Pw), on 2650 benchmarks, Pw outperforms Bb by at least a factor of 10. In an additional experiment, we evaluated Pw with a time limit of 1200 s, which increases the number of solved instances compared to a time limit of 10 s by 7% (571 instances). These results suggest a combination of both configurations within a sequential portfolio setting [39], where our propagation-based strategy is run for a certain amount of time prior to invoking the bit-blasting engine. However, in practice, the number of propagation steps performed is a more reliable metric than the actual runtime of Pw within a sequential portfolio setting. In the following, we distinguish two sequential portfolio configurations.
(1) Bb+Pw-virtual-Xs A virtual sequential portfolio combination of Pw and Bb, where we assume that Pw is run exactly X seconds prior to invoking Bb.  (2) Bb+Pw-X The sequential portfolio combination of Pw and Bb as implemented in Boolector, where configuration Pw is run with a limit of X propagation steps prior to invoking Bb. Note that this configuration won the QF_BV division of the main track of the SMT competition 2016 with X=1000=1k. Figure 11 illustrates the performance of a virtual sequential portfolio combination Bb+Pwvirtual-1s in comparison to the bit-blasting configuration Bb with a time limit of 1200 s, where we assume that configuration Pw is run for 1 s before falling back to the bit-blasting engine. Overall, configuration Bb+Pw-virtual-1s solves 63 instances more than Bb, and further outperforms Bb in terms of runtime by at least a factor of 10 on almost 2400 benchmarks. Figure 12 shows the performance of the sequential portfolio combinations Bb+Pw-1k, Bb+Pw-10k, Bb+Pw-50k and Bb+Pw-100k in comparison to configuration Bb with a time limit of 1200 s, where Pw is run with a limit of 1000, 10,000, 50,000 and 100,000 propagation steps before invoking the bit-blasting engine. With a limit of 1k propagation steps, configuration Bb+Pw-1k already solves 41 instances more than Bb. It further outperforms Bb in terms of runtime by at least a factor of 10 on more than 2400 benchmarks. Increasing the propagation step limit for configuration Pw to 10k, 50k and 100k further increases performance in term of runtime, with 2601 (Bb+Pw-10k), 2649 (Bb+Pw-50k) and 2657 (Bb+Pw-100k) instances solved by at least a factor of 10 faster than with configuration Bb. In terms of number of solved instances, configuration Bb+Pw-10k shows the best performance with a plus of 52 instances compared to Bb. Configurations Bb+Pw-50k and Bb+Pw-10k still solve 50 and 45 more instances than Bb, but lose instances compared to Bb+Pw-10k due to the increasing overhead introduced for those instances not solved within the given propagation step limit.  Fig. 12 Bb versus our sequential portfolio configurations Bb+Pw-1k, Bb+Pw-10k, Bb+Pw-50k and Bb+Pw-100k with a time limit of 1200 s.
In an additional experiment with configurations Bb+Pw-1k and Bb+Pw-10k, we compiled a set of 21172 unsatisfiable benchmarks containing all QF_BV benchmarks in SMT-LIB with status unsat and determined the overhead introduced by Pw. With a total of 1237 s for configuration Bb+Pw-1k, the overhead for the unsatisfiable instances is negligible compared to the performance gain of almost 102k s on the satisfiable instances. For configuration Bb+Pw-10k, the overhead for the unsatisfiable instances is larger by a factor of 10 (10,316 s), which is still an order of magnitude less than the performance gain of more than 116k s on the satisfiable instances. Table 7 summarizes the results of configurations Bb, Bb+Pw-virtual-1s and Bb+Pw-10k, and gives a more detailed overview by benchmark family with a time limit of 1200 s. As shown in Fig. 12, a propagation step limit of 100k (Bb+Pw-100k) almost corresponds to virtually limiting the runtime of Pw to 1 s (Bb+Pw-virtual-1s), in particular when considering the number of instances solved by at least a factor of 10 faster than Bb. A propagation limit of 10 000 (Bb+Pw-10k), however, yields the best results in terms of number of solved instances and the overall runtime. Figure 13 shows the influence on randomization effects of our propagation-based strategy Pw in terms of the number of solved instances when introducing different levels of nondeterminism during value selection and different path selection strategies over 21 runs with different seeds and a time limit of 10 s.
In terms of value selection, the default configuration of Pw prioritizes inverse values over consistent values during backtracing with a probability of 99:1. As illustrated in Fig. 13a, decreasing this ratio, i.e., increasing the probability to choose consistent values over inverse values, increases the level of non-determinism of our backtracing algorithm, and as a consequence, the variance in terms of performance. The default ratio of 99:1 has a SD of 17.9 and decreasing the ratio of inverse to consistent values to 50:50 and 0:100 (consistent values only), the standard deviation increases to 23.5 and 38.9. When decreasing the level of non-determinism by increasing the ratio of inverse to consistent values to 100:0 (inverse values only), on the other hand, the SD drops to 14.1. Overall, as shown in Fig. 13a, a higher  In terms of path selection, not prioritizing inputs but choosing randomly corresponds to a maximum level of non-determinism. Prioritizing controlling inputs for Boolean operators already decreases non-determinism during path selection. However, utilizing essential inputs for all word-level operators decreases non-determinism even further. Figure 13a shows the influence of decreasing the level of non-determinism during path selection in terms of the number of solved instances over 21 runs with different seeds and a time limit of 10 s. By default, Pw prioritizes essential inputs for all word-level operators. Utilizing only controlling inputs of Boolean operators already decreases performance, and not prioritizing inputs but choosing randomly decreases performance even further. Prioritizing essential inputs for all word-level operators yields the best results.

Conclusion
In this paper, we presented our complete propagation-based local search strategy for the theory of quantifier-free fixed-size bit-vectors, which we previously presented in [35], in more detail.
We defined a complete set of rules for determining backtracing values when propagating assignments towards the primary inputs and provided extensive examples to illustrate the core concepts of our approach. We further provided a more extensive experimental evaluation, including an analysis of randomization effects caused by using different seeds for the random number generator. Motivated by the experimental results in [35], which showed the potential of a sequential portfolio combination of our propagation-based strategy and a state-of-the-art bit-blasting approach, we implemented this combination in our SMT solver Boolector. Our results confirm a considerable gain in performance.
Our procedure was evaluated on problems in the theory of quantifier-free bit-vectors in SMT. However, it is not restricted to bit-vector logics. Applying our strategy to other logics is probably the most intriguing direction for future work.
When combined with bit-blasting, our propagation-based techniques may learn properties of the input formula that might be useful for the bit-blasting engine. We leave learning and passing these properties to the bit-blasting engine to future work. Further, extending our propagation-based techniques by introducing strategies for conflict detection and resolution during backtracing as well as lemma generation to obtain an algorithm that is able to also prove unsatisfiability is another challenge for future work. A possible direction would be incorporating techniques from the MCSat for bit-vectors approach presented in [41].
Finally, we would like to thank Andreas Fröhlich and the reviewers for helpful comments, and Holger Hoos for fruitful discussions on the relation between non-deterministic completeness and the notion of probabilistically approximately complete (PAC).