figure a
figure b

1 Introduction

Satisfiability Modulo Theories (SMT) solvers serve as back-end reasoning engines for a wide range of applications in formal methods (e.g., [13, 14, 21, 23, 35]). In particular, the theory of fixed-size bit-vectors, in combination with arrays, uninterpreted functions and floating-point arithmetic, have received increasing interest in recent years, as witnessed by the high and increasing numbers of benchmarks submitted to the SMT-LIB benchmark library [5] and the number of participants in corresponding divisions in the annual SMT competition (SMT-COMP) [42]. State-of-the-art SMT solvers supporting (a subset of) these theories include Boolector [31], cvc5  [3], MathSAT [15], STP [19], Yices2 [17] and Z3 [25]. Among these, Boolector had been largely dominating the quantifier-free divisions with bit-vectors and arrays in SMT-COMP over the years [2].

Boolector was originally published in 2009 by Brummayer and Biere [11] as an SMT solver for the quantifier-free theories of fixed-size bit-vectors and arrays. Since 2012, Boolector has been mainly developed and maintained by the authors of this paper, who have extended it with support for uninterpreted functions and lazy handling of non-recursive lambda terms [32, 38, 39], local search strategies for quantifier-free bit-vectors [33, 34], and quantified bit-vector formulas [40].

While Boolector is still competitive in terms of performance, it has several limitations. Its code base consists of largely monolithic C code, with a rigid architecture focused on a very specialized, tight integration of bit-vectors and arrays. Consequently, it is cumbersome to maintain, and adding new features is difficult and time intensive. Further, Boolector requires manual management of memory and reference counts from API users; terms and sorts are tied to a specific solver instance and cannot be shared across instances; all preprocessing techniques are destructive, which disallows incremental preprocessing; and due to architectural limitations, incremental solving with quantifiers is not supported.

In 2018, we forked Boolector in preparation for addressing these issues, and entered an improved and extended version of this fork as Bitwuzla in the SMT competition 2020 [26]. At that time, Bitwuzla extended Boolector with: support for floating-point arithmetic by integrating SymFPU [8] (a library of bit-vector encodings of floating-point operations); a novel generalization of its propagation-based local search strategy [33] to ternary values [27]; unsat core extraction; and since 2022, support for reasoning about quantified formulas for all supported theories and their combinations. This version of Bitwuzla was already made available on GitHub at [28], but not officially released. However, architectural and structural limitations inherited from Boolector remained. Thus, to overcome these limitations and address the above issues, we decided to discard the existing code base and rewrite Bitwuzla from scratch.

In this paper, we present the first official release of Bitwuzla, an SMT solver for the (quantified and quantifier-free) theories of fixed-size bit-vectors, arrays, floating-point arithmetic, uninterpreted functions and their combinations. Its name (pronounced as bitvootslah) is derived from an Austrian dialect expression that can be translated as someone who tinkers with bits. Bitwuzla is written in , inspired by techniques implemented in Boolector. That is, rather than only redesigning problematic aspects of Boolector, we carefully dissected and (re)evaluated its parts to serve as guidance when writing a new solver from scratch. In that sense, it is not a reimplementation of Boolector, but can be considered its superior successor. Bitwuzla is available on GitHub [28] under the MIT license, and its documentation is available at [29].

2 Architecture

Bitwuzla supports reasoning about quantifier-free and quantified formulas over fixed-size bit-vectors, floating-point arithmetic, arrays and uninterpreted functions as standardized in SMT-LIB [4]. In this section, we provide an overview of Bitwuzla ’s system architecture and its core components as given in Fig. 1.

Fig. 1.
figure 1

Bitwuzla system architecture.

Bitwuzla consists of two main components: the Solving Context and the Node Manager. The Solving Context can be seen as a solver instance that determines satisfiability of a set of formulas and implements the lazy, abstraction/refinement-based SMT paradigm lemmas on demand [6, 24] (in contrast to SMT solvers like cvc5 and Z3, which are based on the CDCL(\(\mathcal {T}\)) [36] framework). The Node Manager is responsible for constructing and maintaining nodes and types and is shared across multiple Solving Context instances.

Bitwuzla provides a comprehensive API as its main interface, with a C and Python API built on top. All features of the API are also accessible to C and Python users. The API documentation is available at [29]. The API exports Term, Sort, Bitwuzla, and Option classes for constructing nodes and types, configuring solver options and constructing Bitwuzla solver instances (the external representation of Solving Contexts). Term and Sort objects may be used in multiple Bitwuzla instances. The parser interacts with the solver instance via the API. A textual command line interface (CLI) builds on top of the parser, supporting SMT-LIBv2 [4] and BTOR2 [35] as input languages.

2.1 Node Manager

Bitwuzla represents formulas and terms as reference-counted, immutable nodes in a directed acyclic graph. The Node Manager is responsible for constructing and managing these nodes and employs hash-consing to maximize sharing of subgraphs. Automatic reference counting allows the Node Manager to determine when to delete nodes. Similarly, types are constructed and managed by the Type Manager, which is maintained by the Node Manager. Nodes and types are stored globally (thread-local) in the Node Database and Type Database, which has the key advantage that they can be shared between arbitrarily many solving contexts within one thread. This is one of the key differences to Boolector ’s architecture, where terms and types are manually reference counted and tied to a single solver instance, which does not allow sharing between solver instances.

2.2 Solving Context

A Solving Context is the internal equivalent of a solver instance and determines the satisfiability of a set of asserted formulas (assertions). Solving Contexts are fully configurable via options and provide an incremental interface for adding and removing assertions via push and pop. Incremental solving allows users to perform multiple satisfiability checks with similar sets of assertions while reusing work from earlier checks. On the API level, Bitwuzla also supports satisfiability queries under a given set of assumptions (SMT-LIB command check-sat-assuming), which are internally handled via push and pop.

Nodes and types constructed via the Node Manager may be shared between multiple Solving Contexts. If the set of assertions is satisfiable, the Solving Context provides a model for the input formula. It further allows to query model values for any term, based on this model (SMT-LIB command get-value). In case of unsatisfiable queries, the Solving Context can be configured to extract an unsatisfiable core and unsat assumptions.

A Solving Context consists of three main components: a Rewriter, a Preprocessor and a Solver Engine. The Rewriter and Preprocessor perform local (node level) and global (over all assertions) simplifications, whereas the Solver Engine is the central solving engine, managing theory solvers and their interaction.

Preprocessor. As a first step of each satisfiability check, prior to solving, the preprocessor applies a pipeline of preprocessing passes in a predefined order to the current set of assertions until fixed-point. Each preprocessing pass implements a set of satisfiability-preserving transformations. All passes can be optionally disabled except for one mandatory transformation, the reduction of the full set of operators supported on the API level to a reduced operator set: Boolean connectives are expressed by means of \(\{\lnot , \wedge \}\), quantifier \(\exists \) is represented in terms of \(\forall \), inequalities are represented in terms of < and >, signed bit-vector operators are expressed in terms of unsigned operators, and more. These reduction transformations are a subset of the term rewrites performed by the Rewriter, and rewriting is implemented as one preprocessing pass. Additionally, Bitwuzla implements 7 preprocessing passes, which are applied sequentially, after rewriting, until no further transformations are possible: and flattening, which splits a top-level \(\wedge \) into its subformulas, e.g., \(a \wedge (b \wedge (c = d))\) into {a, b, \(c = d\)}; substitution, which replaces all occurrences of a constant x with a term t if \(x = t\) is derived on the top level; skeleton preprocessing, which simplifies the Boolean skeleton of the input formula with a SAT solver; embedded constraints, which substitutes all occurrences of top-level constraints in subterms of other top-level constraints with true; extract elimination, which eliminates bit-vector extracts over constants; lambda elimination, which applies beta reduction on lambda terms; and normalization of arithmetic expressions.

Preprocessing in Bitwuzla is fully incremental: all passes are applied to the current set of assertions, from all assertion levels, and simplifications derived from lower levels are applied to all assertions of higher levels (including assumptions). Assertions are processed per assertion level i, starting from \(i = 0\), and for each level \(i > 0\), simplifications are applied based on information from all levels \(j \le i\). Note that when solving under assumptions, Bitwuzla internally pushes an assertion level and handles these assumptions as assertions of that level. When a level i is popped, the assertions of that level are popped, and the state of the preprocessor is backtracked to the state that was associated with level \(i-1\). Note that preprocessing assertion levels \(i < j\) with information derived from level j requires to not only restore the state of the preprocessor, but to also reconstruct the assertions on levels \(i < j\) when level j is popped to the state before level j was pushed, and is left to future work.

Boolector, on the other hand, only performs preprocessing based on top-level assertions (assertion level 0) and does not incorporate any information from assumptions or higher assertion levels.

Rewriter. The rewriter transforms terms via a predefined set of rewrite rules into semantically equivalent normal forms. This transformation is local in the sense that it is independent from the current set of assertions. We distinguish between required and optional rewrite rules, and further group rules into so-called rewrite levels from 0–2. The set of required rules consists of operator elimination rewrites, which are considered level 0 rewrites and ensure that nodes only contain operators from a reduced base set. For example, the two’s complement \(-x\) of a bit-vector term x is rewritten to \((\sim \!x + 1)\) by means of one’s complement and bit-vector addition. Optional rewrite rules are grouped into level 1 and level 2. Level 1 rules perform rewrites that only consider the immediate children of a node, whereas level 2 rules may consider multiple levels of children. If not implemented carefully, level 2 rewrites can potentially destroy sharing of subterms and consequently increase the overall size of the formula. For example, rewriting \((t + 0)\) to t is considered a level 1 rewrite rule, whereas rewriting \((a - b = c)\) to \((b + c = a)\) is considered a level 2 rule since it may introduce an additional bit-vector addition \((b + c)\) if \((a - b)\) occurs somewhere else in the formula. The maximum rewrite level of the rewriter can be configured by the user.

Rewriting is applied on the current set of assertions as a preprocessing pass and, as all other passes, applied until fixed-point. That is, on any given term, the rewriter applies rewrite rules until no further rewrite rules can be applied. For this, the rewriter must guarantee that no set of applied rewrite rules may lead to cyclic rewriting of terms. Additionally, all components of the solving context apply rewriting on freshly created nodes to ensure that all nodes are always fully normalized. In order to avoid processing nodes more than once, the rewriter maintains a cache that maps nodes to their fully rewritten form.

Solver Engine. After preprocessing, the solving context sends the current set of assertions to the Solver Engine, which implements a lazy SMT paradigm called lemmas on demand [6, 24]. However, rather than using a propositional abstraction of the input formula as in [6, 24], it implements a bit-vector abstraction similar to Boolector [12, 38]. At its core, the Solver Engine maintains a bit-vector theory solver and a solver for each supported theory. Quantifier reasoning is handled by a dedicated quantifiers module, implemented as a theory solver. The Solver Engine manages all theory solvers, the distribution of relevant terms, and the processing of lemmas generated by the theory solvers.

The bit-vector solver is responsible for reasoning about the bit-vector abstraction of the input assertions and lemmas generated during solving, which includes all propositional and bit-vector terms. Theory atoms that do not belong to the bit-vector theory are abstracted as Boolean constants, and bit-vector terms whose operator does not belong to the bit-vector theory are abstracted as bit-vector constants. For example, an array select operation of type bit-vector is abstracted as a bit-vector constant, while an equality between two arrays is abstracted as a Boolean constant.

If the bit-vector abstraction is satisfiable, the bit-vector solver produces a satisfying assignment, and the floating-point, array, function and quantifier solvers check this assignment for theory consistency. If a solver finds a theory inconsistency, i.e., a conflict between the current satisfying assignment and the solver’s theory axioms, it produces a lemma to refine the bit-vector abstraction and rule out the detected inconsistency. Theory solvers are allowed to send any number of lemmas, with the only requirement that if a theory solver does not send a lemma, the current satisfying assignment is consistent with the theory.

Finding a satisfying assignment for the bit-vector abstraction and the subsequent theory consistency checks are implemented as an abstraction/refinement loop as given in Algorithm 1. Whenever a theory solver sends lemmas, the loop is restarted to get a new satisfying assignment for the refined bit-vector abstraction. The loop terminates if the bit-vector abstraction is unsatisfiable, or if the bit-vector abstraction is satisfiable and none of the theory solvers report any theory inconsistencies. Note that the abstraction/refinement algorithm may return unknown if the input assertions include quantified formulas.

figure i

Backtrackable Data Structures. Every component of the Solver Context except for the Rewriter depends on the current set of assertions. When incrementally, the assertion stack is modified by adding (SMT-LIB command push) and removing (SMT-LIB command pop) assertions. In contrast to Boolector, Bitwuzla supports saving and restoring the internal solver state, i.e., the state of the Solving Context, corresponding to these push and pop operations by means of backtrackable data structures. These data structures are custom variants of mutable data structures provided in the standard library, extended with an interface to save and restore their state on push and pop calls. This allows the solver to take full advantage of incremental solving by reusing work from previous satisfiability checks and backtracking to previous states. Further, this enables incremental preprocessing. Bitwuzla’s backtrable data structures are conceptually similar to context-dependent data structures in cvc5 [3].

3 Theory Solvers

The Solver Engine maintains a theory solver for each supported theory and implements a module for handling quantified formulas as a dedicated theory solver. The central engine of the Solver Engine is the bit-vector theory solver, which reasons about a bit-vector abstraction of the current set of input assertions, refined with lemmas generated by other theory solvers. The theories of fixed-size bit-vectors, arrays, floating-point arithmetic, and uninterpreted functions are combined via a model-based theory combination approach similar to [12, 38].

Theory combination is based on candidate models produced by the bit-vector theory solver for the bit-vector abstraction (function \(T_{BV}\)::solve() in Algorithm 1). For each candidate model, each theory solver checks consistency with the axioms of the corresponding theory (functions \(T_{*}\)::check() in Algorithm 1). If a theory solver requests a model value for a term that is not part of the current bit-vector abstraction, the theory solver who “owns” that term is queried for a value. If this value or the candidate model is inconsistent with the axioms of the theory querying the value, it sends a lemma to refine the bit-vector abstraction.

3.1 Arrays

The array theory solver implements and extends the array procedure from [12] with support for reasoning over (equalities of) nested arrays and non-extensional constant arrays. This is in contrast to Boolector, which generalizes the lemmas on demand procedure for extensional arrays as described in [12] to non-recursive first-order lambda terms [37, 38], without support for nested arrays. Generalizing arrays to lambda terms allows to use the same procedure for arrays and uninterpreted functions and enables a natural, compact representation and extraction of extended array operations such as memset, memcpy and array initialization patterns as described in [39]. As an example, \( memset (a,i,n,e)\), which updates n elements of array a within range \([i,i+n[\) to a value e starting from index i, can be represented as \(\lambda j\,.\, ite (i \le j < i + n,e,a[j])\). Reasoning over equalities involving arbitrary lambda terms (including these operations), however, requires higher-order reasoning, which is not supported by Boolector. Further, extensionality over standard array operators that are represented as lambda terms (e.g., store) requires special handling, which makes the procedure unnecessarily complex. Bitwuzla, on the other hand, implements separate theory solvers for arrays and uninterpreted functions. Consequently, since it does not generalize arrays to lambda terms, it cannot utilize the elegant representation of Boolector for the extended array operations of [39]. Thus, currently, extracting and reasoning about these operations is not yet supported. Instead of representing such operators as lambda terms, we plan to introduce specific array operators. This will allow a seamless integration into Bitwuzla’s array procedure, with support for reasoning about extensionality involving these operators. We will also add support for reasoning about extensional constant arrays in the near future.

3.2 Bit-Vectors

The bit-vector theory solver implements two orthogonal approaches: the classic bit-blasting technique employed by most state-of-the-art bit-vector solvers, which eagerly translates the current bit-vector abstraction to SAT; and the ternary propagation-based local search approach presented in [27]. Since local search procedures only allow to determine satisfiability, they are particularly effective as a complementary strategy, in combination with (rather than instead of) bit-blasting [27, 33]. Bitwuzla’s bit-vector solver allows to combine local search with bit-blasting in a sequential portfolio setting: the local search procedure is run until a predefined resource limit is reached before falling back on the bit-blasting procedure. Currently, Bitwuzla allows combining these two approaches only in this particular setting. We plan to explore more interleaved configurations, possibly while sharing information between the procedures as future work.

Bit-Blasting. Bitwuzla implements the eager reduction of the bit-vector abstraction to propositional logic in two phases. First, it constructs an And-Inverter-Graph (AIG) circuit representation of the abstraction while applying AIG-level rewriting techniques [10]. This AIG circuit is then converted into Conjunctive Normal Form (CNF) via Tseitin transformation and sent to the SAT solver back-end. Note that for assertions from levels \(> 0\), Bitwuzla leverages solving under assumptions in the SAT solver in order to be able to backtrack to lower assertion levels on pop. Bitwuzla supports CaDiCaL [7], CryptoMiniSat [41], and Kissat [7] as SAT back-ends and uses CaDiCaL as its default SAT solver.

Local Search. Bitwuzla implements an improved version of the ternary propagation-based local search procedure described in [27]. This procedure is a generalization of the propagation-based local search approach implemented in Boolector [33] and addresses one of its main weaknesses: its obliviousness to bits that can be simplified to constant values. Propagation-based local search is based on propagating target values from the outputs to the inputs, does not require bit-blasting, brute-force randomization or restarts, and lifts the concept of backtracing of Automatic Test Pattern Generation (ATPG) [22] to the word-level. Boolector additionally implements the stochastic local search (SLS) approach presented in [18], optionally augmented with a propagation-based strategy [34]. Bitwuzla, however, only implements our ternary propagation-based approach since it was shown to significantly outperform these approaches [33].

3.3 Floating-Point Arithmetic

The solver for the theory of floating-point arithmetic implements an eager translation of floating-point atoms in the bit-vector abstraction to equisatisfiable formulas in the theory of bit-vectors, a process sometimes referred to as word-blasting. To translate floating-point expressions to the word-level, Bitwuzla integrates SymFPU [9], a library of bit-vector encodings of floating-point operations. SymFPU uses templated types for Booleans (un)signed bit-vectors, rounding modes and floating-point formats, which allows utilizing solver-specific representations. SymFPU has also been integrated into cvc5 [3].

3.4 Uninterpreted Functions

For the theory of uninterpreted functions (UF), Bitwuzla implements dynamic Ackermannization [16], which is a lazy form of Ackermann’s reduction. The UF solver checks whether the current satisfying assignment of the bit-vector abstraction is consistent with the function congruence axiom \(\bar{a} = \bar{b} \rightarrow f(\bar{a}) = f(\bar{b})\) and produces a lemma whenever the axiom is violated.

3.5 Quantifiers

Quantified formulas are handled by the quantifiers module, which is treated as a theory solver and implements model-based quantifier instantiation [20] for all supported theories and their combinations. In the bit-vector abstraction, quantified formulas are abstracted as Boolean constants. Based on the assignment of these constants, the quantifiers solver produces instantiation or Skolemization lemmas. If the constant is assigned to true, the quantifier is treated as universal quantifier and the solver produces instantiation lemmas. If the constant is assigned to false, the solver generates a Skolemization lemma. Bitwuzla allows to combine quantifiers with all supported theories as well as incremental solving and unsat core extraction. This is in contrast to Boolector, which only supports sequential reasoning about quantified bit-vector formulas and, generally, does not provide unsat cores for unsatisfiable instances.

Table 1. Solved instances and total runtime on solved instances (non-incremental).
Table 2. Solved queries and total runtime on solved queries (incremental).

4 Evaluation

We evaluate the overall performance of Bitwuzla on all non-incremental and incremental benchmarks of all supported logics in SMT-LIB [5]. We further include logics with floating-point arithmetic that are classified as containing linear integer arithmetic (LRA). Bitwuzla does not support LRA reasoning, but the benchmarks in these logics currently only involve to-floating-point conversion (SMT-LIB command to_fp) from real values, which is supported.

We compare against Boolector [31] and the SMT-COMP 2022 version of Bitwuzla [26] (configuration SC22), which, at that time, was an improved and extended version of Boolector and won several divisions in all tracks of SMT-COMP 2022 [2]. Boolector did not participate in SMT-COMP 2022, thus we use the current version of Boolector available on GitHub (commit 13a8a06d) [1]. Further, since Boolector does not support logics involving floating-point arithmetic, quantified logics other than pure quantified bit-vectors and incremental solving when quantifiers are involved, we also compare against the SMT-COMP 2022 versions of cvc5 [3] and Z3 [25]. Both solvers are widely used, high performance SMT solvers with support for a wide range of theories, including the theories supported by Bitwuzla. Note that this version of cvc5 uses a sequential portfolio of multiple configurations for some logics.

We ran all experiments on a cluster with Intel Xeon E5-2620 v4 CPUs. We allocated one CPU core and 8GB of RAM for each solver and benchmark pair, and used a 1200 s s time limit, the same time limit as used in SMT-COMP 2022 [2].

Table 1 shows the number of solved benchmarks for each solver in the non-incremental quantifier-free (QF_) and quantified divisions. Overall, Bitwuzla solves the largest number of benchmarks in the quantified divisions, considerably improving over SC22 and Boolector with over 600 and 4,200 solved benchmarks, respectively. Bitwuzla also takes the lead in the quantifier-free divisions, with 44 more solved instances compared to SC22, and more than 650 solved benchmarks compared to cvc5. On the 140,438 commonly solved instances between Bitwuzla, SC22, cvc5, and Z3 over all divisions, Bitwuzla is the fastest solver with 203,838s, SC22 is slightly slower with 208,310s, cvc5 is \(2.85\times \) slower (586,105s), and Z3 is \(5.1\times \) slower (1,049,534s).

Table 2 shows the number of solved incremental check-sat queries for each solver in the incremental divisions. Again, Bitwuzla solves the largest number of queries overall and in the quantifier-free divisions. For the quantified divisions, Bitwuzla solves 42,770 queries, the second largest number of solved queries after Z3 (45,373), and more than 3700 more queries than SC22 (39,040). On benchmarks of the ABVFPLRA division, Bitwuzla significantly outperforms SC22 due to the occurrence of nested arrays, which were unsupported in SC22.

The artifact of this evaluation is archived and available in the Zenodo open-access repository at https://zenodo.org/record/7864687.

5 Conclusion

Our experimental evaluation shows that Bitwuzla is a state-of-the-art SMT solver for the quantified and quantifier-free theories of fixed-size bit-vectors, arrays, floating-point arithmetic, and uninterpreted functions. Bitwuzla has been extensively tested for robustness and correctness with Murxla [30], an API fuzzer for SMT solvers, which is an integral part of its development workflow. We have outlined several avenues for future work throughout the paper. We further plan to add support for the upcoming SMT-LIB version 3 standard, when finalized.