cvc5: A Versatile and Industrial-Strength SMT Solver ⋆

. cvc5 is the latest SMT solver in the cooperating validity checker series and builds on the successful code base of CVC4. This paper serves as a comprehensive system description of cvc5 ’s architectural design and highlights the major features and components introduced since CVC4 1.8. We evaluate cvc5 ’s performance on all benchmarks in SMT-LIB and provide a comparison against CVC4 and Z3.

CVC3, written with the aim of creating a flexible and performant architecture that could last far into the future. The fact that CVC4 has integrated over a decade's worth of SMT research and development while becoming increasingly robust and performance-competitive attests to the success of that endeavor.
In this paper, we introduce cvc5, the next solver in the series. cvc5 is not a rewrite of CVC4 and indeed builds on its successful code base and architecture. Compared to other SMT solvers, cvc5 supports a diverse set of theories (all standard SMT-LIB theories, and many non-standard theories) and features beyond regular SMT solving such as higher-order reasoning and syntax-guided synthesis (SyGuS) [3]. The name-change 6 rather acknowledges both a (mostly) new team of developers as well as the significant evolution the tool has undergone since CVC4 was described in a tool paper published in 2011 [21]. Moreover, cvc5 comes with updated documentation, new and improved APIs, and more user-friendly installation. Most importantly, it introduces several significant new features. Like its predecessors, cvc5 is available under the 3-clause BSD open source license and runs on all major platforms (Linux, macOS, and Windows).
We make the following contributions: -An in-depth description of the architectural design of cvc5 and how its pieces and modules work together. -A comprehensive summary of all features that have been added to the solver since CVC4 was introduced in [21]. -A description of major features introduced since CVC4 1.8, the final version of CVC4, including: • a new C ++ API, and new Python and Java APIs that build on top of it; • a new theory solver for the theory of fixed-size bit-vectors; • a new and extensive proof-production module; • a new procedure for non-linear arithmetic; and • a syntax-guided quantifier-instantiation procedure [96]. -Evidence, based on experimental evaluation and industrial use cases, that cvc5 is in fact both versatile and industrial-strength.
2 Architecture and Core Components cvc5 supports reasoning about quantifier-free and quantified formulas in a wide range of background theories and their combinations, including all theories standardized in SMT-LIB [22]. It further natively supports several non-standard theories and theory extensions. These include, among others, separation logic, the theory of sequences, the theory of finite sets and relations, and the extension of the theory of reals with transcendental functions.
In this section, we start with a brief overview of the core components of cvc5, and then discuss them in more detail in the following subsections. A high-level overview of the system architecture is given in Figure 1. The central engine of cvc5 is the SMT Solver module, which is based on the CDCL(T ) framework [99] and relies on a customized version of the MiniSat propositional solver [57] at its core. The SMT Solver consists of several components: the Rewriter and the Preprocessor modules, which apply simplifications locally (at the term level) and globally (on the whole input formula), respectively; the Propositional Engine, which serves as a manager for the CDCL(T ) SAT solver; and the Theory Engine, which manages theory combination and all theory-specific and quantified reasoning procedures.
Besides standard satisfiability checking, cvc5 provides additional functionality such as abduction, interpolation, syntax-guided synthesis (SyGuS) [3], and quantifier elimination. Each of these features is implemented as an additional solver built on top of the SMT Solver. The SyGuS Solver is the main entry point for synthesis queries, which encode SyGuS problems as (higher-order) satisfiability problems with both semantic and syntactic constraints [114]. The Quantifier Elimination Solver performs quantifier elimination based on tracking the quantifier instantiations of the SMT Solver [116]. The Abduction Solver and the Interpolation Solver are both SyGuS-based [110] and thus are built as layers on top of the SyGuS Solver.
cvc5 provides a C ++ API as the main interface, not just for external client software, but also for its own parser and for additional language bindings in Java and Python. cvc5 also provides a textual command-line interface (CLI), built on top of the parser, which supports SMT-LIBv2 [25], SyGuS2 [104] and TPTP [134] as input languages. The Proof Module can output formal unsatisfiability proofs in three proof formats: Alethe [128], Lean 4 [88], and LFSC [133].

The SMT Solver Module
The SMT Solver module is the centerpiece of cvc5 and is responsible for handling all SMT queries. Its functionality includes, in addition to satisfiability checking, constructing models for satisfiable input formulas and extracting assumptions, cores, and proof objects for unsatisfiable formulas. The main components of the SMT Solver module are described below.
Preprocessor. Before any satisfiability check, cvc5 applies to each formula from an input problem a sequence of satisfiability-preserving transformations. We distinguish between (i) required normalization passes, e.g., removal of ite terms; (ii) optional simplification passes aimed at making the formula easier to solve, e.g., finding entailed theory literals; and (iii) optional reduction passes that transform the formula from one logic to another, e.g., from non-linear integer arithmetic to a bit-vector problem with configurable bit-width. Currently, cvc5 implements 34 passes, executed in a fixed order. Optional passes can be enabled and disabled via configuration options. Preprocessing passes are self-contained, and adding or modifying passes does not require knowledge of the internals of the SMT solver engine.
Propositional Engine. The Propositional Engine serves as the core CDCL(T ) engine [99], which takes the Boolean abstraction of the input formula (together with any lemmas produced during solving) and produces a satisfying assignment for that abstraction. Its main components are the Clausifier and the propositional satisfiability (SAT) solver. The Clausifier converts the Boolean abstraction into Conjunctive Normal Form (CNF), which then serves as input for the SAT solver. In cvc5, as in CVC4, we use a customized version of MiniSat [57] as the core SAT solver. Extensions we have added to MiniSat include: the production of resolution proofs; native support for pushing and popping assertions; and a Decision Engine [12], which can be used to create customized decision heuristics for MiniSat.
During its search, the Propositional Engine asserts a theory literal (¬)p to the Theory Engine as soon as the SAT solver assigns a truth value to the propositional variable abstracting the atom p. We refer to the set of all such literals as the currently asserted literals. When checking the consistency of the set L of currently asserted literals in the overall background theory T , we distinguish between two levels of effort: standard and full, depending on whether the SAT solver has a partial or full model, respectively, for the Boolean abstraction. At standard effort, a theory solver may optionally perform some lightweight consistency checking. At full effort, the theory solver must either produce a lemma (following the splitting-on-demand approach [23]) or determine whether L is satisfiable or not and, in the latter case, produce a conflict clause, a clause that is valid in the theory T but is inconsistent with L.
Rewriter. The Rewriter module is responsible for converting terms via a set of rewrite rules into semantically equivalent normal forms. In contrast to preprocessing, rewriting is done during solving. In fact, all major components of cvc5 invoke the Rewriter to ensure that the terms they work with are normalized, thereby simplifying their implementation. Rewrite rules are applied locally, i.e., independent of the currently asserted literals, and are divided into required and optional rules, of which the latter can be enabled or disabled by the user. The Rewriter maintains a cache to avoid processing any term more than once.
Examples of rewrites include simplifications such as x + 0 ; x, normalizations that sort the operands of associative and commutative operators, and operator eliminations such as x ≤ y ; y + 1 > x (when x and y have integer sort). In certain contexts, e.g., enumerative SyGuS approaches, aggressive rewriting rules, which would be detrimental to SMT solving, can be beneficial. Such rules are implemented in an Extended Rewriter, which is enabled when needed.
To help automate improvements to the Rewriter, we developed a workflow that detects and enumerates new rewrite rule candidates using the SyGuS solver [101]. It works by detecting and suggesting critical pairs, i.e., pairs of equivalent terms that are not rewritten to the same term by the current rules.
Theory Engine. The Theory Engine is the main entry point for checking the theory consistency of the theory literals asserted by the Propositional Engine. It dispatches each of these literals to the appropriate theory solvers and is further responsible for dispatching any propagated literals or lemmas generated by the theory solvers back to the Propositional Engine.
When multiple theory solvers are enabled, the Combination Engine submodule is responsible for coordinating between them. Like CVC4, cvc5 uses the polite theory combination mechanism [74,108,130]. This includes propagating or performing case splits on equalities and disequalities between shared terms (terms appearing in the literals of more than one theory solver). As in CVC4, the algorithm for computing these splits is based on care graphs [75].
The Combination Engine controls the Model Manager, which is responsible for combining models from multiple theories and constructs a model for the input formula. The Model Manager also maintains an equivalence relation E over all the terms in the input formula, induced by all of the currently asserted literals that are equalities. When invoked, the Model Manager has the responsibility of assigning concrete values to each equivalence class of E with the assistance of the individual theory solvers, which provide values for terms in their theory. Typically, the Model Manager is invoked only when the theory solvers have reached a saturation point that allows the Theory Engine to conclude that the input problem is satisfiable (and thus, a model can be constructed successfully).
As in CVC4, each sub-formula of the input that starts with a quantifier is abstracted by a propositional variable. When any such variable or its negation is asserted, the Theory Engine dispatches the corresponding quantified formula to the Quantifiers Module, which generates suitable quantifier instantiations. Since certain techniques for handling quantified formulas, e.g., E-matching [89], require knowledge of the state and terms known by the other theory solvers, this module has access to all equality information from all theory solvers.
Theory Solvers. cvc5 supports a wide range of theories, including all theories standardized in SMT-LIB. Each theory solver relies on an Equality Engine Module, which implements congruence closure over a configurable set of operators, typically those that belong to the solver's theory. The Equality Engine is responsible for quickly detecting conflicts due to equality reasoning. In addition, all theories communicate reasoning steps to the rest of the system via the Theory Inference Manager. Every theory solver emits lemmas, conflict clauses, and propagated literals through this interface. The Theory Inference Manager implements or simplifies common usage pattern like caching and rewriting lemmas, proof construction, and collection of statistics. Every lemma or conflict sent from a theory is associated with a unique identifier for its kind, the inference identifier, which is a crucial debugging aid. Below, we briefly survey the theory solvers in cvc5, along with their main implementation techniques.
Linear Arithmetic. The linear arithmetic solver [78] extends the simplex procedure adapted for SMT by Dutertre and de Moura [56]. It implements a sumof-infeasibilities-based heuristic [79], an integration with the external GLPK LP solver [80], and certain heuristics proposed by Griggio [63]. Integer problems are handled by solving their real relaxation before using branching [64] and cutting planes [54] to find integer solutions. The branch-and-bound method optionally generates lemmas consisting of ternary clauses inspired by unit-cube tests [39].
Non-linear Arithmetic. For non-linear arithmetic problems, cvc5 resorts to linear abstraction and refinement. It uses a combination of independent subsolvers integrated with the linear arithmetic solver and invoked only when the linear abstraction is satisfiable. One sub-solver implements cylindrical algebraic coverings [1], while the other sub-solvers are based on incremental linearization [45]. A variety of lemma schemas are used to assert properties of non-linear functions (e.g., multiplication and trigonometric functions) in a counterexampleguided fashion [123]. Non-linear integer problems are solved by incremental linearization and incomplete techniques based on reductions to bit-vectors.
Arrays. As in CVC4, the array solver is based on a decision procedure by de Moura and Bjørner [91] but following the more detailed description by Jovanović and Barrett [75]. An alternative experimental implementation based on an approach by Christ and Hoenicke [43] is also available.
Bit-Vectors. For the theory of fixed-size bit-vectors, cvc5's main approach is bit-blasting, which refers to the process of translating bit-vector problems into equisatisfiable SAT problems, and is applied after preprocessing. In cvc5, we distinguish two modes for bit-blasting: lazy and eager. Lazy bit-blasting seamlessly integrates with the CDCL(T ) infrastructure of cvc5 and fully supports the combination of bit-vectors with any theory supported by cvc5. It further leverages the full power of cvc5's Equality Engine for reasoning about equalities over bit-vector terms and also uses the solve-under-assumptions feature [57] supported by many state-of-the-art SAT solvers. For problems that can be fully reduced to bit-vectors, cvc5 can also be used in eager mode. This mode does not rely on solving under assumptions, but instead directly asserts all of the bit-blasted constraints to the SAT solver, which usually enables more simplifications. Additionally, cvc5 supports the Ackermannization and eager bit-blasting of constraints involving uninterpreted functions and sorts [66].
Datatypes. For quantifier-free constraints over datatypes, we use a rule-based procedure that follows calculi already implemented in CVC4 [24,112] and that optimizes the sharing of selectors over multiple constructors [125].
Floating-Point Arithmetic. Formulas in the theory of floating-point arithmetic are translated to equisatisfiable formulas in the theory of bit-vectors, in a process referred to as word-blasting. For this, cvc5 integrates the SymFPU [37] library, which was first used in CVC4 and has also been integrated in the Bitwuzla SMT solver [92]. This approach admits several optimizations compared to earlier solvers, which translate directly to the bit-level, e.g., CNF or AIGs. Another difference from older approaches [38] is that translation is done at the formula level instead of the term level. Conversions between real and floating-point terms are treated as uninterpreted functions and refined if the models of the real arithmetic and the floating-point solver do not agree. The refinement lemmas use the monotonicity of the conversion functions to constrain the floating-point and real arithmetic terms to matching intervals that exclude the current model.
Sets and Relations. cvc5 implements a solver for the parametric theory of finite sets, i.e., sets whose elements are of any sort supported by cvc5. The core decision procedure for sets is extended with support for cardinality constraints [13]. The set theory solver is extended with a sub-module that specializes in relational constraints [87], where relations are modeled as sets of tuples.
Separation Logic. In separation logic, the semantics of constraints assume a location and data type for specifying the model of the heap. cvc5 supports an extension of the SMT-LIB language for separation logic [73], in which the location and data types of the heap can be any sort supported by cvc5. The classical separation logic connectives are treated as theory predicates which are lazily reduced to constraints over sets and uninterpreted functions [115].
Strings and Sequences. For strings and sequences, cvc5 implements a solver consisting of multiple layered components. At its core, the solver reasons about length constraints and word equations [84], supplemented with reasoning about code points to handle conversions between strings and integers efficiently [119]. Extended functions such as string replacement are lazily reduced to word equations after context-dependent simplifications [126]. When necessary, the regular expressions in input problems are unfolded and derivatives are computed [85]. The string theory solver further incorporates aggressive simplification rules that rely on abstractions to derive facts about string terms [118]. Finally, conflicts are detected eagerly on partial assignments from the SAT solver by computing the congruence closure and constant prefixes and suffixes of string terms.
Uninterpreted Functions. The theory of uninterpreted functions is handled in largely the same way as in CVC4. It follows Simplify's approach [53] extended with support for fixed finite cardinality constraints [121]. This extension is used in combination with finite-model-finding techniques for finding finite models based on minimal interpretations of uninterpreted sorts.
Quantifiers. Quantified formulas are all handled by the Quantifiers Module, which resembles a theory solver. The module contains many sub-solvers, all based on some form of quantifier instantiation, and each specializing in solving specific classes of quantified formulas. The Quantifiers Module relies on heuristic E-matching when uninterpreted functions are present [89]. This technique is supplemented by conflict-based instantiation for detecting when an instantiation is in conflict with the currently asserted literals [16,124]. The Quantifiers Module additionally incorporates finite-model-finding techniques, which are useful for detecting satisfiable input problems [122]. It also relies on enumerative approaches when other techniques are incomplete [109]. For quantifiers over linear arithmetic, it uses a specialized counterexample-guided based approach for quantifier instantiation [116]. An extension of this technique is used for quantified bit-vector logics [95]. For other quantified logics in pure background theories, e.g., over floating-point or non-linear arithmetic, cvc5 relies on syntax-guided quantifier instantiation [96]. The Quantifiers Module also contains sub-solvers implementing more advanced solving paradigms, including: a module for doing Skolemization with inductive strengthening and enumeration of sub-goals for inductive theorem proving problems [117], a finite-model-finding technique for recursive functions [113], and a solver for syntax-guided synthesis [114].

Proof Module
The Proof Module of cvc5 was built from scratch and replaces the proof system of CVC4 [67,77], which was incomplete and suffered from a number of architectural shortcomings. The design of cvc5's proof module was guided by the following principles. First, the overhead incurred by proof production should be at most linear in the solving time. Second, the emitted proofs should be detailed enough to enable efficient (i.e., polynomial) checking, ensuring that proof checking is inherently simpler than solving. Third, disabling a system component when in proof production mode because it lacks adequate proof generation capabilities should be done rarely and only if the component is not crucial for performance. Finally, given the different needs of users and the trade-offs offered by different proof systems, proof production should be flexible enough to allow the emission of proofs in different formats.
Following these design principles, the Proof Module in cvc5 produces detailed proofs for nearly all of its theories, rewrite rules, preprocessing passes, internal SAT solvers, and theory combination engines. It further supports eager and lazy proof production with built-in proof reconstruction. This enables proof production for some notoriously challenging functionalities, such as substitution and rewriting (common, for example, in simplification under global assumptions and in string solving [126]). Furthermore, although it maintains internally a single proof representation, cvc5 is able to emit proofs in multiple formats, including those supported by the LFSC [133] proof checker and the Lean 4 [88], Isabelle/HOL [100] and Coq [30] proof assistants.

Node Manager
Formulas and terms are represented uniformly in cvc5 as nodes in a directed acyclic graph, reference-counted and managed by the Node Manager. The Node Manager further maintains a Skolem Manager, which is responsible for tracking Skolem symbols introduced during solving. All cvc5 instances in the same thread share the same Node Manager instance.
Nodes are immutable and are aggressively shared using hash consing: whenever a new node is about to be created, the Node Manager checks whether a node with the same structure already exists, and if it does, it returns a reference to the existing node instead. Besides saving memory, this ensures that syntactic equality checks can be performed in constant time (by comparing the unique ids assigned to each node). Reference counting allows the Node Manager to determine when to dispose of nodes. Weak references are used whenever possible to limit the overhead of reference counting.
Nodes store 96 bits of metadata (id, reference count, kind, and number of children) and a variable number of pointers to child nodes. The kind of a node can be an operator kind, e.g., addition, or a leaf kind, e.g., a variable. Optional additional static information associated with nodes can be stored separately in hash maps referred to as node attributes. Since node attributes are managed by the Node Manager, which may be shared by multiple solver instances, attributes must only be used to capture inherent node properties (i.e., properties that are independent of run-time options).
Many theory solvers, including those for quantifiers, strings, arrays, nonlinear arithmetic, and sets, introduce terms with Skolem (i.e., fresh) constants during solving. Such constants are centrally generated by the Skolem Manager, which also associates with each of them a term of the same sort, the constant's witness form. If the computed witness form for a constant matches that of a previously used constant, the previous constant can be reused. This not only provides a deterministic way of generating fresh constants during solving but also allows the system to minimize the number of introduced constants. This reuse is crucial for performance in some theory solvers [120].

Context-Dependent Data Structures
Certain applications of SMT solvers require multiple satisfiability checks with similar assertions. To support such applications, the SMT-LIB standard includes commands to save (with a push command) the current set of user-level assertions and restore (with a pop command) a previous set. This allows the solver to reuse parts of the work from earlier satisfiability checks and amortizes startup cost. Most of the state of cvc5 depends directly or indirectly on the current set of assertions. So whenever the user pushes or pops, cvc5 has to save or restore the corresponding state. Similarly, whenever the SAT solver makes a decision or backtracks to a previous decision point, each theory solver has to save or restore the corresponding information.
To support these operations, cvc5 defines a notion of context level, which increases with each push and decreases with each pop operation, and implements context-dependent data structures. These data structures behave similarly to corresponding mutable data structures provided in the C ++ standard library, except that they are associated with a context level and automatically save and restore their state as the context increases or decreases. For efficiency reasons, s = Solver () i = s . getIn tegerSo rt () x = s . mkConst (i , " x " ) s . assertFormula ( s . mkTerm ( kinds . Equal , s . mkTerm ( kinds . Mult , x , s . mkInteger (2)) , s . mkInteger (4))) s . checkSat () this state data is stored using a region-based custom allocator that allocates one region per context level, allowing all state data associated with a level to be freed simultaneously by simply freeing the corresponding region.

Highlighted Features
In this section, we discuss features that are new in cvc5 as well as some of the more prominent user-and developer-facing features. We compare them to their counterparts in CVC4 when applicable.
Application Programming Interfaces (APIs). cvc5 provides a lean, comprehensive, and feature-complete C ++ API, which also serves as the main interface for the parser module and the basis for all other language bindings. The parser module uses the same API as external users, without any special privileges. cvc5's C ++ API has been designed and written from scratch and thus is not backwards compatible with CVC4's C ++ API. It is centered around the Solver class, which represents a cvc5 instance and implements methods for tasks such as creating terms, asserting formulas, and issuing checks.
cvc5's Python API is built on top of cvc5's C ++ API using Cython [29] and makes all of cvc5's features accessible to Python users. It is a straightforward translation of the C ++ API without added syntactic sugar such as operator overloading. Additionally, however, cvc5 provides a higher-level layer on top of its Python API, which is more user-friendly and pythonic. This layer provides automatic solver management, allows SMT terms to be constructed using Python infix operators, and converts Python objects to SMT terms of the appropriate sort. This leads to much more succinct code, as shown in Figure 2, which compares using the high-and low-level Python APIs to solve the integer equation 2 · x = 4. The higher-level Python API is based on and designed to work as a drop-in replacement for Z3py, the Python API of Z3 [90].
cvc5's Java API is implemented via the Java Native Interface (JNI), which allows Java applications to invoke native code and vice versa [83]. In contrast, CVC4 uses SWIG [28] to semi-automatically generate bindings. One of the challenges of developing a Java API, and the main motivation for implementing it manually instead of using SWIG, is the interaction between Java's garbage collector and cvc5's reference-counting mechanism for terms and sorts. The new API implements the AutoCloseable interface to destroy the underlying C ++ objects in the expected order. It mostly mirrors the C ++ API and supports operator overloading, iterators, and exceptions. There are a few differences from the C ++ API, such as using arbitrary-precision integer pairs, specifically, pairs of Java BigInteger objects, to represent rational numbers. In contrast to the old Java API, the new API puts greater emphasis on using Java-native types such as List<T> instead of wrapper classes for C ++ types such as std::vector<T>.
Documentation. We provide comprehensive documentation for both cvc5 users [8] and developers [6]. User documentation contains instructions for building and installing cvc5 and its dependencies, extensive documentation and examples of common uses cases for all available APIs, and a thorough description of all supported non-standard theories with examples. Developer documentation provides details of cvc5 internals and instructions for contributions, including guidelines for coding and testing, and a recommended development workflow.
Proofs. As mentioned above, cvc5 has a new proof system. Proofs are stored internally using a new custom intermediate representation. Multiple output proof formats are supported via target-specific post-processing transformations on this internal representation. The final proof object can then be pretty-printed and saved in a text file. The currently supported output proof formats include LFSC [133], Alethe [128], and the language of the Lean 4 [88] proof assistant.
CVC4 proofs exclusively used the LFSC format. cvc5 continues support for LFSC but with a new, more user-friendly syntax. LFSC is a logical framework, based on Edinburgh LF [69], which was explicitly designed to facilitate the production and checking of fine-grained proofs in SMT. It comes with a small and high-performance proof checker, which is generic in the sense that it takes as input both a proof term p and a proof signature, a definition of the data types and proof rules used to construct p. The checker verifies that p is well-formed with respect to the provided signature. We have defined proof signatures for all the individual theories supported by cvc5. These definitions can be combined together as needed to define a proof system for any combination of those theories. When emitting proofs in LFSC, cvc5 includes all the relevant signatures as a preamble to the proof term.
The Alethe proof format is a flexible proof format for SMT solvers based on SMT-LIB. It includes both coarse-and fine-grained steps and was first implemented in the veriT solver [34]. Alethe proofs can be checked via reconstruction within Isabelle/HOL [15,129] as well as within Coq, the latter via the SMTCoq plugin [5,58]. Our main motivation for producing Alethe proofs is to leverage these proof reconstruction infrastructures, thus enabling the trustworthy integration of cvc5 in Isabelle/HOL and Coq. Users of these tools can leverage the integration to dispatch selected goals to cvc5 for proving, thereby increasing the level of automation available to them without requiring a larger trusted core. These integrations represent ongoing work in cvc5 and are being carried out in close collaboration with both Isabelle/HOL and Coq experts.
Although we aim to have a similar full integration in the Lean 4 [88] proof assistant in the future, cvc5 currently only supports the use of Lean 4 as an external checker; i.e., cvc5 can emit proofs as Lean terms (for a subset of the theories supported by cvc5), and Lean 4 can then check these proofs. Since the underlying logic of Lean 4 is an extension of that of LFSC, this functionality follows an approach similar to that used for LFSC by modeling cvc5 proof rules as Lean types and reducing proof checking to type checking.
Syntax-Guided Synthesis. cvc5 has native support for syntax-guided synthesis (SyGuS) problems [3]. As mentioned, the cvc5 core has a dedicated module for encoding SyGuS problems into (higher-order) SMT formulas, annotated with syntactic restrictions. These restrictions are represented via a deep embedding into the theory of datatypes. Internally, after encoding the SyGuS problem, a sub-module of the quantifiers theory, called the synthesis engine, is the main entry point for solving. Based on the shape of the input, it uses one of three approaches. If the input problem has no syntactic restrictions, and is in single invocation form [114], that is, all functions to synthesize are applied to the same argument list, then it uses a quantifier-instantiation based approach. Otherwise, it uses one of two enumerative approaches, depending on the properties of the input [111]. The SyGuS solver also implements further refinements and extensions of the enumerative approaches, including algorithms for decision-tree learning [4] for programming-by-example problems, extended rewriting for enumeration [101], piecewise-independent unification [17], and static grammar-reduction techniques. Furthermore, the SyGuS solver contains specialized procedures to support an efficient implementation of interpolation and abduction.
Interpolation and Abduction. cvc5 computes abducts and Craig interpolants [51] using solvers built on top of the SyGuS solver. The solver for interpolation translates an interpolation query into a SyGuS conjecture whose solutions are interpolants. Specifically, given quantifier-free formulas A and C over any combination of the theories supported by cvc5, the interpolation solver solves for B in the SyGuS conjecture A → B ∧ B → C, with the syntactic restriction that B's free symbols range over the symbols shared by A and C. Any synthesized solution for B is, by construction, a Craig interpolant for A and C.
Abduction is the process of constructing a formula B that is enough to add to a formula A to prove some goal formula C (equivalently, to make the formula F = A∧B ∧¬C unsatisfiable). cvc5's abduction solver reduces this problem to a SyGuS one where C is the formula to be synthesized and F is the semantic constraint. Optionally, the user can also impose syntactic restrictions on the abduct B. The SyGuS solver implements specific optimizations for abduction queries, such as using unsat cores to prune classes of invalid candidate solutions [110].
Non-Linear Arithmetic. The new sub-solver for non-linear arithmetic is based on cylindrical algebraic coverings and closely follows [1], with some notable extensions. The implementation uses the libpoly library [76], which provides polynomial arithmetic and most algebraic routines required for the computation of cylindrical algebraic decompositions and coverings. Infeasible subsets are computed by tracking all contributing assertions for every covering. The infeasible subset is then obtained from the union of assertions from the top-level covering. The sub-solver implements several different variable orderings, as these can have a significant impact on run-times in practice. Apart from classical variable orderings used for cylindrical algebraic decomposition, some experimental orderings based on machine learning have been implemented, roughly following ideas from England et al. [59]. (Mixed real-) integer problems are supported by dynamically injecting intervals into coverings to cover gaps that do not contain integers.
Higher-Order Logic. cvc5 has been extended with partial support for higherorder logic [18]. The extension is based on a pragmatic approach in which λabstractions are eliminated eagerly via lambda lifting [71]. This approach is used with the theory solver for the quantifier-free fragment of the theory of equality with uninterpreted functions (EUF) and with the quantifier-instantiation technique based on E-matching with triggers [53,89]. For the EUF solver, we added support for (dis)equality constraints between functions, via an extensionality inference rule, and for partial applications of (Curried) functions. For quantifier instantiation, we modified several of the data structures for E-matching to incorporate matching in the presence of equalities between function values, function variables, and partial function applications. The extension also uses custom axioms, such as an axiom simulating how functions are updated, to improve the generation of new λ-abstractions, since cvc5 does not yet perform HO-unification, which would allow it to synthesize arbitrary λ-abstractions.
New Bit-Vector Solver. cvc5 features a new bit-blasting solver, which supports the use of off-the-shelf SAT solvers such as CaDiCaL [31] or CryptoMin-iSat [131] as SAT back-ends for both the eager and lazy bit-blasting approaches. In contrast, CVC4's lazy bit-blasting solver relied on a customized version of MiniSat and did not allow the use of more recent state-of-the-art SAT solvers.
Int-Blasting. In addition to bit-blasting, cvc5 implements int-blasting techniques, which reduce bit-vector problems to equisatisfiable non-linear integer arithmetic problems [97,138]. These techniques are orthogonal to bit-blasting and especially effective on unsatisfiable formulas over large bit-widths.
Syntax-Guided Quantifier Instantiation. cvc5 features a new theory-agnostic enumerative quantifier-instantiation technique we call syntax-guided quantifier instantiation [96]. This technique leverages cvc5's SyGuS solver to synthesize terms for quantifier instantiation in a counterexample-guided manner.
Unsatisfiable Cores. In cvc5, unsat (short for unsatisfiable) core extraction has been completely overhauled. It now uses the new proof infrastructure for tracking preprocessing transformations, which, differently from CVC4's, supports most of the preprocessing passes. Unsat cores can be extracted based on the constructed proof or via the tracked preprocessing and assumption-based unsat core extraction [47]. For the latter, cvc5 uses the solve-under-assumptions feature available in the MiniSat-based SAT engine. This is a lightweight solution that does not require the generation of proofs in the SAT solver and full preprocessing proofs. However, if a user requests both unsat cores and proofs, cvc5 switches to proof-based unsat core extraction using the new proof infrastructure.
Distributed and Central Policies for Equality Reasoning. As mentioned in Section 2, the Combination Engine manages theory combination, and theory solvers manage their interactions with the rest of the system via their Equality Engine. In contrast to CVC4, the policy for assigning an Equalitiy Engine to a theory solver in cvc5 is configurable. In the distributed policy, a new Equality Engine is generated and assigned for each theory solver. These theory solvers perform congruence closure and their theory-specific reasoning locally. The advantage of this approach is that the constraints are local to the theory and thus do not lead to overhead when combined with other theories. In the central policy, a single, shared Equality Engine is assigned to all theory solvers. The advantage of this approach is that communication of facts between theory solvers happens automatically, which in turn can trigger theory propagations more eagerly. Both policies use the same core Equality Engine Module. Each theory solver has been refactored to be agnostic with respect to the equality policy.
Decision Heuristic. For Boolean reasoning, in addition to MiniSat's decision heuristic, cvc5 implements a separate decision heuristic which uses the original Boolean structure of the input to keep track of the justified parts of the input constraints, i.e., the parts where it can infer the value of terms based on a (partial) assignment to sub-terms. To make decisions, this new heuristic traverses assertions not satisfied by the currently asserted literals, computing the desired values (starting with true as the desired value for the root) for each term until it finds an unasserted literal that would contribute towards a desired value. This heuristic is a reimplementation and extension of a heuristic [12] implemented in CVC4. The heuristic optionally prioritizes assertions that most frequently contributed to conflicts in the past using a dynamic ordering scheme.
Additional Features. Many more aspects and features have been improved and implemented with the goal of providing useful information to users and developers. Notable examples include: a complete overhaul of CVC4's mechanism for collecting statistics; improved bookkeeping for information about theory lemmas; and a general mechanism for communicating additional information to users such as quantifier instantiations and terms enumerated by the SyGuS solver.

Evaluation
We evaluate cvc5's overall performance (commit 5f998504) by comparing it against Z3 4.8.12 [90] and CVC4 1.8. 7 Z3 is a widely used, high-performance SMT solver which, like cvc5, supports a wide range of theories. We compare against CVC4 to illustrate some of the performance improvements implemented as part of the move to cvc5. To run CVC4 optimally, we use the same commandline options as those in CVC4's competition script for SMT-COMP 2020 [9]. Similarly, for cvc5, we use a (slightly updated) version of the competition script from SMT-COMP 2021 [7]. For some logics, e.g., quantified logics, these scripts try multiple options in a sequential portfolio.  We ran all experiments on a cluster equipped with Intel Xeon E5-2620 v4 CPUs. We allocated one CPU core and 8GB of RAM for each solver and benchmark pair and ran each benchmark with a 20 minute time limit, the same time limit used at SMT-COMP 2021 [102]. We used all non-incremental SMT-LIB [22] benchmarks for our evaluation, with the exception of 45 (misclassified) benchmarks that have quantifiers in quantifier-free logics and 1128 (misclassified) benchmarks that have non-linear literals in linear arithmetic logics. These are known misclassifications in the current release of SMT-LIB. Note that many benchmarks in SMT-LIB come from industrial applications. Table 1 shows the number of solved benchmarks for each solver using the same divisions as those used for SMT-COMP 2021. There were no disagreements among the solvers on the satisfiability of benchmarks. Overall, cvc5 solves the largest number of benchmarks. Compared to CVC4, cvc5 solves fewer benchmarks in the quantifier-free linear integer arithmetic division due to refactorings related to adding proof support. In the quantifier-free equality and bit-vector division, cvc5 also solves fewer benchmarks, which we attribute to the fact that the new bit-vector solver has not yet been optimized for theory combination. Finally, for quantifier-free string benchmarks, there have been bug fixes since CVC4 that affected performance.
In addition to regularly participating in SMT-COMP, cvc5 and CVC4 also participate in the CADE ATP System Competition (CASC) and in SyGuS-Comp [103]. In CASC, cvc5 tends to perform in the middle of the pack on untyped theorem divisions (unsatisfiable quantified UF in SMT-LIB parlance), and towards the top of the pack on theorems with arithmetic. The last time SyGuS-Comp was held was in 2019, when CVC4 won four out of five tracks.
CVC4 is used extensively in industry, and our users are in the process of updating to cvc5. Examples of its use include: a back-end for Zelkova, a tool developed at Amazon to reason about AWS Access Policies [10,11,33]; a back-end for Boogie [20], which is used in many projects including Dafny [81] and the Move Prover [137], a tool used to formally verify smart contracts; a back-end at Certora, another company engaged in formal verification of smart contracts [138]; a back-end for Sledgehammer [32], a tool for discharging proof obligations in Isabelle used by Isabelle's own industrial users; and a back-end for SPARK [70], a development environment for safety-critical Ada programs.

Future Work
We briefly highlight a few current development directions for cvc5.
Optimization Solver. Optimization modulo theories (OMT) [136] is an extension of SMT, which requires a solver not only to determine satisfiability but also to return a satisfying assignment (if any) that optimizes one or more objectives. OMT is already supported by several solvers including MathSAT [46] and Z3. cvc5 already has internal infrastructure for supporting OMT queries. We aim to improve and expose (through the APIs) this capability in the near future.
Theory of Bags. cvc5 has preliminary support for a theory of multisets (or bags) that can be implemented via a reduction to linear integer arithmetic [107]. We plan to extend this theory with higher-order combinators such as map and fold. With these combinators, and encoding relational tables as bags of tuples, cvc5 will be able to support several commonly-used table operations, with the goal of facilitating reasoning about SQL queries and database applications.
Floating-Point Arithmetic. In addition to word-blasting, we plan to leverage our work on invertibility conditions [36] to lift the local search approach for bit-vectors from [93,94] to floating-point arithmetic.
Internal Portfolio. Due to the computational complexity of SMT, there is often no single strategy that works best for all problems. As a result, users of SMT solvers often rely on portfolio approaches to try different sets of options, either in parallel or sequentially, as we did in Section 4. Implementing portfolio approaches that use the solver as a black box is sub-optimal because some work, such as parsing, has to be duplicated. The cvc5 roadmap includes plans to support portfolio solving internally, thereby avoiding that additional overhead. We further plan to provide predefined portfolios tuned for specific use cases. As one example of the different needs of different use cases, some applications prefer the solver to always return quickly (even if the answer is "unknown") whereas others expect the solver to try as hard as possible to solve a given problem.
New Parser. cvc5's current parser is inherited from CVC4 and is based on the ANTLR 3 parser generator [105]. In addition to relying on a now deprecated version of ANTLR, the parser is unacceptably slow on large inputs and provides no API for user applications to interact with. A new parser using Flex [106] and Bison [49] is in development. The new parser will also provide an API allowing users to parse whole files or individual terms.