Satisfiability Modulo Finite Fields

. We study satisfiability modulo the theory of finite fields and give a decision procedure for this theory. We implement our procedure for prime fields inside the cvc5 SMT solver. Using this theory, we construct SMT queries that encode translation validation verification conditions for various zero knowledge proof compilers applied to Boolean computations. We evaluate our procedure on these benchmarks. Our experiments show that our implementation is superior to previous approaches (which encode field arithmetic using integers or bit-vectors)


Introduction
Finite fields are critical to the design of recent cryptosystems.For instance, elliptic curve operations are defined in terms of operations in a finite field.Also, Zero-Knowledge Proofs (ZKPs) and Multi-Party Computations (MPCs), powerful tools for building secure and private systems, often require key properties of the system to be expressed as operations in a finite field.
Field-based cryptosystems already safeguard everything from our money to our privacy.Over 80% of our TLS connections, for example, use elliptic curves [4,68].Private cryptocurrencies [34,61,90] built on ZKPs have billiondollar market capitalizations [46,47].And MPC protocols have been used to operate auctions [17], facilitate sensitive cross-agency collaboration in the US federal government [5], and compute cross-company pay gaps [8].These systems safeguard our privacy, assets, and government data.Their importance justifies spending considerable effort to ensure that the systems are free of bugs that could compromise the resources they are trying to protect; thus, they are prime targets for formal verification.
However, verifying field-based cryptosystems is challenging, in part because current automated verification tools do not reason directly about finite fields.Many tools use Satisfiability Modulo Theories (SMT) solvers as a back-end [9, 29,35,94,96].SMT solvers [7,10,12,20,28,37,75,78,79] are automated reasoners that determine the satisfiability of formulas in first-order logic with respect to one or more background theories.They combine propositional search with specialized reasoning procedures for these theories, which model common data types such as Booleans, integers, reals, bit-vectors, arrays, algebraic datatypes, and more.Since SMT solvers do not currently support a theory of finite fields, SMT-based tools must encode field operations using another theory.
There are two natural ways to represent finite fields using commonly supported theories in SMT, but both are ultimately inefficient.Recall that a finite field of prime order can be represented as the integers with addition and multiplication performed modulo a prime p.Thus, field operations can be represented using integers or bit-vectors: both support addition, multiplication, and modular reduction.However, both approaches fall short.Non-linear integer reasoning is notoriously challenging for SMT solvers, and bit-vector solvers perform abysmally on fields of cryptographic size (hundreds of bits).
In this paper, we develop for the first time a direct solver for finite fields within an SMT solver.We use well-known ideas from computer algebra (specifically, Gröbner bases [21] and triangular decomposition [6,100]) to form the basis of our decision procedure.However, we improve on this baseline in two important ways.First, our decision procedure does not manipulate field polynomials (i.e., those of form X p − X).As expected, this results in a loss of completeness at the Gröbner basis stage.However, surprisingly, this often does not matter.Furthermore, completeness is recovered during the model construction algorithm (albeit in a rather rudimentary way).This modification turns out to be crucial for obtaining reasonable performance.Second, we implement a proof-tracing mechanism in the Gröbner basis engine, thereby enabling it to compute unsatisfiable cores, which is also very beneficial in the context of SMT solving.Finally, we implement all of this as a theory solver for prime-order fields inside the cvc5 SMT solver.
To guide research in this area, we also give a first set of QF_FF (quantifier-free, finite field) benchmarks, obtained from the domain of ZKP compiler correctness.ZKP compilers translate from high-level computations (e.g., over Booleans, bitvectors, arrays, etc.) to systems of finite field constraints that are usable by ZKPs.We instrument existing ZKP compilers to produce translation validation [87] verification conditions, i.e. conditions that represent desirable correctness properties of a specific compilation.We give these compilers concrete Boolean computations (which we sample at random), and construct SMT formulas capturing the correctness of the ZKP compilers' translations of those computations into field constraints.We represent the formulas using both our new theory of finite fields and also the alternative theory encodings mentioned above.
We evaluate our tool on these benchmarks and compare it to the approaches based on bit-vectors, integers, and pure computer algebra (without SMT).We find that our tool significantly outperforms the other solutions.Compared to the best previous solution (we list prior alternatives in Section 7), it is 6× faster and it solves 2× more benchmarks.
In sum, our contributions are: 1. a definition of the theory of finite fields in the context of SMT; 2. a decision procedure for this theory that avoids field polynomials and produces unsatisfiable cores; 3. the first public theory solver for this theory (implemented in cvc5); and 4. the first set of QF_FF benchmarks, which encode translation validation queries for ZKP compilers on Boolean computations.

Related Work
There is a large body of work on computer algebra, with many algorithms implemented in various tools [1,18,33,39,51,54,60,74,101,102].However, the focus in this work is on quickly constructing useful algebraic objects (e.g., a Gröbner basis), rather than on searching for a solution to a set of field constraints.
One line of recent work [56,57] by Hader and Kovács considers SMT-oriented field reasoning.One difference with our work is that it scales poorly with field size because it uses field polynomials to achieve completeness.Furthermore, their solver is not public.
Others consider verifying field constraints used in ZKPs.One paper surveys possible approaches [98], and another considers proof-producing ZKP compilation [26].However, neither develops automated, general-purpose tools.
Still other works study automated reasoning for non-linear arithmetic over reals and integers [3, 25, 27, 31, 49, 62-64, 72, 76, 97, 99].A key challenge is reasoning about comparisons.We work over finite fields and do not consider comparisons because they are used for neither elliptic curves nor most ZKPs.

Algebra
Here, we summarize algebraic definitions and facts that we will use; see [73,Chapters 1 through 8] or [36, Part IV] for a full presentation.
Finite Fields A finite field is a finite set equipped with binary operations + and × that have identities (0 and 1 respectively), have inverses (save that there is no multiplicative inverse for 0), and satisfy associativity, commutativity, and distributivity.The order of a finite field is the size of the set.All finite fields have order q = p e for some prime p (called the characteristic) and positive integer e.Such an integer q is called a prime power.
Up to isomorphism, the field of order q is unique and is denoted F q , or F when the order is clear from context.The fields F q d for d > 1 are called extension fields of F q .In contrast, F q may be called the base field.We write F ⊂ G to indicate that F is a field that is isomorphic to the result of restricting field G to some subset of its elements (but with the same operations).We note in particular that F q ⊂ F q d .A field of prime order p is called a prime field.
Polynomials For a finite field F and formal variables X 1 , . . ., X k , F[X 1 , . . ., X k ] denotes the set of polynomials in X 1 , . . ., X k with coefficients in F. By taking the variables to be in F, a polynomial f ∈ F[X 1 , . . ., X k ] can be viewed as a function from F k → F. However, by taking the variables to be in an extension G of F, f can also be viewed as function from G k → G.
For a set of polynomials called the ideal generated by S and is denoted ⟨f 1 , . . ., f m ⟩ or ⟨S⟩.In turn, S is called a basis for the ideal I.
The variety of an ideal I in field G ⊃ F is denoted V G (I), and is the set {x ∈ G k : ∀f ∈ I, f (x) = 0}.That is, V G (I) contains the common zeros of polynomials in I, viewed as functions over G.Note that for any set of polynomials S that generates I, V G (I) contains exactly the common zeros of S in G.When the space G is just F, we denote the variety as V(I).An ideal I that contains 1 contains all polynomials and is called trivial.
One can show that if I is trivial, then V(I) = ∅.However, the converse does not hold.For instance, The field polynomial for field F q in variable X is X q − X.Its zeros are all of F q and it has no additional zeros in any extension of F q .Thus, for an ideal I of polynomials in F[X 1 , . . ., X k ] that contains field polynomials for each variable X i , I is trivial iff V(I) = ∅.For this reason, field polynomials are a common tool for ensuring the completeness of ideal-based reasoning techniques [50,56,98].
Representation We represent F p as the set of integers {0, 1, . . ., p − 1}, with the operations + and × performed modulo p.The representation of F p e with e > 1 is more complex.Unfortunately, the set {0, 1, . . ., p e − 1} with + and × performed modulo p e is not a field because multiples of p do not have multiplicative inverses.Instead, we represent F p e as the set of polynomials in F[X] of degree less than e.The operations + and × are performed modulo q(X), an irreducible polynomial 4 of degree e [73, Chapter 6].There are p e such polynomials, and so long as q(X) is irreducible, all (save 0) have inverses.Note that this definition of F p e generalizes F p , and captures the fact that F p ⊂ F p e .

Ideal Membership
The ideal membership problem is to determine whether a given polynomial p is in the ideal generated by a given set of polynomials D. We summarize definitions and facts relevant to algorithms for this problem; see [32] for a full presentation.

Monomial Ordering
with non-negative integers e i .A monomial ordering is a total ordering on monomials such that for all monomials p, q, r, if p < q, then pr < qr.
Reduction For polynomials p and d, if lm(d) divides a term t of p, then we say that p reduces to r modulo d (written p → d r) for r = p − t lm(d) d.For a set of polynomials D, we write p → D r if p → d r for some d ∈ D. Let → * D be the transitive closure of → D .We define p ⇒ D r to hold when p → * D r and there is no r ′ such that r → D r ′ .
Reduction is a sound-but incomplete-algorithm for ideal membership.That is, one can show that p ⇒ D 0 implies p ∈ ⟨D⟩, but the converse does not hold in general.
Gröbner Bases Define the s-polynomial for polynomials p and q, by spoly(p, q) = lcm(lm(p),lm(q)) lt(p) • q − lcm(lm(p),lm(q)) lt(q) • p.A Gröbner basis (GB) [21] is a set of polynomials P characterized by the following equivalent conditions: 1. ∀p, p ′ ∈ P , spoly(p, p ′ ) ⇒ P 0 (closure under the reduction of s-polynomials) 2. ∀p ∈ ⟨P ⟩, p ⇒ P 0 (reduction is a complete test for ideal membership) Gröbner bases are useful for deciding ideal membership.From the first characterization, one can build algorithms for constructing a Gröbner basis for any ideal [21].Then, the second characterization gives an ideal membership test.When P is a GB, the relation ⇒ P is a function (i.e., → P is confluent), and it can be efficiently computed [1,21]; thus, this test is efficient.
A Gröbner basis engine takes a set of generators G for some ideal I and computes a Gröbner basis for I.We describe the high-level design of such engines here.An engine constructs a sequence of bases G 0 , G 1 , G 2 , . . .(with G 0 = G) until some G i is a Gröbner basis.Each G i is constructed from G i−1 according to one of three types of steps.First, for some p, q ∈ G i−1 such that spoly(p, q) ⇒ Gi−1 r ̸ = 0, the engine can set G i = G i−1 ∪ {r}.Second, for some p ∈ G i−1 such that p ⇒ Gi−1\{p} r ̸ = p, the engine can set G i = (G i−1 \ {p}) ∪ {r}.Third, for some p ∈ G i−1 such that p ⇒ Gi−1\{p} 0, the engine can set G i = G i−1 \ {p}.Notice that all rules depend on the current basis; some add polynomials, and some remove them.In general, it is unclear which sequence of steps will construct a Gröbner basis most quickly: this is an active area of research [1,18,43,45].

Zero Knowledge Proofs
Zero-knowledge proofs allow one to prove that some secret data satisfies a public property, without revealing the data itself.See [95] for a full presentation; we give a brief overview here.There are two parties: a verifier V and a prover P. V knows a public instance x and asks P to show that it has knowledge of a secret witness w satisfying a public predicate ϕ(x, w).To do so, P runs an efficient (i.e., polytime in a security parameter λ) proving algorithm Prove(ϕ, x, w) → π and sends the resulting proof π to V.Then, V runs an efficient verification algorithm Verify(ϕ, x, π) → {0, 1} that accepts or rejects the proof.A system for Zero-Knowledge Proofs of knowledge (ZKPs) is a (Prove, Verify) pair with: completeness: If ϕ(x, w), then Pr[Verify(ϕ, x, Prove(ϕ, x, w)) = 0] ≤ negl(λ),5 computational knowledge soundness [16]: (informal) a polytime adversary that does not know w satisfying ϕ can produce an acceptable π with probability at most negl(λ).zero-knowledge [52]: (informal) π reveals nothing about w, other than its existence.
ZKP applications are manifold.ZKPs are the basis of private cryptocurrencies such as Zcash and Monero, which have a combined market capitalization of $2.80B as of 30 June 2022 [46,47].They've also been proposed for auditing sealed court orders [48], operating private gun registries [65], designing privacypreserving middleboxes [55] and more [24,58].This breadth of applications is possible because implemented ZKPs are very general: they support any ϕ checkable in polytime.However, ϕ must be first compiled to a cryptosystem-compatible computation language.The most common language is a rank-1 constraint system (R1CS).In an R1CS C, x and w are together encoded as a vector z ∈ F m .The system C is defined by three matrices A, B, C ∈ F n×m ; it is satisfied when Az • Bz = Cz, where • is the elementwise product.Thus, the predicate can be viewed as n distinct constraints, where constraint i has form Note that each constraint is a degree ≤ 2 polynomial in m variables that z must be a zero of.For security reasons, F must be large: its prime must have ≈255 bits.
Encoding The efficiency of the ZKP scales quasi-linearly with n.Thus, it's useful to encode ϕ as an R1CS with a minimal number of constraints.Since equisatifiability-not logical equivalence-is needed, encodings may introduce new variables.
As an example, consider the Boolean computation ).How can one ensure that a ′ ∈ F (also in z) is 0 or 1 and a ↔ (a ′ = 1)?Given that there are k − 1 ORs, natural approaches use Θ(k) constraints.One clever approach is to introduce variable x ′ and enforce constraints x If any c i is true, a ′ must be 1 to satisfy the second constraint; setting x ′ to the sum's inverse satisfies the first.If all c i are false, the first constraint ensures a ′ is 0. This encoding is correct when the sum does not overflow; thus, k must be smaller than F's characteristic.
Optimizations like this can be quite complex.Thus, ZKP programmers use constraint synthesis libraries [14, 71] or compilers [13,26,82,83,85,93,103] to generate an R1CS from a high-level description.Such tools support objects like Booleans, fixed-width integers, arrays, and user-defined data-types.The correctness of these tools is critical to the correctness of any system built with them.

SMT
We assume usual terminology for many-sorted first order logic with equality ( [40] gives a complete presentation).Let Σ be a many-sorted signature including a Fig. 1: Signature of the theory of F q sort Bool and symbol family ≈ σ (abbreviated ≈) with sort σ × σ → Bool for all σ in Σ.A theory is a pair T = (Σ, I), where Σ is a signature and I is a class of Σ-interpretations.A Σ-formula ϕ is satisfiable (resp., unsatisfiable) in T if it is satisfied by some (resp., no) interpretation in I. Given a (set of) formula(s) S, we write S |= T ϕ if every interpretation M ∈ I that satisfies S also satisfies ϕ.
When using the CDCL(T ) framework for SMT, the reasoning engine for each theory is encapsulated inside a theory solver.Here, we mention the fragment of CDCL(T ) that is relevant for our purposes ( [80] gives a complete presentation)).
The goal of CDCL(T ) is to check a formula ϕ for satisfiability.A core module manages a propositional search over the propositional abstraction of ϕ and communicates with the theory solver.As the core constructs partial propositional assignments for the abstract formula, the theory solver is given the literals that correspond to the current propositional assignment.When the propositional assignment is completed (or, optionally, before), the theory solver must determine whether its literals are jointly satisfiable.If so, it must be able to provide an interpretation in I (which includes an assignment to theory variables) that satisfies them.If not, it may indicate a strict subset of the literals which are unsatisfiable: an unsatisfiable core.Smaller unsatisfiable cores usually accelerate the propositional search.

The Theory of Finite Fields
We define the theory T Fq of the finite field F q , for any order q.Its sort and symbols are indexed by the parameter q; we omit q when clear from context.
The signature of the theory is given in Figure 1.It includes sort F, which intuitively denotes the sort of elements of F q and is represented in our proposed SMT-LIB format as (_ FiniteField q).There is a constant symbol for each element of F q , and function symbols for addition and multiplication.Other finite field operations (e.g., negation, subtraction, and inverses) naturally reduce to this signature.
An interpretation M of T Fq must interpret: F as F q , n ∈ {0, . . ., q − 1} as the n th element of F q in lexicographical order,6 + as addition in F q , × as multiplication in F q , and ≈ as equality in F q .
1 Function DecisionProcedure: Input: A set of F-literals L in variables X Output: UNSAT and a core C ⊆ L, or Output: SAT and a model M : X → F 2 P ← empty set; Wi ← fresh, ∀i; Fig. 2: The decision procedure for F q .
Note that in order to avoid ambiguity, we require that the sort of any constant ffn must be ascribed.For instance, the n th element of F q would be (as ffn (_ FiniteField q)).The sorts of non-nullary function symbols need not be ascribed: they can be inferred from their arguments.

Decision Procedure
Recall ( §2.4) that a CDCL(T ) theory solver for F must decide the satisfiability of a set of F-literals.At a high level, our decision procedure comprises three steps.First, we reduce to a problem concerning a single algebraic variety.Second, we use a GB-based test for unsatisfiability that is fast and sound, but incomplete.Third, we attempt model construction.Figure 2 shows pseudocode for the decision procedure; we will explain it incrementally.

Algebraic Reduction
Let L = {ℓ 1 , . . ., ℓ |L| } be a set of literals.Each F-literal has the form ℓ i = s i ▷◁ t i where s and t are F-terms and ▷◁ ∈ {≈, ̸ ≈}.Let X = {X 1 , . . ., X k } denote the free variables in L. Let E, D ⊆ {1, . . ., |L|} be the sets of indices corresponding to equalities and disequalities in L, respectively.Let be the set of interpretations of the equalities; i.e., To simplify, we reduce disequalities to equalities using a classic technique [89]: we introduce a fresh variable W i for each i ∈ D and define P ′ D as Fig. 3: Interpreting F-terms as polynomials Note that each p ∈ P ′ D has zeros for exactly the values of X where its analog in We define P to be P E ∪P ′ D (constructed in lines 2 to 6, Fig. 2) and note three useful properties of P .First, L is satisfiable if and only if V(⟨P ⟩) is non-empty.Second, for any P ′ ⊂ P , if V(⟨P ′ ⟩) = ∅, then {π(p) : p ∈ P ′ } is an unsatisfiable core, where π maps a polynomial to the literal it is derived from.Third, from any x ∈ V(⟨P ⟩) one can immediately construct a model.Thus, our theory solver reduces to understanding properties of the variety V(⟨P ⟩).

Incomplete Unsatisfiability and Cores
Recall ( §2.2) that if 1 ∈ ⟨P ⟩, then V(⟨P ⟩) is empty.We can answer this ideal membership query using a Gröbner basis engine (line 7, Fig. 2).Let GB be a subroutine that takes a list of polynomials and computes a Gröbner basis for the ideal that they generate, according to some monomial ordering.We use grevlex: the ordering for which GB engines are typically most efficient [44].We compute GB (P ) and check whether 1 ⇒ GB(P ) 0. If so, we report that V(⟨P ⟩) is empty.If not, recall ( §2.2) that V(⟨P ⟩) may still be empty; we proceed to attempt model construction (lines 9 to 11, Fig. 2, described in the next subsection).
If 1 does reduce by the Gröbner basis, then identifying a subset of P which is sufficient to reduce 1 yields an unsatisfiable core.To construct such a subset, we formalize the inferences performed by the Gröbner basis engine as a calculus for proving ideal membership.
Figure 4 presents IdealCalc: our ideal membership calculus.IdealCalc proves facts of the form p ∈ ⟨P ⟩, where p is a polynomial and P is the set of generators for an ideal.The G rule states that the generators are in the ideal.The Z rule states that 0 is in the ideal.The S rule states that for any two polynomials in the ideal, their s-polynomial is in the ideal too.The R ↑ and R ↓ rules state that if p → q r with q in the ideal, then p is in the ideal if and only if r is.
The soundness of IdealCalc follows immediately from the definition of an ideal.Completeness relies on the existence of algorithms for computing Gröbner bases using only s-polynomials and reduction [21,43,45].We prove both properties in Appendix A.
Fig. 4: IdealCalc: a calculus for ideal membership 1 Function FindZero: Fig. 5: Finding common zeros for a Gröbner basis.After handling trivial cases, FindZero uses ApplyRule to apply the first applicable rule from Figure 6.
By instrumenting a Gröbner basis engine and reduction engine, one can construct IdealCalc proof trees.Then, for a conclusion 1 ∈ ⟨P ⟩, traversing the proof tree to its leaves gives a subset P ′ ⊆ P such that 1 ∈ ⟨P ′ ⟩.The procedure CoreFromTree (called in line 8, Fig. 2) performs this traversal, by accessing a proof tree recorded by the GB procedure and the reductions.The proof of Theorem 2 explains our instrumentation in more detail (Appendix A).

Completeness through Model Construction
As discussed, we still need a complete decision procedure for determining if V(⟨P ⟩) is empty.We call this procedure FindZero; it is a backtracking search for an element of V(⟨P ⟩).It also serves as our model construction procedure.
Figure 5 presents FindZero as a recursive search.It maintains two data structures: a Gröbner basis B and partial map M : X ′ → F from variables to field elements.By applying a branching rule (which we will discuss in the next paragraph), FindZero obtains a disjunction of single-variable assignments X ′ i → z, which it branches on.FindZero branches on an assignment X ′ i → z by adding it to M and updating B to GB (B ∪ {X ′ i − z}). Figure 6 shows the branching rules of FindZero.Each rule comprises antecedents (conditions that must be met for the rule to apply) and a conclusion (a disjunction of single-variable assignments to branch on).The Univariate rule applies when B contains a polynomial p that is univariate in some variable X ′ i that M does not have a value for.The rule branches on the univariate roots of Fig. 6: Branching rules for FindZero.
p.The Triangular rule comes from work on triangular decomposition [70].It applies when B is zero-dimensional. 7It computes a univariate minimal polynomial p(X ′ i ) in some unassigned variables X ′ i , and branches on the univariate roots of p.The final rule Exhaust has no conditions and simply branches on all possible values for all unassigned variables.
Theorem 3 (FindZero Correctness).If V(⟨B⟩) = ∅ then FindZero returns ⊥; otherwise, it returns a member of V(⟨B⟩).(Proof: Appendix B) Correctness and Efficiency The branching rules achieve a careful balance between correctness and efficiency.The Exhaust rule is always applicable, but a full exhaustive search over a large field is unreasonable (recall: ZKPs operate of ≈255-bit fields).The Triangular and Univariate rules are important alternatives to exhaustion.They create a far smaller set of branches, but apply only when the variety has dimension zero or the basis has a univariate polynomial.
As an example of the importance of Univariate, consider the univariate system X 2 = 2, in a field where 2 is not a perfect square (e.g., F 7 ).X 2 − 2 is already a (reduced) Gröbner basis, and it does not contain 1, so FindZero applies.With the Univariate rule, FindZero computes the univariate zeros of X 2 − 2 (there are none) and exits.Without it, the Exhaust rule creates |F| branches.
As an example of when Triangular is critical, consider [70].The system is unsatisfiable, it has dimension 0, and its ideal does not contain 1.Moreover, our solver computes a (reduced) Gröbner basis for it that does not contain any univariate polynomials.Thus, Univariate does not apply.However, Triangular does, and with it, FindZero quickly terminates.Without Triangular, Exhaust would create at least |F| branches.
In the above examples, Exhaust performs very poorly.However, that is not always the case.For example, in the system X 1 + X 2 = 0, using Exhaust to guess X 1 , and then using the univariate rule to determine X 2 is quite reasonable.In general, Exhaust is a powerful tool for solving underconstrained systems.Our experiments will show that despite including Exhaust, our procedure performs quite well on our benchmarks.We reflect on its performance in Section 8.
Field polynomials: a road not taken By guaranteeing completeness through (potential) exhaustion, we depart from prior work.Typically, one ensures completeness by including field polynomials in the ideal ( §2.2).Indeed, this is the approach suggested [98] and taken [57] by prior work.However, field polynomials induce enormous overhead in the Gröbner basis engine because their degree is so large.The result is a procedure that is only efficient for tiny fields [57].In our experiments, we compare our system's performance to what it would be if it used field polynomials. 8The results confirm that deferring completeness to FindZero is far superior for our benchmarks.

Implementation
We have implemented our decision procedure for prime fields in the cvc5 SMT solver [7] as a theory solver.It is exposed through cvc5's SMT-LIB, C++, Java, and Python interfaces.Our implementation comprises ≈2k lines of C++.For the algebraic sub-routines of our decision procedure ( §4), it uses CoCoALib [1].To compute unsatisfiable cores ( §4.2), we inserted hooks into CoCoALib's Gröbner basis engine (17 lines of C++).
Our theory solver makes sparse use of the interface between it and the rest of the SMT solver.It acts only once a full propositional assignment has been constructed.It then runs the decision procedure, reporting either satisfiability (with a model) or unsatisfiability (with an unsatisfiable core).

Benchmark Generation
Recall that one motivation for this work is to enable translation validation for compilers to field constraint systems (R1CSs) used in zero-knowledge proofs (ZKPs).Our benchmarks are SMT formulas that encode translation validation queries for compilers from Boolean computations to R1CS.At a high level, each benchmark is generated as follows.
Through step 3, we construct SMT queries that are satisfiable, unsatisfiable, and of unknown status.Through step 5, we construct queries solvable using bit-vector reasoning, integer reasoning, or a stand-alone computer algebra system.

Examples
We describe our benchmark generator in full and give the definitions of soundness and determinism in Appendix C. Here, we give three example benchmarks.Our examples are based on the Boolean formula Our convention is to mark field variables with a prime, but not Boolean variables.Using the technique from Section 2.3, CirC compiles this formula to the twoconstraint system: Each Boolean input x i corresponds to field element x ′ i and r ′ corresponds to the result of Ψ .
Soundness An R1CS is sound if it ensures the output r ′ corresponds to the value of Ψ (when given valid inputs).Concretely, our system is sound if the following formula is valid: where Ψ and s ′ are defined as above.This is an UNSAT benchmark, because the formula is valid.
Determinism An R1CS is deterministic if the values of the inputs uniquely determine the value of the output.To represent this in a formula, we use two copies of the constraint system: one with primed variables, and one with doubleprimed variables.Our example is deterministic if the following formula is valid: Unsoundness Removing constraints from the system can give a formula that is not valid (a SAT benchmark).For example, if we remove (1 − r ′ )s ′ = 0, then the soundness formula is falsified by

Experiments
Our experiments show that our approach: 1. scales well with the size of F (unlike a BV-based approach), 2. would scale poorly with the size of F if field polynomials were used, 3. benefits from unsatisfiable cores, and 4. substantially outperforms all reasonable alternatives.
Our test bed is a cluster with Intel Xeon E5-2637 v4 CPUs.Each run is limited to one physical core, 8GB memory, and 300s.
Throughout, we generate benchmarks for two correctness properties (soundness and determinism), three different ZKP compilers, and three different statuses (sat, unsat, and unknown).We vary the field size, encoding, number of inputs, and number of terms, depending on the experiment.We evaluate our cvc5 extension, Bitwuzla (commit 27f6291), and z3 (version 4.11.2).

Comparison with Bit-Vectors
Since bit-vector solvers scale poorly with bit-width, one would expect the effectiveness of a BV encoding of our properties to degrade as the field size grows.To validate this, we generate BV-encoded benchmarks for varying bit-widths and evaluate state-of-the-art bit-vector solvers on them.Though our applications of interest use b = 255, we will see that the BV-based approach does not scale to fields this large.Thus, for this set of experiments we use b ∈ {5, 10, . . ., 60}, and we sample formulas with 4 inputs and 8 intermediate terms.
Figure 7a shows performance of three bit-vector solvers (cvc5 [7], Bitwuzla [78], and z3 [75]) and our F solver as a cactus plot; Table 1 splits the solved instances by property and status.We see that even for these small bit-widths, the field-based approach is already superior.The bit-vector solvers are more   competitive on the soundness benchmarks, since these benchmarks include only half as many field operations as the determinism benchmarks.For our benchmarks, Bitwuzla is the most efficient BV solver.We further examine the time that it and our solver take to solve the 9 benchmarks they can both solve at all bit-widths.Figure 7b plots the total solve time against b.While the field-based solver's runtime is nearly independent of field size, the bit-vector solvers slow down substantially as the field grows.
In sum, the BV approach scales poorly with field size and is already inferior on fields of size at least 2 40 .

The Cost of Field Polynomials
Recall that our decision procedure does not use field polynomials ( §4.3), but our implementation optionally includes them ( §5).In this experiment, we measure the cost they incur.We use propositional formulas in 2 variables with 4 terms, and we take b ∈ {4, . . ., 12}, and include SAT and unknown benchmarks.Figure 8a compares the performance of our tool with and without field polynomials.For many benchmarks, field polynomials cause a slowdown greater than 100×.To better show the effect of the field size, we consider the solve time for the SAT benchmarks, at varying values of b. Figure 8b shows how solve times change as b grows: using field polynomials causes exponential growth.For UN-SAT benchmarks, both configurations complete within 1s.This is because (for these benchmarks) the GB is just {1} and CoCoA's GB engine is good at discovering that (and exiting) without considering the field polynomials.
This growth is predictable.GB engines can take time exponential (or worse) in the degree of their inputs.A simple example illustrates this fact: consider computing a Gröbner basis with X 2 b − X and X 2 − X.The former reduces to 0 modulo the latter, but the reduction takes 2 b − 1 steps.

The Benefit of UNSAT Cores
Section 4.2 describes how we compute unsatisfiable (UNSAT) cores in the F solver by instrumenting our Gröbner basis engine.In this experiment, we measure the benefit of doing so.We generate Boolean formulas with 2, 4, 6, 8, 10, and 12 variables; and 2 0 , 2 1 , 2 2 , 2 3 , 2 4 , 2 5 , 2 6 , and 2 7 intermediate terms, for a 255-bit field.We vary the number of intermediate terms widely in order to generate benchmarks of widely variable difficulty.We configure our solver with and without GB instrumentation.
Figure 9a shows the results.For many soundness benchmarks, the cores cause a speedup of more than 10×.As expected, only the soundness benchmarks benefit.Soundness benchmarks have non-trivial boolean structure, so the SMT core makes many queries to the theory solver.Returning good UNSAT cores shrinks the propositional search space, reduces the number of theory queries, and thus reduces solve time.However, determinism benchmarks are just a conjunction   of theory literals, so the SMT core makes only one theory query.For them, returning a good UNSAT core has no benefit-but also induces little overhead.

Comparison to Pure Computer Algebra
In this experiment, we compare our SMT-based approach (which integrates computer-algebra techniques into SMT) against a stand-alone use of computeralgebra.We encode the Boolean structure of our formulas in F p (see Appendix C.4).When run on such an encoding, our SMT solver makes just one query to its field solver, so it cannot benefit from the search optimizations present in CDCL(T ).For this experiment, we use the same benchmark set as the last.
Figure 9b compares the pure F approach with our SMT-based approach.For benchmarks that encode soundness properties, the SMT-based approach is clearly dominant.The intuition here is is that computer algebra systems are not optimized for Boolean reasoning.If a problem has non-trivial Boolean structure, a cooperative approach like SMT has clear advantages.SMT's advantage is less pronounced for determinism benchmarks, as these manifest as a single query to the finite field solver; still, in this case, our encoding seems to have some benefit much of the time.

Main Experiment
In our main experiment, we compare our approach against all reasonable alternatives: a pure computer-algebra approach ( §7.4), a BV approach with Bitwuzla (the best BV solver on our benchmarks, §7.1), an NIA approach with cvc5 and z3, and our own tool without UNSAT cores ( §7.3).We use the same benchmark set as the last experiment; this uses a 255-bit field.
Figure 10 shows the results as a cactus plot.Table 2 shows the number of solved instances for each system, split by property and status.Bitwuzla quickly  runs out of memory on most of the benchmarks.A pure computer-algebra approach outperforms Bitwuzla and cvc5's NIA solver.The NIA solver of z3 does a bit better, but our field-aware SMT solver is the best by far.Moreover, its best configuration uses UNSAT cores.Comparing the total solve time of ff-cvc5 and nia-z3 on commonly solved benchmarks, we find that ff-cvc5 reduces total solve time by 6×.In sum, the techniques we describe in this paper yield a tool that substantially outperforms all alternatives on our benchmarks.

Discussion and Future Work
We've presented a basic study of the potential of an SMT theory solver for finite fields based on computer algebra.Our experiments have focused on translation validation for ZKP compilers, as applied to Boolean input computations.The solver shows promise, but much work remains.
As discussed (Sec.5), our implementation makes limited use of the interface exposed to a theory solver for CDCL(T ).It does no work until a full propositional assignment is available.It also submits no lemmas to the core solver.Exploring which lightweight reasoning should be performed during propositional search and what kinds of lemmas are useful is a promising direction for future work.
Our model construction (Sec.4.3) is another weakness.Without univariate polynomials or a zero-dimensional ideal, it falls back to exhaustive search.If a solution over an extension field is acceptable, then there are Θ(|F| d ) solutions, so an exhaustive search seems likely to quickly succeed.Of course, we need a solution in the base field.If the base field is closed, then every solution is in the base field.Our fields are finite (and thus, not closed), but for our benchmarks, they seem to bear some empirical resemblance to closed fields (e.g., the GB-based test for an empty variety never fails, even though it is theoretically incomplete).For this reason, exhaustive search may not be completely unreasonable for our benchmarks.Indeed, our experiments show that our procedure is effective on our benchmarks, including for SAT instances.However, the worst-case performance of this kind of model construction is clearly abysmal.We think that a more intelligent search procedure and better use of ideas from computer algebra [6,69] would both yield improvement.
Theory combination is also a promising direction for future work.The benchmarks we present here are in the QF_FF logic: they involve only Booleans and finite fields.Reasoning about different fields in combination with one another would have natural applications to the representation of elliptic curve operations inside ZKPs.Reasoning about datatypes, arrays, and bit-vectors in combination with fields would also have natural applications to the verification of ZKP compilers.f ← f − t lm(g) g return f fn buchberger(P ′ ): Q ← unordered pairs from P ′ while Q not empty: s ← reduce(spoly(p, q), P ′ ) if s = 0: continue for g ∈ P ′ : add (s, g) to Q add s to P ′ return P ′ Fig. 11: The inIdeal(f, P ) algorithm for testing whether f ∈ ⟨P ⟩.We instrument it to build IdealCalc proof trees.Then, its correctness as a test for ideal membership implies the completeness of IdealCalc.
By augmenting inIdeal, we prove that the IdealCalc calculus is complete.We augment inIdeal to produce a proof tree that concludes f ∈ ⟨P ⟩, when inIdeal returns true.We introduce a global map M from polynomials to proofs that they are in ⟨P ⟩.Each entry of M contains the key polynomial p, a rule kind k (e.g, G, Z, . . . ) and a finite sequence of antecedent polynomials, p 1 , . . ., p k .We denote the entry p → (k, p 1 , . . ., p k ).The entry represents the inference that if each p i ∈ ⟨P ⟩, then p ∈ ⟨P ⟩, by k.An entry is valid if M contains a valid entry for all antecedent polynomials.If an entry for p is valid, then a simple recursion extracts a proof tree for p ∈ ⟨P ⟩ from M .
First, we describe the modifications to ensure that the Gröbner basis polynomials are provably in ⟨P ⟩.At the beginning of inIdeal, for each g ∈ P , we add g → (G) to M .For each spoly(p, q) call (line 3), we add spoly(p, q) → (S, p, q).For each reduction step (line 2) called from buchberger (line 3), we add f ′ → (R ↓ , f, g) to M , where f ′ denotes the new value of the variable f .Each of these modifications add entries whose antecedents already have valid entries in M .Thus, for every execution of buchberger's loop, and all g ∈ P ′ , M contains a valid entry that shows g is in ⟨P ⟩.Now, we ensure that f is provably in ⟨P ⟩.Before inIdeal calls reduce (line 1), we add 0 → (Z) to M .For each reduction step (line 2) called from buchberger (line 1), we add f → (R ↑ , f ′ , g), where f ′ denotes the new value of the variable f .If reduce returns 0, a backwards induction over the loop of reduce shows that every value f takes has a valid entry in M .Thus, the original value of f has a valid entry in M , and we can construct an IdealCalc proof that f ∈ ⟨P ⟩ whenever inIdeal returns true.

B Proof of Correctness for FindZero
We prove that FindZero is correct (Theorem 3).
Proof.It suffices to show that for each branching rule that results in j (X ij −r j ), First, consider an application of Univariate with univariate p(X i ).Fix z ∈ V(⟨B⟩).z is a zero of p, so for some j, r j = z and z ∈ V(⟨B ∪ {X i − z}⟩).
Next, consider an application of Triangular to variable X i with minimal polynomial p(X i ).By the definition of minimal polynomial, any zero z of ⟨B⟩ has a value for X i that is a root of p.Let that root be r.Then, z ∈ V(⟨B ∪ {X i − z}⟩).
Finally, consider an application of Exhaust.The desired property is immediate.

C Benchmark Generation
Recall that one motivation for this work is to enable the verification of field constraint systems used in zero-knowledge proofs (ZKPs).Recall ( §2.3) that ZKPs consume rank-1 constraint systems (R1CSs).Thus, we craft benchmarks which test the correctness of R1CSs produced by ZKP compilers.In this work, we only consider the behavior of ZKP compilers on Boolean computations.
At a high level, our benchmark generator: samples a random propositional formula, compiles it to an R1CS C using a compiler, and then builds an SMT formula that tests a correctness property of C. We implement our generator in ≈1.1k lines of Rust, building on the CirC compiler infrastructure's intermediate representation and SMT backend [83].Our implementation is public with an open-source license. 9

C.1 Correctness Properties
We obtain a constraint system C by inputting a propositional formula Ψ (x 1 , . . ., x m ) to an R1CS compiler.The compiler outputs: -C a map from each x i to a variable X i in C -Y : a variable in C that represents the value of Ψ .
We construct formulas that test two correctness properties of this output: soundness and determinism.
Soundness An R1CS encoding C of a propositional formula Ψ is sound if all solutions of C correspond to valid solutions of Ψ .More precisely, the encoding is sound if the following holds: Determinism A constraint system C is deterministic if the output is uniquely determined by the inputs.More precisely, let C ′ be a copy of C with primed variables.Then C is deterministic if the following holds: Soundness is important because it is a kind of functional correctness property for C: it relates C to its claimed specification Ψ .Determinism is weaker: if C is non-deterministic, it cannot be sound for any specification.Determinism is interesting because it can be tested without the specification Ψ .

C.2 Formula Distribution
We sample from a distribution of propositional formulas parameterized by: v: the number of input variables t: the number of intermediate terms p: a parameter for a geometric distribution -O: a set of fixed-arity and variadic Boolean operators Our sampler constructs a propositional formula Ψ in variables x 1 , . . ., x v .In each step i, it maintains a set T i of intermediate terms.T 0 is empty, and each T i+1 is obtained by adding a single term to T i .For steps i ∈ [1, v], the added term is x i ; thus, T v = {x 1 , . . ., x v }.For steps i ∈ [v + 1, v + t − 1], we sample a uniformly random operator o ∈ O, independently sample terms s 1 , . . ., s k from T i (as described next), and add o(s 1 , . . ., s k ).If o has fixed arity, then k is that arity; otherwise, k = 2 + g, where g is drawn from the geometric distribution parameterized by p.We sample random elements from T i = {t 1 , . . ., t i } according to a discrete, weighted distribution where element t i has weight i 2 .The t i are ordered first by the number of intermediate terms they're already children of and second by the number of the step in which they were added; thus, the leastused term that is oldest is most likely to be selected.Finally, in step v + t, all the elements of T v+t−1 that have not been used already are combined using a uniformly random variadic operator from O.

C.3 Compilers
We consider three compilers.First is the ZoKrates reference compiler [38,104], which compiles from an eponymous language to R1CS.It supports a wide variety of types, including Booleans.To interact with this compiler, we encode our propositional formula as a ZoKrates program.Second is CirC [83]: a compiler infrastructure for circuits that can produce R1CS.For this compiler, we encode our propositional formula directly using CirC's IR.Third is the CirCbased ZoKrates compiler, ZoKCirC [83].For this compiler, we once again encode our propositional formula as a ZoKrates program.

C.4 Non-F encodings
A rank-1 constraint system C can be directly represented in the QF_FF logic.To our knowledge, our work introduces the first SMT solver which can handle such queries directly.However, prior to our work, one could handle such queries by representing field arithmetic using other SMT theories, or by mapping the Boolean structure into the finite field, and then using a computer algebra system.
Thus, for comparative purposes, we consider alternate encodings of our formulas based on: bit-vectors (BV), non-linear integer arithmetic (NIA), and pure field arithmetic (PureFF).In our BV encoding, we represent prime field elements as bit-vectors of width w (with w = 2⌈log 2 p⌉) and compute the unsigned remainder modulo p after each operation.In our NIA enocding, we represent prime field elements as integers and compute the remainder modulo p after each operation.In the PureFF encoding, we map the Boolean structure of our SMT formula into F p .We represent false as 0, true as 1, Boolean variables as field variables with the requirement that they are equal to 0 or 1, ∧ as multiplication, ¬x as 1 − x, and the rest of the propositional operators accordingly.This results in a formula which is just a conjunction of F p -literals: it can be solved using a stand-alone decision procedure for F p (i.e., without an SMT solver).

C.5 SAT benchmarks
If the compilers we use are correct, soundness and determinism will hold, and our formulas will be unsatisfiable.To ensure that our benchmark set has some SAT formulas, we inject potential bugs in one of two ways.The first way is to remove the final constraint from C. All of our compilers use the final constraint to relate the output variable X o to the rest of the constraint system, so omitting it yields a non-deterministic and unsound system.The second way is to remove a random constraint from C; this is not guaranteed to compromise the correctness of the constraint system.In sum, dropping no constraints, the last constraint, or a random constraint produces benchmarks that are respectively: unsatisfiable, satisfiable, and unknown.

C.6 Field Size
The final parameter in benchmark generation is b: the number of bits in the field modulus p.Generally, our generator sets the modulus to be the least prime greater than 2 b−1 .There is one exception: for b = 255, it uses the BLS 12-381 elliptic curves scalar field modulus. 11This specific field is used in industrial deployments of ZKPs [61].
Total solve time for (field-based) cvc5 and (BV-based) Bitwuzla on commonly solved instances at all bit-widths.

Fig. 7 :
Fig. 7: The performance of field-based and BV-based approaches (with various BV solvers) when the field size ranges from 5 to 60 bits.
Each series is one property at different numbers of bits.

Fig. 8 :
Fig.8: Solve times, with and without field polynomials.The field size varies from 4 to 12 bits.The benchmarks are all SAT or unknown.
Our SMT solver with and without UNSAT cores.
Our SMT solver compared with a pure computer algebra system.

Table 1 :
Solved small-field benchmarks by tool, property, and status.

Table 2 :
Solved benchmarks by tool, property, and status.