figure a
figure b

1 Introduction

Zero-Knowledge Proofs (ZKPs) are powerful tools for building privacy-preserving systems. They allow one entity, the prover \(\mathcal {P}\) , to convince another, the verifier \(\mathcal {V}\) , that some secret data satisfies a public property, without revealing anything else about the data. ZKPs underlie a large (and growing!) set of critical applications, from billion-dollar private cryptocurrencies, like ZcashĀ [24, 53] and MoneroĀ [2], to research into auditable sealed court ordersĀ [20], private gun registriesĀ [26], privacy-preserving middleboxesĀ [23], and zero-knowledge proofs of exploitabilityĀ [11]. This breadth of applications is possible because of the generality of ZKPs. In general, \(\mathcal {P}\) knows a secret witness w, whereas \(\mathcal {V}\) knows a property \(\phi \) and a public instance x. \(\mathcal {P}\) must show that \(\phi (x,w)= \top \). Typically, x and w are vectors of variables in a finite field \(\mathbb {F}\), and \(\phi \) can be any system of equations over the variables, using operations \(+\) and \(\times \). Because \(\phi \) itself is an input to \(\mathcal {P}\) and \(\mathcal {V}\) , and because of the expressivity of field equations, a single implementation of \(\mathcal {P}\) and \(\mathcal {V}\) can serve many different purposes.

Humans find it difficult to express themselves directly with field equations, so they use ZKP compilers. A ZKP compiler converts a high-level predicate \(\phi '\) into an equivalent system of field equations \(\phi \). In other words, a ZKP compiler generalizes a ZKP: by compiling \(\phi '\) to \(\phi \) and then using a ZKP for \(\phi \), one obtains a ZKP for \(\phi '\). There are many industrial Ā [3, 5, 6, 14, 21, 45, 55, 66] and academic Ā [4, 18, 28, 29, 46, 48, 50, 54, 63] ZKP compilers.

The correctness of a ZKP compiler is critical for securityā€” a bug in the compiler could admit proofs of false statementsā€” but verification is challenging for three reasons. First, the definition of correctness for a ZKP compiler is non-trivial; we discuss later in this section. Second, ZKP compilers span multiple domains. The high-level predicate \(\phi '\) is typically expressed in a language with common types such as Booleans and fixed-width integers, while the output \(\phi \) is over a large, prime-order field. Thus, any compiler correctness definition must span these domains. Third, ZKP compilers are evolving and performance-critical; verification must not inhibit future changes or degrade compiler performance.

In this work, we develop tools for automatically verifying the field-blaster of a ZKP compiler. A ZKP compilerā€™s field-blaster is the pass that converts from a formula over Booleans, fixed-width integers, and finite-field elements, to a system of field equations; as a transformation from bit-like types to field equations, the field-blaster exemplifies the challenge of cross-domain verification.

Our paper makes three contributions. First, we formulate a precise correctness definition for a ZKP compiler. Our definition ensures that a correct compiler preserves the completeness and soundness of the underlying ZK proof system.Footnote 1 More specifically, given a ZK proof system where statements are specified in a low-level language L, and a compiler from a high-level language H to L, if the compiler is correct by our definition, it extends the ZK proof systemā€™s soundness and completeness properties to statements in H. Further, our definition is preserved under sequential composition, so proving the correctness of each compiler pass individually suffices to prove correctness of the compiler itself.

Second, we give an architecture for a verifiable field-blaster. In our architecture, a field-blaster is a set of ā€œencoding rules.ā€ We give verification conditions (VCs) for these rules, and we show that if the VCs hold, then the field-blaster is correct. Our approach supports automated verification because (bounded versions of) the VCs can be checked automatically. This reduces both the up-front cost of verification and its maintenance cost.

Third, we do a case study. Using our architecture, we implement a new field-blaster for CirCĀ [46] (ā€œSIR-seeā€), an infrastructure used by state-of-the-art ZKP compilers. We verify bounded versions of our field-blasterā€™s VCs using SMT-based finite-field reasoningĀ [47], and show that our field blaster does not compromise CirCā€™s performance. We also report on four bugs that our verification effort uncovered, including a soundness bug that allowed the prover to ā€œlieā€ about the results of certain bit-vector comparisons. We note that the utility of our techniques is not limited to CirC: most ZKP compilers include something like the field-blaster we describe here.

In the next sections, we discuss related work (Sect.Ā 1.1), give background on ZKPs and CirC (Sect.Ā 2), present a field-blasting example (Sect.Ā 3), describe our architecture (Sect.Ā 4), give our verification conditions (Sect.Ā 5), and present the case study (Sect.Ā 6).

1.1 Related Work

Verified Compilers. There is a rich body of work on verifying the correctness of traditional compilers. We focus on compilation for ZKPs; this requires different correctness definitions that relate bit-like types to prime field elements. In the next paragraphs, we discuss more fine-grained differences.

Compiler verification efforts fall into two broad categories: automatedā€”verification leveraging automated reasoning solversā€”and foundationalā€”manual verification using proof assistants (e.g., CoqĀ [8] or IsabelleĀ [44]). CompCertĀ [36], for example, is a Coq-verified C compiler with verified optimization passes (e.g.,Ā [40]). Closest to our work is backend verification, which proves correct the translation from an intermediate representation to machine code. CompCertā€™s loweringĀ [37] is verified, as is CakeMLā€™sĀ [31] lowering to different ISAsĀ [19, 57]. While such foundational verification offers strong guarantees, it imposes a heavy proof burden; creating CompCert, for example, took an expert team eight yearsĀ [56], and any updates to compiler code require updates to proofs.

Automated verification, in contrast, does not require writing and maintaining manual proofs.Footnote 2 CobaltĀ [34], RhodiumĀ [35], and PECĀ [32] are domain-specific languages (DSLs) for writing automatically-verified compiler optimizations and analyses. Most closely related to our work is AliveĀ [39], a DSL for expressing verified peephole optimizations, local rewrites that transform snippets of LLVM IRĀ [1] to better-performing ones. Alive addresses transformations over fixed types (while we address lowering to finite field equations) and formulates correctness in the presence of undefined behavior (while we formulate correctness for ZKPs). Beyond Alive, Alive2Ā [38] provides translation validationĀ [41, 51] for LLVMĀ [33], and VeRAĀ [10] verifies range analysis in the Firefox JavaScript engine.

There is also work on verified compilation for domains more closely related to ZKPs. The PorcupineĀ [15] compiler automatically synthesizes representations for fully-homomorphic encryptionĀ [62], and GillarĀ [58] proves that optimization passes in the QiskitĀ [60] quantum compiler are semantics-preserving. While these works compile from high-level languages to circuit representations, the correctness definitions for their domains do not apply to ZKP compilers.

Verified Compilation to Cryptographic Proofs. Prior works on verified compilation for ZKPs (or similar) take the foundational approach (with attendant proof maintenance burdens), and they do not formulate a satisfactory definition of compiler correctness. PinocchioQĀ [18] builds on CompCertĀ [36]. The authors formulate a correctness definition that preserves the existential soundness of a ZKP but does not consider completeness, knowledge soundness, or zero-knowledge (see Sect.Ā 2.2). LeoĀ [14] is a ZKP compiler that produces (partial) ACL2Ā [27] proofs of correct compilation; work to emit proofs from its field-blaster is ongoing.

Recent work defines security for reductions of knowledgeĀ [30]. These let \(\mathcal {P}\) convince \(\mathcal {V}\) that it knows a witness for an instance of relation \(\mathcal {R}_1\) by proving it knows a witness for an instance of an easier-to-prove relation \(\mathcal {R}_2\). Unlike ZKP compilers, \(\mathcal {P}\) and \(\mathcal {V}\) interact to derive \(\mathcal {R}_2\) using \(\mathcal {V}\) ā€™s randomness (e.g., proving that two polynomials are nonzero w.h.p. by proving that a random linear combination of them is), whereas ZKP compilers run ahead of time and non-interactively.

Further afield, EcneĀ [65] is a tool that attempts to verify that the input to a ZKP encodes a deterministic computation. It does not consider any notion of a specification of the intended behavior. A different workĀ [25] attempts to automatically verify that a ā€œwidgetā€ given to a ZKP meets some specification. They consider widgets that could be constructed manually or with a compiler. Our focus is on verifying a compiler pass.

2 Background

2.1 Logic

We assume usual terminology for many-sorted first-order logic with equality ( [17] gives a complete presentation). We assume every signature includes the sort \(\textsf{Bool}\) , constants \(\textsf{True}\) and \(\textsf{False}\) of sort \(\texttt{Bool}\) , and symbol family \(\approx _\sigma \) (abbreviated \(\approx \)) with sort \(\sigma \times \sigma \rightarrow \textsf{Bool}\) for each sort \(\sigma \). We also assume a family of conditionals: symbols \( ite _\sigma \) (ā€œif-then-elseā€, abbreviated \( ite \) ) of sort \(\textsf{Bool}\times \sigma \times \sigma \rightarrow \sigma \).

A theory is a pair \(\mathcal {T} = (\varSigma , \textbf{I})\), where \(\varSigma \) is a signature and \(\textbf{I}\) is a class of \(\varSigma \)-interpretations. A \(\varSigma \) -formula is a term of sort \(\textsf{Bool}\) . A \(\varSigma \) -formula \(\phi \) is satisfiable (resp., unsatisfiable) in \(\mathcal {T}\) if it is satisfied by some (resp., no) interpretation in \(\textbf{I}\) . We focus on two theories. The first is \(\mathcal {T}_{ BV }\), the SMT-LIB theory of bit-vectorsĀ [52, 61], with signature \(\varSigma _{ BV }\) including a bit-vector sort \(\textsf{BV}_{[n]}\) for each \(n>0\) with bit-vector constants \(c_{[n]}\) of sort \(\textsf{BV}_{[n]}\) for each \(c\in [0,2^n-1]\), and operators including & and \(|\) (bitwise and, or) and \(+_{[n]}\) (addition modulo \(2^n\)). We write t[i] to refer to the \(i^ th \) bit of bit-vector t, where t[0] is the least-significant bit. The other theory is \(\mathcal {T}_{ F _{p}}\), which is the theory corresponding to the finite field of order p, for some prime pĀ [47]. This theory has signature \(\varSigma _{ F _{p}}\) containing the sort \(\textsf{FF}_{p}\), constant symbols \(0, \ldots , p-1\), and operators \(+\) and \(\times \).

In this paper, we assume all interpretations interpret sorts and symbols in the same way. We write \(\textsf{dom}(v)\) for the set interpreting the sort of a variable v. We assume that \(\textsf{Bool}\), \(\textsf{True}\), and \(\textsf{False}\) are interpreted as \(\{\top , \bot \}\), \(\top \), and \(\bot \), respectively; \(\varSigma _{ BV }\)-interpretations follow the SMT-LIB standard; and \(\varSigma _{ F _{p}}\)-interpretations interpret symbols as the corresponding elements and operations in \(\mathbb {F}_p\), a finite field of order p (for concreteness, this could be the integers modulo p). Note that only the values of variables can vary between two interpretations.

For a signature \(\varSigma \), let t be a \(\varSigma \)-term of sort \(\sigma \), with free variables \(x_1,\ldots ,x_n\), respectively of sort \(\sigma _1,\ldots ,\sigma _n\). We define the function \(\hat{t}: \textsf{dom}(x_1) \times \cdots \times \textsf{dom}(x_n) \rightarrow \textsf{dom}(t)\) as follows. Let \(\mathbf {\textsf{x}}\in \textsf{dom}(x_1) \times \cdots \times \textsf{dom}(x_n)\). Let \(\mathcal {M}\) be an interpretation that interprets each \(x_i\) as \( \textsf{x}_i\). Then \(\hat{t}(\mathbf {\textsf{x}}) = t^{\mathcal {M}}\) (i.e., the interpretation of t in \(\mathcal {M}\)). For example, the term \(t = a \wedge \lnot a\) defines \(\hat{t}:\textsf{Bool}\rightarrow \textsf{Bool}= \lambda \,x.\,\bot \). In the following, we follow the convention used above in using the standard font (e.g., x) for logical variables and a sans serif font (e.g., \(\textsf{x}\)) to denote meta-variables standing for values (i.e., elements of \(\sigma ^\mathcal {M}\) for some \(\sigma \) and \(\mathcal {M}\)). Also, abusing notation, weā€™ll conflate single variables (of both kinds) with vectors of variables when the distinction doesnā€™t matter. Note that a formula \(\phi \) is satisfiable if there exist values \(\textsf{x}\) such that \(\hat{\phi }(\textsf{x}) = \top \). It is valid if for all values \(\textsf{x}\), \(\hat{\phi }(\textsf{x}) = \top \).

For terms s,Ā t and variable x, \(t[x\mapsto s]\) denotes t with all occurrences of x replaced with s. For a sequence of variable-term pairs, \(S = (x_1 \mapsto s_1, \dots , x_n \mapsto s_n)\), t[S] is defined to be \(t[x_1\mapsto s_1]\cdots [x_n\mapsto s_n]\).

Fig. 1.
figure 1

The information flow for a zero-knowledge proof.

2.2 Zero Knowledge Proofs

As mentioned above, Zero-knowledge proofs (ZKPs) make it possible to prove that some secret data satisfies a public propertyā€”without revealing the data itself. SeeĀ [59] for a full presentation; we give a brief overview here, and then describe how general-purpose ZKPs are used.

Overview and Definitions. In a cryptographic proof system, there are two parties: a verifier \(\mathcal {V}\) and a prover \(\mathcal {P}\) . \(\mathcal {V}\) knows a public instance \(\textsf{x}\) and asks \(\mathcal {P}\) to show that it has knowledge of a secret witness \(\textsf{w}\) satisfying a public predicate \(\phi (x, w)\) from a predicate class \(\varPhi \) (a set of formulas) (i.e., \(\hat{\phi }(\textsf{x},\textsf{w})=\top \)). FigureĀ 1 illustrates the workflow. First, a trusted party runs an efficient (i.e., polytime in an implicit security parameter \(\lambda \)) algorithm \(\textsf{Setup}(\phi )\) which produces a proving key \(\textsf{pk}\) and a verifying key \(\textsf{vk}\) . Then, \(\mathcal {P}\) runs an efficient algorithm \(\textsf{Prove}(\textsf{pk}, \textsf{x}, \textsf{w}) \rightarrow \pi \) and sends the resulting proof \(\pi \) to \(\mathcal {V}\) . Finally, \(\mathcal {V}\) runs an efficient verification algorithm \(\textsf{Verify}(\textsf{vk}, \textsf{x}, \pi ) \rightarrow \{\top ,\bot \}\) that accepts or rejects the proof. A zero-knowledge argument of knowledge for class \(\varPhi \) is a tuple \(\Pi = (\textsf{Setup}, \textsf{Prove}, \textsf{Verify})\) with three informal properties for every \(\phi \in \varPhi \) and every \(\textsf{x}\in \textsf{dom}(x),\textsf{w}\in \textsf{dom}(w)\):

  • perfect completeness: if \(\hat{\phi }(\textsf{x},\textsf{w})\) holds, then \(\textsf{Verify}(\textsf{vk}, \textsf{x}, \pi )\) holds;

  • computational knowledge soundnessĀ [9]: an efficient adversary that does not know \(\textsf{w}\) cannot produce a \(\pi \) such that \(\textsf{Verify}(\textsf{vk}, \textsf{x}, \pi )\) holds; and

  • zero-knowledgeĀ [22]: \(\pi \) reveals nothing about \(\textsf{w}\), other than its existence.

Technically, the system is an ā€œargumentā€ rather than a ā€œproofā€ because soundness only holds against efficient adversaries. Also note that knowledge soundness requires that an entity must ā€œknowā€ a valid \(w'\) to produce a proof; it is not enough for a valid \(w'\) to simply exist. We give more precise definitions in AppendixĀ A.

Representations for ZKPs. As mentioned above, ZKP applications are manifold (Sect.Ā 1)ā€”from cryptocurrencies to private registries. This breadth of applications is possible because ZKPs support a broad class of predicates. Most commonly, these predicates are expressed as rank-1 constraint systems (R1CSs). Recall that \(\mathbb {F}_p\) is a prime-order finite field (also called a prime field). We will drop the subscript p when it is not important. In an R1CS, \(\textsf{x}\) and \(\textsf{w}\) are vectors of elements in \(\mathbb {F}\) ; let \(\textsf{z}\in \mathbb {F}^m\) be their concatenation. The function \(\hat{\phi }\) can be defined by three matrices \(\textsf{A}, \textsf{B}, \textsf{C}\in \mathbb {F}^{n \times m}\); \(\hat{\phi }(\textsf{x},\textsf{w})\) holds when \(\textsf{A}\textsf{z}\circ \textsf{B}\textsf{z}= \textsf{C}\textsf{z}\), where \(\circ \) is the element-wise product. Thus, \(\phi \) can be viewed as n conjoined constraints, where each constraint i is of the form \((\sum _j a_{ij}z_j) \times (\sum _j b_{ij}z_j) \approx (\sum _j c_{ij}z_j)\) (where the \(a_{ij}\), \(b_{ij}\) and \(c_{ij}\) are constant symbols from \(\varSigma _{ F _{p}}\), and the \(z_j\) are a vector of variables of sort \(\textsf{FF}_{p}\)). That is, each constraint enforces a single non-linear multiplication.

2.3 Compilation Targeting Zero Knowledge Proofs

To write a ZKP about a high-level predicate \(\phi \), that predicate is first compiled to an R1CS. A ZKP compiler from class \(\varPhi \) (a set of \(\varSigma \)-formulas) to class \(\varPhi '\) (a set of \(\varSigma '\)-formulas) is an efficient algorithm \(\textsf{Compile}(\phi \in \varPhi ) \rightarrow (\phi '\in \varPhi ', \textsf{Ext}_x, \textsf{Ext}_w)\). Given a predicate \(\phi (x,w)\), it returns a predicate \(\phi '(x', w')\) as well as two efficient and deterministic algorithms, instance and witness extenders: \(\textsf{Ext}_x : \textsf{dom}(x) \rightarrow \textsf{dom}(x')\) and \(\textsf{Ext}_w: \textsf{dom}(x)\times \textsf{dom}(w) \rightarrow \textsf{dom}(w')\).Footnote 3 For example, CirCĀ [46] can compile a Boolean-returning C function (in a subset of C) to an R1CS.

At a high-level, \(\phi \) and \(\phi '\) should be ā€œequisatisfiableā€, with \(\textsf{Ext}_x\) and \(\textsf{Ext}_w\) mapping satisfying values for \(\phi \) to satisfying values for \(\phi '\). That is, for all \(\textsf{x}\in \textsf{dom}(x)\) and \(\textsf{w}\in \textsf{dom}(w)\) such that \(\hat{\phi }(\textsf{x}, \textsf{w}) = \top \), if \(\textsf{x}' = \textsf{Ext}_x(\textsf{x})\) and \(\textsf{w}' = \textsf{Ext}_w(\textsf{x}, \textsf{w})\), then \(\hat{\phi '}(\textsf{x}', \textsf{w}') = \top \). Furthermore, for any \(\textsf{x}\), it should be impossible to (efficiently) find \(\textsf{w}'\) satisfying \(\hat{\phi '}(\textsf{Ext}_x(\textsf{x}), \textsf{w}') = \top \) without knowing a \(\textsf{w}\) satisfying \(\hat{\phi }(\textsf{x}, \textsf{w}) = \top \). In Sect.Ā 5.1, we precisely define correctness for a predicate compiler.

One can build a ZKP for class \(\varPhi \) from a compiler from \(\varPhi \) to \(\varPhi '\) and a ZKP for \(\varPhi '\). Essentially, one runs the compiler to get a predicate \(\phi '\in \varPhi '\), as well as \(\textsf{Ext}_x\) and \(\textsf{Ext}_w\). Then, one writes a ZKP to show that \(\hat{\phi '}(\textsf{Ext}_x(\textsf{x}), \textsf{Ext}_w(\textsf{x}, \textsf{w})) = \top \). In AppendixĀ A, we give this construction in full and prove it is secure.

Optimization. The primary challenge when using ZKPs is cost: typically, \(\textsf{Prove}\) is at least three orders of magnitude slower than checking \(\phi \) directlyĀ [64]. Since \(\textsf{Prove}\) ā€™s cost scales with n (the constraint count), it is critical for the compiler to minimize n. The space of optimizations is large and complex, for two reasons. First, the compiler can introduce fresh variables. Second, only equisatifiabilityā€”not logical equivalenceā€”is needed. Compilers in this space exploit equisatisfiability heavily to efficiently represent high-level constructs (e.g., Booleans, bit-vectors, arrays, ...) as an R1CS.

As a (simple!) example, consider the Boolean computation \(a \approx c_1 \vee \dots \vee c_k\). Assume that \(c'_1, \dots , c'_k\) are variables of sort \(\textsf{FF}_{}\) and that we add constraints \(c'_i(1-c'_i) \approx 0\) to ensure that \(c'_i\) has to be 0 or 1 for each i. Assume further that \((c'_i \approx 1)\) encodes \(c_i\) for each i. How can one additionally ensure that \(a'\) (also of sort \(\textsf{FF}_{}\)) is also forced to be equal to 0 or 1 and that \((a' \approx 1)\) is a correct encoding of a? Given that there are \(k-1\) ORs, natural approaches use \(\Theta (k)\) constraints. One clever approach is to introduce variable \(x'\) and enforce constraints \(x'(\sum _i c'_i)\approx a'\) and \((1-a')(\sum _i c'_i)\approx 0\). In any interpretation where any \(c_i\) is true, the corresponding interpretation for \(a'\) must be 1 to satisfy the second constraint; setting \(x'\) to the sumā€™s inverse satisfies the first. If all \(c_i\) are false, the first constraint ensures \(a'\) is 0. This technique assumes the sum does not overflow; since ZKP fields are typically large (e.g., with p on the order of \(2^{255}\)), this is usually a safe assumption.

Fig. 2.
figure 2

The architecture of CirC

CirC. CirCĀ [46] is an infrastructure for building compilers from high-level languages (e.g., a C subset), to R1CSs. It has been used in research projectsĀ [4, 12], and in industrial R &D. FigureĀ 2 shows the structure of an R1CS compiler built with CirC. First, the front-end of the compiler converts the source program into CirC-IR. CirC-IR is a term IR based on SMT-LIB that includes: Booleans, bit-vectors, fixed-size arrays, tuples, and prime fields.Footnote 4 Second, the compiler optimizes and simplifies the IR so that the only remaining sorts are Booleans, bit-vectors, and the target prime field. Third, the compiler lowers the simplified IR to an R1CS predicate over the target field. For ZKPs built with CirC, the completeness, soundness, and zero-knowledge of the end-to-end system depend on the correctness of CirC itself.

3 Overview andĀ Example

To start, we view CirCā€™s lowering pass as two passes (Fig.Ā 2). The first pass, ā€œ(finite-)field-blasting,ā€ converts a many-sorted IR (representable as a (\(\varSigma _{ BV }\cup \varSigma _{ F _{}}\))-formula) to a conjunction of field equations (\(\varSigma _{ F _{}}\)-equations). The second pass, ā€œflattening,ā€ converts this conjunction of field equations to an R1CS.

Our focus is on verifying the first pass. We begin with a worked example of how to field-blast a small snippet of CirC-IR (Sect.Ā 3.1). This example will illustrate four key ideas (Sect.Ā 3.2) that inspire our field-blasterā€™s architecture.

3.1 An Example ofĀ Field-Blasting

We start with an example CirC-IR predicate expressed as a (\(\varSigma _{ BV }\cup \varSigma _{ F _{}}\))-formula:

$$ \begin{aligned} \phi \triangleq (x_0 \oplus w_0) \wedge (w_1 +_{[4]} x_1 \approx w_1) \wedge (x_2~ \& ~w_1 \approx x_2) \wedge (x_3 \approx w_2 \times w_2) \end{aligned}$$
(1)

The predicate includes: the XOR of two Booleans (ā€œ\(\oplus \)ā€), a bit-vector sum, a bit-vector AND, and a field product. \(x_0\) and \(w_0\) are of sort \(\texttt{Bool}\), \(x_1\), \(x_2\), and \(w_1\) are of sort \(\textsf{BV}_{[4]}\), and \(x_3\) and \(w_2\) are of sort \(\textsf{FF}_{p}\). Weā€™ll assume that \(p \gg 2^4\). TableĀ 1 summarizes the new variables and assertions we create during field-blasting; we describe the origin of each assertion and new variable in the next paragraphs.

Lowering Clause One (Booleans). We begin with the Boolean term \((x_0 \oplus w_0)\). We will use 1 and 0 to represent \(\top \) and \(\bot \). We introduce variables \(x'_0\) and \(w'_0\) of sort \(\textsf{FF}_{p}\) to represent \(x_0\) and \(w_0\) respectively. To ensure that \(w'_0\) is 0 or 1, we assert: \(w'_0(w'_0-1) \approx 0\).Footnote 5 \(x_0 \oplus w_0\) is then represented by the expression \(1 - x'_0 - w'_0 + 2x'_0w'_0\). Setting this equal to 1 enforces that \(x_0 \oplus w_0\) must be true. These new assertions and fresh variables are reflected in the first three rows of the table.

Table 1. New variables and assertions when compiling the example \(\phi \).

Lowering Clause Two and Three (Bit-vectors). Before describing how to bit-blast the second and third clauses in \(\phi \), we discuss bit-vector representations in general. A bit-vector t can be viewed as a sequence of b bits or as a non-negative integer less than \(2^b\). These two views suggest two natural representations in a prime-order field: first, as one field element \(t'_u\), whose unsigned value agrees with t (assuming the fieldā€™s size is at least \(2^b\)); second, as b elements \(t'_0, \dots , t'_{b-1}\), that encode the bits of t as 0 or 1 (in our encoding, \(t'_0\) is the least-significant bit and \(t'_{b-1}\) is the most-significant bit). The first representation is simple, but with it, some field values (e.g., \(2^b\)) donā€™t corresponding to any possible bit-vector. With the second approach, by including equations \(t'_i(t'_i-1)\approx 0\) in our system, we ensure that any satisfying assignment corresponds to a valid bit-vector. However, the extra b equations increase the size of our compilerā€™s output.

We represent \(\phi \)ā€™s \(w_1\) bit-wise: as \(w'_{1,0}, \dots , w'_{1,3}\), and we represent the instance variable \(x_1\) as \(x'_{1,u}\).Footnote 6 For the constraint \(w_1 +_{[4]} x_1 \approx w_1\), we compute the sum in the field and bit-decompose the result to handle overflow. First, we introduce new variable \(s'\) and set it equal to \(x'_{1,u} + \sum _{i=0}^3 2^iw'_{1,i}\). Then, we bit-decompose \(s'\), requiring \(s' \approx \sum _{i=0}^4 2^is'_{i}\), and \(s'_i(s'_i-1) \approx 0\) for \(i\in [0,4]\). Finally, we assert \(s'_i\approx w'_{1,i}\) for \(i\in [0,3]\). This forces the lowest 4 bits of the sum to be equal to \(w_1\).

The constraint \( x_2~ \& ~w_1\approx x_2\) is more challenging. Since \(x_2\) is an instance variable, we initially encode it as \(x'_{2,u}\). Then, we consider the bit-wise AND. There is no obvious way to encode a bit-wise operation, other than bit-by-bit. So, we convert \(x'_{2,u}\) to a bit-wise representation: We introduce witness variables \(x'_{2,0}, \dots , x'_{2, 3}\) and equations \(x'_{2,i}(x'_{2,i}-1)\approx 0\) as well as equation \(x'_{2,u} \approx \sum _{i=0}^3 2^i x'_{2,i}\). Then, for each i we require \(x'_{2,i} w'_{1,i} \approx x'_{2,i}\).

Lowering the Final Clause (Field Elements). Finally, we consider the field equation \(x_2 \approx w_2 \times w_2\). Our target is also field equations, so lowering this is straightforward. We simply introduce primed variables and copy the equation.

3.2 Key Ideas

This example highlights four ideas that guide the design of our field-blaster:

  1. 1.

    fresh variables and assertions: Field-blasting uses two primitive operations: creating new variables in \(\phi '\) (e.g., \(w'_0\) to represent \(w_0\)) and adding new assertions to \(\phi '\) (e.g., \(w'_0(w'_0-1)\approx 0\)).

  2. 2.

    encodings: For a term t in \(\phi \), we construct a field term (or collection of field terms) in \(\phi '\) that represent the value of t. For example, the Boolean \(w_0\) is represented as the field element \(w'_0\) that is 0 or 1.

  3. 3.

    operator rules: if t is an operator applied to some arguments, we can encode t given encodings of the arguments. For example, if t is \(x_0 \oplus w_0\), and \(x_0\) is encoded as \(x_0'\) and \(w_0\) as \(w_0'\), then t can be encoded as \(1-x'_0-w'_0+2x'_0w'_0\).

  4. 4.

    conversions: Some sorts can be represented by encodings of different kinds. If a term has multiple possible encodings, the compiler may need to convert between them to apply some operator rule. For example, we converted \(x_2\) from an unsigned encoding to a bit-wise encoding before handling an AND.

4 Architecture

In this section, we present our field-blaster architecture. To compile a predicate \(\phi \) to a system of field equations \(\phi '\), our architecture processes each term t in \(\phi \) using a post-order traversal. Informally, it represents each t as an ā€œencodingā€ in \(\phi '\): a term (or collection of terms) over variables in \(\phi '\). Each encoding is produced by a small algorithm called an ā€œencoding ruleā€.

Below, we define the type of encodings \(\textsf{Enc}\) (Sect.Ā 4.1), the five different types of encoding rules (Sect.Ā 4.2), and a calculus that iteratively applies these rules to compile all of \(\phi \) (Sect.Ā 4.3).

4.1 Encodings

Table 2. Encodings for each term sort. Only bit-vectors have two encoding kinds.

TableĀ 2 presents our tagged union type \(\textsf{Enc}\) of possible term encodings. Each variant comprises the term being encoded, its tag (the encoding kind), and a sequence of field terms. The encoding kinds are \(\texttt{bit}\) (a Boolean as 0/1), \(\texttt{uint}\) (a bit-vector as an unsigned integer), \(\texttt{bits}\) (a bit-vector as a sequence of bits), and \(\texttt{field}\) (a field term trivially represented as a field term). Each encoding has an intended semantics: a condition under which the encoding is considered valid. For instance, a \(\texttt{bit}\) encoding of Boolean t is valid if the field term f is equal to \( ite (t, 1, 0)\).

4.2 Encoding Rules

An encoding rule is an algorithm that takes and/or returns encodings, in order to represent some part of the input predicate as field terms and equations.

Primitive Operations. A rule can perform two primitive operations: creating new variables and emitting assertions. In our pseudocode, the primitive function \(\textsf{fresh}(\textsf{name}, t, \mathsf {is\texttt{I}nst}) \rightarrow x'\) creates a fresh variable. Argument \(\mathsf {is\texttt{I}nst}\) is a Boolean indicating whether \(x'\) is an instance variable (as opposed to a witness). Argument t is a field term (over variables from \(\phi \) and previously defined primed variables) that expresses how to compute a value for \(x'\). For example, to create a field variable \(w'\) that represents Boolean witness variable w, a rule can call \(\textsf{fresh}(w', ite (w, 1, 0), \bot )\). The compiler uses t to help create the \(\textsf{Ext}_x\) and \(\textsf{Ext}_w\) algorithms. A rule asserts a formula \(t'\) (over primed variables) by calling \(\textsf{assert}(t')\).

Fig. 3.
figure 3

Pseudocode for some bit-vector rules: \(\textsf{variable}\) uses a \(\texttt{uint}\) encoding for instances and bit-splits witnesses to ensure theyā€™re well-formed, \(\textsf{const}\) bit-splits the constant itā€™s given, \(\textsf{assertEq}\) asserts unsigned or bit-wise equality, and \(\textsf{convert}\) either does a bit-sum or bit-split.

Rule Types. There are five types of rules: (1) Variable rules \(\textsf{variable}(t, \mathsf {is\texttt{I}nst}) \rightarrow e\) take a variable t and its instance/witness status and return an encoding of that variable made up of fresh variables. (2) Constant rules \(\textsf{const}(t) \rightarrow e\) take a constant term t and produce an encoding of t comprising terms that depend only on t. Since t is a constant, the terms in e can be evaluated to field constants (see the calculus in Sect.Ā 4.3).Footnote 7 The \(\textsf{const}\) rule cannot call \(\textsf{fresh}\) or \(\textsf{assert}\) . (3) Equality rules \(\textsf{assertEq}(e, e')\) take two encodings of the same kind and emit assertions that equate the underlying terms. (4) Conversion rules \(\textsf{convert}(e, \textsf{kind}')\rightarrow e'\) take an encoding and convert it to an encoding of a different kind. Conversions are only non-trivial for bit-vectors, which have two encoding kinds: \(\texttt{uint}\) and \(\texttt{bits}\) . (5) Operator rules apply to terms t of form \(o(t_1, \dots , t_n)\). Each operator rule takes t, o, and encodings of the child terms \(t_i\) and returns an encoding of t. Some operator rules require specific kinds of encodings; before using such an operator rule, our calculus (Sect.Ā 4.3) calls the convert rule to ensure the input encodings are the correct kind. FigureĀ 3 gives pseudocode for the first four rule types, as applied to bit-vectors. FigureĀ 4 gives pseudocode for two bit-vector operator encoding rules. A field blaster uses many operator rules: in our case study (Sect.Ā 6) there are 46.

Fig. 4.
figure 4

Pseudocode for some bit-vector operator rules. bvZeroExt zero-extends a bit-vector; for bit-wise encodings, it adds zero bits, and for unsigned encodings, it simply copies the original encoding. bvMulUint multiplies bit-vectors, all assumed to be unsigned encodings. We show only the case where the multiplication cannot overflow in the field: in this case the rule performs the multiplication in the field, and bit-splits the result to implement reduction modulo \(2^b\). The rules use ff2bv, which converts from a field element to a bit-vector (discussed in Sect.Ā 6.1).

4.3 Calculus

We now give a non-deterministic calculus describing how our field-blaster applies rules to compile a predicate \(\phi (x,w)\) into a system of field equations.

A calculus state is a tuple of three items: (E,Ā A,Ā F). The encoding store E is a (multi-)map from terms to sets of encodings. The assertions formula A is a conjunction of all field equations asserted via \(\textsf{assert}\) . The fresh variable definitions sequence F is a sequence consisting of pairs, where each pair (v,Ā t) matches a single call to \(\textsf{fresh}(v,t,\dots )\).

FigureĀ 5 shows the transitions of our calculus. We denote the result of a rule as \(A', F', e' \leftarrow r(\dots )\), where \(A'\) is a formula capturing any new assertions, \(F'\) is a sequence of pairs capturing any new variable definitions, and \(e'\) is the ruleā€™s return value. We may omit one or more results if they are always absent for a particular rule. For encoding store E, \(E \cup (t \mapsto e)\) denotes the store with e added to tā€™s encoding set.

Fig. 5.
figure 5

The transition rules of our rewriting calculus.

There are five kinds of transitions. The \(\textsf{Const}\) transition adds an encoding for a constant term. The \(\textsf{const}\) rule returns an encoding e whose terms depend on the constant c; \(e'\) is a new encoding identical to e, except that each of its terms has been evaluated to obtain a field constant. The \(\textsf{Var}\) transition adds an encoding for a variable term. The \(\textsf{Conv}\) transition takes a term that is already encoded and re-encodes it with a new encoding kind. The \(\textsf{kinds}\) operator returns all legal values of \(\textsf{kind}\) for encodings of a given sort. The \(\textsf{Op}_{r}\) transition applies operator rule r. This transition is only possible if rā€™s operator kind agrees with o, and if its input encoding kinds agree with \(\vec e\). The \(\textsf{Finish}\) transition applies when \(\phi \) has been encoded. It uses \(\textsf{const}\) and \(\textsf{assertEq}\) to build assertions that hold when \(\phi = \top \). Rather than producing a new calculus state, it returns the outputs of the calculus: the assertions and the variable definitions.

To meet the requirements of the ZKP compiler, our calculus must return two extension function: \(\textsf{Ext}_x\) and \(\textsf{Ext}_w\) (Sect.Ā 2.2). Both can be constructed from the fresh variable definitions F. One subtlety is that \(\textsf{Ext}_x(x)\) (which assigns values to fresh instance variables) is a function of x onlyā€”it cannot depend on the witness variables of \(\phi \). We ensure this by allowing fresh instance variables to only be created by the \(\textsf{variable}\) rule, and only when it is called with \(\mathsf {is\texttt{I}nst}= \top \).

Strategy. Our calculus is non-deterministic: multiple transitions are possible in some situations; for example, some conversion is almost always applicable. The strategy that decides which transition to apply affects field blaster performance (AppendixĀ D) but not correctness.

5 Verification Conditions

In this section, we first define correctness for a ZKP compiler (Sect.Ā 5.1). Then, we give verification conditions (VCs) for each type of encoding rule (Sect.Ā 5.2). Finally, we show that if these VCs hold, our calculus is a correct ZKP compiler (Sect.Ā 5.3).

5.1 Correctness Definition

Definition 1 (Correctness)

A ZKP compiler \(\textsf{Compile}(\phi ) \rightarrow (\phi ', \textsf{Ext}_x, \textsf{Ext}_w)\) is correct if it is demonstrably complete and demonstrably sound.

  • demonstrable completeness: For all \(\textsf{x}\in \textsf{dom}(x),\textsf{w}\in \textsf{dom}(w)\) such that \(\hat{\phi }(\textsf{x},\textsf{w})=\top \),

    $$ \hat{\phi '}(\textsf{Ext}_x(\textsf{x}),\textsf{Ext}_w(\textsf{x},\textsf{w})) = \top $$
  • demonstrable soundness: There exists an efficient algorithm \(\textsf{Inv}(\textsf{x}', \textsf{w}') \rightarrow \textsf{w}\) such that for all \(\textsf{x}\in \textsf{dom}(x),\textsf{w}'\in \textsf{dom}(w')\) such that \(\hat{\phi '}(\textsf{Ext}_x(\textsf{x}),\textsf{w}')=\top \),

    $$ \hat{\phi }(\textsf{x}, \textsf{Inv}(\textsf{Ext}_x(\textsf{x}),\textsf{w}'))=\top $$

Demonstrable completeness (respectively, soundness) requires the existence of a witness for \(\phi '\) (resp., \(\phi \)) when a witness exists for \(\phi \) (resp., \(\phi '\)); this existence is demonstrated by an efficient algorithm \(\textsf{Ext}_w\) (resp., \(\textsf{Inv}\)) that computes the witness.

Correct ZKP compilers are important for two reasons. First, since sequential composition preserves correctness, one can prove a multi-pass compiler is correct pass-by-pass. Second, a correct ZKP compiler from \(\varPhi \) to \(\varPhi '\) can be used to generalize a ZKP for \(\varPhi '\) to one for \(\varPhi \). We prove both properties in AppendixĀ A.

Theorem 1 (Compiler Composition)

If \(\textsf{Compile}'\) and \(\textsf{Compile}''\) are correct, then the compiler \(\textsf{Compose}(\textsf{Compile}', \textsf{Compile}'')\) (AppendixĀ A) is correct.

Theorem 2 (ZKP Generalization)

(informal) Given a correct ZKP compiler \(\textsf{Compile}\) from \(\varPhi \) to \(\varPhi '\) and a ZKP for \(\varPhi '\), we can construct a ZKP for \(\varPhi \).

5.2 Rule VCs

Recall (Sect.Ā 4) that our language manipulates encodings through five types of encoding rules. We give verification conditions for each type of rule. Intuitively, these capture the correctness of each rule in isolation. Next, weā€™ll show that they imply the correctness of a ZKP compiler that follows our calculus.

Our VCs quantify over valid encodings. That is, they have the form: ā€œfor any valid encoding e of term t, ...ā€ We can quantify over an encoding e by making each \(t_i \in \textsf{terms}(e)\) a fresh variable, and quantifying over the \(t_i\). Encoding validity is captured by a predicate \( valid (e,t)\), which is defined to be the validity condition in TableĀ 2. Each VC containing encoding variables \(\textbf{e}\) implicitly represents a conjunction of instances of that VC, one for each possible tuple of kinds of \(\textbf{e}\), which is fixed for each instance. If a VC contains \( valid (e,t)\), the sort of t is constrained to be compatible with \(\textsf{kind}(e)\). For a kind and a sort to be compatible, they must occur in the same row of TableĀ 2. We define the equality predicate \( equal (e,e')\) as \(\bigwedge _i \textsf{terms}(e)[i] \approx \textsf{terms}(e')[i]\).

Table 3. VCs related to encoding uniqueness.

Encoding Uniqueness. First, we require the uniqueness of valid encodings, for any fixed encoding kind. TableĀ 3 shows the VCs that ensure this. Each row is a formula that must be valid, for all compatible encodings and terms. The first two rows ensure that there is a bijection from terms to their valid encodings (in the first row, we consider only instances for which \(\textsf{kind}(e)=\textsf{kind}(e')\)). The function \( fromTerm (t, \textsf{kind}) \rightarrow e\) maps a term and an encoding kind to a valid encoding of that kind, and the function \( toTerm (e)\rightarrow t\) maps a valid encoding to its encoded term. The third and fourth rows ensure that \( fromTerm \) and \( toTerm \) are correctly defined. We will use \( toTerm \) in our proof of calculus soundness (AppendixĀ B) and we will use \( fromTerm \) to optimize VCs for faster verification (Sect.Ā 6.1).

For an example of the \( valid \) , \( fromTerm \) , and \( toTerm \) functions, consider a Boolean b encoded as an encoding e with \(\textsf{kind}\) \(\texttt{bit}\) and whose \(\textsf{terms}\) consist of a single field element f. Validity is defined as \( valid (e, b) = f \approx ite (b, 1, 0)\), \( toTerm (f)\) is defined as \(f \approx 1\), and \( fromTerm (b, \texttt{bit})\) is \((b, \texttt{bit}, ite (b, 1, 0))\).

VCs for Encoding Rules. TableĀ 4 shows our VCs for the rules of Fig.Ā 5. For each rule application, A and F denote, respectively, the assertions and the variable declarations generated when that rule is applied. We explain some of the VCs in detail.

Table 4. VCs for encoding rules.

First, consider a rule \(r_o\) for operator o applied to inputs \(t_1, \dots , t_k\). The rule takes input encodings \(e_1, \dots , e_k\) and returns an output \(e'\). It is sound if the validity of its inputs and its assertions imply the validity of its output. It is complete if the validity of its inputs implies its assertions and the validity of its output, after substituting fresh variable definitions.

Second, consider a variable rule. Its input is a variable term t, and it returns \(e'\), a putative encoding thereof. Note that \(e'\) does not actually contain t, though the substitutions in F may bind the fresh variables of \(e'\) to functions of t. For the rule to be sound when t is a witness variable \((t \in w)\), the assertions must imply that \(e'\) is valid for some term \(t'\). For the rule to be sound when t is an instance variable \((t \in x)\), the assertions must imply that \(e'\) is valid for t, when the instance variables in \(e'\) are replaced with their definition (\(F_x\) denotes F, restricted to its declarations of instance variables).Footnote 8 For the variable rule to be complete (for an instance or a witness), the assertions and the validity of \(e'\) for t must follow from F.

Third, consider a constant rule. Its input is a constant term t, and it returns an encoding e. Recall that the terms of e are always evaluated, yielding \(e'\) which only contains constant terms. Thus, correctness depends only on the fact that e is always a valid encoding of the input t. This can be captured with a single VC.

5.3 A Correct Field-Blasting Calculus

Given rules that satisfy these verification conditions, we show that the calculus of Sect.Ā 4.3 is a correct ZKP compiler. The proof is in AppendixĀ B.

Theorem 3 (Correctness)

With rules that satisfy the conditions of Sect.Ā 5.2, the calculus of Sect.Ā 4.3 is demonstrably complete and sound (Def.Ā 1).

6 Case Study: A Verifiable Field-Blaster forĀ CirC

We implemented and partially verified a field-blaster for CirCĀ [46]. Our implementation is based on a refactoring of CirCā€™s original field blaster to conform to our encoding rules (Sect.Ā 4.2) and consists of \(\approx \)850 lines of code (LOC).Footnote 9 As described below, we have (partially) verified our encoding rules, but trust our calculus (Sect.Ā 4.3, \(\approx \)150 LOC) and our flattening implementations (Fig.Ā 2, \(\approx \)160 LOC).

While porting rules, we found 4 bugs in CirCā€™s original field-blaster (see AppendixĀ G), including a severe soundness bug. Given a ZKP compiled with CirC, the bug allowed a prover to incorrectly compare bit-vectors. The prover, for example, could claim that the unsigned value of 0010 is greater than or less than that of 0001. A patch to fix all 4 bugs (in the original field blaster) has been upstreamed, and we are in the process of upstreaming our new field blaster implementation into CirC.

6.1 Verification Evaluation

Our implementation constructs the VCs from Sect.Ā 5.2 and emits them as SMT-LIB (extended with a theory of finite fieldsĀ [47]). We verify them with cvc5, because it can solve formulas over bit-vectors and prime fieldsĀ [47]. The verification is partial in that it is bounded in two ways. We set \(b \in \mathbb {N}\) to be the maximum bit-width of any bit-vector and \(a \in \mathbb {N}\) to be the maximum number of arguments to any n-ary operator. In our evaluation, we used \(a=4\) and \(b=4\). These bounds are small, but they were sufficient to find the bugs mentioned above.

Optimizing Completeness VCs. Generally, cvc5 verifies soundness VCs more quickly than completeness VCs. This is surprising at first glance. To see why, consider the soundness (S) and completeness (C) conditions for a conversion rule from e to \(e'\) that generates assertions A and definitions F:

$$ S\triangleq (A \wedge valid (e, t)) \rightarrow valid (e', t) C\triangleq ( valid (e, t) \rightarrow (A \wedge valid (e', t)))[F] $$

In both, t is a variable, e contains variables, and there are variables in \(e'\) and A that are defined by F. In C, though, some variables are replaced by their definitions in Fā€”which makes the number of variables (and thus the search space)ā€”seem smaller for C than S. Yet, cvc5 is slower on C.

The problem is that, while the field operations in A are standard (e.g., \(+\), \(\times \), and \(=\)), the definitions in F use a CirC-IR operator that (once embedded into SMT-LIB) is hard for cvc5 to reason about. That operator, (ff2bv b), takes a prime field element x and returns a bit-vector v. If xā€™s integer representative is less than \(2^b\), then vā€™s unsigned value is equal to x; otherwise, v is zero.

The ff2bv operator is trivial to evaluate but hard to embed. cvc5ā€™s SMT-LIB extension for prime fields only supports \(+\), \(\times \) and \(=\), so no operator can directly relate x to v. Instead, we encode the relationship through b Booleans that represent the bits of v. To test whether \(x < 2^b\), we use the polynomial \(f(x) = \prod _{i=0}^{2^b-1}(x - i)\), which is zero only on \([0,2^b-1]\). The bit-splitting essentially forces cvc5 to guess vā€™s value; further, fā€™s high degree slows down the Grƶbner basis computations that form the foundation of cvc5ā€™s field solver.

To optimize verification of the completeness VCs, we reason about CirC-IR directly. First, we use the uniqueness of valid encodings and the \( fromTerm \) function. Since the VC assumes \( valid (e, t)\), we know e is equal to \( fromTerm (t, \textsf{kind}(e))\). We use this equality to eliminate e from the completeness VC, leaving:

$$ (A \wedge valid (e', t))[F][e\mapsto fromTerm (t, \textsf{kind}(e))] $$

Since F defines all variables in A and \(e'\), the only variable after substitution is t. So, when t is a Boolean or small bit-vector, an exhaustive search is very effective;Footnote 10 we implemented such a solver in 56 LOC, using CirCā€™s IR as a library.

For soundness VCs, this approach is less effective. The \( fromTerm \) substitution still applies, but if F introduces fresh field variables, they are not eliminated and thus, the final formula contains field variables, so exhaustion is infeasible.

Fig. 6.
figure 6

VCs verified by different solvers. ā€˜uniqā€™ denotes the VCs of TableĀ 3; others are from TableĀ 4. ā€˜Cā€™ denotes completeness; ā€˜Sā€™: soundness.

Fig. 7.
figure 7

The performance of CirC with the verified and unverified field-blaster. Metrics are summed over the 61 functions in the Z# standard library.

Verification Results. We ran our VC verification on machines with Intel Xeon E5-2637 v4 CPUs.Footnote 11 Each attempt is limited to one physical core, 8GB memory, and 30Ā min. FigureĀ 6 shows the number of VCs verified by cvc5 and our exhaustive solver. As expected, the exhaustive solver is effective on completeness VCs for Boolean and bit-vector rules, but ineffective on soundness VCs for rules that introduce fresh field variables. There are four VCs that neither solver verifies within 30Ā min: bvadd with (\(b = 4\), \(a = 4\)), and bvmul with (\(b = 3\), \(a = 4\)) and (\(b = 4\), \(a \ge 3\)). Most other VCs verify instantly. In AppendixĀ E, we analyze how VC verification time depends on a and b.

6.2 Performance andĀ Output Quality Evaluation

We compare CirC with our field-baster (ā€œVerifiedā€) against CirC with its original field-blaster (ā€œUnverifiedā€)Footnote 12 on three metrics: compiler runtime, memory usage, and the final R1CS constraint count. Our benchmark set is the standard library for CirCā€™s Z# input language (which extends ZoKratesĀ [16, 68] v0.6.2). Our testbed runs Linux with 32GB memory and an AMD Ryzen 2700.

There is no difference in constraints, but the verified field-blaster slightly improves compiler performance: ā€“8% time and ā€“2% memory (Fig.Ā 7). We think that the small improvement is unrelated to the fact that the new field blaster is verified. In AppendixĀ E, we discuss compiler performance further.

7 Discussion

In this work, we present the first automatically verifiable field-blaster. We view the field-blaster as a set of rules; if some (automatically verifiable) conditions hold for each rule, then the field-blaster is correct. We implemented a performant and partially verified field-blaster for CirC, finding 4 bugs along the way.

Our approach has limitations. First, we require the field-blaster to be written as a set of encoding rules. Second, we only verify our rules for bit-vectors of bounded size and operators of bounded arity. Third, we assume that each rule is a pure function: for example, it doesnā€™t return different results depending on the time. Future work might avoid the last two limitations through bit-width-independent reasoning Ā [42, 43, 67] and a DSL (and compiler) for encoding rules. It would also be interesting to extend our approach to: a ZKP with a non-prime fieldĀ [7, 13], a compiler IR with partial or non-deterministic semantics, or a compiler with correctness that depends on computational assumptions.