Bounded Verification for Finite-Field-Blasting

Ozdemir, Alex; Wahby, Riad S.; Brown, Fraser; Barrett, Clark

doi:10.1007/978-3-031-37709-9_8

Alex Ozdemir⁹,
Riad S. Wahby¹⁰,
Fraser Brown¹⁰ &
…
Clark Barrett⁹

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 13966))

Included in the following conference series:

International Conference on Computer Aided Verification

2275 Accesses
4 Citations

Abstract

Zero Knowledge Proofs (ZKPs) are cryptographic protocols by which a prover convinces a verifier of the truth of a statement without revealing any other information. Typically, statements are expressed in a high-level language and then compiled to a low-level representation on which the ZKP operates. Thus, a bug in a ZKP compiler can compromise the statement that the ZK proof is supposed to establish. This paper takes a step towards ZKP compiler correctness by partially verifying a field-blasting compiler pass, a pass that translates Boolean and bit-vector logic into equivalent operations in a finite field. First, we define correctness for field-blasters and ZKP compilers more generally. Next, we describe the specific field-blaster using a set of encoding rules and define verification conditions for individual rules. Finally, we connect the rules and the correctness definition by showing that if our verification conditions hold, the field-blaster is correct. We have implemented our approach in the CirC ZKP compiler and have proved bounded versions of the corresponding verification conditions. We show that our partially verified field-blaster does not hurt the performance of the compiler or its output; we also report on four bugs uncovered during verification.

You have full access to this open access chapter, Download conference paper PDF

hacspec: Towards Verifiable Crypto Standards

Verified Proofs of Higher-Order Masking

Verification of an Optimized NTT Algorithm

1 Introduction

Zero-Knowledge Proofs (ZKPs) are powerful tools for building privacy-preserving systems. They allow one entity, the prover $\mathcal {P}$ , to convince another, the verifier $\mathcal {V}$ , that some secret data satisfies a public property, without revealing anything else about the data. ZKPs underlie a large (and growing!) set of critical applications, from billion-dollar private cryptocurrencies, like Zcash [24, 53] and Monero [2], to research into auditable sealed court orders [20], private gun registries [26], privacy-preserving middleboxes [23], and zero-knowledge proofs of exploitability [11]. This breadth of applications is possible because of the generality of ZKPs. In general, $\mathcal {P}$ knows a secret witness w, whereas $\mathcal {V}$ knows a property $\phi $ and a public instance x. $\mathcal {P}$ must show that $\phi (x,w)= \top $. Typically, x and w are vectors of variables in a finite field $\mathbb {F}$, and $\phi $ can be any system of equations over the variables, using operations $+$ and $\times $. Because $\phi $ itself is an input to $\mathcal {P}$ and $\mathcal {V}$ , and because of the expressivity of field equations, a single implementation of $\mathcal {P}$ and $\mathcal {V}$ can serve many different purposes.

Humans find it difficult to express themselves directly with field equations, so they use ZKP compilers. A ZKP compiler converts a high-level predicate $\phi '$ into an equivalent system of field equations $\phi $. In other words, a ZKP compiler generalizes a ZKP: by compiling $\phi '$ to $\phi $ and then using a ZKP for $\phi $, one obtains a ZKP for $\phi '$. There are many industrial [3, 5, 6, 14, 21, 45, 55, 66] and academic [4, 18, 28, 29, 46, 48, 50, 54, 63] ZKP compilers.

The correctness of a ZKP compiler is critical for security— a bug in the compiler could admit proofs of false statements— but verification is challenging for three reasons. First, the definition of correctness for a ZKP compiler is non-trivial; we discuss later in this section. Second, ZKP compilers span multiple domains. The high-level predicate $\phi '$ is typically expressed in a language with common types such as Booleans and fixed-width integers, while the output $\phi $ is over a large, prime-order field. Thus, any compiler correctness definition must span these domains. Third, ZKP compilers are evolving and performance-critical; verification must not inhibit future changes or degrade compiler performance.

In this work, we develop tools for automatically verifying the field-blaster of a ZKP compiler. A ZKP compiler’s field-blaster is the pass that converts from a formula over Booleans, fixed-width integers, and finite-field elements, to a system of field equations; as a transformation from bit-like types to field equations, the field-blaster exemplifies the challenge of cross-domain verification.

Our paper makes three contributions. First, we formulate a precise correctness definition for a ZKP compiler. Our definition ensures that a correct compiler preserves the completeness and soundness of the underlying ZK proof system.^{Footnote 1} More specifically, given a ZK proof system where statements are specified in a low-level language L, and a compiler from a high-level language H to L, if the compiler is correct by our definition, it extends the ZK proof system’s soundness and completeness properties to statements in H. Further, our definition is preserved under sequential composition, so proving the correctness of each compiler pass individually suffices to prove correctness of the compiler itself.

Second, we give an architecture for a verifiable field-blaster. In our architecture, a field-blaster is a set of “encoding rules.” We give verification conditions (VCs) for these rules, and we show that if the VCs hold, then the field-blaster is correct. Our approach supports automated verification because (bounded versions of) the VCs can be checked automatically. This reduces both the up-front cost of verification and its maintenance cost.

Third, we do a case study. Using our architecture, we implement a new field-blaster for CirC [46] (“SIR-see”), an infrastructure used by state-of-the-art ZKP compilers. We verify bounded versions of our field-blaster’s VCs using SMT-based finite-field reasoning [47], and show that our field blaster does not compromise CirC’s performance. We also report on four bugs that our verification effort uncovered, including a soundness bug that allowed the prover to “lie” about the results of certain bit-vector comparisons. We note that the utility of our techniques is not limited to CirC: most ZKP compilers include something like the field-blaster we describe here.

In the next sections, we discuss related work (Sect. 1.1), give background on ZKPs and CirC (Sect. 2), present a field-blasting example (Sect. 3), describe our architecture (Sect. 4), give our verification conditions (Sect. 5), and present the case study (Sect. 6).

1.1 Related Work

Verified Compilers. There is a rich body of work on verifying the correctness of traditional compilers. We focus on compilation for ZKPs; this requires different correctness definitions that relate bit-like types to prime field elements. In the next paragraphs, we discuss more fine-grained differences.

Compiler verification efforts fall into two broad categories: automated—verification leveraging automated reasoning solvers—and foundational—manual verification using proof assistants (e.g., Coq [8] or Isabelle [44]). CompCert [36], for example, is a Coq-verified C compiler with verified optimization passes (e.g., [40]). Closest to our work is backend verification, which proves correct the translation from an intermediate representation to machine code. CompCert’s lowering [37] is verified, as is CakeML’s [31] lowering to different ISAs [19, 57]. While such foundational verification offers strong guarantees, it imposes a heavy proof burden; creating CompCert, for example, took an expert team eight years [56], and any updates to compiler code require updates to proofs.

Automated verification, in contrast, does not require writing and maintaining manual proofs.^{Footnote 2} Cobalt [34], Rhodium [35], and PEC [32] are domain-specific languages (DSLs) for writing automatically-verified compiler optimizations and analyses. Most closely related to our work is Alive [39], a DSL for expressing verified peephole optimizations, local rewrites that transform snippets of LLVM IR [1] to better-performing ones. Alive addresses transformations over fixed types (while we address lowering to finite field equations) and formulates correctness in the presence of undefined behavior (while we formulate correctness for ZKPs). Beyond Alive, Alive2 [38] provides translation validation [41, 51] for LLVM [33], and VeRA [10] verifies range analysis in the Firefox JavaScript engine.

There is also work on verified compilation for domains more closely related to ZKPs. The Porcupine [15] compiler automatically synthesizes representations for fully-homomorphic encryption [62], and Gillar [58] proves that optimization passes in the Qiskit [60] quantum compiler are semantics-preserving. While these works compile from high-level languages to circuit representations, the correctness definitions for their domains do not apply to ZKP compilers.

Verified Compilation to Cryptographic Proofs. Prior works on verified compilation for ZKPs (or similar) take the foundational approach (with attendant proof maintenance burdens), and they do not formulate a satisfactory definition of compiler correctness. PinocchioQ [18] builds on CompCert [36]. The authors formulate a correctness definition that preserves the existential soundness of a ZKP but does not consider completeness, knowledge soundness, or zero-knowledge (see Sect. 2.2). Leo [14] is a ZKP compiler that produces (partial) ACL2 [27] proofs of correct compilation; work to emit proofs from its field-blaster is ongoing.

Recent work defines security for reductions of knowledge [30]. These let $\mathcal {P}$ convince $\mathcal {V}$ that it knows a witness for an instance of relation $\mathcal {R}_1$ by proving it knows a witness for an instance of an easier-to-prove relation $\mathcal {R}_2$. Unlike ZKP compilers, $\mathcal {P}$ and $\mathcal {V}$ interact to derive $\mathcal {R}_2$ using $\mathcal {V}$ ’s randomness (e.g., proving that two polynomials are nonzero w.h.p. by proving that a random linear combination of them is), whereas ZKP compilers run ahead of time and non-interactively.

Further afield, Ecne [65] is a tool that attempts to verify that the input to a ZKP encodes a deterministic computation. It does not consider any notion of a specification of the intended behavior. A different work [25] attempts to automatically verify that a “widget” given to a ZKP meets some specification. They consider widgets that could be constructed manually or with a compiler. Our focus is on verifying a compiler pass.

2 Background

2.1 Logic

We assume usual terminology for many-sorted first-order logic with equality ( [17] gives a complete presentation). We assume every signature includes the sort $\textsf{Bool}$ , constants $\textsf{True}$ and $\textsf{False}$ of sort $\texttt{Bool}$ , and symbol family $\approx _\sigma $ (abbreviated $\approx $) with sort $\sigma \times \sigma \rightarrow \textsf{Bool}$ for each sort $\sigma $. We also assume a family of conditionals: symbols $ ite _\sigma $ (“if-then-else”, abbreviated $ ite $ ) of sort $\textsf{Bool}\times \sigma \times \sigma \rightarrow \sigma $.

A theory is a pair $\mathcal {T} = (\varSigma , \textbf{I})$, where $\varSigma $ is a signature and $\textbf{I}$ is a class of $\varSigma $-interpretations. A $\varSigma $ -formula is a term of sort $\textsf{Bool}$ . A $\varSigma $ -formula $\phi $ is satisfiable (resp., unsatisfiable) in $\mathcal {T}$ if it is satisfied by some (resp., no) interpretation in $\textbf{I}$ . We focus on two theories. The first is $\mathcal {T}_{ BV }$, the SMT-LIB theory of bit-vectors [52, 61], with signature $\varSigma _{ BV }$ including a bit-vector sort $\textsf{BV}_{[n]}$ for each $n>0$ with bit-vector constants $c_{[n]}$ of sort $\textsf{BV}_{[n]}$ for each $c\in [0,2^n-1]$, and operators including & and $|$ (bitwise and, or) and $+_{[n]}$ (addition modulo $2^n$). We write t[i] to refer to the $i^ th $ bit of bit-vector t, where t[0] is the least-significant bit. The other theory is $\mathcal {T}_{ F _{p}}$, which is the theory corresponding to the finite field of order p, for some prime p [47]. This theory has signature $\varSigma _{ F _{p}}$ containing the sort $\textsf{FF}_{p}$, constant symbols $0, \ldots , p-1$, and operators $+$ and $\times $.

In this paper, we assume all interpretations interpret sorts and symbols in the same way. We write $\textsf{dom}(v)$ for the set interpreting the sort of a variable v. We assume that $\textsf{Bool}$, $\textsf{True}$, and $\textsf{False}$ are interpreted as $\{\top , \bot \}$, $\top $, and $\bot $, respectively; $\varSigma _{ BV }$-interpretations follow the SMT-LIB standard; and $\varSigma _{ F _{p}}$-interpretations interpret symbols as the corresponding elements and operations in $\mathbb {F}_p$, a finite field of order p (for concreteness, this could be the integers modulo p). Note that only the values of variables can vary between two interpretations.

For a signature $\varSigma $, let t be a $\varSigma $-term of sort $\sigma $, with free variables $x_1,\ldots ,x_n$, respectively of sort $\sigma _1,\ldots ,\sigma _n$. We define the function $\hat{t}: \textsf{dom}(x_1) \times \cdots \times \textsf{dom}(x_n) \rightarrow \textsf{dom}(t)$ as follows. Let $\mathbf {\textsf{x}}\in \textsf{dom}(x_1) \times \cdots \times \textsf{dom}(x_n)$. Let $\mathcal {M}$ be an interpretation that interprets each $x_i$ as $ \textsf{x}_i$. Then $\hat{t}(\mathbf {\textsf{x}}) = t^{\mathcal {M}}$ (i.e., the interpretation of t in $\mathcal {M}$). For example, the term $t = a \wedge \lnot a$ defines $\hat{t}:\textsf{Bool}\rightarrow \textsf{Bool}= \lambda \,x.\,\bot $. In the following, we follow the convention used above in using the standard font (e.g., x) for logical variables and a sans serif font (e.g., $\textsf{x}$) to denote meta-variables standing for values (i.e., elements of $\sigma ^\mathcal {M}$ for some $\sigma $ and $\mathcal {M}$). Also, abusing notation, we’ll conflate single variables (of both kinds) with vectors of variables when the distinction doesn’t matter. Note that a formula $\phi $ is satisfiable if there exist values $\textsf{x}$ such that $\hat{\phi }(\textsf{x}) = \top $. It is valid if for all values $\textsf{x}$, $\hat{\phi }(\textsf{x}) = \top $.

For terms s, t and variable x, $t[x\mapsto s]$ denotes t with all occurrences of x replaced with s. For a sequence of variable-term pairs, $S = (x_1 \mapsto s_1, \dots , x_n \mapsto s_n)$, t[S] is defined to be $t[x_1\mapsto s_1]\cdots [x_n\mapsto s_n]$.

2.2 Zero Knowledge Proofs

As mentioned above, Zero-knowledge proofs (ZKPs) make it possible to prove that some secret data satisfies a public property—without revealing the data itself. See [59] for a full presentation; we give a brief overview here, and then describe how general-purpose ZKPs are used.

Overview and Definitions. In a cryptographic proof system, there are two parties: a verifier $\mathcal {V}$ and a prover $\mathcal {P}$ . $\mathcal {V}$ knows a public instance $\textsf{x}$ and asks $\mathcal {P}$ to show that it has knowledge of a secret witness $\textsf{w}$ satisfying a public predicate $\phi (x, w)$ from a predicate class $\varPhi $ (a set of formulas) (i.e., $\hat{\phi }(\textsf{x},\textsf{w})=\top $). Figure 1 illustrates the workflow. First, a trusted party runs an efficient (i.e., polytime in an implicit security parameter $\lambda $) algorithm $\textsf{Setup}(\phi )$ which produces a proving key $\textsf{pk}$ and a verifying key $\textsf{vk}$ . Then, $\mathcal {P}$ runs an efficient algorithm $\textsf{Prove}(\textsf{pk}, \textsf{x}, \textsf{w}) \rightarrow \pi $ and sends the resulting proof $\pi $ to $\mathcal {V}$ . Finally, $\mathcal {V}$ runs an efficient verification algorithm $\textsf{Verify}(\textsf{vk}, \textsf{x}, \pi ) \rightarrow \{\top ,\bot \}$ that accepts or rejects the proof. A zero-knowledge argument of knowledge for class $\varPhi $ is a tuple $\Pi = (\textsf{Setup}, \textsf{Prove}, \textsf{Verify})$ with three informal properties for every $\phi \in \varPhi $ and every $\textsf{x}\in \textsf{dom}(x),\textsf{w}\in \textsf{dom}(w)$:

perfect completeness: if $\hat{\phi }(\textsf{x},\textsf{w})$ holds, then $\textsf{Verify}(\textsf{vk}, \textsf{x}, \pi )$ holds;
computational knowledge soundness [9]: an efficient adversary that does not know $\textsf{w}$ cannot produce a $\pi $ such that $\textsf{Verify}(\textsf{vk}, \textsf{x}, \pi )$ holds; and
zero-knowledge [22]: $\pi $ reveals nothing about $\textsf{w}$, other than its existence.

Technically, the system is an “argument” rather than a “proof” because soundness only holds against efficient adversaries. Also note that knowledge soundness requires that an entity must “know” a valid $w'$ to produce a proof; it is not enough for a valid $w'$ to simply exist. We give more precise definitions in Appendix A.

Representations for ZKPs. As mentioned above, ZKP applications are manifold (Sect. 1)—from cryptocurrencies to private registries. This breadth of applications is possible because ZKPs support a broad class of predicates. Most commonly, these predicates are expressed as rank-1 constraint systems (R1CSs). Recall that $\mathbb {F}_p$ is a prime-order finite field (also called a prime field). We will drop the subscript p when it is not important. In an R1CS, $\textsf{x}$ and $\textsf{w}$ are vectors of elements in $\mathbb {F}$ ; let $\textsf{z}\in \mathbb {F}^m$ be their concatenation. The function $\hat{\phi }$ can be defined by three matrices $\textsf{A}, \textsf{B}, \textsf{C}\in \mathbb {F}^{n \times m}$; $\hat{\phi }(\textsf{x},\textsf{w})$ holds when $\textsf{A}\textsf{z}\circ \textsf{B}\textsf{z}= \textsf{C}\textsf{z}$, where $\circ $ is the element-wise product. Thus, $\phi $ can be viewed as n conjoined constraints, where each constraint i is of the form $(\sum _j a_{ij}z_j) \times (\sum _j b_{ij}z_j) \approx (\sum _j c_{ij}z_j)$ (where the $a_{ij}$, $b_{ij}$ and $c_{ij}$ are constant symbols from $\varSigma _{ F _{p}}$, and the $z_j$ are a vector of variables of sort $\textsf{FF}_{p}$). That is, each constraint enforces a single non-linear multiplication.

2.3 Compilation Targeting Zero Knowledge Proofs

To write a ZKP about a high-level predicate $\phi $, that predicate is first compiled to an R1CS. A ZKP compiler from class $\varPhi $ (a set of $\varSigma $-formulas) to class $\varPhi '$ (a set of $\varSigma '$-formulas) is an efficient algorithm $\textsf{Compile}(\phi \in \varPhi ) \rightarrow (\phi '\in \varPhi ', \textsf{Ext}_x, \textsf{Ext}_w)$. Given a predicate $\phi (x,w)$, it returns a predicate $\phi '(x', w')$ as well as two efficient and deterministic algorithms, instance and witness extenders: $\textsf{Ext}_x : \textsf{dom}(x) \rightarrow \textsf{dom}(x')$ and $\textsf{Ext}_w: \textsf{dom}(x)\times \textsf{dom}(w) \rightarrow \textsf{dom}(w')$.^{Footnote 3} For example, CirC [46] can compile a Boolean-returning C function (in a subset of C) to an R1CS.

At a high-level, $\phi $ and $\phi '$ should be “equisatisfiable”, with $\textsf{Ext}_x$ and $\textsf{Ext}_w$ mapping satisfying values for $\phi $ to satisfying values for $\phi '$. That is, for all $\textsf{x}\in \textsf{dom}(x)$ and $\textsf{w}\in \textsf{dom}(w)$ such that $\hat{\phi }(\textsf{x}, \textsf{w}) = \top $, if $\textsf{x}' = \textsf{Ext}_x(\textsf{x})$ and $\textsf{w}' = \textsf{Ext}_w(\textsf{x}, \textsf{w})$, then $\hat{\phi '}(\textsf{x}', \textsf{w}') = \top $. Furthermore, for any $\textsf{x}$, it should be impossible to (efficiently) find $\textsf{w}'$ satisfying $\hat{\phi '}(\textsf{Ext}_x(\textsf{x}), \textsf{w}') = \top $ without knowing a $\textsf{w}$ satisfying $\hat{\phi }(\textsf{x}, \textsf{w}) = \top $. In Sect. 5.1, we precisely define correctness for a predicate compiler.

One can build a ZKP for class $\varPhi $ from a compiler from $\varPhi $ to $\varPhi '$ and a ZKP for $\varPhi '$. Essentially, one runs the compiler to get a predicate $\phi '\in \varPhi '$, as well as $\textsf{Ext}_x$ and $\textsf{Ext}_w$. Then, one writes a ZKP to show that $\hat{\phi '}(\textsf{Ext}_x(\textsf{x}), \textsf{Ext}_w(\textsf{x}, \textsf{w})) = \top $. In Appendix A, we give this construction in full and prove it is secure.

Optimization. The primary challenge when using ZKPs is cost: typically, $\textsf{Prove}$ is at least three orders of magnitude slower than checking $\phi $ directly [64]. Since $\textsf{Prove}$ ’s cost scales with n (the constraint count), it is critical for the compiler to minimize n. The space of optimizations is large and complex, for two reasons. First, the compiler can introduce fresh variables. Second, only equisatifiability—not logical equivalence—is needed. Compilers in this space exploit equisatisfiability heavily to efficiently represent high-level constructs (e.g., Booleans, bit-vectors, arrays, ...) as an R1CS.

As a (simple!) example, consider the Boolean computation $a \approx c_1 \vee \dots \vee c_k$. Assume that $c'_1, \dots , c'_k$ are variables of sort $\textsf{FF}_{}$ and that we add constraints $c'_i(1-c'_i) \approx 0$ to ensure that $c'_i$ has to be 0 or 1 for each i. Assume further that $(c'_i \approx 1)$ encodes $c_i$ for each i. How can one additionally ensure that $a'$ (also of sort $\textsf{FF}_{}$) is also forced to be equal to 0 or 1 and that $(a' \approx 1)$ is a correct encoding of a? Given that there are $k-1$ ORs, natural approaches use $\Theta (k)$ constraints. One clever approach is to introduce variable $x'$ and enforce constraints $x'(\sum _i c'_i)\approx a'$ and $(1-a')(\sum _i c'_i)\approx 0$. In any interpretation where any $c_i$ is true, the corresponding interpretation for $a'$ must be 1 to satisfy the second constraint; setting $x'$ to the sum’s inverse satisfies the first. If all $c_i$ are false, the first constraint ensures $a'$ is 0. This technique assumes the sum does not overflow; since ZKP fields are typically large (e.g., with p on the order of $2^{255}$), this is usually a safe assumption.

CirC. CirC [46] is an infrastructure for building compilers from high-level languages (e.g., a C subset), to R1CSs. It has been used in research projects [4, 12], and in industrial R &D. Figure 2 shows the structure of an R1CS compiler built with CirC. First, the front-end of the compiler converts the source program into CirC-IR. CirC-IR is a term IR based on SMT-LIB that includes: Booleans, bit-vectors, fixed-size arrays, tuples, and prime fields.^{Footnote 4} Second, the compiler optimizes and simplifies the IR so that the only remaining sorts are Booleans, bit-vectors, and the target prime field. Third, the compiler lowers the simplified IR to an R1CS predicate over the target field. For ZKPs built with CirC, the completeness, soundness, and zero-knowledge of the end-to-end system depend on the correctness of CirC itself.

3 Overview and Example

To start, we view CirC’s lowering pass as two passes (Fig. 2). The first pass, “(finite-)field-blasting,” converts a many-sorted IR (representable as a ($\varSigma _{ BV }\cup \varSigma _{ F _{}}$)-formula) to a conjunction of field equations ($\varSigma _{ F _{}}$-equations). The second pass, “flattening,” converts this conjunction of field equations to an R1CS.

Our focus is on verifying the first pass. We begin with a worked example of how to field-blast a small snippet of CirC-IR (Sect. 3.1). This example will illustrate four key ideas (Sect. 3.2) that inspire our field-blaster’s architecture.

3.1 An Example of Field-Blasting

We start with an example CirC-IR predicate expressed as a ($\varSigma _{ BV }\cup \varSigma _{ F _{}}$)-formula:

$$ \begin{aligned} \phi \triangleq (x_0 \oplus w_0) \wedge (w_1 +_{[4]} x_1 \approx w_1) \wedge (x_2~ \& ~w_1 \approx x_2) \wedge (x_3 \approx w_2 \times w_2) \end{aligned}$$

(1)

The predicate includes: the XOR of two Booleans (“$\oplus $”), a bit-vector sum, a bit-vector AND, and a field product. $x_0$ and $w_0$ are of sort $\texttt{Bool}$, $x_1$, $x_2$, and $w_1$ are of sort $\textsf{BV}_{[4]}$, and $x_3$ and $w_2$ are of sort $\textsf{FF}_{p}$. We’ll assume that $p \gg 2^4$. Table 1 summarizes the new variables and assertions we create during field-blasting; we describe the origin of each assertion and new variable in the next paragraphs.

Lowering Clause One (Booleans). We begin with the Boolean term $(x_0 \oplus w_0)$. We will use 1 and 0 to represent $\top $ and $\bot $. We introduce variables $x'_0$ and $w'_0$ of sort $\textsf{FF}_{p}$ to represent $x_0$ and $w_0$ respectively. To ensure that $w'_0$ is 0 or 1, we assert: $w'_0(w'_0-1) \approx 0$.^{Footnote 5} $x_0 \oplus w_0$ is then represented by the expression $1 - x'_0 - w'_0 + 2x'_0w'_0$. Setting this equal to 1 enforces that $x_0 \oplus w_0$ must be true. These new assertions and fresh variables are reflected in the first three rows of the table.

Table 1. New variables and assertions when compiling the example $\phi $.

Full size table

Lowering Clause Two and Three (Bit-vectors). Before describing how to bit-blast the second and third clauses in $\phi $, we discuss bit-vector representations in general. A bit-vector t can be viewed as a sequence of b bits or as a non-negative integer less than $2^b$. These two views suggest two natural representations in a prime-order field: first, as one field element $t'_u$, whose unsigned value agrees with t (assuming the field’s size is at least $2^b$); second, as b elements $t'_0, \dots , t'_{b-1}$, that encode the bits of t as 0 or 1 (in our encoding, $t'_0$ is the least-significant bit and $t'_{b-1}$ is the most-significant bit). The first representation is simple, but with it, some field values (e.g., $2^b$) don’t corresponding to any possible bit-vector. With the second approach, by including equations $t'_i(t'_i-1)\approx 0$ in our system, we ensure that any satisfying assignment corresponds to a valid bit-vector. However, the extra b equations increase the size of our compiler’s output.

We represent $\phi $’s $w_1$ bit-wise: as $w'_{1,0}, \dots , w'_{1,3}$, and we represent the instance variable $x_1$ as $x'_{1,u}$.^{Footnote 6} For the constraint $w_1 +_{[4]} x_1 \approx w_1$, we compute the sum in the field and bit-decompose the result to handle overflow. First, we introduce new variable $s'$ and set it equal to $x'_{1,u} + \sum _{i=0}^3 2^iw'_{1,i}$. Then, we bit-decompose $s'$, requiring $s' \approx \sum _{i=0}^4 2^is'_{i}$, and $s'_i(s'_i-1) \approx 0$ for $i\in [0,4]$. Finally, we assert $s'_i\approx w'_{1,i}$ for $i\in [0,3]$. This forces the lowest 4 bits of the sum to be equal to $w_1$.

The constraint $ x_2~ \& ~w_1\approx x_2$ is more challenging. Since $x_2$ is an instance variable, we initially encode it as $x'_{2,u}$. Then, we consider the bit-wise AND. There is no obvious way to encode a bit-wise operation, other than bit-by-bit. So, we convert $x'_{2,u}$ to a bit-wise representation: We introduce witness variables $x'_{2,0}, \dots , x'_{2, 3}$ and equations $x'_{2,i}(x'_{2,i}-1)\approx 0$ as well as equation $x'_{2,u} \approx \sum _{i=0}^3 2^i x'_{2,i}$. Then, for each i we require $x'_{2,i} w'_{1,i} \approx x'_{2,i}$.

Lowering the Final Clause (Field Elements). Finally, we consider the field equation $x_2 \approx w_2 \times w_2$. Our target is also field equations, so lowering this is straightforward. We simply introduce primed variables and copy the equation.

3.2 Key Ideas

This example highlights four ideas that guide the design of our field-blaster:

1.
fresh variables and assertions: Field-blasting uses two primitive operations: creating new variables in $\phi '$ (e.g., $w'_0$ to represent $w_0$) and adding new assertions to $\phi '$ (e.g., $w'_0(w'_0-1)\approx 0$).
2.
encodings: For a term t in $\phi $, we construct a field term (or collection of field terms) in $\phi '$ that represent the value of t. For example, the Boolean $w_0$ is represented as the field element $w'_0$ that is 0 or 1.
3.
operator rules: if t is an operator applied to some arguments, we can encode t given encodings of the arguments. For example, if t is $x_0 \oplus w_0$, and $x_0$ is encoded as $x_0'$ and $w_0$ as $w_0'$, then t can be encoded as $1-x'_0-w'_0+2x'_0w'_0$.
4.
conversions: Some sorts can be represented by encodings of different kinds. If a term has multiple possible encodings, the compiler may need to convert between them to apply some operator rule. For example, we converted $x_2$ from an unsigned encoding to a bit-wise encoding before handling an AND.

4 Architecture

In this section, we present our field-blaster architecture. To compile a predicate $\phi $ to a system of field equations $\phi '$, our architecture processes each term t in $\phi $ using a post-order traversal. Informally, it represents each t as an “encoding” in $\phi '$: a term (or collection of terms) over variables in $\phi '$. Each encoding is produced by a small algorithm called an “encoding rule”.

Below, we define the type of encodings $\textsf{Enc}$ (Sect. 4.1), the five different types of encoding rules (Sect. 4.2), and a calculus that iteratively applies these rules to compile all of $\phi $ (Sect. 4.3).

4.1 Encodings

Table 2. Encodings for each term sort. Only bit-vectors have two encoding kinds.

Full size table

Table 2 presents our tagged union type $\textsf{Enc}$ of possible term encodings. Each variant comprises the term being encoded, its tag (the encoding kind), and a sequence of field terms. The encoding kinds are $\texttt{bit}$ (a Boolean as 0/1), $\texttt{uint}$ (a bit-vector as an unsigned integer), $\texttt{bits}$ (a bit-vector as a sequence of bits), and $\texttt{field}$ (a field term trivially represented as a field term). Each encoding has an intended semantics: a condition under which the encoding is considered valid. For instance, a $\texttt{bit}$ encoding of Boolean t is valid if the field term f is equal to $ ite (t, 1, 0)$.

4.2 Encoding Rules

An encoding rule is an algorithm that takes and/or returns encodings, in order to represent some part of the input predicate as field terms and equations.

Primitive Operations. A rule can perform two primitive operations: creating new variables and emitting assertions. In our pseudocode, the primitive function $\textsf{fresh}(\textsf{name}, t, \mathsf {is\texttt{I}nst}) \rightarrow x'$ creates a fresh variable. Argument $\mathsf {is\texttt{I}nst}$ is a Boolean indicating whether $x'$ is an instance variable (as opposed to a witness). Argument t is a field term (over variables from $\phi $ and previously defined primed variables) that expresses how to compute a value for $x'$. For example, to create a field variable $w'$ that represents Boolean witness variable w, a rule can call $\textsf{fresh}(w', ite (w, 1, 0), \bot )$. The compiler uses t to help create the $\textsf{Ext}_x$ and $\textsf{Ext}_w$ algorithms. A rule asserts a formula $t'$ (over primed variables) by calling $\textsf{assert}(t')$.

Rule Types. There are five types of rules: (1) Variable rules $\textsf{variable}(t, \mathsf {is\texttt{I}nst}) \rightarrow e$ take a variable t and its instance/witness status and return an encoding of that variable made up of fresh variables. (2) Constant rules $\textsf{const}(t) \rightarrow e$ take a constant term t and produce an encoding of t comprising terms that depend only on t. Since t is a constant, the terms in e can be evaluated to field constants (see the calculus in Sect. 4.3).^{Footnote 7} The $\textsf{const}$ rule cannot call $\textsf{fresh}$ or $\textsf{assert}$ . (3) Equality rules $\textsf{assertEq}(e, e')$ take two encodings of the same kind and emit assertions that equate the underlying terms. (4) Conversion rules $\textsf{convert}(e, \textsf{kind}')\rightarrow e'$ take an encoding and convert it to an encoding of a different kind. Conversions are only non-trivial for bit-vectors, which have two encoding kinds: $\texttt{uint}$ and $\texttt{bits}$ . (5) Operator rules apply to terms t of form $o(t_1, \dots , t_n)$. Each operator rule takes t, o, and encodings of the child terms $t_i$ and returns an encoding of t. Some operator rules require specific kinds of encodings; before using such an operator rule, our calculus (Sect. 4.3) calls the convert rule to ensure the input encodings are the correct kind. Figure 3 gives pseudocode for the first four rule types, as applied to bit-vectors. Figure 4 gives pseudocode for two bit-vector operator encoding rules. A field blaster uses many operator rules: in our case study (Sect. 6) there are 46.

4.3 Calculus

We now give a non-deterministic calculus describing how our field-blaster applies rules to compile a predicate $\phi (x,w)$ into a system of field equations.

A calculus state is a tuple of three items: (E, A, F). The encoding store E is a (multi-)map from terms to sets of encodings. The assertions formula A is a conjunction of all field equations asserted via $\textsf{assert}$ . The fresh variable definitions sequence F is a sequence consisting of pairs, where each pair (v, t) matches a single call to $\textsf{fresh}(v,t,\dots )$.

Figure 5 shows the transitions of our calculus. We denote the result of a rule as $A', F', e' \leftarrow r(\dots )$, where $A'$ is a formula capturing any new assertions, $F'$ is a sequence of pairs capturing any new variable definitions, and $e'$ is the rule’s return value. We may omit one or more results if they are always absent for a particular rule. For encoding store E, $E \cup (t \mapsto e)$ denotes the store with e added to t’s encoding set.

There are five kinds of transitions. The $\textsf{Const}$ transition adds an encoding for a constant term. The $\textsf{const}$ rule returns an encoding e whose terms depend on the constant c; $e'$ is a new encoding identical to e, except that each of its terms has been evaluated to obtain a field constant. The $\textsf{Var}$ transition adds an encoding for a variable term. The $\textsf{Conv}$ transition takes a term that is already encoded and re-encodes it with a new encoding kind. The $\textsf{kinds}$ operator returns all legal values of $\textsf{kind}$ for encodings of a given sort. The $\textsf{Op}_{r}$ transition applies operator rule r. This transition is only possible if r’s operator kind agrees with o, and if its input encoding kinds agree with $\vec e$. The $\textsf{Finish}$ transition applies when $\phi $ has been encoded. It uses $\textsf{const}$ and $\textsf{assertEq}$ to build assertions that hold when $\phi = \top $. Rather than producing a new calculus state, it returns the outputs of the calculus: the assertions and the variable definitions.

To meet the requirements of the ZKP compiler, our calculus must return two extension function: $\textsf{Ext}_x$ and $\textsf{Ext}_w$ (Sect. 2.2). Both can be constructed from the fresh variable definitions F. One subtlety is that $\textsf{Ext}_x(x)$ (which assigns values to fresh instance variables) is a function of x only—it cannot depend on the witness variables of $\phi $. We ensure this by allowing fresh instance variables to only be created by the $\textsf{variable}$ rule, and only when it is called with $\mathsf {is\texttt{I}nst}= \top $.

Strategy. Our calculus is non-deterministic: multiple transitions are possible in some situations; for example, some conversion is almost always applicable. The strategy that decides which transition to apply affects field blaster performance (Appendix D) but not correctness.

5 Verification Conditions

In this section, we first define correctness for a ZKP compiler (Sect. 5.1). Then, we give verification conditions (VCs) for each type of encoding rule (Sect. 5.2). Finally, we show that if these VCs hold, our calculus is a correct ZKP compiler (Sect. 5.3).

5.1 Correctness Definition

Definition 1 (Correctness)

A ZKP compiler $\textsf{Compile}(\phi ) \rightarrow (\phi ', \textsf{Ext}_x, \textsf{Ext}_w)$ is correct if it is demonstrably complete and demonstrably sound.

demonstrable completeness: For all $\textsf{x}\in \textsf{dom}(x),\textsf{w}\in \textsf{dom}(w)$ such that $\hat{\phi }(\textsf{x},\textsf{w})=\top $,
$$ \hat{\phi '}(\textsf{Ext}_x(\textsf{x}),\textsf{Ext}_w(\textsf{x},\textsf{w})) = \top $$
demonstrable soundness: There exists an efficient algorithm $\textsf{Inv}(\textsf{x}', \textsf{w}') \rightarrow \textsf{w}$ such that for all $\textsf{x}\in \textsf{dom}(x),\textsf{w}'\in \textsf{dom}(w')$ such that $\hat{\phi '}(\textsf{Ext}_x(\textsf{x}),\textsf{w}')=\top $,
$$ \hat{\phi }(\textsf{x}, \textsf{Inv}(\textsf{Ext}_x(\textsf{x}),\textsf{w}'))=\top $$

Demonstrable completeness (respectively, soundness) requires the existence of a witness for $\phi '$ (resp., $\phi $) when a witness exists for $\phi $ (resp., $\phi '$); this existence is demonstrated by an efficient algorithm $\textsf{Ext}_w$ (resp., $\textsf{Inv}$) that computes the witness.

Correct ZKP compilers are important for two reasons. First, since sequential composition preserves correctness, one can prove a multi-pass compiler is correct pass-by-pass. Second, a correct ZKP compiler from $\varPhi $ to $\varPhi '$ can be used to generalize a ZKP for $\varPhi '$ to one for $\varPhi $. We prove both properties in Appendix A.

Theorem 1 (Compiler Composition)

If $\textsf{Compile}'$ and $\textsf{Compile}''$ are correct, then the compiler $\textsf{Compose}(\textsf{Compile}', \textsf{Compile}'')$ (Appendix A) is correct.

Theorem 2 (ZKP Generalization)

(informal) Given a correct ZKP compiler $\textsf{Compile}$ from $\varPhi $ to $\varPhi '$ and a ZKP for $\varPhi '$, we can construct a ZKP for $\varPhi $.

5.2 Rule VCs

Recall (Sect. 4) that our language manipulates encodings through five types of encoding rules. We give verification conditions for each type of rule. Intuitively, these capture the correctness of each rule in isolation. Next, we’ll show that they imply the correctness of a ZKP compiler that follows our calculus.

Our VCs quantify over valid encodings. That is, they have the form: “for any valid encoding e of term t, ...” We can quantify over an encoding e by making each $t_i \in \textsf{terms}(e)$ a fresh variable, and quantifying over the $t_i$. Encoding validity is captured by a predicate $ valid (e,t)$, which is defined to be the validity condition in Table 2. Each VC containing encoding variables $\textbf{e}$ implicitly represents a conjunction of instances of that VC, one for each possible tuple of kinds of $\textbf{e}$, which is fixed for each instance. If a VC contains $ valid (e,t)$, the sort of t is constrained to be compatible with $\textsf{kind}(e)$. For a kind and a sort to be compatible, they must occur in the same row of Table 2. We define the equality predicate $ equal (e,e')$ as $\bigwedge _i \textsf{terms}(e)[i] \approx \textsf{terms}(e')[i]$.

Table 3. VCs related to encoding uniqueness.

Full size table

Encoding Uniqueness. First, we require the uniqueness of valid encodings, for any fixed encoding kind. Table 3 shows the VCs that ensure this. Each row is a formula that must be valid, for all compatible encodings and terms. The first two rows ensure that there is a bijection from terms to their valid encodings (in the first row, we consider only instances for which $\textsf{kind}(e)=\textsf{kind}(e')$). The function $ fromTerm (t, \textsf{kind}) \rightarrow e$ maps a term and an encoding kind to a valid encoding of that kind, and the function $ toTerm (e)\rightarrow t$ maps a valid encoding to its encoded term. The third and fourth rows ensure that $ fromTerm $ and $ toTerm $ are correctly defined. We will use $ toTerm $ in our proof of calculus soundness (Appendix B) and we will use $ fromTerm $ to optimize VCs for faster verification (Sect. 6.1).

For an example of the $ valid $ , $ fromTerm $ , and $ toTerm $ functions, consider a Boolean b encoded as an encoding e with $\textsf{kind}$ $\texttt{bit}$ and whose $\textsf{terms}$ consist of a single field element f. Validity is defined as $ valid (e, b) = f \approx ite (b, 1, 0)$, $ toTerm (f)$ is defined as $f \approx 1$, and $ fromTerm (b, \texttt{bit})$ is $(b, \texttt{bit}, ite (b, 1, 0))$.

VCs for Encoding Rules. Table 4 shows our VCs for the rules of Fig. 5. For each rule application, A and F denote, respectively, the assertions and the variable declarations generated when that rule is applied. We explain some of the VCs in detail.

Table 4. VCs for encoding rules.

Full size table

First, consider a rule $r_o$ for operator o applied to inputs $t_1, \dots , t_k$. The rule takes input encodings $e_1, \dots , e_k$ and returns an output $e'$. It is sound if the validity of its inputs and its assertions imply the validity of its output. It is complete if the validity of its inputs implies its assertions and the validity of its output, after substituting fresh variable definitions.

Second, consider a variable rule. Its input is a variable term t, and it returns $e'$, a putative encoding thereof. Note that $e'$ does not actually contain t, though the substitutions in F may bind the fresh variables of $e'$ to functions of t. For the rule to be sound when t is a witness variable $(t \in w)$, the assertions must imply that $e'$ is valid for some term $t'$. For the rule to be sound when t is an instance variable $(t \in x)$, the assertions must imply that $e'$ is valid for t, when the instance variables in $e'$ are replaced with their definition ($F_x$ denotes F, restricted to its declarations of instance variables).^{Footnote 8} For the variable rule to be complete (for an instance or a witness), the assertions and the validity of $e'$ for t must follow from F.

Third, consider a constant rule. Its input is a constant term t, and it returns an encoding e. Recall that the terms of e are always evaluated, yielding $e'$ which only contains constant terms. Thus, correctness depends only on the fact that e is always a valid encoding of the input t. This can be captured with a single VC.

5.3 A Correct Field-Blasting Calculus

Given rules that satisfy these verification conditions, we show that the calculus of Sect. 4.3 is a correct ZKP compiler. The proof is in Appendix B.

Theorem 3 (Correctness)

With rules that satisfy the conditions of Sect. 5.2, the calculus of Sect. 4.3 is demonstrably complete and sound (Def. 1).

6 Case Study: A Verifiable Field-Blaster for CirC

We implemented and partially verified a field-blaster for CirC [46]. Our implementation is based on a refactoring of CirC’s original field blaster to conform to our encoding rules (Sect. 4.2) and consists of $\approx $850 lines of code (LOC).^{Footnote 9} As described below, we have (partially) verified our encoding rules, but trust our calculus (Sect. 4.3, $\approx $150 LOC) and our flattening implementations (Fig. 2, $\approx $160 LOC).

While porting rules, we found 4 bugs in CirC’s original field-blaster (see Appendix G), including a severe soundness bug. Given a ZKP compiled with CirC, the bug allowed a prover to incorrectly compare bit-vectors. The prover, for example, could claim that the unsigned value of 0010 is greater than or less than that of 0001. A patch to fix all 4 bugs (in the original field blaster) has been upstreamed, and we are in the process of upstreaming our new field blaster implementation into CirC.

6.1 Verification Evaluation

Our implementation constructs the VCs from Sect. 5.2 and emits them as SMT-LIB (extended with a theory of finite fields [47]). We verify them with cvc5, because it can solve formulas over bit-vectors and prime fields [47]. The verification is partial in that it is bounded in two ways. We set $b \in \mathbb {N}$ to be the maximum bit-width of any bit-vector and $a \in \mathbb {N}$ to be the maximum number of arguments to any n-ary operator. In our evaluation, we used $a=4$ and $b=4$. These bounds are small, but they were sufficient to find the bugs mentioned above.

Optimizing Completeness VCs. Generally, cvc5 verifies soundness VCs more quickly than completeness VCs. This is surprising at first glance. To see why, consider the soundness (S) and completeness (C) conditions for a conversion rule from e to $e'$ that generates assertions A and definitions F:

$$ S\triangleq (A \wedge valid (e, t)) \rightarrow valid (e', t) C\triangleq ( valid (e, t) \rightarrow (A \wedge valid (e', t)))[F] $$

In both, t is a variable, e contains variables, and there are variables in $e'$ and A that are defined by F. In C, though, some variables are replaced by their definitions in F—which makes the number of variables (and thus the search space)—seem smaller for C than S. Yet, cvc5 is slower on C.

The problem is that, while the field operations in A are standard (e.g., $+$, $\times $, and $=$), the definitions in F use a CirC-IR operator that (once embedded into SMT-LIB) is hard for cvc5 to reason about. That operator, (ff2bv b), takes a prime field element x and returns a bit-vector v. If x’s integer representative is less than $2^b$, then v’s unsigned value is equal to x; otherwise, v is zero.

The ff2bv operator is trivial to evaluate but hard to embed. cvc5’s SMT-LIB extension for prime fields only supports $+$, $\times $ and $=$, so no operator can directly relate x to v. Instead, we encode the relationship through b Booleans that represent the bits of v. To test whether $x < 2^b$, we use the polynomial $f(x) = \prod _{i=0}^{2^b-1}(x - i)$, which is zero only on $[0,2^b-1]$. The bit-splitting essentially forces cvc5 to guess v’s value; further, f’s high degree slows down the Gröbner basis computations that form the foundation of cvc5’s field solver.

To optimize verification of the completeness VCs, we reason about CirC-IR directly. First, we use the uniqueness of valid encodings and the $ fromTerm $ function. Since the VC assumes $ valid (e, t)$, we know e is equal to $ fromTerm (t, \textsf{kind}(e))$. We use this equality to eliminate e from the completeness VC, leaving:

$$ (A \wedge valid (e', t))[F][e\mapsto fromTerm (t, \textsf{kind}(e))] $$

Since F defines all variables in A and $e'$, the only variable after substitution is t. So, when t is a Boolean or small bit-vector, an exhaustive search is very effective;^{Footnote 10} we implemented such a solver in 56 LOC, using CirC’s IR as a library.

For soundness VCs, this approach is less effective. The $ fromTerm $ substitution still applies, but if F introduces fresh field variables, they are not eliminated and thus, the final formula contains field variables, so exhaustion is infeasible.

Verification Results. We ran our VC verification on machines with Intel Xeon E5-2637 v4 CPUs.^{Footnote 11} Each attempt is limited to one physical core, 8GB memory, and 30 min. Figure 6 shows the number of VCs verified by cvc5 and our exhaustive solver. As expected, the exhaustive solver is effective on completeness VCs for Boolean and bit-vector rules, but ineffective on soundness VCs for rules that introduce fresh field variables. There are four VCs that neither solver verifies within 30 min: bvadd with ($b = 4$, $a = 4$), and bvmul with ($b = 3$, $a = 4$) and ($b = 4$, $a \ge 3$). Most other VCs verify instantly. In Appendix E, we analyze how VC verification time depends on a and b.

6.2 Performance and Output Quality Evaluation

We compare CirC with our field-baster (“Verified”) against CirC with its original field-blaster (“Unverified”)^{Footnote 12} on three metrics: compiler runtime, memory usage, and the final R1CS constraint count. Our benchmark set is the standard library for CirC’s Z# input language (which extends ZoKrates [16, 68] v0.6.2). Our testbed runs Linux with 32GB memory and an AMD Ryzen 2700.

There is no difference in constraints, but the verified field-blaster slightly improves compiler performance: –8% time and –2% memory (Fig. 7). We think that the small improvement is unrelated to the fact that the new field blaster is verified. In Appendix E, we discuss compiler performance further.

7 Discussion

In this work, we present the first automatically verifiable field-blaster. We view the field-blaster as a set of rules; if some (automatically verifiable) conditions hold for each rule, then the field-blaster is correct. We implemented a performant and partially verified field-blaster for CirC, finding 4 bugs along the way.

Our approach has limitations. First, we require the field-blaster to be written as a set of encoding rules. Second, we only verify our rules for bit-vectors of bounded size and operators of bounded arity. Third, we assume that each rule is a pure function: for example, it doesn’t return different results depending on the time. Future work might avoid the last two limitations through bit-width-independent reasoning [42, 43, 67] and a DSL (and compiler) for encoding rules. It would also be interesting to extend our approach to: a ZKP with a non-prime field [7, 13], a compiler IR with partial or non-deterministic semantics, or a compiler with correctness that depends on computational assumptions.

Notes

1.
Roughly speaking, a ZK proof system is complete if it is possible to prove every true statement, and is sound if it is infeasible to prove false ones.
2.
Automated verification generally leverages solvers. This is a particularly appealing approach in our setting, since CirC (our compiler infrastructure of interest) already supports compilation to SMT formulas.
3.
For technical reasons, the runtime of $\textsf{Ext}_x$ and the size of its description must be $\textsf{poly}(\lambda ,|x|)$—not just $\textsf{poly}(\lambda )$ (Appendix A). .
4.
We list all CirC-IR operators for Booleans, bit-vectors, and prime fields in Appendix C. Almost all are from SMT-LIB.
5.
Later (Sect. 5), we will see that “well-formedness” constraints like this are unnecessary for instance variables, such as $x_0$. .
6.
We represent $w_1$ bit-wise so that we can ensure the representation is well-formed with constraints $w'_{1,i}(w'_{1,i}-1)\approx 0$. As previously noted, such well-formedness constraints are not needed for an instance variable like $x_1$.(See footnote 5).
7.
Having $\textsf{const}(t)$ return terms that depend on t (rather than directly returning constants) is useful for constructing verification conditions for $\textsf{const}$ .
8.
The different soundness conditions for instance and witness variables play a key role in the proof of Theorem 3. Essentially: since the condition for instances replaces variables with their definitions, the validity of the encodings of instance variables need not be explicitly enforced in A. This is why some constraints could be omitted in our field-blasting example.(See footnote 5).
9.
Our implementation is in Rust, as is CirC.
10.
So long as the exhaustive solver reasons directly about all CirC-IR operators.
11.
We omit the completeness VCs for ff2bv. See Appendix C.
12.
After fixing the bugs we found. See Sect. 6.

References

LLVM language reference manual. https://llvm.org/docs/LangRef.html
Monero technical specs. https://monerodocs.org/technical-specs/ (2022)
Airscript. https://github.com/0xPolygonMiden/air-script
Angel, S., Blumberg, A.J., Ioannidis, E., Woods, J.: Efficient representation of numerical optimization problems for SNARKs. In: USENIX Security (2022)
Google Scholar
Bellés-Muñoz, M., Isabel, M., Muñoz-Tapia, J.L., Rubio, A., Baylina, J.: Circom: a circuit description language for building zero-knowledge applications. IEEE Transactions on Dependable and Secure Computing (2022)
Google Scholar
Bellman. https://github.com/zkcrypto/bellman
Ben-Sasson, E., Bentov, I., Horesh, Y., Riabzev, M.: Scalable zero knowledge with no trusted setup. In: CRYPTO (2019)
Google Scholar
Bertot, Y., Castéran, P.: Interactive theorem proving and program development: Coq’Art: the calculus of inductive constructions. Springer, Heidelberg (2013). https://doi.org/10.1007/978-3-662-07964-5
Blum, M., Feldman, P., Micali, S.: Non-interactive zero-knowledge and its applications. In: STOC (1988)
Google Scholar
Brown, F., Renner, J., Nötzli, A., Lerner, S., Shacham, H., Stefan, D.: Towards a verified range analysis for JavaScript JITs. In: PLDI (2020)
Google Scholar
Campanelli, M., Gennaro, R., Goldfeder, S., Nizzardo, L.: Zero-knowledge contingent payments revisited: attacks and payments for services. In: CCS (2017)
Google Scholar
Chen, E., Zhu, J., Ozdemir, A., Wahby, R.S., Brown, F., Zheng, W.: Silph: a framework for scalable and accurate generation of hybrid MPC protocols (2023)
Google Scholar
Chiesa, A., Hu, Y., Maller, M., Mishra, P., Vesely, N., Ward, N.: Marlin: preprocessing zkSNARKs with universal and updatable SRS. In: EUROCRYPT (2020)
Google Scholar
Chin, C., Wu, H., Chu, R., Coglio, A., McCarthy, E., Smith, E.: Leo: a programming language for formally verified, zero-knowledge applications (2021). https://ia.cr/2021/651
Cowan, M., Dangwal, D., Alaghi, A., Trippel, C., Lee, V.T., Reagen, B.: Porcupine: a synthesizing compiler for vectorized homomorphic encryption. In: PLDI (2021)
Google Scholar
Eberhardt, J., Tai, S.: ZoKrates–scalable privacy-preserving off-chain computations. In: IEEE Blockchain (2018)
Google Scholar
Enderton, H.B.: A mathematical introduction to logic. Elsevier (2001)
Google Scholar
Fournet, C., Keller, C., Laporte, V.: A certified compiler for verifiable computing. In: CSF (2016)
Google Scholar
Fox, A., Myreen, M.O., Tan, Y.K., Kumar, R.: Verified compilation of CakeML to multiple machine-code targets. In: CPP (2017)
Google Scholar
Frankle, J., Park, S., Shaar, D., Goldwasser, S., Weitzner, D.: Practical accountability of secret processes. In: USENIX Security (2018)
Google Scholar
Goldberg, L., Papini, S., Riabzev, M.: Cairo - a Turing-complete STARK-friendly CPU architecture (2021). https://ia.cr/2021/0163
Goldwasser, S., Micali, S., Rackoff, C.: The knowledge complexity of interactive proof-systems. In: STOC (1985)
Google Scholar
Grubbs, P., Arun, A., Zhang, Y., Bonneau, J., Walfish, M.: Zero-knowledge middleboxes. In: USENIX Security (2022)
Google Scholar
Hopwood, D., Bowe, S., Hornby, T., Wilcox, N.: Zcash protocol specification. https://raw.githubusercontent.com/zcash/zips/master/protocol/protocol.pdf (2016)
Jiang, K., Chait-Roth, D., DeStefano, Z., Walfish, M., Wies, T.: Less is more: refinement proofs for probabilistic proofs. IEEE S &P (2023)
Google Scholar
Kamara, S., Moataz, T., Park, A., Qin, L.: A decentralized and encrypted national gun registry. In: IEEE S &P (2021)
Google Scholar
Kaufmann, M., Manolios, P., Moore, J.S.: Computer-aided reasoning: ACL2 case studies, vol. 4. Springer, NY (2013). https://doi.org/10.1007/978-1-4757-3188-0
Kosba, A., Papadopoulos, D., Papamanthou, C., Song, D.: MIRAGE: succinct arguments for randomized algorithms with applications to universal zk-SNARKs. In: USENIX Security (2020)
Google Scholar
Kosba, A., Papamanthou, C., Shi, E.: xJsnark: A framework for efficient verifiable computation. In: IEEE S &P (2018)
Google Scholar
Kothapalli, A., Parno, B.: Algebraic reductions of knowledge (2022). https://ia.cr/2022/009
Kumar, R., Myreen, M.O., Norrish, M., Owens, S.: CakeML: A verified implementation of ML. In: POPL (2014)
Google Scholar
Kundu, S., Tatlock, Z., Lerner, S.: Proving optimizations correct using parameterized program equivalence. In: PLDI (2009)
Google Scholar
Lattner, C., Adve, V.: LLVM: a compilation framework for lifelong program analysis & transformation. In: CGO (2004)
Google Scholar
Lerner, S., Millstein, T., Chambers, C.: Automatically proving the correctness of compiler optimizations. In: PLDI (2003)
Google Scholar
Lerner, S., Millstein, T., Rice, E., Chambers, C.: Automated soundness proofs for dataflow analyses and transformations via local rules. In: POPL (2005)
Google Scholar
Leroy, X.: Formal verification of a realistic compiler. Commun. ACM 52(7), 107–115 (2009)
Article Google Scholar
Leroy, X.: A formally verified compiler back-end. J. Autom. Reason. 43(4), 363–446 (2009)
Article MathSciNet MATH Google Scholar
Lopes, N.P., Lee, J., Hur, C.K., Liu, Z., Regehr, J.: Alive2: bounded translation validation for LLVM. In: PLDI (2021)
Google Scholar
Lopes, N.P., Menendez, D., Nagarakatte, S., Regehr, J.: Provably correct peephole optimizations with Alive. In: PLDI (2015)
Google Scholar
Mullen, E., Zuniga, D., Tatlock, Z., Grossman, D.: Verified peephole optimizations for CompCert. In: PLDI (2016)
Google Scholar
Necula, G.C.: Translation validation for an optimizing compiler. In: PLDI (2000)
Google Scholar
Niemetz, A., Preiner, M., Reynolds, A., Zohar, Y., Barrett, C., Tinelli, C.: Towards bit-width-independent proofs in SMT solvers. In: CADE (2019)
Google Scholar
Niemetz, A., Preiner, M., Reynolds, A., Zohar, Y., Barrett, C., Tinelli, C.: Towards satisfiability modulo parametric bit-vectors. J. Autom. Reason. 65(7), 1001–1025 (2021)
Article MathSciNet MATH Google Scholar
Nipkow, T., Wenzel, M., Paulson, L.C.: Isabelle/HOL: a proof assistant for higher-order logic. Springer, Heidelberg (2002). https://doi.org/10.1007/3-540-45949-9
Noir. https://noir-lang.github.io/book/index.html
Ozdemir, A., Brown, F., Wahby, R.S.: CirC: Compiler infrastructure for proof systems, software verification, and more. In: IEEE S &P (2022)
Google Scholar
Ozdemir, A., Kremer, G., Tinelli, C., Barrett, C.: Satisfiability modulo finite fields. In: submission (2022). https://ia.cr/2023/091
Ozdemir, A., Wahby, R., Whitehat, B., Boneh, D.: Scaling verifiable computation using efficient set accumulators. In: USENIX Security (2020)
Google Scholar
Ozdemir, A., Wahby, R.S., Brown, F., Barrett, C.: Bounded verification for finite-field-blasting. Cryptology ePrint Archive (2023) (Full version)
Google Scholar
Parno, B., Howell, J., Gentry, C., Raykova, M.: Pinocchio: nearly practical verifiable computation. Commun. ACM 59(2), 103–112 (2016)
Article Google Scholar
Pnueli, A., Siegel, M., Singerman, E.: Translation validation. In: TACAS (1998)
Google Scholar
Ranise, S., Tinelli, C., Barrett, C.: SMT fixed size bit-vectors theory. https://smtlib.cs.uiowa.edu/theories-FixedSizeBitVectors.shtml (2017)
Sasson, E.B., Chiesa, A., Garman, C., Green, M., Miers, I., Tromer, E., Virza, M.: Zerocash: decentralized anonymous payments from Bitcoin. In: IEEE S &P (2014)
Google Scholar
Setty, S., Braun, B., Vu, V., Blumberg, A.J., Parno, B., Walfish, M.: Resolving the conflict between generality and plausibility in verified computation. In: EuroSys (2013)
Google Scholar
Snarky. https://github.com/o1-labs/snarky
Stewart, G., Beringer, L., Cuellar, S., Appel, A.W.: Compositional CompCert. In: POPL (2015)
Google Scholar
Tan, Y.K., Myreen, M.O., Kumar, R., Fox, A., Owens, S., Norrish, M.: The verified CakeML compiler backend. J. Funct. Programm. 29, E2 (2019)
Google Scholar
Tao, R., et al.: Giallar: push-button verification for the Qiskit quantum compiler. In: PLDI (2022)
Google Scholar
Thaler, J.: Proofs, Arguments, and Zero-Knowledge. Manuscript (2022)
Google Scholar
The Qiskit authors and maintainers: Qiskit: an open-source framework for quantum computing (2021). https://doi.org/10.5281/zenodo.2573505. The Qiskit maintainers request that the full list of Qiskit contributors be included in any citation. Regretfully, we cannot comply, as the list is two pages long
Tinelli, C.: SMT core theory. https://smtlib.cs.uiowa.edu/theories-Core.shtml (2015)
Viand, A., Jattke, P., Hithnawi, A.: SoK: fully homomorphic encryption compilers. In: IEEE S &P (2021)
Google Scholar
Wahby, R.S., Setty, S., Howald, M., Ren, Z., Blumberg, A.J., Walfish, M.: Efficient RAM and control flow in verifiable outsourced computation. In: NDSS (2015)
Google Scholar
Walfish, M., Blumberg, A.J.: Verifying computations without reexecuting them. Commun. ACM 58(2), 74–84 (2015)
Article Google Scholar
Wang, F.: Ecne: automated verification of ZK circuits (2022). https://0xparc.org/blog/ecne
Zinc. https://zinc.matterlabs.dev/
Zohar, Y., et al.: Bit-precise reasoning via Int-blasting. In: CADE (2022)
Google Scholar
ZoKrates. https://zokrates.github.io/

Download references

Acknowledgements

We appreciate the help and guidance of Andres Nötzli, Dan Boneh, and Evan Laufer.

This material is in part based upon work supported by the DARPA SIEVE program and the Simons foundation. Any opinions, findings, and conclusions or recommendations expressed in this report are those of the author(s) and do not necessarily reflect the views of DARPA. It is also funded in part by NSF grant number 2110397 and the Stanford Center for Automated Reasoning.

Author information

Authors and Affiliations

Stanford University, Stanford, USA
Alex Ozdemir & Clark Barrett
Carnegie Mellon University, Pittsburgh, USA
Riad S. Wahby & Fraser Brown

Authors

Alex Ozdemir
View author publications
You can also search for this author in PubMed Google Scholar
Riad S. Wahby
View author publications
You can also search for this author in PubMed Google Scholar
Fraser Brown
View author publications
You can also search for this author in PubMed Google Scholar
Clark Barrett
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Alex Ozdemir .

Editor information