Abstract
Zero Knowledge Proofs (ZKPs) are cryptographic protocols by which a prover convinces a verifier of the truth of a statement without revealing any other information. Typically, statements are expressed in a high-level language and then compiled to a low-level representation on which the ZKP operates. Thus, a bug in a ZKP compiler can compromise the statement that the ZK proof is supposed to establish. This paper takes a step towards ZKP compiler correctness by partially verifying a field-blasting compiler pass, a pass that translates Boolean and bit-vector logic into equivalent operations in a finite field. First, we define correctness for field-blasters and ZKP compilers more generally. Next, we describe the specific field-blaster using a set of encoding rules and define verification conditions for individual rules. Finally, we connect the rules and the correctness definition by showing that if our verification conditions hold, the field-blaster is correct. We have implemented our approach in the CirC ZKP compiler and have proved bounded versions of the corresponding verification conditions. We show that our partially verified field-blaster does not hurt the performance of the compiler or its output; we also report on four bugs uncovered during verification.
You have full access to this open access chapter, Download conference paper PDF
Similar content being viewed by others
1 Introduction
Zero-Knowledge Proofs (ZKPs) are powerful tools for building privacy-preserving systems. They allow one entity, the prover \(\mathcal {P}\) , to convince another, the verifier \(\mathcal {V}\) , that some secret data satisfies a public property, without revealing anything else about the data. ZKPs underlie a large (and growing!) set of critical applications, from billion-dollar private cryptocurrencies, like ZcashĀ [24, 53] and MoneroĀ [2], to research into auditable sealed court ordersĀ [20], private gun registriesĀ [26], privacy-preserving middleboxesĀ [23], and zero-knowledge proofs of exploitabilityĀ [11]. This breadth of applications is possible because of the generality of ZKPs. In general, \(\mathcal {P}\) knows a secret witness w, whereas \(\mathcal {V}\) knows a property \(\phi \) and a public instance x. \(\mathcal {P}\) must show that \(\phi (x,w)= \top \). Typically, x and w are vectors of variables in a finite field \(\mathbb {F}\), and \(\phi \) can be any system of equations over the variables, using operations \(+\) and \(\times \). Because \(\phi \) itself is an input to \(\mathcal {P}\) and \(\mathcal {V}\) , and because of the expressivity of field equations, a single implementation of \(\mathcal {P}\) and \(\mathcal {V}\) can serve many different purposes.
Humans find it difficult to express themselves directly with field equations, so they use ZKP compilers. A ZKP compiler converts a high-level predicate \(\phi '\) into an equivalent system of field equations \(\phi \). In other words, a ZKP compiler generalizes a ZKP: by compiling \(\phi '\) to \(\phi \) and then using a ZKP for \(\phi \), one obtains a ZKP for \(\phi '\). There are many industrial Ā [3, 5, 6, 14, 21, 45, 55, 66] and academic Ā [4, 18, 28, 29, 46, 48, 50, 54, 63] ZKP compilers.
The correctness of a ZKP compiler is critical for securityā a bug in the compiler could admit proofs of false statementsā but verification is challenging for three reasons. First, the definition of correctness for a ZKP compiler is non-trivial; we discuss later in this section. Second, ZKP compilers span multiple domains. The high-level predicate \(\phi '\) is typically expressed in a language with common types such as Booleans and fixed-width integers, while the output \(\phi \) is over a large, prime-order field. Thus, any compiler correctness definition must span these domains. Third, ZKP compilers are evolving and performance-critical; verification must not inhibit future changes or degrade compiler performance.
In this work, we develop tools for automatically verifying the field-blaster of a ZKP compiler. A ZKP compilerās field-blaster is the pass that converts from a formula over Booleans, fixed-width integers, and finite-field elements, to a system of field equations; as a transformation from bit-like types to field equations, the field-blaster exemplifies the challenge of cross-domain verification.
Our paper makes three contributions. First, we formulate a precise correctness definition for a ZKP compiler. Our definition ensures that a correct compiler preserves the completeness and soundness of the underlying ZK proof system.Footnote 1 More specifically, given a ZK proof system where statements are specified in a low-level language L, and a compiler from a high-level language H to L, if the compiler is correct by our definition, it extends the ZK proof systemās soundness and completeness properties to statements in H. Further, our definition is preserved under sequential composition, so proving the correctness of each compiler pass individually suffices to prove correctness of the compiler itself.
Second, we give an architecture for a verifiable field-blaster. In our architecture, a field-blaster is a set of āencoding rules.ā We give verification conditions (VCs) for these rules, and we show that if the VCs hold, then the field-blaster is correct. Our approach supports automated verification because (bounded versions of) the VCs can be checked automatically. This reduces both the up-front cost of verification and its maintenance cost.
Third, we do a case study. Using our architecture, we implement a new field-blaster for CirCĀ [46] (āSIR-seeā), an infrastructure used by state-of-the-art ZKP compilers. We verify bounded versions of our field-blasterās VCs using SMT-based finite-field reasoningĀ [47], and show that our field blaster does not compromise CirCās performance. We also report on four bugs that our verification effort uncovered, including a soundness bug that allowed the prover to ālieā about the results of certain bit-vector comparisons. We note that the utility of our techniques is not limited to CirC: most ZKP compilers include something like the field-blaster we describe here.
In the next sections, we discuss related work (Sect.Ā 1.1), give background on ZKPs and CirC (Sect.Ā 2), present a field-blasting example (Sect.Ā 3), describe our architecture (Sect.Ā 4), give our verification conditions (Sect.Ā 5), and present the case study (Sect.Ā 6).
1.1 Related Work
Verified Compilers. There is a rich body of work on verifying the correctness of traditional compilers. We focus on compilation for ZKPs; this requires different correctness definitions that relate bit-like types to prime field elements. In the next paragraphs, we discuss more fine-grained differences.
Compiler verification efforts fall into two broad categories: automatedāverification leveraging automated reasoning solversāand foundationalāmanual verification using proof assistants (e.g., CoqĀ [8] or IsabelleĀ [44]). CompCertĀ [36], for example, is a Coq-verified C compiler with verified optimization passes (e.g.,Ā [40]). Closest to our work is backend verification, which proves correct the translation from an intermediate representation to machine code. CompCertās loweringĀ [37] is verified, as is CakeMLāsĀ [31] lowering to different ISAsĀ [19, 57]. While such foundational verification offers strong guarantees, it imposes a heavy proof burden; creating CompCert, for example, took an expert team eight yearsĀ [56], and any updates to compiler code require updates to proofs.
Automated verification, in contrast, does not require writing and maintaining manual proofs.Footnote 2 CobaltĀ [34], RhodiumĀ [35], and PECĀ [32] are domain-specific languages (DSLs) for writing automatically-verified compiler optimizations and analyses. Most closely related to our work is AliveĀ [39], a DSL for expressing verified peephole optimizations, local rewrites that transform snippets of LLVM IRĀ [1] to better-performing ones. Alive addresses transformations over fixed types (while we address lowering to finite field equations) and formulates correctness in the presence of undefined behavior (while we formulate correctness for ZKPs). Beyond Alive, Alive2Ā [38] provides translation validationĀ [41, 51] for LLVMĀ [33], and VeRAĀ [10] verifies range analysis in the Firefox JavaScript engine.
There is also work on verified compilation for domains more closely related to ZKPs. The PorcupineĀ [15] compiler automatically synthesizes representations for fully-homomorphic encryptionĀ [62], and GillarĀ [58] proves that optimization passes in the QiskitĀ [60] quantum compiler are semantics-preserving. While these works compile from high-level languages to circuit representations, the correctness definitions for their domains do not apply to ZKP compilers.
Verified Compilation to Cryptographic Proofs. Prior works on verified compilation for ZKPs (or similar) take the foundational approach (with attendant proof maintenance burdens), and they do not formulate a satisfactory definition of compiler correctness. PinocchioQĀ [18] builds on CompCertĀ [36]. The authors formulate a correctness definition that preserves the existential soundness of a ZKP but does not consider completeness, knowledge soundness, or zero-knowledge (see Sect.Ā 2.2). LeoĀ [14] is a ZKP compiler that produces (partial) ACL2Ā [27] proofs of correct compilation; work to emit proofs from its field-blaster is ongoing.
Recent work defines security for reductions of knowledgeĀ [30]. These let \(\mathcal {P}\) convince \(\mathcal {V}\) that it knows a witness for an instance of relation \(\mathcal {R}_1\) by proving it knows a witness for an instance of an easier-to-prove relation \(\mathcal {R}_2\). Unlike ZKP compilers, \(\mathcal {P}\) and \(\mathcal {V}\) interact to derive \(\mathcal {R}_2\) using \(\mathcal {V}\) ās randomness (e.g., proving that two polynomials are nonzero w.h.p. by proving that a random linear combination of them is), whereas ZKP compilers run ahead of time and non-interactively.
Further afield, EcneĀ [65] is a tool that attempts to verify that the input to a ZKP encodes a deterministic computation. It does not consider any notion of a specification of the intended behavior. A different workĀ [25] attempts to automatically verify that a āwidgetā given to a ZKP meets some specification. They consider widgets that could be constructed manually or with a compiler. Our focus is on verifying a compiler pass.
2 Background
2.1 Logic
We assume usual terminology for many-sorted first-order logic with equality ( [17] gives a complete presentation). We assume every signature includes the sort \(\textsf{Bool}\) , constants \(\textsf{True}\) and \(\textsf{False}\) of sort \(\texttt{Bool}\) , and symbol family \(\approx _\sigma \) (abbreviated \(\approx \)) with sort \(\sigma \times \sigma \rightarrow \textsf{Bool}\) for each sort \(\sigma \). We also assume a family of conditionals: symbols \( ite _\sigma \) (āif-then-elseā, abbreviated \( ite \) ) of sort \(\textsf{Bool}\times \sigma \times \sigma \rightarrow \sigma \).
A theory is a pair \(\mathcal {T} = (\varSigma , \textbf{I})\), where \(\varSigma \) is a signature and \(\textbf{I}\) is a class of \(\varSigma \)-interpretations. A \(\varSigma \) -formula is a term of sort \(\textsf{Bool}\) . A \(\varSigma \) -formula \(\phi \) is satisfiable (resp., unsatisfiable) in \(\mathcal {T}\) if it is satisfied by some (resp., no) interpretation in \(\textbf{I}\) . We focus on two theories. The first is \(\mathcal {T}_{ BV }\), the SMT-LIB theory of bit-vectorsĀ [52, 61], with signature \(\varSigma _{ BV }\) including a bit-vector sort \(\textsf{BV}_{[n]}\) for each \(n>0\) with bit-vector constants \(c_{[n]}\) of sort \(\textsf{BV}_{[n]}\) for each \(c\in [0,2^n-1]\), and operators including & and \(|\) (bitwise and, or) and \(+_{[n]}\) (addition modulo \(2^n\)). We write t[i] to refer to the \(i^ th \) bit of bit-vector t, where t[0] is the least-significant bit. The other theory is \(\mathcal {T}_{ F _{p}}\), which is the theory corresponding to the finite field of order p, for some prime pĀ [47]. This theory has signature \(\varSigma _{ F _{p}}\) containing the sort \(\textsf{FF}_{p}\), constant symbols \(0, \ldots , p-1\), and operators \(+\) and \(\times \).
In this paper, we assume all interpretations interpret sorts and symbols in the same way. We write \(\textsf{dom}(v)\) for the set interpreting the sort of a variable v. We assume that \(\textsf{Bool}\), \(\textsf{True}\), and \(\textsf{False}\) are interpreted as \(\{\top , \bot \}\), \(\top \), and \(\bot \), respectively; \(\varSigma _{ BV }\)-interpretations follow the SMT-LIB standard; and \(\varSigma _{ F _{p}}\)-interpretations interpret symbols as the corresponding elements and operations in \(\mathbb {F}_p\), a finite field of order p (for concreteness, this could be the integers modulo p). Note that only the values of variables can vary between two interpretations.
For a signature \(\varSigma \), let t be a \(\varSigma \)-term of sort \(\sigma \), with free variables \(x_1,\ldots ,x_n\), respectively of sort \(\sigma _1,\ldots ,\sigma _n\). We define the function \(\hat{t}: \textsf{dom}(x_1) \times \cdots \times \textsf{dom}(x_n) \rightarrow \textsf{dom}(t)\) as follows. Let \(\mathbf {\textsf{x}}\in \textsf{dom}(x_1) \times \cdots \times \textsf{dom}(x_n)\). Let \(\mathcal {M}\) be an interpretation that interprets each \(x_i\) as \( \textsf{x}_i\). Then \(\hat{t}(\mathbf {\textsf{x}}) = t^{\mathcal {M}}\) (i.e., the interpretation of t in \(\mathcal {M}\)). For example, the term \(t = a \wedge \lnot a\) defines \(\hat{t}:\textsf{Bool}\rightarrow \textsf{Bool}= \lambda \,x.\,\bot \). In the following, we follow the convention used above in using the standard font (e.g., x) for logical variables and a sans serif font (e.g., \(\textsf{x}\)) to denote meta-variables standing for values (i.e., elements of \(\sigma ^\mathcal {M}\) for some \(\sigma \) and \(\mathcal {M}\)). Also, abusing notation, weāll conflate single variables (of both kinds) with vectors of variables when the distinction doesnāt matter. Note that a formula \(\phi \) is satisfiable if there exist values \(\textsf{x}\) such that \(\hat{\phi }(\textsf{x}) = \top \). It is valid if for all values \(\textsf{x}\), \(\hat{\phi }(\textsf{x}) = \top \).
For terms s,Ā t and variable x, \(t[x\mapsto s]\) denotes t with all occurrences of x replaced with s. For a sequence of variable-term pairs, \(S = (x_1 \mapsto s_1, \dots , x_n \mapsto s_n)\), t[S] is defined to be \(t[x_1\mapsto s_1]\cdots [x_n\mapsto s_n]\).
2.2 Zero Knowledge Proofs
As mentioned above, Zero-knowledge proofs (ZKPs) make it possible to prove that some secret data satisfies a public propertyāwithout revealing the data itself. SeeĀ [59] for a full presentation; we give a brief overview here, and then describe how general-purpose ZKPs are used.
Overview and Definitions. In a cryptographic proof system, there are two parties: a verifier \(\mathcal {V}\) and a prover \(\mathcal {P}\) . \(\mathcal {V}\) knows a public instance \(\textsf{x}\) and asks \(\mathcal {P}\) to show that it has knowledge of a secret witness \(\textsf{w}\) satisfying a public predicate \(\phi (x, w)\) from a predicate class \(\varPhi \) (a set of formulas) (i.e., \(\hat{\phi }(\textsf{x},\textsf{w})=\top \)). FigureĀ 1 illustrates the workflow. First, a trusted party runs an efficient (i.e., polytime in an implicit security parameter \(\lambda \)) algorithm \(\textsf{Setup}(\phi )\) which produces a proving key \(\textsf{pk}\) and a verifying key \(\textsf{vk}\) . Then, \(\mathcal {P}\) runs an efficient algorithm \(\textsf{Prove}(\textsf{pk}, \textsf{x}, \textsf{w}) \rightarrow \pi \) and sends the resulting proof \(\pi \) to \(\mathcal {V}\) . Finally, \(\mathcal {V}\) runs an efficient verification algorithm \(\textsf{Verify}(\textsf{vk}, \textsf{x}, \pi ) \rightarrow \{\top ,\bot \}\) that accepts or rejects the proof. A zero-knowledge argument of knowledge for class \(\varPhi \) is a tuple \(\Pi = (\textsf{Setup}, \textsf{Prove}, \textsf{Verify})\) with three informal properties for every \(\phi \in \varPhi \) and every \(\textsf{x}\in \textsf{dom}(x),\textsf{w}\in \textsf{dom}(w)\):
-
perfect completeness: if \(\hat{\phi }(\textsf{x},\textsf{w})\) holds, then \(\textsf{Verify}(\textsf{vk}, \textsf{x}, \pi )\) holds;
-
computational knowledge soundnessĀ [9]: an efficient adversary that does not know \(\textsf{w}\) cannot produce a \(\pi \) such that \(\textsf{Verify}(\textsf{vk}, \textsf{x}, \pi )\) holds; and
-
zero-knowledgeĀ [22]: \(\pi \) reveals nothing about \(\textsf{w}\), other than its existence.
Technically, the system is an āargumentā rather than a āproofā because soundness only holds against efficient adversaries. Also note that knowledge soundness requires that an entity must āknowā a valid \(w'\) to produce a proof; it is not enough for a valid \(w'\) to simply exist. We give more precise definitions in AppendixĀ A.
Representations for ZKPs. As mentioned above, ZKP applications are manifold (Sect.Ā 1)āfrom cryptocurrencies to private registries. This breadth of applications is possible because ZKPs support a broad class of predicates. Most commonly, these predicates are expressed as rank-1 constraint systems (R1CSs). Recall that \(\mathbb {F}_p\) is a prime-order finite field (also called a prime field). We will drop the subscript p when it is not important. In an R1CS, \(\textsf{x}\) and \(\textsf{w}\) are vectors of elements in \(\mathbb {F}\) ; let \(\textsf{z}\in \mathbb {F}^m\) be their concatenation. The function \(\hat{\phi }\) can be defined by three matrices \(\textsf{A}, \textsf{B}, \textsf{C}\in \mathbb {F}^{n \times m}\); \(\hat{\phi }(\textsf{x},\textsf{w})\) holds when \(\textsf{A}\textsf{z}\circ \textsf{B}\textsf{z}= \textsf{C}\textsf{z}\), where \(\circ \) is the element-wise product. Thus, \(\phi \) can be viewed as n conjoined constraints, where each constraint i is of the form \((\sum _j a_{ij}z_j) \times (\sum _j b_{ij}z_j) \approx (\sum _j c_{ij}z_j)\) (where the \(a_{ij}\), \(b_{ij}\) and \(c_{ij}\) are constant symbols from \(\varSigma _{ F _{p}}\), and the \(z_j\) are a vector of variables of sort \(\textsf{FF}_{p}\)). That is, each constraint enforces a single non-linear multiplication.
2.3 Compilation Targeting Zero Knowledge Proofs
To write a ZKP about a high-level predicate \(\phi \), that predicate is first compiled to an R1CS. A ZKP compiler from class \(\varPhi \) (a set of \(\varSigma \)-formulas) to class \(\varPhi '\) (a set of \(\varSigma '\)-formulas) is an efficient algorithm \(\textsf{Compile}(\phi \in \varPhi ) \rightarrow (\phi '\in \varPhi ', \textsf{Ext}_x, \textsf{Ext}_w)\). Given a predicate \(\phi (x,w)\), it returns a predicate \(\phi '(x', w')\) as well as two efficient and deterministic algorithms, instance and witness extenders: \(\textsf{Ext}_x : \textsf{dom}(x) \rightarrow \textsf{dom}(x')\) and \(\textsf{Ext}_w: \textsf{dom}(x)\times \textsf{dom}(w) \rightarrow \textsf{dom}(w')\).Footnote 3 For example, CirCĀ [46] can compile a Boolean-returning C function (in a subset of C) to an R1CS.
At a high-level, \(\phi \) and \(\phi '\) should be āequisatisfiableā, with \(\textsf{Ext}_x\) and \(\textsf{Ext}_w\) mapping satisfying values for \(\phi \) to satisfying values for \(\phi '\). That is, for all \(\textsf{x}\in \textsf{dom}(x)\) and \(\textsf{w}\in \textsf{dom}(w)\) such that \(\hat{\phi }(\textsf{x}, \textsf{w}) = \top \), if \(\textsf{x}' = \textsf{Ext}_x(\textsf{x})\) and \(\textsf{w}' = \textsf{Ext}_w(\textsf{x}, \textsf{w})\), then \(\hat{\phi '}(\textsf{x}', \textsf{w}') = \top \). Furthermore, for any \(\textsf{x}\), it should be impossible to (efficiently) find \(\textsf{w}'\) satisfying \(\hat{\phi '}(\textsf{Ext}_x(\textsf{x}), \textsf{w}') = \top \) without knowing a \(\textsf{w}\) satisfying \(\hat{\phi }(\textsf{x}, \textsf{w}) = \top \). In Sect.Ā 5.1, we precisely define correctness for a predicate compiler.
One can build a ZKP for class \(\varPhi \) from a compiler from \(\varPhi \) to \(\varPhi '\) and a ZKP for \(\varPhi '\). Essentially, one runs the compiler to get a predicate \(\phi '\in \varPhi '\), as well as \(\textsf{Ext}_x\) and \(\textsf{Ext}_w\). Then, one writes a ZKP to show that \(\hat{\phi '}(\textsf{Ext}_x(\textsf{x}), \textsf{Ext}_w(\textsf{x}, \textsf{w})) = \top \). In AppendixĀ A, we give this construction in full and prove it is secure.
Optimization. The primary challenge when using ZKPs is cost: typically, \(\textsf{Prove}\) is at least three orders of magnitude slower than checking \(\phi \) directlyĀ [64]. Since \(\textsf{Prove}\) ās cost scales with n (the constraint count), it is critical for the compiler to minimize n. The space of optimizations is large and complex, for two reasons. First, the compiler can introduce fresh variables. Second, only equisatifiabilityānot logical equivalenceāis needed. Compilers in this space exploit equisatisfiability heavily to efficiently represent high-level constructs (e.g., Booleans, bit-vectors, arrays, ...) as an R1CS.
As a (simple!) example, consider the Boolean computation \(a \approx c_1 \vee \dots \vee c_k\). Assume that \(c'_1, \dots , c'_k\) are variables of sort \(\textsf{FF}_{}\) and that we add constraints \(c'_i(1-c'_i) \approx 0\) to ensure that \(c'_i\) has to be 0 or 1 for each i. Assume further that \((c'_i \approx 1)\) encodes \(c_i\) for each i. How can one additionally ensure that \(a'\) (also of sort \(\textsf{FF}_{}\)) is also forced to be equal to 0 or 1 and that \((a' \approx 1)\) is a correct encoding of a? Given that there are \(k-1\) ORs, natural approaches use \(\Theta (k)\) constraints. One clever approach is to introduce variable \(x'\) and enforce constraints \(x'(\sum _i c'_i)\approx a'\) and \((1-a')(\sum _i c'_i)\approx 0\). In any interpretation where any \(c_i\) is true, the corresponding interpretation for \(a'\) must be 1 to satisfy the second constraint; setting \(x'\) to the sumās inverse satisfies the first. If all \(c_i\) are false, the first constraint ensures \(a'\) is 0. This technique assumes the sum does not overflow; since ZKP fields are typically large (e.g., with p on the order of \(2^{255}\)), this is usually a safe assumption.
CirC. CirCĀ [46] is an infrastructure for building compilers from high-level languages (e.g., a C subset), to R1CSs. It has been used in research projectsĀ [4, 12], and in industrial R &D. FigureĀ 2 shows the structure of an R1CS compiler built with CirC. First, the front-end of the compiler converts the source program into CirC-IR. CirC-IR is a term IR based on SMT-LIB that includes: Booleans, bit-vectors, fixed-size arrays, tuples, and prime fields.Footnote 4 Second, the compiler optimizes and simplifies the IR so that the only remaining sorts are Booleans, bit-vectors, and the target prime field. Third, the compiler lowers the simplified IR to an R1CS predicate over the target field. For ZKPs built with CirC, the completeness, soundness, and zero-knowledge of the end-to-end system depend on the correctness of CirC itself.
3 Overview andĀ Example
To start, we view CirCās lowering pass as two passes (Fig.Ā 2). The first pass, ā(finite-)field-blasting,ā converts a many-sorted IR (representable as a (\(\varSigma _{ BV }\cup \varSigma _{ F _{}}\))-formula) to a conjunction of field equations (\(\varSigma _{ F _{}}\)-equations). The second pass, āflattening,ā converts this conjunction of field equations to an R1CS.
Our focus is on verifying the first pass. We begin with a worked example of how to field-blast a small snippet of CirC-IR (Sect.Ā 3.1). This example will illustrate four key ideas (Sect.Ā 3.2) that inspire our field-blasterās architecture.
3.1 An Example ofĀ Field-Blasting
We start with an example CirC-IR predicate expressed as a (\(\varSigma _{ BV }\cup \varSigma _{ F _{}}\))-formula:
The predicate includes: the XOR of two Booleans (ā\(\oplus \)ā), a bit-vector sum, a bit-vector AND, and a field product. \(x_0\) and \(w_0\) are of sort \(\texttt{Bool}\), \(x_1\), \(x_2\), and \(w_1\) are of sort \(\textsf{BV}_{[4]}\), and \(x_3\) and \(w_2\) are of sort \(\textsf{FF}_{p}\). Weāll assume that \(p \gg 2^4\). TableĀ 1 summarizes the new variables and assertions we create during field-blasting; we describe the origin of each assertion and new variable in the next paragraphs.
Lowering Clause One (Booleans). We begin with the Boolean term \((x_0 \oplus w_0)\). We will use 1 and 0 to represent \(\top \) and \(\bot \). We introduce variables \(x'_0\) and \(w'_0\) of sort \(\textsf{FF}_{p}\) to represent \(x_0\) and \(w_0\) respectively. To ensure that \(w'_0\) is 0 or 1, we assert: \(w'_0(w'_0-1) \approx 0\).Footnote 5 \(x_0 \oplus w_0\) is then represented by the expression \(1 - x'_0 - w'_0 + 2x'_0w'_0\). Setting this equal to 1 enforces that \(x_0 \oplus w_0\) must be true. These new assertions and fresh variables are reflected in the first three rows of the table.
Lowering Clause Two and Three (Bit-vectors). Before describing how to bit-blast the second and third clauses in \(\phi \), we discuss bit-vector representations in general. A bit-vector t can be viewed as a sequence of b bits or as a non-negative integer less than \(2^b\). These two views suggest two natural representations in a prime-order field: first, as one field element \(t'_u\), whose unsigned value agrees with t (assuming the fieldās size is at least \(2^b\)); second, as b elements \(t'_0, \dots , t'_{b-1}\), that encode the bits of t as 0 or 1 (in our encoding, \(t'_0\) is the least-significant bit and \(t'_{b-1}\) is the most-significant bit). The first representation is simple, but with it, some field values (e.g., \(2^b\)) donāt corresponding to any possible bit-vector. With the second approach, by including equations \(t'_i(t'_i-1)\approx 0\) in our system, we ensure that any satisfying assignment corresponds to a valid bit-vector. However, the extra b equations increase the size of our compilerās output.
We represent \(\phi \)ās \(w_1\) bit-wise: as \(w'_{1,0}, \dots , w'_{1,3}\), and we represent the instance variable \(x_1\) as \(x'_{1,u}\).Footnote 6 For the constraint \(w_1 +_{[4]} x_1 \approx w_1\), we compute the sum in the field and bit-decompose the result to handle overflow. First, we introduce new variable \(s'\) and set it equal to \(x'_{1,u} + \sum _{i=0}^3 2^iw'_{1,i}\). Then, we bit-decompose \(s'\), requiring \(s' \approx \sum _{i=0}^4 2^is'_{i}\), and \(s'_i(s'_i-1) \approx 0\) for \(i\in [0,4]\). Finally, we assert \(s'_i\approx w'_{1,i}\) for \(i\in [0,3]\). This forces the lowest 4 bits of the sum to be equal to \(w_1\).
The constraint \( x_2~ \& ~w_1\approx x_2\) is more challenging. Since \(x_2\) is an instance variable, we initially encode it as \(x'_{2,u}\). Then, we consider the bit-wise AND. There is no obvious way to encode a bit-wise operation, other than bit-by-bit. So, we convert \(x'_{2,u}\) to a bit-wise representation: We introduce witness variables \(x'_{2,0}, \dots , x'_{2, 3}\) and equations \(x'_{2,i}(x'_{2,i}-1)\approx 0\) as well as equation \(x'_{2,u} \approx \sum _{i=0}^3 2^i x'_{2,i}\). Then, for each i we require \(x'_{2,i} w'_{1,i} \approx x'_{2,i}\).
Lowering the Final Clause (Field Elements). Finally, we consider the field equation \(x_2 \approx w_2 \times w_2\). Our target is also field equations, so lowering this is straightforward. We simply introduce primed variables and copy the equation.
3.2 Key Ideas
This example highlights four ideas that guide the design of our field-blaster:
-
1.
fresh variables and assertions: Field-blasting uses two primitive operations: creating new variables in \(\phi '\) (e.g., \(w'_0\) to represent \(w_0\)) and adding new assertions to \(\phi '\) (e.g., \(w'_0(w'_0-1)\approx 0\)).
-
2.
encodings: For a term t in \(\phi \), we construct a field term (or collection of field terms) in \(\phi '\) that represent the value of t. For example, the Boolean \(w_0\) is represented as the field element \(w'_0\) that is 0 or 1.
-
3.
operator rules: if t is an operator applied to some arguments, we can encode t given encodings of the arguments. For example, if t is \(x_0 \oplus w_0\), and \(x_0\) is encoded as \(x_0'\) and \(w_0\) as \(w_0'\), then t can be encoded as \(1-x'_0-w'_0+2x'_0w'_0\).
-
4.
conversions: Some sorts can be represented by encodings of different kinds. If a term has multiple possible encodings, the compiler may need to convert between them to apply some operator rule. For example, we converted \(x_2\) from an unsigned encoding to a bit-wise encoding before handling an AND.
4 Architecture
In this section, we present our field-blaster architecture. To compile a predicate \(\phi \) to a system of field equations \(\phi '\), our architecture processes each term t in \(\phi \) using a post-order traversal. Informally, it represents each t as an āencodingā in \(\phi '\): a term (or collection of terms) over variables in \(\phi '\). Each encoding is produced by a small algorithm called an āencoding ruleā.
Below, we define the type of encodings \(\textsf{Enc}\) (Sect.Ā 4.1), the five different types of encoding rules (Sect.Ā 4.2), and a calculus that iteratively applies these rules to compile all of \(\phi \) (Sect.Ā 4.3).
4.1 Encodings
TableĀ 2 presents our tagged union type \(\textsf{Enc}\) of possible term encodings. Each variant comprises the term being encoded, its tag (the encoding kind), and a sequence of field terms. The encoding kinds are \(\texttt{bit}\) (a Boolean as 0/1), \(\texttt{uint}\) (a bit-vector as an unsigned integer), \(\texttt{bits}\) (a bit-vector as a sequence of bits), and \(\texttt{field}\) (a field term trivially represented as a field term). Each encoding has an intended semantics: a condition under which the encoding is considered valid. For instance, a \(\texttt{bit}\) encoding of Boolean t is valid if the field term f is equal to \( ite (t, 1, 0)\).
4.2 Encoding Rules
An encoding rule is an algorithm that takes and/or returns encodings, in order to represent some part of the input predicate as field terms and equations.
Primitive Operations. A rule can perform two primitive operations: creating new variables and emitting assertions. In our pseudocode, the primitive function \(\textsf{fresh}(\textsf{name}, t, \mathsf {is\texttt{I}nst}) \rightarrow x'\) creates a fresh variable. Argument \(\mathsf {is\texttt{I}nst}\) is a Boolean indicating whether \(x'\) is an instance variable (as opposed to a witness). Argument t is a field term (over variables from \(\phi \) and previously defined primed variables) that expresses how to compute a value for \(x'\). For example, to create a field variable \(w'\) that represents Boolean witness variable w, a rule can call \(\textsf{fresh}(w', ite (w, 1, 0), \bot )\). The compiler uses t to help create the \(\textsf{Ext}_x\) and \(\textsf{Ext}_w\) algorithms. A rule asserts a formula \(t'\) (over primed variables) by calling \(\textsf{assert}(t')\).
Rule Types. There are five types of rules: (1) Variable rules \(\textsf{variable}(t, \mathsf {is\texttt{I}nst}) \rightarrow e\) take a variable t and its instance/witness status and return an encoding of that variable made up of fresh variables. (2) Constant rules \(\textsf{const}(t) \rightarrow e\) take a constant term t and produce an encoding of t comprising terms that depend only on t. Since t is a constant, the terms in e can be evaluated to field constants (see the calculus in Sect.Ā 4.3).Footnote 7 The \(\textsf{const}\) rule cannot call \(\textsf{fresh}\) or \(\textsf{assert}\) . (3) Equality rules \(\textsf{assertEq}(e, e')\) take two encodings of the same kind and emit assertions that equate the underlying terms. (4) Conversion rules \(\textsf{convert}(e, \textsf{kind}')\rightarrow e'\) take an encoding and convert it to an encoding of a different kind. Conversions are only non-trivial for bit-vectors, which have two encoding kinds: \(\texttt{uint}\) and \(\texttt{bits}\) . (5) Operator rules apply to terms t of form \(o(t_1, \dots , t_n)\). Each operator rule takes t, o, and encodings of the child terms \(t_i\) and returns an encoding of t. Some operator rules require specific kinds of encodings; before using such an operator rule, our calculus (Sect.Ā 4.3) calls the convert rule to ensure the input encodings are the correct kind. FigureĀ 3 gives pseudocode for the first four rule types, as applied to bit-vectors. FigureĀ 4 gives pseudocode for two bit-vector operator encoding rules. A field blaster uses many operator rules: in our case study (Sect.Ā 6) there are 46.
4.3 Calculus
We now give a non-deterministic calculus describing how our field-blaster applies rules to compile a predicate \(\phi (x,w)\) into a system of field equations.
A calculus state is a tuple of three items: (E,Ā A,Ā F). The encoding store E is a (multi-)map from terms to sets of encodings. The assertions formula A is a conjunction of all field equations asserted via \(\textsf{assert}\) . The fresh variable definitions sequence F is a sequence consisting of pairs, where each pair (v,Ā t) matches a single call to \(\textsf{fresh}(v,t,\dots )\).
FigureĀ 5 shows the transitions of our calculus. We denote the result of a rule as \(A', F', e' \leftarrow r(\dots )\), where \(A'\) is a formula capturing any new assertions, \(F'\) is a sequence of pairs capturing any new variable definitions, and \(e'\) is the ruleās return value. We may omit one or more results if they are always absent for a particular rule. For encoding store E, \(E \cup (t \mapsto e)\) denotes the store with e added to tās encoding set.
There are five kinds of transitions. The \(\textsf{Const}\) transition adds an encoding for a constant term. The \(\textsf{const}\) rule returns an encoding e whose terms depend on the constant c; \(e'\) is a new encoding identical to e, except that each of its terms has been evaluated to obtain a field constant. The \(\textsf{Var}\) transition adds an encoding for a variable term. The \(\textsf{Conv}\) transition takes a term that is already encoded and re-encodes it with a new encoding kind. The \(\textsf{kinds}\) operator returns all legal values of \(\textsf{kind}\) for encodings of a given sort. The \(\textsf{Op}_{r}\) transition applies operator rule r. This transition is only possible if rās operator kind agrees with o, and if its input encoding kinds agree with \(\vec e\). The \(\textsf{Finish}\) transition applies when \(\phi \) has been encoded. It uses \(\textsf{const}\) and \(\textsf{assertEq}\) to build assertions that hold when \(\phi = \top \). Rather than producing a new calculus state, it returns the outputs of the calculus: the assertions and the variable definitions.
To meet the requirements of the ZKP compiler, our calculus must return two extension function: \(\textsf{Ext}_x\) and \(\textsf{Ext}_w\) (Sect.Ā 2.2). Both can be constructed from the fresh variable definitions F. One subtlety is that \(\textsf{Ext}_x(x)\) (which assigns values to fresh instance variables) is a function of x onlyāit cannot depend on the witness variables of \(\phi \). We ensure this by allowing fresh instance variables to only be created by the \(\textsf{variable}\) rule, and only when it is called with \(\mathsf {is\texttt{I}nst}= \top \).
Strategy. Our calculus is non-deterministic: multiple transitions are possible in some situations; for example, some conversion is almost always applicable. The strategy that decides which transition to apply affects field blaster performance (AppendixĀ D) but not correctness.
5 Verification Conditions
In this section, we first define correctness for a ZKP compiler (Sect.Ā 5.1). Then, we give verification conditions (VCs) for each type of encoding rule (Sect.Ā 5.2). Finally, we show that if these VCs hold, our calculus is a correct ZKP compiler (Sect.Ā 5.3).
5.1 Correctness Definition
Definition 1 (Correctness)
A ZKP compiler \(\textsf{Compile}(\phi ) \rightarrow (\phi ', \textsf{Ext}_x, \textsf{Ext}_w)\) is correct if it is demonstrably complete and demonstrably sound.
-
demonstrable completeness: For all \(\textsf{x}\in \textsf{dom}(x),\textsf{w}\in \textsf{dom}(w)\) such that \(\hat{\phi }(\textsf{x},\textsf{w})=\top \),
$$ \hat{\phi '}(\textsf{Ext}_x(\textsf{x}),\textsf{Ext}_w(\textsf{x},\textsf{w})) = \top $$ -
demonstrable soundness: There exists an efficient algorithm \(\textsf{Inv}(\textsf{x}', \textsf{w}') \rightarrow \textsf{w}\) such that for all \(\textsf{x}\in \textsf{dom}(x),\textsf{w}'\in \textsf{dom}(w')\) such that \(\hat{\phi '}(\textsf{Ext}_x(\textsf{x}),\textsf{w}')=\top \),
$$ \hat{\phi }(\textsf{x}, \textsf{Inv}(\textsf{Ext}_x(\textsf{x}),\textsf{w}'))=\top $$
Demonstrable completeness (respectively, soundness) requires the existence of a witness for \(\phi '\) (resp., \(\phi \)) when a witness exists for \(\phi \) (resp., \(\phi '\)); this existence is demonstrated by an efficient algorithm \(\textsf{Ext}_w\) (resp., \(\textsf{Inv}\)) that computes the witness.
Correct ZKP compilers are important for two reasons. First, since sequential composition preserves correctness, one can prove a multi-pass compiler is correct pass-by-pass. Second, a correct ZKP compiler from \(\varPhi \) to \(\varPhi '\) can be used to generalize a ZKP for \(\varPhi '\) to one for \(\varPhi \). We prove both properties in AppendixĀ A.
Theorem 1 (Compiler Composition)
If \(\textsf{Compile}'\) and \(\textsf{Compile}''\) are correct, then the compiler \(\textsf{Compose}(\textsf{Compile}', \textsf{Compile}'')\) (AppendixĀ A) is correct.
Theorem 2 (ZKP Generalization)
(informal) Given a correct ZKP compiler \(\textsf{Compile}\) from \(\varPhi \) to \(\varPhi '\) and a ZKP for \(\varPhi '\), we can construct a ZKP for \(\varPhi \).
5.2 Rule VCs
Recall (Sect.Ā 4) that our language manipulates encodings through five types of encoding rules. We give verification conditions for each type of rule. Intuitively, these capture the correctness of each rule in isolation. Next, weāll show that they imply the correctness of a ZKP compiler that follows our calculus.
Our VCs quantify over valid encodings. That is, they have the form: āfor any valid encoding e of term t, ...ā We can quantify over an encoding e by making each \(t_i \in \textsf{terms}(e)\) a fresh variable, and quantifying over the \(t_i\). Encoding validity is captured by a predicate \( valid (e,t)\), which is defined to be the validity condition in TableĀ 2. Each VC containing encoding variables \(\textbf{e}\) implicitly represents a conjunction of instances of that VC, one for each possible tuple of kinds of \(\textbf{e}\), which is fixed for each instance. If a VC contains \( valid (e,t)\), the sort of t is constrained to be compatible with \(\textsf{kind}(e)\). For a kind and a sort to be compatible, they must occur in the same row of TableĀ 2. We define the equality predicate \( equal (e,e')\) as \(\bigwedge _i \textsf{terms}(e)[i] \approx \textsf{terms}(e')[i]\).
Encoding Uniqueness. First, we require the uniqueness of valid encodings, for any fixed encoding kind. TableĀ 3 shows the VCs that ensure this. Each row is a formula that must be valid, for all compatible encodings and terms. The first two rows ensure that there is a bijection from terms to their valid encodings (in the first row, we consider only instances for which \(\textsf{kind}(e)=\textsf{kind}(e')\)). The function \( fromTerm (t, \textsf{kind}) \rightarrow e\) maps a term and an encoding kind to a valid encoding of that kind, and the function \( toTerm (e)\rightarrow t\) maps a valid encoding to its encoded term. The third and fourth rows ensure that \( fromTerm \) and \( toTerm \) are correctly defined. We will use \( toTerm \) in our proof of calculus soundness (AppendixĀ B) and we will use \( fromTerm \) to optimize VCs for faster verification (Sect.Ā 6.1).
For an example of the \( valid \) , \( fromTerm \) , and \( toTerm \) functions, consider a Boolean b encoded as an encoding e with \(\textsf{kind}\) \(\texttt{bit}\) and whose \(\textsf{terms}\) consist of a single field element f. Validity is defined as \( valid (e, b) = f \approx ite (b, 1, 0)\), \( toTerm (f)\) is defined as \(f \approx 1\), and \( fromTerm (b, \texttt{bit})\) is \((b, \texttt{bit}, ite (b, 1, 0))\).
VCs for Encoding Rules. TableĀ 4 shows our VCs for the rules of Fig.Ā 5. For each rule application, A and F denote, respectively, the assertions and the variable declarations generated when that rule is applied. We explain some of the VCs in detail.
First, consider a rule \(r_o\) for operator o applied to inputs \(t_1, \dots , t_k\). The rule takes input encodings \(e_1, \dots , e_k\) and returns an output \(e'\). It is sound if the validity of its inputs and its assertions imply the validity of its output. It is complete if the validity of its inputs implies its assertions and the validity of its output, after substituting fresh variable definitions.
Second, consider a variable rule. Its input is a variable term t, and it returns \(e'\), a putative encoding thereof. Note that \(e'\) does not actually contain t, though the substitutions in F may bind the fresh variables of \(e'\) to functions of t. For the rule to be sound when t is a witness variable \((t \in w)\), the assertions must imply that \(e'\) is valid for some term \(t'\). For the rule to be sound when t is an instance variable \((t \in x)\), the assertions must imply that \(e'\) is valid for t, when the instance variables in \(e'\) are replaced with their definition (\(F_x\) denotes F, restricted to its declarations of instance variables).Footnote 8 For the variable rule to be complete (for an instance or a witness), the assertions and the validity of \(e'\) for t must follow from F.
Third, consider a constant rule. Its input is a constant term t, and it returns an encoding e. Recall that the terms of e are always evaluated, yielding \(e'\) which only contains constant terms. Thus, correctness depends only on the fact that e is always a valid encoding of the input t. This can be captured with a single VC.
5.3 A Correct Field-Blasting Calculus
Given rules that satisfy these verification conditions, we show that the calculus of Sect.Ā 4.3 is a correct ZKP compiler. The proof is in AppendixĀ B.
Theorem 3 (Correctness)
With rules that satisfy the conditions of Sect.Ā 5.2, the calculus of Sect.Ā 4.3 is demonstrably complete and sound (Def.Ā 1).
6 Case Study: A Verifiable Field-Blaster forĀ CirC
We implemented and partially verified a field-blaster for CirCĀ [46]. Our implementation is based on a refactoring of CirCās original field blaster to conform to our encoding rules (Sect.Ā 4.2) and consists of \(\approx \)850 lines of code (LOC).Footnote 9 As described below, we have (partially) verified our encoding rules, but trust our calculus (Sect.Ā 4.3, \(\approx \)150 LOC) and our flattening implementations (Fig.Ā 2, \(\approx \)160 LOC).
While porting rules, we found 4 bugs in CirCās original field-blaster (see AppendixĀ G), including a severe soundness bug. Given a ZKP compiled with CirC, the bug allowed a prover to incorrectly compare bit-vectors. The prover, for example, could claim that the unsigned value of 0010 is greater than or less than that of 0001. A patch to fix all 4 bugs (in the original field blaster) has been upstreamed, and we are in the process of upstreaming our new field blaster implementation into CirC.
6.1 Verification Evaluation
Our implementation constructs the VCs from Sect.Ā 5.2 and emits them as SMT-LIB (extended with a theory of finite fieldsĀ [47]). We verify them with cvc5, because it can solve formulas over bit-vectors and prime fieldsĀ [47]. The verification is partial in that it is bounded in two ways. We set \(b \in \mathbb {N}\) to be the maximum bit-width of any bit-vector and \(a \in \mathbb {N}\) to be the maximum number of arguments to any n-ary operator. In our evaluation, we used \(a=4\) and \(b=4\). These bounds are small, but they were sufficient to find the bugs mentioned above.
Optimizing Completeness VCs. Generally, cvc5 verifies soundness VCs more quickly than completeness VCs. This is surprising at first glance. To see why, consider the soundness (S) and completeness (C) conditions for a conversion rule from e to \(e'\) that generates assertions A and definitions F:
In both, t is a variable, e contains variables, and there are variables in \(e'\) and A that are defined by F. In C, though, some variables are replaced by their definitions in Fāwhich makes the number of variables (and thus the search space)āseem smaller for C than S. Yet, cvc5 is slower on C.
The problem is that, while the field operations in A are standard (e.g., \(+\), \(\times \), and \(=\)), the definitions in F use a CirC-IR operator that (once embedded into SMT-LIB) is hard for cvc5 to reason about. That operator, (ff2bv b), takes a prime field element x and returns a bit-vector v. If xās integer representative is less than \(2^b\), then vās unsigned value is equal to x; otherwise, v is zero.
The ff2bv operator is trivial to evaluate but hard to embed. cvc5ās SMT-LIB extension for prime fields only supports \(+\), \(\times \) and \(=\), so no operator can directly relate x to v. Instead, we encode the relationship through b Booleans that represent the bits of v. To test whether \(x < 2^b\), we use the polynomial \(f(x) = \prod _{i=0}^{2^b-1}(x - i)\), which is zero only on \([0,2^b-1]\). The bit-splitting essentially forces cvc5 to guess vās value; further, fās high degree slows down the Grƶbner basis computations that form the foundation of cvc5ās field solver.
To optimize verification of the completeness VCs, we reason about CirC-IR directly. First, we use the uniqueness of valid encodings and the \( fromTerm \) function. Since the VC assumes \( valid (e, t)\), we know e is equal to \( fromTerm (t, \textsf{kind}(e))\). We use this equality to eliminate e from the completeness VC, leaving:
Since F defines all variables in A and \(e'\), the only variable after substitution is t. So, when t is a Boolean or small bit-vector, an exhaustive search is very effective;Footnote 10 we implemented such a solver in 56 LOC, using CirCās IR as a library.
For soundness VCs, this approach is less effective. The \( fromTerm \) substitution still applies, but if F introduces fresh field variables, they are not eliminated and thus, the final formula contains field variables, so exhaustion is infeasible.
Verification Results. We ran our VC verification on machines with Intel Xeon E5-2637 v4 CPUs.Footnote 11 Each attempt is limited to one physical core, 8GB memory, and 30Ā min. FigureĀ 6 shows the number of VCs verified by cvc5 and our exhaustive solver. As expected, the exhaustive solver is effective on completeness VCs for Boolean and bit-vector rules, but ineffective on soundness VCs for rules that introduce fresh field variables. There are four VCs that neither solver verifies within 30Ā min: bvadd with (\(b = 4\), \(a = 4\)), and bvmul with (\(b = 3\), \(a = 4\)) and (\(b = 4\), \(a \ge 3\)). Most other VCs verify instantly. In AppendixĀ E, we analyze how VC verification time depends on a and b.
6.2 Performance andĀ Output Quality Evaluation
We compare CirC with our field-baster (āVerifiedā) against CirC with its original field-blaster (āUnverifiedā)Footnote 12 on three metrics: compiler runtime, memory usage, and the final R1CS constraint count. Our benchmark set is the standard library for CirCās Z# input language (which extends ZoKratesĀ [16, 68] v0.6.2). Our testbed runs Linux with 32GB memory and an AMD Ryzen 2700.
There is no difference in constraints, but the verified field-blaster slightly improves compiler performance: ā8% time and ā2% memory (Fig.Ā 7). We think that the small improvement is unrelated to the fact that the new field blaster is verified. In AppendixĀ E, we discuss compiler performance further.
7 Discussion
In this work, we present the first automatically verifiable field-blaster. We view the field-blaster as a set of rules; if some (automatically verifiable) conditions hold for each rule, then the field-blaster is correct. We implemented a performant and partially verified field-blaster for CirC, finding 4 bugs along the way.
Our approach has limitations. First, we require the field-blaster to be written as a set of encoding rules. Second, we only verify our rules for bit-vectors of bounded size and operators of bounded arity. Third, we assume that each rule is a pure function: for example, it doesnāt return different results depending on the time. Future work might avoid the last two limitations through bit-width-independent reasoning Ā [42, 43, 67] and a DSL (and compiler) for encoding rules. It would also be interesting to extend our approach to: a ZKP with a non-prime fieldĀ [7, 13], a compiler IR with partial or non-deterministic semantics, or a compiler with correctness that depends on computational assumptions.
Notes
- 1.
Roughly speaking, a ZK proof system is complete if it is possible to prove every true statement, and is sound if it is infeasible to prove false ones.
- 2.
Automated verification generally leverages solvers. This is a particularly appealing approach in our setting, since CirC (our compiler infrastructure of interest) already supports compilation to SMT formulas.
- 3.
For technical reasons, the runtime of \(\textsf{Ext}_x\) and the size of its description must be \(\textsf{poly}(\lambda ,|x|)\)ānot just \(\textsf{poly}(\lambda )\) (AppendixĀ A). .
- 4.
We list all CirC-IR operators for Booleans, bit-vectors, and prime fields in AppendixĀ C. Almost all are from SMT-LIB.
- 5.
Later (Sect.Ā 5), we will see that āwell-formednessā constraints like this are unnecessary for instance variables, such as \(x_0\). .
- 6.
We represent \(w_1\) bit-wise so that we can ensure the representation is well-formed with constraints \(w'_{1,i}(w'_{1,i}-1)\approx 0\). As previously noted, such well-formedness constraints are not needed for an instance variable like \(x_1\).(See footnote 5).
- 7.
Having \(\textsf{const}(t)\) return terms that depend on t (rather than directly returning constants) is useful for constructing verification conditions for \(\textsf{const}\) .
- 8.
The different soundness conditions for instance and witness variables play a key role in the proof of TheoremĀ 3. Essentially: since the condition for instances replaces variables with their definitions, the validity of the encodings of instance variables need not be explicitly enforced in A. This is why some constraints could be omitted in our field-blasting example.(See footnote 5).
- 9.
Our implementation is in Rust, as is CirC.
- 10.
So long as the exhaustive solver reasons directly about all CirC-IR operators.
- 11.
We omit the completeness VCs for ff2bv. See AppendixĀ C.
- 12.
After fixing the bugs we found. See Sect.Ā 6.
References
LLVM language reference manual. https://llvm.org/docs/LangRef.html
Monero technical specs. https://monerodocs.org/technical-specs/ (2022)
Airscript. https://github.com/0xPolygonMiden/air-script
Angel, S., Blumberg, A.J., Ioannidis, E., Woods, J.: Efficient representation of numerical optimization problems for SNARKs. In: USENIX Security (2022)
BellƩs-MuƱoz, M., Isabel, M., MuƱoz-Tapia, J.L., Rubio, A., Baylina, J.: Circom: a circuit description language for building zero-knowledge applications. IEEE Transactions on Dependable and Secure Computing (2022)
Bellman. https://github.com/zkcrypto/bellman
Ben-Sasson, E., Bentov, I., Horesh, Y., Riabzev, M.: Scalable zero knowledge with no trusted setup. In: CRYPTO (2019)
Bertot, Y., CastĆ©ran, P.: Interactive theorem proving and program development: CoqāArt: the calculus of inductive constructions. Springer, Heidelberg (2013). https://doi.org/10.1007/978-3-662-07964-5
Blum, M., Feldman, P., Micali, S.: Non-interactive zero-knowledge and its applications. In: STOC (1988)
Brown, F., Renner, J., Nƶtzli, A., Lerner, S., Shacham, H., Stefan, D.: Towards a verified range analysis for JavaScript JITs. In: PLDI (2020)
Campanelli, M., Gennaro, R., Goldfeder, S., Nizzardo, L.: Zero-knowledge contingent payments revisited: attacks and payments for services. In: CCS (2017)
Chen, E., Zhu, J., Ozdemir, A., Wahby, R.S., Brown, F., Zheng, W.: Silph: a framework for scalable and accurate generation of hybrid MPC protocols (2023)
Chiesa, A., Hu, Y., Maller, M., Mishra, P., Vesely, N., Ward, N.: Marlin: preprocessing zkSNARKs with universal and updatable SRS. In: EUROCRYPT (2020)
Chin, C., Wu, H., Chu, R., Coglio, A., McCarthy, E., Smith, E.: Leo: a programming language for formally verified, zero-knowledge applications (2021). https://ia.cr/2021/651
Cowan, M., Dangwal, D., Alaghi, A., Trippel, C., Lee, V.T., Reagen, B.: Porcupine: a synthesizing compiler for vectorized homomorphic encryption. In: PLDI (2021)
Eberhardt, J., Tai, S.: ZoKratesāscalable privacy-preserving off-chain computations. In: IEEE Blockchain (2018)
Enderton, H.B.: A mathematical introduction to logic. Elsevier (2001)
Fournet, C., Keller, C., Laporte, V.: A certified compiler for verifiable computing. In: CSF (2016)
Fox, A., Myreen, M.O., Tan, Y.K., Kumar, R.: Verified compilation of CakeML to multiple machine-code targets. In: CPP (2017)
Frankle, J., Park, S., Shaar, D., Goldwasser, S., Weitzner, D.: Practical accountability of secret processes. In: USENIX Security (2018)
Goldberg, L., Papini, S., Riabzev, M.: Cairo - a Turing-complete STARK-friendly CPU architecture (2021). https://ia.cr/2021/0163
Goldwasser, S., Micali, S., Rackoff, C.: The knowledge complexity of interactive proof-systems. In: STOC (1985)
Grubbs, P., Arun, A., Zhang, Y., Bonneau, J., Walfish, M.: Zero-knowledge middleboxes. In: USENIX Security (2022)
Hopwood, D., Bowe, S., Hornby, T., Wilcox, N.: Zcash protocol specification. https://raw.githubusercontent.com/zcash/zips/master/protocol/protocol.pdf (2016)
Jiang, K., Chait-Roth, D., DeStefano, Z., Walfish, M., Wies, T.: Less is more: refinement proofs for probabilistic proofs. IEEE S &P (2023)
Kamara, S., Moataz, T., Park, A., Qin, L.: A decentralized and encrypted national gun registry. In: IEEE S &P (2021)
Kaufmann, M., Manolios, P., Moore, J.S.: Computer-aided reasoning: ACL2 case studies, vol. 4. Springer, NY (2013). https://doi.org/10.1007/978-1-4757-3188-0
Kosba, A., Papadopoulos, D., Papamanthou, C., Song, D.: MIRAGE: succinct arguments for randomized algorithms with applications to universal zk-SNARKs. In: USENIX Security (2020)
Kosba, A., Papamanthou, C., Shi, E.: xJsnark: A framework for efficient verifiable computation. In: IEEE S &P (2018)
Kothapalli, A., Parno, B.: Algebraic reductions of knowledge (2022). https://ia.cr/2022/009
Kumar, R., Myreen, M.O., Norrish, M., Owens, S.: CakeML: A verified implementation of ML. In: POPL (2014)
Kundu, S., Tatlock, Z., Lerner, S.: Proving optimizations correct using parameterized program equivalence. In: PLDI (2009)
Lattner, C., Adve, V.: LLVM: a compilation framework for lifelong program analysis & transformation. In: CGO (2004)
Lerner, S., Millstein, T., Chambers, C.: Automatically proving the correctness of compiler optimizations. In: PLDI (2003)
Lerner, S., Millstein, T., Rice, E., Chambers, C.: Automated soundness proofs for dataflow analyses and transformations via local rules. In: POPL (2005)
Leroy, X.: Formal verification of a realistic compiler. Commun. ACM 52(7), 107ā115 (2009)
Leroy, X.: A formally verified compiler back-end. J. Autom. Reason. 43(4), 363ā446 (2009)
Lopes, N.P., Lee, J., Hur, C.K., Liu, Z., Regehr, J.: Alive2: bounded translation validation for LLVM. In: PLDI (2021)
Lopes, N.P., Menendez, D., Nagarakatte, S., Regehr, J.: Provably correct peephole optimizations with Alive. In: PLDI (2015)
Mullen, E., Zuniga, D., Tatlock, Z., Grossman, D.: Verified peephole optimizations for CompCert. In: PLDI (2016)
Necula, G.C.: Translation validation for an optimizing compiler. In: PLDI (2000)
Niemetz, A., Preiner, M., Reynolds, A., Zohar, Y., Barrett, C., Tinelli, C.: Towards bit-width-independent proofs in SMT solvers. In: CADE (2019)
Niemetz, A., Preiner, M., Reynolds, A., Zohar, Y., Barrett, C., Tinelli, C.: Towards satisfiability modulo parametric bit-vectors. J. Autom. Reason. 65(7), 1001ā1025 (2021)
Nipkow, T., Wenzel, M., Paulson, L.C.: Isabelle/HOL: a proof assistant for higher-order logic. Springer, Heidelberg (2002). https://doi.org/10.1007/3-540-45949-9
Ozdemir, A., Brown, F., Wahby, R.S.: CirC: Compiler infrastructure for proof systems, software verification, and more. In: IEEE S &P (2022)
Ozdemir, A., Kremer, G., Tinelli, C., Barrett, C.: Satisfiability modulo finite fields. In: submission (2022). https://ia.cr/2023/091
Ozdemir, A., Wahby, R., Whitehat, B., Boneh, D.: Scaling verifiable computation using efficient set accumulators. In: USENIX Security (2020)
Ozdemir, A., Wahby, R.S., Brown, F., Barrett, C.: Bounded verification for finite-field-blasting. Cryptology ePrint Archive (2023) (Full version)
Parno, B., Howell, J., Gentry, C., Raykova, M.: Pinocchio: nearly practical verifiable computation. Commun. ACM 59(2), 103ā112 (2016)
Pnueli, A., Siegel, M., Singerman, E.: Translation validation. In: TACAS (1998)
Ranise, S., Tinelli, C., Barrett, C.: SMT fixed size bit-vectors theory. https://smtlib.cs.uiowa.edu/theories-FixedSizeBitVectors.shtml (2017)
Sasson, E.B., Chiesa, A., Garman, C., Green, M., Miers, I., Tromer, E., Virza, M.: Zerocash: decentralized anonymous payments from Bitcoin. In: IEEE S &P (2014)
Setty, S., Braun, B., Vu, V., Blumberg, A.J., Parno, B., Walfish, M.: Resolving the conflict between generality and plausibility in verified computation. In: EuroSys (2013)
Stewart, G., Beringer, L., Cuellar, S., Appel, A.W.: Compositional CompCert. In: POPL (2015)
Tan, Y.K., Myreen, M.O., Kumar, R., Fox, A., Owens, S., Norrish, M.: The verified CakeML compiler backend. J. Funct. Programm. 29, E2 (2019)
Tao, R., et al.: Giallar: push-button verification for the Qiskit quantum compiler. In: PLDI (2022)
Thaler, J.: Proofs, Arguments, and Zero-Knowledge. Manuscript (2022)
The Qiskit authors and maintainers: Qiskit: an open-source framework for quantum computing (2021). https://doi.org/10.5281/zenodo.2573505. The Qiskit maintainers request that the full list of Qiskit contributors be included in any citation. Regretfully, we cannot comply, as the list is two pages long
Tinelli, C.: SMT core theory. https://smtlib.cs.uiowa.edu/theories-Core.shtml (2015)
Viand, A., Jattke, P., Hithnawi, A.: SoK: fully homomorphic encryption compilers. In: IEEE S &P (2021)
Wahby, R.S., Setty, S., Howald, M., Ren, Z., Blumberg, A.J., Walfish, M.: Efficient RAM and control flow in verifiable outsourced computation. In: NDSS (2015)
Walfish, M., Blumberg, A.J.: Verifying computations without reexecuting them. Commun. ACM 58(2), 74ā84 (2015)
Wang, F.: Ecne: automated verification of ZK circuits (2022). https://0xparc.org/blog/ecne
Zohar, Y., et al.: Bit-precise reasoning via Int-blasting. In: CADE (2022)
ZoKrates. https://zokrates.github.io/
Acknowledgements
We appreciate the help and guidance of Andres Nƶtzli, Dan Boneh, and Evan Laufer.
This material is in part based upon work supported by the DARPA SIEVE program and the Simons foundation. Any opinions, findings, and conclusions or recommendations expressed in this report are those of the author(s) and do not necessarily reflect the views of DARPA. It is also funded in part by NSF grant number 2110397 and the Stanford Center for Automated Reasoning.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Appendices
A Zero-Knowledge Proofs and Compilers
This appendix is available in the full version of the paperĀ [49].
B Compiler Correctness Proofs
This appendix is available in the full version of the paperĀ [49].
C CirC-IR
This appendix is available in the full version of the paperĀ [49].
D Optimizations toĀ theĀ CirC Field-Blaster
This appendix is available in the full version of the paperĀ [49].
E Verified Field-Blaster Performance Details
This appendix is available in the full version of the paperĀ [49].
F Verifier Performance Details
This appendix is available in the full version of the paperĀ [49].
G Bugs Found inĀ theĀ CirC Field Blaster
This appendix is available in the full version of the paperĀ [49].
Rights and permissions
Open Access This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.
The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.
Copyright information
Ā© 2023 The Author(s)
About this paper
Cite this paper
Ozdemir, A., Wahby, R.S., Brown, F., Barrett, C. (2023). Bounded Verification forĀ Finite-Field-Blasting. In: Enea, C., Lal, A. (eds) Computer Aided Verification. CAV 2023. Lecture Notes in Computer Science, vol 13966. Springer, Cham. https://doi.org/10.1007/978-3-031-37709-9_8
Download citation
DOI: https://doi.org/10.1007/978-3-031-37709-9_8
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-37708-2
Online ISBN: 978-3-031-37709-9
eBook Packages: Computer ScienceComputer Science (R0)