Introduction

DLT and blockchain have been subject to intense research in last years, because it allows to construct consensus among parties that do not fully trust each other, without the necessity of a trusted third party. However, in public and permissionless ledgers, transactions can be viewed by everyone in the network. This fact is a hindrance that we must overcome if those transactions contain privacy-sensitive information.

In order to protect private information, a possible alternative is to use a Trusted Execution Environment (TEE), like Intel SGX [39] technology. The idea is that any private data must appear in the blockchain in encrypted form. Only the owners of the subjacent cryptographic keys will be able to decrypt it. Validation of this information must be done in the TEE system, where the cryptographic keys can be embedded. Therefore, private data will only be visible after decryption, which occurs inside a controlled environment. Putting differently, a TEE offers protection against information leakage by restricting manipulation of private data to a region of memory that can not be accessed by other processes in the same machine, or even by its administrator. Nevertheless, attacks [21, 58] to SGX where proposed in literature, showing that this technology is vulnerable to branch prediction and side-channel attacks, respectively.

A different approach to secure private data is ZKP, which is a cryptographic technique that have been used to provide privacy by design in the context of DLT and blockchain. Shortly, ZKP allows an entity called prover to argue to another party, called verifier, that a determined statement is true without revealing more information than strictly necessary to convince her.

In previous works [40, 48] ING described some preliminary results. The purpose of this work is to extend them in order to provide a complete survey on ZKRP protocols.

In summary, ZKRP allows to prove that a secret integer belongs to a certain interval. For example, if we define this interval to be all integers between 18 and 200, a person can use the ZKRP scheme to prove that she is over 18. This gives her permission, according to some regulation, to consume a determined service, but without revealing her specific age. In the context of payment systems, if party A wants to transfer money to party B, then it is possible to utilize ZKRP to prove that the amount of money in the transaction is positive, otherwise, if the amount is negative, such transaction would in fact transfer money in the opposite direction, i.e. from B to A.

In the following sections we describe in detail the algorithms necessary to implement ZKRP and instantiate the underlying parameters in order to obtain an appropriate level of security. We also compare the different schemes with regards to proof size, and the complexity of the prover and verifier algorithms.

Contributions

There are many surveys about zero knowledge proofs, but mostly related to the theoretical foundations of the proposed cryptographic constructions. The main goal of this survey is to bridge the gap between those papers whose audience is the cryptographic community and the community of developers that are more focused on implementation aspects. There is currently an effort to standardize zero knowledge proofs [14], where academia and industry started the effort to produce a standard to implement ZKPs. In particular, some ZK gadgets were identified as important building blocks for the construction of solutions to more complex problems. Among the ZK gadgets that were discussed, we can remark ZK Range Proofs, ZK Set Membership and cryptographic accumulators. Although those primitives have some relation between each other, the focus of this work is on ZK Range Proofs. In Sect. 7  we discuss these related works and give references to the interested reader.

Next we summarize the main contributions of this work:

  • Survey possible use cases for ZKRP and indicate which papers in the literature present important contributions for the construction of efficient solutions to this use cases.

  • Describe in detail the algorithms required for representative constructions of ZKRP. In particular, we describe how the Fiat–Shamir must be implemented in order to obtain non-interactive protocols.

  • Update the survey presented in the work by Canard et al. [13] with newer proposals.

  • Present our open source implementation [49] of the algorithms detailed in Sect. 4.

Organization

In Sect. 2 we describe possible use cases for ZKRPs. In Sect. 3 we give fundamental results that are important to understand the rest of the document. In Sect. 4 we describe in detail how to implement ZKRP using different strategies. In Sect. 5 we describe our implementation, while in Sect. 6 we compare the schemes with respect to proof size, prover and verifier complexities. In Sect. 7 we discuss related work and give some final remarks.

Applications

In order to give the reader a motivation to investigate further on Zero Knowledge Proofs, we present in this section some interesting applications.

  • Over 18 ZKRP can be used to prove someone is over 18 without revealing her exact age. Thus it is possible to allow the person to consume some service without requiring her to show paper documents, which contain more information than necessary for the age validation. In this situation it is important to have a trusted party to generate a commitment, as described in subsection 3.2, which attests that the information contained in it is correct. The person can not generate the commitment by herself because a malicious user could utilize fake data to prove the desired statement, even though the real data does not respect that property.

  • Know Your Customer (KYC) As explained above,  ZKRP allows to validate that a determined piece of private information belongs to a  numeric interval. This property may be used to ensure compliance, while preserving a client’s privacy. For example, an interesting use case is the so-called anonymous credentials, where a trusted party can attest that a user credential contains attributes whose values are correct, allowing to prove certain properties in zero knowledge way.

  • Mortgage risk assessment. It is possible to prove that the salary of an individual is above some threshold in order to get a mortgage approved. In general, threshold validation is a key verification that must be performed in financial risk assessment. Therefore ZKRP turns out to be very important for financial institutions.

  • Rating and investment grading The problem of rating companies according to their level of productivity or financial health can be modeled by determining a partition of a numeric interval, given by a sequence of increasing numbers \(A_0, A_1, \ldots , A_k\), such that the highest score is attributed to companies rated above \(A_k\) (or sometimes below \(A_0\)). The company’s health is measured to obtain a value x, and the resulting grade depends upon which sub-interval x belongs to. Therefore, it is necessary to verify if \(x \in [A_i, A_{i+1})\) for each \(0 \le i \le k\). As far as we know there is no research applying ZKRP to this specific problem, where maybe more efficient constructions could exist, when compared to the straightforward solution of using ZKRP \(k+1\) times.

  • Electronic voting This is an important topic of research, which attracted the attention of many researchers in last years. Different solutions [1, 26, 28, 36] were proposed to different types of elections. Some solutions are based on zero knowledge proofs, like ZKRP, proof of shuffling, proof of decryption and other related techniques, while others use different cryptographic primitives, like homomorphic threshold encryption and Multi-Party Computation (MPC).

  • Electronic auctions and procurement Secure electronic auctions is a subject that has being focus of research for a long time [44], and it is an important motivation   in the study of ZKRPs, since it is one of the main cryptographic techniques that can be used to construct secure protocols. In particular, it is possible to remark the proposal of secure constructions [47, 50, 54] for Vickrey auctions, where the winner pays the second highest bid. A complementary problem to electronic auctions is procurement, where parties concur for the lowest price. According to the World Bank report [59], the volume of bribes in public sector procurement is roughly US$200 billion per year.

The applications described above are general purpose, but could be interesting also in the context of DLT and blockchain technology. Next we focus on application that are important in the specific scenario of DLT and blockchain:

  • Confidential Transactions and Mimblewimble In 2016, Confidential Transactions (CT) were proposed by Maxwell [46], which utilizes Pedersen commitments [51] to hide transactions amounts. Instead of publishing the amounts being spent in the clear, each party uses the commitment scheme to hide the amount, what makes it infeasible for an adversary to obtain any information about transaction denominations. Since a Pedersen commitment is homomorphic, it allows transactions outputs to be added up without requiring to open the subjacent commitments. Also, the commitment can be used to generate a ZKRP, which is sufficient to validate that a transaction is correct. For instance, it is necessary to show that the amount lies in the interval \([0, 2^n)\), where \(2^n\) is considerably smaller than the size of the underlying group used to construct the Pedersen commitment, ensuring there is no overflow; and \(2^n\) is big enough to deal with every possible valid denomination.

    However, the usage of ZKRP would make the size of transactions too big. Namely, CT with just two outputs and 32 bits of precision would require roughly a ZKRP whose size is 5 KB, leading to transactions whose total size is equal to 5.4 KB@. Thus, ZKRP would correspond to almost 93% of the transaction size. Therefore in order to use CT in Bitcoin, we would need 160 GB only for ZKRP@. If Bulletproofs where used in replacement of the underlying range proof used in CT, then it would reduce this requirement to only 17 GB.

    Mimblewimble [52] is an optimization to CT that can make the size of the ledger even smaller, by aggregating and compressing transactions in such a way that avoids the necessity to download old and unspent transactions outputs.

  • Provisions Provisions [23] is a protocol that allows a Bitcoin exchange to prove it is solvent, by showing that each account has positive balance, and also showing that the exchange has an amount of funds that is larger than or equal to the summation of all individual account’s balance in the system. The challenge here is to calculate a single zero knowledge proof based on the information provided by different participants. This is difficult because each individual balance is encrypted using distinct keys, thus combining them is not straightforward, and requires MPC. Bulletproofs has a MPC protocol that solves this problem efficiently. For instance, if we consider a cryptocurrency exchange with 2 million clients, current implementation of Provisions requires 62 MB of ZKRPs. However, using Bulletproofs this number can be reduced to less than 2 KB, which corresponds to an optimization factor of 300.

Fundamentals

In this section we define commitment schemes, zero knowledge proofs and other important components that are necessary in order to comprehend this work. The purpose of this section is not to present very formal definitions. To achieve this goal, the reader can use Goldreich’s book [33].

Notation Notation \(x \in _R S\) is used when variable x is set to a random element of set S. We are going to use Camenisch and Stadler [10] notation for proofs of knowledge:

$$\begin{aligned} {\mathrm {PK}}\{(\delta , \gamma ) : y = g^\delta h^\gamma \wedge (u \le \delta \le v)\}, \end{aligned}$$

which denotes a proof of knowledge of integers \(\delta\) and \(\gamma\) such that \(y = g^\delta h^\gamma\) and \(u \le \delta \le v\). In other words, this notation means that y is the commitment to the secret value \(\delta\), which is contained in the interval [uv). Greek letters are used to denote values that must be known only to the prover. For instance, we have that \(\delta\) is her private data, while \(\gamma\) is a random value that is used to hide \(\delta\).

Finally, we use notation \(x {\mathop {=}\limits ^{?}}y\) to check if x is equal or not to y.

Assumptions

The constructions presented in this paper are based on the assumptions described in this section.

The strong RSA assumption first appeared in the work of Fujisaki and Okamoto [31]. It is a stronger assumption with respect to the conventional RSA assumption, because any adversary who can break the RSA assumption would also be able to break the strong RSA assumption. In [22]  it is shown how to replace the strong RSA assumption by the standard RSA assumption in many ZKP application including ZKRP.

Definition 1

(RSA assumption) Given RSA modulus n, RSA exponent e and an element \(y \in {\mathbb {Z}}_n^\star\), it is infeasible to find integers x such that \(y = x^e \pmod {n}\).

Definition 2

(Strong RSA assumption) Given an RSA modulus n and an element \(y \in {\mathbb {Z}}_n^\star\), it is infeasible to find integers \(e \ne \pm 1\) and x, such that \(y = x^e \pmod {n}\).

Definition 3

(Discrete Logarithm assumption) Let \({\mathbb {G}}\) be a group of prime order q, a generator \(g \in {\mathbb {G}}\) and an arbitrary element \(y \in {\mathbb {G}}\), it is infeasible to find \(x \in {\mathbb {Z}}_q\), such that \(y = g^x\).

Definition 4

(q -Strong Diffie-Hellman assumption) Given groups \({\mathbb {G}}_1\) and \({\mathbb {G}}_T\), associated with a secure bilinear pairing map e; given generator \(g \in {\mathbb {G}}_1\) and powers \(g^x, \ldots , g^{x^q}\), for \(x \in _r {\mathbb {Z}}_p\), we have that it is infeasible for an adversary to output \((c, g^{1/(x+c)})\), where \(c \in {\mathbb {Z}}_p\).

It is important to remark that these assumptions are not valid if quantum computers come to existence. Therefore, the research of quantum-resistant ZKPs is a very important subject.

Commitment

Shortly, a cryptographic commitment allows someone to compute a value that hides some message without ambiguity, in the sense that no one later will be able to argue that this value corresponds to a different message. In other words, given the impossibility to change the hidden message, we say that the user committed to that message. The purpose of using a commitment scheme is to allow a prover to compute zero knowledge proofs where the hidden message is the underlying witness w.

Definition 5

A commitment scheme is defined by algorithms \({\mathrm {Commit}}\) and \({\mathrm {Open}}\) as follows:

  • \(c = {\mathrm {Commit}}(m, r)\). Given a message m and randomness r, compute as output a value c that, informally, hides message m and such that it is hard to compute message \(m'\) and randomness \(r'\) that satisfies \({\mathrm {Commit}}(m', r') = {\mathrm {Commit}}(m, r)\). In particular, it is hard to invert function \({\mathrm {Commit}}\) to find m or r.

  • \(b = {\mathrm {Open}}(c, m, r)\). Given a commitment c, a message m and randomness r, the algorithm returns true if and only if \(c = {\mathrm {Commit}}(m, r)\).

A commitment scheme has 2 properties:

  • Binding Given a commitment c, it is hard to compute a different pair of message and randomness whose commitment is c. This property guarantees that there is no ambiguity in the commitment scheme, and thus after c is published it is hard to open it to a different value.

  • Hiding It is hard to compute any information about m given c.

A well known commitment scheme is called Pedersen commitment [51]. Given group \({\mathbb {Z}}_p\), of prime order p, where the discrete logarithm problem is infeasible, the commitment is computed as follows:

$$\begin{aligned} c = {\mathrm {Commit}}(m, r) = g^m h^r. \end{aligned}$$

In order to open this commitment, given message m and randomness r, we simply recompute it and compare with c. An interesting property is that Pedersen commitment is homomorphic. Namely, we have that for arbitrary messages \(m_1\) and \(m_2\) and randomness \(r_1\) and \(r_2\), such that \(c_i = {\mathrm {Commit}}(m_i, r_i)\) for \(i \in \{1,2\}\), then

$$\begin{aligned} c_1\cdot c_2 = {\mathrm {Commit}}(m_1 + m_2, r_1 + r_2). \end{aligned}$$

Pedersen commitment is commonly implemented using groups over elliptic curves. Also, it is important to remark that if the discrete logarithm of h with respect to g is known, then it is easy to generate \(m'\) and \(r'\) such that \({\mathrm {Commit}}(m', r') = {\mathrm {Commit}}(m, r)\), breaking the binding property. Thus in order to generate h securely, we must use a hash function that maps binary public strings to elliptic curve points [6].

Another commitment scheme that will be required later in this document is the Fujisaki-Okamoto commitment [31]. The formula to calculate the commitment itself is the same as in Pedersen commitment, namely \(g^m h^r\). The difference is the underlying group, which for the Fujisaki-Okamoto is given by an RSA group \({\mathbb {Z}}_n\), where \(n=pq\) and p and q are safe primes, what means that \((p-1)/2\) and \((q-1)/2\) are also prime numbers. Also, we have that the domain over which randomness r is chosen is different, because the Fujisaki-Okamoto commitment requires \(r \in [2^{-s}n+1,2^s n - 1]\), with s chosen in such a manner that \(2^{-s}\) is negligible. Interestingly, in the original paper [31] Fujisaki and Okamoto propose an interactive protocol for Zero Knowledge Range Proofs, but unfortunately the performance is not good for practical usage.

Zero knowledge proofs

Zero knowledge proofs (ZKP) were proposed in 1989 by Goldwasser, Micali and Rackoff [34]. Using this kind of cryptographic primitive it is possible to show that some statement is true about a secret data, without revealing any other information about the secret beyond this statement. Since then, ZKP became an important field of research, because it provides a new characterization of the complexity class NP, using the so-called interactive programs, and also because it is very useful to construct many cryptographic primitives. Given an element x of a language \({\mathcal {L}} \in NP\), an entity called prover is able to convince a verifier that x indeed belongs to \({\mathcal {L}}\), i.e. there exists a witness w for x. In particular we are interested in proof of knowledge (PoK), where the prover not only convinces about the existence of some witness, but also shows that the prover in fact knows a specific witness w. A desirable characteristic of such proof systems is succinctness, informally meaning that the proof size is small and thus can be verified efficiently. Such constructions are called zk-SNARKs [37]. However, although asymptotically good, zk-SNARKs still have some limitations and for specific problems it turns out that different approaches achieve better performance.

Nowadays ZKP is being used to provide privacy to DLT and blockchain. For instance, it allows to design private payment systems. In summary, we would like to permit parties to transfer digital money, while hiding not only their identities but also the amount being transferred, known as denomination. ZKP can be used to hide this information, but still permitting validation of transactions. An important validation is showing that the denomination is positive, otherwise some payer would be able to receive money by using negative amounts. In this context we have that zk-SNARKs don’t provide good performance when compared to protocols designed specifically for this purpose. The focus of this document is the description of different constructions of ZKRP and compare them to understand when to use each scheme in practice. More concretely, ZKRP allows some party Alice, known as the prover, and who possesses a secret \(\delta\), to prove to another party Bob, known as the verifier, that \(\delta\) belongs to the interval [uv), for arbitrary integers u and v.

Definition 6

A Non-Interactive Zero Knowledge (NIZK) proof scheme is defined by algorithms \({\mathrm {Setup}}\), \({\mathrm {Prove}}\) and \({\mathrm {Verify}}\) as follows:

  • \({\mathrm {Setup}}\) algorithm is responsible for the generation of parameters. Concretely, we have that \({\mathrm {params}}= {\mathrm {Setup}}(\lambda )\), where the input is the security parameter \(\lambda\) and the output is the parameters of the ZKP system of algorithms.

  • \({\mathrm {Prove}}\) syntax is given by \({\mathrm {proof}}= {\mathrm {Prove}}(x, w)\). The algorithm receives as input an instance x of some NP-language \({\mathcal {L}}\), and the witness w, and outputs the zero knowledge proof.

  • \({\mathrm {Verify}}\) algorithm receives the proof as input and outputs a bit b, which is equal to 1 if the verifier accepts the proof.

It is important to remark that not all ZKP schemes are non-interactive. On contrary, most ZKP protocols described in the literature are in fact interactive. In general, the prover must answer challenge messages sent by the verifier in order to convince him that the proof is valid, what requires multiple rounds of communication. In the context of DLT and blockchain applications, we would like to avoid this communication, because either (i) validating nodes can not properly agree on how to choose those challenges, since in many constructions we have to choose them randomly, while the verification algorithm must be deterministic in order to reach consensus; or (ii) because it would make the communication complexity of the system very poor. Nevertheless, the Fiat–Shamir heuristic [30] is a generic technique that allows to convert interactive ZKP schemes into non-interactive protocols. The drawback of this heuristic is that it makes the cryptosystem secure under the random oracle model [4] (ROM). In particular, it is straightforward to make the ZKRP schemes described in this document non-interactive using the Fiat–Shamir heuristic.

A zero knowledge proof scheme has the following properties:

  • Completeness Given a witness w that satisfies instance x, we have that \({\mathrm {Verify}}(\mathrm {Prove}(x,w)) = 1\).

  • Soundness If the witness w does not satisfy x, then the probability \({\mathrm {Prob}}[\mathrm {Verify}(\mathrm {Prove}(x,w)) = 1]\) is sufficiently low.

  • Zero Knowledge Given the interaction between prover and verifier, we call this interaction a view. In order to capture the zero knowledge property we use a polynomial-time simulator, which has access to the same input given to the verifier (including its randomness), but no access to the input of the prover, to generate a simulated view. We say that the ZKP scheme has perfect zero knowledge if the simulated view, under the assumption that \(x \in {\mathcal {L}}\), has the same distribution as the original view. We say that the ZKP scheme has statistical zero knowledge if those distributions are statistically close. We say that the ZKP scheme has computational zero knowledge if there is no polynomial-time distinguisher for those distributions. Intuitively, the existence of such a simulator means that whatever the verifier can compute from the interaction with the prover, it was already possible to compute before such interaction, hence the verifier learned nothing from it. Also, we say that it is a proof of knowledge if we can find an extractor, who has rewindable black-box access to the prover, that can compute the witness w with non-negligible probability.

Bilinear pairings

Some constructions of ZKRP are based on the existence of a secure bilinear map \({\mathbf {bp}}= (\mathbb {G}_1, \mathbb {G}_2, \mathbb {G}_t, e, g_1, g_2)\), where \({\mathbb {G}}_1\), \({\mathbb {G}}_2\) and \({\mathbb {G}}_t\) are groups of sufficiently large prime order, \(g_1\) and \(g_2\) are generators of \({\mathbb {G}}_1\) and \({\mathbb {G}}_2\) respectively and e is an appropriate choice of bilinear map, satisfying the usual requirements: (i) non-degeneracy; (ii) efficiently computable and (iii) bilinearity. This cryptographic primitive is key to the constructions we will present in the next sections and it is important to remark that care must be taken when instantiating such primitive [32, 60]. Barreto-Naehrig [3] elliptic curves permit to implement bilinear maps efficiently.

Zero knowledge range proofs

The first constructions of ZKRP protocols were presented decades ago, with schemes like the one proposed in 1995 by Damgård [25] and in 1997 by Fujisaki and Okamoto [31]. Unfortunately those proposals are not efficient to be used in practice. The first practical construction was proposed by Boudot in 2001 [29]. In this document we will focus on constructions that came after Boudot’s proposal.

In this section we describe in detail different strategies to achieve ZKRP. We can distinguish two different ways to commit to the secret: integer and binary. For each representation distinct strategies exist. A summary of the main characteristics of each family of constructions follows.

  1. 1.

    Integer representation proposals:

    • Square decomposition One of the ideas that can be used to obtain zero knowledge range proofs is the decomposition of the secret element into a sum of squares, as proposed in 2001 by Boudot [29]. In 2003 Lipmaa et al. [43] improved the construction using Lagrange’s four squares theorem. In 2005 Groth [36] observed that if the element is in the form \(4n+1\), then it is possible to get the same result by decomposing only into three squares. The drawback of this approach is that the algorithm by Rabin and Shallit [53], required for the decomposition into squares, runs in time \({\mathcal {O}}(k^4)\), where k is the size of the secret. Both Lipmaa [43] and Groth [36] improved this algorithm, but in practice we have that it leads to a poor performance for the Prover’s algorithm. In 2017, Couteau, Peters and Pointcheval showed how to remove the strong RSA assumption requirement [22] , providing an elegant description of previous schemes. In particular, the original constructions do not need to be modified in order to remove the strong RSA assumption. Also, they proposed a construction that allows faster verification and lower communication, at the price of a less efficient prover. This is good for DLT applications, because the verification in general must be executed by multiple parties and the proof must be stored in the ledger. Applications like anonymous credentials indeed require big secrets, and in this case the square decomposition is a good strategy to follow.

    • Signature-based Another idea for the prover is to prove, in a blind way, that he knows a signature on the secret. Initially, all elements in the interval are signed, then the proof that the prover knows the signature means that this integer belongs to the expected interval. In fact this interval can be any possible finite set, which means that this solution can be used to construct ZK Set Membership. In 2008 Camenisch, Chaabouni and shelat used bilinear pairings to construct an efficient ZKSM scheme [11] that may be used also for ZKRP. If the interval contains N numbers, this solution would require communication of \({\mathcal {O}}{(N)}\) digital signatures. The authors describe in the paper how to use u-ary representation to reduce the communication complexity to \({\mathcal {O}}{(\frac{N}{\log {N}})}\). In 2010, Chaabouni et al. improved the communication complexity by a factor of 2 [17].

  2. 2.

    Binary representation proposals:

    • Multi-base decomposition A common approach that one could follow to build ZKRP schemes is to decompose the secret into the bit representation, which allows to prove that it belongs to the interval by using Boolean arithmetic. Basically, the prover must commit to each bit of the secret; provide a zero knowledge proof that it is indeed a bit; and show a zero knowledge proof that the representation is valid. This last condition may easily be achieved by the utilization of homomorphic commitments. If instead of using the bit representation we use u-ary representation, then we can obtain more efficient constructions, as pointed out in [11]. Another possible strategy is to use the so-called multi-base decomposition [44, 56], which is an alternative way to represent the secret and it allows to build ZKRP schemes that are good for the case of small secrets.

    • Two-tiered homomorphic commitments [35] . In 2011 Groth proposed a new method to construct ZKRP which allows to obtain communication complexity \({\mathcal {O}}(N^{1/3})\), where N is the bit-length of the secret. Groth constructed an argument for batch multiplication of elements in \({\mathbb {Z}}_p\), which can be used to prove that \(u_i.v_i = w_i\), where \(u_i, v_i, w_i \in {\mathbb {Z}}_p\) for \(0 \le i < N\). To construct ZKRP, if the bits of the secret are given by \(w_i\), then the argument can be used to show that \(w_i.w_i = w_i\), what convinces the verifier that in fact \(w_i \in \{0,1\}\). Also, he showed how to prove that \(w = \sum _{i=0}^{N}{w_i.2^i}\), thus proving \(w \in [0, 2^N)\). The argument can be easily adapted to a general interval [AB). The key idea of Groth’s construction is to use bilinear pairings to commit to a vector of Pedersen commitments. For instance, given pairing \(e: {\mathbb {G}}_1 \times {\mathbb {G}}_2 \rightarrow {\mathbb {G}}_T\) and elements \(v, u_1, \ldots , u_N \in {\mathbb {G}}_2\), we can commit to the vector \([c_1, \ldots , c_N] \in {\mathbb {G}}^N_1\) by choosing a random \(t \in {\mathbb {G}}\) and computing \(C = e(t, v)\prod _{i=0}^N{e(c_i, u_i)}.\)

    • Bulletproofs Unfortunately, all the schemes above-mentioned depends upon a trusted setup, which may not be interesting in the context of cryptocurrencies. For instance, if an adversary is able to circumvent this trusted setup, he would be able to create money out of thin air. Recently, Bünz et al. [7] proposed a new idea to construct ZKRP, which they called Bulletproofs. They proposed to use an inner product proof in order to achieve ZKRP with very small proof sizes. Also, they showed how to use a component called multi-exponentiation in order to optimize their construction. The authors also provided an efficient implementation that shows their proposal is adequate for many practical scenarios. However, this proposal was not included in the comparison by Canard et al. [13], then one of the contributions of this work is to analyze how Bulletproofs compares to the other proposals.

Square decomposition construction

In this section we describe the algorithms necessary to implement the ZKRP proposed by Boudot [29] in 2001. This construction requires some building blocks, like the zero knowledge proof that two commitments hide the same secret and the zero knowledge proof that the secret is a square.

We denote the zero knowledge proof that two commitments hide the same secret by \({\mathrm {PK}}_{\mathrm {SS}}= \{x, r_1, r_2 : E = g_1^x h_1^{r_1} \wedge F = g_2^x h_2^{r_2}\}\). The parameters for the \({\mathrm {PK}}_{\mathrm {SS}}\) scheme is given by \({\mathrm {params}}_{\mathrm {SS}}= (t, \ell , s_1, s_2)\), which must be set in order to achieve the desired level of security. Namely, we have that soundness is given by \(2^{t-1}\), while the zero knowledge property is guaranteed given that \(1/\ell\) is negligible. Next we present algorithms \({\mathrm {Prove}}_{\mathrm {SS}}\) and \({\mathrm {Verify}}_{\mathrm {SS}}\). It is important to remark that the discrete logarithm of \(g_1\) with respect to \(h_1\), or its inverse, must be unknown, otherwise the commitment is not secure. Analogously, we have that the same condition must be valid for \(g_2\) and \(h_2\). The hash function is such that it outputs 2t-bit strings. Finally, we have that \(s_1\) and \(s_2\) must be chosen in order to have secure commitments, i.e. \(2^{s_i}\) must be negligible for \(i \in \{1,2\}\).

figure a
figure b

We denote the zero knowledge proof that a secret is a square by \({\mathrm {PK}}_{\mathrm {S}}= \{x, r_1 : E = g^{x^2}h^r\}\). We have that \({\mathrm {params}}_{\mathrm {S}}= (t, \ell , s)\) represents the parameters for the \({\mathrm {PK}}_{\mathrm {S}}\) scheme, so that soundness is given by \(2^{t-1}\) and the zero knowledge property is guaranteed if \(1/\ell\) is negligible, as before. Algorithms 3 and 4 corresponds to \({\mathrm {Prove}}_{\mathrm {S}}\) and \({\mathrm {Verify}}_{\mathrm {S}}\), respectively. Also, the discrete logarithm of g with respect to h, or its inverse, must be unknown, otherwise the commitment is not secure.

figure c
figure d

We denote the zero knowledge proof that a secret belongs to a larger interval, originally proposed by Chan et al. [19], by using notation \({\mathrm {PK}}_{\mathrm {LI}}= \{x, r : E = g^x h^r \wedge x \in [-2^{t+\ell }b, 2^{t+\ell }b] \}\). We have that \({\mathrm {params}}_{\mathrm {LI}}= (t, \ell , s)\) represents the parameters for the \({\mathrm {PK}}_{\mathrm {LI}}\) scheme, so that completeness is achieved with probability greater than \(1-2^\ell\); soundness is given by \(2^{t-1}\) and the zero knowledge property is guaranteed if \(1/\ell\) is negligible. Algorithms 5 and 6 corresponds to \({\mathrm {Prove}}_{\mathrm {LI}}\) and \({\mathrm {Verify}}_{\mathrm {LI}}\), respectively. Also, the discrete logarithm of g with respect to h, or its inverse, must be unknown.

figure e
figure f

Before describing Boudot’s ZKRP construction, we first need a proof with tolerance, denoted by \({\mathrm {PK}}_{\mathrm {WT}}= \{x, r : E = g^x h^r \wedge x \in [a-\theta , b+\theta ] \}\), where \(\theta = 2^{t + \ell + 1} \sqrt{b-a}\), as shown in Algorithms 7 and 8.

figure g
figure h

Algorithms 9 and 10 describe the ZKRP scheme proposed by Boudot [29] in 2001.

figure i
figure j

Signature-based construction

The idea of the protocol is that the verifier initially computes digital signatures for each element in the target set S. The prover then blinds this digital signature by raising it to a randomly chosen exponent \(v \in {\mathbb {Z}}_p\), such that it is computationally infeasible to determine which element was signed. The prover uses the pairing to compute the proof, and the bilinearity of the pairing allows the verifier to check that indeed one of the elements from S were initially chosen. Algorithms 11, 12 and 13 show the details of the this protocol. The scheme depends upon Boneh-Boyen digital signatures, summarized in next.

Boneh-Boyen [27] signatures Shortly, the signer private key is given by \(x \in _R {\mathbb {Z}}_p\) and the public key is \(y = g^x\). Given message m, we have that the digital signature is calculated as \(\sigma = g^{1/(x+m)}\), and verification is achieved by computing \(e(\sigma , y g^m) {\mathop {=}\limits ^{?}}e(g,g)\).

Boneh-Boyen signatures are based on the q-Strong Diffie-Hellman assumption, described in Definition 4.

figure k
figure l
figure m

Range Proof In order to obtain ZKRP, we can decompose the secret \(\delta\) into base u, as follows:

$$\begin{aligned} \delta = \sum _{0 \le j \le \ell }{\delta _j u^j}. \end{aligned}$$

Therefore, if each \(\delta _j\) belongs to the interval [0, u), then we have that \(\delta \in [0, u^\ell )\). The ZKSM algorithms can be easily adapted to carry out this computation, as shown in Algorithms 14, 15 and 16.

figure n
figure o
figure p

In order to obtain Zero Knowledge Range Proofs for arbitrary ranges [ab) we show that \(\delta \in [a, a + u^\ell )\) and \(\delta \in [b - u^\ell , b)\), using 2 times the ZKRP scheme described in Algorithm 15. Namely, we have to prove that \(\delta - b + u^\ell \in [0, u^\ell )\) and \(\delta - a \in [0, u^\ell )\).

Bulletproofs construction

In this section we show a detailed description of the algorithms necessary to implement the Bulletproofs ZKRP protocol.

Notation Given an array \({\mathbf {a}}\in {\mathbb {G}}^n\), we use Python notation to represent array slices:

$$\begin{aligned} {\mathbf {a}}_{[:\ell ]}= & {} [a_1, \ldots , a_\ell ] \in {\mathbb {G}}^\ell ,\\ {\mathbf {a}}_{[\ell :]}= & {} [a_{\ell +1}, \ldots , a_n] \in {\mathbb {G}}^{n - \ell } \end{aligned}$$

Given \(k \in {\mathbb {G}}\), we denote the vector containing the powers of k by

$$\begin{aligned} {\mathbf {k}}^n = [1, k, k^2, \ldots , k^{n-1}]. \end{aligned}$$

Given \({\mathbf {g}}= [g_1, \ldots , g_n] \in {\mathbb {G}}^n\) and \({\mathbf {a}}\in {\mathbb {Z}}_p^n\), we define \({\mathbf {g}}^{\mathbf {a}}\) as follows:

$$\begin{aligned} {\mathbf {g}}^{\mathbf {a}}= \prod _{i=1}^n g_i^{a_i}. \end{aligned}$$

Given \(c \in {\mathbb {Z}}_p\), notation \({\mathbf {b}}= c.{\mathbf {a}}\in {\mathbb {Z}}_p^n\) is a vector such that \(b_i = c.a_i\). Also, \({\mathbf {a}}\circ {\mathbf {b}}= (a_1 b_1, \ldots , a_n b_n)\) is the Hadamard product. The vector polynomial \(p(X) = \sum _{i=0}^n {\mathbf {p}}_i X^i \in {\mathbb {Z}}_p^n[X]\), where each coefficient \({\mathbf {p}}_i\) is a vector in \({\mathbb {Z}}_p^n\). The inner product of such polynomials is given by

$$\begin{aligned} \langle {\mathbf {l}}(X), {\mathbf {r}}(X) \rangle = \sum _{i=0}^d \sum _{j=0}^i \langle {\mathbf {l}}_i, {\mathbf {r}}_j \rangle X^{i+j} \in {\mathbb {Z}}_p[X]. \end{aligned}$$
(1)

Setup

Many ZKRP constructions depend on a trusted setup. Shortly, the parameters necessary to generate and verify the underlying zero knowledge proofs must be computed by a trusted party, because if such parameters are generated using a trapdoor, then this trapdoor could be used to subvert the protocol, allowing to generate money out of thin air.

In order to avoid the trusted setup, Bulletproofs use the Nothing Up My Sleeve (NUMS) strategy, where a hash function [6] is utilized to compute the generators that will be necessary for the Pedersen commitments, as described in Algorithm 17, which describes the specific case where the subjacent elliptic curve is given by Koblitz curve secp256k1 [15, 38].

figure q
figure r
figure s
figure t

Inner product argument

In this section we present the main building block of Bulletproofs, which is the inner product argument. In summary, using this ZKP protocol the prover convinces a verifier that she knows vectors whose inner product is equal to a determined public value. First we describe the initialization procedure in Algorithm 22. Afterwards we present the main protocol, given by Algorithm 23.

figure u
figure v
figure w
figure x

The fact that Bulletproofs allows to halve the size of the problem in each level of the recursion in Algorithm 23 means that it is possible to obtain logarithmic proof size.

Range proof argument

Given a secret value v, if we want to prove it belongs to the interval \([0, 2^n)\), then we do the following:

  • Prove that \({\mathbf {a}}_L \in \{0,1\}^n\) is the bit-decomposition of v. In other words, we show that

    $$\begin{aligned} \langle {\mathbf {a}}_L, {\mathbf {2}}^n \rangle = v. \end{aligned}$$
  • Define \({\mathbf {a}}_R\) as the component-wise complement of \({\mathbf {a}}_L\), what means that, for every \(i \in [0,n]\), if the i-th bit of \({\mathbf {a}}_L\) is 0, then the i-th bit of \({\mathbf {a}}_R\) is equal to 1. Conversely, if the i-th bit of \({\mathbf {a}}_L\) is 1, then the i-th bit of \({\mathbf {a}}_R\) is equal to 0. Equivalently, this condition can be shortly described by Eqs. 2 and 3.

    $$\begin{aligned} {\mathbf {a}}_L \circ {\mathbf {a}}_R= & {} {\mathbf {0}}^n, \end{aligned}$$
    (2)
    $$\begin{aligned} {\mathbf {a}}_R= & {} {\mathbf {a}}_L - 1^n \pmod {2}. \end{aligned}$$
    (3)

    In order to prove that \({\mathbf {a}}_L\) and \({\mathbf {a}}_R\) satisfy both relations, we can randomly choose \(y \in {\mathbb {Z}}_p\) and compute:

    $$\begin{aligned} \langle {\mathbf {a}}_L, {\mathbf {a}}_R \circ {\mathbf {y}}^n \rangle= & {} 0,\\ \langle {\mathbf {a}}_L - {\mathbf {1}}^n - {\mathbf {a}}_R, {\mathbf {y}}^n \rangle= & {} 0. \end{aligned}$$

    These two equations can be combined into a single inner product, by randomly choosing \(z \in {\mathbb {Z}}_p\), and computing

    $$\begin{aligned} \langle {\mathbf {a}}_L - z.{\mathbf {1}}^n, {\mathbf {y}}^n \circ ({\mathbf {a}}_R + z.{\mathbf {1}}^n) + z^2.{\mathbf {2}}^n \rangle = z^2 v + \delta (y,z), \end{aligned}$$
    (4)

    where \(\delta (y,z) = (z - z^2) \langle {\mathbf {1}}^n, {\mathbf {y}}^n \rangle - z^3 \langle {\mathbf {1}}^n, {\mathbf {2}}^n \rangle \in {\mathbb {Z}}_p\).

If the prover could send the vectors in Eq. 4, then the verifier would be able to check the inner product himself. However, this vector reveals information about \({\mathbf {a}}_L\), therefore revealing bits of the secret value v. To solve this problem the prover randomly chooses vectors \({\mathbf {s}}_L\) and \({\mathbf {s}}_R\) in order to blind \({\mathbf {a}}_L\) and \({\mathbf {a}}_R\), respectively. Consider the following polynomials:

$$\begin{aligned} l[X]= & {} {\mathbf {a}}_L - z.1^n + s_L.X \in {\mathbb {Z}}_p^n,\\ r[X]= & {} {\mathbf {y}}^n \circ ({\mathbf {a}}_R + z.1^n + {\mathbf {s}}_R.X) + z^2 2^n \in {\mathbb {Z}}_p^n,\\ t[X]= & {} \langle l[X], r[X] \rangle = t_0 + t_1.X + t_2.X^2, \end{aligned}$$

where the above inner product is computed as defined in Eq. 1.

Note that the constant terms of l[X] and r[X] correspond to the vectors in Eq. 4. Therefore if the prover publishes l[x] and r[x] for a specific \(x \in {\mathbb {Z}}_p\), then we have that terms \({\mathbf {s}}_L\) and \({\mathbf {s}}_R\) ensure no information about \({\mathbf {a}}_L\) and \({\mathbf {a}}_R\) is revealed.

Explicitly, we have that

$$\begin{aligned} t_1 = \langle {\mathbf {a}}_L - z.{\mathbf {1}}^n, {\mathbf {y}}^n.{\mathbf {s}}_R \rangle + \langle {\mathbf {s}}_L, {\mathbf {y}}^n.({\mathbf {a}}_R + z.{\mathbf {1}}^n) \rangle , \end{aligned}$$
(5)

and

$$\begin{aligned} t_2 = \langle {\mathbf {s}}_L, {\mathbf {y}}^n.{\mathbf {s}}_R \rangle . \end{aligned}$$
(6)
figure y
figure z

In order to make Bulletproofs non-interactive using the Fiat–Shamir heuristic. Concretely, we compute \(x = {\mathrm {Hash}}(T_1, T_2)\), \(y = {\mathrm {Hash}}(A, S)\), and \(z = {\mathrm {Hash}}(A, S, y)\) in Algorithms 25 and 26.

Optimizations

The algorithms described in last section can be optimized in two ways, as follows:

  • Multi-exponentiation In the inner-product argument presented in Section 4.3.2 it is required to computed many exponentiations, which is an expensive operation. For instance, in the k-th round of the protocol we must perform \(\frac{n}{2^{k-1}}\) exponentiations, thus in total we must execute 4n exponentiations. It is possible to reduce this number to a single multi-exponentiation of size 2n by postponing these computations to the last round.

    Concretely, given \({\mathbf {g}}= [g_1, \ldots , g_n]\), we have that it is possible to compute g an h, the generators obtained in last round, by using the following expressions:

    $$\begin{aligned} g= & {} \prod _{i=1}^n g_i^{s_i} \in {\mathbb {G}},\\ h= & {} \prod _{i=1}^n h_i^{1/s_i} \in {\mathbb {G}}, \end{aligned}$$

    where

    $$\begin{aligned} s_i = \prod _{j=1}^{\log _2{n}} x_j^{b(i,j)} \end{aligned}$$

    and

    $$\begin{aligned} b(i,j) = {\left\{ \begin{array}{ll} 1, &{} \text {if the } j-\text {th bit of} i-1\text { is 1}\\ -1, &{} \text {otherwise} \end{array}\right. } \end{aligned}$$

    Therefore, verification can be performed by

    $${\textbf{g}}^{a.s}\cdot {\textbf{h}}^{b.s^{-1}}\cdot u^{a.b}{\mathop{=}\limits^{?}}P\cdot\prod\limits_{j=1}^{\log_2{n}}L_{j}^{x_{j}^{2}}\cdot R_{j}^{x_{j}^{-2}}$$
  • Aggregation If multiple range proofs use the same underlying interval, then it is possible to aggregate them into one single ZKRP. Using this optimization, we have that new proofs can be added by only increasing the total size of the proof by a logarithmic factor. Consider we want to aggregate m range proofs. Then, while the naive strategy would lead us to a proof whose size is m times larger, this aggregation procedure in Bulletproofs allows the proof to grow only by a factor of \(2\log _2{m}\).

    In practice, applications like Confidential Transactions [46], Mimblewimble [52] and Provisions [23] would benefit a lot from the utilization of aggregation, because indeed such applications must execute many ZKRPs over the same interval.

Implementation

We implemented the constructions described in Sects. 4.14.2 and 4.3. The scheme based on square decomposition, i.e. Boudot’s construction, was implemented in Java and Solidity, while the signature-based scheme and Bulletproofs were implemented in Golang and they were based on libsecp256k1 library, available in Go-Ethereum. We also provide an implementation of the verification algorithm for Bulletproofs in Solidity. We used BN128 pairing-friendly elliptic curves, thus accomplishing 128 bits of security. The performance is summarized in Table 1, and the measurement was carried out in a computer with a 64-bit Intel i5-6300U 2.40 GHz CPU, 16 GB of RAM and Ubuntu 18.04. The implementation is available on Github [49] and is a proof of concept, thus it should not be used in production without first spending the effort to review it where necessary.

Optimal values for u and \(\ell\) can be calculated as described in the original paper [11]. We used \(u = 57\) and \(\ell = 5\) for the interval [347, 184, 000, 599, 644, 800), obtaining communication complexity equal to 30,976 bits, while the previous work, based on Boudot’s proposal [29], has 48,946 bits.

Table 1 Time complexity

Although schemes were implemented in different languages, Table 1 allows to understand that in spite of the asymptotic complexities of each proposal, the practical results show that schemes based on square decomposition are one order of magnitude slower than other implementations, due mainly to the fact that they involve large variables. Also, it is possible to see that Bulletproofs optimizations, represented in last row, are important to significantly reduce the verification time.

A qualitative comparison is presented in next section, showing the expected performance for different range sizes.

Comparison

In this section we compare the representative schemes for different strategies, as presented in Section 4, with respect to proof size and the complexity of \({\mathrm {Prove}}\) and \({\mathrm {Verify}}\) algorithms. It is important to emphasize that in this section we used different schemes than the ones we implemented, because we followed the qualitative comparison provided in [13]. However the schemes that follow the same strategy have similar performance, what allows us to have a good overview of the existing constructions. Namely, Boudot’s proposal [29] have very similar performance when compared to Lipmaa’s [43]  and Groth’s [36]  constructions. For instance, we used the proposal by Lipmaa et al. [44] to represent the multi-base solution; the proposal by Lipmaa [43] to represent the square decomposition strategy; the scheme by Camenisch et al. [11] for the signature-based implementation; and we included Bulletproofs construction.

Compared to other proposals in the literature, we found that for very big intervals, the best strategy is to use the square decomposition, as for example occurs in the construction by Boudot [29], since verification doesn’t depend on the size of the secret. However, it is important to remark that finding the decomposition into squares consumes a reasonable amount of computational resources, what makes the Prover’s algorithm somewhat inefficient. This scenario may arise in the context of anonymous credentials [20, 24, 45]. On the other hand, for small secrets, Schoenmakers’s strategy [57] is the most efficient scheme with respect to \({\mathrm {Prove}}\) algorithm.

In Figures 12 and 3 we represent in the horizontal axis the bit-length of b, where b is the largest element from the subjacent range [ab] used for the zero knowledge range proof scheme.

Fig. 1
figure 1

Proofs size

Fig. 2
figure 2

Prover complexity

Fig. 3
figure 3

Verifier complexity

It is possible to conclude that in general Bulletproofs offers the best performance, but depending on the requirements of the underlying chosen use case, it may be possible that other strategies offer better advantages. For DLT applications we have that the proof size and the verifier’s complexity are more important metrics than the prover’s complexity, what means that indeed Bulletproofs seems to be the best approach to implement a ZKRP protocol.

Related work and final remarks

In this document we described in detail the construction of ZKRP protocols, which were implemented and open-sourced. Most works on this subject are devoted to construct non-interactive protocols by using the Fiat–Shamir heuristic. Those constructions therefore rely on the random oracle model [4], which is considered a weaker model, since there are schemes proven secure in this model, but in practice turn out to be insecure. Chaabouni et al. [18]  proposed a solution to this problem, allowing to construct non-interactive ZKRP in the standard model.

The focus of this work was on ZKRP, which is closely related to Zero Knowledge Set Membership protocols. More information on this topic can be found in Chaabouni’s Ph.D. thesis [16].

Another related topic is called cryptographic accumulator [2, 8, 12]. It not only allows to verify membership in a set, but also permit to dynamically add and remove elements from the set. Accumulators can be used in replacement for the ZK Set Membership, thus can be a building block to construct ZK Range Proofs, as pointed out in [11]. Recently, Bünz et al. [5] used groups of unknown order to construct accumulators that allows batching. The technique is a generalization of the proof of exponent by Schnorr [55], and can be used to prove knowledge of a homomorphism pre-image. This work allows to reduce the huge amount of memory that is necessary to validate transactions in Bitcoin. In summary, the accumulator is used to store unspent transaction outputs (UTXOs), which can be added and removed using only constant-size memory. Also, the underlying digital signature scheme used in signature-based ZKRP [11], namely Boneh-Boyen signatures, can be replaced and the construction presented here can be adapted to use the digital signature proposed by Camenisch and Lysyanskaya [9].

It is possible to use Zero Knowledge Set Membership or accumulators to validate user information without revealing it. A possible scenario is to perform KYC operations. For example, it would be possible to validate that the country of residence of a user is one belonging to the European Union, without revealing which country. Similarly, it is possible to validate membership in whitelists or blacklists, which would be important in the context of Anti-Money Laundering (AML) solutions for example.

The holy grail for privacy on DLT systems is the construction of private smart contracts. Ethereum [61] allows to construct smart contracts over blockchain, which can be seen as generic applications running in a distributed way, therefore avoiding the necessity to have a centralized solution. In other words, a smart contract is a piece of code that will run by all participants in Ethereum network. However, since there is no mechanism to provide privacy to the system, we have that all the information in smart contracts is visible by every other party, what constitutes a huge issue in many scenarios. This problem could be solved by using zk-SNARKs [37], but it requires a trusted setup, and this problem is even worse in the case of smart contracts, because we need a new setup for each contract. Hawk [41] is an interesting proposal to implement private smart contracts, however it not only needs a new setup for each contract, but also requires a trusted manager, who can view the user private information. Bulletproofs is an interesting proposal regarding private smart contracts, since it avoids the trusted setup and offers a generic ZKP protocol which has small proofs.

Recent breakthroughs in cryptography permit us to construct new protocols and achieve privacy on demand. These new cryptographic algorithms can be ultimately considered as tools that can be reused in different problems. ZKRP is an important tool that is necessary in order to construct a toolbox to deal with more complex applications.

Finally, an important research topic is the construction of post-quantum zero knowledge proofs. Recently, Benoît Libert et al. [42] proposed a construction of ZKRP based on lattices. However, the proof size is 3.54 MB for secret whose size is \(2^{1000}\). Although the secret is huge, the size of the proof can’t be made considerably smaller when the secrets is smaller. Hence, optimizing this construction would allow to reduce the gap existing between conventional schemes and quantum-resistant ones.