1 Introduction

In a zero-knowledge proof system, a prover and verifier interact so that the verifier is convinced of the validity of the statement being proved, but learns nothing more [13]. In a proof of knowledge, the verifier is also convinced of the fact that the prover knows a “witness” that testifies to the validity of the statement being proved [1, 13]. Zero-knowledge proofs and zero-knowledge proofs of knowledge are basic primitives in cryptography, and the fact that any statement in \(\mathcal {NP}\) can be proved in zero-knowledge has made them widely applicable [12]. An important question that has been considerably studied relates to the round complexity of zero-knowledge protocols. We know the following:

  • Assuming the existence of 2-round perfectly-hiding commitments (which can be constructed from families of claw-free functions [8, Sect. 2.4.5]), there exist 5-round zero-knowledge proof systems with negligible soundness error for all \(\mathcal {NP}\). This was proven by Goldreich and Kahan [10]. (We remark that if a language L has a four-round zero-knowledge proof then \(\overline{L}\in\mathcal {MA}\) [15]. Thus, five rounds is the minimal number of rounds, unless co-\(\mathcal{NP}\subseteq\mathcal{MA}\).)

  • Assuming the existence of one-way functions, there exist constant-round zero-knowledge arguments of knowledge with negligible soundness error for all \(\mathcal {NP}\) (an argument is a “proof” where soundness is only guaranteed computationally in the presence of a polynomial-time cheating prover). This was proven by Feige and Shamir [6, 7], who presented a five-round protocol. The existence of four-round zero-knowledge arguments of knowledge, under the assumption that one-way functions exist, was shown by Bellare et al. [2]. This is minimal for black-box zero-knowledge [11].

However, to the best of our knowledge, it has never been proven that there exist constant-round zero-knowledge proofs of knowledge for all \(\mathcal {NP}\). In this note, we prove the following theorem:

Theorem 1

Assuming the existence of 2-round perfectly-hiding commitments, every \(\mathcal {NP}\) relation has a 5-round computational zero-knowledge proof of knowledge with negligible knowledge error.

The Goldreich–Kahan Proof System [10]

Our construction is very similar to that of [10], and we thus begin by describing the latter. Informally, the zero-knowledge proof of [10] works as follows:

  1. 1.

    The prover sends the first message of a perfectly-hiding commitment scheme.

  2. 2.

    The verifier commits to a string q of length n using the perfectly-hiding commitment.

  3. 3.

    The prover begins n parallel executions of the three-round proof system of [12] (or, equivalently, of [3]) and sends the first prover message in each execution.

  4. 4.

    The verifier decommits to the string q.

  5. 5.

    The prover concludes the proof based on the string q.

The above is zero-knowledge because by rewinding the simulator can learn the string q before it prepares the first message of the proof system. Simulation then follows from known techniques which work when the verifier-queries can be guessed or otherwise obtained ahead of time. Before proceeding, we warn that despite the fact that this strategy is intuitively appealing, and even possibly “obvious”, it is highly non-trivial to analyze. Indeed, the proof by [10] that this is zero-knowledge is quite involved, and contains an important and novel proof technique. The problem that arises that makes this non-trivial is discussed at length in [10] and in our proof below. (We remark that this technique is of general importance as it turns out that this problem arises in many cryptographic settings where simulation is used.)

The reason why the protocol of [10] seems to not be a proof of knowledge is that in order to extract, one must obtain multiple different responses from the prover relative to the same first message of the proof system of [12] or [3]. However, the verifier (and thus the extractor) is bound to its query before the prover sends its commitment, and this commitment may in turn be computed as a function of the verifier’s first real message (i.e., the commitment). Thus the extractor cannot change the query without the prover changing its first message.

Our Zero-Knowledge Proof of Knowledge

We solve the aforementioned problem in the protocol of [10] by essentially running a semi-simulatable coin-tossing protocol in order to choose the string q in between the first and second prover messages of the proof system of [12] or [3]. (We do not use a fully simulatable coin-tossing protocol because we need it to be constant-round and secure even if the prover is computationally unbounded. The only such known protocol [17] requires a constant-round zero-knowledge system for proofs of knowledge, which is exactly what we are trying to build.) Informally, our protocol can be described as follows:

  1. 1.

    The prover begins n parallel executions of the proof system of [12] (or, equivalently, of [3]) and commits to the first prover message in each execution. The prover also sends the first message of a perfectly-hiding commitment scheme.

  2. 2.

    The verifier commits to a string q 1 of length n using a perfectly-hiding commitment scheme.

  3. 3.

    The prover commits to a string q 2 of length n using a perfectly-binding commitment scheme.

  4. 4.

    The verifier decommits to the string q 1.

  5. 5.

    The prover decommits to q 2 and concludes the proof based on the string q=q 1q 2.

The intuitive reasoning as to why this protocol is zero-knowledge is the same as for [10]. The simulator guesses ahead of time a string q, runs the verifier in order to obtain q 1, and then rewinds the verifier in order to set its q 2 such that q 1q 2=q. We note that proving this again requires the techniques of [10].

The reason why this protocol is also a proof of knowledge is that it is now possible for an extractor to rewind the prover multiple times relative to the same first message in order to obtain multiple openings with different strings q 1q 2. This enables us to apply the extraction strategy of the basic protocols of [3, 12], albeit with some additional complications.

Remark

  1. 1.

    It is possible to use the technique of [19] in order to construct a 7-round zero-knowledge system for proofs of knowledge for all \(\mathcal {NP}\) with a simpler proof of security.Footnote 1 The advantage of the protocol presented here is in its minimal number of rounds.

  2. 2.

    Our method for obtaining 5-round zero-knowledge proofs of knowledge can be applied to any 3-round public-coin zero-knowledge proof with the property that simulation can be carried out if the verifier query is known ahead of time and extraction works by obtaining two (or more) valid prover-answers relative to the same first prover message. One important application of this is that our method constitutes a highly-efficient generic construction of a zero-knowledge system for proofs of knowledge from any Σ-protocol [5].

Semi-Simulatable Coin Tossing

As we have mentioned, our protocol for constant-round zero-knowledge proofs of knowledge works by having the prover and verifier jointly choose the verifier-query q via a type of coin tossing protocol. In Sect. 3, we isolate this subprotocol and show that it achieves a level of security that we call “semi-simulatable coin tossing”. Informally speaking, this means that if P 1 is corrupted then the protocol is secure according to the standard ideal/real model simulation-based definitions of secure computation. Furthermore, if P 2 is corrupted, it is guaranteed that the output of P 1 is either “abort” or a uniformly distributed string. We remark that although the case of P 2 being corrupted is not simulatable, the fact that P 1 is nevertheless guaranteed to output a uniformly distributed string (or abort) means that a meaningful security level is obtained. We believe that this constant-round coin-tossing protocol, which is highly efficient, is of independent interest.

Organization

In order to keep this note brief, we assume familiarity with the definitions of zero-knowledge and zero-knowledge proofs of knowledge; see [8, Chap. 4] for details. We prove that our protocol is a proof of knowledge using Definition 4.7.3 of [8].

2 Constant-Round Zero-Knowledge Proof of Knowledge

Our constant-round zero-knowledge proof of knowledge is based on n parallel repetitions of the basic proof system for the Hamiltonian Cycle problem which is NP-complete. We therefore obtain a proof system for any language in \(\mathcal{NP}\). We consider directed graphs (and the existence of directed Hamiltonian cycles). Our methodology also works for the 3-coloring protocol of [12], but it is simpler to describe it based on Hamiltonicity. See Appendix A for a full description of the basic Hamiltonicity proof system.

We use a two-round perfectly-hiding commitment scheme. Such a scheme can be constructed from families of claw-free functions. We denote the first message of such a scheme by α, and a commitment to m using α and randomness r by \(C_{\mathrm {ph}}^{\alpha}(m;r)\). In addition, we use a non-interactive perfectly-binding commitment scheme; a commitment to m using randomness r is denoted C pb(m;r). Perfectly-binding commitment schemes can be constructed from 1–1 one-way functions.Footnote 2 The zero-knowledge proof of knowledge system can be found in Protocol 2.

PROTOCOL 2

(Constant-Round ZKPOK)

  • Common Input: a directed graph G=(V,E) with \(n\stackrel {\mathrm {def}}{=}|V|\).

  • Auxiliary Input to Prover: a directed Hamiltonian Cycle, CE, in G.

  • The protocol:

    1. 1.

      Prover’s first step (P1): The prover P sends n independent copies of the first message (BP1) for the basic proof of Hamiltonicity, described in Appendix A. In addition, P sends the first message α of a perfectly-hiding commitment scheme.

    2. 2.

      Verifier’s first step (V1): The verifier V chooses a random string q 1 R {0,1}n and computes \(c_{1}=C_{\mathrm {ph}}^{\alpha}(q_{1};r_{1})\) for a random r 1 of the appropriate length. V sends c 1 to P.

    3. 3.

      Prover’s second step (P2): P chooses a random string q 2 R {0,1}n and computes c 2=C pb(q 2;r 2) for a random r 2 of the appropriate length. P sends c 2 to V.

    4. 4.

      Verifier’s second step (V2): V decommits to c 1 by sending q 1 and r 1.

    5. 5.

      Prover’s third step (P3): If \(C_{\mathrm {ph}}^{\alpha}(q_{1};r_{1})\neq c_{1}\), then P aborts and halts. Otherwise, P decommits to c 2 by sending q 2 and r 2. P computes q=q 1q 2. Denoting q=(q 1,…,q n), P sends the second message (BP2) of the basic proof of Hamiltonicity for each of the n copies, based on the verifier query q i in the ith copy.

    6. 6.

      Verifier’s output: V computes q=q 1q 2. If C pb(q 2;r 2)≠c 2 or the response of the prover is not accepting in all n copies, based on the query q i in the ith copy, then V outputs reject. Otherwise, V outputs accept.

Theorem 3

Assuming that C ph is a perfectly-hiding commitment scheme and that C pb is a perfectly-binding commitment scheme, Protocol 2 is a computational zero-knowledge proof of knowledge of Hamiltonicity, with knowledge error κ(n)=2n.

Proof

We begin by proving that the protocol is a proof of knowledge with knowledge error κ(n)=2n. We use Definition 4.7.3 in [8], and remark that this implies soundness against an all-powerful cheating prover (thus this also proves that it is an interactive proof system). We construct an extractor K that works as follows:

  1. 1.

    K invokes \(P^{*}_{x,y,r}\), where x is the common input graph, and y and r are the auxiliary input and random tape of P , respectively. K receives the first prover message P1.

  2. 2.

    K continues the execution to the end of the proof, running the honest verifier.

    1. (a)

      If the proof is not accepting, then K outputs ⊥ and halts.

    2. (b)

      If the proof is accepting, then K rewinds \(P^{*}_{x,y,r}\) to the beginning, and reruns the execution playing the honest verifier with fresh random coins. K repeats this until another accepting proof is obtained. (Note that since \(P^{*}_{x,y,r}\) is deterministic, the same first prover message is obtained each time.)

  3. 3.

    Let q be the resulting string in the first accepting transcript (where q=q 1q 2), and let q′ be the string in the second accepting transcript (where \(q'=q'_{1}\oplus q'_{2}\)). If q=q′ then K outputs ⊥ and halts. Otherwise, let i be such that q iqi. Since both transcripts are accepting, K obtained responses to both query q=0 and q=1 relative to the same first prover message. Thus, K can extract a Hamiltonian cycle CE. K outputs C and halts.

It is immediate that if K does not output ⊥ then it outputs a valid Hamiltonian cycle C. We now claim that K runs in expected polynomial time. Let p(x,y,r) be the probability that \(P^{*}_{x,y,r}\) convinces an honest verifier upon common input x and (y,r) as above. The important point to notice is that the probability that each iteration concludes in Step 2(b) is exactly p(x,y,r). Thus, the expected number of required iterations is 1/p(x,y,r). In addition, the probability that K reaches Step 2(b) is exactly p(x,y,r). Finally, the cost of each iteration is polynomial in n. Thus, the expected running-time of K is

$$p(x,y,r)\cdot\frac{1}{p(x,y,r)}\cdot \mathrm {poly}(n) + \bigl(1-p(x,y,r)\bigr)\cdot \mathrm {poly}(n) = \mathrm {poly}(n). $$

It remains to show that the probability that K outputs a valid cycle C is at least p(x,y,r)−2n. Now, K outputs ⊥ if the first execution of the proof is not accepting or if q=q′. The probability that the first execution of the proof is not accepting is 1−p(x,y,r), by the definition of p(x,y,r). Next, we claim that the probability that K outputs ⊥ due to the fact that q=q′ is at most 2n; we denote this event by collision (because q and q′ collide). Note that this event can only happen if the first execution was accepting. Thus, denoting the event that the first execution was accepting by accept 1 we have that

Now, let S⊆{0,1}n+m be the set of pairs of strings (q 1,r 1)∈{0,1}n+m for which \(P^{*}_{x,y,r}\) concludes with an accepting proof (we denote by m=m(n) the length of the random string needed to commit to an n-bit string using C ph). We have

$$\mathrm {Pr}[\mathsf {accept}_1] = p(x,y,r) = \mathrm {Pr}_{(q_1,r_1)\in _R\{0,1\}^{n+m}} \bigl[ \vphantom {2^{2^2}}(q_1,r_1)\in S \bigr] = \frac{|S|}{2^{n+m}}. $$

Next, observe that the event collision depends solely on the values (q 1,r 1) used in the first execution and \((q'_{1},r'_{1})\) used in the second execution. In particular, the string q 2 chosen by \(P^{*}_{x,y,r}\), and thus the string q=q 1q 2 is a deterministic function of the commitment value \(C_{\mathrm {ph}}^{\alpha}(q_{1};r_{1})\) computed by V. To be concrete, let g be the (inefficient) function such that \(q_{2}=g(C_{\mathrm {ph}}^{\alpha}(q_{1};r_{1}))\), where q 2 is the value committed to by \(P^{*}_{x,y,r}\) after receiving \(C_{\mathrm {ph}}^{\alpha}(q_{1};r_{1})\) from V. Using this notation, we have that collision is the event that

$$q_1 \oplus g \bigl(C_{\mathrm {ph}}^\alpha(q_1;r_1) \bigr) = q'_1 \oplus g \bigl(C_{\mathrm {ph}}^\alpha \bigl(q'_1;r'_1\bigr) \bigr). $$

We therefore have

We now prove that

$$ \mathrm {Pr}_{(q_1,r_1),(q'_1,r'_1)\in S} \bigl[q_1 \oplus g\bigl(C_{\mathrm {ph}}^\alpha (q_1;r_1)\bigr) = q'_1 \oplus g \bigl(C_{\mathrm {ph}}^\alpha\bigl(q'_1;r'_1 \bigr)\bigr) \bigr] \leq \frac{1}{2^n\cdot p(x,y,r)}. $$
(1)

For every value v∈{0,1}n, define

$$S_v = \bigl\{\vphantom {2^{2^2}}(q_1,r_1)\in S \mid q_1\oplus g\bigl(C_{\mathrm {ph}}^\alpha (q_1;r_1) \bigr)=v \bigr\}; $$

the set of all pairs (q 1,r 1) for which \(P^{*}_{x,y,r}\) concludes with an accepting proof and the resulting query q based on \(P^{*}_{x,y,r}\)’s reply equals v (i.e., \(v=q_{1}\oplus g(C_{\mathrm {ph}}^{\alpha}(q_{1};r_{1}))\)). Note that for every v, it holds that

$$\mathrm {Pr}_{(q_1,r_1)\in S} \bigl[\vphantom {2^{2^2}}q_1\oplus g\bigl( C_{\mathrm {ph}}^\alpha (q_1;r_1)\bigr)=v \bigr] = \mathrm {Pr}_{(q_1,r_1)\in S} \bigl[\vphantom {2^{2^2}}(q_1,r_1)\in S_{v} \bigr] = \frac{|S_{v}|}{|S|}. $$

Combining the above with the fact that (q 1,r 1) and \((q_{1}',r_{1}')\) are independent, we have

(2)

Observe now that the sets \(\{S_{v}\}_{v\in \{0,1\}^{n}}\) partition the set S; this is due to the fact that it is not possible for q 1g(C ph(q 1;r 1)) to equal two distinct values v and w.

Next, we claim that for every v∈{0,1}n it holds that |S v |≤2m. In order to see this, for every v define

$$S'_v= \bigl\{(q_1,r_1)\in \{0,1\}^{n+m} \mid q_1\oplus g\bigl(C_{\mathrm {ph}}^\alpha(q_1;r_1) \bigr) = v \bigr\} $$

and observe that \(S_{v}\subseteq S'_{v}\) since S⊆{0,1}n+m (the difference between \(S'_{v}\) and S v is that in S v there is also a requirement that the transcript be accepting). However, by the perfect hiding property, for every v the set \(S'_{v}\) can contain only a 2n fraction of all such pairs (q 1,r 1). Otherwise, if \(S'_{v}\) contained a larger fraction, then it is possible to guess the committed value with probability greater than 2n. In order to see this, let ϵ be such that \(|S'_{v}| = \epsilon \cdot2^{n+m}\); by the above, ϵ>2n. Then, given a commitment value \(c=C_{\mathrm {ph}}^{\alpha}(q_{1};r_{1})\) for a random q 1 R {0,1}n, one can (inefficiently) compute vg(c) and output this as a guess for the committed value q 1. This guess is correct if \((q_{1},r_{1})\in S'_{v}\) which occurs with probability \(\frac{|S'_{v}|}{|\{0,1\}^{n+m}|}=\epsilon >2^{-n}\), in contradiction to the perfect hiding property which guarantees that it is possible to guess the committed value with probability at most 2n (since there are 2n possible values). Thus, \(|S'_{v}|\leq 2^{-n}\cdot2^{n+m}=2^{m}\) and by the fact that \(S_{v}\subseteq S'_{v}\) we conclude that |S v |≤2m.

Finally, for every v∈{0,1}n, the value \(\sum_{v\in \{0,1\}^{n}} (|S_{v}|/|S|)^{2}\) is maximized when some S v are of maximal size (i.e., of size 2m) and the others are empty. Recalling that |S|=p(x,y,r)⋅2n+m, and observing that the minimum number of v∈{0,1}n in the sum of Eq. (2) when these are maximal is p(x,y,r)⋅2n, we combine the above and conclude that:

completing the proof of Eq. (1). We therefore conclude that

$$\mathrm {Pr}[\mathsf {accept}_1\ \wedge\ \mathsf {collision}] \leq\frac{1}{2^n} $$

and so

$$\mathrm {Pr}\bigl[K^{P^*_{x,y,r}}(x) \neq\bot \bigr] \geq p(x,y,r) - \frac{1}{2^n}, $$

as required.

Zero-Knowledge

Next, we prove that Protocol 2 is zero-knowledge. The proof of this fact is similar to the proof in [10]. We first present a simplified strategy for a black-box simulator \(\mathcal {S}\) given oracle access to a verifier V (with a fixed input, auxiliary input and random tape), and then explain how to modify it. The simplified simulator \(\mathcal {S}\) works as follows:

  1. 1.

    \(\mathcal {S}\) chooses a random string q R {0,1}n. Then, for the prover message in the ith execution, \(\mathcal {S}\) generates a commitment to a random permutation of G if q i=0, and to a simple n-cycle if q i=1. \(\mathcal {S}\) hands V all of the commitments. In addition, \(\mathcal {S}\) chooses α like an honest prover would and hands it to V .

  2. 2.

    \(\mathcal {S}\) receives from V its commitment c 1. \(\mathcal {S}\) chooses a random q 2,r 2, computes c 2=C pb(q 2;r 2) and hands c 2 to V .

  3. 3.

    \(\mathcal {S}\) receives the decommitment q 1,r 1 from V . If \(C_{\mathrm {ph}}^{\alpha}(q_{1};r_{1})\neq c_{1}\), then \(\mathcal {S}\) simulates P aborting, outputs whatever V outputs, and halts. Otherwise, \(\mathcal {S}\) proceeds to the next step.

  4. 4.

    Rewinding phase: \(\mathcal {S}\) rewinds V until q 1q 2=q:

    1. (a)

      \(\mathcal {S}\) fixes q 2=q 1q, where q is the string it chose initially and q 1 is the string that it received from V in the decommitment.

    2. (b)

      \(\mathcal {S}\) rewinds V back to the point that it needs to send c 2. \(\mathcal {S}\) chooses a random r 2 and hands V the commitment c 2=C pb(q 2;r 2).

    3. (c)

      \(\mathcal {S}\) receives \(q'_{1},r'_{1}\) from V .

      1. (i)

        If \(C_{\mathrm {ph}}^{\alpha}(q'_{1};r'_{1})\neq c_{1}\), then \(\mathcal {S}\) returns back to Step 4(b) and repeats using fresh randomness (we stress that q 2 is the same each time, whereas r 2 is fresh).

      2. (ii)

        If \(C_{\mathrm {ph}}^{\alpha}(q'_{1};r'_{1}) = c_{1}\) and \(q'_{1}\neq q_{1}\), then \(\mathcal {S}\) outputs ambiguous and halts.

      3. (iii)

        Otherwise, \(\mathcal {S}\) completes the proof by decommitting either to the entire graph (for q i=0) or the simple cycle (for q i=1).

  5. 5.

    \(\mathcal {S}\) outputs whatever V outputs.

The intuition behind this simulation is clear. \(\mathcal {S}\) repeatedly rewinds until the string q is the one that it initially chose. In this case, it can decommit appropriately and conclude the proof. The fact that the result is computationally indistinguishable from a real proof by an honest prover follows from the hiding property of the perfectly-binding commitments.

The problem with the above simplified strategy is that \(\mathcal {S}\) actually may not run in expected polynomial time. In order to see this, denote by ϵ(n) the probability that V decommits to c 1 in the first iteration (before rewinding) and by δ(n) the probability that V decommits to c 1 in all later iterations; these probabilities are over the choice of q 2 and r 2. We stress that although the initial commitments do not change, ϵ(n) may not equal δ(n). This is because ϵ(n) is based on the case that c 2 is a random commitment to a random q 2, whereas δ(n) is based on the case that c 2 is a random commitment to a fixed q 2. Nevertheless, it follows immediately from the hiding property of C pb that the difference between ϵ(n) and δ(n) is negligible; otherwise, one could use this fact to distinguish commitments. Now, the probability that \(\mathcal {S}\) runs the rewinding phase is ϵ(n), and the expected number of rewinding iterations in the rewinding phase is 1/δ(n). Let μ(n) be a negligible function, such that ϵ(n)−δ(n)=μ(n). We have that the expected running time of \(\mathcal {S}\) is

$$\bigl(1-\epsilon (n)\bigr)\cdot \mathrm {poly}(n) + \epsilon (n) \cdot\frac{1}{\delta(n)}\cdot \mathrm {poly}(n) = \mathrm {poly}'(n)\cdot\frac{\epsilon (n)}{\epsilon (n)-\mu(n)}. $$

It may be tempting at this point to conclude that the above is polynomial because μ(n) is negligible, and so ϵ(n)−μ(n) is almost the same as ϵ(n). This is true for “large” values of ϵ(n). For example, if ϵ(n)>2μ(n) then ϵ(n)−μ(n)>ϵ(n)/2. This then implies that ϵ(n)/(ϵ(n)−μ(n))<2. Unfortunately, however, this is not true in general. For example, consider the case that μ(n)=2n and ϵ(n)=μ(n)+2n/2=2n+2n/2. Then,

$$\frac{\epsilon (n)}{\epsilon (n)-\mu(n)} = \frac{2^{-n}+2^{-n/2}}{2^{-n/2}} = 2^{n/2} + 1, $$

which is exponential in n. This technical problem was observed and solved by [10], and we use their solution here.

The problem described above is solved by ensuring that the simulator \(\mathcal {S}\) never runs “too long”. Specifically, if \(\mathcal {S}\) proceeds to the rewinding phase of the simulation, then it first estimates the value of ϵ(n). This is done by repeating Steps 2 and 3 of the simulation (choosing random q 2 and r 2 each time) until m=m(n) successful decommits occurs (for a polynomial m(n) to be determined below), where a successful decommit is where V decommits to q 1, the string it first decommit to. We remark that as in the original strategy, if V correctly decommits to a different \(q'_{1}\neq q_{1}\) then \(\mathcal {S}\) outputs ambiguous. Then, an estimate \(\tilde{\epsilon }\) of ϵ is taken to be m/T, where T is the overall number of attempts until m successful decommits occurred. As shown in [10], this suffices to ensure that the probability that \(\tilde{\epsilon }\) is not within a constant factor of ϵ(n) is at most 2n. This can be proven using the following bound, that is proven in Appendix B:

Lemma 2.1

(Tail Inequality for Geometric Variables [14])

Let X 1,…,X m be m independent random variables with geometric distribution with parameter ϵ (i.e., for every i, Pr[X i =j]=(1−ϵ)j−1ϵ). Let \(X=\sum_{i=1}^{m} X_{i}\) and let μ=E[X]=m/ϵ. Then, for every Δ,

$$\mathrm {Pr}\bigl[X \geq(1+\Delta)\mu\bigr] \leq e^{-\frac{m\Delta^2}{2(1+\Delta )}}. $$

Define X i to be the random variable that equals the number of attempts needed to obtain the ith successful decommitment (not including the attempts up until the (i−1)th successful decommitment), let \(X=\sum_{i=1}^{m} X_{i}\), and let Δ=±1/2. Clearly, each X i has a geometric distribution with parameter ϵ. It therefore follows that

$$\mathrm {Pr}\biggl[\vphantom {2^{2^2}}X \leq\frac{m}{2\epsilon }\ \vee X \geq\frac {3m}{2\epsilon } \biggr] \leq2\cdot \mathrm {Pr}\biggl[\vphantom {2^{2^2}}X \geq\frac {3}{2}\cdot \frac{m}{\epsilon } \biggr] \leq2\cdot e^{-\frac{m}{12}}. $$

Stated in words, the probability that the estimate \(\tilde{\epsilon }=m/X\) is not between 2ϵ/3 and 2ϵ is at most 2e m/12. Thus, if m(x)=12n, it follows that the probability that \(\tilde{\epsilon }\) is not within the above bounds is at most 2n, as required.

Next, \(\mathcal {S}\) repeats the following rewinding phase up to n times: \(\mathcal {S}\) runs the rewinding phase in Step 4 of the simulation. However, \(\mathcal {S}\) limits the number of rewinding attempts in each rewinding phase to \(n/\tilde{\epsilon }\) iterations. We have the following cases:

  1. 1.

    If within \(n/\tilde{\epsilon }\) rewinding iterations, \(\mathcal {S}\) obtains a successful decommitment from V to q 1, then it completes the proof as described. It can do so in this case because the string is q as required.

  2. 2.

    If \(\mathcal {S}\) obtains a valid decommitment to some \(q'_{1}\neq q_{1}\) then it outputs ambiguous.

  3. 3.

    If \(\mathcal {S}\) does not obtain any correct decommitment within this time, then \(\mathcal {S}\) aborts this attempted rewinding phase.

As mentioned, the above phase is repeated up to n times, each time using independent coins. If the simulator \(\mathcal {S}\) doesn’t successfully conclude in any of the n attempts, then it halts and outputs fail. We will show that this strategy ensures that the probability that \(\mathcal {S}\) outputs fail is negligible.

In addition to the above, \(\mathcal {S}\) keeps a count of its overall running time and if it reaches 2n steps, then it halts, outputting fail. (This additional time-out is needed to ensure that \(\mathcal {S}\) does not run too long in the case that the estimate \(\tilde{\epsilon }\) is not within a constant factor of ϵ(n). Recall that this “bad event” can only happen with probability 2n.)

We first claim that \(\mathcal {S}\) runs in expected polynomial-time.

Claim 2.2

Simulator \(\mathcal {S}\) runs in expected-time that is polynomial in n.

Proof

Observe that in the first and all later iterations, all of \(\mathcal {S}\)’s work takes a strict polynomial-time number of steps. We therefore need to bound only the number of rewinding iterations. Before proceeding, however, we stress that rewinding iterations only take place if V provides a valid decommitment in the first place. Thus, all rewinding only occur with probability ϵ(n).

Now, \(\mathcal {S}\) first rewinds in order to obtain an estimate \(\tilde{\epsilon }\) of ϵ(n). This involves repeating until m(n)=12n successful decommitments are obtained. Therefore, the expected number of repetitions in order to obtain \(\tilde{\epsilon }\) equals exactly 12n/ϵ(n) (since the expected number of trials for a single success is 1/ϵ(n)). After the estimate \(\tilde{\epsilon }\) has been obtained, \(\mathcal {S}\) runs the rewinding phase of Step 4 for a maximum of n times, in each phase limiting the number of rewinding attempts to \(n/\tilde{\epsilon }\).

Given the above, we are ready to compute the expected running-time of \(\mathcal {S}\). In order to do this, we differentiate between two cases. In the first case, we consider what happens if \(\tilde{\epsilon }\) is not within a constant factor of ϵ(n). The only thing we can say about \(\mathcal {S}\)’s running-time in this case is that it is bound by 2n (since this is an overall bound on its running-time). However, since this event happens with probability at most 2n, this case adds only a polynomial number of steps to the overall expected running-time. We now consider the second case, where \(\tilde{\epsilon }\) is within a constant factor of ϵ(n) and thus \(\epsilon (n)/\tilde{\epsilon }= O(1)\). In this case, we can bound the expected running-time of \(\mathcal {S}\) by

$$\mathrm {poly}(n)\cdot \epsilon (n) \cdot \biggl(\frac{12n}{\epsilon (n)}+n\cdot\frac{n}{\tilde{\epsilon }} \biggr) = \mathrm {poly}(n) \cdot\frac{\epsilon (n)}{\tilde{\epsilon }} = \mathrm {poly}(n), $$

and this concludes the analysis. □

Next, we prove that the probability that \(\mathcal {S}\) outputs fail is negligible.

Claim 2.3

The probability that \(\mathcal {S}\) outputs fail is negligible in n.

Proof

Notice that the probability that \(\mathcal {S}\) outputs fail is less than or equal to the probability that it does not obtain a successful decommitment in any of the n rewinding phase attempts plus the probability that it runs for 2n steps.

We first claim that the probability that \(\mathcal {S}\) runs for 2n steps is negligible. We have already shown in Claim 2.2 that \(\mathcal {S}\) runs in expected polynomial-time. Therefore, the probability that an execution will deviate so far from its expectation and run for 2n steps is negligible. (It is enough to use Markov’s inequality to establish this fact.)

We now continue by considering the probability that in all n rewinding phase attempts, \(\mathcal {S}\) does not obtain a successful decommitment within \(n/\tilde{\epsilon }\) steps. Consider the following two possible cases (recall that ϵ(n) equals the probability that V decommits when q 2 is random, and μ is the negligible difference between ϵ(n) and δ(n), the probability that V decommits when q 2 is fixed):

  1. 1.

    Case 1: ϵ(n)≤2μ(n): In this case, V decommits to c 1 with only negligible probability. This means that the probability that \(\mathcal {S}\) even reaches the rewinding phase is negligible. Thus, \(\mathcal {S}\) only outputs fail with negligible probability.

  2. 2.

    Case 2: ϵ(n)>2μ(n): Recall that V successfully decommits in any iteration with probability δ(n)=ϵ(n)−μ(n). Thus, the expected number of iterations needed until V successfully decommits is 1/(ϵ(n)−μ(n)). Now, since in this case ϵ(n)>2μ(n) we have that μ(n)<ϵ(n)/2 and so the expected number of rewinding attempts required to obtain a successful decommitment to q 1 is less than 2/ϵ(n).

    Assuming that \(\tilde{\epsilon }\) is within a constant factor of ϵ(n), we have that \(2/\epsilon (n)=O(1/\tilde{\epsilon })\) and so the expected number of rewindings in any given rewinding attempt is bound by \(O(1/\tilde{\epsilon })\). Therefore, by Markov’s inequality, the probability that \(\mathcal {S}\) tries more than \(n/\tilde{\epsilon }\) iterations in any given rewinding phase attempt is at most O(1/n). It follows that the probability that \(\mathcal {S}\) tries more than this number of iterations in n independent rewinding phases is negligible in n (specifically, it is bound by O(1/n)n). This covers the case that \(\tilde{\epsilon }\) is within a constant factor of ϵ(n). However, the probability that \(\tilde{\epsilon }\) is not within a constant factor of ϵ(n) is also negligible.

Putting the above together, we have that \(\mathcal {S}\) outputs fail with negligible probability only. □

Next, we prove the following:

Claim 2.4

The probability that \(\mathcal {S}\) outputs ambiguous is negligible in n.

Proof

The proof of this claim is identical to the proof of this fact in [10]. Intuitively, if there exists an infinite series of inputs x for which \(\mathcal {S}\) outputs ambiguous with non-negligible probability, then this can be used to break the computational binding of the C ph commitment scheme. The only subtlety is that \(\mathcal {S}\) runs in expected polynomial-time, whereas an attacker for the binding of the commitment scheme must run in strict polynomial-time. Nevertheless, this can be overcome by simply truncating \(\mathcal {S}\) to twice its expected running time. By Markov’s inequality, this reduces the success probability of the binding attack by at most 1/2, and so this is still non-negligible. □

It remains to prove that the output distribution generated by \(\mathcal {S}\) is computationally indistinguishable from the output of V in a real proof with an honest prover. We have already shown that \(\mathcal {S}\) outputs fail or ambiguous with only negligible probability. Thus, the only difference between the output distribution generated by \(\mathcal {S}\) and the output distribution generated in a real proof is that in the case that q i=0 the unopened commitments in the simulated transcript are all to 0, and not to the rest of the graph apart from the cycle. The indistinguishability of this is therefore reduced to the hiding property of the perfectly binding (and computationally hiding) commitment scheme. Once again, the proof of this reduction is identical to the proof of this fact in [10] and so the details are omitted. This completes the proof.  □

Reducing the Knowledge Error to Zero

As shown in [1], if it is possible to find a valid witness to the statement being proved in time poly(n)/κ(n) and it is possible to detect when the extractor “fails” (i.e., in our case, outputs ⊥ because of the event “accept 1collision”), then the knowledge error of a proof of knowledge can be reduced to 0. This is achieved by running the existing knowledge extractor, and in the case of such a failure, finding a valid witness to the statement being proved and outputting it. For the case of Hamiltonicity, it is possible to naively find a cycle in time n!. Thus, in order to reduce the knowledge error to zero using this procedure, we simply need to run the basic Hamiltonicity protocol nlogn times in parallel, instead of just n times in parallel. This yields

$$\mathrm {Pr}[\mathsf {accept}_1\ \wedge\ \mathsf {collision}] \leq\frac{1}{2^{n\log n}} < \frac{1}{n!}, $$

and so in the case that this event occurs, the extractor can find a Hamiltonian cycle using the naive procedure, without affecting its polynomial expected running time. We therefore conclude:

Corollary 4

Assuming the existence of 2-round statistically-hiding commitment schemes, every \(\mathcal {NP}\) relation has a 5-round computational zero-knowledge system for proofs of knowledge, with zero knowledge error.

3 Semi-Simulatable Coin Tossing

In our protocol for constant-round zero-knowledge proofs of knowledge, the verifier’s query q is essentially chosen via a type of coin-tossing protocol. See Protocol 5 for a description of this coin-tossing protocol; recall that \(C_{\mathrm {ph}}^{\alpha}(x;r)\) is a perfectly-hiding commitment to x using the receiver-message α, and C pb is a perfectly-binding commitment scheme. For the sake of this section, we assume familiarity with the definitions of secure two-party computation; see [9, Chap. 7] and [4].

PROTOCOL 5

(Semi-Simulatable Coin Tossing)

  • Common Input: a security parameter 1n and a length parameter .

  • The protocol:

    1. 1.

      Party P 2 ’s first step: P 2 sends P 1 the receiver-message α of the perfectly-hiding commitment scheme.

    2. 2.

      Party P 1 ’s first step: P 1 chooses a random string x R {0,1} and computes \(c_{1}=C_{\mathrm {ph}}^{\alpha}(x;r)\) for a random r of the appropriate length. P 1 sends c 1 to P 2.

    3. 3.

      Party P 2 ’s second step: P 2 chooses a random string y R {0,1} and computes c 2=C pb(y;s) for a random r 2 of the appropriate length. P 2 sends c 2 to P 1.

    4. 4.

      Party P 1 ’s second step: P 1 decommits to c 1 by sending x and r.

    5. 5.

      Party P 2 ’s third step: If \(C_{\mathrm {ph}}^{\alpha}(x;r)\neq c_{1}\), then P 2 outputs ⊥ and halts. Otherwise, P 2 decommits to c 2 by sending y and s.

    6. 6.

      Outputs:

      1. (a)

        P 1 checks that C pb(y;s)=c 2. If not, it outputs ⊥. Otherwise, it outputs xy.

      2. (b)

        P 2 outputs xy.

Protocol 5 has 5 rounds of communication (which is minimal by [16]) and is highly efficient; in particular, it does not use zero-knowledge proofs or arguments, as does the constant-round coin tossing protocol of [17]. In addition, the proof of the zero-knowledge property in Theorem 3 demonstrates that in the case that P 1 (who is the verifier in Protocol 2) is corrupted, it is possible to prove security under the standard simulation-based definitions of [4]. This is because the simulation strategy for the proof of zero-knowledge works by the simulator first choosing the string q at random and then obtaining an execution in which the resulting query is q (with probability that is negligibly close to the probability that the verifier doesn’t abort). Thus, the same strategy works when the simulator receives a random string R from the trusted party and must generate an execution in which xy=R, with probability that is negligibly close to the probability that P 1 does not cause P 2 to abort. In addition, observe that when P 2 is corrupted, the output of P 1 from the coin-tossing protocol is uniformly distributed or ⊥. This holds because P 1 commits to x using a perfectly-hiding commitment and P 2 commits to y using a perfectly-binding commitment. Thus, P 2 is fully committed to y before it knows anything (in an information-theoretic sense) about x. This implies that xy is uniformly distributed and so a corrupted P 2 can either cause P 1 to output this uniform string, or to abort and output ⊥. We call this level of security “semi-simulatable coin tossing”, and define it below. In the following definition, we refer to the coin-tossing functionality f defined by f(1,1)=(U ,U ), meaning that each party inputs the length 1 and receives as output the same uniformly-distributed -bit string. In addition, we denote by \(\mathsf {output}_{1}(\mbox {\textsc {real}}_{\pi,\mathcal {A}(z)}(1^{\ell},1^{\ell},n))\) the output of party P 1 after interacting with adversary \(\mathcal {A}\) in the protocol π, with inputs 1 and security parameter n.

Definition 6

A protocol π=(P 1,P 2) is a semi-simulatable coin-tossing protocol if the following holds:

  1. 1.

    For every non-uniform probabilistic polynomial-time adversary \(\mathcal {A}\) controlling P 1 in the real model, there exists a non-uniform probabilistic polynomial-time adversary/simulator \(\mathcal {S}\) for the ideal model such that

    $$\bigl\{\mbox {\textsc {ideal}}_{f,\mathcal {S}(z)}\bigl(1^\ell,1^\ell,n\bigr) \bigr\}_{z\in \{0,1\}^*;\ell,n\in {\mathbb {N}}} \stackrel {\mathrm {c}}{\equiv }\bigl\{\mbox {\textsc {real}}_{\pi,\mathcal {A}(z)}\bigl(1^\ell,1^\ell,n \bigr) \bigr\}_{z\in \{0,1\}^*;\ell,n\in {\mathbb {N}}}. $$
  2. 2.

    For every non-uniform probabilistic polynomial-time adversary \(\mathcal {A}\) controlling P 2, and for every R 1,R 2∈{0,1}, it holds that

Based on the above discussion, we obtain the following theorem:

Theorem 7

If C ph is a two-round perfectly-hiding commitment scheme and C pb is a perfectly-binding commitment scheme, then Protocol 5 is a 5-round semi-simulatable coin-tossing protocol.