Simulation extractable versions of Groth’s zk-SNARK revisited

Zero-knowledge succinct non-interactive arguments of knowledge (zk-SNARKs) are the most efficient proof systems in terms of proof size and verification. Currently, Groth’s scheme from EUROCRYPT 2016, \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\textsf{Groth16}$$\end{document}Groth16, is the state-of-the-art and is widely deployed in practice. \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\textsf{Groth16}$$\end{document}Groth16 is originally proven to achieve knowledge soundness, which does not guarantee the non-malleability of proofs. There has been considerable progress in presenting new zk-SNARKs or modifying \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\textsf{Groth16}$$\end{document}Groth16 to efficiently achieve strong Simulation extractability, which is shown to be a necessary requirement in some applications. In this paper, we revise the Random oracle based variant of \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\textsf{Groth16}$$\end{document}Groth16 proposed by Bowe and Gabizon, BG18, the most efficient one in terms of prover efficiency and CRS size among the candidates, and present a more efficient variant that saves 2 pairings in the verification and 1 group element in the proof. This supersedes our preliminary construction, presented in CANS 2020 (Baghery et al. in CANS 20, volume 12579 of LNCS, Springer, Heidelberg. pp 453-461, 2020), which saved 1 pairing in the verification, and was proven in the generic group model. Our new construction also improves on BG18 in that our proofs are in the algebraic group model with Random Oracles and reduces security to standard computational assumptions in bilinear groups (as opposed to using the full power of the generic group model (GGM)). We implement our proposed simulation extractable zk-SNARK (SE zk-SNARK) along with BG18 in the Arkworks library, and compare the efficiency of our scheme with some related works. Our empirical experiences confirm that our SE zk-SNARK is more efficient than all previous simulation extractable (SE) schemes in most dimensions and it has very close efficiency to the original \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\textsf{Groth16}$$\end{document}Groth16.


Introduction
Non-interactive zero-knowledge (NIZK) proof systems [11] are a fundamental family of cryptographic primitives that A preliminary version of this paper appeared in the Proceedings of 19th International Conference on Cryptology and Network Security, CANS 2020 [7].
have appeared recently in a wide range of practical applications.A NIZK proof system allows a party to prove that for a public statement x, she knows a witness w such that ( x, w) ∈ R, for some relation R, without leaking any information about w and without interaction with the verifier.Due to their impressive advantages, NIZK proof systems are used ubiquitously to build larger cryptographic protocols and systems.
Zero-knowledge Succinct Arguments of Knowledge (zk-SNARKs) are among the most interesting NIZK proof systems in practice, as they allow to generate very short proofs for NP complete languages, which can be verified in less than 10 milliseconds [20,22].Zk-SNARKs have had a tremendous impact in practice and they have found numerous applications, including verifiable computation systems [32], privacy-preserving (PP) cryptocurrencies [8], PP smart contract systems [28], PP proof-of-stake protocols [24], and efficient ledger verification protocols [13], are some of the best known applications that use zk-SNARKs to prove different statements very efficiently while guaranteeing the privacy of the prover.Because of their practical importance, particularly in large-scale applications like blockchains, even minimal savings especially in proof size or verification cost are considered to be relevant.
In 2016, Groth [22] introduced the most efficient zk-SNARK for Quadratic Arithmetic Programs or QAPs, which is still the state-of-the-art construction, Groth16.It is constructed using bilinear groups and its proof is 3 group elements (2 from G 1 and 1 from G 2 ) and the cost of verification is dominated by 3 pairing computations.In the original paper, it is proven to achieve knowledge soundness in the generic group model (GGM).In 2018, Fuchsbauer, Kiltz, and Loss [19] defined the algebraic group model (AGM) and reproved its security in this weaker model.The proof of Groth16 is malleable, as it is shown in [23].Generating non-malleable proofs is a necessary requirement in building various cryptographic schemes, including universally composable protocols [24,28], cryptocurrencies (e.g.Zcash) [8], signature-of-knowledge schemes [23], etc. Practical systems like Zcash cryptocurrency [8] that uses the original Groth16 [22] make extra efforts to ensure the non-malleability of transactions and the proof of underlying proof system.Considering such concerns, in practice, it is important to have a stronger notion of knowledge soundness, known as (strong) Simulation Extractability (SE).This notion guarantees that a valid witness can be extracted from any adversary producing a proof accepted by the verifier, even after seeing an arbitrary number of simulated proofs.
There have been considerable efforts to construct new SE zk-SNARKs or refine Groth's zk-SNARK to achieve SE and guarantee the non-malleability of proofs.Firstly, in 2017 Groth and Maller [23] proposed an SE zk-SNARK, which is very efficient in terms of proof size but very inefficient in terms of Common Reference String (crs) size and prover time.They also showed how one can use SE zk-SNARKs to build Signature of Knowledge (SoK) schemes [16] with succinct signatures.In 2018 Bowe and Gabizon [14] proposed a less efficient construction in terms of proof size (5 group elements vs 3 in the original version) based on Groth16 which needs a Random Oracle (RO) (apart from GGM) which returns group elements, but with almost no overhead in the crs size or additional cost for the prover.In [29], Lipmaa proposed several constructions, including an efficient QAPbased SE zk-SNARK in terms of proof size and with the same verification complexity as [14,23], but less efficient in terms of crs size and prover time compared to [14] and Groth16.In [2], Atapoor and Baghery used the traditional OR technique to achieve SE in Groth16.Their variant requires 1 pairing less for verification in comparison with previous SE constructions, however it comes with an overhead in proof generation, crs size, and even larger overhead in the proof size.For a particular instantiation they add ≈ 52.000 constrains to the underlying QAP instance, which adds fixed overhead to the prover and crs size, that can be considerable for mid-size circuits.They show that for a circuit with 10 × 10 6 Multiplication (Mul) gates, their prover is about 10% slower, but it can be slower for circuits with less than 10 × 10 6 gates.In [26], Kim, Lee, and Oh proposed a QAPbased SE zk-SNARK with the same crs size and prover time compared to [29], but with slightly shorter proofs and more efficient verification.
These works also differ significantly in the assumptions they make for security.The scheme of Groth and Maller [23] is based on a knowledge assumption and other falsifiable computational assumptions, and they are all q-type assumptions where q is the size of the circuit.In this work, the authors avoid the generic group model by making a concrete knowledge assumption that is essential for extracting the witness.On the other hand, the work of Bowe and Gabizon [14] uses the full power of the generic group model to prove the security.The construction of Bowe and Gabizon uses the generic group model plus the assumption that a certain hash function to group elements is a random oracle.All the constructions of Lipmaa [29] are proven secure in a weaker notion of the AGM, where the adversary has access to a random oracle that allows it to sample random elements obliviously in the group, i.e. without knowing the random oracles.
Recently, Baghery, Kohlweiss, Siim, and Volkhov [6] explored another direction.Instead of modifying Groth16 to achieve strong SE, they first show that the original construction of Groth16 achieves weak SE with non-black-box extraction.Weak SE allows proof randomization, therefore the proof is malleable, while it guarantees that a proof cannot be changed to prove a new statement.Then, considering the first result, they proposed two efficient constructions of Groth16 that achieve weak SE with black-box extraction which is shown to be necessary for UC-security.Both weak and strong SE zk-SNARKs can be lifted to achieve black-box simulation extractability with a simpler compiler [3,6], rather than with the COCO framework [27] which is constructed to lift (knowledge) sound NIZK proofs systems to achieve black-box SE.However, to realize the standard ideal functionality defined for NIZK arguments, one would need to use a strong SE NIZK with black-box extraction [21].Therefore, constructing a more efficient strong SE zk-SNARK, would also allow to build more efficient black-box SE zk-SNARK to be used in UC-secure protocols.Our Contributions.Our main contribution is to revise the simulation extractable variants of Groth16, presented in [14] and [2], to achieve a better efficiency and get the best of both constructions.Namely, achieving strong simulation extractability in Groth16 with minimal overhead.
Our focus is mainly on Bowe and Gabizon's variation [14] which has the most efficient prover and the shortest crs among other (strong) SE zk-SNARKs [2,14,23,26,29], while it uses a RO which returns group elements.To achieve (strong) A typical set of values is n = m = 10 6 and l = 10.In the case of crs size and prover's computation we omit constants.In [23], n Mul gates and m wires translate to 2n squaring gates and 2m wires.In [2], SE is achieved with an OR approach which requires to add constraints and In this paper, using the same approach [14] and some subtle modifications, we construct a strong SE zk-SNARK that results in the most efficient (strong) simulation extractable variant of Groth16 in terms of crs size, prover complexity, and verification time.Our SE zk-SNARK uses some sophisticated modification of Boneh-Boyen signatures [12] to prove knowledge of the DLOG of δ which requires 1 less G 1 element in the proof, and 2 pairings less in the verification in comparison with the argument of Bowe and Gabizon [14], but at the cost of one additional exponentiation in the verification.Our construction supersedes and improves a preliminary version of this work presented at CANS 2020 [7], where in all constructions verification required at least one additional pairing and proofs were in the GGM.
Our construction modifies the proof generation of Groth16 slightly and include the PoK of the DLOG of [δ ] 2 w.r.t [δ] 2 inside the original proof of Groth16.Using this, we manage to save 1 element in the proof, and 2 pairings in the verification of Bowe and Gabizon's construction [14], at the cost of a single exponentiation in G 2 in the verification.This construction shows that using a random oracle, we can achieve strong SE in Groth16, at the cost of one additional G 2 element in the proof, and one new exponentiation in G 2 in the verification.In the case of verifying a larger number of proofs where verifiers of our constructions gain efficiency by using multiscalar exponentiations, our construction achieves almost the same efficiency as Groth16.
Table 1 presents a comparison of our proposed variant of Groth16 with several other constructions for a particular instance of arithmetic circuit satisfiability.As it can be seen, in comparison with Bowe and Gabizon's construction [14], our construction retains most of the properties requires 2 less pairing in the verification, at the cost of 1 additional exponentiation in the verification.We also compare our construction with the results initially obtained and presented in CANS 2020 [7].We note that in both our constructions, the hash function maps into Z p and not to a source group as in [14], which is an additional practical advantage.In comparison with Atapoor and Baghery's construction [2], both of our variants have a negligible overhead in the proof generation and crs size, and a smaller overhead in proof size.Above all, our best construction, requires 3 parings in the verification, instead of 4. 1 We reduce security to a q-DLOG in the AGM with random oracles, where q is the size of the circuit.In contrast, our preliminary result [7] was in the GGM but only required the hash function to be collision resistant.
As a part of our contribution, we also present an opensource prototype implementation of our presented constructions and Bowe and Gabizon's scheme in the Arkworks library, which currently is one of the most popular ecosystems written in Rust for developing and programming with zk-SNARKs.Then, we use our implementations along with the implementations of Groth16 [22] and Groth-Maller [23], which already exist in Arkworks library, and present a comprehensive benchmark for the relevant simulation extractable zk-SNARKs [14,22,23].Full details of our empirical analysis are reported in Sect.4, in Table 2.As we expected, the implementation results show that, our new construction is more efficient than the first one, and also it is more efficient than all previous SE zk-SNARKs in most dimensions and more importantly it has a very close efficiency profile to the original Groth16, particularly when we need to verify a large number of proofs.
Finally, we highlight that using the technique proposed in [23], both of or proposed SE zk-SNARKs can be turned into succinct SoK schemes, which would be more efficient than previous constructions.In general, due to relying on non-falsifiable assumptions, succinct SoK schemes have better efficiency in comparison with constructions that are built under standard assumptions [5,10,16].We also note that to achieve strong (non-black-box) SE, our proposed zk-SNARKs require minimal changes in comparison with the original Groth16.Therefore, one can use the same compiler or ad-hoc approach proposed in [3] and [6], respectively, to construct a more efficient strong black-box SE zk-SNARK for UC-protocols [21].
Organization.In Sect.2, we introduce notation, the relevant security definitions, and recall the Boneh-Boyen signature scheme.In Sect.3, we present our new and the most efficient SE zk-SNARK, that has very close efficiency to the Groth16.We evaluate the practical efficiency of both presented constructions in Sect. 4 using a prototype Rust implementation in Arkworks library.We also compare the efficiency of our constructions with several relevant SE zk-SNARKs in the same section.Finally we conclude the paper in Sect. 5.For the sake of completeness, in Appendix A, we also recall our first SE zk-SNARK [7] that relaxes the RO in Bowe and Gabizon's scheme [14] to a collision resistant hash function, and also saves 1 pairing in the verification.We implement that scheme as well and include it in our benchmarks.
Novelty.Compared to the conference version published in CANS 2020 [7], this version includes a more efficient construction presented in Sect.3, a prototype Rust implementation of our presented constructions along with Bowe and Gabizon's scheme [14] in Arkworks library, followed by a comprehensive efficiency comparison of relevant SE zk-SNARKs that are reported with details in Sect. 4.

Notation and bilinear groups
We let BGgen be a probabilistic polynomial time algorithm which on input 1 λ , where λ is the security parameter, returns the description of an asymmetric bilinear group gk = ( p, G 1 , G 2 , G T , e, P 1 , P 2 ), where G 1 , G 2 and G T are groups of prime order p, the elements an efficiently computable, non-degenerate bilinear map, and there is no efficiently computable isomorphism between G 1 and G 2 .
Elements in G i , are denoted implicitly as [a] i = aP i , where i ∈ {1, 2, T } and P T = e(P 1 , P 2 ).With this notation, e( We extend this notation naturally to vectors and matrices.We denote by negl(λ) an arbitrary negligible function in λ.

Definitions
For an algorithm A , let Im(A ) be the image of A , i.e. the set of valid outputs of A .By y ← A (x; r ) we denote the fact that A , given an input x and a randomizer r , outputs y.
We use the definitions of NIZK arguments from [22].Let R be a relation generator, such that R(1 λ ) returns a polynomial-time decidable binary relation R = {( x, w)}.Here, x is the statement and w is the witness.Security parameter λ can be deduced from the description of R. The relation generator also outputs auxiliary information z R that will be given to the honest parties and the adversary.In our constructions, z R will be the description of a bilinear group.As in [22], z R is the value returned by BGgen(1 λ ), and is given as an input to the parties.
Let L R = { x : ∃ w, ( x, w) ∈ R} be an NP-language.A NIZK argument system for R consists of tuple of PPT algorithms (K, P, V, Sim), such that: where (R, z R ) ∈ Im(R(1 λ )), outputs crs := (crs P , crs V ) and stores trapdoors of crs as ts.We distinguish crs P (needed by the prover) from crs V (needed by the verifier).Prover: P is a PPT algorithm that, given (R, z R , crs P , x, w), where ( x, w) ∈ R, outputs an argument π .Otherwise, it outputs ⊥.Verifier: V is a PPT algorithm that, given (R, z R , crs V , x, π), returns either 0 (reject) or 1 (accept).Simulator: Sim is a PPT algorithm that, given (R, z R , crs, ts, x), outputs a simulated argument π .
Besides succinct proofs, i.e. polynomial in λ, an SE zk-SNARK is required to satisfy completeness, simulation extractability, and zero-knowledge.
Intuitively, perfect completeness states that an honest prover P always convinces an honest verifier V.
Definition 2 (Computationally Knowledge-Soundness [22]) A non-interactive argument is computationally (adaptively) knowledge-sound for R, if for every non-uniform PPT A , there exists a non-uniform PPT extractor Ext A , s.t. for all λ, the following probability is negl(λ), Here, trans A is a list containing all of A 's inputs and outputs.Intuitively, the definition states that if an adversary can convince the verifier, she knows the witness.A knowledgesound also is called an argument of knowledge.
Here, Q is the set of statements queried by adversary to the simulation oracle O, and trans A is a list containing all of A 's inputs and outputs.Note that this variant of simulation extractability allows proof randomization, while it ensures that a proof cannot be changed to prove a new statement.
Here, Q is the set of simulated statement-proof pairs generated by adversary's queries to the simulation oracle O, and trans A is a list containing all of A 's inputs and outputs.Note that both variants of simulation extractability implies knowledge soundness (given in Def. 2), as the earlier is a strong notion of the later which additionally the adversary is allowed to send query to the proof simulation oracle.
Definition 5 (Zero-Knowledge (ZK) [22]) A non-interactive argument is computationally ZK for R, if for all λ, all (R, z R ) ∈ Im(R(1 λ )), and for all non-uniform PPT A , ε 0 ≈ c ε 1 , where Here, the oracle O 0 ( x, w) returns ⊥ (reject) if ( x, w) / ∈ R, and otherwise it returns P(R, z R , crs P , x, w).Similarly, Intuitively, a non-interactive argument is ZK if it does not leak extra information beyond the truth of the statement.

Boneh-Boyen signatures
We briefly recall one of the constructions of Boneh-Boyen signatures [12], that is used implicitly in our constructions.
Messages are elements of Z p , and signatures are elements of G 1 .The secret key is sk ∈ Z p , and the public key (verification key) is [sk] 2 ∈ G 2 .To sign a message m ∈ Z p , the signer computes The verifier accepts the signature if the equation e( Boneh-Boyen signatures are existentially unforgeable under the q-SDH assumption.We use them in our constructions as proofs of knowledge of the secret key in the AGM.

Algebraic group model
The algebraic group model or AGM for short [19] assumes that adversaries are algebraic, i.e. they construct their output group elements as a linear combination of previously seen group elements.This model is a weakening of the GGM, [31,33], since algebraic adversaries have direct access to group elements and can use their representation.In the asymmetric algebraic group model, it is assumed that, for every element π in G 1 , G 2 output by the adversary, it also outputs a set of coefficients in the field that express π as a linear combination of previously received group elements in the same source group.For elements in G T , the adversary also outputs the coefficients that express every element output by the adversary as a linear combination of elements in G T that the adversary has received or can compute as the pairing of elements in G 1 and G 2 it has received.
Several works (e.g.[17]) have proven security in the AGM with random oracles.In this case, the adversary has oracle access to a certain function H : {0, 1} * → R, and the assumption is that for every element π output by the adversary in G 1 , G 2 , also outputs a set of coefficients in the field that express π as a linear combination of all previously received group elements, including those obtained as a response to a hash query if the range R is a group.
Note that when the range of the hash function R is a group, the oracle allows the adversary to sample obliviously from it, i.e. without knowing the discrete logarithm.In our case, the range of the RO is a field (of size of the order of the elliptic curve) and therefore in our model, the adversary cannot obliviously sample in the group.As discussed by Lipmaa [29], we could consider strengthening our model and give the adversary access to another oracle H 2 mapping to group elements to give this additional power to the adversary.This model is more realistic since in practice there usually exist hash to group algorithms that allow to sample in the curve without knowing the discrete logarithm.
Although the strengthened model is very meaningful and is a more realistic idealization of elliptic curves, we have not considered since it complicates the proof significantly although these additional uniformly and randomly chosen elements that are chosen independently of the input of the adversary, intuitively, cannot help the adversary except with negligible probability. 2ollowing the work of Fuchsbauer et al. [19], we will prove that the security of our scheme reduces to the (q 1 , q 2 )-DLOG Assumption, for a certain (q 1 , q 2 ) that depends on the size of supported instances.We note that to improve efficiency, as [29] we rely on the asymmetric AGM, as opposed to the proof of Groth16 in [19,22].

SE variant of Groth16 in the ROM
To achieve (strong) simulation extractability, the prover of Bowe and Gabizon's construction [14] replaces all the computations which depend on δ given in the crs by some δ of its choice, that it must give as part of the proof, together with a proof of knowledge of the DLOG of δ w.r.t to δ, which given some element In their analysis, H is an RO and their proof requires 2 pairings for verification.In Fig. 1, we describe a SE variant of Groth16 that uses a new technique to shorten the proof and verifies it with a single verification equation which requires 3 pairings, just as Groth16.The security analysis is done in the AGM assuming the underlying hash function is a random oracle.The SE proof is built using a sequence of games.As part of the reduction we need to rewrite in the AGM part of the same proof as Bowe and Gabizon's construction [14], that is also in the random oracle but in the generic group model.
A part from the efficiency gain, from a security point of view one additional advantage of our construction is that the RO maps to elements in Z p and it does not need the property that H can sample elements of G obliviously (i.e.soundness does not use that the DLOG of image elements is hard).
The idea of Bowe and Gabizon of using a POK of the DLOG of δ was also used in our preliminary results presented in [7], included in Appendix A. The construction we present below improves on both previous works by choosing δ as before but then replacing it by δ +δm to create and verify the proof at once, where, ).The intuition is that the adversary needs to know the division in the exponent of C by δ +δm.However, this is a degree one to the univariate one by writing each new variable as a degree one polynomial of the same variable.These additional variables ultimately do not change the total degree of the polynomial that the adversary constructs as his output, which is what determines the loss in the reduction.polynomial in δ, and this is hard to do unless δ = ζ δ.The verification of this variant requires one additional exponentiation in G 2 .In the description of the new construction, we highlight the changes to Groth16 with gray background.We emphasize that the original scheme corresponds to m = 0 and ζ = 1.
Theorem 1 (Completeness, ZK, strong SE) The variant of Groth16 described in Fig. 1, is a non-interactive zeroknowledge argument that guarantees 1) perfect completeness, 2) perfect zero-knowledge and 3) strong simulationextractability in the asymmetric Generic Group Model and the RO Model.
Proof To see why perfect completeness holds, the easiest is to rewrite this scheme in such a way so that the terms A, B, C correspond exactly to Groth16, except that the original term δ is replaced by δ + δm.The prover creates A, B with the randomizer r a δ , r b δ , r a , r b ← Z p .Then, it receives m and reinterprets A, B as being created for the randomized δ + δm and some random values s a , s b .This means the prover finds the value s a such that r a δ = s a (δ + δm).Solving the equation, we get Then it computes C as in the original Groth16 paper but for s a , s b and δ + δm, instead of δ.Rewriting, we obtain: Completeness easily follows from these formulae (in fact, it is identical to the completeness of Groth16 replacing δ by δ + δm).Similarly, perfect zero-knowledge can be argued in a standard way.Simulation extractability is proven by reduction in the AGM to the knowledge soundness of Groth16.
Since the adversary is algebraic, for each output elements it is possible to extract a list of coefficients that express it as a linear combination of previously seen elements.The view of an adversary A that has made a sequence of queries x 1 , . . ., x v to Sim( ts, •), and received answers j=1 is the set Q , the union of elements in the crs together with those from the replies of Sim( ts, •); namely, where , and m j ∈ Z p the message that simulator receives from the RO for each A j , B j , δ j .Let Q 1 be the elements of Q in group G 1 and Q 2 the elements in group G 2 .Now, assume that the adversary A has produced elements The coefficients extracted for output element [Y ] i for i ∈ {1, 2} corresponding to element q ∈ Q i will be denoted by k Y ,q , so that for each element Y we have that Y = q∈Q i k Y ,q q.The reduction proceeds in a series of games, G0, . . ., G4. G0: This is the original simulation extractability soundness game.The adversary wins if the proof π = ([A, C] 1 , B, δ 2 ) for some statement (a 1 , . . ., a l ) is accepted and it is not the result of some previous query for the same statement.G1: This game is the same as the previous one except that it aborts if π is accepted but k δ ,δ = −m.G2: This game is the same as the previous one except that it aborts if π is accepted but for some j = 1, . . ., v, δ = k δ ,δ j δ j + k δ ,δ δ and m = m j k δ ,δ j − k δ ,δ .G3: This game is the same as the previous one except that it aborts if if π is accepted but δ = k δ ,δ δ.G4: This game is the same as the previous one, except that an abort occurs if π is accepted but to compute π the adversary uses any of the answers of the simulation oracle.
From G3 on, it is clear that the reduction can extract ζ = DL OG δ δ from the adversary, from which it can transform the adversary's output to a proof for Groth16 as Additionally, since in G4 the adversary does not use any of the answers to the simulation oracle, soundness in that game is implied by the knowledge soundness of Groth16.
We now proceed to bound the difference in the advantage in these games of any algebraic adversary A .Clearly, p since the output of the ran-Fig. 1 A simulation-extractable variation of Groth16 for R. H is a family of collision resistant hash functions that map to Z * p dom oracle is a uniform value chosen independently of the constants extracted, and the adversary can only be lucky in guessing this value with probability 1/ p.

Next we prove the following lemma:
Lemma 1 For all PPT algebraic adversaries A there exists an adversary B against the (v + 2, 1)-DLOG Assumption such that Proof Both games are identical except if adversary A outputs δ = k δ ,δ δ.We show that in this case there exists another adversary B that breaks the (v + 2, 1)-DLOG Assumption.
Given some group key gk . It then chooses m 1 , . . ., m v random values in Z p .It will store these values and give them as a reply to the hash queries related to the simulation queries of A .Next, for j = 1, . . ., v, it defines It programs the public parameters to compute δ and δ j +m j δ roots for any j, that is, it defines the new group key included in the public parameters to be gk j=1 (δ j + m j δ)P 1 , P 2 ).This can be computed from the input of B since is a polynomial of degree (v + 1) in the indeterminate z.
Then, adversary B samples x, α, β ← Z p and computes the common reference string honestly based on the new group key gk and sends all this information to A .Note that this requires to compute some expressions involving x, α, β divided by δ but B can do that by computing δ The terms in G 1 have maximal degree v + 2 so they can be computed by B. Whenever B receives a simulation query x j , it sets For this, it will use the fact that it can compute (δ j + m j δ) −1 P 1 as If adversary A breaks simulation extractability for some x = (a 1 , . . ., a j ), it has produced elements (A, B, C, δ ) that pass the verification equation so: We now study the denominator and numerator of this expression.
For a second consider = (δ, δ 1 , . . ., δ v ) as formal variables and define the polynomial The polynomial P B ( ) is defined analogously for the coefficients k B,q , with q ∈ Q 2 .On the other hand, we also define R A ( ), R C ( ) in a similar way, except that the result is not a polynomial but a sum of some rational functions since the view of A in G 1 includes terms that have δ, δ j + m j δ in the denominator.
If adversary A successfully distinguishes between the two games, k δ ,δ = −m, so P δ ( )+mδ is a polynomial of degree one in δ.Further, there is no j such that P δ ( ) + mδ = χ(δ j + m j δ) for some χ ∈ Z p , since this would imply δ = k δ ,δ j δ j + k δ ,δ δ and m = m j k δ ,δ j − k δ ,δ , which is also an abort condition.If A is successful in distinguishing between the two games, P δ ( ) = k δ ,δ δ, and we are left with two possibilities: But this equation cannot hold, since as we argued, P δ ( ) + mδ is not a polynomial that is a multiple of δ, or δ j + m j δ, the only terms that appear as denominators in any term in R C ( ).
Note that this is a polynomial in , since δ v j=1 (δ j + m j δ) cancels out any of the denominators that appear in the terms in R A ( δ).Replacing δ = d Z + f and δ j = d j Z + f j in T we get a polynomial that depends on a single variable T (Z ).Since C = AB−ic−αβ δ +mδ , T (z) = 0. On the other hand, T (Z ) = 0 except with probability 1/ p.This is justified as follows: if T (Z ) was 0 all its coefficients must be 0. In particular, take the leading terms in Z of T (Z ): this is an expression involving only d, d j , which are information theoretically hidden from A .If we think of this polynomial as a multivariate one of total degree v + 3 in variables d, d j , the probability that A chooses the coefficients k A,q , k B,q , k δ ,q , k C,q such that when evaluated in d, d j this polynomial is 0 can be bounded by (v + 3)/ p.Therefore, B can solve the DLOG challenge by factoring T and trying all the possible roots.
Lemma 2 For all PPT algebraic adversaries A there exists an adversary B against the Proof Both games are identical except if adversary A outputs a accepting proof that is built using the output of some simulation query.We show that in this case there exists another adversary B that breaks the (v + 2, 1)-DLOG Assumption.
Given some group key gk . It then chooses m 1 , . . ., m v random values in Z p .It will store these values and give them as a reply to the hash queries related to the simulation queries of A .Next, for j = 1, . . ., v, it defines and, as in the previous lemma: It programs the public parameters to compute δ and δ j +m j δ roots for any j, that is, it defines the new group key included in the public parameters to be gk is a polynomial of degree (v + 1) in the indeterminate z.
Then, adversary B samples x ← Z p and computes the common reference string honestly based on the new group key gk and sends all this information to A .This can be computed from B's input since these requires to compute polynomials of degree at most 2 in z in each source group.
Whenever B receives a simulation query x j , it samples ζ j , f A, j , d A, j , f B, j , d B, j ← Z p , and sets ) and computes For this, it uses the fact that it can compute δ j + m j δ j roots in G 1 .If adversary A distinguishes between both games, it outputs some x = (a 1 , . . ., a j ), and (A, B, C, δ ) that pass the verification equation and, further, it is possible to extract some ζ such that δ + mδ = (ζ + m)δ, therefore it holds that: For a second, consider as formal variables.Define the polynomial Define R A ( Y ), R C ( Y ) in a similar way, with the coefficients k A,q , k C,q , q ∈ Q 1 extracted from the adversary, except that the result is not a polynomial but a sum of some rational functions since the view of A in G 1 includes terms that have δ or δ j + m j δ in the denominator.Note that are polynomials in Y of degree v + 2 since all possible denominators are cancelled out.Multiplying on both sides of equation ( 2) by δ v j=1 (δ j + m j δ), and replacing each group element by the corresponding polynomial, we get the following polynomial: If adversary A distinguishes between the two games, there is at least one coefficient of P C ( Y ) or P A ( Y ) accompanying A j or C j which is not zero, or at least one coefficient of P B ( Y ) accompanying δ j or B j which is not zero.We show that this implies in all cases that T ( Y ) = 0.
We start by arguing that k A,α = 1 and k B,β = 1, since otherwise the term αβ in equation ( 3) cannot be cancelled out.In other words, R A ( Y ) = α + . . .and P B ( Y ) = β + . .., so P A ( Y ) = δ v j=1 (δ j + m j δ)α + . ... We next argue all cases of interest separately: (a) If the coefficient k B,δ j = 0 for some j, then in P A ( Y )P B ( Y ) the coefficient of αδ v j=1 (δ j + m j δ)δ j is k B,δ j but it is 0 for the rest of the terms (P C can have no δ j terms because the group is asymmetric, ic( Y ) does not have δ j terms by definition and the last term has no monomials without β).Therefore, the coefficient of this polynomial is not zero and T ( Y ) = 0. (b) Similarly, if the coefficient k B,B j = 0 for some j, then in Finally, we show that if T ( Y ) = 0, there exists an adversary against the (v + 2, 1)-DLOG Assumption.Indeed, suppose that T ( Y ) = 0. Define the univariate polynomial T (Z ) as the result of substituting each variable in Y by an expression in the same indeterminate Z , as α = If T (Z ) = 0 is not zero and we know from expression (2) that T (z) = 0, adversary B can find z by factoring T , solving the DLOG challenge.On the other hand, to argue that T (Z ) = 0 except with probability (v + 3)/ p, we resort to the same argument as in the last step of Lemma 1.
This concludes the reduction to the knowledge soundness of Groth16, that was reduced in the symmetric AGM to the (2n − 1)-DLOG Assumption.

Empirical analysis
We evaluate the efficiency of our presented simulation extractable variants of Groth's zk-SNARK using a prototype implementation in Arkworks3 which is an ecosystem written in Rust for developing and programming with zk-SNARKs.A prototype implementation of both Groth16 [22] and Groth and Maller's zk-SNARK [23] are already presented in Arkworks library, and in order to obtain a fair comparison and a comprehensive outcome, we also present an efficient implementation of Bowe and Gabizon's construction [14] and our initial construction [7] in the same library. 4ur empirical analysis are done with the elliptic curves BLS12-381, MNT4-298, MNT6-298, MNT4-753 and MNT6-753 that BLS12-381 is estimated to achieve between 117 and 120 bits security [30], and the other four curves are estimated to achieve respectively 2 77 , 2 87 , 2 113 , 2 137 security [9].All experiments are done on a desktop machine with Ubuntu 20.4.2 LTS, an Intel Core i9-9900 processor at base frequency 3.1 GHz, and 128GB of memory.Proof generations are done in the multi-thread mode, with 16 threads, while proof verifications are done in a single-thread mode.
Following the benchmark strategy in Arkworks library, we report Per-Constraint Proving Time (PCPT) and verification time for both the proposed constructions in Sects.3 and Appendix A and compare their efficiency with (weak or strong) SE zk-SNARKs of Groth16 [22], Groth-Maller (GM17) [23] and Bowe-Gabizon (BG18) [14].Motivated by blockchain and large-scale applications like Zcash [8], we also compare (deterministic) verifying time of all constructions for the case that one needs to verify a large number of proofs for a particular language simultaneously.In the verification step of our constructions, one needs to compute exponentiation in G 2 and G T , which can be optimized by Multi-Scalar Multiplication (MSM) techniques.
Table 2 presents an empirical analysis of our constructions and compares them with several relevant SE zk-SNARKs for an R1CS instance with 400.000 constraints and 10 input variables.The reported times are the average values on 100 iterations for proof generation and 10.000 iterations for verification.As it can be seen, similar to BG18 construction [14], provers of our constructions are almost as efficient as Groth's protocol, while due to a different NP characterization, the GM17 scheme is considerably less efficient in comparison with other schemes.For instance, to generate a proof for an arithmetic circuit with 400.000 constraints, with BLS12-381 curve, Groth16, BG18, and both of our constructions require ≈ 2.01 seconds, while GM17 needs ≈ 4.41 seconds.Among the compared strong SE constructions, GM17 has the shortest proof size, namely 2 elements from G 1 and 1 element from G 2 , and our construction in Sect. 3 has the second shortest proof size, namely 2 elements from G 1 and 2 elements from G 2 .
In the last two columns of Tab. 2, we report the verification time of all constructions for the case that we need to verify 10 2 or 10 3 proofs of the same language.Once verifying a large number of proofs, our constructions use the MSM technique to compute the needed exponentiations in all proofs at the same time, which allows us to save on total verification time.As it can be seen, our construction presented in Sect. 3 has the most efficient verification among the strong SE constructions, and above all in the case of verifying a large number of proofs, the total verification time in both of our constructions improve significantly using the MSM technique.In particular, the verification of our second construction has very close efficiency to the original Groth16.For instance, in the case of BLS12-381, once we verify 100 proofs, the total verification time for Groth16 is ≈ 0.190 seconds, and for our second construction is ≈ 0.194.As it can be seen the gap is small and actually the larger the number of proofs we verify, the smaller this gap gets.

Conclusion
Over the last few years, various SE zk-SNARKs have been proposed that achieve (strong) simulation extractability [2,14,23,29], which is a security property stronger than knowledge soundness and prevents attacks from the adversaries who have seen simulated proofs.Simulation extractability implies non-malleability of proofs [23] and its variant with black-box extraction is shown to be sufficient for achieving UC-security in NIZK arguments [21].SE zk-SNARKs allow us to build succinct signature-of-knowledge schemes [16,23], and they can also be used to build chameleon hash functions [25].
In this paper, we revised the SE variation of Groth16 proposed in [14] and presented a new variation.Our initial construction from CANS 2020 ( [7], Appendix A) requires 4 pairings in verification, instead of 5 in [14], and also avoids random oracles in exchange for using a collision resistant hash function.It has a more efficient prover, crs size, and proof size in comparison with [2], that has also 4 pairings in the verification.Our new variant used some subtle modifications to shorten the proof size and improved the verification of Bowe and Gabizon's construction significantly [14].In this variant, we showed that using a random oracle, we can achieve strong SE in Groth16, at the cost of one additional G 2 element in the proof, and one new exponentiation in G 2 in the verification, where the later introduces negligible overhead to the verification of Groth16 in the cases that one needs to verify a large number of proofs for the same circuit (e.g.Zcash [8]).
We evaluated the empirical performance of our constructions in Arkworks library.Our evaluations showed that our constructions are among the most efficient SE zk-SNARKs.Particularly, in large-scale applications, the CRS, the prover, and the verifier of our new SE zk-SNARK are almost as efficient as the original Groth16.Just, in our case the proof consists of 4 group elements, instead of 3 in the original construction of Groth16.This seems to be a minimal cost to achieve strong SE in Groth16.
Fig. 2 Our initial strong SE variant of Groth16 for R along with a modification of the Boneh-Boyen signature.In the protocol, H is a family of collision resistant hash functions that map to Z * p [7] is changed to another PoK in the GGM that relies on the collision resistance property of the hash function.In Fig. 2, the elements [αβ, t(x), γ t(x)] T are redundant and can in fact be computed from the rest of the elements in the crs.Alternatively, one can describe Groth16 as corresponding to ζ = 1, γ = 0 and where the proof consists only of [A, C] 1 , [B] 2 .Differences with Groth16 are highlighted.We briefly give an intuition behind the construction in the following.
Avoiding Random Oracle.In [7], it is proven that the variation of Groth16 described in Fig. 2, guarantees (1) perfect completeness, (2) perfect zero-knowledge and (3) simulation-extractability in the asymmetric GGM.The proof of construction uses the collision resistance property of the hash function and the GGM.Roughly speaking, the new variable γ gives some additional guarantees because to compute t(x) (γ +m) (δ +δm) from D j such that m j = m, it is necessary to know both 1  (δ +δm) and γ (δ +δm) , but this is only possible when δ + δm = kδ.Then, either one has the knowledge of the DLOG of δ respect to δ (k − m), which is straightforward, or either one has re-used δ j and m j from some jth query.The last case is discarded when one reaches that same message had to be re-used, m = m j , which breaks collision resistance of the hash.
while in the other terms it is 0, in which caseT ( Y ) = 0. (c) If the coefficient k A,A j = 0 for some j, then in P A ( Y )P B ( Y ) the coefficient of monomial A j β is k A,A j ,while in the other terms it is 0 (because in P C there can be no β term and ic( Y ) does not have A j terms by definition).Therefore, T ( Y ) = 0. (d) If the coefficient k A,C j = 0 for some j, the analysis is the same as in (b).Therefore, T ( Y ) = 0. (e) If the coefficient k C,A j = 0, the only term with A j would be P C ( Y )(ζ j +m)δ since we ruled out case (c).Therefore, T ( Y ) = 0. (f) If the coefficient k C,C j = 0, the only term with C j would be P C ( Y )(ζ j +m)δ since we ruled out case (d).Therefore, T ( Y ) = 0.

Table 1
A comparison of our proposed variations of Groth16 along with the other SE zk-SNARKs for arithmetic circuit satisfiability with n Mul gates (constraints) and m wires (variables), of which l are public input wires (variables) variables, resulting in n ≈ n + 52.000, m ≈ m + 52.000, and l = l + 4. G 1 , G 2 and G T : group elements, E i : exponentiation in group G i , M i : multiplication in group G i , P: pairings GGM Generic group model, ROM Random oracle model, AGM Algebraic group model, HAK Hash algebraic knowledge assumption, LCR Linear collision resistance hash functions, CRH Collision resistant hash, VE Number of verification equations, WSE Weak simulation extractable, SSE

Table 2
A comparison of practical efficiency of our proposed variants of Groth16 SNARKs with several elliptic curves.The benchmarks are done with an R1CS instance with 400.000 constrains and 10 input values, and the average of proving times are taken for 100 iterations and the verification for 10 3 iterations.Proof generation is done in multi-thread setting with 16 threads, while the verification is done in the single-thread setting