Exponentiating in Pairing Groups
Abstract
We study exponentiations in pairing groups for the most common security levels and show that, although the Weierstrass model is preferable for pairing computation, it can be worthwhile to map to alternative curve representations for the nonpairing group operations in protocols.
At the turn of the century it was shown that elliptic curves can be used to build powerful cryptographic primitives: bilinear pairings [14, 36, 49]. Pairings are used in a large variety of protocols, and even when considering the recent breakthrough paper which shows how to instantiate multilinear maps using ideal lattices [26], pairings remain the preferred choice for a bilinear map due to their superior performance. Algorithms to compute cryptographic pairings involve computations on elements in all three pairing groups, \(\mathbb {G}_1\), \(\mathbb {G}_2\) and \(\mathbb {G}_T\), but protocols usually require many additional standalone exponentiations in any of these three groups. In fact, protocols often compute only a single pairing but require many operations in any or all of \(\mathbb {G}_1\), \(\mathbb {G}_2\) and \(\mathbb {G}_T\) [13, 28, 47]. In this work, we use such scenarios as a motivation to enhance the performance of group operations that are not the pairing computation.
Using nonWeierstrass models for elliptic curve group operations can give rise to significant speedups (cf. [9, 10, 31, 43]). Such alternative models have not found the same success within pairing computations, since Miller’s algorithm [42] not only requires group operations, but also relies on the computation of functions with divisors corresponding to these group operations. These functions are somewhat inherent in the Weierstrass group law, which is why Weierstrass curves remain faster for the pairings themselves [17]. Nevertheless, this does not mean that alternative curve models cannot be used to give speedups in the standalone group operations in pairingbased protocols. The purpose of this paper is to determine which curve models are applicable in the most popular pairing scenarios, and to report the speedups achieved when employing them. In order to obtain meaningful results, we have implemented curve arithmetic in different models that target the 128, 192 and 256bit security levels. Specifically, we have implemented group exponentiations and pairings on BN curves [4] (embedding degree \(k=12\)), KSS curves [38] (\(k=18\)) and BLS curves [3] (\(k=12\) and \(k=24\)). We use GLV [25] and GLS [23] decompositions of dimensions \(2\), \(4\), \(6\) and \(8\) to speed up the scalar multiplication.
The goal of this work is not to set new software speed records, but to illustrate the improved performance that is possible from employing different curve models in the pairing groups \(\mathbb {G}_1\) and \(\mathbb {G}_2\). In order to provide meaningful benchmark results, we have designed our library using recoding techniques [21, 29] such that all code runs in constanttime, i.e. the runtime of the code is independent of any secret input material. Our implementations use stateoftheart algorithms for computations in the various groups [24] and for evaluating the pairing [2]. For any particular curve or security level, we assume that the ratios between our various benchmark results remain (roughly) invariant when implemented for different platforms or when the bottleneck arithmetic functions are converted to assembly. We therefore believe that our table of timings provides implementers and protocol designers with good insight as to the relative computational expense of operating in pairing groups versus computing the pairing(s).
2 Preliminaries
A cryptographic pairing \(e: \mathbb {G}_1 \times \mathbb {G}_2 \rightarrow \mathbb {G}_T\) is a bilinear map that relates the three groups \(\mathbb {G}_1\), \(\mathbb {G}_2\) and \(\mathbb {G}_T\), each of prime order \(r\). These groups are defined as follows. For distinct primes \(p\) and \(r\), let \(k\) be the smallest positive integer such that \(r\mid p^k1\). Assume that \(k>1\). For an elliptic curve \(E/\mathbb {F}_p\) such that \(r\mid \#E(\mathbb {F}_p)\), we can choose \(\mathbb {G}_1 = E(\mathbb {F}_p)[r]\) to be the order\(r\) subgroup of \(E(\mathbb {F}_p)\). We have \(E[r] \subset E(\mathbb {F}_{p^k})\), and \(\mathbb {G}_2\) can be taken as the (order\(r\)) subgroup of \(E(\mathbb {F}_{p^k})\) of \(p\)eigenvectors of the \(p\)power Frobenius endomorphism on \(E\). Let \(\mathbb {G}_T\) be the group of \(r\)th roots of unity in \(\mathbb {F}_{p^k}^*\). The embedding degree \(k\) is very large (i.e. \(k \approx r\)) for general curves, but must be kept small (i.e. \(k <50\)) if computations in \(\mathbb {F}_{p^k}\) are to be feasible in practice – this means that socalled pairingfriendly curves must be constructed in a special way. In Sect. 2.1 we recall the best known techniques for constructing such curves with embedding degrees that target the 128, 192 and 256bit security levels – \(k\) is varied to optimally balance the size of \(r\) and the size of \(\mathbb {F}_{p^k}\), which respectively determine the complexity of the best known elliptic curve and finite field discrete logarithm attacks.
2.1 Parameterized Families of PairingFriendly Curves with Sextic Twists
The most suitable pairingfriendly curves for our purposes come from parameterized families, such that the parameters to find a suitable curve \(E(\mathbb {F}_p)\) can be written as univariate polynomials. For the four families we consider, we give below the polynomials \(p(x)\), \(r(x)\) and \(t(x)\), where \(t(x)\) is such that \(n(x) = p(x)+1t(x)\) is the cardinality of the desired curve, which has \(r(x)\) as a factor. All of the curves found from these constructions have \(j\)invariant zero, which means they can be written in Weierstrass form as \(y^2=x^3+b\). Instances of these pairingfriendly families can be found by searching through integer values \(x\) of an appropriate size until we find \(x=x_0\) such that \(p=p(x_0)\) and \(r=r(x_0)\) are simultaneously prime, at which point we can simply test different values for \(b\) until the curve \(E: y^2=x^3+b\) has an \(n\)torsion point.
For the above families, which all have \(k=2^i3^j\), the best practice to construct the full extension field \(\mathbb {F}_{p^k}\) is to use a tower of (intermediate) quadratic and cubic extensions [5, 40]. Since \(6 \mid k\), we can always use a sextic twist \(E'(\mathbb {F}_{p^{k/6}})\) to represent elements of \(\mathbb {G}_2\subset E(\mathbb {F}_{p^k})[r]\) as elements of an isomorphic group \(\mathbb {G}_2' = E'(\mathbb {F}_{p^{k/6}})[r]\). This shows that group operations in \(\mathbb {G}_2\) can be performed on points with coordinates in an extension field with degree one sixth the size, which is the best we can do for elliptic curves [50, Proposition X.5.4].
In all cases considered in this work, the most preferable sextic extension from \(\mathbb {F}_{p^{k/6}}=\mathbb {F}_p(\xi )\) to \(\mathbb {F}_{p^{k}}=\mathbb {F}_{p^{k/6}}(z)\) is constructed by taking \(z\in \mathbb {F}_{p^k}\) as a root of the polynomial \(z^6\xi \), which is irreducible in \(\mathbb {F}_{p^{k/6}}[z]\). We describe the individual towers in the four cases as follows: the BN and BLS cases with \(k=12\) preferably take \(p \equiv 3 \mathrm{~mod~ }{4}\), so that \(\mathbb {F}_{p^2}\) can be constructed as \(\mathbb {F}_{p^2}=\mathbb {F}_p[u]/(u^2+1)\), and take \(\xi =u+1\) for the sextic extension to \(\mathbb {F}_{p^{12}}\). For \(k=18\) KSS curves, we prefer that \(2\) is not a cube in \(\mathbb {F}_p\), so that \(\mathbb {F}_{p^3}\) can be constructed as \(\mathbb {F}_{p^2}=\mathbb {F}_p[u]/(u^3+2)\), before taking \(\xi =u\) to extend to \(\mathbb {F}_{p^{18}}\). For \(k=24\) BLS curves, we again prefer to construct \(\mathbb {F}_{p^2}\) as \(\mathbb {F}_{p^2}=\mathbb {F}_p[u]/(u^2+1)\), on top of which we take \(\mathbb {F}_{p^4}=\mathbb {F}_{p^2}[v]/(v^2(u+1))\) (it is easily shown that \(v^2u\) cannot be irreducible [18, Proposition 1]), and use \(\xi =v\) for the sextic extension. All of these constructions agree with the towers used in the “speedrecord” literature [1, 2, 18, 48].
2.2 The GLV and GLS Algorithms
The GLV [25] and GLS [23] methods both use an efficient endomorphism to speed up elliptic curve scalar multiplications. The GLV method relies on endomorphisms specific to the shape of the curve \(E\) that are unrelated to the Frobenius endomorphism. On the other hand, the GLS method works over extension fields where Frobenius becomes nontrivial, so it does not rely on \(E\) having a special shape. However, if \(E\) is both defined over an extension field and has a special shape, then the two can be combined [23, Sect. 3] to give higherdimensional decompositions, which can further enhance performance.
Since in this paper we have \(E/\mathbb {F}_p: y^2=x^3+b\) and \(p \equiv 1 \mathrm{~mod~ }{3}\), we can use the GLV endomorphism \(\phi : (x,y) \mapsto (\zeta x, y)\) in \(\mathbb {G}_1\) where \(\zeta ^3=1\) and \(\zeta \in \mathbb {F}_p \setminus \{1\}\). In this case \(\phi \) satisfies \(\phi ^2+\phi +1\) in the endomorphism ring \(\mathrm{End}(E)\) of \(E\), so on \(\mathbb {G}_1\) it corresponds to scalar multiplication by \(\lambda _\phi \), where \(\lambda _\phi ^2+\lambda _\phi +1 \equiv 0 \mathrm{~mod~ }{r}\), meaning we get a 2dimensional decomposition in \(\mathbb {G}_1\). Since \(\mathbb {G}_2'\) is always defined over an extension field herein, we can combine the GLV endomorphism above with the Frobenius map to get higherdimensional GLS decompositions. The standard way to do this in the pairing context [24] is to use the untwisting isomorphism \(\varPsi \) to move points from \(\mathbb {G}_2'\) to \(\mathbb {G}_2\), where the \(p\)power Frobenius \(\pi _p\) can be applied (since \(E\) is defined over \(\mathbb {F}_p\), while \(E'\) is not), before using the twisting isomorphism \(\varPsi ^{1}\) to move this result back to \(\mathbb {G}_2'\). We define \(\psi \) as \(\psi = \varPsi ^{1} \circ \pi _p \circ \varPsi \), which (even though \(\varPsi \) and \(\varPsi ^{1}\) are defined over \(\mathbb {F}_{p^k}\)) can be explicitly described over \(\mathbb {F}_{p^{k/6}}\). The GLS endomorphism \(\psi \) satisfies \(\varPhi _k(\psi ) = 0\) in \(\mathrm{End}(E')\) [24, Lemma 1], where \(\varPhi _k(\cdot )\) is the \(k\)th cyclotomic polynomial, so it corresponds to scalar multiplication by \(\lambda _\psi \), where \(\varPhi _k(\lambda _\psi ) \equiv 0 \mathrm{~mod~ }{r}\), i.e. \(\lambda _\psi \) is a primitve \(k\)th root of unity modulo \(r\). For the curves with \(k=12\), we thus obtain a \(4\)dimensional decomposition in \(\mathbb {G}_2' \subset E'(\mathbb {F}_{p^2})\); for \(k=18\) curves, we get a \(6\)dimensional decomposition in \(\mathbb {G}_2'\subset E'(\mathbb {F}_{p^3})\); and for \(k=24\) curves, we get an \(8\)dimensional decomposition in \(\mathbb {G}_2'\subset E'(\mathbb {F}_{p^4})\).
To obtain the \(d\) miniscalars \(s_0, \dots , s_{d1}\) from the scalar \(s\) and the close vector \((\hat{s}_0,\dots ,\hat{s}_{d1})\), we compute \((s_0, \dots s_{d1}) = (s, 0, \dots , 0)  (\hat{s}_0,\dots ,\hat{s}_{d1})\) in \(\mathbb {Z}^d\). We can then compute \([s]P_0\) via the multiexponentiation \(\sum _{i=0}^{d1} [s_i] P_i\). The typical way to do this is to start by making all of the \(s_i\) positive: we simultaneously negate any (\(s_i,P_i\)) pair for which \(s_i < 0\) (this can be done in a sidechannel resistant way using bitmasks). We then precompute all possible sums \(\sum _{i=0}^{d1} [b_i]P_i\), for the \(2^d\) combinations of \(b_i \in \{0,1\}\), and store them in a lookup table. When simultaneously processing the \(j\)th bits of the \(d\) miniscalars, this allows us to update the running value with only one point addition, before performing a single point doubling. In each case however, this standard approach requires individual attention for further optimization – this is what we describe in Sect. 3.
We aim to create constanttime programs: implementations which have an execution time independent of any secret material (e.g. the scalar). This means that we always execute exactly the same amount of point additions and duplications independent of the input. In order to achieve this in the setting of scalar multiplication using the GLV/GLS method, we use the recoding techniques from [21, 29]. This recoding technique not only guarantees that the program performs a constant number of point operations, but that the recoding itself is done in constant time as well. Furthermore, an advantage of this method is that the lookup table size is reduced by a factor of two, since we only store lookup elements for which the multiple of the first point \(P_0\) is odd. Besides reducing the memory, this reduces the time to create the lookup table.
3 Strategies for GLV in \(\mathbb {G}_1\) and GLS in \(\mathbb {G}_2\)
This section presents our highlevel strategy for \(2\)GLV on \(\mathbb {G}_1\), \(4\)GLS in \(\mathbb {G}_2\) in the two \(k=12\) families, \(6\)GLS in \(\mathbb {G}_2\) for the KSS curves with \(k=18\), and \(8\)GLS in \(\mathbb {G}_2\) for the BLS curves with \(k=24\). We use the following abbreviations for elliptic curve operations that we require: DBL – for the doubling of a projective point, ADD – for the addition between two projective points, MIX – for the addition between a projective point and an affine point, and AFF – for the addition between two affine points to give a projective point.
3.1 \(2\)GLV on \(\mathbb {G}_1\)
For both BLS families and the KSS family, we get a simple GLV scalar decomposition and obtain the miniscalars by writing \(s\) as a linear function in \(\lambda _\phi \). This has the additional advantage that both \(s_0\) and \(s_1\) are positive. For BN curves, we use the algorithm from [46] for the decomposition. In this setting, the miniscalars can be negative, so we must ensure that they become positive (see Sect. 2.2) before using Algorithm 1 to generate the lookup table.
3.2 \(4\)GLS on \(\mathbb {G}_2\) for BN and BLS Curves with \(k=12\)
In the BLS case, we have \(\lambda _\psi (x) = x\), which means \(\lambda _\psi  \approx r^{1/4}\), so we get a 4dimensional decomposition in \(\mathbb {G}_2\) by writing the scalar \(0 \le s <r\) in base \(\lambda _\psi \) as \(s=\sum _{i=0}^3 s_i \lambda _\psi ^i\), with \(0\le s_i <\lambda _\psi \) [24, Example 3]. On the other hand, the miniscalars resulting from the decomposition on BN curves in [24, Example 5] can be negative.
3.3 \(6\)GLS on \(\mathbb {G}_2\) for KSS Curves with \(k=18\)
To decompose the scalar for \(6\)GLS on \(\mathbb {G}_2\) for KSS curves, we use the technique^{1} from [46], after which we must ensure all the \(s_i\) are nonnegative according to Sect. 2.2. In this case, the decision of the window size (being \(w=1\)) is again trivial, since a window of size \(w=2\) requires a lookup table of size \(2^{11}\). On input of \(P_i\) corresponding to \(s_i>0\), for \(0\le i \le 5\), we generate the 32 elements of the lookup table as follows. We use Algorithm 3 to produce \(T[0],\ldots ,T[7]\) (using \(P_0,\ldots , P_3\)). We compute \(T[8] \leftarrow \mathtt{AFF}(T[0],P_4)\) and \(T[i] \leftarrow \mathtt{MIX}(T[i8],P_4)\) for \(9\le i \le 15\). Next, we compute \(T[16] \leftarrow \mathtt{AFF}(T[0],P_5)\) and \(T[i] \leftarrow \mathtt{MIX}(T[i16],P_5)\) for \(17\le i \le 31\).
3.4 \(8\)GLS on \(\mathbb {G}_2\) for BLS Curves \(k=24\)
BLS curves with \(k=24\) have \(\lambda _\psi (x)=x\), which means \(\lambda _\psi  \approx r^{1/8}\), so one can compute an 8dimensional decomposition in \(\mathbb {G}_2\) by writing the scalar \(0 \le s <r\) in base \(\lambda _\psi \) as \(s=\sum _{i=0}^7 s_i \lambda _\psi ^i\), with \(0\le s_i <\lambda _\psi \) [24, Example 4]. We use the 8dimensional decomposition strategy studied in [15]: the idea is to split the lookup table (a single large lookup table would consist of \(128\) entries) into two lookup tables consisting of eight elements each. In this case, we need to compute twice the amount of point additions when simultaneously processing the miniscalars (see Table 3), but we save around \(120\) point additions in generating the lookup table(s). Let \(T_1\) be the table consisting of the 8 entries \(P_0+\sum _{i=1}^3[b_i]P_i\), for \(b_i \in \{0,1\}\), which is generated using Algorithm 3 on \(P_0,\dots ,P_3\). The second table, \(T_2\), consists of the 8 entries \(P_4+\sum _{i=5}^7[b_i]P_i\) for \(b_i \in \{0,1\}\), and can be precomputed as \(T_2[j] \leftarrow \psi ^4(T_1[j])\), for \(j = 0,\dots ,7\). With the specific tower construction for \(k=24\) BLS curves (see Sect. 2.1), the map \(\psi ^4: \mathbb {G}_2 \rightarrow \mathbb {G}_2\) significantly simplifies to \(\psi ^4: (x,y) \mapsto (c_x x, c_y y)\), where the constants \(c_x\) and \(c_y\) are in \(\mathbb {F}_p\).
4 Alternate Curve Models for Exponentiations in Groups \(\mathbb {G}_1\) and \(\mathbb {G}_2\)
An active research area in ECC involves optimizing elliptic curve arithmetic through the use of various curve models and coordinate systems (see [9, 31] for an overview). For example, in ECC applications the fastest arithmetic to realize a group operation on Weierstrass curves of the form \(y^2=x^3+b\) requires \(16\) field multiplications [9], while a group addition on an Edwards curve can incur as few as \(8\) field multiplications [33]. While alternative curve models are not favorable over Weierstrass curves in the pairing computation itself [17], they can still be used to speed up the elliptic curve operations in \(\mathbb {G}_1\) and \(\mathbb {G}_2\).
4.1 Three NonWeierstrass Models

\(\mathcal {W}\)  Weierstrass: all curves in this paper have \(j\)invariant zero and Weierstrass form \(y^2=x^3+b\). The fastest formulas on such curves use Jacobian coordinates [8].

\(\mathcal {J}\)  Extended Jacobi quartic: if an elliptic curve has a point of order 2, then it can be written in (extended) Jacobi quartic form as \(\mathcal {J} :y^2=dx^4+ax^2+1\) [11, Sect. 3] – these curves were first considered for cryptographic use in [11, Sect. 3]. The fastest formulas work on the corresponding projective curve given by \(\mathcal {J} :Y^2Z^2=dX^4+aX^2Z^2+Z^4\) and use the 4 extended coordinates \((X :Y :Z :T)\) to represent a point, where \(x=X/Z\), \(y=Y/Z\) and \(T=X^2/Z\) [34].

\(\mathcal {H}\)  Generalized Hessian: if an elliptic curve (over a finite field) has a point of order 3, then it can be written in generalized Hessian form as \(\mathcal {H} :x^3+y^3+c=dxy\) [20, Theorem 2]. The authors of [37, 51] studied Hessian curves of the form \(x^3+y^3+1=dxy\) for use in cryptography, and this was later generalized to include the parameter \(c\) [20]. The fastest formulas for ADD/MIX/AFF are from [7] while the fastest DBL formulas are from [32] – they work on the homogeneous projective curve given by \(\mathcal {H} :X^3+Y^3+cZ^3=dXYZ\), where \(x=X/Z\), \(y=Y/Z\). We note that the \(j\)invariant zero version of \(\mathcal {H}\) has \(d=0\) (see Sect. 4.3), so in Table 1 we give updated costs that include this speedup.

\(\mathcal {E}\)  Twisted Edwards: if an elliptic curve has a point of order 4, then it can be written in twisted Edwards form as \(\mathcal {E} :ax^2+y^2=1+dx^2y^2\) [6, Theorem 3.3]. However, if the field of definition, \(K\), has \(\#K \equiv 1 \mathrm{~mod~ }{4}\), then \(4 \mid E\) is enough to write \(E\) in twisted Edwards form [6, Sect. 3] (i.e. we do not necessarily need a point of order 4). Twisted Edwards curves [19] were introduced to cryptography in [6, 10] and the best formulas are from [33].
The costs of necessary operations for computing group exponentiations on four models of elliptic curves. Costs are reported as \(\mathbf {T}_{\mathbf{M}, \mathbf{S}, \mathbf{d}, \mathbf{a}}\) , where \(\mathbf{M}\) is the cost of a field multiplication, \(\mathbf{S}\) is the cost of a field squaring, \(\mathbf{d}\) is the cost of multiplication by a curve constant, \(\mathbf{a}\) is the cost of a field addition (we have counted multiplications by 2 as additions), and \(\mathbf {T}\) is the total number of multiplications, squarings, and multiplications by curve constants.
Model/coords  Requires  DBL cost  ADD cost  MIX cost  AFF cost 

\(\mathcal {W}\)/Jac.    \(\mathbf{7}_{2,5,0,14}\)  \(\mathbf{16}_{11,5,0,13}\)  \(\mathbf{11}_{7,4,0,14}\)  \(\mathbf{6}_{4,2,0,12}\) 
\(\mathcal {J}\)/ext.  pt. of order 2  \(\mathbf{9}_{1,7,1,12}\)  \(\mathbf{13}_{7,3,3,19}\)  \(\mathbf{12}_{6,3,3,18}\)  \(\mathbf{11}_{5,3,3,18}\) 
\(\mathcal {H}\)/proj.  pt. of order 3  \(\mathbf{7}_{6,1,0,11}\)  \(\mathbf{12}_{12,0,0,3}\)  \(\mathbf{10}_{10,0,0,3}\)  \(\mathbf{8}_{8,0,0,3}\) 
\(\mathcal {E}\)/ext.  pt. of order 4, or  \(\mathbf{9}_{4,4,1,7}\)  \(\mathbf{10}_{9,0,1,7}\)  \(\mathbf{9}_{8,1,0,7}\)  \(\mathbf{8}_{7,0,1,7}\) 
\(4 \mid E\) and \(\#K \equiv 1 \mathrm{~mod~ }{4}\) 
For each model, we summarize the cost of the required group operations in Table 1. The total number of field multiplications are reported in bold for each group operation – this includes multiplications, squarings and multiplications by constants. We note that in the context of plain ECC these models have been studied with small curve constants; in pairingbased cryptography, however, we must put up with whatever constants we get under the transformation to the nonWeierstrass model. The only exception we found in this work is for the \(k=12\) BLS curves, where \(\mathbb {G}_1\) can be transformed to a Jacobi quartic curve with \(a=1/2\), which gives a worthwhile speedup [34].
4.2 Applicability of Alternative Curve Models for \(k\in \{12,18,24\}\)
In this section we prove the existence or nonexistence of points of orders 2, 3 and 4 in the groups \(E(\mathbb {F}_p)\) and \(E'(\mathbb {F}_{p^{k/6}})\) for the pairingfriendly families considered in this work. These proofs culminate in Table 2, which summarizes the alternative curve models that are available for \(\mathbb {G}_1\) and \(\mathbb {G}_2\) in the scenarios we consider. We can study \(\#E(\mathbb {F}_p)\) directly from the polynomial parameterizations in Sect. 2.1, while for \(\#E'(\mathbb {F}_{p^e})\) (where \(e=k/6\)) we do the following. With the explicit recursion in [12, Corollary VI.2] we determine the parameters \(t_e\) and \(f_e\) which are related by the CM equation \(4p^e=t_e^2+3f_e^2\) (since all our curves have CM discriminant \(D=3\)). This allows us to compute the order of the correct sextic twist, which by [30, Proposition 2] is one of \(n_{e,1}' = p^{e}+1(3f_e+t_e)/2\) or \(n_{e,2}' = p^e+1(3f_e+t_e)/2\). For \(k=12\) and \(k=24\) BLS curves, we assume that \(p \equiv 3 \mathrm{~mod~ }{4}\) so that \(\mathbb {F}_{p^2}\) can be constructed (optimally) as \(\mathbb {F}_{p^2}=\mathbb {F}_p[u]/(u^2+1)\). Finally, since \(p \equiv 3 \mathrm{~mod~ }{4}\), \(E(\mathbb {F}_p)\) must contain a point of order 4 if we are to write \(E\) in twisted Edwards form; however, since \(E'\) is defined over \(\mathbb {F}_{p^e}\), if \(e\) is even then \(4 \mid E'\) is enough to write \(E'\) in twisted Edwards form (see Sect. 4.1).
Proposition 1
Let \(E/\mathbb {F}_p\) be a BN curve with sextic twist \(E'/\mathbb {F}_{p^2}\). The groups \(E(\mathbb {F}_p)\) and \(E'(\mathbb {F}_{p^2})\) do not contain points of order 2, 3 or 4.
Proof
From (1) we always have \(\#E(\mathbb {F}_p) \equiv 1 \mathrm{~mod~ }{6}\). Remark 2.13 of [44] shows that we have \(\#E'(\mathbb {F}_{p^2}) = (p+1t)(p1+t)\), which from (1) gives that \(\#E'(\mathbb {F}_{p^2}) \equiv 1 \mathrm{~mod~ }{6}\). \(\square \)
Proposition 2
For \(p \equiv 3 \mathrm{~mod~ }{4}\), let \(E/\mathbb {F}_p\) be a \(k=12\) BLS curve with sextic twist \(E'/\mathbb {F}_{p^2}\). The group \(E(\mathbb {F}_p)\) contains a point of order \(3\) and can contain a point of order 2, but not 4, while the group \(E'(\mathbb {F}_{p^2})\) does not contain a point of order 2, 3 or 4.
Proof
From [12, Corollary VI.2] we have \(t_2(x) = t(x)^22p(x)\), which with (2) and \(4p(x)^2=t_2(x)^2+3f_2(x)^2\) allows us to deduce that the correct twist order is \(n_{2,2}'\), which gives \(n_{2,2}'(x) \equiv 1 \mathrm{~mod~ }{12}\) for \(x \equiv 1 \mathrm{~mod~ }{3}\), i.e. \(E'\) does not have points of order \(2\), \(3\) or \(4\). For \(E\), (2) reveals that \(3 \mid \#E\), and furthermore that \(x \equiv 4 \mathrm{~mod~ }{6}\) implies \(\#E\) is odd, while for \(x \equiv 1 \mathrm{~mod~ }{6}\) we have \(4 \mid \#E\). The assumption \(p \equiv 3 \mathrm{~mod~ }{4}\) holds if and only if \(x \equiv 7 \mathrm{~mod~ }{12}\), which actually implies \(p \equiv 7 \mathrm{~mod~ }{12}\). Now, to have a point of order 4 on \(E/\mathbb {F}_p:y^2=x^3+b\), the fourth division polynomial \(\psi _4(x)=2x^6+40bx^38b^2\) must have a root \(\alpha \in \mathbb {F}_p\), which happens if and only if \(\alpha ^3 = 10b\pm 6b\sqrt{3}\). However, [35, Sect. 5, Theorem 2(b)] says that \(3\) is a quadratic residue in \(\mathbb {F}_p\) if and only if \(p \equiv \pm b^2 \mathrm{~mod~ }{12}\), where \(b\) is coprime to \(3\), which cannot happen for \(p \equiv 7 \mathrm{~mod~ }{12}\), so \(E\) does not have a point of order 4. \(\square \)
Proposition 3
Let \(E/\mathbb {F}_p\) be a \(k=18\) KSS curve with sextic twist \(E'/\mathbb {F}_{p^3}\). The group \(E(\mathbb {F}_p)\) does not contain a point of order 2, 3 or 4, while the group \(E'(\mathbb {F}_{p^3})\) contains a point of order 3 but does not contain a point of order 2 or 4.
Proof
From [12, Corollary VI.2] we have \(t_3(x) = t(x)^33p(x)t(x)\). With (3) and \(4p(x)^3=t_3(x)^2+3f_3(x)^2\)) it follows that \(n_{3,1}'(x)\) is the correct twist order. We have \(n_{3,1}'(x) \equiv 3 \mathrm{~mod~ }{12}\) for \(x \equiv 14 \mathrm{~mod~ }{42}\), i.e. \(E'\) has a point of order 3 but no points of order 2 or 4. For \(E\) we have \(\#E \equiv 1 \mathrm{~mod~ }{6}\) from (3), which means there are no points of order 2, 3, or 4. \(\square \)
Proposition 4
For \(p \equiv 3 \mathrm{~mod~ }{4}\), let \(E/\mathbb {F}_p\) be a BLS curve with \(k=24\) and sextic twist \(E'/\mathbb {F}_{p^4}\). The group \(E(\mathbb {F}_p)\) can contain points of order 2 or 3 (although not simultaneously), but not 4, while the group \(E'(\mathbb {F}_{p^4})\) can contain a point of order 2, but does not contain a point of order 3 or 4.
Proof
Again, [12, Corollary VI.2] gives \(t_4(x)=t(x)^44p(x)t(x)^2+2p(x)^2\), and from (4) and \(4p(x)^4=t_4(x)^2+3f_4(x)^2\) we get \(n_{4,1}'(x)\) as the correct twist order. For \(x \equiv 1 \mathrm{~mod~ }{6}\) we have \(n_{4,1}'(x) \equiv 1 \mathrm{~mod~ }{12}\) (so no points of order 2, 3, or 4), while for \(x \equiv 4 \mathrm{~mod~ }{6}\) we have \(n_{4,1}'(x) \equiv 4 \mathrm{~mod~ }{12}\). Recall from the proof of Proposition 2 that \((\alpha ,\beta ) \in E'(\mathbb {F}_{p^4})\) is a point of order 4 if we have \(\alpha \in \mathbb {F}_{p^4}\) such that \(\alpha ^3 = (10\pm 6\sqrt{3})b'\). The curve equation gives \(\beta ^2= (9\pm 6\sqrt{3})b'\), i.e. \(b'\) must be a square in \(\mathbb {F}_{p^4}\), which implies that \((0,\pm \sqrt{b'})\) are points of order 3 on \(E'(\mathbb {F}_{p^4})\), which contradicts \(n_{4,1}'(x) \equiv 1 \mathrm{~mod~ }{3}\). Thus, \(E'(\mathbb {F}_{p^4})\) cannot have points of order 3 or 4. For \(E\), from (4) we have \(\#E(\mathbb {F}_p) \equiv 3 \mathrm{~mod~ }{12}\) if \(x \equiv 4 \mathrm{~mod~ }{6}\), but \(\#E \equiv 0 \mathrm{~mod~ }{12}\) if \(x \equiv 1 \mathrm{~mod~ }{6}\). Thus, there is a point of order 3 on \(E\), as well as a point of order 2 if \(x \equiv 1 \mathrm{~mod~ }{6}\). So it remains to check whether there is a point of order 4 when \(x \equiv 1 \mathrm{~mod~ }{6}\). Taking \(x \equiv 1 \mathrm{~mod~ }{12}\) gives rise to \(p \equiv 1 \mathrm{~mod~ }{4}\), so take \(x \equiv 7 \mathrm{~mod~ }{12}\). This implies that \(p \equiv 7 \mathrm{~mod~ }{12}\), and the same argument as in the proof of Proposition 2 shows that there is no point of order 4. \(\square \)
Optional curve models for \(\mathbb {G}_1\) and \(\mathbb {G}_2\) in popular pairing implementations.
\(\mathbb {G}_1\)  \(\mathbb {G}_2\)  

Family\(k\)  Algorithm  Models avail.  Algorithm  Models avail.  Follows from 
BN\(12\)  2GLV  \(\mathcal {W}\)  4GLS  \(\mathcal {W}\)  Proposition 1 
BLS\(12\)  2GLV  \(\mathcal {H}, \mathcal {J}, \mathcal {W}\)  4GLS  \(\mathcal {W}\)  Proposition 2 
KSS\(18\)  2GLV  \(\mathcal {W}\)  6GLS  \(\mathcal {H}, \mathcal {W}\)  Proposition 3 
BLS\(24\)  2GLV  \(\mathcal {H}, \mathcal {J}, \mathcal {W}\)  8GLS  \(\mathcal {E}, \mathcal {J}, \mathcal {W}\)  Proposition 4 
4.3 Translating Endomorphisms to the NonWeierstrass Models
In this section we investigate whether the GLV and GLS endomorphisms from Sect. 2.2 translate to the Jacobi quartic and Hessian models. Whether the endomorphisms translate desirably depends on how efficiently they can be computed on the nonWeierstrass model. It is not imperative that the endomorphisms do translate desirably, but it can aid efficiency: if the endomorphisms are not efficient on the alternative model, then our exponentiation routine also incurs the cost of passing points back and forth between the two models – this cost is small but could be nonnegligible for highdimensional decompositions. On the other hand, if the endomorphisms are efficient on the nonWeierstrass model, then the groups \(\mathbb {G}_1\) and/or \(\mathbb {G}_2\) can be defined so that all exponentiations take place directly on this model, and the computation of the pairing can be modified to include an initial conversion back to Weierstrass form.
We essentially show that the only scenario in which the endomorphisms are efficiently computable on the alternative model is the case of the GLV endomorphism \(\phi \) on Hessian curves.
For GLS on Hessian curves, there is no obvious or simple way to perform the analogous untwisting or twisting isomorphisms directly between \(\mathcal {H}'(\mathbb {F}_{p^{k/6}})\) and \(\mathcal {H}(\mathbb {F}_{p^k})\), which suggests that we must pass back and forth to the Weierstrass curve/s to determine the explicit formulas for the GLS endomorphism on \(\mathcal {H}'\). The composition of these maps \(\psi _{\mathcal {H}'} = \tau \circ \varPsi _{\mathcal {W}}^{1} \circ \pi _p \circ \varPsi _{\mathcal {W}} \circ \tau ^{1}\) does not appear to simplify to be anywhere near as efficient as the GLS endomorphism is on the Weierstrass curve. Consequently, our GLS routine will start with a Weierstrass point in \(\mathcal {W}'(\mathbb {F}_{p^{k/6}})\), where we compute \(d1\) applications of \(\psi \in \mathrm{End}(\mathcal {W}')\), before using (6) to convert the \(d\) points to \(\mathcal {H}'(\mathbb {F}_{p^{k/6}})\), where the remainder of the routine takes place (save the final conversion back to \(\mathcal {W}'\)). Note that since we are converting affine Weierstrass points to \(\mathcal {H}'\) via (6), this only incurs two multiplications each time. However, the results are now projective points on \(\mathcal {H}'\) meaning that the more expensive full addition formulas must be used to generate the remainder of the lookup table.
4.4 Curve Choices for Pairings at the \(128\), \(192\) and \(256\)bit Security Levels
The specific curves we choose in this section can use any of the alternative models that are available in the specific cases as shown in Table 2. The only exception occurs for \(k=24\), for which we are forced to choose between having a point of order 2 or 3 (see Proposition 4) in \(\mathbb {G}_1\) – we opt for the point of order 3 and the Hessian model, as this gives enhanced performance. Note that these curves do not sacrifice any efficiency in the pairing computation compared to previously chosen curves in the literature (in terms of the field sizes, hammingweights and towering options).
The \({{\varvec{k}}}=\mathbf{12}\) BN Curve. Since no alternative models are available for the BN family, we use the curve that was first seen in [45] and subsequently used to achieve speed records at the 128bit security level [2], which results from substituting \(x=(2^{62}+2^{55}+1)\) into (1), and taking \(E/\mathbb {F}_p:y^2=x^3+2\) and \(E'/\mathbb {F}_{p^2}: y^2=x^3+(1u)\), where \(\mathbb {F}_{p^2}=\mathbb {F}_p[u]/(u^2+1)\).
The \({{\varvec{k}}}=\mathbf{12}\) BLS Curve. Setting \(x=2^{106}2^{72}+2^{69}1\) in (2) gives a 635bit prime \(p\) and a 424bit prime \(r\). Let \(\mathbb {F}_{p^2}=\mathbb {F}_p[u]/(u^2+1)\) and let \(\xi =u+1\). The Weierstrass forms corresponding to \(\mathbb {G}_1\) and \(\mathbb {G}_2\) are \(\mathcal {W}/\mathbb {F}_p: y^2=x^3+1\) and \(\mathcal {W'}/\mathbb {F}_{p^2}: y^2=x^3+\xi \). Only \(\mathbb {G}_1\) has options for alternative models (see Table 2): the Hessian curve \(\mathcal {H}/\mathbb {F}_p :x^3+y^3+2=0\) and the Jacobi quartic curve \(\mathcal {J}/\mathbb {F}_p:y^2= \frac{3}{16}x^4+\frac{3}{4}x^2+1\) are both isomorphic to \(\mathcal {W}\) over \(\mathbb {F}_p\).
The \({{\varvec{k}}}=\mathbf{18}\) KSS Curve. Setting \(x=2^{64}2^{51}+2^{47}+2^{28}\) in (3) gives a 508bit prime \(p\) and a 376bit prime \(r\). Let \(\mathbb {F}_{p^3}=\mathbb {F}_p[u]/(u^3+2)\). The Weierstrass forms for \(\mathbb {G}_1\) and \(\mathbb {G}_2\) are \(\mathcal {W}/\mathbb {F}_p: y^2=x^3+2\) and \(\mathcal {W'}/\mathbb {F}_{p^3}: y^2=x^3u^2\). Only \(\mathbb {G}_2\) allows for an alternative model (see Table 2): the Hessian curve \(\mathcal {H'}/\mathbb {F}_{p^3} :x^3+y^3+2u\sqrt{1}=0\) is isomorphic to \(\mathcal {W'}\) over \(\mathbb {F}_{p^3}\).
The \({{\varvec{k}}}=\mathbf{24}\) BLS Curve. Setting \(x=2^{63}2^{47}+2^{38}\) in (3) gives a 629bit prime \(p\) and a \(504\)bit prime \(r\). Let \(\mathbb {F}_{p^2}=\mathbb {F}_p[u]/(u^2+1)\) and \(\mathbb {F}_{p^4}=\mathbb {F}_{p^2}[v]/(v^2(u+1))\). The Weierstrass forms corresponding to \(\mathbb {G}_1\) and \(\mathbb {G}_2\) are \(\mathcal {W}/\mathbb {F}_p: y^2=x^3+4\) and \(\mathcal {W'}/\mathbb {F}_{p^4}: y^2=x^3+4v\). This gives us the option of a Hessian model in \(\mathbb {G}_1\): the curve \(\mathcal {H}/\mathbb {F}_p :x^3+y^3+4=0\) is isomorphic to \(\mathcal {W}\) over \(\mathbb {F}_p\). In \(\mathbb {G}_2\) we have both the Jacobi quartic and twisted Edwards models as options. Let \(\theta =(u+1)v\) and set \(a=3\theta /4\) and \(d=(4A^23\theta ^2)/4\). The curve \(\mathcal {J}/\mathbb {F}_{p^4}:y^2= dx^4+ax^2+1\) is isomorphic to \(\mathcal {W'}\) over \(\mathbb {F}_{p^4}\). For the twisted Edwards model, we take \(\alpha = \theta = (u+1)v\), \(s = 1/(\alpha \sqrt{3}) \in \mathbb {F}_{p^4}\), \(a'=(3\alpha s+2)/s\) and \(d'=(3\alpha s2)/s\); the curve \(\mathcal {E}/\mathbb {F}_{p^4}:a'x^2+y^2= 1+d'x^2y^2\) is then isomorphic to \(W'\).
5 Exponentiations in \(\mathbb {G}_T\)
Optimal scenarios for group exponentiations. For both GLV on \(\mathbb {G}_1\) and GLS on \(\mathbb {G}_2\) in all four families, we give the decomposition dimension \(d\), the maximum sizes of the miniscalars \(s_i_\infty \), the optimal window size \(w\), and the optimal curve model.
Sec. level  Family\(k\)  Exp. in \(\mathbb {G}_1\)  Exp. in \(\mathbb {G}_2\)  

\(d\)  \(s_i_\infty \)  \(w\)  Curve  \(d\)  \(s_i_\infty \)  \(w\)  Curve  
128bit  BN12  2  128  \(2\)  Weierstrass  4  64  \(1\)  Weierstrass 
192bit  BLS12  2  212  \(3\)  Hessian  4  106  \(1\)  Weierstrass 
KSS18  2  192  \(3\)  Weierstrass  6  63  \(1\)  Hessian  
256bit  BLS24  2  252  \(3\)  Hessian  8  63  \(1\)  twisted Edwards 
6 Results
In Table 3 we summarize the optimal curve choices in each scenario. We first note that Jacobi quartic curves were unable to outperform the Weierstrass, Hessian or twisted Edwards curves in any of the scenarios. This is because the small number of operations saved in a Jacobi quartic group addition were not enough to outweigh the slower Jacobi quartic doublings (see Table 1), and because of the extra computation incurred by the need to pass back and forth between \(\mathcal {J}\) and \(\mathcal {W}\) to compute the endomorphisms (see Sect. 4.3). On the other hand, while employing the Hessian and twisted Edwards forms also requires us to pass back and forth to compute the endomorphisms, the group law operations on these models are significantly faster than Weierstrass operations across the board, so Hessian and twisted Edwards curves reigned supreme whenever they were able to be employed – we give the concrete comparisons below. In Table 3 we also present the bounds we used on the maximum sizes of the miniscalars resulting from a \(d\)dimensional decomposition. In some cases, like those where decomposing \(s\) involves writing \(s\) in base \(\lambda _\phi \) or \(\lambda _\psi \), these bounds are trivially tight. However, in both the GLV and GLS on BN curves, and in the GLS on KSS curves, the bounds presented are those we obtained experimentally from hundreds of millions of scalar decompositions, meaning that the theoretical bounds could be a few bits larger – determining such bounds could be done using similar techniques to those in [41].
Benchmark results for an optimal ate pairing and group exponentiations in \(\mathbb {G}_1\), \(\mathbb {G}_2\) and \(\mathbb {G}_T\) in millions (M) of clock cycles for the best curve models. These results have been obtained on an Intel Core i73520M CPU averaged over thousands of random instances.
Sec. level  Family\(k\)  Pairing \(e\)  Exp. in \(\mathbb {G}_1\)  Exp. in \(\mathbb {G}_2\)  Exp. in \(\mathbb {G}_T\) 

128bit  BN12  7.0  0.9 \((\mathcal {W})\)  1.8 \((\mathcal {W})\)  3.1 
192bit  BLS12  47.2  4.4 \((\mathcal {H})\)  10.9 \((\mathcal {W})\)  17.5 
KSS18  63.3  3.5 \((\mathcal {W})\)  9.8 \((\mathcal {H})\)  15.7  
256bit  BLS24  115.0  5.2 \((\mathcal {H})\)  27.6 \((\mathcal {E})\)  47.1 
In [1] it was first proposed to use \(k=12\) BLS curves for the 192bit security level, by showing that pairings on these curves are significantly faster than pairings on \(k=18\) KSS curves. Our pairing timings add further weight to their claim. However, our timings also show that KSS curves are slightly faster for exponentiations in all three groups. There are many circumstances where Table 4 could guide implementers to make more efficient decisions when deploying a protocol. As one example, we refer to Boneh and Franklin’s original identitybased encryption scheme [14, Sect. 4.1], where the sender computes a pairing between a public element \(P_{\mathrm {pub}}\) and an identities’ public key \(Q_\mathtt{ID}\), i.e. the sender computes \(g_\mathtt{ID}=e(P_{\mathrm {pub}},Q_\mathtt{ID})\). The sender then chooses a random exponent \(s\) and computes \(g_\mathtt{ID}^s\) (which is hashed to become part of a ciphertext). In this case Table 4 shows that the sender would be much better off computing the scalar multiplication \([s]P_{\mathrm {pub}}\) (assuming \(P_{\mathrm {pub}} \in \mathbb {G}_1\), or else we could compute \([s]Q_\mathtt{ID}\)) before computing the pairing \(e([s]P_{\mathrm {pub}},Q_\mathtt{ID}) = g_\mathtt{ID}^s\).
 1.
We note that for particular KSS \(k=18\) curves, large savings may arise in this algorithm due to the fact that the \(\alpha = \sum _{i=0}^5 a_i\psi ^i\) (from Sect. 5.2 of [46]) have some of the \(a_i\) being zero. In the case of the KSS curve we use, around \(2/3\) of the computations vanish due to \(a_2=a_4=a_5=0\) and \(a_1=1\).
Acknowledgment
We thank the reviewer who pointed out that having \(4 \mid \#E(K)\) and \(\#K \equiv 1 \mathrm{~mod~ }{4}\) is sufficient to write \(E/K\) in twisted Edwards form.
