Keywords

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

1 Introduction

“Let \(\mathbb {G} \) be a group of prime order q.” This defines the requirements for the main group in many cryptographic systems [1, 9, 16, 18, 19], most often with the intention that \(\mathbb {G} \) will be the group of points on an elliptic curve. However, practical implementations usually do not quite deliver a group of prime order q, at least not without significant caveats. Implementations of prime-order curves usually have incomplete or variable-time addition formulas. For example, OpenSSL 1.0.1f, LibTomCrypt 1.17, PolarSSL 1.3.9 and Crypto++ 5.6.2 all use a branch to decide whether the inputs to their point-addition functions are equal, so that they can call the doubling function instead. Some of these libraries also have branches to detect cases where two points add to the identity point, or where one of them is the identity point. Even if this does not introduce timing variations in ECDH or ECDSA signing, it may introduce timing variations in other systems. Furthermore, there is a special case when encoding and decoding the identity point, which is at infinity in the Weierstrass model.

This problem can be mitigated by using complete addition laws. While such laws exist for prime-order curves [6, 11], they are faster and much simpler for other elliptic curves such as (twisted) Edwards curves [4, 5, 14], Hessian curves [15], Jacobi quartics [8] or Jacobi intersections [20, 26]. These curves have a cofactor, denoted h, where the order of the curve is \(h\cdot q\) for some large prime q. The cofactor h is always divisible by 3 for Hessian curves, and by 4 for the other models.

1.1 Pitfalls of a Cofactor

Many authors consider the advantages of a non-prime-order group, such the points on an Edwards curve, to outweigh the disadvantages. But the disadvantages are not negligible. There are several pitfalls which appear specifically for \(h>1\):

Small-Subgroup Attacks. Here an attacker sends a point whose order divides h, and a hapless user multiplies it by some scalar and uses the result. This will either result in a point known to the attacker (if the scalar is known to be divisible by h), or worse it may give the attacker information about the scalar. If the scalar is a private key, then leaking a few bits is a minor problem, though it is devastating to password-authenticated key exchange (PAKE) protocols [13].

Leaking a few bits of a scalar is a much more serious problem if the scalar is arithmetically related to a private key. Menezes and Ustaoglu used scalar leaks through the cofactor in their attack [27] on MQV [25] and HMQV [24]. HMQV was designed to avoid this weakness, but not successfully.

A related attack is to replace a point P with \(P + T\), where T lies in a small subgroup. If the user multiplies by a scalar s, they will get \(sP+sT\) instead of sP, where the difference sT gives away the low-order bits of s. Therefore, it isn’t always enough to reject points in the small subgroup.

The usual defense against these attacks is to multiply certain points by h, and possibly to abort the protocol if the result is the identity. But one must decide which points to take these steps on, and the extra factor of h can complicate the arithmetic. In a prime-order group, this attack is easier to mitigate: at most, one must check for the identity point in the proper places.

Non-injective Behavior. Multiplication by a scalar is a 1-to-1 function if the scalar is relatively prime to the group order. In a prime-order group, this is any scalar in \([1,q-1]\), and is true of a random scalar with high probability. The same is not true in a composite-order group. This means that adding a small-order element to e.g. a public key can produce the same result, possibly resulting in identity misbinding. This can be mitigated by making scalars relatively prime to h — exactly the opposite of techniques which clear the cofactor.

Covert Channels. Non-injective behavior may make it easier to exfiltrate data through the cofactor component, even in protocols where behavior is otherwise deterministic.

Implementation-Defined Behavior. Some systems, such as Ed25519 [7], do not specify behavior when the inputs have a nonzero h-torsion component. In particular, Ed25519 signature verification can be different for batched vs singleton signatures, or between different implementations. This can cause disruption in protocols where all parties must agree on whether a signature is valid, such as a blockchain or Byzantine agreement. In other cases, it may make it easier to fingerprint implementations.

Nontrivial Modifications. If a system or protocol is specified and proved secure on a prime-order group, then both the system and the proof may need to be changed for a group with a cofactor \(h>1\). Usually the modification is small. Often it is enough simply to multiply the outputs by h. However, only an expert will be able to tell exactly what modification is required. Cryptographic proofs are difficult, and this may represent enough work to prevent adoption.

1.2 Our Contribution

The cofactor pitfalls can be avoided by using a related group of prime order q. The most obvious choice is the order-q subgroup of the elliptic curve. But validating membership in that subgroup is slow and complex, requiring either an extra scalar multiplication, checking for roots of division polynomials, or inverting multiple isogenies.

We propose two new ways to build a group of prime order q based on an Edwards or twisted Edwards curve \(\mathcal {E}\) of order 4q, thus eliminating the cofactor. In the first proposal of this paper, the group is \(\mathcal {E}/\mathcal {E} [4]\). That is, two points on the curve \(\mathcal {E} \) are considered equal if their difference has small order (dividing 4). This requires three changes:

  • The function which checks equality of group elements must be modified.

  • The function which encodes points before sending them must encode these equal points as equal sequences of bits.

  • The function which decodes points must be designed to accept only the possible outputs of the encoding function.

Our second, improved proposal uses a different group \(\psi (\mathcal {E})\), which in the usual case will be \(2\mathcal {E}/\mathcal {E} [2]\). That is, only the even (2q-order) points of \(\mathcal {E} \) are used, and two points are considered the same if they differ by a point of order 2. This requires the same three changes.

The difficult parts are the encoding and decoding routines, which are the main contributions of this paper. We describe the encoding algorithm as “compression” because its output is an element of the underlying field \(\mathbb {F} \) rather than the usual two elements. In fact, it will be a “non-negative” element of \(\mathbb {F}\), which allows us to save an additional bit.

It is important to note that internally, the points used in the first proposal can be any points on \(\mathcal {E} \), and in the second proposal they can be any even points on \(\mathcal {E} \). Points which differ by a point of order 4 (resp. 2) are considered equal, and will be encoded to binary strings in the same way. This is similar to using projective coordinates: two values in memory may be considered same point and encode to the same binary string, even though the XY and Z coordinates are different. This is how using a prime-order \(\psi (\mathcal {E})\) instead of \(\mathcal {E} \) mitigates small-subgroup attacks. Points of small order can appear internally, but they are considered equal to the identity element. Likewise \(P+T\) can appear internally with T in a small-order subgroup, but it is considered equal to P and is encoded in the same way.

With the combination of the complete Edwards group law and our point encoding, protocols can gain the simplicity, security and speed benefits of (twisted) Edwards curves without any cofactor-related difficulty. The cost is a small increase in code complexity in the point encoding and decoding functions. On balance, we believe that our encoding can make the design of the entire system simpler. In terms of overhead, our encoding and decoding perform as well as existing point compression and decompression algorithms.

Designers often use untwisted Edwards, twisted Edwards or Montgomery curves. Montgomery curves give simple Diffie-Hellman protocols, and twisted Edwards curves give a speed boost but have incomplete formulas in fields of order 3 (mod 4). Our second proposal adds flexibility for curve choice. The same wire format can be used for a Montgomery curve as for its 4-isogenous Edwards and twisted Edwards curves. Furthermore, for twisted Edwards curves of cofactor 4, the subgroup we use avoids the incomplete cases in the addition laws.

Our group \(\psi (\mathcal {E})\) and encoding algorithm can be used on curves with cofactor greater than 4. It still divides the cofactor by 4, so \(\psi (\mathcal {E})\) will not have prime order. Additionally, \(\psi (\mathcal {E})\) is not of the form \(2\mathcal {E}/\mathcal {E} [2]\) if \(\mathcal {E} \) has full 2-torsion.

We call this technique “Decaf” after the procedure which divides the effect of coffee by 4. We have built reference and optimized implementations of Decaf, and have posted them online at http://sourceforge.net/p/ed448goldilocks/code/ci/decaf/tree/. Our code carries out essentially all the operations described in this paper and appendices on the curve Ed448-Goldilocks, reducing the cofactor from 4 to 1.

2 Definitions and Notation

Finite Field. Let \(\mathbb {F}\) be a finite field whose characteristic is neither 2 nor 3.

Even Elements. An element g of an Abelian group \(\mathbb {G} \) is said to be even if \(g=2h\) for some \(h\in \mathbb {G} \). The even elements form a subgroup denoted \(2\mathbb {G} \).

Torsion Elements. An element g of a group \(\mathbb {G} \) is a k-torsion element if \(k\cdot g = 0_\mathbb {G} \). The k-torsion elements of an Abelian group form a subgroup usually denoted \(\mathbb {G} [k]\). The k-torsion subgroup of an elliptic curve over a finite field has order dividing \(k^2\); in particular, the 2-torsion subgroup has size 1, 2 or 4.

Projective Space. Denote by \(\mathbb {P} ^n(\mathbb {F})\) the n-dimensional projective space over \(\mathbb {F} \). Its elements are written as ratios \((X:Y:Z:\ldots )\), usually in upper-case. As a traditional short-cut, we usually write the elements of \(\mathbb {P} ^2(\mathbb {F})\) as a lower-case tuple (xy) equivalent to (x : y : 1), with the understanding that the equations involving these points may have be extended to cover “points at infinity” of the form (X : Y : 0).

Twisted Edwards Curves. Twisted Edwards curves have two parameters, a and d. They are specified as

$$\mathcal {E} _{a,d} := \left\{ (x,y)\in \mathbb {P} ^2(\mathbb {F}): a\cdot x^2 + y^2 = 1 + d\cdot x^2\cdot y^2\right\} $$

Another form, extended homogeneous coordinates [22], is used for high performance and simpler formulas:

$$\mathcal {E} _{a,d} := \left\{ (X:Y:Z:T)\in \mathbb {P} ^3(\mathbb {F}): XY=ZT\text { and }a\cdot X^2 + Y^2 = Z^2 + d\cdot T^2\right\} $$

We will use “untwisted” to mean \(a=1\). “Twisted” is the general case, which we sometimes narrow to \(a=-1\). The identity point of any Edwards curve is \((0,1) = (0:1:1:0)\).

An Edwards curve is called “complete” if d and ad are nonsquare in \(\mathbb {F} \), which also implies that a is square. A complete Edwards curve has no points at infinity, and supports fast addition formulas which are complete in that they compute the correct answer for any two input points [5].

Montgomery Curves. A Montgomery curve has two parameters, called A and B. It has the form

$$\mathcal {M} _{B,A} := \left\{ (u,v)\in \mathbb {P} ^2(\mathbb {F}): Bv^2 = u\cdot (u^2 + Au + 1)\right\} $$

The identity point of this curve is a point at infinity, namely (0 : 1 : 0). The curve is “untwisted” if \(B=1\). Over \(3\pmod 4\) fields, any twisted Montgomery curve can be put into a form with \(B=1\), but over \(1\pmod 4\) fields, this is not true. In particular, \(B\ne 1\) is potentially useful to handle the twist of Curve25519, which has cofactor 4.

Jacobi Quartic Curves. A Jacobi quartic curve has two parameter, called A and e, and is defined by

$$\mathcal {J} _{e,A} := \left\{ (s,t)\in \mathbb {P} ^2(\mathbb {F}): t^2 = es^4 + 2As^2 + 1\right\} $$

with an identity point at (0, 1). The curve is “untwisted” if \(e=1\). We will only consider curves with \(e=a^2\) in this paper; such curves always have full 2-torsion.

The Curve Parameters. As a corollary of Ahmadi and Granger’s work [2], for any \(a,d\in \mathbb {F} \backslash \{0,1\}\), the following curves are isogenous:

$$\mathcal {E} _{a,d};\ \ \mathcal {E} _{-a,d-a};\ \ \mathcal {M} _{a,2-4d/a};\ \ \mathcal {J} _{a^2,a-2d}$$

Specifically, the Edwards, twisted Edwards and Montgomery curves are all 2-isogenous to the Jacobi quartic, and thus 4-isogenous to each other. We will write the 2-isogenies explicitly in Sects. 4.1 and 5. Since our point encoding works on this family of isogenous curves, we will consider these specific curves parameterized by \(a\) and \(d\).

We will write \(\mathcal {E} \) as a shorthand for \(\mathcal {E} _{a,d}\), \(\mathcal {J} \) for \(\mathcal {J} _{a^2,a-2d}\), and \(\mathcal {M} \) for \(\mathcal {M} _{a,2-4d/a}\).

Coset. In an Abelian group \(\mathbb {G} \), the coset of a subgroup \(H\subset \mathbb {G} \) with respect to an element \(g\in \mathbb {G} \) is \(H+g := \{h+g:h\in H\}\).

Non-negative Field Elements. Let \(p>2\) be prime. Define a residue \(x\in \mathbb {F} =\mathbb {Z}/p\mathbb {Z} \) to be “non-negative” if the least absolute residue for x is in \([0,(p-1)/2]\), and “negative” otherwise. This definition can be generalized (easily but non-canonically) to extension fields. Define |x| to be x or \(-x\), whichever is non-negative. Define \(\sqrt{x}\) to be an arbitrary square root of x, not necessarily the non-negative one.

We chose this definition of non-negative because it is easy to evaluate, and it works over every odd-characteristic field. Alternative choices would be to distinguish by the low bit, or for fields \(3\pmod {4}\), by the Legendre symbol. We avoided the Legendre symbol because it restricts field choices and is somewhat expensive to compute.

Encoding. For sets S and T, and encoding from S to T is an efficient function \(\text {enc} : S\rightarrow T\) with efficient left-inverse \(\text {dec} : T\rightarrow S\uplus \{\bot \}\), which fails by returning \(\bot \) on every element of \(T\backslash \text {enc}[S]\). We are interested in an encoding from an elliptic curve \(\mathcal {E} \) over the field \(\mathbb {F} \) to a binary set \(\{0,1\}^n\) for some fixed n. We assume that the implementer has already chosen an encoding from \(\mathbb {F} \) to binary. Since encodings can be composed and distributed over products, it suffices to encode to a set such as \(\mathbb {F} \), \(\mathbb {F} ^2\) or \(\mathbb {F} \times \{0,1\}\) which has a natural encoding to binary.

Compression. Since most elliptic curve forms are defined as subsets of \(\mathbb {P} ^2(\mathbb {F})\), they admit a straightforward encoding to \(\mathbb {F} ^2\) (and thence to binary) with a finite number of special cases corresponding to points at infinity. We call an encoding “point compression” or simply “compression” if its codomain is smaller than \(\mathbb {F} ^2\) when naturally encoded to binary. Most of the encoding algorithms in this paper map to the set \(\mathbb {F} \) or to its non-negative elements, and so are point compression functions. The set of non-negative elements of \(\mathbb {F} \) generally requires one fewer bit to encode than \(\mathbb {F} \) itself.

3 An Edwards-Only Solution

There is a simple way to remove the a cofactor of 4 from an untwisted Edwards curve. A complete Edwards curve \(\mathcal {E} _{a,d}\) has a 4-torsion subgroup of size exactly 4, whose coset with respect to \(P=(x,y)\) is

$$\mathcal {E} [4] + P = \left\{ (x,y); (y/\sqrt{a},-x\sqrt{a}); (-x,-y); (-y/\sqrt{a},x\sqrt{a})\right\} $$

Of this coset, there is exactly one representative point such that y and xy are both non-negative, and x is nonzero.Footnote 1 We can define the encoding of P to be the y-value of this representative. Note that the representation of the identity point is \((0,-1)\), so the identity point encodes to \(0\in \mathbb {F} \).

Similar solutions apply to incomplete Edwards curves. For curves whose 4-torsion group is \(Z_4\), there is exactly one representative with y and y/x both finite and non-negative. For curves with full 2-torsion, there is exactly one representative with x finite and both y and \((y^2+ax^2)/xy\) non-negative.

The usual addition formulas for incomplete Edwards curves produce the wrong answer (0/0) for operations involving points at infinity, but are otherwise complete. Therefore, if the decoding operation chooses a coset representative in a subgroup that contains no points at infinity (e.g. in the prime-order subgroup), then it is safe to use these curves. However, there is not an obvious way to make this section’s decoding formulas restrict to a subgroup.

Furthermore, this format is not compatible with the fast, simple Montgomery ladder on Montgomery curves. We will remedy these problems using a slightly more complex encoding.

4 A Solution from the Jacobi Quartic

On the Jacobi quartic \(\mathcal {J} _{a^2,a-2d}\), the coset of the 2-torsion group with respect to \(P = (s,t)\) is exactly

$$\mathcal {J} [2] + P = \left\{ (s,t); (-s,-t); (1/as,-t/as^2); (-1/as,t/as^2)\right\} $$

So a similar solution applies on \(\mathcal {J} \) modulo its 2-torsion: we can encode a point P by the s-coordinate of the coset representative (st), where s is non-negative and finite, and t/s is non-negative or infiniteFootnote 2. Call this encoding \(\text {enc}_\mathcal {J} (P)\), and call the corresponding decoding algorithm \(\text {dec}_\mathcal {J} \). Note that the identity point encodes to \(0\in \mathbb {F} \).

4.1 From the Jacobi Quartic to Edwards Curves

The curves \(\mathcal {E} _{a,d}\) and \(\mathcal {J} _{a^2,a-2d}\) are isogenous by the map

$$ \phi _{a}(s,t) = \left( \frac{2s}{1+as^2},\ \frac{1-as^2}{t}\right) \text { with dual } \bar{\phi }_{a}(x,y) = \left( \frac{x}{y},\ \frac{2-y^2-ax^2}{y^2}\right) $$

Note that swapping (ad) with \((-a,d-a)\) results in the same curve \(\mathcal {J} _{a^2,a-2d}\), and gives an isogeny \(\phi _{-a}\) to the curve \(\mathcal {E} _{-a,d-a}\).

We will need the following lemma, whose trivial proof is omitted:

Lemma 1

Let \(\phi \) be a homomorphism from an abelian group \(\mathbb {G} \) to another abelian group \(\mathbb {H} \), and let \(\mathbb {G} '\) be a subgroup of \(\mathbb {G} \). Then \(\phi \) acts as a well-defined homomorphism from \(\mathbb {G}/\mathbb {G} '\) to \(\phi [\mathbb {G} ]/\phi [\mathbb {G} ']\) which is a subgroup of \(\mathbb {H}/\phi [\mathbb {G} ']\). Furthermore, if \(\ker \phi \subseteq \mathbb {G} '\), then \(\phi \) acts as an isomorphism between these groups.

Since the isogeny \(\phi _a\) is a group homomorphism whose kernel is in \(\mathcal {J} [2]\), we can extend the encoding on \(\mathcal {J}/\mathcal {J} [2]\) to an encoding on \(\phi _a[\mathcal {J} ]/\phi _a[\mathcal {J} [2]]\):

$$\text {enc}(P) := \text {enc}_\mathcal {J} (\phi _{a}^{-1}(P)) \text { with } \text {dec}(b) := \phi _a(\text {dec}_\mathcal {J} (b)) $$

The lemma shows that both encoding and decoding are well-defined. In particular, P has two preimages under \(\phi _{a}\), but they represent the same element of \(\mathcal {J}/\mathcal {J} [2]\) and have the same encoding under \(\text {enc}_\mathcal {J} \).

Let \(\psi (\mathcal {E})\) denote the group \(\phi _a[\mathcal {J} ]/\phi _a[\mathcal {J} [2]]\). If the 4-torsion group of \(\mathcal {E} \) is cyclic, then \(\psi (\mathcal {E})\) is more simply expressed as \(2\mathcal {E}/\mathcal {E} [2]\).

4.2 Encoding

When encoding from \(\psi (\mathcal {E})\), we are given a point \(P=(x,y)\) in the image of \(\phi _{aq}\) on \(\mathcal {E} \). We need to efficiently compute s where \((s,t) = \phi _{a}^{-1}(x,y)\). We know that

$$\begin{aligned} x= & {} 2s/(1+as^2) \\ \text {so}\,s= & {} (1\pm \sqrt{1-ax^2})/ax \end{aligned}$$

Also,

$$\begin{aligned} y= & {} (1-as^2)/t \\ \text {so }t/s= & {} (1-as^2)/sy\\= & {} \mp 2\sqrt{1-ax^2}/xy \end{aligned}$$

It turns out to be particularly straightforward to compute this encoding from the popular extended homogeneous coordinates. Explicit formulas are given in Appendix A.1.

4.3 Decoding

To decode, we are given s and must compute

$$(x,y) = \left( \frac{2s}{1+as^2},\ \frac{1-as^2}{\sqrt{a^2 s^4 + (2a-4d)s^2 + 1}}\right) $$

with the square root t taken so that t/s is non-negative. This requires the “inverse square root trick” to compute 1/s and t at the same time, with care to avoid division by 0. The exact formulas are given in Appendix A.2. The input must be rejected if s is negative or if it is not a field element (eg. if it is the binary encoding of a number \(\ge p\)), or if the square root doesn’t exist.

It is simplest to decode to projective form, so that the denominators need not be cleared. It is also relatively easy to decode to affine form by batching a computation of \(1/(1+as^2)\) with the square root. Decoded points always have a well-defined affine form on curves with cofactor exactly 4, because those curves have no points at infinity in the image of \(\phi _{a}\).

4.4 Completeness

Importantly, if the cofactor of \(\mathcal {J} \) is exactly 4, then the image \(\psi (\mathcal {E})\) contains no points at infinity. An easy way to see this is that if \(\phi _a(s,t)\) were at infinity, then \(\bar{\phi }_{a}(\phi _a(s,t))\) would be either at infinity or at \((0,-1)\). In either case, it would be a nontrivial 2-torsion point [21]. But it cannot be a 2-torsion point, because \(\bar{\phi }_{a}\circ \phi _a\) is the doubling map on \(\mathcal {J} \) (by definition of an isogeny), and its image is exactly the subgroup of order q.

4.5 Equality

Ordinarily, testing for equality in a quotient group \(\mathbb {G}/\mathbb {H} \) requires testing whether \(P=Q+H\) for each \(H\in \mathbb {H} \). But if the cofactor is exactly 4, then equality testing is actually easier on \(\psi (\mathcal {E})\) than on \(\mathcal {E} \). In this case, two points \((X_1:Y_1:Z_1:T_1)\) and \((X_2:Y_2:Z_2:T_2)\) are equal if and only if

$$X_1\cdot Y_2 = X_2\cdot Y_1$$

This is because X/Y is the s-coordinate of the image Q of \(\bar{\phi }_{a}(X:Y:Z:T)\) on \(\mathcal {J} \). The only other point with that s-coordinate has a nontrivial 2-torsion component (it is \((0,-1)-Q\)), but the image \((\bar{\phi }_a\circ \phi _a)[\mathcal {J} ]\) is the prime-order subgroup \(\mathcal {J} [q]\).

In particular, for a curve of cofactor exactly 4, a point (X : Y : Z : T) is equal to the identity precisely when \(X=0\).

4.6 Security

Using Decaf gives the security benefits of a prime-order group without weakening well-studied cryptographic assumptions. In particular:

  • The discrete logarithm problem is equivalent on \(\mathcal {E}, \mathcal {J} \) and \(\psi (\mathcal {E})\). The same is true for computational Diffie-Hellman, gap DH, static DH, strong DH, and should hold for similar computation problems.

  • If the Decaf group \(\psi (\mathcal {E})\) has prime order q, then the DDH problem is equivalent on \(\mathcal {E} [q], \mathcal {J} [q]\) and \(\psi (\mathcal {E})\). The same is true for decision linear, and should hold for similar decision problems. These decision problems are easy on groups with a small cofactor, such as \(\mathcal {E} \) itself.

The straightforward proofs of these reductions are omitted.

4.7 Batch Encoding

On a server which needs to generate signatures and/or ephemeral keys at prodigious rates, it may be advantageous to batch the point encoding algorithm.

The encoding algorithm listed above cannot be batched easily because of the inverse square root computation. However, the square root can be avoided if we wish to compress 2P instead of P, that is, if P is computed as \((k/2\,\text {mod}\,q)\cdot B\) instead of \(k\cdot B\). In this case, we can simply evaluate the dual 2-isogeny \(\bar{\phi }\) from \(\mathcal {E} \) to \(\mathcal {J} \):

  • Compute 1/(xy) and \(t/s = (2-y^2-ax^2)/xy\).

  • If t/s is non-negative, then output \(|s|=|x/y| = |x^2/xy|\).

  • Otherwise output \(|1/s| = |y/x| = |y^2/xy|\).

The computation of 1/(xy) can be batched over multiple points using Montgomery’s trick.

4.8 Performance

Overall, Decaf’s performance is very similar to a traditional point compression scheme. Encoding and decoding take one field exponentiation each.

Fig. 1.
figure 1

Cost of encoding and decoding algorithms. \(M=\) multiply, \(I=\) inversion, \(I_2=\) inverse square root, \(L=\) Legendre symbol. Squarings are treated as \(0.8\,M\) and multiplies by constants as \(0.2\,\mathrm{M}\), but columns are rounded to the nearest M.

A comparison to existing point encoding algorithms is shown in Fig. 1. It shows:

  • The encoding and decoding costs.

  • The cost to clear the cofactor if one remains.

  • The order of the resulting points on the curve, with \(4q\rightarrow q\) meaning a cofactor that will most likely be cleared.

  • The extra factor induced by encoding and cofactor clearing.

  • The size in bits of the encoding’s codomain.

If inversion I is implemented using Fermat’s little theorem, it is likely to be slightly more expensive than an inverse square root \(I_2\). In practice, implementations that need both I and \(I_2\) with \(|\mathbb {F} |\equiv 3\pmod 4\) often implement inversion as \(x/(\pm \sqrt{x^2})^2\), costing \(M+2S\) more, and this is usually close to optimal anyway.Footnote 3

The (xy) method is uncompressed, and \((x,\mathop {\text {sign}}y)\) is classically compressed. These methods do not remove the cofactor, so many protocols will remove it at the cost of two doublings \(\approx 12\,\mathrm{M}\). This changes the order of the internal points from 4q to q. The third row is compression with order checking. The order checking can be accomplished by inverting a 2-isogeny twice: the first inversion requires an inverse square root, but the second requires only checking that the root exists, i.e. computing a Legendre symbol.

The first proposal (Sect. 3) is a quotient group on an untwisted Edwards curve. It is slightly more expensive on a twisted Edwards curve, and is dangerous for such curves when \(|\mathbb {F} |\equiv 3\pmod 4\) because the internal points can have order 4. The second proposal (Sect. 4) avoids this problem, and gives an encoding compatible with several curve models, but at the cost of about 8 extra field multiplications and correspondingly higher complexity.

A downside of methods which include an inverse square root \(I_2\) is that they cannot use an EGCD-based inversion method. They also cannot be batched using Montgomery’s batch inversion trick, which accomplishes N inversions using one inversion and \(3(N-1)\) multiplications. The batchable encoding method (Sect. 4.7) replaces the inverse square root in encoding with an inversion but multiplies by an extra factor of 2.

It is seen that our methods cost less in total than point compression plus clearing the cofactor. Even for operations which do not need to clear the cofactor (eg. key generation), the overhead from our encoding is relatively small. Fast key generation operations cost on the order of 3 M per bit of the curve’s order, so the difference in encoding costs is well under 1 % for cryptographically useful curves.

5 Compatibility with Montgomery Curves

The Montgomery ladder on Montgomery curves is a very simple and fast way to implement scalar multiplication for Diffie-Hellman (DH) key exchange. In its simplest form, the ladder discards sign information, making it inherently incompatible with any point encoding format that conveys sign information. Furthermore, it does not distinguish between the curve and its quadratic twist, necessitating the use of twist-safe curves [3]. However, we would like to interoperate with the Montgomery ladder with minimal changes. For example, if a protocol uses a \((u,\mathop {\text {sign}}v)\) format, then the ladder can be modified to compute the sign, or the protocol can be changed to discard the sign bit for DH outputs.

We will show how to use Decaf with the Montgomery ladder on the curve

$$\mathcal {M} _{a,2-4d/a}: av^2 = u\cdot (u^2+(2-4d/a)\cdot u+1)$$

where conveniently the value of \((A+2)/4\) is \(1-d/a\). The curve \(\mathcal {M} _{a,2-4d/a}\) is isogenous to \(\mathcal {J} _{a^2,a-2d}\) by the maps

$$\phi (s,t) = \left( \frac{1}{as^2}, -\frac{t}{as^3}\right) \text { and } \bar{\phi }(u,v) = \left( \frac{1-u^2}{2av}, \frac{a(u+1)^4+8du (u^2+1)}{4a^2 v^2}\right) $$

More simply, \(\phi (s,t) = (as^2,ts) + T_2\), where the 2-torsion point \(T_2\) can be ignored due to the quotient. This means that Montgomery ladder implementations can take input in Decaf format, simply by starting the ladder at \(u = as^2\).

When the ladder finishes, it is possible to efficiently encode the output point in the Decaf point format, including the correct sign information for v. However, recovering the sign information is complicated. Furthermore, it is possible to reject elements on the twist rather than on the curve, which the usual Montgomery ladder does not do, and it is possible to do all of this with only one field exponentiation (an inverse square root). This means that the Montgomery ladder will behave exactly the same as a standard decoding, scalar multiplication and encoding. We give the full details of how to do this in Appendix B. Some of the formulas in that section may be of independent interest.

It is also possible (and complicated) to do these things with existing point formats such as \((u,\mathop {\text {sign}}v)\), but almost no implementations do. Instead, since the Montgomery ladder is used almost exclusively for Diffie-Hellman, most implementations clear the cofactor and output only u, losing the information about v. This leaves the Montgomery ladder code very simple. It is also easy to do this with the Decaf encoding, by clearing the cofactorFootnote 4 and outputting \(|1/\sqrt{au}|\). The implementation should abort on \(u=0\) and \(u=\infty \), which lie in a small subgroup. This will also reject points on the twist, because even points on the twist have either \(u=0,u=\infty \) or au nonsquare.

An Edwards or twisted Edwards implementation can interoperate with this simpler behavior simply by computing \(|s| = |x/y|\), instead of encoding any sign information.

6 Hashing to the Curve

Some protocols require a map from \(\mathbb {F} \) to a curve [10, 19, 23], either to build a hash function which is either indifferentiable from a random oracle, or at least suitable for encoding computational Diffie-Hellman (CDH) challenges in the random oracle model. We could do this by using Elligator 1 or 2 on \(\mathcal {E} \) or \(\mathcal {M} \). Since Decaf only operates subgroups on these curves, we would need to double the output of the map to make sure it is in the subgroup.

However, there is a better solution. We can instead use Elligator 2 on the Jacobi quartic \(\mathcal {J} \), since it has a point of order 2. Then we can translate this point to the Edwards and Montgomery curves using the isogeny. That way, the groups and maps implemented by these curves are all compatible. The formulas for Elligator 2 are found in Appendix C. It is important to note that Elligator 2 provides a 1:1 map to a group of order \(h\cdot q\), not of order \((h/4)\cdot q\). Therefore, the map be up to 4:1 once the isogeny and quotient are applied.

This map is suitable for deriving CDH challenges from a random oracle. That is, it is still suitable for use in derivatives of BLS [10], SPAKE2 [1]Footnote 5, SPEKE [23] and possibly Dragonfly [19]Footnote 6. These protocols do not require a random oracle map to \(\mathbb {G}\). They only require a map from strings to the curve which is at most k-to-1 for small k, hits at least a \(1/\ell \) fraction of the points for small \(\ell \), and whose inverse is efficiently sampleable. When a full random oracle map to \(\mathbb {G}\) is required, Brier et al.’s result [12] shows that mapping two independently chosen field elements and adding them is sufficient.

It is still possible to use Elligator 2 as a partial steganographic encoding for public keys, as in EKE. One may invert the isogeny to obtain a point on \(\mathcal {J} \), randomize its 2-torsion components, and apply the inverse map defined by Elligator 2. Unfortunately, this requires an extra randomization step and an extra inverse-square-root operation compared to the original Elligator 2.

7 Future Work

We do not believe that Decaf is the last word in cofactor-reducing compression algorithms. It would be useful, for example, to eliminate the cofactor of 8 in Curve25519. Additionally, an improved encoding scheme with simpler formulas would make this technique more compelling.

8 Conclusion

We have shown a straightforward way to implement a prime-order group \(\mathbb {G} \) using Edwards, twisted Edwards, Jacobi quartic and Montgomery curves. All four curve shapes implement the same group and so are compatible, except that as usual it is complicated to make the Montgomery ladder retain sign information. Our technique is otherwise similar in complexity and performance to traditional point compression techniques, though it may improve performance by making faster curves safe. Furthermore, we have shown how to implement an Elligator-like map from \(\mathbb {F} \) to \(\mathbb {G} \), which is also compatible with all 4 models.