Keywords

1 Introduction

During the past five to ten years, elliptic-curve cryptography (ECC) has taken over public-key cryptography on the internet and in security applications. Many protocols such as Signal (https://signal.org) or TLS 1.3 rely on the small key sizes and efficient computations to achieve forward secrecy, often meaning that keys are used only once. However, it is also important to notice that security does not break down if keys are reused. Indeed, some implementations of TLS, such as Microsoft’s SChannel, reuse keys for some fixed amount of time rather than for one connection [2]. Google’s QUIC (https://chromium.org/quic) relies on servers keeping their keys fixed for a while to achieve quick session resumption. Several more examples are given by Freire, Hofheinz, Kiltz, and Paterson in their paper [25] formalizing non-interactive key exchange. Some applications require this functionality and for many it provides significant savings in terms of roundtrips or implementation complexity. Finding a post-quantum system that permits non-interactive key exchange while still offering decent performance is considered an open problem. Our paper presents a solution to this problem.

Isogeny-based cryptography is a relatively new kind of elliptic-curve cryptography, whose security relies on (various incarnations of) the problem of finding an explicit isogeny between two given isogenous elliptic curves over a finite field \(\mathbb F_q\). One of the main selling points is that quantum computers do not seem to make the isogeny-finding problem substantially easier. This contrasts with regular elliptic-curve cryptography, which is based on the discrete-logarithm problem in a group and therefore falls prey to a polynomial-time quantum algorithm designed by Shor in 1994 [57].

The first proposal of an isogeny-based cryptosystem was made by Couveignes in 1997 [17]. It described a non-interactive key exchange protocol where the space of public keys equals the set of \(\mathbb F_q\)-isomorphism classes of ordinary elliptic curves over \(\mathbb F_q\) whose endomorphism ring is a given order \(\mathcal O\) in an imaginary quadratic field and whose trace of Frobenius has a prescribed value. It is well-known that the ideal-class group \(\mathrm {cl}(\mathcal O)\) acts freely and transitively on this set through the application of isogenies. Couveignes’ central observation was that the commutativity of \(\mathrm {cl}(\mathcal O)\) naturally allows for a key-exchange protocol in the style of Diffie and Hellman [23]. His work was only circulated privately and thus not picked up by the community; the corresponding paper [17] was never formally published and posted on ePrint only in 2006. The method was eventually independently rediscovered by Rostovtsev and Stolbunov in 2004 (in Stolbunov’s master’s thesis [60] and published on ePrint as [54] in 2006). In 2010, Childs, Jao and Soukharev [12] showed that breaking the Couveignes–Rostovtsev–Stolbunov scheme amounts to solving an instance of the abelian hidden-shift problem, for which quantum algorithms with a time complexity of \(L_q[1/2]\) are known to exist; see [43, 52]. While this may be tolerable (e.g., classical subexponential factorization methods have not ended the widespread use of RSA), a much bigger concern is that the scheme is unacceptably slow: despite recent clever speed-ups due to De Feo, Kieffer, and Smith [21, 41], several minutes are needed for a single key exchange at a presumed classical security level of 128 bits. Nevertheless, in view of its conceptual simplicity, compactness, and flexibility, it seems a shame to discard the Couveignes–Rostovtsev–Stolbunov scheme.

The attack due to Childs–Jao–Soukharev strongly relies on the fact that \(\mathrm {cl}(\mathcal O)\) is commutative, hence indirectly on the fact that \(\mathcal O\) is commutative. This led Jao and De Feo [38] to consider the use of supersingular elliptic curves, whose full ring of endomorphisms is an order in a quaternion algebra; in particular it is non-commutative. Their resulting (interactive) key-agreement scheme, which nowadays goes under the name “Supersingular Isogeny Diffie–Hellman” (SIDH), has attracted almost the entire focus of isogeny-based cryptography over the past six years. The current state-of-the-art implementation is SIKE [37], which was recently submitted to the NIST competition on post-quantum cryptography [48].

It should be stressed that SIDH is not the Couveignes–Rostovtsev–Stolbunov scheme in which one substitutes supersingular elliptic curves for ordinary elliptic curves; in fact SIDH is much more reminiscent of a cryptographic hash function from 2006 due to Charles, Goren, and Lauter [11]. SIDH’s public keys consist of the codomain of a secret isogeny and the image points of certain public points under that isogeny. Galbraith, Petit, Shani, and Ti showed in [29] that SIDH keys succumb to active attacks and thus should not be reused, unless combined with a CCA transform such as the Fujisaki–Okamoto transform [26].

In this paper we show that adapting the Couveignes–Rostovtsev–Stolbunov scheme to supersingular elliptic curves is possible, provided that one restricts to supersingular elliptic curves defined over a prime field \(\mathbb F_p\). Instead of the full ring of endomorphisms, which is non-commutative, one should consider the subring of \(\mathbb F_p\)-rational endomorphisms, which is again an order \(\mathcal O\) in an imaginary quadratic field. As before \(\mathrm {cl}(\mathcal O)\) acts via isogenies on the set of \(\mathbb F_p\)-isomorphism classes of elliptic curves whose \(\mathbb F_p\)-rational endomorphism ring is isomorphic to \(\mathcal O\) and whose trace of Frobenius has a prescribed value; in fact if \(p \ge 5\) then there is only one option for this value, namely 0, in contrast with the ordinary case. See e.g.  [70, Theorem 4.5], with further details to be found in [8, 22] and in Sect. 3 of this paper. Starting from these observations, the desired adaptation of the Couveignes–Rostovtsev–Stolbunov scheme almost unrolls itself; the details can be found in Sect. 4. We call the resulting scheme CSIDH, where the C stands for “commutative”.Footnote 1

While this fails to address Jao and De Feo’s initial motivation for using supersingular elliptic curves, which was to avoid the \(L_q[1/2]\) quantum attack due to Childs–Jao–Soukharev, we show that CSIDH eliminates the main problem of the Couveignes–Rostovtsev–Stolbunov scheme, namely its inefficiency. Indeed, in Sect. 8 we will report on a proof-of-concept implementation which carries out a non-interactive key exchange at a presumed classical security level of 128 bits and a conjectured post-quantum security level of 64 bits in about 80 ms, while using key sizes of only 64 bytes. This is over 2000 times fasterFootnote 2 than the current state-of-the-art instantiation of the Couveignes–Rostovtsev–Stolbunov scheme by De Feo, Kieffer and Smith [21, 41], which itself presents many new ideas and speedups to even achieve that speed.

For comparison, we remark that SIDH, which is the NIST submission with the smallest combined key and ciphertext length, uses public keys and ciphertexts of over 300 bytes each. More precisely SIKE’s version p503 uses uncompressed keys of 378 bytes long [37] for achieving CCA security. The optimized SIKE implementation is about ten times faster than our proof-of-concept C implementation, but even at \(80\,\mathrm {ms}\), CSIDH is practical.

Another major advantage of CSIDH is that we can efficiently validate public keys, making it possible to reuse a key without the need for transformations to confirm that the other party’s key was honestly generated.

Finally we note that just like the original Couveignes–Rostovtsev–Stolbunov scheme, CSIDH relies purely on the isogeny-finding problem; no extra points are sent that could potentially harm security, as argued in [50].

To summarize, CSIDH is a new cryptographic primitive that can serve as a drop-in replacement for the (EC)DH key-exchange protocol while maintaining security against quantum computers. It provides a non-interactive (static–static) key exchange with full public-key validation. The speed is practical while the public-key size is the smallest for key exchange or KEM in the portfolio of post-quantum cryptography. This makes CSIDH particularly attractive in the common scenario of prioritizing bandwidth over computational effort. In addition, CSIDH is compatible with 0-RTT protocols such as QUIC.

Why supersingular? To understand where the main speed-up comes from, it suffices to record that De Feo–Kieffer–Smith had the idea of choosing a field of characteristic p, where p is congruent to \(-1\) modulo all small odd primes \(\ell \) up to a given bound. They then look for an ordinary elliptic curve \(E / \mathbb F_p\) such that \(\# E(\mathbb F_p)\) is congruent to 0 modulo as many of these \(\ell \)’s as possible, i.e., such that points of order \(\ell \) exist over \(\mathbb F_p\). These properties ensure that \(\ell \mathcal O\) decomposes as a product of two prime ideals \(\mathfrak {l}= (\ell , \pi - 1)\) and \(\overline{\mathfrak {l}}= (\ell , \pi + 1)\), where \(\pi \) denotes the Frobenius endomorphism. For such primes the action of the corresponding ideal classes \([\mathfrak {l}]\) and \([\overline{\mathfrak {l}}] = [\mathfrak {l}]^{-1}\) can be computed efficiently through an application of Vélu-type formulae to E (resp. its quadratic twist \(E^t\)), the reason being that only \(\mathbb F_p\)-rational points are involved. If this works for enough primes \(\ell \), we can expect that a generic element of \(\mathrm {cl}(\mathcal O)\) can be written as a product of small integral powers of such \([\mathfrak {l}]\), so that the class-group action can be computed efficiently. However, finding an ordinary elliptic curve \(E / \mathbb F_p\) such that \(\# E(\mathbb F_p)\) is congruent to 0 modulo many small primes \(\ell \) is hard, and the main focus of De Feo–Kieffer–Smith is on speeding up this search. In the end it is only practical to enforce this for 7 primes, thus they cannot take full advantage of the idea.

However, in the supersingular case the property \(\# E(\mathbb F_p) = p + 1\) implies that \(\# E(\mathbb F_p)\) is congruent to 0 modulo all primes \(\ell \mid p+1\) that we started from in building p! Concretely, our proof-of-concept implementation uses 74 small odd primes, corresponding to prime ideals \(\mathfrak {l}_1, \mathfrak {l}_2, \ldots , \mathfrak {l}_{74}\) for which we heuristically expect that almost all elements of our 256-bit size class group can be written as \([\mathfrak {l}_1]^{e_1} [\mathfrak {l}_2]^{e_2} \cdots [\mathfrak {l}_{74}]^{e_{74}}\), where the exponents \(e_i\) are taken from the range \(\{-5,\dots ,5\}\); indeed, one verifies that \(\log \,(2 \cdot 5 + 1)^{74} \approx 255.9979\). The action of such an element can be computed as the composition of at most \(5 \cdot 74 = 370\) easy isogeny evaluations. This should be compared to using 7 small primes, where the same approach would require exponents in a range of length about \(2^{256/7} \approx 2^{36}\), in view of which De Feo–Kieffer–Smith also resort to other primes with less beneficial properties, requiring to work in extensions of \(\mathbb F_p\).

The use of supersingular elliptic curves over \(\mathbb F_p\) has various other advantages. For instance, their trace of Frobenius t is 0, so that the absolute value of the discriminant \(|t^2 - 4p| = 4p\) is as large as possible. As a consequence, generically the size of the class group \(\mathrm {cl}(\mathcal O)\) is close to its maximal possible value for a fixed choice of p. Conversely, this implies that for a fixed security level we can make a close-to-minimal choice for p, which directly affects the key size. Note that this contrasts with the CM construction from [9], which could in principle be used to construct ordinary elliptic curves having many points of small order, but whose endomorphism rings have very small class groups, ruling them out for the Couveignes–Rostovtsev–Stolbunov key exchange.

To explain why key validation works, note that we work over \(\mathbb F_p\) with \(p \equiv 3 \pmod 8\) and start from the curve \(E_0:y^2 = x^3 + x\) with \(\mathbb F_p\)-rational endomorphism ring \(\mathcal O= \mathbb Z[\pi ]\). As it turns out, all Montgomery curves \(E_A:y^2 = x^3 + Ax^2 + x\) over \(\mathbb F_p\) that are supersingular appear in the \(\mathrm {cl}(\mathcal O)\)-orbit of \(E_0\). Moreover their \(\mathbb F_p\)-isomorphism class is uniquely determined by A. So all one needs to do upon receiving a candidate public key \(y^2 = x^3 + Ax^2 + x\) is check for supersingularity, which is an easy task; see Sect. 5. The combination of large size of \(\mathrm {cl}(\mathcal O)\) and representation by a single \(\mathbb F_p\)-element A explains the small key size of 64 bytes.

1.1 One-Way Group Actions

Although non-interactive key exchange is the main application of our primitive, it is actually more general: It is (conjecturally) an instance of Couveignes’ hard homogeneous spaces [17], ultimately nothing but a finite commutative group action for which some operations are easy to compute while others are hard. Such group actions were first formalized and studied by Brassard and Yung [7]. We summarize Couveignes’ definition:

Definition 1

A hard homogeneous space consists of a finite commutative group G acting freely and transitively on some set X.

The following tasks are required to be easy (e.g., polynomial-time):

  • Compute the group operations in G.

  • Sample randomly from G with (close to) uniform distribution.

  • Decide validity and equality of a representation of elements of X.

  • Compute the action of a group element \(g\in G\) on some \(x\in X\).

The following problems are required to be hard (e.g., not polynomial-time):

  • Given \(x,x'\in X\), find \(g\in G\) such that \(g*x = x'\).

  • Given \(x,x',y\in X\) such that \(x'=g*x\), find \(y'=g*y\).

Any such primitive immediately implies a natural Diffie–Hellman protocol: Alice and Bob’s private keys are random elements ab of G, their public keys are \(a*x_0\) resp. \(b*x_0\), where \(x_0\in X\) is a public fixed element, and the shared secret is \(b*(a*x_0)=a*(b*x_0)\). The private keys are protected by the difficulty of the first hard problem above, while the shared secret is protected by the second problem. Note that traditional Diffie–Hellman on a cyclic group C is an instance of this, where X is the set of generators of C and G is the multiplicative group \((\mathbb Z/{\#C})^*\) acting by exponentiation.

1.2 Notation and Terminology

We stress that throughout this paper, we consider two elliptic curves defined over the same field identical whenever they are isomorphic over that field. Note that we do not identify curves that are only isomorphic over some extension field, as opposed to what is done in SIDH, for instance. In the same vein, for an elliptic curve E defined over a finite field \(\mathbb F_p\), we let \(\mathrm {End}_p(E)\) be the subring of the endomorphism ring \(\mathrm {End}(E)\) consisting of endomorphisms defined over \(\mathbb F_p\).Footnote 3 This subring is always isomorphic to an order in an imaginary quadratic number field. Conversely, for a given order \(\mathcal O\) in an imaginary quadratic field and an element \(\pi \in \mathcal O\), we let denote the set of elliptic curves E defined over \(\mathbb F_p\) with \(\mathrm {End}_p(E)\cong \mathcal O\) such that \(\pi \) corresponds to the \(\mathbb F_p\)-Frobenius endomorphism of E. In particular, this implies that \(\varphi \circ \beta = \beta \circ \varphi \) for all \(\mathbb F_p\)-isogenies \(\varphi \) between two curves in and all \(\beta \in \mathcal O\) interpreted as endomorphisms.

Ideals are always assumed to be non-zero.

The notation “\(\log \)” refers to the base-2 logarithm.

Acknowledgements. This project started during a research retreat on post-quantum cryptography, organized by the European PQCRYPTO and ECRYPT-CSA projects in Tenerife from 29 January until 1 February 2018. We would like to thank Jeffrey Burdges, whose quest for a flexible post-quantum key exchange protocol made us look for speed-ups of the Couveignes–Rostovtsev–Stolbunov scheme. We are grateful to Luca De Feo, Jean Kieffer, and Ben Smith for sharing a draft of their paper in preparation, and to Daniel J. Bernstein, Luca De Feo, Jeroen Demeyer, Léo Ducas, Steven Galbraith, David Jao, and Fré Vercauteren for helpful feedback.

2 Isogeny Graphs

Good mixing properties of the underlying isogeny graph are relevant for the security of isogeny-based cryptosystems. Just as in the original Couveignes–Rostovtsev–Stolbunov cryptosystem, in our case this graph is obtained by taking the union of several large subgraphs (each being a union of large isomorphic cycle graphs) on the same vertex set, one for each prime \(\ell \) under consideration; see Fig. 1 for a (small) example. Such a graph is the Schreier graph associated with our class-group action and the chosen generators. We refer to the lecture notes of De Feo [19, Sect. 14.1] for more background and to [40] for a discussion of its rapid mixing properties. One point of view on this is that one can quickly move between distant nodes in the subgraph corresponding to one generator by switching to the subgraph corresponding to another generator. This thereby replaces the square-and-multiply algorithm in exponentiation-based cryptosystems (such as classical Diffie–Hellman).

Fig. 1.
figure 1

Union of the supersingular \(\ell \)-isogeny graphs for \(\ell \in \) { } over \(\mathbb F_{419}\). CSIDH makes use of the larger component, corresponding to curves whose ring of \(\mathbb F_{419}\)-rational endomorphisms is isomorphic to \(\mathbb Z[\sqrt{-419}]\).

The goal of this section is to analyze the structure of the individual cycles.

Definition 2

For a field k and a prime \(\ell \not \mid \mathrm {char}\, k\), the k -rational \(\ell \) -isogeny graph \(G_{k, \ell }\) is defined as having all the elliptic curves defined over k as its vertices, and having a directed edge \((E_1, E_2)\) for each k-rational \(\ell \)-isogeny from \(E_1\) to \(E_2\).Footnote 4

Remark 3

A priori \(G_{k, \ell }\) is a directed graph, but given two elliptic curves \(E_1\) and \(E_2\) whose j-invariants are not in \(\{0, 1728\}\), there are exactly as many edges \((E_2,E_1)\) as \((E_1,E_2)\), obtained by taking dual isogenies. Annoyingly, the nodes with j-invariants 0 and 1728 are more complicated, since these are exactly the curves with extra automorphisms: an elliptic curve E in \(G_{k,\ell }\) has fewer incoming than outgoing edges if and only if either \(j(E)=0\) and \(\sqrt{-3}\in k\), or if \(j(E)=1728\) and \(\sqrt{-1}\in k\). Throughout this paper, we will assume for simplicity that \(\sqrt{-3}, \sqrt{-1} \notin k\), so that neither of these automorphisms are defined over k and we may view \(G_{k, \ell }\) as an undirected graph. In the case of a finite prime field \(k = \mathbb F_p\), it suffices to restrict to \(p \equiv 11 \pmod {12}\), which will be satisfied in the class of instantiations we suggest.

If \(k = \mathbb F_q\) is a finite field, then \(G_{k, \ell }\) is a finite graph that is the disjoint union of ordinary connected components and supersingular connected components. The ordinary components were studied in Kohel’s PhD thesis [42]. Due to their regular structure, these components later became known as isogeny volcanoes.

In general (e.g. over non-prime fields), the supersingular components may bear no similarity at all to the volcanoes of the ordinary case. Traditionally, following Pizer [51], one instead studies the unique supersingular component of \(G_{k,\ell }\) where \(k = \overline{\mathbb F}_q\), which turns out to be a finite \((\ell {+}1)\)-regular Ramanujan graph and forms the basis for the SIDH protocol.

However, Delfs and Galbraith [22] showed that if \(k = \mathbb F_p\) is a finite prime field, then all connected components are volcanoes, even in the supersingular case (where the depth is at most 1 at \(\ell =2\) and 0 otherwise). We present a special case of a unified statement, restricting our attention to the cases in which \(G_{\mathbb F_p,\ell }\) is a cycle. Recall that \(\mathrm {End}_p(E)\) is an order \(\mathcal O\) in the imaginary quadratic field

where \(|t|\le 2\sqrt{p}\) denotes the (absolute value of the) trace of the Frobenius endomorphism, and that two curves are isogenous over \(\mathbb F_p\) if and only if their traces of Frobenius are equal [66, Theorem 1].

Theorem 4

(Kohel, Delfs–Galbraith). Let \(p \ge 5\) be a prime number and let V be a connected component of \(G_{\mathbb F_p, \ell }\). Assume that \(p \equiv 11 \pmod {12}\) or that V contains no curve with j-invariant 0 or 1728. Let t be the trace of Frobenius common to all vertices in V, and let K be as above. Assume that \(\ell \not \mid t^2 - 4p\).

Then all elliptic curves in V have the same \(\mathbb F_p\)-rational endomorphism ring \(\mathcal O\subseteq K\), and \(\mathcal O\) is locally maximal at \(\ell \). Moreover if \(t^2 - 4p\) is a (non-zero) square modulo \(\ell \), then V is a cycle whose length equals the order of \([\mathfrak {l}]\) in \(\mathrm {cl}(\mathcal O)\), where \(\mathfrak {l}\) is a prime ideal dividing \(\ell \mathcal O\). If not, then V consists of a single vertex and no edges.

Proof

In the case of an ordinary component this is just a special case of [65, Theorem 7]. In the case of a supersingular component this follows from the proof of [22, Theorem 2.7]. (In both cases, we could alternatively (re)prove this theorem by proving that an \(\ell \)-isogeny can only change the conductor of the endomorphism ring of an elliptic curve locally at \(\ell \) and applying Theorem 7.)     \(\square \)

In the ordinary case a curve and its quadratic twist can never appear in the same component because they have a different trace of Frobenius. This is the main difference with the supersingular case, where this possibility is not excluded. To avoid confusion, we clarify that by the quadratic twist of a given elliptic curve \(E:y^2 = f(x)\) over \(\mathbb F_p\) we mean the curve \(E^t:dy^2 = f(x)\), where \(d \in \mathbb F_p^*\) is any non-square. If \(p \equiv 3 \pmod 4\) and \(j(E) = 1728\) then this may deviate from what some readers are used to, because in this case \(E^t\) and E are \(\mathbb F_p\)-isomorphic. Note that such a curve is necessarily supersingular.

Remark 5

In fact, if \(p \equiv 3 \bmod 4\) then there are two non-isomorphic curves over \(\mathbb F_p\) with j-invariant 1728, namely \(y^2 = x^3 - x\) and \(y^2 = x^3 + x\), whose endomorphism rings are the full ring of integers \(\mathbb Z[(1+ \sqrt{-p})/2]\) and the order \(\mathbb Z[\sqrt{-p}]\) of conductor 2 respectively. The connected component of each curve is “symmetric”: if E is n steps along \(G_{\mathbb F_p,\ell }\) in one direction from a curve of j-invariant 1728 then the curve that is n steps in the other direction is the quadratic twist of E. In the case of \(G_{\mathbb F_{83},3}\) we can see this in Fig. 2, which is taken from [22, Fig. 8].

It is also interesting to observe that the symmetry around \(j=1728\) confirms the known fact that the class numbers of \(\mathbb Z[(1+ \sqrt{-p})/2]\) and \(\mathbb Z[\sqrt{-p}]\) are odd, at least in the case that \(p \equiv 3 \pmod 4\); see [47].

Fig. 2.
figure 2

The two supersingular components of \(G_{\mathbb F_{83},3}\). The curves in the top component have \(\mathbb F_p\)-rational endomorphism ring \(\mathbb Z[(1+\sqrt{-83})/2]\), while those in the lower component correspond to \(\mathbb Z[\sqrt{-83}]\). Running clockwise through these components corresponds to the repeated action of \([(3,\pi -1)]\).

3 The Class-Group Action

It is well-known that the ideal-class group of an imaginary quadratic order \(\mathcal O\) acts freely via isogenies on the set of elliptic curves with \(\mathbb F_p\)-rational endomorphism ring \(\mathcal O\). Using this group action on a set of ordinary elliptic curves for cryptographic purposes was first put forward by Couveignes [17] and independently rediscovered later by Rostovtsev and Stolbunov [54, 60]. Our suggestion is to use the equivalent of their construction in the supersingular setting, thus the following discussion covers both cases at once. For concreteness, we focus on prime fields with \(p\ge 5\) and point out that the ordinary (but not the supersingular) case generalizes to all finite fields. We recall the following standard lemma:

Lemma 6

Let \(E/\mathbb F_p\) be an elliptic curve and G a finite \(\mathbb F_p\)-rational (i.e., stable under the action of the \(\mathbb F_p\)-Frobenius) subgroup of E. Then there exists an elliptic curve \(E'/\mathbb F_p\) and a separable isogeny \(\varphi :E\rightarrow E'\) defined over \(\mathbb F_p\) with kernel G. The codomain \(E'\) and isogeny \(\varphi \) are unique up to \(\mathbb F_p\)-isomorphism.Footnote 5

Proof

[59, Proposition III.4.12, Remark III.4.13.2, and Exercise III.3.13e].     \(\square \)

The ideal-class group. We recall the definitions and basic properties of class groups of quadratic orders that will be needed in the following. This section is based on [18, Sect. 7]. Let K be a quadratic number field and \(\mathcal O\subseteq K\) an order (that is, a subring which is a free \(\mathbb Z\)-module of rank 2). The norm of an \(\mathcal O\)-ideal \(\mathfrak {a}\subseteq \mathcal O\) is defined as \(\mathrm {N}(\mathfrak {a})=|\mathcal O/\mathfrak {a}|\); it is equal to \(\gcd (\{\mathrm {N}(\alpha )\mid \alpha \in \mathfrak {a}\})\). Norms are multiplicative: \(\mathrm {N}(\mathfrak {a}\mathfrak {b})=\mathrm {N}(\mathfrak {a})\mathrm {N}(\mathfrak {b})\).

A fractional ideal of \(\mathcal O\) is an \(\mathcal O\)-submodule of K of the form \(\alpha \mathfrak {a}\), where \(\alpha \in K^*\) and \(\mathfrak {a}\) is an \(\mathcal O\)-ideal.Footnote 6 Fractional ideals can be multiplied and conjugated in the evident way, and the norm extends multiplicatively to fractional ideals. A fractional \(\mathcal O\)-ideal \(\mathfrak {a}\) is invertible if there exists a fractional \(\mathcal O\)-ideal \(\mathfrak {b}\) such that \(\mathfrak {a}\mathfrak {b}=\mathcal O\). If such a \(\mathfrak {b}\) exists, we define \(\mathfrak {a}^{-1} = \mathfrak {b}\). Clearly all principal fractional ideals \(\alpha \mathcal O\), where \(\alpha \in K^*\), are invertible.

By construction, the set of invertible fractional ideals \(I(\mathcal O)\) forms an abelian group under ideal multiplication. This group contains the principal fractional ideals \(P(\mathcal O)\) as a (clearly normal) subgroup, hence we may define the ideal-class group of \(\mathcal O\) as the quotient

$$ \mathrm {cl}(\mathcal O) = I(\mathcal O) / P(\mathcal O) \,\text{. } $$

Every ideal class \([\mathfrak {a}]\in \mathrm {cl}(\mathcal O)\) has an integral representative, and for any non-zero \(M\in \mathbb Z\) there even exists an integral representative of norm coprime to M.

There is a unique maximal order of K with respect to inclusion called the ring of integers and denoted \(\mathcal O_K\). The conductor of \(\mathcal O\) (in \(\mathcal O_K\)) is the index \(f=[\mathcal O_K:\mathcal O]\). Away from the conductor, ideals are well-behaved; every \(\mathcal O\)-ideal of norm coprime to the conductor is invertible and factors uniquely into prime ideals.

The class-group action. Fix a prime \(p\ge 5\) and an (ordinary or supersingular) elliptic curve E defined over \(\mathbb F_p\). The Frobenius endomorphism \(\pi \) of E satisfies a characteristic equation

$$ \pi ^2 - t\pi + p = 0 $$

in \(\mathrm {End}_p(E)\), where \(t\in \mathbb Z\) is the trace of Frobenius. The curve E is supersingular if and only if \(t=0\). The \(\mathbb F_p\)-rational endomorphism ring \(\mathrm {End}_p(E)\) is an order \(\mathcal O\) in the imaginary quadratic field , where \(\varDelta =t^2-4p\). We note that \(\mathcal O\) always contains the Frobenius endomorphism \(\pi \), and hence the order \(\mathbb Z[\pi ]\).

Any invertible ideal \(\mathfrak {a}\) of \(\mathcal O\) splits into a product of \(\mathcal O\)-ideals as \((\pi \mathcal O)^r\mathfrak {a}_s\), where \(\mathfrak {a}_s\nsubseteq \pi \mathcal O\). This defines an elliptic curve \(E/\mathfrak {a}\) and an isogeny

$$ \varphi _\mathfrak {a}:E\rightarrow E/\mathfrak {a}$$

of degree \(\mathrm {N}(\mathfrak {a})\) as follows [70]: the separable part of \(\varphi _\mathfrak {a}\) has kernel \(\bigcap _{\alpha \in \mathfrak {a}_s}\ker \alpha \), and the purely inseparable part consists of r iterations of Frobenius. The isogeny \(\varphi _\mathfrak {a}\) and codomain \(E/\mathfrak {a}\) are both defined over \(\mathbb F_p\) and are unique up to \(\mathbb F_p\)-isomorphism (by Lemma 6), justifying the notation \(E/\mathfrak {a}\). Multiplication of ideals corresponds to the composition of isogenies. Since principal ideals correspond to endomorphisms, two ideals lead to the same codomain if and only if they are equal up to multiplication by a principal fractional ideal. Moreover, every \(\mathbb F_p\)-isogeny \(\psi \) between curves in comes from an invertible \(\mathcal O\)-ideal in this way, and the ideal \(\mathfrak {a}_s\) can be recovered from \(\psi \) as \(\mathfrak {a}_s=\{\alpha \in \mathcal O\mid \ker \alpha \supseteq \ker \psi \}\). In other words:

Theorem 7

Let \(\mathcal O\) be an order in an imaginary quadratic field and \(\pi \in \mathcal O\) such that is non-empty. Then the ideal-class group \(\mathrm {cl}(\mathcal O)\) acts freely and transitively on the set via the map

in which \(\mathfrak {a}\) is chosen as an integral representative.

Proof

See [70, Theorem 4.5]. Erratum: [55, Theorem 4.5].     \(\square \)

To emphasize the fact that we are dealing with a group action, we will from now on write \([\mathfrak {a}] *E\) or simply \([\mathfrak {a}]E\) for the curve \(E/\mathfrak {a}\) defined above.

The structure of the class group. The class group \(\mathrm {cl}(\mathcal O)\) is a finite abelian group whose cardinality is asymptotically [58]

$$ \#\mathrm {cl}(\mathcal O) \approx \sqrt{|\varDelta |} \text{. } $$

More precise heuristics actually predict that \(\#\mathrm {cl}(\mathcal O)\) grows a little bit faster than \(\sqrt{|\varDelta |}\), but the ratio is logarithmically bounded so we content ourselves with the above estimate. The exact structure of the class group can be computed in subexponential time \(L_{|\varDelta |}[1/2;\sqrt{2}+o(1)]\) using an algorithm of Hafner and McCurley [33]. Unfortunately, this requires too much computation for the sizes of \(\varDelta \) we are working with, but there are convincing heuristics concerning the properties of the class group we need. See Sect. 7.1 for these arguments. If the absolute value \(|t |\) of the trace of Frobenius is “not too big”, the discriminant \(\varDelta \) is about the size of p, hence by the above approximation we may assume \(\#\mathrm {cl}(\mathcal O)\approx \sqrt{p}\). This holds in particular when E is supersingular, where \(t=0\), hence \(|\varDelta |=4p\).

We are interested in primes \(\ell \) that split in \(\mathcal O\), i.e., such that there exist (necessarily conjugate) distinct prime ideals \(\mathfrak {l},\overline{\mathfrak {l}}\) of \(\mathcal O\) with \(\ell \mathcal O=\mathfrak {l}\overline{\mathfrak {l}}\). Such \(\ell \) are known as Elkies primes in the point-counting literature. The ideal \(\mathfrak {l}\) is generated as \(\mathfrak {l}=(\ell ,\pi -\lambda )\), where \(\lambda \in \mathbb Z/{\ell }\) is an eigenvalue of the Frobenius endomorphism \(\pi \) on the \(\ell \)-torsion, and its conjugate is \(\overline{\mathfrak {l}}=(\ell ,\pi -p/\lambda )\), where by abuse of notation \(p/\lambda \) denotes any integral representative of that quotient modulo \(\ell \). Note that \(\ell \) splits in \(\mathcal O\) if and only if \(\varDelta \) is a non-zero square modulo \(\ell \).

Computing the group action. Any element of the class group can be represented as a product of small prime ideals [10, Propositions 9.5.2 and 9.5.3], hence we describe how to compute \([\mathfrak {l}]E\) for a prime ideal \(\mathfrak {l}=(\ell ,\pi -\lambda )\). There are (at least) the following ways to proceed, which vary in efficiency depending on the circumstances [21, 41]:

  • Find \(\mathbb F_p\)-rational roots of the modular polynomial \(\varPhi _\ell (j(E),y)\) to determine the two j-invariants of possible codomains (i.e., up to four non-isomorphic curves, though in the ordinary case wrong twists can easily be ruled out); compute the kernel polynomials [42] \(\chi \in \mathbb F_p[x]\) for the corresponding isogenies (if they exist); if \((x^p,y^p)=[\lambda ](x,y)\) modulo \(\chi \) and the curve equation, then the codomain was correct, else another choice is correct.

  • Factor the \(\ell \) th division polynomial \(\psi _\ell (E)\) over \(\mathbb F_p\); collect irreducible factors with the right Frobenius eigenvalues (as above); use Kohel’s algorithm [42, Sect. 2.4] to compute the codomain.

  • Find a basis of the \(\ell \)-torsion—possibly over an extension field—and compute the eigenspaces of Frobenius; apply Vélu’s formulas [69] to a basis point of the correct eigenspace to compute the codomain.

As observed in [21, 41], the last method is the fastest if the necessary extension fields are small. The optimal case is \(\lambda =1\); in that case, the curve has a rational point defined over the base field \(\mathbb F_p\). If in addition \(p/\lambda =-1\), the other eigenspace of Frobenius modulo \(\ell \) is defined over \(\mathbb F_{p^2}\), so both codomains can easily be computed using Vélu’s formulas over an at most quadratic extension (but in fact, a good choice of curve model allows for pure prime field computations, see Sect. 8; alternatively one could switch to the quadratic twist). Note that if \(p\equiv -1\pmod \ell \), then \(\lambda =1\) automatically implies \(p/\lambda =-1\).

Much of De Feo–Kieffer–Smith’s work [21, 41] is devoted to finding an ordinary elliptic curve E with many small Elkies primes \(\ell \) such that both E and its quadratic twist \(E^t\) have an \(\mathbb F_p\)-rational \(\ell \)-torsion point. Despite considerable effort leading to various improvements, the results are discouraging. With the best parameters found within 17 000 h of CPU time, evaluating one class-group action still requires several minutes of computation to complete. This suggests that without new ideas, the original Couveignes–Rostovtsev–Stolbunov scheme will not become anything close to practical in the foreseeable future.

4 Construction and Design Choices

In this section, we discuss the construction of our proposed group action and justify our design decisions. For algorithmic details, see Sect. 8. Notice that the main obstacle to performance in the Couveignes–Rostovtsev–Stolbunov scheme—constructing a curve with highly composite order—becomes trivial when using supersingular curves instead of ordinary curves, since for \(p\ge 5\) any supersingular elliptic curve over \(\mathbb F_p\) has exactly \(p+1\) rational points.

The cryptographic group action described below is a straightforward implementation of this construction. Note that we require \(p\equiv 3\pmod 4\) so that we can easily write down a supersingular elliptic curve over \(\mathbb F_p\) and so that an implementation may use curves in Montgomery form. It turns out that this choice is also beneficial for other reasons. In principle, this constraint is not necessary for the theory to work, although the structure of the isogeny graph changes slightly (see [22] and Remark 3 for details).

Parameters. Fix a large prime p of the form \(4\ \cdot \ \ell _1\cdots \ell _n-1\), where the \(\ell _i\) are small distinct odd primes. Fix the elliptic curve \(E_0:y^2=x^3+x\) over \(\mathbb F_p\); it is supersingular since \(p\equiv 3\pmod 4\). The Frobenius endomorphism \(\pi \) satisfies \(\pi ^2=-p\), so its \(\mathbb F_p\)-rational endomorphism ring is an order in the imaginary quadratic field \(\mathbb Q(\sqrt{-p})\). More precisely, Proposition 8 (below) shows \(\mathrm {End}_p(E_0) = \mathbb Z[\pi ]\), which has conductor 2.

Rational Elkies primes. By Theorem 4, the choices made above imply that the \(\ell _i\)-isogeny graph is a disjoint union of cycles. Moreover, since \(\pi ^2-1\equiv 0\pmod {\ell _i}\) the ideals \(\ell _i\mathcal O\) split as \(\ell _i\mathcal O=\mathfrak {l}_i\overline{\mathfrak {l}_i}\), where \(\mathfrak {l}_i=(\ell _i,\pi -1)\) and \(\overline{\mathfrak {l}_i}=(\ell _i,\pi +1)\). In other words, all the \(\ell _i\) are Elkies primes. In particular, we can use any one of the three algorithms described at the end of Sect. 3 to walk along the cycles.

Furthermore, the kernel of \(\varphi _{\mathfrak {l}_i}\) is the intersection of the kernels of the scalar multiplication \([\ell _i]\) and the endomorphism \(\pi -1\). That is, it is the subgroup generated by a point \(P\) of order \(\ell _i\) which lies in the kernel of \(\pi -1\) or, in other words, is defined over \(\mathbb F_p\). Similarly, the kernel of \(\varphi _{\overline{\mathfrak {l}_i}}\) is generated by a point Q of order \(\ell _i\) that is defined over \(\mathbb F_{p^2}\) but not \(\mathbb F_p\) and such that \(\pi (Q)=-Q\). This greatly simplifies and accelerates the implementation, since it allows performing all computations over the base field (see Sect. 8 for details).

Sampling from the class group. Ideally,Footnote 7 we would like to know the exact structure of the ideal-class group \(\mathrm {cl}(\mathcal O)\) to be able to sample elements uniformly at random. However, such a computation is currently not feasible for the size of discriminant we need, hence we resort to heuristic arguments. Assuming that the \(\mathfrak {l}_i\) do not have very small order and are “evenly distributed” in the class group, we can expect ideals of the form \(\mathfrak {l}_1^{e_1}\mathfrak {l}_2^{e_2}\cdots \mathfrak {l}_n^{e_n}\) for small \(e_i\) to lie in the same class only very occasionally. For efficiency reasons, it is desirable to sample the exponents \(e_i\) from a short range centered around zero, say \(\{-m,\dots ,m\}\) for some integer m. We will argue in Sect. 7.1 that choosing m such that \(2m+1\ge \root n \of {\#\mathrm {cl}(\mathcal O)}\) is sufficient. Since the prime ideals \(\mathfrak {l}_i\) are fixed global parameters, the ideal \(\prod _i\mathfrak {l}_i^{e_i}\) may simply be represented as a vector \((e_1,\dots ,e_n)\).

Evaluating the class-group action. Computing the action of an ideal class represented by \(\prod _i\mathfrak {l}_i^{e_i}\) on an elliptic curve E proceeds as outlined in Sect. 3. Since \(\pi ^2=-p\equiv 1\pmod {\ell _i}\), we are now in the favourable situation that the eigenvalues of Frobenius on all \(\ell _i\)-torsion subgroups are \(+1\) and \(-1\). Hence we can efficiently compute the action of \(\mathfrak {l}_i\) (resp. \(\overline{\mathfrak {l}_i}\)) by finding an \(\mathbb F_p\)-rational point (resp. \(\mathbb F_{p^2}\)-rational with Frobenius eigenvalue \(-1\)) of order \(\ell _i\) and applying Vélu-type formulas. This step could simply be repeated for each ideal \(\mathfrak {l}_i^{\pm 1}\) whose action is to be evaluated, but see Sect. 8 for a more efficient method.

5 Representing and Validating \(\mathbb F_p\)-isomorphism Classes

A major unsolved problem of SIDH is its lack of public-key validation, i.e., the inability to verify that a public key was honestly generated. This shortcoming leads to polynomial-time active attacks [29] on static variants for which countermeasures are expensive. For example, the actively secure variant SIKE [37] applies a transformation proposed by Hofheinz, Hövelmanns, and Kiltz [36] which is similar to the Fujisaki–Okamoto transform [26], essentially doubling the running time on the recipient’s side compared to an ephemeral key exchange.

The following proposition tackles this problem for our family of CSIDH instantiations. Moreover, it shows that the Montgomery coefficient forms a unique representative for the \(\mathbb F_p\)-isomorphism class resulting from the group action, hence may serve as a shared secret without taking j-invariants.

Proposition 8

Let \(p\ge 5\) be a prime such that \(p\equiv 3\pmod 8\), and let \(E/\mathbb F_p\) be a supersingular elliptic curve. Then \(\mathrm {End}_p(E) = \mathbb Z[\pi ]\) if and only if there exists \(A \in \mathbb F_p\) such that E is \(\mathbb F_p\)-isomorphic to the curve \(E_A:y^2 = x^3 + Ax^2 + x\). Moreover, if such an \(A\) exists then it is unique.

Proof

First suppose that E is isomorphic over \(\mathbb F_p\) to \(E_A\) for some \(A\in \mathbb F_p\). If \(E_A\) has full \(\mathbb F_p\)-rational \(2\)-torsion, then Table 1 of [16] shows that either \(E_A\) or its quadratic twist must have order divisible by 8. However, both have cardinality \(p+1\equiv 4\pmod 8\). Hence \(E_A\) can only have one \(\mathbb F_p\)-rational point of order 2. With Theorem 2.7 of [22], we can conclude \(\mathrm {End}_p(E)=\mathrm {End}_p(E_A)=\mathbb Z[\pi ]\).

Now assume that \(\mathrm {End}_p(E) = \mathbb Z[\pi ]\). By Theorem 7, the class group \(\mathrm {cl}(\mathbb Z[\pi ])\) acts transitively on , so in particular there exists \([\mathfrak {a}]\in \mathrm {cl}(\mathbb Z[\pi ])\) such that \([\mathfrak {a}]E_0 = E\), where \(E_0:y^2 = x^3 + x\). Choosing a representative \(\mathfrak {a}\) that has norm coprime to 2p yields a separable \(\mathbb F_p\)-isogeny \(\varphi _\mathfrak {a}:E_0 \rightarrow E\) of odd degree. Thus, by [53, Proposition 1] there exists an \(A\in \mathbb F_p\) and a separable isogeny \(\psi :E_0 \rightarrow E_A:y^2 = x^3 + Ax^2 + x\) defined over \(\mathbb F_p\) such that \(\ker \psi =\ker \varphi _\mathfrak {a}\). As isogenies defined over \(\mathbb F_p\) with given kernel are unique up to post-composition with \(\mathbb F_p\)-isomorphisms (Lemma 6), we conclude that \(E\) is \(\mathbb F_p\)-isomorphic to \(E_A\).

Finally, let \(B\in \mathbb F_p\) such that \(E_A\cong E_B:Y^2 = X^3 + BX^2 + X\). Then by [59, Proposition III.3.1(b)] there exist \(u\in \mathbb F_p^*\) and \(r,s,t\in \mathbb F_p\) such that

$$ x = u^2X + r \,,\quad y = u^3Y + su^2X + t\,\text{. } $$

Substituting this into the curve equation for \(E_A\) and subtracting the equation of \(E_B\) (scaled by \(u^6\)) equals zero in the function field and thus leads to a linear relation over \(\mathbb F_p\) between the functions 1, X, \(X^2\), Y, and \(XY\). Writing \(\infty \) for the point at infinity of \(E_B\), it follows from Riemann–Roch [59, Theorem 5.4] that \(\mathcal {L}(5(\infty ))\) is a 5-dimensional \(\mathbb F_p\)-vector space with basis \(\{1,X,Y,X^2,XY\}\). Hence the obtained linear relation must be trivial, and a straightforward computation yields the relations

$$\begin{aligned} s = t&= 0 \,,&3r^2 + 2Ar + 1&= u^4 \,,\\ 3r + A&= Bu^2 \,,&r^3 + Ar^2 + r&= 0. \end{aligned}$$

But since \(E_A\) only has a single \(\mathbb F_p\)-rational point of order 2, the only \(r\in \mathbb F_p\) such that \(r^3 + Ar^2 + r = 0\) is simply \(r = 0\). In that case \(u^4 = 1\), and hence \(u = \pm 1\) since \(p\equiv 3\pmod 8\). In particular, \(u^2 = 1\) and thus \(A = B\).     \(\square \)

Therefore, by choosing public keys to consist of a Montgomery coefficient \(A\in \mathbb F_p\), Proposition 8 guarantees that A represents a curve in the correct isogeny class , where \(\pi =\sqrt{-p}\) and \(\mathcal O=\mathbb Z[\pi ]\), under the assumption that it is smooth (i.e. \(A\notin \{\pm 2\}\)) and supersingular.

Verifying supersingularity. As \(p\ge 5\), an elliptic curve E defined over \(\mathbb F_p\) is supersingular if and only if \(\#E(\mathbb F_p)=p+1\) [59, Exercise 5.10]. In general, proving that an elliptic curve has a given order N is easy if the factorization of N is known; exhibiting a subgroup (or in particular, a single point) whose order d is a divisor of N greater than \(4\sqrt{p}\) implies the order must be correct. Indeed, the condition \(d>4\sqrt{p}\) implies that there exists only one multiple of d in the Hasse interval \([p+1-2\sqrt{p};\, p+1+2\sqrt{p}]\) [35]. This multiple must be the group order by Lagrange’s theorem.

Now note that a random point generally has very large order \(d\). In our case \( E(\mathbb F_p) \cong \mathbb Z/{4} \times \prod _{i=1}^n\mathbb Z/{\ell _i} \text{, }\) so that \(\ell _i\mid d\) with probability \((\ell _i-1)/\ell _i\). Ignoring the even part, this shows that the expected order is lower bounded by

$$ \prod _{i=1}^n \Big (\ell _i-1+\frac{1}{\ell _i}\Big ) \,\text{. } $$

This product is about the same size as p, and it is easily seen that a random point will with overwhelming probability have order (much) greater than \(4\sqrt{p}\). This observation leads to a straightforward verification method, see Algorithm 1.Footnote 8

figure b

If the condition \(d>4\sqrt{p}\) does not hold at the end of Algorithm 1, the point P had too small order to prove \(\#E(\mathbb F_p)=p+1\). In this case one may retry with a new random point P (although this outcome has negligible probability and could just be ignored). There is no possibility of wrongly classifying an ordinary curve as supersingular.

Note moreover that if x-only Montgomery arithmetic is used (as we suggest) and the point P is obtained by choosing a random x-coordinate in \(\mathbb F_p\), there is no need to differentiate between points defined over \(\mathbb F_p\) and \(\mathbb F_{p^2}\); any x-coordinate in \(\mathbb F_p\) works. Indeed, any point that has an x-coordinate in \(\mathbb F_p\) but is only defined over \(\mathbb F_{p^2}\) corresponds to an \(\mathbb F_p\)-rational point on the quadratic twist, which is supersingular if and only if the original curve is supersingular.

There are more optimized variants of this algorithm; the bulk of the work are the scalar multiplications required to compute the points \(Q_i=[(p+1)/\ell _i]P\). Since they are all multiples of P with shared factors, one may more efficiently compute all \(Q_i\) at the same time using a divide-and-conquer strategy (at the expense of higher memory usage). See Sect. 8, and in particular Algorithm 3, for details.

6 Non-interactive Key Exchange

Starting from the class-group action on supersingular elliptic curves and the parameter choices outlined in Sects. 3 and 4, one obtains the following non-interactive key-exchange protocol.

figure c

Remark 9

Besides key exchange, we expect that our cryptographic group action will have several other applications, given the resemblance with traditional Diffie–Hellman and the ease of verifying the correctness of public keys. We refer to previous papers on group actions for a number of suggestions in this direction, in particular Brassard–Yung [7], Couveignes [17, Sect. 4], and Stolbunov [61]. We highlight the following 1-bit identification scheme, which in our case uses a key pair \(([\mathfrak {a}], A)\) as above. One randomly samples an element \([\mathfrak {b}]\in \mathrm {cl}(\mathcal O)\) and commits to a curve \(E' = [\mathfrak {b}]E_0\). Depending on a challenge bit \(b\), one then releases either \([\mathfrak {b}]\) or \([\mathfrak {c}] := [\mathfrak {b}][\mathfrak {a}]^{-1} \), as depicted in Fig. 3. As already pointed out in Stolbunov’s PhD thesis [62, Sect. 2.B], this can be turned into a signature scheme by repeated application of the 1-bit protocol and by applying the Fiat–Shamir [24] or Unruh [68] transformation. However, we point out that it is not immediately clear how to represent \([\mathfrak {c}] \) in a way that is efficiently computable and leaks no information about the secret key \([\mathfrak {a}]\). We leave a resolution of this issue for future research, but mention that a related problem was recently tackled by Galbraith, Petit and Silva [30] who studied a similar triangular identification protocol in the context of SIDH.Footnote 9

Fig. 3.
figure 3

A 1-bit identification protocol.

7 Security

The central problem of our new primitive is the following analogue to the classical discrete-logarithm problem.

Problem 10

(Key recovery). Given two supersingular elliptic curves \(E,E'\) defined over \(\mathbb F_p\) with the same \(\mathbb F_p\)-rational endomorphism ring \(\mathcal O\), find an ideal \(\mathfrak {a}\) of \(\mathcal O\) such that \([\mathfrak {a}]E=E'\). This ideal must be represented in such a way that the action of \([\mathfrak {a}]\) on a curve can be evaluated efficiently, for instance \(\mathfrak {a}\) could be given as a product of ideals of small norm.

Note that just like in the classical group-based scenario, security notions of Diffie–Hellman schemes built from our primitive rely on slightly different hardness assumptions (cf. Sect. 1.1) that are straightforward translations of the computational and decisional Diffie–Hellman problems. However, continuing the analogy with the classical case, and since we are not aware of any ideas to attack the key exchange without recovering one of the keys, we will assume in the following analysis that the best approach to breaking the key-exchange protocol is to solve Problem 10.

We point out that the “inverse Diffie-Hellman problem” is easy in the context of CSIDH: given \([\mathfrak {a}]E_0\) we can compute \([\mathfrak {a}]^{-1}E_0\) by mere quadratic twisting; see Remark 5. This contrasts with the classical group-based setting [28, Sect. 21.1]. Note that just like identifying a point (xy) with its inverse \((x,-y)\) in an ECDLP setting, this implies a security loss of one bit under some attacks: An attacker may consider the curves \([\mathfrak {a}]E\) and \([\mathfrak {a}]^{-1}E\) identical, which reduces the search space by half.

No torsion-point images. One of the most worrying properties of SIDH seems to be that Alice and Bob publish the images of known points under their secret isogenies along with the codomain curve, i.e., a public key is of the form \((E',\varphi (P),\varphi (Q))\) where \(\varphi :E\rightarrow E'\) is a secret isogeny and \(P,Q\in E\) are publicly known points. Although thus far nobody has succeeded in making use of this extra information to break the original scheme, Petit presented an attack using these points when overstretched, highly asymmetric parameters are used [50]. The Couveignes–Rostovtsev–Stolbunov scheme, and consequently our new scheme CSIDH, does not transmit such additional points—a public key consists of only an elliptic curve. Thus we are confident that a potential future attack against SIDH based on these torsion points would not apply to CSIDH.

Chosen-ciphertext attacks. As explained in Sect. 5, the CSIDH group action features efficient public-key validation. This implies it can be used without applying a CCA transform such as the Fujisaki–Okamoto transform [26], thus enabling efficient non-interactive key exchange and other applications in a post-quantum world.

7.1 Classical Security

We begin by considering classical attacks.

Exhaustive key search. The most obvious approach to attack any cryptosystem is to simply search through all possible keys. In the following, we will argue that our construction provides sufficient protection against key search attacks, including dumb brute force and (less naïvely) a meet-in-the-middle approach.

As explained in Sect. 4, a private key of our scheme consists of an exponent vector \((e_1,\dots ,e_n)\) where each \(e_i\) is in the range \(\{-m,\dots ,m\}\), representing the ideal class \([\mathfrak {l}_1^{e_1}\mathfrak {l}_2^{e_2}\cdots \mathfrak {l}_n^{e_n}]\in \mathrm {cl}(\mathcal O)\). There may (and typically will) be multiple such vectors that represent the same ideal class and thus form equivalent private keys. However, we argue (heuristically) that the number of short representations per ideal class is small. Here and in the following, “short” means that all \(e_i\) are in the range \(\{-m,\dots ,m\}\). The maximum number of such short representations immediately yields the min-entropyFootnote 10 of our sampling method, which measures the amount of work a brute-force attacker has to do while conducting an exhaustive search for the key.

We assume in the following discussion that \(\mathrm {cl}(\mathcal O)\) is “almost cyclic” in the sense that it has a very large cyclic component, say of order N not much smaller than \(\#\mathrm {cl}(\mathcal O)\). According to a heuristic of Cohen and Lenstra, this is true with high probability for a “random” imaginary quadratic field [13, Sect. 9.I], and this conjecture is in line with our own experimental evidence. So suppose

$$ \rho :\mathrm {cl}(\mathcal O) \twoheadrightarrow (\mathbb Z/{N},+) $$

is a surjective group homomorphism (which may be thought of as a projection to the large cyclic subgroup followed by an isomorphism) and define \(\alpha _i=\rho ([\mathfrak {l}_i])\). We may assume that \(\alpha _1=1\); this can be done without loss of generality whenever at least one of the \([\mathfrak {l}_i]\) has order N in the class group. For some fixed \([\mathfrak {a}]\in \mathrm {cl}(\mathcal O)\), any short representation \([\mathfrak {l}_1^{e_1}\mathfrak {l}_2^{e_2}\cdots \mathfrak {l}_n^{e_n}]=[\mathfrak {a}]\) yields a short solution to the linear congruence

$$ e_1 + e_2\alpha _2 + \dots + e_n\alpha _n \equiv \rho ([\mathfrak {a}]) \pmod N \text{, } $$

so counting solutions to this congruence gives an upper bound on the number of short representations of \([\mathfrak {a}]\). These solutions are exactly the points in some shifted version (i.e., a coset) of the integer lattice spanned by the rows of the matrix

$$ L = \begin{pmatrix} N &{} 0 &{} 0 &{} \cdots &{} 0 \\ -\alpha _2 &{} 1 &{} 0 &{} \cdots &{} 0 \\ -\alpha _3 &{} 0 &{} 1 &{} \cdots &{} 0 \\ \vdots &{} \vdots &{} \vdots &{} \ddots &{} \vdots \\ -\alpha _n &{} 0 &{} 0 &{} \cdots &{} 1 \\ \end{pmatrix} {\text{, }} $$

so by applying the Gaussian heuristic [49, Chap. 2, Definition 8] one expects

$$ {{\text {vol}}\,[-m;m]^n} \,/\, {\det L} = (2m+1)^n/N $$

short solutions. Since we assumed \(\mathrm {cl}(\mathcal O)\) to be almost cyclic, this ratio is not much bigger than \((2m+1)^n/\#\mathrm {cl}(\mathcal O)\), which is not very large for our choice of m as small as possible with \((2m+1)^n\ge \#\mathrm {cl}(\mathcal O)\).

As a result, we expect the complexity of a brute-force search to be around \(2^{\log \sqrt{p}-\varepsilon }\) for some positive \(\varepsilon \) that is small relative to \(\log \sqrt{p}\). To verify our claims, we performed computer experiments with many choices of p of up to 40 bits (essentially brute-forcing the number of representations for all elements) and found no counterexamples to the heuristic result that our sampling method loses only a few bits of brute-force security compared to uniform sampling from the class group. For our sizes of p, the min-entropy was no more than 4 bits less than that of a perfectly uniform distribution on the class group (i.e. \(\varepsilon \le 4\)). Of course this loss factor may grow in some way with bigger choices of p (a plot of the data points for small sizes suggests an entropy loss proportional to \(\log \log p\)), but we see no indication for it to explode beyond a few handfuls of bits, as long as we find m and n so that \((2m+1)^n\) is not much larger than \(\#\mathrm {cl}(\mathcal O)\).

Meet-in-the-middle key search. Since a private key trivially decomposes into a product of two smooth ideals drawn from smaller sets (e.g. splitting \([\mathfrak {l}_1^{e_1}\mathfrak {l}_2^{e_2}\cdots \mathfrak {l}_n^{e_n}]\) as \([\mathfrak {l}_1^{e_1}\cdots \mathfrak {l}_{\nu }^{e_{\nu }}]\cdot [\mathfrak {l}_{\nu +1}^{e_{\nu +1}}\cdots \mathfrak {l}_n^{e_n}]\) for some \(\nu \in \{1,\dots ,n\}\)), the usual time-memory trade-offs à la baby-step giant-step [56] with an optimal time complexity of \(O\big (\sqrt{\#\mathrm {cl}(\mathcal O)}\big )\approx O(\root 4 \of {p})\) apply.Footnote 11 Another interpretation of this algorithm is finding a path between two nodes in the underlying isogeny graph by constructing a breadth-first tree starting from each of them, each using a certain subset of the edges, and looking for a collision. Details, including a memoryless variation of this concept, can be found in Delfs and Galbraith’s paper [22], and for the ordinary case in [27].

Remark 11

The algorithms mentioned thus far scale exponentially in the size of the key space, hence they are asymptotically more expensive than the quantum attacks outlined below which is subexponential in the class-group size. This implies one could possibly balance the costs of the different attacks and use a key space smaller than \(\#\mathrm {cl}(\mathcal O)\) without any loss of security (unless the key space is chosen particularly badly, e.g., as a subgroup), which leads to improved performance. We leave a more thorough analysis of this idea for future work.

Pohlig–Hellman-style attacks. Notice that the set we are acting on does not form a group with efficiently computable operations (that are compatible with the action of \(\mathrm {cl}(\mathcal O)\)). Thus there seems to be no way to apply Pohlig–Hellman-style algorithms making use of the decomposition of finite abelian groups. In fact, the Pohlig–Hellman algorithm relies on efficiently computable homomorphisms to proper subgroups, which in the setting at hand would correspond to an efficient algorithm that “projects” a given curve to the orbit of \(E_0\) under a subgroup action. Therefore, we believe the structure of the class group to be largely irrelevant (assuming it is big enough); in particular, we do not require it to have a large prime-order subgroup.

7.2 Quantum Security

We now discuss the state of quantum algorithms to solve Problem 10.

Grover’s algorithm and claw finding. Applying Grover search [32] via claw finding as described in [38] is fully applicable to CSIDH as well, leading to an attack on Problem 10 in \(O(\root 6 \of {p})\) calls to a quantum oracle that computes our group action. The idea is to split the search space for collisions into a classical \(O(\root 6 \of {p})\) target part and a \(O(\root 3 \of {p})\) search part on which a quantum search is applied. Our choices of \(p\) that lead to classical security are also immediately large enough to imply quantum security against this attack (cf. [48, Sect. 4.A.5 in Call for Proposals]). That is, the number of queries to our quantum oracle necessary to solve Problem 10 is larger than the number of quantum queries to an AES oracle needed to retrieve the key of the corresponding AES instantiation via Grover’s algorithm. For example, an AES-128 key can be recovered with approximately \(2^{64}\) (quantum) oracle queries, which requires us to set \(p>2^{384}\). However, \(p\) is much larger than that (see Table 1) due to the existence of subexponential quantum attacks.

The abelian hidden-shift problem. A crucial result by Kuperberg [43] is an algorithm to solve the hidden-shift problem with time, query and space complexity \(2^{O(\sqrt{\log N})}\) in an abelian group \(H\) of order \(N\). He also showed that any abelian hidden-shift problem reduces to a dihedral hidden-subgroup problem on a different but closely related oracle. A subsequent alternative algorithm by Regev [52] achieves polynomial quantum space complexity with an asymptotically worse time and query complexity of \(2^{O(\sqrt{\log N\log \log N})}\). A follow-up algorithm by Kuperberg [44] uses \(2^{O(\sqrt{\log N})}\) time, queries and classical space, but only \(O(\log {N})\) quantum space. All these algorithms have subexponential time and space complexity.

Attacking the isogeny problem. The relevance of these quantum algorithms to Problem 10 has been observed by Childs–Jao–Soukharev [12] in the ordinary case and by Biasse–Jao–Sankar [4] in the supersingular setting. By defining functions as \(f_0:[\mathfrak {b}]\mapsto [\mathfrak {b}]E\) and \(f_1:[\mathfrak {b}]\mapsto [\mathfrak {b}]E'=[\mathfrak {b}][\mathfrak {a}]E\), the problem can be viewed as an abelian hidden-shift problem with respect to \(f_0\) and \(f_1\). We note that each query requires evaluating the functions \(f_i\) on arbitrary ideal classes (i.e. without being given a representative that is a product of ideals of small prime norm) which is non-trivial. However, Childs–Jao–Soukharev show this can be done in subexponential time and space [12, Sect. 4].

Subexponential vs. practical. An important remark about all these quantum algorithms is that they do not immediately lead to estimates for runtime and memory requirements on concrete instantiations with \(H = \mathrm {cl}(\mathcal O)\). Although the algorithms by Kuperberg and Regev are shown to have subexponential complexity in the limit, this asymptotic behavior is not enough to understand the space and time complexity on actual (small) instances. For example, Kuperberg’s first paper [43, Theorem 3.1] mentions \(O(2^{3\sqrt{\log N}})\) oracle queries to achieve a non-negligible success probability when N is a power of a small integer. It also presents a second algorithm that runs in \(\tilde{O}(3^{\sqrt{2\log _3 N}}) = O(2^{1.8\sqrt{\log N}})\) [43, Theorem 5.1]. His algorithms handle arbitrary group structures but he does not work out more exact counts for those. Of course, this does not contradict the time complexity of \(2^{O(\sqrt{\log N})}\) as stated above, but for a concrete security analysis the hidden constants certainly matter a lot and ignoring the O typically underestimates the security. Childs–Jao–Soukharev [12, Theorem 5.2] prove a query complexity of

$$\begin{aligned} L_N\big [1/2,\sqrt{2} \big ] = \exp { \Big [\big (\sqrt{2} + o(1)\big ) \sqrt{\ln {N}\ln {\ln {N}}} \Big ], } \end{aligned}$$
(1)

where \(N = \# \mathrm {cl}(\mathcal O)\), for using Regev’s algorithm for solving the hidden-shift problem. This estimates only the query complexity, so does not include the cost of queries to the quantum oracle (i.e. the isogeny oracle). Childs–Jao–Soukharev present two algorithms to compute the isogeny oracle, the fastest of which is due to Bisson [5]. In [12, Remark 4.8] Childs–Jao–Soukharev give an upper bound of

$$\begin{aligned} L_p [1/2,1/\sqrt{2}] = \exp { \Big [\big (1/\sqrt{2} + o(1)\big ) \sqrt{\ln {p}\ln {\ln {p}}} \Big ] } \end{aligned}$$
(2)

on the running time of Bisson’s algorithm.

Remark 12

Childs–Jao–Soukharev compute the total cost for computing the secret isogeny in [12, Remark 5.5] to be \(L_p[1/2,3/\sqrt{2}]\) (using Regev and Bisson’s algorithms, requiring only polynomial space). They appear to obtain this by setting \(N=p\) when multiplying (1) and (2), but as \(N \sim \sqrt{p}\) this is an overestimation and should be \(L_p[1/2,1+1/\sqrt{2}]\). Either way, this is the largest asymptotic complexity of the estimates. Also, Galbraith and Vercauteren [31] point out this algorithm actually has superpolynomial space complexity due to the high memory usage of the isogeny oracle in [12], but see [39].

Childs–Jao–Soukharev additionally compute the total time \(L_p[1/2,1/\sqrt{2}]\) for computing the secret isogeny combining Kuperberg [43] and Bisson. This requires superpolynomial storage (also before considering the memory usage of the oracle). Note that in this combination the costs of the oracle computation dominate asymptotically.

It is important to mention that asymptotically worse algorithms may provide practical improvements on our “small” instances over either of the algorithms studied by Childs–Jao–Soukharev: For example, Couveignes [17, Sect. 5] provides heuristic arguments that one can find smooth representatives of ideal classes by computing the class-group structure (which can be done in polynomial time on a quantum computer [34]) and applying a lattice-basis-reduction algorithm such as LLL [45] to its lattice of relations. This might be more efficient than using Childs–Jao–Soukharev’s subexponential oracle. However, note that this method makes evaluating the oracle several times harder for the attacker than for legitimate users, thus immediately giving a few additional bits of security, since users only evaluate the action of very smooth ideals by construction. We believe further research in this direction is necessary and important, since it will directly impact the cost of an attack, but we consider a detailed analysis of all these algorithms and possible trade-offs to be beyond the scope of this work.Footnote 12

Remark 13

After we posted a first version of this paper on the Cryptology ePrint Archive, there were three independent attempts at assessing the security of CSIDH.

Biasse, Iezzi, and Jacobson [3] work out some more details of the attack ideas mentioned above for Regev’s algorithm. They focus on the class-group-computation part of the oracle and they work out how to represent random elements of the class group as a product of small prime ideals. Their analysis is purely asymptotic and an assessment of the actual cost on specific instances is explicitly left for future work.

Bonnetain and Schrottenloher [6] determine (quantum) query complexities for breaking CSIDH under the assumption that the quantum memory can be made very large, which implies that Kuperberg’s faster algorithms would be applicable. They estimate the number of oracle queries as \((5\pi ^2/4) 2^{1.8\sqrt{\log N}}\). The 1.8 appears to approximate the \(\sqrt{2\log 3}\) in Kuperberg [43, Theorem 5.1]. They state \(2^{1.8\sqrt{\log N} + 2.3}\) for the number of qubits.

While we ignored Kuperberg’s algorithm due to the large memory costs, they take the stance that “the most time-efficient version is relevant”, and so do not ignore this algorithm. For small N the number of qubits stated in [6] might be possible, which makes Kuperberg’s algorithm indeed relevant for these sizes. However, this also highlights the high cost of computing the oracle, which Childs–Jao–Soukharev placed at \(L_p[1/2,1/\sqrt{2}]\). Bonnetain and Schrottenloher investigate the oracle computation using Couveignes’ LLL idea and improve it using better lattice basis reduction.

The current version of Bonnetain–Schrottenloher [6] also presents concrete estimates for the attack costs for our parameter sets, but unfortunately this version ignores most of the cost of evaluating isogenies. For example: (1) Algorithm 2 in our paper makes heavy use of input-dependent branches, which is impossible in superposition [39, Sect. 4]; (2) [6] skips finding points of order \(\ell _i\) which are needed as the kernel of the \(\ell _i\) isogeny; (3) [6] applies a result for multiplication costs in \(\mathbb F_{2^n}\) to multiplications in \(\mathbb F_p\). We analyzed the (significantly higher) cost of a quantum oracle for isogeny evaluation and conclude that the current estimates of Bonnetain–Schrottenloher do not imply that the 512-bit parameters stated below are broken under NIST level 1.

Jao, LeGrow, Leonardi, and Ruiz-Lopez recently made a preprint [39] of their MathCrypt paper available to us. They address the issue of superpolynomial space in the oracle computation identified by Galbraith and Vercauteren (stated above) and give a new algorithm for finding short representations of elements. Their paper focuses on the asymptotic analysis of the oracle step so that they achieve overall polynomial quantum space, but does not obtain any concrete cost estimates.

7.3 Instantiations

Finally we present estimates for some sizes of \(p\).

Security estimates. As explained in Sect. 7.1, the best classical attack has query complexity \(O(\root 4 \of {p})\), and the number of queries has been worked out for different quantum attacks. We consider [12] in combination with Regev and Kuperberg (\(L_p\big [1/2,3/\sqrt{2}\big ]\) and \(L_p\big [1/2,1/\sqrt{2}\big ]\), respectively) as well as the pure query complexity of Regev’s and Kuperberg’s algorithms (\(L_N\big [1/2,\sqrt{2}\big ]\), \(O(2^{3\sqrt{\log N}})\), and \(O(2^{1.8\sqrt{\log N}})\), respectively). We summarize the resulting attack complexities, ignoring the memory costs and without restricting the maximum depth of quantum circuits, for some sizes of \(p\) in Table 1. We note again that we expect these complexities to be subject to more careful analysis, taking into account the implicit constants, the (in-)feasibility of long sequential quantum operations, and the large memory requirement. We also include the recent estimates on the query complexity and full attack complexity by Bonnetain and Schrottenloher [6].

We point out a recent analysis [1] which shows that the classical attack on SIDH (which is the same for CSIDH) is likely slower in practice than current parameter estimates assumed, which is due to the huge memory requirements of the searches. Similarly, the cost of the quantum attacks is significantly higher than just the query complexity times the cost of the group action because evaluating the oracle in superposition is significantly more expensive than a regular group action.

Table 1. Estimated attack complexities ignoring limits on depth. The three rightmost columns state costs for the complete attack; the others state classical and quantum query complexities. All numbers are rounded to whole bits and use \(N=\#\mathrm {cl}(\mathcal O)=\sqrt{p}\), \(o(1)=0\), and all hidden \(O\)-constants 1, except for numbers taken from [6].

Recall that public keys consist of a single element \(A\in \mathbb F_p\), which may be represented using \(\lceil \log p\rceil \) bits. A private key is represented as a list of n integers in \(\{-m,\dots ,m\}\), where m was chosen such that \(n \log (2m+1)\approx \log \sqrt{p}\), hence it may be stored using roughly \((\log p)/2\) bits. Therefore the rows of Table 1 correspond to public key sizes of 64, 128, and 224 bytes, and private keys are approximately half that size when encoded optimally.

Security levels. We approximate security levels as proposed by NIST for the post-quantum standardization effort [48, Sect. 4.A.5]. That is, the \(k\)-bit security level means that the required effort for the best attacks is at least as large as that needed for a key-retrieval attack on a block cipher with a \(k\)-bit key (e.g. AES-\(k\) for \(k\in \{128,192,256\}\)). In other words, under the assumption that the attacks query an oracle on a circuit at least as costly as AES, we should have a query complexity of at least \(2^{k-1}\) resp. \(\sqrt{2^k}\) to a classical resp. quantum oracle. NIST further restricts the power of the quantum computation to circuits of maximum depth \(2^{40}\) up to \(2^{96}\), meaning that theoretically optimal tradeoffs (such as the formulas in Table 1 above) might not be possible for cryptographic sizes.

The parameters for CSIDH-\(\log p\) were chosen to match the query complexity of Regev’s attack on the hidden-shift problem (see the third column in Table 1) for roughly \(2^{k/2}\), which should match NIST levels 1-3 as the group action computation has depth at least as large as AES.

Some other algorithms give lower estimates which makes it necessary to evaluate the exact cost of the oracle queries or compute the lower-order terms in the complexity. The analysis in [6, Table 8] states lower overall costs compared to AES. While this is a signficant improvement, we believe that this does not affect our security claim when accounting precisely for the actual cost of oracle queries, as stated above. Our preliminary analysis shows costs of more than \(2^{50}\) qubit operations for evaluating the oracle for \(\log p = 512\), where [6] assumes \(2^{37}\). This means that the NIST levels are reached even with the low query numbers in [6]. More analysis is certainly needed and it is unclear whether that will result in larger or smaller choices of p.

Note that adjusting parameters only involves changing the prime \(p\) (and a few numbers derived from it) and is therefore very simple, should it turn out that our initial estimates are insufficient.

8 Implementation

In this section, we outline our most important tricks to make the system easier to implement or the code faster. As pointed out earlier, the crucial step is to use a field of size \(4\cdot \ell _1\cdots \ell _n-1\), where the \(\ell _i\) are small distinct odd primes; this implies that all \(\ell _i\) are Elkies primes for a supersingular elliptic curve over \(\mathbb F_p\) and that the action of ideals \((\ell _i,\pi \pm 1)\) can be computed efficiently using \(\mathbb F_p\)-rational points. See Sect. 4 for these design decisions. The following section focuses on lower-level implementation details.

Montgomery curves. The condition \(p\,+\,1\,\equiv \,4\pmod 8\) implies that all curves in can be put in the form \(y^2 = x^3 + Ax^2 + x\) (cf. Proposition 8) for some \(A\in \mathbb F_p\) via an \(\mathbb F_p\)-isomorphism. This is commonly referred to as the Montgomery form [46] of an elliptic curve and is popular due to the very efficient arithmetic on its \(x\)-line. This extends well to computations of isogenies on the \(x\)-line, as was first shown by Costello–Longa–Naehrig [15, Sect. 3]. Our implementation uses exactly the same formulas for operations on curves. For isogeny computations on Montgomery curves we use a projectivized variant (to avoid almost all inversions) of the formulas from Costello–Hisil [14] and Renes [53]. This can be done as follows.

For a fixed prime \(\ell \ge 3\), a point P of order \(\ell \), and an integer \(k\in \{1,\dots ,\ell -1\}\), let \((X_k:Z_k)\) be the projectivized x-coordinate of \([k]P\). Then by defining \(c_i\in \mathbb F_p\) such that

$$ \prod _{i=1}^{\ell -1}(Z_iw + X_i) = \sum _{i=0}^{\ell -1}c_iw^i $$

as polynomials in w, we observe that

$$ (\tau (A-3\sigma ) : 1) = \big ( Ac_0c_{\ell -1} - 3(c_0c_{\ell -2}-c_1c_{\ell -1}) : c_{\ell -1}^2 \big ) \text{, } $$

where

$$ \tau = \prod _{i=1}^{\ell -1}\frac{X_i}{Z_i}\text{, } \quad \sigma = \sum _{i=1}^{\ell -1}\left( \frac{X_i}{Z_i}-\frac{Z_i}{X_i}\right) $$

and \(A\) is the Montgomery coefficient of the domain curve. By noticing that \(x([k]P) = x([\ell -k]P)\) for all \(k\in \{1,\ldots ,(\ell -1)/2\}\) we can reduce the computation needed by about half. That is, we can compute \((\tau (A-3\sigma ) : 1)\) iteratively in about operationsFootnote 13, noting that \(\tau (A-3\sigma )\) is the Montgomery coefficient of the codomain curve of an isogeny with kernel \(\langle P\rangle \) [53, Proposition 1]. If necessary, a single division at the end of the computation suffices to obtain an affine curve constant. We refer to the implementation for more details.

Note that for a given prime \(\ell \), we could reduce the number of field operations by finding an appropriate representative of the isogeny formulas modulo (a factor of) the \(\ell \)-division polynomial \(\psi _\ell \) (as done in [15] for 3- and 4-isogenies). Although this would allow for a more efficient implementation, we do not pursue this now for the sake of simplicity.

Rational points. Recall that the goal is to evaluate the action of (the class of) an ideal \( \mathfrak {l}_1^{e_1}\cdots \mathfrak {l}_n^{e_n} \) on a curve , where each \(\mathfrak {l}_i=(\ell _i,\pi -1)\) is a prime ideal of small odd norm \(\ell _i\) and the \(e_i\) are integers in a short range \(\{-m,\dots ,m\}\). We assume E is given in the form \(E_A:y^2=x^3+Ax^2+x\).

The obvious way to do this is to consider each factor \(\mathfrak {l}_i^{\pm 1}\) in this product and to find the abscissa of a point \(P\) of order \(\ell _i\) on E, which (depending on the sign) is defined over \(\mathbb F_p\) or \(\mathbb F_{p^2}\!\setminus \!\mathbb F_p\). This exists by our choice of p and \(\ell _i\) (cf. Sect. 4). Finding such an abscissa amounts to sampling a random \(\mathbb F_p\)-rational x-coordinate, checking whether \(x^3+Ax^2+x\) is a square or not (for \(\mathfrak {l}_i^{+1}\) resp. \(\mathfrak {l}_i^{-1}\)) in \(\mathbb F_p\) (and resampling if it was wrong), followed by a multiplication by \((p+1)/\ell _i\) and repeating from the start if the result is \(\infty \). The kernel of the isogeny given by \(\mathfrak {l}_i^{\pm 1}\) is then \(\langle P\rangle \), so the isogeny may be computed using Vélu-type formulas. Repeating this procedure for all \(\mathfrak {l}_i^{\pm 1}\) gives the result.

However, fixing a sign before sampling a random point effectively means wasting about half of all random points, including an ultimately useless square test. Moreover, deciding on a prime \(\ell _i\) before sampling a point and doing the cofactor multiplication wastes another proportion of the points, including both an ultimately useless square test and a scalar multiplication. Both of these issues can be remedied by not fixing an \(\ell _i\) before sampling a point, but instead taking any x-coordinate, determining the smallest field of definition (i.e. \(\mathbb F_p\) or \(\mathbb F_{p^2}\)) of the corresponding point, and then performing whatever isogeny computations are possible using that point (based on its field of definition and order). The steps are detailed in Algorithm 2.

figure d

Due to the commutativity of \(\mathrm {cl}(\mathcal O)\), and since we only decrease (the absolute value of) each \(e_i\) once we successfully applied the action of \(\mathfrak {l}_i^{\pm 1}\) to the current curve, this algorithm indeed computes the action of \([\mathfrak {l}_1^{e_1}\mathfrak {l}_2^{e_2}\cdots \mathfrak {l}_n^{e_n}]\).

Remark 14

Since the probability that a random point has order divisible by \(\ell _i\) (and hence leads to an isogeny step in Algorithm 2) grows with \(\ell _i\), the isogeny steps for big \(\ell _i\) are typically completed before those for small \(\ell _i\). Hence it may make sense to sample the exponents \(e_i\) for ideals \(\mathfrak {l}_i\) from different ranges depending on the size of \(\ell _i\), or to not include any very small \(\ell _i\) in the factorization of \(p+1\) at all to reduce the expected number of repetitions of the loop above. Note moreover that doing so may also improve the performance of straightforward constant-time adaptions of our algorithms, since it yields stronger upper bounds on the maximum number of required loop iterations (at the expense of slightly higher cost per isogeny computation). Varying the choice of the \(\ell _i\) can also lead to performance improvements if the resulting prime p has lower Hamming weight. Finding such a p is a significant computational effort but needs to be done only once; all users can use the same finite field.

Remark 15

Algorithm 2 is obviously strongly variable-time when implemented naïvely. Indeed, the number of points computed in the isogeny formulas is linear in the degree, hence the iteration counts of certain loops in our implementation are very directly related to the private key. We note that it would not be very hard to create a constant-time implementation based on this algorithm by always performing the maximal required number of iterations in each loop and only storing the results that were actually needed (using constant-time conditional instructions), although this incurs quite a bit of useless computation, leading to a doubling of the number of curve operations on average. We leave the design of optimized constant-time algorithms for future work.

Public-key validation. Recall that the public-key validation method outlined in Sect. 5 essentially consists of computing \([(p+1)/\ell _i]P\) for each i, where P is a random point on E. Performing this computation in the straightforward way is simple and effective. On the other hand, a divide-and-conquer approach, such as the following recursive algorithm, yields better speeds at the expense of slightly higher memory usage. Note that Algorithm 3 only operates on public data, hence need not be constant-time in a side-channel resistant implementation.

figure e

This routine can be used for verifying that an elliptic curve \(E/\mathbb F_p\) is supersingular as follows: Pick a random point \(P\in E(\mathbb F_p)\) and run Algorithm 3 on input [4]P and \((\ell _1,\dots ,\ell _n)\) to obtain the points \(Q_i=[(p+1)/\ell _i]P\). Then continue like in Algorithm 1 to verify that E is supersingular using these precomputed points.

In practice, it is not necessary to run Algorithm 3 as a black-box function until it returns all the points \(Q_1,\dots ,Q_n\): The order checking in Algorithm 1 can be performed as soon as a new point \(Q_i\) becomes available, i.e., in the base case of Algorithm 3. This reduces the memory usage (since the points \(Q_i\) can be discarded immediately after use) and increases the speed (since the algorithm terminates as soon as enough information was obtained) of public-key validation using Algorithms 1 and 3. We note that the improved performance of this algorithm compared to Algorithm 1 alone essentially comes from a time-space trade-off, hence the memory usage is higher (cf. Sect. 8.1). On severely memory-constrained devices one may instead opt for the naïve algorithm, which requires less space but is slower.

8.1 Performance Results

On top of a minimal implementation in the sage computer algebra system [67] for demonstrative purposes, we created a somewhat optimized proof-of-concept implementation of the CSIDH group action for a particular 512-bit prime p. While this implementation features 512-bit field arithmetic written in assembly (for Intel Skylake processors), it also contains generic C code supporting other field sizes and can therefore easily be ported to other computer architectures or parameter sets if desired.Footnote 14

The prime p is chosen as \(p=4\cdot \ell _1\cdots \ell _{74}-1\) where \(\ell _1\) through \(\ell _{73}\) are the smallest 73 odd primes and \(\ell _{74}=587\) is the smallest prime distinct from the other \(\ell _i\) that renders p prime. This parameter choice implies that public keys have a size of 64 bytes. Private keys are stored in 37 bytes for simplicity, but an optimal encoding would reduce this to only 32 bytes. Table 2 summarizes performance numbers for our proof-of-concept implementation. Note that private-key generation is not listed as it only consists of sampling n random integers in a small range \(\{-m,\dots ,m\}\), which has negligible cost.

Table 2. Performance numbers of our proof-of-concept implementation, averaged over 10 000 runs on an Intel Skylake i5 processor clocked at \(3.5\,\mathrm {GHz}\).

We emphasize that both our implementations are intended as a proof of concept and unfit for production use; in particular, they are explicitly not side-channel resistant and may contain any number of bugs. We leave the design of hardened and more optimized implementations for future work.