Abstract
In this paper, we initiate the study of side-channel leakage in hash-and-sign lattice-based signatures, with particular emphasis on the two efficient implementations of the original GPV lattice-trapdoor paradigm for signatures, namely NIST second-round candidate Falcon and its simpler predecessor DLP. Both of these schemes implement the GPV signature scheme over NTRU lattices, achieving great speed-ups over the general lattice case. Our results are mainly threefold.
First, we identify a specific source of side-channel leakage in most implementations of those schemes, namely, the one-dimensional Gaussian sampling steps within lattice Gaussian sampling. It turns out that the implementations of these steps often leak the Gram–Schmidt norms of the secret lattice basis.
Second, we elucidate the link between this leakage and the secret key, by showing that the entire secret key can be efficiently reconstructed solely from those Gram–Schmidt norms. The result makes heavy use of the algebraic structure of the corresponding schemes, which work over a power-of-two cyclotomic field.
Third, we concretely demonstrate the side-channel attack against DLP (but not Falcon due to the different structures of the two schemes). The challenge is that timing information only provides an approximation of the Gram–Schmidt norms, so our algebraic recovery technique needs to be combined with pruned tree search in order to apply it to approximate values. Experimentally, we show that around \(2^{35}\) DLP traces are enough to reconstruct the entire key with good probability.
You have full access to this open access chapter, Download conference paper PDF
Similar content being viewed by others
1 Introduction
Lattice-Based Signatures. Lattice-based cryptography has proved to be a versatile way of achieving a very wide range of cryptographic primitives with strong security guarantees that are also believed to hold in the postquantum setting. For a while, it was largely confined to the realm of theoretical cryptography, mostly concerned with asymptotic efficiency, but it has made major strides towards practicality in recent years. Significant progress has been made in terms of practical constructions, refined concrete security estimates and fast implementations. As a result, lattice-based schemes are seen as strong contenders in the NIST postquantum standardization process.
In terms of practical signature schemes in particular, lattice-based constructions broadly fit within either of two large frameworks: Fiat–Shamir type constructions on the one hand, and hash-and-sign constructions on the other.
Fiat–Shamir lattice based signatures rely on a variant of the Fiat–Shamir paradigm [16] developed by Lyubashevsky, called “Fiat–Shamir with aborts” [31], which has proved particularly fruitful. It has given rise to numerous practically efficient schemes [2, 8, 23] including the two second round NIST candidates Dilithium [10, 33] and qTESLA [5].
The hash-and-sign family has a longer history, dating back to Goldreich–Goldwasser–Halevi [22] signatures as well as NTRUSign [24]. Those early proposals were shown to be insecure [12, 19, 21, 40], however, due to a statistical dependence between the distribution of signatures and the signing key. That issue was only overcome with the development of lattice trapdoors by Gentry, Peikert and Vaikuntanathan [20]. In the GPV scheme, signatures follow a distribution that is provably independent of the secret key (a discrete Gaussian supported on the public lattice), but which is hard to sample from without knowing a secret, short basis of the lattice. The scheme is quite attractive from a theoretical standpoint (for example, it is easier to establish QROM security for it than for Fiat–Shamir type schemes), but suffers from large keys and a potentially costly procedure for discrete Gaussian sampling over a lattice. Several follow-up works have then striven to improve its concrete efficiency [13, 34, 37, 42, 49], culminating in two main efficient and compact implementations: the scheme of Ducas, Lyubashevsky and Prest (DLP) [11], and its successor, NIST second round candidate Falcon [47], both instantiated over NTRU lattices [24] in power-of-two cyclotomic fields. One can also mention NIST first round candidates pqNTRUSign [52] and DRS [44] as members of this family, the latter of which actually fell prey to a clever statistical attack [51] in the spirit of those against GGH and NTRUSign.
Side-Channel Analysis of Lattice-Based Signatures. With the NIST postquantum standardization process underway, it is crucial to investigate the security of lattice-based schemes not only in a pure algorithmic sense, but also with respect to implementation attacks, such as side-channels. For lattice-based signatures constructed using the Fiat–Shamir paradigm, this problem has received a significant amount of attention in the literature, with numerous works [4, 6, 7, 14, 43, 50] pointing out vulnerabilities with respect to timing attacks, cache attacks, power analysis and other types of side-channels. Those attacks have proved particularly devastating against schemes using discrete Gaussian sampling, such as the celebrated BLISS signature scheme [8]. In response, several countermeasures have also been proposed [27, 28, 39], some of them provably secure [3, 4], but the side-channel arms race does not appear to have subsided quite yet.
In contrast, the case of hash-and-sign lattice-based signatures, including DLP and Falcon, remains largely unexplored, despite concerns being raised regarding their vulnerability to side-channel attacks. For example, the NIST status report on first round candidates, announcing the selection of Falcon to the second round, notes that “more work is needed to ensure that the signing algorithm is secure against side-channel attacks”. The relative lack of cryptanalytic works regarding these schemes can probably be attributed to the fact that the relationship between secret keys and the information that leaks through side-channels is a lot more subtle than in the Fiat–Shamir setting.
Indeed, in Fiat–Shamir style schemes, the signing algorithm uses the secret key very directly (it is combined linearly with other elements to form the signature), and as a result, side-channel leakage on sensitive variables, like the random nonce, easily leads to key exposure. By comparison, the way the signing key is used in GPV type schemes is much less straightforward. The key is used to construct the trapdoor information used for the lattice discrete Gaussian sampler; in the case of the samplers [13, 20, 30] used in GPV, DLP and Falcon, that information is essentially the Gram–Schmidt orthogonalization (GSO) of a matrix associated with the secret key. Moreover, due to the way that GSO matrix is used in the sampling algorithm, only a small amount of information about it is liable to leak through side-channels, and how that small amount relates to the signing key is far from clear. To the best of our knowledge, neither the problem of identifying a clear side-channel leakage, nor that of relating that such a leakage to the signing key have been tackled in the literature so far.
Our Contributions. In this work, we initiate the study of how side-channel leakage impacts the security of hash-and-sign lattice-based signature, focusing our attention to the two most notable practical schemes in that family, namely DLP and Falcon. Our contributions towards that goal are mainly threefold.
First, we identify a specific leakage of the implementations of both DLP and Falcon (at least in its original incarnation) with respect to timing side-channels. As noted above, the lattice discrete Gaussian sampler used in signature generation relies on the Gram–Schmidt orthogonalization of a certain matrix associated with the secret key. Furthermore, the problem of sampling a discrete Gaussian distribution supported over the lattice is reduced to sampling one-dimensional discrete Gaussians with standard deviations computed from the norms of the rows of that GSO matrix. In particular, the one-dimensional sampler has to support varying standard deviations, which is not easy to do in constant time. Unsurprisingly, the target implementations both leak that standard deviation through timing side-channels; specifically, they rely on rejection sampling, and the acceptance rate of the corresponding loop is directly related to the standard deviation. As a result, timing attacks will reveal the Gram–Schmidt norms of the matrix associated to the secret key (or rather, an approximation thereof, to a precision increasing with the number of available samples).
Second, we use algebraic number theoretic techniques to elucidate the link between those Gram–Schmidt norms and the secret key. In fact, we show that the secret key can be entirely reconstructed from the knowledge of those Gram–Schmidt norms (at least if they are known exactly), in a way which crucially relies on the algebraic structure of the corresponding lattices.
Since both DLP and Falcon work in an NTRU lattice, the signing key can be expressed as a pair (f, g) of small elements in a cyclotomic ring \(\mathcal R= \mathbb {Z}[\zeta ]\) (of power-of-two conductor, in the case of those schemes). The secret, short basis of the NTRU lattice is constructed by blocks from the multiplication matrices of f and g (and related elements F, G) in a certain basis of \(\mathcal R\) as a \(\mathbb {Z}\)-algebra (DLP uses the usual power basis, whereas Falcon uses the power basis in bit-reversed order; this apparently small difference interestingly plays a crucial role in this work). It is then easily seen that the Gram matrix of the first half of the lattice basis is essentially the multiplication matrix associated with the element \(u = f\bar{f}+g\bar{g}\), where the bar denotes the complex conjugation \(\bar{\zeta } = \zeta ^{-1}\). From that observation, we deduce that knowing the Gram–Schmidt norms of lattice basis is essentially equivalent to knowing the leading principal minors of the multiplication matrix of u, which is a real, totally positive element of \(\mathcal R\).
We then give general efficient algorithms, both for the power basis (DLP case) and for the bit-reversed order power basis (Falcon case), which recover an arbitrary totally positive element u (up to a possible automorphism of the ambient field) given the leading principal minors of its multiplication matrix. The case of the power basis is relatively easy: we can actually recover the coefficients iteratively one by one, with each coefficient given as a solution of quadratic equation over \(\mathbb {Q}\) depending only on the minors and the previous coefficients. The bit-reversed order power basis is more contrived, however; recovery is then carried out recursively, by reduction to the successive subfields of the power-of-two cyclotomic tower.
Finally, to complete the recovery, we need to deduce f and g from u. We show that this can be done using the public key \(h = g/f\bmod q\): we can use it to reconstruct both the relative norm \(f\bar{f}\) of f, and the ideal \((f)\subset \mathcal R\). That data can then be plugged into the Gentry–Szydlo algorithm [21] to obtain f in polynomial time, and hence g. Those steps, though simple, are also of independent interest, since they can be applied to the side-channel attack against BLISS described in [14], in order to get rid of the expensive factorization of an algebraic norm, and hence make the attack efficient for all keys (instead of a small percentage of weak keys as originally stated).
Our third contribution is to actually collect timing traces for the DLP scheme and mount the concrete key recovery. This is not an immediate consequence of the previous points, since our totally positive element recovery algorithm a priori requires the exact knowledge of Gram–Schmidt norms, whereas side-channel leakage only provides approximations (and since some of the squared Gram–Schmidt norms are rational numbers of very large height, recovering them exactly would require an unrealistic number of traces). As a result, the recovery algorithm has to be combined with some pruned tree search in order to account for approximate inputs. In practice, for the larger parameters of DLP signatures (with a claimed security level of 192 bits), we manage to recover the key with good probability using \(2^{33}\) to \(2^{35}\) DLP timing traces.
Carrying out such an experiment in the Falcon setting, however, is left as a challenging open problem for further work. This is because adapting the bit-reversed order totally positive recovery algorithm to deal with approximate inputs appears to be much more difficult (instead of sieving integers whose square lies in some specified interval, one would need to find the cyclotomic integers whose square lies in some target set, which does not even look simple to describe).
The source code of the attack is available at https://github.com/yuyang-crypto/Key_Recovery_from_GSnorms.
Related Work. As noted above, the side-channel security of Fiat–Shamir lattice-based signature has been studied extensively, including in [4, 6, 7, 14, 43, 50]. However, the only implementation attacks we are aware of against hash-and-sign schemes are fault analysis papers [15, 35]: side-channel attacks have not been described so far to the best of our knowledge.
Aside from the original implementations of DLP and Falcon, which are the focus of this paper, several others have appeared in the literature. However, they usually do not aim for side-channel security [36, 41] or only make the base discrete Gaussian sampler (with fixed standard deviation) constant time [29], but do not eliminate the leakage of the varying standard deviations. As a result, those implementations are also vulnerable to the attacks of this paper.
This is not the case, however, for Pornin’s very recent, updated implementation of Falcon, which uses a novel technique proposed by Prest, Ricosset and Rossi [48], combined with other recent results on constant time rejection sampling for discrete Gaussian distribution [4, 53] in order to eliminate the timing leakage of the lattice discrete Gaussian sampler. This technique applies to discrete Gaussian sampling over \(\mathbb {Z}\) with varying standard deviations, when those deviations only take values in a small range. It is then possible to eliminate the dependence on the standard deviation in the rejection sampling by scaling the target distribution to match the acceptance rate of the maximal possible standard deviation. The small range ensures that the overhead of this countermeasure is relatively modest. Thanks to this countermeasure, we stress that the most recent official implementation of Falcon is already protected against the attacks of this paper. Nevertheless, we believe our results underscore the importance of applying such countermeasures.
Organization of the Paper. Following some preliminary material in Sect. 2, Sect. 3 is devoted to recalling some general facts about signature generation for hash-and-sign lattice-based schemes. Section 4 then gives a roadmap of our attack strategy, and provides some details about the final steps (how to deduce the secret key from the totally positive element \(u=f\bar{f}+g\bar{g}\). Section 5 describes our main technical contribution: the algorithms that recover u from the Gram–Schmidt norms, both in the DLP and in the Falcon setting. Section 6 delves into the details of the side-channel leakage, showing how the implementations of the Gaussian samplers of DLP and Falcon do indeed reveal the Gram–Schmidt norms through timing side-channels. Finally, Sect. 7 presents our concrete experiments against DLP, including the tree search strategy to accommodate approximate Gram–Schmidt norms and experimental results in terms of timing and number of traces.
Notation. We use bold lowercase letters for vectors and bold uppercase for matrices. The zero vector is \(\mathbf {0}\). We denote by \(\mathbb {N}\) the non-negative integer set and by \(\log \) the natural logarithm. Vectors are in row form, and we write \(\mathbf {B}= (\mathbf {b}_0,\dotsc , \mathbf {b}_{n-1})\) to denote that \(\mathbf {b}_i\) is the i-th row of \(\mathbf {B}\). For a matrix \(\mathbf {B}\in \mathbb {R}^{n\times m}\), we denote by \(\mathbf {B}_{i,j}\) the entry in the i-th row and j-th column of \(\mathbf {B}\), where \(i\in \{{0,\dotsc , n-1}\}\) and \(j\in \{{0,\dotsc ,m-1}\}\). For \(I\subseteq [0,n), J \subseteq [0, m)\), we denote by \(\mathbf {B}_{I\times J}\) the submatrix \((\mathbf {B}_{i,j})_{i\in I,j\in J}\). In particular, we write \(\mathbf {B}_{I} = \mathbf {B}_{I\times I}\). Let \(\mathbf {B}^t\) denote the transpose of \(\mathbf {B}\).
Given \(\mathbf {u}= (u_0,\dotsc ,u_{n-1})\) and \(\mathbf {v}= (v_0,\dotsc ,v_{n-1})\), their inner product is \(\langle {\mathbf {u}, \mathbf {v}}\rangle = \sum _{i=0}^{n-1}u_iv_i\). The \(\ell _2\)-norm of \(\mathbf {v}\) is \(\Vert \mathbf {v}\Vert = \sqrt{\langle {\mathbf {v}, \mathbf {v}}\rangle }\) and the \(\ell _\infty \)-norm is \(\Vert \mathbf {v}\Vert _\infty = \max _i|v_i|\). The determinant of a square matrix \(\mathbf {B}\) is denoted by \(\det (\mathbf {B})\), so that \(\det \left( \mathbf {B}_{[0,i]}\right) \) is the i-th leading principal minor of \(\mathbf {B}\).
Let D be a distribution. We write \(z\hookleftarrow D\) when the random variable z is sampled from D, and denote by D(x) the probability that \(z=x\). The expectation of a random variable z is \(\mathbb {E}[z]\). We write \(\mathcal {N}(\mu , \sigma ^2)\) the normal distribution of mean \(\mu \) and variance \(\sigma ^2\). We let U(S) be the uniform distribution over a finite set S. For a real-valued function f and any countable set S in the domain of f, we write \(f(S)= \sum _{x\in S} f(x)\).
2 Preliminaries
A lattice \(\mathcal {L}\) is a discrete additive subgroup of \(\mathbb {R}^m\). If it is generated by \(\mathbf {B}\in \mathbb {R}^{n\times m}\), we also write \(\mathcal {L}:= \mathcal {L}(\mathbf {B}) = \{{\mathbf {x}\mathbf {B}\mid \mathbf {x}\in \mathbb {Z}^n}\}\). If \(\mathbf {B}\) has full row rank, then we call \(\mathbf {B}\) a basis and n the rank of \(\mathcal {L}\).
2.1 Gram–Schmidt Orthogonalization
Let \(\mathbf {B}= (\mathbf {b}_0,\dotsc , \mathbf {b}_{n-1}) \in \mathbb {Q}^{n\times m}\) of rank n. The Gram-Schmidt orthogonalization of \(\mathbf {B}\) is \(\mathbf {B}= \mathbf {L}\mathbf {B}^*\), where \(\mathbf {L}\in \mathbb {Q}^{n\times n}\) is lower-triangular with 1 on its diagonal and \(\mathbf {B}^* = (\mathbf {b}_0^*,\dotsc , \mathbf {b}_{n-1}^*)\) is a matrix with pairwise orthogonal rows. We call \(\Vert \mathbf {b}_i^*\Vert \) the i-th Gram-Schmidt norm of \(\mathbf {B}\), and let \(\Vert \mathbf {B}\Vert _{GS} = \max _i \Vert \mathbf {b}_i^*\Vert \).
The Gram matrix of \(\mathbf {B}\) is \(\mathbf {G}= \mathbf {B}\mathbf {B}^t\), and satisfies \(\mathbf {G}= \mathbf {L}\mathbf {D}\mathbf {L}^t\) where \(\mathbf {D}= \mathrm {diag}\left( \Vert \mathbf {b}_i^*\Vert ^2\right) \). This is also known as the Cholesky decomposition of \(\mathbf {G}\), and such a decomposition exists for any symmetric positive definite matrix. The next proposition follows from the triangular structure of \(\mathbf {L}\).
Proposition 1
Let \(\mathbf {B}= \mathbb {Q}^{n\times m}\) of rank n and \(\mathbf {G}\) its Gram matrix. Then for all integer \(0\le k\le n-1\), we have \(\det \left( \mathbf {G}_{[0,k]}\right) = \prod _{i=0}^{k} \Vert \mathbf {b}_i^*\Vert ^2\).
Let \(\mathbf {M}= \left( \begin{array}{cc} \mathbf {A}&{} \mathbf {B}\\ \mathbf {C}&{} \mathbf {D}\end{array} \right) \), where \(\mathbf {A}\in \mathbb {R}^{n\times n}\), \(\mathbf {D}\in \mathbb {R}^{m\times m}\) are invertible matrices, then \(\mathbf {M}/ \mathbf {A}= \mathbf {D}- \mathbf {C}\mathbf {A}^{-1}\mathbf {B}\in \mathbb {R}^{m\times m}\) is called the Schur complement of \(\mathbf {A}\). It holds that
2.2 Parametric Statistics
Let \(D_p\) be some distribution determined by parameter p. Let \(\mathbf {X} = (X_1,\dotsc ,X_n)\) be a vector of observed samples of \(X\hookleftarrow D_p\). The log-likelihood function with respect to \(\mathbf {X}\) is
Provided the log-likelihood function is bounded, a maximum likelihood estimator for samples \(\mathbf {X}\) is a real \(\text {MLE}(\mathbf {X})\) maximizing \(\ell _\mathbf {X}(p)\). The Fisher information is
Seen as a random variable, it is known (e.g. [26, Theorem 6.4.2]) that \(\sqrt{n}(\text {MLE}(\mathbf {X}) - p)\) converges in distribution to \(\mathcal {N}(0, \mathcal {I}(p)^{-1})\). When the target distribution is a geometric, maximum likelihood estimators and the Fisher information are well-known. The second statement of the next lemma directly comes from a Gaussian tail bound.
Lemma 1
Let \(\text {Geo}_p\) denote a geometric distribution with parameter p, and \(\mathbf {X} = (X_1,\cdots ,X_n)\) be samples from \(\text {Geo}_p\). Then we have \(\text {MLE}(\mathbf {X}) = \frac{n}{\sum _{i=1}^n X_i}\) and \(\sqrt{n}(\text {MLE}(\mathbf {X}) - p)\) converges in distribution to \(\mathcal N(0, p^2(1-p))\). In particular, when N is large, then for any \(\alpha \ge 1\), we have \(|\text {MLE}(\mathbf {X}) - p| \le \alpha \cdot p\sqrt{\frac{1-p}{N}}\) except with probability at most \(2\exp (-\alpha ^2/2)\).
2.3 Discrete Gaussian Distributions
Let \(\rho _{\sigma ,\mathbf {c}}(\mathbf {x}) = \exp \left( -\frac{\Vert \mathbf {x}-\mathbf {c}\Vert ^2}{2\sigma ^2}\right) \) be the n-dimensional Gaussian function with center \(\mathbf {c}\in \mathbb {R}^n\) and standard deviation \(\sigma \). When \(\mathbf {c}=\mathbf {0}\), we just write \(\rho _{\sigma }(\mathbf {x})\). The discrete Gaussian over a lattice \(\mathcal {L}\) with center \(\mathbf {c}\) and standard deviation parameter \(\sigma \) is defined by the probability function
In this work, the case \(\mathcal {L}= \mathbb {Z}\) is of particular interest. It is well known that \(\int _{-\infty }^{\infty } \rho _{\sigma , c}(x)\text {d}x = \sigma \sqrt{2\pi }\). Notice that \(D_{\mathbb {Z},\sigma ,c}\) is equivalent to \(i + D_{\mathbb {Z},\sigma ,c-i}\) for an arbitrary \(i \in \mathbb {Z}\), hence it suffices to consider the case where \(c \in [0,1)\). The half discrete integer Gaussian, denoted by \(D^+_{\mathbb {Z},\sigma ,c}\), is defined by
We again omit the center when it is \(c=0\). For any \(\epsilon >0\), the (scaled)Footnote 1 smoothing parameter \(\eta _{\epsilon }'(\mathbb {Z})\) is the smallest \(s>0\) such that \(\rho _{1/s\sqrt{2\pi }}(\mathbb {Z})\le 1+\epsilon \). In practice, \(\epsilon \) is very small, say \(2^{-50}\). The smoothing parameter allows to quantify precisely how the discrete Gaussian differs from the standard Gaussian function.
Lemma 2
([38], implicit in Lemma 4.4). If \(\sigma \ge \eta _{\epsilon }'(\mathbb {Z})\), then \(\rho _\sigma (c + \mathbb {Z}) \in [\frac{1-\epsilon }{1+\epsilon }, 1]\rho _\sigma (\mathbb {Z})\) for any \(c \in [0,1)\).
Corollary 1
If \(\sigma \ge \eta _{\epsilon }'(\mathbb {Z})\), then \(\rho _\sigma (\mathbb {Z})\in [1, \frac{1+\epsilon }{1-\epsilon }]\sqrt{2\pi }\sigma \).
Proof
Notice that \(\int _{0}^{1} \rho _{\sigma }(\mathbb {Z}+c)\text {d}c = \int _{-\infty }^{\infty } \rho _{\sigma }(x)\text {d}x = \sqrt{2\pi }\sigma \), the proof is completed by Lemma 2. \(\square \)
2.4 Power-of-Two Cyclotomic Fields
For the rest of this article, we let \(n = 2^\ell \) for some integer \(\ell \ge 1\). We let \(\zeta _n\) be a 2n-th primitive root of 1. Then \(\mathcal K_n = \mathbb {Q}(\zeta _n)\) is the n-th power-of-two cyclotomic field, and comes together with its ring of algebraic integers \(\mathcal R_n = \mathbb {Z}[\zeta _n]\). It is also equipped with n field automorphisms forming the Galois group which is commutative in this case. It can be seen that \(\mathcal K_{n/2}=\mathbb {Q}(\zeta _{n/2})\) is the subfield of \(\mathcal K_n\) fixed by the automorphism \(\sigma (\zeta _n)=-\zeta _n\) of \(\mathcal K_{n}\), as \(\zeta _n^2 = \zeta _{n/2}\). This leads to a tower of field extensions and their corresponding rings of integers
Given an extension \(\mathcal K_n|\mathcal K_{n/2}\), the relative trace \(\mathrm {Tr}: \mathcal K_n\rightarrow \mathcal K_{n/2}\) is the \(\mathcal K_{n/2}\)-linear map given by \(\mathrm {Tr}(f) = f+\sigma (f)\). Similarly, the relative norm is the multiplicative map \(\mathrm {N}(f)=f\cdot \sigma (f) \in \mathcal K_{n/2}\). Both maps send integers in \(\mathcal K_{n}\) to integers in \(\mathcal K_{n/2}\). For all \(f \in \mathcal K_n\), it holds that \(f = (\mathrm {Tr}(f) + \zeta _n\mathrm {Tr}(\zeta _n^{-1}f))/2\).
We are also interested in the field automorphism \(\zeta _n\mapsto \zeta _n^{-1}=\bar{\zeta _n}\), which corresponds to the complex conjugation. We call adjoint the image \(\bar{f}\) of f under this automorphism. The fixed subfield \(\mathcal K^+_n := \mathbb {Q}(\zeta _n+\zeta _n^{-1})\) is known as the totally real subfield and contains the self-adjoint elements, that is, such that \(f =\bar{f}\). Another way to describe self-adjoint elements is to say that all their complex embeddingsFootnote 2 are in fact reals. Elements whose embeddings are all positive are called totally positive elements, and we denote their set by \(\mathcal K^{++}_n\). A standard example of such an element is given by \(f\bar{f}\) for any \(f\in \mathcal K_n\). It is well-known that the Galois automorphisms act as permutation of these embeddings, so that a totally positive element stays positive under the action of the Galois group.
Representation of Cyclotomic Numbers. We also have \(\mathcal K_n \simeq \mathbb {Q}[x]/(x^n+1)\) and \(\mathcal R_n \simeq \mathbb {Z}[x]/(x^n+1)\), so that elements in cyclotomic fields can be seen as polynomials. In this work, each \(f = \sum _{i=0}^{n-1} f_i\zeta _n^i\in \mathcal K_{n}\) is identified with its coefficient vector \((f_0,\cdots ,f_{n-1})\). Then the inner product of f and g is \(\langle {f, g}\rangle = \sum _{i=0}^{n-1}f_ig_i\), and we write \(\Vert f\Vert \), resp. \(\Vert f\Vert _\infty \), the \(\ell _2\)-norm, resp. \(\ell _\infty \)-norm, of f. In this representation, it can be checked that \(\bar{f} = (f_0, -f_{n-1}, \dotsc , -f_1)\) and that \(\langle { f,gh}\rangle = \langle {f\bar{g}, h}\rangle \) for all \(f,g,h \in \mathcal K_n\). In particular, the constant coefficient of \(f\bar{g}\) is \(\langle {f, g}\rangle =\langle {f\bar{g}, 1}\rangle \). A self-adjoint element f has coefficients \((f_0, f_1, \dotsc , f_{n/2-1}, 0, -f_{n/2-1}, \dotsc , -f_1)\).
Elements in \(\mathcal K_n\) can also be represented by their matrix of multiplication in the basis \(1, \zeta _n, \dotsc , \zeta _n^{n-1}\). In other words, the map \(\mathcal A_{n}: \mathcal K_n \rightarrow \mathbb {Q}^{n\times n}\) defined by
is a ring isomorphism. We have \(fg=g\cdot \mathcal A_n(f)\). We can also see that \(\mathcal A_n(\bar{f}) = \mathcal A_n(f)^t\) which justifies the term “adjoint”. We deduce that the matrix of a self-adjoint element is symmetric. It can be observed that a totally positive element \(A \in \mathcal K_n\) corresponds to the symmetric positive definite matrix \(\mathcal A_n(A)\).
For efficiency reasons, the scheme Falcon uses another representation corresponding to the tower structure. If \(f=(f_0,\dotsc , f_{n-1})\in \mathcal K_n\), we let \(f_e=\mathrm {Tr}(f)/2 = (f_0, f_2,\dotsc , f_{n-2})\) and \(f_o=\mathrm {Tr}(\zeta _n^{-1}f)/2 = (f_1, f_3, \dotsc , f_{n-1})\). Let \(\mathbf {P}_{n} \in \mathbb {Z}^{n\times n}\) be the permutation matrix corresponding to the bit-reversal order. We define \(\mathcal F_{n}(f) = \mathbf {P}_{n}\mathcal A_{n}(f)\mathbf {P}_{n}^t\). In particular, it is also symmetric positive definite when f is a totally positive element. As shown in [13], it holds that
2.5 NTRU Lattices
Given \(f,g \in \mathcal R_n\) such that f is invertible modulo some \(q \in \mathbb {Z}\), we let \(h=f^{-1}g \bmod q\). The NTRU lattice determined by h is \(\mathcal L_\text {NTRU}= \{ (u,v) \in \mathcal R_n^2\,:\, u+vh = 0 \bmod q\}\). Two bases of this lattice are of particular interest for cryptography:
where \(F,G \in \mathcal R_n\) such that \(fG-gF = q\). Indeed, the former basis acts usually as the public key, while the latter is the secret key, also called the trapdoor basis, when f, g, F, G are short vectors. In practice, these matrices are represented using either the operator \(\mathcal A_n\) [11] or \(\mathcal F_n\) [47]:
3 Hash-and-Sign over NTRU Lattices
Gentry, Peikert and Vaikuntanathan introduced in [20] a generic and provably secure hash-and-sign framework based on trapdoor sampling. This paradigm has then been instantiated over NTRU lattices giving rise to practically efficient cryptosystems: DLP [11] and Falcon [47] signature schemes.
In the NTRU-based hash-and-sign scheme, the secret key is a pair of short polynomials \((f, g) \in \mathcal R_n^2\) and the public key is \(h = f^{-1}g \bmod q\). The trapdoor basis \(\mathbf {B}_{f,g}\) (of \(\mathcal L_\text {NTRU}\)) derives from (f, g) by computing \(F,G\in \mathcal R_n\) such that \(fG-gF =q\). In both the DLP signature and Falcon, the trapdoor basis has a bounded Gram-Schmidt norm: \(\Vert \mathbf {B}_{f,g}\Vert _{GS}\le 1.17\sqrt{q}\) for compact signatures.
The signing and verification procedure is described on a high level as follows:
Lattice Gaussian samplers [20, 42] are nowadays a standard tool to generate signatures provably statistically independent of the secret basis. However, such samplers are also a notorious target for side-channel attacks. This work makes no exception and attacks non constant-time implementations of the lattice Gaussian samplers at the heart of both DLP and Falcon, that are based on the KGPV sampler [30] or its ring variant [13]. Precisely, while previous attacks target to Gaussian with public standard deviations, our attack learns the secret-dependent Gaussian standard deviations involved in the KGPV sampler.
3.1 The KGPV Sampler and Its Variant
The KGPV sampler is a randomized variant of Babai’s nearest plane algorithm [1]: instead of rounding each center to the closest integer, the KGPV sampler determines the integral coefficients according to some integer Gaussians. It is shown in [20] that under certain smoothness condition, the algorithm outputs a sample from a distribution negligibly close to the target Gaussian. Its formal description is illustrated in Algorithm 3.1.
Note that in the KGPV sampler (or its ring variant), the standard deviations of integer Gaussians are inversely proportional to the Gram-Schmidt norms of the input basis. In the DLP scheme, \(\mathbf {B}\) is in fact the trapdoor basis \(\mathbf {B}^{\mathcal A}_{f,g} \in \mathbb {Z}^{2n\times 2n}\).
The Ducas–Prest Sampler. Falcon uses a variant of the KGPV algorithm which stems naturally from Ducas–Prest’s fast Fourier nearest plane algorithm [13]. It exploits the tower structure of power-of-two cyclotomic rings. Just like the KGPV sampler, the Ducas-Prest sampler fundamentally relies on integer Gaussian sampling to output Gaussian vectors. We omit its algorithmic description, as it is not needed in this work. Overall, what matters is to understand that the standard deviations of involved integer Gaussians are also in the form \(\sigma _i = \sigma /\Vert \mathbf {b}_{i}^*\Vert \), but that \(\mathbf {B}= \mathbf {B}^{\mathcal F}_{f,g}\) in this context.
4 Side-Channel Attack Against Trapdoor Samplers: A Roadmap
Our algorithm proceeds as follows:
-
1.
Side-channel leakage: extract the \(\Vert \mathbf {b}_i^*\Vert \)’s associated to \(\mathbf {B}_{f,g}^\mathcal A\), resp. \(\mathbf {B}_{f,g}^\mathcal F\) via the timing leakage of integer Gaussian sampler in the DLP scheme, reps. Falcon.
-
2.
Totally positive recovery: from the given \(\Vert \mathbf {b}_i^*\Vert \)’s, recover a Galois conjugate u of \(f\overline{f} + g\overline{g} \in \mathcal K^{++}_n\).
-
3.
Final recovery: compute f from u and the public key \(g/f \mod q\).
Steps 1 and 2 of the attack are the focus of Sects. 6 and 5 respectively. Below we describe how the third step is performed. First we recover the element \(f\overline{g}\), using the fact that it has small coefficients. More precisely, the \(j^{\text {th}}\) coefficient is \(\langle {f, \zeta _n^jg}\rangle \) where f and \(\zeta _n^jg\) are independent and identically distributed according to \(D_{\mathbb {Z}^n,r}\), with \(r=1.17\sqrt{\frac{q}{2n}}\). By [32, Lemma 4.3], we know that all these coefficients are of size much smaller than q/2 with high probability. Now, we can compute \(v = u\overline{h}(1+h\overline{h})^{-1} \bmod q\), where \(h = f^{-1}g\bmod q\) is the public verification key. We readily see that \(v = f\overline{g} \bmod q\) if and only if \(u = f\overline{f} + g\overline{g}\). If u is a conjugate of \(f\overline{f}+g\overline{g}\), then most likely the coefficients of v will look random in \((-q/2, q/2]\). This can mostly be interpreted as the NTRU assumption, that is, h being indistinguishable from a random element modulo q. When this happens, we just consider another conjugate of u, until we obtain a distinguishably small element, which must then be \(f\overline{g}\) (not just in reduction modulo q, but in fact over the integers).
Once this is done, we can then deduce the reduction modulo q of \(f\bar{f} \equiv f\bar{g} / \bar{h} \pmod q\), which again coincides with \(f\bar{f}\) over the integers with high probability (if we again lift elements of \(\mathbb {Z}_q\) to \((-q/2,q/2]\), except for the constant coefficient, which should be lifted positively). This boils down to the fact that with high probability \(f\overline{f}\) has its constant coefficient in (0, q) and the others are in \((-q/2, q/2)\). Indeed, the constant coefficient of \(f\overline{f}\) is \(\Vert f\Vert ^2\), and the others are \(\langle {f, \zeta _n^jf}\rangle \)’s with \(j\ge 1\). By some Gaussian tail bound, we can show \(\Vert f\Vert ^2\le q\) with high probability. As for \(\langle {f, \zeta _n^jf}\rangle \)’s, despite the dependency between f and \(\zeta _n^jf\), we can still expect \(|\langle {f, \zeta _n^jf}\rangle | < q/2\) for all \(j\ge 1\) with high probability. We leave details in the full version [17] for interested readers.
Next, we compute the ideal (f) from the knowledge of \(f\overline{f}\) and \(f\overline{g}\). Indeed, as f and g are co-prime from the key generation algorithm, we directly have \((f) = (f\overline{f}) + (f\overline{g})\). At this point, we have obtained both the ideal (f) and the relative norm \(f\bar{f}\) of f on the totally real subfield. That data is exactly what we need to apply the Gentry–Szydlo algorithm [21], and finally recover f itself in polynomial time. Note furthermore that the practicality of the Gentry–Szydlo algorithm for the dimensions we consider (\(n=512\)) has been validated in previous work [14].
Comparison with Existing Method. As part of their side-channel analysis of the BLISS signature scheme, Espitau et al. [14] used the Howgrave-Graham–Szydlo algorithm to recover an NTRU secret f from \(f\overline{f}\). They successfully solved a small proportion \(({\approx } 7\%)\) of NTRU instances with \(n=512\) in practice. The Howgrave-Graham–Szydlo algorithm first recovers the ideal (f) and then calls the Gentry–Szydlo algorithm as we do above. The bottleneck of this method is in its reliance on integer factorization for ideal recovery: the integers involved can become quite large for an arbitrary f, so that recovery cannot be done in classical polynomial time in general. This is why only a small proportion of instances can be solved in practice.
However, the technique we describe above bypasses this expensive factorization step by exploiting the arithmetic property of the NTRU secret key. In particular, it is immediate to obtain a two-element description of (f), so that the Gentry-Szydlo algorithm can be run as soon as \(f\bar{f}\) and \(f\bar{g}\) are computed. This significantly improves the applicability and efficiency of Espitau et al.’s side-channel attack against BLISS [14]. The question of avoiding the reliance on Gentry–Szydlo algorithm by using the knowledge of \(f\overline{g}\) and \(f\overline{f}\) remains open, however.
5 Recovering Totally Positive Elements
Totally positive elements in \(\mathcal K_n\) correspond to symmetric positive definite matrices with an inner structure coming from the algebra of the field. In particular, it is enough to know only one line of the matrix to recover the corresponding field element. Hence it can be expected that being given the diagonal part of the LDL decomposition also suffices to perform a recovery. In this section, we show that this is indeed the case provided we know exactly the diagonal.
Recall on the one hand that the \(\mathcal A_n\) representation is the skew circulant matrix in which each diagonal consists of the same entries. On the other hand, the \(\mathcal F_n\) representation does not follow the circulant structure, but it is compatible with the tower of rings structure, i.e. its sub-matrices are the \(\mathcal F_{n/2}\) representations of elements in the subfield \(\mathcal K_{n/2}\). Each operator leads to a distinct approach, which is described in Sects. 5.1 and 5.2 respectively.
While the algorithms of this section can be used independently, they are naturally related to hash-and-sign over \(\text {NTRU}\) lattices. Let \(\mathbf {B}\) be a matrix representation of some secret key \((g,-f)\), and \(\mathbf {G}= \mathbf {B}\mathbf {B}^t\). Then the diagonal part of \(\mathbf {G}\)’s LDL decomposition contains the \(\Vert \mathbf {b}_i^*\Vert \)’s, and \(\mathbf {G}\) is a matrix representation of \(f\overline{f} + g\overline{g} \in \mathcal K^{++}_n\). As illustrated in Sect. 4, the knowledge of \(u=f\overline{f} + g\overline{g}\) allows to recover the secret key in polynomial time. Therefore results in this section pave the way for a better use of secret Gram-Schmidt norms.
In practice however, we will obtain only approximations of the \(\Vert \mathbf {b}_i^*\Vert \)’s. The algorithms of this section must then be tweaked to handle the approximation error. The case of \(\mathcal A_n\) is dealt with in Sect. 7.1. While we do not solve the “approximate” case of \(\mathcal F_n\), we believe our “exact” algorithms to be of independent interest to the community.
5.1 Case of the Power Basis
The goal of this section is to obtain the next theorem. It involves the heuristic argument that some rational quadratic equations always admits exactly one integer root, which will correspond to a coefficient of the recovered totally positive element. Experimentally, when it happens that there are two integer roots and the wrong one is chosen, the algorithm “fails” with overwhelming probability at the next step: the next discriminant does not lead to integer roots.
Theorem 1
Let \(u\in \mathcal R_n \cap \mathcal K^{++}_n\). Write \(\mathcal A_n(u) = \mathbf {L}\cdot \mathrm {diag}(\lambda _i)_i \cdot \mathbf {L}^t\). There is a (heuristic) algorithm \(\mathsf {Recovery}_\mathcal A\) that, given \(\lambda _i\)’s, computes u or \(\sigma (u)\). It runs in \(\widetilde{O}(n^3\log \Vert u\Vert _\infty )\).
The complexity analysis is given in the full version [17]. In Sect. 7.2, a version tweaked to handle approximations of the \(\lambda _i\)’s is given, and may achieve quasi-quadratic complexity. It is in any case very efficient in practice, and it is used in our attack against DLP signature.
We now describe Algorithm 5.1. By Proposition 1, \(\prod _{j=0}^{i} \lambda _i = \det \left( \mathcal A_n(u)_{[0,i]}\right) \) is an integer, thus we take \(m_i = \prod _{j=0}^{i} \lambda _i\) instead of \(\lambda _i\) as input for integrality. It holds that \(u_0 = \det \left( \mathcal A_n(u)_{[0,0]}\right) = \lambda _0\). By the self-adjointness of u, we only need to consider the first n/2 coefficients. For any \(0 \le i < n/2-1\), we have
Let \(\mathbf {v}_i = (u_{i+1}, \dotsc , u_1)\). By the definition of the Schur complement and Proposition 1, we see that
where the left-hand side is actually \(\lambda _{i+1}\), and the right-hand side gives a quadratic equation in \(u_{i+1}\) with rational coefficients that can be computed from the knowledge of \((u_0,\dotsc , u_i)\). When \(i=0\), the equation is equivalent to \(\lambda _0\lambda _1 = u_0^2-u_1^2\): there are two candidates of \(u_1\) up to sign. Once \(u_1\) is chosen, for \(i\ge 1\), the quadratic equation should have with very high probability a unique integer solution, i.e. the corresponding \(u_{i+1}\). This leads to Algorithm 5.1. Note that the sign of \(u_1\) determines whether the algorithm recovers u or \(\sigma (u)\). This comes from the fact that \(\mathcal A_n(u) = \mathrm {diag}((-1)^i)_{i\le n} \cdot \mathcal A_n(\sigma (u))\cdot \mathrm {diag}((-1)^i)_{i\le n}\).
5.2 Case of the Bit-Reversed Order Basis
In this section, we are given the diagonal part of the LDL decomposition \(\mathcal F_n(u)=\mathbf {L}'\mathrm {diag}(\lambda _i)\mathbf {L}'^t\), which rewrites as \((\mathbf {L}'^{-1}\mathbf {P}_n)\mathcal A_n(u)\mathbf {(}\mathbf {L}'^{-1}\mathbf {P}_n)^t = \mathrm {diag}(\lambda _i)\). Since the triangular structure is shuffled by the bit-reversal representation, recovering u from the \(\lambda _i\)’s is not as straightforward as in the previous section. Nevertheless, the compatibility of the \(\mathcal F_n\) operator with the tower of extension can be exploited. It gives a recursive approach that stems from natural identities between the trace and norm maps relative to the extension \(\mathcal K_n\,|\,\mathcal K_{n/2}\), crucially uses the self-adjointness and total positivity of u, and fundamentally relies on computing square roots in \(\mathcal R_n\).
Theorem 2
Let \(u\in \mathcal R_n \cap \mathcal K^{++}_n\). Write \(\mathcal F_n(u) =\mathbf {L}'\cdot \mathrm {diag}(\lambda _i)_i \cdot \mathbf {L}'^t\). There is a (heuristic) algorithm that, given the \(\lambda _i\)’s, computes a conjugate of u. It runs in \(\widetilde{O}(n^3\log \Vert u\Vert _\infty )\).
The recursiveness of the algorithm and its reliance on square roots will force it to always work “up to Galois conjugation”. In particular, at some point we will assume heuristically that only one of the conjugates of a value computed within the algorithm is in a given coset of the subgroup of relative norms in the quadratic subfield. Since that constraint only holds with negligible probability for random values, the heuristic is essentially always verified in practice. Recall that we showed in Sect. 4 how to recover the needed conjugate in practice by a distinguishing argument.
The rest of the section describes the algorithm, while the complexity analysis is presented in the full version [17]. First, we observe from
that \(\mathrm {Tr}(u)\) is self-adjoint. The positivity of u implies that \(\mathrm {Tr}(u) \in \mathcal K^{++}_{n/2}\). From Eq. (2), we know that the n/2 first minors of \(\mathcal F_n(u)\) are the minors of \(\mathcal F_{n/2}(\mathrm {Tr}(u)/2)\). The identity above also shows that \(\mathrm {Tr}(\zeta _n^{-1}u)\) is a square root of the element \(\zeta _{n/2}^{-1}\mathrm {Tr}(\zeta _n^{-1}u)\overline{\mathrm {Tr}(\zeta _n^{-1}u)}\) in \(\mathcal K_{n/2}\). Thus, if we knew \(\mathrm {Tr}(\zeta _n^{-1}u)\overline{\mathrm {Tr}(\zeta _n^{-1}u)}\), we could reduce the problem of computing \(u\in \mathcal K_n\) to computations in \(\mathcal K_{n/2}\), more precisely, recovering a totally positive element from “its minors” and a square root computation.
It turns out that \(\mathrm {Tr}(\zeta _n^{-1}u)\overline{\mathrm {Tr}(\zeta _n^{-1}u)}\) can be computed by going down the tower as well. One can see that
where \(\mathrm {N}(u)\) is totally positive since u (and therefore \(\sigma (u)\)) is. This identityFootnote 3 can be thought as a “number field version” of the \(\mathcal F_n\) representation. Indeed, recall that \(u_e = (1/2)\mathrm {Tr}(u)\) and \(u_o=(1/2)\mathrm {Tr}(\zeta _n^{-1}u)\). Then by block determinant formula and the fact that \(\mathcal F_n\) is a ring isomorphism, we see that
This strongly suggests a link between the successive minors of \(\mathcal F_n(u)\) and the element \(\mathrm {N}(u)\). The next lemma makes this relation precise, and essentially amounts to taking Schur complements in the above formula.
Lemma 3
Let \(u\in \mathcal K^{++}_n\) and \(\widehat{u} = \frac{2\mathrm {N}(u)}{\mathrm {Tr}(u)}\in \mathcal K^{++}_{n/2}\). Then for \(0< k< n/2\), we have
Proof
Let \(\mathbf {G}= \mathcal F_{n}(u)\) and \(\mathbf {B}= \mathcal F_{n/2}(u_o)_{[0,\frac{n}{2}) \times [0,k)}\) in order to write
with \(\mathbf {B}^t=\mathcal F_{n/2}(\overline{u_o})_{[0,k) \times [0,\frac{n}{2})}\). Let \(\mathbf {S}= \mathbf {G}_{[0, \frac{n}{2} + k)}/\mathcal F_{n/2}(u_e) = \mathcal F_{n/2}(u_e)_{[0,k)} - \mathbf {B}\mathcal F_{n/2}(u_e)^{-1}\mathbf {B}^t\). Since \(\mathcal F_n\) is a ring isomorphism, a routine computation shows that \(\mathbf {S}= \mathcal F_{n/2}(\widehat{u})_{[0,k)}\). The proof follows from Eq. (1). \(\square \)
Lemma 3 tells us that knowing \(\mathrm {Tr}(u)\) and the principal minors of \(\mathcal F_n(u)\) is enough to recover those of \(\mathcal F_{n/2}(\widehat{u})\), so that the computations in \(\mathcal K_{n}\) are again reduced to computing a totally positive element in \(\mathcal K_{n/2}\) from its minors. Then from Eq. (3), we can obtain \(\mathrm {Tr}(\zeta _n^{-1}u)\overline{\mathrm {Tr}(\zeta _n^{-1}u)}\). The last step is then to compute a square root of \(\zeta _{n/2}^{-1}\mathrm {Tr}(\zeta _n^{-1}u)\overline{\mathrm {Tr}(\zeta _n^{-1}u)}\) in \(\mathcal K_{n/2}\) to recover \(\pm \mathrm {Tr}(\zeta _n^{-1}u)\). In particular, this step will lead to u or its conjugate \(\sigma (u)\). As observed above, this translates ultimately in recovering only a conjugate of u.
Lastly, when \(n=2\), that is, when we work in \(\mathbb {Q}(i)\), a totally positive element is in fact in \(\mathbb {Q}_+\). This leads to Algorithm 5.2, which is presented in the general context of \(\mathcal K_n\) to fit the description above, for the sake of simplicity. The algorithm \(\mathsf {TowerRoot}\) of Step 9 computes square roots in \(\mathcal K_n\) and a quasi-quadratic version for integers is presented and analyzed in the next section.
The whole procedure is constructing a binary tree as illustrated in Fig. 1. The algorithm can be made to rely essentially only on algebraic integers, which also helps in analyzing its complexity. This gives the claim of Theorem 2 (see the full version [17] for details). At Step 6, the algorithm finds the (heuristically unique) conjugate \(\widehat{u}\) of \(\widetilde{u}\) such that \(\widehat{u}\cdot u^+\) is a relative norm (since we must have \(\widehat{u}\cdot u^+ = \mathrm {N}(u)\) by the above). In practice, in the integral version of this algorithm, we carry out this test not by checking for being a norm, but as an integrality test.
5.2.1 Computing Square Roots in Cyclotomic Towers
In this section, we will focus on computing square roots of algebraic integers: given \(s = t^2 \in \mathcal R_n\), compute t. The reason for focusing on integers is that both our Algorithm 5.2 and practical applications deal only with algebraic integers. A previous approach was suggested in [25], relying on finding primes with small splitting pattern in \(\mathcal R_n\), computing square roots in several finite fields and brute-forcing to find the correct candidate. A hassle in analyzing this approach is to first find a prime larger enough than an arbitrary input, and that splits in, say, two factors in \(\mathcal R_n\). Omitting the cost of finding such a prime, this algorithm can be shown to run in \(\widetilde{O}(n^2(\log \Vert s\Vert _\infty )^2)\). Our recursive approach does not theoretically rely on finding a correct prime, and again exploits the tower structure to achieve the next claim.
Theorem 3
Given a square s in \(\mathcal R_n\), there is a deterministic algorithm that computes \(t \in \mathcal R_n\) such that \(t^2=s\) in time \(\widetilde{O}(n^2\log \Vert s\Vert _\infty )\).
Recall that the subfield \(\mathcal K_{n/2}\) is fixed by the automorphism \(\sigma (\zeta _n) = -\zeta _n\). For any element t in \(\mathcal R_n\), recall that \(t = \frac{1}{2}(\mathrm {Tr}(t) + \zeta _n\mathrm {Tr}(\zeta _n^{-1} t))\), where \(\mathrm {Tr}\) is the trace relative to this extension. We can also see that
for the relative norm. Hence recovering \(\mathrm {Tr}(t)\) and \(\mathrm {Tr}(\zeta _n^{-1} t)\) can be done by computing the square roots of elements in \(\mathcal R_{n/2}\) determined by s and \(\mathrm {N}(t)\). The fact that \(\mathrm {N}(s) = \mathrm {N}(t)^2\) leads to Algorithm 5.3.
Notice that square roots are only known up to sign. This means that an algorithm exploiting the tower structure of fields must perform several sign checks to ensure that it will lift the correct root to the next extension. For our algorithm, we only need to check for the sign of \(\mathrm {N}(t)\) (the signs of \(\mathrm {Tr}(t)\) and \(\mathrm {Tr}(\zeta _n^{-1} t)\) can be determined by checking if their current values allow to recover s). This verification happens at Step 6 of Algorithm 5.3, where after computing the square root of \(\mathrm {N}(s)\), we know \((-1)^b\mathrm {N}(t)\) for some \(b\in \{0,1\}\). It relies on noticing that from Eq. (4), \(T_b := \mathrm {Tr}(s)+2\cdot (-1)^b\mathrm {N}(t)\) is a square in \(\mathcal K_{n/2}\) if and only if \(b=0\), in which case \(T_b = \mathrm {Tr}(t)^2\). (Else, \(\zeta _n^{-2}T_b\) is the square \(\mathrm {Tr}(\zeta _n^{-1} t)^2\) in \(\mathcal K_{n/2}\).) This observation can be extended to a sign check that runs in \(\widetilde{O}(n\cdot \log \Vert s\Vert _\infty )\). The detailed analysis is given in the full version [17].
In practice, we can use the following approach: since n is small, we can easily precompute a prime integer p such that \(p -1\equiv n\bmod 2n\). For such a prime, there is a primitive \(n^{\text {th}}\) root \(\omega \) of unity in \(\mathbb {F}_p\), and such a root cannot be a square in \(\mathbb {F}_p\) (else 2n would divide \(p-1\)). Checking squareness then amounts to checking which of \(T_b(\omega )\) or \(\omega ^{-2}T_b(\omega )\) is a square \(\bmod \, p\) by computing a Legendre symbol. While we need such primes for any power of 2 that is smaller than n, in any case, this checks is done in quasi-linear time. Compared to [25], the size of p here does not matter.
Let us denote by \(\mathsf {SQRT}(n, S)\) the complexity of Algorithm 5.3 for an input \(s \in \mathcal R_n\) with coefficients of size \(S = \log \Vert s\Vert _\infty \). Using e.g. FFT based multiplication of polynomials, \(\mathrm {N}(s)\) can be computed in \(\widetilde{O}(n S)\), and has bitsize at most \(2S+\log n\). Recall that the so-called canonical embedding of any \(s\in \mathcal K_n\) is the vector \(\tau (s)\) of its evaluations at the roots of \(x^n+1\). It is well-known that it satisfies \(\Vert \tau (s) \Vert = \sqrt{n}\Vert s\Vert \), so that \(\Vert \tau (s)\Vert _\infty \le n \Vert s\Vert _\infty \) by norm equivalence. If \(s=t^2\) we see that \(\Vert \tau (s)\Vert _\infty = \Vert \tau (t)\Vert _\infty ^2\). Using again norm equivalence, we obtain \(\Vert t\Vert _\infty \le \sqrt{n}\Vert s\Vert _\infty ^{1/2}\). In the case of \(\mathrm {N}(s) = \mathrm {N}(t)^2\), we obtain that \(\mathrm {N}(t)\) has size at most \(S+\log n\). The cost of \(\mathsf {CheckSqr}\) is at most \(\widetilde{O}(n S)\), so we obtain
A tedious computation (see the full version [17] for details) gives us Theorem 3.
6 Side-Channel Leakage of the Gram–Schmidt Norms
Our algorithms in Sect. 5 rely on the knowledge of the exact Gram-Schmidt norms \(\Vert \mathbf {b}_i^*\Vert \). In this section, we show that in the original implementations of DLP and Falcon, approximations of \(\Vert \mathbf {b}_i^*\Vert \)’s can be obtained by exploiting the leakage induced by a non constant-time rejection sampling.
In previous works targeting the rejection phase, the standard deviation of the sampler was a public constant. This work deals with a different situation, as both the centers and the standard deviations used by the samplers of DLP and Falcon are secret values determined by the secret key. These samplers output Gaussian vectors by relying on an integer Gaussian sampler, which performs rejection sampling. The secret standard deviation for the \(i^{\text {th}}\) integer Gaussian is computed as \(\sigma _i =\sigma /\Vert \mathbf {b}_i^*\Vert \) for some fixed \(\sigma \), so that exposure of the \(\sigma _i\)’s means the exposure of the Gram-Schmidt norms. The idea of the attack stems from the simple observation that the acceptance rate of the sampler is essentially a linear function of its current \(\sigma _i\). In this section, we show how, by a timing attack, one may recover all acceptance rates from sufficiently many signatures by computing a well-chosen maximum likelihood estimator. Recovering approximations of the \(\Vert \mathbf {b}_i^*\Vert \)’s then follows straightforwardly.
6.1 Leakage in the DLP Scheme
We first target the Gaussian sampling in the original implementation [46], described in Algorithms 6.1 and 6.2. It samples “shifted” Gaussian integers by relying on three layers of Gaussian integer sampling with rejection. More precisely, the target Gaussian distribution at the “top” layer has a center which depends on secret data and varies during each call. To deal with the varying center, the “shifted” sample is generated by combining zero-centered sampler and rejection sampling. Yet the zero-centered sampler has the same standard deviation as the “shifted” one, and the standard deviation depends on the secret key. At the “intermediate” layer, also by rejection sampling, the sampler rectifies a public zero-centered sample to a secret-dependent one.
At the “bottom” layer, the algorithm \(\mathsf {IntSampler}\) actually follows the BLISS sampler [8] that is already subject to side-channel attacks [7, 14, 43]. We stress again that our attack does not target this algorithm, so that the reader can assume a constant-time version of it is actually used here. The weakness we are exploiting is a non constant-time implementation of Algorithm 6.2 in the “intermediate” layer. We now describe how to actually approximate the \(\sigma _i\)’s using this leakage.
Let \(\widehat{\sigma } = \sqrt{\frac{1}{2\log (2)}}\) be the standard deviation of the Gaussian at the “bottom” layer and \(k_i = \lceil \frac{\sigma _i}{\hat{\sigma }}\rceil \). It can be verified that the average acceptance probability of Algorithm 6.2 is \(AR(\sigma _i) = \frac{\rho _{\sigma _i}(\mathbb {Z})}{\rho _{k\widehat{\sigma }}(\mathbb {Z})}\). As required by the KGPV algorithm, we know that \(k_i\widehat{\sigma } \ge \sigma _i \ge \eta _{\epsilon }'(\mathbb {Z})\) and by Corollary 1 we have \(AR(\sigma _i) \in \frac{\sigma _i}{k_i\widehat{\sigma }}\cdot \left[ \frac{1-\epsilon }{1+\epsilon }, 1\right] \). Since \(\epsilon \) is very small in this context, we do not lose much by assuming that \(AR(\sigma _i) = \frac{\sigma _i}{k_i\hat{\sigma }}\).
Next, for a given \(\sigma _i\), the number of trials before Algorithm 6.2 outputs its result follows a geometric distribution \(\text {Geo}_{AR(\sigma _i)}\). We let \(\overline{AR}_i\) be maximum likelihood estimators for the \(AR(\sigma _i)\)’s associated to N executions of the KGPV sampler, that we compute using Lemma 1. We now want to determine the \(k_i\)’s to compute \(\overline{\sigma _i} = k_i\hat{\sigma }\overline{AR}_i\). Concretely, for the suggested parameters, we can set \(k_i = 3\) for all i at the beginning and measure \(\overline{AR}_i\). Because the first half of the \(\sigma _i\)’s are in a small interval and increase slowly, it may be the case at some step that \(\overline{AR}_{i+1}\) is significantly smaller than \(\overline{AR}_{i}\) (say, \(1.1\cdot \overline{AR}_{i+1} < \overline{AR}_{i}\)). This means that \(k_{i+1} = k_i+1\), and we then increase by one all the next \(k_{i}\)’s. This approach can be done until \(\overline{AR}_{n}\) is obtained, and works well in practice. Lastly, Lemma 1 tells us that for large enough \(\alpha \) and p, taking \(N\ge 2^{2(p+\log \alpha )}\) implies \(|\overline{\sigma }_i-\sigma _i|\le 2^{-p}\cdot \sigma _i\) for all \(0\le i< 2n\) with high probability.
From [11], the constant \(\sigma \) is publicly known. This allows us to have approximations \(\overline{b_i} = \frac{\sigma }{\overline{\sigma }_i}\), which we then expect are up to p bits of accuracy on \(\Vert \mathbf {b}_i^*\Vert \).
6.2 Leakage in the Falcon Scheme
We now describe how the original implementation of Falcon presents a similar leakage of Gram–Schmidt norms via timing side-channels. In contrast to the previous section, the integer sampler of Falcon is based on one public half-Gaussian sampler and some rejection sampling to reflect sensitive standard deviations and centers. The procedure is shown in Algorithm 6.3.
Our analysis does not target the half-Gaussian sampler \(D_{\mathbb {Z},\widehat{\sigma }}^{+}\) where \(\widehat{\sigma }=2\), so that we omit its description. It can be implemented in a constant-time way [29], but this has no bearing on the leakage we describe.
We first consider \(c_i\) and \(\sigma _i\) to be fixed. Following Algorithm 6.3, we let \(p(z,b) = \exp \left( \frac{z^2}{2\widehat{\sigma }^2} - \frac{(b + (2b-1)z-c_i)^2}{2\sigma _i^2} \right) \) be the acceptance probability and note that
Then the average acceptance probability for fixed c and \(\sigma _i\) satisfies
As \(\widehat{\sigma } \ge \sigma _i \ge \eta _{\epsilon }'(\mathbb {Z})\) for a very small \(\epsilon \), we can again use Lemma 2 to have that \(\rho _{\sigma _i}(\mathbb {Z}-c) \approx \rho _{\sigma _i}(\mathbb {Z})\). This allows us to consider the average acceptance probability as a function \(AR(\sigma _i)\), independent of c. Using that \(2\rho _{\hat{\sigma }}^+(\mathbb {N}) = \rho _{\hat{\sigma }}(\mathbb {Z})+1\) combined with Corollary 1, we write \(AR(\sigma _i) = \frac{\sigma _i\sqrt{2\pi }}{1+2\sqrt{2\pi }}\). Then an application of Lemma 1 gives the needed number of traces to approximate \(\sigma _i\) up to a desired accuracy.
7 Practical Attack Against the DLP Scheme
For the methods in Sect. 6, measure errors seem inevitable in practice. To mount a practical attack, we have to take into account this point. In this section, we show that it is feasible to compute a totally positive element even with noisy diagonal coefficients of its LDL decomposition.
First we adapt the algorithm \(\mathsf {Recovery}_\mathcal A\) (Algorithm 5.1) to the noisy input in Sect. 7.1. To determine each coefficient, we need to solve a quadratic inequality instead of an equation due to the noise. As a consequence, each quadratic inequality may lead to several candidates of the coefficient. According to if there is a candidate or not, the algorithm extends prefixes hopefully extending to a valid solution or eliminates wrong prefixes. Thus the algorithm behaves as a tree search.
Then we detail in Sect. 7.2 some implementation techniques to accelerate the recovery algorithm in the context of the DLP scheme. While the algorithm is easy to follow, adapting it to practical noisy case is not trivial.
At last, we report experimental results in Sect. 7.3. As a conclusion, given the full timing leakage of about \(2^{34}\) signatures, one may practically break the DLP parameter claimed for 192-bit security with a good chance. We bring some theoretical support for this value in Sect. 7.4.
7.1 Totally Positive Recovery with Noisy Inputs
Section 5.1 has sketched the exact recovery algorithm. To tackle the measure errors, we introduce a new parameter to denote the error bound. The modified algorithm proceeds in the same way: given a prefix \((A_0,\cdots , A_{l-1})\), it computes all possible \(A_l\)’s satisfying the error bound condition and extends or eliminates the prefix according to if it can lead to a valid solution. A formal algorithmic description is provided in Algorithm 7.1. For convenience, we use the (noisy) diagonal coefficients (i.e. secret Gram-Schmidt norms) of the LDL decomposition as input. In fact, Proposition 1 has shown the equivalence between the diagonal part and principal minors. In addition, we include prefix in the input for ease of description. The initial prefix is \(\mathsf{prefix} = \overline{A_0} = \lfloor {\overline{d_0}}\rceil \). Clearly, the correct A must be in the final candidate list.
7.2 Practical Tweaks in the DLP Setting
Aiming at the DLP signature, we implemented our side-channel attack. By the following techniques, one can boost the practical performance of the recovery algorithm significantly and reduce the number of required signatures.
Fast Computation of the Quadratic Equation. Exploiting the Toeplitz structure of \(\mathcal A_n(A)\), we propose a fast algorithm to compute the quadratic equation, i.e. \((Q_a, Q_b, Q_c)\), that requires only O(l) multiplications and additions. The idea is as follows. Let \(\mathbf {T}_i = \mathcal A_n(A)_{[0,i]}\). Let \(\mathbf {u}_i = (A_1,\cdots , A_i)\) and \(\mathbf {v}_i = (A_i,\cdots , A_1)\), then
Let \(\mathbf {r}_i = \mathbf {v}_i\mathbf {T}_{i-1}^{-1}\), \(\mathbf {s}_i = \mathbf {u}_i\mathbf {T}_{i-1}^{-1}\) which is the reverse of \(\mathbf {r}_i\) and \(d_i = A_0 - \langle {\mathbf {v}_i, \mathbf {r}_i}\rangle = A_0 - \langle {\mathbf {u}_i, \mathbf {s}_i}\rangle \). A straightforward computation leads to that
Let \(f_i = \langle {\mathbf {r}_i, \mathbf {u}_i}\rangle = \langle {\mathbf {s}_i, \mathbf {v}_i}\rangle \), then the quadratic equation of \(A_i\) is
Remark that \(d_i\) is the square of the last Gram-Schmidt norm. Because \(\overline{d_i}\), a noisy \(d_i\), is the input, combining \( f_{i-1}, \mathbf {v}_{i-1}, \mathbf {r}_{i-1}\) would determine all possible \(A_i\)’s. Once \(A_i\) is recovered, one can then compute \(\mathbf {r}_i\), \(\mathbf {s}_i\) according to
and further compute \(d_{i}, f_{i}\). As the recovery algorithm starts with \(i=1\) (i.e. \(\mathsf{prefix} = A_0\)), we can compute the sequences \(\{{d_{i}}\}, \{{f_{i}}\}, \{{\mathbf {r}_i}\}, \{{\mathbf {s}_i}\}\) on the fly.
Remark 1
The input matrix is very well conditioned, so we can use a precision of only \(O(\log n)\) bits.
Remark 2
The above method implies an algorithm of complexity \(\widetilde{O}(n^2)\) for the exact case (Sect. 5.1).
Pruning. We expect that when a mistake is made in the prefix, the error committed in the Gram-Schmidt will be larger. We therefore propose to prune prefixes when \(\sum _{k=i}^j e_k^2/\sigma _k^2\ge B_{j-i}\) for some i, j where \(e_k\) is the difference between the measured k-th squared Gram-Schmidt norm and the one of the prefix. The bound \(B_l\) is selected so that for \(e_k\) a Gaussian of standard deviation \(\sigma _k\), the condition is verified except with probability \(\tau /\sqrt{l}\). The failure probability \(\tau \) is geometrically decreased until the correct solution is found.
Verifying Candidates. Let \(A = f\overline{f} + g\overline{g}\), then \(f\overline{f} = A(1+h\overline{h})\bmod q\). As mentioned in Sect. 4, all coefficients except the constant one of \(f\overline{f}\) would be much smaller the modulus q. This can be used to check if a candidate is correct. In addition, both A(x) and \(A(-x)\) are the final candidates, we also check \(A(1+h(-x)\overline{h}(-x))\) to ensure that the correct \(A(-x)\) will not to be eliminated. Once either A(x) or \(A(-x)\) is found, we terminate the algorithm.
The Use of Symplecticity. As observed in [18], the trapdoor basis \(\mathbf {B}_{f,g}\) is q-symplectic and thus \(\Vert \mathbf {b}_i^*\Vert \cdot \Vert \mathbf {b}_{2n-1-i}^*\Vert = q\). Based on that, we combine the samples of the i-th and \((2n-1-i)\)-th Gaussians to approximate \(\Vert \mathbf {b}_i^*\Vert \). This helps to refine the approximations and thus to reduce the number of signatures enabling a practical attack.
7.3 Experimental Results
We validate the recovery algorithm on practical DLP instances. Experiments are conducted on the parameter set claimed for 192-bit security in which
The leakage data we extracted is the number of iterations of centered Gaussian samplings (Algorithm 6.2). To obtain it, we added some instrumentation to Prest’s C++ implementation [46]. The centered Gaussian samplings only depend on the secret key itself not the hashed message. Hence, instead of executing complete signing, we only perform centered Gaussian samplings. We mean by sample size the number of collected Gaussian samples. In fact, considering the rejection sampling in Algorithm 6.1, one requires about N/2 signatures to generate N samples per centered Gaussian.
We tested our algorithm on ten instances, and result is shown in Table 1. Producing the dataset of \(2^{36.5}\) samples for a given key took about 36 hours on our 48-core machine (two weeks for all 10 distinct keys).
In one instance, the recovery algorithm found millions of candidate solutions with Gram-Schmidt norms closer to the noisy ones than the correct solution, in the sense that they had a larger \(\tau \). This indicates that the recovery algorithm is relatively close to optimality.
7.4 Precision Required on the Gram–Schmidt Norms
We try here to give a closed formula for the number of samples needed. We recall that the relative error with respect to the Gram-Schmidt norm (squared) is \(\varTheta (1/\sqrt{N})\) where N is the number of samples.
A fast recovery corresponds to the case where only one root is close to an integer; and in particular increasing by one the new coefficient must change by \(\varOmega (1/\sqrt{N})\) the Gram-Schmidt norm. This is not an equivalence because there is another root of the quadratic form, but we will assume this is enough.
Let \(b_1\) be the first row of \(\begin{pmatrix} \mathcal A_n(f)&\mathcal A_n(g)\end{pmatrix}\), and \(b_i\) the i-th row for \(i\ge 2\). We define \(pb_i\) as the projection of \(b_1\) orthogonally to \(b_2,\dots ,b_{i-1}\). We expect that \(\Vert pb_i\Vert \approx \sqrt{\frac{2n-i+2}{2n}}\Vert b_1\Vert \). Consider the Gram matrix of the family \(b_1,\dots ,b_{i-1},b_{i}\pm \frac{pb}{\Vert b_1\Vert ^2}\). We have indeed changed only the top right/bottom left coefficients by \(\pm 1\), beside the bottom right coordinate. Clearly this does not change the i-th Gram-Schmidt vector; so the absolute change in the i-th Gram-Schmidt norm squared is
The Gram-Schmidt norm squared is roughly \(\Vert pb_i\Vert ^2\).
Getting only one solution at each step with constant probability corresponds to
(assuming the scalar product is distributed as a Gaussian) which means a total number of samples of
This gives roughly \(2^{29}\) samples, which is similar to what the search algorithm requires.
Getting only one solution at each step with probability \(1-1/n\) corresponds to
and \(N=\varTheta (n^3q^2)\). This would be \(2^{57}\) samples.
8 Conclusion and Future Work
In this paper, we have investigated the side-channel security of the two main efficient hash-and-sign lattice-based signature schemes: DLP and Falcon (focusing on their original implementations, although our results carry over to several later implementations as well). The two main takeaways of our analysis are that:
-
1.
the Gram–Schmidt norms of the secret basis leak through timing side-channels; and
-
2.
knowing the Gram–Schmidt norms allows to fully recover the secret key.
Interestingly, however, there is a slight mismatch between those two results: the side-channel leakage only provides approximate values of the Gram–Schmidt norms, whereas secret key recovery a priori requires exact values. We are able to bridge this gap in the case of DLP by combining the recovery algorithm with a pruned tree search. This lets us mount a concrete attack against DLP that recovers the key from \(2^{33}\) to \(2^{35}\) DLP traces in practice for the high security parameters of DLP (claiming 192 bits of security).
However, the gap remains in the case of Falcon: we do not know how to modify our recovery algorithm so as to deal with approximate inputs, and as a result apply it to a concrete attack. This is left as a challenging open problem for future work.
Also left for future work on the more theoretical side is the problem of giving an intrinsic description of our recovery algorithms in terms of algebraic quantities associated with the corresponding totally positive elements (or equivalently, to give an algebraic interpretation of the LDL decomposition for algebraically structured self-adjoint matrices). In particular, in the Falcon case, our approach shows that the Gram–Schmidt norms characterize the Galois conjugacy class of a totally positive element. This strongly suggests that they should admit a nice algebraic description, but it remains elusive for now.
The final recovery in our attack, that is computing f from \(f\bar{f} + g\bar{g}\), heavily relies on the property of NTRU. We need further investigations to understand the impact of Gram-Schmidt norm leakage in hash-and-sign schemes over other lattices. But for non-structured lattices, there appears to be a strong obstruction to at least a full key recovery attack, simply due to the dimension of the problem: there are only n Gram-Schmidt norms but \(O(n^2)\) secret coefficients to be recovered.
On a positive note, we finally recall that the problem of finding countermeasures against the leakage discussed in this paper is fortunately already solved, thanks to the recent work of Prest, Ricosset and Rossi [48]. And that countermeasure has very recently been implemented into Falcon [45], so the leak can be considered as patched! The overhead of that countermeasure is modest in the case of Falcon, thanks to the small range in which the possible standard deviations occur; however, it could become more costly for samplers that need to accommodate a wider range of standard deviations.
An alternate possible countermeasure could be to use Peikert’s convolution sampling [42] in preference to the KGPV approach, as it eliminates the need for varying standard deviations, and is easier to implement even without floating point arithmetic [9]. It does have the drawback of sampling wider Gaussians, however, and hence leads to less compact parameter choices.
Notes
- 1.
The scaling factor is \((\sqrt{2\pi })^{-1}\) before the smoothing parameter \(\eta _{\epsilon }(\mathbb {Z})\) in [38].
- 2.
Each root of \(x^n+1\) describes one complex embedding by mean of evaluation.
- 3.
This describes the discriminant of \(T^2-\mathrm {Tr}(u)T + \mathrm {N}(u)\) whose roots are u and \(\sigma (u)\) in \(\mathcal K_n\). It is then not surprising that \(\mathrm {Tr}(\zeta _n^{-1}u)\overline{\mathrm {Tr}(\zeta _n^{-1}u)}\) is a square only in \(\mathcal K_n\).
References
Babai, L.: On Lovász’ lattice reduction and the nearest lattice point problem. Combinatorica 6(1), 1–13 (1986)
Bai, S., Galbraith, S.D.: An improved compression technique for signatures based on learning with errors. In: Benaloh, J. (ed.) CT-RSA 2014. LNCS, vol. 8366, pp. 28–47. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-04852-9_2
Barthe, G., et al.: Masking the GLP lattice-based signature scheme at any order. In: Nielsen, J.B., Rijmen, V. (eds.) EUROCRYPT 2018, Part II. LNCS, vol. 10821, pp. 354–384. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-78375-8_12
Barthe, G., Belaïd, S., Espitau, T., Fouque, P.A., Rossi, M., Tibouchi, M.: GALACTICS: Gaussian sampling for lattice-based constant-time implementation of cryptographic signatures, revisited. In: Cavallaro, L., Kinder, J., Wang, X., Katz, J. (eds.) ACM CCS 2019, pp. 2147–2164. ACM Press (2019)
Bindel, N., et al.: qTESLA. Technical report, National Institute of Standards and Technology (2019). https://csrc.nist.gov/projects/post-quantum-cryptography/round-2-submissions
Bootle, J., Delaplace, C., Espitau, T., Fouque, P.-A., Tibouchi, M.: LWE without modular reduction and improved side-channel attacks against BLISS. In: Peyrin, T., Galbraith, S. (eds.) ASIACRYPT 2018, Part I. LNCS, vol. 11272, pp. 494–524. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-03326-2_17
Groot Bruinderink, L., Hülsing, A., Lange, T., Yarom, Y.: Flush, gauss, and reload – a cache attack on the BLISS lattice-based signature scheme. In: Gierlichs, B., Poschmann, A.Y. (eds.) CHES 2016. LNCS, vol. 9813, pp. 323–345. Springer, Heidelberg (2016). https://doi.org/10.1007/978-3-662-53140-2_16
Ducas, L., Durmus, A., Lepoint, T., Lyubashevsky, V.: Lattice signatures and bimodal Gaussians. In: Canetti, R., Garay, J.A. (eds.) CRYPTO 2013, Part I. LNCS, vol. 8042, pp. 40–56. Springer, Heidelberg (2013). https://doi.org/10.1007/978-3-642-40041-4_3
Ducas, L., Galbraith, S., Prest, T., Yu, Y.: Integral matrix gram root and lattice Gaussian sampling without floats. In: Canteaut, A., Ishai, Y. (eds.) EUROCRYPT 2020. LNCS, vol. 12107, pp. 608–637. Springer, Cham (2020)
Ducas, L., et al.: CRYSTALS-Dilithium: a lattice-based digital signature scheme. IACR TCHES 2018(1), 238–268 (2018). https://tches.iacr.org/index.php/TCHES/article/view/839
Ducas, L., Lyubashevsky, V., Prest, T.: Efficient identity-based encryption over NTRU lattices. In: Sarkar, P., Iwata, T. (eds.) ASIACRYPT 2014, Part II. LNCS, vol. 8874, pp. 22–41. Springer, Heidelberg (2014). https://doi.org/10.1007/978-3-662-45608-8_2
Ducas, L., Nguyen, P.Q.: Learning a zonotope and more: cryptanalysis of NTRUSign countermeasures. In: Wang, X., Sako, K. (eds.) ASIACRYPT 2012. LNCS, vol. 7658, pp. 433–450. Springer, Heidelberg (2012). https://doi.org/10.1007/978-3-642-34961-4_27
Ducas, L., Prest, T.: Fast Fourier orthogonalization. In: ISSAC, pp. 191–198 (2016)
Espitau, T., Fouque, P.A., Gérard, B., Tibouchi, M.: Side-channel attacks on BLISS lattice-based signatures: exploiting branch tracing against strongSwan and electromagnetic emanations in microcontrollers. In: Thuraisingham, B.M., Evans, D., Malkin, T., Xu, D. (eds.) ACM CCS 2017, pp. 1857–1874. ACM Press, October/November 2017
Espitau, T., Fouque, P., Gérard, B., Tibouchi, M.: Loop-abort faults on lattice-based signature schemes and key exchange protocols. IEEE Trans. Comput. 67(11), 1535–1549 (2018). https://doi.org/10.1109/TC.2018.2833119
Fiat, A., Shamir, A.: How to prove yourself: practical solutions to identification and signature problems. In: Odlyzko, A.M. (ed.) CRYPTO 1986. LNCS, vol. 263, pp. 186–194. Springer, Heidelberg (1987). https://doi.org/10.1007/3-540-47721-7_12
Fouque, P.A., Kirchner, P., Tibouchi, M., Wallet, A., Yu, Y.: Key Recovery from Gram-Schmidt Norm Leakage in Hash-and-Sign Signatures over NTRU Lattices. IACR Cryptology ePrint Archive, report 2019/1180 (2019)
Gama, N., Howgrave-Graham, N., Nguyen, P.Q.: Symplectic lattice reduction and NTRU. In: Vaudenay, S. (ed.) EUROCRYPT 2006. LNCS, vol. 4004, pp. 233–253. Springer, Heidelberg (2006). https://doi.org/10.1007/11761679_15
Gentry, C., Jonsson, J., Stern, J., Szydlo, M.: Cryptanalysis of the NTRU signature scheme (NSS) from Eurocrypt 2001. In: Boyd, C. (ed.) ASIACRYPT 2001. LNCS, vol. 2248, pp. 1–20. Springer, Heidelberg (2001). https://doi.org/10.1007/3-540-45682-1_1
Gentry, C., Peikert, C., Vaikuntanathan, V.: Trapdoors for hard lattices and new cryptographic constructions. In: Ladner, R.E., Dwork, C. (eds.) 40th ACM STOC, pp. 197–206. ACM Press, May 2008
Gentry, C., Szydlo, M.: Cryptanalysis of the revised NTRU signature scheme. In: Knudsen, L.R. (ed.) EUROCRYPT 2002. LNCS, vol. 2332, pp. 299–320. Springer, Heidelberg (2002). https://doi.org/10.1007/3-540-46035-7_20
Goldreich, O., Goldwasser, S., Halevi, S.: Public-key cryptosystems from lattice reduction problems. In: Kaliski Jr., B.S. (ed.) CRYPTO 1997. LNCS, vol. 1294, pp. 112–131. Springer, Heidelberg (1997). https://doi.org/10.1007/BFb0052231
Güneysu, T., Lyubashevsky, V., Pöppelmann, T.: Practical lattice-based cryptography: a signature scheme for embedded systems. In: Prouff, E., Schaumont, P. (eds.) CHES 2012. LNCS, vol. 7428, pp. 530–547. Springer, Heidelberg (2012). https://doi.org/10.1007/978-3-642-33027-8_31
Hoffstein, J., Howgrave-Graham, N., Pipher, J., Silverman, J.H., Whyte, W.: NTRUSign: digital signatures using the NTRU lattice. In: Joye, M. (ed.) CT-RSA 2003. LNCS, vol. 2612, pp. 122–140. Springer, Heidelberg (2003). https://doi.org/10.1007/3-540-36563-X_9
Hoffstein, J., Lieman, D., Silverman, J.H.: Polynomial rings and efficient public key authentication (1999)
Hogg, R.V., McKean, J.W., Craig, A.T.: Introduction to Mathematical Satistics, 8th edn. Pearson, London (2018)
Hülsing, A., Lange, T., Smeets, K.: Rounded Gaussians - fast and secure constant-time sampling for lattice-based crypto. In: Abdalla, M., Dahab, R. (eds.) PKC 2018, Part II. LNCS, vol. 10770, pp. 728–757. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-76581-5_25
Karmakar, A., Roy, S.S., Reparaz, O., Vercauteren, F., Verbauwhede, I.: Constant-time discrete Gaussian sampling. IEEE Trans. Comput. 67(11), 1561–1571 (2018)
Karmakar, A., Roy, S.S., Vercauteren, F., Verbauwhede, I.: Pushing the speed limit of constant-time discrete Gaussian sampling. A case study on the Falcon signature scheme. In: DAC 2019 (2019)
Klein, P.N.: Finding the closest lattice vector when it’s unusually close. In: Shmoys, D.B. (ed.) 11th SODA, pp. 937–941. ACM-SIAM, January 2000
Lyubashevsky, V.: Fiat-Shamir with aborts: applications to lattice and factoring-based signatures. In: Matsui, M. (ed.) ASIACRYPT 2009. LNCS, vol. 5912, pp. 598–616. Springer, Heidelberg (2009). https://doi.org/10.1007/978-3-642-10366-7_35
Lyubashevsky, V.: Lattice signatures without trapdoors. In: Pointcheval, D., Johansson, T. (eds.) EUROCRYPT 2012. LNCS, vol. 7237, pp. 738–755. Springer, Heidelberg (2012). https://doi.org/10.1007/978-3-642-29011-4_43
Lyubashevsky, V., et al.: CRYSTALS-DILITHIUM. Technical report, National Institute of Standards and Technology (2019). https://csrc.nist.gov/projects/post-quantum-cryptography/round-2-submissions
Lyubashevsky, V., Prest, T.: Quadratic time, linear space algorithms for Gram-Schmidt orthogonalization and Gaussian sampling in structured lattices. In: Oswald, E., Fischlin, M. (eds.) EUROCRYPT 2015, Part I. LNCS, vol. 9056, pp. 789–815. Springer, Heidelberg (2015). https://doi.org/10.1007/978-3-662-46800-5_30
McCarthy, S., Howe, J., Smyth, N., Brannigan, S., O’Neill, M.: BEARZ attack FALCON: implementation attacks with countermeasures on the FALCON signature scheme. In: Obaidat, M.S., Samarati, P. (eds.) SECRYPT, pp. 61–71 (2019)
McCarthy, S., Smyth, N., O’Sullivan, E.: A practical implementation of identity-based encryption over NTRU lattices. In: O’Neill, M. (ed.) IMACC 2017. LNCS, vol. 10655, pp. 227–246. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-71045-7_12
Micciancio, D., Peikert, C.: Trapdoors for lattices: simpler, tighter, faster, smaller. In: Pointcheval, D., Johansson, T. (eds.) EUROCRYPT 2012. LNCS, vol. 7237, pp. 700–718. Springer, Heidelberg (2012). https://doi.org/10.1007/978-3-642-29011-4_41
Micciancio, D., Regev, O.: Worst-case to average-case reductions based on Gaussian measures. SIAM J. Comput. 37(1), 267–302 (2007)
Micciancio, D., Walter, M.: Gaussian sampling over the integers: efficient, generic, constant-time. In: Katz, J., Shacham, H. (eds.) CRYPTO 2017, Part II. LNCS, vol. 10402, pp. 455–485. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-63715-0_16
Nguyen, P.Q., Regev, O.: Learning a parallelepiped: cryptanalysis of GGH and NTRU signatures. In: Vaudenay, S. (ed.) EUROCRYPT 2006. LNCS, vol. 4004, pp. 271–288. Springer, Heidelberg (2006). https://doi.org/10.1007/11761679_17
Oder, T., Speith, J., Höltgen, K., Güneysu, T.: Towards practical microcontroller implementation of the signature scheme Falcon. In: Ding, J., Steinwandt, R. (eds.) PQCrypto 2019. LNCS, vol. 11505, pp. 65–80. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-25510-7_4
Peikert, C.: An efficient and parallel Gaussian sampler for lattices. In: Rabin, T. (ed.) CRYPTO 2010. LNCS, vol. 6223, pp. 80–97. Springer, Heidelberg (2010). https://doi.org/10.1007/978-3-642-14623-7_5
Pessl, P., Bruinderink, L.G., Yarom, Y.: To BLISS-B or not to be: attacking strongSwan’s implementation of post-quantum signatures. In: Thuraisingham, B.M., Evans, D., Malkin, T., Xu, D. (eds.) ACM CCS 2017, pp. 1843–1855. ACM Press, October/November 2017
Plantard, T., Sipasseuth, A., Dumondelle, C., Susilo, W.: DRS. Technical report, National Institute of Standards and Technology (2017). https://csrc.nist.gov/projects/post-quantum-cryptography/round-1-submissions
Pornin, T.: New Efficient, Constant-Time Implementations of Falcon, August 2019. https://falcon-sign.info/falcon-impl-20190802.pdf
Prest, T.: Proof-of-concept implementation of an identity-based encryption scheme over NTRU lattices (2014). https://github.com/tprest/Lattice-IBE
Prest, T., et al.: FALCON. Technical report, National Institute of Standards and Technology (2019). https://csrc.nist.gov/projects/post-quantum-cryptography/round-2-submissions
Prest, T., Ricosset, T., Rossi, M.: Simple, fast and constant-time Gaussian sampling over the integers for Falcon. In: Second PQC Standardization Conference (2019)
Stehlé, D., Steinfeld, R.: Making NTRU as secure as worst-case problems over ideal lattices. In: Paterson, K.G. (ed.) EUROCRYPT 2011. LNCS, vol. 6632, pp. 27–47. Springer, Heidelberg (2011). https://doi.org/10.1007/978-3-642-20465-4_4
Tibouchi, M., Wallet, A.: One bit is all it takes: a devastating timing attack on BLISS’s non-constant time sign flips. Cryptology ePrint Archive, Report 2019/898 (2019). https://eprint.iacr.org/2019/898
Yu, Y., Ducas, L.: Learning strikes again: the case of the DRS signature scheme. In: Peyrin, T., Galbraith, S. (eds.) ASIACRYPT 2018, Part II. LNCS, vol. 11273, pp. 525–543. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-03329-3_18
Zhang, Z., Chen, C., Hoffstein, J., Whyte, W.: pqNTRUSign. Technical report, National Institute of Standards and Technology (2017). https://csrc.nist.gov/projects/post-quantum-cryptography/round-1-submissions
Zhao, R.K., Steinfeld, R., Sakzad, A.: FACCT: FAst, Compact, and Constant-Time Discrete Gaussian Sampler over Integers. IACR Cryptology ePrint Archive, report 2018/1234 (2018)
Acknowledgements
This work is supported by the European Union Horizon 2020 Research and Innovation Program Grant 780701 (PROMETHEUS). This work has also received a French government support managed by the National Research Agency in the “Investing for the Future” program, under the national project RISQ P141580-2660001/DOS0044216, and under the project TYREX granted by the CominLabs excellence laboratory with reference ANR-10-LABX-07-01.
Author information
Authors and Affiliations
Corresponding authors
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2020 International Association for Cryptologic Research
About this paper
Cite this paper
Fouque, PA., Kirchner, P., Tibouchi, M., Wallet, A., Yu, Y. (2020). Key Recovery from Gram–Schmidt Norm Leakage in Hash-and-Sign Signatures over NTRU Lattices. In: Canteaut, A., Ishai, Y. (eds) Advances in Cryptology – EUROCRYPT 2020. EUROCRYPT 2020. Lecture Notes in Computer Science(), vol 12107. Springer, Cham. https://doi.org/10.1007/978-3-030-45727-3_2
Download citation
DOI: https://doi.org/10.1007/978-3-030-45727-3_2
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-45726-6
Online ISBN: 978-3-030-45727-3
eBook Packages: Computer ScienceComputer Science (R0)