Abstract
In this paper, we initiate the study of sidechannel leakage in hashandsign latticebased signatures, with particular emphasis on the two efficient implementations of the original GPV latticetrapdoor paradigm for signatures, namely NIST secondround candidate Falcon and its simpler predecessor DLP. Both of these schemes implement the GPV signature scheme over NTRU lattices, achieving great speedups over the general lattice case. Our results are mainly threefold.
First, we identify a specific source of sidechannel leakage in most implementations of those schemes, namely, the onedimensional Gaussian sampling steps within lattice Gaussian sampling. It turns out that the implementations of these steps often leak the Gram–Schmidt norms of the secret lattice basis.
Second, we elucidate the link between this leakage and the secret key, by showing that the entire secret key can be efficiently reconstructed solely from those Gram–Schmidt norms. The result makes heavy use of the algebraic structure of the corresponding schemes, which work over a poweroftwo cyclotomic field.
Third, we concretely demonstrate the sidechannel attack against DLP (but not Falcon due to the different structures of the two schemes). The challenge is that timing information only provides an approximation of the Gram–Schmidt norms, so our algebraic recovery technique needs to be combined with pruned tree search in order to apply it to approximate values. Experimentally, we show that around \(2^{35}\) DLP traces are enough to reconstruct the entire key with good probability.
You have full access to this open access chapter, Download conference paper PDF
Similar content being viewed by others
1 Introduction
LatticeBased Signatures. Latticebased cryptography has proved to be a versatile way of achieving a very wide range of cryptographic primitives with strong security guarantees that are also believed to hold in the postquantum setting. For a while, it was largely confined to the realm of theoretical cryptography, mostly concerned with asymptotic efficiency, but it has made major strides towards practicality in recent years. Significant progress has been made in terms of practical constructions, refined concrete security estimates and fast implementations. As a result, latticebased schemes are seen as strong contenders in the NIST postquantum standardization process.
In terms of practical signature schemes in particular, latticebased constructions broadly fit within either of two large frameworks: Fiat–Shamir type constructions on the one hand, and hashandsign constructions on the other.
Fiat–Shamir lattice based signatures rely on a variant of the Fiat–Shamir paradigm [16] developed by Lyubashevsky, called “Fiat–Shamir with aborts” [31], which has proved particularly fruitful. It has given rise to numerous practically efficient schemes [2, 8, 23] including the two second round NIST candidates Dilithium [10, 33] and qTESLA [5].
The hashandsign family has a longer history, dating back to Goldreich–Goldwasser–Halevi [22] signatures as well as NTRUSign [24]. Those early proposals were shown to be insecure [12, 19, 21, 40], however, due to a statistical dependence between the distribution of signatures and the signing key. That issue was only overcome with the development of lattice trapdoors by Gentry, Peikert and Vaikuntanathan [20]. In the GPV scheme, signatures follow a distribution that is provably independent of the secret key (a discrete Gaussian supported on the public lattice), but which is hard to sample from without knowing a secret, short basis of the lattice. The scheme is quite attractive from a theoretical standpoint (for example, it is easier to establish QROM security for it than for Fiat–Shamir type schemes), but suffers from large keys and a potentially costly procedure for discrete Gaussian sampling over a lattice. Several followup works have then striven to improve its concrete efficiency [13, 34, 37, 42, 49], culminating in two main efficient and compact implementations: the scheme of Ducas, Lyubashevsky and Prest (DLP) [11], and its successor, NIST second round candidate Falcon [47], both instantiated over NTRU lattices [24] in poweroftwo cyclotomic fields. One can also mention NIST first round candidates pqNTRUSign [52] and DRS [44] as members of this family, the latter of which actually fell prey to a clever statistical attack [51] in the spirit of those against GGH and NTRUSign.
SideChannel Analysis of LatticeBased Signatures. With the NIST postquantum standardization process underway, it is crucial to investigate the security of latticebased schemes not only in a pure algorithmic sense, but also with respect to implementation attacks, such as sidechannels. For latticebased signatures constructed using the Fiat–Shamir paradigm, this problem has received a significant amount of attention in the literature, with numerous works [4, 6, 7, 14, 43, 50] pointing out vulnerabilities with respect to timing attacks, cache attacks, power analysis and other types of sidechannels. Those attacks have proved particularly devastating against schemes using discrete Gaussian sampling, such as the celebrated BLISS signature scheme [8]. In response, several countermeasures have also been proposed [27, 28, 39], some of them provably secure [3, 4], but the sidechannel arms race does not appear to have subsided quite yet.
In contrast, the case of hashandsign latticebased signatures, including DLP and Falcon, remains largely unexplored, despite concerns being raised regarding their vulnerability to sidechannel attacks. For example, the NIST status report on first round candidates, announcing the selection of Falcon to the second round, notes that “more work is needed to ensure that the signing algorithm is secure against sidechannel attacks”. The relative lack of cryptanalytic works regarding these schemes can probably be attributed to the fact that the relationship between secret keys and the information that leaks through sidechannels is a lot more subtle than in the Fiat–Shamir setting.
Indeed, in Fiat–Shamir style schemes, the signing algorithm uses the secret key very directly (it is combined linearly with other elements to form the signature), and as a result, sidechannel leakage on sensitive variables, like the random nonce, easily leads to key exposure. By comparison, the way the signing key is used in GPV type schemes is much less straightforward. The key is used to construct the trapdoor information used for the lattice discrete Gaussian sampler; in the case of the samplers [13, 20, 30] used in GPV, DLP and Falcon, that information is essentially the Gram–Schmidt orthogonalization (GSO) of a matrix associated with the secret key. Moreover, due to the way that GSO matrix is used in the sampling algorithm, only a small amount of information about it is liable to leak through sidechannels, and how that small amount relates to the signing key is far from clear. To the best of our knowledge, neither the problem of identifying a clear sidechannel leakage, nor that of relating that such a leakage to the signing key have been tackled in the literature so far.
Our Contributions. In this work, we initiate the study of how sidechannel leakage impacts the security of hashandsign latticebased signature, focusing our attention to the two most notable practical schemes in that family, namely DLP and Falcon. Our contributions towards that goal are mainly threefold.
First, we identify a specific leakage of the implementations of both DLP and Falcon (at least in its original incarnation) with respect to timing sidechannels. As noted above, the lattice discrete Gaussian sampler used in signature generation relies on the Gram–Schmidt orthogonalization of a certain matrix associated with the secret key. Furthermore, the problem of sampling a discrete Gaussian distribution supported over the lattice is reduced to sampling onedimensional discrete Gaussians with standard deviations computed from the norms of the rows of that GSO matrix. In particular, the onedimensional sampler has to support varying standard deviations, which is not easy to do in constant time. Unsurprisingly, the target implementations both leak that standard deviation through timing sidechannels; specifically, they rely on rejection sampling, and the acceptance rate of the corresponding loop is directly related to the standard deviation. As a result, timing attacks will reveal the Gram–Schmidt norms of the matrix associated to the secret key (or rather, an approximation thereof, to a precision increasing with the number of available samples).
Second, we use algebraic number theoretic techniques to elucidate the link between those Gram–Schmidt norms and the secret key. In fact, we show that the secret key can be entirely reconstructed from the knowledge of those Gram–Schmidt norms (at least if they are known exactly), in a way which crucially relies on the algebraic structure of the corresponding lattices.
Since both DLP and Falcon work in an NTRU lattice, the signing key can be expressed as a pair (f, g) of small elements in a cyclotomic ring \(\mathcal R= \mathbb {Z}[\zeta ]\) (of poweroftwo conductor, in the case of those schemes). The secret, short basis of the NTRU lattice is constructed by blocks from the multiplication matrices of f and g (and related elements F, G) in a certain basis of \(\mathcal R\) as a \(\mathbb {Z}\)algebra (DLP uses the usual power basis, whereas Falcon uses the power basis in bitreversed order; this apparently small difference interestingly plays a crucial role in this work). It is then easily seen that the Gram matrix of the first half of the lattice basis is essentially the multiplication matrix associated with the element \(u = f\bar{f}+g\bar{g}\), where the bar denotes the complex conjugation \(\bar{\zeta } = \zeta ^{1}\). From that observation, we deduce that knowing the Gram–Schmidt norms of lattice basis is essentially equivalent to knowing the leading principal minors of the multiplication matrix of u, which is a real, totally positive element of \(\mathcal R\).
We then give general efficient algorithms, both for the power basis (DLP case) and for the bitreversed order power basis (Falcon case), which recover an arbitrary totally positive element u (up to a possible automorphism of the ambient field) given the leading principal minors of its multiplication matrix. The case of the power basis is relatively easy: we can actually recover the coefficients iteratively one by one, with each coefficient given as a solution of quadratic equation over \(\mathbb {Q}\) depending only on the minors and the previous coefficients. The bitreversed order power basis is more contrived, however; recovery is then carried out recursively, by reduction to the successive subfields of the poweroftwo cyclotomic tower.
Finally, to complete the recovery, we need to deduce f and g from u. We show that this can be done using the public key \(h = g/f\bmod q\): we can use it to reconstruct both the relative norm \(f\bar{f}\) of f, and the ideal \((f)\subset \mathcal R\). That data can then be plugged into the Gentry–Szydlo algorithm [21] to obtain f in polynomial time, and hence g. Those steps, though simple, are also of independent interest, since they can be applied to the sidechannel attack against BLISS described in [14], in order to get rid of the expensive factorization of an algebraic norm, and hence make the attack efficient for all keys (instead of a small percentage of weak keys as originally stated).
Our third contribution is to actually collect timing traces for the DLP scheme and mount the concrete key recovery. This is not an immediate consequence of the previous points, since our totally positive element recovery algorithm a priori requires the exact knowledge of Gram–Schmidt norms, whereas sidechannel leakage only provides approximations (and since some of the squared Gram–Schmidt norms are rational numbers of very large height, recovering them exactly would require an unrealistic number of traces). As a result, the recovery algorithm has to be combined with some pruned tree search in order to account for approximate inputs. In practice, for the larger parameters of DLP signatures (with a claimed security level of 192 bits), we manage to recover the key with good probability using \(2^{33}\) to \(2^{35}\) DLP timing traces.
Carrying out such an experiment in the Falcon setting, however, is left as a challenging open problem for further work. This is because adapting the bitreversed order totally positive recovery algorithm to deal with approximate inputs appears to be much more difficult (instead of sieving integers whose square lies in some specified interval, one would need to find the cyclotomic integers whose square lies in some target set, which does not even look simple to describe).
The source code of the attack is available at https://github.com/yuyangcrypto/Key_Recovery_from_GSnorms.
Related Work. As noted above, the sidechannel security of Fiat–Shamir latticebased signature has been studied extensively, including in [4, 6, 7, 14, 43, 50]. However, the only implementation attacks we are aware of against hashandsign schemes are fault analysis papers [15, 35]: sidechannel attacks have not been described so far to the best of our knowledge.
Aside from the original implementations of DLP and Falcon, which are the focus of this paper, several others have appeared in the literature. However, they usually do not aim for sidechannel security [36, 41] or only make the base discrete Gaussian sampler (with fixed standard deviation) constant time [29], but do not eliminate the leakage of the varying standard deviations. As a result, those implementations are also vulnerable to the attacks of this paper.
This is not the case, however, for Pornin’s very recent, updated implementation of Falcon, which uses a novel technique proposed by Prest, Ricosset and Rossi [48], combined with other recent results on constant time rejection sampling for discrete Gaussian distribution [4, 53] in order to eliminate the timing leakage of the lattice discrete Gaussian sampler. This technique applies to discrete Gaussian sampling over \(\mathbb {Z}\) with varying standard deviations, when those deviations only take values in a small range. It is then possible to eliminate the dependence on the standard deviation in the rejection sampling by scaling the target distribution to match the acceptance rate of the maximal possible standard deviation. The small range ensures that the overhead of this countermeasure is relatively modest. Thanks to this countermeasure, we stress that the most recent official implementation of Falcon is already protected against the attacks of this paper. Nevertheless, we believe our results underscore the importance of applying such countermeasures.
Organization of the Paper. Following some preliminary material in Sect. 2, Sect. 3 is devoted to recalling some general facts about signature generation for hashandsign latticebased schemes. Section 4 then gives a roadmap of our attack strategy, and provides some details about the final steps (how to deduce the secret key from the totally positive element \(u=f\bar{f}+g\bar{g}\). Section 5 describes our main technical contribution: the algorithms that recover u from the Gram–Schmidt norms, both in the DLP and in the Falcon setting. Section 6 delves into the details of the sidechannel leakage, showing how the implementations of the Gaussian samplers of DLP and Falcon do indeed reveal the Gram–Schmidt norms through timing sidechannels. Finally, Sect. 7 presents our concrete experiments against DLP, including the tree search strategy to accommodate approximate Gram–Schmidt norms and experimental results in terms of timing and number of traces.
Notation. We use bold lowercase letters for vectors and bold uppercase for matrices. The zero vector is \(\mathbf {0}\). We denote by \(\mathbb {N}\) the nonnegative integer set and by \(\log \) the natural logarithm. Vectors are in row form, and we write \(\mathbf {B}= (\mathbf {b}_0,\dotsc , \mathbf {b}_{n1})\) to denote that \(\mathbf {b}_i\) is the ith row of \(\mathbf {B}\). For a matrix \(\mathbf {B}\in \mathbb {R}^{n\times m}\), we denote by \(\mathbf {B}_{i,j}\) the entry in the ith row and jth column of \(\mathbf {B}\), where \(i\in \{{0,\dotsc , n1}\}\) and \(j\in \{{0,\dotsc ,m1}\}\). For \(I\subseteq [0,n), J \subseteq [0, m)\), we denote by \(\mathbf {B}_{I\times J}\) the submatrix \((\mathbf {B}_{i,j})_{i\in I,j\in J}\). In particular, we write \(\mathbf {B}_{I} = \mathbf {B}_{I\times I}\). Let \(\mathbf {B}^t\) denote the transpose of \(\mathbf {B}\).
Given \(\mathbf {u}= (u_0,\dotsc ,u_{n1})\) and \(\mathbf {v}= (v_0,\dotsc ,v_{n1})\), their inner product is \(\langle {\mathbf {u}, \mathbf {v}}\rangle = \sum _{i=0}^{n1}u_iv_i\). The \(\ell _2\)norm of \(\mathbf {v}\) is \(\Vert \mathbf {v}\Vert = \sqrt{\langle {\mathbf {v}, \mathbf {v}}\rangle }\) and the \(\ell _\infty \)norm is \(\Vert \mathbf {v}\Vert _\infty = \max _iv_i\). The determinant of a square matrix \(\mathbf {B}\) is denoted by \(\det (\mathbf {B})\), so that \(\det \left( \mathbf {B}_{[0,i]}\right) \) is the ith leading principal minor of \(\mathbf {B}\).
Let D be a distribution. We write \(z\hookleftarrow D\) when the random variable z is sampled from D, and denote by D(x) the probability that \(z=x\). The expectation of a random variable z is \(\mathbb {E}[z]\). We write \(\mathcal {N}(\mu , \sigma ^2)\) the normal distribution of mean \(\mu \) and variance \(\sigma ^2\). We let U(S) be the uniform distribution over a finite set S. For a realvalued function f and any countable set S in the domain of f, we write \(f(S)= \sum _{x\in S} f(x)\).
2 Preliminaries
A lattice \(\mathcal {L}\) is a discrete additive subgroup of \(\mathbb {R}^m\). If it is generated by \(\mathbf {B}\in \mathbb {R}^{n\times m}\), we also write \(\mathcal {L}:= \mathcal {L}(\mathbf {B}) = \{{\mathbf {x}\mathbf {B}\mid \mathbf {x}\in \mathbb {Z}^n}\}\). If \(\mathbf {B}\) has full row rank, then we call \(\mathbf {B}\) a basis and n the rank of \(\mathcal {L}\).
2.1 Gram–Schmidt Orthogonalization
Let \(\mathbf {B}= (\mathbf {b}_0,\dotsc , \mathbf {b}_{n1}) \in \mathbb {Q}^{n\times m}\) of rank n. The GramSchmidt orthogonalization of \(\mathbf {B}\) is \(\mathbf {B}= \mathbf {L}\mathbf {B}^*\), where \(\mathbf {L}\in \mathbb {Q}^{n\times n}\) is lowertriangular with 1 on its diagonal and \(\mathbf {B}^* = (\mathbf {b}_0^*,\dotsc , \mathbf {b}_{n1}^*)\) is a matrix with pairwise orthogonal rows. We call \(\Vert \mathbf {b}_i^*\Vert \) the ith GramSchmidt norm of \(\mathbf {B}\), and let \(\Vert \mathbf {B}\Vert _{GS} = \max _i \Vert \mathbf {b}_i^*\Vert \).
The Gram matrix of \(\mathbf {B}\) is \(\mathbf {G}= \mathbf {B}\mathbf {B}^t\), and satisfies \(\mathbf {G}= \mathbf {L}\mathbf {D}\mathbf {L}^t\) where \(\mathbf {D}= \mathrm {diag}\left( \Vert \mathbf {b}_i^*\Vert ^2\right) \). This is also known as the Cholesky decomposition of \(\mathbf {G}\), and such a decomposition exists for any symmetric positive definite matrix. The next proposition follows from the triangular structure of \(\mathbf {L}\).
Proposition 1
Let \(\mathbf {B}= \mathbb {Q}^{n\times m}\) of rank n and \(\mathbf {G}\) its Gram matrix. Then for all integer \(0\le k\le n1\), we have \(\det \left( \mathbf {G}_{[0,k]}\right) = \prod _{i=0}^{k} \Vert \mathbf {b}_i^*\Vert ^2\).
Let \(\mathbf {M}= \left( \begin{array}{cc} \mathbf {A}&{} \mathbf {B}\\ \mathbf {C}&{} \mathbf {D}\end{array} \right) \), where \(\mathbf {A}\in \mathbb {R}^{n\times n}\), \(\mathbf {D}\in \mathbb {R}^{m\times m}\) are invertible matrices, then \(\mathbf {M}/ \mathbf {A}= \mathbf {D} \mathbf {C}\mathbf {A}^{1}\mathbf {B}\in \mathbb {R}^{m\times m}\) is called the Schur complement of \(\mathbf {A}\). It holds that
2.2 Parametric Statistics
Let \(D_p\) be some distribution determined by parameter p. Let \(\mathbf {X} = (X_1,\dotsc ,X_n)\) be a vector of observed samples of \(X\hookleftarrow D_p\). The loglikelihood function with respect to \(\mathbf {X}\) is
Provided the loglikelihood function is bounded, a maximum likelihood estimator for samples \(\mathbf {X}\) is a real \(\text {MLE}(\mathbf {X})\) maximizing \(\ell _\mathbf {X}(p)\). The Fisher information is
Seen as a random variable, it is known (e.g. [26, Theorem 6.4.2]) that \(\sqrt{n}(\text {MLE}(\mathbf {X})  p)\) converges in distribution to \(\mathcal {N}(0, \mathcal {I}(p)^{1})\). When the target distribution is a geometric, maximum likelihood estimators and the Fisher information are wellknown. The second statement of the next lemma directly comes from a Gaussian tail bound.
Lemma 1
Let \(\text {Geo}_p\) denote a geometric distribution with parameter p, and \(\mathbf {X} = (X_1,\cdots ,X_n)\) be samples from \(\text {Geo}_p\). Then we have \(\text {MLE}(\mathbf {X}) = \frac{n}{\sum _{i=1}^n X_i}\) and \(\sqrt{n}(\text {MLE}(\mathbf {X})  p)\) converges in distribution to \(\mathcal N(0, p^2(1p))\). In particular, when N is large, then for any \(\alpha \ge 1\), we have \(\text {MLE}(\mathbf {X})  p \le \alpha \cdot p\sqrt{\frac{1p}{N}}\) except with probability at most \(2\exp (\alpha ^2/2)\).
2.3 Discrete Gaussian Distributions
Let \(\rho _{\sigma ,\mathbf {c}}(\mathbf {x}) = \exp \left( \frac{\Vert \mathbf {x}\mathbf {c}\Vert ^2}{2\sigma ^2}\right) \) be the ndimensional Gaussian function with center \(\mathbf {c}\in \mathbb {R}^n\) and standard deviation \(\sigma \). When \(\mathbf {c}=\mathbf {0}\), we just write \(\rho _{\sigma }(\mathbf {x})\). The discrete Gaussian over a lattice \(\mathcal {L}\) with center \(\mathbf {c}\) and standard deviation parameter \(\sigma \) is defined by the probability function
In this work, the case \(\mathcal {L}= \mathbb {Z}\) is of particular interest. It is well known that \(\int _{\infty }^{\infty } \rho _{\sigma , c}(x)\text {d}x = \sigma \sqrt{2\pi }\). Notice that \(D_{\mathbb {Z},\sigma ,c}\) is equivalent to \(i + D_{\mathbb {Z},\sigma ,ci}\) for an arbitrary \(i \in \mathbb {Z}\), hence it suffices to consider the case where \(c \in [0,1)\). The half discrete integer Gaussian, denoted by \(D^+_{\mathbb {Z},\sigma ,c}\), is defined by
We again omit the center when it is \(c=0\). For any \(\epsilon >0\), the (scaled)^{Footnote 1} smoothing parameter \(\eta _{\epsilon }'(\mathbb {Z})\) is the smallest \(s>0\) such that \(\rho _{1/s\sqrt{2\pi }}(\mathbb {Z})\le 1+\epsilon \). In practice, \(\epsilon \) is very small, say \(2^{50}\). The smoothing parameter allows to quantify precisely how the discrete Gaussian differs from the standard Gaussian function.
Lemma 2
([38], implicit in Lemma 4.4). If \(\sigma \ge \eta _{\epsilon }'(\mathbb {Z})\), then \(\rho _\sigma (c + \mathbb {Z}) \in [\frac{1\epsilon }{1+\epsilon }, 1]\rho _\sigma (\mathbb {Z})\) for any \(c \in [0,1)\).
Corollary 1
If \(\sigma \ge \eta _{\epsilon }'(\mathbb {Z})\), then \(\rho _\sigma (\mathbb {Z})\in [1, \frac{1+\epsilon }{1\epsilon }]\sqrt{2\pi }\sigma \).
Proof
Notice that \(\int _{0}^{1} \rho _{\sigma }(\mathbb {Z}+c)\text {d}c = \int _{\infty }^{\infty } \rho _{\sigma }(x)\text {d}x = \sqrt{2\pi }\sigma \), the proof is completed by Lemma 2. \(\square \)
2.4 PowerofTwo Cyclotomic Fields
For the rest of this article, we let \(n = 2^\ell \) for some integer \(\ell \ge 1\). We let \(\zeta _n\) be a 2nth primitive root of 1. Then \(\mathcal K_n = \mathbb {Q}(\zeta _n)\) is the nth poweroftwo cyclotomic field, and comes together with its ring of algebraic integers \(\mathcal R_n = \mathbb {Z}[\zeta _n]\). It is also equipped with n field automorphisms forming the Galois group which is commutative in this case. It can be seen that \(\mathcal K_{n/2}=\mathbb {Q}(\zeta _{n/2})\) is the subfield of \(\mathcal K_n\) fixed by the automorphism \(\sigma (\zeta _n)=\zeta _n\) of \(\mathcal K_{n}\), as \(\zeta _n^2 = \zeta _{n/2}\). This leads to a tower of field extensions and their corresponding rings of integers
Given an extension \(\mathcal K_n\mathcal K_{n/2}\), the relative trace \(\mathrm {Tr}: \mathcal K_n\rightarrow \mathcal K_{n/2}\) is the \(\mathcal K_{n/2}\)linear map given by \(\mathrm {Tr}(f) = f+\sigma (f)\). Similarly, the relative norm is the multiplicative map \(\mathrm {N}(f)=f\cdot \sigma (f) \in \mathcal K_{n/2}\). Both maps send integers in \(\mathcal K_{n}\) to integers in \(\mathcal K_{n/2}\). For all \(f \in \mathcal K_n\), it holds that \(f = (\mathrm {Tr}(f) + \zeta _n\mathrm {Tr}(\zeta _n^{1}f))/2\).
We are also interested in the field automorphism \(\zeta _n\mapsto \zeta _n^{1}=\bar{\zeta _n}\), which corresponds to the complex conjugation. We call adjoint the image \(\bar{f}\) of f under this automorphism. The fixed subfield \(\mathcal K^+_n := \mathbb {Q}(\zeta _n+\zeta _n^{1})\) is known as the totally real subfield and contains the selfadjoint elements, that is, such that \(f =\bar{f}\). Another way to describe selfadjoint elements is to say that all their complex embeddings^{Footnote 2} are in fact reals. Elements whose embeddings are all positive are called totally positive elements, and we denote their set by \(\mathcal K^{++}_n\). A standard example of such an element is given by \(f\bar{f}\) for any \(f\in \mathcal K_n\). It is wellknown that the Galois automorphisms act as permutation of these embeddings, so that a totally positive element stays positive under the action of the Galois group.
Representation of Cyclotomic Numbers. We also have \(\mathcal K_n \simeq \mathbb {Q}[x]/(x^n+1)\) and \(\mathcal R_n \simeq \mathbb {Z}[x]/(x^n+1)\), so that elements in cyclotomic fields can be seen as polynomials. In this work, each \(f = \sum _{i=0}^{n1} f_i\zeta _n^i\in \mathcal K_{n}\) is identified with its coefficient vector \((f_0,\cdots ,f_{n1})\). Then the inner product of f and g is \(\langle {f, g}\rangle = \sum _{i=0}^{n1}f_ig_i\), and we write \(\Vert f\Vert \), resp. \(\Vert f\Vert _\infty \), the \(\ell _2\)norm, resp. \(\ell _\infty \)norm, of f. In this representation, it can be checked that \(\bar{f} = (f_0, f_{n1}, \dotsc , f_1)\) and that \(\langle { f,gh}\rangle = \langle {f\bar{g}, h}\rangle \) for all \(f,g,h \in \mathcal K_n\). In particular, the constant coefficient of \(f\bar{g}\) is \(\langle {f, g}\rangle =\langle {f\bar{g}, 1}\rangle \). A selfadjoint element f has coefficients \((f_0, f_1, \dotsc , f_{n/21}, 0, f_{n/21}, \dotsc , f_1)\).
Elements in \(\mathcal K_n\) can also be represented by their matrix of multiplication in the basis \(1, \zeta _n, \dotsc , \zeta _n^{n1}\). In other words, the map \(\mathcal A_{n}: \mathcal K_n \rightarrow \mathbb {Q}^{n\times n}\) defined by
is a ring isomorphism. We have \(fg=g\cdot \mathcal A_n(f)\). We can also see that \(\mathcal A_n(\bar{f}) = \mathcal A_n(f)^t\) which justifies the term “adjoint”. We deduce that the matrix of a selfadjoint element is symmetric. It can be observed that a totally positive element \(A \in \mathcal K_n\) corresponds to the symmetric positive definite matrix \(\mathcal A_n(A)\).
For efficiency reasons, the scheme Falcon uses another representation corresponding to the tower structure. If \(f=(f_0,\dotsc , f_{n1})\in \mathcal K_n\), we let \(f_e=\mathrm {Tr}(f)/2 = (f_0, f_2,\dotsc , f_{n2})\) and \(f_o=\mathrm {Tr}(\zeta _n^{1}f)/2 = (f_1, f_3, \dotsc , f_{n1})\). Let \(\mathbf {P}_{n} \in \mathbb {Z}^{n\times n}\) be the permutation matrix corresponding to the bitreversal order. We define \(\mathcal F_{n}(f) = \mathbf {P}_{n}\mathcal A_{n}(f)\mathbf {P}_{n}^t\). In particular, it is also symmetric positive definite when f is a totally positive element. As shown in [13], it holds that
2.5 NTRU Lattices
Given \(f,g \in \mathcal R_n\) such that f is invertible modulo some \(q \in \mathbb {Z}\), we let \(h=f^{1}g \bmod q\). The NTRU lattice determined by h is \(\mathcal L_\text {NTRU}= \{ (u,v) \in \mathcal R_n^2\,:\, u+vh = 0 \bmod q\}\). Two bases of this lattice are of particular interest for cryptography:
where \(F,G \in \mathcal R_n\) such that \(fGgF = q\). Indeed, the former basis acts usually as the public key, while the latter is the secret key, also called the trapdoor basis, when f, g, F, G are short vectors. In practice, these matrices are represented using either the operator \(\mathcal A_n\) [11] or \(\mathcal F_n\) [47]:
3 HashandSign over NTRU Lattices
Gentry, Peikert and Vaikuntanathan introduced in [20] a generic and provably secure hashandsign framework based on trapdoor sampling. This paradigm has then been instantiated over NTRU lattices giving rise to practically efficient cryptosystems: DLP [11] and Falcon [47] signature schemes.
In the NTRUbased hashandsign scheme, the secret key is a pair of short polynomials \((f, g) \in \mathcal R_n^2\) and the public key is \(h = f^{1}g \bmod q\). The trapdoor basis \(\mathbf {B}_{f,g}\) (of \(\mathcal L_\text {NTRU}\)) derives from (f, g) by computing \(F,G\in \mathcal R_n\) such that \(fGgF =q\). In both the DLP signature and Falcon, the trapdoor basis has a bounded GramSchmidt norm: \(\Vert \mathbf {B}_{f,g}\Vert _{GS}\le 1.17\sqrt{q}\) for compact signatures.
The signing and verification procedure is described on a high level as follows:
Lattice Gaussian samplers [20, 42] are nowadays a standard tool to generate signatures provably statistically independent of the secret basis. However, such samplers are also a notorious target for sidechannel attacks. This work makes no exception and attacks non constanttime implementations of the lattice Gaussian samplers at the heart of both DLP and Falcon, that are based on the KGPV sampler [30] or its ring variant [13]. Precisely, while previous attacks target to Gaussian with public standard deviations, our attack learns the secretdependent Gaussian standard deviations involved in the KGPV sampler.
3.1 The KGPV Sampler and Its Variant
The KGPV sampler is a randomized variant of Babai’s nearest plane algorithm [1]: instead of rounding each center to the closest integer, the KGPV sampler determines the integral coefficients according to some integer Gaussians. It is shown in [20] that under certain smoothness condition, the algorithm outputs a sample from a distribution negligibly close to the target Gaussian. Its formal description is illustrated in Algorithm 3.1.
Note that in the KGPV sampler (or its ring variant), the standard deviations of integer Gaussians are inversely proportional to the GramSchmidt norms of the input basis. In the DLP scheme, \(\mathbf {B}\) is in fact the trapdoor basis \(\mathbf {B}^{\mathcal A}_{f,g} \in \mathbb {Z}^{2n\times 2n}\).
The Ducas–Prest Sampler. Falcon uses a variant of the KGPV algorithm which stems naturally from Ducas–Prest’s fast Fourier nearest plane algorithm [13]. It exploits the tower structure of poweroftwo cyclotomic rings. Just like the KGPV sampler, the DucasPrest sampler fundamentally relies on integer Gaussian sampling to output Gaussian vectors. We omit its algorithmic description, as it is not needed in this work. Overall, what matters is to understand that the standard deviations of involved integer Gaussians are also in the form \(\sigma _i = \sigma /\Vert \mathbf {b}_{i}^*\Vert \), but that \(\mathbf {B}= \mathbf {B}^{\mathcal F}_{f,g}\) in this context.
4 SideChannel Attack Against Trapdoor Samplers: A Roadmap
Our algorithm proceeds as follows:

1.
Sidechannel leakage: extract the \(\Vert \mathbf {b}_i^*\Vert \)’s associated to \(\mathbf {B}_{f,g}^\mathcal A\), resp. \(\mathbf {B}_{f,g}^\mathcal F\) via the timing leakage of integer Gaussian sampler in the DLP scheme, reps. Falcon.

2.
Totally positive recovery: from the given \(\Vert \mathbf {b}_i^*\Vert \)’s, recover a Galois conjugate u of \(f\overline{f} + g\overline{g} \in \mathcal K^{++}_n\).

3.
Final recovery: compute f from u and the public key \(g/f \mod q\).
Steps 1 and 2 of the attack are the focus of Sects. 6 and 5 respectively. Below we describe how the third step is performed. First we recover the element \(f\overline{g}\), using the fact that it has small coefficients. More precisely, the \(j^{\text {th}}\) coefficient is \(\langle {f, \zeta _n^jg}\rangle \) where f and \(\zeta _n^jg\) are independent and identically distributed according to \(D_{\mathbb {Z}^n,r}\), with \(r=1.17\sqrt{\frac{q}{2n}}\). By [32, Lemma 4.3], we know that all these coefficients are of size much smaller than q/2 with high probability. Now, we can compute \(v = u\overline{h}(1+h\overline{h})^{1} \bmod q\), where \(h = f^{1}g\bmod q\) is the public verification key. We readily see that \(v = f\overline{g} \bmod q\) if and only if \(u = f\overline{f} + g\overline{g}\). If u is a conjugate of \(f\overline{f}+g\overline{g}\), then most likely the coefficients of v will look random in \((q/2, q/2]\). This can mostly be interpreted as the NTRU assumption, that is, h being indistinguishable from a random element modulo q. When this happens, we just consider another conjugate of u, until we obtain a distinguishably small element, which must then be \(f\overline{g}\) (not just in reduction modulo q, but in fact over the integers).
Once this is done, we can then deduce the reduction modulo q of \(f\bar{f} \equiv f\bar{g} / \bar{h} \pmod q\), which again coincides with \(f\bar{f}\) over the integers with high probability (if we again lift elements of \(\mathbb {Z}_q\) to \((q/2,q/2]\), except for the constant coefficient, which should be lifted positively). This boils down to the fact that with high probability \(f\overline{f}\) has its constant coefficient in (0, q) and the others are in \((q/2, q/2)\). Indeed, the constant coefficient of \(f\overline{f}\) is \(\Vert f\Vert ^2\), and the others are \(\langle {f, \zeta _n^jf}\rangle \)’s with \(j\ge 1\). By some Gaussian tail bound, we can show \(\Vert f\Vert ^2\le q\) with high probability. As for \(\langle {f, \zeta _n^jf}\rangle \)’s, despite the dependency between f and \(\zeta _n^jf\), we can still expect \(\langle {f, \zeta _n^jf}\rangle  < q/2\) for all \(j\ge 1\) with high probability. We leave details in the full version [17] for interested readers.
Next, we compute the ideal (f) from the knowledge of \(f\overline{f}\) and \(f\overline{g}\). Indeed, as f and g are coprime from the key generation algorithm, we directly have \((f) = (f\overline{f}) + (f\overline{g})\). At this point, we have obtained both the ideal (f) and the relative norm \(f\bar{f}\) of f on the totally real subfield. That data is exactly what we need to apply the Gentry–Szydlo algorithm [21], and finally recover f itself in polynomial time. Note furthermore that the practicality of the Gentry–Szydlo algorithm for the dimensions we consider (\(n=512\)) has been validated in previous work [14].
Comparison with Existing Method. As part of their sidechannel analysis of the BLISS signature scheme, Espitau et al. [14] used the HowgraveGraham–Szydlo algorithm to recover an NTRU secret f from \(f\overline{f}\). They successfully solved a small proportion \(({\approx } 7\%)\) of NTRU instances with \(n=512\) in practice. The HowgraveGraham–Szydlo algorithm first recovers the ideal (f) and then calls the Gentry–Szydlo algorithm as we do above. The bottleneck of this method is in its reliance on integer factorization for ideal recovery: the integers involved can become quite large for an arbitrary f, so that recovery cannot be done in classical polynomial time in general. This is why only a small proportion of instances can be solved in practice.
However, the technique we describe above bypasses this expensive factorization step by exploiting the arithmetic property of the NTRU secret key. In particular, it is immediate to obtain a twoelement description of (f), so that the GentrySzydlo algorithm can be run as soon as \(f\bar{f}\) and \(f\bar{g}\) are computed. This significantly improves the applicability and efficiency of Espitau et al.’s sidechannel attack against BLISS [14]. The question of avoiding the reliance on Gentry–Szydlo algorithm by using the knowledge of \(f\overline{g}\) and \(f\overline{f}\) remains open, however.
5 Recovering Totally Positive Elements
Totally positive elements in \(\mathcal K_n\) correspond to symmetric positive definite matrices with an inner structure coming from the algebra of the field. In particular, it is enough to know only one line of the matrix to recover the corresponding field element. Hence it can be expected that being given the diagonal part of the LDL decomposition also suffices to perform a recovery. In this section, we show that this is indeed the case provided we know exactly the diagonal.
Recall on the one hand that the \(\mathcal A_n\) representation is the skew circulant matrix in which each diagonal consists of the same entries. On the other hand, the \(\mathcal F_n\) representation does not follow the circulant structure, but it is compatible with the tower of rings structure, i.e. its submatrices are the \(\mathcal F_{n/2}\) representations of elements in the subfield \(\mathcal K_{n/2}\). Each operator leads to a distinct approach, which is described in Sects. 5.1 and 5.2 respectively.
While the algorithms of this section can be used independently, they are naturally related to hashandsign over \(\text {NTRU}\) lattices. Let \(\mathbf {B}\) be a matrix representation of some secret key \((g,f)\), and \(\mathbf {G}= \mathbf {B}\mathbf {B}^t\). Then the diagonal part of \(\mathbf {G}\)’s LDL decomposition contains the \(\Vert \mathbf {b}_i^*\Vert \)’s, and \(\mathbf {G}\) is a matrix representation of \(f\overline{f} + g\overline{g} \in \mathcal K^{++}_n\). As illustrated in Sect. 4, the knowledge of \(u=f\overline{f} + g\overline{g}\) allows to recover the secret key in polynomial time. Therefore results in this section pave the way for a better use of secret GramSchmidt norms.
In practice however, we will obtain only approximations of the \(\Vert \mathbf {b}_i^*\Vert \)’s. The algorithms of this section must then be tweaked to handle the approximation error. The case of \(\mathcal A_n\) is dealt with in Sect. 7.1. While we do not solve the “approximate” case of \(\mathcal F_n\), we believe our “exact” algorithms to be of independent interest to the community.
5.1 Case of the Power Basis
The goal of this section is to obtain the next theorem. It involves the heuristic argument that some rational quadratic equations always admits exactly one integer root, which will correspond to a coefficient of the recovered totally positive element. Experimentally, when it happens that there are two integer roots and the wrong one is chosen, the algorithm “fails” with overwhelming probability at the next step: the next discriminant does not lead to integer roots.
Theorem 1
Let \(u\in \mathcal R_n \cap \mathcal K^{++}_n\). Write \(\mathcal A_n(u) = \mathbf {L}\cdot \mathrm {diag}(\lambda _i)_i \cdot \mathbf {L}^t\). There is a (heuristic) algorithm \(\mathsf {Recovery}_\mathcal A\) that, given \(\lambda _i\)’s, computes u or \(\sigma (u)\). It runs in \(\widetilde{O}(n^3\log \Vert u\Vert _\infty )\).
The complexity analysis is given in the full version [17]. In Sect. 7.2, a version tweaked to handle approximations of the \(\lambda _i\)’s is given, and may achieve quasiquadratic complexity. It is in any case very efficient in practice, and it is used in our attack against DLP signature.
We now describe Algorithm 5.1. By Proposition 1, \(\prod _{j=0}^{i} \lambda _i = \det \left( \mathcal A_n(u)_{[0,i]}\right) \) is an integer, thus we take \(m_i = \prod _{j=0}^{i} \lambda _i\) instead of \(\lambda _i\) as input for integrality. It holds that \(u_0 = \det \left( \mathcal A_n(u)_{[0,0]}\right) = \lambda _0\). By the selfadjointness of u, we only need to consider the first n/2 coefficients. For any \(0 \le i < n/21\), we have
Let \(\mathbf {v}_i = (u_{i+1}, \dotsc , u_1)\). By the definition of the Schur complement and Proposition 1, we see that
where the lefthand side is actually \(\lambda _{i+1}\), and the righthand side gives a quadratic equation in \(u_{i+1}\) with rational coefficients that can be computed from the knowledge of \((u_0,\dotsc , u_i)\). When \(i=0\), the equation is equivalent to \(\lambda _0\lambda _1 = u_0^2u_1^2\): there are two candidates of \(u_1\) up to sign. Once \(u_1\) is chosen, for \(i\ge 1\), the quadratic equation should have with very high probability a unique integer solution, i.e. the corresponding \(u_{i+1}\). This leads to Algorithm 5.1. Note that the sign of \(u_1\) determines whether the algorithm recovers u or \(\sigma (u)\). This comes from the fact that \(\mathcal A_n(u) = \mathrm {diag}((1)^i)_{i\le n} \cdot \mathcal A_n(\sigma (u))\cdot \mathrm {diag}((1)^i)_{i\le n}\).
5.2 Case of the BitReversed Order Basis
In this section, we are given the diagonal part of the LDL decomposition \(\mathcal F_n(u)=\mathbf {L}'\mathrm {diag}(\lambda _i)\mathbf {L}'^t\), which rewrites as \((\mathbf {L}'^{1}\mathbf {P}_n)\mathcal A_n(u)\mathbf {(}\mathbf {L}'^{1}\mathbf {P}_n)^t = \mathrm {diag}(\lambda _i)\). Since the triangular structure is shuffled by the bitreversal representation, recovering u from the \(\lambda _i\)’s is not as straightforward as in the previous section. Nevertheless, the compatibility of the \(\mathcal F_n\) operator with the tower of extension can be exploited. It gives a recursive approach that stems from natural identities between the trace and norm maps relative to the extension \(\mathcal K_n\,\,\mathcal K_{n/2}\), crucially uses the selfadjointness and total positivity of u, and fundamentally relies on computing square roots in \(\mathcal R_n\).
Theorem 2
Let \(u\in \mathcal R_n \cap \mathcal K^{++}_n\). Write \(\mathcal F_n(u) =\mathbf {L}'\cdot \mathrm {diag}(\lambda _i)_i \cdot \mathbf {L}'^t\). There is a (heuristic) algorithm that, given the \(\lambda _i\)’s, computes a conjugate of u. It runs in \(\widetilde{O}(n^3\log \Vert u\Vert _\infty )\).
The recursiveness of the algorithm and its reliance on square roots will force it to always work “up to Galois conjugation”. In particular, at some point we will assume heuristically that only one of the conjugates of a value computed within the algorithm is in a given coset of the subgroup of relative norms in the quadratic subfield. Since that constraint only holds with negligible probability for random values, the heuristic is essentially always verified in practice. Recall that we showed in Sect. 4 how to recover the needed conjugate in practice by a distinguishing argument.
The rest of the section describes the algorithm, while the complexity analysis is presented in the full version [17]. First, we observe from
that \(\mathrm {Tr}(u)\) is selfadjoint. The positivity of u implies that \(\mathrm {Tr}(u) \in \mathcal K^{++}_{n/2}\). From Eq. (2), we know that the n/2 first minors of \(\mathcal F_n(u)\) are the minors of \(\mathcal F_{n/2}(\mathrm {Tr}(u)/2)\). The identity above also shows that \(\mathrm {Tr}(\zeta _n^{1}u)\) is a square root of the element \(\zeta _{n/2}^{1}\mathrm {Tr}(\zeta _n^{1}u)\overline{\mathrm {Tr}(\zeta _n^{1}u)}\) in \(\mathcal K_{n/2}\). Thus, if we knew \(\mathrm {Tr}(\zeta _n^{1}u)\overline{\mathrm {Tr}(\zeta _n^{1}u)}\), we could reduce the problem of computing \(u\in \mathcal K_n\) to computations in \(\mathcal K_{n/2}\), more precisely, recovering a totally positive element from “its minors” and a square root computation.
It turns out that \(\mathrm {Tr}(\zeta _n^{1}u)\overline{\mathrm {Tr}(\zeta _n^{1}u)}\) can be computed by going down the tower as well. One can see that
where \(\mathrm {N}(u)\) is totally positive since u (and therefore \(\sigma (u)\)) is. This identity^{Footnote 3} can be thought as a “number field version” of the \(\mathcal F_n\) representation. Indeed, recall that \(u_e = (1/2)\mathrm {Tr}(u)\) and \(u_o=(1/2)\mathrm {Tr}(\zeta _n^{1}u)\). Then by block determinant formula and the fact that \(\mathcal F_n\) is a ring isomorphism, we see that
This strongly suggests a link between the successive minors of \(\mathcal F_n(u)\) and the element \(\mathrm {N}(u)\). The next lemma makes this relation precise, and essentially amounts to taking Schur complements in the above formula.
Lemma 3
Let \(u\in \mathcal K^{++}_n\) and \(\widehat{u} = \frac{2\mathrm {N}(u)}{\mathrm {Tr}(u)}\in \mathcal K^{++}_{n/2}\). Then for \(0< k< n/2\), we have
Proof
Let \(\mathbf {G}= \mathcal F_{n}(u)\) and \(\mathbf {B}= \mathcal F_{n/2}(u_o)_{[0,\frac{n}{2}) \times [0,k)}\) in order to write
with \(\mathbf {B}^t=\mathcal F_{n/2}(\overline{u_o})_{[0,k) \times [0,\frac{n}{2})}\). Let \(\mathbf {S}= \mathbf {G}_{[0, \frac{n}{2} + k)}/\mathcal F_{n/2}(u_e) = \mathcal F_{n/2}(u_e)_{[0,k)}  \mathbf {B}\mathcal F_{n/2}(u_e)^{1}\mathbf {B}^t\). Since \(\mathcal F_n\) is a ring isomorphism, a routine computation shows that \(\mathbf {S}= \mathcal F_{n/2}(\widehat{u})_{[0,k)}\). The proof follows from Eq. (1). \(\square \)
Lemma 3 tells us that knowing \(\mathrm {Tr}(u)\) and the principal minors of \(\mathcal F_n(u)\) is enough to recover those of \(\mathcal F_{n/2}(\widehat{u})\), so that the computations in \(\mathcal K_{n}\) are again reduced to computing a totally positive element in \(\mathcal K_{n/2}\) from its minors. Then from Eq. (3), we can obtain \(\mathrm {Tr}(\zeta _n^{1}u)\overline{\mathrm {Tr}(\zeta _n^{1}u)}\). The last step is then to compute a square root of \(\zeta _{n/2}^{1}\mathrm {Tr}(\zeta _n^{1}u)\overline{\mathrm {Tr}(\zeta _n^{1}u)}\) in \(\mathcal K_{n/2}\) to recover \(\pm \mathrm {Tr}(\zeta _n^{1}u)\). In particular, this step will lead to u or its conjugate \(\sigma (u)\). As observed above, this translates ultimately in recovering only a conjugate of u.
Lastly, when \(n=2\), that is, when we work in \(\mathbb {Q}(i)\), a totally positive element is in fact in \(\mathbb {Q}_+\). This leads to Algorithm 5.2, which is presented in the general context of \(\mathcal K_n\) to fit the description above, for the sake of simplicity. The algorithm \(\mathsf {TowerRoot}\) of Step 9 computes square roots in \(\mathcal K_n\) and a quasiquadratic version for integers is presented and analyzed in the next section.
The whole procedure is constructing a binary tree as illustrated in Fig. 1. The algorithm can be made to rely essentially only on algebraic integers, which also helps in analyzing its complexity. This gives the claim of Theorem 2 (see the full version [17] for details). At Step 6, the algorithm finds the (heuristically unique) conjugate \(\widehat{u}\) of \(\widetilde{u}\) such that \(\widehat{u}\cdot u^+\) is a relative norm (since we must have \(\widehat{u}\cdot u^+ = \mathrm {N}(u)\) by the above). In practice, in the integral version of this algorithm, we carry out this test not by checking for being a norm, but as an integrality test.
5.2.1 Computing Square Roots in Cyclotomic Towers
In this section, we will focus on computing square roots of algebraic integers: given \(s = t^2 \in \mathcal R_n\), compute t. The reason for focusing on integers is that both our Algorithm 5.2 and practical applications deal only with algebraic integers. A previous approach was suggested in [25], relying on finding primes with small splitting pattern in \(\mathcal R_n\), computing square roots in several finite fields and bruteforcing to find the correct candidate. A hassle in analyzing this approach is to first find a prime larger enough than an arbitrary input, and that splits in, say, two factors in \(\mathcal R_n\). Omitting the cost of finding such a prime, this algorithm can be shown to run in \(\widetilde{O}(n^2(\log \Vert s\Vert _\infty )^2)\). Our recursive approach does not theoretically rely on finding a correct prime, and again exploits the tower structure to achieve the next claim.
Theorem 3
Given a square s in \(\mathcal R_n\), there is a deterministic algorithm that computes \(t \in \mathcal R_n\) such that \(t^2=s\) in time \(\widetilde{O}(n^2\log \Vert s\Vert _\infty )\).
Recall that the subfield \(\mathcal K_{n/2}\) is fixed by the automorphism \(\sigma (\zeta _n) = \zeta _n\). For any element t in \(\mathcal R_n\), recall that \(t = \frac{1}{2}(\mathrm {Tr}(t) + \zeta _n\mathrm {Tr}(\zeta _n^{1} t))\), where \(\mathrm {Tr}\) is the trace relative to this extension. We can also see that
for the relative norm. Hence recovering \(\mathrm {Tr}(t)\) and \(\mathrm {Tr}(\zeta _n^{1} t)\) can be done by computing the square roots of elements in \(\mathcal R_{n/2}\) determined by s and \(\mathrm {N}(t)\). The fact that \(\mathrm {N}(s) = \mathrm {N}(t)^2\) leads to Algorithm 5.3.
Notice that square roots are only known up to sign. This means that an algorithm exploiting the tower structure of fields must perform several sign checks to ensure that it will lift the correct root to the next extension. For our algorithm, we only need to check for the sign of \(\mathrm {N}(t)\) (the signs of \(\mathrm {Tr}(t)\) and \(\mathrm {Tr}(\zeta _n^{1} t)\) can be determined by checking if their current values allow to recover s). This verification happens at Step 6 of Algorithm 5.3, where after computing the square root of \(\mathrm {N}(s)\), we know \((1)^b\mathrm {N}(t)\) for some \(b\in \{0,1\}\). It relies on noticing that from Eq. (4), \(T_b := \mathrm {Tr}(s)+2\cdot (1)^b\mathrm {N}(t)\) is a square in \(\mathcal K_{n/2}\) if and only if \(b=0\), in which case \(T_b = \mathrm {Tr}(t)^2\). (Else, \(\zeta _n^{2}T_b\) is the square \(\mathrm {Tr}(\zeta _n^{1} t)^2\) in \(\mathcal K_{n/2}\).) This observation can be extended to a sign check that runs in \(\widetilde{O}(n\cdot \log \Vert s\Vert _\infty )\). The detailed analysis is given in the full version [17].
In practice, we can use the following approach: since n is small, we can easily precompute a prime integer p such that \(p 1\equiv n\bmod 2n\). For such a prime, there is a primitive \(n^{\text {th}}\) root \(\omega \) of unity in \(\mathbb {F}_p\), and such a root cannot be a square in \(\mathbb {F}_p\) (else 2n would divide \(p1\)). Checking squareness then amounts to checking which of \(T_b(\omega )\) or \(\omega ^{2}T_b(\omega )\) is a square \(\bmod \, p\) by computing a Legendre symbol. While we need such primes for any power of 2 that is smaller than n, in any case, this checks is done in quasilinear time. Compared to [25], the size of p here does not matter.
Let us denote by \(\mathsf {SQRT}(n, S)\) the complexity of Algorithm 5.3 for an input \(s \in \mathcal R_n\) with coefficients of size \(S = \log \Vert s\Vert _\infty \). Using e.g. FFT based multiplication of polynomials, \(\mathrm {N}(s)\) can be computed in \(\widetilde{O}(n S)\), and has bitsize at most \(2S+\log n\). Recall that the socalled canonical embedding of any \(s\in \mathcal K_n\) is the vector \(\tau (s)\) of its evaluations at the roots of \(x^n+1\). It is wellknown that it satisfies \(\Vert \tau (s) \Vert = \sqrt{n}\Vert s\Vert \), so that \(\Vert \tau (s)\Vert _\infty \le n \Vert s\Vert _\infty \) by norm equivalence. If \(s=t^2\) we see that \(\Vert \tau (s)\Vert _\infty = \Vert \tau (t)\Vert _\infty ^2\). Using again norm equivalence, we obtain \(\Vert t\Vert _\infty \le \sqrt{n}\Vert s\Vert _\infty ^{1/2}\). In the case of \(\mathrm {N}(s) = \mathrm {N}(t)^2\), we obtain that \(\mathrm {N}(t)\) has size at most \(S+\log n\). The cost of \(\mathsf {CheckSqr}\) is at most \(\widetilde{O}(n S)\), so we obtain
A tedious computation (see the full version [17] for details) gives us Theorem 3.
6 SideChannel Leakage of the Gram–Schmidt Norms
Our algorithms in Sect. 5 rely on the knowledge of the exact GramSchmidt norms \(\Vert \mathbf {b}_i^*\Vert \). In this section, we show that in the original implementations of DLP and Falcon, approximations of \(\Vert \mathbf {b}_i^*\Vert \)’s can be obtained by exploiting the leakage induced by a non constanttime rejection sampling.
In previous works targeting the rejection phase, the standard deviation of the sampler was a public constant. This work deals with a different situation, as both the centers and the standard deviations used by the samplers of DLP and Falcon are secret values determined by the secret key. These samplers output Gaussian vectors by relying on an integer Gaussian sampler, which performs rejection sampling. The secret standard deviation for the \(i^{\text {th}}\) integer Gaussian is computed as \(\sigma _i =\sigma /\Vert \mathbf {b}_i^*\Vert \) for some fixed \(\sigma \), so that exposure of the \(\sigma _i\)’s means the exposure of the GramSchmidt norms. The idea of the attack stems from the simple observation that the acceptance rate of the sampler is essentially a linear function of its current \(\sigma _i\). In this section, we show how, by a timing attack, one may recover all acceptance rates from sufficiently many signatures by computing a wellchosen maximum likelihood estimator. Recovering approximations of the \(\Vert \mathbf {b}_i^*\Vert \)’s then follows straightforwardly.
6.1 Leakage in the DLP Scheme
We first target the Gaussian sampling in the original implementation [46], described in Algorithms 6.1 and 6.2. It samples “shifted” Gaussian integers by relying on three layers of Gaussian integer sampling with rejection. More precisely, the target Gaussian distribution at the “top” layer has a center which depends on secret data and varies during each call. To deal with the varying center, the “shifted” sample is generated by combining zerocentered sampler and rejection sampling. Yet the zerocentered sampler has the same standard deviation as the “shifted” one, and the standard deviation depends on the secret key. At the “intermediate” layer, also by rejection sampling, the sampler rectifies a public zerocentered sample to a secretdependent one.
At the “bottom” layer, the algorithm \(\mathsf {IntSampler}\) actually follows the BLISS sampler [8] that is already subject to sidechannel attacks [7, 14, 43]. We stress again that our attack does not target this algorithm, so that the reader can assume a constanttime version of it is actually used here. The weakness we are exploiting is a non constanttime implementation of Algorithm 6.2 in the “intermediate” layer. We now describe how to actually approximate the \(\sigma _i\)’s using this leakage.
Let \(\widehat{\sigma } = \sqrt{\frac{1}{2\log (2)}}\) be the standard deviation of the Gaussian at the “bottom” layer and \(k_i = \lceil \frac{\sigma _i}{\hat{\sigma }}\rceil \). It can be verified that the average acceptance probability of Algorithm 6.2 is \(AR(\sigma _i) = \frac{\rho _{\sigma _i}(\mathbb {Z})}{\rho _{k\widehat{\sigma }}(\mathbb {Z})}\). As required by the KGPV algorithm, we know that \(k_i\widehat{\sigma } \ge \sigma _i \ge \eta _{\epsilon }'(\mathbb {Z})\) and by Corollary 1 we have \(AR(\sigma _i) \in \frac{\sigma _i}{k_i\widehat{\sigma }}\cdot \left[ \frac{1\epsilon }{1+\epsilon }, 1\right] \). Since \(\epsilon \) is very small in this context, we do not lose much by assuming that \(AR(\sigma _i) = \frac{\sigma _i}{k_i\hat{\sigma }}\).
Next, for a given \(\sigma _i\), the number of trials before Algorithm 6.2 outputs its result follows a geometric distribution \(\text {Geo}_{AR(\sigma _i)}\). We let \(\overline{AR}_i\) be maximum likelihood estimators for the \(AR(\sigma _i)\)’s associated to N executions of the KGPV sampler, that we compute using Lemma 1. We now want to determine the \(k_i\)’s to compute \(\overline{\sigma _i} = k_i\hat{\sigma }\overline{AR}_i\). Concretely, for the suggested parameters, we can set \(k_i = 3\) for all i at the beginning and measure \(\overline{AR}_i\). Because the first half of the \(\sigma _i\)’s are in a small interval and increase slowly, it may be the case at some step that \(\overline{AR}_{i+1}\) is significantly smaller than \(\overline{AR}_{i}\) (say, \(1.1\cdot \overline{AR}_{i+1} < \overline{AR}_{i}\)). This means that \(k_{i+1} = k_i+1\), and we then increase by one all the next \(k_{i}\)’s. This approach can be done until \(\overline{AR}_{n}\) is obtained, and works well in practice. Lastly, Lemma 1 tells us that for large enough \(\alpha \) and p, taking \(N\ge 2^{2(p+\log \alpha )}\) implies \(\overline{\sigma }_i\sigma _i\le 2^{p}\cdot \sigma _i\) for all \(0\le i< 2n\) with high probability.
From [11], the constant \(\sigma \) is publicly known. This allows us to have approximations \(\overline{b_i} = \frac{\sigma }{\overline{\sigma }_i}\), which we then expect are up to p bits of accuracy on \(\Vert \mathbf {b}_i^*\Vert \).
6.2 Leakage in the Falcon Scheme
We now describe how the original implementation of Falcon presents a similar leakage of Gram–Schmidt norms via timing sidechannels. In contrast to the previous section, the integer sampler of Falcon is based on one public halfGaussian sampler and some rejection sampling to reflect sensitive standard deviations and centers. The procedure is shown in Algorithm 6.3.
Our analysis does not target the halfGaussian sampler \(D_{\mathbb {Z},\widehat{\sigma }}^{+}\) where \(\widehat{\sigma }=2\), so that we omit its description. It can be implemented in a constanttime way [29], but this has no bearing on the leakage we describe.
We first consider \(c_i\) and \(\sigma _i\) to be fixed. Following Algorithm 6.3, we let \(p(z,b) = \exp \left( \frac{z^2}{2\widehat{\sigma }^2}  \frac{(b + (2b1)zc_i)^2}{2\sigma _i^2} \right) \) be the acceptance probability and note that
Then the average acceptance probability for fixed c and \(\sigma _i\) satisfies
As \(\widehat{\sigma } \ge \sigma _i \ge \eta _{\epsilon }'(\mathbb {Z})\) for a very small \(\epsilon \), we can again use Lemma 2 to have that \(\rho _{\sigma _i}(\mathbb {Z}c) \approx \rho _{\sigma _i}(\mathbb {Z})\). This allows us to consider the average acceptance probability as a function \(AR(\sigma _i)\), independent of c. Using that \(2\rho _{\hat{\sigma }}^+(\mathbb {N}) = \rho _{\hat{\sigma }}(\mathbb {Z})+1\) combined with Corollary 1, we write \(AR(\sigma _i) = \frac{\sigma _i\sqrt{2\pi }}{1+2\sqrt{2\pi }}\). Then an application of Lemma 1 gives the needed number of traces to approximate \(\sigma _i\) up to a desired accuracy.
7 Practical Attack Against the DLP Scheme
For the methods in Sect. 6, measure errors seem inevitable in practice. To mount a practical attack, we have to take into account this point. In this section, we show that it is feasible to compute a totally positive element even with noisy diagonal coefficients of its LDL decomposition.
First we adapt the algorithm \(\mathsf {Recovery}_\mathcal A\) (Algorithm 5.1) to the noisy input in Sect. 7.1. To determine each coefficient, we need to solve a quadratic inequality instead of an equation due to the noise. As a consequence, each quadratic inequality may lead to several candidates of the coefficient. According to if there is a candidate or not, the algorithm extends prefixes hopefully extending to a valid solution or eliminates wrong prefixes. Thus the algorithm behaves as a tree search.
Then we detail in Sect. 7.2 some implementation techniques to accelerate the recovery algorithm in the context of the DLP scheme. While the algorithm is easy to follow, adapting it to practical noisy case is not trivial.
At last, we report experimental results in Sect. 7.3. As a conclusion, given the full timing leakage of about \(2^{34}\) signatures, one may practically break the DLP parameter claimed for 192bit security with a good chance. We bring some theoretical support for this value in Sect. 7.4.
7.1 Totally Positive Recovery with Noisy Inputs
Section 5.1 has sketched the exact recovery algorithm. To tackle the measure errors, we introduce a new parameter to denote the error bound. The modified algorithm proceeds in the same way: given a prefix \((A_0,\cdots , A_{l1})\), it computes all possible \(A_l\)’s satisfying the error bound condition and extends or eliminates the prefix according to if it can lead to a valid solution. A formal algorithmic description is provided in Algorithm 7.1. For convenience, we use the (noisy) diagonal coefficients (i.e. secret GramSchmidt norms) of the LDL decomposition as input. In fact, Proposition 1 has shown the equivalence between the diagonal part and principal minors. In addition, we include prefix in the input for ease of description. The initial prefix is \(\mathsf{prefix} = \overline{A_0} = \lfloor {\overline{d_0}}\rceil \). Clearly, the correct A must be in the final candidate list.
7.2 Practical Tweaks in the DLP Setting
Aiming at the DLP signature, we implemented our sidechannel attack. By the following techniques, one can boost the practical performance of the recovery algorithm significantly and reduce the number of required signatures.
Fast Computation of the Quadratic Equation. Exploiting the Toeplitz structure of \(\mathcal A_n(A)\), we propose a fast algorithm to compute the quadratic equation, i.e. \((Q_a, Q_b, Q_c)\), that requires only O(l) multiplications and additions. The idea is as follows. Let \(\mathbf {T}_i = \mathcal A_n(A)_{[0,i]}\). Let \(\mathbf {u}_i = (A_1,\cdots , A_i)\) and \(\mathbf {v}_i = (A_i,\cdots , A_1)\), then
Let \(\mathbf {r}_i = \mathbf {v}_i\mathbf {T}_{i1}^{1}\), \(\mathbf {s}_i = \mathbf {u}_i\mathbf {T}_{i1}^{1}\) which is the reverse of \(\mathbf {r}_i\) and \(d_i = A_0  \langle {\mathbf {v}_i, \mathbf {r}_i}\rangle = A_0  \langle {\mathbf {u}_i, \mathbf {s}_i}\rangle \). A straightforward computation leads to that
Let \(f_i = \langle {\mathbf {r}_i, \mathbf {u}_i}\rangle = \langle {\mathbf {s}_i, \mathbf {v}_i}\rangle \), then the quadratic equation of \(A_i\) is
Remark that \(d_i\) is the square of the last GramSchmidt norm. Because \(\overline{d_i}\), a noisy \(d_i\), is the input, combining \( f_{i1}, \mathbf {v}_{i1}, \mathbf {r}_{i1}\) would determine all possible \(A_i\)’s. Once \(A_i\) is recovered, one can then compute \(\mathbf {r}_i\), \(\mathbf {s}_i\) according to
and further compute \(d_{i}, f_{i}\). As the recovery algorithm starts with \(i=1\) (i.e. \(\mathsf{prefix} = A_0\)), we can compute the sequences \(\{{d_{i}}\}, \{{f_{i}}\}, \{{\mathbf {r}_i}\}, \{{\mathbf {s}_i}\}\) on the fly.
Remark 1
The input matrix is very well conditioned, so we can use a precision of only \(O(\log n)\) bits.
Remark 2
The above method implies an algorithm of complexity \(\widetilde{O}(n^2)\) for the exact case (Sect. 5.1).
Pruning. We expect that when a mistake is made in the prefix, the error committed in the GramSchmidt will be larger. We therefore propose to prune prefixes when \(\sum _{k=i}^j e_k^2/\sigma _k^2\ge B_{ji}\) for some i, j where \(e_k\) is the difference between the measured kth squared GramSchmidt norm and the one of the prefix. The bound \(B_l\) is selected so that for \(e_k\) a Gaussian of standard deviation \(\sigma _k\), the condition is verified except with probability \(\tau /\sqrt{l}\). The failure probability \(\tau \) is geometrically decreased until the correct solution is found.
Verifying Candidates. Let \(A = f\overline{f} + g\overline{g}\), then \(f\overline{f} = A(1+h\overline{h})\bmod q\). As mentioned in Sect. 4, all coefficients except the constant one of \(f\overline{f}\) would be much smaller the modulus q. This can be used to check if a candidate is correct. In addition, both A(x) and \(A(x)\) are the final candidates, we also check \(A(1+h(x)\overline{h}(x))\) to ensure that the correct \(A(x)\) will not to be eliminated. Once either A(x) or \(A(x)\) is found, we terminate the algorithm.
The Use of Symplecticity. As observed in [18], the trapdoor basis \(\mathbf {B}_{f,g}\) is qsymplectic and thus \(\Vert \mathbf {b}_i^*\Vert \cdot \Vert \mathbf {b}_{2n1i}^*\Vert = q\). Based on that, we combine the samples of the ith and \((2n1i)\)th Gaussians to approximate \(\Vert \mathbf {b}_i^*\Vert \). This helps to refine the approximations and thus to reduce the number of signatures enabling a practical attack.
7.3 Experimental Results
We validate the recovery algorithm on practical DLP instances. Experiments are conducted on the parameter set claimed for 192bit security in which
The leakage data we extracted is the number of iterations of centered Gaussian samplings (Algorithm 6.2). To obtain it, we added some instrumentation to Prest’s C++ implementation [46]. The centered Gaussian samplings only depend on the secret key itself not the hashed message. Hence, instead of executing complete signing, we only perform centered Gaussian samplings. We mean by sample size the number of collected Gaussian samples. In fact, considering the rejection sampling in Algorithm 6.1, one requires about N/2 signatures to generate N samples per centered Gaussian.
We tested our algorithm on ten instances, and result is shown in Table 1. Producing the dataset of \(2^{36.5}\) samples for a given key took about 36 hours on our 48core machine (two weeks for all 10 distinct keys).
In one instance, the recovery algorithm found millions of candidate solutions with GramSchmidt norms closer to the noisy ones than the correct solution, in the sense that they had a larger \(\tau \). This indicates that the recovery algorithm is relatively close to optimality.
7.4 Precision Required on the Gram–Schmidt Norms
We try here to give a closed formula for the number of samples needed. We recall that the relative error with respect to the GramSchmidt norm (squared) is \(\varTheta (1/\sqrt{N})\) where N is the number of samples.
A fast recovery corresponds to the case where only one root is close to an integer; and in particular increasing by one the new coefficient must change by \(\varOmega (1/\sqrt{N})\) the GramSchmidt norm. This is not an equivalence because there is another root of the quadratic form, but we will assume this is enough.
Let \(b_1\) be the first row of \(\begin{pmatrix} \mathcal A_n(f)&\mathcal A_n(g)\end{pmatrix}\), and \(b_i\) the ith row for \(i\ge 2\). We define \(pb_i\) as the projection of \(b_1\) orthogonally to \(b_2,\dots ,b_{i1}\). We expect that \(\Vert pb_i\Vert \approx \sqrt{\frac{2ni+2}{2n}}\Vert b_1\Vert \). Consider the Gram matrix of the family \(b_1,\dots ,b_{i1},b_{i}\pm \frac{pb}{\Vert b_1\Vert ^2}\). We have indeed changed only the top right/bottom left coefficients by \(\pm 1\), beside the bottom right coordinate. Clearly this does not change the ith GramSchmidt vector; so the absolute change in the ith GramSchmidt norm squared is
The GramSchmidt norm squared is roughly \(\Vert pb_i\Vert ^2\).
Getting only one solution at each step with constant probability corresponds to
(assuming the scalar product is distributed as a Gaussian) which means a total number of samples of
This gives roughly \(2^{29}\) samples, which is similar to what the search algorithm requires.
Getting only one solution at each step with probability \(11/n\) corresponds to
and \(N=\varTheta (n^3q^2)\). This would be \(2^{57}\) samples.
8 Conclusion and Future Work
In this paper, we have investigated the sidechannel security of the two main efficient hashandsign latticebased signature schemes: DLP and Falcon (focusing on their original implementations, although our results carry over to several later implementations as well). The two main takeaways of our analysis are that:

1.
the Gram–Schmidt norms of the secret basis leak through timing sidechannels; and

2.
knowing the Gram–Schmidt norms allows to fully recover the secret key.
Interestingly, however, there is a slight mismatch between those two results: the sidechannel leakage only provides approximate values of the Gram–Schmidt norms, whereas secret key recovery a priori requires exact values. We are able to bridge this gap in the case of DLP by combining the recovery algorithm with a pruned tree search. This lets us mount a concrete attack against DLP that recovers the key from \(2^{33}\) to \(2^{35}\) DLP traces in practice for the high security parameters of DLP (claiming 192 bits of security).
However, the gap remains in the case of Falcon: we do not know how to modify our recovery algorithm so as to deal with approximate inputs, and as a result apply it to a concrete attack. This is left as a challenging open problem for future work.
Also left for future work on the more theoretical side is the problem of giving an intrinsic description of our recovery algorithms in terms of algebraic quantities associated with the corresponding totally positive elements (or equivalently, to give an algebraic interpretation of the LDL decomposition for algebraically structured selfadjoint matrices). In particular, in the Falcon case, our approach shows that the Gram–Schmidt norms characterize the Galois conjugacy class of a totally positive element. This strongly suggests that they should admit a nice algebraic description, but it remains elusive for now.
The final recovery in our attack, that is computing f from \(f\bar{f} + g\bar{g}\), heavily relies on the property of NTRU. We need further investigations to understand the impact of GramSchmidt norm leakage in hashandsign schemes over other lattices. But for nonstructured lattices, there appears to be a strong obstruction to at least a full key recovery attack, simply due to the dimension of the problem: there are only n GramSchmidt norms but \(O(n^2)\) secret coefficients to be recovered.
On a positive note, we finally recall that the problem of finding countermeasures against the leakage discussed in this paper is fortunately already solved, thanks to the recent work of Prest, Ricosset and Rossi [48]. And that countermeasure has very recently been implemented into Falcon [45], so the leak can be considered as patched! The overhead of that countermeasure is modest in the case of Falcon, thanks to the small range in which the possible standard deviations occur; however, it could become more costly for samplers that need to accommodate a wider range of standard deviations.
An alternate possible countermeasure could be to use Peikert’s convolution sampling [42] in preference to the KGPV approach, as it eliminates the need for varying standard deviations, and is easier to implement even without floating point arithmetic [9]. It does have the drawback of sampling wider Gaussians, however, and hence leads to less compact parameter choices.
Notes
 1.
The scaling factor is \((\sqrt{2\pi })^{1}\) before the smoothing parameter \(\eta _{\epsilon }(\mathbb {Z})\) in [38].
 2.
Each root of \(x^n+1\) describes one complex embedding by mean of evaluation.
 3.
This describes the discriminant of \(T^2\mathrm {Tr}(u)T + \mathrm {N}(u)\) whose roots are u and \(\sigma (u)\) in \(\mathcal K_n\). It is then not surprising that \(\mathrm {Tr}(\zeta _n^{1}u)\overline{\mathrm {Tr}(\zeta _n^{1}u)}\) is a square only in \(\mathcal K_n\).
References
Babai, L.: On Lovász’ lattice reduction and the nearest lattice point problem. Combinatorica 6(1), 1–13 (1986)
Bai, S., Galbraith, S.D.: An improved compression technique for signatures based on learning with errors. In: Benaloh, J. (ed.) CTRSA 2014. LNCS, vol. 8366, pp. 28–47. Springer, Cham (2014). https://doi.org/10.1007/9783319048529_2
Barthe, G., et al.: Masking the GLP latticebased signature scheme at any order. In: Nielsen, J.B., Rijmen, V. (eds.) EUROCRYPT 2018, Part II. LNCS, vol. 10821, pp. 354–384. Springer, Cham (2018). https://doi.org/10.1007/9783319783758_12
Barthe, G., Belaïd, S., Espitau, T., Fouque, P.A., Rossi, M., Tibouchi, M.: GALACTICS: Gaussian sampling for latticebased constanttime implementation of cryptographic signatures, revisited. In: Cavallaro, L., Kinder, J., Wang, X., Katz, J. (eds.) ACM CCS 2019, pp. 2147–2164. ACM Press (2019)
Bindel, N., et al.: qTESLA. Technical report, National Institute of Standards and Technology (2019). https://csrc.nist.gov/projects/postquantumcryptography/round2submissions
Bootle, J., Delaplace, C., Espitau, T., Fouque, P.A., Tibouchi, M.: LWE without modular reduction and improved sidechannel attacks against BLISS. In: Peyrin, T., Galbraith, S. (eds.) ASIACRYPT 2018, Part I. LNCS, vol. 11272, pp. 494–524. Springer, Cham (2018). https://doi.org/10.1007/9783030033262_17
Groot Bruinderink, L., Hülsing, A., Lange, T., Yarom, Y.: Flush, gauss, and reload – a cache attack on the BLISS latticebased signature scheme. In: Gierlichs, B., Poschmann, A.Y. (eds.) CHES 2016. LNCS, vol. 9813, pp. 323–345. Springer, Heidelberg (2016). https://doi.org/10.1007/9783662531402_16
Ducas, L., Durmus, A., Lepoint, T., Lyubashevsky, V.: Lattice signatures and bimodal Gaussians. In: Canetti, R., Garay, J.A. (eds.) CRYPTO 2013, Part I. LNCS, vol. 8042, pp. 40–56. Springer, Heidelberg (2013). https://doi.org/10.1007/9783642400414_3
Ducas, L., Galbraith, S., Prest, T., Yu, Y.: Integral matrix gram root and lattice Gaussian sampling without floats. In: Canteaut, A., Ishai, Y. (eds.) EUROCRYPT 2020. LNCS, vol. 12107, pp. 608–637. Springer, Cham (2020)
Ducas, L., et al.: CRYSTALSDilithium: a latticebased digital signature scheme. IACR TCHES 2018(1), 238–268 (2018). https://tches.iacr.org/index.php/TCHES/article/view/839
Ducas, L., Lyubashevsky, V., Prest, T.: Efficient identitybased encryption over NTRU lattices. In: Sarkar, P., Iwata, T. (eds.) ASIACRYPT 2014, Part II. LNCS, vol. 8874, pp. 22–41. Springer, Heidelberg (2014). https://doi.org/10.1007/9783662456088_2
Ducas, L., Nguyen, P.Q.: Learning a zonotope and more: cryptanalysis of NTRUSign countermeasures. In: Wang, X., Sako, K. (eds.) ASIACRYPT 2012. LNCS, vol. 7658, pp. 433–450. Springer, Heidelberg (2012). https://doi.org/10.1007/9783642349614_27
Ducas, L., Prest, T.: Fast Fourier orthogonalization. In: ISSAC, pp. 191–198 (2016)
Espitau, T., Fouque, P.A., Gérard, B., Tibouchi, M.: Sidechannel attacks on BLISS latticebased signatures: exploiting branch tracing against strongSwan and electromagnetic emanations in microcontrollers. In: Thuraisingham, B.M., Evans, D., Malkin, T., Xu, D. (eds.) ACM CCS 2017, pp. 1857–1874. ACM Press, October/November 2017
Espitau, T., Fouque, P., Gérard, B., Tibouchi, M.: Loopabort faults on latticebased signature schemes and key exchange protocols. IEEE Trans. Comput. 67(11), 1535–1549 (2018). https://doi.org/10.1109/TC.2018.2833119
Fiat, A., Shamir, A.: How to prove yourself: practical solutions to identification and signature problems. In: Odlyzko, A.M. (ed.) CRYPTO 1986. LNCS, vol. 263, pp. 186–194. Springer, Heidelberg (1987). https://doi.org/10.1007/3540477217_12
Fouque, P.A., Kirchner, P., Tibouchi, M., Wallet, A., Yu, Y.: Key Recovery from GramSchmidt Norm Leakage in HashandSign Signatures over NTRU Lattices. IACR Cryptology ePrint Archive, report 2019/1180 (2019)
Gama, N., HowgraveGraham, N., Nguyen, P.Q.: Symplectic lattice reduction and NTRU. In: Vaudenay, S. (ed.) EUROCRYPT 2006. LNCS, vol. 4004, pp. 233–253. Springer, Heidelberg (2006). https://doi.org/10.1007/11761679_15
Gentry, C., Jonsson, J., Stern, J., Szydlo, M.: Cryptanalysis of the NTRU signature scheme (NSS) from Eurocrypt 2001. In: Boyd, C. (ed.) ASIACRYPT 2001. LNCS, vol. 2248, pp. 1–20. Springer, Heidelberg (2001). https://doi.org/10.1007/3540456821_1
Gentry, C., Peikert, C., Vaikuntanathan, V.: Trapdoors for hard lattices and new cryptographic constructions. In: Ladner, R.E., Dwork, C. (eds.) 40th ACM STOC, pp. 197–206. ACM Press, May 2008
Gentry, C., Szydlo, M.: Cryptanalysis of the revised NTRU signature scheme. In: Knudsen, L.R. (ed.) EUROCRYPT 2002. LNCS, vol. 2332, pp. 299–320. Springer, Heidelberg (2002). https://doi.org/10.1007/3540460357_20
Goldreich, O., Goldwasser, S., Halevi, S.: Publickey cryptosystems from lattice reduction problems. In: Kaliski Jr., B.S. (ed.) CRYPTO 1997. LNCS, vol. 1294, pp. 112–131. Springer, Heidelberg (1997). https://doi.org/10.1007/BFb0052231
Güneysu, T., Lyubashevsky, V., Pöppelmann, T.: Practical latticebased cryptography: a signature scheme for embedded systems. In: Prouff, E., Schaumont, P. (eds.) CHES 2012. LNCS, vol. 7428, pp. 530–547. Springer, Heidelberg (2012). https://doi.org/10.1007/9783642330278_31
Hoffstein, J., HowgraveGraham, N., Pipher, J., Silverman, J.H., Whyte, W.: NTRUSign: digital signatures using the NTRU lattice. In: Joye, M. (ed.) CTRSA 2003. LNCS, vol. 2612, pp. 122–140. Springer, Heidelberg (2003). https://doi.org/10.1007/354036563X_9
Hoffstein, J., Lieman, D., Silverman, J.H.: Polynomial rings and efficient public key authentication (1999)
Hogg, R.V., McKean, J.W., Craig, A.T.: Introduction to Mathematical Satistics, 8th edn. Pearson, London (2018)
Hülsing, A., Lange, T., Smeets, K.: Rounded Gaussians  fast and secure constanttime sampling for latticebased crypto. In: Abdalla, M., Dahab, R. (eds.) PKC 2018, Part II. LNCS, vol. 10770, pp. 728–757. Springer, Cham (2018). https://doi.org/10.1007/9783319765815_25
Karmakar, A., Roy, S.S., Reparaz, O., Vercauteren, F., Verbauwhede, I.: Constanttime discrete Gaussian sampling. IEEE Trans. Comput. 67(11), 1561–1571 (2018)
Karmakar, A., Roy, S.S., Vercauteren, F., Verbauwhede, I.: Pushing the speed limit of constanttime discrete Gaussian sampling. A case study on the Falcon signature scheme. In: DAC 2019 (2019)
Klein, P.N.: Finding the closest lattice vector when it’s unusually close. In: Shmoys, D.B. (ed.) 11th SODA, pp. 937–941. ACMSIAM, January 2000
Lyubashevsky, V.: FiatShamir with aborts: applications to lattice and factoringbased signatures. In: Matsui, M. (ed.) ASIACRYPT 2009. LNCS, vol. 5912, pp. 598–616. Springer, Heidelberg (2009). https://doi.org/10.1007/9783642103667_35
Lyubashevsky, V.: Lattice signatures without trapdoors. In: Pointcheval, D., Johansson, T. (eds.) EUROCRYPT 2012. LNCS, vol. 7237, pp. 738–755. Springer, Heidelberg (2012). https://doi.org/10.1007/9783642290114_43
Lyubashevsky, V., et al.: CRYSTALSDILITHIUM. Technical report, National Institute of Standards and Technology (2019). https://csrc.nist.gov/projects/postquantumcryptography/round2submissions
Lyubashevsky, V., Prest, T.: Quadratic time, linear space algorithms for GramSchmidt orthogonalization and Gaussian sampling in structured lattices. In: Oswald, E., Fischlin, M. (eds.) EUROCRYPT 2015, Part I. LNCS, vol. 9056, pp. 789–815. Springer, Heidelberg (2015). https://doi.org/10.1007/9783662468005_30
McCarthy, S., Howe, J., Smyth, N., Brannigan, S., O’Neill, M.: BEARZ attack FALCON: implementation attacks with countermeasures on the FALCON signature scheme. In: Obaidat, M.S., Samarati, P. (eds.) SECRYPT, pp. 61–71 (2019)
McCarthy, S., Smyth, N., O’Sullivan, E.: A practical implementation of identitybased encryption over NTRU lattices. In: O’Neill, M. (ed.) IMACC 2017. LNCS, vol. 10655, pp. 227–246. Springer, Cham (2017). https://doi.org/10.1007/9783319710457_12
Micciancio, D., Peikert, C.: Trapdoors for lattices: simpler, tighter, faster, smaller. In: Pointcheval, D., Johansson, T. (eds.) EUROCRYPT 2012. LNCS, vol. 7237, pp. 700–718. Springer, Heidelberg (2012). https://doi.org/10.1007/9783642290114_41
Micciancio, D., Regev, O.: Worstcase to averagecase reductions based on Gaussian measures. SIAM J. Comput. 37(1), 267–302 (2007)
Micciancio, D., Walter, M.: Gaussian sampling over the integers: efficient, generic, constanttime. In: Katz, J., Shacham, H. (eds.) CRYPTO 2017, Part II. LNCS, vol. 10402, pp. 455–485. Springer, Cham (2017). https://doi.org/10.1007/9783319637150_16
Nguyen, P.Q., Regev, O.: Learning a parallelepiped: cryptanalysis of GGH and NTRU signatures. In: Vaudenay, S. (ed.) EUROCRYPT 2006. LNCS, vol. 4004, pp. 271–288. Springer, Heidelberg (2006). https://doi.org/10.1007/11761679_17
Oder, T., Speith, J., Höltgen, K., Güneysu, T.: Towards practical microcontroller implementation of the signature scheme Falcon. In: Ding, J., Steinwandt, R. (eds.) PQCrypto 2019. LNCS, vol. 11505, pp. 65–80. Springer, Cham (2019). https://doi.org/10.1007/9783030255107_4
Peikert, C.: An efficient and parallel Gaussian sampler for lattices. In: Rabin, T. (ed.) CRYPTO 2010. LNCS, vol. 6223, pp. 80–97. Springer, Heidelberg (2010). https://doi.org/10.1007/9783642146237_5
Pessl, P., Bruinderink, L.G., Yarom, Y.: To BLISSB or not to be: attacking strongSwan’s implementation of postquantum signatures. In: Thuraisingham, B.M., Evans, D., Malkin, T., Xu, D. (eds.) ACM CCS 2017, pp. 1843–1855. ACM Press, October/November 2017
Plantard, T., Sipasseuth, A., Dumondelle, C., Susilo, W.: DRS. Technical report, National Institute of Standards and Technology (2017). https://csrc.nist.gov/projects/postquantumcryptography/round1submissions
Pornin, T.: New Efficient, ConstantTime Implementations of Falcon, August 2019. https://falconsign.info/falconimpl20190802.pdf
Prest, T.: Proofofconcept implementation of an identitybased encryption scheme over NTRU lattices (2014). https://github.com/tprest/LatticeIBE
Prest, T., et al.: FALCON. Technical report, National Institute of Standards and Technology (2019). https://csrc.nist.gov/projects/postquantumcryptography/round2submissions
Prest, T., Ricosset, T., Rossi, M.: Simple, fast and constanttime Gaussian sampling over the integers for Falcon. In: Second PQC Standardization Conference (2019)
Stehlé, D., Steinfeld, R.: Making NTRU as secure as worstcase problems over ideal lattices. In: Paterson, K.G. (ed.) EUROCRYPT 2011. LNCS, vol. 6632, pp. 27–47. Springer, Heidelberg (2011). https://doi.org/10.1007/9783642204654_4
Tibouchi, M., Wallet, A.: One bit is all it takes: a devastating timing attack on BLISS’s nonconstant time sign flips. Cryptology ePrint Archive, Report 2019/898 (2019). https://eprint.iacr.org/2019/898
Yu, Y., Ducas, L.: Learning strikes again: the case of the DRS signature scheme. In: Peyrin, T., Galbraith, S. (eds.) ASIACRYPT 2018, Part II. LNCS, vol. 11273, pp. 525–543. Springer, Cham (2018). https://doi.org/10.1007/9783030033293_18
Zhang, Z., Chen, C., Hoffstein, J., Whyte, W.: pqNTRUSign. Technical report, National Institute of Standards and Technology (2017). https://csrc.nist.gov/projects/postquantumcryptography/round1submissions
Zhao, R.K., Steinfeld, R., Sakzad, A.: FACCT: FAst, Compact, and ConstantTime Discrete Gaussian Sampler over Integers. IACR Cryptology ePrint Archive, report 2018/1234 (2018)
Acknowledgements
This work is supported by the European Union Horizon 2020 Research and Innovation Program Grant 780701 (PROMETHEUS). This work has also received a French government support managed by the National Research Agency in the “Investing for the Future” program, under the national project RISQ P1415802660001/DOS0044216, and under the project TYREX granted by the CominLabs excellence laboratory with reference ANR10LABX0701.
Author information
Authors and Affiliations
Corresponding authors
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2020 International Association for Cryptologic Research
About this paper
Cite this paper
Fouque, PA., Kirchner, P., Tibouchi, M., Wallet, A., Yu, Y. (2020). Key Recovery from Gram–Schmidt Norm Leakage in HashandSign Signatures over NTRU Lattices. In: Canteaut, A., Ishai, Y. (eds) Advances in Cryptology – EUROCRYPT 2020. EUROCRYPT 2020. Lecture Notes in Computer Science(), vol 12107. Springer, Cham. https://doi.org/10.1007/9783030457273_2
Download citation
DOI: https://doi.org/10.1007/9783030457273_2
Published:
Publisher Name: Springer, Cham
Print ISBN: 9783030457266
Online ISBN: 9783030457273
eBook Packages: Computer ScienceComputer Science (R0)