Keywords

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

1 Introduction

1.1 Learning with Rounding

The learning with rounding (LWR) problem, introduced by Banerjee, and Rosen [BPR12], concerns the cryptographic properties of the function \(f_{\mathbf{s}}:{\mathbb Z}_q^n \rightarrow {\mathbb Z}_p\) given by

$$\begin{aligned} f_{\mathbf{s}}(\mathbf{x}) = \left\lfloor \langle \mathbf{x}, \mathbf{s}\rangle \right\rceil _p = \left\lfloor (p/q) \cdot \langle \mathbf{x}, \mathbf{s}\rangle \right\rceil \end{aligned}$$

where \(\mathbf{s} \in {\mathbb Z}_q^n\) is a secret key, \(\langle \mathbf{x}, \mathbf{s}\rangle \) is the inner product of \(\mathbf{x}\) and \(\mathbf{s}\) mod q, and \(\left\lfloor \cdot \right\rceil \) denotes the closest integer. In this work we are interested in the algorithmic hardness of the tasks of learning the secret \(\mathbf{s}\) and of distinguishing \(f_\mathbf{s}\) from a random function given uniform and independent samples of the form \((\mathbf{x}, f_\mathbf{s}(\mathbf{x}))\).

Learning with rounding was proposed as a deterministic variant of the learning with errors (LWE) problem [Reg05]. In this problem \(f_{\mathbf{s}}\) is replaced by the randomized function \(g_{\mathbf{s}}:{\mathbb Z}_q^n \rightarrow {\mathbb Z}_q\) given by \(g_{\mathbf{s}}(\mathbf{x}) = \langle \mathbf{x}, \mathbf{s}\rangle + e\), where e is sampled from some error distribution over \({\mathbb Z}_q\) independently for every input \(\mathbf{x} \in {\mathbb Z}_q^n\).

In spite of the superficial similarities between the two problems, the cryptographic hardness of LWE is much better understood. Extending works of Regev [Reg05], Peikert [Pei09], and others, Brakerski et al. [BLP+13] gave a polynomial-time reduction from finding an approximate shortest vector in an arbitrary lattice to the task of distinguishing \(g_{\mathbf{s}}\) from a random function given access to uniform and independent samples \((\mathbf{x}, g_{\mathbf{s}}(\mathbf{x}))\) when e is drawn from the discrete Gaussian distribution of sufficiently large standard deviation. Their reduction is versatile in two important aspects. First, it is meaningful for any modulus q that exceeds the standard deviation of the noise. Second, it does not assume a bound on the number of samples given to the distinguisher.

In contrast, the hardness of the learning with rounding problem has only been established for restricted settings of the parameters. In their work Peikert, and Rosen show that if \(f_{\mathbf{s}}\) can be efficiently distinguished from a random function given m random samples with advantage \(\delta \), then so can \(g_{\mathbf{s}}\) with advantage \(\delta - O(m B p / q)\), where the noise e is supported on the range of integers \(\{-B, \dots , B\}\) modulo q. From here one can conclude the hardness of distinguishing \(f_{\mathbf{s}}\) from a random function given m random samples assuming the hardness of learning with errors, but only when the modulus q is of an exponential order of magnitude in the security parameter.

Alwen et al. [AKPW13] give a reduction from LWE to the same problem assuming that \(q_{\mathrm {max}}\) is at least as large as 2n m B p and \(q_{\mathrm {max}}^2\) does not divide q, where \(q_{\mathrm {max}}\) is the largest prime divisor of q. This reduction can be meaningful even for values of q that are polynomially related to the security parameter. For example, when q is a prime number then the improvement over the reduction of Banerjee, Peikert, and Rosen is substantial.

However, the result of Alwen et al. does not apply to all (sufficiently large) values of the modulus q. For example it does not cover values of q that are powers of two. In this case the rounding function is particularly natural as it outputs the first \(\log p\) significant bits of q in binary representation. Moreover, rounding with a small prime q necessarily introduces noticeable bias, consequently requiring some form of deterministic extraction. Finally, the work of Alwen et al. does not include a treatment of the significantly more efficient ring variant of LWR.

1.2 Our Results

We establish the cryptographic hardness of the function \(f_{\mathbf{s}}\) in the following three settings:

  • One-Wayness: In Theorem 1 in Sect. 2 we show that any algorithm that recovers the secret \(\mathbf{s}\) from m independent random samples of the form \((\mathbf{x}, f_{\mathbf{s}}(\mathbf{x}))\) with probability at least \(\varepsilon \) also recovers the secret \(\mathbf{s}\) from m independent random samples of the form \((\mathbf{x}, \left\lfloor g_{\mathbf{s}}(\mathbf{x}) \right\rceil _p)\) with probability at least \(\varepsilon ^2 / (1 + 2Bp/q)^m\). Therefore, if the function \(G(\mathbf{x}_1, \dots , \mathbf{x}_m, \mathbf{s}) = (\mathbf{x}_1, \dots , \mathbf{x}_m, g_{\mathbf{s}}(\mathbf{x}_1), \dots , g_{\mathbf{s}}(\mathbf{x}_m))\) is one-way under some B-bounded distribution (i.e. if the search version of LWE is hard) then we conclude that

    $$F(\mathbf{x}_1, \dots , \mathbf{x}_m, \mathbf{s}) = (\mathbf{x}_1, \dots , \mathbf{x}_m, f_{\mathbf{s}}(\mathbf{x}_1), \dots , f_{\mathbf{s}}(\mathbf{x}_m))$$

    is also one-way, as long as \(q \ge 2 m B p\). In Theorem 2 in Sect. 2.2 we show that the ring variants of the LWE and LWR problems (defined in that section) are related in an analogous manner.

  • Pseudorandomness: In Theorem 3 in Sect. 3 we show that if there exists an efficient distinguisher that tells apart m independent random samples \((\mathbf{x}, g_{\mathbf{s}}(\mathbf{x}))\) from m independent random samples of the form \((\mathbf{x}, \left\lfloor u \right\rceil _p)\), then LWE secrets can be learned efficiently assuming \(q \ge 2mBp\). In particular, when p divides q, the above function F is a pseudorandom generator assuming the hardness of learning with errors. Theorem 3 improves upon several aspects of the work of Alwen et al.: First, we do not impose any number-theoretic restrictions on q; second, they require the stronger condition \(q \ge 2nmBp\); third, unlike theirs, our reduction is sample preserving; and fourth, we believe our proof is considerably simpler. On the other hand, the complexity of their reduction has a better dependence on the modulus q and the distinguishing probability.

  • Hardness of learning from samples with uniform noise: In Theorem 5 in Sect. 4 we give an efficient reduction that takes as input independent random samples of the form \((\mathbf{x}, g_{\mathbf{s}}(\mathbf{x}))\) and produces independent random samples of the form \((\mathbf{x}, f_{\mathbf{s}}(\mathbf{x}))\) provided that p divides q and the noise e of \(g_{\mathbf{s}}\) is uniformly distributed over the integers in the range \([-q/2p, \dots , q/2p)\). Therefore if \(f_\mathbf{s}\) can be distinguished efficiently from a random function for any number of independent random samples, so can \(g_{\mathbf{s}}\). By a reduction of Chow [Cho13] in the other direction (Theorem 6), the two problems are in fact equivalent. These reductions do not impose any additional restriction on p, q and the number of LWR samples m. The learning with errors problem under this noise distribution is not known to be as hard as the learning with errors problem with discrete Gaussian noise when the number of samples is unbounded in terms of q and n. The existence of a reduction to the case of discrete Gaussian noise is an interesting open problem.

  • Noise flooding: In addition, our technique allows for an improved analysis of noise flooding. The noise flooding technique is ubiquitous in the LWE cryptographic literature. Roughly speaking, it is used to rerandomize a faulty sample \(\bigl (\mathbf{x},\langle \mathbf{x},\mathbf{s}\rangle +e_\mathsf{bad}\bigr )\) into one of the form \(\bigl (\mathbf{x},\langle \mathbf{x},\mathbf{s}\rangle +e_\mathsf{good}\bigr )\) where \(e_\mathsf{good}\) is distributed according to the error distribution implicit in \(g_{\mathbf{s}}(\cdot )\), while \(e_\mathsf{bad}\) is not. Most of the time, the desired error distribution is a discrete Gaussian over \({\mathbb Z}_q\) whereas \(e_\mathsf{bad}\) is some arbitrary B-bounded element in \({\mathbb Z}_q\). The most common method is to draw a fresh Gaussian error e and set \(e_\mathsf{good}=e_\mathsf{bad}+e\) which results in the distribution of \(e_\mathsf{good}\) being within statistical distance \(B/\sigma \) of the desired Gaussian. However, this requires choosing parameters in order to ensure that \(B/\sigma \ge B/q\) is small. In particular, it requires setting q to be larger than any polynomial in the security parameter. Even worse, often the bound B is polynomially related to the standard deviation \(\sigma '\) of another discrete Gaussian used in the construction. This means that \(q/\sigma '\) also grows faster than any polynomial in the security parameter, which is not ideal as the quantity \(q/\sigma '\) corresponds to the strength of assumption one is making on the hardness of the underlying lattice problem. In Sect. 5 we use techniques from Sect. 2 to give a simple proof that noise flooding can be used whenever \(q=\varOmega \bigl ( B\sqrt{m}\bigr )\). In particular, it can be used even when q is polynomial in the security parameter.

Conventions. We write \(x \leftarrow X\) for a uniform sample from the set X, \(R(\mathbf{x})\) for the function \((R(x_1), \dots , R(x_n))\), and \({\mathbb Z}_q^{n*}\) for the set of vectors in \({\mathbb Z}_q^n\) which are not zero-divisors. Namely, \({\mathbb Z}_q^{n*}=\{\mathbf{x} \in {\mathbb Z}_q^n:\gcd (x_1, \dots , x_n, q) = 1\}\). All algorithms are assumed to be randomized.

2 One-Wayness of LWR

In this section we prove the following theorem. We say a distribution over \({\mathbb Z}_q\) is B-bounded if it is supported over the interval of integers \(\{-B, \dots , B\}\), where \(B \le (q - 1)/2\). We say a B-bounded distribution e is balanced if \({{\mathrm{Pr}}}[e \le 0] \ge 1/2\) and \({{\mathrm{Pr}}}[e \ge 0] \ge 1/2\).

Theorem 1

Let p, q, n, m, and B be integers such that \(q > 2pB\). For every algorithm \(\mathsf {Learn}\),

$$ \mathop {\Pr }\nolimits _{\mathbf{A},\mathbf{s},\mathbf{e}}[\mathsf {Learn}(\mathbf{A},\left\lfloor \mathbf{A}\mathbf{s}+\mathbf{e} \right\rceil _p) = \mathbf{s}]\ge \frac{\Pr _{\mathbf{A},\mathbf{s}}[\mathsf {Learn}(\mathbf{A},\left\lfloor \mathbf{A}\mathbf{s} \right\rceil _p) = \mathbf{s}]^2}{(1+2pB/q)^m}, $$

where \(\mathbf{A} \leftarrow {\mathbb Z}_q^{m \times n}\), the noise \(\mathbf{e}\) is independent over all m coordinates, B-bounded and balanced in each coordinate, and \(\mathbf{s}\) is chosen from any distribution supported on \({\mathbb Z}_q^{n*}\).

The assumptions made on the secret and error distribution in Theorem 1 are extremely mild. The condition \(\mathbf{s} \in {\mathbb Z}_q^{n*}\) is satisfied for at least a \(1 - O(1/2^n)\) fraction of secrets \(s \leftarrow {\mathbb Z}_q^n\). While a B-bounded error distribution may not be balanced, it can always be converted to a 2B-bounded and balanced error distribution by a suitable constant shift. The discrete Gaussian distribution of standard deviation \(\sigma \) is \(e^{-\varOmega (t^2)}\)-statistically close to being \(t\sigma \)-bounded and balanced for every \(t \ge 1\).

Theorem 2 in Sect. 2.2 concerns the ring variants of the LWR and LWE problems and will be proved in an analogous manner.

We now outline the proof of Theorem 1. Let \(\mathsf{X}_{\mathbf{s}}\) denote the distribution of a single LWR sample \(\mathbf{a},\left\lfloor \langle \mathbf{a},\mathbf{s}\rangle \right\rceil _p\) where \(\mathbf{a}\leftarrow {\mathbb Z}_q^n\) and \(\mathsf{Y}_{\mathbf{s}}\) denote the distribution of a single rounded LWE sample \(\mathbf{a},\left\lfloor \langle \mathbf{a},\mathbf{s}\rangle +e \right\rceil _p\). To prove Theorem 1 we will fix \(\mathbf{s}\) and look at the ratio of probabilities of any possible instance under the product distributions \(\mathsf{X}_{\mathbf{s}}^m\) and \(\mathsf{Y}_{\mathbf{s}}^m\), respectively. If this ratio was always bounded by a sufficiently small quantity K,Footnote 1 then it would follow that the success probability of any search algorithm for LWR does not deteriorate by more than a factor of 1 / K when it is run on rounded LWE instances instead.

While it happens that there are exceptional instances for which the ratio of probabilities under \(\mathsf{X}_{\mathbf{s}}^m\) and \(\mathsf{Y}_{\mathbf{s}}^m\) can be large, our proof of Theorem 1 will show that such instances cannot occur too often under the rounded LWE distribution and therefore does not significantly affect the success probability of the inversion algorithm. This can be showed by a standard probabilistic analysis, but we opt instead to work with a measure of distributions that is particularly well suited for bounding ratios of probabilities: the Rényi divergence.

The role of Rényi divergence in our analysis accounts for our quantitative improvement over the result of Banerjee, Peikert, and Rosen, who used the measure of statistical distance in its place. Rényi divergence has been used in a related context: Bai, Langlois, Lepoint, Stehlé and Steinfeld [BLL+15] use it to obtain tighter bounds for several lattice-based primitives.

2.1 Proof of Theorem 1

Given two distributions \(\mathsf{X}\) and \(\mathsf{Y}\) over \(\varOmega \), the power of their Rényi divergenceFootnote 2 is \(\mathrm {RD}_2(\mathsf{X} \Vert \mathsf{Y}) = {{\mathrm{E}}}_{a\leftarrow \mathsf{X}} [ \Pr [ \mathsf{X} = a] / \Pr [ \mathsf{Y} = a]]\).

Lemma 1

Let \(\mathsf{X}_{\mathbf{s}}\) be the distribution of a single LWR sample and let \(\mathsf{Y}_{\mathbf{s}}\) be that of a single rounded LWE sample. Assume \(B < q/2p\). For every \(\mathbf{s} \in {\mathbb Z}_q^{n*}\) and every noise distribution that is B-bounded and balanced, \(\mathrm {RD}_2\bigl (\mathsf{X}_\mathbf{s}\Vert \mathsf{Y}_\mathbf{s}\bigr )\le 1 + 2Bp/q\).

Proof

By the definition of Rényi divergence,

$$\begin{aligned} \mathrm {RD}_2\bigl (\mathsf{X}_\mathbf{s}\Vert \mathsf{Y}_\mathbf{s}\bigr )&= {{\mathrm{E}}}_{\mathbf{a}\leftarrow \mathbb {Z}_q^n} \frac{{{\mathrm{Pr}}}\bigl [\mathsf{X}_\mathbf{s}=(\mathbf{a},\left\lfloor \langle \mathbf{a},\mathbf{s}\rangle \right\rceil _p)\bigr ]}{{{\mathrm{Pr}}}\bigl [\mathsf{Y}_\mathbf{s}=(\mathbf{a},\left\lfloor \langle \mathbf{a},\mathbf{s}\rangle \right\rceil _p)\bigr ]}&= {{\mathrm{E}}}_{\mathbf{a} \leftarrow \mathbb {Z}_q^n} \frac{1}{{{\mathrm{Pr}}}_{e} \bigl [\left\lfloor \langle \mathbf{a},\mathbf{s}\rangle +e \right\rceil _p=\left\lfloor \langle \mathbf{a}, \mathbf{s}\rangle \right\rceil _p\bigr ]}. \end{aligned}$$

Let \(\mathsf{BAD_{\mathbf{s}}}\) be the set \(\big \{\mathbf{a}\in {\mathbb Z}_q^n: \big |\langle \mathbf{a},\mathbf{s}\rangle - \frac{q}{p}\left\lfloor \langle \mathbf{a},\mathbf{s}\rangle \right\rceil _p\big |< B\big \}\). These are the \(\mathbf{a}\) for which \(\langle \mathbf{a},\mathbf{s}\rangle \) is dangerously close to the rounding boundary. When \(\mathbf{a}\notin \mathsf{BAD}_\mathbf{s}\), \({{\mathrm{Pr}}}_{e}\bigl [\left\lfloor \langle \mathbf{a}, \mathbf{s}\rangle +e \right\rceil _p=\left\lfloor \langle \mathbf{a},\mathbf{s}\rangle \right\rceil _p\bigr ] = 1\). Since \(\gcd (s_1,\dots ,s_n,q)=1\), the inner product \(\langle \mathbf{a}, \mathbf{s}\rangle \) is uniformly distributed over \({\mathbb Z}_q\), so \({{\mathrm{Pr}}}[\mathbf{a} \in \mathsf{BAD}_\mathbf{s}] \le (2B - 1)p/q\). When \(\mathbf{a}\in \mathsf{BAD}_\mathbf{s}\), the event \(\left\lfloor \langle \mathbf{a},\mathbf{s}\rangle +e \right\rceil _p = \left\lfloor \langle \mathbf{a},\mathbf{s}\rangle \right\rceil _p\) still holds at least in one of the two cases \(e \le \) or \(e \ge 0\). By our assumptions on the noise distribution, \({{\mathrm{Pr}}}_{e}\bigl [\left\lfloor \langle \mathbf{a}, \mathbf{s}\rangle +e \right\rceil _p=\left\lfloor \langle \mathbf{a},\mathbf{s}\rangle \right\rceil _p\bigr ] \ge 1/2\). Conditioning over the event \(\mathbf{a} \in \mathsf{BAD}_{\mathbf{s}}\), we conclude that

$$\begin{aligned} \mathrm {RD}_2(\mathsf{X}_\mathbf{s}\Vert \mathsf{Y}_\mathbf{s}) \le 1 \cdot {{\mathrm{Pr}}}[\mathbf{a}\notin \mathsf{BAD}_\mathbf{s}] + 2 \cdot {{\mathrm{Pr}}}[ \mathbf{a}\in \mathsf{BAD}_\mathbf{s}] \le 1 + \frac{2Bp}{q}.\quad \square \end{aligned}$$

To complete the proof of Theorem 1 we need two elementary properties of Rényi divergence.

Claim

For any two distributions \(\mathsf{X}\) and \(\mathsf{Y}\), (1) \(\mathrm {RD}_2(\mathsf{X}^m \Vert \mathsf{Y}^m) = \mathrm {RD}_2(\mathsf{X} \Vert \mathsf{Y})^m\) and (2) for any event E, \({{\mathrm{Pr}}}[\mathsf{Y} \in E] \ge {{\mathrm{Pr}}}[\mathsf{X} \in E]^2/\mathrm {RD}_2(\mathsf{X} \Vert \mathsf{Y})\).

Proof

Property (1) follows immediately from independence of the m samples. Property (2) is the Cauchy-Schwarz inequality applied to the functions

$$\begin{aligned} f(a) = \frac{{{\mathrm{Pr}}}[\mathsf{X} = a]}{\sqrt{{{\mathrm{Pr}}}[\mathsf{Y} = a]}};\text { and }g(a) = \sqrt{{{\mathrm{Pr}}}[\mathsf{Y} = a]}. \quad \square \end{aligned}$$

Proof

(Proof of Theorem 1 ). Fix \(\mathbf{s}\) such that \(\gcd (\mathbf{s}, q) = 1\) and the randomness of \(\mathsf {Learn}\). By Lemma 1 and part (1) of Claim 2.1, \(\mathrm {RD}_2(\mathsf{X}_\mathbf{s}^m\Vert \mathsf{Y}_\mathbf{s}^m) \le (1 + 2Bp/q)^m\). Letting E be the event \(\{(\mathbf{A}, \mathbf{y}):\mathsf {Learn}(\mathbf{A}, \mathbf{y}) = \mathbf{s}\}\), by part (2) of Claim 2.1,

$$\begin{aligned} \mathop {\Pr }\nolimits _{\mathbf{A},\mathbf{e}}[\mathsf {Learn}(\mathbf{A},\left\lfloor \mathbf{A}\mathbf{s}+\mathbf{e} \right\rceil _p) = \mathbf{s}] \ge \frac{\Pr _{\mathbf{A}}[\mathsf {Learn}(\mathbf{A},\left\lfloor \mathbf{A}\mathbf{s} \right\rceil _p) = \mathbf{s}]^2}{(1+2pB/q)^m}. \end{aligned}$$

To obtain the theorem, we average over \(\mathbf{s}\) and the randomness of \(\mathsf {Learn}\) and apply the Cauchy-Schwarz inequality.    \(\square \)

2.2 Hardness over Rings

For many applications it is more attractive to use a ring version of LWR (RLWR). Banerjee, Peikert, and Rosen [BPR12] introduced it together with LWR. It brings the advantage of reducing the entropy of \(\mathbf{A}\) for same sized \(\left\lfloor \mathbf{A}\mathbf{s}+\mathbf{e} \right\rceil _p\). In the following theorem, we give a variant of Theorem 1 for the RLWR based on the hardness of ring LWE. This theorem is not needed for the remaining sections of the paper.

Theorem 2

Let pqnkB be integers such that \(q > 2pB\). Let \(R_q\) be the ring \({\mathbb Z}_q[x]/g(x)\) where g is a polynomial of degree n over \({\mathbb Z}_q\) and f be an arbitrary function over \(R_q\). For every algorithm \(\mathsf {Learn}\),

$$ \mathop {\Pr }\nolimits _{\mathbf{a},s,\mathbf{e}}[\mathsf {Learn}(\mathbf{a},\left\lfloor \mathbf{a}s+\mathbf{e} \right\rceil _p) = f(s)]\ge \frac{\Pr _{\mathbf{a},s}[\mathsf {Learn}(\mathbf{a},\left\lfloor \mathbf{a}s \right\rceil _p) = f(s)]^2}{(1+2pB/q)^{nk}}, $$

where \(\mathbf{a}\leftarrow R_q^k\), the noise \(\mathbf{e}\) is independent over all k coordinates, B-bounded and balanced in each coordinate, and s is chosen from any distribution supported on the set of all units in \(R_q\).

An element in \(R_q={\mathbb Z}_q[x]/g(x)\) can be represented as a polynomial (in x) of degree less than n with coefficients in \({\mathbb Z}_q\). Here, for \(a\in R_q\), \(\left\lfloor a \right\rceil _p\) is an element in \({\mathbb Z}_p[x]/g(x)\) obtained by applying the function \(\left\lfloor \cdot \right\rceil _p\) to each of coefficient of a separately. A distribution over ring \(R_q\) is B-bounded and balanced if every coefficient is drawn independently from a B-bounded and balanced distribution over \({\mathbb Z}_q\).

The bound in Theorem 2 matches the bound in Theorem 1 since k can be chosen such that nk is on the order of m. Theorem 2 follows from Claim 2.1 and the following variant of Lemma 1.

Lemma 2

Assume \(B<q/2p\). For every unit \(s\in R_q\) and noise distribution \(\chi \) that is B-bounded and balanced over \(R_q\), \(\mathrm {RD}_2\bigl (\mathsf{X}_s\Vert \mathsf{Y}_s\bigr )\le \bigl (1+2pB/q\bigr )^n\) where \(\mathsf{X}_s\) is the random variable \(\bigl (a,\left\lfloor a\cdot s \right\rceil _p\bigr )\) and \(\mathsf{Y}_s\) is the random variable \(\bigl (a,\left\lfloor a\cdot s \right\rceil _p + e\bigr )\) with \(a\leftarrow R_q\) and \(e\leftarrow \chi \).

Since the proof is very similar to the proof of Lemma 1, we defer it to Appendix A.

3 Pseudorandomness of LWR

In this section we prove the following Theorem. We will implicitly assume that algorithms have access to the prime factorization of the modulus q throughout this section.

Theorem 3

For every \(\varepsilon > 0\), n, m, \(q > 2pB\), and algorithm \(\mathsf {Dist}\) such that

$$\begin{aligned} \bigl |{{\mathrm{Pr}}}_{\mathbf{A},\mathbf{s}}\bigl [\mathsf {Dist}\bigl (\mathbf{A}, \left\lfloor \mathbf{As} \right\rceil _p\bigr )=1\bigr ] - {{\mathrm{Pr}}}_{\mathbf{A},\mathbf{u}}\bigl [\mathsf {Dist}\bigl (\mathbf{A}, \left\lfloor \mathbf{u} \right\rceil _p\bigr )=1\bigr ]\bigr | \ge \varepsilon , \end{aligned}$$
(1)

where \(\mathbf{A}\leftarrow {\mathbb Z}_q^{m\times n}\), \(\mathbf{s}\leftarrow \{0,1\}^n\) and \(\mathbf{u}\leftarrow {\mathbb Z}_q^m\) there exists an algorithm \(\mathsf {Learn}\) that runs in time polynomial in n, m, the number of divisors of q, and the running time of \(\mathsf {Dist}\) such that

$$\begin{aligned} {{\mathrm{Pr}}}_{\mathbf{A},\mathbf{s}}\bigl [\mathsf {Learn}\bigl (\mathbf{A}, \mathbf{As} + \mathbf{e}\bigr )=\mathbf{s}\bigr ] \ge \Bigl (\frac{\varepsilon }{4qm} - \frac{2^n}{p^m}\Bigr )^2 \cdot \frac{1}{(1 + 2Bp/q)^m} \end{aligned}$$
(2)

for any noise distribution \(\mathbf{e}\) that is B-bounded and balanced in each coordinate.

One unusual aspect of this theorem is that the secret is a uniformly distributed binary string in \({\mathbb Z}_q^n\). This assumption can be made essentially without loss of generality: Brakerski et al. [BLP+13] show that under discrete Gaussian noise, learning a binary secret in \(\{0,1\}^n\) from LWE samples is as hard as learning a secret uniformly sampled from \({\mathbb Z}_q^{\varOmega (n/\log q)}\). The assumption (1) can also be stated with \(\mathbf{s}\) sampled uniformly from \({\mathbb Z}_q^n\): In Sect. 3.4 we show that distinguishing LWR samples from random ones is no easier for uniformly distributed secrets than it is for any other distribution on secrets, including the uniform distribution over binary secrets. (When q is prime, the proof of Theorem 3 can be carried out for \(\mathbf{s}\) uniformly distributed over \({\mathbb Z}_q^n\) so these additional steps are not needed.)

To prove Theorem 3 we follow a sequence of standard steps originating from Yao [Yao82], Goldreich and Levin [GL89]: In Lemma 3 we convert the distinguisher \(\mathsf {Dist}\) into a predictor that given a sequence of LWR samples and a label \(\mathbf{a}\) guesses the inner product \(\langle \mathbf{a}, \mathbf{s}\rangle \) in \({\mathbb Z}_q\) with significant advantage. In Lemma 4 we show how to use this predictor to efficiently learn the entries of the vector \(\mathbf{s}\) modulo \(q'\) for some divisor \(q' > 1\) of q. If the entries of the secret \(\mathbf{s}\) are bits, \(\mathbf{s}\) is then fully recovered given LWR samples. By Theorem 1 the learner’s advantage does not deteriorate significantly when the LWR samples are replaced by LWE samples.

Our proof resembles the work of Micciancio and Mol [MM11] who give, to the best of our knowledge, the only sample preserving search-to-decision reduction for LWE (including its variants). Unlike our theorem, theirs imposes certain number-theoretic restrictions on q. Also, while Micciancio and Mol work with a problem that is “dual” to LWE, we work directly with LWR samples.

3.1 Predicting the Inner Product

Lemma 3

For all \(\varepsilon \) (possibly negative), n, m, q, every polynomial-time function R over \({\mathbb Z}_q\), and every algorithm \(\mathsf {Dist}\) such that

$$ \mathop {\Pr }\nolimits _{\mathbf{A},\mathbf{s}}\bigl [\mathsf {Dist}\bigl (\mathbf{A}, R(\mathbf{A}\mathbf{s})\bigr )=1\bigr ]-\mathop {\Pr }\nolimits _{\mathbf{A},\mathbf{u}}\bigl [ \mathsf {Dist}\bigl (\mathbf{A},R(\mathbf{u})\bigr )=1\bigr ] = \varepsilon , $$

there exists an algorithm \(\mathsf {Pred}\) whose running time is polynomial in its input size and the running time of \(\mathsf {Dist}\) such that

$$ \mathop {\Pr }\nolimits _{\mathbf{A},\mathbf{s},\mathbf{a}}\bigl [\mathsf {Pred}\bigl ( \mathbf{A},R(\mathbf{A}\mathbf{s}), \mathbf{a}\bigr ) =\langle \mathbf{a},\mathbf{s}\rangle \bigr ]= \frac{1}{q}+\frac{\varepsilon }{mq}. $$

where the probabilities are taken over \(\mathbf{A}\leftarrow {\mathbb Z}_q^{m \times n}\), \(\mathbf{u}\leftarrow {\mathbb Z}_q^m\), the random coins of the algorithms, and secret \(\mathbf{s}\) sampled from an arbitrary distribution.

Here, \(R(\mathbf{y})\) is the vector obtained by applying R to every coordinate of the vector \(\mathbf{y}\).

Proof

Consider the following algorithm \(\mathsf {Pred}\). On input \((\mathbf{A}, \mathbf{b})\) \(=\) \(((\mathbf{a}_1, b_1)\) \(, \dots ,\) \((\mathbf{a}_m,\) \( b_m))\) (\(\mathbf{a}_j \in {\mathbb Z}_q^n\), \(b_j \in {\mathbb Z}_q\)) and \(\mathbf{a} \in {\mathbb Z}_q^n\):

  1. 1.

    Sample a random index \(i \leftarrow \{1, \dots , m\}\) and a random \(c \leftarrow {\mathbb Z}_q\).

  2. 2.

    Obtain \(\mathbf{A}', \mathbf{b}'\) from \(\mathbf{A}, \mathbf{b}\) by replacing \(\mathbf{a}_i\) with \(\mathbf{a}\), \(b_i\) with R(c), and every \(b_j\) for \(j > i\) with an independent element of the form \(R(u_j), u_j \leftarrow {\mathbb Z}_q\).

  3. 3.

    If \(\mathsf {Dist}(\mathbf{A}',\mathbf{b}') = 1\), output c. Otherwise, output a uniformly random element in \({\mathbb Z}_q\).

Let \(\mathbf{h}_i=\bigl (R(\langle \mathbf{a}_1,\mathbf{s}\rangle ), \dots ,R(\langle \mathbf{a}_i,\mathbf{s}\rangle ),R(u_{i+1}), \dots ,R(u_m)\bigr ) \in {\mathbb Z}_p^m\), for i ranging from 0 to m. Then \(\mathbf{h}_m = R(\mathbf{As})\) and \(\mathbf{h}_0 = R(\mathbf{u})\) so by the assumption on \(\mathsf {Dist}\) it follows that

$$ {{\mathrm{E}}}_i\Bigl [\mathop {\Pr }\nolimits _{\mathbf{A},\mathbf{s},\mathbf{u}}\bigl [\mathsf {Dist}\bigl (\mathbf{A}, \mathbf{h}_{i}\bigr )=1\bigr ]-\mathop {\Pr }\nolimits _{\mathbf{A},\mathbf{s},\mathbf{u}}\bigl [ \mathsf {Dist}\bigl (\mathbf{A},\mathbf{h}_{i-1}\bigr )=1\bigr ]\Bigr ] = \frac{\varepsilon }{m}. $$

Conditioned on the choice of i,

$$\begin{aligned}&{{\mathrm{Pr}}}\bigl [\mathsf {Pred}(\mathbf{A}, \mathbf{b}, \mathbf{a}\bigr ) = \langle \mathbf{a},\mathbf{s}\rangle \bigr ]\\= & {} {{\mathrm{Pr}}}\bigl [\mathsf {Dist}(\mathbf{A}',\mathbf{b}')=1 \, \text {and}\, c = \langle \mathbf{a},\mathbf{s}\rangle \bigr ] + \frac{1}{q} \cdot {{\mathrm{Pr}}}\bigl [\mathsf {Dist}(\mathbf{A}',\mathbf{b}') \ne 1\bigr ] \\= & {} \frac{1}{q} \cdot {{\mathrm{Pr}}}\bigl [\mathsf {Dist}(\mathbf{A}',\mathbf{b}')=1 \mid c = \langle \mathbf{a},\mathbf{s}\rangle \bigr ] + \frac{1}{q} \cdot {{\mathrm{Pr}}}\bigl [\mathsf {Dist}(\mathbf{A}',\mathbf{b}') \ne 1\bigr ] \\= & {} \frac{1}{q}+\frac{1}{q}\cdot \bigl ({{\mathrm{Pr}}}\bigl [\mathsf {Dist}(\mathbf{A}',\mathbf{b}')=1\big |c=\langle \mathbf{a}, \mathbf{s}\rangle \bigr ]-{{\mathrm{Pr}}}\bigl [\mathsf {Dist}(\mathbf{A}',\mathbf{b}')=1\bigr ]\bigr ) \end{aligned}$$

when \(\mathbf{b} = R(\mathbf{As})\), the distribution \((\mathbf{A'}, \mathbf{b'})\) is the same as \((\mathbf{A}, \mathbf{h}_{i-1})\) while \((\mathbf{A'}, \mathbf{b'})\) conditioned on \(c = \langle \mathbf{a}, \mathbf{s}\rangle \) is the same as \((\mathbf{A}, \mathbf{h}_i)\). Averaging over i yields the desired advantage of \(\mathsf {Pred}\).    \(\square \)

3.2 Learning the Secret

Lemma 4

There exists an oracle algorithm \(\mathsf {List}\) such that for every algorithm \(\mathsf {Pred}\) satisfying \(|{{\mathrm{Pr}}}[\mathsf {Pred}(\mathbf{a}) = \langle \mathbf{a}, \mathbf{s}\rangle ] - 1/q| \ge \varepsilon \), \(\mathsf {List}^{\mathsf {Pred}}(\varepsilon )\) outputs a list of entries \((q', \mathbf{s'})\) containing at least one such that \(q' > 1\), \(q'\) divides q, and \(\mathbf{s'} = \mathbf{s} \mod q'\) in time polynomial in n, \(1/\varepsilon \), and the number of divisors of q with probability at least \(\varepsilon /4\). The probabilities are taken over \(\mathbf{a} \leftarrow {\mathbb Z}_q^n\), any distribution on \(\mathbf{s}\), and the randomness of the algorithms.

When q is a prime number, the conclusion of the theorem implies that the list must contain the secret \(\mathbf{s}\). When q is a composite, the assumption does not in general guarantee full recovery of \(\mathbf{s}\). For example, the predictor \(\mathsf {Pred}(a) = \langle \mathbf{a}, \mathbf{s}\rangle \mod q'\) has advantage \(\varepsilon = (q' - 1)/q\) but does not distinguish between pairs of secrets that are congruent modulo \(q'\). In this case \(\mathsf {List}\) cannot hope to learn any information on \(\mathbf{s}\) beyond the value \(\mathbf{s}\) modulo \(q'\).

The proof of Lemma 4 makes use of the following result of Akavia, Goldwasser, and Safra [AGS03] on learning heavy Fourier coefficients, extending work of Kushilevitz, Mansour, and others. Recall that the Fourier coefficients of a function \(h:{\mathbb Z}_q^n \rightarrow {\mathbb C}\) are the complex numbers \(\hat{h}(\mathbf{a}) = {{\mathrm{E}}}_{\mathbf{x} \leftarrow {\mathbb Z}_q^n}[h(\mathbf{x}) \omega ^{-\langle \mathbf{a}, \mathbf{x}\rangle }]\), where \(\omega = e^{2\pi i/q}\) is a primitive \(q-\)th root of unity. Our functions of interest all map into the unit complex circle \({\mathbb T}= \{c \in {\mathbb C}:|c| = 1\}\), so we specialize the result to this setting.

Theorem 4

(Akavia et al. [AGS03]). There is an algorithm \(\mathsf {AGS}\) that given query access to a function \(h:{\mathbb Z}_q^n\rightarrow \mathbb T\) outputs a list of size at most \(2/\varepsilon ^2\) which contains all \(\mathbf{a} \in {\mathbb Z}_q^n\) such that \(|\hat{h}(\mathbf{a})| \ge \varepsilon \) in time polynomial in n, \(\log {q}\), and \(1/\varepsilon \) with probability at least 1 / 2.

We will also need the following property of the Fourier transform of random variables. For completeness the proof is given below.

Claim

For every random variable Z over \({\mathbb Z}_q\) there exists a nonzero r in \({\mathbb Z}_q\) such that \(|{{\mathrm{E}}}[\omega ^{rZ}]| \ge |{{\mathrm{Pr}}}[Z = 0] - 1/q|\).

Proof

(Proof of Lemma 4 ). We first replace \(\mathsf {Pred}\) by the following algorithm: Sample a uniformly random unit (invertible element) u from \({\mathbb Z}_q^*\) and output \(u^{-1} \mathsf {Pred}(u\mathbf{a})\). This transformation does not affect the advantage of \(\mathsf {Pred}\) but ensures that for fixed \(\mathbf{s}\) and randomness of \(\mathsf {Pred}\), the value \({{\mathrm{E}}}_{\mathbf{a}}[\omega ^{r(\mathsf {Pred}(\mathbf{a}) - \langle \mathbf{a}, \mathbf{s}\rangle )}]\) is the same for all r with the same \(\gcd (r, q)\).

Algorithm \(\mathsf {List}\) works as follows: For every divisor \(r < q\) of q run \(\mathsf {AGS}\) with oracle access to the function \(h_r(\mathbf{a}) = \omega ^{r \cdot \mathsf {Pred}(\mathbf{a})}\) and output \((q' = q/r, \mathbf{s'}/r \mod q')\) for every \(\mathbf{s'}\) in the list produced by \(\mathsf {AGS}\).

We now assume \(\mathsf {Pred}\) satisfies the assumption of the lemma and analyze \(\mathsf {List}\). By Claim 3.2 there exists a nonzero \(r \in {\mathbb Z}_q\) such that \(|{{\mathrm{E}}}[\omega ^{r(\mathsf {Pred}(\mathbf{a}) - \langle \mathbf{a}, \mathbf{s}\rangle )}]| \ge \varepsilon \). By Markov’s inequality and the convexity of the absolute value, with probability at least \(\varepsilon /2\) over the choice of \(\mathbf{s}\) and the randomness of \(\mathsf {Pred}\) \(|{{\mathrm{E}}}_{\mathbf{a}}[\omega ^{r(\mathsf {Pred}(\mathbf{a}) - \langle \mathbf{a}, \mathbf{s}\rangle )}]|\) is at least \(\varepsilon /2\). We fix \(\mathbf{s}\) and the randomness of \(\mathsf {Pred}\) and assume this is the case. By our discussion on \(\mathsf {Pred}\), the expectation of interest is the same for all r with the same \(\gcd (r, q)\), so we may and will assume without loss of generality that r is a divisor of q.

Since \({{\mathrm{E}}}_{\mathbf{a}}[\omega ^{r(\mathsf {Pred}(\mathbf{a}) - \langle \mathbf{a}, \mathbf{s}\rangle )}] = \hat{h}_r(r\mathbf{s})\), by Theorem 4, the r-th run of \(\mathsf {AGS}\) outputs \(r\mathbf{s}\) with probability at least 1 / 2. Since \((r\mathbf{s})/r \mod q' = \mathbf{s} \mod q'\) it follows that the entry \((q', \mathbf{s} \mod q')\) must appear in the output of \(\mathsf {List}\) with probability at least \((1/2)(\varepsilon /2) = \varepsilon /4\). Regarding time complexity, \(\mathsf {List}\) makes a call to \(\mathsf {AGS}\) for every divisor of q except q, so its running time is polynomial in n and the number of divisors of q.    \(\square \)

Proof

(Proof of Claim 3.2). Let \(\varepsilon = {{\mathrm{Pr}}}[Z = 0] - 1/q\) and \(h(a) = q({{\mathrm{Pr}}}[Z = a] - {{\mathrm{Pr}}}[U = a])\), where \(U \leftarrow {\mathbb Z}_q\) is a uniform random variable. By Parseval’s identity from Fourier analysis,

$$\begin{aligned} \sum _{r \in {\mathbb Z}_q} |\hat{h}(r)|^2 = {{\mathrm{E}}}_{a \leftarrow {\mathbb Z}_q}[h(a)^2] \ge \frac{1}{q} h(0)^2 = q\varepsilon ^2. \end{aligned}$$

On the left hand side, after normalizing we obtain that \(\hat{h}(r) = {{\mathrm{E}}}[\omega ^{-rZ}] - {{\mathrm{E}}}[\omega ^{-rU}]\). Therefore \(\hat{h}(0) = 0\), so \(|\hat{h}(r)|^2 = |{{\mathrm{E}}}[\omega ^{-rZ}]|^2\) must be at least as large as \(q\varepsilon ^2/(q - 1)\) for at least one nonzero value of r, giving a slightly stronger conclusion than desired.    \(\square \)

3.3 Proof of Theorem 3

On input \((\mathbf{A}, \mathbf{b})\), algorithm \(\mathsf {Learn}\) runs \(\mathsf {List}^{\mathsf {Pred}(\mathbf{A}, \left\lfloor \mathbf{b} \right\rceil _p, \cdot )}(\varepsilon /2qm)\) and outputs any \(\mathbf{s} \in \{0,1\}^n\) appearing in the list such that \(\left\lfloor \mathbf{As} \right\rceil _p = \left\lfloor \mathbf{b} \right\rceil _p\) (or the message fail if no such \(\mathbf{s}\) exists). By Theorem 1,

$$\begin{aligned} {{\mathrm{Pr}}}[\mathsf {Learn}(\mathbf{A}, \left\lfloor \mathbf{As + e} \right\rceil _p) = \mathbf{s}] \ge \frac{{{\mathrm{Pr}}}[\mathsf {Learn}(\mathbf{A}, \left\lfloor \mathbf{As} \right\rceil _p) = \mathbf{s}]^2}{(1 + 2Bp/q)^m}. \end{aligned}$$

For \(\mathsf {Learn}(\mathbf{A}, \left\lfloor \mathbf{As} \right\rceil _p)\) to output \(\mathbf{s}\) it is sufficient that \(\mathbf{s}\) appears in the output of \(\mathsf {List}^{\mathsf {Pred}(\mathbf{A}, \left\lfloor \mathbf{As} \right\rceil _p, \cdot )}(\varepsilon /2qm)\) and that no other \(\mathbf{s'} \in \{0,1\}^n\) satisfies \(\left\lfloor \mathbf{As'} \right\rceil _p = \left\lfloor \mathbf{As} \right\rceil _p\). By Lemmas 3 and 4, the list contains \(\mathbf{s} \mod q'\) for some \(q'\) with probability at least \(\varepsilon /4qm\). Since \(\mathbf{s}\) is binary, \(\mathbf{s} \mod q' = \mathbf{s}\). By a union bound, the probability that some \(\left\lfloor \mathbf{As'} \right\rceil _p = \left\lfloor \mathbf{As} \right\rceil _p\) for some \(\mathbf{s'} \ne \mathbf{s}\) is at most \(2^np^{-m}\) and so

$$\begin{aligned} {{\mathrm{Pr}}}[\mathsf {Learn}(\mathbf{A}, \left\lfloor \mathbf{As + e} \right\rceil _p) = \mathbf{s}] \ge \frac{(\varepsilon /4qm - 2^np^{-m})^2}{(1 + 2Bp/q)^m}. \end{aligned}$$

3.4 Rerandomizing the Secret

Lemma 5

Let S be any distribution supported on \({\mathbb Z}_q^{n*}\). For every function R on \({\mathbb Z}_q\), there is a polynomial-time transformation that (1) maps the distribution \((\mathbf{A}, R(\mathbf{As}))_{\mathbf{A} \leftarrow {\mathbb Z}_q^{m\times n}, \mathbf{s} \leftarrow S}\) to \((\mathbf{A}, R(\mathbf{As}))_{\mathbf{A} \leftarrow {\mathbb Z}_q^{m\times n}, \mathbf{s} \leftarrow {\mathbb Z}_q^{n*}}\) and (2) maps the distributon \((\mathbf{A}, R(\mathbf{u}))_{\mathbf{A} \leftarrow {\mathbb Z}_q^{m\times n}, \mathbf{u} \leftarrow {\mathbb Z}_q^m}\) to itself.

In particular, it follows that the distinguishing advantage (1) can be preserved when the secret is chosen uniformly from \({\mathbb Z}_q^{n*}\) instead of uniformly from \(\{0,1\}^n - \{0^n\}\). The sets \({\mathbb Z}_q^{n*}\) and \(\{0,1\}^n - \{0^n\}\) can be replaced by \({\mathbb Z}_q^n\) and \(\{0,1\}^n\), respectively, if we allow for failure with probability \(O(2^{-n})\).

To prove Lemma 5 we need a basic fact from algebra. We omit the easy proof.

Claim

Multiplication by an \(n \times n\) invertible matrix over \({\mathbb Z}_q\) is a transitive action on \({\mathbb Z}_q^{n*}\).

Proof

(Proof of Lemma 5 ). Choose a uniformly random invertible matrix \(\mathbf{P} \in {\mathbb Z}_q^{n\times n}\) and apply the map \(f(\mathbf{a},b) = (\mathbf{Pa},b)\) to every row. Clearly this map satisfies the second condition. For the first condition, we write \(f(\mathbf{a}, R(\langle \mathbf{a}, \mathbf{s}\rangle )) = (\mathbf{Pa}, R(\langle \mathbf{a}, \mathbf{s}\rangle ))\), which is identically distributed as \((\mathbf{a}, R(\langle \mathbf{a}, \mathbf{P}^{-t}\mathbf{s}\rangle ))\). By Claim 3.4, for every \(\mathbf{s}\) in the support of S the orbit of \(\mathbf{P}^{-t}\mathbf{s}\) is \({\mathbb Z}_q^{n*}\), so by symmetry \(\mathbf{P}^{-t}\mathbf{s}\) is uniformly random in \({\mathbb Z}_q^{n*}\). Therefore the first condition also holds.    \(\square \)

4 Equivalence of LWR and LWE with Uniform Errors

When the number of LWR samples is not a priori bounded, we show that the pseudorandomness (resp. one-wayness) of LWR follows from the pseudorandomness (resp. one-wayness) of LWE with a uniform noise distribution over the range of integers \([-\frac{q}{2p},\dots ,\frac{q}{2p})\). We use a rejection sampling based approach to reject LWE samples which are likely to be rounded to the wrong value in \({\mathbb Z}_p\). This comes at the cost of throwing away samples, and indeed the sample complexity of our reduction grows with q.

Theorem 5

Let p and q be integers such that p divides q. Then there is a reduction \(\mathsf {R}\) with query access to independent samples such that for every \(\mathbf{s} \in {\mathbb Z}_q^{n*}\):

  • Given query access to samples \((\mathbf{a}, \langle \mathbf{a},\mathbf{s}\rangle + e)\), \(\mathbf{a} \leftarrow {\mathbb Z}_q^n, e \leftarrow [-\frac{q}{2p},\dots ,\frac{q}{2p}\big ) \subset {\mathbb Z}_q\), \(\mathsf {R}\) outputs samples from the distribution \((\mathbf{a}, \left\lfloor \langle \mathbf{a},\mathbf{s}\rangle \right\rceil _p)\), \(\mathbf{a} \leftarrow {\mathbb Z}_q^n\),

  • Given query access to uniform samples \((\mathbf{a}, u)\), \(\mathbf{a} \leftarrow {\mathbb Z}_q^n, u \leftarrow {\mathbb Z}_q\), \(\mathsf {R}\) outputs a uniform sample \((\mathbf{a}, v)\), \(\mathbf{a} \leftarrow {\mathbb Z}_q^n\), \(v \leftarrow {\mathbb Z}_p\).

In both cases, the expected running time and sample complexity of the reduction is O(q / p).

Proof

We view the set \((q/p){\mathbb Z}_p\) as a subset of \({\mathbb Z}_q\). The reduction \(\mathsf {R}\) queries its oracle until it obtains the first sample \((\mathbf{a}, b) \in {\mathbb Z}_q^n \times {\mathbb Z}_q\) such that b is in the set \((q/p){\mathbb Z}_p\) and outputs \((\mathbf{a}, (p/q)b) \in {\mathbb Z}_q^n \times {\mathbb Z}_p\). In both cases of interest b is uniformly distributed in \({\mathbb Z}_q\), so the expected number of query calls until success is q / p.

When the queried samples are uniformly distributed in \({\mathbb Z}_q^n \times {\mathbb Z}_q\), the output is also uniformly distributed in \({\mathbb Z}_q^n \times {\mathbb Z}_p\). For queried samples of the form \((\mathbf{a}, \langle \mathbf{a},\mathbf{s}\rangle + e)\), we calculate the probability mass function of the output distribution. For every possible output \((\mathbf{a}', b')\), we have

$$\begin{aligned} {{\mathrm{Pr}}}\big [\mathsf {R} \, \text {outputs}\, (\mathbf{a}', b')\big ]&= {{\mathrm{Pr}}}\big [\mathbf{a} = \mathbf{a}' \, \text {and}\, \langle \mathbf{a},\mathbf{s}\rangle + e = b'\ \big |\ \langle \mathbf{a},\mathbf{s}\rangle + e \in (q/p){\mathbb Z}_p\big ] \\&= {{\mathrm{Pr}}}_{\mathbf{a}}[\mathbf{a}=\mathbf{a}'] \cdot \frac{{{\mathrm{Pr}}}_{e} \bigl [\langle \mathbf{a},\mathbf{s}\rangle +e=(q/p)b'\ \big |\ \mathbf{a}=\mathbf{a}'\bigr ]}{{{\mathrm{Pr}}}_{e}\bigl [\langle \mathbf{a},\mathbf{s}\rangle +e\in (q/p) {\mathbb Z}_p\ \big |\ \mathbf{a}=\mathbf{a}'\bigr ]} \\&= q^{-n}\cdot \left\{ \begin{array}{ll} \frac{p/q}{p/q}, &{}\text {if}\, (q/p)b'-\langle \mathbf{a}',\mathbf{s}\rangle \in \bigl [-\frac{q}{2p},\dots ,\frac{q}{2p}\bigr ) \\ 0, &{} \text {otherwise.}\end{array}\right. \\&= \left\{ \begin{array}{ll} q^{-n}, &{}\text {if}\, b'=\left\lfloor \langle \mathbf{a}',\mathbf{s}\rangle \right\rceil _p \\ 0, &{}\text {otherwise.}\end{array}\right. \end{aligned}$$

This is the probability mass function of the distribution \((\mathbf{a}, \left\lfloor \langle \mathbf{a},\mathbf{s}\rangle \right\rceil _p)\), as desired.    \(\square \)

The following theorem whose proof appears in the M.Eng. thesis of Chow [Cho13] shows that distinguishing LWR samples from uniform and inverting LWR samples are not substantially harder than they are for LWE samples under the above noise distribution.

Theorem 6

For all m, n, p, q such that p divides q, and \(\varepsilon \) (possibly negative), and polynomial-time algorithm \(\mathsf {Dist}\) such that

$$\begin{aligned} {{\mathrm{Pr}}}_{\mathbf{A},\mathbf{s}}\bigl [\mathsf {Dist}\bigl (\mathbf{A}, \mathbf{As}+\mathbf{e}\bigr )=1\bigr ] - {{\mathrm{Pr}}}_{\mathbf{A},\mathbf{u}}\bigl [\mathsf {Dist}\bigl (\mathbf{A}, \mathbf{u}\bigr )=1\bigr ] = \varepsilon , \end{aligned}$$

there exists a polynomial time algorithm \(\mathsf {Dist'}\) such that

$$\begin{aligned} {{\mathrm{Pr}}}_{\mathbf{A},\mathbf{s}}\bigl [\mathsf {Dist}'\bigl (\mathbf{A}, \left\lfloor \mathbf{As} \right\rceil _p\bigr )=1\bigr ] - {{\mathrm{Pr}}}_{\mathbf{A},\mathbf{u}}\bigl [\mathsf {Dist}'\bigl (\mathbf{A}, \left\lfloor \mathbf{u} \right\rceil _p\bigr )=1\bigr ] = \frac{\varepsilon }{q}, \end{aligned}$$

where \(\mathbf{A}\leftarrow {\mathbb Z}_q^{m\times n}\), the noise \(\mathbf{e}\) is independent over all m coordinates and uniform over the set \([-\frac{q}{2p},\dots ,\frac{q}{2p})\subseteq {\mathbb Z}_q\) in each coordinate, and \(\mathbf{s}\) is chosen from any distribution supported on \({\mathbb Z}_q^{n*}\).

Proof

Consider the following algorithm \(\mathsf {Dist}'\).

On input \((\mathbf{A}, \mathbf{b}) = ((\mathbf{a}_1, b_1), \dots , (\mathbf{a}_m, b_m))\) (\(\mathbf{a}_j \in {\mathbb Z}_q^n\), \(b_j \in {\mathbb Z}_p\)) and \(\mathbf{a}\in {\mathbb Z}_q^n\):

  1. 1.

    Sample a random \(\mathbf{r} \leftarrow {\mathbb Z}^n_q\) and a random \(\mathbf{c}\leftarrow {\mathbb Z}_q^m\).

  2. 2.

    Obtain \(\mathbf{A}', \mathbf{b}'\in {\mathbb Z}_q^{m\times n}\times {\mathbb Z}_q^m\) from \(\mathbf{A}, \mathbf{b}\) by letting \(\mathbf{A}' = \mathbf{A} - \mathbf{c}\bullet \mathbf{r}\), \(\mathbf{b}'=\frac{q}{p}\cdot \mathbf{b} - \mathbf{c}\).

  3. 3.

    If \(\mathsf {Dist}(\mathbf{A}',\mathbf{b}') = 1\), output 1. Otherwise, output 0.

Here, \(\mathbf{c} \bullet \mathbf{r}\) is the outer product of the vectors \(\mathbf{c}\) and \(\mathbf{r}\).

When \(\mathbf{b} = \left\lfloor \mathbf{u} \right\rceil _p\), \((\mathbf{A}',\mathbf{b}')\) is distributed as \((\mathbf{A}',\mathbf{u})\). When \(\mathbf{b} = \left\lfloor \mathbf{As} \right\rceil _p\), we can write

$$\begin{aligned} (\mathbf{A}',\mathbf{b}')&= (\mathbf{A} - \mathbf{c}\bullet \mathbf{r}, \tfrac{q}{p}\cdot \left\lfloor \mathbf{As} \right\rceil _p - \mathbf{c}) \\&= (\mathbf{A}',\tfrac{q}{p}\cdot \left\lfloor \mathbf{A's}+\mathbf{c}\cdot \langle \mathbf{r},\mathbf{s}\rangle \right\rceil _p - \mathbf{c} )\\&= (\mathbf{A}',\mathbf{A's}+\mathbf{c}\cdot \langle \mathbf{r},\mathbf{s}\rangle - \{\mathbf{A's}+\mathbf{c}\cdot \langle \mathbf{r},\mathbf{s}\rangle \}_p - \mathbf{c} ) \\&= (\mathbf{A}',\mathbf{A's}+\mathbf{c}\cdot (\langle \mathbf{r},\mathbf{s}\rangle -1) - \{\mathbf{A's}+\mathbf{c}\cdot \langle \mathbf{r},\mathbf{s}\rangle \}_p) \end{aligned}$$

where \(\{x\}_p = x - \frac{q}{p}\cdot \left\lfloor x \right\rceil _p\). Conditioned on \(\langle \mathbf{r},\mathbf{s}\rangle =1\), \((\mathbf{A}',\mathbf{b}')\) is distributed as \((\mathbf{A}', \mathbf{A's}+\{\mathbf{u}\}_p)\), which is the same as \((\mathbf{A}', \mathbf{A's}+\mathbf{e})\) where each coordinate of \(\mathbf{e}\) is uniformly drawn from set \([-\frac{q}{2p},\dots ,\frac{q}{2p})\subseteq {\mathbb Z}_q\). In this case \(\mathsf {Dist}\) has distinguishing advantage \(\varepsilon \). Conditioned on \(\langle \mathbf{r},\mathbf{s}\rangle \ne 1\), \((\mathbf{A}',\mathbf{b}')\) is distributed uniformly over \({\mathbb Z}_q^{m\times n}\times {\mathbb Z}_q^m\) and \(\mathsf {Dist}\) has zero distinguishing advantage. Since for any \(\mathbf{s}\in {\mathbb Z}_q^{n*}\), the probability that \(\langle \mathbf{r},\mathbf{s}\rangle =1\) equals 1 / q over the random choice of \(\mathbf{r}\), the overall distinguishing advantage is \(\varepsilon /q\).    \(\square \)

5 Noise Flooding

In this section, let \(\chi _\sigma \) denote the discrete Gaussian distribution on \({\mathbb Z}_q\) with standard deviation \(\sigma \): \(\chi _\sigma (x)\) is proportional to \(\exp \bigl (-\pi (x/\sigma )^2\bigr )\). Often in applications of LWE, one is given a sample \((\mathbf{a},b)\) with \(b=\langle \mathbf{a},\mathbf{s}\rangle +e\) for \(e\leftarrow \chi _\sigma \) and by performing various arithmetic operations obtains a new pair \((\mathbf{a}',b')\) with \(b'=\langle \mathbf{a}',\mathbf{s}'\rangle +e'\). Sometimes, the noise quantity \(e'\) obtained is not distributed according to a Gaussian, but is only subject to an overall bound on its absolute value. If the proof of security needs \((\mathbf{a}',b')\) to be an LWE instance, then sometimes the “noise flooding” technique is used where a fresh Gaussian \(x\leftarrow \chi _{\sigma '}\) is drawn and \(b'\) is set to \(\langle \mathbf{a}',\mathbf{s}'\rangle +e'+x\). As long as \(e'+\chi _{\sigma '}\approx _\mathsf{s}\chi _{\sigma '}\) the resulting \((\mathbf{a}',b')\) is statistically close to a fresh LWE instance. This technique in some form or another appears in many places, for example [AIK11, GKPV10, DGK+10, OPW11]. Unfortunately, \(e'+\chi _{\sigma '}\approx _\mathsf{s}\chi _{\sigma '}\) requires q to be large and so the applications also carry this requirement. In this section we bound the continuous analogue of Rényi divergence between \(e'+\chi _{\sigma '}\) and \(\chi _{\sigma '}\) and show that the noise flooding technique can be used even when q is polynomial in the security parameter, as long as the number of samples is also bounded.

We remark that our main result in this section, Corollary 1, follows from general results in prior work which bound the Rényi divergence between Gaussians. For example, Lemma 4.2 of [LSS14] implies Corollary 1 below. However, we are unaware of a theorem in the literature with a simple statement which subsumes Corollary 1. We include a proof for completeness.

Claim

Let \(\varPsi _\alpha \) be the continuous Gaussian on \({\mathbb R}\) with standard deviation \(\alpha \): \(\varPsi _\alpha (x)=\alpha ^{-1}e^{-\pi (x/\alpha )^2}\). Then for any \(\beta \in {\mathbb R}\),

$$\mathrm {RD}_2(\beta +\varPsi _\alpha \Vert \varPsi _\alpha ) =e^{2\pi (\beta /\alpha )^2}.$$

Proof

We have

$$\begin{aligned} \mathrm {RD}_2(\beta +\varPsi _\alpha \Vert \varPsi _\alpha )= & {} \alpha ^{-1}\int _{-\infty }^\infty e^{-\bigl (\pi /\alpha ^2\bigr ) \bigl [2(x-\beta )^{2}-x^2)\bigr ]}dx\\= & {} \alpha ^{-1}\cdot e^{2\pi \bigl (\beta /\alpha \bigr )^2}\int _{-\infty }^\infty e^{-\bigl (\pi /\alpha ^2\bigr )\bigl [(x-2\beta )^2\bigr ]}dx\\= & {} e^{2\pi \bigl (\beta /\alpha \bigr )^2}. \end{aligned}$$

We have used the substitution \(u=x-2\beta \) and the identity \(\int _{\mathbb {R}}e^{-\pi cu^2}du=c^{-1/2}\) for all \(c>0\).    \(\square \)

Corollary 1

Fix \(m,q,k\in {\mathbb Z}\), a bound B, and a standard deviation \(\sigma \) such that \(B<\sigma <q\). Moreover, let \(e\in {\mathbb Z}_q\) be such that \(|e|\le B\). If \(\sigma =\varOmega \bigl (B\sqrt{m/\log k}\bigr )\), then

$$\mathrm {RD}_2\bigl ((e+\chi _\sigma )^m\Vert \chi _\sigma ^m\bigr )=\mathrm {poly}(k)$$

where \(\mathsf{X}^m\) denotes m independent samples from \(\mathsf{X}\).

Proof

Rényi divergence cannot grow by applying a function to both distributions. Since the discrete Gaussians \(e+\chi _\sigma \) and \(\chi _\sigma \) are obtained from the continuous Gaussians \(\beta +\varPsi _\alpha \) and \(\varPsi _\alpha \) by scaling and rounding, where \(\beta =|e|/q\) and \(\alpha =\sigma /q\), we see that

$$\mathrm {RD}_2\bigl (e+\chi _\sigma \Vert \chi _\sigma \bigr )\le \mathrm {RD}_2\bigl (\beta +\varPsi _\alpha \Vert \varPsi _\alpha \bigr )=\exp \bigl (2\pi (\beta / \alpha )^2\bigr )\le \exp \bigl (2\pi (B/\sigma )^2\bigr ).$$

Therefore, \(\mathrm {RD}_2\bigl ((e+\chi _\sigma )^m\Vert \chi _\sigma ^m\bigr )\le \exp \bigl ( 2\pi m(B/\sigma )^2\bigr )\), and the result follows.    \(\square \)