Learning with error was proposed by O. Regev in 2005 (see Regev, 2009), which can be regarded as a dual form of SIS problem. LWE has very important applications in modern cryptography, such as LWE-based fully homomorphic encryption. The main purpose of this chapter is to explain the mathematical principles of the LWE problem in detail, especially the polynomial equivalence between the average LWE problem and the hard problems on lattice, which is one generalization of the Ajtai reduction principle and solves the computational complexity of the LWE problem effectively.

3.1 Circulant Matrix

Circulant matrix is a kind of simple and beautiful special matrix in mathematics, which has important applications in many fields of engineering technology. In Sect. 7.7 of ‘Modern Cryptography’, we explain and demonstrate the basic properties of circulant matrix in detail. See the monograph Zheng (2022) on circulant matrices for more details.

Let T be a square matrix of order n,

$$\begin{aligned} T=\left( \begin{array}{ccc|c} 0 &{} \cdots &{} 0 &{} 1\\ \hline &{} &{} &{} 0\\ &{} I_{n-1} &{} &{} \vdots \\ &{} &{} &{} 0 \\ \end{array} \right) _{n\times n}, \end{aligned}$$
(3.1.1)

where \(I_{n-1}\) is the \(n-1\) dimensional unit matrix. Obviously, we can define a linear transformation \(x\rightarrow Tx\), \(x\in \mathbb {R}^n\) of \(\mathbb {R}^n\rightarrow \mathbb {R}^n\) by T. The characteristic polynomial of T is \(f(x)=x^n-1\), so \(T^n=I_n\). We use column notation for vectors in \(\mathbb {R}^n\), and \(\{e_0,e_1,\dots ,e_{n-1}\}\) is the standard basis of \(\mathbb {R}^n\), i.e.

$$\begin{aligned} e_0=\begin{pmatrix} 1 \\ 0 \\ 0 \\ \vdots \\ 0 \end{pmatrix},\ e_1= \begin{pmatrix} 0 \\ 1 \\ 0 \\ \vdots \\ 0 \end{pmatrix},\ \dots ,\ e_{n-1}= \begin{pmatrix} 0 \\ 0 \\ \vdots \\ 0 \\ 1 \end{pmatrix}. \end{aligned}$$
(3.1.2)

Denote \(e_m\) as \(e_k\), if \(m\equiv k\ (\text {mod}\ n)\), and \(0\leqslant k\leqslant n-1\), it is easy to see

$$\begin{aligned} Te_k=e_{k+1},\ \text {and}\ T^k(e_0)=e_k,\ 0\leqslant k\leqslant n-1. \end{aligned}$$
(3.1.3)

Definition 3.1.1

Let \(\alpha =\begin{pmatrix} \alpha _0 \\ \vdots \\ \alpha _{n-1} \end{pmatrix}\in \mathbb {R}^n\), the circulant matrix \(T^{*}(\alpha )\) generated by \(\alpha \) is defined by

$$\begin{aligned} T^*(\alpha )=[\alpha ,T\alpha ,\dots ,T^{n-1}\alpha ]_{n\times n}\in \mathbb {R}^{n\times n}. \end{aligned}$$
(3.1.4)

It is easy to verify that the circulant matrix B generated by the linear combination vector is the linear combination of the corresponding circulant matrices, i.e.

$$\begin{aligned} T^*(a\alpha +b\beta )=aT^*(\alpha )+bT^*(\beta ). \end{aligned}$$
(3.1.5)

Specially, for any \(\alpha =\begin{pmatrix} \alpha _0 \\ \vdots \\ \alpha _{n-1} \end{pmatrix}\in \mathbb {R}^n\), the circulant matrix \(T^{*}(\alpha )\) generated by \(\alpha \) could be written as

$$\begin{aligned} T^*(\alpha )=T^*\left( \sum \limits _{i=0}^{n-1} \alpha _i e_i \right) =\sum \limits _{i=0}^{n-1} \alpha _i T^*(e_i), \end{aligned}$$
(3.1.6)

therefore, any circulant matrix is the linear combination of circulant matrices generated by the standard basis vectors \(e_i\). It is easy to verify that

$$\begin{aligned} T^*(e_k)=T^k,\ 0\leqslant k\leqslant n-1. \end{aligned}$$
(3.1.7)

In particular, the unit matrix \(I_n\) is a circulant matrix generated by the vector \(e_0\). The basis properties about the circulant matrix are summarized in the following lemma, and the corresponding proofs could be found in Sect. 7.7 in Zheng (2022).

Lemma 3.1.1

Let \(\alpha =\begin{pmatrix} \alpha _0 \\ \alpha _1 \\ \vdots \\ \alpha _{n-1} \end{pmatrix}\), \(\beta =\begin{pmatrix} \beta _0 \\ \beta _1 \\ \vdots \\ \beta _{n-1} \end{pmatrix}\) be two vectors in \(\mathbb {R}^n\), then we have

  1. (i)

    \(T^*(\alpha )=\alpha _0 I_n+\alpha _1 T+\cdots +\alpha _{n-1}T^{n-1}\).

  2. (ii)

    \(T^*(\alpha )\cdot T^*(\beta )=T^*(\beta )\cdot T^*(\alpha )\).

  3. (iii)

    \(T^*(\alpha )\cdot T^*(\beta )=T^*(T^*(\alpha )\beta )\).

  4. (iv)

    \(\text {det}(T^*(\alpha ))=\Pi _{i=0}^{n-1} \alpha (w_i)\), where \(w_i\) is the n-th unit root.

  5. (v)

    \(T^*(\alpha )\) is an invertible matrix if and only if the characteristic polynomial \(\alpha (x)=\alpha _0+\alpha _1 x+\cdots +\alpha _{n-1}x^{n-1}\) corresponding to \(\alpha \) and \(x^n-1\) are coprime, i.e. \((\alpha (x),x^n-1)=1\).

We take the characteristic polynomial \(x^n-1\) as modulo and construct the one-to-one correspondence between polynomial quotient rings and n dimensional vectors, which is called the geometric theory of polynomial rings. We consider the following three polynomial quotient rings. Let \(\mathbb {R}[x]\), \(\mathbb {Z}[x]\) and \(\mathbb {Z}_q[x]\) be the polynomial rings of one variable on \(\mathbb {R}\), \(\mathbb {Z}\) and \(\mathbb {Z}_q\) respectively, defined by

$$\begin{aligned} \overline{R}=\mathbb {R}[x]/<x^n-1>=\left\{ \sum \limits _{i=0}^{n-1}a_i x^i\ |\ a_i\in \mathbb {R}\right\} , \end{aligned}$$
(3.1.8)
$$\begin{aligned} R=\mathbb {Z}[x]/<x^n-1>=\left\{ \sum \limits _{i=0}^{n-1}a_i x^i\ |\ a_i\in \mathbb {Z}\right\} , \end{aligned}$$
(3.1.9)

and

$$\begin{aligned} R_q=\mathbb {Z}_q[x]/<x^n-1>=\left\{ \sum \limits _{i=0}^{n-1}a_i x^i\ |\ a_i\in \mathbb {Z}_q\right\} . \end{aligned}$$
(3.1.10)

In fact, the right hand side of the above formula is a set of representative elements of the polynomial quotient ring.

For any \(\alpha (x)=\alpha _0+\alpha _1 x+\cdots +\alpha _{n-1} x^{n-1}\in \overline{R}\), we construct the following correspondence

$$\begin{aligned} \alpha (x)=\alpha _0+\alpha _1 x+\cdots +\alpha _{n-1} x^{n-1}\in \overline{R}\longleftrightarrow \alpha =\begin{pmatrix} \alpha _0 \\ \alpha _1 \\ \vdots \\ \alpha _{n-1} \end{pmatrix}\in \mathbb {R}^n, \end{aligned}$$
(3.1.11)

written as \(\alpha (x)\longleftrightarrow \alpha \) or \(\alpha \longleftrightarrow \alpha (x)\). Then (3.1.11) gives a one-to-one correspondence between \(\overline{R}\) and \(\mathbb {R}^n\). In the same way, \(\alpha (x)\longleftrightarrow \alpha \) also gives the one-to-one correspondences of \(R\rightarrow \mathbb {Z}^n\) and \(R_q\rightarrow \mathbb {Z}_q^n\). It is not hard to see that the above correspondence is an Abel group isomorphism. To establish ring isomorphism, we introduce the concept of convolution multiplication of vectors.

Definition 3.1.2

For any two vectors \(\alpha \), \(\beta \) in \(\mathbb {R}^n\), \(\mathbb {Z}^n\) or \(\mathbb {Z}_q^n\), we define the convolution \(\alpha \otimes \beta \) by

$$\begin{aligned} \alpha \otimes \beta =T^*(\alpha )\cdot \beta . \end{aligned}$$
(3.1.12)

Under the above definition, \(\mathbb {R}^n\), \(\mathbb {Z}^n\) and \(\mathbb {Z}_q^n\) become a commutative ring with unit element, respectively. Obviously, the convolution defined by (3.1.12) is closed on \(\mathbb {Z}^n\) or \(\mathbb {Z}_q^n\). If \(\alpha \in \mathbb {Z}^n\), then \(T^*(\alpha )\in \mathbb {Z}^{n\times n}\), thus, \(\alpha \otimes \beta =T^*(\alpha )\beta \in \mathbb {Z}^n\), so is \(\mathbb {Z}_q^n\). Based on the property (iii) of lemma 3.1.1,

$$\begin{aligned} T^*(\alpha \otimes \beta )=T^*(T^*(\alpha )\beta )=T^*(\alpha )T^*(\beta )=T^*(\beta )T^*(\alpha )=T^*(\beta \otimes \alpha ), \end{aligned}$$

so we have \(\alpha \otimes \beta =\beta \otimes \alpha \). On the other hand,

$$\begin{aligned} (\alpha +\alpha ')\otimes \beta =T^*(\alpha +\alpha ')\beta =T^*(\alpha )\beta +T^*(\alpha ')\beta =\alpha \otimes \beta +\alpha '\otimes \beta , \end{aligned}$$

hence, \(\mathbb {R}^n\), \(\mathbb {Z}^n\) and \(\mathbb {Z}_q^n\) are commutative rings with the same unit element \(e_0\). Since \(T^*(e_0)=I_n\), then

$$\begin{aligned} e_0\otimes \beta =T^*(e_0)\beta =I_n\beta =\beta . \end{aligned}$$

Lemma 3.1.2

Suppose \(\overline{R}\), R and \(R_q\) are defined by (3.1.8), (3.1.9) and (3.1.10), then we have the following three ring isomorphisms:

$$\begin{aligned} \overline{R}\cong \mathbb {R}^n,\ R\cong \mathbb {Z}^n\ \text {and}\ R_q\cong \mathbb {Z}_q^n. \end{aligned}$$

Proof

We only prove \(\overline{R}\cong \mathbb {R}^n\), the other two conclusions could be proved in the same way. \(\forall \alpha (x)\in \overline{R}\), \(\alpha (x)\longleftrightarrow \alpha \in \mathbb {R}^n\) is a one-to-one correspondence and an Abel group isomorphism. We are to prove

$$\begin{aligned} \alpha (x)\beta (x)\longleftrightarrow \alpha \otimes \beta ,\ \forall \alpha (x),\beta (x)\in \overline{R}. \end{aligned}$$
(3.1.13)

Let \(\beta (x)=\beta _0+\beta _1 x+\cdots +\beta _{n-1}x^{n-1}\), then

$$\begin{aligned} \begin{aligned} x\beta (x)&=\beta _0 x+\beta _1 x^2+\cdots +\beta _{n-2}x^{n-1}+\beta _{n-1}x^n \\&=\beta _{n-1}+\beta _0 x+\cdots +\beta _{n-2}x^{n-1}, \end{aligned} \end{aligned}$$

so \(x\beta (x)\longleftrightarrow T\beta \). For all k, \(0\leqslant k\leqslant n-1\), we know

$$\begin{aligned} x^k \beta (x)\longleftrightarrow T^k\beta . \end{aligned}$$

Let \(\alpha (x)=\alpha _0+\alpha _1 x+\cdots +\alpha _{n-1}x^{n-1}\), it follows that

$$\begin{aligned} \alpha (x)\beta (x)=\sum \limits _{k=0}^{n-1} \alpha _k x^k \beta (x)\longleftrightarrow \sum \limits _{k=0}^{n-1} \alpha _k T^k \beta =T^*(\alpha )\beta =\alpha \otimes \beta . \end{aligned}$$

Therefore, we prove that \(\overline{R}\cong \mathbb {R}^n\). Similarly, we have \(R\cong \mathbb {Z}^n\) and \(R_q\cong \mathbb {Z}_q^n\).    \(\square \)

Since \(\mathbb {R}^n\) is Euclidean space, the Euclidean distances in \(\mathbb {Z}^n\) and \(\mathbb {Z}_q^n\) could also be defined as the Euclidean distance in \(\mathbb {R}^n\), which is called the embedding of Euclidean distance in \(\mathbb {Z}^n\) and \(\mathbb {Z}_q^n\). By Lemma 3.1.2, we treat \(\overline{R}\), R, \(R_q\) and \(\mathbb {R}^n\), \(\mathbb {Z}^n\), \(\mathbb {Z}_q^n\) as the same and write \(\overline{R}=\mathbb {R}^n\), \(R=\mathbb {Z}^n\), \(R_q=\mathbb {Z}_q^n\). Therefore, the polynomial rings \(\overline{R}\), R and \(R_q\) also have Euclidean distance, which constructs the geometry of the polynomial ring. For any polynomial \(\alpha (x)\in \overline{R}\), we define

$$\begin{aligned} |\alpha (x)|=|\alpha |,\ \text {if}\ \alpha (x)\longleftrightarrow \alpha . \end{aligned}$$
(3.1.14)

Lemma 3.1.3

For any \(\alpha (x)\), \(\beta (x)\in \overline{R}\) (or R, \(R_q\)), we have

$$\begin{aligned} |\alpha (x)\beta (x)|\leqslant \sqrt{n}|\alpha (x)|\cdot |\beta (x)|. \end{aligned}$$

Proof

To prove this lemma, we only prove that for any \(\alpha \), \(\beta \in \mathbb {R}^n\) (the same as \(\mathbb {Z}^n\) or \(\mathbb {Z}_q^n\)), we have

$$\begin{aligned} |\alpha \otimes \beta |\leqslant \sqrt{n}|\alpha |\cdot |\beta |. \end{aligned}$$
(3.1.15)

By Definition 3.1.2,

$$\begin{aligned} \alpha \otimes \beta =T^*(\alpha )\beta =[\alpha ,T\alpha ,\dots ,T^{n-1}\alpha ]\beta =\begin{pmatrix} b_1 \\ b_2 \\ \vdots \\ b_n \end{pmatrix}\in \mathbb {R}^n. \end{aligned}$$

Let \(\overline{\alpha }\) be the conjugation vector of \(\alpha \), i.e.

$$\begin{aligned} \alpha =\begin{pmatrix} \alpha _0 \\ \alpha _1 \\ \vdots \\ \alpha _{n-1} \end{pmatrix}\Rightarrow \overline{\alpha }=\begin{pmatrix} \alpha _{n-1} \\ \alpha _{n-2} \\ \vdots \\ \alpha _0 \end{pmatrix}, \end{aligned}$$

then, the circulant matrix \(T^*(\alpha )\) generated by \(\alpha \) can be divided into rows

$$\begin{aligned} T^*(\alpha )=\begin{pmatrix} \overline{\alpha }^{T}T^T \\ \overline{\alpha }^{T}(T^T)^2 \\ \vdots \\ \overline{\alpha }^{T}(T^T)^n \end{pmatrix}, \end{aligned}$$

where \(T^T\) is the transposed matrix of T. So \(b_i=\overline{\alpha }^{T}(T^T)^i \beta \ (1\leqslant i\leqslant n)\) and we get

$$\begin{aligned} |b_i|\leqslant |\alpha |\cdot |\beta |,\ 1\leqslant i\leqslant n. \end{aligned}$$

It follows that

$$\begin{aligned} |\alpha \otimes \beta |= \left( \sum \limits _{i=1}^{n}b_i^2 \right) ^{\frac{1}{2}}\leqslant \sqrt{n}|\alpha |\cdot |\beta |. \end{aligned}$$

We complete the proof.    \(\square \)

Finally we discuss the relation between circulant matrix and lattice. Let \(B\in \mathbb {R}^{n\times n}\) be a square matrix of order n, the lattice \(L(B)\subset \mathbb {R}^n\) generated by B is defined by

$$\begin{aligned} L(B)=\{Bx\ |\ x\in \mathbb {Z}^n\}. \end{aligned}$$

If B is an invertible matrix, then L(B) is called an n dimensional full rank lattice.

Definition 3.1.3

Let \(L(B)\subset \mathbb {R}^n\) be a lattice, we call L(B) a cyclic lattice, if L(B) is closed under the linear transformation T, i.e. for any \(\alpha \in L(B)\) we have \(T\alpha \in L(B)\). If \(L(B)\subset \mathbb {Z}^n\) is a cyclic lattice, then L(B) is called a cyclic integer lattice.

Lemma 3.1.4

Let \(\alpha \in \mathbb {R}^n\), then the lattice \(L(T^*(\alpha ))\) generated by the circulant matrix \(T^*(\alpha )\) is a cyclic lattice, which is the smallest cyclic lattice containing \(\alpha \).

Proof

Based on the definition \(T^*(\alpha )=[\alpha ,T\alpha ,\dots ,T^{n-1}\alpha ]\), we get

$$\begin{aligned} L(T^*(\alpha ))=\left\{ \sum \limits _{i=0}^{n-1}a_i T^i \alpha \ |\ a_i\in \mathbb {Z}\right\} . \end{aligned}$$

For any \(\beta \in L(T^*(\alpha ))\),

$$\begin{aligned} \beta =\sum \limits _{i=0}^{n-1}b_i T^i \alpha \Rightarrow T\beta \in L(T^*(\alpha )),\ b_i\in \mathbb {Z}, \end{aligned}$$

so \(L(T^*(\alpha ))\) is a cyclic lattice. Assume L is a cyclic lattice containing \(\alpha \), since \(\alpha \in L\), \(T\alpha \in L,\dots ,T^{n-1}\alpha \in L\), then any linear combination of integer coefficients

$$\begin{aligned} \sum \limits _{i=0}^{n-1} a_i T^i \alpha \in L\Rightarrow L(T^*(\alpha ))\subset L. \end{aligned}$$

This means that \(L(T^*(\alpha ))\) is the smallest cyclic lattice containing \(\alpha \).    \(\square \)

Lemma 3.1.5

Let \(L(B)\subset \mathbb {R}^n\) be a cyclic lattice, \(\alpha \in L(B)\) be a lattice vector, then there is an integer matrix \(D\in \mathbb {Z}^{n\times n}\) such that

$$\begin{aligned} T^*(\alpha )=BD. \end{aligned}$$
(3.1.16)

Proof

Since \(\alpha \in L(B)\), L(B) is a cyclic lattice, then \(T\alpha \in L(B)\), \(T^2 \alpha \in L(B),\dots \), \(T^{n-1}\alpha \in L(B)\). Let (\(0\leqslant k\leqslant n-1\))

$$\begin{aligned} T^k \alpha =Bd_k,\ d_k\in \mathbb {Z}^n,\ D=[d_0,d_1,\dots ,d_{n-1}]_{n\times n}\in \mathbb {Z}^{n\times n}, \end{aligned}$$

the circulant matrix \(T^*(\alpha )\) generated by \(\alpha \) could be written as

$$\begin{aligned} T^*(\alpha )=[\alpha ,T\alpha ,\dots ,T^{n-1}\alpha ]=[Bd_0,Bd_1,\dots ,Bd_{n-1}]=BD. \end{aligned}$$

Lemma 3.1.5 holds.    \(\square \)

Let \(L\subset \mathbb {R}^n\) be a lattice, for any \(x\in \mathbb {R}^n\), there exists \(u_x\in L\Rightarrow \)

$$\begin{aligned} |x-u_x|=\min \limits _{\alpha \in L,\alpha \ne x}|\alpha -x|=|x-L|. \end{aligned}$$
(3.1.17)

\(u_x\) is called the nearest lattice vector of x. We define the covering radius \(\rho (L)\) of L by

$$\begin{aligned} \rho (L)=\max \limits _{x\in \mathbb {R}^n}|x-u_x|=\max \limits _{x\in \mathbb {R}^n}|x-L|. \end{aligned}$$
(3.1.18)

Obviously, the covering radius \(\rho (L)\) satisfies that any sphere \(N(x,\rho (L))\) with radius \(\rho (L)\) contains at least one lattice vector. If \(L_1\subset L\) is a sublattice, then for any \(x\in \mathbb {R}^n\),

$$\begin{aligned} |x-L|\leqslant |x-L_1|\Rightarrow \rho (L)\leqslant \rho (L_1). \end{aligned}$$
(3.1.19)

If \(L=L(B)\), we write \(\rho (L)=\rho (B)\). The final goal of this section is to prove the existence of the covering radius and give an upper bound estimate of \(\rho (L)\) using Babai’s nearest plane algorithm.

Let \(L=L(B)\), \(S=\{s_1,s_2,\dots ,s_n\}\subset L\) be n linearly independent lattice vectors. \(S^*=\{s_1^*,s_2^*,\dots ,s_n^*\}\) is the orthogonal basis corresponding to S by the Gram-Schmidt method. We define

$$\begin{aligned} \sigma (S)=\left( \sum \limits _{i=1}^{n} |s_i^*|^2\right) ^{\frac{1}{2}}. \end{aligned}$$
(3.1.20)

Lemma 3.1.6

(Babai) Let \(L=L(B)\subset \mathbb {R}^n\) be a full rank lattice, \(S\subset L\) be the set of n linearly independent lattice vectors, then for any \(t\in \mathbb {R}^n\), there exists a lattice vector \(w\in L\Rightarrow \)

$$\begin{aligned} |t-w|\leqslant \frac{1}{2}\sigma (S). \end{aligned}$$
(3.1.21)

Specially, the covering radius \(\rho (L)\) of L exists and satisfies \(\rho (L)\leqslant \frac{1}{2}\sigma (S)\).

Proof

Without loss of generality, we only prove for the case \(S=B\). Since \(L(S)\subset L(B)\) is a full rank sublattice, by (3.1.21) \(w\in L(S)\Rightarrow w\in L(B)\) and \(\rho (L)\leqslant \rho (S)\leqslant \frac{1}{2}\sigma (S)\). Let \(B=[\beta _1,\beta _2,\dots ,\beta _n]\), the corresponding orthogonal basis is \(B^*=[\beta _1^*,\beta _2^*,\dots ,\beta _n^*]\). Babai’s algorithm is based on the following two techniques:

  1. (1)

    Rounding off (see Theorem 7 of Chap. 7 in Zheng (2022))

\(\forall x\in \mathbb {R}^n\), let \(x=\sum \nolimits _{i=1}^{n} x_i \beta _i^*\), where \(x_i\in \mathbb {R}\). Define \(\delta _i\in \mathbb {Z}\) is the nearest integer of \(x_i\), and

$$\begin{aligned}{}[x]_B=\sum \limits _{i=1}^n \delta _i \beta _i^*,\ \{x\}_B=\sum \limits _{i=1}^n a_i \beta _i^*,\ -\frac{1}{2}<a_i\leqslant \frac{1}{2},\ 1\leqslant i\leqslant n. \end{aligned}$$

It is easy to see \(x=[x]_B+\{x\}_B\), where \([x]_B\in L\) is a lattice vector.

  1. (2)

    Nearest plane

Let \(U=L(\beta _1,\beta _2,\dots ,\beta _{n-1})\subset \mathbb {R}^n\) be an \(n-1\) dimensional subspace,

$$\begin{aligned} L'=\sum \limits _{i=1}^{n-1} \mathbb {Z}\beta _i\subset L\ \text {is a sublattice of}\ L. \end{aligned}$$

After \(x\in \mathbb {R}^n\) is given, let \(v\in L\), such that \(U+v\) is the nearest plane of x. Let \(x'\) be the orthographic projection of x in \(U+v\), \(y\in L'\) be the nearest lattice vector of \(x-v\), \(w=y+v\) be an approximation of the nearest lattice vector of x in L. Based on the above definitions, we can prove that (see (7.82) of Chap. 7 in Zheng (2022))

$$\begin{aligned} \left\{ \begin{array}{l} U=L(\beta _1,\beta _2,\dots ,\beta _{n-1})=L(\beta _1^*,\beta _2^*,\dots ,\beta _{n-1}^*) \\ v=\delta _n \beta _n\in L \\ x'=\sum \limits _{i=1}^{n-1} x_i \beta _i^*+ \delta _n\beta _n^* \\ y\ \text {is the nearest lattice vector of}\ x-v\ \text {in}\ L' \\ w=y+v\in L \end{array} \right. . \end{aligned}$$
(3.1.22)

Since \(v=\delta _n \beta _n\), \(x'=\sum \nolimits _{i=1}^{n-1}x_i \beta _i^*+\delta _n \beta _n^*\),

$$\begin{aligned} |x-x'|=|x_n-\delta _n||\beta _n^*|\leqslant \frac{1}{2}|\beta _n^*|. \end{aligned}$$

The distance between any two planes in \(\{U+z\ |\ z\in L\}\) is at least \(|\beta _n^*|\), and \(|x-x'|\) is the distance of x from the nearest plane, so we have

$$\begin{aligned} |x-x'|\leqslant |x-u_x|. \end{aligned}$$

Let \(w=y+v=y+\delta _n \beta _n\in L\), we are to prove

$$\begin{aligned} |x-w|^2=|x-x'|^2+|x'-w|^2. \end{aligned}$$
(3.1.23)

This is because

$$\begin{aligned} x-x'=(x_n-\delta _n)\beta _n^*,\ x'-w=x'-v-y\in U, \end{aligned}$$

therefore,

$$\begin{aligned} (x-x')\bot (x'-w), \end{aligned}$$

and (3.1.23) holds. Based on the assumption:

$$\begin{aligned} |x'-w|^2\leqslant \frac{1}{4}\left( |\beta _1^*|^2+\cdots +|\beta _{n-1}^*|^2\right) . \end{aligned}$$

It follows that

$$\begin{aligned} |x-w|^2\leqslant \frac{1}{4}(|\beta _1^*|^2+\cdots +|\beta _{n-1}^*|^2+|\beta _n^*|^2)=\left( \frac{1}{2}\sigma (B)\right) ^2. \end{aligned}$$

Let \(x=t\in \mathbb {R}^n\), we get \(w\in L\) such that

$$\begin{aligned} |t-w|\leqslant \frac{1}{2}\sigma (B). \end{aligned}$$

This lemma holds.    \(\square \)

The calculation of the covering radius on lattice is also a kind of hard problem. We define the covering radius problem (\(\text {CDP}_{\gamma }\)) based on parameter approximation.

Definition 3.1.4

(\(\text {CDP}_{\gamma }\)) Let L be a full rank lattice, \(\gamma (n)\) be a parameter, \(\text {CDP}_{\gamma }\) problem is to find an r such that

$$\begin{aligned} \rho (L)\leqslant r\leqslant \gamma (n)\rho (L). \end{aligned}$$
(3.1.24)

3.2 SIS and Knapsack Problem on Ring

Let q be a positive integer, \(\mathbb {Z}_q\) be the residue class ring \(\text {mod}\ q\), and \(\mathbb {Z}_q[x]\) be the polynomial ring of one variable on \(\mathbb {Z}_q\). By (3.1.10), we define a quotient ring \(R_q\) on \(\mathbb {Z}_q[x]\)

$$\begin{aligned} R_q=\mathbb {Z}_q[x]/<x^n-1>\cong (\mathbb {Z}_q^n,+,\otimes ). \end{aligned}$$
(3.2.1)

To define the SIS problem on \(R_q\), for any m polynomials \(A=\{a_1(x),\dots ,a_m(x)\}\subset R_q\), A could be regarded as an m dimensional vector in \(R_q\), i.e. \(A=(a_1(x),\dots ,\) \(a_m(x))\in R_q^m\), with the norm |A| defined by

$$\begin{aligned} |A|=\left( \sum \limits _{i=1}^{m} |a_i(x)|^2\right) ^{\frac{1}{2}}=\left( \sum \limits _{i=1}^{m} |a_i|^2\right) ^{\frac{1}{2}}, \end{aligned}$$
(3.2.2)

where \(a_i(x)\longleftrightarrow a_i\in \mathbb {Z}_q^n\).

Definition 3.2.1

Let \(\beta >0\) be a positive real number, n, m, q be positive integers. The SIS problem on \(R_q\) is defined as follows: for any given uniformly distributed vector \(A=(a_1(x),\dots ,a_m(x))\in R_q^m\), find an m dimensional vector \(z=(z_1(x),z_2(x),\dots ,z_m(x))\in R_q^m\) such that

$$\begin{aligned} \left\{ \begin{array}{l} f_A(z)=\sum \limits _{i=1}^m a_i(x)z_i(x)=0 \\ 0<|z|=\left( \sum \limits _{i=1}^m |z_i(x)|^2\right) ^{\frac{1}{2}}\leqslant \beta \end{array} \right. . \end{aligned}$$
(3.2.3)

This problem is denoted as \(R_q-\text {SIS}_{q,\beta ,m}\).

Remark 3.2.1

By the above definition, \(f_A(z)\in R_q\), so \(f_A(z)=0\) is equivalent to

$$\begin{aligned} f_A(z)=\sum \limits _{i=1}^{m} a_i(x)z_i(x)\equiv 0\ (\text {mod}\ x^n-1), \end{aligned}$$

here \(0<|z|\leqslant \beta \) is computed in the real number field \(\mathbb {R}\).

Remark 3.2.2

In order to guarantee the \(R_q-\text {SIS}_{q,\beta ,m}\) problem has solution, we only need \(m>\text {log}_2 q\), which has big difference from the requirement \(m>n\text {log}q\) of the classical SIS problem (see Sect. 2.2 in the last chapter). In fact, if \(A=(a_1(x),a_2(x),\dots ,a_m(x))\) is given, the selection of \(z=(z_1(x),\dots ,z_m(x))\) could be considered in \(\mathbb {Z}_q^n\). For each \(z_i(x)\longleftrightarrow z_i\in \mathbb {Z}_q^n\), choose each coordinate of \(z_i\) as 0 or 1 so that the n dimensional vector \(z_i\) has a short length. There are about \(2^n\) such short vectors \(z_i\), so there are about \(2^{mn}\) choices of z in total. If \(2^{mn}>q^n\), i.e. \(mn>n\text {log}_2 q\), \(m>\text {log}_2 q\), then \(z'\in R_q^m\), \(z''\in R_q^m\Rightarrow \)

$$\begin{aligned} f_A(z')=f_A(z'')\Rightarrow f_A(z'-z'')=0. \end{aligned}$$

So \(z=z'-z''\) is the solution satisfying (3.2.3).

Geometric definition of \(R_q-\text {SIS}_{q,\beta ,m}\):

Given m vectors \(A=(a_1,a_2,\dots ,a_m)\) uniformly distributed on \(\mathbb {Z}_q^n\), \(a_i\in \mathbb {Z}_q^n\), solve a group of nonzero short vectors \(z=(z_1,z_2,\dots ,z_m)\), \(z_i\in \mathbb {Z}_q^n\), such that

$$\begin{aligned} \left\{ \begin{array}{l} f_A(z)=\sum \limits _{i=1}^m a_i \otimes z_i=0 \\ |z_i|\leqslant \sqrt{n},\ 1\leqslant i\leqslant n \end{array} \right. . \end{aligned}$$
(3.2.4)

Obviously, \(R_q-\text {SIS}\) problem is a special case of the knapsack problem on ring.

Definition 3.2.2

(Knapsack problem on ring) Let R be a commutative ring with identity, \(a_1,\dots ,a_m\) be m nonzero elements in R, \(X\subset R\), \(|X|=2^n\), \(b\in R\) is called the target element. Knapsack problem on ring is to solve m elements \(z_1,z_2,\dots ,z_m\in X\) in X such that

$$\begin{aligned} f_A(z)=\sum \limits _{i=1}^m a_i z_i=b,\ \forall z_i\in X. \end{aligned}$$
(3.2.5)

If \(R=\mathbb {Z}\) is a ring of integers, \(X=\{0,1\}\), or \(X=\{0,1,\dots ,2^n-1\}\), then the above problem is the classical knapsack problem. It has been proved that the computational complexity of solving the knapsack problem on \(\mathbb {Z}\) is subexponential, such as the super increasing sequence is polynomial. If \(R=R_q\), \(b=0\), then the above problem becomes the SIS problem on \(R_q\). The main result in this section is the following theorem:

Theorem 3.2.1

Let \(m=O(\log n)\), \(k=\tilde{O}(\log n)\), \(q\geqslant 4mkn^{\frac{5}{2}}\), and \(\gamma \geqslant 16mkn^3\), if we can solve the knapsack problem (3.2.6) on \(R_q\), then there exists a probabilistic polynomial algorithm solving the covering radius problem \(\text {CDP}_{\gamma }\) for any n dimensional full rank cyclic lattice.

The knapsack problem on \(R_q\) in Theorem 3.2.1 is the more general case of (3.2.4), which is summarized in the following definition.

Knapsack problem on \(R_q\): Choose m vectors \(A=(a_1,a_2,\dots ,a_m)\) uniformly distributed on \(\mathbb {Z}_q^n\) randomly and any target vector \(b\in \mathbb {Z}_q^n\), find a set of short vectors \(z=(z_1,z_2,\dots ,z_m)\) such that

$$\begin{aligned} f_A(z)=\sum \limits _{i=1}^m a_i \otimes z_i=b,\ |z_i|\leqslant \sqrt{n},\ 1\leqslant i\leqslant m. \end{aligned}$$
(3.2.6)

From Theorem 3.2.1, the knapsack problem on \(R_q\) on the average case has a more difficult computational complexity than the covering radius problem on any full rank cyclic lattice under positive probability, which is another reduction principle from the worst case to the average case by Ajtai.

The core idea of the proof of Theorem 3.2.1 is to approximate the covering radius \(\rho (L)\) of L by \(\frac{1}{2}\sigma (S)\) for any cyclic lattice \(L=L(B)\subset \mathbb {R}^n\) under the assumption that (3.2.6) is solvable, where \(S=\{s_1,s_2,\dots ,s_n\}\subset L\) is a set of n linearly independent vectors, and

$$\begin{aligned} \sigma (S)=\left( \sum \limits _{i=1}^{n} |s_i^*|^2\right) ^{\frac{1}{2}}. \end{aligned}$$

\(\{s_1^*,s_2^*,\dots ,s_n^*\}\) is the corresponding orthogonal basis of S using Gram-Schmidt algorithm. Since \(|s_i^*|\leqslant |s_i|\ (1\leqslant i\leqslant n)\), we have

$$\begin{aligned} \sigma (S)=\left( \sum \limits _{i=1}^{n} |s_i^*|^2\right) ^{\frac{1}{2}}\leqslant \left( \sum \limits _{i=1}^{n} |s_i|^2\right) ^{\frac{1}{2}}. \end{aligned}$$
(3.2.7)

By Lemma 3.1.6, \(\rho (L)\leqslant \frac{1}{2}\sigma (S)\). The core steps of approximating \(\rho (L)\) by \(\frac{1}{2}\sigma (S)\) is summarized as follows.

(1) Reduced algorithm

Randomly choose \(S=\{s_1,s_2,\dots ,s_n\}\subset L\) is a set of n linearly independent lattice vectors, assume that

$$\begin{aligned} |S|=|s_n|=\max \limits _{1\leqslant i\leqslant n}|s_i|. \end{aligned}$$

If \(\frac{1}{2}\sigma (S)\leqslant \gamma \rho (L)\), then the \(\text {CDP}_{\gamma }\) problem on L is solved. If \(\sigma (S)>2\gamma \rho (L)\), we can find a lattice vector \(s'\in L\), such that

$$\begin{aligned} |s'|\leqslant \frac{1}{2} |s_n|=\frac{1}{2} |S|, \end{aligned}$$

and \(s_1,s_2,\dots ,s_{n-1},s'\) are linearly independent. Replace S with the new set of vectors \(S'=\{s_1,s_2,\dots ,s_{n-1},s'\}\), that is, replace \(s_n\) with \(s'\) in S. Repeat this process n times and we can get

$$\begin{aligned} |S'|\leqslant \frac{1}{2}|S|. \end{aligned}$$
(3.2.8)

Repeat the above reduced algorithm, and find a set of linearly independent vectors \(S\subset L\), such that

$$\begin{aligned} |S|\leqslant \frac{2\gamma }{\sqrt{n}} \rho (L), \end{aligned}$$
(3.2.9)

and the computational complexity of the algorithm is polynomial. Based on (3.2.9), we have

$$\begin{aligned} \rho (L)\leqslant \frac{1}{2}\sigma (S)\leqslant \frac{\sqrt{n}}{2}|S|\leqslant \gamma \rho (L). \end{aligned}$$

So we complete solving the \(\text {CDP}_{\gamma }\) problem.

(2) Approximation of standard orthogonal basis

Let \(\{e_0,e_1,\dots ,e_{n-1}\}\subset \mathbb {Z}_q^n\) be a standard orthogonal basis, \(L=L(B)\subset \mathbb {R}^n\) be a given cyclic lattice. Define the parameter

$$\begin{aligned} \beta =\left( \frac{4nq}{\gamma }+\frac{\sqrt{n}}{2}\right) \sigma (S), \end{aligned}$$
(3.2.10)

where \(S=\{s_1,s_2,\dots ,s_n\}\subset L\) is a set of n linearly independent vectors, such that

$$\begin{aligned} \sigma (S)>2\gamma \rho (L). \end{aligned}$$
(3.2.11)

To find \(s'\) in the reduced algorithm, by Lemma 3.1.6, there is a lattice vector \(c\in L\Rightarrow \)

$$\begin{aligned} |c-\beta e_0|\leqslant \frac{1}{2}\sigma (S). \end{aligned}$$
(3.2.12)

Since T is an orthogonal matrix, it is an orthogonal linear transformation in \(\mathbb {R}^n\), i.e.

$$\begin{aligned} |T\alpha |=|\alpha |,\ \forall \alpha \in \mathbb {R}^n. \end{aligned}$$

Therefore, for any \(0\leqslant k\leqslant n-1\),

$$\begin{aligned} |T^k(c-\beta e_0)|=|c-\beta e_0|\leqslant \frac{1}{2}\sigma (S). \end{aligned}$$

Note that \(T^k e_0=e_k\), so

$$\begin{aligned} |T^k c-{\beta }e_k|\leqslant \frac{1}{2} \sigma (S). \end{aligned}$$

Because \(c\in L\) and L is a cyclic lattice, then \(T^k c\in L\ (0\leqslant k\leqslant n-1)\). The circulant matrix \(T^*(c)=[c,Tc,\dots ,T^{k-1}c]\) implements the approximation of standard orthogonal basis.

In order to give a complete proof of theorem 3.2.1, we denote

$$\begin{aligned} B'=q(T^*(c))^{-1}B. \end{aligned}$$
(3.2.13)

Lemma 3.2.1

The lattice \(L(B')\) generated by \(B'\) satisfies \(q\mathbb {Z}^n\subset L(B')\).

Proof

By Lemma 3.1.5, since \(c\in L\) and L is a cyclic lattice, there exists an integer matrix \(D\in \mathbb {Z}^{n\times n}\) such that

$$\begin{aligned} T^*(c)=BD\Rightarrow B^{-1}T^*(c)\in \mathbb {Z}^{n\times n}, \end{aligned}$$

thus,

$$\begin{aligned} B'(B^{-1}T^*(c))=q(T^*(c))^{-1}\cdot B\cdot B^{-1}T^*(c)=qI_n. \end{aligned}$$

Each column of the above matrix \(qe_j\ (0\leqslant j\leqslant n-1)\in L(B')\Rightarrow q\mathbb {Z}^n\subset L(B')\).    \(\square \)

Based on Lemma 3.2.1, \(q\mathbb {Z}^n\) is an additive subgroup in \(L(B')\). Randomly choose mk vectors \(x_{ij}^{'}\in G\ (1\leqslant i\leqslant m,1\leqslant j\leqslant k)\) in the quotient group \(G=L(B')/q\mathbb {Z}^n\), the integral vectors \(w_{ij}^{'}\) of \(x_{ij}^{'}\) is defined by

$$\begin{aligned} w_{ij}^{'}=[x_{ij}^{'}]\in \mathbb {Z}^n,\ 1\leqslant i\leqslant m,\ 1\leqslant j\leqslant k. \end{aligned}$$

Let

$$\begin{aligned} a_i\equiv \sum \limits _{j=1}^{k} w_{ij}^{'}\ (\text {mod}\ q)\Rightarrow a_i\in \mathbb {Z}_q^n, \end{aligned}$$
(3.2.14)

\(A=(a_1,a_2,\dots ,a_m)\) contains the above m vectors in \(\mathbb {Z}_q^n\), consider the knapsack problem on \(R_q=(\mathbb {Z}_q^n,+,\otimes )\),

$$\begin{aligned} f_A(z)=\sum \limits _{i=1}^m a_i\otimes z_i,\ \forall z_i\in \mathbb {Z}_q^n,\ |z_i|\leqslant \sqrt{n}. \end{aligned}$$

If we can solve the knapsack problem on \(R_q\), then \(f_A(z)\) collision is also solvable. So there are integral vectors \(y=(y_1,y_2,\dots ,y_m)\), \(\hat{y}=(\hat{y}_1,\hat{y}_2,\dots ,\hat{y}_m)\) such that

$$\begin{aligned} f_A(y-\hat{y})=\sum \limits _{i=1}^{m} a_i\otimes (y_i-\hat{y}_i)=0,\ \forall |y_i|\leqslant \sqrt{n},\ |\hat{y}_i|\leqslant \sqrt{n}, \end{aligned}$$
(3.2.15)

where

$$\begin{aligned} y=(y_1,y_2,\dots ,y_m),\ \hat{y}=(\hat{y}_1,\hat{y}_2,\dots ,\hat{y}_m). \end{aligned}$$
(3.2.16)

Based on the vector clusters y and \(\hat{y}\) in \(\mathbb {Z}_q^n\), we define

$$\begin{aligned} \left\{ \begin{array}{l} x_{ij}=\frac{1}{q}T^*(c)x_{ij}^{'}\ \text {and}\ w_{ij}=\frac{1}{q}T^*(c)w_{ij}^{'} \\ s'=\sum \limits _{i=1}^{m} \sum \limits _{j=1}^{k} (x_{ij}-w_{ij}) \otimes (y_{i}-\hat{y}_{i}) \end{array} \right. . \end{aligned}$$
(3.2.17)

The \(s'\) defined by the above formula is just the \(s'\) in the reduced algorithm. First, we prove the following lemma.

Lemma 3.2.2

\(x_{ij}\in L(B)\) is a lattice vector in the given cyclic lattice L (\(1\leqslant i\leqslant m,1\leqslant j\leqslant k\)), and if \(f_A(y)=f_A(\hat{y})\), \(s'\in L(B)\) is also a lattice vector.

Proof

Since \(x_{ij}^{'}\in L(B')\), there is \(\alpha \in \mathbb {Z}^n\) such that \(x_{ij}^{'}=B'\alpha \), we get

$$\begin{aligned} x_{ij}=\frac{1}{q}T^*(c)B'\alpha =\frac{1}{q}T^*(c)\cdot q(T^*(c))^{-1}\cdot B\alpha =B\alpha \in L(B). \end{aligned}$$

To prove \(s'\in L(B)\), by (3.2.17) and the property of circulant matrix (see (3.1.5))

$$\begin{aligned} \begin{aligned} s'&=\sum x_{ij}\otimes (y_{ij}-\hat{y}_{ij})-\sum w_{ij}\otimes (y_{ij}-\hat{y}_{ij}) \\&=\sum T^*(x_{ij}) (y_{ij}-\hat{y}_{ij})-\sum T^*(w_{ij}) (y_{ij}-\hat{y}_{ij}) \\&=\sum \limits _{i=1}^{m} T^*(\sum \limits _{j=1}^{k} x_{ij})(y_i-\hat{y}_i)-\sum \limits _{i=1}^{m} T^*(\sum \limits _{j=1}^{k} w_{ij})(y_i-\hat{y}_i). \end{aligned} \end{aligned}$$
(3.2.18)

Based on the first conclusion, \(x_{ij}\in L(B)\Rightarrow \sum \nolimits _{j=1}^{k} x_{ij}\in L(B)\), since \(y_i\) and \(\hat{y}_i\) are integral vectors in \(\mathbb {Z}_q^n\), it follows that

$$\begin{aligned} T^*\left( \sum \limits _{j=1}^{k} x_{ij}\right) (y_i-\hat{y}_i)\in L(B). \end{aligned}$$

Next we prove the second term of (3.2.18) is also a lattice vector. By the definition of \(w_{ij}\),

$$\begin{aligned} w_{ij}=\frac{1}{q}T^*(c)w_{ij}^{'},\ \text {then}\ \sum \limits _{j=1}^k w_{ij}=\frac{1}{q}T^*(c)\left( \sum \limits _{j=1}^k w_{ij}^{'}\right) . \end{aligned}$$

Hence,

$$\begin{aligned} T^*\left( \sum \limits _{j=1}^k w_{ij}\right) =\frac{1}{q}T^*(c)T^*\left( \sum \limits _{j=1}^k w_{ij}^{'}\right) . \end{aligned}$$

The second term of (3.2.18) could be written as

$$\begin{aligned} \begin{aligned} \sum \limits _{i=1}^m T^*(\sum \limits _{j=1}^k w_{ij})(y_i-\hat{y}_i)&=\frac{1}{q}T^*(c)\sum \limits _{i=1}^m T^*(\sum \limits _{j=1}^k w_{ij}^{'})(y_i-\hat{y}_i) \\&=\frac{1}{q}T^*(c)\sum \limits _{i=1}^m \sum \limits _{j=1}^k w_{ij}^{'}\otimes (y_i-\hat{y}_i).\end{aligned} \end{aligned}$$
(3.2.19)

Since

$$\begin{aligned} \sum \limits _{i=1}^m \sum \limits _{j=1}^k w_{ij}^{'}\otimes (y_i-\hat{y}_i)\equiv \sum \limits _{i=1}^m a_{i}\otimes (y_i-\hat{y}_i)\ (\text {mod}\ q)\equiv f_A(y)-f_A(\hat{y})\ (\text {mod}\ q), \end{aligned}$$

by \(f_A(y)=f_A(\hat{y})\), we know the second term of (3.2.18) is in L(B), i.e.

$$\begin{aligned} \sum \limits _{i=1}^m T^*(\sum \limits _{j=1}^k w_{ij})(y_i-\hat{y}_i)\in L(B). \end{aligned}$$

Finally we have \(s'\in L(B)\) based on (3.2.18).    \(\square \)

Lemma 3.2.3

The lattice vector \(s'\) defined in (3.2.17) satisfies

$$\begin{aligned} |s'|\leqslant \frac{1}{2}|s_n|=\frac{1}{2}|S|. \end{aligned}$$
(3.2.20)

Proof

We only prove \(|s'|\leqslant \sigma (S)/2\sqrt{n}\), since

$$\begin{aligned} \sigma (S)\leqslant \left( \sum \limits _{i=1}^n |s_i|^2\right) ^{\frac{1}{2}}\leqslant \sqrt{n} |S|=\sqrt{n} |s_n|, \end{aligned}$$

we can get \(|s'|\leqslant \frac{1}{2}|s_n|\), and the lemma is proved. Based on the definition of \(s'\),

$$\begin{aligned} |s'|\leqslant \sum \limits _{i=1}^m \sum \limits _{j=1}^k |(x_{ij}-w_{ij})\otimes (y_i-\hat{y}_i)|. \end{aligned}$$
(3.2.21)

It follows that

$$\begin{aligned} x_{ij}-w_{ij}=\frac{1}{q}T^*(c)(x_{ij}^{'}-w_{ij}^{'})=\frac{1}{q}c \otimes (x_{ij}^{'}-w_{ij}^{'}). \end{aligned}$$

Let \(\alpha =c-\beta e_0\), then \(|\alpha |\leqslant \frac{1}{2}\sigma (S)\) (see (3.2.12)), and

$$\begin{aligned} \begin{aligned} x_{ij}-w_{ij}&=\frac{1}{q}(\alpha +\beta e_0)\otimes (x_{ij}^{'}-w_{ij}^{'})=\frac{1}{q}T^*(\alpha +\beta e_0)(x_{ij}^{'}-w_{ij}^{'}) \\&=\frac{1}{q}\beta T^*(e_0) (x_{ij}^{'}-w_{ij}^{'})+\frac{1}{q} T^*(\alpha ) (x_{ij}^{'}-w_{ij}^{'}) \\&=\frac{\beta }{q}(x_{ij}^{'}-w_{ij}^{'})+\frac{1}{q} T^*(\alpha ) (x_{ij}^{'}-w_{ij}^{'}). \end{aligned} \end{aligned}$$

Since

$$\begin{aligned} |x_{ij}^{'}-w_{ij}^{'}|\leqslant \frac{1}{2}\sqrt{n}, \end{aligned}$$

combine with (3.1.15) in the last section, we have (\(\beta \) is determined by (3.2.10))

$$\begin{aligned} \begin{aligned} |x_{ij}-w_{ij}|&\leqslant \frac{\beta }{q}|x_{ij}^{'}-w_{ij}^{'}|+\frac{1}{q} |\alpha \otimes (x_{ij}^{'}-w_{ij}^{'})| \\&\leqslant \frac{\beta }{q}\cdot \frac{1}{2}\sqrt{n}+\frac{1}{q}\cdot \frac{\sqrt{n}}{2}\cdot \sqrt{n}\cdot \frac{1}{2}\sigma (S) \\&= \frac{\beta }{q}\cdot \frac{\sqrt{n}}{2}+\frac{1}{q}\sqrt{n}\cdot \frac{\sigma (S)}{2}\cdot \frac{\sqrt{n}}{2} \\&= \sigma (S) \left( \frac{2n^{\frac{3}{2}}}{\gamma }+\frac{n}{2q}\right) \\&\leqslant \sigma (S)\left( \frac{1}{8}\cdot \frac{1}{mkn^{\frac{3}{2}}}+\frac{1}{8}\cdot \frac{1}{mkn^{\frac{3}{2}}}\right) \\&=\frac{1}{4}\sigma (S)\frac{1}{mkn^{\frac{3}{2}}}. \end{aligned} \end{aligned}$$

Based on (3.2.21), we get

$$\begin{aligned} |s'|\leqslant mk\sqrt{n}\max \limits _{i,j}|x_{ij}-w_{ij}|\cdot \max \limits _{i}|y_i-\hat{y}_i| \end{aligned}$$
$$\begin{aligned} \qquad \leqslant mk\sqrt{n}\cdot 2\sqrt{n}\max \limits _{i,j}|x_{ij}-w_{ij}|\leqslant \frac{\sigma (S)}{2\sqrt{n}}. \end{aligned}$$

So we complete the proof of Lemma 3.2.3.    \(\square \)

From the above lemma, the reduced algorithm required in Theorem 3.2.1 is proved. However, we must prove that \(\{a_i\}_{i=1}^m\subset \mathbb {Z}_q^n\) determined by (3.2.14) is uniformly distributed, so that the knapsack problem on \(R_q\) is solved in the average case. Next we prove that \(\{a_i\}_{i=1}^m\) is almost uniformly distributed in \(\mathbb {Z}_q^n\), that is, the statistical distance between the distribution of \(\{a_i\}\) and the uniform distribution is sufficiently small. We first prove the following lemma.

Lemma 3.2.4

Let \(B'=q(T^*(c))^{-1}B\), then the covering radius \(\rho (B')\) of \(L(B')\) satisfies

$$\begin{aligned} \rho (B')\leqslant \frac{1}{8n}, \end{aligned}$$

where L(B) is a full rank cyclic lattice, \(c\in L(B)\) is given by (3.2.12).

Proof

Based on the definition of covering radius,

$$\begin{aligned} \rho (B')=\max \limits _{x\in \mathbb {R}^n}|x-u_x|=\max \limits _{x\in \mathbb {R}^n}|x-L(B')|. \end{aligned}$$

Let \(t'\in \mathbb {R}^n\) be the vector achieving the maximum value above, i.e. \(|t'-L(B')|\geqslant \rho (B')\), and

$$\begin{aligned} |t'-B'z|\geqslant \rho (B'),\ \forall \ z\in \mathbb {Z}^n. \end{aligned}$$

Denote

$$\begin{aligned} t=\frac{1}{q}T^*(c)t'. \end{aligned}$$

Suppose \(Bz_0\in L(B)\) is the nearest lattice vector of t, then we have

$$\begin{aligned} \begin{aligned} \rho (B)&\geqslant \text {dist}(t,L(B))=|t-Bz_0| \\&=|\frac{1}{q}T^*(c)t'-Bz_0|=|\frac{1}{q}T^*(c)(t'-B'z_0)| \\&\geqslant \frac{1}{q} |t'-B'z_0|\min \limits _d \frac{|c\otimes d|}{|d|} \\&\geqslant \frac{1}{q}\rho (B') \min \limits _{d\in \mathbb {R}^n,d\ne 0} \frac{|c\otimes d|}{|d|}.\end{aligned} \end{aligned}$$
(3.2.22)

For any \(d\in \mathbb {R}^n\), \(d\ne 0\), we estimate the value of \(c\otimes d\). Since \(c=\beta e_0+\alpha \), where \(|\alpha |\leqslant \frac{1}{2}\sigma (S)\), so

$$\begin{aligned} \begin{aligned} |c\otimes d|&=|(\beta e_0+\alpha )\otimes d| \\&\geqslant |d|(\beta -\frac{1}{2}\sqrt{n}\sigma (S)) \\&\geqslant \beta |d|-\sqrt{n}|\alpha | |d|\\&=|d|\frac{4nq}{\gamma }\sigma (S). \end{aligned} \end{aligned}$$

By (3.2.22), we have (see (3.2.11))

$$\begin{aligned} \begin{aligned} \rho (B)&\geqslant \frac{1}{q}\rho (B')\cdot \frac{4nq}{\gamma }\sigma (S) \\&\geqslant 8n \rho (B')\rho (B). \end{aligned} \end{aligned}$$

This implies \(\rho (B')\leqslant \frac{1}{8n}\). Lemma 3.2.4 holds.    \(\square \)

Lemma 3.2.5

Let \(\Lambda =L(B)\) be a lattice, \(Q\subset \mathbb {R}^n\) is convex and contains a ball with the radius \(r\geqslant \rho (\Lambda )\). Then the number of lattice vectors of L(B) contained in Q satisfies

$$\begin{aligned} \frac{\text {Vol}(Q)}{\text {det}(\Lambda )} (1-\frac{\rho (\Lambda )n}{\gamma })\leqslant |L(B)\cap Q|\leqslant \frac{\text {Vol}(Q)}{\text {det}(\Lambda )} (1+\frac{2\rho (\Lambda )n}{\gamma }). \end{aligned}$$

Proof

See Lyubashevsky and Micciancio (2006) or Lyubashevsky (2010).

Based on the above lemma, let \(\Lambda =L(B')\), we estimate the distribution of vectors \(\{a_{ij}\}\) in \(\mathbb {Z}_q^n\). From the definition

$$\begin{aligned} a_{ij}\equiv w_{ij}^{'}\ (\text {mod}\ q),\ a_i\equiv \sum \limits _{j=1}^k a_{ij}\ (\text {mod}\ q), \end{aligned}$$
(3.2.23)

where \(w_{ij}^{'}\) is the rounding vector of \(x_{ij}^{'}\in G=L(B')/q\mathbb {Z}^n\). The ball taking \(w_{ij}^{'}\) as the center with radius \(\frac{1}{2}\) is contained in the cube centered as \(w_{ij}^{'}\) with the side length \(\frac{1 }{2}\). Since \(\rho (L(B'))\leqslant \frac{1}{8n}<\frac{1}{2}\), from lemma 3.2.4, the number N of lattice vectors of \(L(B')\) in this cube satisfies

$$\begin{aligned} \frac{1}{\text {det}(B')}\left( 1-\frac{1}{4}\right) \leqslant N\leqslant \frac{1}{\text {det}(B')}(1+\frac{1}{2}). \end{aligned}$$

For any \(a\in R_q=\mathbb {Z}_q^n\), because \(x_{ij}^{'}\) is uniformly selected in \(L(B')/q\mathbb {Z}^n\), there are

$$\begin{aligned} |L(B')/q\mathbb {Z}^n|^{-1}=\frac{q^n}{\text {det}(B')} \end{aligned}$$

possible choices, therefore,

$$\begin{aligned} \left| {\text {Pr}\{a_{ij}=a\}-\frac{1}{q^n}} \right| \leqslant \frac{1}{2q^n}. \end{aligned}$$
(3.2.24)

Now we estimate the probability distribution of \(\{a_{i}\}_{i=1}^m\).    \(\square \)

Lemma 3.2.6

Let G be a finite Abel group, \(A_1,A_2,\dots ,A_k\) be k independent random variables on G, such that for any element \(x\in G\),

$$\begin{aligned} \left| {\text {Pr}\{A_i=x\}-\frac{1}{|G|}} \right| \leqslant \frac{1}{2|G|}. \end{aligned}$$

Then the statistical distance between \(\xi =\sum \nolimits _{i=1}^k A_i\) and the uniform distribution on G is

$$\begin{aligned} \Delta (\xi ,U(G))\leqslant 2^{-(k+1)}. \end{aligned}$$

Proof

We use mathematical induction to prove that the following inequality holds for any positive integer k,

$$\begin{aligned} \left| {\text {Pr}\{\xi =x\}-\frac{1}{|G|}} \right| \leqslant \frac{1}{2^k |G|},\ \forall x\in G. \end{aligned}$$

If \(k=1\), the inequality above holds. Assume it holds for \(k-1\), denote \(\xi '=\sum \nolimits _{i=1}^{k-1} A_i\), \(\xi =\xi '+A_k\), we have

$$\begin{aligned} \begin{aligned} \left| {\text {Pr}\{\xi =x\}-\frac{1}{|G|}} \right|&=\left| {\sum \limits _{a\in G} Pr\{\xi '=a,A_k=x-a\}-\frac{1}{|G|}} \right| \\&=\left| {\sum \limits _{a\in G} \text {Pr}\{\xi '=a\}\text {Pr}\{A_k=x-a\}-\frac{1}{|G|}} \right| \\&=\left| {\sum \limits _{a\in G} \left( \text {Pr}\{\xi '=a\}-\frac{1}{|G|}\right) \left( \text {Pr}\{A_k=x-a\}-\frac{1}{|G|} \right) } \right| \\&\leqslant \sum \limits _{a\in G} \frac{1}{2^{k-1}|G|}\cdot \frac{1}{2|G|}=\frac{1}{2^k |G|}. \end{aligned} \end{aligned}$$

Thus,

$$\begin{aligned} \Delta (\xi ,U(G))=\frac{1}{2}\sum \limits _{x\in G} \left| {\text {Pr}\{\xi =x\}-\frac{1}{|G|}} \right| \leqslant \frac{1}{2}\sum \limits _{x\in G} \frac{1}{2^k |G|}=2^{-(k+1)}. \end{aligned}$$

This lemma holds.    \(\square \)

From (3.2.23), (3.2.24) and Lemma 3.2.5, we know that each \(a_i=\sum \nolimits _{j=1}^k a_{ij}\) is almost uniformly distributed on \(\mathbb {Z}_q^n\), i.e. the statistical distance between \(a_i\) and the uniform distribution is sufficiently small. Therefore, the knapsack problem on \(R_q\) sampled by \(f_A(z)\) is in the average case. So far, we have completed the proof of Theorem 3.2.1.

3.3 LWE Problem

The LWE problem is to solve a kind of random linear equations under a given probability distribution. To better understand the LWE problem, let’s start with the checking learning problem (LPE) with errors. Let \(\mathbb {Z}_2=\{0,1\}\) be a finite field with 2 elements, \(n\geqslant 1\) and \(\varepsilon \geqslant 0\) be a given parameter. The distribution of \(\xi \) with parameter \(\varepsilon \) on \(\mathbb {Z}_2\) is

$$\begin{aligned} \text {Pr}\{\xi =0\}=1-\varepsilon ,\ \text {Pr}\{\xi =1\}=\varepsilon . \end{aligned}$$

If \(a,b\in \mathbb {Z}_2\), the probability that a and b having the same parity is \(1-\varepsilon \), i.e.

$$\begin{aligned} \text {Pr}\{a\equiv b\ (\text {mod}\ 2)\}=1-\varepsilon , \end{aligned}$$

denoted as \(a\equiv _{\varepsilon } b\). The checking learning problem with errors LPE is: given m independent vectors \(\{a_1,a_2,\dots ,a_m\}\), \(a_i\in \mathbb {Z}_2^n\) uniformly distributed on \(\mathbb {Z}_2^n\), and \(b=\begin{pmatrix} b_1 \\ \vdots \\ b_m \end{pmatrix}\in \mathbb {Z}_2^m\), to solve a vector \(s\in \mathbb {Z}_2^n\), such that the following m random congruence equations hold simultaneously

$$\begin{aligned} \left\{ \begin{array}{l} b_1\equiv _{\varepsilon }<a_1,s>\ (\text {mod}\ 2)\\ b_2\equiv _{\varepsilon }<a_2,s>\ (\text {mod}\ 2)\\ \vdots \\ b_m\equiv _{\varepsilon }<a_m,s>\ (\text {mod}\ 2) \end{array} \right. , \end{aligned}$$
(3.3.1)

where \(<a_i,s>\) is the inner product of two vectors in \(\mathbb {Z}_2^n\). If \(\varepsilon =0\), then the distribution \(\xi \) becomes the trivial distribution, and (3.3.1) becomes m deterministic congruence equations. At this time, the LPE problem could be solved by Gauss elimination method with only n equations, and the computational complexity is a polynomial of n. If \(\varepsilon >0\), the LPE problem is nontrivial, and its computational complexity is exponential of n. For example, the likelihood algorithm requires O(n) random congruence equations with computational complexity \(2^{O(n)}\). In 2003, Blum et al. (2003) proposed a subexponential algorithm whose computational complexity and the number of random congruence equations are both \(2^{O(n/\log n)}\), which is the best result of the LPE question so far.

Generalizing the LPE problem from \(\text {mod}\ 2\) to the general case \(\text {mod}\ q\), it becomes the LWE problem. Due to the important role of the LWE problem in modern anti-quantum computing cryptosystems, we will introduce the related concepts and results in detail in this section. First, we define the random congruence equation with error on the integer ring \(\mathbb {Z}\). Let \(n\geqslant 1\), \(q\geqslant 2\) be two positive integers, \(\mathbb {Z}_q\) be the residue class ring of \(\text {mod}\ q\), and \(\chi \) be a probability distribution on \(\mathbb {Z}_q\).

Definition 3.3.1

Let \(a,b\in \mathbb {Z}\), \(e\in \mathbb {Z}_q\), if

$$\begin{aligned} \text {Pr}\{a\equiv b+e\ (\text {mod}\ q)\}=\chi (e), \end{aligned}$$
(3.3.2)

we call a and b are congruential \(\text {mod}\ q\) under the distribution \(\chi \), denoted as

$$\begin{aligned} a\equiv _{\chi } b+e\ (\text {mod}\ q),\ \text {or}\ a=_{\chi } b+e. \end{aligned}$$
(3.3.3)

The above formula is called a random congruence equation with error under \(\chi \), and it is abbreviated as \(a=b+e\) sometimes.

Based on the above random congruence equation, we give the definition of the LWE distribution \(A_{s,\chi }\).

Definition 3.3.2

Let \(s\in \mathbb {Z}_q^n\), \(\chi \) be a given distribution on \(\mathbb {Z}_q\), the LWE distribution \(A_{s,\chi }=(a,b)\in \mathbb {Z}_q^n \times \mathbb {Z}_q\) generated by s and \(\chi \) satisfies:

  1. (1)

    \(a\in \mathbb {Z}_q^n\) is uniformly distributed;

  2. (2)

    \(b=_{\chi } <a,s>+e\), where \(<a,s>\) is the inner product of a and s in \(\mathbb {Z}_q\), \(e\in \mathbb {Z}_q\) has the distribution \(\chi \), i.e. \(e\leftarrow \chi \). We call \(A_{s,\chi }=(a,b)\in \mathbb {Z}_q^n \times \mathbb {Z}_q\) the LWE distribution, s is called the private key and e is called the error distribution. If \(b\in \mathbb {Z}_q\) is uniformly distributed, then \(A_{s,\chi }\) is called the uniform LWE distribution.

Under the LWE distribution \(A_{s,\chi }=(a,b)\in \mathbb {Z}_q^n \times \mathbb {Z}_q\), for a given error \(e\in \mathbb {Z}_q\), the essence of finding the private key \(s=(s_1,s_2,\dots ,s_n)'\in \mathbb {Z}_q^n\) is solving the random knapsack problem on the ring \(\mathbb {Z}_q\):

$$\begin{aligned} b\equiv a_1 s_1+a_2 s_2+\cdots +a_n s_n\ (\text {mod}\ q), \end{aligned}$$

solve \(s\in \mathbb {Z}_q^n\) under the probability distribution \(\chi (e)\). Next, we give the definition of LWE problem \(\text {LWE}_{n,q,\chi ,m}\) with the parameters \(n\geqslant 1\), \(q\geqslant 2\), \(m\geqslant 1\) and \(\chi \).

Definition 3.3.3

For any m independent samples \((a_i,b_i)\in \mathbb {Z}_q^n \times \mathbb {Z}_q\ (1\leqslant i\leqslant m)\) of \(A_{s,\chi }\), and randomly selected samples of the error distribution \(e=\begin{pmatrix} e_1 \\ \vdots \\ e_m \end{pmatrix}\), \(e_i\in \mathbb {Z}_q\), \(e_i\leftarrow \chi \), the \(\text {LWE}_{n,q,\chi ,m}\) problem is to solve the private key \(s\in \mathbb {Z}_q^n\) with high probability (larger than \(1-\delta \)). In other words, solve \(s\in \mathbb {Z}_q^n\) satisfying

$$\begin{aligned} \left\{ \begin{array}{l} b_1=_{\chi }<a_1,s>+e_1\\ b_2=_{\chi }<a_2,s>+e_2\\ \vdots \\ b_m=_{\chi }<a_m,s>+e_m \end{array} \right. . \end{aligned}$$
(3.3.4)

Remark 3.3.1

If \(\chi \) is the trivial distribution, i.e. \(\chi (0)=1\), \(\chi (k)=0\) for \(1\leqslant k<q\), then the samples of \(\chi \) are \(e=\begin{pmatrix} 0 \\ \vdots \\ 0 \end{pmatrix}\), (3.4) becomes m deterministic congruence equations

$$\begin{aligned} \left\{ \begin{array}{l} b_1\equiv<a_1,s>\ (\text {mod}\ q)\\ b_2\equiv<a_2,s>\ (\text {mod}\ q)\\ \vdots \\ b_m\equiv <a_m,s>\ (\text {mod}\ q) \end{array} \right. . \end{aligned}$$

Based on the Gauss elimination, we can calculate the only private key \(s\in \mathbb {Z}_q^n\) from n congruence equations, and the computational complexity is polynomial.

Remark 3.3.2

Let \(q=2\), \(\chi \) be the two point distribution with parameter \(\varepsilon \) on \(\mathbb {Z}_2\), then the LWE problem on \(\mathbb {Z}_2\) is just the LPE problem. For any error distribution \(e=\begin{pmatrix} e_1 \\ \vdots \\ e_m \end{pmatrix}\), if \(e_i=1\), from

$$\begin{aligned} \text {Pr}\{b_i\equiv <a_i,s>+1\ (\text {mod}\ 2)\}=\varepsilon , \end{aligned}$$

we can get

$$\begin{aligned} \text {Pr}\{b_i\equiv <a_i,s>\ (\text {mod}\ 2)\}=1-\varepsilon . \end{aligned}$$

Matrix representation of the \(\text {LWE}_{n,q,\chi ,m}\) problem

Let \(A=[a_1,a_2,\dots ,a_m]_{n\times m}\in \mathbb {Z}_q^{n\times m}\) be a random matrix uniformly distributed, \(b=\begin{pmatrix} b_1 \\ b_2 \\ \vdots \\ b_m \end{pmatrix}\in \mathbb {Z}_q^m\), \(e=\begin{pmatrix} e_1 \\ e_2 \\ \vdots \\ e_m \end{pmatrix}\in \mathbb {Z}_q^m\) be the errors, and \(e\leftarrow \chi ^m\), solve the private key \(s\in \mathbb {Z}_q^n\), such that

$$\begin{aligned} b\equiv _{\chi } A's+e\ (\text {mod}\ q), \end{aligned}$$
(3.3.5)

where \(A'\) is the transpose matrix of A, and (3.3.5) is a set of random congruence equations with errors. The probability that the ith congruence equation holds is \(\chi (e_i)\), so

$$\begin{aligned} \text {Pr}\{b\equiv _{\chi } A's+e\ (\text {mod}\ q)\}=\Pi _{i=1}^m \chi (e_i)=\chi (e). \end{aligned}$$
(3.3.6)

Let \(\Lambda _q(A)\) and \(\Lambda _q^{\bot }(A)\) be q ary integral lattices (see Sect. 7.3 of Chap. 7 in Zheng (2022)), defined by:

$$\begin{aligned} \left\{ \begin{array}{l} \Lambda _q(A)=\{A'x\ |\ x\in \mathbb {Z}_q^n\}+q\mathbb {Z}_q^n \\ \Lambda _q^{\bot }(A)=\{x\in \mathbb {Z}_q^m\ |\ Ax\equiv 0\ (\text {mod}\ q)\} \end{array} \right. . \end{aligned}$$
(3.3.7)

Since \(\Lambda _q(A)=q\Lambda _q^{\bot }(A)^{*}\), \(A's\in \Lambda _q(A)\), the geometric meaning of \(\text {LWE}_{n,q,\chi ,m}\) is to solve a lattice vector \(A's\) near from b for any \(b\in \mathbb {Z}_q^m\), such that the distance \(b-A's\) has the distribution \(\chi ^m\), which is dual to the SIS problem.

Lemma 3.3.1

Suppose \(A\in \mathbb {Z}_q^{n\times m}\) is a random matrix uniformly distributed, \(A=[A_1,A_2]\), where \(A_1\in \mathbb {Z}_q^{n\times n}\) is an invertible matrix, let \(\overline{A}=A_1^{-1}A=[I_n,A_1^{-1}A_2]\), then \(A_{s,\chi }\) and \(\overline{A}_{s,\chi }\) have the same probability distribution.

Proof

From Lemma 2.1.1 in Chap. 2, if A is uniformly distributed, then \(\overline{A}\) is also a uniform random matrix. Assume \(b=\begin{pmatrix} b_1 \\ b_2 \end{pmatrix}\), \(e=\begin{pmatrix} e_1 \\ e_2 \end{pmatrix}\), \(s\in \mathbb {Z}_q^n\) satisfy

$$\begin{aligned} b\equiv _{\chi }A's+e\ (\text {mod}\ q), \end{aligned}$$

that is,

$$\begin{aligned} \left\{ \begin{array}{l} b_1\equiv _{\chi }A_1^{'}s+e_1\ (\text {mod}\ q) \\ b_2\equiv _{\chi }A_2^{'}s+e_2\ (\text {mod}\ q) \end{array} \right. . \end{aligned}$$

Let \(A_1^{*}=(A_1^{'})^{-1}\), and

$$\begin{aligned} \overline{b}=\begin{pmatrix} A_1^{*}b_1 \\ A_1^{*}b_2 \end{pmatrix},\ \overline{e}=\begin{pmatrix} A_1^{*}e_1 \\ A_1^{*}e_2 \end{pmatrix}. \end{aligned}$$

Obviously, \(\overline{b}\) and b have the same probability distribution, so are \(\overline{e}\) and e,

$$\begin{aligned} \overline{b}\equiv _{\chi } A_1^{*} \begin{pmatrix} b_1 \\ b_2 \end{pmatrix}\ (\text {mod}\ q) \equiv _{\chi } A_1^{*}(A's+e)\ (\text {mod}\ q) \end{aligned}$$
$$\begin{aligned} \ \ \equiv _{\chi } A_1^{*}A's+A_1^{*}e\ (\text {mod}\ q)\equiv _{\chi } \overline{A'}s+e\ (\text {mod}\ q). \end{aligned}$$

The lemma holds.    \(\square \)

The above \(\overline{A}=[I_n,A_1^{-1}A_2]\) is called the normal form of the LWE problem.

Lemma 3.3.2

Let x, y, z be three random variables on \(\mathbb {Z}_q\), x and y are independent, \(z\equiv x+y\ (\text {mod}\ q)\). If x is uniformly distributed on \(\mathbb {Z}_q\), so is z.

Proof

For any integer \(0\leqslant i\leqslant q-1\), we compute the probability that z takes the value i.

$$\begin{aligned} \begin{aligned} \text {Pr}\{z=i\}&=\sum \limits _{j=0}^{q-1} Pr\{x=j,y=i-j\} \\&=\sum \limits _{j=0}^{q-1} Pr\{x=j\}Pr\{y=i-j\} \\&=\frac{1}{q} \sum \limits _{j=0}^{q-1} Pr\{y=i-j\}=\frac{1}{q}. \end{aligned} \end{aligned}$$

   \(\square \)

Lemma 3.3.3

In the LWE distribution \(A_{s,\chi }=(a,b)\), b is uniformly distributed if and only if \(b-<a,s>\) is uniformly distributed.

Proof

If \(b-<a,s>\) is uniformly distributed, from \(b=(b-<a,s>)+<a,s>\) and Lemma 3.3.2, we get b is uniform. On the other hand, if b is uniform, from \(b-<a,s>=b+(-<a,s>)\) and Lemma 3.3.2 again, \(b-<a,s>\) is also uniformly distributed.    \(\square \)

According to Definition 3.3.1, the above lemma gives an equivalent condition that \(A_{s,\chi }\) is a uniform LWE distribution. An equivalent form of the LWE problem is the decision LWE problem, which we call the D-LWE problem.

Definition 3.3.4

(D-LWE problem) Given \(a\in \mathbb {Z}_q^n\) is uniformly distributed, \(s\in \mathbb {Z}_q^n\), \(e\in \mathbb {Z}_q\) with the distribution \(\chi \), decide whether \(<a,s>+e\) is uniform under positive probability of s.

The D-LWE problem seems easy, however, the difficulty of it is equivalent to that of the LWE problem. We will prove this equivalence in detail in Sect. 3.4. Here we focus on the probability distribution \(\chi \) of the LWE problem. Usually, \(\chi \) takes the discrete Gauss distribution on \(\mathbb {Z}_q\). In Chapter 1, we discussed the discretization of continuous random variable with Gauss distribution in \(\mathbb {R}^n\) on lattice in detail. The discrete Gauss distribution on \(\mathbb {Z}_q\) is actually the discretization of Gauss distribution on \(\mathbb {Z}_q\).

Recall the definition of Gauss function \(\rho _s(x)\) in Chap. 1 (see (3.2.1),

$$\begin{aligned} \rho _s(x)=e^{-\frac{\pi }{s^2}|x|^2},\ x\in \mathbb {R}^n. \end{aligned}$$
(3.3.8)

If \(n=1\), \(\rho _s(x)\) is a density function of continuous random variable on \(\mathbb {R}\). We convert the corresponding random variable of \(\rho _s(x)\) to \(\text {mod}\ 1\), which becomes a continuous random variable defined on \(\mathbb {T}\equiv [0,1)\ (\text {mod}\ 1)\) of length 1, with the density function

$$\begin{aligned} \psi _{\beta }(x)=\sum \limits _{k=-\infty }^{+\infty }\frac{1}{\beta } e^{-\frac{\pi }{\beta ^2} (x-k)^2},\ x\in \mathbb {T}. \end{aligned}$$
(3.3.9)

It is easy to see that

$$\begin{aligned} \int \limits _{\mathbb {T}} \psi _{\beta }(x) \textrm{d}x=\int \limits _0^1 \psi _{\beta }(x) \textrm{d}x=\int \limits _{\mathbb {R}}\frac{1}{\beta } \rho _{\beta }(x)\textrm{d}x=1. \end{aligned}$$

In order to estimate the statistical distance between random variables defined by different \(\beta \), we first prove the following two lemmas.

Lemma 3.3.4

Let t and l be positive real numbers, \(x,y\in \mathbb {R}^n\) satisfy

$$\begin{aligned} |x|\leqslant t,\ \text {and}\ |x-y|\leqslant l. \end{aligned}$$

Then

$$\begin{aligned} \rho _s(y)\geqslant \left( 1-\frac{\pi }{s^2}(2tl+l^2)\right) \rho _s(x). \end{aligned}$$
(3.3.10)

Proof

For any \(z\in \mathbb {R}\), we have

$$\begin{aligned} e^{-z}\geqslant 1-z,\ z\in \mathbb {R}. \end{aligned}$$

Therefore,

$$\begin{aligned} \begin{aligned} \rho _s(y)&=e^{-\frac{\pi }{s^2} |y|^2}\geqslant e^{-\frac{\pi }{s^2} (|x|+|y-x|)^2} \\&\geqslant e^{-\frac{\pi }{s^2} (|x|^2+2l|x|+l^2)} \\&\geqslant e^{-\frac{\pi }{s^2} (|x|^2+2tl+l^2)} \\&\geqslant (1-\frac{\pi }{s^2}(2tl+l^2))\rho _s(x). \end{aligned} \end{aligned}$$

   \(\square \)

Lemma 3.3.5

Let \(0<\alpha <\beta \leqslant 2\alpha \), then the statistical distance between \(\psi _{\alpha }\) and \(\psi _{\beta }\) satisfies

$$\begin{aligned} \Delta (\psi _{\alpha },\psi _{\beta })\leqslant \frac{9}{2}(\frac{\beta }{\alpha }-1). \end{aligned}$$
(3.3.11)

Proof

Based on

$$\begin{aligned} \int \limits _0^1 \psi _{\beta }(x) \textrm{d}x=1, \end{aligned}$$

it follows that

$$\begin{aligned} \begin{aligned}&\int \limits _0^1 |\psi _{\beta }(x)-\psi _{\alpha }(x)|\textrm{d}x \\&\quad =\int \limits _0^1 \left| {\sum \limits _{k=-\infty }^{+\infty } \left( \frac{1}{\beta } e^{-\frac{\pi }{\beta ^2} |k-x|^2}-\frac{1}{\alpha } e^{-\frac{\pi }{\alpha ^2} |k-x|^2}\right) } \right| \textrm{d}x \\&\quad \leqslant \sum \limits _{k=-\infty }^{+\infty } \int \limits _0^1 \left| {\frac{1}{\beta } e^{-\frac{\pi }{\beta ^2} |x-k|^2}-\frac{1}{\alpha } e^{-\frac{\pi }{\alpha ^2} |x-k|^2}} \right| \textrm{d}x \\&\quad =\int \limits _{-\infty }^{+\infty } \left| {\frac{1}{\beta } e^{-\frac{\pi }{\beta ^2} |x|^2}-\frac{1}{\alpha } e^{-\frac{\pi }{\alpha ^2} |x|^2}} \right| \textrm{d}x. \end{aligned} \end{aligned}$$

Let \(x=\alpha y\), we get

$$\begin{aligned} \int \limits _0^1 |\psi _{\beta }(x)-\psi _{\alpha }(x)|\textrm{d}x\leqslant \int \limits _{-\infty }^{+\infty } \left| {\frac{1}{\beta /\alpha }e^{-\frac{\pi }{(\beta /\alpha )^2} |y|^2}-e^{-\pi |y|^2}} \right| \textrm{d}y. \end{aligned}$$
(3.3.12)

Without loss of generality, assume \(\alpha =1\), \(\beta =1+\varepsilon \), where \(0<\varepsilon \leqslant 1\), we estimate the right hand of (3.3.12)

$$\begin{aligned} \begin{aligned}&\int \limits _{\mathbb {R}} \left| {e^{-\pi |x|^2}-\frac{1}{1+\varepsilon }e^{-\frac{\pi }{(1+\varepsilon )^2} |x|^2}} \right| \textrm{d}x \\&\quad \leqslant \int \limits _{\mathbb {R}} \left| {e^{-\pi |x|^2}-e^{-\frac{\pi }{(1+\varepsilon )^2} |x|^2}} \right| \textrm{d}x+\int \limits _{\mathbb {R}} (1-\frac{1}{1+\varepsilon }) e^{-\frac{\pi }{(1+\varepsilon )^2} |x|^2} \textrm{d}x \\&\quad =\int \limits _{\mathbb {R}} \left| {e^{-\pi |x|^2}-e^{-\frac{\pi }{(1+\varepsilon )^2} |x|^2}} \right| \textrm{d}x+\varepsilon \\&\quad =\int \limits _{\mathbb {R}} \left| {1-e^{-\pi (1-\frac{1}{(1+\varepsilon )^2}) x^2}} \right| \cdot e^{-\frac{\pi }{(1+\varepsilon )^2} x^2} \textrm{d}x+\varepsilon . \end{aligned} \end{aligned}$$

For \(\forall z\geqslant 0\), we have

$$\begin{aligned} 1-z\leqslant e^{-z} \leqslant 1\Rightarrow 0\leqslant 1-e^{-z}\leqslant z, \end{aligned}$$

and

$$\begin{aligned} \begin{aligned}&\left| {e^{-\pi (1-\frac{1}{(1+\varepsilon )^2}) x^2}-1} \right| \leqslant \pi (1-\frac{1}{(1+\varepsilon )^2})x^2 \\&\quad =\frac{\pi }{(1+\varepsilon )^2}(2\varepsilon +\varepsilon ^2)x^2\leqslant 2\pi \varepsilon x^2. \end{aligned} \end{aligned}$$

Finally,

$$\begin{aligned} \int \limits _0^1 |\psi _{\alpha }(x)-\psi _{\beta }(x)|\textrm{d}x\leqslant 2\pi \varepsilon \int \limits _{\mathbb {R}} x^2 e^{-\frac{\pi }{(1+\varepsilon )^2}x^2} \textrm{d}x+\varepsilon = \varepsilon +\varepsilon (1+\varepsilon )^3 \leqslant 9\varepsilon . \end{aligned}$$

Since \(\varepsilon =\frac{\beta }{\alpha }-1\), based on (3.3.12),

$$\begin{aligned} \Delta (\psi _{\alpha },\psi _{\beta })=\frac{1}{2}\int _0^1 |\psi _{\alpha }(x)-\psi _{\beta }(x)|\textrm{d}x\leqslant \frac{9}{2}(\frac{\beta }{\alpha }-1). \end{aligned}$$

We complete the proof of this lemma.    \(\square \)

In order to obtain the discrete Gauss distribution on \(\mathbb {Z}_q\), we construct a discrete processing technique for continuous random variables. Let \(\mathbb {T}\) be any interval with length 1 on \(\mathbb {R}\), denoted as

$$\begin{aligned} \mathbb {T}\equiv [0,1)\ (\text {mod}\ 1). \end{aligned}$$

If \(\varphi (x)\) is the density function of a continuous random variable \(\varphi \) on \(\mathbb {T}\), we define a discrete random variable \(\overline{\varphi }\) on \(\mathbb {Z}_q\) by

$$\begin{aligned} \overline{\varphi }=\lfloor q\varphi \rceil , \end{aligned}$$
(3.3.13)

that is, if \(\varphi \) takes a value \(x\in \mathbb {T}\), then \(\overline{\varphi }\) takes the value \(\lfloor qx \rceil \ \text {mod}\ q\), where \(\lfloor x \rceil \) is the closest integer to x. When x runs over [0, 1), obviously \(\lfloor qx \rceil \) runs over \(\mathbb {Z}_q\), so \(\overline{\varphi }\) defined in (3.3.13) is indeed a discrete random variable on \(\mathbb {Z}_q\). We call \(\overline{\varphi }\) the discretization of \(\varphi \).

Lemma 3.3.6

If \(\varphi \) is a continuous random variable on \(\mathbb {T}\) with the density function \(\varphi (x)\), then \(\overline{\varphi }\) is a discrete random variable on \(\mathbb {Z}_q\), and its probability distribution \(\overline{\varphi }(k)\) is

$$\begin{aligned} \text {Pr}\{\overline{\varphi }=k\}=\overline{\varphi }(k)=\int \limits _{(k-\frac{1}{2})/q}^{(k+\frac{1}{2})/q} \varphi (x) \textrm{d}x,\ k\in \mathbb {Z}_q. \end{aligned}$$

Proof

$$\begin{aligned} \begin{aligned}&\text {Pr}\{\overline{\varphi }=k\}=Pr\{\lfloor q\varphi \rceil =k\}=\text {Pr} \left\{ {k-\frac{1}{2}\leqslant q\varphi<k+\frac{1}{2}} \right\} \\&\quad =\text {Pr}\left\{ {\left( k-\frac{1}{2}\right) /q\leqslant \varphi <\left( k+\frac{1}{2}\right) /q} \right\} =\int \limits _{(k-\frac{1}{2})/q}^{(k+\frac{1}{2})/q} \varphi (x) \textrm{d}x. \end{aligned} \end{aligned}$$

   \(\square \)

Definition 3.3.5

The discrete Gauss distribution \(\overline{\psi }_{\beta }\) on \(\mathbb {Z}_q\) is defined by

$$\begin{aligned} \overline{\psi }_{\beta }(k)=\int \limits _{(k-\frac{1}{2})/q}^{(k+\frac{1}{2})/q} \psi _{\beta }(x) \textrm{d}x, \end{aligned}$$
(3.3.14)

where \(\psi _{\beta }(x)\) is the continuous Gauss distribution on \(\mathbb {T}\) in (3.3.9) and \(\beta \) is called the parameter of discrete Gauss distribution.

In the LWE problem, usually we suppose \(\chi =\overline{\psi }_{\beta }\) is a discrete Gauss distribution. The main result in this chapter is the following theorem.

Theorem 3.3.1

Let \(m=\text {Poly}(n)\), \(q\leqslant 2^{\text {Poly}(n)}\), \(\chi =\overline{\psi }_{\alpha }\) be the discrete Gauss distribution with parameter \(\alpha \), where \(0<\alpha <1\), and \(\alpha q\geqslant 2\sqrt{n}\). Then the difficulty of solving the \(\text {D-LWE}_{n,q,\chi ,m}\) problem is at least as hard as that of \(\text {GapSVP}_{\gamma }\) or \(\text {SIVP}_{\gamma }\) problem on any n dimensional full rank lattice L based on quantum algorithm, where \(\gamma =\tilde{O}(\frac{n}{\alpha })\).

The proof of Theorem 3.3.1 will be given in the next section. Here we only introduce the idea of this proof. The proof of Theorem 3.3.1 is mainly divided into the following two steps:

  1. (1)

    Using the quantum reduction algorithm to prove that the \(\text {LWE}_{n,q,\chi ,m}\) problem is as least as hard as difficult problems on any lattice such as the GapSVP and SIVP problems.

  2. (2)

    Prove the difficulty of the \(\text {D-LWE}_{n,q,\chi ,m}\) problem is not lower than that of the \(\text {LWE}_{n,q,\chi ,m}\) problem (see Theorem 3.4.1 in the next section). The original proof the Theorem 3.4.1 is based on the modulus q being a prime number, such as \(q=2\). Later it is generalized to the general case \(q=2^{\text {Poly}(n)}\) (see Regev (2009) and Peikert (2009)), and the proof of Theorem 3.3.1 is complete.

3.4 Proof of the Main Theorem

In this section, we mainly prove that the difficulty of solving D-LWE problem is not lower than that of the hard problem on lattice, that is, if there is a quantum algorithm for solving the D-LWE problem, then there exists a quantum algorithm to solve the hard problem on lattice. The whole proof could be divided into three parts. In order to better understand the three parts of proof, we first introduce the definition of the DGS problem.

Definition 3.4.1

\(\text {DGS}_{\phi }\): given an n dimensional lattice L with generated matrix B, a real number \(r>\phi (B)\), where \(\phi \) is a real function of B. The goal is to output a sample from the discrete Gauss distribution \(D_{L,r}\).

The DGS problem is also called the discrete Gauss sampling problem. We will see that the difficulty of the DGS problem is polynomial equivalent to that of the hard problem on lattice after this proof. Next we introduce the idea of proving that the D-LWE problem is at least as difficult as the hard problem on lattice. This proof could be divided into three parts, which are given in Sects. 3.4.1, 3.4.2 and 3.4.3. In Sect. 3.4.1, we prove that if there is a quantum algorithm to solve the LWE problem, then there is also a quantum algorithm to solve the \(\text {DGS}_{\sqrt{2n}\eta _{\varepsilon }(L)/\alpha }\) problem. In Sect. 3.4.2, we give a reduction algorithm from the \(\text {GIVP}_{2\sqrt{n}\phi }\) problem to the \(\text {DGS}_{\phi }\) problem, so that we have completed the proof that the LWE problem is not less difficult than the hard problem on lattice. In Sect. 3.4.3, we further prove that the D-LWE problem \(\text {D-LWE}_{n,q,\chi ,m}\) can be reduced to the \(\text {LWE}_{n,q,\chi ,m}\) problem and complete the proof of Theorem 3.3.1. The flowchart of the whole proof is shown in Fig. 3.1.

Fig. 3.1
figure 1

The flowchart of the proof of Theorem 3.3.1

3.4.1 From LWE to DGS

In this subsection, we will solve the \(\text {DGS}_{\sqrt{2n}\eta _{\varepsilon }(L)/\alpha }\) problem by the algorithm of \(\text {LWE}_{n,q,\psi _{\alpha },m}\) problem. The main conclusion is the following Lemma 3.4.1 , and its proof depends on Lemmas 3.4.2 and 3.4.3. We give these three lemmas first.

Lemma 3.4.1

Let \(m=\text {Poly}(n)\), \(\varepsilon =\varepsilon (n)\) be a negligible function of n, \(q=q(n)\) be a positive integer, \(\alpha =\alpha (n)\in (0,1)\), \(\alpha q\geqslant 2\sqrt{n}\), \(\chi =\psi _{\alpha }\). Assume that we have an algorithm W that solves the \(\text {LWE}_{n,q,\psi _{\alpha },m}\) problem given a polynomial number of samples, then there exists an efficient quantum algorithm for the \(\text {DGS}_{\sqrt{2n}\eta _{\varepsilon }(L)/\alpha }\) problem.

Lemma 3.4.2

For any n dimensional lattice L and a real number \(r>2^{2n}\lambda _n(L)\), there exists an efficient algorithm that outputs a sample from a distribution that is within statistical distance \(2^{-\Omega (n)}\) of the discrete Gauss distribution \(D_{L,r}\), where \(\Omega (n)\) is a polynomial function or exponential function of n.

Lemma 3.4.3

Let \(m=\text {Poly}(n)\), \(\varepsilon =\varepsilon (n)\) be a negligible function of n, \(q=q(n)\geqslant 2\) be a positive integer, \(\alpha =\alpha (n)\in (0,1)\). Assume that we have an algorithm W that solves the \(\text {LWE}_{n,q,\psi _{\alpha },m}\) problem given a polynomial number of samples, then there exists a constant \(c>0\) and an efficient quantum algorithm that, given any n dimensional lattice L, a real number \(r>\sqrt{2}q \eta _{\varepsilon }(L)\) and \(n^c\) samples from \(D_{L,r}\), outputs a sample from \(D_{L,r\sqrt{n}/(\alpha q)}\).

Proof of Lemma 3.4.1: Given an n dimensional lattice L and a real number \(r>\sqrt{2n}\eta _{\varepsilon }(L)/\alpha \), our goal is to output a sample from the discrete Gauss distribution \(D_{L,r}\). The idea of this proof is to use iteration steps. Let

$$\begin{aligned} r_i=r(\alpha q/\sqrt{n})^i,\quad i=1,2,\dots ,3n. \end{aligned}$$

Based on Lemma 1.3.6 in Chap. 1,

$$\begin{aligned} r_{3n}>2^{3n}r>2^{3n} \sqrt{2n}\eta _{\varepsilon }(L)/\alpha \geqslant 2^{3n}\sqrt{2n}\sqrt{\frac{\ln 1/\varepsilon }{\pi }}\frac{\lambda _n(L)}{n}>2^{2n}\lambda _n(L). \end{aligned}$$

By Lemma 3.4.2, we can produce samples from the discrete Gauss distribution \(D_{L,r_{3n}}\). Suppose c is the constant from Lemma 3.4.3, we output \(n^c\) samples from \(D_{L,r_{3n}}\). According to Lemma 3.4.3, we can get samples from the distribution \(D_{L,r_{3n}\sqrt{n}/(\alpha q)}\), i.e. \(D_{L,r_{3n-1}}\). Repeat this process, since

$$\begin{aligned} r_1=r\alpha q/\sqrt{n}>\sqrt{2n}\eta _{\varepsilon }(L)/\alpha \cdot \alpha q/\sqrt{n}=\sqrt{2}q \eta _{\varepsilon }(L), \end{aligned}$$

which satisfies the condition of Lemma 3.4.3, finally we can output a sample from \(D_{L,r_1 \sqrt{n}/(\alpha q)}=D_{L,r}\). The lemma holds.

   \(\square \)

Proof of Lemma 3.4.2: By the LLL algorithm (Lenstra et al. (1982)), we can choose the generated matrix \(B=[b_1,b_2,\dots ,b_n]\) of L satisfying \(b_i\leqslant 2^n \lambda _n(L)\), \(1\leqslant i\leqslant n\). Suppose F(B) is the basic neighborhood of lattice L. The algorithm in Lemma 3.4.2 can be achieved by the following steps. First we generate a sample y from the discrete Gauss distribution \(D_r\), where

$$\begin{aligned} D_r(x)=\frac{\rho _r(x)}{r^n},\ \forall x\in \mathbb {R}^n. \end{aligned}$$

We get \(y'=y\ \text {mod}\ L\in F(B)\), and \(x=y-y'\in L\). Denote the distribution of x as \(\xi \), next we prove the statistical distance between \(\xi \) and \(D_{L,r}\) is exponentially small. Note that

$$\begin{aligned} |y'|\leqslant \text {diam}(F(B))\leqslant \sum \limits _{i=1}^n |b_i|\leqslant n 2^n \lambda _n(L), \end{aligned}$$

where

$$\begin{aligned} \text {diam}(F(B))=\max \{|u-v|\ \big |\ u,v\in F(B)\}. \end{aligned}$$

Based on Lemma 1.3.4 in Chap. 1,

$$\begin{aligned} \rho (L\backslash \sqrt{n}rN)<(r\sqrt{2\pi e}e^{-\pi r^2})^n \rho (L), \end{aligned}$$

here N is the unit ball. This means \(\rho (L\backslash \sqrt{n}rN)\) is exponentially small, so we can always assume \(x\leqslant \sqrt{n}r\). By Lemma 3.3.4, let \(t=\sqrt{n}r\), \(l=n2^n \lambda _n(L)\), by some simple calculations we get

$$\begin{aligned} \begin{aligned} \text {Pr}\{\xi =x\}&=\int \limits _{x+F(B)} D_r(y) \textrm{d}y\geqslant \int \limits _{x+F(B)} (1-2^{-\Omega (n)})D_r(x) \textrm{d}y \\&=(1-2^{-\Omega (n)})D_r(x) \text {det}(L). \end{aligned} \end{aligned}$$

On the other hand, from Lemma 1.3.2 in Chap. 1,

$$\begin{aligned} \text {Pr}\{D_{L,r}=x\}=\frac{\rho _{r}(x)}{\rho _{r}(L)}=\frac{\rho _r(x)}{\text {det}(L^*)r^n \rho _{1/r}(L^*)}\leqslant \frac{\rho _r(x)}{\text {det}(L^*)r^n}=D_r(x) \text {det}(L). \end{aligned}$$

So we have

$$\begin{aligned} \text {Pr}\{\xi =x\}\geqslant (1-2^{-\Omega (n)}) \text {Pr}\{D_{L,r}=x\}. \end{aligned}$$

Summing \(x\in L\) on both sides, we get the statistical distance between \(\xi \) and \(D_{L,r}\) is exponentially small. The lemma holds.    \(\square \)

Definition 3.4.2

(1) \(\text {CVP}_{L,d}\): given an n dimensional lattice L, a target vector \(t\in \mathbb {R}^n\), a real number d, \(\text {dist}(t,L)\leqslant d\), find \(u\in L\) such that

$$\begin{aligned} |u-t|=\min \limits _{x\in L,|x-t|\leqslant d} |x-t|. \end{aligned}$$

(2) \(\text {CVP}_{L,d}^{(q)}\): given an n dimensional lattice L with generated matrix B, a target vector \(t\in \mathbb {R}^n\), a real number d, \(\text {dist}(t,L)\leqslant d\), denote \(K_L(t)=\{u\in L\ |\ |u-t|=\min \limits _{x\in L}|x-t|\}\), i.e. \(K_L(t)\) is the closest vector to t in the lattice L, output \(B^{-1}K_L(t)\ \text {mod}\ q\in \mathbb {Z}_q^n\).

The CVP problem is also called the closest vector problem. In order to prove Lemma 3.4.3, we need the following two lemmas, Lemmas 3.4.4 and 3.4.14. In Lemma 3.4.4, we use the samples of \(D_{L, r}\) to solve the \(\text {CVP}_{L^*,\alpha q/(\sqrt{2}r)}\) problem, and Lemma 3.4.14 shows that we can generate a sample of \(D_{L,r\sqrt{n}/\alpha q}\) from the algorithm of solving the \(\text {CVP}_{L^*,\alpha q/(\sqrt{2}r)}\) problem so that we complete the proof of Lemma 3.4.3. The following content is divided into two parts. In the first part, we use Lemmas 3.4.5 to 3.4.11 to prove Lemma 3.4.4, which is to solve the \(\text {CVP}_{L^*,\alpha q/(\sqrt{2}r)}\) problem based on the samples of \(D_{L,r}\). In the second part, we prove Lemma 3.4.14 according to Lemmas 3.4.12 and 3.4.13, and achieve the transition from solving \(\text {CVP}_{L^ *,\alpha q/(\sqrt{2}r)}\) to \(D_{L,r\sqrt{n}/\alpha q}\).

Lemma 3.4.4

Let \(m=\text {Poly}(n)\), \(\varepsilon =\varepsilon (n)\) be a negligible function of n, \(q=q(n)\geqslant 2\) be a positive integer, \(\alpha =\alpha (n)\in (0,1)\). Assume that we have an algorithm W that solves \(\text {LWE}_{n,q,\psi _{\alpha },m}\) given a polynomial number of samples, then there exists a constant \(c>0\) and an efficient algorithm that, given any n dimensional lattice L, a real number \(r>\sqrt{2}q \eta _{\varepsilon }(L)\), and \(n^c\) samples from \(D_{L,r}\), solves the \(\text {CVP}_{L^*,\alpha q/(\sqrt{2}r)}\) problem.

Proof

This lemma is proved directly by the following Lemmas 3.4.5 to 3.4.11.    \(\square \)

Lemma 3.4.5 shows the relationship of difficulty between the CVP and \(\text {CVP}^{(q)}\) problems.

Lemma 3.4.5

Given an n dimensional lattice L, a real number \(d<\lambda _1(L)/2\), \(q\geqslant 2\) is a positive integer. There exists an efficient algorithm to solve the \(\text {CVP}_{L,d}\) problem based on the algorithm for \(\text {CVP}_{L,d}^{(q)}\).

Proof

Let \(x\in \mathbb {R}^n\) satisfying \(\text {dist}(x,L)\leqslant d\) be the target vector, define vectors \(\{x_n\}\) and \(\{a_n\}\) as follows: \(x_1=x\),

$$\begin{aligned} a_i=B^{-1}K_L(x_i)\in \mathbb {Z}^n,\ i\geqslant 1, \end{aligned}$$

which is the coefficient vector of the closest vector to \(x_i\) in lattice L,

$$\begin{aligned} x_{i+1}=(x_i-B(a_i\ \text {mod}\ q))/q,\ i\geqslant 1, \end{aligned}$$

it is easy to prove

$$\begin{aligned} a_{i+1}=(a_i-(a_i\ \text {mod}\ q))/q, \end{aligned}$$

and

$$\begin{aligned} |x_{i+1}-Ba_{i+1}|\leqslant \frac{d}{q^i}. \end{aligned}$$

That is, the distance from \(x_{n+1}\) to lattice L is no more than \(\frac{d}{q^n}\). Note that \(\frac{d}{q^n}\) could be sufficiently small if n becomes lager enough. Based on the nearest plane algorithm by Babai (1985), we can find \(y\in L\) such that y is the closest vector to \(x_{n+1}\) in lattice L. Let \(y=Ba\), then \(a_{n+1}=a\), combine with

$$\begin{aligned} a_{i+1}=(a_i-(a_i\ \text {mod}\ q))/q, \end{aligned}$$

we get \(a_n,a_{n-1},\dots ,a_1\), and complete the process of solving the \(\text {CVP}_{L,d}\) problem. This lemma holds.    \(\square \)

We introduce the definition of the LWE distribution \(A_{s,\chi }\) in Definition 3.3.2, where \(\chi \) is a distribution on \(\mathbb {Z}_q\). If the value space of \(\chi \) is changed to \(\mathbb {T}=[0,1)\), we can give another definition of LWE distribution.

Definition 3.4.3

Let \(s\in \mathbb {Z}_q^n\), e be a random variable on \(\mathbb {T}\) with density function \(\phi \). The LWE distribution \(A_{s,\phi }=(a,b)\in \mathbb {Z}_q^n\times \mathbb {T}\) generated by s and \(\phi \) satisfies:

(1) \(a\in \mathbb {Z}_q^n\) is uniformly distributed.

(2) \(b=a\cdot s/q+e\ \text {mod}\ 1\).

The LWE distribution we discuss later in this section is always \(A_{s,\phi }\).

Lemma 3.4.6

Let \(q=q(n)\geqslant 1\) be a positive integer, given \(s'\in \mathbb {Z}_q^n\) and samples from \(A_{s,\psi _{\alpha }}\) for some unknown \(s\in \mathbb {Z}_q^n\), \(\alpha <1\). There exists an efficient algorithm that determines whether \(s'=s\) with probability exponentially close to 1.

Proof

Let (a, x) be a sample from the LWE distribution \(A_{s,\psi _{\alpha }}\), \(\mathbb {T}=[0,1)\), \(\xi \) be a random variable on \(\mathbb {T}\) with density function p(y) such that

$$\begin{aligned} \xi =x-a\cdot s'/q=e+a\cdot (s-s')/q. \end{aligned}$$
(3.4.1)

The steps of the algorithm are as follows. Generate n samples \(y_1,y_2,\dots ,y_n\) of \(\xi \) and compute

$$\begin{aligned} z=\frac{1}{n}\sum \limits _{i=1}^n \cos (2\pi y_i). \end{aligned}$$

If \(z>0.02\), then we confirm \(s=s'\), otherwise, we decide \(s\ne s'\). Next we prove the correctness of this algorithm.

If \(s=s'\), by (3.4.1) we get \(\xi =e\) with the distribution \(\psi _{\alpha }\). On the other hand, if \(s\ne s'\), then there is \(1\leqslant j\leqslant n\), such that \(s_j\ne s_j^{'}\), where \(s_j\) and \(s_j^{'}\) are the jth coordinates of s and \(s'\) , respectively. Let \(g=\text {gcd}(q,s_j-s_j^{'})\), \(k=q/\text {gcd}(q,s_j-s_j^{'})\), \(a_j\) be the jth coordinate of a, it is not hard to see the distribution of \(a_j(s_j-s_j^{'})\ \text {mod}\ q\) has period g, i.e. the distribution of \(a_j(s_j-s_j^{'})/q \ \text {mod}\ 1\) has period \(g/q=1/k\), \(k\geqslant 2\). Since \(\xi \) is regarded as the sum of \(a_j(s_j-s_j^{'})/q \ \text {mod}\ 1\) and an independent random variable, therefore, the distribution of \(\xi \) also has period 1/k. Assume \(\tilde{z}\) is the expectation of \(\cos (2\pi \xi )\),

$$\begin{aligned} \tilde{z}=E[\cos (2\pi \xi )]=\int \limits _0^1 \cos (2\pi y)p(y)\textrm{d}y=\text {Re}\int \limits _0^1 e^{2\pi i y}p(y)\textrm{d}y. \end{aligned}$$

When \(s=s'\), the distribution of \(\xi \) is \(\psi _{\alpha }\), the right hand of the above formula could be computed as \(\tilde{z}=e^{-\pi \alpha ^2}\). When \(s\ne s'\), the distribution of \(\xi \) is periodic with period 1/k, note that the integral of the periodic function \(e^{2\pi i y}p(y)\) with period 1 is fixed in any interval of length 1, then

$$\begin{aligned} \begin{aligned} \int \limits _0^1 e^{2\pi i y}p(y)\textrm{d}y&=\int \limits _{\frac{1}{k}}^{1+\frac{1}{k}} e^{2\pi i y}p(y)\textrm{d}y \\&=\int \limits _0^1 e^{2\pi i (y+\frac{1}{k})}p(y)\textrm{d}y \\&=e^{\frac{2\pi i}{k}}\int \limits _0^1 e^{2\pi i y}p(y)\textrm{d}y. \end{aligned} \end{aligned}$$

From \(k\geqslant 2\) we know \(\tilde{z}=0\), by the Chebyshev inequality,

$$\begin{aligned} Pr\{|z-\tilde{z}|\leqslant 0.01\}\geqslant 1-\frac{\text {Var}[\cos (2\pi \xi )]}{0.01^2 n}. \end{aligned}$$

The probability of \(|z-\tilde{z}|\leqslant 0.01\) is exponentially close to 1 when n is large enough. Thus, we confirm \(s\ne s'\) with probability exponentially close to 1 if \(z\leqslant 0.02\). We complete the proof.    \(\square \)

Based on Lemma 3.4.6 and the algorithm for \(\text {LWE}_{n,q,\psi _{\alpha },m}\), for any \(\beta \leqslant \alpha \) and samples from \(A_{s,\psi _{\beta }}\), the following Lemma 3.4.7 gives an algorithm to solve s with probability close to 1.

Lemma 3.4.7

Let \(q=q(n)\geqslant 2\) be a positive integer, \(\alpha =\alpha (n)\in (0,1)\). Assume that we have an algorithm W that solves \(\text {LWE}_{n,q,\psi _{\alpha },m}\) with a polynomial number of samples, then there exists an efficient algorithm \(W'\) to solve s with probability exponentially close to 1 for some samples from \(A_{s,\psi _{\beta }}\), where \(\beta \leqslant \alpha \) and \(\beta \) is unknown.

Proof

Assume we need \(n^c\) samples in the algorithm W, \(c>0\) is a constant. Let the set Z be

$$\begin{aligned} Z=\{\gamma \ |\ \gamma =\delta n^{-2c} \alpha ^2\in [0,\alpha ^2],\ \delta \in \mathbb {Z}\}. \end{aligned}$$

The steps of algorithm \(W'\) are as follows. For each \(\gamma \in Z\), we repeat the following process n times. Each time we get \(n^c\) samples from \(A_{s,\psi _{\beta }}\) and add samples from \(\psi _{\sqrt{\gamma }}\) to the second component of each sample from \(A_{s,\psi _{\beta }}\), so we obtain \(n^c\) samples from \(A_{s,\psi _{\sqrt{\beta ^2+\gamma }}}\). We solve \(s'\) by algorithm W and determine whether \(s'=s\). If \(s'=s\), output \(s'\) and we complete the algorithm. Next we prove that the above algorithm could achieve the goal of solving s with probability exponentially close to 1. Assume

$$\begin{aligned} \Gamma =\min \{\gamma \in Z,\gamma \geqslant \alpha ^2-\beta ^2\}. \end{aligned}$$

From the definition of the set Z

$$\begin{aligned} \Gamma \leqslant \alpha ^2-\beta ^2+n^{-2c} \alpha ^2. \end{aligned}$$

Let \(\alpha '=\sqrt{\beta ^2+\Gamma }\), we have

$$\begin{aligned} \alpha \leqslant \alpha '\leqslant \sqrt{\alpha ^2+n^{-2c} \alpha ^2} \leqslant (1+n^{-2c})\alpha . \end{aligned}$$

Based on lemma 3.3.5,

$$\begin{aligned} \Delta (\psi _{\alpha },\psi _{\alpha '})\leqslant \frac{9}{2} \left( \frac{\alpha '}{\alpha }-1 \right) \leqslant \frac{9}{2}n^{-2c}. \end{aligned}$$

Therefore, the statistical distance between the \(n^c\) samples from \(\psi _{\alpha }\) and \(n^c\) samples from \(\psi _{\alpha '}\) is no more than \(9n^{-c}\), which means the probability that the algorithm W solves s successfully is at least \(1-9n^{-c}\geqslant \frac{1}{2}\). It follows that the probability of solving s unsuccessfully n times is no more than \(2^{-n}\). The lemma holds.    \(\square \)

To prove our main result, we need two properties about the Gauss function and statistical distance.

Lemma 3.4.8

For any n dimensional lattice L, \(c\in \mathbb {R}^n\), \(\varepsilon >0\), \(r\geqslant \eta _{\varepsilon }(L)\), we have

$$\begin{aligned} \rho _r(L+c)\in r^n \text {det}(L^*)(1\pm \varepsilon ). \end{aligned}$$
(3.4.2)

Proof

Based on Lemma 1.3.2 in Chap. 1,

$$\begin{aligned} \begin{aligned} \rho _r(L+c)&=\sum \limits _{x\in L}\rho _{r,-c}(x)=\text {det}(L^*) \sum \limits _{y\in L^*}\hat{\rho }_{r,-c}(y) \\&=r^n \text {det}(L^*) \sum \limits _{y\in L^*}e^{2\pi i c\cdot y}\rho _{1/r}(y) \\&=r^n \text {det}(L^*) (1+\sum \limits _{y\in L^*\backslash \{0\}}e^{2\pi i c\cdot y}\rho _{1/r}(y)). \end{aligned} \end{aligned}$$

From \(r\geqslant \eta _{\varepsilon }(L)\), it follows that \(\rho _{1/r}(L^* \backslash \{0\})\leqslant \varepsilon \), and

$$\begin{aligned} \left| {\sum \limits _{y\in L^*\backslash \{0\}}e^{2\pi i c\cdot y}\rho _{1/r}(y)} \right| \leqslant \sum \limits _{y\in L^*\backslash \{0\}} \rho _{1/r}(y)\leqslant \varepsilon . \end{aligned}$$

We get

$$\begin{aligned} \rho _r(L+c)=r^n \text {det}(L^*) \left( 1+\sum \limits _{y\in L^*\backslash \{0\}}e^{2\pi i c\cdot y}\rho _{1/r}(y) \right) \in r^n \text {det}(L^*) (1\pm \varepsilon ). \end{aligned}$$

The proof is complete.    \(\square \)

Lemma 3.4.9

For any n dimensional lattice L, \(u\in \mathbb {R}^n\), \(\varepsilon <\frac{1}{2}\), r, s are two positive real numbers, \(t=\sqrt{r^2+s^2}\), assume \(rs/t=1/\sqrt{1/r^2+1/s^2}\geqslant \eta _{\varepsilon }(L)\), let \(\xi \) be the sum of a discrete Gauss distribution \(D_{L+u,r}\) and a noise distribution \(D_s\), then

$$\begin{aligned} \Delta (\xi ,D_t)\leqslant 2\varepsilon . \end{aligned}$$
(3.4.3)

Proof

Let the density function of \(\xi \) be Y(x), then

$$\begin{aligned} \begin{aligned} Y(x)&=\frac{1}{s^n \rho _r(L+u)} \sum \limits _{y\in L+u} \rho _r(y) \rho _s(x-y)\\&=\frac{1}{s^n \rho _r(L+u)} \sum \limits _{y\in L+u} \text {exp}\left( -\pi \left( \left| {\frac{y}{r}} \right| ^2+ \left| {\frac{x-y}{s}} \right| ^2\right) \right) \\&=\frac{1}{s^n \rho _r(L+u)} \sum \limits _{y\in L+u} \text {exp}\left( -\pi \left( \frac{r^2+s^2}{r^2\,s^2} \left| {y-\frac{r^2}{r^2+s^2}x} \right| ^2+\frac{1}{r^2+s^2}|x|^2\right) \right) \\&=\text {exp}\left( -\frac{\pi }{r^2+s^2}|x|^2\right) \frac{1}{s^n \rho _r(L+u)} \\&\quad \sum \limits _{y\in L+u} \text {exp}\left( -\pi \left( \frac{r^2+s^2}{r^2\,s^2}\left| {y-\frac{r^2}{r^2+s^2}x} \right| ^2\right) \right) \\&=\frac{\rho _t(x)}{s^n}\frac{\rho _{rs/t,(r/t)^2 x-u}(L)}{\rho _{r,-u}(L)} \\&=\frac{\rho _t(x)}{s^n}\frac{\hat{\rho }_{rs/t,(r/t)^2 x-u}(L^*)}{\hat{\rho }_{r,-u}(L^*)} \\&=\frac{\rho _t(x)}{t^n}\frac{(t/rs)^n \hat{\rho }_{rs/t,(r/t)^2 x-u}(L^*)}{(1/r)^n \hat{\rho }_{r,-u}(L^*)}. \end{aligned} \end{aligned}$$
(3.4.4)

Based on the Fourier transform property of Gauss function in Lemma 1.2.1 in Chap. 1, we get

$$\begin{aligned} \hat{\rho }_{rs/t,(r/t)^2 x-u}(w)=\text {exp}(-2\pi i ((r/t)^2 x-u)\cdot w) (rs/t)^n \rho _{t/rs}(w), \end{aligned}$$

and

$$\begin{aligned} \hat{\rho }_{r,-u}(w)=\text {exp}(2\pi i u\cdot w) r^n \rho _{1/r}(w). \end{aligned}$$

Since \(r\geqslant \frac{rs}{t}\geqslant \eta _{\varepsilon }(L)\),

$$\begin{aligned} |1-(t/rs)^n \hat{\rho }_{rs/t,(r/t)^2 x-u}(L^*)|\leqslant \rho _{t/rs}(L^*\backslash \{0\})\leqslant \varepsilon , \end{aligned}$$
$$\begin{aligned} |1-(1/r)^n \hat{\rho }_{r,-u}(L^*)|\leqslant \rho _{1/r}(L^*\backslash \{0\})\leqslant \varepsilon . \end{aligned}$$

It follows that

$$\begin{aligned} 1-2\varepsilon \leqslant \frac{1-\varepsilon }{1+\varepsilon }\leqslant \frac{(t/rs)^n \hat{\rho }_{rs/t,(r/t)^2 x-u}(L^*)}{(1/r)^n \hat{\rho }_{r,-u}(L^*)}\leqslant \frac{1+\varepsilon }{1-\varepsilon }\leqslant 1+4\varepsilon . \end{aligned}$$

By (3.4.4),

$$\begin{aligned} |Y(x)-\frac{\rho _t(x)}{t^n}|\leqslant 4\varepsilon \frac{\rho _t(x)}{t^n}. \end{aligned}$$

Integrate for \(x\in \mathbb {R}^n\),

$$\begin{aligned} \Delta (\xi ,D_t)=\frac{1}{2}\int \limits _{\mathbb {R}^n}|Y(x)-\frac{\rho _t(x)}{t^n}|\textrm{d}x \leqslant 2\varepsilon . \end{aligned}$$

We complete the proof.    \(\square \)

Lemma 3.4.10

For any n dimensional lattice L, vectors \(z,u\in \mathbb {R}^n\), real numbers \(r,\alpha >0\), \(\varepsilon <\frac{1}{2}\), \(\eta _{\varepsilon }(L)\leqslant \) \(1/\sqrt{1/r^2+(|z|/\alpha )^2}\), let v be a random variable of the discrete Gauss distribution \(D_{L+u,r}\), e be a random variable of Gauss distribution with mean 0 and standard deviation \(\alpha /\sqrt{2\pi }\), \(\xi \) be a random variable of Gauss distribution with mean 0 and standard deviation \(\sqrt{(r|z|)^2+\alpha ^2}/\sqrt{2\pi }\), then

$$\begin{aligned} \Delta (z\cdot v+e,\xi )\leqslant 2\varepsilon . \end{aligned}$$
(3.4.5)

In particular,

$$\begin{aligned} \Delta (z\cdot v+e \mod 1,\psi _{\sqrt{(r|z|)^2+\alpha ^2}})\leqslant 2\varepsilon . \end{aligned}$$
(3.4.6)

Proof

Let the random variable h has distribution \(D_{\alpha /|z|}\), then the standard deviation of h is \(\alpha /(|z|\sqrt{2\pi })\), and the standard deviation of \(z\cdot h\) is \(|z|\cdot \alpha /(|z|\sqrt{2\pi })=\alpha /\sqrt{2\pi }\) which is the same as that of e. Since both of them have Gauss distributions, we get the distributions of \(z\cdot h\) and e are the same, i.e. \(z\cdot v+e\) and \(z\cdot (v+h)\) have the same distribution. Based on Lemma 3.4.9, let \(s=\alpha /|z|\), it follows that the statistical distance between \(v+h\) and \(D_{\sqrt{r^2+(\alpha /|z|)^2}}\) is no more than \(2\varepsilon \),

$$\begin{aligned} \Delta (v+h,D_{\sqrt{r^2+(\alpha /|z|)^2}})\leqslant 2\varepsilon . \end{aligned}$$

By the property of statistical distance,

$$\begin{aligned} \Delta (z\cdot (v+h),z\cdot D_{\sqrt{r^2+(\alpha /|z|)^2}})\leqslant 2\varepsilon . \end{aligned}$$

Here the standard deviation of \(z\cdot D_{\sqrt{r^2+(\alpha /|z|)^2}}\) is

$$\begin{aligned} |z|\cdot \sqrt{r^2+(\alpha /|z|)^2}/\sqrt{2\pi }=\sqrt{(r|z|)^2+\alpha ^2}/\sqrt{2\pi }, \end{aligned}$$

which is the same as that of \(\xi \). Note that both of the two random variables have Gauss distributions; therefore, \(z\cdot D_{\sqrt{r^2+(\alpha /|z|)^2}}\) and \(\xi \) have the same distribution, i.e.

$$\begin{aligned} \Delta (z\cdot v+e,\xi )\leqslant 2\varepsilon , \end{aligned}$$

mod 1 for both of the two random variables,

$$\begin{aligned} \Delta (z\cdot v+e\ \text {mod}\ 1,\psi _{\sqrt{(r|z|)^2+\alpha ^2}})\leqslant 2\varepsilon . \end{aligned}$$

The lemma holds.    \(\square \)

Lemma 3.4.11

Let \(\varepsilon =\varepsilon (n)\) be a negligible function of n, \(q=q(n)\geqslant 2\) be a positive integer, \(\alpha =\alpha (n)\in (0,1)\). Assume we have an algorithm W to solve s given a polynomial number of samples from \(A_{s,\psi _{\beta }}\) for any \(\beta \leqslant \alpha \) (\(\beta \) is unknown), then there exists an efficient algorithm that given an n dimensional lattice L, a real number \(r>\sqrt{2}q \eta _{\varepsilon }(L)\) and a polynomial number of samples from \(D_{L,r}\), to solve the \(\text {CVP}_{L^*,\alpha q/(\sqrt{2}r)}^{(q)}\) problem.

Proof

For a given \(x\in \mathbb {R}^n\), \(\text {dist}(x,L^*)\leqslant \alpha q/(\sqrt{2}r)\), denote the generated matrix of L is B, and the generated matrix of \(L^*\) is \((B^T)^{-1}\), our goal is to solve \(s=B^T K_{L^*}(x)\ \text {mod}\ q\). The idea of algorithm \(W'\) is to generate a polynomial number of samples from \(A_{s,\psi _{\beta }}\), and solve s according to the algorithm W.

The steps of algorithm \(W'\) are as follows: let \(v\in L\) be a sample from the discrete Gauss distribution \(D_{L,r}\), \(a=B^{-1}v\ \text {mod}\ q\), e be random variable of Gauss distribution with mean 0 and standard deviation \(\alpha /(2\sqrt{\pi })\), then there is \(\beta \leqslant \alpha \) such that the statistical distance between \((a,x\cdot v/q+e\ \text {mod}\ 1)\) and \(A_{s,\psi _{\beta }}\) is negligible. Next we prove the correctness of this algorithm.

Firstly, note that the distribution of a is almost uniform, i.e. the statistical distance between a and the uniform distribution is negligible. This is because for any \(a_0\in \mathbb {Z}_q^n\), we have

$$\begin{aligned} \text {Pr}\{a=a_0\}=\rho _r(qL+Ba_0)=\rho _{r/q}(L+Ba_0/q). \end{aligned}$$

Since \(q\eta _{\varepsilon }(L)<r\), based on Lemma 3.4.8,

$$\begin{aligned} \text {Pr}\{a=a_0\}=\rho _{r/q}(L+Ba_0/q)\in (r/q)^n \text {det}(L^*)(1\pm \varepsilon ),\ \forall a_0\in \mathbb {Z}_q^n. \end{aligned}$$

This implies a is almost uniformly distributed.

Secondly, we consider the distribution of \(x\cdot v/q+e\ \text {mod}\ 1\). Let \(x'=x-K_{L^*}(x)\), from \(\text {dist}(x,L^*)\leqslant \alpha q/(\sqrt{2}r)\) we get \(|x'|\leqslant \alpha q/(\sqrt{2}r)\) and

$$\begin{aligned} x\cdot v/q+e\ \text {mod}\ 1=(x'/q)\cdot v+e+K_{L^*}(x)\cdot v/q\ \text {mod}\ 1. \end{aligned}$$
(3.4.7)

We compute the distributions of \(K_{L^*}(x)\cdot v/q\ \text {mod}\ 1\) and \((x'/q)\cdot v+e\) , respectively. It is easy to see

$$\begin{aligned} K_{L^*}(x)\cdot v=(B^T K_{L^*}(x))\cdot (B^{-1}v), \end{aligned}$$

therefore,

$$\begin{aligned} K_{L^*}(x)\cdot v\ \text {mod}\ q=(B^T K_{L^*}(x))\cdot (B^{-1}v)\ \text {mod}\ q=s\cdot a\ \text {mod}\ q. \end{aligned}$$

This means \(K_{L^*}(x)\cdot v/q\ \text {mod}\ 1\) and \(s\cdot a/q\ \text {mod}\ 1\) have the same distribution. In order to get the distribution of \((x'/q)\cdot v+e\), note that v has discrete Gauss distribution \(D_{qL+Ba,r}\), and e has Gauss distribution with mean 0 and standard deviation \(\alpha /(2\sqrt{\pi })\), let \(\beta =\sqrt{(r|x'|/q)^2+\alpha ^2/2}\leqslant \alpha \),

$$\begin{aligned} 1/\sqrt{1/r^2+(\sqrt{2}|x'|/\alpha q)^2}\geqslant r/\sqrt{2}>q\eta _{\varepsilon }(L)=\eta _{\varepsilon }(qL) \end{aligned}$$

satisfies the condition of Lemma 3.4.10. By Lemma 3.4.10, \((x'/q)\cdot v+e\) almost has the distribution \(\psi _{\beta }\) and the statistical distance of them is negligible. From (3.4.7), \(x\cdot v/q+e\ \text {mod}\ 1\) and \(\psi _{\beta }+s\cdot a/q\ \text {mod}\ 1\) have the same distribution. Above all, we get the statistical distance between \((a,x\cdot v/q+e\ \text {mod}\ 1)\) and \(A_{s,\psi _{\beta }}\) is negligible so that the algorithm \(W'\) is correct. We complete the proof.    \(\square \)

Combining the above Lemmas 3.4.5, 3.4.7 and 3.4.11, we obtain the conclusion of Lemma 3.4.4 immediately, which shows that we can solve the \(\text {CVP}_{L^*,\alpha q/(\sqrt{2}r)}\) problem by the samples of \(D_{L,r}\). In order to prove Lemma 3.4.3 completely, we introduce the technique of quantum computation to prove there is an efficient quantum algorithm to generate a sample from \(D_{L,r\sqrt{n}/\alpha q}\) based on the algorithm for the \(\text {CVP}_{L^*,\alpha q/(\sqrt{2}r)}\) problem.

Definition 3.4.4

For a real number \(a\in \mathbb {R}\) and a vector \(x\in \mathbb {R}^n\), we define the Dirac notation \(a|x\rangle =ax\). Let A be a finite or countable set in \(\mathbb {R}^n\), f be a function from \(\mathbb {R}^n\) to \(\mathbb {R}\), a quantum state is defined by

$$\begin{aligned} \sum \limits _{x\in A}f(x)|x\rangle =\sum \limits _{x\in A}f(x)x, \end{aligned}$$
(3.4.8)

if \(\sum \nolimits _{x\in A}f(x)x\) converges.

The knowledge about Dirac notation and quantum state is an important part of quantum physics. Since it involves too much content beyond the scope of this book, we will not introduce it in detail. We only provide the Lemmas 3.4.12, 3.4.13 and 3.4.14 here. The readers could refer to Nielsen and Chuang (2000), Shor (1997) for details. The following Lemma 3.4.12 gives the discrete Gauss quantum state on lattice, where the lattice L satisfies \(L\subset \mathbb {Z}^n\).

Lemma 3.4.12

Given an n dimensional lattice \(L\subset \mathbb {Z}^n\), \(r>2^{2n}\lambda _n(L)\), there exists an efficient quantum algorithm to output a state within negligible \(l_2\) distance from the following state

$$\begin{aligned} \sum \limits _{x\in L}\sqrt{\rho _r(x)}|x\rangle =\sum \limits _{x\in L}\rho _{\sqrt{2}r}(x)|x\rangle . \end{aligned}$$
(3.4.9)

Let L be an n dimensional lattice, R be a positive number, \(L/R=\{x/R\ |\ x\in L\}\) be a lattice obtained by scaling down L by a factor of R. The following lemma 3.4.13 claims that the quantum state on lattice is on points of norm at most \(\sqrt{n}\).

Lemma 3.4.13

Let R be a positive integer, L be an n dimensional lattice such that \(\lambda _1(L)>2\sqrt{n}\), F be the basic neighborhood of L. \(v_1\) and \(v_2\) are defined by

$$\begin{aligned} v_1=\sum \limits _{x\in L/R,|x|<\sqrt{n}} \rho (x)|x\ \text {mod}\ L\rangle . \end{aligned}$$
(3.4.10)

and

$$\begin{aligned} \begin{aligned} v_2&=\sum \limits _{x\in L/R} \rho (x)|x\ \text {mod}\ L\rangle \\&=\sum \limits _{L/R\cap F}\sum \limits _{y\in L} \rho (x-y)|x\rangle . \end{aligned} \end{aligned}$$
(3.4.11)

Then the \(l_2\) distance between \(\frac{v_1}{|v_1|}\) and \(\frac{v_2}{|v_2|}\) is negligible.

The following Lemma 3.4.14 gives an algorithm to generate a sample from \(D_{L,\sqrt{n}/(\sqrt{2}d)}\) based on the algorithm for the \(\text {CVP}_{L^*,d}\) problem.

Lemma 3.4.14

Given an n dimensional lattice L, a real number \(d<\lambda _1(L^*)/2\), if there exists an algorithm to solve the \(\text {CVP}_{L^*,d}\) problem, then there is an efficient quantum algorithm to generate a sample from the discrete Gauss distribution \(D_{L,\sqrt{n}/(\sqrt{2}d)}\).

According to Lemma 1.3.6 in Chap. 1, when \(r>\sqrt{2}q\eta _{\varepsilon }(L)\), we have

$$\begin{aligned} \frac{\alpha q}{\sqrt{2}r}< \frac{\alpha }{2\eta _{\varepsilon }(L)}\leqslant \frac{\alpha }{2}\sqrt{\frac{\pi }{\ln (1/\varepsilon )}}\lambda _1(L^*)<\frac{\lambda _1(L^*)}{2}, \end{aligned}$$

replace d in Lemma 3.4.14 with \(\alpha q/(\sqrt{2}r)\), then there exists a quantum algorithm to generate a sample from the discrete Gauss distribution \(D_{L,r\sqrt{n}/\alpha q}\) given the algorithm for the \(\text {CVP}_{L^*,\alpha q/(\sqrt{2}r)}\) problem.

Combine Lemma 3.4.4 with Lemma 3.4.14, for \(r>\sqrt{2}q\eta _{\varepsilon }(L)\), we have proved that one can solve the \(\text {CVP}_{L^*,\alpha q/(\sqrt{2}r)}\) problem given the algorithm for the \(\text {LWE}_{n,q,\psi _{\alpha },m}\) problem and a polynomial number of samples from \(D_{L,r}\), and further to generate a sample from \(D_{L,r\sqrt{n}/\alpha q}\), which is the whole proof of Lemma 3.4.3. So far, we get the main Lemma 3.4.1 in this subsection and finish the first part of proof for Theorem 3.3.1, i.e. from the algorithm for \(\text {LWE}_{n,q,\psi _{\alpha },m}\) problem to solve the \(\text {DGS}_{\sqrt{2n}\eta _{\varepsilon }(L)/\alpha }\) problem.

3.4.2 From DGS to Hard Problems on Lattice

In this subsection, we are to prove that if there is an algorithm to solve the DGS problem, then there exists a probabilistic polynomial algorithm to solve the hard problems on lattice. Take the GIVP problem as an example, that is, find a set \(S=\{s_i\}\subset L\) of n linearly independent vectors in L, such that

$$\begin{aligned} |S|=\max |s_i|\leqslant \gamma (n)\phi (B), \end{aligned}$$

where \(\gamma (n)\geqslant 1\) is a function of n, B is the generated matrix of L, \(\phi (B)\) is a real function of B. Specially, if \(\phi =\lambda _n\), then the GIVP problem becomes the SIVP problem. In order to complete the proof of reduction algorithm from the hard problems on lattice to the DGS problem, we introduce the following two lemmas first. Lemma 3.4.15 shows that with a positive probability, the samples from discrete Gauss distribution are not all contained in a given plane with dimension no more than n.

Lemma 3.4.15

Given an n dimensional lattice \(L\subset \mathbb {R}^n\), \(\varepsilon \leqslant \frac{1}{10}\), \(r\geqslant \sqrt{2}\eta _{\varepsilon }(L)\), let H be a plane in \(\mathbb {R}^n\) with dimension no more than \(n-1\), x be a sample from the discrete Gauss distribution \(D_{L,r}\), then

$$\begin{aligned} \text {Pr}\{x\not \in H\}\geqslant \frac{1}{10}. \end{aligned}$$

Proof

\(h=(h_1,h_2,\dots ,h_n)\in H\), without loss of generality, we suppose that H is \(h_1=0\), i.e. the plane of all points with the first coordinate 0, let \(x=(x_1,x_2,\dots ,x_n)\). Consider the expectation \(E[e^{-\pi (x_1/r)^2}]\), based on Lemma 1.3.2 in Chap. 1, we have

$$\begin{aligned} \begin{aligned}&\mathop {E}\limits _{x\sim D_{L,r}} [e^{-\pi (x_1/r)^2}] \\&=\frac{1}{\rho _r(L)}\sum \limits _{x\in L}e^{-\pi (\sqrt{2}x_1/r)^2}e^{-\pi (x_2/r)^2}\cdots e^{-\pi (x_n/r)^2} \\&=\frac{\text {det}(L^*)r^n}{\sqrt{2}\rho _r(L)}\sum \limits _{y\in L^*}e^{-\pi (ry_1/\sqrt{2})^2}e^{-\pi (ry_2)^2}\cdots e^{-\pi (ry_n)^2} \\&\leqslant \frac{\text {det}(L^*)r^n}{\sqrt{2}\rho _r(L)} \rho _{\sqrt{2}/r}(L^*), \end{aligned} \end{aligned}$$

where \(y=(y_1,y_2,\dots ,y_n)\in L^*\). Since \(r/\sqrt{2}\geqslant \eta _{\varepsilon }(L)\), we get

$$\begin{aligned} \rho _{\sqrt{2}/r}(L^*)=1+\rho _{\sqrt{2}/r}(L^*\backslash \{0\})\leqslant 1+\varepsilon . \end{aligned}$$

It follows that

$$\begin{aligned} \mathop {E}\limits _{x\sim D_{L,r}} [e^{-\pi (x_1/r)^2}]\leqslant \frac{\text {det}(L^*)r^n}{\sqrt{2}\rho _r(L)}(1+\varepsilon ). \end{aligned}$$

By Lemma 1.3.2 in Chap. 1 again,

$$\begin{aligned} \rho _r(L)=\text {det}(L^*)r^n \rho _{1/r}(L^*)\geqslant \text {det}(L^*)r^n, \end{aligned}$$

therefore,

$$\begin{aligned} \mathop {E}\limits _{x\sim D_{L,r}} [e^{-\pi (x_1/r)^2}]\leqslant \frac{1+\varepsilon }{\sqrt{2}}<\frac{9}{10}. \end{aligned}$$

On the other hand,

$$\begin{aligned} \begin{aligned} \mathop {E}\limits _{x\sim D_{L,r}} [e^{-\pi (x_1/r)^2}]&\geqslant \sum \limits _{x\in H,x\sim D_{L,r}} \frac{\rho _r(x)}{\rho _r(L)} [e^{-\pi (x_1/r)^2}] \\&= \sum \limits _{x\in H,x\sim D_{L,r}} \frac{\rho _r(x)}{\rho _r(L)}=Pr\{x\in H\}. \end{aligned} \end{aligned}$$

According to the above two inequalities,

$$\begin{aligned} \text {Pr}\{x\in H\}\leqslant \frac{9}{10}, \end{aligned}$$

that is,

$$\begin{aligned} \text {Pr}\{x\not \in H\}\geqslant \frac{1}{10}. \end{aligned}$$

The lemma holds.    \(\square \)

Based on Lemma 3.4.15, the following lemma shows that it is possible to find n linearly independent vectors from \(n^2\) independent samples of the discrete Gauss distribution \(D_{L,r}\) with probability close to 1, which provides a guarantee for solving the GIVP problem later.

Lemma 3.4.16

Given an n dimensional lattice \(L\subset \mathbb {R}^n\), \(\varepsilon \leqslant \frac{1}{10}\), \(r\geqslant \sqrt{2}\eta _{\varepsilon }(L)\), then the probability that a set of \(n^2\) vectors chosen independently from \(D_{L,r}\) contain no n linearly independent vectors is exponentially small.

Proof

Let \(x_1,x_2,\dots ,x_{n^2}\) be \(n^2\) independent samples from \(D_{L,r}\), for \(i=1,2,\dots ,\) \(n-1\), let \(B_i\) be the event that

$$\begin{aligned} \text {dim\ span}(x_1,x_2,\dots ,x_{in})=\text {dim\ span}(x_1,x_2,\dots ,x_{(i+1)n})<n. \end{aligned}$$

If none of the events \(B_1,B_2,\dots ,B_{n-1}\) happens, then

$$\begin{aligned} \text {dim\ span}(x_1,x_2,\dots ,x_{n^2})=n, \end{aligned}$$

i.e. there exists n linearly independent vectors in these \(n^2\) samples. Next we estimate the probability of \(B_i\), by Lemma 3.4.15,

$$\begin{aligned} \text {Pr}\{x_j\in \text {span}(x_1,x_2,\dots ,x_{in})\}\leqslant \frac{9}{10},\ \forall \, in+1\leqslant j\leqslant (i+1)n. \end{aligned}$$

Thus,

$$\begin{aligned} \text {Pr}\{x_{in+1},x_{in+2},\dots ,x_{(i+1)n}\in \text {span}(x_1,x_2,\dots ,x_{in})\}\leqslant \left( \frac{9}{10}\right) ^n, \end{aligned}$$

that is,

$$\begin{aligned} \text {Pr}\{B_i\}\leqslant \left( \frac{9}{10}\right) ^n, \forall \, i=1,2,\dots ,n-1. \end{aligned}$$

It follows that

$$\begin{aligned} \text {Pr}\{\overline{B_1}\cap \overline{B_2}\cap \cdots \cap \overline{B_{n-1}}\}=1-\text {Pr}\{B_1\cup \cdots \cup B_{n-1}\}\geqslant 1-(n-1)\left( \frac{9}{10}\right) ^n, \end{aligned}$$

this means the probability that none of \(B_1,B_2,\dots ,B_{n-1}\) happens is close to 1, i.e. the probability that there are n linearly independent vectors in these \(n^2\) independent samples from \(D_{L,r}\) is close to 1. We complete the proof.    \(\square \)

Based on the above preparations, let’s prove the main conclusion in this subsection.

Lemma 3.4.17

Given an n dimensional lattice L, \(\varepsilon =\varepsilon (n)\leqslant \frac{1}{10}\), \(\phi (L)\geqslant \sqrt{2}\eta _{\varepsilon }(L)\), if there exists an algorithm for the \(\text {DGS}_{\phi }\) problem, then there is a probabilistic polynomial algorithm to solve the \(\text {GIVP}_{2\sqrt{n}\phi }\) problem.

Proof

By the LLL algorithm we choose the generated matrix \(S=[s_1,s_2,\dots ,s_n]\) of lattice L such that \(s_i\leqslant 2^n \lambda _n(L)\), \(1\leqslant i\leqslant n\). Let

$$\begin{aligned} \tilde{\lambda }_n=|S|=\max \limits _{1\leqslant i\leqslant n}|s_i| \end{aligned}$$

be the length of the longest column vector in S, then

$$\begin{aligned} \lambda _n(L)\leqslant \tilde{\lambda }_n \leqslant 2^n \lambda _n(L). \end{aligned}$$

For each \(i\in \{0,1,\dots ,2n\}\), let \(r_i=2^{-i} \tilde{\lambda }_n\), we generate \(n^2\) independent samples from \(D_{L,r_i}\) based on the algorithm of the \(\text {DGS}_{\phi }\) problem, and the corresponding sets of \(n^2\) vectors are denoted as \(S_0,S_1,\dots ,S_{2n}\). If \(\tilde{\lambda }_n\leqslant \phi (L)\), we have

$$\begin{aligned} \tilde{\lambda }_n=|S|\leqslant 2\sqrt{n}\phi (L), \end{aligned}$$

so S is a solution of the \(\text {GIVP}_{2\sqrt{n}\phi }\) problem. If \(\phi (L)<\tilde{\lambda }_n\), then there exists \(i\in \{0,1,\dots ,2n\}\) such that \(\phi (L)\leqslant r_i\leqslant 2\phi (L)\) according to Lemma 1.3.6 in Chap. 1,

$$\begin{aligned} \tilde{\lambda }_n\leqslant 2^n \lambda _n(L)\leqslant 2^n n\sqrt{\frac{\pi }{\ln (1/\varepsilon )}}\eta _{\varepsilon }(L)<2^{2n+1}\phi (L), \end{aligned}$$

combine \(r_0=\tilde{\lambda }_n>\phi (L)\) with \(r_{2n}=2^{-2n}\tilde{\lambda }_n<2\phi (L)\), we know there is \(r_i\) satisfying \(\phi (L)\leqslant r_i\leqslant 2\phi (L)\). By Lemma 3.4.16, the probability that \(S_i\) contains n linearly independent vectors \(v_1,v_2,\dots ,v_n\) is close to 1. Based on Lemma 1.3.4 in Chap. 1, the probability each \(v_i\) no more than \(\sqrt{n}r_i\leqslant 2\sqrt{n}\phi (L)\) is close to 1. Let \(V=[v_1,v_2,\dots ,v_n]\), we get \(|V|\leqslant 2\sqrt{n}\phi (L)\), so we find a solution of the \(\text {GIVP}_{2\sqrt{n}\phi }\) problem. This lemma holds.    \(\square \)

In Chap. 2, we have proved that the hard problems on lattice such as the GIVP and GapSIVP problems can be reduced to the SIS problem, so the difficulties of solving the hard problems on lattice are the same. In Lemma 3.4.17, we prove that if there is an algorithm for the DGS problem, then there is a probabilistic polynomial algorithm to solve the GIVP problem with positive probability, which can also solve the other hard problems on lattice. So far we have completed the second part of the proof of Theorem 3.3.1. In the first part, we have proved that if there is an algorithm for the \(\text {LWE}\) problem, then there exists a quantum algorithm to solve the DGS problem. Combining the two parts of the proof, we get the feasibility to solve the hard problems on lattice based on the algorithm for solving the LWE problem, that is, the difficulty of solving the LWE problem is not lower than that of the hard problems on lattice.

3.4.3 From D-LWE to LWE

In this subsection, we will finish the third part of the proof for Theorem 3.3.1, i.e. the difficulty of the D-LWE problem is at least as high as that of the LWE problem, which is given in the following Theorem 3.4.1.

Theorem 3.4.1

Let \(n\geqslant 1\) be a positive integer, \(2\leqslant q\leqslant \text {Poly}(n)\) be a prime number, \(\chi \) be a distribution on \(\mathbb {Z}_q\). Assume that we have an algorithm W to determine a sample from the LWE distribution \(A_{s,\chi }\) or the uniform distribution U with probability close to 1, then there exists an algorithm \(W'\) to solve s given some samples from the LWE distribution \(A_{s,\chi }\) with probability close to 1.

Proof

Let \(s=(s_1,s_2,\dots ,s_n)\in \mathbb {Z}_q^n\), we give the steps for solving \(s_1\) of the algorithm \(W'\), and \(s_2,\dots ,s_n\) could be solved in the same way. For \(k\in \mathbb {Z}_q\), consider the following transformation of the LWE sample (a, b), where a is uniformly distributed on \(\mathbb {Z}_q^n\), \(b=a\cdot s+e\), \(e\leftarrow \chi \),

$$\begin{aligned} (a,b)\longrightarrow (a+(l,0,\dots ,0),b+lk), \end{aligned}$$

here \(l\in \mathbb {Z}_q\) is uniformly distributed. If \(k=s_1\), then

$$\begin{aligned} b+lk=a\cdot s+e+ls_1=(a+(l,0,\dots ,0))\cdot s+e, \end{aligned}$$

note that \(a+(l,0,\dots ,0)\) is also uniform on \(\mathbb {Z}_q^n\), therefore, \((a+(l,0,\dots ,0),b+lk)\) has the LWE distribution \(A_{s,\chi }\).

On the other hand, if \(k\ne s_1\), at this time lk and b are independent, based on l is uniform on \(\mathbb {Z}_q\), it follows that lk is also uniform on \(\mathbb {Z}_q\). By Lemma 3.3.2, we get \(b+lk\) is uniform on \(\mathbb {Z}_q\), so \((a+(l,0,\dots ,0),b+lk)\) is uniform. By the algorithm W, we determine \((a+(l,0,\dots ,0),b+lk)\) is from the LWE distribution \(A_{s,\chi }\) or the uniform distribution, and check whether \(s_1\) is equal to k. Since the number of possible values of k is q, we can always find the solution of \(s_1\). After solving \(s_2,s_3,\dots ,s_n\) in the same way, we get the solution s. The lemma holds.    \(\square \)

In Theorem 3.4.1, we prove that the difficulty of the D-LWE problem is not lower than that of the LWE problem and complete the whole proof of Theorem 3.3.1. The difficulty from solving the D-LWE problem to the LWE problem, then to the hard problems on lattice does not increase. We will further discuss the LWE cryptosystem with the probability of decryption error in the next chapter.