1 Introduction

Lattices in cryptography have been actively used as the foundation for constructing efficient or high-functional cryptosystems such as public-key encryptions [17, 26, 41], fully homomorphic encryptions [10, 22], and multilinear maps [21]. The security of lattice-based cryptography is based on the hardness of solving the (approximate) shortest vector problems (SVP) in the underlying lattice [15, 32, 35, 36]. In order to put lattice-based cryptography into practical use, we must precisely estimate the secure parameters in theory and practice by analyzing the previously known efficient algorithms for solving the SVP.

Currently the most efficient algorithms for solving the SVP are perhaps a series of BKZ algorithms [13, 14, 46, 47]. Numerous efforts have been made to estimate the security of lattice-based cryptography by analyzing the BKZ algorithms. Lindner and Peikert [32] gave an estimation of secure key sizes by connecting the computational cost of BKZ algorithm with the root Hermite factor from their experiment using the NTL-BKZ [49]. Furthermore, van de Pol and Smart [51] estimated the key sizes of fully homomorphic encryptions using a simulator based on Chen-Nguyen’s BKZ 2.0 [13]. Lepoint and Naehrig [31] gave a more precise estimation using the parameters of the full-version of BKZ 2.0 paper [14]. On the other hand, Liu and Nguyen [33] estimated the secure key sizes of some LWE-based cryptosystems by considering the BDD in the associated q-ary lattice. Aono et al. [7] gave another security estimation for LWE-based cryptosystems by considering the challenge data from the Darmstadt Lattice Challenge [50]. Recently, Albrecht et al. presented a comprehensive survey on the state-of-the-art of hardness estimation for the LWE problem [5].

The above analyzing algorithms are usually called “lattice-based attacks”, which have a generic framework consisting of two parts:

(1) Lattice reduction: This step aims to decrease the norm of vectors in the basis by performing a lattice reduction algorithm such as the LLL or BKZ algorithm.

(2) Point search: This step finds a short vector in the lattice with the reduced basis by performing the enumeration algorithm.

In order to obtain concrete and practical security parameters for lattice-based cryptosystems, it is necessary to investigate the trade-offs between the computational cost of a lattice reduction and that of a lattice point search.

For our total cost estimation, we further limit the lattice-based attack model by (1) using our improved progressive BKZ algorithm for lattice reduction, and (2) using the standard (sometimes randomized) lattice vector enumeration algorithm with sound pruning [20]. To predict the computational cost under this model, we propose a simulation method to generate the computing time of lattice reduction and the lengths of the Gram-Schmidt vectors of the basis to be computed.

BKZ Algorithms: Let \(B = (\mathbf{b}_1,\ldots ,\mathbf{b}_n)\) be the basis of the lattice. The BKZ algorithms perform the following local point search and update process from index \(i=1\) to \(n-1\). The local point search algorithm, which is essentially the same as the algorithm used in the second part of the lattice-based attacks, finds a short vector in the local block \(B_i = \pi _i (\mathbf{b}_i,\ldots ,\mathbf{b}_{i+\beta -1})\) of the fixed blocksize \(\beta \) (the blocksize shrinks to \(n-i+1\) for large \(i \ge n-\beta +1\)). Here, the lengths of vectors are measured under the projection \(\pi _i\) which is defined in Sect. 2.1. Then, the update process applies lattice reduction for the degenerated basis \((\mathbf{b}_1,\ldots ,\mathbf{b}_{i-1},\mathbf{v},\) \(\mathbf{b}_i,\ldots ,\mathbf{b}_n)\) after inserting vector \(\mathbf{v}\) at i-th index.

The point search subroutine finds a short vector in some searching radius \(\alpha \cdot \mathrm {GH}(B_i)\) with some probability which is defined over random local blocks of the fixed dimension. Here, \(\mathrm {GH}(B_i)\) is an approximation of the length of the shortest vector in the sublattice generated by \(B_i\).

In the classical BKZ algorithms [46, 47], the local point search calls a single execution of a lattice vector enumeration algorithm with a reasonable pruning for searching tree. The BKZ 2.0 algorithm proposed by Chen and Nguyen [13] uses the extreme pruning technique [20], which performs the lattice enumeration with success probability p for \(\lfloor 1/p \rceil \) different bases \(G_1,\ldots ,G_{\lfloor 1/p \rceil }\) obtained by randomizing the local basis \(B_i\). They use the fixed searching radius as \(\sqrt{1.1} \cdot \mathrm {GH}(B_i)\). We stress that BKZ 2.0 is practically the fastest algorithm for solving the approximate SVP of large dimensions. Indeed, many top-records in the Darmstadt Lattice Challenge [50] have been solved by BKZ 2.0 (Table 1).

Table 1. Technical comparison from BKZ 2.0

Our Contributions: In this paper we revisit progressive BKZ algorithms, which have been mentioned in several studies; these include [13, 19, 25, 45, 48]. The main idea of progressive BKZ is that performing BKZ iteratively starting with a small blocksize is practically faster than the direct execution of BKZ with a larger blocksize. The method used to increase the blocksize \(\beta \) strongly affects the overall computational cost of progressive BKZ. The research goal here is to find an optimal method of increasing the blocksize \(\beta \) according to the other parameters in the BKZ algorithms.

One major difference between BKZ 2.0 and our algorithm is the usage of randomized enumeration in local blocks. To find a very short vector in each local block efficiently, BKZ 2.0 uses the randomizing technique in [20]. Then, it reduces each block to decrease the cost of lattice enumeration. Although it is significantly faster than the enumeration without pruning, it introduces overhead because the bases are not good in practice after they have been randomized. To avoid this overhead, we adopted the algorithm with a single enumeration with a low probability.

Moreover, BKZ of a large blocksize with large pruning (i.e., a low probability) is generally better in both speed and quality of basis than that of a small blocksize with few pruning (i.e., a high probability), as a rule of thumb. We pursue this idea and add the freedom to choose the radius \(\alpha \cdot \mathrm {GH}(L)\) of the enumeration of the local block; this value is fixed in BKZ 2.0 as \(\sqrt{1.1} \cdot \mathrm {GH}(L)\).

To optimize the algorithm, we first discuss techniques for optimizing the BKZ parameters of enumeration subroutine, including the blocksize \(\beta \), success probability p of enumeration, and \(\alpha \) to set the searching radius of enumeration as \(\alpha \cdot \mathrm {GH}(B_i)\). We then show the parameter relationship that minimizes the computational cost for enumeration of a BKZ-\(\beta \)-reduced basis. Next, we introduce the new usage of full enumeration cost (FEC), derived from Gama-Nguyen-Regev’s cost estimation [20] with a Gaussian heuristic radius and without pruning, to define the quality of the basis and to predict the cost after BKZ-\(\beta \) is performed. Using this metric, we can determine the timing for increasing blocksize \(\beta \) that provides an optimized strategy; in previous works, the timing was often heuristic.

Furthermore, we propose a new BKZ simulator to predict the Gram-Schmidt lengths \(\Vert \mathbf{b}^*_i\Vert \) after BKZ-\(\beta \). Some previous works aimed to find a short vector as fast as possible, and did not consider other quantities. However, additional information is needed to analyze the security of lattice-based cryptosystems. In literatures, a series of works on lattice basis reduction [13, 14, 19, 44] have attempted to predict the Gram-Schmidt lengths \(\Vert \mathbf{b}^*_i\Vert \) after lattice reduction. In particular, Schnorr’s GSA is the first simulator of Gram-Schmidt lengths and the information it provides is used to analyze the random sampling algorithm. We follow this idea, i.e., predicting Gram-Schmidt lengths to analyze other algorithms.

Our simulator is based on the Gaussian heuristic with some modifications, and is computable directly from the lattice dimension and the blocksize. On the other hand, Chen-Nguyen’s simulator must compute the values sequentially; it has an inherent problem of accumulative error, if we use the strategy that changes blocksize many times. We also investigate the computational cost of our implementation of the new progressive BKZ, and show our estimation for solving challenge problems in the Darmstadt SVP Challenge and Ideal Lattice Challenge [50]. Our cost estimation is derived by setting the computation model and by curve fitting based on results from computer experiments. Using our improved progressive BKZ, we solved Ideal Lattice Challenge of 600 and 652 dimensions in the exact expected times of \(2^{20.7}\) and \(2^{24.0}\) s, respectively, on a standard PC.

Finally, we compare our algorithm with several previous algorithms. In particular, compared with Chen-Nguyen’s BKZ 2.0 algorithm [13, 14] and Schnorr’s blocksize doubling strategy [48], our algorithm is significantly faster. For example, to find a vector shorter than \(1.05 \cdot \mathrm {GH}(L)\), which is required by the SVP Challenge [50], our algorithm is approximately 50 times faster than BKZ 2.0 in a simulator-based comparison up to 160 dimensions.

Roadmap: In Sect. 2 we introduce the basic facts on lattices. In Sect. 3 we give an overview of BKZ algorithms, including Chen-Nguyen’s BKZ 2.0 [13] and its cost estimation; we also state some heuristic assumptions. In Sect. 4, we propose the optimized BKZ parameters under the Schnorr’s geometric series assumption (GSA). In Sect. 5, we explain the basic variant of the proposed progressive BKZ algorithm and its simulator for the cost estimation. In Sect. 6, we discuss the optimized block strategy that improved the speed of the proposed progressive BKZ algorithm. In Sect. 7, we show the cost estimation for processing local blocks based on our implementation. Due to the spacing limitation, we omit the details of our implementation (See the full version [8] for the details). We then discuss an extended strategy using many random reduced bases [20] besides our progressive BKZ in Sect. 8. Finally, Sect. 9 gives the results of our simulation to solve the SVP Challenge problems and compares these results with previous works (Fig. 1).

Fig. 1.
figure 1

Roadmap of this paper: optimizing parameters from local to global

2 Lattice and Shortest Vector

A lattice L is generated by a basis B which is a set of linearly independent vectors \({\mathbf{b}_1,\dots ,\mathbf{b}_n}\) in \(\mathbb {R}^m\). We will refer to it as \(L({\mathbf{b}_1,\dots ,\mathbf{b}_n})=\{\sum _{i=1}^{n}{x_i}{\mathbf{b}_i},{x_i}\in \mathbb {Z}\}\). Throughout this paper, we assume \(m=O(n)\) to analyze the computational cost, though it is not essential. The length of \(\mathbf{v}\in \mathbb {R}^m\) is the standard Euclidean norm \(\Vert \mathbf{v}\Vert := \sqrt{\mathbf{v}\cdot \mathbf{v}}\), where the dot product of any two lattice vectors \(\mathbf{v}=(v_1,\ldots ,v_m)\) and \(\mathbf{w}=(w_1,\ldots ,w_m)\) is defined as \(\mathbf{v}\cdot \mathbf{w}=\sum _{i=1}^{m} v_i w_i\). For natural numbers i and j with \(i<j\), [i : j] is the set of integers \(\{ i, i+1,\ldots , j\}\). Particularly, [1 : j] is denoted by [j].

The gamma function \(\varGamma (s)\) is defined for \(s > 0\) by \(\varGamma (s)=\int _0^\infty t^{s-1}\cdot e^{-t} dt\). The beta function is \(\mathrm{B}(x,y)=\int ^1_0 t^{x-1} (1-t)^{y-1} dt\). We denote by Ball\(_n(R)\) the n-dimensional Euclidean ball of radius R, and then its volume \(V_n(R)=R^n\cdot \frac{\pi ^{n/2}}{\varGamma (n/2+1)}\). Stirling’s approximation yields \(\varGamma (n/2+1)\approx \sqrt{\pi n} (n/2)^{n/2} e^{-n/2}\) and \(V_n(1)^{-1/n} \approx \sqrt{n/(2\pi e)} \approx \sqrt{n/17}\).

2.1 Gram-Schmidt Basis and Projective Sublattice

For a given lattice basis \({B=(\mathbf{b}_1,\dots ,\mathbf{b}_n)}\), we define its Gram-Schmidt orthogonal basis \(B^*=(\mathbf{b}_1^*,\dots ,\mathbf{b}_n^*)\) by \(\mathbf{b}^*_i=\mathbf{b}_i - \sum _{j=1}^{i-1}\mu _{ij}{} \mathbf{b}_j^*\) for \(1\le j < i \le n\), where \(\mu _{ij}=(\mathbf{b}_i\cdot \mathbf{b}_j^*)/\Vert \mathbf{b}_j^*\Vert ^2\) are the Gram-Schmidt coefficients (abbreviated as GS-coefficients). We sometimes refer to \(\Vert \mathbf{b}^*_i\Vert \) as the Gram-Schmidt lengths (abbreviated as GS-lengths). We also use the Gram-Schmidt variables (abbreviated as GS-variables) to denote the set of GS-coefficients \(\mu _{ij}\) and lengths \(||\mathbf{b}^*_i||\). The lattice determinant is defined as \(\det (L) := \prod _{i=1}^n \Vert \mathbf{b}^*_i\Vert \) and it is equal to the volume \(\mathrm{vol}(L)\) of the fundamental parallelepiped. We denote the orthogonal projection by \(\pi _i:\mathbb R^m \mapsto \mathrm{span}(\mathbf{b}_1\,,\dots ,\) \(\mathbf{b}_{i-1})^\perp \) for \(i\in \{1,\dots ,n\}\). In particular, \(\pi _1( \cdot )\) is used as the identity map.

We denote the local block by the projective sublattice

$$\begin{aligned} L_{[i:j]}:= L(\pi _i(\mathbf{b}_i),\pi _i(\mathbf{b}_{i+1}), \dots ,\pi _i(\mathbf{b}_j)) \end{aligned}$$

for \(j\in \{i,i+1,\dots ,n\}\). We sometimes use \(B_i\) to denote the lattice whose basis is \((\pi _i(\mathbf{b}_i),\ldots ,\) \(\pi _i(\mathbf{b}_j))\) of projective sublattice \( L_{[i:j]}\). That is, we omit the change of blocksize \(\beta =j-i+1\) if it is clear by context.

2.2 Shortest Vector and Gaussian Heuristic

A non-zero vector in a lattice L that has the minimum norm is called the shortest vector. We use \(\lambda _1(L)\) to denote the norm of the shortest vector. The notion is also defined for a projective sublattice as \(\lambda _1(L_{[i:j]})\) (we occasionally refer to this as \(\lambda _1(B_i)\) in this paper).

The shortest vector problem (SVP) is the problem of finding a vector of length \(\lambda _1(L)\). For a function \(\gamma (n)\) of a lattice dimension n, the standard definition of \(\gamma \)-approximate SVP is the problem of finding a vector shorter than \(\gamma (n) \cdot \lambda _1(L)\).

An n-dimensional lattice L and a continuous (usually convex and symmetric) set \(S \subset \mathbb {R}^m\) are given. Then the Gaussian heuristic says that the number of points in \(S\cap L\) is approximately vol(S)/vol(L).

In particular, taking S as the origin-centered ball of radius R, the number of lattice points is approximately \(V_n(R) / \mathrm{vol}(L)\), which derives the length of shortest vector \(\lambda _1\) by R so that the volume of the ball is equal to that of the lattice:

$$\begin{aligned} \lambda _1(L) \approx \det (L)^{1/n}/V_n(1)^{1/n} = \frac{(\varGamma (n/2+1) \det (L))^{1/n}}{\sqrt{\pi }} \end{aligned}$$

This is usually called the Gaussian heuristic of a lattice, and we denote it by \(\mathrm {GH}(L) = \det (L)^{1/n}/V_n(1)^{1/n}\).

For our analysis, we use the following lemma on the randomly generated points.

Lemma 1

Let \(x_1,\ldots ,x_K\) be K points uniformly sampled from the n-dimensional unit ball. Then, the expected value of the shortest length of vectors from origin to these points is

$$\begin{aligned} \mathbf{E} \Big [ \min _{i\in [K]} ||x_i|| \Big ] = K \cdot B\Big (K,\frac{n+1}{n} \Big ) := K\cdot \int ^1_0 t^{1/n} (1-t)^{K-1} dt. \end{aligned}$$

In particular, letting \(K=1\), the expected value is \(n/(n+1)\).


Since the cumulative distribution function of each \(\Vert x_i\Vert \) is \(F_i(r) = r^n\), the cumulative function of the shortest length of the vectors is \(F_{\min }(r) = 1- (1-F_i(r))^K = 1-(1-r^n)^K.\) Its probability density function is \(P_{\min }(r)= \frac{dF}{dr} =Kn\cdot r^{n-1} (1-r^n)^{K-1}.\) Therefore, the expected value of the shortest length of the vectors is

$$\begin{aligned} \int _0^1 r P_{\min }(r) dr = K\cdot \int ^1_0 t^{1/n} (1-t)^{K-1} dt. \end{aligned}$$

   \(\Box \)

2.3 Enumeration Algorithm [20, 28, 46]

We explain the enumeration algorithm for finding a short vector in the lattice. The pseudo code of the enumeration algorithm is given in [20, 46]. For given lattice basis \((\mathbf{b}_1,\ldots ,\mathbf{b}_n)\), and its Gram-Schmidt basis \((\mathbf{b}^*_1,\ldots ,\mathbf{b}^*_n)\), the enumeration algorithm considers a search tree whose nodes are labeled by vectors. The root of the search tree is the zero vector; for each node labeled by \(\mathbf{v} \in L\) at depth \(k\in [n]\), its children have labels \(\mathbf{v} + a_{n-k}\cdot \mathbf{b}_{n-k}\) \((a_{n-k}\in \mathbb {Z})\) whose projective length \(\Vert \pi _{n-k}(\sum _{i=n-k}^n a_i \cdot \mathbf{b}_i )\Vert \) is smaller than a bounding value \(R_{k+1} \in (0,\Vert \mathbf{b}_1\Vert ]\). After searching all possible nodes, the enumeration algorithm finds a lattice vector shorter than \(R_n\) at a leaf of depth n, or its projective length is somehow short at a node of depth \(k<n\). It is clear that by taking \(R_k=\Vert \mathbf{b}_1\Vert \) for all \(k\in [n]\), the enumeration algorithm always finds the shortest vector \(\mathbf{v}_1\) in the lattice, namely \(\Vert \mathbf{v}_1\Vert =\lambda _1(L)\).

Because \(\Vert \mathbf{b}_1\Vert \) is often larger than \(\lambda _1(L)\), we can set a better searching radius \(R_n=\mathrm {GH}(L)\) to decrease the computational cost. We call this the full enumeration algorithm and define the full enumeration cost \(\mathrm {FEC}(B)\) as the cost of the algorithm for this basis. With the same argument in [20], we can evaluate \(\mathrm {FEC}(B)\) using the following equation.

$$\begin{aligned} \begin{aligned} \mathrm {FEC}(B)=\sum _{k=1}^{n}\frac{V_k(\mathrm {GH}(L))}{\prod _{i=n-k+1}^n \Vert \mathbf{b}_i^* \Vert }. \end{aligned} \end{aligned}$$

Because full enumeration is a cost-intensive algorithm, several improvements have been proposed by considering the trade-offs between running time, searching radius, and success probability [20, 47]. Gama-Nguyen-Regev [20] proposed a cost estimation model of the lattice enumeration algorithm to optimize the bounding functions of \(R_1, \ldots , R_n\), which were mentioned above. The success probability p of finding a single vector within a radius c is given by

$$\begin{aligned} p = \mathop {\Pr }\limits _{ (x_1,\ldots ,x_n) \leftarrow c \cdot S_n} \Big [ \sum _{i=1}^\ell x_i^2 < R_\ell ^2\ \mathrm{for}\ \forall \ \ell \in [n] \Big ], \end{aligned}$$

where \(S_n\) is the surface of the n-dimensional unit ball. Then, the cost of the enumeration algorithm can be estimated by the number of processed nodes, i.e.,

$$\begin{aligned} N = \frac{1}{2} \sum _{k=1}^n \frac{\mathrm{vol}\{ (x_1,\ldots ,x_k)\in \mathbb {R}^k : \sum _{i=1}^\ell x_i^2 < R_\ell ^2\ \mathrm{for}\ \forall \ \ell \in [k] \}}{\prod _{i=n-k+1}^n \Vert \mathbf{b}^*_i\Vert }. \end{aligned}$$

Note that the factor 1 / 2 is based on the symmetry. Using the methodology in [20], Chen-Nguyen proposed a method to find the optimal bounding functions of \(R_1,\ldots ,R_n\) that minimizes N subject to p.

In this paper, we use the lattice enumeration cost, abbreviated as ENUM cost, to denote the number N in Eq. (1). For a lattice L defined by a basis B and parameters \(\alpha >0\) and \(p\in [0,1]\), we use \(\mathrm {ENUMCost}(B;\alpha ,p)\) to denote the minimized cost N of lattice enumeration with radius \(c=\alpha \cdot \mathrm {GH}(L)\) subject to the success probability p. This notion is also defined for a projective sublattice.

3 Lattice Reduction Algorithms

Lattice reduction algorithms transform a given lattice basis \((\mathbf{b}_1,\ldots ,\mathbf{b}_n)\) to another basis whose Gram-Schmidt lengths are relatively shorter.

LLL Algorithm [30]: The LLL algorithm transforms the basis \((\mathbf{b}_1,\ldots ,\mathbf{b}_n)\) using the following two operations: size reduction \(\mathbf{b}_i \leftarrow \mathbf{b}_i - \lfloor \mu _{ji} \rceil \mathbf{b}_j\) for \(j\in [i-1]\), and neighborhood swaps between \(\mathbf{b}_i\) and \(\mathbf{b}_{i+1}\) if \(\Vert \mathbf{b}^*_{i+1} \Vert ^2 \le 1/2\Vert \mathbf{b}^*_{i} \Vert ^2\) until no update occurs.

BKZ Algorithms [46, 47]. For a given lattice basis and a fixed blocksize \(\beta \), the BKZ algorithm processes the following operation in the local block \(B_i\), i.e., the projected sublattice \(L_{[i,i+\beta -1]}\) of blocksize \(\beta \), starting from the first index \(i=1\) to \(i=n-1\). Note that the blocksize \(\beta \) reluctantly shrinks to \(n-i+1\) for large \(i > n-\beta +1\), and thus we sometimes use \(\beta '\) to denote the dimension of \(B_i\), i.e. \(\beta '=\min (\beta ,n-i+1)\).

At index i, the standard implementation of the BKZ algorithm calls the enumeration algorithm for the local block \(B_i\). Let \(\mathbf{v}\) be a shorter vector found by the enumeration algorithm. Then the BKZ algorithm inserts \(\mathbf{v}\) into \(\mathbf{b}_{i-1}\) and \(\mathbf{b}_i\), and constructs the degenerated basis \((\mathbf{b}_1,\ldots ,\mathbf{b}_{i-1}, \mathbf{v},\mathbf{b}_i,\ldots ,\mathbf{b}_{\min (i+\beta -1,n)})\). For this basis, we apply the LLL algorithm (or BKZ with a smaller blocksize) so that the basis of shorter independent vectors can be obtained. One set of these procedures from \(i=1\) to \(n-1\) is usually called a tour. The original version of the BKZ algorithm stops when no update occurs during \(n-1\) iterations. In this paper, we refer to the BKZ algorithm with blocksize \(\beta \) as the BKZ-\(\beta \).

HKZ Reduced Basis: The lattice basis \((\mathbf{b}_1,\ldots ,\mathbf{b}_n)\) is called Hermite-Korkine-Zolotarev (HKZ) reduced [38, Chapter 2] if it is size-reduced \(|\mu _{ji}|\le 1/2\) for all i and j, and \(\pi _i(\mathbf{b}_i)\) is the shortest vector in the projective sublattice \(L_{[i:n]}\) for all i. We can estimate the Gram-Schmidt length of the HKZ-reduced basis by using the Gaussian heuristic as \(\Vert \mathbf{b}^*_i\Vert =\mathrm {GH}(L_{[i:n]})\). Since the HKZ-reduced basis is completely reduced in this sense, we will use this to discuss the lower bound of computing time in Sect. 8.2.

3.1 Some Heuristic Assumptions in BKZ

Gaussian Heuristic in Small Dimensions: Chen and Nguyen observed that the length \(\lambda _1(B_i)\) of the shortest vector in the local block \(B_i\) is usually larger than \(\mathrm {GH}(B_i)\) in small dimensions i.e., small \(\beta '\) [13]. They gave the averaged values of \(\Vert \mathbf{b}^*_i\Vert /\det (L)^{1/n}\) for the last indexes of highly reduced bases to modify their BKZ simulator, see [13, Appendix C]. For their 50 simulated values for \(\Vert \mathbf{b}^*_{n-49}\Vert ,\ldots ,\Vert \mathbf{b}^*_{n}\Vert \), we define the modified Gaussian heuristic constant by

$$\begin{aligned} \tau _i := \frac{\lambda _1(\pi _{n-i+1}(L))}{\mathrm {GH}(\pi _{n-i+1}(L))} = \frac{ \Vert \mathbf{b}^*_{n-i+1}\Vert }{V_i(1)^{-1/i}\cdot \prod _{j=n-i+1}^{n} \Vert \mathbf{b}^*_j \Vert ^{1/i} }. \end{aligned}$$

We will use \(\tau _i\) for \(i\le 50\) to denote these modifying constants; for \(i > 50\) we define \(\tau _i=1\) following Chen-Nguyen’s simulator [13].

In the rest of this paper, we assume that the shortest vector lengths of \(\beta \)-dimensional local blocks \(B_i\) of reduced bases satisfies

$$\begin{aligned} \lambda _1(B_i) \approx \left\{ \begin{array}{ll} \tau _\beta \cdot \mathrm {GH}(B_i) &{} (\beta \le 50 )\\ \mathrm {GH}(B_i) &{} (\beta > 50 )\\ \end{array} \right. \end{aligned}$$

on average.

We note that there exists a mathematical theory to guarantee \(\tau _i \rightarrow 1\) for random lattices when the dimension goes to infinity [42]. Though it does not give the theoretical guarantee \(\tau _i=1\) for BKZ local blocks, they are very close in our preliminary experiments.

Geometric Series Assumption (GSA): Schnorr [44] introduced geometric series assumption (GSA), which says that the Gram-Schmidt lengths \(\Vert \mathbf{b}_i^*\Vert \) in the BKZ-reduced basis decay geometrically with quotient r for \(i=1,\dots ,n\), namely, \(\Vert \mathbf{b}_i^*\Vert ^2/\Vert \mathbf{b}_1\Vert ^2=r^{i-1}\), for some \(r \in [3/4,1)\). Here r is called the GSA constant. Figure 2 shows the Gram-Schmidt lengths of a 240-dimensional reduced basis after processing BKZ-100 using our algorithm and parameters.

Fig. 2.
figure 2

Semi-log graph of \(\Vert \mathbf{b}_{\mathbf{i}}^{\mathbf{*}} \Vert \) of a 240-dimensional highly reduced basis

It is known that GSA does not hold exactly in the first and last indexes [11]. Several previous works [3, 11, 44] aimed to modify the reduction algorithm that outputs the reduced basis satisfying GSA. However, it seems difficult to obtain such a reduced basis in practice. In this paper, we aim to modify the parameters in the first and last indexes so that the proposed simulator performs with optimal efficiency (See Sect. 5.1).

3.2 Chen-Nguyen’s BKZ 2.0 Algorithm [13]

We recall Chen-Nguyen’s BKZ 2.0 Algorithm in this section. The outline of the BKZ 2.0 algorithm is described in Fig. 3.

Fig. 3.
figure 3

Outline of BKZ 2.0

Speed-Up Techniques for BKZ 2.0: BKZ 2.0 employs four major speed-up techniques that differentiate it from the original BKZ:

  1. 1.

    BKZ 2.0 employs the extreme pruning technique [20], which attempts to find shorter vectors in the local blocks \(B_i\) with low probability p by randomizing basis \(B_i\) to more blocks \(G_1,\ldots ,G_M\) where \(M = \lfloor 1/p \rceil \).

  2. 2.

    For the search radius \(\min \{ ||\mathbf{b}^*_i||, \alpha \cdot \mathrm {GH}(B_i) \}\) in the enumeration algorithm of the local block \(B_i\), Chen and Nguyen set the value as \(\alpha = \sqrt{1.1}\) from their experiments, while the previous works set the radius as \(\Vert \mathbf{b}_i^*\Vert \).

  3. 3.

    In order to reduce the cost of the enumeration algorithm, BKZ 2.0 preprocesses the local blocks by executing the sequence of BKZ algorithm, e.g., 3 tours of BKZ-50 and then 5 tours of BKZ-60, and so on. The parameters blocksize, number of rounds and number of randomized bases, are precomputed to minimize the total enumeration cost.

  4. 4.

    BKZ 2.0 uses the terminating condition introduced in [23], which aborts BKZ within small number of tours. It can find a short vector faster than the full execution of BKZ.

Chen-Nguyen’s BKZ 2.0 Simulator: In order to predict the computational cost and the quality of the output basis, they also propose the simulating procedure of the BKZ 2.0 algorithm. Let \((\ell _1,\ldots ,\ell _n)\) be the simulated values of the GS-lengths \(\Vert \mathbf{b}^*_i\Vert \) for \(i=1,\ldots ,n\). Then, the simulated values of the determinant and the Gaussian heuristic are represented by \(\prod _{j=1}^n \ell _j\) and \(\mathrm {GH}(B_i) = V_{\beta '}(1)^{-1/\beta '} \prod _{j=i}^{i+\beta '-1} \ell _i\) where \(\beta '=\min \{\beta ,n-i+1\}\), respectively.

They simulate a BKZ tour of blocksize \(\beta \) assuming that each enumeration procedure finds a vector of projective length \(\mathrm {GH}(B_i)\). Roughly speaking, their simulator updates \((\ell _i,\ell _{i+1})\) to \((\ell '_i,\ell '_{i+1})\) for \(i=1,\ldots ,n-1\), where \(\ell '_i = \mathrm {GH}(B_i)\) and \(\ell '_{i+1} = \ell _{i+1}\cdot (\ell _i/\ell '_i)\). Here, the last 50 GS-lengths are modified using an HKZ reduced basis. The details of their simulator are given in [13, Algorithm 3].

They also present the upper and lower bounds for the number of processed nodes during the lattice enumeration of blocksize \(\beta \). From [14, Table 4], we extrapolate the costs as

$$\begin{aligned} \begin{array}{l} \log _2(\mathrm{Cost}_{\beta }) = 0.000784314\beta ^2 + 0.366078\beta -6.125 \end{array} \end{aligned}$$

Then, the total enumeration cost of performing the BKZ 2.0 algorithm using blocksize \(\beta \) and t tours is given by

$$\begin{aligned} t\cdot \sum _{i=1}^{n-1} \mathrm{Cost}_{\min \{\beta ,n-i+1\}}. \end{aligned}$$

To convert the number of nodes into single-threaded time in seconds, we use the rational constant \(4\cdot 10^9 / 200 = 2\cdot 10^7\), because they assumed that processing one node requires 200 clock cycles in a standard CPU, and we assume it can work at 4.0GHz.

We note that there are several models to extrapolate \(\log _2(\mathrm{Cost}_{\beta })\). Indeed, Lepoint and Naehrig [31] consider two models by a quadratic interpolation and a linear interpolation from the table. Albrecht et al. [5] showed another BKZ 2.0 cost estimation that uses an interpolation using the cost model \(\log _2(\mathrm{Cost}_{\beta }) = O(n\log n)\). It is a highly non-trivial task to find a proper interpolation that estimates a precise cost of the BKZ 2.0 algorithm.

We further mention that the upper bound of the simulator is somewhat debatable, because they use the enumeration radius \(c=\min \{ \sqrt{1.1} \cdot \mathrm {GH}(B_i),\Vert \mathbf{b}_i^*\Vert \} \) for \(i < n-30\) in their experiment whereas they assume \(c=\mathrm {GH}(B_i)\) for the cost estimation in their upper bound simulation. Thus, the actual cost of BKZ 2.0 could differ by a factor of \(1.1^{O(\beta )}\).

Fig. 4.
figure 4

Plain BKZ algorithm

4 Optimizing Parameters in Plain BKZ

In this section we consider the plain BKZ algorithm described in Fig. 4, and roughly predict the GS-lengths of the output basis, which were computed by the GSA constant r. Using this analysis, we can obtain the optimal settings for parameters \((\alpha ,p)\) in Step 4 of the plain BKZ algorithm of blocksize \(\beta \).

4.1 Relationship of Parameters \(\alpha , P, \beta , R\)

We fix the values of parameters \((\beta ,\alpha )\) and assume that the lattice dimension n is sufficiently large.

Suppose that we found a vector \(\mathbf{v}\) of \(\Vert \mathbf{v}\Vert < \alpha \cdot \mathrm {GH}(B_i)\) in the local block \(B_i\). We update the basis \(B_i\) by inserting \(\mathbf{v}\) at i-th index, and perform LLL or small blocksize BKZ on the updated basis.

When the lattice dimension is large, Rogers’ theorem [42] says that approximately \(\alpha ^n/2\) vector pairs \((\mathbf{v},-\mathbf{v})\) exist within the ball of radius \(c=\alpha \cdot \mathrm {GH}(L)\). Since the pruning probability is defined for a single vector pair, we expect the actual probability that the enumeration algorithm finds at least one vector shorter than c is roughly

$$\begin{aligned} 1-(1-p)^{\alpha ^n/2} \approx p\cdot \frac{\alpha ^n}{2}. \end{aligned}$$

From relation (5), there may exist one lattice vector in the searching space by setting parameter p as

$$\begin{aligned} p = \frac{2}{\alpha ^\beta }. \end{aligned}$$

Remark 1

The probability setting of Eq. (6) is an optimal choice under our assumption. If p is smaller, the enumeration algorithm finds no short vector with high probability and basis updating at i-th index does not occur, which is a waste of time. On the other hand, if we take a larger p so that there exist \(p\cdot \alpha ^\beta /2>1\) vector pairs, the computational time of the enumeration algorithm increases more than \(p\cdot \alpha ^\beta /2\) times [20]. Although it can find shorter vectors, this is also a waste of time from the viewpoint of basis updating.

Assume that one vector is found using the enumeration, and also assume that the distribution of it is the same as the random point in the \(\beta \)-dimensional ball of radius \(\alpha \cdot \mathrm {GH}(B_i)\). Then, the expected value of \(\Vert \mathbf{v}\Vert \) is \(\frac{\beta }{\beta +1} \alpha \cdot \mathrm {GH}(B_i)\) by letting \(K=1\) in Lemma 1. Thus, we can expect that this is the value \(\Vert \mathbf{b}^*_i\Vert \) after update.

Therefore, after executing a sufficient number of BKZ tours, we can expect that all the lengths \(\Vert \mathbf{b}^*_i\Vert \) of the Gram-Schmidt basis satisfy

$$\begin{aligned} \Vert \mathbf{b}^*_i\Vert = \frac{\beta }{\beta +1} \alpha \cdot \mathrm {GH}(B_i) \end{aligned}$$

on average. Hence, under Schnorr’s GSA, we have the relation

$$\begin{aligned} \Vert \mathbf{b}^*_i\Vert = \frac{\alpha \beta }{\beta +1} \cdot V_{\beta }(1)^{-1/\beta } \Vert \mathbf{b}^*_i\Vert \prod _{j=1}^{\beta } r^{(j-1)/2\beta }, \end{aligned}$$

and the GSA constant is

$$\begin{aligned} r = \left( \frac{\beta +1}{\alpha \beta } \right) ^{\frac{4}{\beta -1}} \cdot V_\beta (1)^{\frac{4}{\beta (\beta -1)} }. \end{aligned}$$

Therefore, by fixing \((\alpha ,\beta )\), we can set the probability p and obtain r as a rough prediction of the quality of the output lattice basis. We will use the relations (6) and (9) to set our parameters. Note that any two of \(\beta ,\alpha ,p\) and r are determined from the other two values.

Remark 2

Our estimation is somehow underestimate, i.e., in our experiments, the found vectors during BKZ algorithm are often shorter than the estimation in Eq. (7). This gap is mainly from the estimation in (5), which can be explained as follows. Let \((R_1,\ldots ,R_\beta )\) be a bounding function of probability p for a vector of length \(\Vert v\Vert \). Then, the probability \(p'\) for a vector of length \(\Vert v'\Vert \) of a shorter vector is the same as the scaled bounding function \((R'_1,\ldots ,R'_\beta )\) where \(R'_i = \min \{ 1.0,R_i \cdot \Vert v\Vert /\Vert v'\Vert \}\). Here, \(p'\) is clearly larger than p due to \(R'_i \ge R_i\) for \(i\in [\beta ]\). Therefore, when the above parameters are used, the quality of the output basis is better than that derived from Eq. (9) if we perform a sufficient number of tours. Hence, within a few tours, our algorithm can output a basis which has a good quality predicted by our estimation in this section.

Fig. 5.
figure 5

Relation between \(\beta \) and r that minimizes the computational cost

4.2 Optimizing Parameters

Now for a fixed parameter pair \((\beta ,r)\), the cost \(\mathrm {ENUMCost}(B_i;\alpha ,p)\) of the enumeration algorithm in local block \(B_i\) satisfying GSA is computable. Concretely, we compute \(\alpha \) using the relation (9), fix p by (6), and simulate the Gram-Schmidt lengths of \(B_i\) using \(\Vert \mathbf{b}^*_i\Vert =r^{(i-1)/2}\). Using the computation technique in [20], for several GSA constants r, we search for the optimal blocksize \(\beta \) that minimizes the enumeration cost \(\mathrm {ENUMCost}(B_i;\alpha ,p)\). The small squares in Fig. 5 show the results. From these points, we find the functions \(f_1(\beta )\) and \(f_2(\beta )\), whose graphs are also in the figure.

We explain how to find these functions \(f_1(\beta )\) and \(f_2(\beta )\). Suppose lattice dimension n is sufficiently large, and suppose the cost of the enumeration algorithm is roughly dominated by the probability p times the factor at \(k=n/2\) in the summation (1). Then \(\mathrm {ENUMCost}(B_i;\alpha ,p)\) is approximately

$$\begin{aligned} D = p \cdot \frac{V_{\beta /2}(\alpha \cdot \mathrm {GH}(B_r) ) }{\prod _{i=\beta /2+1}^\beta \Vert \mathbf{b}^*_i\Vert } = 2\alpha ^{-\beta /2} \frac{V_{\beta /2}(1) V_\beta (1)^{-1/2} }{r^{\beta ^2/16}}, \end{aligned}$$

where from Eq. (9) we have obtained

$$\begin{aligned} D \approx Const. \times r^{(\beta ^2 - 2\beta )/16} \cdot \left( \frac{\beta }{e\pi } \right) ^{\beta /4}, \text{ and } \frac{\partial \log D}{\partial \beta } \approx \frac{\beta -1}{8} \log r + \frac{1}{4} + \frac{1}{4} \log \frac{\beta }{e\pi }. \end{aligned}$$

In order to minimize D, we roughly need the above derivative to be zero; thus, we use the following function of \(\beta \) for our cost estimation with constants \(c_i\)

$$\begin{aligned} \log (r) = 2\cdot (\log \beta +1 - \log (e\pi ) )/(\beta -1) = \frac{ \log c_1 \beta }{c_2\beta + c_3}. \end{aligned}$$

From this observation, we fix the fitting function model as \(f(\beta ) = \frac{ \log (c_1 \beta + c_2) }{c_3\beta + c_4}.\)

By using the least squares method implemented in gnuplot, we find the coefficients \(c_i\) so that \(f(\beta )\) is a good approximation of the pairs \((\beta _i,\log (r_i))\). In our curve fitting, we separate the range of \(\beta \) into the interval [40, 100], and the larger one. This is needed for converging to \(\log (r)=0\) when \(\beta \) is sufficiently large; however, our curve fitting using a single natural function did not achieve it. Curves \(f_1(\beta )\) and \(f_2(\beta )\) in Fig. 5 are the results of our curve fitting for the range [40, 100] and the larger one, respectively.

For the range of \(\beta \in [40,100]\), we have obtained

$$\begin{aligned} \mathrm{log}(r) = f_1(\beta ) := -18.2139/(\beta + 318.978) \end{aligned}$$

and for the larger blocksize \(\beta >100\),

$$\begin{aligned} \mathrm{log}(r)=f_2(\beta ) := (-1.06889/(\beta -31.0345))\cdot \mathrm{log}(0.417419\beta -25.4889). \end{aligned}$$

Note that we will use the relation (10) when the blocksize is smaller than 40.

Moreover, we obtain pairs of \(\beta \) and minimize \(\mathrm {ENUMCost}(B_i;\alpha ,p)\), in accordance with the above experiments. Using the curve fitting that minimizes \(\sum _{\beta } |f(\beta ) - \log _2 \mathrm {ENUMCost}(B_i;\alpha ,p) |^2\) using gnuplot, we find the extrapolating formula

$$\begin{aligned} \log _2\mathrm {MINCost}(\beta ) := \left\{ \begin{array}{ll} 0.1375\beta + 7.153 &{} (\beta \in [60,105]) \\ 0.000898\beta ^2 + 0.270\beta -16.97 &{} (\beta > 105) \end{array} \right. \end{aligned}$$

to \(\log _2 \mathrm {ENUMCost}(B_i;\alpha ,p)\). We will use this as the standard of the enumeration cost of blocksize \(\beta \).

Remark 3

Our estimation from the real experiments is \(0.25\beta \cdot \mathrm {ENUMCost}(B_i;\) \(\alpha ,p)\) (See, Sect. 7.1 of the full-version [8]), which crosses over the estimation of BKZ 2.0 simulator (3) at \(\beta =873\). Thus, the performance of BKZ 2.0 might be better in some extremely high block sizes, while our algorithm has a better performance in the realizable block sizes \(<200\).

4.3 Parameter Settings in Step 4 in Fig. 4

Using the above arguments, we can fix the optimized pair \((\alpha ,p)\) for each blocksize \(\beta \). That is, to process a local block of blocksize \(\beta \) in Step 4 of the plain BKZ algorithm in Fig. 4, we compute the corresponding r by Eqs. (10) and (11), and additionally obtain the parameters \(\alpha \) by Eq. (9) and p by Eq. (6). These are our basic parameter settings.

Modifying Blocksize at First Indexes: We sometimes encounter the phenomenon in which the actual \(\mathrm {ENUMCost}(B_i;\alpha ,p)\) in small indexes is much smaller than that in middle indexes. This is because \(||\mathbf{b}^*_i||\) is smaller than \(\mathrm {GH}(B_i)\) in small indexes. In other words, \(\mathbf{b}_i\) is hard to update using the enumeration of blocksize \(\beta \). To speed up the lattice reduction, we use a heuristic method that enlarges the blocksizes as follows.

From the discussion in the above subsection, we know the theoretical value of the enumeration cost at blocksize \(\beta \). On the other hand, in the actual computing of BKZ algorithms, the enumeration cost is increased because the sequence \((||\mathbf{b}^*_i||,\ldots ,||\mathbf{b}^*_{i+\beta -1}||)\), which mainly affects the computing cost, does not follow the GSA of slope r exactly. Thus, we define the expected enumeration cost in blocksize \(\beta \) as \(\beta \cdot \mathrm {MINCost}(\beta )\). With this expectation, we reset the blocksize as the minimum \(\beta \) satisfying \(\mathrm {ENUMCost}(B_{[i:i+\beta -1]};\alpha ,p) > \beta \cdot \mathrm {MINCost}(\beta )\).

Modifying \((\alpha ,p)\) at Last Indexes: For large indexes such as \(i>n-\beta \), the blocksize of a local block shrinks to \(\beta '=\min (\beta ,n-i+1)\). In our implementation, we increase the success probability to a new \(p'\), while \(\mathrm {ENUMCost}(B_i;\alpha ',p')\) is smaller than \(\beta \cdot \mathrm {MINCost}(\beta )\). We also reset the radius as \(\alpha '=(2/p')^{1/\beta }\) from Eq. (6).

Fig. 6.
figure 6

Our progressive BKZ algorithm (basic variant)

5 Our Proposed Progressive BKZ: Basic Variant

In this section, we explain the basic variant of our proposed progressive BKZ algorithm.

In general, if the blocksize of the BKZ algorithm increases, a shorter vector \(\mathbf{b}_1\) can be computed; however, the running cost will eventually increase. The progressive BKZ algorithm starts a BKZ algorithm with a relatively small blocksize \(\beta _{start}\) and increases the blocksize to \(\beta _{end}\) by some criteria. The idea of the progressive BKZ algorithm has been mentioned in several literatures, for example, [13, 25, 45, 48]. The research challenge in the progressive BKZ algorithm is to find an effective criteria for increasing blocksizes that minimizes the total running time.

In this paper we employ the full enumeration cost (FEC) in Sect. 2.3, in order to evaluate the quality of the basis for finding the increasing criteria. Recall that the FEC of basis \(B=(\mathbf{b}_1,\ldots ,\mathbf{b}_n)\) of n-dimensional lattice L is defined by \(\mathrm {FEC}(B)=\sum _{k=1}^{n}\frac{V_k(\mathrm {GH}(L))}{\prod _{i=n-k+1}^n \Vert \mathbf{b}_i^* \Vert }\), where \(\Vert \mathbf{b}_i^*\Vert \) represents the GS-lengths. Note that \(\mathrm {FEC}(B)\) eventually decreases after performing several tours of the BKZ algorithm using the fixed blocksize \(\beta \).

Moreover, we construct a simulator that evaluates the GS-lengths by the optimized parameters \(\alpha , p, \beta , r\) for the BKZ algorithm described in the local block discussion in Sect. 4.3. The simulator for an n-dimensional lattice only depends on the blocksize \(\beta \) of the local block; we denote by \(\mathrm{Sim}\text{- }\mathrm{GS}\text{- }\mathrm{lengths}(n,\beta )\) the simulated GS-lengths \((\ell _1,\ldots ,\ell _n)\). The construction of simulator will be presented in Sect. 5.1.

For this purpose, we define some functions defined on the simulated GS-lengths \((\ell _1,\ldots ,\ell _n)\). \(\mathrm{Sim}\text{- }\mathrm{GH}(\ell _1,\ldots ,\ell _n) = V_{n}(1)^{-1/n} \prod _{j=1}^{n} \ell _j^{1/n}\) is the simulated Gaussian heuristic. The simulated value of full enumeration cost is

$$\begin{aligned} \mathrm{Sim}\text{- }\mathrm{FEC}(\ell _1, \ldots ,\ell _n):=\sum _{k=1}^{n}\frac{V_k(\mathrm{Sim}\text{- }\mathrm{GH}(\ell _1,\ldots ,\ell _n))}{\prod _{i=n-k+1}^n\ \ell _i }. \end{aligned}$$

Further, for \((\ell _1,\ldots ,\ell _n)=\mathrm{Sim}\text{- }\mathrm{GS}\text{- }\mathrm{lengths}(n,\beta )\), we use the notation

\(\mathrm{Sim}\text{- }\mathrm{FEC}(n,\beta ) := \mathrm{Sim}\text{- }\mathrm{FEC}(\ell _1,\ldots ,\ell _n)\) in particular. The simulated enumeration cost \(\mathrm{Sim}\text{- }\mathrm{ENUMCost}(\ell _1,\ldots ,\ell _{\beta };\alpha ,p)\) is defined by \(\mathrm {ENUMCost}(B;\alpha ,p)\) for a lattice basis B that has GS-lengths \(\Vert \mathbf{b}^*_i \Vert =\ell _i\) for \(i\in [\beta ]\).

The key point of our proposed progressive BKZ algorithm is to increase the blocksize \(\beta \) if \(\mathrm {FEC}(B)\) becomes smaller than \(\mathrm{Sim}\text{- }\mathrm{FEC}(n,\beta )\). In other words, we perform the BKZ tours of blocksize \(\beta \) while \(\mathrm {FEC}(B) > \mathrm{Sim}\text{- }\mathrm{FEC}(n,\beta )\). We describe the proposed progressive BKZ in Fig. 6.

Remark 4

In the basic variant of our progressive BKZ described in Sect. 6.1, we increase the blocksize \(\beta \) in increments of one in Step 2. However, we will present an optimal strategy for increasing the blocksize in Sect. 5.

5.1 \(\mathrm{Sim}\text{- }\mathrm{GS}\text{- }\mathrm{lengths}(n,\beta )\): Predicting Gram-Schmidt Lengths

In the following, we construct a simulator for predicting the Gram-Schmidt lengths \(\Vert \mathbf{b}^*_i\Vert \) obtained from the plain BKZ algorithm of blocksize \(\beta \).

Our simulator consists of two phases. First, we generate approximated GS-lengths using Gaussian heuristics; we then modify it for the first and last indexes of GSA in Sect. 3.1. We will explain how to compute \((\ell _1,\ldots ,\ell _n)\) as the output of \(\mathrm{Sim}\text{- }\mathrm{GS}\text{- }\mathrm{lengths}(n,\beta )\).

First Phase: Our simulator computes the initial value of \((\ell _1,\ldots ,\ell _n)\).

We start from the last index by setting \(\ell _n=1\), and compute \(\ell _i\) backwards. From Eqs. (2) and (7) we are able to simulate the GS-lengths \(\ell _i\) by solving the following equation of \(\ell _i\):

$$\begin{aligned} \ell _i = \max \left\{ \frac{\beta '}{\beta '+1 }\alpha , \tau _{\beta '} \right\} \cdot \mathrm {GH}(\ell _i,\ldots ,\ell _{i+\beta '-1}),\ \mathrm{where}\ \beta '=\min (\beta ,n-i+1). \end{aligned}$$

Here, \(\alpha \) is the optimized radius parameter in Sect. 4.3 and \(\tau _{\beta '}\) is the coefficient of the modified Gaussian heuristic.

This simple simulation in the first phase is sufficient for smaller blocksizes (\(\beta <30\)). However, for simulating larger blocksizes, we must modify the GS-lengths of the first and last indexes in Sect. 3.1.

Second Phase: To modify the results of the simple simulation, we consider our two modifying methods described in Sect. 4.3. We recall that \(\mathrm {MINCost}(\beta )\) is the standard value of the enumeration cost of blocksize \(\beta \).

We first consider the modification for the last indexes \(i>n-\beta +1\), i.e., a situation in which the blocksize is smaller than \(\beta \). We select the modified probability \(p_i\) at index i so that \(\mathrm{Sim}\text{- }\mathrm{ENUMCost}(\ell _i,\ldots ,\ell _n;\alpha _i,p_i) = \mathrm {MINCost}(\beta )\), where \(\ell _i,\ldots ,\ell _n\) is the result of the first simulation, and we use \(\alpha _i = (2/p_i)^{n-i+1}\). After all (\(\alpha _i,p_i\)) for \(n-\beta +1\le i \le n\) are fixed, we modify the GS-lengths by solving the following equation of \(\ell _i\) again:

$$\begin{aligned} \ell _i = \max \left\{ \frac{\beta '}{\beta '+1 } {\alpha _i} , \tau _{\beta '} \right\} \cdot \mathrm {GH}(\ell _i,\ldots ,\ell _n)\ \mathrm{where}\ \beta '=n-i+1. \end{aligned}$$

Next, using the modified \((\ell _1,\ldots ,\ell _n)\), we again modify the first indexes as follows. We determine the integer parameter \(b>0\) for the size of enlargement. For \(b=1,2,\ldots \), we reset the blocksize at index i as \(\beta _i := \beta + \max \{ (b-i+1)/2, b-2(i-1) \} \) for \(i \in \{1,\ldots ,b\}\). Using these blocksizes, we recompute the GS-lengths by solving Eq. (13) from \(i=\beta _i\) to 1. Then, we compute \(\mathrm{Sim}\text{- }\mathrm{ENUMCost}(\ell _1,\ldots ,\ell _{\beta +b};\alpha ,p)\). We select the maximum b such that this simulated enumeration cost is smaller than \(2\cdot \mathrm {MINCost}(\beta )\).

Fig. 7.
figure 7

Left figure: Semi-log graph of \(\Vert \mathbf{b}_{\mathbf{i}}^{\mathbf{*}}\Vert \) of reduced random lattices from the SVP Challenge problem generator: Simulation (bold lines) vs. Experiment (small squares). Right figure: The root Hermite factor of reduced random 300-dimensional bases after BKZ-\(\beta \). Simulation (bold red lines) vs. Experiment (thin blue lines) (Color figure online).

Experimental Result of Our GS-lengths Simulator: We performed some experiments on the GS-lengths for some random lattices from the Darmstadt SVP Challenge [50]. We computed the GS-lengths for 120, 150 and 200 dimensions using the proposed progressive BKZ algorithm, with ending blocksizes of 40, 60, and 100, respectively (Note that the starting blocksize is irrelevant to the quality of the GS-lengths). The simulated result is shown in Fig. 7. Almost all small squares of the computed GS-lengths are plotted on the bold line obtained by our above simulation. Our simulator can precisely predict the GS-lengths of these lattices. The progress of the first vector, which uses 300-dimensional lattices, is also shown in the figure.

5.2 Expected Number of BKZ Tours at Step 3

At Step 3 in the proposed algorithm (Fig. 6) we iterate the BKZ tour with blocksize \(\beta \) as long as the full enumeration cost \(\mathrm {FEC}(B)\) is larger than the simulated cost \(\mathrm{Sim}\text{- }\mathrm{FEC}(n,\beta )\). In the following we estimate the expected number of BKZ tours (we denote it as \(\sharp tours\)) at blocksize \(\beta \).

In order to estimate \(\sharp tours\), we first compute \((\ell _1,\ldots ,\ell _n)\) and the output of \(\mathrm{Sim}\text{- }\mathrm{GS}\text{- }\mathrm{lengths}(n,\beta -1)\), and update it by using the modified Chen-Nguyen’s BKZ 2.0 simulator described in Sect. 3.2, until \(\mathrm{Sim}\text{- }\mathrm{FEC}(\ell _1,\ldots ,\ell _n)\) is smaller than \(\mathrm{Sim}\text{- }\mathrm{FEC}(n,\beta )\). We simulate a BKZ tour by updating the pair \((\ell _i,\ell _{i+1})\) to \((\ell '_i,\ell '_{i+1})\) for \(i=1,\ldots ,n-1\) according to the following rule:

$$\begin{aligned} \begin{array}{ll} &{}\ell '_i = \max \left\{ \frac{\beta }{\beta +1}\alpha , \tau _\beta \right\} \cdot \mathrm {GH}(\ell _i,\ldots ,\ell _{\min (n,i+\beta -1)})\\ \mathrm{and} &{} \ell '_{i+1} = \ell _{i+1}\cdot (\ell _i/\ell '_i). \end{array} \end{aligned}$$

At the simulation of t-th BKZ tour, write the input GS-lengths \((\ell '_1,\ldots ,\ell '_n)\); i.e., the output of the \((t-1)\)-th BKZ tour. We further denote the output of t-th BKZ tour as \((\ell _1,\ldots ,\ell _n)\). Suppose they satisfy

$$\begin{aligned} \mathrm{Sim}\text{- }\mathrm{FEC}(\ell '_1,\ldots ,\ell '_n)> \mathrm{Sim}\text{- }\mathrm{FEC}(n,\beta ) > \mathrm{Sim}\text{- }\mathrm{FEC}(\ell _1,\ldots ,\ell _n). \end{aligned}$$

Then, our estimation of \(\sharp tours\) is the interpolated value:

$$\begin{aligned} {\sharp tours} = (t-1) + \frac{\mathrm{Sim}\text{- }\mathrm{FEC}(\ell '_1,\ldots ,\ell '_n) - \mathrm{Sim}\text{- }\mathrm{FEC}(n,\beta ) }{\mathrm{Sim}\text{- }\mathrm{FEC}(\ell '_1,\ldots ,\ell '_n) - \mathrm{Sim}\text{- }\mathrm{FEC}(\ell _1,\ldots ,\ell _n)}. \end{aligned}$$

Note that we can use this estimation for other BKZ strategies, although we estimate the number of BKZ tours from BKZ-\((\beta -1)\) basis to BKZ-\(\beta \) basis, using BKZ-\(\beta \) algorithm. We will estimate the tours for other combinations of starting and ending blocksizes, and use them in the algorithm.

6 Our Progressive BKZ: Optimizing Blocksize Strategy

We propose how to optimally increase the blocksize \(\beta \) in the proposed progressive BKZ algorithm. Several heuristic strategies for increasing the blocksizes have been proposed. The following sequences of blocksizes after LLL-reduction have been used in the previous literatures:

$$\begin{aligned} \begin{array}{ccccccl} 20 &{} \rightarrow 21 &{} \rightarrow 22 &{} \rightarrow 23 &{} \rightarrow 24 &{} \rightarrow \cdots &{} \text { Gama and Nguyen}~ [19] \\ 2 &{} \rightarrow 4 &{} \rightarrow 8 &{} \rightarrow 16 &{} \rightarrow 32 &{} \rightarrow \cdots &{} \text { Schnorr and Shevchenko}~ [48], \\ 2 &{} \rightarrow 4 &{} \rightarrow 6 &{}\rightarrow 8 &{}\rightarrow 10 &{}\rightarrow \cdots &{} \text { Haque, Rahman, and Pieprzyk}~ [25], \\ 50 &{}\rightarrow 60 &{}\rightarrow 70 &{}\rightarrow 80 &{}\rightarrow 90 &{}\rightarrow \cdots &{} \text { Chen and Nguyen}~ [13,14] \\ \end{array} \end{aligned}$$

The timings for changing to the next blocksize were not explicitly given. They sometimes continue the BKZ tour until no update occurs as the original BKZ. In this section we try to find the sequence of the blocksizes that minimizes the total cost of the progressive BKZ to find a BKZ-\(\beta \) reduced basis. To find this strategy, we consider all the possible combinations of blocksizes used in our BKZ algorithm and the timing to increase the blocksizes.

Notations on Blocksize Strategy: We say a lattice basis B of dimension n is \(\beta \)-reduced when \(\mathrm {FEC}(B)\) is smaller than \(\mathrm{Sim}\text{- }\mathrm{FEC}(n,\beta )\). For a tuple of blocksizes \((\beta ^{alg},\beta ^{start},\beta ^{goal})\) satisfying \(2\le \beta ^{start} < \beta ^{goal} \le \beta ^{alg}\), the notation

$$\begin{aligned} \beta ^{start} {\,\mathop {\rightarrow }\limits ^{{\beta ^{alg}}}\,}\beta ^{goal} \end{aligned}$$

is the process of the BKZ following algorithm. The input is a \(\beta ^{start}\)-reduced basis B, and the algorithm updates B using the tours of BKZ-\(\beta ^{alg}\) algorithm with parameters in Sect. 4.3. It stops when \(\mathrm {FEC}(B) < \mathrm{Sim}\text{- }\mathrm{FEC}(n,\beta ^{goal})\).

\(\mathrm {TimeBKZ}(n,\beta ^{start} {\,\mathop {\rightarrow }\limits ^{{\beta ^{alg}}}\,}\beta ^{goal})\) is the computing time in seconds of this algorithm. We provide a concrete simulating procedure in this and the next sections. We assume that \(\mathrm {TimeBKZ}\) is a function of \(n,\beta ^{alg},\beta ^{start}\) and \(\beta ^{goal}\).

To obtain a BKZ-\(\beta \) reduced basis from an LLL reduced basis, many blocksize strategies are considered as follows:

$$\begin{aligned} \beta ^{goal}_0=\mathrm{LLL}{\,\mathop {\rightarrow }\limits ^{{\beta _1^{alg}}}\,}\beta _1^{goal} {\,\mathop {\rightarrow }\limits ^{{\beta _2^{alg}}}\,}\beta _2^{goal} {\,\mathop {\rightarrow }\limits ^{{\beta _3^{alg}}}\,}\cdots {\,\mathop {\rightarrow }\limits ^{{\beta _D^{alg}}}\,}\beta _D^{goal} (=\beta ). \end{aligned}$$

We denote this sequence as \(\{ (\beta _j^{alg},\beta ^{goal}_j) \}_{j=1,\ldots ,D}\), and regard it as the progressive BKZ given in Fig. 8.

Fig. 8.
figure 8

Our progressive BKZ algorithm with blocksize strategy

6.1 Optimizing Blocksize Strategies

Our goal in this section is to find the optimal sequence that minimizes the total computing time

$$\begin{aligned} \sum _{i=1}^{D} \mathrm {TimeBKZ}(n,\beta ^{goal}_{i-1} {\,\mathop {\rightarrow }\limits ^{{\beta _i^{alg}}}\,}\beta _i^{goal}) \end{aligned}$$

of the progressive BKZ algorithm to find a BKZ-\(\beta ^{goal}_D\) basis.

Based on our experimental results, which are given in Sect. 7, we can estimate the computing time of the BKZ algorithm:

$$\begin{aligned} \begin{array}{l} \displaystyle \mathrm {TimeBKZ}(n,\beta ^{start} {\,\mathop {\rightarrow }\limits ^{{\beta ^{alg}}}\,}\beta ^{goal})\ \mathrm{[sec.]} \\ \displaystyle = \sum _{t=1}^{\sharp tours} \Big [ 1.5\cdot 10^{-10} \cdot (\beta ^{alg})^2 n^3 + 1.5\cdot 10^{-8} \cdot \beta ^{alg} \sum _{i=1}^{n-1} \mathrm{ENUMCost}(B_i;\alpha ,p) \Big ] \end{array} \end{aligned}$$

when dimension n is small (\(n < 400\)), and

$$\begin{aligned} \begin{array}{l} \displaystyle \mathrm {TimeBKZ}(n,\beta ^{start} {\,\mathop {\rightarrow }\limits ^{{\beta ^{alg}}}\,}\beta ^{goal})\ \mathrm{[sec.]} \\ \displaystyle = \sum _{t=1}^{\sharp tours} \Big [ 2.5 \cdot 10^{-4} \cdot \frac{n-\beta ^{alg}}{250-\beta ^{alg}} \cdot n^2. + 3.0\cdot 10^{-8}\cdot \beta ^{alg} \sum _{i=1}^{n-1} \mathrm{ENUMCost}(B_i;\alpha ,p) \Big ] \end{array} \end{aligned}$$

when dimension n is large (\(n \ge 400\)). The difference is caused by the difference in the types to compute Gram-Schmidt variables in implementation. The former and latter implementation employ \(\mathtt{quad\_float}\) and \(\mathtt{RR}\) (320 bits) respectively, where \(\mathtt{RR}\) is the arbitrary precision floating point type in the NTL library [49]. To compute \(\sharp tours\) we use the procedure in Sect. 5.2. The input of the \(\mathrm {ENUMCost}\) function is from \(\mathrm{Sim}\text{- }\mathrm{GS}\text{- }\mathrm{lengths}(n,\beta ^{start})\) at the first tour. From the second tour, we use the updated GS-lengths by the Chen-Nguyen’s simulator with blocksize \(\beta ^{alg}\).

Using these computing time estimations, we discuss how to find the optimal blocksize strategy (15) that minimizes the total computing time. In this optimizing procedure, the input consists of n and \(\beta \), the lattice dimension and the goal blocksize. We denote \(\mathrm {TimeBKZ}(n,\beta ^{goal})\) to be the minimized time in seconds to find a \(\beta \)-reduced basis from an LLL reduced basis, that is, the minimum of (16) from among the possible blocksize strategies. By definition, we have

$$\begin{aligned} \mathrm {TimeBKZ}(n,\beta ^{goal}) = \min _{\beta ',\beta ^{alg}} \Big \{ \mathrm {TimeBKZ}(n,\beta ') + \mathrm {TimeBKZ}(n,\beta ' {\,\mathop {\rightarrow }\limits ^{{\beta ^{alg}}}\,}\beta ^{goal} ) \Big \} \end{aligned}$$

where we take the minimum over the pair of blocksizes \((\beta ',\beta ^{alg})\) satisfying \(\beta ' < \beta ^{goal} \le \beta ^{alg}\).

For the given \((n,\beta )\), our optimizing algorithm computes \(\mathrm {TimeBKZ}(n,\bar{\beta })\) from small \(\bar{\beta }\) to the target \(\bar{\beta }= \beta \). As the base case, we define that \(\mathrm {TimeBKZ}(n,20)\) represents the time to compute a BKZ-20 reduced basis using a fixed blocksize, starting from an LLL reduced basis:

$$\begin{aligned} \mathrm {TimeBKZ}(n,20) := \min _{\beta ^{alg}} \Big \{ \mathrm {TimeBKZ}(n, \mathrm{LLL} {\,\mathop {\rightarrow }\limits ^{{\beta ^{alg}}}\,}20) \Big \}. \end{aligned}$$

6.2 Simulating Time to Find Short Vectors in Random Lattices

In this section, we give our simulating result of finding short vectors for random lattices. For the given lattice dimension n and the target length, we simulate the necessary BKZ blocksize \(\beta \) so that \(\ell _1\) of \(\mathrm{Sim}\text{- }\mathrm{GS}\text{- }\mathrm{lengths}(n,\beta )\) is smaller than the target length. Then, we simulate \(\mathrm {TimeBKZ}(n,\beta )\) by using the method in Sect. 6.1.

As an example, in Table 2, we show the optimized blocksize strategy and computing time to find a 102-reduced basis in \(n=600\) dimension. We estimate blocksize 102 is necessary to find a vector shorter than \(n\cdot \det (L)^{1/n}\), which is the condition to enter the Hall of Fame in the Approximate Ideal Lattice Challenge [50].

Table 2. The optimized blocksize strategy and computational time in seconds in 600-dimensional lattice.
Table 3. Simulated \(\log _2(\)Time [sec.]) of our algorithm and BKZ 2.0 for large dimensions to find short vectors. The time is after LLL-reduced basis. Because the estimate for BKZ 2.0 is only the cost for enumeration, our algorithm appears to be slow in small blocksizes.

Table 3 shows the blocksize and predicted total computing time in seconds to find a vector shorter than \(n\cdot \mathrm {GH}(L)\) (this corresponds to the n-approximate SVP from the learning with errors problem [41].), \(n\cdot \det (L)^{1/n}\) (from the Approximate Ideal Lattice Challenge published in Darmstadt [50]), and \(\sqrt{n}\cdot \mathrm {GH}(L)\). For comparison, the simulating result of BKZ 2.0 is given to find \(n\cdot \det (L)^{1/n}\). Recall that their estimated cost in seconds is given by \(\sharp \mathrm{ENUM} / 2\cdot 10^7\). From Table 3, our algorithm is asymptotically faster than BKZ 2.0.

6.3 Comparing with Other Heuristic Blocksize Strategies

In this section, we compare the blocksize strategy of our progressive BKZ in Fig. 8. Using a random 256-dimensional basis, we experimented and simulated the progressive BKZ to find a BKZ-128 reduced basis with the three following strategies:

                                   (Schnorr-Shevchenko’s doubling strategy [48])

                                   (Simplest step-by-step in Fig. 6)

                                   (Optimized blocksize strategy in Fig. 8)

In experiment, our simple and optimized strategy takes about 27.1 min and about 11.5 min respectively to achieve BKZ-64 basis after the LLL reduction. On the other hand, Schnorr-Schevchenko’s doubling strategy takes about 21 min.

After then, the doubling strategy switches to BKZ-128 and takes about 14 single-core days to process the first one index, while our strategies comfortably continues the execution of progressive BKZ.

Our simulator predicts that it takes about \(2^{25.3}\), \(2^{25.1}\) and \(2^{37.3}\) s to finish BKZ-128 by our simple, optimized, and Schnorr-Schevchenko’s doubling strategy, respectively. Our strategy is about 5000 times faster than the doubling strategy.

Interestingly, we find that the computing time of simple blocksize strategy is close to that of optimized strategy in many simulations when the blocksize is larger than about 100. Hence, the simple blocksize strategy would be better than the optimizing blocksize strategy in practice, because the latter needs a heavy precomputing as in Sect. 6.1.

7 Our Implementation and Cost Estimation for Processing Local Blocks

In this section we describe how to derive the estimation of the computing times of Eqs. (17) and (18) of Step 3–10 in Fig. 6. Remark that due to the page limitation, we omit almost of detailed description from the full-version [8].

The total computing time is the sum of times to process local blocks (corresponds to Step 5–8 in Fig. 6):

$$\begin{aligned} \begin{array}{l} \mathrm {TimeBKZ}(n,\beta ^{start} {\,\mathop {\rightarrow }\limits ^{{\beta ^{alg}}}\,}\beta ^{goal}) = \\ \displaystyle \sum _{t=1}^{\sharp tours} \sum _{i=1}^{n-1} \Big [ \text {Time of processing local block }B_i \text {with parameters }(\alpha ,p) \Big ]. \end{array} \end{aligned}$$

We constructed our model of computing time for small dimensional lattices (\(dim<400\)) as follows.

$$\begin{aligned} \begin{array}{l} \displaystyle Time_{Sim\text {-}small}(dim,\beta ,A_1,W_1) = \\ \displaystyle \sum _{\beta ^{start}}^{\beta ^{goal}} \sum _{t=1}^{\sharp tours} \Big [ A_1 \cdot \beta ^2 n^3 + W_1\cdot \beta \sum _{i=1}^{n-1} \mathrm{ENUMCost}(B_i;\alpha ,p)\ \Big ] \text {[sec.]}. \end{array} \end{aligned}$$

And for the large dimensions as

$$\begin{aligned} \begin{array}{l} \displaystyle Time_{Sim\text {-}large}(dim,\beta ,A_2,W_2) = \\ \displaystyle \sum _{\beta ^{start}}^{\beta _{goal}} \sum _{t=1}^{\sharp tours} \Big [A_2\cdot \frac{n-\beta }{H-\beta } \cdot Hn^2 + W_2\cdot \beta \sum _{i=1}^{n-1} \mathrm{ENUMCost}(B_i;\alpha ,p) \Big ] \text {[sec.]}. \end{array} \end{aligned}$$
Fig. 9.
figure 9

Result of our parameter fitting for cost estimation. Left Figure: implementation for small dimensional lattices. Right Figure: implementation for large dimensional lattices. In both graphs, experimental results are plotted by small squares and the simulating results are drawn in bold lines.

In this section, we conduct the computer experiments with the simple blocksize strategy:

$$\begin{aligned} 2 {\,\mathop {\rightarrow }\limits ^{{20}}\,}20 {\,\mathop {\rightarrow }\limits ^{{21}}\,}21 {\,\mathop {\rightarrow }\limits ^{{22}}\,}22 {\,\mathop {\rightarrow }\limits ^{{23}}\,}23 {\,\mathop {\rightarrow }\limits ^{{24}}\,}24 {\,\mathop {\rightarrow }\limits ^{{25}}\,}\cdots \end{aligned}$$

using a lattice generated by the SVP challenge problem generator, and then we estimate the undefined variables \(W_1\), \(W_2\), \(A_1\) and \(A_2\) by the experimental computing time after BKZ-55, i.e., \(\beta ^{start}=55\).

We find the suitable coefficients \((A_1, W_1)\) by using the standard curve fitting method in semi-log scale, which minimize

$$\begin{aligned} \begin{array}{l} \displaystyle \sum _{dim\in \{ 200,300\}} \sum _{\beta =55} \left| \log \Big ( T(dim,\beta ,A_1,W_1 )\Big ) - \log \Big (Time_{Exp}(dim,\beta ) \Big ) \right| ^2 \end{array}, \end{aligned}$$

where \(T(dim, \beta , A_1, W_1) = Time_{Sim{\text{- }}large}(dim, \beta , A_1, W_1)\) in the small dimensional situation. For the large dimensional situation, we use the result of \(dim \in \{ 600,800\}\) to fix \(A_2\) and \(W_2\).

We find suitable coefficients

$$\begin{aligned} \begin{array}{ll} A_1 = 1.5\cdot 10^{-10} &{} \mathrm{and} \ W_1= 1.5\cdot 10^{-8} \\ A_2 = 10^{-6} &{} \mathrm{and} \ W_2= 3.0\cdot 10^{-8}. \end{array} \end{aligned}$$

The fitting results are given in Fig. 9. Using the Eqs. (20) and (21) with the above coefficients (22), we can estimate the computing times of our progressive BKZ algorithm.

8 Pre/Post-Processing the Entire Basis

In this section, we consider an extended strategy that enhances the speed of our progressive BKZ by pre/post-precessing the entire basis.

In pre-processing we first generate a number of randomized bases for input basis. Each basis is then reduced by using the proposed progressive BKZ algorithm. Finally we perform the enumeration algorithm for each reduced basis with some low probability in the post-processing. This strategy is essentially the same as the extreme pruning technique [20]. However, it is important to note that we do not generate a randomized basis inside the progressive BKZ. Our simulator for the proposed progressive BKZ is so precise that we can also estimate the speedup by the pre/post-precessing using our simulator.

8.1 Algorithm for Finding Nearly Shortest Vectors

In the following, we construct an algorithm for finding a vector shorter than \(\gamma \cdot \mathrm {GH}(L)\) with a reasonable probability using the strategy above, and we analyze the total computing time using our simulator for the BKZ algorithm.

Concretely, for given lattice basis B of dimension n, the pre-processing part generates M randomized bases \(B_i = U_i B\) by multiplying unimodular matrices \(U_i\) for \(i = 1,\ldots ,M\). Next, we apply our progressive BKZ for finding the BKZ-\(\beta \) reduced basis. The cost to obtain the randomized reduced bases is estimated by \(M\cdot (\mathrm{TimeRandomize}(n) + \mathrm {TimeBKZ}(n,\beta ))\). Here, \(\mathrm{TimeRandomize}\) includes the cost of generating a random unimodular matrix and matrix multiplication, which is negligibly smaller than \(\mathrm {TimeBKZ}\) in general. Thus we assume the computational cost for lattice reduction is \(M\cdot \mathrm {TimeBKZ}(n,\beta )\).

Finally, in the post-processing part, we execute the standard enumeration algorithm with the searching radius parameter \(\alpha =\gamma \) and probability parameter \(p=2\cdot \gamma ^{-n}/M\). As with the similar argument in Sect. 4.1, there exist about \(\gamma ^n /2\) short vector pairs in \(\mathrm{Ball}_n(\gamma \cdot \mathrm {GH}(L))\). Therefore, the probability that one enumeration finds the desired vector is about \((\gamma ^n /2) \cdot (2\cdot \gamma ^{-n}/M) = 1/M\) and the total probability of success is \(1-(1-1/M)^M \approx 0.632\).

Consequently, the total computing cost in our model is

$$\begin{aligned} M\cdot \left( \mathrm {TimeBKZ}(n,\beta ) + \frac{\mathrm {ENUMCost}(B;\gamma ,p=2\cdot \gamma ^{-n}/M)}{ 6\cdot 10^7}\right) \ \mathrm{[sec.]}, \end{aligned}$$

where \(\mathrm {TimeBKZ}(n, \beta )\) and \(\mathrm {ENUMCost}(B;\gamma ,p)\) are defined by Sect. 6.1 and Sect. 2.3, respectively. We can optimize this total cost by finding the minimum of formula (23) over parameter \((\beta ,M)\). Here, note that the constant \(6\cdot 10^7\) comes from our best benchmarking record of lattice enumeration. In Table 4, we provide the detailed simulating result with setting \(\gamma =1.05\) to analyze the hardness of the Darmstadt SVP Challenge in several dimensions. A comparison with previous works are given in Sect. 9 (See the line C in Fig. 10).

Table 4. The cost of solving SVP Challenge using our optimal blocksize strategy

8.2 Lower Bound of the Cost by an Idealized Algorithm

Here we discuss the lower bound of the total computing cost of the proposed progressive BKZ algorithm (or other reduction algorithm) with the pre/post-processing.

The total cost is estimated by the sum of the computational time for the randomization, the progressive BKZ algorithm, and the enumeration algorithm by the following extremely idealized situations. Note that we believe that they are beyond the most powerful cryptanalysis which we can achieve in the future, and thus we say that this is the lower bound in our model.

(a) The cost for the randomization becomes negligibly small. The algorithm for randomizing the basis would not only be the method of multiplying random unimodular bases, and we could find an ideal randomization at a negligibly small cost. Thus, \(\mathrm{TimeRandomize}(n)=0\).

(b) The cost for the progressive BKZ algorithm does not become lower than that of computing the Gram-Schmidt lengths. Even though the progressive BKZ algorithm ideally improved, we always need the Gram-Schmidt basis computation used for the enumeration algorithm or the LLL algorithm. The computation of the Gram-Schmidt basis (even though the computation is performed in an approximation using floating-point operations with a sufficient precision) includes \(\varTheta (n^3)\) floating point arithmetic operations via the Cholesky factorization algorithm (See, for example [38, Chapter 5]). A modern CPU can perform a floating point operation in one clock cycle, and it can work at about 4.0 GHz. Thus, we assume that the lower bound of the time in seconds is \((4.0\cdot 10^9)^{-1} \cdot n^3\).

(c) The reduced basis obtained by the progressive BKZ (or other reduction algorithm) becomes ideally reduced. We define the simulated \(\gamma \) -approximate HKZ basis \(B_{\gamma \text{- }HKZ}\) by a basis satisfying

$$\begin{aligned} ||\mathbf{b}^*_i|| = \tau _{n-i+1} \mathrm {GH}(L_{[i:n]})\ \mathrm{for}\ i=2,\ldots ,n\ \mathrm{and}\ ||\mathbf{b}_1||=\gamma \cdot \mathrm {GH}(L). \end{aligned}$$

For any fixed \(\gamma \) and p, we assume this basis minimizes the cost for enumeration over any basis satisfying \(||\mathbf{b}_1|| \ge \gamma \cdot \mathrm {GH}(L)\).

Therefore, the lower bound of the total cost of the idealized algorithm in seconds is given by

$$\begin{aligned} \min _{M\in \mathbb {N}} M\cdot \left( (4.0\cdot 10^9)^{-1} \cdot n^3 + \frac{ENUMCost( B_{\gamma {\text{- }}HKZ};\alpha ,p/M)}{6\cdot 10^7} \right) . \end{aligned}$$

Setting \(\gamma =1.05\), we analyze the lower bound cost to enter the SVP Challenge. (See the line D in Fig. 10).

9 Simulation Results for SVP Challenge and Comparison

In this section, we give our simulation results using our proposed progressive BKZ algorithm together with the pre/post-processing strategy in Sect. 8.1 for solving the Darmstadt SVP Challenge [50], which tries to find a vector shorter than \(1.05 \cdot \mathrm {GH}(L)\) in the random lattice L of dimension n.

We also simulate the cost estimation of Lindner and Peikert [32] and that of Chen and Nguyen [13] in the same model. The summery of our simulation results and the latest records published in the SVP Challenge are given in Fig. 10. The outlines of our estimations A to D in Fig. 10 are given below.

From our simulation, the proposed progressive BKZ algorithm is about 50 times faster than BKZ 2.0 and about 100 times slower than the idealized algorithm that achieves the lower bound in our model of Sect. 8.2.

Fig. 10.
figure 10

Comparing cost in seconds. A: Lindner-Peikert estimation, B: Chen-Nguyen’s BKZ 2.0 simulation, C: Simulating estimation of our randomized BKZ-then-ENUM algorithm, D: Lower bound in the randomized BKZ-then-ENUM strategy. Records in the SVP Challenge are indicated by the black circles “\(\bullet \)”, and our experimental results are indicated by the white circles “\(\circ \)”.

A: Lindner-Peikert’s Estimation [32]: From the experiments using the BKZ implementation in the NTL library [49], they estimated that the BKZ algorithm can find a short vector of length \(\delta ^n \det (L)^{1/n}\) in \(2^{1.8/\log _2(\delta ) -110}\) [sec.] in the n-dimensional lattice. The computing time of Lindner-Peikert’s model becomes

$$\begin{aligned} Time_\mathrm{LP}= 2^{1.8/\log _2(\delta ) -110}\ \mathrm{with}\ \delta =1.05^{1/n}\cdot V_n(1)^{-1/n^2}, \end{aligned}$$

because this \(\delta \) attains \(1.05 \cdot \mathrm {GH}(L)=\delta ^n \det (L)^{1/n}\).

B: Chen-Nguyen’s BKZ 2.0 [13, 14]: We estimated the cost of BKZ 2.0 using the simulator in Sect. 3.2. Following the original paper [13], we assume that a blocksize is fixed and the estimation is the minimum of (4) over all possible pairs of the blocksize \(\beta \) and the number t of tours. Again we convert the number of nodes into the single-threaded time, we divide the number by \(2\cdot 10^7\).

C: Our Estimation: We searched the minimum cost using the estimation (23) over M and \(\beta \) with setting \(\gamma =1.05\).

D: Lower Bound in Our Model: We searched the minimum cost using the estimation (24) over M with setting \(\gamma =1.05\).

Records of SVP Challenge: From the hall of fame in the SVP Challenge [50] and reporting paper [18], we listed up the records that contain the computing time with a single thread in Fig. 10, as black circles “\(\bullet \)”. Moreover we performed experiments on our proposed progressive BKZ algorithm using the pre/post-processing strategy in Sect. 8.1 up to 123 dimensions which are also indicated by the white circles “\(\circ \)” in Fig. 10.

10 Conclusions and Future Work

We proposed an improved progressive BKZ algorithm with optimized parameters and block-increasing strategy. We also gave a simulator that can precisely predict the Gram-Schmidt lengths computed using the proposed progressive BKZ. We also presented the efficient implementation of the enumeration algorithm and LLL algorithm, and the total cost of the proposed progressive BKZ algorithm was precisely evaluated by the sharp simulator.

Moreover, we showed a comparison with other algorithms by simulating the cost of solving the instances from the Darmstadt SVP Challenge. Our progressive BKZ algorithm is about 50 times faster than the BKZ 2.0 proposed by Chen and Nguyen for solving the SVP Challenges up to 160 dimensions. Finally, we discussed a computational lower bound of the proposed progressive BKZ algorithm under certain ideal assumptions. These simulation results contribute to the estimation of the secure parameter sizes used in lattice based cryptography.

We outline some future works: (1) constructing a BKZ simulator without using our \(\mathrm {ENUMCost}\), (2) adopting our simulator with other strategies such as BKZ-then-Sieve strategy for computing a short vector more efficiently, and (3) estimating the secure key length of lattice-based cryptosystems using the lower bound of the proposed progressive BKZ.