Sieving for Shortest Vectors in Lattices Using Angular LocalitySensitive Hashing
 43 Citations
 3k Downloads
Abstract
By replacing the bruteforce list search in sieving algorithms with Charikar’s angular localitysensitive hashing (LSH) method, we get both theoretical and practical speedups for solving the shortest vector problem (SVP) on lattices. Combining angular LSH with a variant of Nguyen and Vidick’s heuristic sieve algorithm, we obtain heuristic time and space complexities for solving SVP of \(2^{0.3366n + o(n)}\) and \(2^{0.2075n + o(n)}\) respectively, while combining the same hash family with Micciancio and Voulgaris’ GaussSieve algorithm leads to an algorithm with (conjectured) heuristic time and space complexities of \(2^{0.3366n + o(n)}\). Experiments with the GaussSievevariant show that in moderate dimensions the proposed HashSieve algorithm already outperforms the GaussSieve, and the practical increase in the space complexity is much smaller than the asymptotic bounds suggest, and can be further reduced with probing. Extrapolating to higher dimensions, we estimate that a fully optimized and parallelized implementation of the GaussSievebased HashSieve algorithm might need a few core years to solve SVP in dimension 130 or even 140.
Keywords
Lattices Shortest vector problem (SVP) Sieving algorithms Approximate nearest neighbor problem Localitysensitive hashing (LSH)1 Introduction
Lattice Cryptography. Over the past few decades, latticebased cryptography has attracted wide attention from the cryptographic community, due to e.g. its presumed resistance against quantum attacks [10], averagecase hardness guarantees [3], the existence of latticebased fully homomorphic encryption schemes [16], and efficient cryptographic primitives like NTRU [17]. An important problem related to lattice cryptography is to estimate the hardness of the underlying hard lattice problems, such as finding short vectors; a good understanding is critical for accurately choosing parameters in lattice cryptography [28, 39].
Finding Short Vectors. Given a basis \(\{\varvec{b}_1, \dots , \varvec{b}_n\} \subset \mathbb {R}^n\) of an ndimensional lattice \(\mathcal {L} = \sum _{i=1}^n \mathbb {Z} \varvec{b}_i\), finding a shortest nonzero lattice vector (with respect to the Euclidean norm) or approximating it up to a constant factor is wellknown to be NPhard under randomized reductions [4, 21]. For large approximation factors, various fast algorithms for finding short vectors are known, such as the lattice basis reduction algorithms LLL [26] and BKZ [43, 44]. The latter has a blocksize parameter \(\beta \) which can be tuned to obtain a tradeoff between the time complexity and the quality of the output; the higher \(\beta \), the longer the algorithm takes and the shorter the vectors in the output basis. BKZ uses an algorithm for solving the exact shortest vector problem (SVP) in lattices of dimension \(\beta \) as a subroutine, and the runtime of BKZ largely depends on the runtime of this subroutine. Estimating the complexity of solving exact SVP therefore has direct consequences for the estimated hardness of solving approximate SVP with BKZ.
Finding Shortest Vectors. In the original description of BKZ, enumeration was used as the SVP subroutine [14, 20, 38, 44]. This method has a low (polynomial) space complexity, but its runtime is superexponential (\(2^{\varOmega (n \log n)}\)), which is known to be suboptimal: sieving [5], the Voronoi cell algorithm [32], and the recent discrete Gaussian sampling approach [2] all run in single exponential time (\(2^{O(n)}\)). The main drawbacks of the latter methods are that their space complexities are exponential in n as well, and due to larger hidden constants in the exponents enumeration is commonly still considered more practical than these other methods in moderate dimensions n [34].
Sieving in Arbitrary Lattices. On the other hand, these other SVP algorithms are relatively new, and recent improvements have shown that at least sieving may be able to compete with enumeration in the future. While the original work of Ajtai et al. [5] showed only that sieving solves SVP in time and space \(2^{O(n)}\), later work showed that one can provably solve SVP in arbitrary lattices in time \(2^{2.47n + o(n)}\) and space \(2^{1.24n + o(n)}\) [35, 40]. Heuristic analyses of sieving algorithms further suggest that one may be able to solve SVP in time \(2^{0.42n + o(n)}\) and space \(2^{0.21n + o(n)}\) [7, 33, 35], or optimizing for time, in time \(2^{0.38n + o(n)}\) and space \(2^{0.29n + o(n)}\) [7, 45, 46]. Other works have shown how to speed up sieving in practice [11, 15, 19, 29, 30, 41], and sieving recently made its way to the top 25 of the SVP challenge hall of fame [42], using the GaussSieve algorithm [23, 33].
Contributions. In this work we show how to obtain exponential tradeoffs and speedups for sieving using (angular) localitysensitive hashing [12, 18], a technique from the field of nearest neighbor searching. In short, for each list vector \(\varvec{w}\) we store lowdimensional, lossy sketches (hashes), such that vectors that are nearby have a higher probability of having the same sketch (hash value) than vectors which are far apart. To search the list for nearby vectors we then do not go through the entire list of lattice vectors, but only consider those vectors that have at least one matching sketch (hash value) in one of the hash tables. Storing all list vectors in exponentially many hash tables requires exponentially more space, but searching for nearby vectors can then be done exponentially faster as well, as many distant vectors are not considered for reductions. Optimizing for time, the resulting HashSieve algorithm has heuristic time and space complexities both bounded by \(2^{0.3366n + o(n)}\), while tuning the parameters differently, we get a continuous heuristic tradeoff between the space and time complexities as illustrated by the solid blue curve in Fig. 1.
From a Tradeoff to a Speedup. Applying angular LSH to a variant of the NguyenVidick sieve [35], we further obtain an algorithm with heuristic time and space complexities of \(2^{0.3366n + o(n)}\) and \(2^{0.2075n + o(n)}\) respectively, as illustrated by the blue point in Fig. 1. The key observation is that the hash tables of the HashSieve can be processed sequentially (similar to [8]), storing one hash table at a time. The resulting algorithm achieves the same heuristic speedup, but the asymptotic space complexity remains the same as in the original NVsieve algorithm. This improvement is explained in detail in the full version. Note that this speedup does not appear to be compatible with the GaussSieve and only works with the NVsieve, which may make the resulting algorithm slower in moderate dimensions, even though the memory used is much smaller.
Experimental Results. Practical experiments with the (GaussSievebased) HashSieve algorithm validate our heuristic analysis, and show that (i) already in low dimensions, the HashSieve outperforms the GaussSieve; and (ii) the increase in the space complexity is significantly smaller than one might guess from only looking at the leading exponent of the space complexity. We also show how to further reduce the space complexity at almost no cost by a technique called probing, which reduces the required number of hash tables by a factor \(\text {poly}(n)\). In the end, these results will be an important guide for estimating the hardness of exact SVP in moderate dimensions, and for the hardness of approximate SVP in high dimensions using BKZ with sieving as the SVP subroutine.

Nguyen and Vidick considered LSH families based on Euclidean distances [6], while we will argue that it seems more natural to consider hash families based on angular distances or cosine similarities [12].

Nguyen and Vidick focused on the worstcase difference between nearby and faraway vectors, while we will focus on the averagecase difference.
To illustrate the second point: the smallest angle between pairwise reduced vectors in the GaussSieve may be only slightly bigger than \(60^{\circ }\) (i.e. hardly any bigger than angles of nonreduced vectors), while in high dimensions the average angle between two pairwise reduced vectors is actually close to \(90^{\circ }\).
Outlook. Although this work focuses on applying angular LSH to sieving, more generally this work could be considered the first to succeed in applying LSH to lattice algorithms. Various recent followup works have already further investigated the use of different LSH methods and other nearest neighbor search methods in the context of lattice sieving [8, 9, 25, 31], and an open problem is whether other lattice algorithms (e.g. provable sieving algorithms, the Voronoi cell algorithm) may benefit from related techniques as well.
Roadmap. In Sect. 2 we describe the technique of (angular) LSH for finding near(est) neighbors, and Sect. 3 describes how to apply these techniques to the GaussSieve. Section 4 states the main result regarding the time and space complexities of sieving using angular LSH, and describes the technique of probing. In Sect. 5 we finally describe experiments performed using the GaussSievebased HashSieve, and possible consequences for the estimated complexity of SVP in high dimensions. The full version [24] contains details on how angular LSH may be combined with the NVsieve, and how the memory can be reduced to obtain a memorywise asymptotically superior NVsievebased HashSieve.
2 LocalitySensitive Hashing
2.1 Introduction
The near(est) neighbor problem is the following [18]: Given a list of ndimensional vectors \(L = \{\varvec{w}_1, \varvec{w}_2, \dots , \varvec{w}_N\} \subset \mathbb {R}^n\), preprocess L in such a way that, when later given a target vector \(\varvec{v} \notin L\), one can efficiently find an element \(\varvec{w} \in L\) which is close(st) to \(\varvec{v}\). While in low (fixed) dimensions n there may be ways to answer these queries in time sublinear or even logarithmic in the list size N, in high dimensions it seems hard to do better than with a naive bruteforce list search of time O(N). This inability to efficiently store and query lists of highdimensional objects is sometimes referred to as the “curse of dimensionality” [18].
Fortunately, if we know that the list of objects L has a certain structure, or if we know that there is a significant gap between what is meant by “nearby” and “far away,” then there are ways to preprocess L such that queries can be answered in time sublinear in N. For instance, for the Euclidean norm, if it is known that the closest point \(\varvec{w}^* \in L\) lies at distance \(\Vert \varvec{v}  \varvec{w}^*\Vert = r_1\), and all other points \(\varvec{w} \in L\) are at distance at least \(\Vert \varvec{v}  \varvec{w}\Vert \ge r_2 = (1 + \varepsilon ) r_1\) from \(\varvec{v}\), then it is possible to preprocess L using time and space \(O(N^{1 + \rho })\), and answer queries in time \(O(N^{\rho })\), where \(\rho = (1 + \varepsilon )^{2} < 1\) [6]. For \(\varepsilon > 0\), this corresponds to a sublinear time and subquadratic (superlinear) space complexity in N.
2.2 Hash Families
The method of [6] described above, as well as the method we will use later, relies on using localitysensitive hash functions [18]. These are functions h which map an ndimensional vector \(\varvec{v}\) to a lowdimensional sketch of \(\varvec{v}\), such that vectors which are nearby in \(\mathbb {R}^n\) have a high probability of having the same sketch, while vectors which are far away have a low probability of having the same image under h. Formalizing this property leads to the following definition of a localitysensitive hash family\(\mathcal {H}\). Here, we assume D is a certain similarity measure^{1}, and the set U below may be thought of as (a subset of) the natural numbers \(\mathbb {N}\).
Definition 1

If \(D(\varvec{v}, \varvec{w}) \le r_1\) then \(\mathbb {P}_{h \in \mathcal {H}}[h(\varvec{v}) = h(\varvec{w})] \ge p_1\).

If \(D(\varvec{v}, \varvec{w}) \ge r_2\) then \(\mathbb {P}_{h \in \mathcal {H}}[h(\varvec{v}) = h(\varvec{w})] \le p_2\).
Note that if we are given a hash family \(\mathcal {H}\) which is \((r_1, r_2, p_1, p_2)\)sensitive with \(p_1 \gg p_2\), then we can use \(\mathcal {H}\) to distinguish between vectors which are at most \(r_1\) away from \(\varvec{v}\), and vectors which are at least \(r_2\) away from \(\varvec{v}\) with nonnegligible probability, by only looking at their hash values (and that of \(\varvec{v}\)).
2.3 Amplification
Before turning to how such hash families may actually be constructed or used to find nearest neighbors, note that in general it is unknown whether efficiently computable \((r_1, r_2, p_1, p_2)\)sensitive hash families even exist for the ideal setting of \(r_1 \approx r_2\) and \(p_1 \approx 1\) and \(p_2 \approx 0\). Instead, one commonly first constructs an \((r_1, r_2, p_1, p_2)\)sensitive hash family \(\mathcal {H}\) with \(p_1 \approx p_2\), and then uses several AND and ORcompositions to turn it into an \((r_1, r_2, p_1', p_2')\)sensitive hash family \(\mathcal {H}'\) with \(p_1' > p_1\) and \(p_2' < p_2\), thereby amplifying the gap between \(p_1\) and \(p_2\).

ANDcomposition. Given an \((r_1, r_2, p_1, p_2)\)sensitive hash family \(\mathcal {H}\), we can construct an \((r_1, r_2, p_1^k, p_2^k)\)sensitive hash family \(\mathcal {H}'\) by taking k different, pairwise independent functions \(h_1, \dots , h_k \in \mathcal {H}\) and a onetoone mapping \(f: U^k \rightarrow U\), and defining \(h \in \mathcal {H}'\) as \(h(\varvec{v}) = f(h_1(\varvec{v}), \dots , h_k(\varvec{v}))\). Clearly \(h(\varvec{v}) = h(\varvec{w})\) iff \(h_i(\varvec{v}) = h_i(\varvec{w})\) for all \(i \in [k]\), so if \(\mathbb {P}[h_i(\varvec{v}) = h_i(\varvec{w})] = p_j\) for all i, then \(\mathbb {P}[h(\varvec{v}) = h(\varvec{w})] = p_j^k\) for \(j = 1,2\).

ORcomposition. Given an \((r_1, r_2, p_1, p_2)\)sensitive hash family \(\mathcal {H}\), we can construct an \((r_1, r_2, 1  (1  p_1)^t, 1  (1  p_2)^t)\)sensitive hash family \(\mathcal {H}'\) by taking t different, pairwise independent functions \(h_1, \dots , h_t \in \mathcal {H}\), and defining \(h \in \mathcal {H}'\) by the relation \(h(\varvec{v}) = h(\varvec{w})\) iff \(h_i(\varvec{v}) = h_i(\varvec{w})\) for at least one\(i \in [t]\). Clearly \(h(\varvec{v}) \ne h(\varvec{w})\) iff \(h_i(\varvec{v}) \ne h_i(\varvec{w})\) for all \(i \in [t]\), so if \(\mathbb {P}[h_i(\varvec{v}) \ne h_i(\varvec{w})] = 1  p_j\) for all i, then \(\mathbb {P}[h(\varvec{v}) \ne h(\varvec{w})] = (1  p_j)^t\) for \(j = 1,2\).^{2}
2.4 Finding Nearest Neighbors
To use these hash families to find nearest neighbors, we may use the following method first described in [18]. First, we choose \(t \cdot k\) random hash functions \(h_{i,j} \in \mathcal {H}\), and we use the ANDcomposition to combine k of them at a time to build t different hash functions \(h_1, \dots , h_t\). Then, given the list L, we build t different hash tables \(T_1, \dots , T_t\), where for each hash table \(T_i\) we insert \(\varvec{w}\) into the bucket labeled \(h_i(\varvec{w})\). Finally, given the vector \(\varvec{v}\), we compute its t images \(h_i(\varvec{v})\), gather all the candidate vectors that collide with \(\varvec{v}\) in at least one of these hash tables (an ORcomposition) in a list of candidates, and search this set of candidates for a nearest neighbor.
Clearly, the quality of this algorithm for finding nearest neighbors depends on the quality of the underlying hash family \(\mathcal {H}\) and on the parameters k and t. Larger values of k and t amplify the gap between the probabilities of finding ‘good’ (nearby) and ‘bad’ (faraway) vectors, which makes the list of candidates shorter, but larger parameters come at the cost of having to compute many hashes (both during the preprocessing and querying phases) and having to store many hash tables in memory. The following lemma shows how to balance k and t so that the overall time complexity is minimized.
Lemma 1
 (1)
Time for preprocessing the list: \(\tilde{O}(k N^{1 + \rho })\).
 (2)
Space complexity of the preprocessed data: \(\tilde{O}(N^{1 + \rho })\).
 (3)
Time for answering a query \(\varvec{v}\): \(\tilde{O}(N^{\rho })\).
 (3a)
Hash evaluations of the query vector \(\varvec{v}\): \(O(N^{\rho })\).
 (3b)
List vectors to compare to the query vector \(\varvec{v}\): \(O(N^{\rho })\).
 (3a)
Although Lemma 1 only shows how to choose k and t to minimize the time complexity, we can also tune k and t so that we use more time and less space. In a way this algorithm can be seen as a generalization of the naive bruteforce search solution for finding nearest neighbors, as \(k = 0\) and \(t = 1\) corresponds to checking the whole list for nearby vectors in linear time and linear space.
2.5 Angular Hashing
3 From the GaussSieve to the HashSieve
Let us now describe how localitysensitive hashing can be used to speed up sieving algorithms, and in particular how we can speed up the GaussSieve of Micciancio and Voulgaris [33]. We have chosen this algorithm as our main focus since it seems to be the most practical sieving algorithm to date, which is further motivated by the extensive attention it has received in recent years [15, 19, 23, 29, 30, 41] and by the fact that the highest sieving record in the SVP challenge database was obtained using (a modification of) the GaussSieve [23, 42]. Note that the same ideas can also be applied to the NguyenVidick sieve [35], which has proven complexity bounds. Details on this combination are in the full version.
3.1 The GaussSieve Algorithm
While the space complexity of the GaussSieve is reasonably well understood, there are no proven bounds on the time complexity of this algorithm. One might estimate that the time complexity is determined by the double loop over L: at any time each pair of vectors \(\varvec{w}_1, \varvec{w}_2 \in L\) was compared at least once to see if one could reduce the other, so the time complexity is at least quadratic in L. The algorithm further seems to show a similar asymptotic behavior as the NVsieve [35], for which the asymptotic time complexity is heuristically known to be quadratic in L, i.e., of the order \(2^{0.415n + o(n)}\). One might therefore conjecture that the GaussSieve also has a time complexity of \(2^{0.415n + o(n)}\), which closely matches previous experiments with the GaussSieve in high dimensions [23].
3.2 The GaussSieve with Angular Reductions
3.3 The HashSieve with Angular Reductions
3.4 The (GaussSieveBased) HashSieve Algorithm
3.5 Relation with Leveled Sieving
Overall, the crucial modification going from the GaussSieve to the HashSieve is that by using hash tables and looking up vectors to reduce the target vector with in these hash tables, we make the search space smaller; instead of comparing a new vector to all vectors in L, we only compare the vector to a much smaller subset of candidates \(C \subset L\), which mostly contains good candidates for reduction, and does not contain many of the ‘bad’ vectors in L which are not useful for reductions anyway.
In a way, the idea of the HashSieve is similar to the technique previously used in two and threelevel sieving [45, 46]. There, the search space of candidate nearby vectors was reduced by partitioning the space into regions, and for each vector storing in which region it lies. In those algorithms, two nearby vectors in adjacent regions are not considered for reductions, which means one needs more vectors to saturate the space (a higher space complexity) but less time to search the list of candidates for nearby vectors (a lower time complexity). The key difference between leveled sieving and our method is in the way the partitions of \(\mathbb {R}^n\) are chosen: using giant balls in leveled sieving (similar to the Euclidean LSH method of [6]), and using intersections of halfspaces in the HashSieve.
4 Theoretical Results
For analyzing the time complexity of sieving with angular LSH, for clarity of exposition we will analyze the GaussSievebased HashSieve and assume that the GaussSieve has a time complexity which is quadratic in the list size, i.e. a time complexity of \(2^{0.415n + o(n)}\). We will then show that using angular LSH, we can reduce the time complexity to \(2^{0.337n + o(n)}\). Note that although practical experiments in high dimensions seem to verify this assumption [23], in reality it is not known whether the time complexity of the GaussSieve is quadratic in L. At first sight this therefore may not guarantee a heuristic time complexity of the order \(2^{0.337n + o(n)}\). In the full version we illustrate how the same techniques can be applied to the sieve of Nguyen and Vidick [35], for which the heuristic time complexity is in fact known to be at most \(2^{0.415n + o(n)}\), and for which we get the same speedup. This implies that indeed, with sieving we can provably solve SVP in time and space \(2^{0.337n + o(n)}\) under the same heuristic assumptions of Nguyen and Vidick [35]. For clarity of exposition, in the main text we will continue focusing on the GaussSieve due to its better practical performance, even though theoretically one might rather apply this analysis to the algorithm of Nguyen and Vidick due to their heuristic bounds on the time and space complexities.
4.1 HighDimensional Intuition
So for now, suppose that the GaussSieve has a time complexity quadratic in L and that \(L \le 2^{0.208n + o(n)}\). To estimate the complexities of the HashSieve, we will use the following assumption previously described in [35]:
Heuristic 1
The angle \(\varTheta (\varvec{v}, \varvec{w})\) between random sampled/list vectors \(\varvec{v}\) and \(\varvec{w}\) follows the same distribution as the distribution of angles \(\varTheta (\varvec{v}, \varvec{w})\) obtained by drawing \(\varvec{v}, \varvec{w} \in \mathbb {R}^n\) at random from the unit sphere.
Note that under this assumption, in high dimensions angles close to \(90^{\circ }\) are much more likely to occur between list vectors than smaller angles. So one might guess that for two vectors \(\varvec{w}_1, \varvec{w}_2 \in L\) (which necessarily have an angle larger than \(60^{\circ }\)), with high probability their angle is close to \(90^{\circ }\). On the other hand, vectors that can reduce one another always have an angle less than \(60^{\circ }\), and by similar arguments we expect this angle to always be close to \(60^{\circ }\). Under the extreme assumption that all ‘reduced angles’ between vectors that are unable to reduce each other are exactly\(90^{\circ }\) (and nonreduced angles are at most \(60^{\circ }\)), we obtain the following estimate for the costs of the HashSieve algorithm.
Proposition 1
Assuming that reduced vectors are always pairwise orthogonal, the HashSieve with parameters \(k = 0.2075n + o(n)\) and \(t = 2^{0.1214n + o(n)}\) heuristically solves SVP in time and space \(2^{0.3289n + o(n)}\). We further obtain the tradeoff between the space and time complexities indicated by the dashed line in Fig. 1.
Proof
If all reduced angles are \(90^{\circ }\), then we can simply let \(\theta _1 = \frac{\pi }{3}\) and \(\theta _2 = \frac{\pi }{2}\) and use the hash family described in Sect. 2.5 with \(p_1 = \frac{2}{3}\) and \(p_2 = \frac{1}{2}\). Applying Lemma 1, we can perform a single search in time \(N^{\rho } = 2^{0.1214n + o(n)}\) using \(t = 2^{0.1214n + o(n)}\) hash tables, where \(\rho = \frac{\log (1 / p_1)}{\log (1 / p_2)} = \log _2(\frac{3}{2}) \approx 0.585\). Since we need to perform these searches \(\tilde{O}(L) = \tilde{O}(N)\) times, the time complexity is of the order \(\tilde{O}(N^{1 + \rho }) = 2^{0.3289n + o(n)}\). \(\square \)
4.2 Heuristically Solving SVP in Time and Space \(2^{0.3366n + o(n)}\)
Of course, in practice not all reduced angles are actually \(90^{\circ }\), and one should carefully analyze what is the real probability that a vector w whose angle with \(\varvec{v}\) is more than \(60^{\circ }\), is found as a candidate due to a collision in one of the hash tables. The following central theorem follows from this analysis and shows how to choose the parameters to optimize the asymptotic time complexity. A rigorous proof of Theorem 1 based on the NVsieve can be found in the full version.
Theorem 1
Note that the optimized values in Theorem 1 and Proposition 1, and the associated curves in Fig. 1 are very similar. So the simple estimate based on the intuition that in high dimensions “everything is orthogonal” is not far off.
4.3 Heuristically Solving SVP in Time \(2^{0.3366n}\) and Space \(2^{0.2075n}\)
For completeness let us briefly explain how for the NVsieve [35], we can in fact process the hash tables sequentially and eliminate the need of storing exponentially many hash tables in memory, for which full details are given in the full version. To illustrate the idea, recall that in the NguyenVidick sieve we are given a list L of size \(2^{0.21n + o(n)}\) of vectors of norm at most R, and we want to build a new list \(L'\) of similar size \(2^{0.21n + o(n)}\) of vectors of norm at most \(\gamma R\) with \(\gamma < 1\). To do this, we look at (almost) all pairs of vectors in L, and see if their difference (sum) is short; if so, we add it to \(L'\). As the probability of finding a short vector is roughly \(2^{0.21n + o(n)}\) and we have \(2^{0.42n + o(n)}\) pairs of vectors, this will result in enough vectors to continue in the next iterations.
The natural way to apply angular LSH to this algorithm would be to add all vectors in L to t independent hash tables, and to find short vectors to add to \(L'\) we then compute a new vector \(\varvec{v}\)’s hash value for each of these t hash tables, look for potential short vectors \(\varvec{v} \pm \varvec{w}\) by comparing \(\varvec{v}\) with the colliding vectors \(\varvec{w} \in \bigcup _{i=1}^t T_i[h_i(\varvec{v})]\), and process all vectors one by one. This results in similar asymptotic time and space complexities as illustrated above.
The crucial modification that we can make to this algorithm (similar to [8]) is that we process the tables one by one; we first construct the first hash table, add all vectors in L to this hash table, and look for short difference vectors inside each of the buckets of L to add to \(L'\). The cost of building and processing one hash table is of the order \(2^{0.21n + o(n)}\), and the number of vectors found that can be added to \(L'\) is of the order \(2^{0.08n + o(n)}\). By then deleting the hash table from memory and building new hash tables over and over (\(t = 2^{0.13n + o(n)}\) times) we keep building a longer list \(L'\) until finally we will again have found \(2^{0.21n + o(n)}\) short vectors for the next iteration. In this case however we never stored all hash tables in memory at the same time, and the memory increase compared to the NVsieve is asymptotically negligible. This leads to the following result.
Theorem 2
Note that this choice of parameters balances the costs of computing hashes and comparing vectors; the fact that the blue point in Fig. 1 does not lie on the “Time = Space”line does not mean we can further reduce the time complexity.
4.4 Reducing the Space Complexity with Probing
Finally, as the above modification only seems to work with the less practical NVsieve (and not with the GaussSieve), and since for the GaussSievebased HashSieve the memory requirement increases exponentially, let us briefly sketch how we can reduce the required amount of memory in practice for the (GaussSievebased) HashSieve using probing [36]. The key observation here is that, as illustrated in Fig. 2, we only check one bucket in each hash table for nearby vectors, leading to t hash buckets in total that are checked for candidate reductions. This seems wasteful, as the hash tables contain more information: we also know for instance which hash buckets are nextmost likely to contain nearby vectors, which are buckets with very similar hash values. By also probing these buckets in a clever way and checking multiple hash buckets per hash table, we can significantly reduce the number of hash tables t in practice such that in the end we still find as many good vectors. Using \(\ell \) levels of probing (checking all buckets with hash value at Hamming distance at most \(\ell \) to \(h(\varvec{v})\)) we can reduce t by a factor \(O(n^{\ell })\) at the cost of increasing the time complexity by a factor at most \(2^{\ell }\). This does not constitute an exponential improvement, but the polynomial reduction in memory may be worthwhile in practice. More details on probing can be found in the full version.
5 Practical Results
5.1 Experimental Results in Moderate Dimensions
 (a)
With the HashSieve, maintaining a list L is no longer needed.
 (b)
Instead of making a list of candidates, we go through the hash tables one by one, checking if collisions in this table lead to reductions. If a reducing vector is found early on, this may save up to \(t \cdot k\) hash computations.
 (c)
As \(h_i(\varvec{v}) = h_i(\varvec{v})\) the hash of \(\varvec{v}\) can be computed for free from \(h_i(\varvec{v})\).
 (d)
Instead of comparing \(\pm \varvec{v}\) to all candidate vectors \(\varvec{w}\), we only compare \(+\varvec{v}\) to the vectors in the bucket \(h_i(\varvec{v})\) and \(\varvec{v}\) to the vectors in the bucket labeled \(h_i(\varvec{v})\). This further reduces the number of comparisons by a factor 2 compared to the GaussSieve, where both comparisons are done for each potential reduction.
 (e)
For choosing vectors \(\varvec{a}_{i,j}\) to use for the hash functions \(h_i\), there is no reason to assume that drawing \(\varvec{a}\) from a specific, sufficiently large random subset of the unit sphere would lead to substantially different results. In particular, using sparse vectors \(\varvec{a}_{i,j}\) makes hash computations significantly cheaper, while retaining the same performance [1, 27]. Our experiments indicated that even if all vectors \(\varvec{a}_{i,j}\) have only two equal nonzero entries, the algorithm still finds the shortest vector in (roughly) the same number of iterations.
 (f)
We should not store the actual vectors, but only pointers to vectors in each hash table \(T_i\). This means that compared to the GaussSieve, the space complexity roughly increases from \(O(N \cdot n)\) to \(O(N \cdot n + N \cdot t)\) instead of \(O(N \cdot n \cdot t)\), i.e., an asymptotic increase of a factor t/n rather than t.
Computations. Figure 3b shows the number of inner products computed by the HashSieve for comparing vectors and for computing hashes. We have chosen k and t so that the total time for each of these operations is roughly balanced, and indeed this seems to be the case. The total number of inner products for hashing seems to be a constant factor higher than the total number of inner products computed for comparing vectors, which may also be desirable, as hashing is significantly cheaper than comparing vectors using sparse hash vectors. Tuning the parameters differently may slightly change this ratio.
List Sizes. In the analysis, we assumed that if reductions are missed with a constant probability, then the list size also increases by a constant factor. Figure 3c seems to support this intuition, as indeed the list sizes in the HashSieve seem to be a (small) constant factor larger than in the GaussSieve.
Time Complexities. Figure 3d compares the timings of the GaussSieve and HashSieve on a single core of a Dell Optiplex 780, which has a processor speed of 2.66 GHz. Theoretically, we expect to achieve a speedup of roughly \(2^{0.078n}\) for each list search, and in practice we see that the asymptotic speedup of the HashSieve over the GaussSieve is close to \(2^{0.07n}\) using a leastsquares fit.
Note that the coefficients in the leastsquares fits for the time complexities of the GaussSieve and HashSieve are higher than theory suggests, which is in fact consistent with previous experiments in low dimensions [15, 19, 29, 30, 33]. This phenomenon seems to be caused purely by the low dimensionality of our experiments. Figure 3d shows that in higher dimensions, the points start to deviate from the straight line, with a better scaling of the time complexity in higher dimensions. Highdimensional experiments of the GaussSieve (\(80 \le n \le 100\)) and the HashSieve (\(86 \le n \le 96\)) demonstrated that these algorithms start following the expected trends of \(2^{0.42n + o(n)}\) (GaussSieve) and \(2^{0.34n + o(n)}\) (HashSieve) as n gets larger [23, 31]. In high dimensions we therefore expect the coefficient 0.3366 to be accurate. For more details, see [31].
Space Complexities. Figure 3e illustrates the experimental space complexities of the tested algorithms for various dimensions. For the GaussSieve, the total space complexity is dominated by the memory required to store the list L. In our experiments we stored each vector coordinate in a register of 4 bytes, and since each vector has n entries, this leads to a total space complexity for the GaussSieve of roughly 4nN bytes. For the HashSieve the asymptotic space complexity is significantly higher, but recall that in our hash tables we only store pointers to vectors, which may also be only 4 bytes each. For the HashSieve, we estimate the total space complexity as \(4 n N + 4 t N \sim 4 t N\) bytes, i.e., roughly a factor \(\frac{t}{n} \approx 2^{0.1290n} / n\) higher than the space complexity of the GaussSieve. Using probing, the memory requirement is further reduced by a significant amount, at the cost of a small increase in the time complexity (Fig. 3d).
5.2 HighDimensional Extrapolations
As explained at the start of this section, the experiments in Sect. 5.1 are aimed at verifying the heuristic analysis and at establishing trends which hold regardless of the amount of optimization of the code, the quality of preprocessing of the input basis, the amount of parallelization etc. However, the linear estimates in Fig. 3 may not be accurate. For instance, the time complexities of the GaussSieve and HashSieve seem to scale better in higher dimensions; the time complexities may well be \(2^{0.415n + o(n)}\) and \(2^{0.337n + o(n)}\) respectively, but the contribution of the o(n) only starts to fade away for large n. To get a better feeling of the actual time complexities in high dimensions, one would have to run these algorithms in higher dimensions. In recent work, Mariano et al. [31] showed that the HashSieve can be parallelized in a similar fashion as the GaussSieve [29]. With better preprocessing and optimized code (but without probing), Mariano et al. were able to solve SVP in dimensions up to 96 in less than one day on one machine using the HashSieve^{3}. Based on experiments in dimensions 86 up to 96, they further estimated the time complexity to lie between \(2^{0.32n  15}\) and \(2^{0.33n  16}\), which is close to the theoretical estimate \(2^{0.3366n + o(n)}\). So although the points in Fig. 3d almost seem to lie on a line with a different leading constant, these leading constants should not be taken for granted for highdimensional extrapolations; the theoretical estimate \(2^{0.3366n + o(n)}\) seems more accurate.
Finally, let us try to estimate the highest practical dimension n in which the HashSieve may be able to solve SVP right now. The current highest dimension that was attacked using the GaussSieve is \(n = 116\), for which 32 GB RAM and about 2 core years were needed [23]. Assuming the theoretical estimates for the GaussSieve (\(2^{0.4150n + o(n)}\)) and HashSieve (\(2^{0.3366n + o(n)}\)) are accurate, and assuming there is a constant overhead of approximately \(2^2\) of the HashSieve compared to the GaussSieve (based on the exponents in Fig. 3d), we might estimate the time complexities of the GaussSieve and HashSieve to be \(G(n) = 2^{0.4150n + C}\) and \(H(n) = 2^{0.3366n + C + 2}\) respectively. To solve SVP in the same dimension \(n = 116\), we therefore expect to use a factor \(G(116) / H(116) \approx 137\) less time using the HashSieve, or five core days on the same machine. With approximately two core years, we may further be able to solve SVP in dimension 138 using the HashSieve, which would place sieving near the very top of the SVP hall of fame [42]. This does not take into account the space complexity though, which at this point may have increased to several TBs. Several levels of probing may significantly reduce the required amount of RAM, but further experiments have to be conducted to see how practical the HashSieve is in high dimensions. As in high dimensions the space requirement also becomes an issue, studying the memoryefficient NVsievebased HashSieve (with space complexity \(2^{0.2075n + o(n)}\)) may be an interesting topic for future work.
Footnotes
 1.
A similarity measure D may informally be thought of as a “slightly relaxed” distance metric, which may not satisfy all properties associated to distance metrics.
 2.
Note that h is strictly not a function and only defines a relation.
 3.
At the time of writing, Mariano et al.’s highest SVP challenge records obtained using the HashSieve are in dimension 107, using five days on one multicore machine.
Notes
Acknowledgments
The author is grateful to Meilof Veeningen and Niels de Vreede for their help and advice with implementations. The author thanks the anonymous reviewers, Daniel J. Bernstein, Marleen Kooiman, Tanja Lange, Artur Mariano, Joop van de Pol, and Benne de Weger for their valuable suggestions and comments. The author further thanks Michele Mosca for funding a research visit to Waterloo to collaborate on lattices and quantum algorithms, and the author thanks Stacey Jeffery, Michele Mosca, Joop van de Pol, and John M. Schanck for valuable discussions there. The author also thanks Memphis Depay for his inspiration.
References
 1.Achlioptas, D.: Databasefriendly random projections. In: PODS (2001)Google Scholar
 2.Aggarwal, D., Dadush, D., Regev, O., StephensDavidowitz, N.: Solving the shortest vector problem in \(2^n\) time via discrete Gaussian sampling. In: STOC (2015)Google Scholar
 3.Ajtai, M.: Generating hard instances of lattice problems (extended abstract). In: STOC, pp. 99–108 (1996)Google Scholar
 4.Ajtai, M.: The shortest vector problem in \(L_2\) is NPhard for randomized reductions (extended abstract). In: STOC, pp. 10–19 (1998)Google Scholar
 5.Ajtai, M., Kumar, R., Sivakumar, D.: A sieve algorithm for the shortest lattice vector problem. In: STOC, pp. 601–610 (2001)Google Scholar
 6.Andoni, A., Indyk, P.: Nearoptimal hashing algorithms for approximate nearest neighbor in high dimensions. In: FOCS, pp. 459–468 (2006)Google Scholar
 7.Becker, A., Gama, N., Joux, A.: A sieve algorithm based on overlattices. In: ANTS, pp. 49–70 (2014)Google Scholar
 8.Becker, A., Gama, N., Joux, A.: Speedingup lattice sieving without increasing the memory, using subquadratic nearest neighbor search. Preprint (2015)Google Scholar
 9.Becker, A., Laarhoven, T.: Efficient sieving in (ideal) lattices using crosspolytopic LSH. Preprint (2015)Google Scholar
 10.Bernstein, D.J., Buchmann, J., Dahmen, E.: PostQuantum Cryptography. Springer, Heidelberg (2009)CrossRefzbMATHGoogle Scholar
 11.Bos, J.W., Naehrig, M., van de Pol, J.: Sieving for shortest vectors in ideal lattices: a practical perspective. Cryptology ePrint Archive, Report 2014/880 (2014)Google Scholar
 12.Charikar, M.S.: Similarity estimation techniques from rounding algorithms. In: STOC, pp. 380–388 (2002)Google Scholar
 13.Conway, J.H., Sloane, N.J.A.: Sphere Packings, Lattices and Groups. Springer, New York (1999) CrossRefzbMATHGoogle Scholar
 14.Fincke, U., Pohst, M.: Improved methods for calculating vectors of short length in a lattice. Math. Comput. 44(170), 463–471 (1985)MathSciNetCrossRefzbMATHGoogle Scholar
 15.Fitzpatrick, R., Bischof, C., Buchmann, J., Dagdelen, Ö., Göpfert, F., Mariano, A., Yang, B.Y.: Tuning GaussSieve for speed. In: Aranha, D.F., Menezes, A. (eds.) LATINCRYPT 2014. LNCS, vol. 8895, pp. 288–305. Springer, Heidelberg (2015) Google Scholar
 16.Gentry, C.: Fully homomorphic encryption using ideal lattices. In: STOC (2009)Google Scholar
 17.Hoffstein, J., Pipher, J., Silverman, J.H.: NTRU: a ringbased public key cryptosystem. In: Buhler, J.P. (ed.) ANTS 1998. LNCS, vol. 1423, pp. 267–288. Springer, Heidelberg (1998) CrossRefGoogle Scholar
 18.Indyk, P., Motwani, R.: Approximate nearest neighbors: towards removing the curse of dimensionality. In: STOC, pp. 604–613 (1998)Google Scholar
 19.Ishiguro, T., Kiyomoto, S., Miyake, Y., Takagi, T.: Parallel gauss sieve algorithm: solving the svp challenge over a 128dimensional ideal lattice. In: Krawczyk, H. (ed.) PKC 2014. LNCS, vol. 8383, pp. 411–428. Springer, Heidelberg (2014) CrossRefGoogle Scholar
 20.Kannan, R.: Improved algorithms for integer programming and related lattice problems. In: STOC, pp. 193–206 (1983)Google Scholar
 21.Khot, S.: Hardness of approximating the shortest vector problem in lattices. In: FOCS, pp. 126–135 (2004)Google Scholar
 22.Klein, P.: Finding the closest lattice vector when it’s unusually close. In: SODA, pp. 937–941 (2000)Google Scholar
 23.Kleinjung, T.: Private Communication (2014)Google Scholar
 24.Laarhoven, T.: Sieving for shortest vectors in lattices using angular localitysensitive hashing (2015). Full version at http://eprint.iacr.org/2014/744
 25.Laarhoven, T., de Weger, B.: Faster sieving for shortest lattice vectors using spherical localitysensitive hashing. In: LATINCRYPT (2015)Google Scholar
 26.Lenstra, A.K., Lenstra, H.W., Lovász, L.: Factoring polynomials with rational coefficients. Math. Ann. 261(4), 515–534 (1982)MathSciNetCrossRefzbMATHGoogle Scholar
 27.Li, P., Hastie, T.J., Church, K.W.: Very sparse random projections. In: KDD, pp. 287–296 (2006)Google Scholar
 28.Lindner, R., Peikert, C.: Better key sizes (and attacks) for LWEbased encryption. In: Kiayias, A. (ed.) CTRSA 2011. LNCS, vol. 6558, pp. 319–339. Springer, Heidelberg (2011) CrossRefGoogle Scholar
 29.Mariano, A., Timnat, S., Bischof, C.: Lockfree GaussSieve for linear speedups in parallel high performance SVP calculation. In: SBACPAD (2014)Google Scholar
 30.Mariano, A., Dagdelen, Ö., Bischof, C.: A comprehensive empirical comparison of parallel ListSieve and GaussSieve. In: Lopes, L., et al. (eds.) EuroPar 2014: Parallel Processing Workshops, Part I. LNCS, vol. 8805, pp. 48–59. Springer, Switzerland (2014)Google Scholar
 31.Mariano, A., Laarhoven, T., Bischof, C.: Parallel (probable) lockfree HashSieve: a practical sieving algorithm for the SVP. In: ICPP (2015)Google Scholar
 32.Micciancio, D., Voulgaris, P.: A deterministic single exponential time algorithm for most lattice problems based on Voronoi cell computations. In: STOC (2010)Google Scholar
 33.Micciancio, D., Voulgaris, P.: Faster exponential time algorithms for the shortest vector problem. In: SODA, pp. 1468–1480 (2010)Google Scholar
 34.Micciancio, D., Walter, M.: Fast lattice point enumeration with minimal overhead. In: SODA, pp. 276–294 (2015)Google Scholar
 35.Nguyen, P.Q., Vidick, T.: Sieve algorithms for the shortest vector problem are practical. J. Math. Crypt. 2(2), 181–207 (2008)MathSciNetzbMATHGoogle Scholar
 36.Panigraphy, R.: Entropy based nearest neighbor search in high dimensions. In: SODA, pp. 1186–1195 (2006)Google Scholar
 37.Plantard, T., Schneider, M.: Ideal lattice challenge. http://latticechallenge.org/ideallatticechallenge/ (2014)
 38.Pohst, M.E.: On the computation of lattice vectors of minimal length, successive minima and reduced bases with applications. ACM Bull. 15(1), 37–44 (1981)MathSciNetCrossRefzbMATHGoogle Scholar
 39.van de Pol, J., Smart, N.P.: Estimating key sizes for high dimensional latticebased systems. In: Stam, M. (ed.) IMACC 2013. LNCS, vol. 8308, pp. 290–303. Springer, Heidelberg (2013) CrossRefGoogle Scholar
 40.Pujol, X., Stehlé, D.: Solving the shortest lattice vector problem in time \(2^{2.465n}\). Cryptology ePrint Archive, Report 2009/605 (2009)Google Scholar
 41.Schneider, M.: Sieving for shortest vectors in ideal lattices. In: Youssef, A., Nitaj, A., Hassanien, A.E. (eds.) AFRICACRYPT 2013. LNCS, vol. 7918, pp. 375–391. Springer, Heidelberg (2013) CrossRefGoogle Scholar
 42.Schneider, M., Gama, N., Baumann, P., Nobach, L.: SVP challenge (2014). http://latticechallenge.org/svpchallenge
 43.Schnorr, C.P.: A hierarchy of polynomial time lattice basis reduction algorithms. Theoret. Comput. Sci. 53(2), 201–224 (1987)MathSciNetCrossRefzbMATHGoogle Scholar
 44.Schnorr, C.P., Euchner, M.: Lattice basis reduction: improved practical algorithms and solving subset sum problems. Math. Programming 66(2), 181–199 (1994)MathSciNetCrossRefzbMATHGoogle Scholar
 45.Wang, X., Liu, M., Tian, C., Bi, J.: Improved NguyenVidick heuristic sieve algorithm for shortest vector problem. In: ASIACCS, pp. 1–9 (2011)Google Scholar
 46.Zhang, F., Pan, Y., Hu, G.: A threelevel sieve algorithm for the shortest vector problem. In: Lange, T., Lauter, K., Lisoněk, P. (eds.) SAC 2013. LNCS, vol. 8282, pp. 29–47. Springer, Heidelberg (2014) CrossRefGoogle Scholar