Deterministic constructions of high-dimensional sets with small dispersion

The dispersion of a point set $P\subset[0,1]^d$ is the volume of the largest box with sides parallel to the coordinate axes, which does not intersect $P$. Here, we show a construction of low-dispersion point sets, which can be deduced from solutions of certain $k$-restriction problems, which are well-known in coding theory. It was observed only recently that, for any $\varepsilon>0$, certain randomized constructions provide point sets with dispersion smaller than $\varepsilon$ and number of elements growing only logarithmically in $d$. Based on deep results from coding theory, we present explicit, deterministic algorithms to construct such point sets in time that is only polynomial in $d$. Note that, however, the running-time will be super-exponential in $\varepsilon^{-1}$.

Date: January 23, 2019. We would like to express our gratitude to the Erwin Schrödinger International Institute for Mathematics and Physics for its hospitality during the programme on "Tractability of High Dimensional Problems and Discrepancy", where some part of this research was carried out. We also gratefully acknowledge the support of the Oberwolfach Research Institute for Mathematics, where initial discussion were held during the workshop "Perspectives in High-Dimensional Probability and Convexity". We thank also Michael Gnewuch, Daniel Král' and Hemant Tyagi for fruitful discussions. JV was supported by the grant P201/18/00580S of the Grant Agency of the Czech Republic and by the European Regional Development Fund-Project "Center for Advanced Applied Science" (No. CZ.02.1.01/0.0/0.0/16 019/0000778).
Hence, N(ε, d) is the minimal cardinality of a point set P ⊂ [0, 1] d that has dispersion smaller than ε.
Besides the fact that the above geometric quantities are interesting in its own right, they also attracted attention in the numerical analysis community, especially when it comes to very high dimensional applications. The reason is that bounds on the (minimal) dispersion lead to bounds on worst-case errors, and hence the complexity, for some numerical problems, including optimization and approximation in various settings, see [5,27,30,32,36,37,40]. This is a similar situation as for the much more studied discrepancy, which corresponds to certain numerical integration problems, see e.g. [8,9,12,28,29,31].
Moreover, bounding the dispersion is clearly also related to the problem of finding the largest empty box. In dimension two, this is the Maximum Empty Rectangle Problem, which is one of the oldest problems in computational geometry. For the state of the art and further references we refer to [13,14,15,16,24].
Regarding bounds on the inverse of the minimal dispersion, we have ≤ N(ε, d) ≤ C d ε for some C < ∞ and all ε < 1/8, see [1], where the upper bound is attained by certain digital nets. See also [14,32] for a related upper bound. Although these bounds show the correct dependence on ε, they are rather bad with respect to the dimension d. This gap was narrowed in the past years by several authors, see [21,33,35,39], including the important work of Sosnovec who proved that the logarithmic dependence on d is optimal. With this respect, the best bound at present is see [39] and Theorem 4.1 below. Note that the logarithmic dependence is special for the cube, as it is known that, for the same problem on the torus, we have a lower bound linear in d, see [38]. The main drawback of these results is that they only show the existence of point sets with small dispersion. The only explicit constructions we are aware of are the above mentioned digital nets, which lead to a bad d-dependence, and sparse grids, which satisfy the upper bound N(ε, d) ≤ (2d) log 2 (1/ε) , see [21]. It is already clear from the number of points, that both do not lead to constructions of point sets with small dispersion that can be carried out in time that is polynomial in d. Moreover, as the existence proofs are based on random points on a finite grid, one could use the naive algorithm: try each of the possible configurations, calculate its dispersion (if possible) and output a set that satisfies the requested bound. The running-time of this algorithm is (in the worst case) clearly at least exponential in d.
Remark 1.1. Note that the decision problem, if a given point set has discrepancy smaller than ε, is known to be NP-hard, see [18] or [11,Section 3.3], and the same is true for the dispersion, if the dimension d is part of the input, cf. [6].
For upper bounds on the cost of constructing points with small discrepancy and further literature, see [10,11,17]. However, all known algorithms for this problem so far have running-time at least exponential in d. We hope that the results of this paper will lead to some progress also for this problem.
Reconsidering the bound (1.1), one may hope that points with small dispersion may be constructable also in very high dimensions. Ideally, we would like to have algorithms for the construction of point sets of size N = O(N(ε, d)) with dispersion at most ε > 0, whose computational cost is polynomial in ε −1 and d. However, this seems to be out of reach. (Note that the output already costs d · N.) Here, we focus on the dependence on d and show, using deep results from the theory of error-correcting codes, that point sets with small dispersion of size ∼ log(d) can be constructed in time that is polynomial in d. Unfortunately, we do not have a good control on the dependence on ε −1 . It remains an open problem to find fully-polynomial constructions for the dispersion.
Our two main results are derandomized versions of the results from [35] and [39]. Both use different approaches and lead to somewhat different results. The first one, as discussed in Section 3, leads to point sets of size O ε (log d) that can be constructed in time linear in the size of the output, i.e., in time O ε (d log d), see Algorithm 1 and Theorem 3.3. Here, and in the following, O ε (log d) means that the implied constant depends on ε in an unspecified way. A second, and much more involved, construction will be given by Algorithm 2 in Section 4. The corresponding result reads as follows.
Theorem 4.4 Let ε ∈ (0, 1 4 ] and d ≥ 2. Then, there is an absolute constant C < ∞, such that Algorithm 2 constructs a set P ⊂ [0, 1] d with disp(P ) ≤ ε and Note that, in contrast to Algorithm 1, the point set that is constructed by Algorithm 2 has size that is polynomial in ε −1 . However, as the proof shows, its computational cost is much larger.
Our general approach is as following. We start with a detailed inspection of the random constructions of point sets with small dispersion. This will allow us to clearly separate the setting of the construction and its randomized part. It will turn out then, which properties are crucial for each of the approaches and what exactly is the role of randomness. Afterwards, we replace the randomized part by a deterministic one.

Basics from Coding Theory
Our deterministic constructions of point sets with small dispersion will be essentially obtained by certain "derandomization" of recent proofs from [35,39]. As we will rely on rather deep tools from coding theory, we summarize in this section the necessary definitions and results for later use.
2.1. Universal sets. First, we introduce the concept of (n, k)-universal sets, which is also known in coding theory under the name of t-independent set problem. It has its roots in testing of logical circuits [34]. Naturally one is interested in (randomized and deterministic) constructions of small (n, k)-universal sets. The straightforward randomized construction provides the existence of an (n, k)-universal set of size ⌈k2 k log(n)⌉. On the other hand, [20] gives a lower bound on the size of an (n, k)-universal set of the order Ω(2 k log(n)). There exist several deterministic constructions of (n, k)universal sets in the literature (cf. [2,3,25]) and we shall rely on the results given in [26]. Theorem 6]). There is a deterministic construction of an (n, k)-universal set of size 2 k+O(log 2 (k)) log(n), which can be listed in linear time of the length of the output.
Although the notion of an (n, k)-universal set is not very flexible and comes from a different area of mathematics, we will see in Section 3, that there is indeed a link to sets with small dispersion. Reusing the known results from coding theory, it will already allow us to obtain our first deterministic construction of a point set with cardinality of order log(d). However, in this approach we have only very limited control of the dependence on ε.
We also need the following natural generalization of (n, k)-universal sets. We could not find this concept in the literature, but we assume that this and the proceeding lemma are known.
If b = 2, Definitions 2.1 and 2.3 coincide, i.e. (n, k, 2)-universal sets are just the usual (n, k)-universal sets. We use the following two observations to transfer the known results about (n, k)-universal sets to our setting. (ii) Let m ∈ N and T ⊂ {0, 1} mn be an (mn, mk)-universal set. Then there is an (n, k, 2 m )-universal set of the same size.
Proof. The proof is quite straightforward. To show (i), just replace all occurrences of b among the coordinates of T by zero. For the proof of the second part it is enough to interpret each The direct random construction yields the existence of an (n, k, b)-universal set of the size ⌈kb k log(ebn/k)⌉. Using Theorem 2.2, we can easily obtain a deterministic construction of an (n, k, b)-universal set of only a slightly larger size.
Theorem 2.5. There is a deterministic construction of an (n, k, 2 m − 1)universal set of size 2 mk+O(log 2 (mk)) log(n), which can be listed in linear time of the length of the output.
Proof. By Theorem 2.2, there is a construction of an (mn, mk)-universal set with at most 2 mk+O(log 2 (mk)) log(mn) elements. The result then follows from Lemma 2.4.

k-restriction problems.
For the derandomization of the analysis of [39] we need a more flexible notion of the so-called k-restriction problems, see [26,Section 2.2]. Solutions to these problems will be one of the building blocks of our deterministic construction of sets with small dispersion whose size is polynomial in 1/ε and, still, logarithmic in d. An important parameter of the k-restriction problems is the minimal size of each of the restriction sets C j , i.e.
Random constructions of sets satisfying the k-restriction problem with parameters (b, k, n, M) and C = {C 1 , . . . , C M } are based on a simple union bound. Indeed, let 1 ≤ j ≤ M and S ⊂ {1, . . . , n} with #S = k be fixed. The probability, that a randomly chosen vector v ∈ {0, 1, . . . , b − 1} n satisfies C j on S is at least Finally, the probability that there is a set S ⊂ {1, . . . , n} with #S = k and This expression is smaller than one if This means that there exist solutions to a k-restriction problem with parame- where c is from (2.1). Theorem 1 of [26] states that there is a deterministic algorithm that outputs such a solution of size equaling the union bound. The main idea of its proof is that the random sampling can be replaced by an extensive search through a k-wise independent probability space with n random variables with values in {1, . . . , b}.
where c is from (2.1). The time taken to output the collection is where T is the time complexity of the membership oracle.
Here, the membership oracle is a procedure which, for given v ∈ {1, . . . , b} n , S ⊂ {1, . . . , n} with #S = k and j ∈ {1, . . . , M}, outputs if the restriction of v on S belongs to C j . In what follows it can be executed in T = O(k) time.

2.3.
Splitters. The last ingredient of our derandomization procedure are splitters. They played a central role in [26] as the basic building blocks of all deterministic constructions given there. Essentially, they allow to split a large problem into smaller problems which can then be treated by the extensive search of Theorem 2.7. Similarly to [26] and [4], we will rely on (n, k, k 2 )-splitters. By Definition 2.8, A(n, k) is an (n, k, k 2 )-splitter, if it is a collection of mappings a : {1, . . . , n} → {1, . . . , k 2 } such that for every S ⊂ {1, . . . , n} with #S = k, there is an a ∈ A(n, k), which is injective on S.
Such explicit codes exist by [3] with L = O(k 4 log n). To see this, note that the rate R of a code as above is defined by R := log k 2 (n)/L = log(n)/(L log(k 2 )). By [3, eq. (5)], see also [4,Lemma 3], we obtain that an explicit code with normalized Hamming distance at least δ = 1 − 2 k 2 exist with where γ 0 > 0 is an absolute constant and H q (x) := −x log q (x)−(1−x) log q (1− x)+x log q (q−1). Simple computations show that this implies R ≥ c/(k 4 log(k 2 )) for some c > 0. This, in turn, implies that we can choose L = O(k 4 log n).
The explicit construction of [3] yields a linear code that is based on a two-fold concatenation code that combines the Wozencraft ensemble, Justesen codes and expander codes, which in turn rely on famous deterministic constructions of expander graphs [23]. This construction is 'uniformly constructive' (see [3]), i.e., the construction can be done in time growing only polynomially in n. Furthermore, the code satisfies [3, eq. (5)] for all δ < 1 − 1/q, where q = k 2 in our case. Note also that the running-time depends only polynomially also on k, cf. [26,Thm. 3(iv)]. For the details we refer to Sections 3 and 4 of [3].
The following lemma summarizes the discussion above and shows that (n, k, k 2 )splitters of relatively small size can be constructed explicitly in polynomial time.

Lemma 2.9 (cf. [4, Lemma 3]).
There is an explicit (n, k, k 2 )-splitter of size O k 4 log(n) that can be constructed in polynomial time in n and k.
The (n, k, k 2 )-splitters can be used whenever n is "very large" compared to k. Roughly speaking, and in the context of the present paper, we will transform a k-restriction problem of size n to a k-restriction problem of size k 2 , which can then be solved using the results of Section 2.2. This will lead to construction algorithms with an apparently optimal dependence of their running time on the original problem size n. This approach was already used to prove [26,Theorem 6], see Theorem 2.5.

Derandomization of Sosnovec's proof
First, we consider the construction of Sosnovec [35], which gives logarithmic dependence of N(ε, d) on d but involves no special control of its dependence on ε −1 . His main theorem was the following. We will see that this result can be essentially derandomized using results from coding theory, while loosing only a negligible factor. The drawback of this approach is the extremely bad dependence of c ε on ε. We sketch the main ideas of Sosnovec's proof. Let B = I 1 × · · · × I d ∈ Ω m . The key observation of [35] is that the number of indices j ∈ {1, . . . , d} with M m ⊂ I j is bounded from above by m2 m , a quantity independent on d. To be more specific, if we denote then #A(B) ≤ A m := min(m2 m , d) for every B ∈ Ω m . We will refer to A(B) as the set of "active indices" of B. If A(B) is not of the full possible size, we enlarge it by adding any of the other indices to obtain a set with cardinality equal to A m . Therefore, we can associate to each B ∈ Ω m (possibly in a nonunique way) a set A with #A = A m and a vector z ∈ M Am m such that any x ∈ M d m with x| A = z lies in B. Vice versa, if we have a point set P = {x 1 , . . . , x N } ⊂ M d m , such that for every A ⊂ {1, . . . , d} with #A = A m and to every z ∈ M Am m , there is some x j ∈ P with x j | A = z, then, by what we just said, P intersects every B ∈ Ω m . Therefore, the dispersion of P can not be larger than 2 −m , i.e., disp(P ) ≤ 2 −m and hence N(2 −m , d) ≤ N.
To simplify the combinatorial part later on, we multiply all coordinates by 2 m , which results to vectors with integer components. This motivates the following definition. 3.3. Derandomization using universal sets. Definition 3.2 resembles very much the concept of (n, k)-universal sets, see Section 2.1. In particular, it is easy to see that every (d, A m , 2 m − 1)-universal set -after adding 1 to each coordinate-satisfies condition (S) of the order m. Therefore, we can use Theorem 2.5 to replace the random arguments of the last section by a deterministic algorithm. We glue all the components together in the form of an algorithm.
The remaining operations can be done in a linear time, without enlarging the point set.
As expected, the dependence of the size of P on 2 m ≈ ε −1 is rather bad (as it was in [35]), but there is indeed only a logarithmic dependence on d.

Improving the dependence in ε −1
The main aim of [39] was to refine the analysis of [35] and to achieve a better dependence of N(ε, d) on ε, without sacrificing the logarithmic dependence on d. The main theorem of [39] was the following.
Also this result can be derandomized using results from coding theory. By doing this, we will lose some power of 1/ε in the size of the point set. However, it will still be of order log(d).

4.1.
Enhanced analysis of the random construction. The main novelty of [39] was a more careful splitting of Ω m (see (3.1)) into subgroups. To be more specific (and using the notation of [19]), for s = (s 1 , . . . , s d ) ∈ {1, . . . , 2 m − 1} d and p = (p 1 , . . . , p d ) ∈ M d m , we denoted Ω m (s, p) to be those cubes from Ω m , which have I ℓ approximately of the length s ℓ 2 m and its left point around p ℓ for all ℓ = 1, . . . , d, i.e. Ω m (s, p) := B = I 1 × · · · × I d ∈ Ω m : ∀ℓ ∈ {1, . . . , d} : We denote by I m the pairs (s, p), for which Ω m (s, p) is non-empty. It is easy to see that their number is bounded from above by The aim of [39] was to combine (4.1) with Lemma 4.2 and the union bound. Indeed, the probability that a randomly chosen point z ∈ M d m avoids B m (s, p) is at most 1 − 2 −m−4 . Therefore, the probability that a set P = {x 1 , . . . , x N } ⊂ M d m of N randomly and independently generated points does not intersect By the union bound over all (s, p) ∈ I m , we get further P ∃(s, p) ∈ I m : ∀ℓ ∈ {1, . . . , N} : As B m (s, p) was defined in (4.2) as the intersection of all cubes from Ω m (s, p), finding a point x ℓ ∈ B m (s, p) means that the same point may be found in all cubes in Ω m (s, p). We conclude that if N is large enough to ensure that i.e., N > m2 2m+4 log(2 m+3 d), then the randomly generated P = {x 1 , . . . , x N } ⊂ M d m intersects every B ∈ Ω m with positive probability. Hence, there exists P with #P ≤ N such that disp(P ) ≤ 2 −m . This is essentially the result of [39].

4.2.
Connection to k-restriction problems. By what we said above, if a point set P ⊂ [0, 1] d intersects B m (s, p) for all (s, p) ∈ I m , then disp(P ) ≤ 2 −m . The randomized construction in [39], which we now want to replace by a deterministic one, was restricted in its choice of points to M d m . Therefore we define We observe that a set T satisfies the condition (S ′ ) of order m if, and only if, the set T − 1 satisfies the k-restriction problem with respect to C.
The parameter M, which is just the cardinality of C, can be estimated from above in a way similar to (4.1), but note that we do not have to choose the subset of active indices anymore. Each C m (s, p) ∈ C is characterized by s ∈ {0, 1, . . . , 2 m − 1} Am and p ∈ {1/2 m , . . . , (2 m − 1)/2 m } Am . Therefore, The second important parameter of a k-restriction problem is the minimal size c := c(C) of each of the restriction sets C m (s, p), see (2.1). A lower bound on c follows directly from Lemma 4.2 and we obtain (4.5) c ≥ P(z ∈ B m (s, p)) · #M Am m ≥ 2 −m−4 (2 m − 1) Am . With the choice of parameters as given above, we have c/b k ≥ 2 −m−4 .

4.3.
A first attempt for derandomization. Using the arguments of the last section, one could use the construction from Theorem 2.7 directly to solve the corresponding k-restriction problem with parameters (2 m − 1, A m , d, 2 2mAm ), whenever d > 2 m . This leads to a point set P ⊂ M d m with disp(P ) ≤ 2 −m and #P ≤ 2 m+4 log d Am 2 2mAm = O(m 2 2 2m log(d)).
Note that this bound matches the union bound from Section 4.1. However, the running-time of the algorithm, as given by Theorem 2.7, is where T is the time complexity of the membership oracle, which can be assumed of the order O(A m ) = O(m2 m ) in this case. 4.4. Derandomization using splitters. We now describe how we can improve the construction of an explicit solution to the desired k-restriction problem with parameters (b, k, n, m) equal to (2 m − 1, A m , d, 2 2mAm ) and the set system C defined by (4.3). We use the approach of [26] to obtain solutions of the k-restriction problem which are 'small' in size and running-time of the corresponding algorithm. 'Small' means here, that the dependence on the original problem dimension d is as small as possible.
In the heart of the constructions are splitters, see Section 2.3. As already indicated in Section 2.3, we use a (d, A m , A 2 m )-splitter, say A(m, d), to map the original d-dimensional problem to a k-restriction problem in dimension To show that T * is indeed a solution to our k-restriction problem, let S ⊂ {1, . . . , d} with #S = A m . Then, there exists a ∈ A(m, d), such that S ′ = a(S) ⊂ {1, 2, . . . , A 2 m } has A m mutually different elements, i.e., #S ′ = A m . Now, for every C ∈ C, there is some τ ∈ T (m), such that C ∋ τ | S ′ = (τ • a)| S . Hence, T * , which satisfies #T * = #T (m) · #A(m, d), is a solution to the restriction problem with parameters (2 m − 1, A m , d, 2 2mAm ). We merge all the components together in a form of an algorithm.