Greedy energy minimization can count in binary: point charges and the van der Corput sequence

This paper establishes a connection between a problem in Potential Theory and Mathematical Physics, arranging points so as to minimize an energy functional, and a problem in Combinatorics and Number Theory, constructing 'well-distributed' sequences of points on $[0,1)$. Let $f:[0,1] \rightarrow \mathbb{R}$ be (i) symmetric $f(x) = f(1-x)$, (ii) twice differentiable on $(0,1)$, and (iii) such that $f''(x)>0$ for all $x \in (0,1)$. We study the greedy dynamical system, where, given an initial set $\{x_0, \ldots, x_{N-1}\} \subset [0,1)$, the point $x_N$ is obtained as $$ x_{N} = \arg\min_{x \in [0,1)} \sum_{k=0}^{N-1}{f(|x-x_k|)}.$$ We prove that if we start this construction with the single element $x_0=0$, then all arising constructions are permutations of the van der Corput sequence (counting in binary and reflected about the comma): \textit{greedy energy minimization recovers the way we count in binary.} This gives a new construction of the classical van der Corput sequence. The special case $f(x) = 1-\log(2 \sin(\pi x))$ answers a question of Steinerberger. Interestingly, the point sets we derive are also known in a different context as Leja sequences on the unit disk. Moreover, we give a general bound on the discrepancy of any sequence constructed in this way for functions $f$ satisfying an additional assumption.


Introduction and main result
iii) what do these configurations look like? These old questions are far from answered: the minimal energy is known to have an asymptotic expansion, the first few terms are known; see e.g. [3,4,5,13,14]. Relatively little is known about minimal energy configurations, they seem to arrange themselves in a hexagonal pattern which is one instance of the crystallization conjecture [2,19,25]. The problem is notoriously difficult: the special case of N = 5 points was only very recently solved by R. Schwartz [21]. For some special values of N there are constructions that are known; we refer to the seminal work of Cohn-Kumar [9]. Whenever there are topological obstructions to triangular lattice, the defects seem to localize in scars [8]. Needless to say, many of these questions remain interesting and many of the results carry over to the case of a more general domain instead of S 2 and a more general energy functional where f : R → R. The choice f (x) = x −s , Riesz potentials, is among the most popular. The only case where the problem is known to have been widely solved is the case of the one-dimensional torus: for a rather large class of functions the optimal arrangement is known to be equispaced points [6].
Given the difficulty of these questions, one would assume that it is much more difficult to study the dynamical problem, where one starts with a given set of points {x 0 , . . . , x N −1 } ⊂ [0, 1) (or possibly just a single point) and then defines the next element in the sequence in a greedy fashion via (1) x N = arg min i.e. adding the point in the location of the minimum of the energy (with the caveat that should the minimum not be unique, then any choice is admissible). One of the main contributions of our paper is that for a wide class of functions f , this dynamical system can indeed be studied for one-periodic functions, i.e. on the one-dimensional torus T identified with the unit interval. Moreover, it gives rise to a surprisingly rigid dynamical structure.

A Problem in Combinatorics and Number
Theory. Suppose we want to distribute a sequence of points X = (x n ) ∞ n=1 evenly over [0, 1) in the most regular fashion -how would one do it? If we knew in advance that we want to place N points, then we would presumably place them at equidistant intervals. However, what if we wanted the sequence to be distributed regularly at all times (i.e. also at all intermediate stages) ? We now make this notion precise and define the extreme discrepancy of the first N points of X as It is easy to see that D N ≥ 1/N . How small can it be? In particular, is there a sequence X such that D N (X) ≤ cN −1 for all N ∈ N? This question, originally due to van der Corput, was answered by van Aardenne-Ehrenfest who showed that no such result exists; see [15] for details. The problem was finally solved by Schmidt [20] who proved that D N ≥ cN −1 log N for infinitely many N ∈ N; see [12,15,18,20] and references therein for smallest known constants. Sequences with D N (X) ≤ cN −1 log N for a constant c > 0 and all N are called low discrepancy sequences. In particular, this bound matches classical constructions of sequences, one of the most famous of which is the van der Corput sequence ( [27,28] The van der Corput sequence and its various generalisations are known to be very close to optimal with regards to discrepancy; see [11,12,18].

A possible connection.
There are very few known constructions of low discrepancy sequences, most of them are based on structures from Number Theory or Combinatorics. Low discrepancy sequences are important in numerical integration since they ensure smallest possible approximation errors; see [10]. While the theory is very well understood in one dimension, there are to-date two different conjectures of the sharp lower bound on the discrepancy of arbitrary sequences in more dimensions; [10]. In particular, it is not clear whether there exist sequences more regular than anything we can construct so far.
Motivated by this, Steinerberger [23,24] recently proposed to study whether regular sequences could be constructed via dynamical systems of the type outlined in (1). More precisely, suppose we are given {x 0 , . . . , x N −1 } ⊂ [0, 1), then he proposed to construct x N in a greedy manner as (2) x N = arg min and if the minimum is not unique, any choice is admissible. Steinerberger proves that, independently of the initial conditions, such sequences satisfy D N ≤ cN −1/2 log N . Moreover, dropping the technical gap condition in (2), he conjectures: (A) (weak form): the discrepancy of every such sequence satisfies (B) (strong form): and the constant c does not depend on the initial set from which this construction is started. We give affirmative answers to these conjectures in Theorem 2.1 in the special case when we are given only one initial point x 0 . It was conjectured that if we start with x 0 = 0, then the van der Corput sequence is an admissible choice for the greedy selection rule; see Figure 1. The main purpose of our paper is to prove that this is indeed the case and that it holds at a much larger level of generality for dynamical systems of this type; for a precise statement of this result see Theorem 2.1. Moreover, we give a general discrepancy bound in Theorem 2.6 for a subset of the functions considered in Theorem 2.1 and arbitrary initial choices which is slightly worse than Steinerberger's bound for the special case of f (x) = 1 − log(2 sin(π|x|)).  Figure 1. The set M 10 of the first 10 points (black) of the classical van der Corput sequence and the function 9 k=0 1 − log(2 sin(π|x − x k |)). The red points are the candidates for the next point as suggested by the greedy algorithm; note that the x 10 = .0101 2 = 5/16.

1.4.
Outline. We state our main results, Theorem 2.1 and Theorem 2.6, in Section 2; in particular this section also contains in Subsection 2.3 the definition of the sequences used to characterise the output of the greedy algorithm of Steinerberger as well as an outline of our proof strategy. In Section 3 we review properties of the van der Corput sequence. We also recall important generalisations and the method of Faure to calculate the discrepancy of permuted van der Corput sequences. In Section 4 we study the discrepancy of the particular generalised van der Corput sequences used to characterise the output of the algorithm. As a main result we show in Theorem 4.5 resp. in Corollary 4.6 that all such sequences have the same discrepancy. In Section 5 we relate our results to the greedy algorithm. We show in Theorem 5.5 that the classical van der Corput sequence is an admissible output of the algorithm and we finally prove Theorem 2.1. Section 6 contains a proof of Theorem 2.6.

Main results
2.1. Preliminaries. Let S b be the set of all permutations of {0, 1, . . . , b − 1}, i.e. all permutations of the first b non-negative integers. Throughout this paper we sometimes write a concrete permutation σ as tuple, e.g. σ = (0, 2, 4, 1, 3) meaning that 0 is mapped to 0, 1 is mapped to 2, 2 is mapped to 4 and so on; i.e. the number at the i-th position of the tuple is the image of i under σ. Moreover, for tuples of numbers τ 1 , τ 2 of length b 1 , b 2 we write (τ 1 , τ 2 ) for the concatenation of the two tuples; i.e. (τ 1 , τ 2 ) is a tuple of numbers of length b 1 + b 2 . Finally, for permutations (or tuples) of fixed length b, we are sometimes only interested in its initial segment of the first k numbers. In this case we may write σ = (τ k , * ) meaning that the first k elements of the permutation are determined by τ k and the rest of the permutation can be any arrangement of the remaining b − k numbers.

2.2.
Permuted van der Corput sequences. Let a j (n) denote the jth coefficient in the binary representation n = a 0 · 2 0 + a 1 · 2 1 + a 2 · 2 2 + . . . = ∞ j=1 a j 2 j of an integer n, in which 0 ≤ a j (n) ≤ 1 and if 0 ≤ n < 2 m , then a j (n) = 0 for all j ≥ m. The binary radical inverse function is defined as S 2 : N 0 → [0, 1), Then the classical van der Corput sequence is defined as S 2 = (S 2 (n)) n≥0 . Faure [11] generalised the definition of the classical van der Corput sequence in two ways. First, he replaced the binary representation of an integer by its general b-adic representation for a fixed integer base b ≥ 2. This allows for the definition of the b-adic radical inverse function S b , which in turn can be used to define van der Corput sequences in general bases; i.e. S b = (S b (n)) n≥0 . Furthermore, for σ ∈ S b Faure defines the generalised (or permuted) van der Corput sequence S σ b = (S σ b (n)) n≥0 for a fixed base b ≥ 0 via the permuted b-adic radical inverse function Every such sequence is uniformly distributed modulo 1; [11, Propriété 3.1.1].

2.3.
Family of permutations. We inductively define a set of permutations P m ⊂ S 2 m in each basis b m = 2 m . We start with b 1 = 2 and P 1 = {(0, 1)} and we obtain the set P m+1 from P m in the following way: We first multiply each permutation σ ∈ P m with 2 and denote the resulting tuple of numbers as 2σ and the set of all such tuples as 2P m . Next, each 2σ ∈ 2P m gives rise to 2 m new tuples: For each odd a with 1 ≤ a ≤ 2 m+1 we form a new tuple 2σ ⊕ a by adding a to 2σ (addition modulo 2 m+1 ). The set of all such tuples is denoted by 2P m ⊕ a. Finally, the set P m+1 is defined as the set of all permutations (2σ, 2σ ⊕ a) for σ, σ ∈ P m and odd a with 1 ≤ a ≤ 2 m+1 . As examples we construct P 2 and P 3 . First, we have that

Main result and open questions.
We can now state our first main result.
for N > 0 and where every global minimum is admissible if it happens not to be unique. Then for every fixed N > 0 there exist an m > 0 with N ≤ 2 m and a σ ∈ P m such that . Remark 1. Note that our assumptions on f imply that f is a strictly convex function on [0, 1]; but not every strictly convex function necessarily satisfies f (x) > 0 for all x ∈ (0, 1). However, every uniformly convex function, i.e. f such that f (x) ≥ θ > 0, satisfies our assumptions.
Remark 2. The point sets we obtain are closely related to Leja sequences on the unit disk. Let K be a nonempty compact subset of the complex plane. A sequence (a n ) n≥0 of points in K is a Leja sequence for K if the following extremal metric property holds for N ≥ 1, Thus, the (N + 1)st term a N of a Leja sequence must maximise the product of distances to the N previous ones; see [1,16]. There are very few classes of compact sets for which Leja sequences are explicitly known. However, if K is the unit disk then the structure of the corresponding Leja sequences are known and described in [1, Theorem 5 and Corollary 2]. Importantly, the maximum principle implies that these points lie on the boundary of K; i.e. on the unit circle if K is the unit disk.
Note that Leja sequences are defined via an explicit functional which is studied on different compact sets. In our setting, we fix the domain and vary the functionals. It is interesting that our minimising sequences have the same structure as the particular Leja sequences on the unit disk.
Remark 3. We believe Theorem 2.1 is a strong indicator that the dynamical version of the static equilibrium problem in mathematical physics might give rise to interesting structures. In the one-dimensional case, it certainly connects in a very substantial way to structures in number theory. To emphasize this, we explicitly state that: Greedy energy minimization on the one-dimensional torus automatically recovers the way we count in binary.  We suspect that the classical van der Corput sequence and its permutations studied in this paper form in a way a unique (and natural) link between sequences constructed via greedy energy minimisation and sequences obtained from traditional methods in number theory.
2.5. Outline of proof. We prove Theorem 2.1 in 2 steps. In the first step (i.e. in Section 4) we calculate the discrepancy of permuted van der Corput sequences S σ 2 m , σ ∈ P m . We conclude in Corollary 4.6 that all such sequences have the same extreme discrepancy as the classical van der Corput sequence S 2 . Our argument uses the general machinery of Faure for the calculation of the discrepancy of permuted van der Corput sequences which we recall in Section 3 and is based on various symmetries exhibited by the permutations in P m .
In the second step (i.e. in Section 5) we relate our particular family of permuted van der Corput sequences to the greedy algorithm of Steinerberger. We show that every such sequence can be obtained from the algorithm (Lemma 5.8) and that every output of the algorithm can be described by such a sequence (Lemma 5.7). Thus, we obtain a full characterisation of the possible outputs of the greedy algorithm and by the results of the first step we also know the discrepancy of any such output.
This concludes the proof of Theorem 2.1.
2.6. General discrepancy bound. Finally, we conclude with a general discrepancy bound for sequences generated from the greedy algorithm and an arbitrary initial set of values. We assume wlog that f has mean 0 and require two additional assumptions on f . We obtain the discrepancy bound for functions f that are bounded and whose Fourier coefficientsf (k) defined as satisfy a condition on their decay.
Theorem 2.6. Let f : [0, 1] → R be as in Theorem 2.1, assume in addition that f is bounded and normalised to have mean value 0, and that for N ≥ m, where every global minimum is admissible if it happens not to be unique. Then this sequence satisfies D N (X) ≤c N 1/3 , in which the constantc > 0 only depends on the initial set.

Properties of the van der Corput sequence
Faure [11] developed a comprehensive machinery to calculate the discrepancy of permuted van der Corput sequences as defined in Section 2.2. Our proof of Theorem 2.1 makes use of this machinery; i.e. we use Faure's method to show that all permutations σ ∈ P m generate permuted van der Corput sequences with the same discrepancy as the classical van der Corput sequence in base 2. To facilitate our proof we recall the main properties of van der Corput sequences as well as Faure's method in this section. Our main result of this section is Lemma 3.6 in which we relate the classical van der Corput sequence to our family of permutations P m , m ≥ 1.
Let Note that Faure does not divide the discrepancy by N ; i.e. his formulas for the discrepancy (e.g. in [11]) give N D N (X) in our notation.
3.1. Self similarity of van der Corput sequences. The classical van der Corput sequence and its permuted generalisations are self similar as shown in the following two lemmas. Let Proof. We have that We can repeat the calculation from the proof of Lemma 3.1 to obtain the following general version for 3.2. Discrepancy of permuted van der Corput sequences. In the following we introduce Faure's system of basic functions which can be used to calculate the discrepancy of generalised van der Corput sequences. The theory of Faure was applied to the study of various point sets and sequences over the last 40 years and is very well explained and illustrated in the recent survey [12]. The exact formulas for the discrepancy of generalised van der Corput sequences are based on a set of elementary functions which are defined for any permutation σ ∈ S b . Let For an infinite, one-dimensional sequence X we set Then we get from [11,Théorème 1] for N ≥ 1 that Note that the infinite series in these formulas can indeed be computed exactly; for details we refer to [11, Section 3.3.6, Corollaire 1]. The exact formulas can be used for the asymptotic analysis of the discrepancy of generalised van der Corput sequences. By [11, Théorème 2], Furthermore, we introduce the function and rewrite (4) as Note that the local maxima of ψ σ b have arguments of the form x = k/b for k ∈ N with 0 ≤ k ≤ b − 1. As shown in [11,Lemme 4.2.2], the sequence (max x∈[0,1] F σ m (x)/m) m≥1 is decreasing. In particular, [11,Théorème 6] the asymptotic constants for van der Corput sequences that are generated from the identity permutation in base b are calculated: Consequently, the only sequence generated from an identity permutation that improves the result of the classical sequence in base 2 is S 3 ; for more information see the recent survey [18].

3.3.
Symmetries. The Symmetry as well as the Swapping Lemma facilitate the analysis of structurally similar permutations and state that a shift, reflection or reversal of a permutation does not change the discrepancy of the generated sequence. This is not surprising since we interpret [0, 1] as a circle.

Lemma 3.3 (Symmetry Lemma).
Let 0 < a < b be an integer, let σ ∈ S b and let σ , σ ∈ S b be defined as Then it holds for all N that for 0 ≤ k < b and 0 ≤ k < c. Then, Remark 5. If we set σ = τ , then the intrication σ · σ gives a permutation in base b 2 whose ψ-function is the function F σ 2 defined in (5). In this special case the new permutation generates the same sequence as the original permutation.
Hence, every permutation σ m is contained in the set P m which we defined in Section 2.3.

A family of permutations
In this section we study the discrepancy of sequences generated from permutations in P m ⊂ S 2 m . The main result of this section is the observation that all the functions ψ σ 2 m are identical for σ ∈ P m . Thus, using Lemma 3.6, we see that all permutations in P m generate permuted van der Corput sequences with the same asymptotic discrepancy constant (in fact, with the same discrepancy) as the classical van der Corput sequence. We summarise the main observations of this section in Corollary 4.6.
We start with an important technical observation.  To illustrate this definition, observe that the two sets {0, 1, 2, 4, 6} and {3, 5, 7} are 3-inverse. Being m-inverse implies an important equality. We formulate the next lemma in the specialised way in which we will use it later.
To prepare for the proof of the main theorem of this section, we make one more technical observation, which we state again in the special form in which we will use it below. Lemma 4.4. Let σ ∈ P m and let σ := σ • µ 2 m , in which µ 2 m is the swapping permutation. Let τ k denote the first k elements of 2σ ⊕ a and τ 2 m −k denote the first 2 m − k elements of 2σ ⊕ a for odd 1 ≤ a ≤ 2 m+1 and addition modulo 2 m+1 . Then τ k ∪ {0, 2, . . . , 2 m+1 − 2} and τ 2 m −k are (m + 1)-inverse.
Proof. Note that the swapping permutation simply reverses the elements of σ; i.e.
for all 0 ≤ i ≤ 2 m − 1. Hence, the last 2 m − k elements of σ are exactly the first 2 m − k elements of σ .  It is easy to see that both permutations give rise to the same ψ-function. To turn to the induction step, we assume that the assertion is true up to m. In particular this means that all permutations in the set P m generate permuted van der Corput sequences with identical ψ-function.
By Lemma 4.1 all tuples in 2P m have identical ψ-functions for x ∈ [0, 1/2]. By Lemma 3.3 adding a constant modulo 2 m to the permutations in P m does not change the ψ-function. Hence, also all tuples in 2P m ⊕ a have identical ψ-functions for x ∈ [0, 1/2]. Furthermore, by Lemma 3.4 swapping a permutation does not change its ψ-function. Hence, our considerations also apply to all swapped permutations σ • µ 2 m . Now take arbitrary σ,σ ∈ P m and build a permutation (2σ, 2σ ⊕ a) ∈ P m+1 . We already know that the ψ-functions of all such permutations are identical for x ∈ [0, 1/2]. Now letτ k denote the first k elements of 2σ ⊕ a. By Lemma 4.4 we know thatτ k ∪ {0, 2, . . . , 2 m − 2} is (m + 1)-inverse toτ 2 m −k . Hence, we can apply Lemma 4.3 to calculate the value of [. However, we know that the swapped permutationσ • µ 2 m has the same ψ-function on [0, 1/2] as every other permutation in P m . Thus, we obtain the values ψ (2σ,2σ⊕a) 2 m+1 on [1/2, 1] via the reflection of the ψ function on [0, 1/2] of any other tuple in 2P m .
Since we picked an arbitrary permutation in P m+1 we conclude that every permutation in P m+1 has the same ψ-function. In the following we relate the results of Section 3 and 4 to the greedy algorithm and prove Theorem 2.1. Proof. We have that from which we see that g (A/4) = 0. Since f is increasing, we have that g is positive and therefore x = A/4 is a minimum of g.
Moreover, for x ∈ [0, A/2], we have that and since f is increasing, also g is increasing on [0, A/2].
Due to symmetry we can get a copy of g in the interval [A/2, A]. The functional studied by Steinerberger [23,24] is an example of an admissible function; for x ∈ [0, 1]f (x) := 1 − log(2 sin(πx)); see Figure 2 for an illustration.   Proof. We write N = N 1 = k j=1 2 m j in its binary representation such that m k > m k−1 > . . . > m 1 is a decreasing sequence of non-zero integers. According to this representation we split the set M N into disjoint subsets such that there is one subset M j containing 2 m j points for every m j ; in particular we write  in which N i := k j=i 2 m j . Due to the self similarity of the van der Corput sequence (see Lemma 3.1) we get a shifted versionG m j of the function G m j for each set M j ; see Figure 4 and Figure  5. To conclude our proof we show that for each k > j ≥ 1 the set of minima ofG m j is contained in the set of minima of the functionG m j+1 . From Lemma 3.1 we know that and M j+1 = M N j+1 ,N j+2 . Furthermore from Lemma 5.4 we know that the minima ofG m j are at the points whereas the minima ofG m j+1 are at Now it is easy to see that all arguments of minima of the form (7) are also of the form (8).
The set of arguments of global minima of the resulting function H N = N −1 h=0 f h for M N is then the intersection of the set of all arguments of minima of the functionsG m j . In particular, this set contains exactly the 2 m 1 points of the form It follows from the definition of the van der Corput sequence that the (N + 1)st point of the sequence is among them.
Finally, we observe that every sequence generated by a permutation in P m can be realised by the algorithm (when starting with {0}) and that for a fixed 2 m−1 < N ≤ 2 m every output of the algorithm can be found as initial segment of a sequence S σ 2 m with σ ∈ P m . Definition 5.6. We call a set {x 0 , x 1 , . . . , x N −1 } of N points admissible if it can be obtained as output of the algorithm (1) for an admissible function f on [0, 1] when starting with the seed {0}.
Now we prove the lemma by induction on m ≥ 2. We assume the assertion is true for all N ≤ 2 m ; i.e. for every admissible set {x 0 , x 1 , . . . , x N −1 } with N ≤ 2 m there exists a σ ∈ P m with x i = S σ 2 m (i) for all 0 ≤ i ≤ N − 1. In particular, this means that P m contains a permutation encoding every possible sequence of decisions we take when distributing the first 2 m points using the greedy algorithm and starting with x 0 = 0. For N = 2 m we always get exactly one admissible set; i.e. 0, 1 2 m , . . . , To make the induction step from m to m + 1, we first use Lemma 5.4 to see that the algorithm suggests 2 m minima as candidates for x 2 m , i.e. the (N + 1)st point, which are of the form i/2 m + 1/2 m+1 for 0 ≤ i ≤ 2 m − 1. Picking any of these 2 m minima, the important observation is that the function H 2 m +1 = 2 m h=0 f h and H 2 m +1 − H 2 m =H 1 = f 2 m have the same set of arguments of minima. This can be seen as in the proof of Theorem 5.5: We look at the intersection of a nested sequence of sets (the set of arguments of minima for H 2 m contains the set of arguments of minima ofH 1 ) and this intersection does not change, i.e. is still the set of arguments of minima ofH 1 , if we discard the outermost set. We can repeat this argument for any of the following N s with 2 m + 2 ≤ N ≤ 2 m+1 . In particular, we see that once we have picked the point for N = 2 m + 1 we encounter the same tree of decisions as for the first 2 m points, just with a different start value. Now it is easy to see that we have indeed a permutation (2σ, 2σ ⊕ a) in P m+1 for every path through the decision tree of the algorithm. Looking at the construction of P m+1 we see that by our assumption there is a permutation, σ, in P m for every initial choice of first 2 m points. Then we can accommodate for any choice of (N +1)st point via the addition of an odd a modulo 2 m+1 . And finally, once we picked a, i.e. the new starting value a/2 m + 1/2 m+1 for 0 ≤ a ≤ 2 m − 1, we can realise our second (shifted by the constant a) run through the decision tree to determine the remaining 2 m − 1 points via an appropriate permutation σ in P m .
Thus, if the assertion holds for m, then it also holds for m + 1. Proof. We prove this again by induction on N . Assume the set {S σ 2 m (0), S σ 2 m (1), . . . , S σ 2 m (N −1)} is admissible. We need to show that S σ 2 m (N ) is among the minima suggested by the algorithm. This can be shown along the same lines as Theorem 5.5.
Together with Corollary 4.6 this completes the proof of Theorem 2.1.

Proof of Theorem 2.6
Throughout this section f and X are as in the statement of the theorem. Since f has mean value 0 we have that We need two ingredients for the proof of Theorem 2.6. We start with a technical observation.
which can be rewritten as Next we use the definition of the greedy algorithm. We have that LeVeque's inequality now implies the desired result.