The Halton sequence and its discrepancy in the Cantor expansion

We consider variants of the Halton sequence in a generalized numeration system, called the Cantor expansion. We show that it provides a wealth of low-discrepancy sequences by giving an estimate of the (star) discrepancy of the Halton sequence in each bounded Cantor base. The techniques used in our estimation of the discrepancy are adapted from those developed by E.I. Atanassov.


Introduction
Let ω = (x n ) ∞ n=1 be a sequence in [0, 1) s . A standard problem in numerical analysis is estimating the integral of a function, through a knowledge of its value at a finite number of points of the sequence. This is known as the Monte Carlo method in the case of stochastic sequences (x n ) N n=1 or the quasi-Monte Carlo method in the case of deterministic (x n ) N n=1 . This is encapsulated in the famous Koksma-Hlawka inequality for an arbitrary function f on [0, 1] s with bounded variation V ( f ) in the sense of Hardy and Krause and any finite set of points (x n ) N n=1 with discrepancy Here A(J ; N ; ω) = #{1 ≤ n ≤ N : x n ∈ J } is the counting function, where λ s (J ) denotes the s-dimensional Lebesgue measure of J and the above supremum is taken over all rectan- Note that λ s (J ) = s i=1 z i . For more details on numerical integration, the reader can consult [3,11] or [14]. Evidently, to estimate [0,1] s f (x) dx sufficiently precisely, what is needed is a good bound for D * N (ω). The discrepancy is nothing other than a quantitative measure of uniformity of distribution. In particular, the sequence ω is uniformly distributed on [0, 1) s , if and only if D * N (ω) → 0 as N → ∞. In a sense, the faster D * N (ω) decays as a function of N , the better uniformly distributed the sequence ω is. One of the fundamental obstructions in this subject is that there is a limit to how well distributed any sequence can be. This is encapsulated in the elementary inequality D * N (ω) ≥ 1/2 s N (N ∈ N) whose proof makes an entertaining exercise. This opens the door to the deep subject of irregularities of distribution which addresses just what limitations there are to the uniformity of distribution of an arbitrary sequence, and the complementary problem of constructing sequences with discrepancy as small as possible. This latter issue is clearly central to the initial issue mentioned in this paper.
Perhaps the most famous example of a low-discrepancy sequence is the van der Corput sequence. In 1935, van der Corput [21] introduced a procedure to generate low-discrepancy sequences on [0, 1). These sequences are considered to be among the best distributed over [0, 1), and no other infinitely generated sequences can have discrepancy of smaller order of magnitude than van der Corput sequences. The technique of van der Corput is based on a very simple idea. Let b > 1 be a natural number. Then every nonnegative integer n has a b-adic representation of the form In practical applications, a generalization of the van der Corput sequence to higher dimensions is more likely to be of practical use. In 1960, this was proposed by Halton [7]. Given coprime integers b 1 , . . . , b s all greater than 1, the sequence (φ b 1 (n), . . . , φ b s (n)) ∞ n=0 is called the Halton sequence in bases b 1 , . . . , b s .
It was known for a long time that the discrepancy of the first N elements of the Halton sequence in bases b 1 , . . . , b s can be bounded by for some constant c(b 1 , . . . , b s ) > 0. For example, this was shown in [4,6,7,15] and [18]. It is believed that the order (log N ) s /N is the best possible for an arbitrary infinite sequence. That this is the case when s = 1 was proved by Schmidt [20]. For s > 1, the question remains open. We shall call an infinite sequence ω in [0, 1) s a low-discrepancy sequence if The question of how small the constant c(b 1 , . . . , b s ) in (1.1) can be interesting from both a theoretical and a practical viewpoint. The articles referred to above show that this constant depends very strongly on the dimension s. The minimal value for this quantity can be obtained if we choose b 1 , . . . , b s to be the first s prime numbers. But even in this case c(b 1 , . . . , b s ) grows very fast to infinity if s increases. This deficiency was overcome by Atanassov [1] who proved that This estimate is so impressive that, when b 1 , . . . , b s are the first s prime numbers, In this paper, we introduce the Halton sequence in a generalized numeration system, which is induced by the a-adic integers and which is called the Cantor expansion, and give an estimate of its discrepancy by adapting the techniques developed by Atanassov. It is worth noting here that the van der Corput sequence and some one-dimensional low-discrepancy sequences with respect to the Cantor expansion were studied in [2] and [5]. Note also that it was mentioned in [8] about the Halton sequence in a more generalized numeration system than the Cantor expansion, called the G-expansion; however, the paper aimed to study the Halton sequence in some fixed non-integer bases and did not touch on the Halton sequence with respect to dynamical bases.
We now summarize the contents of this paper. In Sect. 2, we introduce the concept of a generalized numeral system. Then we define the Halton sequence induced by this generalized system and state our main result on the estimate of discrepancy of the Halton sequence. In Sects. 3 and 4, we collect all preliminary lemmas and prove our main result, respectively. In Sect. 5, we prove an estimate of discrepancy of the van der Corput sequence in a generalized numeration system without the restriction on boundedness of inducing sequences. Then we pose an open problem regarding an extension of our results. Finally, we introduce in Sect. 6 the Hammersley point set induced by the generalized numeration system and show that it provides a wealth of low-discrepancy point sets by giving an estimate of its discrepancy.

A generalized Halton sequence
j=1 be a sequence of natural numbers greater than 1. Then it is clear that every nonnegative integer n has a unique b-adic representation of the form where n j ∈ {0, 1, . . . , b j − 1} ( j ∈ N). This b-adic representation is also called the Cantor expansion of n with respect to the Cantor base b. Moreover, every real number x ∈ [0, 1) has a b-adic expansion of the form . The x j can be calculated by the greedy algorithm where [α] and {α} denote the integer part and the fractional part of α, respectively. The idea of this generalized numeration system stems from the a-adic integers, which is a class of locally compact topological groups and possesses a symbolic dynamical structure. For more details on the a-adic integers, see [9, pp. 106-117].
Define the radical-inverse function The van der Corput sequence in base b is defined as (φ b (n)) ∞ n=0 . This sequence was studied in [2] and [5], where it was proved to be a low-discrepancy sequence with some restriction on the Cantor base b. Furthermore, the sequence was shown, without any restriction on the Cantor base, to be uniformly distributed mod 1 in [13] and to be a low-discrepancy sequence in [12].
The following is our main result and gives an estimate of the discrepancy of the Halton sequence in bases of bounded sequences.

Main Theorem 2.1
be s bounded sequences of natural numbers greater than 1 such that, for all j 1 , j 2 ∈ N and all 1 ≤ i 1 < i 2 ≤ s, we have b i 1 , j 1 and b i 2 , j 2 coprime. Suppose that ω is the Halton sequence in bases b 1 , . . . , b s . Then, for any N ≥ 1, we have . This shows that the Halton sequence is a low-discrepancy sequence.

Preliminary lemmas
In order to prove Theorem 2.1, we need the following five lemmas. These preliminary results are adapted from [3] whose ideas go back to Atanassov [1].
be s arbitrary sequences of natural numbers greater than 1 such that, for all j 1 , j 2 ∈ N and all 1 ≤ i 1 Proof For each n ∈ N 0 , we denote the b i -adic expansion of n by Then the nth element ω n of the Halton sequence is contained in J , if and only if, for all 1 ≤ i ≤ s, This is however equivalent to n (i) As b 1, j 1 , . . . , b s, j s are pairwise coprime for all ( j 1 , . . . , j s ) ∈ N s , we obtain from the Chinese Remainder Theorem that every s i=1 b i,1 · · · b i,k i consecutive elements of the Halton sequence, contain exactly one element in J or, in other words, Therefore, for every N ∈ N, we obtain Now we write the interval J as a disjoint union of intervals of the form J , which proves the first assertion. For This was the second assertion of the lemma.
be s arbitrary sequences of natural numbers greater than 1 . For each N ∈ N, let d(b 1 , . . . , b s ; N ) be the number of tuples Proof Let u ⊆ {1, . . . , s}. By Lemma 3.2, the number of s-tuples (k 1 , . . . , k s ) such that Moreover, each of these s-tuples contributes at most i∈u f i to the sum on the left hand side in the statement of the lemma. From this, invoking the inequality 1 |u|! ≤ s s−|u| s! , we obtain and this is the desired result.
Now we need to introduce some notation. Let J ⊆ R s be an interval. Then a signed splitting of J is a collection of not necessarily disjoint intervals J 1 , . . . , J r together with signs ε 1 , . . . , ε r ∈ {−1, 1} such that, for all x ∈ J, we have where (J 1 , . . . , J r ; ε 1 , . . . , ε r ) is a signed splitting of J, and here we use ν( The following result is taken directly from [1] (see also [3] for a detailed proof).
together with the signs ε k 1 ,...,k s = s i=1 sgn(z k i +1,i − z k i ,i ) for 0 ≤ k i ≤ n i and 1 ≤ i ≤ s define a signed splitting of the interval J.
For the proof of Theorem 2.1, we need a digit expansion of reals z ∈ [0, 1) in (b j ) ∞ j=1 -adic base which uses signed digits. The next lemma shows that such an expansion exists.

Proof of the main theorem
For all 1 ≤ i ≤ s, let n i = log N log m i + 1, and for 1 ≤ k ≤ n i , define the truncations of the expansions and let z 0,i = 0 and z n i +1,i = z i . According to Lemma 3.4, the collection of intervals together with the signs ε k 1 ,...,k s = s i=1 sgn(z k i +1,i − z k i ,i ) for 0 ≤ k i ≤ n i and 1 ≤ i ≤ s defines a signed splitting of the interval J.
Since both λ s and A(·; N ; ω) are additive functions on the set of intervals, we obtain that where 1 denotes the sum over all (k 1 , . . . , k s ) such that s i=1 b i,1 · · · b i,k i ≤ N and 2 denotes the remaining part of the above sum.
First we deal with the sum 1 . For any 1 ≤ i ≤ s, the length of the interval [min(z k i ,i , z k i +1,i ), max(z k i ,i , z k i +1,i )) is |a i,k i /b i,1 · · · b i,k i |, and also the limit points of this interval are rationals with denominator b i,1 · · · b i,k i . It is worth keeping in mind that, due to the choice of n i , we must have k i < n i when s i=1 b i,1 · · · b i,k i ≤ N . Accordingly, the intervals J k 1 ,...,k s are of the form as considered in Lemma 3.1 from which we obtain It remains to estimate 2 . To this end, we split the set of s-tuples (k 1 , . . . , For a fixed 1 ≤ l ≤ s − 1 and a fixed l-tuple (k 1 , . . . , k l ) with l i=1 b i,1 · · · b i,k i ≤ N , define r to be the largest integer such that It follows that the tuple (k 1 , . . . , k l , k l+1 , . . . , k s ) is contained in B l , if and only if k l+1 ≥ r.
Therefore, for any 0 ≤ l ≤ s − 1 and fixed k 1 , . . . , it follows that the interval [min(z r,l+1 , z l+1 ), max(z r,l+1 , z l+1 )) is contained in some interval The latter fact implies that k i < n i for all 1 ≤ i ≤ s. Thus, an application of Lemma 3.1 yields that But on the other hand, we also have that N λ s (L) ≤ b l+1,r l i=1 |a i,k i |. Hence, where we have used Lemma 3.3 again. Hence, the result follows.

An open problem
We have seen that the Halton sequence in bases of bounded sequences b 1 , . . . , b s is a low-discrepancy sequence. It is of theoretical interest to ask whether the assumption of boundedness of the base sequences b 1 , . . . , b s can be removed or not.
The following statement shows that we can remove the restriction on the boundedness when s = 1. That is, the van der Corput sequence in base b = (b j ) ∞ j=1 of an arbitrary sequence of natural numbers greater than 1 is a low-discrepancy sequence. Note that the proof is developed from the classical dyadic case in [10, pp. 127-128].
j=1 be an arbitrary sequence of natural numbers greater than 1. Suppose that ω is the van der Corput sequence in base b. Then, for any N ∈ N, we have Note that Proposition 5.1 gives a better estimation of discrepancy than the case s = 1 of Theorem 2.1 and also that of [2,Théorème 4.5]. To prove this result, we need to introduce a notation and two preliminary lemmas.
A finite sequence 0 ≤ c 1 < c 2 < · · · < c L of points from the interval [0, 1) is called an The first lemma gives an estimation of discrepancy of an arithmetic progressions. The next simple lemma is useful for estimating the discrepancy of a sequence which can be decomposed into a number of subsequences with small discrepancy. Let ω be a superposition of ω 1 , . . . , ω K , that is, a sequence obtained by listing in some order the terms of ω k . We set N = N 1 + · · · + N K , which will be the number of elements of ω. Then we have Proof of Proposition 5. 1 We can always represent a given N ∈ N by its b-adic expansion . Partition the interval [0, N ] of integers into k subintervals I 1 , . . . , I k as follows. First, put I 1 = [0, N k b 1 · · · b k−1 ]. Then, for each 1 < j ≤ k, we define I j as the interval Note that the proof idea of splitting up the range of 0, 1, . . . , N in this way is due to Niederreiter [17].
An integer n ∈ I j (1 ≤ j ≤ k) can be written in the form In fact, we get all N k− j+1 b 1 · · · b k− j integers in I j if we let n i run through all possible combinations. It now follows that where x j only depends on j, and not on n. If n runs through I j , then We deduce that if the φ b (n) (n ∈ I j ) are ordered according to their magnitude, then we obtain a sequence ω j of N k− j+1 b 1 · · · b k− j elements that is a true arithmetic progression with parameters η j = 1 b 1 ···b k− j+1 . It now follows immediately from Lemma 5.2 that the discrepancy of each ω j , multiplied by the number of elements in ω j , is at most 1. Combining this with Lemma 5.3 and the fact that φ b (0), φ b (1), . . . , φ b (N ) is decomposed into k sequences ω j , we obtain N D * N (ω) ≤ k. It remains to estimate k in terms of N . By (5.1), we have N ≥ b 1 · · · b k−1 ≥ m k−1 , and so we obtain k ≤ (log N / log m) + 1. This completes the proof of Proposition 5.1.
In general, it is likely to be true that the Halton sequence in arbitrary bases of sequences is a low-discrepancy sequence.
Problem 5.4 Let s > 1, and let b 1 = (b 1, j ) ∞ j=1 , . . . , b s = (b s, j ) ∞ j=1 be s arbitrary sequences of natural numbers greater than 1 such that, for all j 1 , j 2 ∈ N and all 1 ≤ i 1 < i 2 ≤ s, we have b i 1 , j 1 and b i 2 , j 2 coprime. Suppose that ω is the Halton sequence in bases b 1 , . . . , b s . Then we have If the conjecture is true, then it is natural to ask further whether the constant c(b 1 , . . . , b s ) can be reduced to a similar form to that in Corollary 2.2.
Let b 1 = (b 1, j ) ∞ j=1 , . . . , b s−1 = (b s−1, j ) ∞ j=1 be s − 1 sequences of natural numbers greater than 1 such that, for all j 1 , j 2 ∈ N and all 1 ≤ i 1 < i 2 ≤ s − 1, we have b i 1 , j 1 and b i 2 , j 2 coprime. The Hammersley point set in bases b 1 , . . . , b s−1 consisting of N points in [0, 1) s is defined to be the point set We deduce a discrepancy bound for the Hammersley point set with the help of Theorem 2.1 in combination with the following general result from [18] that goes back to Roth [19].
A point set P consisting of N points in [0, 1) s is called a low-discrepancy point set if D * N (P) = O((log N ) s−1 /N ). In this sense, the generalized Hammersley point set is a low-discrepancy point set.