Lattice based integration algorithms: Kronecker sequences and rank-1 lattices

We prove upper bounds on the order of convergence of lattice based algorithms for numerical integration in function spaces of dominating mixed smoothness on the unit cube with homogeneous boundary condition. More precisely, we study worst-case integration errors for Besov spaces of dominating mixed smoothness $\mathring{\mathbf{B}}^s_{p,\theta}$, which also comprise the concept of Sobolev spaces of dominating mixed smoothness $\mathring{\mathbf{H}}^s_{p}$ as special cases. The considered algorithms are quasi-Monte Carlo rules with underlying nodes from $T_N(\mathbb{Z}^d) \cap [0,1)^d$, where $T_N$ is a real invertible generator matrix of size $d$. For such rules the worst-case error can be bounded in terms of the Zaremba index of the lattice $\mathbb{X}_N=T_N(\mathbb{Z}^d)$. We apply this result to Kronecker lattices and to rank-1 lattice point sets, which both lead to optimal error bounds up to $\log N$-factors for arbitrary smoothness $s$. The advantage of Kronecker lattices and classical lattice point sets is that the run-time of algorithms generating these point sets is very short.


Introduction
In this paper we study numerical integration of smooth functions on the d-dimensional unit cube which satisfy homogeneous boundary conditions. More precisely, we consider Besov spaces of dominating mixed smoothnessB s p,θ which also comprise the concept of Sobolev spaces of dominating mixed smoothnessH s p as special cases. The exact definition of these spaces requires some preparation and will be given in Section 2.2. For the moment we just note that the parameter s denotes the underlying smoothness of the functions and that for f ∈B s p,θ we have supp(f ) ⊂ [0, 1] d .
We study numerical integration |I where · Bs p,θ denotes the norm inB s p,θ . As integration rules we use what we call general lattice rules which are of the form (1) with an invertible matrix T ∈ R d×d . Note that is a d-dimensional lattice in R d , i.e., a discrete additive subgroup of R d not contained in a proper linear subspace of R d . A specific example of such a rule is Frolov's cubature formula introduced in [6] (see also [25] and the survey paper by Temlyakov [24,Section 4.3]). Here a suitable generating matrix T for the underlying lattice X is found as follows: Define the polynomial p d ∈ R[x], p d (x) = −1 + d j=1 (x − 2j + 1). Then p d has only integer coefficients, it is irreducible over the rationals and has d different real roots, say ξ 1 , . . . , ξ d ∈ R. With these roots define the d × d Vandermonde matrix B by Then the generator of the lattice X used in Frolov's cubature formula is where a > 1 is a shrinking factor. It was shown by Dubinin [5], see also [22] and [26], that Frolov's cubature formula achieves the optimal convergence rate for the worst-case error inB s p,θ , which is wce(Q T ,B s p,θ ) ≍ where N is the number of elements of T (Z d ) that belong to the unit cube [0, 1] d . The lower bound for these spaces was proven in [25]. See also [11] for techniques to transfer such results to spaces without (or periodic) boundary conditions. For the Sobolev(-Hilbert) spacesH s 2 =B s 2,2 the result reads The problem with Frolov's cubature formula is that one needs to determine which points from the shrunk lattice belong to the unit cube [0, 1] d . This is in general a very difficult task, especially when the dimension d is large 1 .
It is the aim of this paper to find other general lattice rules whose lattice points are faster to generate on a computer also for large dimensions d and which also achieve optimal convergence rate for the worst-case error with respect to the smoothness s, up to log N-terms.
We will prove a very general estimate for the worst-case error in terms of the Zaremba index of the lattice X = T (Z d ), see Theorem 14 in Section 3. Then we apply this error estimate to two examples of general lattice rules, to so-called Kronecker lattices (Subsection 4.1) and to rank-1 lattice point sets (Subsection 4.2). In both cases we achieve a convergence rate of order O(N −s ) for the worst-case error up to log N-terms. The major advantage of these constructions is that it is automatically clear which points belong to the unit cube. Hence the proposed integration algorithms (which are in fact QMC rules) can be implemented very efficiently.
Basic notation: Throughout the paper d ∈ N denotes the dimension. By log we denote the dyadic logarithm. For x ∈ R we denote by {x} the fractional part of x and by x the distance of x to the nearest integer, i.e., x := min m∈Z |x − m|.
For ℓ = (ℓ 1 , . . . , ℓ d ) ∈ N d 0 we denote by the operator of partial differentiation with |ℓ| = ℓ 1 + · · · + ℓ d . Furthermore, for ξ ∈ R d let e ξ : R d → C, e ξ (x) := exp(2πi ξ, x ), where ·, · denotes the usual inner product in R d . Before we formulate the main results of this work in more detail we need some preparation, which is presented in Section 2. 1 It is known that N ∼ | det(T −1 )| as the shrinking factor a tends to infinity. This can be quantified as with the usual modification if p = ∞. We will also need L p -spaces on compact domains Ω ⊂ R d instead of the entire R d . We write f Lp(Ω) for the corresponding (restricted)

Basics of Fourier analysis
and the corresponding inverse Fourier transform F −1 f (ξ) = F f (−ξ). Additionally, we define the spaces of continuous functions C(R d ), infinitely differentiable functions C ∞ (R d ) and infinitely differentiable functions with compact support C ∞ 0 (R d ) as well as the Schwartz space S = S(R d ) of all rapidly decaying infinitely differentiable functions on R d , i.e., The space S ′ (R d ), the topological dual of S(R d ), is also referred to as the set of tempered distributions on R d . Indeed, a linear mapping f : for all ϕ ∈ S(R d ). The space S ′ (R d ) is equipped with the weak * -topology. The convolution ϕ * ψ of two square-integrable functions ϕ, ψ is defined via the integral If ϕ, ψ ∈ S(R d ), then ϕ * ψ still belongs to S(R d ). In fact, the convolution operator can be extended to S(R d ) × L 1 , in which case we have ϕ * ψ ∈ S(R d ), and to S( ). It makes sense point-wise and is a C ∞ -function in R d . As usual, the Fourier transform can be extended to

Function spaces
In this section we introduce the function spaces under consideration, namely, the Besov and Sobolev spaces of dominating mixed smoothness. There are several equivalent characterizations of these spaces, see [27]. For our purposes, the most suitable is the characterization by local means (see [27,Theorem 1.23] or [26,Section 4]).
Let us continue with the definition of the Besov spaces B s p,θ = B s p,θ (R d ) defined on the entire R d .
with the usual modification for θ = ∞.  In fact, it is not even necessary that Ψ 0 and Ψ 1 have compact support. However, for the proof of our results this specific choice is crucial.
In this article we are interested in classes of functions which are supported in the unit The next lemma collects some frequently used embeddings between Besov and Sobolev spaces. They will be useful in obtaining our results for Sobolev spaces directly from the results for Besov spaces.
Lemma 5. Let p ∈ (0, ∞], and let s ∈ R. (i) We have the chain of embeddings Proof. For a proof we refer to [17,Chapt. 2]. Note that for (ii) the compact support of the functions is necessary to ensure the corresponding embeddings of the L p spaces.
In the sequel we will always assume that s >

Useful lemmas
Here we collect some lemmas that will be essential for our analysis. For proofs and further literature for these results we refer to [ . Although this equality looks more technical than the original summation formula, this variant is exactly in the form we need and it comes with less assumptions on the involved functions.
In particular, the limit on the right hand side exists.
The lemma above holds for quite general choices of ϕ 0 . However, we choose a certain ϕ 0 (and hence ϕ m ) that is related to the definition of our spaces, i.e., the functions Ψ 0 , Ψ 1 from Section 2.2 are involved. This is because, given Ψ 0 and Ψ 1 , we can construct a suitable decomposition of unity.
We define the d-fold tensorized functions where Λ j , Ψ j , j ∈ N 0 , are defined in Lemma 7. We obtain from Lemma 7 that we can use the functions in Lemma 6. Moreover, by the construction of the tensorized functions and Lemma 7, we know that the support of F Λ m , m ∈ N d 0 , is of the form For this, recall that we choose Ψ 0 , Ψ 1 such that ε = 1, see Remark 1.
As one can see in the lemmas above we are concerned with sums of certain function evaluations of Fourier transforms. Finally, we want to bound such sums by the norm of the involved functions, i.e., we have to control the dependence on the matrix B, see Lemma 6. For this we need the following two lemmas, see [  Then, for 1 ≤ p ≤ ∞, we have Lemma 10. Let B ∈ R d×d be an invertible matrix and Ω ∈ R d be a bounded set. Furthermore, let g ∈ S(R d ) with supp(F g) ⊂ Ω. Then, for 1 ≤ p ≤ ∞, we have

General error bound
In this section we provide a general upper bound on the error of general lattice rules for the integration in the spacesB s p,θ of functions with support inside the unit cube. Here we mean by general lattice rules all algorithms of the form with an invertible matrix T ∈ R d×d . Clearly, due to supp(f ) ⊂ [0, 1] d , we could also sum over all x ∈ T (Z d ) without changing the value of Q T (f ).
For an invertible matrix T the dual lattice of the lattice X = T (Z d ) is defined as Remark 11. In the literature usually only those algorithms of the form (8) are called lattice rule that satisfy T (Z d ) ⊃ Z d , see e.g. [13,20]. Such rules are a priori also suited for the integration of periodic functions, in contrast to the general algorithms of the form (8). We decided to use this nomenclature with the prefix "general" since it seems to be adequate for cubature rules that use function evaluations on a lattice.
We prove the following theorem.
Theorem 12. Let T ∈ R d×d be invertible, X = T (Z d ), X * its dual lattice, I m from (7) and Q T as in (8). Then, for 1 ≤ p, θ ≤ ∞ and s > 1/p, we have Proof. From Lemma 6 with ϕ m from (6) we have where B = T −⊤ . Actually, the outer sum is defined as a certain limit, however, we will see that this sum converges absolutely, which justifies this notation.
By Hölder's inequality as well as Lemma 8 and Lemma 10 we have This shows that, in order to prove bounds on the worst-case error, it is enough to study the numbers |X * ∩ I m \ {0}| for m ∈ N d 0 . Moreover, as the next lemma shows, this quantity can be bounded by the Zaremba index of the lattice X, which is, loosely speaking, the largest ℓ ∈ N such that |X * ∩ I m \ {0}| = 0 for all m with |m| 1 ≤ ℓ. More precisely, we define the Zaremba index of the lattice X by where, for z = (z 1 , . . . , z d ), r(z) := d j=1 z j with z := max{1, |z|}. The connection of the Zaremba index to numerical integration is well-known, see e.g. [13,20] and the references therein, but we repeat the (relatively short) proofs here for convenience of the reader. Lemma 13. Let T ∈ R d×d be invertible, X = T (Z d ), X * its dual lattice and I m from (7). Then, for all m ∈ N d 0 , we have where log denotes the dyadic logarithm.
Proof. Let M := log ρ(X) . For x ∈ I m , we have r(x) ≤ 2 |m| 1 . This shows that, for |m| 1 < M, there is no x ∈ X * in I m , except possibly the origin. This proves the first bound.

Application to specific point sets
In this section we apply the general results from the last section to specific point sets. More precisely, we study Kronecker lattices and rank-1 lattice point sets in dimensions d ≥ 2. By Theorem 14 it is enough to bound the Zaremba index of these lattices. However, since the bounds on the Zaremba index are worse in higher dimensions, we treat the lower-dimensional cases separately.

Kronecker lattices
We study Kronecker lattices which are point sets of the form where α = (α 1 , ..., α d−1 ) ∈ R d−1 . These point sets can be written as where X N (α) = T N (Z d ) and Note that det(T N ) = (−1) d+1 /N and hence d T N = N. Hence, we can use the results from the last section to prove upper bounds on the error of the cubature rule Q T N . Recall that the dual lattice of Remark 15. In view of Remark 9 the sequence of matrices B N , N ≥ 1, from (12) satisfies  Since all lattices that follow will be of this form we use the notation Moreover, we write wce P N (α),B s p,θ := wce Q T N ,B s p,θ . In the following we fix a vector α ∈ R d−1 and just let N grow in (10). Given a "good" α, this makes the point set P N (α) particularly easy to implement, since one needs only Nd arithmetic operations.
It is not surprising, and well known, that bounds on the Zaremba index of lattices X N (α) depend on Diophantine approximation properties of the vector α. More precisely, a lattice has large Zaremba index if the involved numbers α 1 , ..., α d−1 are badly approximable (in a certain sense). This is reflected by the following lemma.
Next assume that |k j | ≥ N for some j = 1, ..., d−1. Then it follows from the definition that r(z) ≥ k j ≥ N.
Finally, assume that k : Clearly, by choosing the right k d ∈ Z, we have by assumption This shows that in any case r(z) ≥ min{N, cN/ψ(N) d−1 } ≥ N/ψ(N) d−1 min{ψ(1) d−1 , c} and thus proves the claim.
We now treat the cases d = 2 and d ≥ 3 separately, since the known results on the existence of (simultaneously) badly approximable numbers differ in these cases.

The case d = 2
We say that a real number α is badly approximable, if there is a positive constant c 0 = c 0 (α) > 0 such that k kα ≥ c 0 > 0 for all integers k ≥ 1.
It is well known that an irrational number α is badly approximable if and only if the sequence a 1 , a 2 , a 3 , . . . of partial quotients in the continued fraction expansion of α = [a 0 ; a 1 , a 2 , a 3 , . . .] is bounded, i.e. there is some M = M(α) > 0 such that a j ≤ M for all integers j ≥ 1, see e.g. [8]. For example for the golden ratio α = (1 + √ 5)/2 we have that the continued fraction coefficients are all 1 and hence of course are also bounded.
It is easily seen from the definition of badly approximable numbers that we can apply Lemma 16 with d = 2 and ψ ≡ 1, which proves that, for the constant c ′ 0 = min(1, c 0 ), ρ 2 (N, α) ≥ c ′ 0 N for all N ∈ N. This implies the following theorem.

The case d ≥ 3
Unfortunately, in dimensions greater two the results are not as satisfactory as for d = 2.
That is, we do not know if vectors α exist that give the optimal order of convergence of the corresponding (general) lattice rule. Assume for the moment that we have a vector α ∈ R d−1 for d ≥ 3 such that In this case we could show that wce(P N (α),B s p,θ ) (log N) (d−1)(1−1/θ) /N s , which is the optimal order of convergence. However, a famous conjecture of Littlewood states that there is no α ∈ R d−1 , d ≥ 3, with this property, see e.g. [1]. See also [2] for a discussion of this Diophantine problem in the context of the discrepancy of (nα)-sequences.
The best we can hope for at the moment for our problem are metrical results. These are based on the following lemma.
Proof. From [3, Lemma 5] we obtain that, under the assumptions of the lemma, we can apply Lemma 16 for almost every α ∈ R d−1 .
This implies the following result.
Theorem 19. Let ψ : N → R + be non-decreasing such that n≥1 1 nψ(n) < ∞. Then, for almost all α ∈ R d−1 and every N ≥ 1 we have For example, for δ > 0 for almost all α ∈ R d−1 we have Remark 20. In dimension d = 3 the metrical result can be slightly improved. It follows from results in [1] that there exist (α 1 , α 2 ) ∈ R 2 such that the assumption of Lemma 16 holds with ψ(N) = (log N) log log N. Hence, for d = 3, the second statement of Theorem 19 holds with δ = 0.
However, if we want a result for concrete α ∈ R d−1 for d ≥ 3, the situation is even worse. Recall that for a real number η, a (d − 1)-tuple α ∈ (R \ Q) d−1 is said to be of approximation type η, if η is the infimum of all numbers σ for which there exists a positive constant c = c(σ, α) such that It is well known that the type η of an irrational vector α is at least one. On the other hand it has been shown by Schmidt [18] that α = (α 1 , . . . , α d−1 ), with real algebraic components for which 1, α 1 , . . . , α d−1 are linearly independent over Q, is of type η = 1. In particular, (e r 1 , . . . , e r d−1 ) with distinct nonzero rationals r 1 , . . . , r d−1 or ( √ p 1 , . . . , √ p d−1 ) with distinct prime numbers p 1 , . . . , p d−1 are of type η = 1.
From (14), Lemma 16 and Theorem 14 we obtain the following result.

Rank-1 lattice point sets
A rank-1 lattice point set is given by the points { n N g} for n = 0, 1, . . . , N − 1, where g = (g 0 , g 1 , . . . , g d−1 ) is a lattice point in Z d and where the fractional part is applied component wise. We restrict ourselves to the case where g 0 = 1 (if N is a prime number, this still covers all possible rank-1 lattice point sets). Then we can write a rank-1 lattice point set as In view of Section 4.1 we see that we replace the possibly irrational point α by the rational point g/N. So, for consistent notation, we should have used, e.g., the denotation P N (g/N) for the point set. However we use P N (g) etc. for simplicity. For the same reasoning we let ρ d (N, g) := ρ(X N (g/N)). The Zaremba index of rank-1 lattice point sets is a well studied quantity and we can use the known results in order to apply them in Theorem 14. Since the statement of Remark 15 holds also in this case, we can ignore the hidden constants in Theorem 14. Again we treat the cases d = 2 and d ≥ 3 separately.

Remark 22.
For d = 2 the construction is again based on the boundedness of the partial quotients of g 1 /N. So, given a badly approximable number α, see Section 4.1.1, one can use its convergents p k /q k , k = 1, 2, ..., to construct the (optimal) sequence of lattices P q k (1, p k ) . To find an analogous construction in higher dimensions is a challenging open problem.

Remark 24.
In particular the result holds for Fibonacci rules, where N = F n and g = F n−1 , the n th and (n − 1) st Fibonacci numbers, respectively. In this case the continued fraction coefficients of g/N = F n−1 /F n are all exactly 1. Fibonacci rules were also used by Temlyakov, see [23] and [24, Section 4.1].
Remark 25. Note that a famous conjecture of Zaremba [29, p. 76] states that for every integer N ≥ 2 one can find a g ∈ {1, . . . , N} with gcd(g, N) = 1 such that the continued fraction coefficients of g/N are bounded by some constant K (in fact, he conjectured that K = 5). Niederreiter [12] established this conjecture for all N of the form 2 m , 3 m or 5 m for m ∈ N. Bourgain and Kontorovich [4] proved Zaremba's conjecture for almost all N with a constant K = 50. Huang [7] improved this result to show Zaremba's conjecture for almost all N with constant K = 5.

The case d ≥ 3
It follows from a result of Zaremba [30] that for every d ≥ 2 and N ≥ 2 there exists a lattice point g ∈ Z d such that In fact, one can choose C d = (d − 1)!/2 d−1 (see also [13,Theorem 5.12]). From this result together with Theorem 14 we obtain the following theorem. However, it remains an open question how to construct, for given d and N, lattice points g ∈ Z d−1 which achieve the lower bound (16). Without loss of generality one can restrict to the search space {g ∈ {1, 2, . . . , N − 1} : gcd(g, N) = 1} d−1 of size ϕ(N) d−1 , where ϕ denotes Euler's totient function. This is too large for a full search already for moderately large N and d ≥ 3. So far one relies on computer search to find good generating vectors, usually based on the fast component-by-component construction [14,15].
Korobov [9] suggested to consider lattice point sets with generating vectors g = (1, g, g 2 , . . . , g d−1 ) in Z d with g ∈ {1, 2, . . . , N − 1} such that gcd(g, N) = 1. The search space for lattice points of this form reduces to ϕ(N). At least for prime powers N and in dimension d = 3 there is an existence result of Larcher and Niederreiter [10] for g = (1, g, g 2 ) with For d > 3 this is open.