1 Introduction

In [4] the diffraction properties of the visible points and the \(k\)th-power-free numbers were studied and it was shown that these sets have positive, pure-point, translation-bounded diffraction spectra with countable, dense support. The interest of this lay in the fact that these sets fail to be Delone sets: they are uniformly discrete (subsets of lattices, in fact) but not relatively dense. The lack of relative denseness means that these sets have arbitrarily large “holes” and hence are not repetitive in the sense of [14]. It is of interest to ask for more precise information about the irregularity of these sets, and Lenz (private communication) has asked what their entropy is.

There are two kinds of entropy commonly associated with arrays of symbols (of which subsets of lattices are a particular case): patch-counting entropy which is defined simply by counting patches and depends only on the adjacency relation between sites, not on any metric of the ambient space; and measure entropy which is defined in terms of the frequency of occurrence of patches in space. The patch-counting entropy is an upper bound for the measure entropy, whatever measure is used. We show that the sets considered here have measure entropy zero (relative to a canonically constructed measure) but positive patch-counting entropy, contrasting with regular model sets [18], for which both entropies are zero [5]. In [4], a model set construction for the visible points and the \(k\)th-power-free numbers was described, with the internal spaces adelic, instead of Euclidean as in more usual cut-and-project sets. In this construction, the boundaries of the windows have positive measure, however, so they are not regular model sets.

In Sect. 2, we define patch-counting and measure entropies, while in Sect. 3 we define the set of \(k\)-free points, whose entropies we investigate, and show that they possess patch frequencies which can be explicitly calculated in terms of infinite products. This is just a mild generalization to the case of lattices other than \(\mathbb Z \) of the results of Mirsky [16] on \(k\)th-power-free integers. To keep the route to our main results as clear as possible we have been content with weak error terms in Sect. 3, but for the record we show in Sect. 8 how error terms like those in [15] carry over to the general case. Section 4 gives some examples of patch frequencies and Sect. 5 completes the calculation of the entropies, with the aid of a key lemma (for the measure entropy case) that gives a small upper bound for the frequencies of the great majority of patches. In Sect. 6, we give a short discussion of the variational principle, which relates the two kinds of entropy. In Sect. 7, we demonstrate how the results in [4] on the diffraction spectra of the \(k\)th-power-free integers and visible lattice points carry over to the general case.

For the special case of square-free numbers (resp., \(k\)th-power-free numbers), some of our results were found independently by employing alternative methods from the theory of dynamical systems by Cellarosi and Sinai [7], Cellarosi and Vinogradov [8] and by Sarnak [22]. Furthermore, these references also contain results on the ergodic properties of the underlying invariant measures that go beyond what we cover here.

In the course of the paper, we need to call on a number of standard results in number theory, which for convenience we have collected in an appendix.

Peter A. B. Pleasants gave me (CH) an early draft of this paper already in 2006. After his untimely death in 2008, Michael Baake asked me to finish the manuscript. At that time, it already contained the entire calculation of the entropies (Sects. 15). Moreover, Peter had planned two further sections, one on improved error terms and one on a model set construction including the sets in question together with an upper bound for the topological entropies that is intrinsic to the corresponding window. While the former is now included (Sect. 8), the latter is still work in progress. Instead, the text now has two additional sections, one on a variational principle (Sect. 6) and one on the diffraction of the sets studied here (Sect. 7).

2 Definitions of Entropy

Let \(X\) be a subset of a lattice \(\Lambda \) in \(\mathbb R ^n\). Given a radius \(\rho >0\) and a point \({\varvec{t}}\in \Lambda \), the \(\rho \)-patch of \(X\) at \(\varvec{t}\) is

$$\begin{aligned} (X-{\varvec{t}})\cap B_\rho ({\varvec{0}}), \end{aligned}$$

the translation to the origin of the part of \(X\) within a distance \(\rho \) of \(\varvec{t}\). We denote by \(\mathcal{A }(\rho )\) the set of all \(\rho \)-patches of \(X\) and by \(N(\rho )=|\mathcal{A }(\rho )|\) the number of distinct \(\rho \)-patches of \(X\). Then the patch-counting entropy of \(X\) is

$$\begin{aligned} h_\mathrm{pc}(X):=\lim _{\rho \rightarrow \infty }\frac{\log _2N(\rho )}{\rho ^nv_n}, \end{aligned}$$
(1)

where \(v_n\) is the volume of an \(n\)-dimensional ball of radius 1, i.e. \(v_n=\pi ^{n/2}/\Gamma (1+\frac{n}{2})\) (so that the denominator is the volume of the open ball \(B_\rho ({\varvec{0}})\)). It can be shown by a subadditivity argument that this limit exists for every \(X\subset \Lambda \). In [5, Theorem 1 and Remark 2] Baake, Lenz and Richard show that, for the dynamical system of coloured Delone sets of finite local complexity, the patch-counting entropy coincides with the topological entropy; see Sect. 6 for more on the natural dynamical system associated with a subset \(X\) of \(\Lambda \) and the \(k\)-free points in particular.

To describe measure entropy, we must take into account densities of subsets of a lattice. If \(Y\subset \Lambda \), its density \(\delta (Y)\) is defined by

$$\begin{aligned} \delta (Y):=\lim _{R\rightarrow \infty }\frac{|Y\cap B_R(\varvec{0})|}{R^nv_n}, \end{aligned}$$
(2)

when the limit exists; cf. [4] for related ways of defining densities of discrete point sets. In cases where the limit does not exist, we can still define an upper density, \(\bar{\delta }(Y)\) and a lower density, \(\underline{\delta }(Y)\), by replacing the limit in (2) by \(\limsup \) or \(\liminf \). The frequency, \(\nu (\mathcal{P })\), of a \(\rho \)-patch \(\mathcal{P }\) of \(X\) is defined by

$$\begin{aligned} \nu (\mathcal{P }){:}=\delta (\{\varvec{t}\in \Lambda :\text{ the } \rho \text{-patch } \text{ of } X \text{ at } \varvec{t} \text{ is } \mathcal{P }\}), \end{aligned}$$
(3)

when this density exists. In the absence of a well defined density, we can still define an upper frequency, \(\bar{\nu }(\mathcal{P })\) and a lower frequency, \(\underline{\nu }(\mathcal{P })\), by replacing \(\delta \) by \(\bar{\delta }\) or \(\underline{\delta }\). The measure entropy of \(X\), which can be thought of as corresponding to the metric entropy of a dynamical system, is now defined by

$$\begin{aligned} h_\mathrm{meas}(X):=\lim _{\rho \rightarrow \infty } \frac{1}{\rho ^nv_n}\sum _{\mathcal{P }\in \mathcal{A }(\rho )}\!\!\!- \nu (\mathcal{P })\log _2\nu (\mathcal{P }), \end{aligned}$$
(4)

with the convention that \(\nu \log _2\nu =0\) when \(\nu =0\); see Sect. 6 for details. It is defined when every patch of \(X\) has a well defined frequency, in which case a subadditivity argument again shows that the limit exists. Since \(\nu \log _2\nu \) is a convex function of \(\nu \), the sum does not decrease if we replace the \(\nu (\mathcal{P })\)’s by their average value, \(1/N(\rho )\), to make the right side the same as the right side of (1). Hence

$$\begin{aligned} h_\mathrm{meas}(X)\le h_\mathrm{pc}(X). \end{aligned}$$

As a simple example where these entropies differ, consider the binary sequence consisting of the binary numbers in order (0, 1, 10, 11, 100, ...) with \(n,n+1\) separated by \(n\) 1’s:

$$\begin{aligned} \underline{0}\underline{1}1\underline{10}11 \underline{11}111\underline{100}1111\underline{101}11111 \underline{110}111111\underline{111}1111111\underline{1000}1111\ldots . \end{aligned}$$

Evidently, there are very few 0’s to contribute variety here. In fact the sequence of 0’s has density zero, and consequently any finite word that is not all 1’s has frequency zero. So \(h_\mathrm{meas}=0\). But since there are \(2^l\) possible words of length \(l\) and every word occurs somewhere, \(h_\mathrm{pc}=1\).

In general, \(h_\mathrm{pc}\) is a combinatorial function of the set of finite configurations that occur, while \(h_\mathrm{meas}\) is a geometric function of an infinite configuration and can differ among different infinite configurations built up from the same set of finite ones, with \(h_\mathrm{pc}\) being an upper bound for the possible values it can take. Of the two entropies, \(h_\mathrm{meas}\) would appear to carry more physical significance.

More generally, if we have a pattern formed by labelling the points of \(\Lambda \) with letters from an \(a\)-letter alphabet then we can again define \(\rho \)-patches, and the entropies of the pattern are given by (1) and (4) with 2 replaced by \(a\) as the base of logarithms. A subset of \(\Lambda \) corresponds to a 2-letter labelling indicating whether or not a site is occupied. The reason for the patch volume in the denominator and for the choice of base of logarithms is to normalize so that the integer lattice with random labelling has both entropies 1.

There are various ways in which the definition of measure entropy might be extended to sets \(X\) for which not all patch frequencies exist. A first step would be to replace the sum in (4) by

$$\begin{aligned} \lim _{R\rightarrow \infty }\sum _{\mathcal{P }\in \mathcal{A }(\rho )} -\frac{|L(\mathcal{P })\cap B_R(\varvec{0})|}{R^nv_n} \log _2\big (\frac{|L(\mathcal{P })\cap B_R(\varvec{0})|}{R^nv_n}\big ), \end{aligned}$$

where \(L(\mathcal{P })\) is the set appearing in (3). This delays taking the limit, so that it has a chance of existing even when some individual patch frequencies may fail to exist. We shall not need such extensions here, however, since Theorem 1 below guarantees that, for the sets studied in this paper, all patch frequencies exist.

3 \(k\)-Free Points

As a convenient context for our results, we shall use the set \(V=V(\Lambda ,k)\) of \(k\)-free points of a lattice \(\Lambda \) in \(\mathbb R ^n\). For a point \(\varvec{l}\ne \varvec{0}\) in \(\Lambda \) define its \(k\)-content, \(c_k(\varvec{l})\), to be the largest integer \(c\) such that \(\varvec{l}\in c^k\Lambda \). Then \(c_k(\varvec{l})\) is also the least common multiple of the numbers \(d\) with \(d^{-k}{\varvec{l}}\in \Lambda \), i.e. \(d^{-k}{\varvec{l}}\in \Lambda \) if and only if \(d\mid c_k(\varvec{l})\). For consistency and convenience, we define \(c_k(\varvec{0})=\infty \), with the understanding that \(d\mid \infty \) for any number \(d\). The \(k\)-free points, \(V=V(\Lambda ,k)\), of \(\Lambda \) are the points with \(c_k(\varvec{l})=1\). One can see that \(V\) is non-periodic, i.e. \(V\) has no nonzero translational symmetries. As particular cases we have the visible points of \(\Lambda \) (with \(n\ge 2\) and \(k=1\)), treated in [4], and the \(k\)-free integers (with \(\Lambda =\mathbb Z \)), treated in [4, 15, 16]. The more general context has the advantage of avoiding duplication of near-identical proofs. When \(n=k=1\), \(V\) consists of just the two points of \(\Lambda \) closest to \(\varvec{0}\) on either side, and we exclude this trivial case. Since \(\Lambda \) is a free Abelian group of rank \(n\), its automorphism group, \(\mathrm{Aut }(\Lambda )\), is isomorphic to the matrix group \(\mathrm{GL}(n,\mathbb Z )\). Explicit isomorphisms can be found by taking coordinates with respect to any basis of \(\Lambda \). Since the action of \(\mathrm{GL}(n,\mathbb Z )\) on \(\Lambda \) preserves \(k\)-content, the \(k\)-free points \(V\) are invariant under the action of \(\mathrm{GL}(n,\mathbb Z )\).

Proposition 1

\(V\) is uniformly discrete, but has arbitrarily large holes. Moreover, for any \(r>0\), there is a set of holes in \(V\) of inradius at least \(r\) whose centres have positive density.

Proof

Since \(V\subset \Lambda \), the uniform discreteness is trivial. Now let \(C=\{\varvec{a}_1,\dots ,\varvec{a}_s\}\) be any finite configuration of points in \(\Lambda \) (e.g., all points in a ball or a cube). Choose \(s\) integers \(m_1,\dots ,m_s>1\) that are pairwise coprime (e.g., the first \(s\) primes). By (39), there is a point \(\varvec{a}\in \Lambda \) with

$$\begin{aligned} \varvec{a}\equiv -\varvec{a}_i\pmod {m_i^k\Lambda } \end{aligned}$$

for \(i=1,\dots ,s\). Now for any \(\varvec{x}\equiv \varvec{a}\pmod {m_1^k\cdots m_s^k\Lambda }\) the configuration \(C+\varvec{x}=\{\varvec{a}_1+\varvec{x},\dots ,\varvec{a}_s+\varvec{x}\}\) is congruent, in the geometric sense, to \(C\) but no point in \(C+\varvec{x}\) is in \(V\), since \(\varvec{a} _i+\varvec{x} \in m_i^k\Lambda \) for \(i=1,\dots ,s\). The points \(\varvec{x}\) have density \(1/((m_1\cdots m_s)^{nk}\det (\Lambda ))>0\) by (38) and (39). \(\square \)

For a natural number \(P\), we define \(V_P=V_P(\Lambda ,k)\) to be the set of points \({\varvec{l}}\in \Lambda \setminus \{\varvec{0}\}\) with \((c_k({\varvec{l}}),P)=1\). Clearly \(V_P\) is fully periodic with a lattice of periods that contains \(P^k\Lambda \). The \(V_P\)’s are partially ordered inversely to the divisibility partial order on \(\mathbb N \), that is, \(V_{PQ}\subset V_P\) for all \(P,Q\). In fact, more precisely, \(V_{PQ}=V_P\cap V_Q\). The intersection of all the \(V_P\)’s is \(V\), so if \(P\) is divisible by all primes up to a large bound \(V_P\) can be regarded as a set of “potentially \(k\)-free” points.

For a finite subset \(\mathcal F \) of \(\Lambda \) and a positive integer \(m\), we shall use

$$\begin{aligned} \mathcal F /m\Lambda \end{aligned}$$

to denote the set of cosets of \(m\Lambda \) in \(\Lambda \) that are represented in \(\mathcal F \). We also write

$$\begin{aligned} D(\mathcal F ):=\max _{\varvec{l},\varvec{m}\in \mathcal F }\Vert \varvec{l}-\varvec{m}\Vert \end{aligned}$$

for the diameter of \(\mathcal F \), where \(\Vert \cdot \Vert \) denotes the Euclidean norm on \(\mathbb R ^n\).

Since entropies of sets in \(\mathbb R ^n\) vary under change of scale inversely as the \(n\)th power of the scaling constant, it is sufficient to consider lattices of determinant 1. (For other lattices the formula for the entropy of \(V\) must simply be divided by the determinant of \(\Lambda \).) We fix the following notation for the rest of this paper:

\(\Lambda \) is a lattice of determinant \(1\) in \(\mathbb R ^n\), \(\lambda \) is the length of its shortest nonzero vector, \(k\) is a natural number (with \(k\!\ge \!2\) if \(n\!=\!1\)) and \(V\) is the set of \(k\)-free points in \(\Lambda \).

Also, for subsets \(X,\mathcal{P },\mathcal Q \) of \(\Lambda \), with \(X\) infinite but \(\mathcal{P },\mathcal Q \) finite, we define the locator set

$$\begin{aligned} L(X;\mathcal{P },\mathcal Q ):= \{\varvec{t}\in \Lambda :\mathcal{P }+\varvec{t}\subset X,\ \mathcal Q +\varvec{t}\subset \Lambda \setminus X\} \end{aligned}$$

consisting of those lattice translations that locate \(\mathcal{P }\) totally inside \(X\) and \(\mathcal Q \) totally outside \(X\).

The genesis of our proof of positive, but non-maximal, patch-counting entropy for the visible points is the observation that, of the four corners of any unit square of the integer lattice in the plane, at least one is invisible (because both its coordinates are even) but each of the 15 possibilities for the visibility or not of the corners, when the possibility of their all being visible is excluded, can occur, depending on the position of the square within the lattice. This is the simplest example of the fact that, in general, every \(\rho \)-patch contains an irreducible minimum of points not in \(V\) but for the remaining points in the patch we can arrange that they are visible or not, independently of each other by choosing the position of the patch in the lattice. This leads to an exponentially large number of \(\rho \)-patches, the number of which can be estimated quite accurately.

Our aim with the following lemma is to concentrate most of the necessary inclusion–exclusion arguments into a single result from which ensuing results can be fairly readily derived. For this reason it has several parameters (\(\mathcal{P }\), \(m\), \(\varvec{m}\), \(P\) and \(\varvec{x}\)) and three components to its error term. Until the parameters are further specified, there is no assumption that the error terms are of smaller order than the main term. To keep the proof short we have not made the error terms as small as possible—in Sect. 8 we make use of the technique of [15] to vastly improve the last error term.

Lemma 1

Let \(\mathcal{P }\) be a finite subset of \(\Lambda \), \(m\in \mathbb N \), \(\varvec{m}\in \Lambda \), \(P\) be a natural number coprime to \(m\) and \(\varvec{x}\in \mathbb R ^n\). Then

$$\begin{aligned} |L(V_P;\mathcal{P },\emptyset )\cap (\varvec{m}+m\Lambda )\cap B_R(\varvec{x})| \end{aligned}$$

is estimated by a main term

$$\begin{aligned} \frac{R^nv_n}{m^n}\prod _{p\mid P}\big (1-\frac{|{ \mathcal P }/p^k\Lambda |}{p^{nk}}\big ) \end{aligned}$$
(5)

with error

$$\begin{aligned} O\big (R^{1/k}+R^{n-1}(\min \{\log \log P,\log S\})^{|\mathcal{P }|} +\min \{\tau _{|\mathcal{P }|+1}(P),(S/\lambda )^{|\mathcal{P }|/k}\}\big ), \end{aligned}$$
(6)

where \(S:=R+\Vert \varvec{x}\Vert +\max _{\varvec{p}\in \mathcal{P }}\Vert \varvec{p}\Vert \), \(\tau _r\) is the \(r\)-divisor function in (35), and the \(O\)-constant depends only on \(\Lambda \), \(k\) and \(\mathcal{P }\).

Proof

We may clearly assume that \(P\) is squarefree. For each prime \(p\) the points \(\varvec{t}\) with \(\varvec{t}+\mathcal{P }\subset V_p\) consist of \(p^{nk}-|\mathcal{P }/p^k\Lambda |\) cosets of \(p^k\Lambda \) in \(\Lambda \) (those cosets \(\varvec{t}+p^k\Lambda \) with \((-\varvec{t}+p^k\Lambda )\cap \mathcal{P }=\emptyset \)). Clearly \(|\mathcal{P }/p^k\Lambda |=|\mathcal{P }|\) when \(p^k\lambda >D(\mathcal{P })\). Let \(Q\) be the product of those prime factors \(p\) of \(P\) with \(|\mathcal{P }/p^k\Lambda |<|\mathcal{P }|\). By the Chinese Remainder Theorem (39), \(L(V_Q;\mathcal{P },\emptyset )\cap (\varvec{m}+m\Lambda )\) consists of

$$\begin{aligned} \prod _{p\mid Q}(p^{nk}-|\mathcal{P }/p^k\Lambda |) \end{aligned}$$

cosets of \(mQ^k\Lambda \) in \(\Lambda \). For each such coset \(\varvec{q}+mQ^k\Lambda \) we have

$$\begin{aligned} (\varvec{q}+mQ^k\Lambda )\cap L(V_P;\mathcal{P },\emptyset )=(\varvec{q}+mQ^k\Lambda )\cap L(V_{P/Q};\mathcal{P },\emptyset ). \end{aligned}$$
(7)

Now write \(\mathcal{P }=\{\varvec{p}_1,\ldots ,\varvec{p}_r\}\). Since \(\varvec{p}_i+\varvec{t}\in V_{P/Q}\) if and only if \(c_k(\varvec{p}_i+\varvec{t})\) is coprime to \({P/Q}\), it follows from (31) that for each of these cosets the cardinal of \(L(V_P;\mathcal{P },\emptyset )\cap (\varvec{q}+mQ^k\Lambda )\cap B_R(\varvec{x})\) is

$$\begin{aligned} \mathop {\mathop {\sum }\limits _{\varvec{t}\in \Lambda \cap B_R(\varvec{x})}}\limits _{\varvec{t}-\varvec{q}\in mQ^k\Lambda } \ \prod _{i=1}^r \mathop {\mathop {\sum }\limits _{d\mid P/Q}}\limits _{d\mid c_k(\varvec{p}_i+\varvec{t})} \mu (d). \end{aligned}$$

Reversing the order of summation gives

$$\begin{aligned} \begin{array}[t]{c} \displaystyle \sum _{d_1\mid P/Q} \ \sum _{d_2\mid P/Q}\cdots \sum _{d_r\mid P/Q}\\ d_i^k<S/\lambda \text{ for } \text{ each } i \end{array} \mu (d_1d_2\cdots d_r) \mathop {\mathop {\mathop {\sum }\limits _{\varvec{t}\in \Lambda \cap B_R(\varvec{x})}} \limits _{\varvec{t}\in \varvec{q}+mQ^k\Lambda }}\limits _{\varvec{t}\in -\varvec{p}_i+d_i^k\Lambda }1, \end{aligned}$$
(8)

where replacing \(\mu (d_1)\cdots \mu (d_r)\) by \(\mu (d_1\cdots d_r)\) is justified by the fact that the \(d_i\)’s are pairwise coprime since any common factor of \(c_k(\varvec{p}_i+\varvec{t})\) and \(c_k(\varvec{p}_j+\varvec{t})\) divides \(c_k(\varvec{p}_i-\varvec{p}_j)\), all of whose prime factors divide \(Q\). Writing \(d_1\cdots d_r=d\) and noting that \((mQ,d)=1\), we can apply (38) with \(\Lambda \) replaced by \(m(dQ)^k\Lambda \) to obtain, for the inner sum, the estimate

$$\begin{aligned} \frac{R^nv_n}{m^n(dQ)^{nk}}+O(R^{n-1}/m^{n-1}(dQ)^{(n-1)k})+O(1). \end{aligned}$$
(9)

Substituting this estimate in (8) gives a main term

$$\begin{aligned} \frac{R^nv_n}{m^nQ^{nk}}\prod _{p\mid P/Q}\big (1-\frac{r}{p^{nk}}\big ) \end{aligned}$$

with error term (6). The main term arises by removing the conditions \(d_i^k<S/\lambda \) from the sum of the main term in (9) then using the fact that \(\mu (d)\tau _r(d)\) (where \(\tau _r(d)\) is the number of ways of expressing \(d\) as a product of \(r\) natural numbers) is a multiplicative function, whose value is \(-r\) at primes and 0 at prime powers, to express the extended sum as an Euler product, as in (37). The first error term in (6) comes from the extra terms included in the extended multiple sum, so is

$$\begin{aligned} \le \frac{rR^nv_n}{m^nQ^{nk}}\sum _{d_1^k\ge S/\lambda }\frac{1}{d_1^{nk}}\sum _{d_2=1}^\infty \frac{1}{d_2^{nk}} \cdots \sum _{d_r=1}^\infty \frac{1}{d_r^{nk}}=O\big (R^nS^{-n+(1/k)}\big ), \end{aligned}$$

since each of the \(r-1\) complete sums is \(\le \zeta (2)<2\). The other two error terms account for the sum over the error terms in (9). The logarithms in the middle error term are necessary only in the case \(n=2\), \(k=1\), when the series \(\sum d_i^{-(n-1)k}\) diverge but the partial sums can be estimated by using (36) or the standard estimate for the partial sums of the harmonic series. In all other cases these series converge and the middle error term can be taken as \(O(R^{n-1})\). (When \(n=1\), there is no middle error term, since the first error term in (9) is then the same as the last.)

Finally, summing over the cosets of \(mQ^k\Lambda \) that make up \(L(V_Q;\mathcal{P },\emptyset )\cap (\varvec{m}+m\Lambda )\) gives the main term (5) (since \(|\mathcal{P }/p^k\Lambda |=r\) when \(p\not \mid Q\)) and increases the error term by a factor at most \(Q^k\), which is bounded in terms of \(k\) and \(\mathcal{P }\). \(\square \)

Corollary 1

If \(\rho \) is a positive radius and \(P\) is a natural number divisible by every prime less than \(\log \rho \), then

$$\begin{aligned}&|V_P\cap B_\rho (\varvec{0})|=\frac{\rho ^nv_n}{\zeta (nk)}+o(\rho ^n),\end{aligned}$$
(10)
$$\begin{aligned}&|V\cap B_\rho (\varvec{0})|=\frac{\rho ^nv_n}{\zeta (nk)}+o(\rho ^n),\end{aligned}$$
(11)
$$\begin{aligned}&|(V_P\setminus V)\cap B_\rho (\varvec{0})|=o(\rho ^n) \end{aligned}$$
(12)

and, for any \(\varvec{x}\in \mathbb R ^n\),

$$\begin{aligned} |V_P\cap B_\rho (\varvec{x})|\le \frac{\rho ^nv_n}{\zeta (nk)}+o(\rho ^n), \end{aligned}$$
(13)

where \(\zeta \) is the Riemann \(\zeta \)-function.

Proof

For (10) we use the lemma with \(\mathcal{P }=\{\varvec{0}\}\), \(m=1\), \(\varvec{x}=\varvec{0}\) and \(R=\rho \), then replace the product by \(1/\zeta (nk)\) using (33), with \(N=\log \rho \), and (32). This gives (10) with error term \(O(\rho ^n/\log ^{nk-1}\rho )\).

Clearly \(V_Q\cap B_\rho (\varvec{0})=V\cap B_\rho (\varvec{0})\) when \(Q\) is the product of all primes less than \((\rho /\lambda )^{1/k}\) (where \(\lambda \) is the length of the shortest nonzero vector in \(\Lambda \)), giving (11), and (12) results from subtracting this from (10).

For (13) we use the lemma with \(P\) replaced by \(P^{\prime }\), the product of the primes less than \(\log \rho \), together with (33) and (32), and note that \(V_P\subset V_{P^{\prime }}\). Then

$$\begin{aligned} |V_P\cap B_\rho ({\varvec{x}})|\le |V_{P^{\prime }}\cap B_\rho ({\varvec{x}})|= \frac{\rho ^nv_n}{\zeta (nk)}+O\big (\rho ^n/\log ^{nk-1}\rho \big ), \end{aligned}$$

since \(\log P^{\prime }=O(\log \rho )\) and hence \(\log \log P^{\prime }\) and \(\tau (P^{\prime })\) are both \(O(\rho ^\epsilon )\) by (34) and (35). \(\square \)

We note that (11) tells us that \(V\) has density \(1/\zeta (nk)\), generalizing Propositions 6 and 11 of [4] (though the error terms are not as good as those in [4] and much worse than those in [15, 16]). Also, one might regard (13) as saying that \(V\) has a “uniform upper density” (or that \(\Lambda \setminus V\) has a uniform lower density).

The following two theorems carry over to \(k\)-free points the results of Mirsky [15, 16] ([15] improves the error terms in [16])Footnote 1 on \(k\)-free numbers. A weaker result for squarefree numbers goes back to Pillai [20]. Again, we make no attempt in Theorem 1 to match the error term of [15], postponing this to Sect. 8.

Theorem 1

For any two disjoint finite subsets \(\mathcal{P }\) and \(\mathcal Q \) of \(\Lambda \), \(L(V;\mathcal{P },\mathcal Q )\) has a well defined density given by

$$\begin{aligned} \sum _\mathcal{F \subset \mathcal Q }(-1)^{|\mathcal F |} \prod _p\big (1-\frac{|(\mathcal{P }\cup \mathcal F )/p^k\Lambda |}{p^{nk}}\big ). \end{aligned}$$

Proof

By the inclusion–exclusion principle (30) applied to \(L(V;\mathcal{P },\emptyset )\cap B_R(\varvec{0})\), with \(P_i\) being the property that \(\varvec{q}_i+\varvec{t}\in V\) (where \(\mathcal Q =\{\varvec{q}_1,\varvec{q}_2,\ldots \}\)), we have

$$\begin{aligned} |L(V;\mathcal{P },\mathcal Q )\cap B_R(\varvec{0})|= \sum _\mathcal{F \subset \mathcal Q }(-1)^{|\mathcal F |}| L(V;\mathcal{P }\cup \mathcal F ,\emptyset )\cap B_R(\varvec{0})|. \end{aligned}$$

Now Lemma 1 with \(P\) equal to the product of the primes less than \(\log R\) gives the estimate

$$\begin{aligned} R^nv_n\prod _{p<\log R}\big (1-\frac{|(\mathcal{P }\cup \mathcal F )/p^k\Lambda |}{p^{nk}}\big ) +O(R^{1/k}+R^{n-1+\varepsilon }) \end{aligned}$$

for \(|L(V_P;\mathcal{P }\cup \mathcal F ,\emptyset )\cap B_R(\varvec{0})|\), the proof of (12) of Corollary 1 shows that \(V_P\) can be replaced by \(V\) at the expense of an extra error term \(O(R^n/(\log R)^{nk-1})\), and (33) allows the product to be extended over all primes with a similar extra error term. Altogether, this gives the estimate

$$\begin{aligned} R^nv_n\sum _\mathcal{F \subset \mathcal Q }(-1)^{|\mathcal F |}\prod _{p}\big (1-\frac{|(\mathcal{P }\cup \mathcal F )/p^k\Lambda |}{p^{nk}}\big ) +O(R^n/(\log R)^{nk-1}) \end{aligned}$$

for \(|L(V;\mathcal{P },\mathcal Q )\cap B_R(\varvec{0})|\). \(\square \)

Theorem 2

For disjoint finite subsets \(\mathcal{P }\) and \(\mathcal Q \) of \(\Lambda \), the following statements are equivalent:

  1. (i)

    \(|\mathcal{P }/p^k\Lambda |<p^{nk}\) for every prime \(p\);

  2. (ii)

    \(L(V;\mathcal{P },\mathcal Q )\) is non-empty;

  3. (iii)

    \(L(V;\mathcal{P },\mathcal Q )\) has positive density.

Proof

Clearly (iii) implies (ii) and, almost as clearly, (ii) implies (i), since if \(\mathcal{P }\) contains a complete set of coset representatives for \(p^k\Lambda \) then, for every \(\varvec{t}\in \Lambda \), some point of \(\mathcal{P }+\varvec{t}\) is in \(p^k\Lambda \) (so not in \(V\)).

Now assume (i) holds. For each \(\varvec{q}\in \mathcal Q \) choose a different prime \(p(\varvec{q})>(D(\mathcal{P }\cup \mathcal Q )/\lambda )^{1/k}\) and let \(m\) be the product of the \(p(\varvec{q})\)’s. By the Chinese Remainder Theorem, there is an \(\varvec{m}\in \Lambda \) such that

$$\begin{aligned} \varvec{t}\equiv \varvec{m}{ (\text{ mod } \ m^k\Lambda )}\iff \varvec{t}\equiv -\varvec{q}{ (\text{ mod } \ p(\varvec{q})^k\Lambda )}\quad \forall \varvec{q}\in \mathcal Q . \end{aligned}$$

Then for \(\varvec{t}\equiv \varvec{m}\) (mod \(m^k\Lambda \)) we have \(\mathcal Q +\varvec{t}\subset \Lambda \setminus V\) and \(\mathcal{P }+\varvec{t}\subset V_m\) (the latter using the fact that for every \(\varvec{p}\in \mathcal{P }\) and every prime factor \(p(\varvec{q})\) of \(m\), \(\varvec{q}+\varvec{t}\in p(\varvec{q})^k\Lambda \) and \(\Vert \varvec{p}-\varvec{q}\Vert <p(\varvec{q})^k\lambda \), ensuring that \(\varvec{p}+\varvec{t}\not \in p(\varvec{q})^k\Lambda \)). Now Lemma 1, with \(P\) the product of the primes less than \(\log R\) not dividing \(m\), gives a main term \(CR^n\) with error \(O(\max \{R^{n-1}\log R,R^{1/k}\})\) for the cardinal of a subset of the points \(\varvec{t}\in \Lambda \cap B_R(\varvec{0})\) with \(\mathcal{P }+\varvec{t}\subset V_{mP}\) and \(\mathcal Q +\varvec{t}\subset \Lambda \setminus V\), where the constant \(C\) is positive since the product in (5) has every term positive. By (12) of Corollary 1, the number of these points with \(\mathcal{P }+\varvec{t}\not \subset V\) is \(o(R^n)\). Hence \(L(V;\mathcal{P },\mathcal Q )\) has positive lower density, and so, by Theorem 1, positive density. \(\square \)

An interesting feature of Theorem 2 is that the criterion (i) is independent of \(\mathcal Q \). This means, for example, that

$$\begin{aligned} L(V;\mathcal{P },\emptyset )\ne \emptyset \quad \Rightarrow \quad \delta (L(V;\mathcal{P },\mathcal Q ))>0\quad \forall \mathcal Q \text{ with } \mathcal{P }\cap \mathcal Q =\emptyset , \end{aligned}$$

which tells us, in particular, that every subset of a patch of \(V\) is a patch of \(V\).

4 Examples

Theorem 1 allows us to calculate the frequencies of \(\rho \)-patches of \(V\) in terms of the products

$$\begin{aligned} \Pi _r(nk):=\prod _{p>r^{1/nk}}\big (1-\frac{r}{p^{nk}}\big ) \end{aligned}$$

for \(r=0,1,\ldots ,|\Lambda \cap B_\rho (\varvec{0})|\). Here, we give two simple examples that both have \(nk=2\) and that have \(|\Lambda \cap B_\rho (\varvec{0})|=3\) and 5, respectively. So we need the products

$$\begin{aligned} \Pi _1(2)&= 1/\zeta (2)=6/\pi ^2=0.6079271\ldots ,\\ \Pi _2(2)&= 0.3226340\ldots \text{(the } \text{ Feller--Tornier } \text{ constant), }\\ \Pi _3(2)&= 0.1254869\ldots ,\quad \Pi _4(2)=0.3785994\ldots ,\quad \Pi _5(2)=0.2733455\ldots , \end{aligned}$$

whose values can be calculated efficiently by the method described in [19].

Our first example is to find the frequencies of all 2-patches when \(V\) is the set of squarefree numbers. Here \(\Lambda =\mathbb Z \), \(n=1\), \(k=2\) and \(|\Lambda \cap B_2(\varvec{0})|=3\). Since \(-1,0,1\) are distinct mod \(p^k\), for every \(p\), \(|(\mathcal{P }\cup \mathcal F )/p^k|=|\mathcal{P }\cup \mathcal F |\) and \(\nu (\mathcal{P })\) depends only on \(|\mathcal{P }|\) in this case. Table 1 gives the frequencies of 2-patches of all possible sizes, both in terms of the above products and numerically, and Fig. 1 depicts the patches themselves, with their frequencies. There are three patches each of sizes 1 and 2, and we check that the sum, \(\sum \nu (\mathcal{P })\), of the frequencies of all patches is 1 and that the average patch size, \(\sum \nu (\mathcal{P })|\mathcal{P }|\), is \(3\delta (V)=18/\pi ^2\). The patches of size 2 are the most frequent, as is to be expected since 2 is the closest integer to \(3\delta (V)\): indeed, 59 % of all locations have patches of size 2. The empty patch is by far the rarest, occurring at less than 2 % of locations. The radius \(\rho =2\) is the largest for which every subset of \(\Lambda \cap B_\rho (\varvec{0})\) occurs as a patch of \(V\): of the 32 subsets of \(\Lambda \cap B_3(\varvec{0})\) the 3 that contain 4 or 5 consecutive points do not occur as patches of \(V\).

Table 1 Frequencies of the 2-patches of the squarefree numbers
Fig. 1
figure 1

The 2-patches of the squarefree numbers (\(\Lambda =\mathbb Z \), \(k=2\)) with their frequencies accurate to five decimal places. The black dots are points of \(V\) and the open circles other lattice points. The top row contains the patches with mirror symmetry and the bottom row the two mirror image pairs

Our other example is the \(\sqrt{2}\)-patches of the visible points, \(V\), in \(\mathbb Z ^2\), where \(\Lambda =\mathbb Z ^2\), \(n=2\), \(k=1\) and \(|\Lambda \cap B_2(\varvec{0})|=5\). Figure 2 shows the different patches, up to symmetry, with their frequencies. The four patches in the top row have the full dihedral symmetry \(D_4\); the two in the second row have symmetry \(D_2\), and give rise to another patch on rotation through \(\pi /2\); the remaining six patches have only reflection symmetry, and each gives rise to three others on rotation through \(\pm \pi /2\) and \(\pi \). We can again check that \(\sum \nu (\mathcal{P })=1\) and \(\sum \nu (\mathcal{P })|\mathcal{P }|=5\delta (V)=30/\pi ^2\). This time, however, the frequencies do not depend only on \(|\mathcal{P }|\), and indeed no two patches that are not symmetry related have the same frequency. Of the five patches with \(|\mathcal{P }|=4\), the symmetric one has frequency nearly five times that of each of the other four, and the ratio of the frequencies of two of the patches with \(|\mathcal{P }|=3\) is nearly 30. Surprisingly, one of the patches with \(|\mathcal{P }|=3\) (the commonest patch size) has frequency smaller than that of any patch except the empty one. The empty patch itself occurs at less than 1 in 900 locations. As in the previous example, \(\sqrt{2}\) is the largest radius for which every subset of \(\Lambda \cap B_\rho (\varvec{0})\) is a patch: of the 512 subsets of \(\Lambda \cap B_{\sqrt{3}}(\varvec{0})\), the 135 that contain all four vertices of a lattice square do not occur as patches of \(V\).

5 Entropy Calculations

Theorem 3

\(h_\mathrm{pc}(V)=1/\zeta (nk)\).

Proof

For each radius \(\rho >0\) let \(P=P(\rho )\) be the product of the primes \(p\) with \(p^{nk}\le |\Lambda \cap B_\rho (\varvec{0})|\). Then \(P\) is divisible by every prime less than \(\log \rho \) when \(\rho \) is large enough.

Fig. 2
figure 2

The \(\sqrt{2}\)-patches of the visible points of \(\mathbb Z ^2\), up to symmetry, with their frequencies, accurate to five decimal places

The \(\rho \)-patch of \(V\) at any point \(\varvec{t}\in \Lambda \) is a subset of \((V_P-\varvec{t})\cap B_\rho (\varvec{0})\) and by (13) of Corollary 1 the cardinal of this set is at most \(\rho ^nv_n/\zeta (nk)+o(\rho ^n)\). Also there are at most \(P^{nk}\) possibilities for \(V_P-\varvec{t}\) as \(\varvec{t}\) varies, since \(P^k\Lambda \) is the lattice of periods of \(V_P\). So

$$\begin{aligned} \log _2N(\rho )&\le \frac{\rho ^nv_n}{\zeta (nk)}+ o(\rho ^n)+nk\log _2P\nonumber \\&\le \frac{\rho ^nv_n}{\zeta (nk)}+o(\rho ^n)+O(\rho ^{1/k}), \end{aligned}$$
(14)

since \(\log _2P=O(\rho ^{1/k})\), by (34).

To bound \(\log _2N(\rho )\) below we note that every subset \(\mathcal{P }\) of \(V\cap B_\rho (\varvec{0})\) is the \(\rho \)-patch of \(V\) at some point of \(\Lambda \), by Theorem 2 with \(\mathcal Q =\Lambda \cap B_\rho (\varvec{0})\setminus \mathcal{P }\). By (11) of Corollary 1, \(|V\cap B_\rho (\varvec{0})|=\rho ^nv_n/\zeta (nk)+o(\rho ^n)\), so

$$\begin{aligned} \log _2N(\rho )\ge \frac{\rho ^nv_n}{\zeta (nk)}+o(\rho ^n). \end{aligned}$$
(15)

On dividing by \(\rho ^nv_n\) and letting \(\rho \) tend to infinity, (14) and (15) give \(h_\mathrm{pc}(V)=1/\zeta (nk)\). \(\square \)

To bound the measure entropy we need the following lemma, which enables us to obtain good upper bounds for the frequency of “sparse” patches of \(V\), i.e. patches that contain few points in comparison to their size.

Lemma 2

Let \(\mathcal{P }\) and \(\mathcal Q \) be disjoint finite subsets of \(\Lambda \), let \(Q\) be the product of all primes \(p\) with

$$\begin{aligned} |\mathcal Q /p^k\Lambda |<|\mathcal Q | \end{aligned}$$
(16)

and define

$$\begin{aligned} s:=\min _{\varvec{t}\in L(V;\mathcal{P },\mathcal Q )}|(\mathcal Q +\varvec{t})\cap V_Q|. \end{aligned}$$

Then

$$\begin{aligned} \delta (L(V;\mathcal{P },\mathcal Q ))= O\big (4^{(D(\mathcal Q )/\lambda )^{1/k}nk}/|\mathcal Q |^{s-s/nk}\big ), \end{aligned}$$
(17)

where the \(O\)-constant depends only on \(\Lambda \).

Proof

If \(\varvec{t}\in L(V;\mathcal{P },\mathcal Q )\) then for each \(\varvec{q}\in \mathcal Q \cap (V_Q-\varvec{t})\) there is a prime \(p(\varvec{q})\not \mid Q\) with \(\varvec{q}+\varvec{t}\in p(\varvec{q})^k\Lambda \), and by the definition of \(Q\) these primes are distinct. By (12) of Corollary 1 with \(\rho =R+\max _{\varvec{q}\in \mathcal Q }\Vert \varvec{q}\Vert \) and \(P\) the product of the primes less than \(\log \rho \), the number of points \(\varvec{t}\in L(V;\mathcal{P },\mathcal Q )\cap B_R(\varvec{0})\) for which \(p(\varvec{q})\ge \log \rho \) for some \(\varvec{q}\in \mathcal Q \cap (V_Q-\varvec{t})\) is \(o(\rho ^n)\). The remaining \(\varvec{t}\)’s in \( L(V;\mathcal{P },\mathcal Q )\cap B_R(\varvec{0})\) have \(p(\varvec{q})<\log \rho \) for each \(\varvec{q}\in \mathcal Q \cap (V_Q-\varvec{t})\). For the number of such \(\varvec{t}\) with a given set \(\mathcal Q \cap (V_Q-\varvec{t})=\{\varvec{q}_1,\ldots ,\varvec{q}_t\}\) and a given ordered set of primes \(\{p(\varvec{q}_1),\ldots ,p(\varvec{q}_t)\}\), (38) with \(\Lambda \) replaced by \((p(\varvec{q}_1)\cdots p(\varvec{q}_t))^k\Lambda \), gives the estimate

$$\begin{aligned} \le \frac{CR^n}{(p(\varvec{q}_1)\cdots p(\varvec{q}_t))^{nk}} \end{aligned}$$

when \(R\) is large enough to ensure that \(R^n>(\log \rho )^{nk|\mathcal Q |}\), where the constant \(C\) depends only on \(\Lambda \). The sum of this over all sets of \(t\) primes not dividing \(Q\) is majorized by

$$\begin{aligned} CR^n\big (\sum _{p\not \mid Q}\frac{1}{p^{nk}}\big )^t <CR^n\big (\sum _{m\ge |{\mathcal{Q }|^{1/nk}}}\frac{1}{m^{nk}}\big )^t <\frac{CR^n}{|\mathcal{Q }|^{s-s/nk}}, \end{aligned}$$

since \(t\ge s\) and the least prime not dividing \(Q\) is \(\ge |\mathcal Q |^{1/nk}\), by (16). There are at most \(Q^{nk}\) possibilities for \(\mathcal Q \cap (V_Q-\varvec{t})\), since \(Q^k\Lambda \) is the lattice of periods of \(V_Q\), so

$$\begin{aligned} |L(V;\mathcal{P },\mathcal Q )\cap B_R(\varvec{0})|&< \frac{Q^{nk}CR^n}{|\mathcal Q |^{s-s/nk}}+o(R^n)\\&< \frac{4^{(D(\mathcal Q )/\lambda )^{1/k}nk}CR^n}{|\mathcal Q |^{s-s/nk}}+o(R^n) \end{aligned}$$

for large \(R\), where the second inequality results from (34) and the fact that \(p^k\le D(\mathcal Q )/\lambda \) for every prime factor \(p\) of \(Q\). The result follows on dividing by \(R^n\) and letting \(R\) tend to infinity (the existence of the limit on the left being guaranteed by Theorem 1). \(\square \)

Theorem 4

\(h_\mathrm{meas}(V)=0\).

Proof

Given \(\rho >0\) and a \(\rho \)-patch \(\mathcal{P }\) of \(V\), let \(\mathcal Q :=(\Lambda \cap B_\rho (\varvec{0}))\setminus \mathcal{P }\) and, as in Lemma 2, define \(Q=Q(\mathcal{P })\) to be the product of all primes \(p\) with \(|\mathcal Q /p^k\Lambda |<|\mathcal Q |\) and

$$\begin{aligned} s=s(\mathcal{P }):=\min _{\varvec{t}\in L(V;\mathcal{P },\mathcal Q )}|(\mathcal Q +\varvec{t})\cap V_Q|. \end{aligned}$$

By (13) of Corollary 1 and the fact that \(V\subset V_P\) with \(P\) the product of primes less than \(\log \rho \), we have

$$\begin{aligned} |\mathcal Q |> \big (1-\frac{1}{\zeta (nk)}\big )v_n \rho ^n-o(\rho ^n)>\frac{v_n\rho ^n}{2^{nk}} \end{aligned}$$
(18)

for large enough \(\rho \).

Now put \(S=S(\rho ):=\rho ^n/\sqrt{\log _2\rho }\). We shall calculate separately the contributions to the measure entropy \(h_\mathrm{meas}\) of the \(\rho \)-patches \(\mathcal{P }\) of \(V\) with \(s(\mathcal{P })\ge S\) and those with \(s(\mathcal{P })< S\). The former patches have small frequency and the latter are few in number.

For the \(\rho \)-patches with \(s(\mathcal{P })\ge S\), Lemma 2 and (18) give

$$\begin{aligned} -\log _2\nu (\mathcal{P })>\frac{n}{2}S\log _2\rho -O\big (D(\mathcal Q )^{1/k}\big ) =\frac{n}{2}S\log _2\rho -O\big (\rho ^{1/k}\big )> \frac{S\log _2\rho }{3} \end{aligned}$$

for large enough \(\rho \) which, since \(-\log _2\nu \) is decreasing but \(-\nu \log _2\nu \) is increasing for \(\nu \in (0,1/e]\), gives the estimate

$$\begin{aligned} -\nu (\mathcal{P })\log _2\nu (\mathcal{P })=O(2^{-S \log _2\rho /3}S\log _2\rho ). \end{aligned}$$

Since there are at most \(2^{|\Lambda \cap B_\rho (\varvec{0})|}\) \(\rho \)-patches in all, the contribution of the \(\rho \)-patches \(\mathcal{P }\) with \(s(\mathcal{P })\ge S\) to the sum on the right of (4) is

$$\begin{aligned} O\big (2^{|\Lambda \cap B_\rho (\varvec{0})|-(S\log _2\rho )/3}S\log _2\rho \big )= O\big (2^{-\frac{\rho ^n}{4}\sqrt{\log _2\rho }} \rho ^n\sqrt{\log _2\rho }\big ) =o(1). \end{aligned}$$
(19)

Turning to the \(\rho \)-patches with \(s(\mathcal{P })<S\), denote this set of patches by \(\mathcal B \subset \mathcal{A }(\rho )\) and let \(F\) be their combined frequency. The contribution of these patches to the sum on the right of (4) is

$$\begin{aligned} \sum _{\mathcal{P }\in \mathcal B }-\nu (\mathcal{P })\log _2\nu (\mathcal{P }) \end{aligned}$$

which, since \(\nu \log _2\nu \) is a convex function of \(\nu \), does not decrease if we replace the \(\nu (\mathcal{P })\)’s by their average value, \(F/|\mathcal B |\). So this contribution is

$$\begin{aligned} \le F\log _2|\mathcal B |-F\log _2F\le \log _2|\mathcal B |+\frac{\log _2e}{e}. \end{aligned}$$
(20)

To bound \(|\mathcal B |\) we note that if \(\mathcal{P }\!\in \!\mathcal B \) then there is a \(\varvec{t}\!\in \! L(V;\mathcal{P },\mathcal Q )\) with \(|\mathcal Q \cap (V_Q\!-\!\varvec{t})|\!<\!S\). Since \(\mathcal{P }\subset V-\varvec{t}\subset V_Q-\varvec{t}\), \(\mathcal Q =(\Lambda \cap B_\rho (\varvec{0}))\setminus \mathcal{P }\), and \(Q^k\Lambda \) is the lattice of periods of \(V_Q\), \(\mathcal{P }\) and \(\mathcal Q \) are completely determined by this subset of \(B_\rho (\varvec{0})\) and by \(\varvec{t}\) modulo \(Q^k\Lambda \). There are \(Q^{nk}\) cosets of \(Q^k\Lambda \) in \(\Lambda \) and the number of subsets of \(\Lambda \cap B_\rho (\varvec{0})\) with fewer than \(S\) members is bounded above by

$$\begin{aligned} \sum _{i=0}^{\lfloor S\rfloor }\big (\begin{array}{c}|\Lambda \cap B_\rho (\varvec{0})|\\ i\end{array}\big ) \le (\lfloor S\rfloor +1)\big (\begin{array}{c}|\Lambda \cap B_\rho (\varvec{0})|\\ \lfloor S\rfloor \end{array}\big ) \le 2S\big (\frac{e|\Lambda \cap B_\rho (\varvec{0})|}{S}\big )^{\!\!S} \end{aligned}$$

for large \(\rho \), by (40). Hence the bound on the right of (20) is majorized by

$$\begin{aligned}&S\log _2(e|\Lambda \cap B_\rho (\varvec{0})|/S)+\log _22eS+nk\log _2Q\nonumber \\&\quad =O\big (\frac{\rho ^n\log _2\log _2\rho }{\sqrt{ \log _2\rho }}\big )+O(\log _2\rho )+O(\rho ^{1/k}) =o(\rho ^n). \end{aligned}$$
(21)

Since the contributions (19) and (21) are both \(o(\rho ^n),\,h_\mathrm{meas}(V)=0\). \(\square \)

Note on patch shapes.   On the general principle of the isotropy of space, we have used spherical patches throughout and measured densities and frequencies through expanding spherical regions; but the results we obtain are independent of the shapes of these patches and regions: all our point-counting estimates stem from (38) which remains valid for an arbitrary expanding region in place of the expanding ball, with main term the volume of the region (using the volume of the fundamental region of the lattice as a unit) and an error term of smaller order provided the boundary of the region has \(n\)-dimensional measure zero. It is not even necessary for the shape of the density-defining regions to be the same as the (also expanding) patch shape.

6 Variational Principle

Endowing the power set \(\{0,1\}^{\Lambda }\) of the lattice \(\Lambda \) with the product topology of the discrete topology on \(\{0,1\}\), it becomes a compact topological space (by Tychonov’s theorem). This topology is in fact generated by the metric \(d\) defined by

$$\begin{aligned} d(X,Y):=\min \big \{1,\inf \{\varepsilon >0\,\mid \, X\cap B_{1/\varepsilon }(\varvec{0})=Y\cap B_{1/\varepsilon }(\varvec{0})\}\big \} \end{aligned}$$

for subsets \(X,Y\) of \(\Lambda \). Then \((\{0,1\}^{\Lambda },\Lambda )\) is a topological dynamical system, i.e. the natural translational action of the group \(\Lambda \) on \(\{0,1\}^{\Lambda }\) is continuous.

Now let \(X\) be a subset of \(\Lambda \). The closure \(\mathbb X (X)\) of the set of lattice translations \(\varvec{t}+X\) (\(\varvec{t}\in \Lambda \)) of \(X\) in \(\{0,1\}^{\Lambda }\) gives rise to the topological dynamical system \((\mathbb X (X),\Lambda )\), i.e. \(\mathbb X (X)\) is a compact topological space on which the action of \(\Lambda \) is continuous; cf. [5] and references therein for details. Denote by \(\mathcal{M }(\mathbb X (X),\Lambda )\) the set of \(\Lambda \)-invariant probability measures on \(\mathbb X (X)\) with respect to the Borel \(\sigma \)-algebra on \(\mathbb X (X)\), i.e. the smallest \(\sigma \)-algebra on \(\mathbb X (X)\) which contains the open subsets of \(\mathbb X (X)\). For a fixed such measure \(\mu \) and a radius \(\rho >0\), let \(h_\rho (\mu )\) be the entropy of \(\mu \) restricted to \(\mathcal A (\rho )\), i.e.

$$\begin{aligned} h_\rho (\mu ):=\sum _\mathcal{P \in \mathcal A (\rho )}- \mu (C_\mathcal P )\log _2 \mu (C_\mathcal P ), \end{aligned}$$

where \(\mathcal{A }(\rho )\) denotes the set of \(\rho \)-patches of \(X\) and \(C_\mathcal P \) is the set of elements of \(\mathbb X (X)\) whose \(\rho \)-patch at \(\varvec{0}\) is \(\mathcal{P }\), the so-called cylinder set with respect to \(\mathcal{P }\). The metric entropy of \(\mu \) is then given by the limit

$$\begin{aligned} h(\mu ):=\lim _{\rho \rightarrow \infty }\frac{h_\rho (\mu )}{\rho ^nv_n}, \end{aligned}$$

which exists by a subadditivity argument; cf. [6] and also see [10, 13, 25]. As in Sect. 2, replacing the \(\mu (C_\mathcal P )\)’s by their average value, \(1/N(\rho )\), we see that

$$\begin{aligned} h(\mu )\le h_\mathrm{pc}(X)\quad \forall \mu \in \mathcal M (\mathbb X (X),\Lambda ). \end{aligned}$$

Since the topological entropy \(h_\mathrm{top}(\mathbb X (X),\Lambda )\) of \((\mathbb X (X),\Lambda )\) coincides with \(h_\mathrm{pc}(X)\) by [5, Theorem 1 and Remark 2], the variational principle for lattice actions on compact spaces here reads as follows; cf. [6] and [21], Sect. 6], the latter being an extension of the case \(n=1\) from [9, 25]. An elementary proof can be found in [17]. Note that the additional statement follows from the expansiveness of the action of \(\Lambda \) on \(\mathbb X (X)\).

Theorem 5

(Variational principle)

$$\begin{aligned} \sup _{\mu \in \mathcal{M }(\mathbb X (X),\Lambda )} h(\mu )=h_\mathrm{pc}(X). \end{aligned}$$

Moreover, the supremum is achieved at some measure. \(\square \)

In case of \(V\), \(\mathbb X (V)\) will also contain the empty set (cf. Proposition 1) and various other subsets of \(\Lambda \) and thus admits many \(\Lambda \)-invariant probability measures. In fact, we shall now show that \(\mathbb X (V)\) coincides with the set of admissible subsets \(A\) of \(\Lambda \), i.e. subsets \(A\) of \(\Lambda \) having the property that every finite subset \(\mathcal P \) of \(A\) satisfies criterion (i) of Theorem 2; compare [22, Theorem 8(i)]. We denote the set of all admissible subsets of \(\Lambda \) by \(\mathbb A \).

Theorem 6

\(\mathbb X (V)=\mathbb A \).

Proof

Since \(V\in \mathbb A \) (otherwise some point of \(V\) is in \(p^k\Lambda \) for some prime \(p\), a contradiction) and since \(\mathbb A \) is a \(\Lambda \)-invariant and closed subset of \(\{0,1\}^{\Lambda }\), it follows that \(\mathbb A \) contains \(\mathbb X (V)\). For the other inclusion, let \(A\in \mathbb A \). Then, for any \(\rho >0\), Theorem 2 applied to the finite subset \(A_\rho =A\cap B_\rho (\varvec{0})\) of \(A\) implies the existence of a \(t_\rho \in L(V;A_\rho ,\Lambda \cap B_\rho (\varvec{0})\setminus A_\rho )\). It follows that \(A\in \mathbb X (V)\).\(\square \)

Moreover, one has \(h_\mathrm{pc}(V)=1/\zeta (nk)\) by Theorem 3. Consider the frequency function \(\nu \) from above which gives the frequencies \(\nu (\mathcal{P })\) of occurence of \(\rho \)-patches \(\mathcal{P }\) of \(V\) in space. The function \(\nu \), regarded as a function on the cylinder sets by setting \(\nu (C_\mathcal{P }):=\nu (\mathcal{P })\), is finitely additive on the cylinder sets with \(\nu (\mathbb X (V))=\sum _{\mathcal{P } \in \mathcal A (\rho )}\nu (C_{{ \mathcal P }})=1\). Since the family of cylinder sets is a (countable) semi-algebra that generates the Borel \(\sigma \)-algebra on \(\mathbb X (V)\), one can use the method from [25], Sect. 0.2] to show that \(\nu \) extends uniquely to a probability measure on \(\mathbb X (V)\). Moreover, this probability measure can be seen to be \(\Lambda \)-invariant. This shows that the measure entropy \(h_\mathrm{meas}(V)\) is indeed a metric entropy of a \(\Lambda \)-invariant probability measure on \(\mathbb X (V)\). Certainly, an explicit characterisation of \((\mathcal{M }\mathbb X (V),\Lambda )\) together with the corresponding metric entropies (in particular those measures \(\mu \in \mathcal{M }(\mathbb X (V),\Lambda )\) with maximal entropy, i.e. \(h(\mu )=1/\zeta (nk)\)) would be desirable (but not simple).

7 Diffraction Spectrum

In the following, we assume that the reader is acquainted with the mathematics of diffraction as carefully laid out in [4]; see also [2] and references therein for a review. We shall also use the notation and results from that text. In fact, the proofs presented below are straightforward modifications of the corresponding proofs in [4] and are only included for the reader’s convenience. For an alternative derivation of the diffraction spectrum in case of the visible lattice points, see [23], Sect. 5a].

A Dirichlet series we shall encounter below is

$$\begin{aligned} \xi (s):=\sum _{m=1}^{\infty }\frac{\mu (m)\tau (m)}{m^s}= \prod _p\big (1-\frac{2}{p^s}\big )\,, \end{aligned}$$
(22)

which is absolutely convergent for \(\mathfrak R (s)>1\), where \(\tau \) is the ordinary divisor function in (35).

Theorem 7

The natural autocorrelation of \(V\) exists and is supported on \(\Lambda \), the weight of a point \(\varvec{a}\in \Lambda \) in the autocorrelation of \(V\) being given by

$$\begin{aligned} w(\varvec{a})=\xi (nk)\prod _{p\mid c_k(\varvec{a})}\big (1+\frac{1}{p^{nk}-2}\big )\,, \end{aligned}$$

with error term equal to \(O(R^{-(1-(1/k))^2})\) for \(n=1\) and \(k\ge 2\), \(O(R^{-1/2})\) for \(n=2\) and \(k=1\) and \(O(R^{-1})\) otherwise, where, in any case, the implied constant depends on \(\varvec{a}\) as well as on \(\Lambda \). (For lattices \(\Lambda \) with determinant \(\ne 1\) the weights above must be divided by \(\det (\Lambda )\).)

Proof

Since the cases \(n=1\), \(k\ge 2\) and \(n\ge 2\), \(k=1\) were already treated in [4, Theorems 1, 2 and 4], we may assume that \(n,k\ge 2\). Since \(V-V\subset \Lambda \), the autocorrelation of \(V\) (if it exists) can only be supported on \(\Lambda \). The weight of a point \(\varvec{a}\in \Lambda \) in the autocorrelation of \(V\) is the limit as \(R\rightarrow \infty \) of

$$\begin{aligned} \frac{1}{R^nv_n}\sum _{\varvec{x},\varvec{x}-\varvec{a}\in V\cap B_R(\varvec{0})} 1 \end{aligned}$$
(23)

and, by [4, Lemma 1], the existence of this limit for each \(\varvec{a} \in \Lambda \) is sufficient to ensure the existence of the autocorrelation.

It is convenient to drop the condition \(\varvec{x}-\varvec{a}\in B_R(\varvec{0})\) in (23), which then becomes

$$\begin{aligned} \frac{1}{R^nv_n}\mathop {\mathop {\sum }\limits _{\varvec{x},\varvec{x}-\varvec{a}\in V}} \limits _{\varvec{x}\in B_R(\varvec{0})} 1\,. \end{aligned}$$
(24)

The difference between these sums is \(O(1/R)\) by (38), due to the extra lattice points \(\varvec{x}\) within a constant distance \(\Vert \varvec{a}\Vert \) of the boundary of \(B_R(\varvec{0})\) that are included in the latter. By (31), this can be written as

$$\begin{aligned} \frac{1}{R^nv_n}\sum _{\varvec{x}\in \Lambda \cap B_R(\varvec{0})\setminus \{\varvec{0},\varvec{a}\}}\quad \sum _{l\mid c_k(\varvec{x})}\mu (l)\sum _{m\mid c_k(\varvec{x}-\varvec{a})}\mu (m)\,. \end{aligned}$$

Reversing the order of summation gives

$$\begin{aligned} \frac{1}{R^nv_n} \sum _{1\le l<S^{\frac{1}{k}}}\sum _{1\le m<S^{\frac{1}{k}}}\mu (l)\mu (m)\mathop {\mathop {\mathop {\sum }\limits _{\varvec{x}\in \Lambda \cap B_R(\varvec{0})\setminus \{\varvec{0},\varvec{a}\}}} \limits _{\varvec{x}\in l^k\Lambda }}\limits _{\varvec{x}-\varvec{a}\in m^k\Lambda }1\,, \end{aligned}$$

where \(S:=(R+\Vert \varvec{a}\Vert )/\lambda \). Collecting terms with the same value of \(d=(l,m)\), noting that all \(\varvec{x}\) in the inmost sum belong to \(d^k\Lambda \) and that there is no such \(\varvec{x}\) unless \(\varvec{a}\in d^k\Lambda \), and putting \(l^{\prime }:=l/d\), \(m^{\prime }:=m/d\), \(\varvec{x}^{\prime }:=\varvec{x}/d^k\), \(\varvec{a}^{\prime }:=\varvec{a}/d^k\), we obtain

$$\begin{aligned} \frac{1}{R^nv_n} \sum _{d\mid c_k(\varvec{a})}\sum _{1\le l^{\prime }<S^{\frac{1}{k}}/d}\mathop {\mathop {\sum }\limits _{1\le m^{\prime }<S^{\frac{1}{k}}/d}}\limits _{(l^{\prime },m^{\prime })=1}\mu (l^{\prime }d) \mu (m^{\prime }d)\mathop {\mathop {\mathop {\sum }\limits _{\varvec{x}^{\prime }\in \Lambda \cap B_{R/d^k}(\varvec{0})\setminus \{\varvec{0},\varvec{a}^{\prime }\}}}\limits _{\varvec{x}^{\prime }\in l^{\prime k}\Lambda }}\limits _{\varvec{x}^{\prime }-\varvec{a}^{\prime }\in m^{\prime k}\Lambda }1\,. \end{aligned}$$
(25)

Since \(l^{\prime }\) and \(m^{\prime }\) are bound variables of summation and \(\varvec{x}^{\prime }\) and \(\varvec{a}^{\prime }\) will not be referred to again, we can drop the dashes: from now on \(l\) and \(m\) are the new \(l^{\prime }\) and \(m^{\prime }\) but \(\varvec{a}\) is the original \(\varvec{a}\).

By (38) with \(\Lambda \) replaced by \((lm)^k\Lambda \), the inmost sum is

$$\begin{aligned} v_n\big (\frac{R}{(dlm)^k}\big )^n+O\big (\frac{R}{(dlm)^k}\big )^{n-1}+O(1)\,. \end{aligned}$$

These three terms give a main term and two error terms in (25).

The first error term is majorized by

$$\begin{aligned} O\big (\frac{1}{R}\sum _{1\le l<S^{\frac{1}{k}}}\frac{1}{l^{k (n-1)}}\sum _{1\le m<S^{\frac{1}{k}}}\frac{1}{m^{k(n-1)}}\big )=O(1/R) \end{aligned}$$

since the sums are convergent due to \(k(n-1)\ge 2\).

The second error term is majorized by

$$\begin{aligned} O\big (\frac{S^{2/k}}{R^n}\big )=O\big (\frac{1}{R^{n-2/k}} \big )=O(1/R) \end{aligned}$$

since \(S=O(R)\) and \(k(n-1)\ge 2\). So both error terms are \(O(1/R)\) and thus tend to \(0\) as \(R\rightarrow \infty \).

The main term is

$$\begin{aligned}&\sum _{d\mid c_k(\varvec{a})}\sum _{1\le l<S^{\frac{1}{k}}/d}\mathop {\mathop {\sum }\limits _{1\le m<S^{\frac{1}{k}}/d}}\limits _{(l,m)=1}\frac{\mu (ld)\mu (md)}{(dlm)^{nk}}\\&\quad =\sum _{d\mid c_k(\varvec{a})}\mathop {\mathop {\sum }\limits _{1\le l<S^{\frac{1}{k}}/d}}\limits _{(l,d)=1} \mathop {\mathop {\mathop {\sum }\limits _{1\le m<S^{\frac{1}{k}}/d}}\limits _{(m,d)=1}}\limits _{(l,m)=1}\frac{\mu (ld) \mu (md)}{(dlm)^{nk}}\\&\quad = \sum _{d\mid c_k(\varvec{a})}\frac{\mu ^2(d)}{d^{nk}}\mathop {\mathop {\sum }\limits _{1\le l<S^{\frac{1}{k}}/d}}\limits _{(l,d)=1}\mathop {\mathop {\sum }\limits _ {1\le m<S^{\frac{1}{k}/d}}}\limits _{(m,d)=1}\frac{\mu (lm)}{(lm)^{nk}} \end{aligned}$$

since \(\mu \) is multiplicative and \(\mu (lm)=0\) when \((l,m)\ne 1\). Since the last double sum is absolutely convergent, this converges to

$$\begin{aligned} \sum _{d\mid c_k(\varvec{a})}\frac{\mu ^2(d)}{d^{nk}} \mathop {\mathop {\sum }\limits _{r=1}}\limits _{(r,d)=1}^{\infty } \frac{\mu (r)\tau (r)}{r^{nk}} \end{aligned}$$
(26)

as \(R\rightarrow \infty \). The difference between this limit and the partial sum above is \(O(1/R^{nk-1})\), so falls within the error estimate \(O(1/R)\).

Using (22), the expression for the limit (26) can be rearranged as

$$\begin{aligned}&\mathop {\mathop {\sum }\limits _{d\mid c_k(\varvec{a})}} \limits _{d \text{ squarefree }} \frac{1}{d^{nk}} \prod _{p\not \mid d}\big (1-\frac{2}{p^{nk}}\big )\\&\quad =\xi (nk)\mathop {\mathop {\sum }\limits _{d\mid c_k(\varvec{a})}} \limits _{d \text{ squarefree }} \frac{1}{d^{nk}}\prod _{p\mid d}\big (1-\frac{2}{p^{nk}}\big )^{-1} \\&\quad =\xi (nk)\prod _{p\mid c_k(\varvec{a})}\big (1+\frac{1}{p^{nk}}\big (1-\frac{2}{p^{nk}} \big )^{-1}\big )\\&\quad =\xi (nk)\prod _{p\mid c_k(\varvec{a})} \big (1+\frac{1}{p^{nk}-2}\big )\,. \end{aligned}$$

This completes the proof. \(\square \)

Corollary 2

\(V-V=\Lambda \).

Proof

Trivially, one has \(V-V\subset \Lambda \). From Theorem 7, one gets \(w(\varvec{a})>0\) for all \(\varvec{a}\in \Lambda \) and thus also \(\Lambda \subset V-V\) by [4, Lemma 1]. \(\square \)

The dual or reciprocal lattice \(\Lambda ^*\) of \(\Lambda \) is

$$\begin{aligned} \Lambda ^*:=\{\varvec{y} \in \mathbb R ^n\mid \varvec{y}\cdot \varvec{x}\in \mathbb Z \text{ for } \text{ all } \varvec{x}\in \Lambda \} \end{aligned}$$

By definition, the denominator \(q\) of a point \(\varvec{p}\in \mathbb Q \Lambda ^*\) is the smallest number \(a\in \mathbb N \) with \(a\varvec{p}\in \Lambda ^*\). This is also the greatest common divisor of the numbers \(a\in \mathbb N \) with \(a\varvec{p}\in \Lambda ^*\), i.e. \(a\varvec{p}\in \Lambda ^*\) if and only if \(q\mid a\).

Theorem 8

The diffraction measure \(\widehat{\gamma }\) of the autocorrelation \(\gamma \) of \(V\) exists and is a positive, pure-point, translation-bounded measure which is concentrated on the set of points in \(\mathbb Q \Lambda ^*\) with \((k+1)\)-free denominator and whose intensity at a point with such a denominator \(q\) is given by

$$\begin{aligned} \frac{1}{\zeta ^2(nk)}\prod _{p\mid q}\frac{1}{(p^{nk}-1)^2}\,. \end{aligned}$$
(27)

This measure can also be interpreted as

$$\begin{aligned} \widehat{\gamma }=\xi (nk)\mathop {\mathop {\sum }\limits _{d=1}} \limits _{d \text{ squarefree }}^{\infty }\big (\prod _{p\mid d}\frac{1}{p^{2nk}-2p^{nk}}\big )\omega _{\Lambda ^*/d^k}\,, \end{aligned}$$
(28)

a weak*-convergent sum \((\)in fact, even \(\Vert \cdot \Vert _{\mathrm{loc }}\)-convergent sum\()\) of Dirac combs. \((\)For lattices \(\Lambda \) with determinant \(\ne 1\) the above formulas must be divided by the square of \(\det (\Lambda ).)\)

Proof

Let \(\gamma \) be the autocorrelation of \(V\). As shown in the proof of Theorem 7, one has

$$\begin{aligned} w(\varvec{a})&= \xi (nk)\mathop {\mathop {\mathop {\sum }_{d=1}} \limits _{\text{ d } \text{ squarefree }}} \limits _{\varvec{a} \in d^k\Lambda }^{\infty } \frac{1}{d^{nk}}\prod _{p\mid d}\big (1-\frac{2}{p^{nk}}\big )^{-1}\,. \end{aligned}$$

So by Theorem 7 and [4, Lemma 1] one obtains

$$\begin{aligned} \gamma&= \xi (nk)\mathop {\mathop {\sum }_{d=1}} \limits _{d \text{ squarefree }}^{\infty } \frac{1}{d^{nk}}\prod _{p\mid d}\big (1-\frac{2}{p^{nk}}\big )^{-1}\omega _{d^k\Lambda }\,. \end{aligned}$$

Since \(\Vert \omega _{d^k\Lambda }\Vert _{\mathrm{loc }}=O(1)\) and the coefficient of \(\omega _{d^k\Lambda }\) is \(O(1/d^{nk})\), this sum of tempered distributions is convergent in the weak*-topology by [4, Lemma 2]. By the Poisson summation formula for lattice Dirac combs [4, Eq. 31], its term-by-term Fourier transform is

$$\begin{aligned} \widehat{\gamma }&= \xi (nk)\mathop {\mathop {\sum }_{d=1}} \limits _{d \text{ squarefree }}^{\infty } \frac{1}{d^{2nk}}\prod _{p\mid d}\big (1-\frac{2}{p^{nk}}\big )^{-1}\omega _{\Lambda ^*/d^k}\\&= \xi (nk)\mathop {\mathop {\sum }_{d=1}} \limits _{d \text{ squarefree }}^{\infty } \big (\prod _{p\mid d}\frac{1}{p^{2nk}-2p^{nk}}\big )\omega _{\Lambda ^*/d^k}\,, \end{aligned}$$

which weak*-converges to the diffraction measure of \(V\), since the Fourier transform operator is weak*-continuous. Since \(\Vert \omega _{\Lambda ^*/d^k}\Vert _{\mathrm{loc }}=O(d^{nk})\) and the coefficient of \(\omega _{\Lambda ^*/d^k}\) is \(O(1/d^{2nk})\), the weak*-sum is a translation-bounded pure-point measure equal to the pointwise sum of its terms by [4, Lemma 2].Footnote 2 This establishes the series form (28) for the diffraction spectrum.

The explicit values of the intensities can now be calculated as follows. Let \(\varvec{p}\) be a point in \(\mathbb Q \Lambda ^*\) with denominator \(q\). We can assume that \(q\) is \((k+1)\)-free, since otherwise there is no contribution to (28) at all. The terms in (28) that contribute to the intensity at \(\varvec{p}\) are those with \(d=mq^*\), where \(q^*\) is the squarefree kernel of \(q\) and \(m\in \mathbb N \) is squarefree and coprime to \(q\). Thus the intensity at \(\varvec{p}\) is

$$\begin{aligned}&\quad \xi (nk)\prod _{p\mid q}\frac{1}{p^{2nk}-2p^{nk}}\mathop {\mathop {\mathop {\sum }_{m=1}} \limits _{m \text{ squarefree }}} \limits _{(m,q)=1}^{\infty }\prod _{p\mid m}\frac{1}{p^{2nk}-2p^{nk}}\,. \end{aligned}$$

Using the Euler products in (32) and (22) this simplifies to

$$\begin{aligned}&\xi (nk)\prod _{p\mid q}\frac{1}{p^{2nk}-2p^{nk}}\prod _{p\not \mid q}\big (1+\frac{1}{p^{2nk}-2p^{nk}}\big )\\&\quad =\xi (nk)\prod _{p\mid q}\frac{1}{p^{2nk}}\big (1-\frac{2}{p^{nk}}\big )^{-1}\prod _{p\not \mid q}\big (1-\frac{1}{p^{nk}}\big )^2 \big (1-\frac{2}{p^{nk}}\big )^{-1}\\&\quad =\frac{1}{\zeta ^2(nk)}\prod _{p\mid q} \frac{1}{p^{2nk}}\big (1-\frac{1}{p^{nk}}\big )^{-2}\,, \end{aligned}$$

which agrees with (27). \(\square \)

One explicitly sees that \(\widehat{\gamma }\) above is fully translation invariant, with lattice of periods \(\Lambda ^*\), in accordance with Theorem 1 of [1]. Moreover, since the action of the group of automorphisms of \(\Lambda ^*\), \(\mathrm{Aut }(\Lambda ^*)\simeq \mathrm{GL}(n,\mathbb Z )\), on \(\mathbb Q \Lambda ^*\) preserves the denominator, \(\widehat{\gamma }\) is \((\Lambda ^*\rtimes \mathrm{Aut }(\Lambda ^*))\)-symmetric. In particular, both \(\widehat{\gamma }\) and the set \(V\) itself are \(\mathrm{GL}(n,\mathbb Z )\)-symmetric.

8 Improving the Error Terms

What has kept the error term large in the argument as we have presented it so far is the last term of (6), with \(|\mathcal{P }|\) in the exponent of \(S\) (in the second component of the minimum). This arose from the \(O(1)\) error term in (9), when (9) was substituted for the inner sum in (8). The \(O(1)\) error term was not even a boundary effect: it was caused solely by lattices whose determinants are much larger than the volume of the region in which points are being counted. The result of this was to put the burden of keeping the last term of (6) small onto \(P\) (which occurs in the first component of the minimum), causing an increase in the error due to the tail of the \(\zeta \)-function product. Mirsky’s idea in [16] and [15] was to show that the terms with some \(d_i\) large contribute a negligible amount to (8) and can be discarded before the substitution of (9) is made. The remaining terms have the individual \(d_i\)’s so well bounded that the second component of the minimum can take over the role of providing a respectable error term, freeing \(P\) to be assigned a much larger value and thus reducing the size of the tail of the \(\zeta \)-function product.

Let \(r\in \mathbb Z ^+\), let \(\varvec{p}_1,\dots ,\varvec{p}_r\in \Lambda \), and let \(m_1,\dots ,m_r\in \mathbb N \). Define the symbol \(E\big (\begin{array}{l}m_1,\dots ,m_{r}\\ \varvec{p}_1,\dots ,\varvec{p}_{r} \end{array}\big )\) as \(1\) or \(0\) according to the system of congruences in \(\varvec{t} \in \Lambda \),

$$\begin{aligned} \varvec{t}+\varvec{p}_i\in m_i\Lambda \quad (1\le i\le r)\,, \end{aligned}$$
(29)

being solvable or not. Further, for a positive real number \(R\) and a point \(\varvec{x}\in \mathbb R ^n\), let \(T\big (\varvec{x};R;\begin{array}{l}m_1,\dots ,m_{r}\\ \varvec{p}_1,\dots ,\varvec{p}_{r}\end{array}\big )\) denote the number of points \(\varvec{t}\in \Lambda \) such that

$$\begin{aligned}&\varvec{t}\in B_R(\varvec{x})\,,\\&\quad \varvec{t}+\varvec{p}_{i}\in m_{i}\Lambda \quad (1\le i\le r)\,. \end{aligned}$$

We denote by \([m_1,\dots ,m_r]\) the least common multiple of \(m_1,\dots ,m_r\). Further, \((m_i,m_j)\) denotes the greatest common divisor of \(m_i\) and \(m_j\). For brevity, let \(c(\varvec{l})\) denote the \(1\)-content of a nonzero point \(\varvec{l}\in \Lambda \).

Lemma 3

The system (29) of congruences is soluble if and only if

$$\begin{aligned} (m_i,m_j)\mid c(\varvec{p}_i-\varvec{p}_j)\quad (1\le i<j\le r)\,. \end{aligned}$$

In the case of solubility, the solutions form precisely one residue class

$$\begin{aligned} \pmod {[m_1,\dots ,m_r]\Lambda }\,. \end{aligned}$$

Proof

This is an immediate consequence of [15, Lemma 1] applied to each coordinate with respect to a basis of \(\Lambda \). \(\square \)

If \(m_1,\dots ,m_r\) are pairwise coprime, the last result boils down to the Chinese Remainder Theorem (39). In fact, only this special case will be needed in Theorem 9 below. However, since the subsequent lemmas may be of independent interest, we prefer to stick to the general case.

Lemma 4

$$\begin{aligned} T\big (\varvec{x};R;\begin{array}{l} m_1,\dots ,m_{r}\\ \varvec{p}_1,\dots ,\varvec{p}_{r}\end{array}\big )=R^nv_n\,\frac{E\big ( \begin{array}{l} m_1,\dots ,m_{r}\\ \varvec{p}_1,\dots ,\varvec{p}_{r}\end{array}\big )}{[m_1,\dots ,m_{r}]^n}\!+\! O\big (\frac{R^{n-1}}{[m_1,\dots ,m_r]^{n-1}}\big )\!+\!O(1), \end{aligned}$$

where the implied \(O\)-constants depend only on \(\Lambda \).

Proof

This is an immediate consequence of Lemma 3 together with (38) applied to the lattice \([m_1,\dots ,m_r]\Lambda \). \(\square \)

Lemma 5

$$\begin{aligned} \frac{E\big (\begin{array}{l}m_1,\dots ,m_{r}\\ \varvec{p}_1,\dots ,\varvec{p}_{r}\end{array}\big )}{[m_1,\dots ,m_r]}\le \frac{K}{m_1\cdots m_r}\,. \end{aligned}$$

where \(K\) depends only on \(r,\varvec{p}_1,\dots ,\varvec{p}_{r}\).

Proof

This follows along the same lines as Lemma 3 of [15] by employing Lemma 3 instead of [15, Lemma 1]. \(\square \)

For points \(\varvec{p}_1,\dots ,\varvec{p}_r\in \Lambda \) and positive real numbers \(R\) and \(\alpha \), denote by \(L(\varvec{x};R;\varvec{p}_1,\dots ,\varvec{p}_r;\alpha )\) the cardinality of systems \((\varvec{t},a_1,\dots ,a_r)\) of lattice points \(\varvec{t}\in \Lambda \) and numbers \(a_1,\dots ,a_r\in \mathbb N \) such that

$$\begin{aligned}&\varvec{t}\in B_R(\varvec{x})\,,\\&\quad \varvec{t}+\varvec{p}_{i}\in a_{i}^{k}\Lambda \quad (1\le i\le r)\,,\\&\quad a_1\cdots a_r>R^{\alpha }\,. \end{aligned}$$

Lemma 6

$$\begin{aligned} L(\varvec{x};R;\varvec{p}_1,\dots ,\varvec{p}_r;\alpha )=O\big (R^{n-\alpha (nk-1)+\varepsilon } \big )+O\big (R^{n-1+\frac{2}{nk+1}+\varepsilon }\big )\,, \end{aligned}$$

where the implied \(O\)-constants depend only on \(\Lambda ,k,r,\varvec{p}_1,\dots ,\varvec{p}_{r}\).

Proof

The proof is by induction on \(r\). For \(r=1\), we can apply Lemma 4 to obtain

$$\begin{aligned} L(\varvec{x};R;\varvec{p}_1;\alpha )&= \mathop {\mathop {\mathop {\sum }_{\varvec{t}\in B_R(\varvec{x})}} \limits _{\varvec{t}+\varvec{p}_1\in a_1^k \Lambda }} \limits _{a_1>R^{\alpha }} 1\,\,=\,\,\mathop {\mathop {\sum }_{a_1>R^{\alpha }}} \limits _{a_1< ((R+\Vert \varvec{p}_1\Vert )/\lambda )^{1/k}}T\big (\varvec{x};R;\begin{array}{l}a_1^k\\ \varvec{p}_1 \end{array}\big )\\&= \mathop {\mathop {\sum }_{a_1>R^{\alpha }}} \limits _{a_1< ((R+\Vert \varvec{p}_1\Vert )/\lambda )^{1/k}}\big ( \frac{R^nv_n}{a_1^{nk}}+O \big (\frac{R^{n-1}}{a_1^{(n-1)k}}\big )+O(1)\big )\\&= O\big (R^{n-\alpha (nk-1)}\big )+O\big (R^{n-1}\log R\big )+O(R^{1/k})\,, \end{aligned}$$

where there is no middle term when \(n=1\) and the logarithm in the middle term is only needed in the case \(n=2\), \(k=1\), when the corresponding harmonic series \(\sum 1/a_1^{(n-1)k}\) diverges. In all other cases, these series converge, and the middle term can be taken as \(O(R^{n-1})\). Thus the lemma holds for \(r=1\). Assume now that the assertion holds for some \(r\ge 1\). Let \(\beta \) be a positive real parameter to be fixed later. Writing \(a=a_1\cdots a_{r+1}\), for symmetry reasons one has

$$\begin{aligned}&L(\varvec{x};R;\varvec{p}_1,\dots ,\varvec{p}_{r+1};\alpha )=O(\mathop {\mathop {\mathop {\mathop {\mathop {\mathop { \sum }\limits _{\varvec{t}\in B_R(\varvec{x})}} \limits _{\varvec{t}+\varvec{p}_1\in a_1^k\Lambda }} \limits _{\cdots }} \limits _{\varvec{t}+\varvec{p}_{r+1}\in a_{r+1}^k\Lambda }} \limits _{a>R^{\alpha }}}\limits _{\frac{a}{a_1},\dots ,\frac{a}{a_{r+1}}\le x^{\beta }} 1)+O(\mathop {\mathop {\mathop {\mathop {\mathop {\mathop {\sum }_{\varvec{t}\in B_R(\varvec{x})}} \limits _{\varvec{t}+\varvec{p}_1\in a_1^k\Lambda }} \limits _{\cdots }} \limits _{\varvec{t}+\varvec{p}_{r+1}\in a_{r+1}^k\Lambda }} \limits _{a>R^{\alpha }}} \limits _{{a_{1}}\cdots {a_{r}}> x^{\beta }}1)=L_1+L_2\,, \end{aligned}$$

say. Employing Lemmas 4 and 5, one obtains

$$\begin{aligned} L_1&= O\big (\mathop {\mathop {\mathop {\mathop {\mathop {\sum }_{\varvec{t}\in B_R(\varvec{x})}} \limits _{\varvec{t}+\varvec{p}_1\in a_1^k\Lambda }} \limits _{\cdots }} \limits _{\varvec{t}+\varvec{p}_{r+1}\in a_{r+1}^k\Lambda }} \limits _{R^{\alpha }<a\le R^{\beta (r+1)/r}}1\big )=O\big ( \sum _{R^{\alpha }<a\le R^{\beta (r+1)/r}}T\big (\varvec{x};R;\begin{array}{l} a_1^k,\dots ,a_{r+1}^k\\ \varvec{p}_1,\dots ,\varvec{p}_{r+1}\end{array}\big )\big )\\&= O\big (\sum _{R^{\alpha }<a\le R^{\beta (r+1)/r}}\big (R^nv_n\,\frac{E\big (\begin{array}{l}a_1^k, \dots ,a_{r+1}^k\\ \varvec{p}_1,\dots ,\varvec{p}_{r+1}\end{array}\big )}{[a_1^k,\dots ,a_{r+1}^k]^n} +R^{n-1}+1\big )\big )\\&= O\big (R^n\sum _{a>R^{\alpha }}\frac{1}{(a_1\cdots a_{r+1})^ {nk}}\big )+O\big (R^{n-1+\beta (r+1)/r+\varepsilon }\big )\\&= O\big (R^{n-\alpha (nk-1)+\varepsilon }\big )+O\big (R^{n-1+ \beta (r+1)/r+\varepsilon }\big )\,. \end{aligned}$$

With \(\tau \) denoting the ordinary divisor function, one further obtains

$$\begin{aligned} L_2&= O(\mathop {\mathop {\mathop {\mathop {\mathop {\sum }\limits _{\varvec{t}\in B_R(\varvec{x})}}\limits _{\varvec{t}+\varvec{p}_1\in a_1^k\Lambda }} \limits _{\cdots }} \limits _{\varvec{t}+\varvec{p}_{r}\in a_{r}^k\Lambda }} \limits _{a_1\cdots a_r> x^{\beta }}\sum _{\varvec{t}+\varvec{p}_{r+1}\in a_{r+1}^k\Lambda } 1)\,\,=\,\,O(\mathop {\mathop {\mathop {\mathop {\mathop {\sum }\limits _{\varvec{t}\in B_R(\varvec{x})}} \limits _{\varvec{t}+\varvec{p}_1\in a_1^k\Lambda }} \limits _{\cdots }} \limits _{\varvec{t}+\varvec{p}_{r}\in a_{r}^k\Lambda }} \limits _{a_1\cdots a_r> x^{\beta }}\tau \big (\Vert \varvec{t}+\varvec{p}_{r+1}\Vert /\lambda \big )\\&= O(\mathop {\mathop {\mathop {\mathop {\mathop {\sum }_{\varvec{t}\in B_R(\varvec{x})}} \limits _{\varvec{t}+\varvec{p}_1\in a_1^k\Lambda }} \limits _{\cdots }} \limits _{\varvec{t}+\varvec{p}_{r}\in a_{r}^k\Lambda }} \limits _{a_1\cdots a_r> x^{\beta }}R^{\varepsilon })\,\,=\,\,O(R^{\varepsilon }L(\varvec{x};R;\varvec{p}_1,\dots ,\varvec{p}_{r};\beta ))\\&= O\big (R^{n-\beta (nk-1)+2\varepsilon }\big )+O\big (R^{n-1+ \frac{2}{nk+1}+2\varepsilon }\big )\,, \end{aligned}$$

by assumption. Setting \(\beta :=\frac{r}{rnk+1}\), we obtain

$$\begin{aligned} L(\varvec{x};R;\varvec{p}_1,\dots ,\varvec{p}_{r+1};\alpha )=O\big (R^{n-\alpha (nk-1)+ \varepsilon }\big )+O\big (R^{n-1+\frac{2}{nk+1}+2\varepsilon }\big )\,, \end{aligned}$$

which proves the lemma. \(\square \)

We are now in a position to improve the error term of Lemma 1.

Theorem 9

Let \(\mathcal{P }\) be a finite subset of \(\Lambda \), \(m\in \mathbb N \), \(\varvec{m}\in \Lambda \), \(P\) be a natural number coprime to \(m\) and \(\varvec{x}\in \mathbb R ^n\). Then

$$\begin{aligned} |L(V_P;\mathcal{P },\emptyset )\cap (\varvec{m}+m\Lambda )\cap B_R(\varvec{x})| \end{aligned}$$

is

$$\begin{aligned} \frac{R^nv_n}{m^n}\prod _{p\mid P}\big (1-\frac{|\mathcal P /p^k\Lambda |}{p^{nk}}\big )+O\big (R^{n-1+\frac{2}{nk+1}+\varepsilon }\big )\,, \end{aligned}$$

where the \(O\)-constant depends only on \(\Lambda \), \(k\) and \(\mathcal{P }\).

Proof

This follows from the following modification of the proof of Lemma 1. We shall also use the notation from that proof. It suffices to show that (8) is

$$\begin{aligned} \frac{R^nv_n}{m^nQ^{nk}}\prod _{p\mid P/Q}\big (1-\frac{r}{p^{nk}}\big )+O\big (R^{n-1+\frac{2}{nk+1}+\varepsilon }\big )\,. \end{aligned}$$

To this end, divide (8) as \(C_1+C_2\), where

$$\begin{aligned} C_1=\begin{array}[t]{c} \displaystyle \sum _{d_1\mid P/Q}\quad \sum _{d_2\mid P/Q}\cdots \sum _{d_r\mid P/Q}\\ d_1\cdots d_r\le R^{\frac{1}{nk}} \end{array} \mu (d_1\cdots d_r) \mathop {\mathop {\mathop {\sum }\limits _{\varvec{t}\in \Lambda \cap B_R(\varvec{x})}} \limits _{\varvec{t}\in \varvec{q}+mQ^k\Lambda }} \limits _{\varvec{t}\in -\varvec{p}_i+d_i^k\Lambda }1, \end{aligned}$$

and \(C_2\) consists of the terms with \(d_1\cdots d_r> R^{\frac{1}{nk}}\). By Lemma 6,

$$\begin{aligned} C_2=O\big (R^{n-\frac{nk-1}{nk}+\varepsilon }\big )+O \big (R^{n-1+\frac{2}{nk+1}+\varepsilon }\big )=O \big (R^{n-1+\frac{2}{nk+1}+\varepsilon }\big )\,. \end{aligned}$$

One further obtains

$$\begin{aligned} C_1&= \begin{array}[t]{c} \sum _{d_1\mid P/Q}\sum _{d_2\mid P/Q}\cdots \sum _{d_r \mid P/Q}\\ d_1\cdots d_r\le R^{\frac{1}{nk}} \end{array} \mu (d_1\cdots d_r) \mathop {\mathop {\mathop {\sum }_{\varvec{t}\in \Lambda \cap B_R(\varvec{x})}} \limits _{\varvec{t}\in \varvec{q}+mQ^k\Lambda }} \limits _{\varvec{t}\in -\varvec{p}_i+d_i^k\Lambda }1\\&= \begin{array}[t]{c} \sum \limits _{d_1\mid P/Q} \ \sum \limits _{d_2\mid P/Q}\cdots \ \sum \limits _{d_r\mid P/Q}\\ d_1\cdots d_r\le R^{\frac{1}{nk}} \end{array} \mu (d_1\cdots d_r)T\big (\varvec{x};R;\begin{array}{l}d_1^k,\dots ,d_{r}^k,mQ^k\\ \varvec{p}_1,\dots ,\varvec{p}_{r},\varvec{q}\end{array}\,\,\big ) \end{aligned}$$

Since the \(d_1,\dots ,d_r\) are pairwise coprime and since \((mQ,d)=1\), Lemma 4 in conjunction with Lemma 3 shows that

$$\begin{aligned} T\big (\varvec{x};R;\begin{array}{l}d_1^k,\dots ,d_{r}^k,mQ^k\\ \varvec{p}_1,\dots ,\varvec{p}_{r},\varvec{q}\end{array}\,\,\big )&= \frac{R^nv_n}{m^n(dQ)^{nk}}+ O\big (\frac{R^{n-1}}{m^{n-1}(dQ)^{(n-1)k}}\big )+O(1)\\&=\frac{R^nv_n}{m^n(dQ)^{nk}}+O(R^{n-1})\,. \end{aligned}$$

Just as in the proof of Lemma 1, substituting this in the above expression for \(C_1\) and removing the condition \(d\le R^{\frac{1}{nk}}\) from the sum over \(R^nv_n/(m^n(dQ)^{nk})\) gives the main term

$$\begin{aligned} \frac{R^nv_n}{m^nQ^{nk}}\prod _{p\mid P/Q}\big (1-\frac{r}{ p^{nk}}\big ) \end{aligned}$$

The error from the extra terms included in the extended multiple sum is

$$\begin{aligned} O\big (R^n\sum _{d>R^{\frac{1}{nk}}}\frac{\tau _r(d)}{d^{nk}}\big )=O(R^{n-1+\frac{1}{nk}+\varepsilon })=O \big (R^{n-1+\frac{2}{nk+1}+\varepsilon }\big )\,. \end{aligned}$$

Similarly, the sum over the error term can be seen to be \(O(R^{n-1+\frac{1}{nk}+\varepsilon })\). Altogether, this proves the assertion. \(\square \)

One can now employ Theorem 9 instead of Lemma 1 to see that the error terms in Corollary 1 and Theorem 1 of the form \(O(R^n/(\log R)^{nk-1})\) can indeed be improved to

$$\begin{aligned} O(R^{n-1+\frac{2}{nk+1}+\varepsilon })\,. \end{aligned}$$

More precisely, Theorem 9 allows one to choose \(P\) as large as the product of primes less than \(R^{1/(nk+1)}\) (instead of \(\log R\)) in the modified proofs.