1 Introduction

1.1 Background

A theorem of Glasner from 1979 [11] shows that if \(Y \subset \mathbb {T}= \mathbb {R}/\mathbb {Z}\) is infinite then for each \(\epsilon > 0\) there exists an integer n such that nY is \(\epsilon \)-dense. A more quantitative version was obtained by Berend–Peres [4], which states that there exist contstants \(c_1,c_2>0\) such that if \(Y \subset \mathbb {T}/\mathbb {R}\) satisfies \(|Y| > \left( c_1/\epsilon \right) ^{c_2/\epsilon }\) then nY is \(\epsilon \)-dense in \(\mathbb {T}\) for some \(n \in \mathbb {N}\). This was improved significantly in the seminal work of Alon–Peres [1] which provided the optimal lower bound as follows.

Theorem 1.1

(Alon-Peres [1]) For \(\delta >0\) there exists \(\epsilon _{\delta }>0\) such that for all \(0<\epsilon < \epsilon _{\delta }\) and \(Y \subset \mathbb {T}\) with \(|Y| > \epsilon ^{-2-\delta }\) then there exists n so that nY is \(\epsilon \)-dense.

This phenomenom can be extended to other semigroup actions, thus movitating the following definition.

Definition 1.2

Let G be semigroup acting on a compact metric space X by continuous maps. We say that this action is Glasner if for all infinite \(Y \subset X\) there exists \(g \in G\) such that gY is \(\epsilon \)-dense. Moreover, we say that it is \(k(\epsilon )\)-uniformly Glasner if for all sufficiently small \(\epsilon >0\) and \(Y \subset X\) with \(|Y| > k(\epsilon )\) we have that there exists \(g \in G\) such that gY is \(\epsilon \)-dense.

For instance Kelly–Lê [15] used the techniques of Alon–Peres [1] to show that the natural action of the multiplicative semi-group \(M_{d \times d}(\mathbb {Z})\) of \(d \times d\) integer matrices on \(\mathbb {T}^d\) is \(c_d\epsilon ^{-3d^2}\)uniformly Glasner. This was later improved by Dong in [8] where he showed, using the same techniques of Alon–Peres together with the deep work of Benoist–Quint [3], that the action \({\text {SL}}_d(\mathbb {Z}) \curvearrowright \mathbb {T}^d\) is \(c_{\delta , d} \epsilon ^{-4d - \delta }\)-uniformly Glasner for all \(\delta >0\). Later Dong [9] used a different technique but still based on the work of Benoist–Quint [3] to show that a large class of subgroups of \({\text {SL}}_d(\mathbb {Z})\) have the Glasner property.

Theorem 1.3

(Dong [9]) Let \(d \ge 2\) and let \(G \le {\text {SL}}_d(\mathbb {Z})\) be a subgroup that is Zariski dense in \({\text {SL}}_d(\mathbb {R})\). Then \(G \curvearrowright \mathbb {T}^d\) is Glasner, i.e., if \(Y \subset \mathbb {T}^d\) is infinite and \(\epsilon >0\) then there exists \(g \in G\) such that gY is \(\epsilon \)-dense.

We remark that this result, unlike the aforementioned \(G = {\text {SL}}_d(\mathbb {Z})\) case in [8], does not use the techniques of Alon–Peres and does not establish a uniform Glasner property (Y needs to be infinite). A uniform Glasner property was obtained for the case where G acts irreducibly and is generated by finitely many unipotents in [6].

Theorem 1.4

Let \(d \ge 2 \) and let \(G \le {\text {SL}}_d(\mathbb {Z})\) be a group generated by finitely many unipotent elements such that the representation \(G \curvearrowright \mathbb {R}^d\) is irreducible. Then there exists a constant \(C_G>0\) such that \(G \curvearrowright \mathbb {T}^d\) is \(\epsilon ^{-C_{G}}\)-uniformly Glasner, i.e., if \(Y \subset \mathbb {T}^d\) with \(|Y|>\epsilon ^{-C_G}\) then there exists \(g \in G\) such that gY is \(\epsilon \)-dense.

Examples of such groups include the subgroup of \({\text {SL}}_d(\mathbb {Z})\) preserving a diagonal quadratic form with coefficients \(\pm 1\), not all of the same sign (see [6] or Proposition A.5 in [7] for more details).

1.2 Glasner property for groups with Zariski connected Zariski closures

The first main result of this paper extends Theorem 1.4 by replacing the requirement that G is generated by finitely many unipotent elements by the weaker assumption that G has Zariski connected Zariski closure. It also improves Theorem 1.3 by providing a uniform Glasner property and also not requiring the Zariski closure to be the full \({\text {SL}}_d(\mathbb {R})\).

Theorem A

Let \(G \le {\text {SL}}_d(\mathbb {Z})\) be a finitely generated group with Zariski connected Zariski closure in \({\text {SL}}_d(\mathbb {R})\) such that \(G \curvearrowright \mathbb {R}^d\) is an irreducible representation. Then there exists \(C_{G} >0\) such that \(G \curvearrowright \mathbb {T}^d\) is \(\epsilon ^{-C_{G}}\)-uniformly Glasner; i.e., if \(Y \subset \mathbb {T}^d\) with \(|Y|>\epsilon ^{-C_{G}}\) then there exists \(g \in G\) such that gY is \(\epsilon \)-dense.

1.3 Glasner property for product actions

Let \(G_1 \curvearrowright X_1\) and \(G_2 \curvearrowright X_2\) be two actions on compact metric spaces that have the Glasner property. We consider the product action

$$\begin{aligned} G_1 \times G_2&\curvearrowright X_1 \times X_2 \\ (g_1,g_2) \cdot (x_1, x_2)&= (g_1x_1, g_2x_2). \end{aligned}$$

Clearly, we see that it is not Glasner since a horizontal line \(Y = X_1 \times \{x_2\}\) is infinite and \((g_1,g_2)Y \subset X_1 \times \{g_2 x_2\}\) is another horizontal line, which cannot be \(\epsilon \)-dense for all \(\epsilon >0\). The same obstruction occurs if Y is a finite union of horizontal and vertical lines. It is thus natural to ask whether this is the only obstruction by considering infinite sets \(Y \subset X_1 \times X_2\) such that no two points are on a common vertical or horizontal line.

Question 1.5

(Glasner for product action) Suppose that \(G_1 \curvearrowright X_1\) and \(G_2 \curvearrowright X_2\) are Glasner. Suppose \(Y \subset X_1 \times X_2\) is an infinite set such that both of the projections onto \(X_1\) and \(X_2\) are injective on Y (i.e., if \(y, y' \in Y\) are distinct then \(\pi _{X_1}y \ne \pi _{X_1}y'\) and \(\pi _{X_2}y \ne \pi _{X_2}y'\) where \(\pi _{X_i}:X_1 \times X_2 \rightarrow X_i\) is the projection). Then is it true that for all \(\epsilon >0\) there exists \(g \in G_1 \times G_2\) such that gY is \(\epsilon \)-dense in \(X_1 \times X_2\) ?

We are unable to find any counterexample so far. The main goal of this paper is to answer this question in the affirmative for many of the semigroups of endomorphisms on \(\mathbb {T}^d\) presented above. We first present a special case of one of our main results, which verifies this for the situation of the original Glasner theorem.

Proposition 1.6

Suppose \(Y \subset \mathbb {T}^2\) is infinite and both of the projections onto the \(\mathbb {T}\) factors are injective on Y. Then for all \(\epsilon >0\) there exists \((n,m) \in \mathbb {N}^2\) such that (nm)Y is \(\epsilon \)-dense in \(\mathbb {T}^2\). In fact, if \(P_1(x),P_2(x) \in \mathbb {Z}[x]\) are polynomials such that no non-trivial linear combination is constant then for all \(\epsilon >0\) there exists \(n \in \mathbb {N}\) such that \((P_1(n), P_2(n))Y\) is \(\epsilon \)-dense.

Our next main result demonstrates this phenomenom for the Glasner actions of unipotently generated groups presented in Theorem 1.4.

Theorem B

Let \(G_1 \le {\text {SL}}_{d_1}(\mathbb {Z})\) and \(G_2 \le {\text {SL}}_{d_2}(\mathbb {Z})\) be subgroups generated by finitely many unipotent elements such that \(G_1 \curvearrowright \mathbb {R}^{d_1}\) and \(G_2 \curvearrowright \mathbb {R}^{d_2}\) are irreducible representations, where \(d_1,d_2 \ge 2\) are integers. Then for all \(\epsilon >0\) there exists \(k \in \mathbb {N}\) such that if \(Y \subset \mathbb {T}^{d_1} \times \mathbb {T}^{d_2}\) with \(|Y| \ge k\) satisfies that the projections to \(\mathbb {T}^{d_1}\) and \(\mathbb {T}^{d_2}\) are injective on Y, then there exists \(g \in G_1 \times G_2\) such that gY is \(\epsilon \)-dense in \(\mathbb {T}^{d_1} \times \mathbb {T}^{d_2}\).

In light of Theorem A, it is interesting to ask if the condition that \(G_1,G_2\) are generated by unipotent elements can be replaced with the (weaker) assumption that \(G_1,G_2\) have Zariski connected Zariski closures.

1.4 Non-irreducible actions

In the setting of endomorphisms on \(\mathbb {T}^d\), any product action is another action by endomorphisms. Unfortunately, it is not irreducible hence Theorems 1.4 and A do not apply. It is thus natural to ask how one can extend these theorems to the non-irreducible case by placing suitable restrictions on the set Y (in a way that is analogous to the setting in Question 1.5). Our next main result achieves this for unipotently generated subgroups.

Theorem C

Let \(G \le {\text {SL}}_d(\mathbb {Z})\) be a group generated by finitely many unipotent elements. Let \({\widetilde{Y}} \subset [0,1)^d\) be infinite such that for all distinct \({\widetilde{y}}, {\widetilde{y}}' \in {\widetilde{Y}}\) we have that \({\widetilde{y}} - {\widetilde{y}}'\) is not contained in any G-invariant proper affine subspace. Then for all \(\epsilon >0\) there exists a constant k such that if \(|{\widetilde{Y}}|>k\) then there exists \(g \in G\) such that gY is \(\epsilon \)-dense in \(\mathbb {T}^d\), where \(Y \subset \mathbb {T}^d\) is the projection of \({\widetilde{Y}}\) onto \(\mathbb {T}^d\).

As before, it is interesting to ask whether this result holds if one replaces the assumption of G being finitely generated by unipotents with the weaker assumption that the Zariski closure of G is Zariski connected.

Proof of Theorem B using Theorem C

Let \(G = G_1 \times G_2\) and let \({\widetilde{Y}} \subset [0,1)^{d_1} \times [0,1)^{d_2}\) be a set of representatives for \(Y \subset \mathbb {T}^{d_1} \times \mathbb {T}^{d_2}\). Let \(a \in ({\widetilde{Y}} - {\widetilde{Y}}) {\setminus } \{0 \}\). Using Theorem C it suffices to show that if \(Ga \subset W+a\) for some subspace \(W \le \mathbb {R}^{d_1} \times \mathbb {R}^{d_2}\) then \(W = \mathbb {R}^{d_1} \times \mathbb {R}^{d_2}\). To see this, write \(a = (a_1,a_2)\) where \(a_i \in \mathbb {R}^{d_i}\) and notice that \(a_1,a_2\) are both non-zero (by assumption). Now for \(g_1,g_1' \in G_1\) we have that

$$\begin{aligned} (g_1' a_1 - g_1 a_1, 0) = (g_1',1)a - (g_1,1)a \in W. \end{aligned}$$

In particular, since \(d_1 \ge 2\) and \(G_1\) acts irreducibly on \(\mathbb {R}^{d_1}\), we may find \(g'_1 \in G_1\) such that \(b_1 = g'_1 a_1 - a_1 \ne 0\) (here we use the assumption that \(a_1 \ne 0\)). Now we have that

$$\begin{aligned} (g_1b_1,0) = (g_1g_1'a_1 - g_1a_1, 0) \in W \quad \text { for all } g_1 \in G_1. \end{aligned}$$

By irreducibility and \(b_1 \ne 0\), this means that for all \(v_1 \in \mathbb {R}^{d_1}\) we have that \((v_1,0) \in W\). Similarly, we may show that \((0,v_2) \in W\) for all \(v_2 \in \mathbb {R}^{d_2}\). Thus \(W = \mathbb {R}^{d_1} \times \mathbb {R}^{d_2}\). \(\square \)

1.5 Glasner property along polynomial sequences

Our technique for proving Theorem C extends the polynomial method used in [6]. Throughout this paper, we let \(\pi _{\mathbb {T}^d}:\mathbb {R}^d \rightarrow \mathbb {T}^d\) denote the quotient map.

Theorem D

Fix \(\epsilon >0\), a positive integer d and let \(A(x) \in M_{d \times d}(\mathbb {Z}[x])\) be a matrix with integer polynomial entries. Then there exists a constant \(k = k(\epsilon , A(x), d)>0\) such that the following is true: Suppose \({\widetilde{Y}} \subset [0,1)^d\) with \(|{\widetilde{Y}}| \ge k\) satisfies the following condition:

$$\begin{aligned} \text {For all } v \in \mathbb {Z}^d\setminus \{ 0 \} \text { and distinct } {\widetilde{y}}, {\widetilde{y}}' \in {\widetilde{Y}} \text { we have that } v \cdot (A(x) - A(0))({\widetilde{y}}-{\widetilde{y}}') \ne 0 \in \mathbb {R}[x].\nonumber \\ \end{aligned}$$
(1)

Letting \(Y = \pi _{\mathbb {T}^d}({\widetilde{Y}})\), there exists \(n \in \mathbb {Z}\) such that A(n)Y is \(\epsilon \)-dense in \(\mathbb {T}^d\).

Example 1.7

(Proof of Proposition 1.6) Let \(P_1(x), P_2(x) \in \mathbb {Z}[x]\) be polynomials such that no non-trivial linear combination of them is constant and let

$$\begin{aligned}A(x) = \begin{bmatrix} P_1(x) &{} 0 \\ 0 &{} P_2(x) \\ \end{bmatrix} \end{aligned}$$

Now suppose \({\widetilde{Y}} \subset [0,1)^2\) is such that any two distinct \({\widetilde{y}}, {\widetilde{y}}'\) are not on a common vertical or horizontal line. This means that \((a_1, a_2):= {\widetilde{y}} - {\widetilde{y}}'\) satisfies that \(a_1, a_2 \ne 0\). Now the expression (1) in Theorem D is

$$\begin{aligned} a_1v_1P_1(x) + a_2v_2 P_2(x) - a_1v_1P_1(0) - a_2v_2P_2(0) \end{aligned}$$

where \(v = (v_1,v_2) \in \mathbb {Z}^2 \setminus \{(0,0)\}\). But \((a_1v_1, a_2v_2) \ne (0,0)\) and thus the linear combination \(a_1v_1P_1(x) + a_2v_2 P_2(x)\) is a non-constant polynomial and so this expression is non-zero, thus Theorem D applies.

We remark that the \(d=1\) case recovers the result of Berend–Peres [4] (that was later improved quantitatively by Alon–Peres [1]) on the Glasner property along polynomial sequences. More precisely, it states that if \(P(x) \in \mathbb {Z}[x]\) is non-constant then for all \(\epsilon >0\) there is a constant \(k = k(\epsilon , P(x))\) such that for subsets \(Y \subset \mathbb {T}\) with \(|Y|>k\) we have that P(n)Y is \(\epsilon \)-dense for some \(n \in \mathbb {Z}\).

Example 1.8

(Diagonal action) Consider now the diagonal action \(\mathbb {N}\curvearrowright \mathbb {T}^2\) given by \(n(x,y) = (ny,ny)\). Clearly this is not Glasner since the diagonal (or any non-dense subgroup of \(\mathbb {T}^2\)) is an infinite invariant set and hence never becomes \(\epsilon \) dense for small enough \(\epsilon >0\). However, we may still apply Theorem D to obtain natural assumptions on the set \(Y \subset \mathbb {T}^2\) so that Y has \(\epsilon \)-dense images under the diagonal action. First, we let

$$\begin{aligned} A(x) = \begin{bmatrix} x &{} 0 \\ 0 &{} x \\ \end{bmatrix} \end{aligned}$$

The condition says that for any two distinct \({\widetilde{y}}, {\widetilde{y}}' \in {\widetilde{Y}}\), by setting \((a_1, a_2):= {\widetilde{y}} - {\widetilde{y}}'\) we must have

$$\begin{aligned} (a_1v_1 + a_2v_2) x \ne 0 \quad \text {for all } (v_1,v_2) \in \mathbb {Z}^2 \setminus \{ (0,0) \}. \end{aligned}$$

This is equivalent to the statement that \(a_1,a_2 \in \mathbb {R}\) are linearly independent over \(\mathbb {Z}\).

2 Tools

We now gather some useful tools that have mostly been used in previous works [6, 8, 15] that are multidimensional generalizations of the techniques originally introduced by Alon–Peres. We restate them for the convenience of the reader, although one slightly new variation will be needed (see Lemma 2.3) mainly for the purposes of proving Theorem A.

We start with a bound based on [1] that has been extended by the aforementioned works. The following formulation can be found exactly in [6] ([8] only demonstrates and uses the \(r \ge 1\) case).

Proposition 2.1

Fix an integer \(d>0\) and any real number \(r>0\). Then there exists a constant \(C = C(d,r)\) such that the following is true: Given any distinct \(x_1, \ldots , x_k \in \mathbb {T}^d\) let \(h_q\) denote the number of pairs (ij) with \(1 \le i,j \le k\) such that q is the minimal (if such exists) positive integer such that \(q(x_i - x_j) = 0\). Then

$$\begin{aligned} \sum _{q=2}^{\infty } h_q q^{-r} \le C k^{2 - r/(d+1)}. \end{aligned}$$

Throughout this paper, we let \(e(t) = e^{2\pi i t}\) and we let

$$\begin{aligned} B(M) = \{ \vec {m} \in \mathbb {Z}^d ~|~ \vec {m} \ne \vec {0} \text { and } \Vert \vec {m}\Vert _{\infty } \le M \} \end{aligned}$$

denote the \(L^{\infty }\) ball of radius M in \(\mathbb {Z}^d\) around \(\vec {0}\) with \(\vec {0}\) removed.

For \(u \in \mathbb {T}^d\) by |u| we will mean the \(\Vert \cdot \Vert _{\infty }\) distance from u to the origin in \(\mathbb {T}^d\), which may precisely be defined as the distance from the origin in \(\mathbb {R}^d\) to the closest point in the lattice \((\pi _{\mathbb {T}^d})^{-1}(u) \subset \mathbb {R}^d\) (this is the metric that we use for \(\mathbb {T}^d\) when defining \(\epsilon \)-dense).

Theorem 2.2

(See Corollary 2 in [2]) Let \(0<\epsilon < \frac{1}{2}\), \(M=\lceil \frac{d}{\epsilon } \rceil \) and \(u_1, \ldots , u_k \in \mathbb {T}^d\) with \(|u_i| > \epsilon \) for all \(i=1, \ldots , k\). Then

$$\begin{aligned} \frac{k}{3} \le \sum _{\vec {m} \in B(M)} \left| \sum _{i=1}^k e(\vec {m} \cdot u_i) \right| . \end{aligned}$$

The following is a more relaxed version of Proposition 2 in [15] which we will need for both Theorems A and C. It is purely finitistic, rather than asymptotic, which will allow us to take averages with respect to random walks rather than just Cesàro averages.

Lemma 2.3

For integers \(d>0\) there exists a constant \(C_1=C_1(d)>0\) such that the following is true. Suppose that \(g \in M_{d \times d}(\mathbb {Z})\) and \(x_1,\ldots , x_k\) satisfy that \(\{g x_1, \ldots , g x_k\}\) is not \(\epsilon \)-dense. Then for \(M = \lceil \frac{d}{\epsilon } \rceil \) we have

$$\begin{aligned} k^2 < C_1 \epsilon ^{-d} \sum _{\vec {m} \in B(M)} \sum _{i,j=1}^k e(\vec {m} \cdot g(x_i - x_j)). \end{aligned}$$

Proof

Not being \(\epsilon \)-dense means that there exists \(\alpha \in \mathbb {T}^d\) such that \(|\alpha - g x_i |> \epsilon \) for all \(i = 1, \ldots k\). Using Theorem 2.2 with \(u_i = g(\alpha - x_i)\) and applying Cauchy–Schwartz we get

$$\begin{aligned} \frac{k^2}{9} \le |B(M)| \sum _{\vec {m} \in B(M)} \left| \sum _{i=1}^k e(\vec {m} \cdot (\alpha - g x_i)) \right| ^2. \end{aligned}$$

Now expanding this square and using the estimate \(|B(M)| = (2M+1)^d - 1 = O(2^dd^d\epsilon ^{-d})\) gives the result. \(\square \)

3 Proof of the Glasner property in the case of Zariski connected Zariski closures

Now let \(G \le {\text {SL}}_d(\mathbb {Z})\) be a subgroup with Zariski connected Zariski closure such that the action of G on \(\mathbb {R}^d\) is irreducible and let \(\mu \) be a probability measure on G with finite mean such that \(\mu (\{g \}) >0\) for all \(g \in G\). Our main tool is the following powerful result on the equidistribution of random linear walks on \(\mathbb {T}^d\) that extends the deep work of Bourgain–Furman–Lindenstrauss–Mozes [5].

Theorem 3.1

(See Theorem 1.2 in [12]) There exists a \(\lambda >0\) and a constant \(C>0\) such that for every \(x \in \mathbb {T}^d\) and \(0<t<\frac{1}{2}\), if \(a \in \mathbb {Z}^d {\setminus } \{0\}\) is such that

$$\begin{aligned} |\widehat{\mu ^{*n} *\delta _x}(a)| \ge t \quad \text {and } n \ge C \log \frac{\Vert a\Vert }{t} \end{aligned}$$

then there exists a \(q \in \mathbb {Z}_{>0}\) and \(x' \in \frac{1}{q}\mathbb {Z}^d/\mathbb {Z}^d\) such that

$$\begin{aligned}q < \left( \frac{\Vert a\Vert }{t} \right) ^C \quad \text { and } d(x,x') \le e^{-\lambda n}. \end{aligned}$$

Letting \(n \rightarrow \infty \) and taking contrapositives, we obtain the following simple corollary.

Lemma 3.2

There exists a constant \(C>0\) such that for every \(x \in \mathbb {Q}^d/\mathbb {Z}^d\) of the form \(x = \frac{1}{q}v\), where \(v \in \mathbb {Z}^d\) and \({\text {gcd}}(v,q) = 1\), and every \(a \in \mathbb {Z}^d {\setminus } \{0\}\) we have that

$$\begin{aligned} \limsup _{n \rightarrow \infty } |\widehat{\mu ^{*n} *\delta _x}(a)| \le 2\Vert a \Vert q^{-1/C}. \end{aligned}$$

Furthermore, if \(y \in \mathbb {T}^d\) is irrational then

$$\begin{aligned} \limsup _{n \rightarrow \infty } |\widehat{\mu ^{*n} *\delta _y}(a)| = 0. \end{aligned}$$

Proof

Let \(t = \Vert a \Vert q^{-1/C}\). If \(t \ge \frac{1}{2}\) then the result is clearly true since \(|{\widehat{\nu }} (a)| \le 1\) for any probability measure \(\nu \) on \(\mathbb {T}^d\). On the other hand, if \(0<t < \frac{1}{2}\) then we may apply Theorem 3.1 and proceed by contradition to show the sharper (by a factor of \(\frac{1}{2}\)) bound

$$\begin{aligned} \limsup _{n \rightarrow \infty } |\widehat{\mu ^{*n} *\delta _x}(a)| \le t. \end{aligned}$$

More precisely, if this bound were to fail then we can find large enough \(n \ge C \log \frac{\Vert a\Vert }{t}\) so that \(|\widehat{\mu ^{*n} *\delta _x}(a)| \ge t\) and thus there exists

$$\begin{aligned} q' < \left( \frac{\Vert a\Vert }{t} \right) ^C = q \end{aligned}$$

(meaning that \(q' \ne q\)) such that \(x' \in \frac{1}{q'}\mathbb {Z}^d/\mathbb {Z}^d\) such that \(d(x,x') < e^{-\lambda n}\). For sufficiently large n, this leads to a contradiction as \(q \ne q'\).\(\square \)

Intuitively, this can be interpreted as saying that an irrational orbit equidistributes to the Haar measure while the orbit of a rational point with large enough denominator almost equidistributes to the Haar measure. We remark that the proof for \(G={\text {SL}}_d(\mathbb {Z})\) given by Dong in [8] instead used an explicit calculation (Ramanujan sum) for this convolution in the rational case and used the work of Benoist–Quint [3] for the irrational case.

Proof of Theorem A

Suppose for contradiction that \(x_1,\ldots x_k \in \mathbb {T}^d\) are distinct points such that \(\{ g x_1, \ldots , g x_k \}\) is not \(\epsilon \)-dense in \(\mathbb {T}^d\) for all \(g \in G\). Using Lemma 2.3, for \(M=\lceil \frac{d}{\epsilon } \rceil \) we have

$$\begin{aligned} k^2 < C_1 \epsilon ^{-d} \sum _{\vec {m} \in B(M)} \sum _{i,j=1}^k e(\vec {m} \cdot g(x_i - x_j)) \quad \text {for all } g \in G. \end{aligned}$$

Now let \(\mu \) be the probability measure on G as above. Integrating this estimate with respect to the n-fold convolution \(\mu ^{*n}\) we obtain

$$\begin{aligned} k^2&< C_1 \epsilon ^{-d} \sum _{\vec {m} \in B(M)} \int _{G} \sum _{i,j=1}^k e(\vec {m} \cdot g(x_i - x_j)) d(\mu ^{*n})(g) \\ {}&= C_1 \epsilon ^{-d} \sum _{\vec {m} \in B(M)} \sum _{i,j=1}^k \widehat{\mu ^{*n} * \delta _{x_i - x_j}}(\vec {m}). \end{aligned}$$

Now using Lemma 3.2 and letting \(n \rightarrow \infty \), we get that

$$\begin{aligned} k^2 < C_1 \epsilon ^{-d} \sum _{\vec {m} \in B(M)} \sum _{q=1}^{\infty } h_q \cdot 2 \Vert \vec {m} \Vert q^{-1/C} + C_1\epsilon ^{-d} k |B(M)|, \end{aligned}$$

where \(h_q\) denotes the number of pairs \(x_i, x_j\) such that q is the least positive integer for which \(q(x_i - x_j) = 0\). We apply Proposition 2.1 to obtain that

$$\begin{aligned} k^2&< 2C_1 \epsilon ^{-d} \sum _{\vec {m} \in B(M)} C_2k^{2-\frac{1}{C(d+1)}} \Vert \vec {m} \Vert + C_1\epsilon ^{-d} k |B(M)| \\ {}&\le C \epsilon ^{-3d} k^{2-\frac{1}{C(d+1)}} + C\epsilon ^{-2d}k \end{aligned}$$

for a large enough constant C that depends on d and G. Thus, for large enough \(k \ge \epsilon ^{-C_G}\) for some constant \(C_G>0\) this inequality must fail, contradicting the initial assumption that for some distinct \(x_1, \ldots , x_k \in \mathbb {T}^d\) the set \(\{ g x_1, \ldots , g x_k \}\) is not \(\epsilon \)-dense in \(\mathbb {T}^d\) for all \(g \in G\). \(\square \)

4 Proof of main polynomial theorem

Lemma 4.1

(GCD bound lemma) Let \(T_0:\mathbb {Z}^d \rightarrow \mathbb {Z}^r\) be a \(\mathbb {Z}\)-linear transformation. Then there exists a constant \(Q=Q(T_0)>0\) and a surjective \(\mathbb {Z}\)-linear map \(R:\mathbb {Z}^d \rightarrow W\) (where \(W \cong \mathbb {Z}^{d'}\) is an abelian group) such that \(T_0 = TR\) for some injective \(\mathbb {Z}\)-linear map \(T:W \rightarrow \mathbb {Z}^r\) and such that for all \(q \in \mathbb {Z}_{\ge 0}\) we have that

$$\begin{aligned} {\text {gcd}}(Tw,q) \le Q \quad \text {for all } w \in W \text { with } {\text {gcd}}(w,q)=1. \end{aligned}$$

Proof

By the Smith normal form we may write

$$\begin{aligned} T_0 = LDR' \end{aligned}$$

where \(L:\mathbb {Z}^r \rightarrow \mathbb {Z}^r\), \(R':\mathbb {Z}^d \rightarrow \mathbb {Z}^d\) are automorphisms and \(D:\mathbb {Z}^d \rightarrow \mathbb {Z}^r\) is diagonal. This means that \(De_i = D_ie_i'\) where \(e_i \in \mathbb {Z}^d\) and \(e_i' \in \mathbb {Z}^r\) is the i-th standard basis vector and \(D_i \in \mathbb {Z}\). We also have the divisibility conditions \(D_1|D_2|\cdots | D_d\). Now suppose that k is maximal such that \(D_k \ne 0\) (thus \(D_i = 0\) for all \(i > k\)). We let \(W = \mathbb {Z}\text {-span}\{e_1, \ldots , e_k\}\). We let \(R=P_WR'\) where \(P_W:\mathbb {Z}^d \rightarrow W\) is the orthogonal projection and we let \(T = (LD)|_{W}:W \rightarrow \mathbb {Z}^r\) be the restriction of LD to W. It follows that

$$\begin{aligned} T_0=TR \end{aligned}$$

and that T is injective while R is surjective. Indeed, for \(a \in \mathbb {Z}^d\) we write \(R'a = w+u\) where \(w \in W\) and \(u \in \mathbb {Z}\text {-span}\{e_{i+1}, \ldots , e_d\}\), thus

$$\begin{aligned} DR'a = Dw = DP_WR'a = DRa. \end{aligned}$$

Moreover, we see T is injective since L is an automorphism and \(D|_W\) is injective.

Now fix \(q \in \mathbb {Z}_{>0}\) and \(w \in W\) such that \({\text {gcd}}(w,q) = 1\). We see that \({\text {gcd}}(Dw,q) \le D_k\). Now since L is an automorphism we have that

$$\begin{aligned} {\text {gcd}}(LDw,q) = {\text {gcd}}(Dw,q) \le Q \end{aligned}$$

where we set \(Q = D_k\). \(\square \)

Proof of Theorem D

Suppose \({\widetilde{Y}} = \{x_1, \ldots , x_k\}\) where the \(x_i\) are distinct and suppose that A(n)Y is not \(\epsilon \)-dense in \(\mathbb {T}^d\) for all \(n \in \mathbb {Z}\) (where \(Y = \pi _{\mathbb {T}^d}({\widetilde{Y}})\)). So we can apply Lemma 2.3 to all such \(g \in \{A(1), \ldots , A(N)\}\) and average over \(n=1, \ldots , N\) to obtain that

$$\begin{aligned} k^2 \le \frac{C_1}{\epsilon ^d}\sum _{ \vec {m} \in B(M)} \sum _{1 \le i,j \le k} \frac{1}{N} \sum _{n=1}^N e(\vec {m} \cdot A(n) (x_i - x_j)), \end{aligned}$$

where \(M = \lceil \frac{d}{\epsilon } \rceil \). Now for each \(\vec {m} \in B(M)\) we have a linear map \(T_{\vec {m}}:\mathbb {R}^d \rightarrow \mathbb {R}[x]\) given by

$$\begin{aligned} T_{\vec {m}}u = \vec {m} \cdot (A(x) - A(0))u. \end{aligned}$$

Observe that \(T_{\vec {m}}\) maps \(\mathbb {Z}^d\) to \(\mathbb {Z}[x]\) and in fact the image of \(T_{\vec {m}}\) is isomorphic (as an abelian group) to \(\mathbb {Z}^r\) for some \(r \le D\) where D is the degree of A(x). Using the GCD bound lemma above we may write \(T_{\vec {m}} = T'_{\vec {m}}R_{\vec {m}}\) where \(T'_{\vec {m}}:\mathbb {Z}^{d'} \rightarrow \mathbb {Z}^d\) is an injective linear map for some \(d'\le d\) and \(R_{\vec {m}}:\mathbb {Z}^d \rightarrow \mathbb {Z}^{d'}\) is surjective and linear. We may also view these maps as integer matrices and thus as linear maps between Euclidean spaces or between tori. By assumption, we have that \(T_{\vec {m}}(x_i - x_j) \in \mathbb {R}[x]\) is non-zero for distinct ij. Thus \(R_{\vec {m}}\) must be injective on \({\widetilde{Y}}\) hence \(|{\widetilde{Y}}_{\vec {m}}| = k\) where we define

$$\begin{aligned} {\widetilde{Y}}_{\vec {m}} = R_{\vec {m}} {\widetilde{Y}} \subset \mathbb {R}^{d'}. \end{aligned}$$

Now observe that since there are only finitely many \(\vec {m}\) (we consider \(\epsilon \) as fixed and B(M) is a finite set) there must exist a constant L such that

$$\begin{aligned} R_{\vec {m}}([0,1)^d) \subset [0,L)^{d'} \end{aligned}$$

for all \(\vec {m} \in B(M)\). This means that if we set \(Y_{\vec {m}} = \pi _{\mathbb {T}^{d'}} ({\widetilde{Y}}_{\vec {m}})\) then we must have

$$\begin{aligned} |Y_{\vec {m}}| \ge \frac{|{\widetilde{Y}}_m|}{L^{d'}} \ge k L^{-d}. \end{aligned}$$

Thus we can rewrite our bound as

$$\begin{aligned} k^2&\le \frac{C_1}{\epsilon ^d}\sum _{ \vec {m} \in B(M)} \sum _{{\widetilde{y}},{\widetilde{y}}' \in {\widetilde{Y}}_{\vec {m}}} \frac{1}{N} \sum _{n=1}^N e\left( (T'_{\vec {m}} ({\widetilde{y}}-{\widetilde{y}}')) (n) + \vec {m} \cdot A(0)({\widetilde{y}}-{\widetilde{y}}') \right) \\&\le \frac{C_1}{\epsilon ^d} L^{2d} \sum _{ \vec {m} \in B(M)} \sum _{y, y' \in Y_{\vec {m}}} \frac{1}{N} \sum _{n=1}^N e\left( (T'_{\vec {m}} (y-y')) (n) + \vec {m} \cdot A(0)(y-y') \right) \end{aligned}$$

where the extra \(L^{2d}\) factor comes from the fact that a pair \(y,y' \in Y_{\vec {m}}\) arises as the projection of at most \(L^d L^d\) pairs \({\widetilde{y}}, {\widetilde{y}}' \in {\widetilde{Y}}_{\vec {m}}\).

Now we consider two cases.

Case 1: \(y-y'\) is not rational, i.e., \(y-y' \notin \mathbb {Q}^{d'}/\mathbb {Z}^{d'}\). We claim that \(T'_{\vec {m}}(y-y')(x) \in \mathbb {T}[x]\) has an irrational non-constant term (the constant term is zero). This follows from basic Linear Algebra: If A is a matrix with entries in \(\mathbb {Q}\) and with trivial kernel then a solution to \(Ax = u\), with u a rational vector, must be rational. Thus if \(T'_{\vec {m}}(y-y')(x) \in (\mathbb {Q}/\mathbb {Z})[x]\) then \(y-y' \in \mathbb {Q}^{d'}/\mathbb {Z}^{d'}\), a contradiction. It now follows by the polynomial Weyl Equidistribution theorem (Theorem 1.4 in [10]) that

$$\begin{aligned} \lim _{N \rightarrow \infty } \frac{1}{N} \sum _{n=1}^N e\left( (T'_{\vec {m}} (y-y')) (n) + \vec {m} \cdot A(0)(y-y') \right) = 0. \end{aligned}$$

Case 2: \(y-y' \in \mathbb {Q}^{d'}/\mathbb {Z}^{d'}\). We thus write \(y-y' = \frac{w}{q}\) where \(w \in \mathbb {Z}^d\) and \({\text {gcd}}(w, q) = 1\). We now use the GCD bound lemma to see that

$$\begin{aligned} T'_{\vec {m}}(y-y')(n) = \frac{1}{q}\sum _{j=1}^r b_j n^j \end{aligned}$$

where \({\text {gcd}}(b_1, \ldots , b_r, q) \le Q(\vec {m})\) for some constant \(Q(\vec {m})\) as in the GCD bound lemma. Thus we may apply Hua’s bound (see [14] or [13]) to obtain for any \(0< \delta < \frac{1}{D}\) a constant \(C_2 = C_2(D, \delta )\) depending only on D and \(\delta \) such that

$$\begin{aligned}&\left| \lim _{N \rightarrow \infty } \frac{1}{N}\sum _{n=1}^N e\left( (T'_{\vec {m}} (y-y')) (n) + \vec {m} \cdot A(0)(y-y') \right) \right| \\ {}&\quad = \left| \frac{1}{q}\sum _{n=1}^q e\left( \frac{1}{q}\sum _{j=1}^r b_j n^j + \vec {m} \cdot A(0)(y-y') \right) \right| \\ {}&\quad \le C_{2} \left( \frac{Q(\vec {m})}{q} \right) ^{\frac{1}{D} - \delta }. \end{aligned}$$

Let \(Q = \max _{\vec {m} \in B(M)} Q(\vec {m})\). Also, let \(h_{q,\vec {m}}\) denote the number of pairs \(y,y' \in Y_{\vec {m}}\) such that \(y-y' = \frac{w}{q}\) where \(w \in \mathbb {Z}^d\) and \({\text {gcd}}(w, q) = 1\). In other words, \(h_{q,\vec {m}}\) is the number of pairs \(y,y' \in Y_{\vec {m}}\) such that q is the least positive integer for which \(q(y-y')=0\). Letting \(N \rightarrow \infty \) and combining the two cases above we obtain the bound

$$\begin{aligned} k^2&\le \frac{C_1 L^{2d}}{\epsilon ^d}\sum _{ \vec {m} \in B(M)} \left( \sum _{q=2}^{\infty } h_{q,\vec {m}} C_{2} \left( \frac{Q}{q} \right) ^{\frac{1}{D} - \delta } +k \right) \end{aligned}$$

Now apply Proposition 2.1 to get that

$$\begin{aligned} \sum _{q=2}^{\infty } h_{q,\vec {m}} q^{\delta - \frac{1}{D}} \le C_3 k^{2 - (\frac{1}{D} - \delta )/(d+1)} \end{aligned}$$

for some constant \(C_3 = C_3(d,D)\) depending only on d and D. Thus we have shown that

$$\begin{aligned} k^2 \le Q^{\frac{1}{D} - \delta }C_2 (2M)^d \frac{C_1L^{2d}}{\epsilon ^d}C_3 k^{2 - (\frac{1}{D} - \delta )/(d+1)} + \frac{C_1 L^{2d}}{\epsilon ^d}(2M)^d k. \end{aligned}$$

Observe that as \(\epsilon \), A(x) and d are fixed, we have that M, Q and L are fixed and so for large enough k this inequality must fail. In other words, if |Y| is larger than some function of \(\epsilon \), A(x) and d then there must exist \(n \in \mathbb {Z}_{\ge 0}\) such that A(n)Y is \(\epsilon \)-dense in \(\mathbb {T}^d\). \(\square \)

5 Applications to unipotent subgroups

Lemma 5.1

Let G be a semigroup generated by a finite set U and let

$$\begin{aligned} G_n = \{ u_1 \cdots u_r ~|~ 0 \le r \le n \text { and } u_1, \ldots , u_r \in U\} \end{aligned}$$

be the ball of radius n in the Cayley graph of G. Suppose that G acts on \(\mathbb {R}^d\) by linear maps and \(a \in \mathbb {R}^d\) satisfies that Ga is not contained in any proper affine subspace. Then \(G_d a\) is not contained in any proper affine subspace.

Proof

Let \(H_n\) denote the smallest affine subspace containing \(G_n a\). In other words, \(H_n = W_n + a\) where

$$\begin{aligned} W_n = \mathbb {R}\text {-span}\{ ga - a ~|~ g \in G_n\}. \end{aligned}$$

Clearly \(H_n \subset H_{n+1}\). We claim that if \(H_N = H_{N+1}\) then \(H_n = H_N\) for all \(n \ge N\). First note that if \(u \in U\) is a generator then \(uW_n \subset W_{n+1}\), since for \(g \in G_n\) we have that

$$\begin{aligned} u(ga-a) = uga - ua = (uga - a) - (ua - a) \in W_{n+1} +W_1 \subset W_{n+1}. \end{aligned}$$

Consequently, for \(w \in W_n\) we have

$$\begin{aligned} u(w+a) = uw + ua = uw + (ua - a) + a \in W_{n+1} + W_1 + a = W_{n+1} + a \end{aligned}$$

and thus

$$\begin{aligned} uH_n \subset H_{n+1}. \end{aligned}$$

Thus if \(H_N = H_{N+1}\) then \(uH_N \subset H_{N+1} = H_N\) for all generators u and thus \(gH_N \subset H_N\) for all \(g \in G\). Recalling that, by definition, \(H_N\) contains \(G_N a\) and thus by G-invariance \(H_N\) contains \(G_n a\) for all \(n \ge N\), meaning that \(H_N\) contains \(H_n\) for all \(n \ge N\). Thus \(H_N = H_n\) for all \(n \ge N\). Consequently, the smallest such N for which \(H_N = H_{N+1}\) satisfies \(N \le d\) (by dimension arguments). Thus \(H_n = H_d\) for all \(n \ge d\) which means that \(H_d\) contains \(G_n a\) for all \(n \ge d\) and thus \(G a \subset H_d\). By assumption that Ga is not in any proper affine subspace, this means that \(H_d = \mathbb {R}^d\). \(\square \)

Proof of Theorem C

Let \(U = \{u_1, \ldots , u_m\}\) be a finite set of generators for G where each \(u_i\) is a unipotent element and use cyclic notation so that \(u_i = u_{i+jm}\) for all \(i, j \in \mathbb {Z}\). Note that for each fixed i the matrix \(u_i ^n\) has entries that are integer polynomials in n hence

$$\begin{aligned} Q_N(n_1, \ldots , n_N) = \prod _{i=1}^{N} u_i^{n_i} \in M_{d \times d}(\mathbb {Z}[n_1, \ldots , n_N]) \end{aligned}$$

is a matrix with multivariate integer polynomial entries in the variables \(n_1, \ldots , n_N\). Now let \(N = dm\) and use Lemma 5.1 to get that \(\{ Q_N(n_1, \ldots n_N)a ~|~ n_1, \ldots n_N \in \mathbb {Z}\}\) is not contained in any proper affine subspace of \(\mathbb {R}^d\) for all fixed non-zero \(a \in {\widetilde{Y}} - {\widetilde{Y}}\). In other words, for each fixed \(a \in ({\widetilde{Y}} - {\widetilde{Y}})\setminus \{0\}\) if we let \(P_1, \ldots , P_d \in \mathbb {R}[n_1, \ldots , n_N]\) be the polynomials such that

$$\begin{aligned} Q(n_1, \ldots , n_N)a = (P_1(n_1, \ldots , n_N), \ldots , P_d(n_1, \ldots , n_N)) \end{aligned}$$

then \(P_1, \ldots , P_d, 1\) are linearly independent over \(\mathbb {R}\). But there exists a large enough \(R \in \mathbb {Z}_{>0}\) (independent of a) such that the substitutions \(n_i \mapsto n_i^{R^{i-1}}\) induce a map \(\mathbb {Z}[n_1, \ldots , n_N] \rightarrow \mathbb {Z}[n]\) that is injective on the monomials appearing in \(Q_N(n_1, \ldots , n_N)\). Thus \(P_1, \ldots , P_d, 1\) remain linearly independent over \(\mathbb {R}\) after making this substitution, thus \(\{ Q(n, n^R, \ldots , n^{R^{N-1}})a ~|~ n \in \mathbb {Z}\}\) is also not contained in any proper affine subspace. So the proof is complete by applying Theorem D to the polynomial \(A(x) = Q(x, x^R, \ldots , x^{R^{N-1}})\), which is independent of \({\widetilde{Y}}\) and thus the lower bound k is uniform (once G is fixed). \(\square \)