1 Introduction

The aim of this paper is to prove uniform oscillation inequalities and \(\lambda \)-jump inequalities in the context of polynomial ergodic averages and truncated singular operators of the Cotlar type modeled on multi-dimensional subset of primes. We extend the known results of Trojan [37] for the r-variation seminorm \(V^r\) with \(r>2\) to endpoint cases expressed in terms of the uniform jump and oscillation inequalities. This provides a fuller quantitative description of the pointwise convergence of the mentioned averages.

1.1 Statement of results

Let \((X,\mathcal {B},\mu )\) be a \(\sigma \)-finite measure space endowed with a family of invertible commuting and measure preserving transformations \(S_1,\ldots , S_d:X\rightarrow X\). Let \(\Omega \) be a bounded convex open subset of \(\mathbb {R}^k\) such that \(B(0,c_{\Omega }) \subseteq \Omega \subseteq B(0,1)\) for some \(c_{\Omega }\in (0, 1)\), where B(0, u) is the open Euclidean ball in \(\mathbb {R}^k\) with radius \(u>0\) centered at \(0\in \mathbb {R}^k\). For any \(t>0\), we set

$$\begin{aligned} \Omega _t:= \{x\in \mathbb {R}^{k}: t^{-1}x\in \Omega \}. \end{aligned}$$

We consider a polynomial mapping

$$\begin{aligned} \mathcal {P}=(\mathcal {P}_1,\dots ,\mathcal {P}_{d}):\mathbb {Z}^k\rightarrow \mathbb {Z}^{d} \end{aligned}$$
(1.1)

where each \(\mathcal {P}_j:\mathbb {Z}^k\rightarrow \mathbb {Z}\) is a polynomial of k variables with integer coefficients such that \(\mathcal {P}_j(0)=0\). Let \(k',k''\in \{0,1,\ldots ,k\}\) with \(k=k'+k''\). For \(f\in L^\infty (X,\mu )\), we define the associated ergodic averages by

$$\begin{aligned}{} & {} \mathcal {A}_{t}^{\mathcal {P},k',k''}f(x)\nonumber \\{} & {} \quad :=\frac{1}{\vartheta _\Omega (t)}\sum _{(n,p)\in \mathbb {Z}^{k'}\times (\pm \mathbb {P})^{k''}}f\left( S_1^{\mathcal {P}_1(n,p)}\cdots S_d^{\mathcal {P}_d(n,p)}x\right) \mathbbm {1}_{\Omega _t}(n,p)\left( \prod _{i=1}^{k''}\log |p_i|\right) ,\quad x\in X,\nonumber \\ \end{aligned}$$
(1.2)

where \(\pm \mathbb {P}\) denotes the set of positive and negative prime numbers and

$$\begin{aligned} \vartheta _\Omega (t):=\sum _{(n,p)\in \mathbb {Z}^{k'}\times (\pm \mathbb {P})^{k''}}\mathbbm {1}_{\Omega _t}(n,p)\left( \prod _{i=1}^{k''}\log |p_i|\right) \end{aligned}$$

is the Chebyshev function. We also consider the Cotlar type ergodic averages given by

$$\begin{aligned}{} & {} \mathcal {H}_{t}^{\mathcal {P},k',k''}f(x)\nonumber \\{} & {} \quad :=\sum _{(n,p)\in \mathbb {Z}^{k'}\times (\pm \mathbb {P})^{k''}}f\left( S_1^{\mathcal {P}_1(n,p)}\cdots S_d^{\mathcal {P}_d(n,p)}x\right) K(n,p)\mathbbm {1}_{\Omega _t}(n,p)\left( \prod _{i=1}^{k''}\log |p_i|\right) ,\quad x\in X,\nonumber \\ \end{aligned}$$
(1.3)

where \(K:\mathbb {R}^{k}{\setminus }\{0\} \rightarrow \mathbb {C}\) is a Calderón–Zygmund kernel satisfying the following conditions:

  1. 1.

    The size condition: For every \(x\in \mathbb {R}^k{\setminus }\{0\}\), we have

    $$\begin{aligned} |K(x)| \lesssim |x|^{-k}. \end{aligned}$$
    (1.4)
  2. 2.

    The cancellation condition: For every \(0<r<R<\infty \), we have

    $$\begin{aligned} \int _{\Omega _{R}{\setminus } \Omega _{r}} K(y) \text {d}y = 0. \end{aligned}$$
    (1.5)
  3. 3.

    The Lipschitz continuity condition: For every \(x, y\in \mathbb {R}^k{\setminus }\{0\}\) with \(2|y|\le |x|\), we have

    $$\begin{aligned} |K(x)-K(x+y)| \lesssim |y| |x|^{-(k+1)}. \end{aligned}$$
    (1.6)

We recall the definitions of the oscillation seminorm and \(\lambda \)-jump counting function. Let \(\mathbb {I}\subseteq \mathbb {R}\). For an increasing sequence \(I=(I_j: j\in \mathbb {N})\subseteq \mathbb {I}\) and \(N\in \mathbb {N}\cup \{\infty \}\), the truncated oscillation seminorm of a function \(f:\mathbb {I}\rightarrow \mathbb {C}\) is defined by

$$\begin{aligned} O_{I, N}^2( f(t): t\in \mathbb {I}) := \left( \sum _{j=1}^N\sup _{\begin{array}{c} I_j \le t < I_{j+1}\\ t\in \mathbb {I} \end{array}} |f(t)-f(I_j)|^2\right) ^{1/2}. \end{aligned}$$
(1.7)

For any \(\lambda >0\) and \(\mathbb {I}\subseteq \mathbb {R}\), the \(\lambda \)-jump counting function of a function \(f:\mathbb {I}\rightarrow \mathbb {C}\) is defined by

$$\begin{aligned} N_{\lambda }(f(t) : t\in \mathbb {I}):=\sup \{J\in \mathbb {N}\, |\, \exists _{\begin{array}{c} t_{0}<\cdots<t_{J}\\ t_{j}\in \mathbb {I} \end{array}} : \min _{0<j\le J} |f(t_{j})-f(t_{j-1})| \ge \lambda \}. \end{aligned}$$
(1.8)

We can now state the main result of this paper.

Theorem 1

Let \(d, k\ge 1\) and let \(\mathcal {P}\) be a polynomial mapping as in  (1.1). Let \(k',k''\in \{0,1,\ldots ,k\}\) with \(k'+k''=k\) and let \(\mathcal {M}_t^{\mathcal {P},k',k''}\) be either \(\mathcal {A}_t^{\mathcal {P},k',k''}\) or \(\mathcal {H}_t^{\mathcal {P},k',k''}\). Then, for any \(p\in (1,\infty )\), there is a constant \(C_{p,d,k,\deg \mathcal {P}}>0\) such that

$$\begin{aligned} \sup _{\lambda>0}\Big \Vert \lambda N_{\lambda }(\mathcal {M}_t^{\mathcal {P},k',k''} f:t>0)^{1/2}\Big \Vert _{L^p(X,\mu )}&\le C_{p,d,k,\deg \mathcal {P}}\Vert f\Vert _{L^p(X,\mu )}, \end{aligned}$$
(1.9)
$$\begin{aligned} \sup _{N\in \mathbb {N}}\sup _{I\in \mathfrak {S}_N(\mathbb {R}_+)}\Big \Vert O_{I,N}^2(\mathcal {M}_t^{\mathcal {P},k',k''} f:t>0)\Big \Vert _{L^p(X,\mu )}&\le C_{p,d,k,\deg \mathcal {P}}\Vert f\Vert _{L^p(X,\mu )}, \end{aligned}$$
(1.10)

for any \(f\in L^p(X,\mu )\). Here, \(\mathfrak {S}_N(\mathbb {R}_+)\) is the set of all strictly increasing sequences in \(\mathbb {R}_+\) of length \(N+1\) (see Sect. 2.2). The constant \(C_{p,d,k,\deg \mathcal {P}}\) is independent of the coefficients of the polynomial mapping \(\mathcal {P}\).

In the proof of the above theorem, we use methods developed in [24, 28, 37] and very recently in [21, 34]. We follow Bourgain’s approach [6] to use the Calderón transference principle [7] which reduce the problem to the integer shift system (see Sect. 2.3) and then exploit the Hardy–Littlewood circle method to analyze the appropriate Fourier multipliers. The main tools used to handle the estimates for the multiplier operators are: an appropriate generalization of Weyl’s inequality (Proposition 6); the Ionescu–Wainger multiplier theorem (see [12, 28] and [36]) combined with the Rademacher–Menshov inequality (see [24]) and standard multiplier approximations (Lemma 8); the Magyar–Stein–Wainger sampling principle [23] and [26].

As a consequence of Theorem 1, we can state the following quantitative form of the ergodic theorem concerning the averages \(\mathcal {A}_t^{\mathcal {P},k',k''}\) and \(\mathcal {H}_t^{\mathcal {P},k',k''}\).

Corollary 2

Let \((X,\mathcal {B},\mu )\) be a \(\sigma \)-finite measure space. Let \(d, k\ge 1\) and let \(\mathcal {P}\) be a polynomial mapping as in  (1.1). Let \(k',k''\in \{0,1,\ldots ,k\}\) with \(k'+k''=k\) and let \(\mathcal {M}_t^{\mathcal {P},k',k''}\) be either \(\mathcal {A}_t^{\mathcal {P},k',k''}\) or \(\mathcal {H}_t^{\mathcal {P},k',k''}\). Let \(p\in (1,\infty )\) and \(f\in L^p(X,\mu )\). Then we have:

  1. (i)

    (Mean ergodic theorem) the averages \(\mathcal {M}_t^{\mathcal {P},k',k''}f\) converge in \(L^p(X,\mu )\) norm as \(t\rightarrow \infty \);

  2. (ii)

    (Pointwise ergodic theorem) the averages \(\mathcal {M}_t^{\mathcal {P},k',k''}f\) converge pointwise \(\mu \)-almost everywhere on X as \(t\rightarrow \infty \);

  3. (iii)

    (Maximal ergodic theorem) the following maximal estimate holds:

    $$\begin{aligned} \Bigg \Vert \sup _{t>0}|\mathcal {M}_t^{\mathcal {P},k',k''}f|\Bigg \Vert _{L^p(X, \mu )}\lesssim _{d,k,p, \deg \mathcal P}\Vert f\Vert _{L^p(X, \mu )}; \end{aligned}$$
    (1.11)
  4. (iv)

    (Oscillation ergodic theorem) the following uniform oscillation inequality holds:

    $$\begin{aligned} \sup _{N\in \mathbb {N}}\sup _{I\in \mathfrak S_N(\mathbb {R}_+) }\Bigg \Vert O_{I, N}^2(\mathcal {M}_t^{\mathcal {P},k',k''}f: t>0)\Bigg \Vert _{L^p(X, \mu )}\lesssim _{d,k,p, \deg \mathcal P}\Vert f\Vert _{L^p(X, \mu )}; \end{aligned}$$
    (1.12)
  5. (v)

    (Variational ergodic theorem) for any \(r\in (2,\infty )\), the following r-variational inequality holds (see Sect. 2.2 for the definition of \(V^r\)):

    $$\begin{aligned} \Bigg \Vert V^r(\mathcal {M}_t^{\mathcal {P},k',k''}f: t>0)\Bigg \Vert _{L^p(X, \mu )}\lesssim _{d,k,p, r, \deg \mathcal P}\Vert f\Vert _{L^p(X, \mu )}; \end{aligned}$$
    (1.13)
  6. (vi)

    (Jump ergodic theorem) the following jump inequality holds:

    $$\begin{aligned} \sup _{\lambda>0}\Bigg \Vert \lambda N_{\lambda }(\mathcal {M}_t^{\mathcal {P},k',k''}f:t>0)^{1/2}\Bigg \Vert _{L^p(X,\mu )}\lesssim _{d,k,p, \deg \mathcal P}\Vert f\Vert _{L^p(X, \mu )}. \end{aligned}$$
    (1.14)

The implicit constants in (1.11), (1.12), (1.13), and (1.14) are independent of the coefficients of the polynomial mapping \(\mathcal {P}\).

A few comments are in order.

  1. 1.

    Corollary 2 is the most general quantitative version of the one parameter ergodic theorem for both averages \(\mathcal {A}_t^{\mathcal {P},k',k''}\) and \(\mathcal {H}_t^{\mathcal {P},k',k''}\) (cf. [22, Theorem 1.20]), which concludes the work of many authors over last decades—see Sect. 1.2 for details.

  2. 2.

    The mean ergodic theorem (i) easily follows from (ii) and (iii) by Lebesgue’s dominated convergence theorem. Each inequality from (iv), (v), and (vi) individually implies pointwise convergence (ii) and the maximal estimate (iii). The jump inequality (vi) implies the variational ergodic theorem (v) in the full range \(r\in (2,\infty )\). Hence, the inequality (1.14) can be seen as an \(r=2\) endpoint for (1.13).

  3. 3.

    Unfortunately, we do not know at this moment if the oscillation inequality (1.12) is any kind of endpoint for the variational inequality (1.13). A recent result from [21] shows that the oscillation estimates cannot be interpreted as an endpoint in a way similar to how the jump inequalities are. See the discussion in [21] and [22].

  4. 4.

    The oscillation inequality (1.12) for the ergodic averages \(\mathcal {A}_t^{\mathcal {P},k',k''}\) can be seen as a contribution to a problem posed by Rosenblatt and Wierdl [31, Problem 4.12, p. 80] in the early 1990’s about uniform oscillation inequalities for the classical Birkhoff ergodic averages given by

    $$\begin{aligned} \frac{1}{2N+1}\sum _{n=-N}^Nf(S^nx). \end{aligned}$$
    (1.15)

    In 1998, Jones, Kaufman, Rosenblatt, and Wierdl [14] gave an affirmative answer to this problem. The inequality (1.10) provides us with the uniform oscillation inequality for the counterpart of (1.15) along the prime numbers given by

    $$\begin{aligned} \frac{1}{2|\mathbb {P}_N|}\sum _{n=-N}^N{f(S^nx)\mathbbm {1}_{\mathbb {P}}(|n|)}, \end{aligned}$$

    where \(\mathbb {P}_N=\mathbb {P}\cap [1,N]\). Moreover, the inequality (1.10) is much more general than the originally posted problem since it concerns the multi-dimensional averages along arbitrary polynomials with integer coefficients.

  5. 5.

    Parts (i), (ii), (iii), and (v) for the standard averages \(\mathcal {A}_t^{\mathcal {P},k',k''}\) with \(k''\ge 1\) in the presented generality were first obtained by Trojan [37]. In the case with \(k''=0\) (excluding the prime numbers from the summation), the first proof of the variational inequality (1.12) in the full range \(r\in (2,\infty )\) was given by Mirek, Stein, and Trojan [24].

  6. 6.

    In the case of the Cotlar ergodic averages \(\mathcal {H}_t^{\mathcal {P},k',k''}\) with \(k''\ge 1\), the ergodic theorems (i), (ii), (iii) and (v) were proven by Trojan [37] under the gradient condition

    $$\begin{aligned} |x|^{k+1}|\nabla K(x)|\lesssim 1, \end{aligned}$$
    (1.16)

    but Trojan’s argument can be adapted with small changes to deal with Calderón–Zygmund kernels which satisfy the more general condition (1.6). In this case, the results for \(k''\ge 1\) seem to be completely new. For \(\mathcal {H}_t^{\mathcal {P},k,0}\), the jump inequality was obtained by Mirek, Stein, and Zorin-Kranich [28], and the oscillation ergodic theorem was obtained by the second author [34].

  7. 7.

    The oscillation inequality (1.12) and the jump inequality (1.14) are completely new results for both types of averages when \(k''\ge 1\) and follows by Theorem 1. When \(k''=0\), the corresponding results for the jump inequalities are known due to the work of Mirek, Stein, and Zorin-Kranich [28]. The uniform oscillation inequality was proven by Mirek, Słomian, and Szarek [21] in the case of the averages \(\mathcal {A}_t^{\mathcal {P},k,0}\) and by the second author [34] in the case of \(\mathcal {H}_t^{\mathcal {P},k,0}\).

1.2 Historical background

In 1931, Birkhoff [3] and von Neumann [38] proved that the averages

$$\begin{aligned} M_Nf(x):=\frac{1}{N}\sum _{n=1}^Nf(S^nx) \end{aligned}$$
(1.17)

converge pointwise \(\mu \)-almost everywhere on X and in \(L^p(X,\mu )\) norm respectively for any \(f\in L^p(X,\mu )\), \(p \in [1,\infty )\), as \(N\rightarrow \infty \). In 1955, Cotlar [9] established the pointwise \(\mu \)-almost everywhere convergence on X as \(N\rightarrow \infty \) of the ergodic Hilbert transform given by

$$\begin{aligned} H_Nf(x):=\sum _{1\le |n|\le N} \frac{f(S^nx)}{n} \end{aligned}$$

for any \(f\in L^p(X,\mu )\). In 1968, Calderón [7] made an important observation (now called the Calderón transference principle) that some results in ergodic theory can be easily deduced from known results in harmonic analysis. Namely, the convergence of the Birkhoff averages \(M_N\) can be deduced from the boundedness of the Hardy–Littlewood maximal function, and the convergence of Cotlar’s averages \(H_N\) follows from the boundedness of the maximal function for the truncated discrete Hilbert transform. As we will see ahead, this observation has had a huge impact in the study of convergence problems in ergodic theory.

At the beginning of the 1980’s, Bellow [2] and independently Furstenberg [11] posed the problem about pointwise convergence of the averages along squares given by

$$\begin{aligned} T_N f(x):=\frac{1}{N}\sum _{n=1}^N{f(S^{n^2}x)}. \end{aligned}$$

Despite its similarity to Birkhoff’s theorem, the problem of pointwise convergence of the \(T_N\) averages has a totally different nature from that of its linear counterpart. In particular, the standard approach is insufficient in this case.

We briefly sketch the classical approach of handling the problem of pointwise convergence. It consists of two steps:

  1. (a)

    Establish \(L^p\)-boundedness for the corresponding maximal function.

  2. (b)

    Find a dense class of functions in \(L^p(X,\mu )\) for which the pointwise convergence holds.

In the case of Birkhoff’s averages \(M_N\), the Calderón transference principle allows one to deduce the estimate

$$\begin{aligned} \Vert \sup _{N\in \mathbb {N}}|M_Nf|\Vert _{L^p(X,\mu )}\lesssim _{p}\Vert f\Vert _{L^p(X,\mu )} \end{aligned}$$

for \(p\in (1,\infty ]\) from the estimate for the discrete Hardy–Littlewood maximal function (and we have a weak-type estimate for \(p=1\)). In turn, estimates for the discrete Hardy–Littlewood maximal function follow easily from those for the continuous one. This establishes the first step (a). For the second step, one can use the idea of Riesz decomposition [30] to analyze the space \({\mathbb {I}}_S\oplus {\mathbb {T}}_S\subseteq L^2(X,\mu )\), where

$$\begin{aligned} { \mathbb {I}}_S&:=\{f\in L^2(X,\mu ): f\circ S =f\}\\&\qquad \text { and } \qquad \\ {\mathbb {T}}_S&:=\{h\circ S-h: h\in L^2(X,\mu )\cap L^{\infty }(X,\mu )\}. \end{aligned}$$

We see that \(M_Nf=f\) for \(f\in {\mathbb {I}}_S\) and, for \(g=h\circ S - h\in {\mathbb {T}}_S\), we have

$$\begin{aligned} M_Ng(x)=\frac{1}{N}\Bigg (h(S^{N+1}x)-h(Sx)\Bigg ) \end{aligned}$$

by telescoping. Consequently, we see that \(M_Ng\rightarrow 0\) as \(N\rightarrow \infty \). This establishes \(\mu \)-almost everywhere pointwise convergence of \(M_N\) on \({\mathbb {I}}_S\oplus {\mathbb {T}}_S\), which is dense in \(L^2(X,\mu )\). Since \(L^2(X,\mu )\) is dense in \(L^p(X,\mu )\) for every \(p\in [1,\infty )\), this establishes (b).

In the case of the quadratic averages \(T_N\), the matter is more complicated. For the first step, by the Calderón transference principle, it is enough to establish \(\ell ^p\) bounds for the maximal function given by

$$\begin{aligned} \sup _{N\in \mathbb {N}}\frac{1}{N}\sum _{n=1}^N f(x-n^2),\quad f\in \ell ^p(\mathbb {Z}). \end{aligned}$$
(1.18)

The \(\ell ^p\) estimate for the above maximal function does not follow directly from the continuous counterpart and requires completely new methods. However, a more serious problem arises in connection with the second step. Namely, the idea of von Neumann fails in this case because the averages \(T_N g\) do not possess the telescoping property for \(g\in \mathbb {T}_S\).

At the end of the 1980’s, Bourgain established the pointwise convergence of the averages \(T_N\) in a series of groundbreaking articles [4,5,6]. By using the Hardy–Littlewood circle method from analytic number theory, he established \(\ell ^p\)-bounds for the maximal function (1.18), which establishes step (a). He then bypassed the problem of finding the requisite dense class of functions by using the oscillation seminorm (1.7). Bourgain [6] proved that, for any \(\lambda >1\) and any sequence of integers \(I=(I_j:{j\in \mathbb {N}})\) with \(I_{j+1}>2I_j\) for all \(j\in \mathbb {N}\), we have

$$\begin{aligned} \Big \Vert O_{I, N}^2(T_{\lambda ^n}f:n\in \mathbb {N})\Big \Vert _{L^2(X,\mu )} \le C_{I,\lambda }(N)\Vert f\Vert _{L^2(X,\mu )}, \qquad N\in \mathbb {N}, \end{aligned}$$
(1.19)

for any \(f\in L^2(X,\mu )\) with \(\lim _{N\rightarrow \infty } N^{-1/2}C_{I, \lambda }(N)=0\). This non-uniform inequality (1.19) suffices to establish the pointwise convergence of the averaging operators \(T_Nf\) for any \(f\in L^2(X, \mu )\). In the same series of papers, by similar methods, Bourgain established the pointwise convergence of the averages along primes

$$\begin{aligned} \frac{1}{|\mathbb {P}_N|}\sum _{n=1}^Nf(S^nx)\mathbbm {1}_{\mathbb {P}}(n) \end{aligned}$$

for \(f\in L^p(X,\mu )\) with \(p>\frac{1}{2}(1+\sqrt{3})\). In the same year, Wierdl [40] extended Bourgain’s result to \(p\in (1,\infty )\).

In order to establish the inequality (1.19), Bourgain used the Hardy–Littewood circle method and r-variation seminorms \(V^r\). The r-variations were introduced by Lépingle [17] in the context of families of bounded martingales. In 1976, he proved that, for all \(r\in (2, \infty )\), \(p\in (1, \infty )\), and any family of bounded martingales \((\mathfrak f_n:X\rightarrow \mathbb {C}:n\in \mathbb {N})\), we have

$$\begin{aligned} \Vert V^r(\mathfrak f_n: n\in \mathbb {N})\Vert _{L^p(X)}\lesssim _{p,r}\sup _{n\in \mathbb {N}}\Vert \mathfrak f_n\Vert _{L^p(X)} \end{aligned}$$

with the implicit constant depending only on p and r. The above inequality is sharp in the sense that it fails for \(r=2\), see [13] for a counterexample.

Bourgain observed that the \(V^r\) seminorm can be used to obtain (1.19). This is because, by Hölder’s inequality, we have

$$\begin{aligned} O_{I,N}^2(T_n f:n\in \mathbb {N})\le N^{1/2-1/r}V^r(T_n f:n\in \mathbb {N}) \end{aligned}$$

for \(r\ge 2\). In order to prove the r-variational inequality for the averages \(T_N\), Bourgain used the \(\lambda \)-jump counting function. It can easily be seen that

$$\begin{aligned} \sup _{\lambda >0}\Vert \lambda N_{\lambda }(T_N f:N\in \mathbb {N})^{1/2}\Vert _{L^p(X,\mu )}\le \Vert V^r(T_N f:N\in \mathbb {N})\Vert _{L^p(X,\mu )} \end{aligned}$$

for every \(r\ge 2\). The above inequality can be reversed in some sense [6]. Namely, for any \(p\in (1,\infty )\) and any \(r\in (2,\infty )\), we have

$$\begin{aligned} \Vert V^r(T_N f:N\in \mathbb {N})\Vert _{L^{p,\infty }(X,\mu )}\lesssim _{p,r} \sup _{\lambda >0}\Vert \lambda N_{\lambda }(T_N f:N\in \mathbb {N})^{1/2}\Vert _{L^{p,\infty }(X,\mu )}. \end{aligned}$$

For more details about oscillation, variation, and jump seminorms, we refer to [16, 21, 22].

The above arguments demonstrate that the problem of proving pointwise convergence can be reduced to proving an appropriate r-variational estimate or jump inequality. However, an intriguing question was the issue of uniformity in the inequality (1.19). Shortly after the groundbreaking work of Bourgain, Lacey [31, Theorem 4.23, p. 95] improved inequality (1.19) showing that, for every \(\lambda >1\), there is a constant \(C_{\lambda }>0\) such that

$$\begin{aligned} \sup _{N\in \mathbb {N}}\sup _{I\in \mathfrak S_N(\mathbb {L}_{\tau })}\Big \Vert O_{I, N}^2(T_{\lambda ^n}f:n\in \mathbb {N})\Big \Vert _{L^2(X)} \le C_{\lambda }\Vert f\Vert _{L^2(X)},\quad f\in L^2(X,\mu ), \end{aligned}$$
(1.20)

where \(\mathbb {L}_{\tau }:=\{\tau ^n:n\in \mathbb {N}\}\). This result motivated the question about uniform estimates independent of \(\lambda >1\) in (1.20). In the case of Birkhoff’s averages, this question was explicitly formulated in [31, Problem 4.12, p. 80].

In 1998, Jones, Kaufman, Rosenblatt, and Wierdl [14] established the uniform oscillation inequality on \(L^p(X,\mu )\) for the standard Birkhoff averages \(M_N\). Two years later, Campbell, Jones, Reinhold, and Wierdl [8] established the uniform oscillation inequality for the ergodic Hilbert transform. In 2003, Jones, Rosenblatt, and Wierdl [15] proved uniform oscillation inequalities on \(L^p(X,\mu )\) with \(p\in (1,2]\) for the Birkhoff averages over cubes. However, the case of polynomial averages, even one-dimensional, was open until recent works [21, 34], and the case of averages along primes was open until this paper.

In 2015, Mirek and Trojan [19], using the ideas of Bourgain and Wierdl, established \(\mu \)-almost everywhere pointwise convergence of the Cotlar averages along the primes,

$$\begin{aligned} \sum _{p\in (\pm \mathbb {P}_N)} \frac{f(S^p)}{p}\log |p|. \end{aligned}$$

They proved that the corresponding maximal function is bounded on \(L^p(X,\mu )\) with \(p>1\) and showed that the analogue of Bourgain’s non-uniform oscillation inequality (1.19) holds for those averages.

In the same year, Zorin-Kranich [41] established the pointwise convergence of the averages related to the polynomial mapping given by

$$\begin{aligned} \tilde{\mathcal {P}}=(n,n^2,n^3,\ldots ,n^d):\mathbb {Z}\rightarrow \mathbb {Z}^d. \end{aligned}$$

Namely, he proved that, for any \(r>2\) and \( \left| \frac{1}{p}-\frac{1}{2}\right| <\frac{1}{2(d+1)}\), we have the following r-variational estimate

$$\begin{aligned} \left\| V^r(\mathcal {A}_N^{\tilde{\mathcal {P}},1,0}f:N\in \mathbb {N})\right\| _{L^p(X,\mu )}\lesssim _{p,r}\Vert f\Vert _{L^p(X,\mu )}. \end{aligned}$$

As a consequence, the averages \(\mathcal {A}_N^{\tilde{\mathcal {P}},1,0}f\) converge \(\mu \)-almost everywhere for any \(f\in L^p(X,\mu )\).

In 2016, Mirek and Trojan [20] established the pointwise convergence for the averages (1.2) taken over cubes with \(k'=k\), that is

$$\begin{aligned} A_{N,\textrm{cube}}^{\mathcal {P},k,0} f(x):=\frac{1}{N^k}\sum _{y\in [0,N]^k\cap \mathbb {Z}^k}f\left( S_1^{\mathcal {P}_1(y)}S_2^{\mathcal {P}_1(y)}\cdots S_d^{\mathcal {P}_d(y)}x\right) . \end{aligned}$$

There, Mirek and Trojan noted for the first time that the Rademacher–Menshov inequality (2.5) may be used to establish r-variational estimates. For \(p\in (1,\infty )\) and \(r>\max \{p, p/(p-1)\}\), they proved that

$$\begin{aligned} \left\| V^r(A_{N,\textrm{cube}}^{\mathcal {P},k,0} f:N\in \mathbb {N})\right\| _{L^p(X,\mu )}\le C_{p,d,k,\textrm{deg}\mathcal {P}}\Vert f\Vert _{L^p(X,\mu )}. \end{aligned}$$

Unfortunately, the methods introduced by Bourgain had limitations. These work perfectly fine in the case of the \(L^2\) estimates, but, in the case of an \(L^p\) estimates with \(p\ne 2\), there arise difficulties which are hard to overcome concerning the fractions around which major arcs are defined. However, Ionescu and Wainger [12], in their groundbreaking 2005 work about discrete singular Radon operators, introduced a set of fractions for which the circle method can be applied towards \(L^p\) estimates with \(p\ne 2\).

In 2015, Mirek [18] built a discrete counterpart of the Littlewood–Paley theory using the Ionescu–Wainger multiplier theorem and used it to reprove the main result from [12]. In 2017, Mirek, Stein, and Trojan [24, 25] further exploited these ideas together with the Rademacher–Menshov inequality from [20] to obtain an \(L^p\) estimate for the r-variation seminorm for both \(\mathcal {A}_t^{\mathcal {P},k,0}\) and \(\mathcal {H}_t^{\mathcal {P},k,0}\) associated with convex sets in the full range of parameters. Namely, they showed that

$$\begin{aligned} \left\| V^r(\mathcal {M}_t^{\mathcal {P},k,0}f: t>0)\right\| _{L^p(X, \mu )}\lesssim _{d,k,p, r, \deg \mathcal P}\Vert f\Vert _{L^p(X, \mu )} \end{aligned}$$
(1.21)

for \(p\in (1,\infty )\) and \(r\in (2,\infty )\), where \(\mathcal {M}_t^{\mathcal {P},k,0}\) is either \(\mathcal {A}_t^{\mathcal {P},k,0}\) or \(\mathcal {H}_t^{\mathcal {P},k,0}\). There, the operators \(\mathcal {H}_t^{\mathcal {P},k,0}\) are related to Calderón–Zygmund kernels satisfying the gradient condition (1.16).

In 2019, Trojan [37] proved an \(L^p\) estimate for the r-variation seminorm for both \(\mathcal {A}_t^{\mathcal {P},k',k''}\) and \(\mathcal {H}_t^{\mathcal {P},k',k''}\) with \(k',k''\in \{0,1,\ldots ,k\}\) such that \(k'+k''=k\). Namely, he showed that

$$\begin{aligned} \left\| V^r(\mathcal {M}_t^{\mathcal {P},k',k''}f: t>0)\right\| _{L^p(X, \mu )}\lesssim _{d,k,p, r, \deg \mathcal P}\Vert f\Vert _{L^p(X, \mu )} \end{aligned}$$
(1.22)

for \(p\in (1,\infty )\) and \(r\in (2,\infty )\), where \(\mathcal {M}_t^{\mathcal {P},k',k''}\) is either \(\mathcal {A}_t^{\mathcal {P},k',k''}\) or \(\mathcal {H}_t^{\mathcal {P},k',k''}\). A straightforward consequence of the inequality (1.22) is the \(\mu \)-almost everywhere convergence of the averages \(\mathcal {M}_t^{\mathcal {P},k',k''}f\). Again, the operators \(\mathcal {H}_t^{\mathcal {P},k',k''}\) there are related to Calderón–Zygmund kernels satisfying the gradient condition (1.16).

In 2020, Mirek, Stein, and Zorin-Kranich [28] further refined the methods developed in [24, 25] and proved a uniform \(L^p\) estimate for the \(\lambda \)-jump counting function. They proved that

$$\begin{aligned} \sup _{\lambda>0}\Big \Vert \lambda N_{\lambda }(\mathcal {M}_t^{\mathcal {P},k,0}f:t>0)^{1/2}\Big \Vert _{L^p(X,\mu )}\le C_{p,d,k,\textrm{deg}\mathcal {P}}\Vert f\Vert _{L^p(X,\mu )} \end{aligned}$$
(1.23)

for any \(p\in (1,\infty )\) and any \(f\in L^p(X,\mu )\), where \(\mathcal {M}_t^{\mathcal {P},k,0}f\) is either \(\mathcal {A}_t^{\mathcal {P},k,0}f\) or \(\mathcal {H}_t^{\mathcal {P},k,0}f\). There, the operators \(\mathcal {H}_t^{\mathcal {P},k,0}\) are associated with Calderón–Zygmund kernels satisfying the Hölder continuity condition generalizing (1.6): For some \(\sigma \in (0,1]\) and for every \(x, y\in \mathbb {R}^k{\setminus }\{0\}\) with \(2|y|\le |x|\), we have

$$\begin{aligned} |K(x)-K(x+y)| \lesssim |y|^{\sigma } |x|^{-(k+\sigma )}. \end{aligned}$$
(1.24)

It is worth noting that the inequality (1.23) implies the r-variation inequality (1.21).

In 2021, the second author in collaboration with Mirek and Szarek [21] established the oscillation inequality

$$\begin{aligned} \sup _{N\in \mathbb {N}}\sup _{I\in \mathfrak {S}_N(\mathbb {R}_+)}\left\| {O_{I,N}^2(\mathcal {A}_t^{\mathcal {P},k,0}f:t>0)}\right\| _{L^p(X,\mu )}\le C_{p,d,k,\textrm{deg}\mathcal {P}}\Vert f\Vert _{L^p(X,\mu )}, \end{aligned}$$
(1.25)

and, recently, the second author [34] proved the counterpart of (1.25) in the case of the operators \(\mathcal {H}_t^{\mathcal {P},k,0}\) related to Calderón–Zygmund kernels satisfying (1.24).

2 Notation and necessary tools

2.1 Basic notation

We denote \(\mathbb {N}:=\{1, 2, \ldots \}\), \(\mathbb {N}_0:=\{0,1,2,\ldots \}\), and \(\mathbb {R}_+:=(0, \infty )\). For \(d\in \mathbb {N}\), the sets \(\mathbb {Z}^d\), \(\mathbb {R}^d\), \(\mathbb {C}^d\), and \(\mathbb {T}^d = (\mathbb {R}/\mathbb {Z})^d \equiv [-1/2, 1/2)^d\) have the standard meanings. For each \(N\in \mathbb {N}\), we set

$$\begin{aligned} \mathbb {N}_N:=\{1,\ldots , N\}. \end{aligned}$$

For any \(x\in \mathbb {R}\), we set

$$\begin{aligned} \lfloor x \rfloor : = \max \{ n \in \mathbb {Z}: n \le x \}. \end{aligned}$$

For \(u\in \mathbb {N}\), we define the set

$$\begin{aligned} 2^{u\mathbb {N}}:=\{2^{un}:n\in \mathbb {N}\}. \end{aligned}$$

For two non-negative numbers A and B, we write \(A \lesssim B\) to indicate that \(A\le CB\) for some \(C>0\) that may change from line to line, and we may write \(\lesssim _{\delta }\) if the implicit constant depends on \(\delta \).

We denote the standard inner product on \(\mathbb {R}^d\) by \( x\cdot \xi \). Moreover, for any \(x\in \mathbb {R}^d\), we denote the \(\ell ^2\)-norm and the maximum norm respectively by

$$\begin{aligned} |x|:=|x|_2:=\sqrt{x\cdot x} \qquad \text { and } \qquad |x|_{\infty }:=\max _{1\le k\le d}|x_k|. \end{aligned}$$

For a multi-index \(\gamma =(\gamma _1,\dots ,\gamma _k)\in \mathbb {N}^k_0\), we abuse the notation to write \(|\gamma |:=\gamma _1+\cdots +\gamma _k\). No confusion should arise since all multi-indices will be denoted by \(\gamma \).

2.2 Seminorms

Let \(\mathbb {I}\subseteq \mathbb {R}\) and \(\lambda > 0\). For \(N\in \mathbb {N}\cup \{\infty \}\), we denote by \(\mathfrak S_N(\mathbb {I})\) the family of all strictly increasing sequences of length \(N+1\) contained in \(\mathbb {I}\). We already defined the oscillation seminorm (1.7) and the \(\lambda \)-jump counting function (1.8) in the introduction. For any \(r\in [1,\infty )\), the r-variation seminorm \(V^r\) of a function \(f:\mathbb {I}\rightarrow \mathbb {C}\) is defined by

$$\begin{aligned} V^r(f(t) : t \in \mathbb {I}): = \sup _{\begin{array}{c} t_0< t_1< \cdots < t_J \\ t_j \in \mathbb {I} \end{array}} \left( \sum _{j = 1}^J |f(t_j) - f(t_{j-1})|^r\right) ^{1/r}. \end{aligned}$$
(2.1)

The r-variational seminorm controls the oscillation seminorm and the \(\lambda \)-jump counting function. Indeed, by Hölder’s inequality, we have

$$\begin{aligned} O_{I,N}^2(f(t) : t \in \mathbb {I}) \le N^{1/2-1/r} V^r(f(t) : t \in \mathbb {I}) \end{aligned}$$
(2.2)

for any \(N\in \mathbb {N}\), \(I\in \mathfrak {S}_N(\mathbb {I})\), and \(r \ge 2\). Moreover, for any \(\lambda >0\), we have

$$\begin{aligned} \lambda N_\lambda (f(t) : t \in \mathbb {I})^{1/r} \le V^r(f(t) : t \in \mathbb {I}). \end{aligned}$$
(2.3)

We adopt notation to simultaneously handle the oscillation seminorm and the \(\lambda \)-jump counting function for the sake of brevity and to emphasize the required properties. Let E be either of \(\mathbb {R}^d\) or \(\mathbb {Z}^d\) with the usual measures and let \((f_t:t\in \mathbb {I}) \subset L^p(E)\). We write

$$\begin{aligned} \mathcal {S}_E^p(f_t: t \in \mathbb {I}) \end{aligned}$$

to represent either of the following quantities:

$$\begin{aligned} \sup _{N \in \mathbb {N}}\sup _{I \in \mathfrak {S}_N(\mathbb {I})} \Bigg \Vert O_{I,N}^2(f_t(x): t \in \mathbb {I}) \Bigg \Vert _{L^p(E)}\quad \text {or}\quad \sup _{\lambda > 0} \Bigg \Vert \lambda N_\lambda (f_t(x): t \in \mathbb {I})^{1/2} \Bigg \Vert _{L^p(E)}. \end{aligned}$$

Proposition 3

Let \(p\in (1,\infty )\) and \(\mathbb {I}\subseteq \mathbb {R}\). The seminorm \(\mathcal {S}_E^p\) is subadditive up to a positive constant, that is,

$$\begin{aligned} \mathcal {S}_E^p(f_t+g_t:t\in \mathbb {I})\lesssim \mathcal {S}_E^p(f_t:t\in \mathbb {I})+ \mathcal {S}_E^p(g_t:t\in \mathbb {I}), \end{aligned}$$

where the implied constant is independent of \(\mathbb {I}\) and the families \((f_t:t\in \mathbb {I})\) and \((g_t:t\in \mathbb {I})\).

The critical point is that the jump quasi-seminorm admits an equivalent subadditive seminorm, see [28, Corollary 2.11].

Remark 2.4

(Rademacher–Menshov inequality) By inequalities (2.2) and (2.3), we deduce that the Rademacher–Menshov inequality [27, Lemma 2.5, p. 534] holds for \(\mathcal {S}_E^p\). Namely, for any \(k,m\in \mathbb {N}\) with \(k<2^m\) and any sequence of functions \((f_n:n\in \mathbb {N})\subset L^p(E)\), we have

$$\begin{aligned} \mathcal {S}_E^p(f_n:k\le n\le 2^m)\le \sqrt{2}\left\| {\sum _{i=1}^s\left( \sum _{j}{\left| f_{u_{j+1}^i}-f_{u_j^i}\right| ^2}\right) ^{1/2}}\right\| _{L^p(E)}, \end{aligned}$$
(2.5)

where each \([u_j^i,u_{j+1}^i)\) is a dyadic interval contained in \([k, 2^m]\) of the form \([j2^i,(j+1)2^{i})\) for some \(0\le i\le m\) and \(0\le j\le 2^{m-i}-1\).

For more information about the \(\lambda \)-jump counting function and the oscillation and r-variation seminorms, we refer to [6, 16, 22, 26, 33].

2.3 Reductions: Calderón transference and lifting

By the Calderón transference principle [7], we may restrict attention to the model dynamical system of \(\mathbb {Z}^d\) equipped with the counting measure and the shift operators \(S_j:\mathbb {Z}^d\rightarrow \mathbb {Z}^d\) given by \(S_j(x_1,\ldots ,x_d):=(x_1,\ldots ,x_j-1,\ldots ,x_d)\). We denote the corresponding averaging operators by

$$\begin{aligned} A_t^{\mathcal {P},k',k''} f(x) = \frac{1}{\vartheta _\Omega (t)} \sum _{(n,p)\in \mathbb {Z}^{k'}\times (\pm \mathbb {P})^{k''}} f\big (x - \mathcal {P}(n, p)\big )\mathbbm {1}_{{\Omega _t}}(n, p) \left( \prod _{j = 1}^{k''} \log |p_j|\right) \end{aligned}$$

and

$$\begin{aligned} H_t^{\mathcal {P},k',k''} f(x) = \sum _{(n,p)\in \mathbb {Z}^{k'}\times (\pm \mathbb {P})^{k''}} f\big (x - \mathcal {P}(n, p)\big ) K(n, p) \mathbbm {1}_{{\Omega _t}}(n, p) \left( \prod _{j = 1}^{k''} \log |p_j|\right) . \end{aligned}$$

Moreover, by a standard lifting argument, it suffices to prove Theorem 1 for a canonical case of the polynomial mapping \(\mathcal {P}\). Let \(\mathcal {P}\) be a polynomial mapping as in (1.1). We define

$$\begin{aligned} \textrm{deg}\, \mathcal {P}:=\max \{\textrm{deg}\, \mathcal {P}_j: 1\le j\le d\} \end{aligned}$$

and consider the set of multi-indices

$$\begin{aligned} \Gamma :=\Bigg \{\gamma \in \mathbb {N}_0^k{\setminus }\{0\}: 0<|\gamma |\le \textrm{deg}\, \mathcal {P}\Bigg \} \end{aligned}$$

equipped with the lexicographic order. We define the canonical polynomial mapping by

$$\begin{aligned} \mathbb {R}^k\ni x=(x_1,\dots ,x_k)\mapsto \mathcal {Q}(x):=(x^\gamma :\gamma \in \Gamma )\in \mathbb {R}^\Gamma , \end{aligned}$$
(2.6)

where \(x^\gamma =x_1^{\gamma _1}x_2^{\gamma _2}\cdots x_k^{\gamma _k}\). By invoking the lifting procedure described in [25, Lemma 2.2] (see also [35, Section 11]), the following implies Theorem 1.

Theorem 4

Let \(k\in \mathbb {N}\), let \(\Gamma \subset \mathbb {N}^{k} {\setminus } \{0\}\) be a nonempty finite set, and let \(k',k''\in \{0,1,\ldots ,k\}\) with \(k'+k''=k\). Let \(M_t^{k',k''}\) be either \(A_t^{\mathcal {Q},k',k''}\) or \(H_t^{\mathcal {Q},k',k''}\). For any \(p\in (1,\infty )\), there is a constant \(C_{p,k,|\Gamma |}>0\) such that

$$\begin{aligned} \mathcal {S}_{\mathbb {Z}^\Gamma }^p(M_t^{k',k''}f:t>0)\le C_{p,k,|\Gamma |}\Vert f\Vert _{\ell ^p(\mathbb {Z}^\Gamma )}. \end{aligned}$$
(2.7)

2.4 Fourier transform and Ionescu–Wainger multiplier theorem

Let \(\mathbb {G}=\mathbb {R}^d\) or \(\mathbb {G}=\mathbb {Z}^d\) and let \(\mathbb {G}^*\) denote the dual group of \(\mathbb {G}\). For every \(z\in \mathbb {C}\), we set \(\varvec{e}(z):=e^{2\pi {\varvec{i}} z}\), where \({\varvec{i}}^2=-1\). Let \(\mathcal {F}_{\mathbb {G}}\) denote the Fourier transform on \(\mathbb {G}\) defined for any \(f \in L^1(\mathbb {G})\) by

$$\begin{aligned} \mathcal {F}_{\mathbb {G}} f(\xi ) := \int _{\mathbb {G}} f(x) \varvec{e}(x\cdot \xi ) \textrm{d}\mu (x),\quad \xi \in \mathbb {G}^*, \end{aligned}$$

where \(\mu \) is the usual Haar measure on \(\mathbb {G}\). For any bounded function \(\mathfrak m:\mathbb {G}^*\rightarrow \mathbb {C}\), we define the corresponding Fourier multiplier operator by

$$\begin{aligned} T_{\mathbb {G}}[\mathfrak m]f(x):=\int _{\mathbb {G}^*}\varvec{e}(-\xi \cdot x)\mathfrak m(\xi )\mathcal {F}_{\mathbb {G}}f(\xi )\textrm{d}\xi , \quad x\in \mathbb {G}. \end{aligned}$$
(2.8)

Here, we assume that \(f:\mathbb {G}\rightarrow \mathbb {C}\) is a compactly supported function on \(\mathbb {G}\) (and smooth if \(\mathbb {G}=\mathbb {R}^d\)) or any other function for which (2.8) makes sense.

An indispensable tool in the proof of Theorem 4 is the vector-valued Ionescu–Wainger multiplier theorem from [28, Section 2] with an improvement by Tao [36].

Theorem 5

For every \(\varrho >0\), there exists a family \((P_{\le N})_{N\in \mathbb {N}}\) of subsets of \(\mathbb {N}\) such that:

  1. (i)

    \(\mathbb {N}_N\subseteq P_{\le N}\subseteq \mathbb {N}_{\max \{N, e^{N^{\varrho }}\}}\).

  2. (ii)

    If \(N_1\le N_2\), then \(P_{\le N_1}\subseteq P_{\le N_2}\).

  3. (iii)

    If \(q \in P_{\le N}\), then all factors of q also lie in \(P_{\le N}\).

  4. (iv)

    \(\textrm{lcm}(P_N) \le 3^N\).

Furthermore, for every \(p \in (1,\infty )\), there exists \(0<C_{p, \varrho , |\Gamma |}<\infty \) such that, for every \(N\in \mathbb {N}\), the following holds:

Let \(0<\varepsilon _N \le e^{-N^{2\varrho }}\) and let \(\textbf{Q}:=[-1/2, 1/2)^\Gamma \) be a unit cube. Let \(\mathfrak {m}:\mathbb {R}^{\Gamma } \rightarrow L(H_0,H_1)\) be a measurable function supported on \(\varepsilon _{N}\textbf{Q}\) taking values in \(L(H_{0},H_{1})\), the space of bounded linear operators between separable Hilbert spaces \(H_{0}\) and \(H_{1}\). Let \(0 \le \textbf{A}_{p} \le \infty \) denote the smallest constant such that

$$\begin{aligned} \Bigg \Vert T_{\mathbb {R}^\Gamma }[\mathfrak {m}]f\Bigg \Vert _{L^{p}(\mathbb {R}^{\Gamma };H_1)} \le \textbf{A}_{p} \Vert f\Vert _{L^{p}(\mathbb {R}^{\Gamma };H_0)} \end{aligned}$$

for every function \(f\in L^2(\mathbb {R}^\Gamma ;H_0)\cap L^{p}(\mathbb {R}^\Gamma ;H_0)\). Then, the multiplier

$$\begin{aligned} \Delta _N(\xi ):=\sum _{b \in \Sigma _{\le N}} \mathfrak {m}(\xi - b), \end{aligned}$$

where \(\Sigma _{\le N}\) is defined by

$$\begin{aligned} \Sigma _{\le N}:= \Bigg \{ \frac{a}{q}\in \mathbb {Q}^\Gamma \cap \mathbb {T}^\Gamma : q \in P_{\le N}\text { and } \textrm{gcd}(a, q)=1\Bigg \}, \end{aligned}$$

satisfies

$$\begin{aligned} \Bigg \Vert T_{\mathbb {Z}^\Gamma }[\Delta _{N}]f\Bigg \Vert _{\ell ^p(\mathbb {Z}^{\Gamma };H_1)} \le C_{p,\varrho ,|\Gamma |} (\log N) \textbf{A}_{p} \Vert f\Vert _{\ell ^p(\mathbb {Z}^{\Gamma };H_0)} \end{aligned}$$
(2.9)

for every \(f\in \ell ^p(\mathbb {Z}^\Gamma ;H_0)\), (cf. [36, Theorem 1.4] which removes the factor of \(\log N\) in the inequality (2.9)).

3 Preliminaries

3.1 General results

In this section, we present some general results concerning the behavior of exponential sums. The following proposition is an enhancement of the variant of Weyl’s inequality due to Trojan [37, Theorem 2] that allows us to estimate exponential sums related to a possibly non-differentiable function \(\phi \), (cf. [28, Theorem A.1]).

Proposition 6

(Weyl’s inequality) Let \(\alpha >0\), \(k\in \mathbb {N}\), and let \(\Gamma \subset \mathbb {N}^{k} {{\setminus }} \{0\}\) be a nonempty finite set. Let \(\Omega '\subseteq \Omega \subseteq B(0,N)\subset \mathbb {R}^k\) be convex sets and let \(\phi :\Omega \cap \mathbb {Z}^{k}\rightarrow \mathbb {C}\). There is \(\beta _\alpha >0\) such that, for any \(\beta > \beta _\alpha \), if there is a multi-index \(\gamma _0\in \Gamma \) with

$$\begin{aligned} \Big |\xi _{\gamma _0} - \frac{a}{q}\Big | \le \frac{1}{q^2} \end{aligned}$$

for some coprime integers a and q with \(1\le a\le q\) and \((\log N)^\beta \le q\le N^{|\gamma _0|}(\log N)^{-\beta }\), then

$$\begin{aligned}&\left| {\sum _{(n,p)\in \mathbb {Z}^{k'}\times (\pm \mathbb {P})^{k''}}\varvec{e}\big (\xi \cdot \mathcal {Q}(n,p)\big )\phi (n,p) \mathbbm {1}_{\Omega {\setminus }\Omega '}(n,p)}\right| \lesssim N^k\log (N)^{-\alpha }\Vert \phi \Vert _{L^\infty (\Omega {\setminus }\Omega ')} \\&\quad +N^{k} \sup _{\begin{array}{c} |x-y|\le N(\log N)^{-\alpha }\\ x,y\in \Omega {\setminus }\Omega ' \end{array}} |\phi (x)-\phi (y)|. \end{aligned}$$

The implicit constant is independent of the function \(\phi \), the variable \(\xi \), the sets \(\Omega ,\Omega '\), and the numbers a, q, and N.

Proof

We define \(\tilde{\phi }(n,p,A):=\phi (n,p)\mathbbm {1}_{(\Omega {{\setminus }}\Omega ')\cap A}(n,p)\). We partition the cube \([-N,N]^k\) into \(J\lesssim \log (N)^{k\alpha }\) cubes \(Q_j\) with disjoint interiors and side lengths \(C N\log (N)^{-\alpha }\) for some constant \(C>0\). Let \((m_j,p_j)\) be a fixed element of \(Q_j\cap \Omega {\setminus }\Omega '\). Since \(\mathbbm {1}_{\Omega }(x)\mathbbm {1}_{\Omega {\setminus }\Omega '}(x)=\mathbbm {1}_{\Omega {\setminus }\Omega '}(x)\) for any \(x\in \mathbb {R}^k\), we have

$$\begin{aligned}&\left| {\sum _{(n,p)}\varvec{e}\big (\xi \cdot \mathcal {Q}(n,p)\big )\tilde{\phi }(n,p,\Omega )\mathbbm {1}_{\Omega }(n,p)}\right| \nonumber \\&\quad \lesssim \sum _{j=1}^J\left| {\sum _{(n,p)}\varvec{e}\big (\xi \cdot \mathcal {Q}(n,p)\big )\tilde{\phi }(n,p,Q_j)\mathbbm {1}_{\Omega \cap Q_j}(n,p)}\right| , \end{aligned}$$
(3.1)

where all sums are taken over \((n,p)\in \mathbb {Z}^{k'}\times (\pm \mathbb {P})^{k''}\). Let \((m_j,p_j)\) be a fixed element of \(Q_j\cap \Omega {\setminus }\Omega '\). We estimate the right hand side of (3.1) by

$$\begin{aligned}&\sum _{j=1}^J\left| {\sum _{(n,p)}\varvec{e}\Big (\xi \cdot \mathcal {Q}(n,p)\Big ) \tilde{\phi }(m_j,p_j,Q_j)\mathbbm {1}_{\Omega \cap Q_j}(n,p)}\right| \\&\quad +\sum _{j=1}^J\left| {\sum _{(n,p)}\varvec{e}\big (\xi \cdot \mathcal {Q}(n,p)\big )\Bigg (\tilde{\phi }(m_j,p_j,Q_j) -\tilde{\phi }(n,p,Q_j)\Bigg )\mathbbm {1}_{\Omega \cap Q_j}(n,p)}\right| . \end{aligned}$$

By Trojan’s variant of Weyl’s inequality [37, Theorem 2], the first term is bounded by

$$\begin{aligned} J N^k(\log N)^{-\alpha '}\Vert \phi \Vert _{L^\infty (\Omega {\setminus }\Omega ')}\lesssim N^k(\log N)^{k\alpha -\alpha '}\Vert \phi \Vert _{L^\infty (\Omega {\setminus }\Omega ')} \end{aligned}$$
(3.2)

for any \(\alpha '>0\). Since \(\mathbbm {1}_{\Omega \cap Q_j}(n,p)=\mathbbm {1}_{\Omega '\cap Q_j}(n,p) +\mathbbm {1}_{(\Omega {\setminus }\Omega ')\cap Q_j}(n,p)\), the second term is bounded by

$$\begin{aligned} N^k(\log N)^{k\alpha -\alpha '}\Vert \phi \Vert _{L^\infty (\Omega {\setminus }\Omega ')} +N^k\sup _{\begin{array}{c} |x-y|\le N(\log N)^{-\alpha }\\ x,y\in \Omega {\setminus }\Omega ' \end{array}} |\phi (x)-\phi (y)|. \end{aligned}$$
(3.3)

Choosing an appropriate \(\alpha '>0\) in (3.2) and (3.3) yields the claim. \(\square \)

The next result is a generalization of [37, Proposition 4.1] and [37, Proposition 4.2] in the spirit of [28, Proposition 4.18]. For \(q\in \mathbb {N}\) and \(a\in \mathbb {N}_q^\Gamma \) with \(\textrm{gcd}(a,q)=1\), the Gaussian sum related to the polynomial mapping \(\mathcal {Q}\) is given by

$$\begin{aligned} G(a/q) := \frac{1}{q^{k'}} \frac{1}{\varphi (q)^{k''}} \sum _{x \in \mathbb {N}^{k'}_q} \sum _{y \in A_q^{k''}} \varvec{e}\big ((a/q)\cdot \mathcal {Q}(x, y)\big ), \end{aligned}$$
(3.4)

where \(A_q:=\{a\in \mathbb {N}_q:\textrm{gdc}(a,q)=1\}\) and \(\varphi \) is Euler’s totient function. There is \(\delta > 0\) such that

$$\begin{aligned} \Bigg |G(a/q) \Bigg | \lesssim q^{-\delta }, \end{aligned}$$
(3.5)

according to [37, Theorem 3].

Lemma 7

Let \(N\in \mathbb {N}\) and let \(\Omega \subseteq B(0,N)\subset \mathbb {R}^k\) be a convex set or a Boolean combination of finitely many convex sets. Let \(\mathcal {K}:\mathbb {R}^k\rightarrow \mathbb {C}\) be a continuous function supported in \(\Omega \). Then, for each \(\beta >0\), there is a constant \(c = c_{\beta }>0\) such that, for any \(q \in \mathbb {N}\) with \(1 \le q \le (\log N)^{\beta }\), \(a \in A_q\), and \(\xi = a/q + \theta \in \mathbb {R}^\Gamma \), we have

$$\begin{aligned}&\left| \sum _{(n,p)\in \mathbb {Z}^{k'}\times (\pm \mathbb {P})^{k''}}\varvec{e}\big (\xi \cdot \mathcal {Q}(n,p)\big )\mathcal {K}(n,p)\left( \prod _{i=1}^{k''}\log |p_i|\right) -G(a/q)\int _{\Omega }\varvec{e}\big ((\xi -a/q)\cdot \mathcal {Q}(t)\big )\mathcal {K}(t)\textrm{d }t\right| \\&\quad \lesssim \Bigg [N^{k-1}\Vert \mathcal {K}\Vert _{L^{\infty }(\Omega )} \big (1+\sum _{\gamma \in \Gamma } |\theta _\gamma | N^{|\gamma |} \big )+ N^k\sup _{\begin{array}{c} x,y \in \Omega \\ |x-y| \le q\sqrt{k} \end{array}} |\mathcal {K}(x) - \mathcal {K}(y)|\Bigg ]N\exp \Bigg (-c\sqrt{\log N}\Bigg ). \end{aligned}$$

The implied constant is independent of \(N,a,q,\xi \) and the kernel \(\mathcal {K}\).

Proof

The case when \(k=k'\) was proven in [28, Proposition 4.18], so we assume that \(k>k'\). Observe that, for a prime number p, \(p \mid q\) if and only if \((p \bmod q, q) > 1\). Hence, for each \(s \in \{1, \ldots , k''\}\), we have

$$\begin{aligned} \begin{aligned}&\left| \sum _{n \in \mathbb {N}_0^{k'}} \sum _{\begin{array}{c} r'' \in \mathbb {N}_q^{k''}\\ (r_s'', q) > 1 \end{array}} \sum _{\begin{array}{c} p \in \mathbb {P}^{k''} \\ p \equiv r'' \bmod q \end{array}} \varvec{e}\big (\xi \cdot \mathcal {Q}(n, p)\big ) \mathcal {K}(n, p) \bigg (\prod _{j=1}^{k''} \log p_j\bigg ) \right| \\&\quad \lesssim N^{k-1} \Vert \mathcal {K}\Vert _{L^{\infty }(\Omega )} \sum _{p \mid q}\log p \lesssim N^{k-1} \Vert \mathcal {K}\Vert _{L^{\infty }(\Omega )}\log q \lesssim N^{k-1} \Vert \mathcal {K}\Vert _{L^{\infty }(\Omega )}\log \log (N). \end{aligned} \end{aligned}$$
(3.6)

To simplify the notation, for \((x,y) \in \mathbb {R}^k {\setminus } \{0\}\), we set \(F(x,y):= \varvec{e}(\theta \cdot \mathcal {Q}(x, y))\mathcal {K}(x, y).\) For \((n, p) \in \mathbb {N}^{k'} \times \mathbb {P}^{k''}\) with \(n \equiv r' \bmod q\) and \(p \equiv r'' \bmod q\), we have

$$\begin{aligned} \begin{aligned} \xi _\gamma n^{\gamma '} p^{\gamma ''} \equiv \frac{a_\gamma }{q} n^{\gamma '} p^{\gamma ''} + \theta _\gamma n^{\gamma '} p^{\gamma ''} \equiv \frac{a_\gamma }{q} (r')^{\gamma '} (r'')^{\gamma ''} + \theta _\gamma n^{\gamma '} p^{\gamma ''} \pmod 1. \end{aligned} \end{aligned}$$

Therefore, we have \(\varvec{e}(\xi \cdot \mathcal {Q}(n, p)) = \varvec{e}((a/q) \cdot \mathcal {Q}(r', r''))\varvec{e}(\theta \cdot \mathcal {Q}(n, p)),\) so then

$$\begin{aligned} \begin{aligned}&\sum _{n \in \mathbb {N}_0^{k'}} \sum _{p \in \mathbb {P}^{k''}} \varvec{e}\big (\xi \cdot \mathcal {Q}(n, p)\big ) \mathcal {K}(n, p) \left( \prod _{j = 1}^{k''} \log p_j\right) \\&\quad = \sum _{r' \in \mathbb {N}^{k'}_q} \sum _{r'' \in A_q^{k''}} \varvec{e}\big ((a/q) \cdot \mathcal {Q}(r', r'')\big ) \sum _{\begin{array}{c} n \in \mathbb {N}_0^{k'} \\ n \equiv r' \bmod q \end{array}} \sum _{\begin{array}{c} p \in \mathbb {P}^{k''} \\ p \equiv r'' \bmod q \end{array}} F(n,p) \left( \prod _{j = 1}^{k''} \log p_j\right) \\&\qquad +\mathcal {O}\Bigg (N^{k-1}\Vert \mathcal {K}\Vert _{L^{\infty }(\Omega )} \log \log N\Bigg ), \end{aligned} \end{aligned}$$
(3.7)

where the error term is the cost for making the summation for \(r''\) over \(A_q^{k''}\) instead of \(N_q^{k''}\). Fix \(u \in \mathbb {N}^{k'}\), \(\tilde{p} \in \mathbb {P}^{k''-1}\), and \(r_1'' \in A_q\). Then \(\left\{ v \in \mathbb {N}: (u, v, \tilde{p}) \in \Omega \right\} = \left( V_0+1, \ldots , V_1\right) \) for some \(0 \le V_0 \le V_1 \le N\). By partial summation, we obtain

$$\begin{aligned} \begin{aligned}&\sum _{\begin{array}{c} p_1 \in \mathbb {P}_{V_1} {\setminus } \mathbb {P}_{V_0} \\ p_1 \equiv r_1'' \bmod q \end{array}} F(u,p_1, \tilde{p}) \log p_1 = \sum _{\begin{array}{c} v_1 = V_0+1 \\ v_1 \equiv r_1'' \bmod q \end{array}}^{V_1} F(u,v_1, \tilde{p}) \mathbbm {1}_{{\mathbb {P}}}(v_1) \log v_1 \\&\quad = \vartheta (V_1; q, r''_1) F(u,V_1,\tilde{p}) - \vartheta (V_0; q, r''_1) F(u,V_0+1,\tilde{p}) \\&\qquad - \sum _{v_1 = V_0+1}^{V_1-1} \vartheta (v_1; q, r_1'') [F(u,v_1+1,\tilde{p}) - F(u,v_1,\tilde{p})] , \end{aligned} \end{aligned}$$
(3.8)

where, for \(x \ge 1\), we have set

$$\begin{aligned} \vartheta (x; q, r):= \sum _{\begin{array}{c} p \in \mathbb {P}_x \\ p \equiv r \bmod q \end{array}} \log p. \end{aligned}$$

Similarly, we have

$$\begin{aligned}{} & {} \sum _{v_1 = V_0 + 1}^{V_1} F(u,v_1, \tilde{p}) = V_1 F(u,V_1,\tilde{p}) - V_0 F(u,V_0+1,\tilde{p}) \nonumber \\{} & {} \quad - \sum _{v_1 = V_0+1}^{V_1-1} v_1 [F(u,v_1+1,\tilde{p}) - F(u,v_1,\tilde{p})]. \end{aligned}$$
(3.9)

Furthermore, in view of the Siegel–Walfisz theorem ( [32, 39], see also [29, Corollary 11.21]), there are \(C, c' > 0\) such that for all \(x \ge 1\), \((r, q) = 1\) and \(1 \le q \le (\log x)^{\beta '}\),

$$\begin{aligned} \bigg |\vartheta (x; q, r) - \frac{x}{\varphi (q)} \bigg | \le C x \exp \Bigg (-c' \sqrt{\log x}\Bigg ). \end{aligned}$$
(3.10)

Hence, by (3.8), (3.9), and (3.10), we obtain

$$\begin{aligned}&\left| \sum _{\begin{array}{c} p_1 \in \mathbb {P}_{V_1}{\setminus }\mathbb {P}_{V_0} \\ p_1 \equiv r_1'' \bmod q \end{array}} F(u, p_1, \tilde{p}) \log p_1 - \frac{1}{\varphi (q)} \sum _{v_1 = V_0 +1}^{V_1} F(u, v_1, \tilde{p}) \right| \\&\qquad \lesssim \Vert \mathcal {K}\Vert _{L^{\infty } (\Omega )} \bigg |\vartheta (V_1; q, r''_1) - \frac{V_1}{\varphi (q)}\bigg | + \Vert \mathcal {K}\Vert _{L^{\infty } (\Omega )} \bigg |\vartheta (V_0; q, r''_1) - \frac{V_0}{\varphi (q)}\bigg |\\&\quad \qquad + \left[ \Vert \mathcal {K}\Vert _{L^\infty (\Omega )}\sum _{\gamma \in \Gamma } |\theta _\gamma | N^{|\gamma |-1} + \sup _{\begin{array}{c} x,y \in \Omega \\ |x-y| \le 1 \end{array}} |\mathcal {K}(x) - \mathcal {K}(y)|\right] \sum _{v_1 = V_0+1}^{V_1-1} \bigg |\vartheta (v_1; q, r''_1) - \frac{v_1}{\varphi (q)} \bigg | \\&\qquad \lesssim \left[ \Vert \mathcal {K}\Vert _{L^{\infty }(\Omega )} \Bigg (1 +\sum _{\gamma \in \Gamma } |\theta _\gamma | N^{|\gamma |} \Bigg ) + N\sup _{\begin{array}{c} x,y \in \Omega \\ |x-y| \le 1 \end{array}} |\mathcal {K}(x) - \mathcal {K}(y)|\right] N\exp \Bigg (-c'\sqrt{\log N}\Bigg ). \end{aligned}$$

Similar arguments applied to the sums over \(p_2, \ldots , p_{k''}\) give

$$\begin{aligned} \begin{aligned}&\left| \sum _{\begin{array}{c} u \in \mathbb {N}_0^{k'} \\ u \equiv r' \bmod q \end{array}} \sum _{\begin{array}{c} p \in \mathbb {P}^{k''} \\ p \equiv r'' \bmod q \end{array}} F(u,p) \bigg (\prod _{j = 1}^{k''} \log p_j \bigg ) - \frac{1}{\varphi (q)^{k''}} \sum _{u \in \mathbb {N}_0^{k'}} \sum _{v \in \mathbb {N}^{k''}} F(qu+r',v) \right| \\&\quad \lesssim \left[ N^{k-1}\Vert \mathcal {K}\Vert _{L^{\infty }(\Omega )} \Bigg (1 + \sum _{\gamma \in \Gamma } |\theta _\gamma | N^{|\gamma |} \Bigg ) + N^k\sup _{\begin{array}{c} x,y \in \Omega \\ |x-y| \le 1 \end{array}} |\mathcal {K}(x) - \mathcal {K}(y)|\right] N\exp \Bigg (-c'\sqrt{\log N}\Bigg ). \end{aligned} \end{aligned}$$
(3.11)

Let \(\Omega _+:= \Omega \cap [0,\infty )^{k'} \times [1,\infty )^{k''}\). We can estimate the sum by an integral by writing

$$\begin{aligned} \begin{aligned}&\left| q^{k'}\sum _{u \in \mathbb {N}_0^{k'}}\sum _{v \in \mathbb {N}^{k''}} F(qu+r',v) - \iint _{\Omega _+} F(s,t) \text {d}s \text {d}t \right| \\&\quad = \left| q^{k'}\sum _{u \in \mathbb {N}_0^{k'}}\sum _{v \in \mathbb {N}^{k''}} F(qu+r',v) - \sum _{u \in \mathbb {N}_0^{k'}} \sum _{v \in \mathbb {N}^{k''}} \int _{qu+[0,q)^{k'}}\int _{v+[0,1)^{k''}} F(s,t) \text {d}s \text {d}t \right| \\&\quad \le \sum _{u \in \mathbb {N}_0^{k'}}\sum _{v \in \mathbb {N}^{k''}} \int _{[0,q)^{k'}}\int _{[0,1)^{k''}} |F(qu+r',v) - F(qu+s,v+t)| \text {d}s \text {d}t. \end{aligned} \end{aligned}$$
(3.12)

We use three estimates to control this:

$$\begin{aligned}&|\varvec{e}\big (\theta \cdot \mathcal {Q}(qu+r',v)\big ) - \varvec{e}\big (\theta \cdot \mathcal {Q}(qu+s,v+t)\big )| \lesssim \sum _{\gamma \in \Gamma } q|\theta _\gamma |N^{|\gamma |-1},\\&\quad |\mathcal {K}(qu+r',v)-\mathcal {K}(qu+s,v+t)| \lesssim \sup _{\begin{array}{c} x,y\in \Omega \\ |x-y|\le q\sqrt{k} \end{array}} |\mathcal {K}(x) - \mathcal {K}(y)|,\\&\quad \sum _{u \in \mathbb {N}_0^{k'}} \sum _{v \in \mathbb {N}^{k''}} |\mathbbm {1}_{{\Omega }}(qu+r',v) - \mathbbm {1}_{{\Omega }}(qu+s,v+t)| \lesssim (N/q)^{k-1}, \end{aligned}$$

where the last inequality is a consequence of [28, Proposition 4.16], which gives that the number of lattice points in \(\Omega \) at a distance \(<q\) from the boundary of \(\Omega \) is \(\mathcal {O}(q N^{k-1})\). We therefore get a bound for (3.12) of the form

$$\begin{aligned} \mathcal {O}\left( qN^{k-1}\Vert \mathcal {K}\Vert _{L^{\infty }(\Omega )}\bigg [1 + \sum _{\gamma \in \Gamma } |\theta _\gamma |N^{|\gamma |} \bigg ] + N^k\sup _{\begin{array}{c} x,y\in \Omega \\ |x-y|\le q\sqrt{k} \end{array}} |\mathcal {K}(x) - \mathcal {K}(y)|\right) . \end{aligned}$$

Applying this in (3.11) and combining the error terms appropriately gives

$$\begin{aligned}&\left| \sum _{\begin{array}{c} u \in \mathbb {N}_0^{k'} \\ u \equiv r' \bmod q \end{array}} \sum _{\begin{array}{c} p \in \mathbb {P}^{k''} \\ p \equiv r'' \bmod q \end{array}} F(u,p) \bigg (\prod _{j = 1}^{k''} \log p_j \bigg ) - \frac{1}{q^{k'}} \frac{1}{\varphi (q)^{k''}} \iint _{\Omega _+} F(s,t) \text {d}s \text {d}t \right| \\&\quad \lesssim \left[ N^{k-1}\Vert \mathcal {K}\Vert _{L^{\infty }(\Omega )} \bigg (1 +\sum _{\gamma \in \Gamma } |\theta _\gamma | N^{|\gamma |} \bigg ) +N^k\sup _{\begin{array}{c} x,y \in \Omega \\ |x-y| \le q\sqrt{k} \end{array}} |\mathcal {K}(x) - \mathcal {K}(y)|\right] N\exp \left( -c'\sqrt{\log N}\right) . \end{aligned}$$

Applying this in (3.7) by summing in \(r'\) and \(r''\) together with (3.6) gives

$$\begin{aligned} \begin{aligned}&\left| \sum _{(n,p)\in \mathbb {N}_0^{k'}\times \mathbb {P}^{k''}}\varvec{e}\Big (\xi \cdot \mathcal {Q}(n,p)\Big )\mathcal {K}(n,p) \mathbbm {1}_{\Omega }(n,p)\Bigg (\prod _{i=1}^{k''}\log |p_i|\Bigg )\right. \\&\quad \left. -G(a/q)\int _{\Omega _+} \varvec{e}\Big ((\xi -a/q)\cdot \mathcal {Q}(t)\Big )\mathcal {K}(t)\textrm{d }t\right| \\&\quad \lesssim \left[ N^{k-1}\Vert \mathcal {K}\Vert _{L^{\infty }(\Omega )} \big (1 +\sum _{\gamma \in \Gamma } |\theta _\gamma | N^{|\gamma |} \big ) + N^k\sup _{\begin{array}{c} x,y \in \Omega \\ |x-y| \le q\sqrt{k} \end{array}} |\mathcal {K}(x) - \mathcal {K}(y)|\right] N\exp \Big (-c\sqrt{\log N}\Big ) \end{aligned} \end{aligned}$$
(3.13)

for any \(c < c'\). In simplifying to get the error term above, note that \(q^{k'} \phi (q)^{k''} \le q^{k} \le (\log N)^{\beta k}\) and

$$\begin{aligned} (\log N)^{\beta k}\exp \left( -c'\sqrt{\log N}\right) \lesssim \exp \left( -c\sqrt{\log N}\right) . \end{aligned}$$

Finally, we note that we can increase the range of integration at (3.13) to the larger \(\Omega \cap [0,\infty )^{k}\) by noting that

$$\begin{aligned} G(a/q)\int _{\Omega \cap [0,\infty ]^{k'}\times [0,1)^{k''}}\varvec{e}\big ((\xi -a/q)\cdot \mathcal {Q}(t)\big )\mathcal {K}(t)\textrm{d}t \end{aligned}$$

is bounded by \(N^{k'}\Vert \mathcal {K}\Vert _{L^\infty (\Omega )}\le N^{k-1}\Vert \mathcal {K}\Vert _{L^\infty (\Omega )}\).

We can repeat the entire proof replacing \(\mathbb {N}_0\) with \(-\mathbb {N}_0\) and/or \(\mathbb {P}\) with \(-\mathbb {P}\) in all the \(2^k\) many possible combinations thereof in \(\mathbb {N}_0^{k'} \times \mathbb {P}^{k''}\). Then, collecting all of the error terms yields the claim. \(\square \)

3.2 Multipliers for the averaging operators

For a function \(f:\mathbb {Z}^\Gamma \rightarrow \mathbb {C}\) with finite support, we have

$$\begin{aligned} A_t^{\mathcal {Q},k',k''}f(x) = T_{\mathbb {Z}^\Gamma }[\mathfrak {m}_{t}]f(x) \quad \text {and} \quad H_t^{\mathcal {Q},k',k''}f(x) = T_{\mathbb {Z}^\Gamma }[\mathfrak {n}_{t}]f(x) \end{aligned}$$

for the discrete Fourier multipliers

$$\begin{aligned} \mathfrak {m}_{t}(\xi ):=\frac{1}{\vartheta _\Omega (t)}\sum _{(n,p)\in \mathbb {Z}^{k'}\times (\pm \mathbb {P})^{k''}} \varvec{e}\big (\xi \cdot \mathcal {Q}(n,p)\big )\mathbbm {1}_{\Omega _t}(n,p)\left( \prod _{i=1}^{k''}\log |p_i|\right) ,\quad \xi \in \mathbb {T}^\Gamma , \end{aligned}$$

and

$$\begin{aligned} \mathfrak {n}_t(\xi ):=\sum _{(n,p)\in \mathbb {Z}^{k'}\times (\pm \mathbb {P})^{k''}}\varvec{e}\big (\xi \cdot \mathcal {Q}(n,p)\big )K(n,p) \mathbbm {1}_{\Omega _t}(n,p)\left( \prod _{i=1}^{k''}\log |p_i|\right) ,\quad \xi \in \mathbb {T}^\Gamma . \end{aligned}$$

Their continuous counterparts are given by

$$\begin{aligned} \Phi _t(\xi ):=\frac{1}{|\Omega _t|}\int _{\Omega _t}\varvec{e}\big (\xi \cdot \mathcal {Q}(t)\big )\textrm{d} t \quad \text {and}\quad \Psi _t(\xi ):=\mathrm{p.v.}\int _{\Omega _t}\varvec{e}\big (\xi \cdot \mathcal {Q}(t)\big )K(t)\textrm{d} t \end{aligned}$$

respectively. To present a unified approach, we write \(M_t^{k',k''}\), \(\mathfrak {y}_t\), and \(\Theta _t\) to represent either \(A_t^{\mathcal {Q},k',k''}\), \(\mathfrak {m}_t\), and \(\Phi _t\) or \(H_t^{\mathcal {Q},k',k''}\), \(\mathfrak {n}_t\), and \(\Psi _t\) respectively. We now present the key properties of our multiplier operators that will be used in the proof of Theorem 4. Let \(N_n:=\lfloor 2^{n^\tau } \rfloor \) for \(n\in \mathbb {N}\) and some \(\tau \in (0,1]\) adjusted later.

  1. Property 1.

    For each \(\alpha > 0\), there is \(\beta _{\alpha } > 0\) such that, for any \(\beta > \beta _{\alpha }\) and \(n \in \mathbb {N}\), if there is a multi-index \(\gamma _0 \in \Gamma \) with

    $$\begin{aligned} \bigg |\xi _{\gamma _0} - \frac{a}{q} \bigg | \le \frac{1}{q^2} \end{aligned}$$

    for some coprime integers a and q with \(1 \le a \le q\) and \((\log N_n)^\beta \le q \le N_n^{|\gamma _0|} (\log N_n)^{-\beta }\), then

    $$\begin{aligned} |(\mathfrak {y}_{N_n} - \mathfrak {y}_{N_{n-1}})(\xi )| \lesssim C(\log N_n)^{-\alpha }. \end{aligned}$$

    This follows from Proposition 6 with \(\phi (x)\equiv (\vartheta _\Omega (N_n))^{-1}\) for the \(\mathfrak {y}_t = \mathfrak {m}_{t}\) case and with \(\phi (x)=K(x)\) for the \(\mathfrak {y}_t = \mathfrak {n}_{t}\) case, noting the size condition (1.4) and the continuity condition (1.6).

  2. Property 2.

    Let A be the \(|\Gamma | \times |\Gamma |\) diagonal matrix with

    $$\begin{aligned} (A v)_\gamma = |\gamma | v_\gamma . \end{aligned}$$
    (3.14)

    For any \(t>0\), we set \(t^A v: = \big (t^{|\gamma |} v_\gamma : \gamma \in \Gamma \big ).\) Then

    $$\begin{aligned} \big |\Theta _{N_n}(\xi ) - \Theta _{N_{n-1}}(\xi )\big | \lesssim \min \big \{|N_n^A\xi |_\infty , |N_n^A \xi |_\infty ^{-1/|\Gamma |}\big \},\quad \text {for each }n\in \mathbb {N}. \end{aligned}$$

    In the \(\Theta _t = \Phi _t\) case, this follows from the mean value theorem and the standard van der Corput lemma. In the \(\Theta _t = \Psi _t\) case, this follows from the cancellation condition (1.5) and [27, Proposition B.2] (see [27, p. 21] for details).

  3. Property 3.

    For each \(\alpha > 0\), \(n \in \mathbb {N}\), and \(\xi \in \mathbb {T}^\Gamma \) satisfying

    $$\begin{aligned} \bigg |\xi _\gamma - \frac{a_\gamma }{q} \bigg | \le N_n^{-|\gamma |} L\qquad \text {for all }\gamma \in \Gamma \end{aligned}$$

    with \(1 \le q \le L\), \(a\in A_q^\Gamma \), and \(1 \le L \le \exp \Big (c\sqrt{\log {N_n}}\Big ) (\log N_n)^{-\alpha }\), we have

    $$\begin{aligned} \mathfrak {y}_{N_n}(\xi ) - \mathfrak {y}_{N_{n-1}}(\xi ) = G(a/q) \big (\Theta _{N_n}(\xi - a/q) - \Theta _{N_{n-1}}(\xi - a/q)\big ) + \mathcal {O}\big ((\log N_n)^{-\alpha }\big ), \end{aligned}$$

    for some constant \(c>0\) which is independent of \(n, \xi , a\) and q. In the \(\mathfrak {y}_t = \mathfrak {m}_t\), \(\Theta _t = \Phi _t\) case, this is [37, Property 6]. In the \(\mathfrak {y}_t = \mathfrak {n}_t\), \(\Theta _t = \Psi _t\) case, this follows from Property 1 alongside Lemma 7 with \(\Omega :=\Omega _{N_n}{\setminus }\Omega _{N_{n-1}}\) and \(\mathcal {K}(n,p):=K(n,p)\mathbbm {1}_{{\Omega }}\), noting the size condition (1.4) and the continuity condition (1.6). For details see [37, Lemmas 3 and 5].

3.3 Parameters discussion

Let \(p\in (1,\infty )\) be fixed and let \(\chi \in (0,1/10)\). Fix \(\tau \) with \(0< \tau < 1-\min (2,p)^{-1}\) and let \(N_n:=\lfloor 2^{n^\tau } \rfloor \) for \(n\in \mathbb {N}\). If \(p \in (1,2)\), fix \(p_0\) such that \(1< p_0 < p\). If instead \(p \in (2,\infty )\), fix \(p_0 > p\). If \(p=2\), the discussion is moot since all the interpolation arguments in the article become unnecessary. We choose \(\rho \) with

$$\begin{aligned} \rho > \frac{1}{\tau }\frac{pp_0-2p}{2p_0 - 2p} \end{aligned}$$

so that interpolation of the estimates

$$\begin{aligned} \Vert T\Vert _{\ell ^2} \lesssim n^{-\rho \tau } \quad \text {and} \quad \Vert T\Vert _{\ell ^{p_0}}\lesssim 1 \end{aligned}$$

yields

$$\begin{aligned} \Vert T\Vert _{\ell ^p} \lesssim n^{-(1+\varepsilon )} \text { for some } \varepsilon > 0. \end{aligned}$$

Property 1 gives us a corresponding \(\beta _\rho \). We fix a choice of \(\beta > \beta _\rho \) and then fix a choice of \(u\in \mathbb {N}\) with \(u>|\Gamma |\beta \). We also have the value of \(\delta \) coming from the Gaussian sum estimate (3.5). With these fixed, we choose the value of \(\varrho \) in Theorem 5 to be

$$\begin{aligned} \varrho := \min \bigg (\frac{\chi }{10 u}, \frac{\delta }{8\tau }\bigg ). \end{aligned}$$

4 Proof of Theorem 4

By the monotone convergence theorem and standard density arguments it is enough to prove that

$$\begin{aligned} \mathcal {S}_{\mathbb {Z}^\Gamma }^p(M_t^{k',k''}f: t\in \mathbb {I}) \lesssim _{p,k,|\Gamma |} \Vert f\Vert _{\ell ^p(\mathbb {Z}^\Gamma )} \end{aligned}$$

holds for every finite subset \(\mathbb {I}\subset \mathbb {R}_+\) with the implicit constant independent of the set \(\mathbb {I}\). We start by splitting (cf. [16, Lemma 1.3]) into long oscillations/jumps and short variations along the subexponential sequence \(N_n\):

$$\begin{aligned}{} & {} \mathcal {S}_{\mathbb {Z}^\Gamma }^p\big (M_t^{k',k''}f: t\in \mathbb {I}\big )\lesssim \mathcal {S}_{\mathbb {Z}^\Gamma }^p(T_{\mathbb {Z}^\Gamma }[\mathfrak {y}_{N_n}]f: n \in \mathbb {N}_0) \\{} & {} \quad + \bigg \Vert \Big (\sum _{n\in \mathbb {N}_0} V^2\big (M_t^{k',k''}f: t \in [N_{n}, N_{n+1})\cap \mathbb {I}\big )^2 \Big )^{1/2} \bigg \Vert _{\ell ^p(\mathbb {Z}^\Gamma )}. \end{aligned}$$

4.1 Short variations

By using the arguments from [28, Section 3.1], the estimate for the short variations will follow from the estimate

$$\begin{aligned} \big \Vert V^1(M_t^{k',k''}f:t\in [N_{n}, N_{n+1})\cap \mathbb {I})\big \Vert _{\ell ^1(\mathbb {Z}^\Gamma )}\lesssim n^{\tau -1}\Vert f\Vert _{\ell ^1(\mathbb {Z}^\Gamma )}. \end{aligned}$$
(4.1)

Let \(t_1<t_2<\cdots <t_{J(n)}\) be a sequence of elements of \([N_n,N_{n+1})\cap \mathbb {I}\). Since the number of elements in \([N_n,N_{n+1})\cap \mathbb {I}\) is finite, it is easy to see that

$$\begin{aligned} \left\| V^1(M_t^{k',k''}f:t\in [N_{n}, N_{n+1})\cap \mathbb {I})\right\| _{\ell ^1(\mathbb {Z}^\Gamma )} \le \left\| \sum _{j=1}^{J(n)}\big |M_{t_j}^{k',k''}f-M_{t_{j-1}}^{k',k''}f\big |\right\| _{\ell ^1(\mathbb {Z}^\Gamma )} \end{aligned}$$

for any \(n\in \mathbb {N}_0.\) Moreover, we have

$$\begin{aligned} \Big \Vert \sum _{j=1}^{J(n)}\big |M_{t_j}^{k',k''}f-M_{t_{j-1}}^{k',k''}f\big |\Big \Vert _{\ell ^1(\mathbb {Z}^\Gamma )} \lesssim 2^{-kn^\tau }\big (\vartheta _\Omega (N_{n+1})-\vartheta _\Omega (N_n)\big )\Vert f\Vert _{\ell ^1(\mathbb {Z}^\Gamma )}. \end{aligned}$$
(4.2)

This follows from the monotonicity of the sets \(\Omega _t\) and having \(\vartheta _\Omega (t)\approx t^k\) by the prime number theorem in the \(M_t^{k',k''} = A_t^{\mathcal {Q}, k',k''}\) case or the size condition (1.4) in the \(M_t^{k',k''} = H_t^{\mathcal {Q}, k',k''}\) case. By [37, Eq. 4.10], the right hand side of (4.2) is bounded by \(n^{\tau -1}\Vert f\Vert _{\ell ^1(\mathbb {Z}^\Gamma )}\), proving (4.1).

4.2 Long oscillations/jumps and the circle method

Let \(\eta :\mathbb {R}^\Gamma \rightarrow [0,1]\) be a smooth function with

$$\begin{aligned} \eta (x) = {\left\{ \begin{array}{ll} 1 &{} \text {if } |x|_\infty \le \tfrac{1}{32 |\Gamma |}, \\ 0 &{} \text {if } |x|_\infty \ge \tfrac{1}{16 |\Gamma |}. \end{array}\right. } \end{aligned}$$

For \(N\in \mathbb {R}_+\), we define the scaling notation

$$\begin{aligned} \eta _N(\xi ):= \eta \big (2^{N \cdot A- N^{\chi }\cdot \textrm{Id}}\xi \big ) \end{aligned}$$

where A is the matrix given in (3.14) and \(\textrm{Id}\) is the \(|\Gamma | \times |\Gamma |\) identity matrix. For dyadic integers \(s \in 2^{u\mathbb {N}}\), we define the annuli sets of fractions by

$$\begin{aligned} \Sigma _s := {\left\{ \begin{array}{ll} \Sigma _{\le s} &{} \text { if } s=2^u, \\ \Sigma _{\le s} {\setminus } \Sigma _{\le s/2^u} &{} \text { if } s>2^u, \end{array}\right. } \end{aligned}$$
(4.3)

where the \(\Sigma _{\le \cdot }\) are the sets of Ionescu–Wainger fractions as in Theorem 5. For \(t\ge 2^u\), we set \(F(t):=\max \{s \in 2^{u\mathbb {N}}: s \le t\}\). We define

$$\begin{aligned} \Xi _{\le j^{\tau u}}(\xi ):=\sum _{a/q \in \Sigma _{\le F(j^{\tau u})}} \eta _{j^{\tau }}(\xi - a/q) \end{aligned}$$

and, for \(s \in 2^{u\mathbb {N}}\), we define the annuli functions

$$\begin{aligned} \Xi _j^s(\xi ):= \sum _{a/q \in \Sigma _{s}} \eta _{j^{\tau }}(\xi - a/q). \end{aligned}$$
(4.4)

By (4.3), we have the telescoping property

$$\begin{aligned} \Xi _{\le j^{\tau u}}=\sum _{\begin{array}{c} s \in 2^{u\mathbb {N}} \\ s \le j^{\tau u} \end{array}} \Xi _j^s. \end{aligned}$$

Note that \(\eta _{j^\tau }(\xi )\) satisfies the hypothesis about the support for \(\mathfrak {m}\) in Theorem 5 since \(\frac{1}{8 |\Gamma |}2^{-j^{\tau }+j^{\tau \chi }} \le e^{-j^{2 \tau u \varrho }}\) provided that \(\varrho \le \chi /(10 u)\). Using the \(\Xi _{\le j^{\tau u}}\) functions, we bound the long oscillations/jumps by

$$\begin{aligned}&\mathcal {S}_{\mathbb {Z}^\Gamma }^p\left( \sum _{j=1}^n T_{\mathbb {Z}^\Gamma }[(\mathfrak {y}_{N_j} - \mathfrak {y}_{N_{j-1}})\Xi _{\le j^{\tau u}}]f : n \in \mathbb {N}\right) \\&\quad + \mathcal {S}_{\mathbb {Z}^\Gamma }^p\left( \sum _{j=1}^n T_{\mathbb {Z}^\Gamma }[(\mathfrak {y}_{N_j} - \mathfrak {y}_{N_{j-1}})(1-\Xi _{\le j^{\tau u}})]f : n \in \mathbb {N}\right) . \end{aligned}$$

These terms correspond to major and minor arcs respectively.

4.3 Minor arcs

Since \(V_1\) controls the oscillation/jump seminorms, we have

$$\begin{aligned}&\mathcal {S}_{\mathbb {Z}^\Gamma }^p\left( \sum _{j=1}^n T_{\mathbb {Z}^\Gamma }[(\mathfrak {y}_{N_j} - \mathfrak {y}_{N_{j-1}})(1-\Xi _{\le j^{\tau u}})]f : n \in \mathbb {N}\right) \\&\quad \le \sum _{n=1}^\infty \big \Vert T_{\mathbb {Z}^\Gamma }[(\mathfrak {y}_{N_n} - \mathfrak {y}_{N_{n-1}}) (1-\Xi _{\le n^{\tau u}})]f \big \Vert _{\ell ^p(\mathbb {Z}^\Gamma )}. \end{aligned}$$

It then suffices to show that

$$\begin{aligned} \big \Vert T_{\mathbb {Z}^\Gamma }[(\mathfrak {y}_{N_n} - \mathfrak {y}_{N_{n-1}})(1-\Xi _{\le n^{\tau u}})]f \big \Vert _{\ell ^{p}(\mathbb {Z}^\Gamma )} \lesssim n^{-(1+\varepsilon )}\Vert f\Vert _{\ell ^p(\mathbb {Z}^\Gamma )} \end{aligned}$$

for some \(\varepsilon > 0\). This uses Property 1 and follows from the proof of [37, Eqs. (5.8), (5.9)] with only small changes due to our differing scaling in the definition of \(\eta _N(\xi )\). We omit the details.

4.4 Introduction to major arcs

Using the annuli multipliers (4.4) and Proposition (3), we bound the major arcs term by

$$\begin{aligned}&\mathcal {S}_{\mathbb {Z}^\Gamma }^p\left( \sum _{j=1}^n \sum _{\begin{array}{c} s \in 2^{u\mathbb {N}} \\ s\le j^{\tau u} \end{array}} T_{\mathbb {Z}^\Gamma }[(\mathfrak {y}_{N_j} - \mathfrak {y}_{N_{j-1}})\Xi _j^s]f : n \in \mathbb {N}\right) \\&\quad \le \sum _{s \in 2^{u \mathbb {N}}} \mathcal {S}_{\mathbb {Z}^\Gamma }^p\left( \sum _{\begin{array}{c} 1 \le j \le n \\ j \ge s^{1/(\tau u)} \end{array}} T_{\mathbb {Z}^\Gamma }[(\mathfrak {y}_{N_j} - \mathfrak {y}_{N_{j-1}})\Xi _j^s]f : n \ge s^{1/\tau u} \right) . \end{aligned}$$

It then suffices to show for large \(s\in 2^{u\mathbb {N}}\) that

$$\begin{aligned} \mathcal {S}_{\mathbb {Z}^\Gamma }^p\left( \sum _{\begin{array}{c} 1 \le j \le n \\ j \ge s^{1/(\tau u)} \end{array}} T_{\mathbb {Z}^\Gamma }[(\mathfrak {y}_{N_j} - \mathfrak {y}_{N_{j-1}})\Xi _j^s]f : n \ge s^{1/\tau u}\right) \lesssim s^{-\varepsilon } \Vert f\Vert _{\ell ^p(\mathbb {Z}^\Gamma )} \end{aligned}$$
(4.5)

for some \(\varepsilon > 0\) since \(\sum _{s \in 2^{u\mathbb {N}}} s^{-\varepsilon } < \infty \). Let \(\kappa _s:=s^{2 \lfloor \varrho \rfloor }\). By splitting the left hand side of (4.5) at \(n \approx 2^{\kappa _s}\) into small and large scales, it suffices to prove that

$$\begin{aligned} \mathcal {S}_{\mathbb {Z}^\Gamma }^p\left( \sum _{\begin{array}{c} 1 \le j \le n \\ j \ge s^{1/(\tau u)} \end{array}} T_{\mathbb {Z}^\Gamma }[(\mathfrak {y}_{N_j} - \mathfrak {y}_{N_{j-1}})\Xi _j^s]f : n^\tau \in [s^{1/u}, 2^{\kappa _s+1}] \right) \lesssim s^{-\varepsilon }\Vert f\Vert _{\ell ^p(\mathbb {Z}^\Gamma )} \end{aligned}$$
(4.6)

and

$$\begin{aligned} \mathcal {S}_{\mathbb {Z}^\Gamma }^p\left( \sum _{\begin{array}{c} 1 \le j \le n \\ j \ge 2^{\kappa _s/\tau } \end{array}} T_{\mathbb {Z}^\Gamma }[(\mathfrak {y}_{N_j} - \mathfrak {y}_{N_{j-1}})\Xi _j^s]f : n^\tau > 2^{\kappa _s} \right) \lesssim s^{-\varepsilon } \Vert f\Vert _{\ell ^p(\mathbb {Z}^\Gamma )}. \end{aligned}$$
(4.7)

For the small scales (4.6), we will use the Rademacher–Menshov inequality (2.5) and Theorem 5. For the large scales (4.7), we will use the Magyar–Stein–Wainger sampling principle from [23, Proposition 2.1] and its counterpart for the jump inequality from [26, Theorem 1.7]. We first establish an approximation lemma to replace our discrete multipliers with continuous counterparts. Let

$$\begin{aligned} v_j^s(\xi ):=\sum _{a/q \in \Sigma _s} G(a/q)\big (\Theta _{N_j} - \Theta _{N_{j-1}}\big )(\xi - a/q)\eta _{j^\tau }(\xi -a/q) \end{aligned}$$
(4.8)

and

$$\begin{aligned} \Lambda _j^s(\xi ):= \sum _{a/q \in \Sigma _s} \big (\Theta _{N_j} - \Theta _{N_{j-1}}\big )(\xi - a/q)\eta _{j^\tau }(\xi -a/q). \end{aligned}$$
(4.9)

Lemma 8

Let \(M \in \mathbb {N}\), \(\alpha ' > 0\), and \(S_M:=\lfloor 2^{M^\tau - 3\,M^{\tau \chi }}\rfloor \). For \(j\in \mathbb {N}\) with \(s^{1/(\tau u)} \le j\) and \(M \le j \le 2M\), we have

$$\begin{aligned} \Vert (\mathfrak {y}_{N_j}-\mathfrak {y}_{N_{j-1}})\Xi _j^s - v_j^s\Vert _{\ell ^\infty (\mathbb {T}^\Gamma )} \lesssim j^{-\alpha ' \tau } \end{aligned}$$
(4.10)

and

$$\begin{aligned} \Vert (\mathfrak {y}_{N_j}-\mathfrak {y}_{N_{j-1}})\Xi _j^s - \Lambda _j^s \mathfrak {m}_{S_M}\Vert _{\ell ^\infty (\mathbb {T}^\Gamma )} \lesssim j^{-\alpha ' \tau }. \end{aligned}$$
(4.11)

Proof

For (4.10), since the \(\eta _{j^{\tau }}(\xi - a/q)\) bump functions in the definitions of \(\Xi _j^s\) and \(v_j^s\) have disjoint supports for distinct fractions a/q, it suffices to prove for a fixed \(a/q \in \Sigma _s\) that

$$\begin{aligned} \big \Vert \eta _{j^{\tau }}(\xi - a/q)\big [\big (\mathfrak {y}_{N_j}-\mathfrak {y}_{N_{j-1}}\big )(\xi ) - G(a/q)\big (\Theta _{N_j} - \Theta _{N_{j-1}}\big )(\xi - a/q)\big ]\big \Vert _{\ell ^\infty (\mathbb {T}^\Gamma )} \lesssim j^{-\alpha ' \tau } \end{aligned}$$

with the implied constant independent of the choice of a/q. Using the definition of \(\Sigma _s\), property (i) from Theorem 5, \(s \le j^{\tau u}\), and \(\varrho \le \chi /(10 u)\), we have \(q \le e^{s^{\varrho }} \le e^{j^{\tau u \rho }}\le 2^{j^{\tau \chi }} =: L_1.\) On the support of \(\eta _{j^\tau }(\xi - a/q)\), we have \(|\xi _\gamma - a_\gamma /q| \lesssim N_j^{-|\gamma |} L_1\) for all \(\gamma \in \Gamma \). Moreover, we have

$$\begin{aligned} L_1 = 2^{j^{\tau \chi }} \le 2^{j^{\tau /2}}j^{-\tau \alpha '}\lesssim \exp \left( \sqrt{\log N_j}\right) (\log N_j)^{-\alpha '}. \end{aligned}$$

The estimate (4.10) then follows from Property 3 with \(\alpha = \alpha '\) and \(L = L_1\).

For (4.11), we use (4.10) and are reduced to showing that

$$\begin{aligned} \Vert v_j^s - \Lambda _j^s \mathfrak {m}_{S_M}\Vert _{\ell ^\infty (\mathbb {T}^\Gamma )} \lesssim j^{-\alpha ' \tau }. \end{aligned}$$

Fixing \(\xi \) in the support of \(\eta _{j^\tau }(\xi - a/q)\), we have

$$\begin{aligned}&|\xi _\gamma - a_\gamma /q|\le 2^{-M^\tau |\gamma |} 2^{(2M)^{\tau \chi }} \le 2^{-M^\tau |\gamma |} 2^{2M^{\tau \chi }}\\&\quad = 2^{-M^\tau |\gamma |} 2^{3M^{\tau \chi }|\gamma |} 2^{-3M^{\tau \chi }|\gamma |} 2^{2M^{\tau \chi }}\le S_M^{-|\gamma |} 2^{-(j/2)^{\tau \chi }} \end{aligned}$$

for all \(\gamma \in \Gamma \). By the triangle inequality, we have

$$\begin{aligned}{} & {} |G(a/q) - \mathfrak {m}_{S_M}(\xi )| \le |G(a/q) - G(a/q) \Phi _{S_M}(\xi - a/q)| \\{} & {} \quad + |G(a/q) \Phi _{S_M}(\xi - a/q) - \mathfrak {m}_{S_M}(\xi )|. \end{aligned}$$

For the first term, we use the estimate (3.5), the mean value theorem, and Property 2 to obtain

$$\begin{aligned} |G(a/q) - G(a/q) \Phi _{S_M}(\xi - a/q)| \le q^{-\delta } \big |S_M^A (\xi - a/q)\big |_{\infty } \le q^{-\delta } 2^{-(j/2)^{\tau \chi }} \lesssim j^{-\alpha ' \tau }. \end{aligned}$$

For the second term, we use that

$$\begin{aligned} |\xi _\gamma - a_\gamma /q| \le S_M^{-|\gamma |} 2^{-M^{\tau \chi }} \le S_M^{-|\gamma |} 2^{(2M)^{\tau \chi }} =: S_M^{-|\gamma |} L_2 \end{aligned}$$

and \(q \le 2^{j^{\tau \chi }} \le 2^{(2\,M)^{\tau \chi }} = L_2 \le \exp \left( \sqrt{ \log S_M}\right) (\log S_M)^{-\alpha ' / \chi }.\) Hence, we may apply [37, Lemma 3] with \(\alpha = \alpha '/\chi \), \(N = S_M\), and \(L = L_2\) to obtain

$$\begin{aligned}&|G(a/q) \Phi _{S_M}(\xi - a/q) - \mathfrak {m}_{S_M}(\xi )| \lesssim \log (S_M)^{-\alpha '/\chi } \\&\quad \lesssim (M^\tau - 3M^{\tau \chi })^{-\alpha '/\chi } \lesssim (2M)^{-\alpha ' \tau } \lesssim j^{-\alpha ' \tau }. \end{aligned}$$

This completes the proof of (4.11). \(\square \)

4.5 Small scales

Using that \(V_2\) dominates oscillations/jumps, splitting \([s^{1/u},2^{\kappa _s+1}]\) into dyadic intervals, and preparing via the triangle inequality to use (4.11), we bound the left hand side of (4.6) by

$$\begin{aligned}&\overbrace{\sum _{M \in 2^{\mathbb {N}} \cap [s^{1/u},2^{\kappa _s}]} \Big \Vert V_2 \Big (\sum _{\begin{array}{c} 1 \le j \le n \\ j \ge s^{1/(\tau u)} \end{array}} T_{\mathbb {Z}^\Gamma } [\Lambda _j^s \mathfrak {m}_{S_M}]f : n^\tau \in [M,2M] \Big )\Big \Vert _{\ell ^p(\mathbb {Z}^\Gamma )}}^{\text {Main Term 1}} \\&\quad +\underbrace{\sum _{M \in 2^{\mathbb {N}} \cap [s^{1/u},2^{\kappa _s}]} \Big \Vert V_2\Big (\sum _{\begin{array}{c} 1 \le j \le n \\ j \ge s^{1/(\tau u)} \end{array}} T_{\mathbb {Z}^\Gamma }[(\mathfrak {y}_{N_j} - \mathfrak {y}_{N_{j-1}})\Xi _j^s - \Lambda _j^s \mathfrak {m}_{S_M}]f : n^\tau \in [M,2M] \Big )\Big \Vert _{\ell ^p(\mathbb {Z}^\Gamma )}}_{\text {Error Term 1}}. \end{aligned}$$

For Error Term 1, it will suffice to show that

$$\begin{aligned} \big \Vert T_{\mathbb {Z}^\Gamma }[(\mathfrak {y}_{N_n} - \mathfrak {y}_{N_{n-1}})\Xi _n^s - \Lambda _n^s \mathfrak {m}_{S_M}]f\big \Vert _{\ell ^p(\mathbb {Z}^\Gamma )} \lesssim n^{-(1+\varepsilon ')} \Vert f\Vert _{\ell ^p(\mathbb {Z}^\Gamma )} \end{aligned}$$
(4.12)

for some \(\varepsilon ' > 0\) since we would then bound it by

$$\begin{aligned} \sum _{n \ge s^{1/(\tau u)}} n^{-(1+\varepsilon ')} \Vert f\Vert _{\ell ^p(\mathbb {Z}^\Gamma )} \lesssim s^{-\varepsilon '/(\tau u)} \Vert f\Vert _{\ell ^p(\mathbb {Z}^\Gamma )} \lesssim s^{-\varepsilon }\Vert f\Vert _{\ell ^p(\mathbb {Z}^\Gamma )} \end{aligned}$$

using that \(V_1\) dominates \(V_2\). We note by Theorem 5 that

$$\begin{aligned} \big \Vert T_{\mathbb {Z}^\Gamma }[(\mathfrak {y}_{N_n} - \mathfrak {y}_{N_{n-1}})\Xi _n^s - \Lambda _n^s \mathfrak {m}_{S_M}]f \big \Vert _{\ell ^{p_0}(\mathbb {Z}^\Gamma )} \lesssim \Vert f\Vert _{\ell ^{p_0}(\mathbb {Z}^\Gamma )} \end{aligned}$$

and, by (4.11) with \(\alpha ' = \rho \), that

$$\begin{aligned} \big \Vert T_{\mathbb {Z}^\Gamma }[(\mathfrak {y}_{N_n} - \mathfrak {y}_{N_{n-1}})\Xi _n^s - \Lambda _n^s \mathfrak {m}_{S_M}]f \big \Vert _{\ell ^2(\mathbb {Z}^\Gamma )} \lesssim n^{-\rho \tau } \Vert f\Vert _{\ell ^2(\mathbb {Z}^\Gamma )}. \end{aligned}$$

Interpolation of the above inequalities yields (4.12).

For Main Term 1, we apply the Rademacher–Menshov inequality (2.5) to bound it by

$$\begin{aligned} \sum _{M \in 2^{\mathbb {N}} \cap [s^{1/u},2^{\kappa _s}]} \sum _{i = 0}^{\log _2(2M)} \left\| \left( \sum _j \Big | \sum _{k \in I_{i,j}^M} T_{\mathbb {Z}^\Gamma }[\Lambda _k^s \mathfrak {m}_{S_M}]f \Big |^2\right) ^{1/2}\right\| _{\ell ^p(\mathbb {Z}^\Gamma )}, \end{aligned}$$

where j is taken over \(j \ge 0\) such that \(I_{i,j}^M:= [j2^i,(j+1)2^i] \cap [M^{1/\tau }, (2\,M)^{1/\tau }] \ne \emptyset \). Let \(\tilde{\eta }_{N}(\xi ):=\eta _N(\xi /2)\). Then \(\tilde{\eta }_{N} \eta _{k^\tau } = \eta _{k^\tau }\) for \(k^\tau \ge N\) due to the nesting supports. This lets us write

$$\begin{aligned} \Lambda _k^s \mathfrak {m}_{S_M} = \Lambda _k^s \mathfrak {m}_{S_M} \sum _{a/q \in \Sigma _s} \tilde{\eta }_M(\xi - a/q) =:\Lambda _k^s \mathfrak {m}_{S_M}\tilde{\Xi }_{M^{1/\tau }}^s \end{aligned}$$

for \(k \in I_{i,j}^M\) since then \(k \ge M^{1/\tau }\). We have for any \(p\in (1,\infty )\) that

$$\begin{aligned} \left\| \left( \sum _j \Big | \sum _{k \in I_{i,j}^M} T_{\mathbb {Z}^\Gamma }[\Lambda _k^s]g \Big |^2\right) ^{1/2}\right\| _{\ell ^{p}(\mathbb {Z}^\Gamma )} \lesssim \Vert g\Vert _{\ell ^{p}(\mathbb {Z}^\Gamma )} \end{aligned}$$

since, by Theorem 5, the above estimate is a consequence of its continuous counterpart

$$\begin{aligned} \left\| \left( \sum _{j}\big |\sum _{k\in I_{i,j}^M}T_{\mathbb {R}^\Gamma }\big [(\Theta _{N_k}-\Theta _{N_{k-1}})\eta _k^\tau \big ]f\big |^2\right) ^{1/2}\right\| _{L^{p}(\mathbb {R}^\Gamma )} \lesssim \Vert f\Vert _{L^{p}(\mathbb {R}^\Gamma )}. \end{aligned}$$

The above square function estimate follows by appealing to Property 2 and arguments from Littlewood–Paley theory. We refer to [27] for more details, see also [28, Theorem 4.3, p. 42]. Thus,

$$\begin{aligned} \left\| \left( \sum _j \Big | \sum _{k \in I_{i,j}^M} T_{\mathbb {Z}^\Gamma }[\Lambda _k^s \mathfrak {m}_{S_M}]f \Big |^2\right) ^{1/2}\right\| _{\ell ^{p_0}(\mathbb {Z}^\Gamma )} \lesssim \big \Vert T_{\mathbb {Z}^\Gamma }[\mathfrak {m}_{S_M}]f\big \Vert _{\ell ^{p_0}(\mathbb {Z}^\Gamma )} \lesssim \Vert f\Vert _{\ell ^{p_0}(\mathbb {Z}^\Gamma )} \end{aligned}$$
(4.13)

using the uniform \(\ell ^{p}\)-boundedness of the averaging operators. We get an improved bound on \(\ell ^2\). To do this, we show that

$$\begin{aligned} \big \Vert \mathfrak {m}_{S_M} \tilde{\Xi }_{M^{1/\tau }}^s\big \Vert _{\ell ^{\infty }(\mathbb {T}^\Gamma )}\lesssim s^{-\delta } \end{aligned}$$

for \(M \in 2^\mathbb {N}\cap [s^{1/u},2^{\kappa _s}]\). Since the bump functions in the sum have disjoint supports, it suffices to prove for a fixed \(a/q \in \Sigma _s\) that

$$\begin{aligned} \Vert \mathfrak {m}_{S_M}(\xi ) \tilde{\eta }_M(\xi - a/q) \Vert _{\ell ^\infty (\mathbb {T}^\Gamma )} \lesssim s^{-\delta } \end{aligned}$$

with the implied constant independent of the choice of a/q. On the support of \(\tilde{\eta }_M(\xi - a/q)\), we have

$$\begin{aligned} |\xi _\gamma - a_\gamma /q| \le \frac{1}{8|\Gamma |} 2^{-M|\gamma |} 2^{M^{\chi }} \le 2^{-M^{\tau }|\gamma |} 2^{2M^{\tau \chi }}. \end{aligned}$$

We follow the same arguments as in the proof of (4.11), choosing \(\alpha ' = \delta u /\tau \), to show that

$$\begin{aligned} |G(a/q) - \mathfrak {m}_{S_M}(\xi )| \lesssim M^{-\delta u} \le s^{-\delta }. \end{aligned}$$

For any \(\xi \in \mathbb {T}^\Gamma \) and \(a/q \in \Sigma _s\), we have

$$\begin{aligned} \big |\mathfrak {m}_{S_M}(\xi ) \tilde{\Xi }_{M^{1/\tau }}^s(\xi ) \big | \le |\mathfrak {m}_{S_M}(\xi )| \le |\mathfrak {m}_{S_M}(\xi ) - G(a/q)| + |G(a/q)| \lesssim s^{-\delta } \end{aligned}$$

using that \(|G(a/q)| \lesssim q^{-\delta } \lesssim s^{-\delta }\) since \(q \ge s/2^u\) by the construction of \(\Sigma _s\). Hence,

$$\begin{aligned}{} & {} \left\| \left( \sum _j \Big | \sum _{k \in I_{i,j}^M} T_{\mathbb {Z}^\Gamma }[\Lambda _k^s \mathfrak {m}_{S_M}]f \Big |^2\right) ^{1/2}\right\| _{\ell ^2(\mathbb {Z}^\Gamma )}\nonumber \\{} & {} \quad \lesssim \big \Vert T_{\mathbb {Z}^\Gamma }[\mathfrak {m}_{S_M}\tilde{\Xi }_{M_{1/\tau }^s}]f\big \Vert _{\ell ^{2}(\mathbb {Z}^\Gamma )} \lesssim s^{-\delta }\Vert f\Vert _{\ell ^{2}(\mathbb {Z}^\Gamma )}. \end{aligned}$$
(4.14)

Interpolation of (4.13) with (4.14) then gives that

$$\begin{aligned} \left\| \left( \sum _j \Big | \sum _{k \in I_{i,j}^M} T_{\mathbb {Z}^\Gamma }[\Lambda _k^s \mathfrak {m}_{S_M}]f \Big |^2\right) ^{1/2}\right\| _{\ell ^p(\mathbb {Z}^\Gamma )}\lesssim s^{-8\varrho }\Vert f\Vert _{\ell ^p(\mathbb {Z}^\Gamma )} \end{aligned}$$

since \(8\varrho \le \delta /(\rho \tau )\). Thus, we may dominate Main Term 1 by

$$\begin{aligned} \sum _{M \in 2^{\mathbb {N}} \cap [s^{1/u},2^{\kappa _s}]} \sum _{i = 0}^{\log _2(2M)} s^{-8\varrho }\Vert f\Vert _{\ell ^p(\mathbb {Z}^\Gamma )}\lesssim \kappa _s^2 s^{-8\varrho } \Vert f\Vert _{\ell ^p(\mathbb {Z}^\Gamma )} \lesssim s^{-4\varrho } \Vert f\Vert _{\ell ^p(\mathbb {Z}^\Gamma )} \end{aligned}$$

since \(\kappa _s \le s^{2\varrho }\), concluding the proof of (4.6).

4.6 Large scales

Since \(V_1\) dominates \(\mathcal {S}_{\mathbb {Z}^\Gamma }^p\), we may bound the left hand side of (4.7) by

$$\begin{aligned} \overbrace{\mathcal {S}_{\mathbb {Z}^\Gamma }^p\Big (\sum _{\begin{array}{c} 1 \le j \le n \\ j \ge 2^{\kappa _s/\tau } \end{array}} T_{\mathbb {Z}^\Gamma }[v_j^s]f : n^\tau > 2^{\kappa _s} \Big )}^{\text {Main Term 2}} + \overbrace{\sum _{n \ge 2^{\kappa _s/\tau }} \big \Vert T_{\mathbb {Z}^\Gamma }[(\mathfrak {y}_{N_n} - \mathfrak {y}_{N_{n-1}})\Xi _n^s - v_n^s]f \big \Vert _{\ell ^p(\mathbb {Z}^\Gamma )}}^{\text {Error Term 2}}. \end{aligned}$$

For Error Term 2, it will suffice to show that

$$\begin{aligned} \big \Vert T_{\mathbb {Z}^\Gamma }[(\mathfrak {y}_{N_n} - \mathfrak {y}_{N_{n-1}})\Xi _n^s - v_n^s]f \big \Vert _{\ell ^p(\mathbb {Z}^\Gamma )} \lesssim e^{(|\Gamma |+1)s^\varrho } n ^{-(1+\varepsilon ')} \Vert f\Vert _{\ell ^p(\mathbb {Z}^\Gamma )} \end{aligned}$$
(4.15)

for some \(\varepsilon ' > 0\) since we would then bound it by

$$\begin{aligned} e^{(|\Gamma |+1)s^\varrho } \sum _{n \ge 2^{\kappa _s/\tau }} n^{-(1+\varepsilon ')} \Vert f\Vert _{\ell ^p(\mathbb {Z}^\Gamma )} \lesssim e^{(|\Gamma |+1)s^\varrho } 2^{-s^{2\varrho }\varepsilon '/\tau } \Vert f\Vert _{\ell ^p(\mathbb {Z}^\Gamma )} \lesssim s^{-\varepsilon } \Vert f\Vert _{\ell ^p(\mathbb {Z}^\Gamma )}. \end{aligned}$$

We have

$$\begin{aligned} \big \Vert T_{\mathbb {Z}^\Gamma }[(\mathfrak {y}_{N_n} - \mathfrak {y}_{N_{n-1}})\Xi _n^s - v_n^s]f \big \Vert _{\ell ^2(\mathbb {Z}^\Gamma )} \lesssim n^{-\rho \tau } \Vert f\Vert _{\ell ^2(\mathbb {Z}^\Gamma )} \end{aligned}$$

by (4.10) with \(\alpha ' = \rho \). We also have

$$\begin{aligned}\big \Vert T_{\mathbb {Z}^\Gamma }[(\mathfrak {y}_{N_n} - \mathfrak {y}_{N_{n-1}})\Xi _n^s - v_n^s]f \big \Vert _{\ell ^{p_0}(\mathbb {Z}^\Gamma )} \lesssim e^{(|\Gamma |+1)s^\varrho } \Vert f\Vert _{\ell ^{p_0}(\mathbb {Z}^\Gamma )} \end{aligned}$$

simply by the triangle inequality and property (i) from Theorem 5. Consequently (4.15) follows by interpolation.

For Main Term 2, we define

$$\begin{aligned} w^s(\xi ):= \sum _{a/q \in \Sigma _s} G(a/q)\tilde{\eta }_{2^{\kappa _s}}(\xi - a/q), \quad \Pi ^s(\xi ):=\sum _{a/q \in \Sigma _s} \tilde{\eta }_{2^{\kappa _s}}(\xi - a/q), \end{aligned}$$

and

$$\begin{aligned} \omega _n^s(\xi ):= \sum _{2^{\kappa _s/\tau } \le j \le n} (\Theta _{N_j} - \Theta _{N_{j-1}})(\xi )\eta _{j^\tau }(\xi ). \end{aligned}$$

Let \(Q_s:=\textrm{lcm}(q: a/q \in \Sigma _s).\) By property (iv) from Theorem 5, we have \(Q_s \le 3^s\). The function \(\omega _n^s\) is supported on \([-\frac{1}{4Q_s}, \frac{1}{4Q_s}]\) for large \(s\in 2^{u\mathbb {N}}\) since, on the support of \(\eta _{2^{\kappa _s}}\), we have \(|\xi _\gamma | \le 2^{-2^{-\kappa _s}+2^{\kappa _s\chi }} \le (4Q_s)^{-1}\) for all \(\gamma \in \Gamma \) and large s. We also have

$$\begin{aligned} \sum _{2^{\kappa _s/\tau } \le j \le n} v_j^s(\xi ) = w^s(\xi )\sum _{b \in \mathbb {Z}^\Gamma } \omega _n^s(\xi - b/Q_s). \end{aligned}$$

Therefore, it suffices to prove

$$\begin{aligned} \mathcal {S}_{\mathbb {Z}^\Gamma }^p\left( T_{\mathbb {Z}^\Gamma }\left[ \sum _{b \in \mathbb {Z}^\Gamma } \omega _n^s(\cdot - b/Q_s) \right] f : n^\tau > 2^{\kappa _s} \right) \lesssim \Vert f\Vert _{\ell ^p(\mathbb {Z}^\Gamma )} \end{aligned}$$
(4.16)

and

$$\begin{aligned} \big \Vert T_{\mathbb {Z}^\Gamma }[w^s]f\Vert _{\ell ^p(\mathbb {Z}^\Gamma )} \lesssim s^{-\varepsilon } \Vert f\Vert _{\ell ^p(\mathbb {Z}^\Gamma )} \end{aligned}$$
(4.17)

for some \(\varepsilon > 0\).

By the Magyar–Stein–Wainger sampling principle [23, Proposition 2.1] for the oscillation seminorm or the sampling principle for the jumps [26, Theorem 1.7], (4.16) follows from

$$\begin{aligned} \mathcal {S}_{\mathbb {R}^\Gamma }^p(T_{\mathbb {R}^\Gamma }[\omega _n^s]f : n^\tau > 2^{\kappa _s}) \lesssim \Vert f\Vert _{L^p(\mathbb {R}^\Gamma )}. \end{aligned}$$
(4.18)

To prove (4.18), we use that the \(\omega _n^s\) functions are almost telescoping. We define

$$\begin{aligned} \Delta _n^s(\xi ):= \sum _{2^{\kappa _s/\tau } \le j \le n} (\Theta _{N_j} - \Theta _{N_{j-1}})(\xi ) = (\Theta _{N_n} - \Theta _{N_{2^{\kappa _s/\tau } - 1}})(\xi ). \end{aligned}$$

Then (4.18) follows from

$$\begin{aligned} \mathcal {S}_{\mathbb {R}^\Gamma }^p(T_{\mathbb {R}^\Gamma }[\Delta _n^s]f : n^\tau > 2^{\kappa _s})\lesssim \Vert f\Vert _{L^p(\mathbb {R}^\Gamma )} \end{aligned}$$
(4.19)

since the error term is bounded by

$$\begin{aligned} \sum _{n > 2^{\kappa _s/\tau }} \big \Vert T_{\mathbb {R}^\Gamma }[(\Theta _{N_n} - \Theta _{N_{n-1}})(\eta _{n^\tau }-1)]f\big \Vert _{L^p(\mathbb {R}^\Gamma )}\lesssim \Vert f\Vert _{L^p(\mathbb {R}^\Gamma )} \end{aligned}$$

using Property 2 and interpolation.

On the other hand, due to translation invariance of \(\mathcal {S}_{\mathbb {R}^\Gamma }^p\),  (4.19) follows from

$$\begin{aligned} \mathcal {S}_{\mathbb {R}^\Gamma }^p(T_{\mathbb {R}^\Gamma }[\Theta _t]f : t > 0) \lesssim \Vert f\Vert _{L^p(\mathbb {R}^\Gamma )}. \end{aligned}$$
(4.20)

For the jump inequality, the estimate (4.20) was proven in [27, Theorem 1.22, Theorem 1.30] for both \(\Phi _t\) and \(\Psi _t\). For the oscillation inequality, (4.20) was proven in [21, Eq. 3.38] for \(\Phi _t\) and in [34, Theorem 1.9] for \(\Psi _t\). This concludes the proof of (4.16).

For (4.17), we note by (3.5) that

$$\begin{aligned} \big \Vert T_{\mathbb {Z}^\Gamma }[w^s]f\Vert _{\ell ^2(\mathbb {Z}^\Gamma )} \lesssim s^{-\delta } \Vert f\Vert _{\ell ^2(\mathbb {Z}^\Gamma )}. \end{aligned}$$
(4.21)

On \(\ell ^{p_0}\), we start by splitting

$$\begin{aligned} w^s = \Pi ^s\mathfrak {m}_{J_s} + (w^s - \Pi ^s\mathfrak {m}_{J_s}), \end{aligned}$$

where \(J_s = \lfloor 2^{2^{\kappa _s} - 3 \cdot 2^{\kappa _s \chi }} \rfloor \). By Theorem 5, we have

$$\begin{aligned} \big \Vert T_{\mathbb {Z}^\Gamma }[\Pi ^s\mathfrak {m}_{J_s}]f\Vert _{\ell ^{p_0}(\mathbb {Z}^\Gamma )} \lesssim \Vert f\Vert _{\ell ^{p_0}(\mathbb {Z}^\Gamma )}. \end{aligned}$$
(4.22)

Let \(p_{00} \in (1,\infty )\). Then

$$\begin{aligned} \big \Vert T_{\mathbb {Z}^\Gamma }[w^s - \Pi ^s\mathfrak {m}_{J_s}]f\Vert _{\ell ^{p_{00}}(\mathbb {Z}^\Gamma )} \lesssim e^{(|\Gamma |+1)s^\varrho } \Vert f\Vert _{\ell ^{p_{00}}(\mathbb {Z}^\Gamma )} \end{aligned}$$
(4.23)

by property (i) from Theorem 5. Therefore, it suffices to show that

$$\begin{aligned} \big \Vert T_{\mathbb {Z}^\Gamma }[w^s - \Pi ^s\mathfrak {m}_{J_s}]f\Vert _{\ell ^2(\mathbb {Z}^\Gamma )} \lesssim 2^{-\chi s^{2\rho }} \Vert f\Vert _{\ell ^2(\mathbb {Z}^\Gamma )} \end{aligned}$$
(4.24)

since interpolating (4.24) with (4.23) for an appropriate choice of \(p_{00}\) gives

$$\begin{aligned} \big \Vert T_{\mathbb {Z}^\Gamma }[w^s - \Pi ^s\mathfrak {m}_{J_s}]f\Vert _{\ell ^{p_0}(\mathbb {Z}^\Gamma )} \lesssim \Vert f\Vert _{\ell ^{p_0}(\mathbb {Z}^\Gamma )}, \end{aligned}$$

combining this with (4.22) gives

$$\begin{aligned} \big \Vert T_{\mathbb {Z}^\Gamma }[w^s]f\Vert _{\ell ^{p_0}(\mathbb {Z}^\Gamma )} \lesssim \Vert f\Vert _{\ell ^{p_0}(\mathbb {Z}^\Gamma )}, \end{aligned}$$

and interpolating the above inequality with (4.21) completes the proof of (4.17) and, thereby, that of (4.7). The proof of (4.24) will proceed similarly as in the proof of (4.11). For \(\xi \) in the support of \(\tilde{\eta }_{2^{\kappa _s}}(\xi - a/q)\), we have

$$\begin{aligned} |\xi _\gamma - a_\gamma /q| \le \frac{1}{16|\Gamma |} 2^{-2^{\kappa _s} |\gamma |} 2^{2^{\kappa _s \chi }} \le 2^{-2^{\kappa _s} |\gamma |} 2^{2^{\kappa _s \chi }}\le J_s^{-|\gamma |} 2^{-2^{\kappa _s \chi }} \end{aligned}$$

for all \(\gamma \in \Gamma \). By the triangle inequality, we have

$$\begin{aligned}{} & {} |G(a/q) - \mathfrak {m}_{J_s}(\xi )| \le |G(a/q) - G(a/q) \Phi _{J_s}(\xi - a/q)| \\{} & {} \quad + |G(a/q) \Phi _{J_s}(\xi - a/q) - \mathfrak {m}_{J_s}(\xi )|. \end{aligned}$$

For the first term, we use  (3.5) and the mean value theorem to write

$$\begin{aligned} |G(a/q) - G(a/q) \Phi _{J_s}(\xi - a/q)| \le q^{-\delta } \big |J_s^A (\xi - a/q)\big |_{\infty } \le q^{-\delta } 2^{-2^{\kappa _s \chi }} \lesssim 2^{-\chi s^{2\varrho }}. \end{aligned}$$

For the second term, we use that

$$\begin{aligned} |\xi _\gamma - a_\gamma /q| \le J_s^{-|\gamma |} 2^{-2^{\kappa _s \chi }} \le J_s^{-|\gamma |} e^{s^{\varrho }} =: J_s^{-|\gamma |} L_3 \end{aligned}$$

and \(q \le L_3 \le \exp \Big (\sqrt{ \log J_s}\Big ) (\log J_s)^{-1}\) to apply [37, Lemma 3] with \(\alpha = 1\), \(N = J_s\), and \(L = L_3\). This gives

$$\begin{aligned} |G(a/q) \Phi _{J_s}(\xi - a/q) - \mathfrak {m}_{J_s}(\xi )| \lesssim \log (J_s)^{-1} \lesssim (2^{\kappa _s} - 3 \cdot 2^{\kappa _s \chi })^{-1} \lesssim 2^{-\chi \kappa _s} \lesssim 2^{-\chi s^{2\varrho }}, \end{aligned}$$

completing the proof.

5 Remarks

As a simple consequence of our results, we can prove the convergence of the Wiener–Wintner type averages. This result is probably known, but we have not found anything like this in the literature in the presented generality.

Let \((X,\mu )\) be a measure space endowed with a measure preserving transformation \(T:X\rightarrow X\) and let

$$\begin{aligned} R(x)=a_m x^m+a_{m-1}x^{m-1}+\cdots +a_1x+a_0,\quad x\in \mathbb {R}, \end{aligned}$$

be a polynomial with real coefficients. Moreover, let \(P:\mathbb {Z}\rightarrow \mathbb {Z}\) be a polynomial with integer coefficients such that \(P(0)=0\). For \(p\in (1,\infty )\), the Wiener–Wintner type averages

$$\begin{aligned} \frac{1}{2N+1}\sum _{n=-N}^N f\left( T^{P(n)}x\right) \varvec{e}\big (R(n)\big ) \end{aligned}$$
(5.1)

converge \(\mu \)-almost everywhere for any \(f\in L^p(X,\mu )\). According to Assani [1, p. 179], the convergence of the averages (5.1) in the case when \(\deg P\ge 2\) is known only for \(R\equiv 0\). However, in [10, Theorem 1.9], the authors have established the convergence in the case when \(P(n)=n^2\) and \(R(n)=\theta n\) for any \(\theta \in \mathbb {R}.\)

Let us show how to deduce the convergence of the averages (5.1) from Corollary 2. Clearly, we may assume that \(R(0)=0\). We consider the measure space \((Y,\nu )\) where \(Y:=X\times \mathbb {T}\), \(\nu :=\mu \times \lambda \), and \(\lambda \) is the normalized Lebesgue measure on \(\mathbb {T}\). We equip the space \((Y,\nu )\) with the family of measure preserving commuting transformations \(S_1, S_2,\ldots ,S_m,S_{m+1}:X\times \mathbb {T}\rightarrow X\times \mathbb {T}\) where, for \(j=1,\ldots ,m\), we put \(S_j:=\textrm{Id}\times D_j\) with

$$\begin{aligned} D_j(\xi ):=\varvec{e}(a_j)\xi \end{aligned}$$

being a rotation on \(\mathbb {T}\), and \(S_{m+1}:=T\times \textrm{Id}.\) We consider the following polynomial mapping

$$\begin{aligned} \mathcal {P}(n):=(n^1,n^2,\ldots ,n^m,P(n)):\mathbb {Z}\rightarrow \mathbb {Z}^{m+1}. \end{aligned}$$

By Corollary 2, we know that the averages

$$\begin{aligned} \frac{1}{2N+1}\sum _{n=-N}^N h\Big (S_1^{n^1}\cdots S_m^{n^m}S_{m+1}^{P(n)}y\Big ),\quad y\in X\times \mathbb {T}\end{aligned}$$
(5.2)

converge \(\nu \)-almost everywhere for any \(h\in L^p(Y,\nu )\). If, for \(f\in L^p(X,\mu )\), we consider the function \(h(y):=f(x)\xi \), then we see that the convergence of the averages (5.1) follows from the convergence of the averages (5.2).

The procedure described above can be extended to obtain that, for \(\mathcal {P}\) being a polynomial mapping of the form (1.1) and \(\mathcal {R}:\mathbb {R}^k\rightarrow \mathbb {R}\) being a polynomial with real coefficients, the averages

$$\begin{aligned}{} & {} \mathcal {A}_{t}^{\mathcal {P},\mathcal {R},k',k''}f(x)\\{} & {} \quad :=\frac{1}{\vartheta _\Omega (t)}\sum _{(n,p)\in \mathbb {Z}^{k'} \times (\pm \mathbb {P})^{k''}}f\left( S_1^{\mathcal {P}_1(n,p)}\cdots S_d^{\mathcal {P}_d(n,p)}x\right) \mathbbm {1}_{\Omega _t}(n,p) {{\varvec{e}}}\big (\mathcal {R}(n,p)\big )\left( \prod _{i=1}^{k''}\log |p_i|\right) \end{aligned}$$

and

$$\begin{aligned}{} & {} \mathcal {H}_t^{\mathcal {P},\mathcal {R},k',k''}f(x)\\{} & {} \quad :=\sum _{(n,p)\in \mathbb {Z}^{k'}\times (\pm \mathbb {P})^{k''}}f\left( S_1^{\mathcal {P}_1(n,p)}\cdots S_d^{\mathcal {P}_d(n,p)}x\right) K(n,p)\mathbbm {1}_{\Omega _t}(n,p){{\varvec{e}}}\big (\mathcal {R}(n,p)\big )\left( \prod _{i=1}^{k''}\log |p_i|\right) \end{aligned}$$

converge \(\mu \)-almost everywhere for any \(f\in L^p(X,\mu )\) with \(p\in (1,\infty )\). Moreover, we can deduce that the analogue of Corollary 2 holds for \(\mathcal {A}_{t}^{\mathcal {P},\mathcal {R},k',k''}\) and \(\mathcal {H}_t^{\mathcal {P},\mathcal {R},k',k''}\).

Unfortunately, we are not able to prove the Wiener–Wintner theorem for the averages \(\mathcal {A}_{t}^{\mathcal {P},\mathcal {R},k',k''}\) and \(\mathcal {H}_t^{\mathcal {P},\mathcal {R},k',k''}\). In our case, that would mean showing that, for any \(M\in \mathbb {N}\), there is a subset of X of full measure on which the convergence holds regardless of the choice of polynomial \(\mathcal {R}\) with \(\deg \mathcal {R}\le M\).

It is an interesting question whenever the Wiener–Wintner theorem can be somehow deduced from the inequality

$$\begin{aligned} \sup _{N\in \mathbb {N}}\sup _{I\in \mathfrak {S}_N(\mathbb {R}_+)}\Big \Vert O_{I,N}^2(\mathcal {A}_t^{\mathcal {P},\mathcal {R},k',k''} f:t>0)\Big \Vert _{L^p(X,\mu )}\le C_{p,d,k,\deg \mathcal {P},\deg \mathcal {R}}\Vert f\Vert _{L^p(X,\mu )}. \end{aligned}$$
(5.3)

This question is motivated by the fact that the constant in (5.3) depends only on the degree of \(\mathcal {R}\) and not its coefficients. We hope to investigate this problem in the near future.