1 Introduction

Fix the degree \(k\ge 2\) and dimension \(d\), positive integers such that \(d> k\). As in [14], we define the arithmetic \(k\)-sphere of radius \(r\) in \(d\) dimensions as

$$\begin{aligned} S^{k,d}_{r}:= \left\{ m\in \mathbb {Z} ^d{:}\,\sum _{i=1}^{d} |m_i|^k=r^k\right\} . \end{aligned}$$
(1)

These are the integer points on the families of varieties arising in Waring’s problem. The \(k\)-sphere of radius \(r\) contains \(N_{k,d}(r) = \# S^{k,d}_{r}\) integer points, \(S^{k,d}_{r}\) is possibly non-empty only when \(r^k\in \mathbb {N} \). We denote the set of positive radii \(r\) such that \(S^{k,d}_{r}\) is not empty by \(\mathcal {R}_{k,d}\). For a function \(f{:}\,\mathbb {Z} ^d\rightarrow \mathbb {C} \) and \(r\in \mathcal {R}_{k,d}\), we introduce the \(k\)-spherical averages,

$$\begin{aligned} A_{r}f(x) = \frac{1}{N_{k,d}(r)} \sum _{y \in S^{k,d}_{r}} f(x-y) \end{aligned}$$
(2)

for each \(x \in \mathbb {Z} ^d\). And for a subsequence \(\mathcal {R}\subset \mathcal {R}_{k,d}\), we introduce the maximal function over \(\mathcal {R}\) defined pointwise by

$$\begin{aligned} M_\mathcal {R}f= \sup _{r\in \mathcal {R}} \left|A_{r}f \right|. \end{aligned}$$
(3)

These maximal functions are the arithmetic analogues of continuous maximal functions over \(k\)-spheres in Euclidean space. In the continuous setting, maximal functions associated to compact convex hypersurfaces are bounded on a range of \(L^p(\mathbb {R} ^d)\) spaces depending on the geometry of the hypersurface and dimensional properties of the collection of radii — see [2, 4, 9, 21, 23,24,25]. In particular we were motivated by the results of [6, 7, 10, 26].

1.1 Previous results

Motivated by this Euclidean phenomenon, in [17], Magyar initiated the study of arithmetic \(k\)-spherical maximal functions and proved that the dyadic versions of the maximal operator in (3) are uniformly bounded for a range of \(\ell ^p(\mathbb {Z} ^d)\) spaces depending on the degree and dimension. Subsequently Magyar et al. [20] extended Magyar’s results to the full maximal function for degree \(k= 2\) proving the sharp range of estimates up to the endpoint. The best result to date is due to [16] wherein Ionescu refined Magyar–Stein–Wainger’s result to a restricted weak-type result at the endpoint \(p=\frac{d}{d-2}\) in [16] by using ideas of Bourgain from [3].

Ionescu

If \(k=2\) and \(d\ge 5\), then \(M_{\mathcal {R}_{k,d}}\) is bounded from \(\ell ^{\frac{d}{d-2},1}_{rest}(\mathbb {Z} ^d)\) to \(\ell ^{\frac{d}{d-2},\infty }(\mathbb {Z} ^d)\).

These results are sharp, except possibly for removing the restricted assumption in Ionescu’s result.

In another direction [18] extended the results of Magyar–Stein–Wainger to positive definite, nondegenerate, homogeneous integral forms to prove boundedness of the corresponding maximal operator on \(\ell ^2(\mathbb {Z} ^d)\) and pointwise convergence of their ergodic averages when \(d>(k-1)2^k\); this includes the family of \(k\)-spheres considered here. The motivating problem of this paper is to sharpen Magyar’s results to the full range of \(\ell ^p(\mathbb {Z} ^d)\)-spaces on which \(M_{\mathcal {R}_{k,d}}\) is bounded. Previously, the author improved Magyar’s results for the family of \(k\)-spheres in [14].

Theorem 0

If \(k\ge 3\) and \(d>2k^2(k-1)\), then \(A_{\mathcal {R}_{k,d}}\) is bounded on \(\ell ^p(\mathbb {Z} ^d)\) for \(p>\frac{d}{d- k^2(k-1)}\).

In this paper, we will refine Theorem 0 to a restricted weak-type endpoint result and deduce further bounds for thin sequences.

1.2 Summary of results

For each degree \(k\ge 3\), there is much room to improve Theorem 0 in the range of p and the range of dimension \(d\). The expected range of dimensions is difficult to describe; we refer the reader to Conjectures 1 and 2 of [14]. The range of p is easier to predict in sufficiently high dimensions when \(N_{k,d}(r) \eqsim r^{d-k}\) for all sufficiently large \(r\in \mathcal {R}_{k,d}\). By testing the maximal operator on a Dirac delta function, we see that the maximal operator over \(\mathcal {R}_{k,d}\), \(M_{\mathcal {R}_{k,d}}\), fails to be bounded on \(\ell ^p(\mathbb {Z} ^d)\) for \(p \le \frac{d}{d-k}\). Thus, we predict that \(M_{\mathcal {R}_{k,d}}\) is bounded on \(\ell ^p(\mathbb {Z} ^d)\) for \(p>\frac{d}{d-k}\). Finding the sharp ranges of dimension and \(\ell ^p(\mathbb {Z} ^d)\) is still out of reach of current technology.

In this paper we improve the range of \(\ell ^p(\mathbb {Z} ^d)\)-boundedness for \(M_\mathcal {R}\), the \(k\)-spherical maximal function over a subset \(\mathcal {R}\subset \mathcal {R}_{k,d}\) when the ‘size of \(\mathcal {R}\)’ is small. Our results are phrased in terms of two hypotheses: \(H_{k}\left( \theta \right) \) and \(MVH_{k}\left( d \right) \) which we describe in this section. Throughout the paper assume that \(\theta \in (0,1)\). Our first hypothesis is a bound on the supremum of exponential sums near rational points and is the following.

Sup Hypothesis

\(H_{k}\left( \theta \right) \) For all \(N \in \mathbb {N} \), suppose that there exists integers \( 1 \le a< q\) with \((a,q)=1\) and \(N \le q \le N^{k-1}\) relatively prime such that \(\left|t-a/q \right|~\le ~q^{-2}\). Then

$$\begin{aligned} \sum _{n=1}^N e(t n^k+ \xi n) \lesssim N \left( q^{-1} + N^{-1} + qN^{-k}\right) ^\theta \end{aligned}$$

with implicit constants independent of \(\xi \), \(q\) and N.

This hypothesis differs slightly from the hypothesis in [14]. There, the hypothesis includes a logarithmic-loss; essentially replacing our \(H_{k}\left( \theta \right) \) below by \(H_{k}\left( \theta -\epsilon \right) \) for all \(\epsilon >0\). This allows us to strengthen our results abstractly. In practice, the exponential sum bounds that we can plug into our hypothesis come with a log-loss so that we can only recover the same bounds for the full maximal function as in [14]. For instance, the recent resolution of the Vinogradov mean value theorems by Wooley [30] and Bourgain et al. [1] and Theorem 5.2 of [27] show that \(H_{k}\left( \theta \right) \) is true for all \(\theta \in (0,1/k[k-1])\). For comparison, when Theorem 0 was proved \(H_{k}\left( \theta \right) \) was known to be true for all \(\theta \in (0,1/2k[k-1])\).

In contrast our method allows for us to obtain new results for lacunary maximal functions for \(k\ge 3\), and more generally for maximal functions over thin subsequences of radii.

Theorem 1

Assume that the degree \(k\ge 3\) and \(H_{k}\left( \theta \right) \) is true for some \(0< \theta < 1/2\). If \(\mathcal {R}\subset \mathcal {R}_{k,d}\) is a subsequence with density-parameter at most \(\delta \), then the associated maximal function, \(M_\mathcal {R}{:}\,\ell ^{p,1}(\mathbb {Z} ^d) \rightarrow \ell ^{p,\infty }(\mathbb {Z} ^d) \) for \( p = \max \left\{ \frac{d}{d-k}, 1 + \frac{\delta k}{2(d-k[k+2])+\delta k}, 1+\frac{\delta }{2(d\theta - k)+\delta } \right\} \) and \( d> \max {\{ k(k+2), k/\theta \}}\).

See Sect. 4 for the definition of density-parameter. The following corollary obtains an improved range of \(\ell ^p(\mathbb {Z} ^d)\) spaces for the maximal function over a subsequence below a critical density. In particular it holds for lacunary sequence which we define as infinite subsequences \(\mathcal {R}\subset \mathcal {R}_{k,d}\) with density 0.

Corollary 1

Assume that \(H_{k}\left( \theta \right) \) is true for some \(0< \theta < 1/2\). If \(\mathcal {R}\subset \mathcal {R}_{k,d}\) is a subsequence with density-parameter at most \( \delta \le \min \{ \frac{2 (d-k[k+2])}{d-2k}, \frac{2 k(d\theta - k)}{d-2k} \} \), then the associated maximal function, \(M_\mathcal {R}\) is restricted weak-type \(\left( \frac{d}{d-k},\frac{d}{d-k} \right) \) for \( d> \max {\{ k(k+2), k/\theta \}}\) .

Remark 1

For any degree \(k\ge 2\) it was conjectured that the maximal function \(M_\mathcal {R}\) is bounded on \(\ell ^p(\mathbb {Z} ^d)\) for all \(1<p \le \infty \), and possibly weak-type (1,1), for any lacunary subsequence \(\mathcal {R}\subset \mathcal {R}_{k,d}\). This was disproven by Zienkiewicz (personal communication), who demonstrated that there are arbitrarily thin infinite sequences \(\mathcal {R}\subset \mathcal {R}_{k,d}\) such that when \(k= 2\), \(M_\mathcal {R}\) fails to be bounded on \(\ell ^p(\mathbb {Z} ^d)\) for \(p < \frac{d}{d-1}\). These examples extend to higher degrees \(k\ge 2\). On the other hand when \(k=2\) the author, in [15], recently showed that for the maximal function \(M_\mathcal {R}\) over the super-lacunary sequence \(\mathcal {R}:= \left\{ r_j > 0{:}\,r_j^2 = 1+\prod _{i=1}^{h(j)} \mathfrak {p}_j\right\} \) where \(h(j) := 2^{j^v}\) for some \(v>1\) and \(\mathfrak {p}_j\) is the \(j^{th}\) prime, is bounded on \(\ell ^{\frac{d}{d-2}}\).

The proof of Theorem 1 follows Ionescu’s argument in [16] which itself is based on Bourgain’s method for the continuous analogues of our averages in [3]. Bourgain’s strategy is to decompose our operator into two sublinear pieces, one with a good bound on \(\ell ^2\) and the other with a bad bound on \(\ell ^1\), and arbitrage these for an improvement in the middle. Ionescu decomposes the spherical operators into five pieces and optimizes their bounds altogether. Our argument is a variant of Ionescu’s. It differs from Ionescu’s by partitioning his argument into two steps. The first step, treated in Sect. 3, uses Bourgain’s strategy to prove the restricted weak-type bound for the main term at the endpoint \(p=\frac{d}{d-k}\). The second step, treated in Sect. 4 combines the result of the first step with Bourgain’s strategy applied to the error term. Our use of the union bound appears to be novel in our variant of Bourgain’s method. Despite its simplicity this application of the union bound is efficient, allowing us to deduce our results for thin subsequences and appears to be over-looked in the literature regarding continuous lacunary spherical averages. In particular, one can give simpler proofs than those of Calderón and Coifman–Weiss to prove that the lacunary spherical maximal function maps \(L^p(\mathbb {R} ^d)\) to itself for \(1 < p \le \infty \).

The main novelty in our paper is in The Approximation Formula. By exploiting Vinogradov’s mean value theorems and conjectures, we obtain a new Approximation Formula which controls the error term associated to an individual averages as opposed to a dyadic range as in the Approximation Lemma in [14]. Again, using Bourgain’s method a la Ionescu and the union bound, our next results improve the range of boundedness for lacunary \(k\)-spherical maximal functions in terms of dimension by a factor of the degree \(k\). As in Theorem 1 we would like our theorems to understand the precise necessary arithmetic input. With this in mind, our next results are stated in terms of a mean value hypothesis \(MVH_{k}\left( d \right) \) motivated by Waring’s problem.

Mean Value Hypothesis

\(MVH_{k}\left( d \right) \) For a fixed degree \(k\ge 3\), we will say that \(MVH_{k}\left( d \right) \) is true for some dimension \(d\in \mathbb {N} \) with \(d> k\) if

$$\begin{aligned} \# \left\{ m_1, \dots , m_d; n_1, \dots , n_d\in [r]{:}\,\sum _{i=1}^d|n_i|^k= \sum _{i=1}^d|m_i|^k\right\} \lesssim r^{2d-k} \end{aligned}$$
(4)

for all \(r\in \mathcal {R}_{k,d}\) as \(r\rightarrow \infty \) where the implicit constants are assumed to be independent of \(r\in \mathcal {R}_{k,d}\).

Our next theorem exploits our Mean Value Hypothesis.

Theorem 2

Assume that \(MVH_{k}\left( s \right) \) is true for some \(s> k\) and that \(H_{k}\left( \theta \right) \) is true for some \(\theta \in (0,1/2)\). If \(\mathcal {R}\) is a subsequence of \(\mathcal {R}_{k,d}\) with density-parameter at most \(\delta \le \theta [d-2s]\), then \(M_\mathcal {R}\) is bounded from \(\ell ^{p,1}(\mathbb {Z} ^d)\) to \(\ell ^{p,\infty }(\mathbb {Z} ^d)\) for \( p = \max \{ \frac{d}{d-k}, 1 + \frac{\delta k}{2(d-k[k+2])+\delta k}, 1+\frac{\delta }{2\theta [d-2s] - \delta } \} \) and \(d> \max \{2s, k(k+2)\}\).

1.3 Outline of the paper

The structure of the paper is outlined as follows. In Sect. 2 we recall the (Dyadic) Approximation Formula from [14] which decomposes our averages into a main term and an error term. We improve on the bounds for the error term of a single average. This improvement of the error term will be used in Sect. 4. In Sect. 3, we prove Theorem 3 which says that maximal function for the main term is restricted weak-type at the endpoint \(\frac{d}{d-k}\). The proof is very similar to that in [16]; as such, we assume the reader’s familiarity with the Magyar–Stein–Wainger transference principle and Bourgain’s lemma in [16]. In Sect. 4, we prove Theorems 1 and 2. We conclude the paper with Sect. 5 wherein we give explicit estimates by connecting our hypothesis to Waring’s problem and the recent resolution of Vinogradov mean value conjectures by Bourgain–Demeter–Guth and Wooley [1, 30].

1.4 Notations

We use the same notations outlined in the previous paper [14]. We recall these here for the reader’s convenience. Here and throughout, \(e \left( {t} \right) \) will denote the character \(e^{2 \pi i t}\) for \(t \in \mathbb {R} , \mathbb {Z} /q\mathbb {Z} \) or \(\mathbb {T} \). The torus \(\mathbb {T} ^d:= (\mathbb {R} /\mathbb {Z} )^d\) is identified with the cube \([-1/2,1/2]^d\subset \mathbb {R} ^d\). For two functions fg, \(f \lesssim g\) if \(\left|f(x) \right| \le C \left|g(x) \right|\) for some constant \(C>0\). f and g are comparable \(f \eqsim g\) if \(f \lesssim g\) and \(g \lesssim f\). All implicit constants throughout the paper may depend on dimension \(d\) and degree \(k\). We will often identify \(\mathbb {Z} /q\mathbb {Z} \) with the set \(\left\{ 1, \dots , q \right\} \), and \((\mathbb {Z} /q\mathbb {Z} )^\times \), the group of units in \(\mathbb {Z} /q\mathbb {Z} \), will also be regarded as a subset of \(\left\{ 1, \dots , q \right\} \) . For a set X, we denote its indicator function by \(\mathbf {1}_{X}\).

There are three Fourier transforms floating around. To distinguish these, if \(f: \mathbb {R} ^d\rightarrow \mathbb {C} \), then we define its Fourier transform by \(\widetilde{f}(\xi ) := \int _{\mathbb {R} ^d} f(x) e(x \cdot \xi ) dx\) for \(\xi \in \mathbb {R} ^d\); if \(f: \mathbb {T} ^d\rightarrow \mathbb {C} \), then we define its Fourier transform by \(\widehat{f}(m) := \int _{\mathbb {T} ^d} f(x) e(-m\cdot x) dx\) for \(m\in \mathbb {Z} ^d\); and if \(f:\mathbb {Z} ^d\rightarrow \mathbb {C} \), then we define its Fourier transform by \(\widehat{f}(\xi ) := \sum _{m\in \mathbb {Z} ^d} f(m) e(n \cdot \xi )\) for \(\xi \in \mathbb {T} ^d\).

2 The Approximation Formulas

A crucial insight of Magyar–Stein–Wainger’s proof of the boundedness of the discrete spherical maximal function in [20] is their approximation formula. Magyar generalized their approximation formula in [18] for a class of forms including the \(k\)-spheres here. The author sharpened Magyar’s result for \(k\)-spheres in [14] using a variant of Hypothesis \(H_{k}\left( \theta \right) \) that included a log-loss. In this section we summarize the decomposition of the \(k\)-spherical measure and its bounds from the circle method approximation. We will begin by recalling the (Dyadic) Approximation Formula from [14] which will be used in Theorem 1. Subsequently we improve the Dyadic Approximation Formula for a single average in the Single Approximation Formula below. Both Approximation Formulas rely on bounds for exponential sums and oscillatory integrals. The Dyadic Approximation Formula makes use of our Sup Hypothesis \(H_{k}\left( \theta \right) \) while the Single Approximation Formula additionally makes use of our Mean Value Hypothesis \(MVH_{k}\left( d \right) \). The necessary bounds for the Fourier transform of the continuous \(k\)-spherical surface measures are significantly better than the analogous bounds for exponential sums. Since these bounds are implicit in the Approximation Formula, we do not recall them here; instead refer the vigilant reader to Sect. 3 of [14].

Throughout the entire paper we assume that the dimension \(d\) is sufficiently large so that \( N_{k,d}(r) \eqsim r^{d-k} \) and renormalize the averages \(A_{r}\) as

$$\begin{aligned} A_{r}f(m) = \frac{1}{c_{d, k} \cdot r^{d-k}} \sum _{n \in S^{k,d}_{r}} f(m-n) \end{aligned}$$

where \(c_{d, k} := \frac{\Gamma (1+1/k)^d}{\Gamma (d/k)}\) is the volume of the Gelfand–Leray form on \(\mathcal {S}_{r}^{k,d}\). For \(\xi \in \mathbb {T} ^d\), the multiplier \(\widehat{A_{r}}(\xi )\) for the convolution operator \(A_{r}\) is given by \((c_{d, k} \cdot r^{d-k})^{-1} \cdot a_{r}(\xi )\) where

$$\begin{aligned} a_{r}(\xi ) := \sum _{m\in S^{k,d}_{r}} e \left( {m\cdot \xi } \right) . \end{aligned}$$

Note that \(a_{r}(\xi )\) is the Fourier transform of the characteristic function of the set of integer points on the \(k\)-sphere \(S^{k,d}_{r}\). For \(r\in \mathcal {R}_{k,d}\), we have

$$\begin{aligned} a_{r}(\xi )&= \int _{\mathbb {T} } \sum _{||m||_\infty \le r} e \left( {(\left|m \right|^k-r^k)t + m\cdot \xi } \right) \, dt\\&= \int _{\mathbb {T} } e \left( {-r^kt} \right) \prod _{i=1}^{d} \left( \sum _{|m_i| \le r} e \left( {\left|m_i \right|^kt + m_i \xi _i} \right) \right) \, dt \end{aligned}$$

where the first sum is over integer points in a cube of side-length \(2r\) centered at the origin and the second line follows from the tensor product nature of the exponential sum in the first line.Footnote 1

The torus \(\mathbb {T} \), commonly identified with the interval [0, 1] via the character \(e(x) := e^{2\pi i x}\), decomposes into a disjoint union of major arcs \(\mathfrak {M}\) and minor arcs \(\mathfrak {m}\), commonly identified as collections of intervals in [0, 1]. This decomposes \(a_{r}\):

$$\begin{aligned} a_{r}(\xi ) = a^{Major}_{r}(\xi ) + a^{minor}_{r}(\xi ) \end{aligned}$$

where

$$\begin{aligned}&a^{Major}_{r}(\xi ) := \int _{\mathfrak {M}} \sum _{||m||_\infty \le R} e \left( {(\left|m \right|^k-r^k)t + m\cdot \xi } \right) \, dt\\&a^{minor }_{r}(\xi ) := \int _{\mathfrak {m}} \sum _{||m||_\infty \le R} e \left( {(\left|m \right|^k-r^k)t + m\cdot \xi } \right) \, dt. \end{aligned}$$

Let \(A_{r}^{Major}\) and \(A_{r}^{minor}\) denote their respective normalized convolution operators. These multipliers are normalized so that

$$\begin{aligned}&\widehat{A_{r}^{Major}} = (c_{d, k} \cdot r^{d-k})^{-1} \cdot a^{Major}_{r}\\&\widehat{A_{r}^{minor}} = (c_{d, k} \cdot r^{d-k})^{-1} \cdot a^{minor}_{r}. \end{aligned}$$

The multiplier corresponding to the major arcs, \(a^{Major}_{r}\) is then approximated by \(r^{d-k} \cdot \widehat{C_r}\).Footnote 2 Altogether, the averages decompose as

$$\begin{aligned} A_{r}= C_{r}+ (A_{r}^{Major} - C_{r}) + A_{r}^{minor} \end{aligned}$$

for each \(r\in \mathcal {R}_{k,d}\).

Our main focus of this section is the error term:

$$\begin{aligned} {E}_{r}:= A_{r}- C_{r}= (A_{r}^{Major} - C_{r}) + A_{r}^{minor} \end{aligned}$$

which naturally composes of two pieces: \(A_{r}^{Major} - C_{r}\) and \(A_{r}^{minor}\). The Dyadic Major Arc Approximation Lemma (Section 7 of [14]) reveals that we have the following bounds for the major arc piece of our error term:

$$\begin{aligned} \left\| \sup _{R \le r < 2R} \left| A_{r}^{Major}f- C_{r}f \right| \right\| _{\ell ^{2}(\mathbb {Z} ^d)} \lesssim R^{k+2-\frac{d}{k}} \left\| f \right\| _{\ell ^{2}(\mathbb {Z} ^d)} \end{aligned}$$
(5)

Remark 2

The Dyadic Major Arc Approximation Lemma of Sect. 7 of [14] is actually stated with a log-loss; that is, \(\lesssim \) is replaced with an \(\epsilon \)-loss in (5). However, this may be simply removed by replacing Hua’s bound for Gauss sums in the proofs of Lemmas 7.1 and 7.2 in [14] with Steckin’s estimate. See Sect. 3 for an example of this. We do not go into further details here.

We do not improve (5) in this paper as this is not our goal here. Instead our aim is to understand and improve bounds for the minor arc piece \(A_{r}^{minor}\).

2.1 The Approximation Formula: dyadic version

The analysis for the minor arc piece of our error term, \(A_{r}^{minor}\) relies on \( H_{k}\left( \theta \right) \) and proceeds along a different argument than our major arc piece. We handle the minor arc error term by Lemma 6.2 from [14]: if \(H_{k}\left( \theta \right) \) is true for some \(\theta \in (0,1) \), then

$$\begin{aligned} \left\| \sup _{R \le r < 2R} \left|A_{r}^{minor}f \right| \right\| _{\ell ^{2}(\mathbb {Z} ^d)} \lesssim R^{k- d\theta } \left\| f \right\| _{\ell ^{2}(\mathbb {Z} ^d)} \end{aligned}$$
(6)

Combining (5) and (6), we obtain the Approximation Formula in [14]:

The Approximation Formula

(dyadic version) If hypothesis \(H_{k}\left( \theta \right) \) is true for some \(\theta \in (0,1)\), then for \(d>\max {\{ k(k+2), k/\theta \}}\) and \(\xi \in \mathbb {T} ^d\),

$$\begin{aligned} \widehat{\sigma _r}(\xi ) = \widehat{C_r}(\xi ) + \widehat{{E}_{r}}(\xi ) \end{aligned}$$
(7)

The error term \(\widehat{{E}_{r}}\) is a multiplier term with convolution operator \({E}_{r}\) satisfying

$$\begin{aligned} \left\| \sup _{R \le r < 2R} |{E}_{r}f| \right\| _{\ell ^{2}(\mathbb {Z} ^d)} \lesssim R^{-\nu } \left\| f \right\| _{\ell ^{2}(\mathbb {Z} ^d)} \end{aligned}$$
(8)

where \( \nu := \min \{ d\theta - k, \frac{d}{k}-(k+2) \} > 0.\)

Our goal in this section is to improve (6) for a single average. This will allow us to improve our range of boundedness for maximal operators over sparse sequences.

2.2 The Approximation Formula for a single average

Our adaptation of the Approximation Formula in [14] for a single average is motivated by Hua’s lemma and the Vinogradov mean value theorems which underlie many advances in Waring’s problem. Recall our Mean Value Hypothesis \(MVH_{k}\left( d \right) \) from the introduction. With this in mind, we now have the following single average \(\ell ^2\) inequality. The reader may wish to compare this to the Main \(\ell ^2\) inequality of [14] which is from the proof of (6.4) on page 204 of [20] or the bottom of page 936 in [18].

Lemma 2.1

(Single average \(\ell ^2\) inequality) If \(T_r\) is an operator with multiplier

$$\begin{aligned} \widehat{T_r}(\xi ) = \beta _r(\xi ) := \int _I \alpha (t,\xi ) e \left( {-t r^k} \right) \; dt \end{aligned}$$

where \( \alpha (t,\xi ){:}\,I \times \mathbb {T} ^d\rightarrow \mathbb {C} \) and I is an interval in [0, 1], then for any \(1 \le p \le \infty \)

$$\begin{aligned} \left\| T_r f \right\| _{\ell ^{2}(\mathbb {Z} ^d)} \le \sup _{\xi \in \mathbb {T} ^d} \left( \int _I \left|\alpha (t,\xi ) \right|^p \; dt \right) ^{1/p} \left\| f \right\| _{\ell ^{2}(\mathbb {Z} ^d)} \end{aligned}$$
(9)

with the standard modification when \(p=\infty \).

Proof

By Plancherel’s theorem, we have the familiar bound

$$\begin{aligned} \left\| T_r f \right\| _{\ell ^{2}(\mathbb {Z} ^d)} \le \sup _{\xi \in \mathbb {T} ^d} \left|\beta _r(\xi ) \right| \cdot \left\| f \right\| _{\ell ^{2}(\mathbb {Z} ^d)} . \end{aligned}$$

Applying Holder’s inequality we bound the first factor: for each \(\xi \in \mathbb {T} ^d\),

$$\begin{aligned} \left|\beta _r(\xi ) \right| \le \int _I \left|\alpha (t,\xi ) \right| \; dt \le |I|^{1/p'} \left( \int _I \left|\alpha (t,\xi ) \right|^p \; dt \right) ^{1/p} \le \left( \int _I \left|\alpha (t,\xi ) \right|^p \; dt \right) ^{1/p} \end{aligned}$$

since \(|I| \le 1\). \(\square \)

The bound (9) works best for \(p=1\) when we exploit the tensor product nature of our exponential sums. The following lemma allows us to ignore the possible cancellation arising from the linear phases. The method here is common in number theory and proceeds by using Plancherel’s theorem to rewrite the \(L^p\)-norm as the number of solutions to a system of equations. We note that Hu and Li [11,12,13] recently used this method to study related discrete restriction problems. Since the proof of the following lemma is a standard technique in the circle method, we refer the reader to Chapter 5, Section 5.1 of [27], in particular inequality (5.4), or Chapter 4, Section  2 of [19] for proofs.

We define the following notation for the remainder of this section: if \(\xi \in \mathbb {T} \) and \(t \in I\) for some interval I, let

$$\begin{aligned} \alpha _r(t,\xi ) := \sum _{|m| \le r} e \left( {\left|m \right|^kt + m\xi } \right) . \end{aligned}$$

Lemma 2.2

For \(s\in \mathbb {N} \) and any \(\xi _i, t \in \mathbb {T} \) with \(i=1, \dots , 2s\),

$$\begin{aligned} {\int _0^1 \prod _{i=1}^{2s} \left|\alpha _r(t,\xi _i) \right| \; dt} \le {\int _0^1 \left|\alpha _r(t,0) \right|^{2s} \; dt}. \end{aligned}$$
(10)

We can now state and prove our Approximation Formula for a single average.

The Approximation Formula

(single average version) If hypothesis \(H_{k}\left( \theta \right) \) is true for some \(\theta \in (0,1)\) and \(MVH_{k}\left( s \right) \) is true for some dimension \(s\in \mathbb {N} \), then for \(d>\max {\{ 2s, k(k+2)\}}\) and \(\xi \in \mathbb {T} ^d\),

$$\begin{aligned} \widehat{\sigma _r}(\xi ) = \widehat{C_r}(\xi ) + \widehat{{E}_{r}}(\xi ) . \end{aligned}$$
(11)

The error term \(\widehat{{E}_{r}}\) is a multiplier with convolution operator \({E}_{r}\) satisfying

$$\begin{aligned} \left\| {E}_{r}f \right\| _{\ell ^{2}(\mathbb {Z} ^d)} \lesssim R^{-\omega } \left\| f \right\| _{\ell ^{2}(\mathbb {Z} ^d)} \end{aligned}$$
(12)

where \( \omega := \min \{ (d-2s)\theta , k+2-\frac{d}{k} \} \)

Remark 3

The Single Approximation Formula yields a power savings of \((d-2s)\theta \) which is weaker than that of \(d\theta - k\), but true for a larger range of dimensions. The catch is that the Single Approximation Formula only allows us to control a single average at a time rather than a dyadic range. Fortunately, this is useful for sparser maximal functions, such as lacunary maximal functions.

Proof

of The Single Approximation Formula By (5), we only need to study the minor arcs and improve (6) when \(d>2s\). We do so for an individual average by considering \(\left\| A_{r}^{minor}f \right\| _{\ell ^{2}(\mathbb {Z} ^d)}\) for each \(r\in \mathcal {R}_{k,d}\). Plancherel’s theorem reduces our goal to a uniform exponential sum estimate of \( a^{minor}_{r}(\xi ) \) for \(\xi = (\xi _1, \dots , \xi _d) \in \mathbb {T} ^d\). The following is a typical approach for bounding minor arcs in Waring’s problem; see for instance Chapter 3 in [8].

We take \(p=1\) in Lemma 2.1 in order to exploit the tensor product nature of the exponential sums. Since \(d> 2s\),

$$\begin{aligned} \left\| a^{minor}_{r} \right\| _{\ell ^{2}(\mathbb {Z} ^d)}&\le \sup _{\xi \in \mathbb {T} ^d} \left\{ \left( \int _I \left|\prod _{i=1}^d\alpha _r(t,\xi _i) \right| \; dt \right) \right\} \\&\le \sup _{\xi \in \mathbb {T} ^d} \left\{ \left[ \sup _{t \in I} \left\{ \prod _{i=2s+1}^{d} \left|\alpha _r(t,\xi _i) \right| \right\} \right] \cdot \left( \int _I \prod _{i=1}^{2s} \left|\alpha _r(t,\xi _i) \right| \; dt \right) \right\} \\&\le \sup _{\xi \in \mathbb {T} ^d} \left\{ \left[ \sup _{t \in I} \left\{ \prod _{i=2s+1}^{d} \left|\alpha _r(t,\xi _i) \right| \right\} \right] \cdot \left( \int _0^1 \prod _{i=1}^{2s} \left|\alpha _r(t,\xi _i) \right| \; dt \right) \right\} \\&\le \sup _{\xi \in \mathbb {T} ^d} \left\{ \left[ \sup _{t \in I} \left\{ \prod _{i=2s+1}^{d} \left|\alpha _r(t,\xi _i) \right| \right\} \right] \cdot \left( \int _0^1 \prod _{i=1}^{2s} \left|\alpha _r(t,0) \right| \; dt \right) \right\} \end{aligned}$$

where the last line above follows from Lemma 2.2.

Plancherel’s Theorem implies that for any \(s \in \mathbb {N} \)

$$\begin{aligned} \int _0^1 \left|\alpha _r(t,0) \right|^{2s} \; dt = \# \left\{ m_1, \dots , m_s; n_1, \dots , n_s\in [r]{:}\,\sum _{i=1}^s|n_i|^k= \sum _{i=1}^s|m_i|^k\right\} . \end{aligned}$$
(13)

Our mean value hypothesis implies by (13) that

$$\begin{aligned} \int _0^1 \left|\alpha _r(t,0) \right|^{2s} \; dt \lesssim r^{2s-k} . \end{aligned}$$

Therefore,

$$\begin{aligned} \left\| A_r^{minor} \right\| _{\ell ^{2}(\mathbb {Z} ^d)} \lesssim r^{k-d} \cdot r^{2s-k} \cdot r^{(d-2s)(1-\theta )} = r^{-(d-2s)\theta } . \end{aligned}$$
(14)

\(\square \)

3 The main term is restricted weak-type at the endpoint for \(d> 2 k\)

We delve further into the Approximation Formula by refining estimates for the main term. The multiplier for the main term decomposes into a sum of operators

$$\begin{aligned} \widehat{C_r}= \sum _{q=1}^\infty \sum _{a\in (\mathbb {Z} /q\mathbb {Z} )^\times } e \left( { \frac{ar^k}{q} } \right) \widehat{C_{r}^{a/q}}\end{aligned}$$
(15)

where we have the multipliers

$$\begin{aligned} \widehat{C_{r}^{a/q}}(\xi ) := \sum _{m\in \mathbb {Z} ^d} G(a,q,m) \Psi (q\xi - m) \widetilde{d\sigma _r}(\xi - m/q) \end{aligned}$$
(16)

and

  • \(\Psi \) is a smooth bump function supported in \([-1/4, 1/4]^d\) and equal to 1 in \([-1/8, 1/8]^d\),

  • for \(m\in \mathbb {Z} ^d, q\in \mathbb {N} \) and \(a\in (\mathbb {Z} /q\mathbb {Z} )^\times \)

    $$\begin{aligned} G(a,q,m) := q^{-d} \prod _{i=1}^d\left[ \sum _{b_i \in \mathbb {Z} /q\mathbb {Z} } e \left( {\frac{a b_i^k+ b_i \cdot m_i}{q}} \right) \right] \end{aligned}$$

    is a normalized Gauss sum,

  • \(d\sigma _r\) is the Gelfand–Leray form on \(\mathcal {S}_{r}^{k,d}:= \{x \in \mathbb {R} ^d{:}\,\sum _{i=1}^{d} |x_i|^k=r^k\}\) normalized to be a probability measure whose \(\mathbb {R} ^d\)-Fourier transform is denoted \( \widetilde{d\sigma _r} \) .

Remark 4

The Approximation Formula generalizes the asymptotic formula in Waring’s problem. As such, the main term \(C_{r}\) connects analysis on \(\mathbb {Z} ^d\) with the analysis of \((\mathbb {Z} /q\mathbb {Z} )^d\) and \(\mathbb {R} ^d\). In particular, we will compare \(S^{k,d}_{r}:= \{m\in \mathbb {Z} ^d{:}\,\sum _{i=1}^{d} |m_i|^k=r^k\}\) with its projections mod \(q\) and its embedding in \(\mathcal {S}_{r}^{k,d}:= \{x \in \mathbb {R} ^d{:}\,\sum _{i=1}^{d} |x_i|^k=r^k\}\) through their respective measures. The measure for \(S^{k,d}_{r}\) is the probability measure \(\sigma _r= \frac{1}{N_{k,d}(r)} \mathbf {1}_{S^{k,d}_{r}}\) where \(\mathbf {1}_{S^{k,d}_{r}}\) is the characteristic function of \(S^{k,d}_{r}\), and the measure for \(\mathcal {S}_{r}^{k,d}\) is \(d\sigma _r\) given by the Gelfand–Leray form on \(\mathcal {S}_{r}^{k,d}\). More precisely, we will approximate the \(\mathbb {Z} ^d\)-Fourier transform of the measure \(\sigma _r\) by the \(\mathbb {R} ^d\)-Fourier transform of \(d\sigma _r\) and its projected measure in \((\mathbb {Z} /q\mathbb {Z} )^d\) which are given by the normalized Gauss sums \( G(a,q,m) \).

We have the following lemma for the maximal function of the \(C_{r}\). The proof relies on Hua’s bound for Gauss sums, the Bruna–Nagel–Wainger bounds for Fourier transforms of surface measures and Rubio de Francia’s maximal theorem for surface measures. For its proof, see the proof of Lemma 8.1 in [14].

Lemma 3.1

If \(d> 2k+1\) and \(p > \frac{d}{d-k}\), then

$$\begin{aligned} \left\| \sup _{r\in \mathcal {R}_{k,d}} \left|C_{r}f \right| \right\| _{\ell ^{p}(\mathbb {Z} ^d)} \lesssim \left\| f \right\| _{\ell ^{p}(\mathbb {Z} ^d)} . \end{aligned}$$
(17)

The purpose of this section is to refine Lemma 3.1 to a restricted weak-type bound at the endpoint when the dimension is sufficiently large.

Theorem 3

Assume that the degree \(k\ge 3\). If \(d\ge 2 k+1\), then the maximal function \(\sup _{r\in \mathcal {R}_{k,d}} \left|C_{r}f \right|\) is restricted weak-type \(\left( \frac{d}{d-k}, \frac{d}{d-k}\right) \).

Our approach is the same as [16]. In particular, Theorem 3 is easily deduced from the following decomposition lemma.

Lemma 3.2

(Decomposition lemma for the main term) For any fixed \(Q\in \mathbb {N} \) we can decompose each \(k\)-spherical average of \(r\in \mathcal {R}_{k,d}\) into the sum of 2 linear operators:

$$\begin{aligned} C_{r}f(x) = C_{r}^{high} f(x) + C_{r}^{low} f(x). \end{aligned}$$
(18)

such that if \(d>2k\), then the following bounds are satisfied:

$$\begin{aligned}&\left\| \sup _{r\in \mathcal {R}_{k,d}} \left|C_{r}^{high} f \right| \right\| _{\ell ^{2}(\mathbb {Z} ^d)} \lesssim Q^{2-\frac{d}{k}} \left\| f \right\| _{\ell ^{2}(\mathbb {Z} ^d)}&\text {(High frequency estimate)} \end{aligned}$$
(19)
$$\begin{aligned}&\left\| \sup _{r\in \mathcal {R}_{k,d}} \left|C_{r}^{low} f \right| \right\| _{\ell ^{1,\infty }(\mathbb {Z} ^d)} \lesssim Q^2 \left\| f \right\| _{\ell ^{1}(\mathbb {Z} ^d)}&\text {(Low frequency estimate)}. \end{aligned}$$
(20)

First we deduce Theorem 3 from Lemma 3.2. Afterwards, we prove Lemma 3.2.

3.1 Deduction of Theorem 3 from Lemma 3.2

We want to show that

$$\begin{aligned} \sup _{\alpha> 0} \alpha ^{\frac{d}{d-k}} \left|\left\{ M_\mathcal {R}f> \alpha \right\} \right| \lesssim \left\| f \right\| _{\ell ^{\frac{d}{d-k}}(\mathbb {Z} ^d)}^{\frac{d}{d-k}} \end{aligned}$$

for any function in \(\ell ^{\frac{d}{d-k}}(\mathbb {Z} ^d)\); instead we prove this for the characteristic function of any subset \(F\) in \(\mathbb {Z} ^d\). Let \(\mathbf {1}_{F}\) denote the characteristic function of a set \(F\). To get this restricted weak-type bound, we will choose \(Q\) depending on an altitude \(\alpha >0\). Suppose for a moment that we can choose our parameters so that

$$\begin{aligned} Q^2&\lesssim R^{k} \end{aligned}$$
(21)
$$\begin{aligned} Q^{2-\frac{d}{k}}&\lesssim R^{k- d/2}. \end{aligned}$$
(22)

Then we have the bounds

$$\begin{aligned}&\left\| \sup _{r\in \mathcal {R}_{k,d}} \left|C_{r}^{high} f \right| \right\| _{\ell ^{1,\infty }(\mathbb {Z} ^d)} \lesssim R^{k} \left\| f \right\| _{\ell ^{1}(\mathbb {Z} ^d)} \end{aligned}$$
(23)
$$\begin{aligned}&\left\| \sup _{r\in \mathcal {R}_{k,d}} \left|C_{r}^{low} f \right| \right\| _{\ell ^{2}(\mathbb {Z} ^d)} \lesssim R^{k- d/2} \left\| f \right\| _{\ell ^{2}(\mathbb {Z} ^d)}. \end{aligned}$$
(24)

Applying (23) and (24) to \(\mathbf {1}_{F}\), the characteristic function of the set \(F\in \mathbb {Z} ^d\), we find

$$\begin{aligned} \left|\left\{ M_\mathcal {R}\mathbf {1}_{F}> 2 \alpha \right\} \right|&\le \left|\left\{ \sup _{r\in \mathcal {R}_{k,d}} \left|C_{r}^{high} \mathbf {1}_{F}> \alpha \right| \right\} \right| + \left|\left\{ \sup _{r\in \mathcal {R}_{k,d}} \left|C_{r}^{low} \mathbf {1}_{F} \right| > \alpha \right\} \right| \\&\lesssim \alpha ^{-1} R^{k} \left|F \right| + \alpha ^{-2} R^{2(k- d/2)} \left|F \right| \\&= \left( \alpha ^{-1} R^{k} + \alpha ^{-2} R^{2k- d} \right) \left|F \right|. \end{aligned}$$

We balance the two terms on the right hand side by choosing \(R= \alpha ^{\frac{1}{k- d}}\) so that \(\alpha ^{-1} R^{k} \eqsim \alpha ^{-2} R^{2k- d}\). Plugging this into the above, we find that

$$\begin{aligned} \left|\left\{ M_\mathcal {R}\mathbf {1}_{F} > 2 \alpha \right\} \right| \lesssim \alpha ^{-1-\frac{k}{d- k}} \left|F \right| = \alpha ^{-\frac{d}{d- k}} \left|F \right| \end{aligned}$$

which is the weak-type bound we seek.

It is easy to verify that (21) and (22) hold for \(Q\eqsim R^{k/2}\) provided that \(d> 2 k\). \(\square \)

3.2 Proof of the decomposition lemma for the main term (Lemma 3.2)

In this section we outline the proof of Lemma 3.2. The details are similar to the proofs of estimates (2.9) and (2.10) in [16] making the necessary modifications to higher degrees like those in [14] from [20]. One important point is our use of Stečkin’s estimate (25) for the Gauss sums \(G(a,q,m)\) rather than Hua’s estimate for them (see Hua’s bound in [14]); otherwise, we cannot reach the endpoint \(p=\frac{d}{d-k}\). We recall Stečkin’s estimate now.

Stečkin’s estimate [22] If \((a, q)=1\), then

$$\begin{aligned} |G(a,q,m)| \lesssim q^{-\frac{d}{k}} \end{aligned}$$
(25)

uniformly for \(m\in \mathbb {Z} ^d\).

For the operators \(C_{r}^{a/q}\), we have high frequency in two aspects: the modulus \(q\) and the continuous aspect. We decompose \(C_{r}^{a/q}\) into continuous-high and continuous-low frequency multipliers. Let \(0<\Delta <1\), and for each \(r\in \mathcal {R}_{k,d}\) define the multipliers

(26)
(27)

Here for \(t>0\). Fix \(Q>1\). Let

$$\begin{aligned} C_{r}^{low}&= \sum _{q< Q} \sum _{a\in (\mathbb {Z} /q\mathbb {Z} )^\times } C_{r}^{a/q, low}\end{aligned}$$
(28)
$$\begin{aligned} C_{r}^{high}&= \left( \sum _{q< Q} \sum _{a\in (\mathbb {Z} /q\mathbb {Z} )^\times } C_{r}^{a/q, high} \right) + \left( \sum _{q\ge Q} \sum _{a\in (\mathbb {Z} /q\mathbb {Z} )^\times } C_{r}^{a/q} \right) . \end{aligned}$$
(29)

so that we have \(C_{r}= C_{r}^{low}+ C_{r}^{high}\). At the moment, our decomposition depends on \(\Delta \) and \(Q\), but we will soon choose \(\Delta = Q^{-1}\).

Replacing Hua’s bound by Stečkin’s estimate (25) in the proof of Lemma 7.3 in [14], we have the following improvement to Lemma 7.3 in [14].

Lemma 3.3

If \(d> \frac{k}{2}+1\), then for all moduli \(q\in \mathbb {N} \) and \(a\in (\mathbb {Z} /q\mathbb {Z} )^\times \), we have

$$\begin{aligned} \left\| \sup _{r\in \mathcal {R}_{k,d}} \left|C_{r}^{a/q}f \right| \right\| _{\ell ^{2}(\mathbb {Z} ^d)} \lesssim q^{-\frac{d}{k}} \left\| f \right\| _{\ell ^{2}(\mathbb {Z} ^d)} . \end{aligned}$$
(30)

We will apply Lemma 3.3 for all large moduli, \(q\ge Q\), but this lemma is insufficient for small moduli, \(q< Q\). Our next lemma obtains a good \(\ell ^2\) bound for the continuous-high frequency multipliers. As in [16], we use the Magyar–Stein–Wainger transference principle in [20] and Lemma 3 from [5] to show that the maximal function for the high frequency part, \(C_{r}^{a/q, high}\) has good \(\ell ^2\) estimates due to the decay of the Fourier transform of the (continuous) surface measure \(\widetilde{d\sigma _r}\).

Lemma 3.4

If \(d>\frac{k}{2}+1\) and \(0<\Delta <1\), then

$$\begin{aligned} \left\| \sup _{r\in \mathcal {R}_{k,d}} \left|C_{r}^{a/q, high}f \right| \right\| _{\ell ^{2}(\mathbb {Z} ^d)} \lesssim {q}^{-\frac{d}{k}} \cdot \left( q\Delta \right) ^{\frac{d-1}{k}-\frac{1}{2}} \left\| f \right\| _{\ell ^{2}(\mathbb {Z} ^d)} . \end{aligned}$$
(31)

Proof

We apply the Magyar–Stein–Wainger separation trick to separate out the arithmetic and analytic parts: \( C_{r}^{a/q, high}= T^{a/q}_r\circ S^{a/q}\), defined by the multipliers

(32)
(33)

Note that \(S^{a/q}\) does not depend on the radius \(r\in \mathcal {R}_{k,d}\); this implies

(34)

For the arithmetic part, Stečkin’s estimate (25) and Proposition 2.2 of [20] imply that

(35)

For the analytic part, we apply the Magyar–Stein–Wainger transference principle (Proposition 2.1 and Corollary 2.1 in [20]). Define the operator \(U_r\) by the multiplier

which is considered as a multiplier on \(\mathbb {R} ^d\). Then the Magyar–Stein–Wainger transference principle implies

for \(1 \le p < \infty \) with an implicit constant independent of \(q\).

We normalize \(\widetilde{U_r}\) so that it does not depend on \(r\) by observing . We are in position to apply Lemma 3 from [5]. Analogous to its use [16], this tells us that

(36)

where \(\alpha _j := \sup _{|\xi | \eqsim 2^j} |\widetilde{U_1}(\xi )|\) and \(\beta _j := \sup _{|\xi | \eqsim 2^j} |\xi \cdot \nabla \widetilde{U_1}(\xi )|\). Now we merely need to calculate \(\alpha _j\) and \(\beta _j\) for \(j \in \mathbb {Z} \). The support condition implies that \(\alpha _j\) and \(\beta _j\) are 0 for \(2^j \le \left( 8 q\Delta \right) ^{-1}\) so that we only need to consider j such that \(\left( 8 q\Delta \right) ^{-1} \le 2^{j}\). Otherwise, the Bruna–Nagel–Wainger bounds in [2], see also (1) and (2) in Section 3 of [14], yield

$$\begin{aligned} \alpha _j \lesssim \left( 1+2^j \right) ^{\frac{1-d}{k}} \; \; \text { and } \; \; \beta _j \lesssim 2^j \cdot \left( 1+2^j \right) ^{\frac{1-d}{k}} . \end{aligned}$$

Applying these bounds, we sum over \(\left( 8 q\Delta \right) ^{-1} \le 2^j\) to conclude the lemma. \(\square \)

Each of the remaining low frequency parts, \(C_{r}^{a/q, low}\) is comparable to the discrete Hardy–Littlewood averages and thus its maximal function is comparable to the Hardy–Littlewood maximal function with a bound that depends on the modulus \(q\).

Proposition 3.1

Let \(M_*\) be the discrete Hardy–Littlewood maximal function over cubes. If \(d\ge 2\), \(k\ge 2\) and \(0<\Delta <1\), then

$$\begin{aligned} {\sup _{r\in \mathcal {R}_{k,d}} \left|C_{r}^{a/q, low}f(x) \right|} \lesssim \left( q\Delta \right) ^{-1} M_*f(x) \end{aligned}$$
(37)

for all \(x \in \mathbb {Z} ^d\).

Proof

Note that for \(r\Delta \ge 1\),

is a smooth function supported in \(\left[ -1/(8 \Delta r), 1/(8 \Delta r) \right] ^{d}\). This implies

for \(r\Delta \ge 1\).

A straightforward computation, see page 1415 of [16], reveals that the kernel \(K_{r}^{a/q, low}\) of \(C_{r}^{a/q}\) is

(38)

for each \(x \in \mathbb {Z} ^d\). A standard argument shows

(39)

for any \(t>0\) and any \(N \in \mathbb {N} \). Taking \(t = 2 q\Delta \) and \(N = d+1\), (38) and (39) imply

$$\begin{aligned} \left|K_{r}^{a/q, low}(x) \right| \lesssim r^{-d} \left( q\Delta \right) ^{-1} (1+\left|x/r \right|)^{-d-1} . \end{aligned}$$

This is an approximation to the identity which implies that

$$\begin{aligned} \sup _{r> 0} \left|C_{r}^{a/q, low}f \right| \lesssim \left( q\Delta \right) ^{-1} \cdot M_*f. \end{aligned}$$

\(\square \)

We are ready for the proof of Lemma 3.2.

Proof

of Lemma 3.2 We choose \(\Delta = Q^{-1}\). By (29), (30) and (31), we see that if \(d>\frac{k}{2}+1\) and \(0<\Delta <1\), then

$$\begin{aligned} \left\| \sup _{r\in \mathcal {R}_{k,d}} \left|C_{r}^{high}f \right| \right\| _{\ell ^{2}(\mathbb {Z} ^d)}&\lesssim \left( \sum _{q< Q} \sum _{a\in (\mathbb {Z} /q\mathbb {Z} )^\times } \left\| \sup _{r\in \mathcal {R}_{k,d}} \left|C_{r}^{high}f \right| \right\| _{\ell ^{2}(\mathbb {Z} ^d)} \right) \\&\quad + \left( \sum _{q\ge Q} \sum _{a\in (\mathbb {Z} /q\mathbb {Z} )^\times } \left\| \sup _{r\in \mathcal {R}_{k,d}} \left|C_{r}f \right| \right\| _{\ell ^{2}(\mathbb {Z} ^d)} \right) \\&\lesssim \left( \sum _{q< Q} \sum _{a\in (\mathbb {Z} /q\mathbb {Z} )^\times } {q}^{-\frac{d}{k}} \cdot \left( q\Delta \right) ^{\frac{d-1}{k}-\frac{1}{2}} + \sum _{q\ge Q} \sum _{a\in (\mathbb {Z} /q\mathbb {Z} )^\times } q^{-\frac{d}{k}} \right) \left\| f \right\| _{\ell ^{2}(\mathbb {Z} ^d)}\\&\lesssim {Q}^{2-\frac{d}{k}} \left\| f \right\| _{\ell ^{2}(\mathbb {Z} ^d)}. \end{aligned}$$

This is (19).

Similarly, (37) shows that

$$\begin{aligned} \sup _{r\in \mathcal {R}_{k,d}} \left|C_{r}^{low}f \right|&\le \sum _{q< Q} \sum _{a\in (\mathbb {Z} /q\mathbb {Z} )^\times } \sup _{r\in \mathcal {R}_{k,d}} \left|C_{r}^{low}f \right|\\&\lesssim \sum _{q< Q} \sum _{a\in (\mathbb {Z} /q\mathbb {Z} )^\times } \left( q\Delta \right) ^{-1} M_*f\\&\le {Q}^2 M_*f. \end{aligned}$$

Therefore the Hardy–Littlewood maximal theorem implies that

$$\begin{aligned} \left\| \sup _{r\in \mathcal {R}_{k,d}} \left|C_{r}^{low}f \right| \right\| _{\ell ^{1, \infty }(\mathbb {Z} ^d)} \lesssim {Q}^2 \left\| M_*f \right\| _{\ell ^{1, \infty }(\mathbb {Z} ^d)} \lesssim {Q}^2 \left\| f \right\| _{\ell ^{1}(\mathbb {Z} ^d)} . \end{aligned}$$

This is (20). \(\square \)

4 Proofs of Theorems 1 and 2

Having proved Theorem 3, we turn our attention to Theorem 1. The proof of Theorem 1 will be similar to the proof of Theorem 3, but simpler. The simplicity is due in part to our notion of density-parameter and our use of the union bound in Proposition 4.1.

Definition 1

A subsequence \(\mathcal {R}\) in \(\mathcal {R}_{k,d}\) has density-parameter at most \(\delta \) if

$$\begin{aligned} \#\left\{ r\in \mathcal {R}{:}\,r\le R\right\} \lesssim _\delta R^{\delta } \end{aligned}$$
(40)

as \(R\rightarrow \infty \).

For instance a lacunary subsequence has density-parameter at most \(\epsilon \) for all \(\epsilon >0\) while the full sequence \(\mathcal {R}_{k,d}\) has density at most \(k\) (when dimension is sufficiently large with respect to degree).

Remark 5

We mention that our density-parameter is a discrete version of the Minkowski dimension of the set of radii considered in [10, 26]. We do not explore the relationship between these quantities in this paper.

First, we split our operator into narrow and wide averages: for any \(R>0\),

$$\begin{aligned} M_\mathcal {R}f\le \sup _{r\le R} \left|A_{r}f \right| + \sup _{r> R} \left|A_{r}f \right|. \end{aligned}$$

Since each average decomposes into a main term and an error term: \(A_{r}= C_{r}+ {E}_{r}\), we further decompose the wide averages into:

$$\begin{aligned} \sup _{r> R} \left|A_{r}f \right| \le \sup _{r> R} \left|C_{r}f \right| + \sup _{r> R} \left|{E}_{r}f \right| . \end{aligned}$$

Altogether, we have the following decomposition lemma.

Lemma 4.1

For any subsequence \(\mathcal {R}\) of \(\mathcal {R}_{k,d}\), we can bound the \(k\)-spherical maximal operator over \(\mathcal {R}\) by

$$\begin{aligned} M_\mathcal {R}f\le \sup _{r\le R} \left|A_{r}f \right| + \sup _{r> R} \left|C_{r}f \right| + \sup _{r> R} \left|{E}_{r}f \right| \end{aligned}$$
(41)

where the supremum are understood to only consider radii \(r\in \mathcal {R}\).

The narrow averages are handled by the union bound.

Proposition 4.1

If \(\mathcal {R}\subseteq \mathcal {R}_{k,d}\) has density-parameter at most \(\delta \), then

$$\begin{aligned} \left\| \sup _{r\le R} \left|A_{r}f \right| \right\| _{\ell ^{1}(\mathbb {Z} ^d)} \lesssim {R}^{\delta } \left\| f \right\| _{\ell ^{1}(\mathbb {Z} ^d)} \end{aligned}$$
(42)

Proof

of Proposition 4.1 Bound the sup by a sum, use Minkowski’s theorem to move the \(\ell ^1(\mathbb {Z} ^d)\)-norm inside, and then use the definition of density-parameter while noting that \(A_{r}\) has norm 1 on \(\ell ^1(\mathbb {Z} ^d)\) for every \(r\in \mathcal {R}_{k,d}\). \(\square \)

Applying the Approximation Formula and summing (12) over a geometric series, we obtain the following \(\ell ^2(\mathbb {Z} ^d)\)-bound for the error term.

Proposition 4.2

Let \(k\ge 3\) and \(\mathcal {R}\subset \mathcal {R}_{k,d}\). If the dimension \(d>\max {\{ k(k+2), k/\theta \}}\), then

$$\begin{aligned} \left\| \sup _{r> R} \left|{E}_{r}f \right| \right\| _{\ell ^{2}(\mathbb {Z} ^d)} \lesssim {R}^{-\nu } \left\| f \right\| _{\ell ^{2}(\mathbb {Z} ^d)} \end{aligned}$$
(43)

where \( \nu := \min \{ d\theta - k, \frac{d}{k}-(k+2) \} > 0.\)

We are now ready for the proof of Theorem 1.

Proof

of Theorem 1 Again, take \(\mathbf {1}_{F}\) to be the characteristic function of F a subset of \(\mathbb {Z} ^d\). We will prove the restricted weak-type bound for \( p = \max \left\{ \frac{d}{d-k}, \frac{2\nu +2\delta }{2\nu +\delta } \right\} . \) For any altitude \(\alpha >0\) and any compactly supported function \(f: \mathbb {Z} ^d\rightarrow \mathbb {C} \), we have

$$\begin{aligned} \#{\left\{ \left|M_\mathcal {R}f \right|> \alpha \right\} } \le&\#{\left\{ \sup _{r\le R} \left|A_{r}f \right|> \alpha /3 \right\} } + \#{\left\{ {\sup _{r> R} \left|{E}_{r}f \right|}> \alpha /3 \right\} }\\&+ \#{\left\{ {\sup _{r> 0} \left|C_{r}f \right|} > \alpha /3 \right\} }. \end{aligned}$$

Since we are aiming for a restricted weak-type bound, we may assume that \( 0 < \alpha \le 1\). Balancing (42) and (43), we choose \( R= \alpha ^{-\frac{1}{2\nu +\delta }} \) to find that

$$\begin{aligned} \#{\left\{ \left|M_\mathcal {R}\mathbf {1}_{F} \right|> \alpha \right\} }&\lesssim \alpha ^{-(1+\frac{\delta }{2\nu +\delta })} \left( \left\| \mathbf {1}_{F} \right\| _{\ell ^{1}(\mathbb {Z} ^d)} + \left\| \mathbf {1}_{F} \right\| _{\ell ^{2}(\mathbb {Z} ^d)}^2 \right) \\&\quad + \#{\left\{ {\sup _{r> 0} \left|C_{r}\mathbf {1}_{F} \right|} > \alpha /3 \right\} }\\&\lesssim \alpha ^{-(1+\frac{\delta }{2\nu +\delta })} \left( \left\| \mathbf {1}_{F} \right\| _{\ell ^{1}(\mathbb {Z} ^d)} + \left\| \mathbf {1}_{F} \right\| _{\ell ^{2}(\mathbb {Z} ^d)}^2 \right) \\&\quad + \alpha ^{-\frac{d}{d-k}} \left\| \mathbf {1}_{F} \right\| _{\ell ^{\frac{d}{d-k}}(\mathbb {Z} ^d)}^{\frac{d}{d-k}}\\&\lesssim \alpha ^{-(1+\frac{\delta }{2\nu +\delta })} \#{F} + \alpha ^{-\frac{d}{d-k}} \#{F} \end{aligned}$$

where the second inequality follows from applying Theorem 3. Since \(\alpha \in (0,1]\), we see that the summand corresponding to the larger of the two exponents dominates the other summand. \(\square \)

The proof of Theorem 2 is identical to the proof of Theorem 1 upon replacing Proposition 4.2 with the following improvement.

Proposition 4.3

Let \(k\ge 3\) and assume that \(MVH_{k}\left( s \right) \) is true for some \(s>k\) and \(H_{k}\left( \theta \right) \) is true for some \(\theta \in (0,1)\) . If \(\mathcal {R}\subset \mathcal {R}_{k,d}\) have density-parameter at most \( \delta \in [0,(d-2s)\theta ) \) and the dimension \(d> \max {\{ 2 s, k(k+2) \}}\), then

$$\begin{aligned} \left\| \sup _{r> R} \left|{E}_{r}f \right| \right\| _{\ell ^{2}(\mathbb {Z} ^d)} \lesssim {R}^{-\omega } \left\| f \right\| _{\ell ^{2}(\mathbb {Z} ^d)} \end{aligned}$$
(44)

where \( \omega := \min \{ (d-2s)\theta -\delta , \frac{d}{k}-(k+2)\} \) is positive.

Proof

Suppose that \(\mathcal {R}\) has density at most \(\delta \). On each dyadic scale \(\mathcal {R}\cap [2^j,2^{j+1})\) apply the union bound, (5) and (14) to conclude that

$$\begin{aligned} \left\| \sup _{r\in \mathcal {R}\cap [R, 2R)} \left|{E}_{r}f \right| \right\| _{\ell ^{2}(\mathbb {Z} ^d)} \lesssim R^{\delta -(d-2s)\theta } + R^{k+2-\frac{d}{k}} \eqsim R^{-\omega } . \end{aligned}$$

Sum over dyadic scales to conclude the proof. \(\square \)

5 Explicit ranges in Theorems 1 and 2, and the connection to the Vinogradov mean value conjectures

We connect our hypotheses with Vinogradov’s mean value conjectures for which there has been exciting recent progress — see [28, 29, 31] . In fact, while this paper was under review, Vinogradov’s mean value conjectures were solved by Wooley in the cubic case and Bourgain–Demeter–Guth for higher degrees — see [1, 30].

This passage is well known in the circle method arena. For \(s, k, N \in \mathbb {N} \) define the Vinogradov mean value

$$\begin{aligned} J_{s, k}(N)&:= \int _{\mathbb {T} ^d} \left|\sum _{n=1}^N e \left( {\xi _1 n + \xi _2 n^2 + \dots + \xi _kn^k} \right) \right|^{2s} \; d\xi \end{aligned}$$
(45)
$$\begin{aligned}&=\# \left\{ m,n \in [N]^{s}{:}\,\sum _{i=1}^sn_i^\ell = \sum _{i=1}^sm_i^\ell \text { for } \ell = 1, \dots , k\right\} \end{aligned}$$
(46)

where equality holds by Plancherel’s theorem. Equation (5.37) on p. 69 of [27] connects our mean value hypothesis with Vinogradov’s mean value conjectures by

$$\begin{aligned} \int _0^1 \left|\alpha _r(t,0) \right|^{2s} \; dt \lesssim r^{\frac{k(k-1)}{2}} \cdot J_{s, k}(r) . \end{aligned}$$
(47)

By Theorem 1.1 of [30] and Theorem 1.1 of [1]

$$\begin{aligned} J_{s, k}(N) \lesssim _{s, k, \epsilon } N^{s+\epsilon } + N^{2s-\frac{k(k+1)}{2}+\epsilon } \end{aligned}$$
(48)

as \(N \rightarrow \infty \) for each (fixed) \(s,k\in \mathbb {N} \) and all \(\epsilon >0\) where the implicit constant may depend on \(\epsilon \), but not on N. For our purposes, we choose the moment \(s= \frac{k(k+1)}{2}\) which balances the two summands in (48). Plugging this into (47) we find that if \(d> k(k+1)\) and hypothesis \(H_{k}\left( \theta \right) \) is true, then for all \(\epsilon >0\),

$$\begin{aligned} \left\| A_r^{minor} \right\| _{\ell ^{2}(\mathbb {Z} ^d)} \lesssim _{\epsilon } r^{\epsilon -\theta (d-k[k+1])} . \end{aligned}$$
(49)

Following the proof of Theorem 2 we deduce the following theorem.

Theorem 4

Let \(d> k(k+2)\) and assume that \(H_{k}\left( \theta \right) \) is true for some \(\theta \in (0,1/2)\). If \( \mathcal {R}\) is a subsequence of \(\mathcal {R}_{k,d}\) with density-parameter at most \( 0 \le \delta < \theta [d-k(k+1)] \), then \(M_\mathcal {R}\) is bounded from \(\ell ^{p,1}(\mathbb {Z} ^d)\) to \(\ell ^{p,\infty }(\mathbb {Z} ^d)\) for \( p := \max \{ \frac{d}{d-k}, 1+\frac{\delta k}{2(d-k[k+2])+\delta k}, 1+\frac{\delta }{\delta -\epsilon + 2\theta [d-k(k+1)]} \} \) where \(0< \epsilon < \delta /2\).

Recall from the introduction that \(H_{k}\left( \theta \right) \) is true for all \(\theta \in (0,1/k[k-1])\); thus we have the following corollary which says that for sufficiently thin subsequences of \(\mathcal {R}_{k,d}\) the dimensional constraint for its maximal function is reduced from a cubic dependence on the degree in Theorem 1 to a quadratic one.

Corollary 2

Let \(k\ge 3 \), \(d> k(k+2)\) and \(p = \frac{d}{d-k}.\) If \( \mathcal {R}\) is a subsequence of \(\mathcal {R}_{k,d}\) with density-parameter at most \( 0 \le \delta < 2 k\theta \left( 1-\frac{k[k+1]}{d} \right) \), then \( M_\mathcal {R}\) is bounded from \(\ell ^{p,1}(\mathbb {Z} ^d)\) to \(\ell ^{p,\infty }(\mathbb {Z} ^d)\).

Remark 6

Observe that, in contrast to Theorem 2, the \(\epsilon \)-loss in (49) does not allow us to capture the endpoint \(\delta = \theta [d-2k(k-1)]\) for our density-parameter in Theorem 4.