1 Introduction

Exponential sums are a ubiquitous tool throughout analytic number theory, and have been studied in their own right at least since the 1920s. When \(\mathbf {k}= (k_1, \ldots , k_t)\) is a tuple of pairwise distinct natural numbers and P is a large positive integer, the Weyl sum of multidegree \(\mathbf {k}\) is given by

$$\begin{aligned} f_{\mathbf {k}}\left( {\alpha }_{k_1}, \ldots , {\alpha }_{k_t}\right) = \sum _{n=1}^P e\left( {\alpha }_{k_1} n^{k_1} + \cdots + {\alpha }_{k_t} n^{k_t}\right) . \end{aligned}$$
(1.1)

Such sums regularly feature in applications of the Hardy-Littlewood circle method in connection with diophantine systems of the shape

$$\begin{aligned} x_1^{k_j} + \cdots + x_s^{k_j} = y_1^{k_j}+ \cdots + y_s^{k_j} \quad (1 \le j \le t). \end{aligned}$$
(1.2)

Whilst the theory of systems of the kind (1.2) has recently seen significant advances in the work of Wooley [22, 23] and Bourgain, Demeter and Guth [3] on Vinogradov’s mean value theorem, our grasp of the cases involving lacunary degrees remains insufficient. One of the simplest such systems is that corresponding to \(\mathbf {k}= (1,3)\), a variant of which is given by

$$\begin{aligned} x_1^{3} + \cdots + x_s^{3} =x_1 + \cdots + x_s = 0. \end{aligned}$$
(1.3)

Although recent progress on this system has been achieved by Brüdern and Robert [5] and Wooley [20], a full understanding of the system (1.3) remains tantalisingly out of reach. In both papers, the authors apply the circle method in order to derive asymptotic formulæ for the number of solutions of such systems, and they succeed in doing so as soon as \(s \ge 10\). On the basis of standard heuristics, one would expect to be able to extend the range in which formulæ of this kind are valid to at least \(s \ge 9\), but unfortunately we lack a sufficiently precise understanding of the underlying Weyl sum \(f_{1,3}({\varvec{\alpha }})\) to achieve such a bound. A similar phenomenon occurs in forthcoming work of Hughes and Wooley [11], which deals with moments of a weighted version of \(f_{1,3}({\varvec{\alpha }})\). Trying to make some headway towards a better understanding of these exponential sums is the main motivation behind the paper at hand.

The main motif underpinning the Hardy-Littlewood method is that sums of the shape (1.1) should be small unless all components of the coefficient vector \({\varvec{\alpha }}\) lie in the vicinity of fractions with a small denominator, in which case they can be well approximated by certain generating functions that are easier to handle and encode the adelic information inherent in the associated system (1.2). To make this notion precise, we introduce some notation. Suppose that the entries of \({\varvec{\alpha }}\) have a rational approximation of the shape

$$\begin{aligned} {\alpha }_{k_j} = a_{k_j}/q + \beta _{k_j} \quad (1 \le j \le t) \end{aligned}$$
(1.4)

with a common denominator q satisfying \(\gcd (q, a_{k_1}, \ldots , a_{k_t})=1\), and define

$$\begin{aligned} S_{\mathbf {k}}(q;\mathbf {a}) = \sum _{x=1}^q e\left( q^{-1}\sum _{j=1}^t a_{k_j} x^{k_j}\right) \quad \text {and} \quad I_{\mathbf {k}}({\varvec{\beta }}) = \int _0^P e \left( \sum _{j=1}^t {\beta }_{k_j} x^{k_j}\right) \mathrm {d}x. \end{aligned}$$

In this notation, we anticipate that \(q^{-1}S_{\mathbf {k}}(q; \mathbf {a})I_{\mathbf {k}}({\varvec{\beta }})\) should be a good approximation to \(f_{\mathbf {k}}({\varvec{\alpha }})\), and we denote the difference by

$$\begin{aligned} {\varDelta }_{\mathbf {k}}(q, \mathbf {a}; {\varvec{\beta }}) = f_{\mathbf {k}}(\mathbf {a}/q + {\varvec{\beta }}) - q^{-1} S_{\mathbf {k}}(q; \mathbf {a}) I_{\mathbf {k}}({\varvec{\beta }}). \end{aligned}$$
(1.5)

There is a considerable body of work related to Weyl sums of the type (1.1) and their approximations (1.5). When \(t=1\) so that \(\mathbf {k}= k\), it is known from [17, Theorem 4.1] that

$$\begin{aligned} {\varDelta }_k(q,a;{\beta }) \ll q^{1/2+\varepsilon }(1+P^k |{\beta }|)^{1/2}, \end{aligned}$$
(1.6)

and Daemen [7, Theorem 2] and Brüdern and Daemen [4, Theorem 1] showed that this bound is sharp up to at most a factor of \(q^\varepsilon \). For general multidegrees \(\mathbf {k}\) we have the weaker bound

$$\begin{aligned} {\varDelta }_{\mathbf {k}}(q, \mathbf {a}; {\varvec{\beta }}) \ll q \left( 1+ \sum _{j=1}^t P^{k_j} |{\beta }_{k_j}| \right) \end{aligned}$$

from [17, Theorem 7.2]. This bound has been improved for \(\mathbf {k}= (1,k)\) by Brüdern and Robert [5, Theorem 3], who obtained the estimate

$$\begin{aligned} {\varDelta }_{1,k}\left( q, \mathbf {a}; {\varvec{\beta }}\right) \ll q^{1-1/k+\varepsilon }\left( 1+P^k |{\beta }_k|\right) ^{1/2}. \end{aligned}$$
(1.7)

Whilst their result holds for all \(k \ge 2\), an epsilon-free version is available in the quadratic case due to the last author [18, Theorem 8]. This invites the question of how optimal the bound in (1.7) is.

The primary objective of this memoir is to examine the exponential sum \(f_{1,k}({\alpha }_1, {\alpha }_k)\) and its associated error term \({\varDelta }_{1,k}(q,\mathbf {a};{\varvec{\beta }})\) more closely. Our main result is the following.

Theorem 1.1

Let \(k \ge 2\). Assume (1.4) with \((q, a_k)=1\) and \(|{\beta }_1| \le (2q)^{-1}\). Then

where \(\left[ \left[ x \right] \right] \) denotes a closest integer to x, and the notation \(\sum ^\dagger \) indicates that the sum runs over all distinct values of \(d\left[ \left[ a_1/d \right] \right] \) satisfying \((\left[ \left[ a_1/d \right] \right] , q/d)=1\).

If, moreover, we have

$$\begin{aligned} |{\beta }_k| \le (4kqP^{k-1})^{-1}, \end{aligned}$$
(1.8)

then the error term in the above asymptotic may be replaced by \(O(q^{1/2 +\varepsilon })\).

Thus, by extracting additional main terms, we are able to obtain an error term of the same quality as in (1.6), which is essentially optimal. We note that the factor \(\log P\) in the error term can be eliminated by means of a more careful analysis. Observe also that the coprimality condition \((\left[ \left[ a_1/d \right] \right] , q/d)=1\) implies that the fractions \(\left[ \left[ a_1/d\right] \right] / (q/d)\) are reduced and therefore pairwise distinct.

In the case when \(k=3\), it follows from Dirichlet’s approximation theorem that every \({\alpha }\in [0,1]\) has an approximation \({\alpha }=a/q + {\beta }\) with \(q(1+P^3 |{\beta }|) \le 2 P^{3/2}\). Thus, in the cubic case we obtain the following.

Corollary 1.1

Assume (1.4) with \(q(1+P^3 |{\beta }_3|) \le 2 P^{3/2}\), where \((q, a_3)=1\) and \(|{\beta }_1| \le (2q)^{-1}\).

  1. (a)

    We have

  2. (b)

    Moreover, we have the bound

    $$\begin{aligned} f_{1,3}\left( {\alpha }_1, {\alpha }_3\right) \ll \frac{P^{1+\varepsilon }}{\left( q+q|{\beta }_3|P^3\right) ^{1/3}} + P^{3/4+\varepsilon } \end{aligned}$$

    uniformly in \({\alpha }_1\).

For general degree k, an analogous chain of reasoning would replace the error term \(O(P^{3/4+\varepsilon })\) by \(O(P^{k/4+\varepsilon })\), which is trivial when \(k \ge 4\). Thus, our result in Theorem 1.1 is strongest in the cubic case, and for higher degrees should be viewed as a bound for the major arcs only.

When \(d=(a_1,q)\) we have \(d \left[ \left[ a_1/d \right] \right] = a_1\); this is the leading term in Theorem 1.1 and corresponds to the approximation in (1.5). We can thus rephrase the conclusion of Theorem 1.1 in the form

(1.9)

When \(d \ne (a_1,q)\) the fractions \(a_1/d\) are all non-integral. In particular, when \(a_1/d\) is half an odd integer, there are two choices for \(\left[ \left[ a_1/d \right] \right] \) and both may occur in the sum. Note further that the sum in (1.9) is empty if and only if \(a_1\) is a multiple of q. When \(q \not \mid a_1\) we have \((a_1,q) \ne q\), and thus it will always contain at least the term \(S_{1,k}(q;0,a_k) I_{1,k}({\alpha }_1, {\beta }_k)\) corresponding to \(d=q\), and in many cases this is the only one. For instance, it is not hard to see that when \(a_1= 1\) and \(q > 1\) is odd, the sum in Theorem 1.1 contains precisely the terms corresponding to \(d=1\) and \(d=q\), and so in this case the asymptotic formula reads

$$\begin{aligned} f_{1,k}\left( {\alpha }_1, {\alpha }_k\right)&= q^{-1} \left( S_{1,k}\left( q;1, a_k\right) I_{1,k}\left( {\beta }_1, {\beta }_k\right) + S_{1,k}(q;0, a_k) I_{1,k}\left( {\alpha }_1,{\beta }_k\right) \right) \\&\quad + O\left( q^{1/2+\varepsilon }\left( 1 + |{\beta }_k|P^k\right) ^{1/2}\log P\right) . \end{aligned}$$

This behaviour, which occurs in many generic situations, indicates that we cannot expect the secondary terms to be subject to any significant cancellation. In fact, we have the following result on the size of the error term.

Theorem 1.2

Suppose that \(a_k\) and q are coprime integers satisfying \(|{\alpha }_k - a_k/q| < q^{-2}\), and assume that (1.4) holds with \(|{\beta }_1| < (2q)^{-1}\).

  1. (a)

    We have the upper bound

  2. (b)

    Suppose now that \({\beta }_k = 0\) and \(a_1=1\) with \(q=p^k\), where p is an odd prime. Then, whenever \(\Vert {\alpha }_1 P \Vert \ge \delta \) for some suitable real number \({\delta }>0\), we have the lower bound

    $$\begin{aligned} |{\varDelta }_{1,k}\left( q, 1, a_k; {\beta }_1,0\right) | \ge \frac{4{\delta }}{3\pi } q^{1-1/k}+ O\left( q^{1/2+\varepsilon } \log P\right) . \end{aligned}$$

Thus Theorem 1.2 shows that the bound (1.7) of Brüdern and Robert is sharp at least when \({\beta }_k\) is small and q is a perfect k-th power of a square-free number. Here, the term of size \(q^{1-1/k} = p^{k-1}\) arises from the exponential sum \(S_{1,k}(q;0,a_k)\) via Lemma 4.4 in [17], and as noted above, unless \(q|a_1\) this term will always appear in the sum in (1.9). Thus, large values of \({\varDelta }_{1,k}(q,\mathbf {a};{\varvec{\beta }})\) cannot be considered an exceptional occurrence when \({\beta }_k\) is small.

As a consequence of Theorem 1.1, we are able to make progress on a problem concerning the fractal dimension of solutions of certain partial differential equations. The motivation for this problem goes back to optical experiments by Talbot [15] in the 1830s concerning the diffraction of light passing through a grating. Berry [2] later initiated the theoretical investigation of the problem and has in particular made predictions regarding the fractal dimension of the diffraction pattern along certain slices in space. The reader is referred to Chapter 2 of [9] and the introduction of [8] for an introduction to the general topic as well as a more thorough history of this particular problem.

In this paper, we shall focus in particular on the family of partial differential equations given by

$$\begin{aligned} i \partial _t q(t,x) - i^k \partial ^{(k)}_x q(t,x) = 0 \quad \left( k \in \mathbb {N}\right) , \end{aligned}$$
(1.10)

where \(t \in \mathbb {R}\) and \(x \in \mathbb {R}/ 2 \pi \mathbb {Z}\). When \(k=2\), the reader will recognize (1.10) as the linear Schrödinger equation, while the case \(k=3\) corresponds to the linear part of the Korteweg-de Vries (KdV) equation, also known as Airy’s equation. For any natural number k, given initial data \(g_k(n) \in L^2(\mathbb {R}/2 \pi \mathbb {Z})\), the evolution of \(g_k\) under (1.10) is given by

$$\begin{aligned} q_k(t,x) = \sum _{n \in \mathbb {Z}} \hat{g}_k(n) e^{i t n^k + i x n}. \end{aligned}$$

Clearly, \(q_k\) is periodic in both t and x with period \(2 \pi \).

We are interested in the restriction of \(q_k\) to linear subsets of \((\mathbb {R}/ 2 \pi \mathbb {Z}) \times (\mathbb {R}/ 2 \pi \mathbb {Z})\). Given \(c \in \mathbb {R}\) and \(r \in \mathbb {Q}\setminus \{0\}\), as well as initial data \(g_k\), let

$$\begin{aligned} q_{k;r,c}(x) = \sum _{n \in \mathbb {Z}} \hat{g}_k(n) e^{i (c-rx)n^k + i x n} \end{aligned}$$

denote the restriction of \(q_k\) to the oblique line \(t+ rx=c\). Recall that the fractal (also known as upper Minkowski or upper box-counting) dimension of a bounded set E is given by

$$\begin{aligned} \overline{\text {dim}}(E)=\limsup _{\varepsilon \rightarrow 0}\frac{\log \left( {\mathcal N}(E,\varepsilon )\right) }{\log \left( 1/\varepsilon \right) }, \end{aligned}$$

where \({\mathcal N}(E,\varepsilon )\) is the minimum number of \(\varepsilon \)-balls required to cover E. Assuming that \(g_k\) is a suitably well-behaved function, we would like to know the the fractal dimension of the real and imaginary parts of the graph of \(q_{k;r,c}\) for a typical c. Note here that it is possible for either the real or the imaginary part of the graph to vanish, so we are really interested in the size of the larger of the two. The simplest non-constant choices for \(g_k\) are step functions, and in such situations, we see that in order to make progress, it is imperative to understand the distribution of large values of exponential sums. As it is convenient to work with dyadic sums in this context, we modify our definition (1.1) by writing

$$\begin{aligned} f_{\mathbf {k}}\left( {\alpha }_{k_1}, \ldots , {\alpha }_{k_t}; Q\right) = \sum _{Q < n \le 2 Q} e\left( {\alpha }_{k_1} n^{k_1} + \cdots + {\alpha }_{k_t} n^{k_t}\right) , \end{aligned}$$
(1.11)

where Q is a positive number. Let \({\varTheta }_k\) denote the set of all \({\theta }\in \mathbb {R}\) such that for almost all \({\gamma }\in [0,1)\) one has

$$\begin{aligned} \sup _{Q \ge 1} Q^{-{\theta }} \sup _{{\alpha }\in [0,1)} |f_{1,k}({\alpha }, {\alpha }+{\gamma }; Q)| \ll _{{\theta },{\gamma }} 1, \end{aligned}$$
(1.12)

and set \({\theta }_k = \inf {\varTheta }_k\). The size of \(\theta _k\) and related quantities has recently been studied by Chen and Shparlinski [6], building on work by Wooley [21]. Clearly, one sees that

$$\begin{aligned} \lfloor 2 Q \rfloor - \lfloor Q \rfloor = \int _0^1 |f_{1,k}\left( {\alpha }, {\alpha }+{\gamma }; Q\right) |^2 \mathrm {d}{\alpha }\le \left( \sup _{{\alpha }\in [0,1)}|f_{1,k}({\alpha }, {\alpha }+{\gamma }; Q)| \right) ^2\le Q^2, \end{aligned}$$

whence we have the trivial bounds

$$\begin{aligned} 1/2 \le {\theta }_k \le 1 \end{aligned}$$
(1.13)

valid for all \(k \ge 2\) and for the entire range \({\gamma }\in [0,1)\). Moreover, we have the trivial bound \(\sup _{{\alpha }, {\gamma }}|f_{1,k}({\alpha }, {\alpha }+{\gamma }; Q)|\asymp Q\), and it is known (see e.g. [6, Corollary 2.2]) that for independent variables \({\gamma }\), \({\alpha }\) we have \(|f_{1,k}({\alpha }, {\alpha }+{\gamma }; Q)| \ll Q^{1/2 + \varepsilon }\) almost everywhere. It turns out that in our case where only one of the variables is restricted to lie in the complement of a thin set while the other one ranges freely, the bound is appreciably larger.

Theorem 1.3

We have \({\theta }_2 = {\theta }_3 = 3/4\).

Theorem 1.3 may be a bit surprising as one naively expects square root cancellation in exponential sums. As will transpire from the proof, it turns out that for almost every \(\gamma \) in (1.12) the supremum is obtained for a special choice of \(\alpha \) on what can be considered a major arc. One might speculate that \({\theta }_k = 3/4\) for all \(k\ge 2\). Indeed, one might hope to adapt the proof of Theorem 1.3 above to show that for almost all \({\gamma }\) this gives the correct extremal value on a suitable set of major arcs and that for almost all \({\gamma }\) the sum is smaller on the corresponding minor arcs. This latter speculation would be consistent with the main result of the last author and Wooley [19].

With the help of Theorem 1.3, we can address our motivating problem.

Theorem 1.4

For \(k=2,3\) let \(g_k\) be a step function, and fix \(r \in \mathbb {Q}\setminus \{0\}\). Set \({\alpha }_2 = 1/8\) and \({\alpha }_3 = 1/12\). Then, for almost every \(c \in \mathbb {R}\), the function \(q_{k;r,c}\) satisfies the Hölder condition \(C^{{\alpha }}\) for every \({\alpha }<{\alpha }_k\). In particular, the fractal dimension of the graph of the real and imaginary parts of \(q_{k;r,c}\) is at most \(2-{\alpha }_k\).

This improves the values of \({\alpha }_2 = 1/10\) and \({\alpha }_3 = 1/27\) of the fourth author with Erdoğan [8, Theorem 1.1]. The proof is identical to that of [8, Corollary 3.5], but inputs our Theorem 1.3 instead of [8, Proposition 3.3]. Moreover, the results of Theorem 1.4 can be transferred to non-linear partial differential equations, in particular the non-linear Schrödinger and KdV equations, by the same methods as Theorems 1.2 and 1.3 are derived from Theorem 1.1 in [8].

Note that Theorem 1.1 in [8] also gives a lower bound for the fractal dimensions in question. Specifically, the authors show that the graph of at least one of the real and imaginary parts of \(q_{k;r,c}\) has fractal dimension of at least \(2-1/(2k)\), and this is sharp at least in the Schrödinger case \(k=2\). Moreover, they remark that, if it were true that \({\theta }_k = 1/2\), their argument could be adapted to show that this lower bound reflects the actual value. Our Theorem 1.3 rules out this approach at least for the cases \(k=2\) and \(k=3\). Meanwhile, if our speculation that \({\theta }_k = 3/4\) could be substantiated for all k, it would imply that the maximum dimension of the respective graphs of the real and imaginary parts would lie in the range \([2-\frac{1}{2k}, 2-\frac{1}{4k}]\). It is worth noting that Lemma 2 of [12] along with [8] imply that for special combinations of initial data and oblique lines the fractal dimension in Theorem 1.4 for \(k=2\) is precisely 7/4 (see [8, Footnote 3] for more details).

Notation Throughout the paper, we make use of the following conventions. All statements involving the letter \(\varepsilon \) are claimed to be true for all (sufficiently small) \(\varepsilon >0\). Thus, the precise ‘value’ of \(\varepsilon \) is allowed to change from one line to the next. Moreover, P always denotes a large positive number. We use the Vinogradov and Bachmann–Landau notations liberally, and here the implied constants are allowed to depend on k and \(\varepsilon \), but never on P, Q or \({\varvec{\alpha }}\).

2 Preliminary lemmata

In this section, we briefly collect some technical lemmata that will be of use in our arguments later. All of these results pertain to the case \(\mathbf {k}= (1,k)\), and in order to avoid clutter, we will in our arguments below drop the multidegree (1, k) in our notation. Throughout, Q denotes a positive number. For easier reference, we begin by stating a few results from the literature.

Lemma 2.1

Let \(a_1, a_k \in \mathbb {Z}\) and \(q \in \mathbb {N}\), and suppose that \((a_k, q)=1\).

  1. (a)

    Uniformly in \(a_1\), we have \(S(q; a_1, a_k) \ll q^{1-1/k+\varepsilon }\).

  2. (b)

    Moreover, \(S(q; a_1, a_k) \ll q^{1/2+\varepsilon }(q, a_1)\).

Proof

These are Theorem 7.1 and Lemma 4.1 in [17], respectively. \(\square \)

We also record an elementary average bound for exponential sums.

Lemma 2.2

For any positive integer q and any integer \(a_k\) we have

$$\begin{aligned} \sum _{b=1}^q |S(q; b, a_k)| \le q^{3/2}. \end{aligned}$$

Proof

By the Cauchy-Schwarz inequality we see that

$$\begin{aligned} \sum _{b=1}^q |S\left( q; b, a_k\right) | \le q^{1/2} \left( \sum _{b=1}^q |S(q; b, a_k)|^2\right) ^{1/2}, \end{aligned}$$

and expanding the square yields

$$\begin{aligned} \sum _{b=1}^q |S\left( q; b, a_k\right) |^2 = \sum _{b=1}^q \sum _{x,y=1}^q e\left( \frac{b (x-y)+ a_k \left( x^k-y^k\right) }{q}\right) = q \sum _{x=1}^q 1 = q^2. \end{aligned}$$

This completes the proof. \(\square \)

The next result is a direct consequence of [17, Lemma 4.2].

Lemma 2.3

Suppose that \(\phi \) is a twice continuously differentiable function on an interval I and let \(H\ge 2\) be a number such that \(|\phi '(x)| \le H\) for all \(x \in I\). Suppose further that \(\phi ''\) has at most finitely many zeros in the interval I. Then

$$\begin{aligned} \sum _{x \in I \cap \mathbb {Z}} e(\phi (n)) = \sum _{|h|\le H} \int _I e(\phi (x) - hx) \mathrm {d}x + O(\log H). \end{aligned}$$

Proof

This is immediate upon partitioning I into subintervals on which \(\phi '\) is monotonic, and then applying Lemma 4.2 of [17] on each of these finitely many intervals. \(\square \)

We continue with bounds on oscillating integrals. For a measurable subset \({\mathcal A}\) we define

$$\begin{aligned} I\left( {\beta }_1, {\beta }_k; {\mathcal A}\right) = \int _{{\mathcal A}} e\left( {\beta }_1 x + {\beta }_k x^k\right) \hbox {d}x. \end{aligned}$$

We then have the following bounds for \(I({\beta }_1, {\beta }_k; {\mathcal A})\).

Lemma 2.4

Let \(k \ge 2\) and suppose that \({\mathcal A}\) is a finite union of pairwise disjoint intervals.

  1. (a)

    Let \(\tau >0\) be a parameter satisfying \(|{\beta }_1 + k{\beta }_k x^{k-1}| \ge \tau \) for all \(x \in {\mathcal A}\). Then

    $$\begin{aligned} I\left( {\beta }_1, {\beta }_k; {\mathcal A}\right) \ll \tau ^{-1}. \end{aligned}$$
  2. (b)

    Assume that \({\mathcal A}\subseteq [Q,2 Q]\) for some \(Q>0\). Then, whenever \({\beta }_k \ne 0\) we have

    $$\begin{aligned} I\left( \beta _1, \beta _k; {\mathcal A}\right) \ll \left( |\beta _k| Q^{k-2}\right) ^{-1/2}. \end{aligned}$$

Proof

These bounds are Lemmata 4.2 and 4.4 in [16], respectively, applied to the function \(F(x) = {\beta }_1 x + {\beta }_k x^k\).

Lemma 2.5

Assume that \(k \ge 2\) and \({\beta }_1 \ne 0\). Suppose further that \({\mathcal A}\) is a union of finitely many pairwise disjoint intervals contained inside [Q, 2Q] for some \(Q>0\). Then we have the bound

$$\begin{aligned} I\left( {\beta }_1, {\beta }_k; {\mathcal A}\right) \ll |{\beta }_1|^{-1} \left( 1 + Q^k |{\beta }_k|\right) ^{1/2}. \end{aligned}$$

Proof

Suppose first that the relation

$$\begin{aligned} |{\beta }_1 - k {\beta }_k x^{k-1}| \ge \textstyle {\frac{1}{2}} |{\beta }_1| \end{aligned}$$
(2.1)

holds for all \(x \in {\mathcal A}\). Then we see from Lemma 2.4(a) that

$$\begin{aligned} I({\beta }_1, {\beta }_k; {\mathcal A}) \ll |{\beta }_1|^{-1}, \end{aligned}$$
(2.2)

which is sufficient to prove the lemma in this case. We may thus concentrate on the opposite case where the inequality (2.1) is violated for some \(x \in {\mathcal A}\). It follows from the triangle inequality that any such x must satisfy the inequalities \(\frac{1}{2}|{\beta }_1| \le k |{\beta }_k| x^{k-1} \le \frac{3}{2}|{\beta }_1|\). Since \(Q \le x \le 2 Q\), this can happen only if

$$\begin{aligned} |{\beta }_1| \asymp Q^{k-1}|{\beta }_k| \end{aligned}$$
(2.3)

and in particular only when \({\beta }_k \ne 0\). We can thus deploy Lemma 2.4(b) and obtain

$$\begin{aligned} I\left( {\beta }_1, {\beta }_k; {\mathcal A}\right)&\ll \left( Q^{k-2}|{\beta }_k|\right) ^{-1/2} \nonumber \\&\ll \left( |{\beta }_k| Q^{k-1}\right) ^{-1}\left( |{\beta }_k|Q^k\right) ^{1/2} \ll |{\beta }_1|^{-1} \left( |{\beta }_k|Q^k\right) ^{1/2}, \end{aligned}$$
(2.4)

where in the last step we used (2.3) again. The full statement now follows upon combining (2.2) and (2.4). \(\square \)

3 Proof of Theorem 1.1

For the proof of our first main result it is convenient to work over dyadic ranges. Recalling our notation (1.11), we make the analogous definition

$$\begin{aligned} I\left( {\beta }_1, {\beta }_k; Q\right) = I_{1,k}\left( {\beta }_1, {\beta }_k;Q\right) = \int _Q^{2 Q} e\left( {\beta }_1 x + {\beta }_k x^k\right) \mathrm {d}x. \end{aligned}$$

Thus, if we can show that

(3.1)

for any \(Q \ge 1/2\), the conclusion of Theorem 1.1 will follow upon dyadic summation, as

$$\begin{aligned} f\left( {\alpha }_1, {\alpha }_k\right) = \sum _{i=1}^{\lceil \log P / \log 2 \rceil } f \left( {\alpha }_1, {\alpha }_k; 2^{-i}P\right) . \end{aligned}$$

The initial stages of our argument follow along the lines of the proof of [5, Theorem 3], which in turn is an adaptation of the argument found in [17, pp. 43–44].

Lemma 3.1

Assume (1.4) with \((a_1,q)=1\), and set

$$\begin{aligned} H=2^{k-1}q\left( 1+kQ^{k-1}|{\beta }_k|\right) . \end{aligned}$$

Then

$$\begin{aligned} f\left( {\alpha }_1, {\alpha }_k; Q\right) = q^{-1} \sum _{|h| \le H} S\left( q; a_1+h, a_k\right) I\left( {\beta }_1-h/q, {\beta }_k;Q\right) + O\left( q^{1/2}\log H\right) . \end{aligned}$$

Proof

By sorting the terms of \(f({\alpha }_1, {\alpha }_k;Q)\) into congruence classes modulo q and encoding the congruence condition in an exponential sum, we find that

$$\begin{aligned} f\left( {\alpha }_1, {\alpha }_k;Q\right)&= \sum _{r=1}^q e \left( \frac{a_1 r + a_k r^k}{q}\right) \sum _{\begin{array}{c} Q< n \le 2 Q \\ n \equiv r \;(\mathrm {mod}\;{q}) \end{array}} e\left( {\beta }_1 n + {\beta }_k n^k\right) \nonumber \\&= \frac{1}{q} \sum _{-q/2 < b \le q/2} S\left( q; a_1+b, a_k\right) f\left( {\beta }_1 - b/q,{\beta }_k;Q\right) . \end{aligned}$$
(3.2)

We treat the sum \(f({\beta }_1 - b/q, {\beta }_k;Q)\) by Lemma 2.3, where we take

$$\begin{aligned} \phi (x) = ({\beta }_1 - b/q)x + {\beta }_k x^k \end{aligned}$$

and set

$$\begin{aligned} H_1 = 2^{k-1}(1+kQ^{k-1}|{\beta }_k|) - 1/2. \end{aligned}$$

Then we have \(|\phi '(x)| \le H_1\) for all \(x \le 2 Q\) and thus Lemma 2.3 yields

$$\begin{aligned} f\left( {\beta }_1 - b/q, {\beta }_k;Q\right) = \sum _{|j| \le H_1} I\left( {\beta }_1-b/q-j, {\beta }_k;Q\right) + O\left( \log H_1\right) . \end{aligned}$$

Using this within (3.2) and applying Lemma 2.2 in the error term yields

$$\begin{aligned} f\left( {\alpha }_1, {\alpha }_k;Q\right)&= \frac{1}{q} \sum _{-q/2 < b \le q/2} \sum _{|j| \le H_1} S\left( q; a_1+ b+jq, a_k\right) I\left( {\beta }_1 - (b+jq)/q, {\beta }_k;Q\right) \\&\quad + O\left( q^{1/2}\log H_1\right) . \end{aligned}$$

The proof is complete upon making the change of variables \(b+qj=h\), noting that under the summation conditions this is in fact a bijection into the set of integers h satisfying \(-H < h \le H\) where \(H= q(H_1+1/2)\). \(\square \)

We now distinguish two cases according to which term in H is larger. Suppose first that \(k|{\beta }_k|Q^{k-1}>1\), so that

$$\begin{aligned} 1 \ll H/q \ll |{\beta }_k|Q^{k-1}. \end{aligned}$$
(3.3)

In such a situation, we discern from Lemma 2.4(b) and Lemma 2.2 that

$$\begin{aligned} \sum _{|h| \le H} S(q; a_1+h, a_k) I({\beta }_1 -h/q, {\beta }_k;Q)&\ll (Q^{k-2}|{\beta }_k|)^{-1/2} \left( \frac{H}{q}+1\right) \sum _{a=1}^q |S(q; a, a_k)| \\&\ll q^{3/2}Q^{k/2}|{\beta }_k|^{1/2}, \end{aligned}$$

where in the last step we used (3.3). Thus, in this situation, we find that

$$\begin{aligned} f\left( {\alpha }_1, {\alpha }_k;Q\right) \ll q^{1/2}\left( Q^{k/2}|{\beta }_k|^{1/2} + \log H\right) \ll q^{1/2+\varepsilon }Q^{k/2}|{\beta }_k|^{1/2}, \end{aligned}$$
(3.4)

which is satisfactory for the purposes of Theorem 1.1.

It remains to study the behaviour of \(f({\alpha }_1, {\alpha }_k;Q)\) when \(k |{\beta }_k| Q^{k-1} \le 1\), or in other words,

$$\begin{aligned} H \ll q. \end{aligned}$$
(3.5)

Set \(d = (a_1+h, q)\) and \(e =(a_1+h)/d\). In this notation, we have \(h=de-a_1\) and \((e, q/d)=1\), and the conclusion of Lemma 3.1 reads

$$\begin{aligned}&f({\alpha }_1, {\alpha }_k;Q) \nonumber \\&\quad = q^{-1} \sum _{d|q} \sum _{\begin{array}{c} e \in \mathbb {Z}\\ |de-a_1| \le H\\ (e,q/d)=1 \end{array}} S\left( q; de, a_k\right) I\left( {\beta }_1 - \frac{de-a_1}{q}, {\beta }_k; Q\right) + O\left( q^{1/2 + \varepsilon }\right) . \end{aligned}$$
(3.6)

We expect the sum on the right hand side of (3.6) to be dominated by the terms corresponding to small values of h. In particular, whenever \(|h| \le d/2\) we have that \(|e-a_1/d| \le 1/2\) and hence \(e=\left[ \left[ a_1/d \right] \right] \). These terms will form our main term. Write

$$\begin{aligned} E\left( q, \mathbf {a}; {\varvec{\beta }}\right) = q^{-1} \sum _{d|q} \sum _{\begin{array}{c} e \in \mathbb {Z}\\ d/2 < |de-a_1| \le H \end{array}} \left| S(q; de, a_k) I\left( {\alpha }_1 - \frac{de}{q}, {\beta }_k;Q\right) \right| \end{aligned}$$

for the sum over all the remaining terms where \(h > d/2\). Then (3.6) may be rephrased as

(3.7)

Thus, it suffices to bound \(E(q, \mathbf {a}; {\varvec{\beta }})\). By Lemma 2.5 we see that

$$\begin{aligned} E\left( q, \mathbf {a}; {\varvec{\beta }}\right) \ll q^{-1}\left( 1+Q^k|{\beta }_k|\right) ^{1/2} \sum _{d|q}\sum _{\begin{array}{c} e \in \mathbb {Z}\\ d/2 < |de-a_1| \le H \end{array}} \frac{|S\left( q; de, a_k\right) |}{|{\alpha }_1-de/q|}. \end{aligned}$$
(3.8)

Now, the condition \(de \ne a_1\) together with our bound \(|{\beta }_1| \le (2q)^{-1}\) implies that

$$\begin{aligned} \left| {\alpha }_1 - \frac{de}{q}\right| \ge \frac{|a_1-de|}{q} - |{\beta }_1| \ge \frac{|a_1-de|}{q} - \frac{1}{2q} \ge \frac{|a_1-de|}{2q}. \end{aligned}$$
(3.9)

Using this within (3.8) and applying Lemma 2.1(b) yields the bound

$$\begin{aligned} E\left( q, \mathbf {a}; {\varvec{\beta }}\right)&\ll \left( 1 + Q^k|{\beta }_k|\right) ^{1/2} \sum _{d|q}\sum _{\begin{array}{c} e \in \mathbb {Z}\\ d/2< |de-a_1| \le H \end{array}}\frac{|S\left( q; de, a_k\right) | }{|a_1-de|} \nonumber \\&\ll q^{1/2+\varepsilon }\left( 1 + Q^k|{\beta }_k|\right) ^{1/2} \sum _{d|q}d \sum _{\begin{array}{c} e \in \mathbb {Z}\\ d/2 < |de-a_1| \le H \end{array}}|a_1-de|^{-1} \nonumber \\&\ll q^{1/2+\varepsilon }\left( 1 + Q^k|{\beta }_k|\right) ^{1/2}, \end{aligned}$$
(3.10)

where in the last step we used (3.5) together with standard bounds for the divisor function. The proof of (3.1) under the assumption (3.5) is now complete upon inserting (3.10) into (3.7), and the unconditional statement follows upon combining this with the bound (3.4).

The second statement of Theorem 1.1 is proved in a similar manner, and we only briefly detail the changes that need to be effected. Here, we do not need to consider a dyadic dissection of the interval, so all of our arguments will involve the exponential sum \(f({\alpha }_1, {\alpha }_k)\) and integral \(I({\beta }_1, {\beta }_k)\) instead of their dyadic analogues \(f({\alpha }_1, {\alpha }_k;Q)\) and \(I({\beta }_1, {\beta }_k;Q)\), and will have P instead of Q. We now observe that in the proof of Lemma 3.1, the condition (1.8) implies that

$$\begin{aligned} |\phi '(x)| = |{\beta }_1 - b/q + k {\beta }_k x^{k-1}| \le \frac{1}{2q} + \frac{1}{2} + \frac{1}{4q} \le 2. \end{aligned}$$

We may thus take \(H_1=2\). The argument then proceeds as above, with the difference that we may skip the discussion of the case (3.3) and can continue directly with the hypothesis (3.5). From this point we arrive, mutatis mutandis, at (3.7). In order to bound the error term \(E(q, \mathbf {a}; {\varvec{\beta }})\) we now note that (3.9) and (1.8) combine to show that, for \(1 \le x \le P\), one has

$$\begin{aligned} \left| {\alpha }_1 - \frac{de}{q} + k {\beta }_k x^{k-1}\right| \ge \left| {\alpha }_1 - \frac{de}{q}\right| - \frac{1}{4q} \ge \frac{|a_1-de|}{4q}. \end{aligned}$$

It thus follows from Lemma 2.4(a) that (3.8) can be replaced by

$$\begin{aligned} E\left( q, \mathbf {a}; {\varvec{\beta }}\right) \ll \sum _{d|q}\sum _{\begin{array}{c} e \in \mathbb {Z}\\ d/2 < |de-a_1| \le H \end{array}} \frac{|S\left( q; de, a_k\right) |}{|a_1-de|}, \end{aligned}$$

and the desired bound \(E(q, \mathbf {a}; {\varvec{\beta }}) \ll q^{1/2+\varepsilon }\) follows as above. This completes the proof of Theorem 1.1. Moreover, Corollary 1.1(a) is now immediate, and part (b) follows easily upon applying Theorem 7.3 of [17] and Lemma 2.1(a) within Theorem 1.1.

We now turn to the proof of Theorem 1.2. When \(d \ne (a_1,q)\), the fractions \(a_1/q\) and \(d\left[ \left[ a_1/d \right] \right] /q\) are distinct, and thus the latter one corresponds to a non-optimal rational approximation to \({\alpha }_1\). In particular, we have \(|{\alpha }_1 - q^{-1}d\left[ \left[ a_1/d \right] \right] | \ge 1/(2q)\). Thus, we can apply Lemma 2.5 within the sum (1.9) much as above, and the statement of Theorem 1.2(a) follows immediately.

For the second statement of the theorem, we begin by observing that the hypotheses imply that the sum in Theorem 1.1 has exactly two main terms, corresponding to the values \(d=1\) and \(d=q\), respectively. We will focus on the latter. Note that when \({\beta }_k = 0\), we can explicitly compute

$$\begin{aligned} I\left( {\alpha }_1, 0\right) = \int _0^P e\left( {\alpha }_1 x\right) \mathrm {d}x = \frac{e\left( {\alpha }_1 P\right) - 1}{2 \pi i {\alpha }_1} = e\left( \frac{{\alpha }_1 P}{2}\right) \frac{\sin \left( \pi {\alpha }_1 P\right) }{\pi {\alpha }_1}. \end{aligned}$$

When \(\Vert {\alpha }_1 P \Vert > {\delta }\), we have \(2{\delta }\le |\sin (\pi {\alpha }_1 P)| \le 1\), and thus

$$\begin{aligned} \frac{4q{\delta }}{3\pi } \le \frac{2{\delta }}{\pi |{\alpha }_1|} \le |I({\alpha }_1, 0)| \le \frac{1}{\pi |{\alpha }_1|} \le \frac{2q}{\pi } \end{aligned}$$

under our assumption that \(1/(2q) \le {\alpha }_1 \le 3/(2q)\). Thus, the main term corresponding to \(d=q\) is given by

$$\begin{aligned} q^{-1} |S\left( q; 0, a_k\right) I\left( {\alpha }_1, 0\right) | \ge \frac{4 {\delta }}{3 \pi }|S\left( q; 0, a_k\right) |. \end{aligned}$$

Finally, we note that when \(q=p^k\), Lemma 4.4 in [17] shows that \(|S(q; 0,a_k)| = q^{1-1/k}\), which implies the result.

4 Additional lemmata for the proof of Theorem 1.3

For the proof of our second main result, we need some more detailed information about cubic complete exponential sums.

Lemma 4.1

Let \(q \in \mathbb {N}\) and set

$$\begin{aligned} q_2 = \prod _{\begin{array}{c} p^t\Vert q\\ t=1\text { or }2 \end{array}} p^t,\quad q_3=\prod _{\begin{array}{c} p^t\Vert q\\ t\ge 3 \end{array}} p^t, \quad \kappa (q) =q_2^{1/2}q_3^{1/3}. \end{aligned}$$

Furthermore, suppose that \((q,a)=1\). Then

$$\begin{aligned} S_{1,3}\left( q;b,a\right) \ll q^{1+\varepsilon }\kappa (q)^{-1}. \end{aligned}$$

Proof

Suppose that \(q=rs\) with \((r,s)=1\), and write \(a=a_2r+a_1s\) and \(b=b_2r+b_1s\). Then it follows by standard arguments that

$$\begin{aligned} S_{1,3}\left( q;b,a\right) =S_{1,3}\left( r;b_1,a_1\right) S_{1,3}\left( s;b_2,a_2\right) . \end{aligned}$$

We are therefore free to restrict our focus to prime power moduli. By Lemma 2.1(a) when \((q_3,a)=1\) we have

$$\begin{aligned} S_{1,3}\left( q_3;b,a\right) \ll q_3^{2/3+\varepsilon }. \end{aligned}$$

Thus it suffices to show that whenever \((a,p)=1\) and \(t=1\) or 2 we have

$$\begin{aligned} S_{1,3}\left( p^t;b,a\right) \ll p^{t/2}. \end{aligned}$$

When \(t=1\) this follows from Corollary II.2F of Schmidt [13]. Suppose \(t=2\). Then when \(p=2\) or 3 this bound is trivial, so we can suppose that \(p>3\). Thus

$$\begin{aligned} S_{1,3}(p^2;b,a)&= \sum _{v=0}^{p-1} \sum _{u=1}^p e\left( \frac{b(pv+u)+a(pv+u)^3}{p^2} \right) \\&= \sum _{u=1}^p e\left( \frac{bu+au^3}{p^2} \right) \sum _{v=0}^{p-1} \left( \frac{b+3au^2}{p}v \right) \\&= p\sum _{\begin{array}{c} u=1\\ b+3au^2\equiv 0 \;(\mathrm {mod}\;{p}) \end{array}}^p e\left( \frac{bu+au^3}{p^2} \right) . \end{aligned}$$

The congruence \(b+3au^2\equiv 0 \;(\mathrm {mod}\;{p})\) has at most two solutions, and it follows that \(|S_{1,3}(p^2;b,a)| \le 2p\). The claim of the lemma follows upon collecting our results. \(\square \)

Lemma 4.2

Let \(q\in \mathbb N\) be odd and \(c\in \mathbb Z\) with \((q,c)=1\). Then there exists \(a\in \mathbb Z\) such that \((q,a)=1\) and for every \(\varepsilon >0\) we have

$$\begin{aligned} |S_{1,3}(q;a-c,a)|\gg q^{1/2-\varepsilon }. \end{aligned}$$

Proof

Before embarking on the argument, we note that the lemma is trivial for \(q=1\), and hence we may assume \(q>1\) and consequently \(c \ne 0\). Again we use the multiplicative property of the Gauss sum as described at the start of the proof of the previous lemma. Thus it suffices to establish the lemma for prime powers.

Let p be an odd prime, \(t\in \mathbb N\) and \(c\in \mathbb Z\) be such that \(p\not \mid c\). The lemma follows if we are able to show that there is an absolute constant \(\xi >0\) having the property that

$$\begin{aligned} |S_{1,3}(p^t;a-c,a)|\ge \xi p^{t/2} \end{aligned}$$

for all odd prime powers \(p^t\).

First we deal with the case \(t=1\). Clearly, the desired statement follows if we can show that

$$\begin{aligned} \sum _{a=1}^{p-1}|S_{1,3}(p;a-c,a)|^2\ge \xi p^2. \end{aligned}$$
(4.1)

In general, we have

$$\begin{aligned} \sum _{a=1}^{p-1} |S_{1,3}\left( p;a-c,a\right) |^2 = p \sum _{\begin{array}{c} m,n=1\\ n+n^3\equiv m+m^3\;(\mathrm {mod}\;{p}) \end{array}}^{p} e\left( \frac{c(n-m)}{p}\right) - \left| \sum _{m=1}^{p} e\left( \frac{cm}{p}\right) \right| ^2. \end{aligned}$$

Clearly, the second sum vanishes. Hence

$$\begin{aligned} \sum _{a=1}^{p-1} |S_{1,3}\left( p;a-c,a\right) |^2&= p \sum _{m=1}^{p} \sum _{\begin{array}{c} n=1\\ n+n^3\equiv m+m^3\;(\mathrm {mod}\;{p}) \end{array}}^{p} e\left( \frac{c(n-m)}{p}\right) \nonumber \\&=p^2 + p\sum _{m=1}^p \sum _{\begin{array}{c} h=1\\ 3m^2+3hm+h^2+1\equiv 0\;(\mathrm {mod}\;{p}) \end{array}}^{p-1} e\left( \frac{ch}{p} \right) . \end{aligned}$$
(4.2)

When \(p=3\) the congruence in the inner sum becomes \(h^2\equiv -1\pmod 3\) which is insoluble, so that sum vanishes, and (4.2) reads

$$\begin{aligned} \sum _{a=1}^{2} |S_{1,3}(3;a-c,a)|^2= 3^2. \end{aligned}$$

When \(p>3\), on the other hand, we use that

$$\begin{aligned} 12\left( 3m^2+3hm+h^2+1\right) = (6m+3h)^2+3h^2+12, \end{aligned}$$

whence we obtain

$$\begin{aligned} p\sum _{m=1}^p \sum _{\begin{array}{c} h=1\\ 3m^2+3hm+h^2+1\equiv 0\;(\mathrm {mod}\;{p}) \end{array}}^{p-1} e\left( \frac{ch}{p} \right) = p\sum _{h=1}^{p-1} e\left( \frac{ch}{p} \right) \left( 1+\left( \frac{-3h^2-12}{p} \right) _L\right) , \end{aligned}$$

where \(\big (\frac{a}{p}\big )_L\) denotes the Legendre symbol. Thus, upon extending the sum on the right hand side to include the term \(h=p\) also, while noting that \(-3p^2-12 \equiv 2^2 (-3) \;(\mathrm {mod}\;{p})\), we obtain

$$\begin{aligned} \sum _{a=1}^{p-1} |S_{1,3}(p;a-c,a)|^2&= p^2 - p\left( 1+\left( \frac{-3}{p}\right) _L \right) \nonumber \\&\quad + p\sum _{h=1}^{p} e\left( \frac{ch}{p} \right) \left( \frac{-3h^2-12}{p} \right) _L. \end{aligned}$$
(4.3)

At this point, we see from Theorem II.2G in [13] that

$$\begin{aligned} \sum _{h=1}^{p} e\left( \frac{ch}{p} \right) \left( \frac{-3h^2-12}{p} \right) _L \le 2 p^{1/2}, \end{aligned}$$
(4.4)

whence we find that

$$\begin{aligned} \sum _{a=1}^{p-1} |S_{1,3}(p;a-c,a)|^2\ge p^2-2p-2p^{3/2}. \end{aligned}$$

When \(p>7\), this is already sufficient for (4.1), so it remains to analyse the cases when \(p=5\) and \(p=7\).

Let now \(p=5\). Since \(\left( \frac{-3}{5}\right) _L=-1\), the desired bound (4.1) follows with \(\xi = 1 - 2/\sqrt{5}\) upon deploying (4.4) within the expression in (4.3). Finally, consider the case \(p=7\). Since \(\left( \frac{-3}{7}\right) _L=1\), we have in (4.3) that

$$\begin{aligned} \sum _{a=1}^{6} |S_{1,3}(7;a-c,a)|^2 = 42 + 7\sum _{h=1}^{6} e\left( \frac{ch}{7}\right) \left( \frac{-3h^2-12}{7} \right) _L, \end{aligned}$$

where we observed that the summand corresponding to \(h=7\) is 1. The remaining sum over h is

$$\begin{aligned} -e\left( \frac{c}{7}\right) +e\left( \frac{2c}{7}\right) -e\left( \frac{3c}{7} \right) -e\left( \frac{4c}{7} \right) +e\left( \frac{5c}{7}\right) -e\left( \frac{6c}{7} \right) , \end{aligned}$$

which has absolute value smaller than 6 for all \(c \in \{1, \ldots , 6\}\). It follows that (4.1) holds in this case also. Thus whenever \(t=1\) there is at least one a satisfying the necessary requirements.

Now consider the case \(t\ge 2\). Suppose first that \(p>3\). Write \(t=3v+u\) with \(v\ge 0\) and \(1\le u\le 3\). When \(u\not =1\) choose \(a=c\). Then by iteratively applying Lemma 4.4 of [17] we find that

$$\begin{aligned} S_{1,3}(p^t;a-c,a) = S_{1,3}(p^t;0,a)= p^{2v+u-1}\ge p^{t/2}, \end{aligned}$$

since in the notation of that lemma and (2.25) ibidem we have \(l=t>1=\gamma \).

When \(u=1\), we may now assume that \(v\ge 1\). Choose a so that \(a\equiv c+p^{2v}(a'-c)\pmod {p^t}\) where \(a'\) is at our disposal. Put \(m=p^{3v}x+y\). Then our sum is

$$\begin{aligned} S_{1,3}(p^{3v+1};p^{2v}(a'-c),a) = \sum _{y=1}^{p^{3v}} \sum _{x=0}^{p-1} e\left( \frac{3ay^2x}{p} + \frac{(a'-c)y}{p^{v+1}} + \frac{ay^3}{p^{3v+1}}\right) . \end{aligned}$$

The sum over x is 0 unless p|y in which case it sums to p, and thus the above is

$$\begin{aligned} p\sum _{z=1}^{p^{3v-1}} e\left( \frac{(a'-c)z}{p^{v}} + \frac{az^3}{p^{3(v-1)+1}} \right) = p^2S_{1,3}(p^{3(v-1)+1}; p^{2v-2}(a'-c), a). \end{aligned}$$

Iterating this argument gives

$$\begin{aligned} S_{1,3}(p^t;a-c,a) = p^{2v}S_{1,3}(p;a'-c,a). \end{aligned}$$

Now we choose \(a'\) in accordance with the case \(t=1\) above. Thus

$$\begin{aligned} |S_{1,3}(p^t;a-c,a)|\ge \xi p^{2v+1/2}\ge \xi p^{t/2}. \end{aligned}$$

When \(p=3\) we can apply a slightly modified argument. Now in the notation (2.25) of [17] we have \(\gamma =2\). When \(t=3v+u\) with \(u=2\) or 3 we again take \(a=c\) and obtain, by Lemma 4.4 ibidem,

$$\begin{aligned} S_{1,3}\left( 3^t;a-c,a\right) = 3^{2v}S_{1,3}\left( 3^u;0,a\right) \end{aligned}$$

and this is \(3^{2v+2}\) when \(u=3\). When \(u=2\), we have instead

$$\begin{aligned} S_{1,3}\left( 3^u;0,a\right) = 3+6\cos \left( 2 \pi a/9\right) . \end{aligned}$$

Since a is not divisible by 3, the cosine cannot be \(-\frac{1}{2}\). Thus

$$\begin{aligned} |S_{1,3}(3^t;a-c,a)|\ge \xi 3^{t/2}. \end{aligned}$$

When \(u=1\) we follow the recipe for general p and obtain

$$\begin{aligned} S_{1,3}(3^t;a-c,a) = 3^{2v}S_{1,3}(3;a'-c,a) \end{aligned}$$

and again appeal to the case \(t=1\). \(\square \)

5 Prolegomena to the proof of Theorem 1.3

Before proceeding to the various parts of the proof of Theorem 1.3, it is useful to review some measure theoretic aspects of approximation of real numbers by rational numbers. In view of the periodicity of our functions, we concentrate on the interval [0, 1].

As is well known, Dirichlet’s approximation theorem states that every real number \(\gamma \) has the property that there are arbitrarily large \(q\in \mathbb N\) and \(c\in \mathbb Z\) such that \((q,c)=1\) and \(|\gamma -c/q|\le q^{-2}\). Moreover, it follows from Khinchine’s theorem (see e.g. Theorem III.3A in [14]) that for almost every such \({\gamma }\) there is a positive number \(C(\gamma )\) such that whenever \((q,c)=1\) we have

$$\begin{aligned} \frac{C(\gamma )}{q^{2}(\log 2q)^{2}}\le \left| \gamma - \frac{c}{q}\right| . \end{aligned}$$
(5.1)

In particular there is a subset \(\varGamma \) of \((\mathbb R\setminus \mathbb Q)\cap [0,1]\) having these properties and with \(\mathrm{meas} \varGamma =1\).

For the upper bound when \(k=3\) we need to refine this further. Let \(\varGamma _0\) denote the subset of \(\varGamma \) with the property that for every \(\delta >0\) and \(\gamma \in \varGamma _0\) there are, in the notation of Lemma 4.1, at most a finite number of q and c with

$$\begin{aligned} \left| \gamma -\frac{c}{q} \right| \le q_2^{-2-\delta }q_3^{-4/3-\delta }. \end{aligned}$$
(5.2)

Lemma 5.1

The set \(\varGamma _{0}\) has full measure in [0, 1].

Proof

Let \(\varUpsilon _0=\bigcup _{\delta >0}\varUpsilon (\delta )\) where \(\varUpsilon (\delta )\) denotes the set of \(\gamma \in [0,1]\) having the property that (5.2) holds for infinitely many pairs (qc). Let further \(N\in \mathbb N\). Then

$$\begin{aligned} \varUpsilon (\delta ) \subseteq \bigcup _{q>N} \bigcup _{0\le c\le q} \left\{ \gamma : \left| \gamma -\frac{c}{q} \right| \le q_2^{-2-\delta }q_3^{-4/3-\delta } \right\} . \end{aligned}$$

Hence

$$\begin{aligned} \mathrm {meas} \varUpsilon (\delta ) \le \sum _{q_2q_3>N}4q_2^{-1-\delta }q_3^{-1/3-\delta }. \end{aligned}$$

We have

$$\begin{aligned} \sum _{q_3\le X}1 \le \sum _{\begin{array}{c} r^3l\le X\\ l|r^2 \end{array}} 1 \le \sum _{r\le X^{1/3}} d(r^2) \ll X^{1/3+\varepsilon }. \end{aligned}$$

Therefore for any \(Y>0\) we have

$$\begin{aligned} \sum _{q_3>Y} q_3^{-1/3-\delta } = \sum _{k=0}^{\infty } \sum _{2^kY<q_3\le 2^{k+1}Y} q_3^{-1/3-\delta } \ll Y^{-\delta /2}, \end{aligned}$$

and so

$$\begin{aligned} \sum _{q_2q_3>N} 4q_2^{-1-\delta }q_3^{-1/3-\delta }&\ll \sum _{q_2\le N} q_2^{-1-\delta } \sum _{q_3>N/q_2} q_3^{-1/3-\delta } + \sum _{q_2> N} 4q_2^{-1-\delta } \\&\ll \sum _{q_2\le N} q_2^{-1-{\delta }/2}N^{-\delta /2} + N^{-\delta }. \end{aligned}$$

Thus

$$\begin{aligned} \mathrm {meas} \varUpsilon (\delta ) \ll N^{-\delta /2} \end{aligned}$$

and this holds for every \(N\in \mathbb N\). \(\square \)

6 Theorem 1.3: the upper bound when \(k=2\)

Let \(\alpha \in \mathbb R\) and \({\gamma }\in {\varGamma }\). By Dirichlet’s theorem on diophantine approximation we can choose \(a_2\), q with \((a_2,q)=1\), \(q\le Q\) and

$$\begin{aligned} \left| \alpha +\gamma -\frac{a_2}{q}\right| \le \frac{1}{q Q}. \end{aligned}$$

Then choose \(a_1\) so that

$$\begin{aligned} \left| \alpha -\frac{a_1}{q}\right| \le \frac{1}{2q}. \end{aligned}$$

Set \({\beta }_1 = {\alpha }-a_1/q\) and \({\beta }_2 = {\alpha }+{\gamma }-a_2/q\). Hence, by Theorem 8 of [18], we have

$$\begin{aligned} f_{1,2}\left( \alpha ,\alpha +\gamma ;Q\right) = q^{-1}S_{1,2}\left( q;a_1,a_2\right) I_{1,2}\left( \beta _1,\beta _2;Q\right) + O\left( Q^{1/2} \right) . \end{aligned}$$

Since \((a_2,q)=1\) we have \(|S_{1,2}(q;a_1,a_2)| \ll \sqrt{q}\) by classical bounds on the Gauss sum. Thus

$$\begin{aligned} f_{1,2}\left( \alpha ,\alpha +\gamma ;Q\right) \ll |I_{1,2}\left( \beta _1, \beta _2;Q\right) | q^{-1/2} + Q^{1/2}. \end{aligned}$$

Therefore, by Theorem 7.1 in [17] we have the bound

$$\begin{aligned} f_{1,2}\left( \alpha ,\alpha +\gamma ;Q\right) \ll \frac{Q}{\left( q+q Q|\beta _1|+q Q^2|\beta _2|\right) ^{1/2}} + Q^{1/2}. \end{aligned}$$
(6.1)

We may assume that

$$\begin{aligned} q\le \left( \textstyle {\frac{1}{2}}C\left( \varepsilon ,\gamma \right) Q\right) ^{1/\left( 2+\varepsilon \right) }, \end{aligned}$$
(6.2)

for otherwise in the denominator in (6.1) we have

$$\begin{aligned} q+q Q|\beta _1|+q Q^2|\beta _2| \ge \left( \textstyle {\frac{1}{2}}C\left( \varepsilon ,\gamma \right) Q\right) ^{1/(2+\varepsilon )} \end{aligned}$$

trivially. Since \({\gamma }\in {\varGamma }\), we infer from (5.1) via (6.2) that

$$\begin{aligned} |\beta _1|\ge \left| \gamma -\frac{a_2-a_1}{q}\right| -|\beta _2| \ge \frac{C(\varepsilon ,\gamma )}{q^{2+\varepsilon }} - \frac{1}{q Q}\ge \frac{1}{2}C\!(\varepsilon ,\gamma ) q^{-2-\varepsilon }, \end{aligned}$$

provided that \(Q\ge 2/C(\varepsilon ,\gamma )\), which we may certainly assume. Thus, we find

$$\begin{aligned} q+q Q|\beta _1| \ge q+{\textstyle \frac{1}{2}}Q C\!\left( \varepsilon , {\gamma }\right) q^{-1-\varepsilon } \ge \left( {\frac{1}{2}}C\!\left( \varepsilon ,\gamma \right) Q\right) ^{1/(2+\varepsilon )} \end{aligned}$$

in this case as well, and (6.1) becomes

$$\begin{aligned} f_{1,2}\left( \alpha ,\alpha +\gamma ;Q\right) \ll _{\varepsilon ,\gamma } Q^{1-1/(4+2\varepsilon )}. \end{aligned}$$

We conclude that if \(\theta >\frac{3}{4}\), then

$$\begin{aligned} Q^{-\theta } \sup _{\alpha \in [0,1)} |f_{1,2}\left( \alpha ,\alpha +\gamma ;Q\right) | \ll _{\theta ,\gamma } 1 \end{aligned}$$

as required.

7 Theorem 1.3: the upper bound when \(k=3\)

This follows the pattern set in §6. Again we use (5.1), but now we suppose that \(\gamma \in \varGamma _0\), and hence given \(\gamma \) we may suppose that for any fixed \(\delta >0\) the inequality (5.2) holds for at most a finite number of q and c. It will be convenient to also suppose that \(\delta \) is sufficiently small. For such a \(\gamma \) we show that for arbitrarily large Q we have

$$\begin{aligned} f_{1,3}(\alpha ,\alpha +\gamma ;Q) \ll _{\delta ,\gamma } Q^{3/4+\delta } \end{aligned}$$
(7.1)

uniformly for \(\alpha \in \mathbb R\).

Let \(\alpha \in \mathbb R\). By Dirichlet’s theorem on diophantine approximation we can choose \(a_3\), q with

$$\begin{aligned} (a_3,q)= 1, \quad q\le Q^{3/2} \quad \text {and} \quad \left| \alpha +\gamma -\frac{a_3}{q}\right| \le \frac{1}{q Q^{3/2}}. \end{aligned}$$
(7.2)

Then choose \(a_1\) so that

$$\begin{aligned} -\frac{1}{2q}<\alpha -\frac{a_1}{q} \le \frac{1}{2q}, \end{aligned}$$

and let \(\beta _3=\alpha +\gamma -a_3/q\) and \(\beta _1=\alpha -a_1/q\). By Corollary 1.1(a) we have

(7.3)

Since \((a_3,q)=1\), by Theorem 7.3 of [17] together with Lemma 4.1 we have

$$\begin{aligned}&q^{-1}S_{1,3}(q;d\left[ \left[ a_1/d\right] \right] ,a_3)I_{1,3}\left( \alpha -\frac{d\left[ \left[ a_1/d \right] \right] }{q},\beta _3; Q\right) \\&\quad \ll Q^{1+\varepsilon } \kappa (q)^{-1} \left( 1+Q\left| \alpha -\frac{d\left[ \left[ a_1/d \right] \right] }{q}\right| +Q^3|\beta _3|\right) ^{-1/3}. \end{aligned}$$

It follows that the contribution arising from those d|q for which

$$\begin{aligned} \kappa (q)^3\left( 1+Q\left| \alpha -\frac{d\left[ \left[ a_1/d \right] \right] }{q}\right| +Q^3|\beta _3|\right) \ge Q^{3/4-2\delta } \end{aligned}$$
(7.4)

is satisfactory in view of (7.1).

Thus it remains to deal with any possible terms for which (7.4) fails to hold. Suppose that \(d \ne (a_1,q)\) for some d violating (7.4), so that

$$\begin{aligned} \left| \alpha -\frac{d\left[ \left[ a_1/d \right] \right] }{q}\right| \ge \frac{1}{2q}. \end{aligned}$$

In that case we would have

$$\begin{aligned} q_2^{1/2}Q = q_2^{3/2}q_3q^{-1}Q \le 2\kappa (q)^3 Q\left| \alpha -\frac{d\left[ \left[ a_1/d \right] \right] }{q}\right| \le 2 Q^{3/4-2\delta }, \end{aligned}$$

which is impossible for Q large. Hence the only term in (7.3) that could possibly violate (7.4) is the one corresponding to \(d=(q, a_1)\), for which

$$\begin{aligned} \alpha -\frac{d\left[ \left[ a_1/d \right] \right] }{q}= {\alpha }- \frac{a_1}{q}=\beta _1. \end{aligned}$$

For this term the negation of (7.4) reads

$$\begin{aligned} \kappa (q)^3\left( 1+Q|\beta _1|+Q^3|\beta _3|\right) < Q^{3/4-2\delta }, \end{aligned}$$
(7.5)

and we observe that in this instance

$$\begin{aligned} q \le q_2^{3/2}q_3 = \kappa (q)^3 \le Q^{3/4-2\delta }. \end{aligned}$$
(7.6)

We assume there is such a term and show that it contradicts the assumption \(\gamma \in \varGamma _0\). Since \({\varGamma }_0 \subseteq {\varGamma }\), we see from (5.1) and (7.2) that

$$\begin{aligned} \frac{C(\gamma )}{q^{2}(\log 2q)^2} <\left| \gamma -\frac{a_3-a_1}{q}\right| \le |\beta _1|+|\beta _3|\le |\beta _1| + q^{-1}Q^{-3/2}. \end{aligned}$$

Since by (7.6) we may suppose that Q is large enough so that

$$\begin{aligned} q^{-1}Q^{-3/2}\le \frac{C(\gamma )}{2q^{2}(\log 2q)^2}\le \frac{1}{2}\left| \gamma -\frac{a_3-a_1}{q}\right| , \end{aligned}$$

it follows from our assumption (7.5) that

$$\begin{aligned} \left| \gamma -\frac{a_3-a_1}{q}\right| \le 2|\beta _1|\le 2\kappa (q)^{-3}Q^{-1/4-2\delta }. \end{aligned}$$
(7.7)

We also have

$$\begin{aligned} \frac{C(\gamma )}{q^{2}(\log 2q)^2}\le 2\kappa (q)^{-3}Q^{-1/4-2\delta } \end{aligned}$$

and therefore

$$\begin{aligned} C(\gamma )Q^{1/4+2\delta }\le 2q_2^{1/2}q_3(\log 2q)^2 \le 2q(\log 2q)^2. \end{aligned}$$

Thus, as Q can be taken large enough so that \((\log 2q)^2 < \frac{1}{2}C(\gamma )Q^{2\delta }\), we have \(q>Q^{1/4}\). By (7.6) we have

$$\begin{aligned} Q^{-1/4-2\delta }&= \left( Q^{-3/4+2\delta } \right) ^{1/3+\frac{32\delta }{9-24\delta }}\\&\le \left( q_2^{-3/2}q_3^{-1} \right) ^{1/3+\frac{32\delta }{9-24\delta }}\\&\le q_2^{-1/2-2\delta }q_3^{-1/3-2\delta }\\&\le \textstyle {\frac{1}{2}} q_2^{-1/2-\delta }q_3^{-1/3-\delta }. \end{aligned}$$

Hence, by (7.7) we find that

$$\begin{aligned} \left| \gamma -\frac{a_3-a_1}{q}\right| \le q_2^{-2-\delta }q_3^{-4/3-\delta }. \end{aligned}$$

However, by the definition of \(\varGamma _0\) in Sect. 5 this is expressly excluded for large q, and so this establishes as promised that (7.5) is impossible. This completes the proof of (7.1), which gives the conclusion \(\theta _3\le \frac{3}{4}\).

8 Theorem 1.3: the lower bound

Let \(\delta >0\) be sufficiently small, and let \(k = 2\) or 3. We show that for all \(\gamma \in \mathbb R\setminus \mathbb {Q}\) there are arbitrarily large Q such that

$$\begin{aligned} \sup _{\alpha \in [0,1)} \left| \sum _{Q<n\le 2 Q} e\left( \alpha n + \left( \alpha +\gamma \right) n^k\right) \right| \gg Q^{3/4-\delta }. \end{aligned}$$

The continued fraction algorithm for \(\gamma \) gives q and c with q arbitrarily large, \((q,c)=1\), and

$$\begin{aligned} |\gamma -c/q|\le q^{-2}. \end{aligned}$$

Note that any two successive convergents c/q and \(c'/q'\) of the continued fraction satisfy \(cq'-c'q=\pm 1\) and so \((q,q')=1\). Thus either q or \(q'\) is odd. For an arbitrary odd convergent q and a fixed small parameter \(\delta >0\) set

$$\begin{aligned} Q=q^{2/(1+2\delta )}. \end{aligned}$$
(8.1)

Let \(a_k\) be any integer with \((q,a_k)=1\), and if \(k=3\) assume additionally that \(a_3\) is such that \(S_{1,3}(q; a_3-c,a_3) \gg q^{1/2 - {\delta }}\). The existence of such an \(a_3\) is guaranteed by Lemma 4.2. Take further \(\alpha = -\gamma +a_k/q\) and \(a_1=a_k-c\), and define \(\alpha _k = \alpha +\gamma \), \(\alpha _1=\alpha \) and \(\beta _j=\alpha _j-a_j/q\) for \(j = 1, k\). Thus

$$\begin{aligned} \beta _k=\alpha +\gamma -\frac{a_k}{q}=0 \quad \text {and} \quad \beta _1=-\gamma + \frac{c}{q}, \end{aligned}$$

and so in particular

$$\begin{aligned} |\beta _1|\le q^{-2}\le Q^{-1-2\delta }. \end{aligned}$$

We have

$$\begin{aligned} f_{1,k}\left( \alpha ,\alpha +\gamma ; Q\right) = q^{-1} S_{1,k}\left( q;a_1,a_k\right) I_{1,k}\left( -{\gamma }+c/q,0; Q\right) + O\left( Q^{1/3 + {\delta }}\right) . \end{aligned}$$
(8.2)

When \(k=2\), this is Theorem 8 in [18], and for \(k=3\) it is a consequence of the second statement of Theorem 1.1. Indeed, if \(d \ne (a_1, q)\) we have \(d \left[ \left[ a_1/d\right] \right] \ne a_1\), and thus it follows that \(|\alpha -d\left[ \left[ a_1/d \right] \right] /q|>1/(2q)\) and therefore

$$\begin{aligned} \left| I_{1,3}\left( \alpha -\frac{d\left[ \left[ a_1/d \right] \right] }{q},0;Q\right) \right| \ll q \end{aligned}$$

by Lemma 2.4(a). Moreover, by Lemma 2.1(a) we have

$$\begin{aligned} S_{1,3}\left( q;d\left[ \left[ a_1/d\right] \right] ,a_3\right) \ll q^{2/3+\varepsilon }. \end{aligned}$$

Hence we see from (8.1) that

and consequently (3.1) implies that

as claimed.

Note that

$$\begin{aligned} I_{1,k}\left( \beta _1,0; Q\right) = \int _{Q}^{2 Q} e\left( \beta _1x\right) \mathrm {d}x = Q e\left( 3 Q\beta _1/2\right) \frac{\sin \left( \pi \beta _1Q\right) }{\pi \beta _1 Q} \gg Q, \end{aligned}$$
(8.3)

where in the last step we used that \(Q|\beta _1| =Q|\gamma -c/q |\le Q^{-2\delta }\) and thus

$$\begin{aligned} \frac{\sin \left( \pi \beta _1 Q\right) }{\pi \beta _1 Q} \gg 1. \end{aligned}$$

Moreover, since q is odd and \((a_k,q)=1\), we have

$$\begin{aligned} q^{-1}|S_{1,k}(q;a_1,a_k)| \gg q^{-1/2-{\delta }} \gg Q^{-1/4-2{\delta }}. \end{aligned}$$
(8.4)

This follows from the classical bound for quadratic Gauss sums in the case \(k=2\), and is a consequence of our choice of \(a_3\) via Lemma 4.2 when \(k=3\). Hence upon combining (8.2), (8.3) and (8.4), we see that

$$\begin{aligned} |f_{1,k}\left( \alpha ,\alpha +\gamma ; Q\right) | \gg Q^{3/4-2\delta }, \end{aligned}$$

and the theorem follows.

9 Concluding remarks

In conclusion, we make some remarks about what might be the real truth with regard to estimates for Weyl sums on suitable sets of minor arcs. Consider sums of the kind (1.1) with multidegree \(\mathbf {k}= (1, \ldots , k)\), where \({\varvec{\alpha }}\) is ‘sufficiently random’. Then we might guess that in this circumstance \(f_{\mathbf {k}}({\varvec{\alpha }})\) behaves like a sum of independent random unimodular variables and so the central limit theorem suggests that

$$\begin{aligned} f_{\mathbf {k}}\left( \varvec{\alpha }\right) \ll P^{1/2+\varepsilon }. \end{aligned}$$
(9.1)

In fact, the received wisdom states that for (9.1) to be true, it should suffice that \(\varvec{\alpha }\) is such that for any \(a_1,\ldots ,a_k,q\) satisfying (1.4) with \((q,a_1,\ldots ,a_k)=1\) we have

$$\begin{aligned} q+qP|\alpha _1-a_1/q|+\cdots + qP^k|\alpha _k-a_k/q|>P. \end{aligned}$$
(9.2)

In the one-dimensional case, \(\mathbf {k}= k\), the last author and Wooley [19] show that there is a connection between the size of \(f_k({\alpha })\) and the number of solutions of

$$\begin{aligned} x_1^k+\cdots +x_b^k=y_1^k+\cdots +y_b^k \end{aligned}$$

with \(0<x_j,y_j\le P\), and that the values of \(|f_k({\alpha })|\) will even have a normal distribution on the minor arcs. For \(k=2\) we have more precise bounds (see [18, Theorem 7)]) since we can cover all of \([0,1]^2\) by choosing \(q,a_2\) with \((q,a_2)=1\), \(|\alpha _2-a_2/q| \le 1/(5qP)\), \(q\le 5P\) and then taking \(a_1\) with \(|\alpha _1-a_1/q|\le 1/(2q)\). Thus the main interest lies in the case \(k\ge 3\). In order to make a proper comparison with the current state of play we first review what is known.

Take \(\mathbf {k}= (1, \ldots , k)\). The Vinogradov mean value theorem (Bourgain, Demeter, Guth [3] and Wooley [22, 23]) combined with Theorem 5.2 of Vaughan [17] give

$$\begin{aligned} f_{\mathbf {k}}({\varvec{\alpha }}) \ll P^{1+\varepsilon } \left( \frac{1}{q}_j+\frac{1}{P}+\frac{q_j}{P^j} \right) ^{\textstyle \frac{1}{k(k-1)}} \end{aligned}$$
(9.3)

when for some \(j \ge 2\) there are coprime \(q_j\) and \(a_j\) such that \(|\alpha _j-a_j/q_j|\le q_j^{-2}\). At least when \(q_j+P^j|\alpha _j-a_j| >P^2\) for every j and choice of \(q_j\), \(a_j\), this is the best that we know when \(k\ge 6\), and is perhaps also the best that we know when \(3\le k\le 5\) and we need to approximate some (presumably non-zero) \(\alpha _j\) with \(2\le j\le k-1\). When we have q, \(a_k\) with \((q,a_k)=1\) and \(|\alpha _k-a_k/q| \le q^{-2}\), Weyl’s inequality [17, Lemma 2.4] gives

$$\begin{aligned} f_{\mathbf {k}}(\varvec{\alpha }) \ll P^{1+\varepsilon } \left( \frac{1}{q}+\frac{1}{P}+\frac{q}{P^k}\right) ^{2^{1-k}} \end{aligned}$$
(9.4)

and this is superior to (9.3) when \(3\le k\le 5\).

There is an underlying problem when dealing with a sum as general as (1.1). It seems that one ought to consider a general rational approximation to \(\varvec{\alpha }\) of the kind \(|q\alpha _j-a_j|\le Q_j^{-1}\) with \((q,a_1,\ldots ,a_k)=1\), but then to cover the whole of \([0,1]^k\) one needs that q can be as large as \(cQ_1\cdots Q_k\). This means that either q might be much larger than \(P^k\) or the intervals about each rational are too large to be able to deduce anything useful. The alternative, as in the statement of (9.3) above, is to deal with one j at a time. If \(q_j>P^{\theta _j}\) for some \(\theta _j\le 1\) then we are done and that leaves the situation when \(q_j\le P^{\theta _j}\) for every j. Now one can take \(q= \mathrm {lcm}(q_1,\ldots ,q_k)\) and the numerators of the rational approximation to \(\varvec{\alpha }\) becomes \((a_1q/q_1,\ldots ,a_kq/q_k)\). However, the likelihood is that any useful bound will still need q fairly constrained in terms of P and so the \(\theta _j\) will have to be rather small. An example of this process is given in the proof of [17, Theorem 7.4]. Some aspects of methods to overcome this are described in Chapter 5 of Baker [1].

Whilst our result of Theorem 1.3 does not strictly contradict such heuristics as detailed in the opening paragraph of this section, it does raise the question of whether these heuristics might not be too naive in some cases. When \(\{p_1, \ldots , p_t\}\) is a set of polynomials with a non-vanishing Wronskian, consider the associated exponential sum

$$\begin{aligned} f_{\mathbf {p}}\left( {\alpha }_1, \ldots , {\alpha }_t\right) = \sum _{1 \le n \le P} e\left( {\alpha }_1 p_1(n) + \cdots + {\alpha }_t p_t(n)\right) . \end{aligned}$$
(9.5)

We know from standard bounds that \(\sup _{{\varvec{\alpha }}} |f_{\mathbf {p}}({\varvec{\alpha }})| = f_{\mathbf {p}}(\varvec{0}) = \lfloor P \rfloor \), whilst on the other hand it follows from [6, Corollary 2.2] that whenever the polynomials \(p_j\) have a non-vanishing Wronskian, the bound (9.1) holds for a set of \({\varvec{\alpha }}\in [0,1]^t\) of full measure. Meanwhile, the analogous inequality to (9.2) is not sufficient for the bound (9.1) to hold. This is evidenced in our Theorem 1.3 with the choices \(p_1(n)=n^k\) and \(p_2(n)=n^k+n\), where \(k=2\) or \(k=3\). Here, it transpires from our arguments that even if \({\alpha }_1\) lacks a good rational approximation, for certain choices of \({\alpha }_2\) the contributions of the degree k part of the polynomial more or less cancel out, leading to a less random behaviour. It is not clear to the authors whether this behaviour is particular to the presence and influence of the linear term in \(p_2\) or whether there is some underlying phenomenon at work. In the latter scenario, one might now be inclined to guess that if only r of the t coefficients are restricted to lie in a set of full measure whilst the other \(t-r\) coefficients are allowed to range over the entire unit interval, that the ensuing bound would interpolate between the two extremes. Thus, one might speculate that when the polynomials \(p_j\) are all of the same degree one has the bound

$$\begin{aligned} \sup _{{\alpha }_1, \ldots , {\alpha }_r} |f_{\mathbf {p}}({\varvec{\alpha }})| \ll P^{1-r/(2t) + \varepsilon } \quad \text {for almost all }\, {\alpha }_{r+1}, \ldots , {\alpha }_t, \end{aligned}$$

and that this might in some cases even be sharp (up to epsilon) for a sequence of values P tending to infinity. The exponent here is \(1-r/(2t) = (r/2 + (t-r))/t\) and interpolates between r contributions of 1/2 and \(t-r\) contributions of 1. Whilst this is compatible with the bounds of our Theorem 1.3, there is still hope that stronger bounds may be available if the polynomials in question differ by more than a linear term.

Our understanding is better in the one-dimensional case. On page 43 of Vaughan [17] it was stated (we have changed the notation to be consistent with this memoir) that when \((q,a)=1\) and \(\beta =\alpha -a/q\) it would be very interesting to decide whether the relation

$$\begin{aligned} f_k(\alpha ) = q^{-1}S_k(q,a) I_k({\beta }) + O\left( (q+qP^k|\beta |)^{\theta }\right) \end{aligned}$$

holds with an exponent \(\theta \) smaller than 1/2, and it was even speculated that \(\theta \) might be as small as 1/k. This was shown to be false by Daemen [7] and Brüdern and Daemen [4]. One other result has come to our attention. Heath-Brown [10] has shown on the assumption of the abc conjecture that if \(\alpha \) is a quadratic irrational then

$$\begin{aligned} \sum _{n\le P} e(\alpha n^3) \ll P^{\frac{5}{7}+\varepsilon }. \end{aligned}$$

It may be worthwhile to note that quadratic irrationals are badly approximable numbers so that Heath-Brown’s result can be viewed as applying to an ‘extreme minor arc’ situation.

Finally, we briefly outline an argument, versions of which in other contexts are quite well known, that shows that one cannot expect to bound the exponential sum \(f_k(\alpha )\) by anything smaller than \(P^{1/2}\) on the minor arcs. Let P be large and choose \(R=P^{1+\phi }\) and \(Q=P^{k-1-\psi }\), where \(\phi \) and \(\psi \) are positive numbers at our disposal and \(\phi <\psi \) so that \(RQ<P^{k-\delta }\) for some substantial \(\delta >0\). There are various wrinkles that could be introduced to enable a quite large \(\delta \).

Let \(\mathfrak M\) denote the union of the intervals \([a/q-1/(qQ),a/q+1/(qQ)]\) with \(1\le a\le q\le R\) and \((q,a)=1\) and let \(\mathfrak m = (1/Q,1+1/Q]\setminus \mathfrak M\). The total measure of \(\mathfrak M\) is \(\le 2RQ^{-1}\) and so

$$\begin{aligned} \int _{\mathfrak M} |f_k(\alpha )|^2 \mathrm {d}\alpha \ll P^2RQ^{-1} \ll P^{4+\phi +\psi -k}. \end{aligned}$$

When \(k\ge 4\) this is \(\ll P^{1-\delta }\) for some \(\delta >0\) as long as \(\phi +\psi <1-\delta \). When \(k=3\) this argument can be refined by using the approximations for \(f_k({\alpha })\) given by (4.13), Theorem 4.2 and the integral version of Lemma 2.8 of Vaughan [17]. Then we obtain the bound

$$\begin{aligned} \int _{\mathfrak M} |f_k(\alpha )|^2 \mathrm {d}\alpha&\ll \sum _{q\le R} q \int _0^{1/(qQ)} \left( \frac{P^2}{\left( q+qP^3\beta \right) ^{2/3}} + q^{\varepsilon }(q+qP^3\beta )\right) \mathrm {d}\beta \\&\ll P^{\frac{1}{3}+\frac{4\phi }{3}} + P^{\frac{1}{3}+\phi +\frac{\psi }{3}} + P^{2\phi +\psi +\varepsilon } + P^{\phi +2\psi +\varepsilon } \ll P^{1-\delta } \end{aligned}$$

whenever \(\phi +\psi <\frac{1}{2}-\delta \). Hence by Parseval’s identity, for all \(k\ge 3\) we have

$$\begin{aligned} \int _{\mathfrak m} |f_k(\alpha )|^2 \mathrm {d}\alpha = P+ O(P^{1-\delta }), \end{aligned}$$
(9.6)

which gives the desired conclusion. Similar arguments may easily be implemented for multidimensional exponential sums, and it is also quite feasible to consider higher moments than the second.