On generating functions in additive number theory, II: lower-order terms and applications to PDEs

We obtain asymptotics for sums of the form \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\begin{aligned} \sum _{n=1}^P e\left( {\alpha }_k\,n^k\,+\,{\alpha }_1 n\right) , \end{aligned}$$\end{document}∑n=1Peαknk+α1n,involving lower order main terms. As an application, we show that for almost all \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${\alpha }_2 \in [0,1)$$\end{document}α2∈[0,1) one has \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\begin{aligned} \sup _{{\alpha }_{1} \in [0,1)} \Big | \sum _{1 \le n \le P} e\left( {\alpha }_{1}\left( n^{3}+n\right) + {\alpha }_{2} n^{3}\right) \Big | \ll P^{3/4 + \varepsilon }, \end{aligned}$$\end{document}supα1∈[0,1)|∑1≤n≤Peα1n3+n+α2n3|≪P3/4+ε,and that in a suitable sense this is best possible. This allows us to improve bounds for the fractal dimension of solutions to the Schrödinger and Airy equations.


Introduction
Exponential sums are a ubiquitous tool throughout analytic number theory, and have been studied in their own right at least since the 1920s. When k = (k 1 , . . . , k t ) is a tuple of pairwise distinct natural numbers and P is a large positive integer, the Weyl sum of multidegree k is given by f k α k 1 , . . . , α k t = P n=1 e α k 1 n k 1 + · · · + α k t n k t . (1.1) Communicated by Kannan Soundararajan.
Extended author information available on the last page of the article Such sums regularly feature in applications of the Hardy-Littlewood circle method in connection with diophantine systems of the shape x k j 1 + · · · + x k j s = y k j 1 + · · · + y k j s (1 ≤ j ≤ t). (1.2) Whilst the theory of systems of the kind (1.2) has recently seen significant advances in the work of Wooley [22,23] and Bourgain, Demeter and Guth [3] on Vinogradov's mean value theorem, our grasp of the cases involving lacunary degrees remains insufficient. One of the simplest such systems is that corresponding to k = (1, 3), a variant of which is given by Although recent progress on this system has been achieved by Brüdern and Robert [5] and Wooley [20], a full understanding of the system (1.3) remains tantalisingly out of reach. In both papers, the authors apply the circle method in order to derive asymptotic formulae for the number of solutions of such systems, and they succeed in doing so as soon as s ≥ 10. On the basis of standard heuristics, one would expect to be able to extend the range in which formulae of this kind are valid to at least s ≥ 9, but unfortunately we lack a sufficiently precise understanding of the underlying Weyl sum f 1,3 (α) to achieve such a bound. A similar phenomenon occurs in forthcoming work of Hughes and Wooley [11], which deals with moments of a weighted version of f 1,3 (α). Trying to make some headway towards a better understanding of these exponential sums is the main motivation behind the paper at hand. The main motif underpinning the Hardy-Littlewood method is that sums of the shape (1.1) should be small unless all components of the coefficient vector α lie in the vicinity of fractions with a small denominator, in which case they can be well approximated by certain generating functions that are easier to handle and encode the adelic information inherent in the associated system (1.2). To make this notion precise, we introduce some notation. Suppose that the entries of α have a rational approximation of the shape with a common denominator q satisfying gcd(q, a k 1 , . . . , a k t ) = 1, and define β k j x k j ⎞ ⎠ dx.
In this notation, we anticipate that q −1 S k (q; a)I k (β) should be a good approximation to f k (α), and we denote the difference by Δ k (q, a; β) = f k (a/q + β) − q −1 S k (q; a)I k (β). (1.5) There is a considerable body of work related to Weyl sums of the type (1.1) and their approximations (1.5). When t = 1 so that k = k, it is known from [17,Theorem 4.1] that Δ k (q, a; β) q 1/2+ε (1 + P k |β|) 1/2 , (1.6) and Daemen [7,Theorem 2] and Brüdern and Daemen [4,Theorem 1] showed that this bound is sharp up to at most a factor of q ε . For general multidegrees k we have the weaker bound from [17,Theorem 7.2]. This bound has been improved for k = (1, k) by Brüdern and Robert [5,Theorem 3], who obtained the estimate Δ 1,k (q, a; β) q 1−1/k+ε 1 + P k |β k | Whilst their result holds for all k ≥ 2, an epsilon-free version is available in the quadratic case due to the last author [18,Theorem 8]. This invites the question of how optimal the bound in (1.7) is. The primary objective of this memoir is to examine the exponential sum f 1,k (α 1 , α k ) and its associated error term Δ 1,k (q, a; β) more closely. Our main result is the following.
then the error term in the above asymptotic may be replaced by O(q 1/2+ε ).
Thus, by extracting additional main terms, we are able to obtain an error term of the same quality as in (1.6), which is essentially optimal. We note that the factor log P in the error term can be eliminated by means of a more careful analysis. Observe also that the coprimality condition ( In the case when k = 3, it follows from Dirichlet's approximation theorem that every α ∈ [0, 1] has an approximation α = a/q + β with q(1 + P 3 |β|) ≤ 2P 3/2 . Thus, in the cubic case we obtain the following.
(b) Moreover, we have the bound For general degree k, an analogous chain of reasoning would replace the error term  When q a 1 we have (a 1 , q) = q, and thus it will always contain at least the term S 1,k (q; 0, a k )I 1,k (α 1 , β k ) corresponding to d = q, and in many cases this is the only one. For instance, it is not hard to see that when a 1 = 1 and q > 1 is odd, the sum in Theorem 1.1 contains precisely the terms corresponding to d = 1 and d = q, and so in this case the asymptotic formula reads This behaviour, which occurs in many generic situations, indicates that we cannot expect the secondary terms to be subject to any significant cancellation. In fact, we have the following result on the size of the error term.
(a) We have the upper bound (b) Suppose now that β k = 0 and a 1 = 1 with q = p k , where p is an odd prime. Then, whenever α 1 P ≥ δ for some suitable real number δ > 0, we have the lower bound Thus Theorem 1.2 shows that the bound (1.7) of Brüdern and Robert is sharp at least when β k is small and q is a perfect k-th power of a square-free number. Here, the term of size q 1−1/k = p k−1 arises from the exponential sum S 1,k (q; 0, a k ) via Lemma 4.4 in [17], and as noted above, unless q|a 1 this term will always appear in the sum in (1.9). Thus, large values of Δ 1,k (q, a; β) cannot be considered an exceptional occurrence when β k is small. As a consequence of Theorem 1.1, we are able to make progress on a problem concerning the fractal dimension of solutions of certain partial differential equations. The motivation for this problem goes back to optical experiments by Talbot [15] in the 1830s concerning the diffraction of light passing through a grating. Berry [2] later initiated the theoretical investigation of the problem and has in particular made predictions regarding the fractal dimension of the diffraction pattern along certain slices in space. The reader is referred to Chapter 2 of [9] and the introduction of [8] for an introduction to the general topic as well as a more thorough history of this particular problem.
In this paper, we shall focus in particular on the family of partial differential equations given by where t ∈ R and x ∈ R/2π Z. When k = 2, the reader will recognize (1.10) as the linear Schrödinger equation, while the case k = 3 corresponds to the linear part of the Korteweg-de Vries (KdV) equation, also known as Airy's equation. For any natural number k, given initial data g k (n) ∈ L 2 (R/2π Z), the evolution of g k under (1.10) is given by Clearly, q k is periodic in both t and x with period 2π . We are interested in the restriction of q k to linear subsets of (R/2π Z) × (R/2π Z). Given c ∈ R and r ∈ Q \ {0}, as well as initial data g k , let q k;r ,c (x) = n∈Zĝ k (n)e i(c−r x)n k +i xn denote the restriction of q k to the oblique line t + r x = c. Recall that the fractal (also known as upper Minkowski or upper box-counting) dimension of a bounded set E is given by where N (E, ε) is the minimum number of ε-balls required to cover E. Assuming that g k is a suitably well-behaved function, we would like to know the the fractal dimension of the real and imaginary parts of the graph of q k;r ,c for a typical c. Note here that it is possible for either the real or the imaginary part of the graph to vanish, so we are really interested in the size of the larger of the two. The simplest non-constant choices for g k are step functions, and in such situations, we see that in order to make progress, it is imperative to understand the distribution of large values of exponential sums. As it is convenient to work with dyadic sums in this context, we modify our definition (1.1) by writing f k α k 1 , . . . , α k t ; Q = Q<n≤2Q e α k 1 n k 1 + · · · + α k t n k t , (1.11) where Q is a positive number. Let Θ k denote the set of all θ ∈ R such that for almost all γ ∈ [0, 1) one has and set θ k = inf Θ k . The size of θ k and related quantities has recently been studied by Chen and Shparlinski [6], building on work by Wooley [21]. Clearly, one sees that whence we have the trivial bounds valid for all k ≥ 2 and for the entire range γ ∈ [0, 1). Moreover, we have the trivial bound sup α,γ | f 1,k (α, α + γ ; Q)| Q, and it is known (see e.g. [6, Corollary 2.2]) that for independent variables γ , α we have | f 1,k (α, α + γ ; Q)| Q 1/2+ε almost everywhere. It turns out that in our case where only one of the variables is restricted to lie in the complement of a thin set while the other one ranges freely, the bound is appreciably larger. Theorem 1. 3 We have θ 2 = θ 3 = 3/4. Theorem 1.3 may be a bit surprising as one naively expects square root cancellation in exponential sums. As will transpire from the proof, it turns out that for almost every γ in (1.12) the supremum is obtained for a special choice of α on what can be considered a major arc. One might speculate that θ k = 3/4 for all k ≥ 2. Indeed, one might hope to adapt the proof of Theorem 1.3 above to show that for almost all γ this gives the correct extremal value on a suitable set of major arcs and that for almost all γ the sum is smaller on the corresponding minor arcs. This latter speculation would be consistent with the main result of the last author and Wooley [19].
With the help of Theorem 1.3, we can address our motivating problem. can be transferred to non-linear partial differential equations, in particular the nonlinear Schrödinger and KdV equations, by the same methods as Theorems 1.2 and 1.3 are derived from Theorem 1.1 in [8].
Note that Theorem 1.1 in [8] also gives a lower bound for the fractal dimensions in question. Specifically, the authors show that the graph of at least one of the real and imaginary parts of q k;r ,c has fractal dimension of at least 2 − 1/(2k), and this is sharp at least in the Schrödinger case k = 2. Moreover, they remark that, if it were true that θ k = 1/2, their argument could be adapted to show that this lower bound reflects the actual value. Our Theorem 1.3 rules out this approach at least for the cases k = 2 and k = 3. Meanwhile, if our speculation that θ k = 3/4 could be substantiated for all k, it would imply that the maximum dimension of the respective graphs of the real and imaginary parts would lie in the range It is worth noting that Lemma 2 of [12] along with [8] imply that for special combinations of initial data and oblique lines the fractal dimension in Theorem 1.4 for k = 2 is precisely 7/4 (see [8,Footnote 3] for more details). Notation Throughout the paper, we make use of the following conventions. All statements involving the letter ε are claimed to be true for all (sufficiently small) ε > 0.
Thus, the precise 'value' of ε is allowed to change from one line to the next. Moreover, P always denotes a large positive number. We use the Vinogradov and Bachmann-Landau notations liberally, and here the implied constants are allowed to depend on k and ε, but never on P, Q or α.

Preliminary lemmata
In this section, we briefly collect some technical lemmata that will be of use in our arguments later. All of these results pertain to the case k = (1, k), and in order to avoid clutter, we will in our arguments below drop the multidegree (1, k) in our notation. Throughout, Q denotes a positive number. For easier reference, we begin by stating a few results from the literature.

Lemma 2.1
Let a 1 , a k ∈ Z and q ∈ N, and suppose that (a k , q) = 1.
(a) Uniformly in a 1 , we have S(q; a 1 , a k ) Proof These are Theorem 7.1 and Lemma 4.1 in [17], respectively.
We also record an elementary average bound for exponential sums.

Lemma 2.2 For any positive integer q and any integer a k we have
Proof By the Cauchy-Schwarz inequality we see that , and expanding the square yields This completes the proof.
Proof This is immediate upon partitioning I into subintervals on which φ is monotonic, and then applying Lemma 4.2 of [17] on each of these finitely many intervals.
We continue with bounds on oscillating integrals. For a measurable subset A we define We then have the following bounds for I (β 1 , β k ; A).

Lemma 2.4 Let k ≥ 2 and suppose that A is a finite union of pairwise disjoint intervals.
(a) Let τ > 0 be a parameter satisfying Proof These bounds are Lemmata 4.2 and 4.4 in [16], respectively, applied to the function Lemma 2.5 Assume that k ≥ 2 and β 1 = 0. Suppose further that A is a union of finitely many pairwise disjoint intervals contained inside [Q, 2Q] for some Q > 0.
Then we have the bound Proof Suppose first that the relation holds for all x ∈ A. Then we see from Lemma 2.4(a) that which is sufficient to prove the lemma in this case. We may thus concentrate on the opposite case where the inequality (2.1) is violated for some x ∈ A. It follows from the triangle inequality that any such x must satisfy the inequalities 1 2 |β 1 | ≤ k|β k |x k−1 ≤ and in particular only when β k = 0. We can thus deploy Lemma 2.4(b) and obtain where in the last step we used (2.3) again. The full statement now follows upon combining (2.2) and (2.4).

Proof of Theorem 1.1
For the proof of our first main result it is convenient to work over dyadic ranges. Recalling our notation (1.11), we make the analogous definition Thus, if we can show that for any Q ≥ 1/2, the conclusion of Theorem 1.1 will follow upon dyadic summation, as The initial stages of our argument follow along the lines of the proof of [5, Theorem 3], which in turn is an adaptation of the argument found in [17, pp. 43-44].
Proof By sorting the terms of f (α 1 , α k ; Q) into congruence classes modulo q and encoding the congruence condition in an exponential sum, we find that We treat the sum f ( and set Then we have |φ (x)| ≤ H 1 for all x ≤ 2Q and thus Lemma 2.3 yields Using this within (3.2) and applying Lemma 2.2 in the error term yields The proof is complete upon making the change of variables b + q j = h, noting that under the summation conditions this is in fact a bijection into the set of integers h We now distinguish two cases according to which term in H is larger. Suppose first that k|β k |Q k−1 > 1, so that In such a situation, we discern from Lemma 2.4(b) and Lemma 2.2 that |h|≤H S(q; a 1 + h, a k )I ( where in the last step we used (3.3). Thus, in this situation, we find that We expect the sum on the right hand side of (3. Thus, it suffices to bound E(q, a; β). By Lemma 2.5 we see that Now, the condition de = a 1 together with our bound |β 1 | ≤ (2q) −1 implies that Using this within (3.8) and applying Lemma 2.1(b) yields the bound where in the last step we used (3.5) together with standard bounds for the divisor function. The proof of (3.1) under the assumption (3.5) is now complete upon inserting (3.10) into (3.7), and the unconditional statement follows upon combining this with the bound (3.4). The second statement of Theorem 1.1 is proved in a similar manner, and we only briefly detail the changes that need to be effected. Here, we do not need to consider a dyadic dissection of the interval, so all of our arguments will involve the exponential sum f (α 1 , α k ) and integral I (β 1 , β k ) instead of their dyadic analogues f (α 1 , α k ; Q) and I (β 1 , β k ; Q), and will have P instead of Q. We now observe that in the proof of Lemma 3.1, the condition (1.8) implies that We may thus take H 1 = 2. The argument then proceeds as above, with the difference that we may skip the discussion of the case (3.3) and can continue directly with the hypothesis (3.5). From this point we arrive, mutatis mutandis, at (3.7). In order to bound the error term E(q, a; β) we now note that (3.9) and (1.8) combine to show that, for 1 ≤ x ≤ P, one has  = (a 1 , q), the fractions a 1 /q and d [[a 1 /d]] /q are distinct, and thus the latter one corresponds to a non-optimal rational approximation to α 1 . In particular, we have |α 1 Thus, we can apply Lemma 2.5 within the sum (1.9) much as above, and the statement of Theorem 1.2(a) follows immediately.
For the second statement of the theorem, we begin by observing that the hypotheses imply that the sum in Theorem 1.1 has exactly two main terms, corresponding to the values d = 1 and d = q, respectively. We will focus on the latter. Note that when β k = 0, we can explicitly compute When α 1 P > δ, we have 2δ ≤ | sin(π α 1 P)| ≤ 1, and thus under our assumption that 1/(2q) ≤ α 1 ≤ 3/(2q). Thus, the main term corresponding to d = q is given by Finally, we note that when q = p k , Lemma 4.4 in [17] shows that |S(q; 0, a k )| = q 1−1/k , which implies the result.

Additional lemmata for the proof of Theorem 1.3
For the proof of our second main result, we need some more detailed information about cubic complete exponential sums.

Lemma 4.1 Let q ∈ N and set
Furthermore, suppose that (q, a) = 1. Then Proof Suppose that q = rs with (r , s) = 1, and write a = a 2 r +a 1 s and b = b 2 r +b 1 s. Then it follows by standard arguments that We are therefore free to restrict our focus to prime power moduli. By Lemma 2.1(a) when (q 3 , a) = 1 we have Thus it suffices to show that whenever (a, p) = 1 and t = 1 or 2 we have When t = 1 this follows from Corollary II.2F of Schmidt [13]. Suppose t = 2. Then when p = 2 or 3 this bound is trivial, so we can suppose that p > 3. Thus The congruence b + 3au 2 ≡ 0 (mod p) has at most two solutions, and it follows that |S 1,3 ( p 2 ; b, a)| ≤ 2 p. The claim of the lemma follows upon collecting our results.

Lemma 4.2
Let q ∈ N be odd and c ∈ Z with (q, c) = 1. Then there exists a ∈ Z such that (q, a) = 1 and for every ε > 0 we have Proof Before embarking on the argument, we note that the lemma is trivial for q = 1, and hence we may assume q > 1 and consequently c = 0. Again we use the multiplicative property of the Gauss sum as described at the start of the proof of the previous lemma. Thus it suffices to establish the lemma for prime powers. Let p be an odd prime, t ∈ N and c ∈ Z be such that p c. The lemma follows if we are able to show that there is an absolute constant ξ > 0 having the property that When p > 3, on the other hand, we use that 12 3m 2 + 3hm + h 2 + 1 = (6m + 3h) 2 + 3h 2 + 12, whence we obtain where a p L denotes the Legendre symbol. Thus, upon extending the sum on the right hand side to include the term h = p also, while noting that −3 p 2 − 12 ≡ 2 2 (−3) (mod p), we obtain whence we find that When p > 7, this is already sufficient for (4.1), so it remains to analyse the cases when p = 5 and p = 7. Let now p = 5. Since which has absolute value smaller than 6 for all c ∈ {1, . . . , 6}. It follows that (4.1) holds in this case also. Thus whenever t = 1 there is at least one a satisfying the necessary requirements. Now consider the case t ≥ 2. Suppose first that p > 3. Write t = 3v +u with v ≥ 0 and 1 ≤ u ≤ 3. When u = 1 choose a = c. Then by iteratively applying Lemma 4.4 of [17] we find that since in the notation of that lemma and (2.25) ibidem we have l = t > 1 = γ .
When u = 1, we may now assume that v ≥ 1. Choose a so that a ≡ c + p 2v (a − c) (mod p t ) where a is at our disposal. Put m = p 3v x + y. Then our sum is The sum over x is 0 unless p|y in which case it sums to p, and thus the above is Iterating this argument gives 3 ( p t ; a − c, a) = p 2v S 1,3 ( p; a − c, a). Now we choose a in accordance with the case t = 1 above. Thus When p = 3 we can apply a slightly modified argument. Now in the notation (2.25) of [17] we have γ = 2. When t = 3v + u with u = 2 or 3 we again take a = c and obtain, by Lemma 4.4 ibidem, S 1,3 3 t ; a − c, a = 3 2v S 1,3 3 u ; 0, a and this is 3 2v+2 when u = 3. When u = 2, we have instead S 1,3 3 u ; 0, a = 3 + 6 cos (2πa/9) .
Since a is not divisible by 3, the cosine cannot be − 1 2 . Thus When u = 1 we follow the recipe for general p and obtain and again appeal to the case t = 1.

Prolegomena to the proof of Theorem 1.3
Before proceeding to the various parts of the proof of Theorem 1.3, it is useful to review some measure theoretic aspects of approximation of real numbers by rational numbers. In view of the periodicity of our functions, we concentrate on the interval [0, 1]. As is well known, Dirichlet's approximation theorem states that every real number γ has the property that there are arbitrarily large q ∈ N and c ∈ Z such that (q, c) = 1 and |γ − c/q| ≤ q −2 . Moreover, it follows from Khinchine's theorem (see e.g. Theorem III.3A in [14]) that for almost every such γ there is a positive number C(γ ) such that whenever (q, c) = 1 we have In particular there is a subset Γ of (R \ Q) ∩ [0, 1] having these properties and with measΓ = 1.
For the upper bound when k = 3 we need to refine this further. Let Γ 0 denote the subset of Γ with the property that for every δ > 0 and γ ∈ Γ 0 there are, in the notation of Lemma 4.1, at most a finite number of q and c with where Υ (δ) denotes the set of γ ∈ [0, 1] having the property that (5.2) holds for infinitely many pairs (q, c). Let further N ∈ N. Then We have Therefore for any Y > 0 we have and so Thus measΥ (δ) N −δ/2 and this holds for every N ∈ N.

Theorem 1.3: the upper bound when k = 2
Let α ∈ R and γ ∈ Γ . By Dirichlet's theorem on diophantine approximation we can choose a 2 , q with (a 2 , q) = 1, q ≤ Q and Then choose a 1 so that Set β 1 = α − a 1 /q and β 2 = α + γ − a 2 /q. Hence, by Theorem 8 of [18], we have Since (a 2 , q) = 1 we have |S 1,2 (q; a 1 , a 2 )| √ q by classical bounds on the Gauss sum. Thus Therefore, by Theorem 7.1 in [17] we have the bound We may assume that for otherwise in the denominator in (6.1) we have trivially. Since γ ∈ Γ , we infer from (5.1) via (6.2) that provided that Q ≥ 2/C(ε, γ ), which we may certainly assume. Thus, we find in this case as well, and (6.1) becomes We conclude that if θ > 3 4 , then as required.

Theorem 1.3: the upper bound when k = 3
This follows the pattern set in §6. Again we use (5.1), but now we suppose that γ ∈ Γ 0 , and hence given γ we may suppose that for any fixed δ > 0 the inequality (5.2) holds for at most a finite number of q and c. It will be convenient to also suppose that δ is sufficiently small. For such a γ we show that for arbitrarily large Q we have uniformly for α ∈ R. Let α ∈ R. By Dirichlet's theorem on diophantine approximation we can choose a 3 , q with Then choose a 1 so that and let β 3 = α + γ − a 3 /q and β 1 = α − a 1 /q. By Corollary 1.1(a) we have It follows that the contribution arising from those d|q for which κ(q) 3 is satisfactory in view of (7.1).
Thus it remains to deal with any possible terms for which (7.4) fails to hold. Suppose that d = (a 1 , q) for some d violating (7.4), so that In that case we would have which is impossible for Q large. Hence the only term in (7.3) that could possibly violate (7.4) is the one corresponding to d = (q, a 1 ), for which For this term the negation of (7.4) reads κ(q) 3 and we observe that in this instance We assume there is such a term and show that it contradicts the assumption γ ∈ Γ 0 . Since Γ 0 ⊆ Γ , we see from (5.1) and (7.2) that Since by (7.6) we may suppose that Q is large enough so that it follows from our assumption (7.5) that We also have and therefore Thus, as Q can be taken large enough so that (log 2q) 2 < 1 2 C(γ )Q 2δ , we have q > Q 1/4 . By (7.6) we have Hence, by (7.7) we find that However, by the definition of Γ 0 in Sect. 5 this is expressly excluded for large q, and so this establishes as promised that (7.5) is impossible. This completes the proof of (7.1), which gives the conclusion θ 3 ≤ 3 4 .

Theorem 1.3: the lower bound
Let δ > 0 be sufficiently small, and let k = 2 or 3. We show that for all γ ∈ R \ Q there are arbitrarily large Q such that The continued fraction algorithm for γ gives q and c with q arbitrarily large, (q, c) = 1, and Note that any two successive convergents c/q and c /q of the continued fraction satisfy cq − c q = ±1 and so (q, q ) = 1. Thus either q or q is odd. For an arbitrary odd convergent q and a fixed small parameter δ > 0 set Let a k be any integer with (q, a k ) = 1, and if k = 3 assume additionally that a 3 is such that S 1,3 (q; a 3 − c, a 3 ) q 1/2−δ . The existence of such an a 3 is guaranteed by Lemma 4.2. Take further α = −γ + a k /q and a 1 = a k − c, and define α k = α + γ , α 1 = α and β j = α j − a j /q for j = 1, k. Thus and so in particular We have When k = 2, this is Theorem 8 in [18], and for k = 3 it is a consequence of the second statement of Theorem guess that in this circumstance f k (α) behaves like a sum of independent random unimodular variables and so the central limit theorem suggests that In fact, the received wisdom states that for (9.1) to be true, it should suffice that α is such that for any a 1 , . . . , a k , q satisfying (1.4) with (q, a 1 , . . . , a k ) = 1 we have q + q P|α 1 − a 1 /q| + · · · + q P k |α k − a k /q| > P. (9.2) In the one-dimensional case, k = k, the last author and Wooley [19] show that there is a connection between the size of f k (α) and the number of solutions of with 0 < x j , y j ≤ P, and that the values of | f k (α)| will even have a normal distribution on the minor arcs. For k = 2 we have more precise bounds (see [18,Theorem 7)]) since we can cover all of [0, 1] 2 by choosing q, a 2 with (q, a 2 ) = 1, |α 2 −a 2 /q| ≤ 1/(5q P), q ≤ 5P and then taking a 1 with |α 1 − a 1 /q| ≤ 1/(2q). Thus the main interest lies in the case k ≥ 3. In order to make a proper comparison with the current state of play we first review what is known. Take k = (1, . . . , k). The Vinogradov mean value theorem (Bourgain, Demeter, Guth [3] and Wooley [22,23]) combined with Theorem 5.2 of Vaughan [17] give when for some j ≥ 2 there are coprime q j and a j such that |α j − a j /q j | ≤ q −2 j . At least when q j + P j |α j − a j | > P 2 for every j and choice of q j , a j , this is the best that we know when k ≥ 6, and is perhaps also the best that we know when 3 ≤ k ≤ 5 and we need to approximate some (presumably non-zero) α j with 2 ≤ j ≤ k − 1. When we have q, a k with (q, a k ) = 1 and |α k − a k /q| ≤ q −2 , Weyl's inequality [17,Lemma 2.4] gives and this is superior to (9.3) when 3 ≤ k ≤ 5.
There is an underlying problem when dealing with a sum as general as (1.1). It seems that one ought to consider a general rational approximation to α of the kind |qα j − a j | ≤ Q −1 j with (q, a 1 , . . . , a k ) = 1, but then to cover the whole of [0, 1] k one needs that q can be as large as cQ 1 · · · Q k . This means that either q might be much larger than P k or the intervals about each rational are too large to be able to deduce anything useful. The alternative, as in the statement of (9.3) above, is to deal with one j at a time. If q j > P θ j for some θ j ≤ 1 then we are done and that leaves the situation when q j ≤ P θ j for every j. Now one can take q = lcm(q 1 , . . . , q k ) and the numerators of the rational approximation to α becomes (a 1 q/q 1 , . . . , a k q/q k ). However, the likelihood is that any useful bound will still need q fairly constrained in terms of P and so the θ j will have to be rather small. An example of this process is given in the proof of [17,Theorem 7.4]. Some aspects of methods to overcome this are described in Chapter 5 of Baker [1].
Whilst our result of Theorem 1.3 does not strictly contradict such heuristics as detailed in the opening paragraph of this section, it does raise the question of whether these heuristics might not be too naive in some cases. When { p 1 , . . . , p t } is a set of polynomials with a non-vanishing Wronskian, consider the associated exponential sum e (α 1 p 1 (n) + · · · + α t p t (n)) . (9.5) We know from standard bounds that sup α | f p (α)| = f p (0) = P , whilst on the other hand it follows from [6, Corollary 2.2] that whenever the polynomials p j have a non-vanishing Wronskian, the bound (9.1) holds for a set of α ∈ [0, 1] t of full measure. Meanwhile, the analogous inequality to (9.2) is not sufficient for the bound (9.1) to hold. This is evidenced in our Theorem 1.3 with the choices p 1 (n) = n k and p 2 (n) = n k +n, where k = 2 or k = 3. Here, it transpires from our arguments that even if α 1 lacks a good rational approximation, for certain choices of α 2 the contributions of the degree k part of the polynomial more or less cancel out, leading to a less random behaviour. It is not clear to the authors whether this behaviour is particular to the presence and influence of the linear term in p 2 or whether there is some underlying phenomenon at work. In the latter scenario, one might now be inclined to guess that if only r of the t coefficients are restricted to lie in a set of full measure whilst the other t − r coefficients are allowed to range over the entire unit interval, that the ensuing bound would interpolate between the two extremes. Thus, one might speculate that when the polynomials p j are all of the same degree one has the bound sup α 1 ,...,α r | f p (α)| P 1−r /(2t)+ε for almost all α r +1 , . . . , α t , and that this might in some cases even be sharp (up to epsilon) for a sequence of values P tending to infinity. The exponent here is 1 − r /(2t) = (r /2 + (t − r ))/t and interpolates between r contributions of 1/2 and t − r contributions of 1. Whilst this is compatible with the bounds of our Theorem 1.3, there is still hope that stronger bounds may be available if the polynomials in question differ by more than a linear term.
Our understanding is better in the one-dimensional case. On page 43 of Vaughan [17] it was stated (we have changed the notation to be consistent with this memoir) that when (q, a) = 1 and β = α − a/q it would be very interesting to decide whether the relation f k (α) = q −1 S k (q, a)I k (β) + O (q + q P k |β|) θ holds with an exponent θ smaller than 1/2, and it was even speculated that θ might be as small as 1/k. This was shown to be false by Daemen [7] and Brüdern and Daemen [4]. One other result has come to our attention. Heath-Brown [10] has shown on the assumption of the abc conjecture that if α is a quadratic irrational then n≤P e(αn 3 ) P 5 7 +ε .
It may be worthwhile to note that quadratic irrationals are badly approximable numbers so that Heath-Brown's result can be viewed as applying to an 'extreme minor arc' situation. Finally, we briefly outline an argument, versions of which in other contexts are quite well known, that shows that one cannot expect to bound the exponential sum f k (α) by anything smaller than P 1/2 on the minor arcs. Let P be large and choose R = P 1+φ and Q = P k−1−ψ , where φ and ψ are positive numbers at our disposal and φ < ψ so that R Q < P k−δ for some substantial δ > 0. There are various wrinkles that could be introduced to enable a quite large δ.