Variants of the Selberg sieve, and bounded intervals containing many primes

For any $m \geq 1$, let $H_m$ denote the quantity $\liminf_{n \to \infty} (p_{n+m}-p_n)$. A celebrated recent result of Zhang showed the finiteness of $H_1$, with the explicit bound $H_1 \leq 70000000$. This was then improved by us (the Polymath8 project) to $H_1 \leq 4680$, and then by Maynard to $H_1 \leq 600$, who also established for the first time a finiteness result for $H_m$ for $m \geq 2$, and specifically that $H_m \ll m^3 e^{4m}$. If one also assumes the Elliott-Halberstam conjecture, Maynard obtained the bound $H_1 \leq 12$, improving upon the previous bound $H_1 \leq 16$ of Goldston, Pintz, and Y{\i}ld{\i}r{\i}m, as well as the bound $H_m \ll m^3 e^{2m}$. In this paper, we extend the methods of Maynard by generalizing the Selberg sieve further, and by performing more extensive numerical calculations. As a consequence, we can obtain the bound $H_1 \leq 246$ unconditionally, and $H_1 \leq 6$ under the assumption of the generalized Elliott-Halberstam conjecture. Indeed, under the latter conjecture we show the stronger statement that for any admissible triple $(h_1,h_2,h_3)$, there are infinitely many $n$ for which at least two of $n+h_1,n+h_2,n+h_3$ are prime. We modify the"parity problem"argument of Selberg to show that this result is the best possible that one can obtain from purely sieve-theoretic considerations. For larger $m$, we use the distributional results obtained previously by our project to obtain the unconditional asymptotic bound $H_m \ll m e^{(4-\frac{24}{181})m}$, or $H_m \ll m e^{2m}$ under the assumption of the Elliott-Halberstam conjecture. We also obtain explicit upper bounds for $H_m$ when $m=2,3,4,5$.


Introduction
For any natural number m, let H m denote the quantity H m :" lim inf nÑ8 pp n`m´pn q, where p n denotes the n th prime. The twin prime conjecture asserts that H 1 " 2; more generally, the Hardy-Littlewood prime tuples conjecture [30] implies that H m " Hpm`1q for all m ě 1, where Hpkq is the diameter of the narrowest admissible k-tuple (see Section 3 for a definition of this term). Asymptotically, one has the bounds p 1 2`o p1qqk log k ď Hpkq ď p1`op1qqk log k as k Ñ 8 (see Theorem 3.3 below); thus the prime tuples conjecture implies that H m is comparable to m log m as m Ñ 8. Until very recently, it was not known if any of the H m were finite, even in the easiest case m " 1. In the breakthrough work of Goldston, Pintz, and Yıldırım [25], several results in this direction were established, including the following conditional result assuming the Elliott-Halberstam conjecture EHrϑs (see Claim  Furthermore, it was shown in [25] that any result of the form EHr 1 2`2 s for some fixed 0 ă ă 1{4 would imply an explicit finite upper bound on H 1 (with this bound equal to 16 for ą 0.229855). Unfortunately, the only results of the type EHrϑs that are known come from the Bombieri-Vinogradov theorem (Theorem 2.3), which only establishes EHrϑs for 0 ă ϑ ă 1{2.
Zhang's argument followed the general strategy from [25] on finding small gaps between primes, with the major new ingredient being a proof of a weaker version of EHr 1 2`2 s, which we call MPZr , δs; see Claim 2.4 below. It was quickly realized that Zhang's numerical bound on H 1 could be improved. By optimizing many of the components in Zhang's argument, we were able [52,53] to improve Zhang's bound to H 1 ď 4680.
Very shortly afterwards, a further breakthrough was obtained by Maynard [38] (with related work obtained independently in unpublished work of Tao), who developed a more flexible "multidimensional" version of the Selberg sieve to obtain stronger bounds on H m . This argument worked without using any equidistribution results on primes beyond the Bombieri-Vinogradov theorem, and amongst other things was able to establish finiteness of H m for all m, not just for m " 1. More precisely, Maynard established the following results. For a survey of these recent developments, see [29]. In this paper, we refine Maynard's methods to obtain the following further improvements.  In Section 3 we will describe the key propositions that will be combined together to prove the various components of Theorem 1.4. As with Theorem 1.1, the results in (vii)-(xiii) do not require EHrϑs or GEHrϑs for all 0 ă ϑ ă 1, but only for a single explicitly computable ϑ that is sufficiently close to 1.
Of these results, the bound in (xii) is perhaps the most interesting, as the parity problem [57] prohibits one from achieving any better bound on H 1 than 6 from purely sieve-theoretic methods; we review this obstruction in Section 8. If one only assumes the Elliott-Halberstam conjecture EHrϑs instead of its generalization GEHrϑs, we were unable to improve upon Maynard's bound H 1 ď 12; however the parity obstruction does not exclude the possibility that one could achieve (xii) just assuming EHrϑs rather than GEHrϑs, by some further refinement of the sieve-theoretic arguments (e.g. by finding a way to establish Theorem 3.6(ii) below using only EHrϑs instead of GEHrϑs).
The bounds (ii)-(vi) rely on the equidistribution results on primes established in our previous paper [52]. However, the bound (i) uses only the Bombieri-Vinogradov theorem, and the remaining bounds (vii)-(xiii) of course use either the Elliott-Halberstam conjecture or a generalization thereof.
A variant of the proof of Theorem 1.4(xii), which we give in Section 9, also gives the following conditional "near miss" to (a disjunction of) the twin prime conjecture and the even Goldbach conjecture: Theorem 1.5 (Disjunction) Assume the generalized Elliott-Halberstam conjecture GEHrϑs for all 0 ă ϑ ă 1. Then at least one of the following statements is true: (a) (Twin prime conjecture) H 1 " 2. (b) (near-miss to even Goldbach conjecture) If n is a sufficiently large multiple of six, then at least one of n and n´2 is expressible as the sum of two primes. Similarly with n´2 replaced by n`2.
(In particular, every sufficiently large even number lies within 2 of the sum of two primes.) We remark that a disjunction in a similar spirit was obtained in [45], which established (prior to the appearance of Theorem 1.2) that either H 1 was finite, or that every interval rx, x`x ε s contained the sum of two primes if x was sufficiently large depending on ε ą 0.
There are two main technical innovations in this paper. The first is a further generalization of the multidimensional Selberg sieve introduced by Maynard and Tao, in which the support of a certain cutoff function F is permitted to extend into a larger domain than was previously permitted (particularly under the assumption of the generalized Elliott-Halberstam conjecture). As in [38], this largely reduces the task of bounding H m to that of efficiently solving a certain multidimensional variational problem involving the cutoff function F . Our second main technical innovation is to obtain efficient numerical methods for solving this variational problem for small values of the dimension k, as well as sharpened asymptotics in the case of large values of k.

Organization of the paper
The paper is organized as follows. After some notational preliminaries, we recall in Section 2 the known (or conjectured) distributional estimates on primes in arithmetic progressions that we will need to prove Theorem 1.4. Then, in Section 3, we give the key propositions that will be combined together to establish this theorem. One of these propositions, Lemma 3.4, is an easy application of the pigeonhole principle. Two further propositions, Theorem 3.5 and Theorem 3.6, use the prime distribution results from Section 2 to give asymptotics for certain sums involving sieve weights and the von Mangoldt function; they are established in Section 4. Theorems 3.8, 3.10, 3.12, 3.14 use the asymptotics established in Theorems 3.5, 3.6, in combination with Lemma 3.4, to give various criteria for bounding H m , which all involve finding sufficiently strong candidates for a variety of multidimensional variational problems; these theorems are proven in Section 5. These variational problems are analysed in the asymptotic regime of large k in Section 6, and for small and medium k in Section 7, with the results collected in Theorems 3.9, 3.11, 3.13, 3.15. Combining these results with the previous propositions gives Theorem 3.2, which, when combined with the bounds on narrow admissible tuples in Theorem 3.3 that are established in Section 10, will give Theorem 1.4. (See also Table 1 for some more details of the logical dependencies between the key propositions.) Finally, in Section 8 we modify an argument of Selberg to show that the bound H 1 ď 6 may not be improved using purely sieve-theoretic methods, and in Section 9 we establish Theorem 1.5 and make some miscellaneous remarks.

Notation
The notation used here closely follows the notation in our previous paper [52].
We use |E| to denote the cardinality of a finite set E, and 1 E to denote the indicator function of a set E, thus 1 E pnq " 1 when n P E and 1 E pnq " 0 otherwise. In a similar spirit, if E is a statement, we write 1 E " 1 when E is true and 1 E " 0 otherwise.
All sums and products will be over the natural numbers N :" t1, 2, 3, . . .u unless otherwise specified, with the exceptions of sums and products over the variable p, which will be understood to be over primes.
The following important asymptotic notation will be in use throughout the paper: Definition 1.6 (Asymptotic notation) We use x to denote a large real parameter, which one should think of as going off to infinity; in particular, we will implicitly assume that it is larger than any specified fixed constant. Some mathematical objects will be independent of x and referred to as fixed ; but unless otherwise specified we allow all mathematical objects under consideration to depend on x (or to vary within a range that depends on x, e.g. the summation parameter n in the sum ř xďnď2x f pnq). If X and Y are two quantities depending on x, we say that X " OpY q or X ! Y if one has |X| ď CY for some fixed C (which we refer to as the implied constant), and X " opY q if one has |X| ď cpxqY for some function cpxq of x (and of any fixed parameters present) that goes to zero as x Ñ 8 (for each choice of fixed parameters). We use X Î Y to denote the estimate X ď x op1q Y , X -Y to denote the estimate Y ! X ! Y , and X « Y to denote the estimate Y Î X Î Y . Finally, we say that a quantity n is of polynomial size if one has n " Opx Op1q q.
If asymptotic notation such as Opq or Î appears on the left-hand side of a statement, this means that the assertion holds true for any specific interpretation of that notation. For instance, the assertion ř n"OpN q |αpnq| Î N means that for each fixed constant C ą 0, one has ř |n|ďCN |αpnq| Î N .
If q and a are integers, we write a|q if a divides q. If q is a natural number and a P Z, we use a pqq to denote the residue class a pqq :" ta`nq : n P Zu and let Z{qZ denote the ring of all such residue classes a pqq. The notation b " a pqq is synonymous to b P a pqq. We use pa, qq to denote the greatest common divisor of a and q, and ra, qs to denote the least common multiple. [1] We also let pZ{qZqˆ:" ta pqq : pa, qq " 1u denote the primitive residue classes of Z{qZ.
We use the following standard arithmetic functions: (i) ϕpqq :" |pZ{qZqˆ| denotes the Euler totient function of q.
(ii) τ pqq :" ř d|q 1 denotes the divisor function of q. (iii) Λpqq denotes the von Mangoldt function of q, thus Λpqq " log p if q is a power of a prime p, and Λpqq " 0 otherwise. (iv) θpqq is defined to equal log q when q is a prime, and θpqq " 0 otherwise.
(v) µpqq denotes the Möbius function of q, thus µpqq " p´1q k if q is the product of k distinct primes for some k ě 0, and µpqq " 0 otherwise. (vi) Ωpqq denotes the number of prime factors of q (counting multiplicity). We recall the elementary divisor bound τ pnq Î 1 (1) whenever n ! x Op1q , as well as the related estimate for any fixed C ą 0; see e.g. [52,Lemma 1.5].

Distribution estimates on arithmetic functions
As mentioned in the introduction, a key ingredient in the Goldston-Pintz-Yıldırım approach to small gaps between primes comes from distributional estimates on the primes, or more precisely on the von Mangoldt function Λ, which serves as a proxy for the primes. In this work, we will also need to consider distributional estimates on more general arithmetic functions, although we will not prove any new such estimates in this paper, relying instead on estimates that are already in the literature. More precisely, we will need averaged information on the following quantity: Definition 2.1 (Discrepancy) For any function α : N Ñ C with finite support (that is, α is non-zero only on a finite set) and any primitive residue class a pqq, we define the (signed) discrepancy ∆pα; a pqqq to be the quantity ∆pα; a pqqq :" ÿ n"a pqq αpnq´1 ϕpqq ÿ pn,qq"1 αpnq.
[1] When a, b are real numbers, we will also need to use pa, bq and ra, bs to denote the open and closed intervals respectively with endpoints a, b. Unfortunately, this notation conflicts with the notation given above, but it should be clear from the context which notation is in use.
In [25] it was shown that any estimate of the form EHrϑs with some fixed ϑ ą 1{2 would imply the finiteness of H 1 . While such an estimate remains unproven, it was observed by Motohashi-Pintz [43] and by Zhang [65] that a certain weakened version of EHrϑs would still suffice for this purpose. More precisely (and following the notation of our previous paper [52]), let , δ ą 0 be fixed, and let MPZr , δs be the following claim: Claim 2.4 (Motohashi-Pintz-Zhang estimate, MPZr , δs) Let I Ă r1, x δ s and Q Î x 1{2`2 . Let P I denote the product of all the primes in I, and let S I denote the square-free natural numbers whose prime factors lie in I. If the residue class a pP I q is primitive (and is allowed to depend on x), and A ě 1 is fixed, then where the implied constant depends only on the fixed quantities pA, , δq, but not on a.
In fact, a stronger result was established in [52], in which the moduli q were assumed to be densely divisible rather than smooth, but we will not exploit such improvements here. For our application, the most important thing is to get as large as possible; in particular, Theorem 2.5 allows one to get arbitrarily close to 7 600 « 0.01167. In this paper, we will also study the following generalization of the Elliott-Halberstam conjecture for a fixed choice of 0 ă ϑ ă 1: Claim 2.6 (Generalized Elliott-Halberstam conjecture, GEHrϑs) Let ε ą 0 and A ě 1 be fixed. Let N, M be quantities such that x ε Î N Î x 1´ε and x ε Î M Î x 1´ε with N Mx, and let α, β : N Ñ R be sequences supported on rN, 2N s and rM, 2M s respectively, such that one has the pointwise bounds for all natural numbers n, m. Suppose also that β obeys the Siegel-Walfisz type bound |∆pβ1 p¨,rq"1 ; a pqqq| ! τ pqrq Op1q M log´A x (7) for any q, r ě 1, any fixed A, and any primitive residue class a pqq. Then for any Q Î x ϑ , we have In [7,Conjecture 1] it was essentially conjectured [2] that GEHrϑs was true for all 0 ă ϑ ă 1. This is stronger than the Elliott-Halberstam conjecture: Proposition 2.7 For any fixed 0 ă ϑ ă 1, GEHrϑs implies EHrϑs.
One could similarly describe a generalization of the Motohashi-Pintz-Zhang estimate MPZr , δs, but unfortunately the arguments in [65] or Theorem 2.5 do not extend to this setting unless one is in the "Type I/Type II" case in which N, M are constrained to be somewhat close to x 1{2 , or if one has "Type III" structure to the convolution α ‹ β, in the sense that it can refactored as a convolution involving several "smooth" sequences. In any event, our analysis would not be able to make much use of such incremental improvements to GEHrϑs, as we only use this hypothesis effectively in the case when ϑ is very close to 1. In particular, we will not directly use Theorem 2.8 in this paper. [2] Actually, there are some differences between [7, Conjecture 1] and the claim here. Firstly, we need an estimate that is uniform for all a, whereas in [7] only the case of a fixed modulus a was asserted. On the other hand, α, β were assumed to be controlled in 2 instead of via the pointwise bounds (6), and Q was allowed to be as large as x log´C x for some fixed C (although, in view of the negative results in [19], [20], this latter strengthening may be too ambitious). [3] One could also use the Heath-Brown identity [31] here if desired.

Outline of the key ingredients
In this section we describe the key subtheorems used in the proof of Theorem 1.4, with the proofs of these subtheorems mostly being deferred to later sections.
We begin with a weak version of the Dickson-Hardy-Littlewood prime tuples conjecture [30], which (following Pintz [46]) we refer to as DHLrk, js. Recall that for any k P N, an admissible k-tuple is a tuple H " ph 1 , . . . , h k q of k increasing integers h 1 ă . . . ă h k which avoids at least one residue class a p ppq :" ta p`n p : n P Zu for every p. For instance, p0, 2, 6q is an admissible 3-tuple, but p0, 2, 4q is not.
For any k ě j ě 2, we let DHLrk, js denote the following claim: Claim 3.1 (Weak Dickson-Hardy-Littlewood conjecture, DHLrk, js) For any admissible k-tuple H " ph 1 , . . . , h k q there exist infinitely many translates n`H " pn`h 1 , . . . , n`h k q of H which contain at least j primes.
The full Dickson-Hardy-Littlewood conjecture is then the assertion that DHLrk, ks holds for all k ě 2. In our analysis we will focus on the case when j is much smaller than k; in fact j will be of the order of log k.
For any k, let Hpkq denote the minimal diameter h k´h1 of an admissible k-tuple; thus for instance Hp3q " 6. It is clear that for any natural numbers m ě 1 and k ě m`1, the claim DHLrk, m`1s implies that H m ď Hpkq (and the claim DHLrk, ks would imply that H k´1 " Hpkq). We will therefore deduce Theorem 1.4 from a number of claims of the form DHLrk, js. More precisely, we have: (i) Hp50q " 246. (vi), (xi) In the asymptotic limit k Ñ 8, one has Hpkq ď k log k`k log log k´k`opkq, with the bounds on the decay rate opkq being effective.
We prove Theorem 3.3 in Section 10. In the opposite direction, an application of the Brun-Titchmarsh theorem gives Hpkq ě p 1 2`o p1qqk log k as k Ñ 8; see [53, §3.9] for this bound, as well as with some slight refinements.
The proof of Theorem 3.2 follows the Goldston-Pintz-Yıldırım strategy that was also used in all previous progress on this problem (e.g. [25], [43], [65], [52], [38]), namely that of constructing a sieve function adapted to an admissible k-tuple with good properties. More precisely, we set w :" log log log x and W :" ź pďw p, and observe the crude bound We have the following simple "pigeonhole principle" criterion for DHLrk, m`1s (cf. [52,Lemma 4.1], though the normalization here is slightly different): Lemma 3.4 (Criterion for DHL) Let k ě 2 and m ě 1 be fixed integers, and define the normalization constant Suppose that for each fixed admissible k-tuple ph 1 , . . . , h k q and each residue class b pW q such that b`h i is coprime to W for all i " 1, . . . , k, one can find a non-negative weight function ν : N Ñ R`and fixed quantities α ą 0 and β 1 , . . . , β k ě 0, such that one has the asymptotic upper bound the asymptotic lower bound for all i " 1, . . . , k, and the key inequality Then DHLrk, m`1s holds.
Proof Let ph 1 , . . . , h k q be a fixed admissible k-tuple. Since it is admissible, there is at least one residue class b pW q such that pb`h i , W q " 1 for all h i P H. For an arithmetic function ν as in the lemma, we consider the quantity Combining (13) and (14), we obtain the lower bound From (12) and the crucial condition (15), it follows that N ą 0 if x is sufficiently large. On the other hand, the sum can be positive only if n`h i is prime for at least m`1 indices i " 1, . . . , k. We conclude that, for all sufficiently large x, there exists some integer n P rx, 2xs such that n`h i is prime for at least m`1 values of i " 1, . . . , k.
The objective is then to construct non-negative weights ν whose associated ratio β1`¨¨¨`β k α has provable lower bounds that are as large as possible. Our sieve majorants will be a variant of the multidimensional Selberg sieves used in [38]. As with all Selberg sieves, the ν are constructed as the square of certain (signed) divisor sums. The divisor sums we will use will be finite linear combinations of products of "one-dimensional" divisor sums. More precisely, for any fixed smooth compactly supported function F : r0,`8q Ñ R, define the divisor sum λ F : Z Ñ R by the formula where log x denotes the base x logarithm log x n :" log n log x .
One should think of λ F as a smoothed out version of the indicator function to numbers n which are "almost prime" in the sense that they have no prime factors less than x ε for some small fixed ε ą 0; see Proposition 4.2 for a more rigorous version of this heuristic. The functions ν we will use will take the form for some fixed natural number J, fixed coefficients c 1 , . . . , c J P R and fixed smooth compactly supported functions F j,i : r0,`8q Ñ R with j " 1, . . . , J and i " 1, . . . , k. (One can of course absorb the constant c j into one of the F j,i if one wishes.) Informally, ν is a smooth restriction to those n for which n`h 1 , . . . , n`h k are all almost prime. Clearly, ν is a (positive-definite) fixed linear combination of functions of the form for various fixed smooth functions F 1 , . . . , F k , G 1 , . . . , G k : r0,`8q Ñ R. The sum appearing in (13) can thus be decomposed into fixed linear combinations of sums of the form Also, if F is supported on r0, 1s, then from (16) we clearly have when n ě x is prime, and so the sum appearing in (14) can be similarly decomposed in this case into fixed linear combinations of sums of the form To estimate the sums (21), we use the following asymptotic, proven in Section 4. For each compactly supported F : r0,`8q Ñ R, let SpF q :" suptx ě 0 : F pxq ‰ 0u (22) denote the upper range of the support of F (with the convention that Sp0q " 0).
Theorem 3.5 (Asymptotic for prime sums) Let k ě 2 be fixed, let ph 1 , . . . , h k q be a fixed admissible k-tuple, and let b pW q be such that b`h i is coprime to W for each i " 1, . . . , k. Let 1 ď i 0 ď k be fixed, and for each 1 ď i ď k distinct from i 0 , let F i , G i : r0,`8q Ñ R be fixed smooth compactly supported functions. Assume one of the following hypotheses: (i) (Elliott-Halberstam) There exists a fixed 0 ă ϑ ă 1 such that EHrϑs holds, and such that ÿ 1ďiďk;i‰i0 (ii) (Motohashi-Pintz-Zhang) There exists fixed 0 ď ă 1{4 and δ ą 0 such that MPZr , δs holds, and such that ÿ 1ďiďk;i‰i0 pSpF i q`SpG i qq ă 1 2`2 (24) and max 1ďiďk;i‰i0 Then we have where B is given by (12) and Here of course F 1 denotes the derivative of F .
To estimate the sums (19), we use the following asymptotic, also proven in Section 4.
Theorem 3.6 (Asymptotic for non-prime sums) Let k ě 1 be fixed, let ph 1 , . . . , h k q be a fixed admissible k-tuple, and let b pW q be such that b`h i is coprime to W for each i " 1, . . . , k. For each fixed 1 ď i ď k, let F i , G i : r0,`8q Ñ R be fixed smooth compactly supported functions. Assume one of the following hypotheses: (i) (Trivial case) One has (ii) (Generalized Elliott-Halberstam) There exists a fixed 0 ă ϑ ă 1 and i 0 P t1, . . . , ku such that GEHrϑs holds, and ÿ 1ďiďk;i‰i0 Then we have where B is given by (12) and c :" A key point in (ii) is that no upper bound on SpF i0 q or SpG i0 q is required (although, as we will see in Section 4.5, the result is a little easier to prove when one has SpF i0 q`SpG i0 q ă 1). This flexibility in the F i0 , G i0 functions will be particularly crucial to obtain part (xii) of Theorem 3.2 and Theorem 1.4.
Remark 3.7 Theorems 3.5, 3.6 can be viewed as probabilistic assertions of the following form: if n is chosen uniformly at random from the set tx ď n ď 2x : n " b pW qu, then the random variables θpn`h i q and λ Fj pn`h j qλ Gj pn`h j q for i, j " 1, . . . , k have mean p1`op1qq W ϕpW q and p ş 1 0 F 1 j ptqG 1 j ptq dt`op1qqB´1 respectively, and furthermore these random variables enjoy a limited amount of independence, except for the fact (as can be seen from (20)) that θpn`h i q and λ Fi pn`h i qλ Gi pn`h i q are highly correlated. Note though that we do not have asymptotics for any sum which involves two or more factors of θ, as such estimates are of a difficulty at least as great as that of the twin prime conjecture (which is equivalent to the divergence of the sum ř n θpnqθpn`2q). Theorem 3.8 (Sieving on the standard simplex) Let k ě 2 and m ě 1 be fixed integers. For any fixed compactly supported square-integrable function F : r0,`8q k Ñ R, define the functionals and J i pF q :" ż r0,`8q k´1ˆż for i " 1, . . . , k, and let M k be the supremum over all square-integrable functions F that are supported on the simplex R k :" tpt 1 , . . . , t k q P r0,`8q k : t 1`¨¨¨`tk ď 1u and are not identically zero (up to almost everywhere equivalence, of course). Suppose that there is a fixed 0 ă ϑ ă 1 such that EHrϑs holds, and such that Then DHLrk, m`1s holds.
Parts (vii)-(xi) of Theorem 3.2 (and hence Theorem 1.4) are then immediate from the following results, proven in Sections 6, 7, and ordered by increasing value of k: (xi) One has M k ě log k´C for all k ě C, where C is an absolute (and effective) constant.
For sake of comparison, in [38,Proposition 4.3] it was shown that M 5 ą 2, M 105 ą 4, and M k ě log k´2 log log k´2 for all sufficiently large k. As remarked in that paper, the sieves used on the bounded gap problem prior to the work in [38] would essentially correspond, in this notation, to the choice of functions F of the special form F pt 1 , . . . , t k q :" f pt 1`¨¨¨`tk q, which severely limits the size of the ratio in (33) (in particular, the analogue of M k in this special case cannot exceed 4, as shown in [59]).
In the converse direction, in Corollary 6.4 we will also show the upper bound M k ď k k´1 log k for all k ě 2, which shows in particular that the bounds in (vii) and (xi) of the above theorem cannot be significantly improved. We remark that Theorem 3.9(vii) and the Bombieri-Vinogradov theorem also gives a weaker version DHLr54, 2s of Theorem 3.2(i).
We also have a variant of Theorem 3.8 which can accept inputs of the form MPZr , δs: be defined as in (33), but where the supremum now ranges over all square-integrable functions F supported in the truncated simplex tpt 1 , . . . , t k q P r0, αs k : t 1`¨¨¨`tk ď 1u (34) and are not identically zero. If then DHLrk, m`1s holds.
In Section 6 we will establish the following variant of Theorem 3.9, which when combined with Theorem 2.5, allows one to use Theorem 3.10 to establish parts (ii)-(vi) of Theorem 3.2 (and hence Theorem 1.4): The implication is clear for (ii)-(v). For (vi), observe that from Theorem 3.11(vi), Theorem 2.5, and Theorem 3.10, we see that DHLrk, m`1s holds whenever k is sufficiently large and m ď plog k´Cqˆ1 4`7 600´C log k˙ which is in particular implied by for some absolute constant C 1 , giving Theorem 3.2(vi). Now we give a more flexible variant of Theorem 3.8, in which the support of F is enlarged, at the cost of reducing the range of integration of the J i . Theorem 3.12 (Sieving on an epsilon-enlarged simplex) Let k ě 2 and m ě 1 be fixed integers, and let 0 ă ε ă 1 be fixed also. For any fixed compactly supported square-integrable function F : r0,`8q k Ñ R, define the functionals . . , k, and let M k,ε be the supremum M k,ε :" sup over all square-integrable functions F that are supported on the simplex p1`εq¨R k " tpt 1 , . . . , t k q P r0,`8q k : t 1`¨¨¨`tk ď 1`εu and are not identically zero. Suppose that there is a fixed 0 ă ϑ ă 1, such that one of the following two hypotheses holds: (i) EHrϑs holds, and 1`ε ă 1 ϑ . (ii) GEHrϑs holds, and ε ă 1 k´1 . If M k,ε ą 2m ϑ then DHLrk, m`1s holds.
We prove this theorem in Section 5.3. We remark that due to the continuity of M k,ε in ε, the strict inequalities in (i), (ii) of this theorem may be replaced by non-strict inequalities. Parts (i), (xiii) of Theorem 3.2, and a weaker version DHLr4, 2s of part (xii), then follow from Theorem 2.3 and the following computations, proven in Sections 7.2, 7.3: We remark that computations in the proof of Theorem 3.13(xii 1 ) are simple enough that the bound may be checked by hand, without use of a computer. The computations used to establish the full strength of Theorem 3.2(xii) are however significantly more complicated.
In fact, we may enlarge the support of F further. We give a version corresponding to part (ii) of Theorem 3.12; there is also a version corresponding to part (i), but we will not give it here as we will not have any use for it. Theorem 3.14 (Going beyond the epsilon enlargement) Let k ě 2 and m ě 1 be fixed integers, let 0 ă ϑ ă 1 be a fixed quantity such that GEHrϑs holds, and let 0 ă ε ă 1 k´1 be fixed also. Suppose that there is a fixed non-zero square-integrable function F : r0,`8q k Ñ R supported in k k´1¨R k , such that for i " 1, . . . , k one has the vanishing marginal condition whenever t 1 , . . . , t i´1 , t i`1 , . . . , t k ě 0 are such that Suppose that we also have the inequality Then DHLrk, m`1s holds.
This theorem is proven in Section 5.4. Theorem 3.2(xii) is then an immediate consequence of Theorem 3.14 and the following numerical fact, established in Section 7.4. Theorem 3.15 (A piecewise polynomial cutoff) Set ε :" 1 4 . Then there exists a piecewise polynomial function F : r0,`8q 3 Ñ R supported on the simplex

2¨R
3 " " pt 1 , t 2 , t 3 q P r0,`8q 3 : t 1`t2`t3 ď 3 2 * and symmetric in the t 1 , t 2 , t 3 variables, such that F is not identically zero and obeys the vanishing marginal condition whenever t 1 , t 2 ě 0 with t 1`t2 ą 1`ε, and such that There are several other ways to combine Theorems 3.5, 3.6 with equidistribution theorems on the primes to obtain results of the form DHLrk, m`1s, but all of our attempts to do so either did not improve the numerology, or else were numerically infeasible to implement.

Multidimensional Selberg sieves
In this section we prove Theorems 3.5 and 3.6. A key asymptotic used in both theorems is the following: Lemma 4.1 (Asymptotic) Let k ě 1 be a fixed integer, and let N be a natural number coprime to W with log N " Oplog Op1q xq. Let F 1 , . . . , F k , G 1 , . . . , G k : r0,`8q Ñ R be fixed smooth compactly supported functions. Then where B was defined in (12), and c :" The same claim holds if the denominators rd j , d 1 j s are replaced by ϕprd j , d 1 j sq.
Such asymptotics are standard in the literature; see e.g. [27] for some similar computations. In older literature, it is common to establish these asymptotics via contour integration (e.g. via Perron's formula), but we will use the Fourier-analytic approach here. Of course, both approaches ultimately use the same input, namely the simple pole of the Riemann zeta function at s " 1.
Proof We begin with the first claim. For j " 1, . . . , k, the functions t Þ Ñ e t F j ptq, t Þ Ñ e t G j ptq may be extended to smooth compactly supported functions on all of R, and so we have Fourier expansions and e t G j ptq " for some fixed functions f j , g j : R Ñ C that are smooth and rapidly decreasing in the sense that f j pξq, g j pξq " Opp1`|ξ|q´Aq for any fixed A ą 0 and all ξ P R (here the implied constant is independent of ξ and depends only on A).
We may thus write Therefore, if we substitute the Fourier expansions into the left-hand side of (36), the resulting expression is absolutely convergent. Thus we can apply Fubini's theorem, and the left-hand side of (36) can thus be rewritten as This latter expression factorizes as an Euler product where the local factors K p are given by We can estimate each Euler factor as Since we have j log x q where the modified zeta function ζ W N is defined by the formula for Repsq ą 1. For Repsq ě 1`1 log x we have the crude bounds where the first inequality comes from comparing the factors in the Euler product. Thus Kpξ 1 , . . . , ξ k , ξ 1 1 , . . . , ξ 1 k q " Oplog 3k xq.
Combining this with the rapid decrease of f j , g j , we see that the contribution to (38) outside of the cube tmaxp|ξ 1 |, . . . , |ξ k |, |ξ 1 1 |, . . . , |ξ 1 k |q ď ? log xu (say) is negligible. Thus it will suffice to show that When |ξ j | ď ? log x, we see from the simple pole of the Riemann zeta function ζpsq " ś p p1´1 p s q´1 at s " 1 that For´?log x ď ξ j ď ? log x, we see that log x¯.
Since logpW N q ! log Op1q x, this gives since the sum is maximized when W N is composed only of primes p ! log Op1q x. Thus Similarly with 1`iξ j replaced by 1`iξ 1 j or 2`iξ j`i ξ 1 j . We conclude that Therefore it will suffice to show that since the errors caused by the 1`op1q multiplicative factor in (41) or the truncation |ξ j |, |ξ 1 j | ď ? log x can be seen to be negligible using the rapid decay of f j , g j . By Fubini's theorem, it suffices to show that for each j " 1, . . . , k. But from dividing (37) by e t and differentiating under the integral sign, we have and the claim then follows from Fubini's theorem. Finally, suppose that we replace the denominators rd j , d 1 j s with ϕprd j , d 1 j sq. An inspection of the above argument shows that the only change that occurs is that the 1 p term in (39) is replaced by 1 p´1 ; but this modification may be absorbed into the 1`Op 1 p 2 q factor in (40), and the rest of the argument continues as before.

The trivial case
We can now prove the easiest case of the two theorems, namely case (i) of Theorem 3.6; a closely related estimate also appears in [38, Lemma 6.2]. We may assume that x is sufficiently large depending on all fixed quantities. By (16), the left-hand side of (29) may be expanded as where By hypothesis, b`h i is coprime to W for all i " 1, . . . , k, and |h i´hj | ă w for all distinct i, j. Thus, Spd 1 , . . . , d k , d 1 1 , . . . , d 1 k q vanishes unless the rd i , d 1 i s are coprime to each other and to W . In this case, Spd 1 , . . . , d k , d 1 1 , . . . , d 1 k q is summing the constant function 1 over an arithmetic progression in rx, 2xs of spacing W rd 1 , d 1 1 s . . . rd k , d 1 k s, and so By Lemma 4.1, the contribution of the main term (29) is pc`op1qqB´k x W ; note that the restriction of the integrals in (30) to r0, 1s instead of r0,`8q is harmless since SpF i q, SpG i q ă 1 for all i. Meanwhile, the contribution of the Op1q error is then bounded by By the hypothesis in Theorem 3.6(i), we see that for d 1 , . . . , d k , d 1 1 , . . . , d 1 k contributing a non-zero term here, one has for some fixed ε ą 0. From the divisor bound (1) we see that each choice of rd 1 , d 1 We conclude that the net contribution of the Op1q error to (29) is Î x 1´ε , and the claim follows.

The Elliott-Halberstam case
Now we show case (i) of Theorem 3.5. For sake of notation we take i 0 " k, as the other cases are similar. We use (16) to rewrite the left-hand side of (26) as whereS n"b pW q n`hi"0 prdi,d 1 i sq @i"1,...,k´1 θpn`h k q.
As in the previous case,Spd 1 , . . . , d k´1 , d 1 1 , . . . , d 1 k´1 q vanishes unless the rd i , d 1 i s are coprime to each other and to W , and so the summand in (43) vanishes unless the modulus q W,d1,...,d 1 k´1 defined by is squarefree. In that case, we may use the Chinese remainder theorem to concatenate the congruence conditions on n into a single primitive congruence condition , and conclude using (3) that From the prime number theorem we have and this expression is clearly independent of d 1 , . . . , d 1 k´1 . Thus by Lemma 4.1, the contribution of the main term in (45) to (43) is pc`op1qqB 1´k x ϕpW q . By (11) and (12), it thus suffices to show that for any where a " a W,d1,...,d 1 k´1 and q " q W,d1,...,d 1 k´1 . For future reference we note that we may restrict the summation here to those d 1 , . . . , d 1 k´1 for which q W,d1,...,d 1 k´1 is square-free. From the hypotheses of Theorem 3.5(i), we have whenever the summand in (43) is non-zero, and each choice q of q W,d1,...,d 1 k´1 is associated to Opτ pqq Op1q q choices of d 1 , . . . , d k´1 , d 1 1 , . . . , d 1 k´1 . Thus this contribution is Using the crude bound for any fixed C ą 0. By the Cauchy-Schwarz inequality it suffices to show that for any fixed A ą 0. However, since θ only differs from Λ on powers p j of primes with j ą 1, it is not difficult to show that so the net error in replacing θ here by Λ is Î x 1´p1´ϑq{2 , which is certainly acceptable. The claim now follows from the hypothesis EHrϑs, thanks to Claim 2.2.

The Motohashi-Pintz-Zhang case
Now we show case (ii) of Theorem 3.5. We repeat the arguments from Section 4.2, with the only difference being in the derivation of (46). As observed previously, we may restrict q W,d1,...,d 1 k´1 to be squarefree. From the hypotheses in Theorem 3.5(ii), we also see that and that all the prime factors of q W,d1,...,d 1 k´1 are at most x δ . Thus, if we set I :" r1, x δ s, we see (using the notation from Claim 2.4) that q W,d1,...,d 1 k´1 lies in S I , and is thus a factor of P I . If we then let A Ă Z{P I Z denote all the primitive residue classes a pP I q with the property that a " b pW q, and such that for each prime w ă p ď x δ , one has a`h i " 0 ppq for some i " 1, . . . , k, then we see that Note from the Chinese remainder theorem that for any given q, if one lets a range uniformly in A, then a pqq is uniformly distributed among Opτ pqq Op1q q different moduli. Thus we have and so it suffices to show that for any fixed A ą 0. We see it suffices to show that for any given a P A. But this follows from the hypothesis M P Zr , δs by repeating the arguments of Section 4.2.

Crude estimates on divisor sums
To proceed further, we will need some additional information on the divisor sums λ F (defined in (16)), namely that these sums are concentrated on "almost primes"; results of this type have also appeared in [44].
Proposition 4.2 (Almost primality) Let k ě 1 be fixed, let ph 1 , . . . , h k q be a fixed admissible k-tuple, and let b pW q be such that b`h i is coprime to W for each i " 1, . . . , k. Let F 1 , . . . , F k : r0,`8q Ñ R be fixed smooth compactly supported functions, and let m 1 , . . . , m k ě 0 and a 1 , . . . , a k ě 1 be fixed natural numbers. Then Furthermore, if 1 ď j 0 ď k is fixed and p 0 is a prime with p 0 ď x 1 10k , then we have the variant As a consequence, we have for any ε ą 0, where ppnq denotes the least prime factor of n.
The exponent 1 10k can certainly be improved here, but for our purposes any fixed positive exponent depending only on k will suffice.
Proof The strategy is to estimate the alternating divisor sums λ Fj pn`h j q by non-negative expressions involving prime factors of n`h j , which can then be bounded combinatorially using standard tools.
We first prove (47). As in the proof of Proposition 4.1, we can use Fourier expansion to write for some rapidly decreasing f j : R Ñ C and all natural numbers d. Thus which factorizes using Euler products as The function s Þ Ñ p´s log x has a magnitude of Op1q and a derivative of Oplog x pq when Repsq ą 1, and thus 1´1 p From the rapid decrease of f j and the triangle inequality, we conclude that for any fixed a j , A. However, we have and so Making the change of variables σ :" 1`|ξ 1 |`¨¨¨`|ξ aj |, we obtain for any fixed A ą 0. In view of this bound and the Fubini-Tonelli theorem, it suffices to show that for all σ 1 , . . . , σ k ě 1. By setting σ :" σ 1`¨¨¨`σk , it suffices to show that for any σ ě 1.
To proceed further, we factorize n`h j as a product n`h j " p 1 . . . p r of primes p 1 ď¨¨¨ď p r in increasing order, and then write where d j :" p 1 . . . p ij and i j is the largest index for which p 1 . . . p ij ă x 1 10k , and m j :" p ij`1 . . . p r . By construction, we see that 0 ď i j ă r, d j ď x 1 10k . Also, we have Since n ď 2x, this implies that where we recall that Ωpd j q " i j denotes the number of prime factors of d j , counting multiplicity. We also see that where ppnq denotes the least prime factor of n. Finally, we have that and we see the d 1 , . . . , d k , W are coprime. We may thus estimate the left-hand side of (50) by where the outer sum ř˚i s over d 1 , . . . , d k ď x 1 10k with d 1 , . . . , d k , W coprime, and the inner sum ř˚i s over x ď n ď 2x with n " b pW q and n`h j " 0 pd j q for each j, with pp n`hj dj q ě R for each j. We bound the inner sum ř˚˚1 using a Selberg sieve upper bound. Let G be a smooth function supported on r0, 1s with Gp0q " 1, and let d " hj dj q ě R, and non-negative otherwise. The right hand side may be expanded as As in Section 4.1, the inner sum vanishes unless the e i e 1 i are coprime to each other and dW , in which case it is The Op1q term contributes Î R k Î x 1{10 , which is negligible. By Lemma 4.1, if Ωpdq ! log 1{2 x then the main term contributes !´d ϕpdq¯k We see that this final bound applies trivially if Ωpdq " log 1{2 x. The bound (50) thus reduces to Ignoring the coprimality conditions on the d j for an upper bound, we see this is bounded by But from Mertens' theorem we have and the claim (47) follows.
The proof of (48) is a minor modification of the argument above used to prove (47). Namely, the variable d j0 is now replaced by rd 0 , p 0 s ă x 1{5k , which upon factoring out p 0 has the effect of multiplying the upper bound for (51) by Op σ log x p0 p0 q (at the negligible cost of deleting the prime p 0 from the sum ř pďx ), giving the claim; we omit the details. Finally, (49) follows immediately from (47) when ε ą 1 10k , and from (48) and Mertens' theorem when ε ď 1 10k .

Remark 4.3
As in [44], one can use Proposition 4.2, together with the observation that the quantity λ F pnq is bounded whenever n " Opxq and ppnq ě x ε , to conclude that whenever the hypotheses of Lemma 3.4 are obeyed for some ν of the form (18), then there exists a fixed ε ą 0 such that for all sufficiently large x, there are " x log k x elements n of rx, 2xs such that n`h 1 , . . . , n`h k have no prime factor less than x ε , and that at least m of the n`h 1 , . . . , n`h k are prime.

The generalized Elliott-Halberstam case
Now we show case (ii) of Theorem 3.6. For sake of notation we shall take i 0 " k, as the other cases are similar; thus we have The basic idea is to view the sum (29) as a variant of (26), with the role of the function θ now being played by the product divisor sum λ F k λ G k , and to repeat the arguments in Section 4.2. To do this we rely on Proposition 4.2 to restrict n`h i to the almost primes.
We turn to the details. Let ε ą 0 be an arbitrary fixed quantity. From (49) and Cauchy-Schwarz one has ith the implied constant uniform in ε, so by the triangle inequality and a limiting argument as ε Ñ 0 it suffices to show that where c ε is a quantity depending on ε but not on x, such that We use (16) to expand out λ Fi , λ Gi for i " 1, . . . , k´1, but not for i " k, so that the left-hand side of (29) becomes ÿ d1,...,d k´1 ,d 1 1 ,...,d 1 where As before, the summand in (54) vanishes unless the modulus [4] q W,d1,...,d 1 k´1 defined in (44) is squarefree, in which case we have the analogue of (45). Here we have put q " q W,d1,...,d 1 k´1 and a " a W,d1,...,d 1 k´1 for convenience. We thus split where, To show (53), it thus suffices to show the main term estimate the first error term estimate and the second error term estimate for any fixed A ą 0.
We begin with (61). Observe that if ppnq ą x ε , then the only way that pn, q W,d1,...,d 1 k´1 q can exceed 1 is if there is a prime x ε ă p ! x which divides both n and one of d 1 , . . . , d 1 k´1 ; in particular, this case can only occur when k ą 1. For sake of notation we will just consider the contribution when there is a prime that divides n and d 1 , as the other 2k´3 cases are similar. By (57), this contribution to Σ 2 can then be crudely bounded (using (1)) by as required, where we have made the change of variables e i :" rd i , d 1 i s, using the divisor bound to control the multiplicity. Now we show (62). From the hypothesis (28) We see the product in (59) is Op1q. Thus by (58), we may bound Σ 3 by From (2) we easily obtain the bound so by Cauchy-Schwarz it suffices to show that for any fixed A ą 0. If we had the additional hypothesis SpF k q`SpG k q ă 1, then this would follow easily from the hypothesis GEHrϑs thanks to Claim 2.6, since one can write But even in the absence of the hypothesis SpF k q`SpG k q ă 1, we can still invoke GEHrϑs after appealing to the fundamental theorem of arithmetic. Indeed, if n P rx`h k , 2x`h k s with pp¨q ą ε, then we have n " p 1 . . . p r for some primes x ε ă p 1 ď¨¨¨ď p r ď 2x`h k , which forces r ď 1 ε`1 . If we then partition rx ε , 2x`h k s by Oplog A`1 xq intervals I 1 , . . . , I m , with each I j contained in an interval of the form rN, p1`log´A xqN s, then we have p i P I ji for some 1 ď j 1 ď¨¨¨ď j r ď m, with the product interval I j1¨¨¨¨¨Ijr intersecting rx`h k , 2x`h k s. For fixed r, there are Oplog Ar`r xq such tuples pj 1 , . . . , j r q, and a simple application of the prime number theorem with classical error term (and crude estimates on the discrepancy ∆) shows that each tuple contributes Opx log´A r`Op1q xq to (63) (here, and for the rest of this section, implied constants will be independent of A unless stated otherwise). In particular, the Oplog Apr´1q xq tuples pj 1 , . . . , j r q with one repeated j i , or for which the interval I j1¨¨¨¨¨Ijr meets the boundary of rx`h k , 2x`h k s, contribute a total of Oplog´A`O p1q xq. This is an acceptable error to (63), and so these tuples may be removed. Thus it suffices to show that ..,jr is the set of all products p 1 . . . p r with p i P I ji for i " 1, . . . , r, and where we allow implied constants in the ! notation to depend on ε. But for n in A j1,...,jr , the 2 r factors of n are just the products of subsets of tp 1 , . . . , p r u, and from the smoothness of F k , G k we see that λ F k pnq is equal to some bounded constant (depending on j 1 , . . . , j r , but independent of p 1 , . . . , p r ), plus an error of Oplog´A xq. As before, the contribution of this error is Oplog´A pr`1q`Op1q xq, so it suffices to show that But one can write 1 Aj 1 ,...,jr as a convolution 1 Aj 1 ‹¨¨¨‹ 1 Aj r , where A ji denotes the primes in I ji ; assigning A jr (for instance) to be β and the remaining portion of the convolution to be α, the claim now follows from the hypothesis GEHrϑs, thanks to the Siegel-Walfisz theorem (see e.g. [58,Satz 4] or [34,Th. 5.29]).
Finally, we show (60). By Lemma 4.1 we have (note that F i , G i are supported on r0, 1s by hypothesis), so by (56) it suffices to show that where c 2 ε is a quantity depending on ε but not on x such that lim εÑ0 c 2 ε " In the case SpF k q`SpG k q ă 1, this would follow easily from (the k " 1 case of) Theorem 3.6(i) and Proposition 4.2. In the general case, we may appeal once more to the fundamental theorem of arithmetic. As before, we may factor n " p 1 . . . p r for some x ε ď p 1 ď¨¨¨ď p r ď 2x`h k and r ď 1 ε`1 . The contribution of those n with a repeated prime factor p i " p i`1 can easily be shown to be Î x 1´ε in the same manner we dealt with Σ 2 , so we may restrict attention to the square-free n, for which the p i are strictly increasing. In that case, one can write where B phq F pxq :" F px`hq´F pxq. On the other hand, a standard application of Mertens' theorem and the prime number theorem (and an induction on r) shows that for any fixed r ě 1 and any fixed continuous function f : where c f is the quantity Putting all this together, we see that we obtain an asymptotic (64) with Comparing (64) with the first part of Proposition 4.2 we see that c 2 ε " Op1q uniformly in ε; subtracting two instances of (64) and comparing with the last part of Proposition 4.2 we see that |c 2 ε1´c 2 ε2 | ! ε 1`ε2 for any ε 1 , ε 2 ą 0. We conclude that c 2 ε converges to a limit as ε Ñ 0 for any F, G. This implies the absolute convergence indeed, by the Cauchy-Schwarz inequality it suffices to establish this for F " G, at which point we may remove the absolute value signs and use the boundedness of c 2 ε . By the dominated convergence theorem, it therefore suffices to establish the identity It will suffice to show the identity for any smooth F : r0,`8q Ñ R, since (66) follows by replacing F with F k`Gk and F k´Gk and then subtracting.
At this point we use the following identity: For any positive reals t 1 , . . . , t r with r ě 1, we have Thus, for instance, when r " 2 we have 1 Proof If the right-hand side of (68) is denoted f r pt 1 , . . . , t r q, then one easily verifies the identity for any r ą 1; but the left-hand side of (68) also obeys this identity, and the claim then follows from induction.
From this lemma and symmetrisation, we may rewrite the left-hand side of (67) as Let I a pF q :" and J a pF q :" pB paq F p0qq 2 .
One can then rewrite (67) as the identity where K a,r pF q :" To prove this, we first observe the identity for any a ą 0; indeed, we have and the claim follows. Iterating this identity k times, we see that In particular, dropping the L a,k pF q term and sending k Ñ 8 yields the lower bound On the other hand, we can expand L a,k pF q as Writing s :" t 1`¨¨¨`tk , we obtain the upper bound L a,k pF q ď ż s,tě0:s`tďa where F t pxq :" F px`tq. Summing this and using (71) and the monotone convergence theorem, we conclude that

Reduction to a variational problem
Now that we have proven Theorems 3.5 and 3.6, we can establish Theorems 3.8, 3.10, 3.12, 3.14. The main technical difficulty is to take the multidimensional measurable functions F appearing in these functions and approximate them by tensor products of smooth functions, for which Theorems 3.5 and 3.6 may be applied.

Proof of Theorem 3.8
We now prove Theorem 3.8. Let k, m, ϑ obey the hypotheses of that theorem, thus we may find a fixed square-integrable function F : r0,`8q k Ñ R supported on the simplex R k :" tpt 1 , . . . , t k q P r0,`8q k : t 1`¨¨¨`tk ď 1u and not identically zero and with We now perform a number of technical steps to further improve the structure of F . Our arguments here will be somewhat convoluted, and are not the most efficient way to prove Theorem 3.8 (which in any event was already established in [38]), but they will motivate the similar arguments given below to prove the more difficult results in Theorems 3.10, 3.12, 3.14. In particular, we will use regularisation techniques which are compatible with the vanishing marginal condition (35) that is a key hypothesis in Theorem 3.14.
We first need to rescale and retreat a little bit from the slanted boundary of the simplex R k . Let δ 1 ą 0 be a sufficiently small fixed quantity, and write F 1 : r0,`8q k Ñ R to be the rescaled function Thus F 1 is a fixed square-integrable measurable function supported on the rescaled simplex pϑ{2´δ 1 q¨R k " tpt 1 , . . . , t k q P r0,`8q k : t 1`¨¨¨`tk ď ϑ{2´δ 1 u.
From (72), we see that if δ 1 is small enough, then F 1 is not identically zero and Let δ 1 and F 1 be as above. Next, let δ 2 ą 0 be a sufficiently small fixed quantity (smaller than δ 1 ), and write F 2 : r0,`8q k Ñ R to be the shifted function, defined by setting F 2 pt 1 , . . . , t k q :" F 1 pt 1´δ2 , . . . , t k´δ2 q when t 1 , . . . , t k ě δ 2 , and F 2 pt 1 , . . . , t k q " 0 otherwise. As F 1 was square-integrable, compactly supported, and not identically zero, and because spatial translation is continuous in the strong operator topology on L 2 , it is easy to see that we will have F 2 not identically zero and that for δ 2 small enough (after restricting F 2 back to r0,`8q k , of course). For δ 2 small enough, this function will be supported on the region tpt 1 , . . . , t k q P R k : t 1¨¨¨`tk ď ϑ{2´δ 2 ; t 1 , . . . , t k ě δ 2 u, thus F 2 stays away from all the boundary faces of R k . By convolving F 2 with a smooth approximation to the identity that is supported sufficiently close to the origin, one may then find a smooth function F 3 : r0,`8q k Ñ R, supported on tpt 1 , . . . , t k q P R k : t 1¨¨¨`tk ď ϑ{2´δ 2 {2; t 1 , . . . , t k ě δ 2 {2u, which is not identically zero, and such that We extend F 3 by zero to all of R k , and then define the function f 3 : R k Ñ R by thus f 3 is smooth, not identically zero and supported on the region tpt 1 , . . . , t k q P R k : From the fundamental theorem of calculus we have and so IpF 3 q "Ĩpf 3 q and J i pF 3 q "J i pf 3 q for i " 1, . . . , k, wherẽ and In particular, Now we approximate f 3 by linear combinations of tensor products. By the Stone-Weierstrass theorem, we may express f 3 as the uniform limit of functions of the form where c 1 , . . . , c J are real scalars, and f i,j : R Ñ R are smooth compactly supported functions. Since f 3 is supported in (76), we can ensure that all the components f 1,j pt 1 q . . . f k,j pt k q are supported in the slightly larger region tpt 1 , . . . , t k q P R k : Observe that if one convolves a function of the form (81) with a smooth approximation to the identity which is of tensor product form pt 1 , . . . , t k q Þ Ñ ϕ 1 pt 1 q . . . ϕ 1 pt k q, one obtains another function of this form. Such a convolution converts a uniformly convergent sequence of functions to a uniformly smoothly convergent sequence of functions (that is to say, all derivatives of the functions converge uniformly). From this, we conclude that f 3 can be expressed as the smooth limit of functions of the form (81), with each component f 1,j pt 1 q . . . f k,j pt k q supported in the region tpt 1 , . . . , t k q P R k : Thus, we may find such a linear combination with J, c j , f i,j fixed and f 4 not identically zero, with Furthermore, by construction we have for all j " 1, . . . , J, where Spq was defined in (22). Now we construct the sieve weight ν : N Ñ R by the formula where the divisor sums λ f were defined in (16). Clearly ν is non-negative. Expanding out the square and using Theorem 3.6(i) and (84), we see that By (20), one has λ f k,j pn`h k q " f k,j p0q whenever n gives a non-zero contribution to the above sum. Expanding out the square in (85) again and using Theorem 3.5(i) and (84) (and the hypothesis EHrϑs), we thus see that which factorizes using (82), (79) as More generally, we see that for i " 1, . . . , k, with β i :"J i pf 4 q. Applying Lemma 3.4 and (75), we obtain DHLrk, m`1s as required.

Proof of Theorem 3.10
Now we prove Theorem 3.10, which uses a very similar argument to that of the previous section. Let k, m, , δ, F be as in Theorem 3.10. By performing the same rescaling as in the previous section (but with 1{2`2 playing the role of ϑ), we see that we can find a fixed square-integrable measurable function F 1 supported on the rescaled truncated simplex for some sufficiently small fixed δ 1 ą 0, such that (73) holds. By repeating the arguments of the previous section we may eventually arrive at a smooth function f 4 : R k Ñ R of the form (82), which is not identically zero and obeys (83), and such that each component f 1,j pt 1 q . . . f k,j pt k q is supported in the region tpt 1 , . . . , t k q P R k : for some sufficiently small δ 2 ą 0. In particular, one has Spf 1,j q`¨¨¨`Spf k,j q ă 1 4` ď 1 2 and Spf 1,j q, . . . , Spf k,j q ă δ for all j " 1, . . . , J. If we then define ν by (85) as before, and repeat all of the above arguments (but use Theorem 3.5(ii) and MPZr , δs in place of Theorem 3.5(i) and EHrϑs), we obtain the claim; we leave the details to the interested reader.

Proof of Theorem 3.12
Now we prove Theorem 3.12. Let k, m, ε, ϑ be as in that theorem. Then one may find a square-integrable function F : r0,`8q k Ñ R supported on p1`εq¨R k which is not identically zero, and with By truncating and rescaling as in Section 5.1, we may find a fixed bounded measurable function F 1 : r0,`8q k Ñ R on the simplex p1`εqp ϑ 2´δ 1 q¨R k such that By repeating the arguments in Section 5.1, we may eventually arrive at a smooth function f 4 : R k Ñ R of the form (82), which is not identically zero and obeys and such that each component f 1,j pt 1 q . . . f k,j pt k q is supported in the region # pt 1 , . . . , t k q P R k : for some sufficiently small δ 2 ą 0. In particular, we have Spf 1,j q`¨¨¨`Spf k,j q ď p1`εq ϑ 2´δ for all 1 ď j ď J. Let δ 3 ą 0 be a sufficiently small fixed quantity (smaller than δ 1 or δ 2 ). By a smooth partitioning, we may assume that all of the f i,j are supported in intervals of length at most δ 3 , while keeping the sum J ÿ j"1 |c j ||f 1,j pt 1 q| . . . |f k,j pt k q| bounded uniformly in t 1 , . . . , t k and in δ 3 . Now let ν be as in (85), and consider the expression This expression expands as a linear combination of the expressions for various 1 ď j, j 1 ď J. We claim that this sum is equal tõ To see this, we divide into two cases. First suppose that hypothesis (i) from Theorem 3.12 holds. Then from (87) we have k ÿ i"1 pSpf i,j q`Spf i,j 1 qq ă p1`εqϑ ă 1 and the claim follows from Theorem 3.6(i). Now suppose instead that hypothesis (ii) from Theorem 3.12 holds, then from (87) one has and so from the pigeonhole principle we have The claim now follows from Theorem 3.6(ii).
Putting this together as in Section 5.1, we conclude that where α :"Ĩpf 4 q.
Now we consider the sum From Proposition 2.7 we see that we have EHrϑs as a consequence of the hypotheses of Theorem 3.12. However, this combined with Theorem 3.5 is not strong enough to obtain an asymptotic for the sum (89), as there is an epsilon loss in (87). But observe that Lemma 3.4 only requires a lower bound on the sum (89), rather than an asymptotic. To obtain this lower bound, we partition t1, . . . , Ju into J 1 Y J 2 , where J 1 consists of those indices j P t1, . . . , Ju with and J 2 is the complement. From the elementary inequality we obtain the pointwise lower bound The point of performing this lower bound is that if j P J 1 Y J 2 and j 1 P J 1 , then from (87), (90) one has k´1 ÿ i"1 pSpf i,j q`Spf i,j 1 qq ă ϑ which makes Theorem 3.5(i) available for use. Indeed, for any j P t1, . . . , Ju and i " 1, . . . , k, we have from (87) that and so by (20) we have for x ď n ď 2x. If we then apply Theorem 3.5(i) and the hypothesis EHrϑs, we obtain the lower bound which we can rearrange as A similar argument gives If we choose δ 3 small enough, then the claim DHLrk, m`1s now follows from Lemma 3.4 and (86).

Proof of Theorem 3.14
Finally, we prove Theorem 3.14. Let k, m, ε, F be as in that theorem. By rescaling as in previous sections, we may find a square-integrable function F 1 : r0,`8q k Ñ R supported on p k k´1 ϑ 2´δ 1 q¨R k for some sufficiently small fixed δ 1 ą 0, which is not identically zero, which obeys the bound and also obeys the vanishing marginal condition (35) whenever t 1 , . . . , t i´1 , t i`1 , . . . , t k ě 0 are such that t 1`¨¨¨`ti´1`ti`1`¨¨¨`tk ą p1`εq ϑ 2´δ 1 .
As before, we pass from F 1 to F 2 by a spatial translation, and from F 2 to F 3 by a regularisation; crucially, we note that both of these operations interact well with the vanishing marginal condition (35), with the end product being that we obtain a smooth function F 3 : r0,`8q k Ñ R, supported on the region for some sufficiently small δ 2 ą 0, which is not identically zero, obeying the bound and also obeying the vanishing marginal condition (35) whenever t 1 , . . . , t i´1 , t i`1 , . . . , t k ě 0 are such that t 1`¨¨¨`ti´1`ti`1`¨¨¨`tk ą p1`εq ϑ 2´δ 2 2 .
As before, we now define the function f 3 : R k Ñ R by f 3 pt 1 , . . . , t k q :" ż s1ět1,...,s k ět k F 3 ps 1 , . . . , s k q ds 1 . . . ds k , thus f 3 is smooth, not identically zero and supported on the region # pt 1 , . . . , t k q P R k : Furthermore, from the vanishing marginal condition we see that we also have f 3 pt 1 , . . . , t k q " 0 whenever we have some 1 ď i ď k for which t i ď δ 2 {2 and From the fundamental theorem of calculus as before, we have Using the Stone-Weierstrass theorem as before, we can then find a function f 4 of the form where c 1 , . . . , c J are real scalars, and f i,j : R Ñ R are smooth functions supported on intervals of length at most δ 3 ą 0 for some sufficiently small δ 3 ą 0, with each component f 1,j pt 1 q . . . f k,j pt k q supported in the region # pt 1 , . . . , t k q P R k : + and avoiding the regions * for each i " 1, . . . , k, and such that In particular, for any j " 1, . . . , J we have Spf 1,j q`¨¨¨`Spf k,j q ă k k´1 and for any i " 1, . . . , k with f k,i not vanishing at zero, we have Spf 1,j q`¨¨¨`Spf k,i´1 q`Spf k,i`1 q`¨¨¨`Spf k,j q ă p1`εq ϑ 2 .
Let ν be defined by (85). From (93), the hypothesis GEHrϑs, and the argument from the previous section used to prove Theorem 3.12(ii), we have Similarly, from (94) (and the upper bound Spf i,j q ă 1 from (93)), the hypothesis EHrϑs (which is available by Proposition 2.7), and the argument from the previous section we have Setting δ 3 small enough, the claim DHLrk, m`1s now follows from Lemma 3.4.

Asymptotic analysis
We now establish upper and lower bounds on the quantity M k defined in (33), as well as for the related quantities appearing in Theorem 3.10.
To obtain an upper bound on M k , we use the following consequence of the Cauchy-Schwarz inequality.
Lemma 6.1 (Cauchy-Schwarz) Let k ě 2, and suppose that there exist positive measurable functions G i : R k Ñ p0,`8q for i " 1, . . . , k such that for all t 1 , . . . , t i´1 , t i`1 , . . . , t k ě 0, where we extend G i by zero to all of r0,`8q k . Then we have Here ess sup refers to essential supremum (thus, we may ignore a subset of R k of measure zero in the supremum).
As a corollary, we can compute M k exactly if we can locate a positive eigenfunction: Corollary 6.2 Let k ě 2, and suppose that there exists a positive function F : R k Ñ p0,`8q obeying the eigenfunction equation for some λ ą 0 and all pt 1 , . . . , t k q P R k , where we extend F by zero to all of r0,`8q k . Then λ " M k .
Proof On the one hand, if we integrate (97) against F and use (31), (32) we see that and thus by (33) we see that M k ě λ. On the other hand, if we apply Lemma 6.1 with G i pt 1 , . . . , t k q :" F pt 1 , . . . , t k q ş 8 0 F pt 1 , . . . , t i´1 , t 1 i , t i`1 , . . . , t k q dt 1 i we see that M k ď λ, and the claim follows.
This allows for an exact calculation of M 2 : where the Lambert W -function W pxq is defined for positive x as the unique positive solution to x " W pxqe W pxq .
Now if we define the function f : r0, 1s Ñ r0,`8q by the formula then a further brief calculation shows that for any 0 ď x ď 1, and hence by (98) that ż 1´x 0 f pyq dy " pλ´1`xqf pxq.
If we then define the function F : R 2 Ñ p0,`8q by F px, yq :" f pxq`f pyq, we conclude that for all px, yq P R 2 , and the claim now follows from Corollary 6.2.
We conjecture that a positive eigenfunction for M k exists for all k ě 2, not just for k " 2; however, we were unable to produce any such eigenfunctions for k ą 2. Nevertheless, Lemma 6.1 still gives us a general upper bound: Corollary 6. 4 We have M k ď k k´1 log k for any k ě 2.
Thus for instance one has M 2 ď 2 log 2 " 1.38629 . . . , which compares well with Corollary 6.3. On the other hand, Corollary 6.4 also gives so that one cannot hope to establish DHLr4, 2s (or DHLr3, 2s) solely through Theorem 3.8 even when assuming GEH, and must rely instead on more sophisticated criteria for DHLrk, ms such as Theorem 3.12 or Theorem 3.14.
Proof If we set G i : R k Ñ p0,`8q for i " 1, . . . , k to be the functions G i pt 1 , . . . , t k q :" k´1 log k 1 1´t 1´¨¨¨´tk`k t i then direct calculation shows that for all t 1 , . . . , t i´1 , t i`1 , . . . , t k ě 0, where we extend G i by zero to all of r0,`8q k . On the other hand, we have log k for all pt 1 , . . . , t k q P R k . The claim now follows from Lemma 6.1.
The upper bound arguments for M k can be extended to other quantities such as M k,ε , although the bounds do not appear to be as sharp in that case. For instance, we have the following variant of Lemma 6.4, which shows that the improvement in constants when moving from M k to M k,ε is asymptotically modest: Proposition 6.5 For any k ě 2 and 0 ď ε ă 1 we have logp2k´1q.
By Cauchy-Schwarz, we conclude that Integrating in t 1 , . . . , t i´1 , t i`1 , . . . , t k and summing in i, we obtain the claim.
Remark 6.6 The same argument, using the weight 1`ap´t 1´¨¨¨´tk`k t i q, gives the more general inequality logˆk`p ap1`εq´1qpk´1q 1´ap1´εq˙ whenever 1 1`ε ă a ă 1 1´ε ; the case a " 1 is Proposition 6.5, and the limiting case a " 1 1`ε recovers Lemma 6.4 when one sends ε to zero.
One can also adapt the computations in Corollary 6.3 to obtain exact expressions for M 2,ε , although the calculations are rather lengthy and will only be summarized here. For fixed 0 ă ε ă 1, the eigenfunctions F one seeks should take the form F px, yq :" f pxq`f pyq for x, y ě 0 and x`y ď 1`ε, where f pxq :" 1 xď1´ε In the regime 0 ă ε ă 1{3, one can calculate that f will (up to scalar multiples) take the form and λ is the largest root of the equation 2λ´1´ε .
In both cases, a variant of Corollary 6.2 can be used to show that M 2,ε will be equal to λ; thus for instance for 1{3 ď ε ă 1. In particular, M 2,ε increases to 2 in the limit ε Ñ 1; the lower bound lim inf εÑ1 M 2,ε ě 2 can also be established by testing with the function F px, yq :" 1 xďδ,yď1`ε´δ`1yďδ,xď1`ε´δ for some sufficiently small δ ą 0. Now we turn to lower bounds on M k , which are of more relevance for the purpose of establishing results such as Theorem 3.9. If one restricts attention to those functions F : R k Ñ R of the special form F pt 1 , . . . , t k q " f pt 1`¨¨¨`tk q for some function f : r0, 1s Ñ R then the resulting variational problem has been optimized in previous works [14], [53] (and originally in unpublished work of Conrey), giving rise to the lower bound where j k´2 is the first positive zero of the Bessel function J k´2 . This lower bound is reasonably strong for small k; for instance, when k " 2 it shows that ă 4 for all k (see [59]), so this lower bound cannot be used to force M k to be larger than 4.
In [38] the lower bound was established for all sufficiently large k. In fact, the arguments in [38] can be used to show this bound for all k ě 200 (for k ă 200, the right-hand side of (99) is either negative or undefined). Indeed, if we use the bound [38, (7.19)] with A chosen so that A 2 e A " k, then 3 ă A ă log k when k ě 200, hence e A " k{A 2 ą k{ log 2 k and so A ě log k´2 log log k. By using the bounds A e A´1 ă 1 6 (since A ą 3) and e A {k " 1{A 2 ă 1{9, we see that the right-hand side of [38, (8.17)] exceeds A´1 p1´1{6´1{9q 2 ě A´2, which gives (99).
We will remove the log log k term in (99) via the following explicit estimate.
Assume the inequalities kµ ď 1´τ (104) kµ ă 1´T (105) Then one has where Z, Z 3 , W, X, V, U are the explicitly computable quantities r (108) Of course, since M whenever F : r0,`8q k Ñ R is square-integrable and supported on r0, T s k X R k . By rescaling, we conclude that k ÿ i"1 J i pF q ď rM rT s k IpF q whenever r ą 0 and F : r0,`8q k Ñ R is square-integrable and supported on r0, rT s k X r¨R k . We apply this inequality with the function F pt 1 , . . . , t k q :" 1 t1`¨¨¨`t k ďr gpt 1 q . . . gpt k q where r ą 1 is a parameter which we will eventually average over, and g is extended by zero to r0,`8q. We thus have We can interpret this probabilistically as IpF q " m k 2 PpX 1`¨¨¨`Xk ď rq where X 1 , . . . , X k are independent random variables taking values in r0, T s with probability distribution 1 m2 gptq 2 dt. In a similar fashion, we have where we adopt the convention that ş ra,bs vanishes when b ă a. In probabilistic language, we thus have where we adopt the convention that the expectation operator E applies to the entire expression to the right of that operator unless explicitly restricted by parentheses. Also by symmetry we see that J i pF q " J k pF q for all i " 1, . . . , k. Putting all this together, we conclude that where we have used symmetry to get the third equality. We conclude that Combining this with (114), we conclude that Splitting into regions where s, t are less than T or greater than T , and noting that gpsq vanishes for s ą T , we conclude that We average this from r " 1 to r " 1`τ , to conclude that Thus to prove (107), it suffices (by (106)) to establish the bounds 1 τ for all 1 ă r ď 1`τ , and 1 τ We begin with (117). Since 1 τ it suffices to show that But, from (102), (103), we see that each X i has mean µ and variance σ 2 , so S k has mean kµ and variance kσ 2 . The claim now follows from Chebyshev's inequality and (104).
Now we show (118). The quantity Y 1 prq is vanishing unless r´S k´1 ě T . Using the crude bound hpsq ď 1 pk´1qs from (115), we see that T where log`pxq :" maxplog x, 0q. We conclude that We can rewrite this as By (115), we have Also, from the elementary bound log`px`yq ď log`x`logp1`yq for any x, y ě 0, we see that We conclude that T˙u sing the elementary bound logp1`yq ď y. Symmetrizing in the X 1 , . . . , X k , we conclude that where Z 1 prq :" Er log`r´S k T Z 2 prq :" Epr´S k q1 S k ďr S k kT and Z 3 was defined in (109). For the minor error term Z 2 , we use the crude bound pr´S k q1 S k ďr S k ď r 2 4 , so For Z 1 , we upper bound log`x by a quadratic expression in x. More precisely, we observe the inequality log`x ď px´2a log a´aq 2 4a 2 log a for any a ą 1 and x P R, since the left-hand side is concave in x for x ě 1, while the right-hand side is convex in x, non-negative, and tangent to the left-hand side at x " a. We conclude that On the other hand, from (102), (103), we see that each X i has mean µ and variance σ 2 , so S k has mean kµ and variance kσ 2 . We conclude that Z 1 prq ď r pr´kµ´2aT log a´aT q 2`k σ 2 4a 2 T 2 log a for any a ą 1.
From (116), (115) we conclude that To prove (119), it thus suffices (after making the change of variables r " 1`uτ ) to show that We will exploit the averaging in u to deal with the singular nature of the factor 1 r´S k´1`p k´1qs . By Fubini's theorem, the left-hand side of (123) may be written as where Qpuq is the random variable Qpuq :" p1`uτ´S k´1´c q 2 ż r0,minp1`uτ´S k´1 ,T qs gpsq 2 1`uτ´S k´1`p k´1qs ds.
In this regime we may bound p1`uτ´S k´1´c q 2 ď c 2 , so this contribution to (123) may be bounded by du˙ds.
Observe on making the change of variables v :" 1`uτ´S k´1`p k´1qs that and so this contribution to (123) is bounded by W X, where W, X are defined in (110), (111). Now we consider the contribution to (123) when [5] 1`uτ´S k´1 ą 2c.
In this regime we bound and so this portion of ş 1 0 Z 4 r1`uτ s du may be bounded by where V, U are defined in (112), (113). The proof of the theorem is now complete.
We can now perform an asymptotic analysis in the limit k Ñ 8 to establish Theorem 3.9(xi) and Theorem 3.11(vi). For k sufficiently large, we select the parameters c :" 1 log k`α log 2 k T :" β log k τ :" γ log k [5] One could obtain a small improvement to the bounds here by replacing the threshold 2c with a parameter to be optimized over.

The case of small and medium dimension
In this section we establish lower bounds for M k (and related quantities, such as M k,ε ) both for small values of k (in particular, k " 3 and k " 4) and medium values of k (in particular, k " 50 and k " 54). Specifically, we will establish Theorem 3.9(vii), Theorem 3.13, and Theorem 3.15.

Bounding M k for medium k
We begin with the problem of lower bounding M k . We first formalize an observation [6] of Maynard [38] that one may restrict without loss of generality to symmetric functions: where F ranges over symmetric square-integrable functions on R k that are not identically zero.
Proof Firstly, observe that if one replaces a square-integrable function F : r0,`8q k Ñ R with its absolute value |F |, then Ip|F |q " IpF q and J i p|F |q ě J i pF q. Thus one may restrict the supremum in (33) to non-negative functions without loss of generality. We may thus find a sequence F n of square-integrable non-negative functions on R k , normalized so that IpF n q " 1, and such that ř k i"1 J i pF n q Ñ M k as n Ñ 8. Now let F n pt 1 , . . . , t k q :" 1 k! ÿ σPS k F n pt σp1q , . . . , t σpkq q be the symmetrization of F n . Since the F n are non-negative with IpF n q " 1, we see that pk!q 2 and so IpF n q is bounded away from zero. Also, from (33), we know that the quadratic form is positive semi-definite and is also invariant with respect to symmetries, and so from the triangle inequality for inner product spaces we conclude that QpF n q ď QpF n q. [6] The arguments in [38] are rigorous under the assumption of a positive eigenfunction as in Corollary 6.2, but the existence of such an eigenfunction remains open for k ě 3.
By construction, QpF n q goes to zero as n Ñ 8, and thus QpF n q also goes to zero. We conclude that kJ 1 pF n q IpF n q " ř k i"1 J i pF n q IpF n q Ñ M k as n Ñ 8, and so The reverse inequality is immediate from (33), and the claim follows.
To establish a lower bound of the form M k ą C for some C ą 0, one thus seeks to locate a symmetric function F : r0,`8q k Ñ R supported on R k such that To do this numerically, we follow [38] (see also [25] for some related ideas) and can restrict attention to functions F that are linear combinations  In order to facilitate computations, it is natural to work with bases b 1 , . . . , b n of symmetric polynomials. We have the following basic integration identity: Lemma 7.2 (Beta function identity) For any non-negative a, a 1 , . . . , a k , we have where Γpsq :" ş 8 0 t s´1 e´t dt is the Gamma function. In particular, if a 1 , . . . , a k are natural numbers, then ż we see that to establish the lemma it suffices to do so in the case a " 0. If we write then by homogeneity we have for any r ą 0, and hence on integrating r from 0 to 1 we conclude that On the other hand, if we multiply by e´r and integrate r from 0 to 8, we obtain instead Using the definition of the Gamma function, this becomes Γpa 1`¨¨¨`ak`k qX " Γpa 1`1 q . . . Γpa k`1 q and the claim follows.
Define a signature to be a non-increasing sequence α " pα 1 , α 2 , . . . , α k q of natural numbers; for brevity we omit zeroes, thus for instance if k " 6, then p2, 2, 1, 1, 0, 0q will be abbreviated as p2, 2, 1, 1q. The number of non-zero elements of α will be called the length of the signature α, and as usual the degree of α will be α 1`¨¨¨`αk . For each signature α, we then define the symmetric polynomials P α " P pkq α by the formula where the summation is over all tuples a " pa 1 , . . . , a k q whose non-increasing rearrangement spaq is equal to α. Thus for instance P p1q pt 1 , . . . , t k q " t 1`¨¨¨`tk P p2q pt 1 , . . . , t k q " t 2 1`¨¨¨`t 2 k P p1,1q pt 1 , . . . , t k q " ÿ 1ďiăjďk t i t j P p2,1q pt 1 , . . . , t k q " ÿ 1ďiăjďk t 2 i t j`ti t 2 j and so forth. Clearly, the P α form a linear basis for the symmetric polynomials of t 1 , . . . , t k . Observe that if α " pα 1 , 1q is a signature containing 1, then one can express P α as P p1q P α 1 minus a linear combination of polynomials P β with the length of β less than that of α. This implies that the functions P a p1q P α , with a ě 0 and α avoiding 1, are also a basis for the symmetric polynomials. Equivalently, the functions p1´P p1q q a P α with a ě 0 and α avoiding 1 form a basis.
After extensive experimentation, we have discovered that a good basis b 1 , . . . , b n to use for the above problem comes by setting the b i to be all the symmetric polynomials of the form p1´P p1q q a P α , where a ě 0 and α consists entirely of even numbers, whose total degree a`α 1`¨¨¨`αk is less than or equal to some chosen threshold d. For such functions, the coefficients of M 1 , M 2 can be computed exactly using Lemma 7.2.
More explicitly, first we quickly compute a look-up table for the structure constants c α,β,γ P Z derived from simple products of the form where degpαq`degpβq ď d. Using this look-up table we rewrite the integrands of the entries of the matrices in (126) and (127) as integer linear combinations of nearly "pure" monomials of the form p1´P p1q q a t a1 1 . . . t a k k . We then calculate the entries of M 1 and M 2 , as exact rational numbers, using Lemma 7.2.
We next run a generalized eigenvector routine on (real approximations to) M 1 and M 2 to find a vector a 1 which nearly maximizes the quantity C in (125). Taking a rational approximation a to a 1 , we then do the quick (and exact) arithmetic to verify that (125) holds for some constant C ą 4. This generalized eigenvector routine is time-intensive when the sizes of M 1 and M 2 are large (say, bigger than 1500ˆ1500), and in practice is the most computationally intensive step of our calculation. When one does not care about an exact arithmetic proof that C ą 4, instead one can run a test for positive-definiteness for the matrix CM 1´M2 , which is usually much faster and less RAM intensive.
Using this method, we were able to demonstrate M 54 ą 4.00238, thus establishing Theorem 3.9(vii). We took d " 23 and imposed the restriction on signatures α that they be composed only of even numbers. It is likely that d " 22 would suffice in the absence of this restriction on signatures, but we found that the gain in M 54 from lifting this restriction is typically only in the region of 0.005, whereas the execution time is increased by a large factor. We do not have a good understanding of why this particular restriction on signatures is so inexpensive in terms of the trade-off between the accuracy of M -values and computational complexity. The total run-time for this computation was under one hour.
We now describe a second choice for the basis elements b 1 , . . . , b n , which uses the Krylov subspace method; it gives faster and more efficient numerical results than the previous basis, but does not seem to extend as well to more complicated variational problems such as M k,ε . We introduce the linear operator L : L 2 pR k q Ñ L 2 pR k q defined by Lf pt 1 , . . . , t k q :" This is a self-adjoint and positive semi-definite operator on L 2 pR k q. For symmetric b 1 , . . . , b n P L 2 pR k q, one can then write If we then choose where 1 is the unit constant function on R k , then the matrices M 1 , M 2 take the Hankel form and so can be computed entirely in terms of the 2n numbers xL i 1, 1y for i " 0, . . . , 2n´1. The operator L maps symmetric polynomials to symmetric polynomials; for instance, one has and so forth. From this and Lemma 7.2, the quantities xL i 1, 1y are explicitly computable rational numbers; for instance, one can calculate pk`2q! xL 3 1, 1y " 2k 2 p7k`5q pk`3q! and so forth. With Maple, we were able to compute xL i 1, 1y for i ď 50 and k ď 100, leading to lower bounds on M k for these values of k, a selection of which are given in Table 3.
7.2 Bounding M k,ε for medium k When bounding M k,ε , we have not been able to implement the Krylov method, because the analogue of L i 1 in this context is piecewise polynomial instead of polynomial, and we were only able to compute it explicitly for very small values of i, such as i " 1, 2, 3, which are insufficient for good numerics.
Thus, we rely on the previously discussed approach, in which symmetric polynomials are used for the basis functions. Instead of computing integrals over the region R k we pass to the regions p1˘εqR k . In order to apply Lemma 7.2 over these regions, this necessitates working with a slightly different basis of polynomials. We chose to work with those polynomials of the form p1`ε´P p1q q a P α , where α is a signature with no 1's. Over the region p1`εqR k , a single change of variables converts the needed integrals into those of the form in Lemma 7.2, and we can then compute the entries of M 1 .
On the other hand, over the region p1´εqR k we instead want to work with polynomials of the form p1´ε´P p1q q a P α . Since p1`ε´P p1q q a " p2ε`p1´ε´P p1q qq a , an expansion using the binomial theorem allows us to convert from our given basis to polynomials of the needed form.
With these modifications, and calculating as in the previous section, we find that M 50,1{25 ą 4.00124 if d " 25 and M 50,1{25 ą 4.0043 if d " 27, thus establishing Theorem 3.13(i). As before, we found it optimal to restrict signatures to contain only even entries, which greatly reduced execution time while only reducing M by a few thousandths.
One surprising additional computational difficulty introduced by allowing ε ą 0 is that the "complexity" of ε as a rational number affects the run-time of the calculations. We found that choosing ε " 1{m (where m P Z has only small prime factors) reduces this effect.
A similar argument gives M 51,1{50 ą 4.00156, thus establishing Theorem 3.13(xiii). In this case our polynomials were of maximum degree d " 22.
Code and data for these calculations may be found at www.dropbox.com/sh/0xb4xrsx4qmua7u/WOhuo2Gx7f/Polymath8b.
By making the change of variables s " t 1`t2`t3`t4 we see that p1`εq 5 15`p 1`εq 4 and similarly by making the change of variables u " t 1`t2`t3 and so (128) follows.
Thus it is possible to establish Theorem 3.13(xii 1 ) using a cutoff function F 1 that is also supported in the unit cube r0, 1s 4 . This allows for a slight simplification to the proof of DHLr4, 2s assuming GEH, as one can add the additional hypothesis SpF i0 q`SpG i0 q ă 1 to Theorem 3.6(ii) in that case.
Remark 7.4 By optimising in ε and taking F to be a symmetric polynomial of degree higher than 1, one can get slightly better lower bounds for M 4,ε ; for instance setting ε " 5{21 and choosing F to be a cubic polynomial, we were able to obtain the bound M 4,ε ě 2.05411. On the other hand, the best lower bound for M 3,ε that we were able to obtain was 1.91726 (taking ε " 56{113 and optimizing over cubic polynomials). Again, see www.dropbox.com/sh/0xb4xrsx4qmua7u/WOhuo2Gx7f/Polymath8b for the relevant code and data.

Three-dimensional cutoffs
In this section we establish Theorem 3.15. We relabel the variables pt 1 , t 2 , t 3 q as px, y, zq, thus our task is to locate a piecewise polynomial function F : r0,`8q 3 Ñ R supported on the simplex R :" " px, y, zq P r0,`8q 3 : x`y`z ď 3 2 * and symmetric in the x, y, z variables, obeying the vanishing marginal condition whenever x, y ě 0 with x`y ą 1`ε, and such that where and IpF q :" ż R F px, y, zq 2 dxdydz (132) and ε :" 1{4.
Our strategy will be as follows. We will decompose the simplex R (up to null sets) into a carefully selected set of disjoint open polyhedra P 1 , . . . , P m (in fact m will be 60), and on each P i we will take F px, y, zq to be a low degree polynomial F i px, y, zq (indeed, the degree will never exceed 3). The left and right-hand sides of (130) become quadratic functions in the coefficients of the F i . Meanwhile, the requirement of symmetry, as well as the marginal requirement (129), imposes some linear constraints on these coefficients. In principle, this creates a finite-dimensional quadratic program, which one can try to solve numerically. However, to make this strategy practical, one needs to keep the number of linear constraints imposed on the coefficients to be fairly small, as compared with the total number of coefficients. To achieve this, the following properties on the polynomials P i are desirable: • (Symmetry) If P i is a polytope in the partition, then every reflection of P i formed by permuting the x, y, z coordinates should also lie in the partition. • (Graph structure) Each polytope P i should be of the form where a i px, yq, b i px, yq are linear forms and Q i is a polygon. • (Epsilon splitting) Each Q i is contained in one of the regions tpx, yq : x`y ă 1´εu, tpx, yq : 1´ε ă x`y ă 1`εu, or tpx, yq : 1`ε ă x`y ă 3{2u. Observe that the vanishing marginal condition (129) now takes the form ÿ i:px,yqPQi ż bipx,yq aipx,yq F i px, y, zq dz " 0 (134) for every x, y ą 0 with x`y ą 1`ε. If the set ti : px, yq P Q i u is fixed, then the left-hand side of (134) is a polynomial in x, y whose coefficients depend linearly on the coefficients of the F i , and thus (134) imposes a set of linear conditions on these coefficients for each possible set ti : px, yq P Q i u with x`y ą 1`ε. Now we describe the partition we will use. This partition can in fact be used for all ε in the interval r1{4, 1{3s, but the endpoint ε " 1{4 has some simplifications which allowed for reasonably good numerical results. To obtain the symmetry property, it is natural to split R (modulo null sets) into six polyhedra R xyz , R xzy , R yxz , R yzx , R zxy , R zyx , where R xyz :" tpx, y, zq P R : x`y ă y`z ă z`xu " tpx, y, zq : 0 ă y ă x ă z; x`y`z ď 3{2u and the other polyhedra are obtained by permuting the indices x, y, z, thus for instance R yxz :" tpx, y, zq P R : y`x ă x`z ă z`yu " tpx, y, zq : 0 ă x ă y ă z; y`x`z ď 3{2u.
To obtain the epsilon splitting property, we decompose R xyz (modulo null sets) into eight subpolytopes A xyz " tpx, y, zq P R : x`y ă y`z ă z`x ă 1´εu, B xyz " tpx, y, zq P R : x`y ă y`z ă 1´ε ă z`x ă 1`εu, C xyz " tpx, y, zq P R : x`y ă 1´ε ă y`z ă z`x ă 1`εu, D xyz " tpx, y, zq P R : 1´ε ă x`y ă y`z ă z`x ă 1`εu, E xyz " tpx, y, zq P R : x`y ă y`z ă 1´ε ă 1`ε ă z`xu, F xyz " tpx, y, zq P R : x`y ă 1´ε ă y`z ă 1`ε ă z`xu, G xyz " tpx, y, zq P R : x`y ă 1´ε ă 1`ε ă y`z ă z`xu, H xyz " tpx, y, zq P R : 1´ε ă x`y ă y`z ă 1`ε ă z`xu; the other five polytopes R xzy , R yxz , R yzx , R zxy , R zyx are decomposed similarly, leading to a partition of R into 6ˆ8 " 48 polytopes. This is almost the partition we will use; however there is a technical difficulty arising from the fact that some of the permutations of F xyz do not obey the graph structure property. So we will split F xyz further, into the three pieces S xyz " tpx, y, zq P F xyz : z ă 1{2`εu, Thus R xyz is now partitioned into ten polytopes A xyz , B xyz , C xyz , D xyz , E xyz , S xyz , T xyz , U xyz , G xyz , H xyz , and similarly for permutations of R xyz , leading to a decomposition of R into 6ˆ10 " 60 polytopes.
A symmetric piecewise polynomial function F supported on R can now be described (almost everywhere) by specifying a polynomial function F ç P : P Ñ R for the ten polytopes P " A xyz , B xyz , C xyz , D xyz , E xyz , S xyz , T xyz , U xyz , G xyz , H xyz , and then extending by symmetry, thus for instance F ç Ayzx px, y, zq " F ç Axyz pz, x, yq.
As discussed earlier, the expressions IpF q, JpF q can now be written as quadratic forms in the coefficients of the F ç P , and the vanishing marginal condition (129) imposes some linear constraints on these coefficients.
Observe that the polytope D xyz and all of its permutations make no contribution to either the functional JpF q or to the marginal condition (129), and give a non-negative contribution to IpF q. Thus without loss of generality we may assume that However, the other nine polytopes A xyz , B xyz , C xyz , E xyz , S xyz , T xyz , U xyz , G xyz , H xyz have at least one permutation which gives a non-trivial contribution to either JpF q or to (129), and cannot be easily eliminated.
The region of integration meets the polytopes A xyz , A yzx , A zyx , B xyz , B zyx , C xyz , E xyz , E zyx , S xyz , T xyz , U xyz , and G xyz .
Projecting these polytopes to the px, yq-plane, we have the diagram: This diagram is drawn to scale in the case when ε " 1{4, otherwise there is a separation between the J 5 and J 7 regions. For each of these eight regions there are eight corresponding integrals J 1 , J 2 , . . . , J 8 , thus J " 2pJ 1`¨¨¨`J8 q.

Next comes
Third is the piece We now have dealt with all integrals involving A xyz , and all remaining integrals pass through B zyx . Continuing, we have

Another component is
The most complicated piece is Here we use´ş x"2ε ş 1´ε´x y"x´2ε¯f px, yq dydx as an abbreviation for ż 2ε x"1{2 ż 1´ε´x y"x´2ε f px, yq dydx.

The parity problem
In this section we argue why the "parity barrier" of Selberg [57] prohibits sieve-theoretic methods, such as the ones in this paper, from obtaining any bound on H 1 that is stronger than H 1 ď 6, even on the assumption of strong distributional conjectures such as the generalized Elliott-Halberstam conjecture GEHrϑs, and even if one uses sieves other than the Selberg sieve. Our discussion will be somewhat informal and heuristic in nature.
We begin by briefly recalling how the bound H 1 ď 6 on GEH (i.e., Theorem 1.4(xii)) was proven. This was deduced from the claim DHLr3, 2s, or more specifically from the claim that the set A :" tn P N : at least two of n, n`2, n`6 are primeu (141) was infinite.
To do this, we (implicitly) established a lower bound for all h P t0, 2, 6u and various residue classes a pqq with q ď x 1´ε and arithmetic functions f , such as the constant function f " 1, the von Mangoldt function f " Λ, or Dirichlet convolutions f " α ‹ β of the type considered in Claim 2.6. (In the presentation of this argument in previous sections, the shift by h was eliminated using the change of variables n 1 " n`h, but for the current discussion it is important that we do not use this shift.) One also required good asymptotic control on the main terms ÿ xďnď2x:pn`h,qq"1 f pn`hq.
Once one eliminates the shift by h, an inspection of these arguments reveals that they would be equally valid if one inserted a further non-negative weight ω : N Ñ R`in the summation over n. More precisely, the above sieve-theoretic argument would also deduce the lower bound that were of the same form as in the unweighted case ω " 1. Now suppose for instance that one was trying to prove the bound H 1 ď 4. A natural way to proceed here would be to replace the set A in (141) with the smaller set A 1 :" tn P N : n, n`2 are both primeu Y tn P N : n`2, n`6 are both primeu (146) and hope to establish a bound of the form ÿ n νpnq1 A 1 pnq ą 0 for a well-chosen function ν : N Ñ R`supported on rx, 2xs, by deriving this bound from suitable (averaged) upper bounds on the discrepancies (142) and control on the main terms (143). If the arguments were sieve-theoretic in nature, then (as in the H 1 ď 6 case), one could then also deduce the lower bound for any non-negative weight ω : N Ñ R`, provided that one had the same control on the weighted discrepancies (144) and weighted main terms (145) that one did on (142), (143).
We apply this observation to the weight ωpnq :" p1´λpnqλpn`2qqp1´λpn`2qλpn`6qq " 1´λpnqλpn`2q´λpn`2qλpn`6q`λpnqλpn`6q where λpnq :" p´1q Ωpnq is the Liouville function. Observe that ω vanishes for any n P A 1 , and hence for any ν. On the other hand, the "Möbius randomness law" (see e.g. [34]) predicts a significant amount of cancellation for any non-trivial sum involving the Möbius function µ, or the closely related Liouville function λ. For instance, the expression ÿ xďnď2x:n"a pqq λpn`hq is expected to be very small (of size [7] Op x q log´A xq for any fixed A) for any residue class a pqq with q ď x 1´ε , and any h P t0, 2, 6u; similarly for more complicated expressions such as ÿ xďnď2x:n"a pqq λpn`2qλpn`6q or ÿ xďnď2x:n"a pqq Λpnqλpn`2qλpn`6q or more generally ÿ xďnď2x:n"a pqq f pnqλpn`2qλpn`6q [7] Indeed, one might be even more ambitious and conjecture a square-root cancellation Î a x{q for such sums (see [40] for some similar conjectures), although such stronger cancellations generally do not play an essential role in sieve-theoretic computations.
where f is a Dirichlet convolution α ‹ β of the form considered in Claim 2.6. Similarly for expressions such as ÿ xďnď2x:n"a pqq f pnqλpnqλpn`2q; note from the complete multiplicativity of λ that pα ‹ βqλ " pαλq ‹ pβλq, so if f is of the form in Claim 2.6, then f λ is also. In view of these observations (and similar observations arising from permutations of t0, 2, 6u), we conclude (heuristically, at least) that all the bounds that are believed to hold for (142), (143) should also hold (up to minor changes in the implied constants) for (144), (145). Thus, if the bound H 1 ď 4 could be proven in a sieve-theoretic fashion, one should be able to conclude the bound (147), which is in direct contradiction to (148).

Remark 8.1 Similar arguments work for any set of the form
A H :" tn P N : Dn ď p 1 ă p 2 ď n`H; p 1 , p 2 both prime, p 2´p1 ď 4u and any fixed H ą 0, to prohibit any non-trivial lower bound on ř n νpnq1 A H pnq from sieve-theoretic methods. Indeed, one uses the weight ωpnq :" ź 0ďiďi 1 ďH;pn`i,3q"pn`i 1 ,3q"1;i 1´i ď4 p1´λpn`iqλpn`i 1 qq; we leave the details to the interested reader. This seems to block any attempt to use any argument based only on the distribution of the prime numbers and related expressions in arithmetic progressions to prove H 1 ď 4.
The same arguments of course also prohibit a sieve-theoretic proof of the twin prime conjecture H 1 " 2. In this case one can use the simpler weight ωpnq " 1´λpnqλpn`2q to rule out such a proof, and the argument is essentially due to Selberg [57]. Of course, the parity barrier could be circumvented if one were able to introduce stronger sievetheoretic axioms than the "linear" axioms currently available (which only control sums of the form (142) or (143)). For instance, if one were able to obtain non-trivial bounds for "bilinear" expressions such as ΛpnqΛpn`2q which would soon lead to a proof of the twin prime conjecture. Unfortunately, we do not know of any plausible way to control such bilinear expressions. (Note however that there are some other situations in which bilinear sieve axioms may be established, for instance in the argument of Friedlander and Iwaniec [21] establishing an infinitude of primes of the form a 2`b4 .)

Additional remarks
The proof of Theorem 3.2(xii) may be modified to establish the following variant: Proposition 9.1 Assume the generalized Elliott-Halberstam conjecture GEHrϑs for all 0 ă ϑ ă 1.
Let 0 ă ε ă 1{2 be fixed. Then if x is a sufficiently large multiple of 6, there exists a natural number n with εx ď n ď p1´εqx such that at least two of n, n´2, x´n are prime. Similarly if n´2 is replaced by n`2.
Note that if at least two of n, n´2, x´n are prime, then either n, n`2 are twin primes, or else at least one of x, x´2 is expressible as the sum of two primes, and Theorem 1.5 easily follows.
Proof (Sketch) We just discuss the case of n´2, as the n`2 case is similar. Observe from the Chinese remainder theorem (and the hypothesis that x is divisible by 6) that one can find a residue class b pW q such that b, b´2, x´b are all coprime to W (in particular, one has b " 1 p6q). By a routine modification of the proof of Lemma 3.4, it suffices to find a non-negative weight function ν : N Ñ Rà nd fixed quantities α ą 0 and β 1 , β 2 , β 3 ě 0, such that one has the asymptotic upper bound the asymptotic lower bounds n"b pW q νpnqθpx´nq ě Spβ 3´o p1qqB 1´k p1´2εqx ϕpW q and the inequality where S is the singular series S :" ź p|xpx´2q;pąw p p´1 .
We select ν to be of the form for various fixed coefficients c 1 , . . . , c J P R and fixed smooth compactly supported functions F j,i : r0,`8q Ñ R with j " 1, . . . , J and i " 1, . . . , 3. It is then routine [8] to verify that analogues of Theorem 3.5 and Theorem 3.6 hold for the various components of ν, with the role of x in the righthand side replaced by p1´2εqx, and the claim then follows by a suitable modification of Theorem 3.14, taking advantage of the function F constructed in Theorem 3.15. [8] One new technical difficulty here is that some of the various moduli rdj , d 1 j s arising in these arguments are not required to be coprime at primes p ą w dividing x or x´2; this requires some modification to Lemma 4.1 that ultimately leads to the appearance of the singular series S. However, these modifications are quite standard, and we do not give the details here.
It is likely that the bounds in Theorem 1.4 can be improved further by refining the sieve-theoretic methods employed in this paper, with the exception of part (xii) for which the parity problem prevents further improvement, as discussed in Section 8. We list some possible avenues to such improvements as follows: 1 In Theorem 3.13, the bound M k,ε ą 4 was obtained for some ε ą 0 and k " 50. It is possible that k could be lowered slightly, for instance to k " 49, by further numerical computations, but we were only barely able to establish the k " 50 bound after two weeks of computation. However, there may be a more efficient way to solve the required variational problem (e.g. by selecting a more efficient basis than the symmetric monomial basis) that would allow one to advance in this direction; this would improve the bound H 1 ď 246 slightly. Extrapolation of existing numerics also raises the possibility that M 53 exceeds 4, in which case the bound of 270 in Theorem 1.4(vii) could be lowered to 264. 2 To reduce k (and thus H 1 ) further, one could try to solve another variational problem, such as the one arising in Theorem 3.10 or in Theorem 3.14, rather than trying to lower bound M k or M k,ε . It is also possible to use the more complicated versions of MPZr , δs established in [52] (in which the modulus q is assumed to be densely divisible rather than smooth) to replace the truncated simplex appearing in Theorem 3.10 with a more complicated region (such regions also appear implicitly in [52, §4.5]). However, in the medium-dimensional setting k « 50, we were not able to accurately and rapidly evaluate the various integrals associated to these variational problems when applied to a suitable basis of functions. One key difficulty here is that whereas polynomials appear to be an adequate choice of basis for the M k , an analysis of the Euler-Lagrange equation reveals that one should use piecewise polynomial basis functions instead for more complicated variational problems such as the M k,ε problem (as was done in the three-dimensional case in Section 7.4), and these are difficult to work with in medium dimensions. From our experience with the low k problems, it looks like one should allow these piecewise polynomials to have relatively high degree on some polytopes, low degree on other polytopes, and vanish completely on yet further polytopes [9] , but we do not have a systematic understanding of what the optimal placement of degrees should be. 3 In Theorem 3.14, the function F was required to be supported in the simplex k k´1¨R k . However, one can consider functions F supported in other regions R, subject to the constraint that all elements of the sumset R`R lie in a region treatable by one of the cases of Theorem 3.6. This could potentially lead to other optimization problems that lead to superior numerology, although again it appears difficult to perform efficient numerics for such problems in the medium k regime k « 50. One possibility would be to adopt a "free boundary" perspective, in which the support of F is not fixed in advance, but is allowed to evolve by some iterative numerical scheme. 4 To improve the bounds on H m for m " 2, 3, 4, 5, one could seek a better lower bound on M k than the one provided by Theorem 6.7; one could also try to lower bound more complicated quantities such as M k,ε . 5 One could attempt to improve the range of , δ for which estimates of the form MPZr , δs are known to hold, which would improve the results of Theorem 1.4(ii)-(vi). For instance, we believe that the condition 600 `180δ ă 7 in Theorem 2.5 could be improved slightly to 1080 `330δ ă 13 by refining the arguments in [52], but this requires a hypothesis of square root cancellation in a certain four-dimensional exponential sum over finite fields, which we have thus far been unable to establish rigorously. Another direction to pursue would be to improve the δ parameter, or to otherwise relax the requirement of smoothness in the moduli, in order to reduce the need to pass to a truncation of the simplex R k , which is the primary reason why the m " 1 results are currently [9] In particular, the optimal choice F for M k,ε should vanish on the polytope tpt1, . . . , t k q P p1`εq¨R k : unable to use the existing estimates of the form MPZr , δs. Another speculative possibility is to seek MPZr , δs type estimates which only control distribution for a positive proportion of smooth moduli, rather than for all moduli, and then to design a sieve ν adapted to just that proportion of moduli (cf. [17]). Finally, there may be a way to combine the arguments currently used to prove MPZr , δs with the automorphic forms (or "Kloostermania") methods used to prove nontrivial equidistribution results with respect to a fixed modulus, although we do not have any ideas on how to actually achieve such a combination. 6 It is also possible that one could tighten the argument in Lemma 3.4, for instance by establishing a non-trivial lower bound on the portion of the sum ř n νpnq when n`h 1 , . . . , n`h k are all composite, or a sufficiently strong upper bound on the pair correlations ř n θpn`h i qθpn`h j q (see [2, §6] for a recent implementation of this latter idea). However, our preliminary attempts to exploit these adjustments suggested that the gain from the former idea would be exponentially small in k, whereas the gain from the latter would also be very slight (perhaps reducing k by Op1q in large k regimes, e.g. k ě 5000). 7 All of our sieves used are essentially of Selberg type, being the square of a divisor sum. We have experimented with a number of non-Selberg type sieves (for instance trying to exploit the obvious positivity of 1´ř pďx:p|n log p log x when n ď x), however none of these variants offered a numerical improvement over the Selberg sieve. Indeed it appears that after optimizing the cutoff function F , the Selberg sieve is in some sense a "local maximum" in the space of non-negative sieve functions, and one would need a radically different sieve to obtain numerically superior results. 8 Our numerical bounds for the diameter Hpkq of the narrowest admissible k-tuple are known to be exact for k ď 342, but there is scope for some slight improvement for larger values of k, which would lead to some improvements in the bounds on H m for m " 2, 3, 4, 5. However, we believe that our bounds on H m are already fairly close (e.g. within 10%) of optimal, so there is only a limited amount of gain to be obtained solely from this component of the argument.

Narrow admissible tuples
In this section we outline the methods used to obtain the numerical bounds on Hpkq given by Theorem 3.3, which are reproduced below:  ˚p xq " k is precisely Hpk`1q. Table 1 of [10] lists these largest x values for 2 ď k ď 170, and we find that Hp50q " 246, Hp51q " 252, and Hp54q " 270. Admissible tuples that realize these bounds are shown in Figures 1, 2 and 3.

10.2
Hpkq bounds for mid-range k As previously noted, exact values for Hpkq are known only for k ď 342. The upper bounds on Hpkq for the five cases (5)-(9) were obtained by constructing admissible k-tuples using techniques developed during the first part of the Polymath8 project. These are described in detail in Section 3 of [53], but for the sake of completeness we summarize the most relevant methods here.

Fast admissibility testing
A key component of all our constructions is the ability to efficiently determine whether a given k-tuple H " ph 1 , . . . , h k q is admissible. We say that H is admissible modulo p if its elements do not form a complete set of residues modulo p. Any k-tuple H is automatically admissible modulo all primes p ą k, since a k-tuple cannot occupy more than k residue classes; thus we only need to test admissibility modulo primes p ă k.
A simple way to test admissibility modulo p is to enumerate the elements of H modulo p and keep track of which residue classes have been encountered in a table with p boolean-valued entries. Assuming the elements of H have absolute value bounded by Opk log kq (true of all the tuples we consider), this approach yields a total bit-complexity of Opk 2 { log k Mplog kqq, where Mpnq denotes the complexity of multiplying two n-bit integers, which, up to a constant factor, also bounds the complexity of division with remainder. Applying the Schönhage-Strassen bound Mpnq " Opn log n log log nq from [56], this is Opk 2 log log k log log log kq, essentially quadratic in k.
This approach can be improved by observing that for most of the primes p ă k there are likely to be many unoccupied residue classes modulo p. In order to verify admissibility at p it is enough to find one of them, and we typically do not need to check them all in order to do so. Using a heuristic model that assumes the elements of H are approximately equidistributed modulo p, one can determine a bound m ă p such that k random elements of Z{pZ are unlikely to occupy all of the residue classes in r0, ms. By representing the k-tuple H as a boolean vector B " pb 0 , . . . , b h k´h1 q in which b i " 1 if and only if 0, 4,10,18,24,28,30,40,54,58,60,70,72,82,84,88,94,102,108,112,114,118,124,130,132,138,142,150,154,160,168,172,174,180,184,192,198,202,208,214,220,222,228,234,238,240,244,250,252,258,262,264,268,270. The key point is that when p ă k is large, say p ą p1` qk{ log k, we can choose m so that we only need to examine a small subset of the entries in B. Indeed, for primes p ą k{c (for any constant c), we can take m " Op1q and only need to examine Oplog kq elements of B (assuming its total size is Opk log kq, which applies to all the tuples we consider here).
Of course it may happen that H occupies every residue class in r0, ms modulo p. In this case we revert to our original approach of enumerating the elements of H modulo p, but we expect this to happen for only a small proportion of the primes p ă k. Heuristically, this reduces the complexity of admissibility testing by a factor of Oplog kq, making it sub-quadratic. In practice we find this approach to be much more efficient than the straight-forward method when k is large. See [52, §3.1] for further details.

Sieving methods
Our techniques for constructing admissible k-tuples all involve sieving an integer interval rs, ts of residue classes modulo primes p ă k and then selecting an admissible k-tuple from the survivors. There are various approaches one can take, depending on the choice of interval and the residue classes to sieve. We list four of these below, starting with the classical sieve of Eratosthenes and proceeding to more modern variations.
with m as small as possible. If we sieve the residue class 0ppq for all primes p ď k we have m " πpkq and p m`1 ą k. In this case no admissibility testing is required, since the residue class 0ppq is unoccupied for all p ď k. Applying the Prime Number Theorem in the forms p k " k log k`k log log k´k`O´k log log k log k¯, πpxq " x log x`O´x log 2 x¯,

this construction yields the upper bound
Hpkq ď k log k`k log log k´k`opkq.
As an optimization, rather than sieving modulo every prime p ď k we instead sieve modulo increasing primes p and stop as soon as the first k survivors form an admissible tuple. This will typically happen for some p m ă k.
• Shifted Schinzel sieve. As noted by Schinzel in [55], in the Hensley-Richards sieve it is slightly better to sieve 1p2q rather than 0p2q; this leaves unsieved powers of 2 near the center of the interval r´x{2, x{2s that would otherwise be removed (more generally, one can sieve 1ppq for many small primes p, but we did not). Additionally, we find that shifting the interval r´x{2, x{2s can yield significant improvements (one can also view this as changing the choices of residue classes). This leads to the following approach: we sieve an interval rs, s`xs of odd integers and multiples of odd primes p ď p m , where x is large enough to ensure at least k survivors, and m is large enough to ensure that the survivors form an admissible tuple, with x and m minimal subject to these constraints. A tuple of exactly k survivors is then chosen to minimize the diameter. By varying s and comparing the results, we can choose a starting point s P r´x{2, x{2s that yields the smallest final diameter. For large k we typically find s « k is optimal, as opposed to s «´pk{2q log k in the Hensley-Richards sieve.
• Shifted greedy sieve. As a further optimization, we can allow greater freedom in the choice of residue class to sieve. We begin as in the shifted Schinzel sieve, but for primes p ď p m that exceed 2 ? k log k, rather than sieving 0ppq we choose a minimally occupied residue class appq. As above we sieve the interval rs, s`xs for varying values of s P r´x{2, x{2s and select the best result, but unlike the shifted Schinzel sieve, for large k we typically choose s «´pk{ log k´kq{2. We remark that while one might suppose that it would be better to choose a minimally occupied residue class at all primes, not just the larger ones, we find that this is generally not the case. Fixing a structured choice of residue classes for the small primes avoids the erratic behavior that can result from making greedy choices to soon (see [28, Fig. 1] for an illustration of this). Table 4 lists the bounds obtained by applying each of these techniques (in the online version of this paper, each table entry includes a link to the constructed tuple). To the admissible tuples obtained using the shifted greedy sieve we additionally applied various local optimizations that are detailed in [52, §3.6]. As can be seen in the table, the additional improvement due to these local optimizations is quite small compared to that gained by using better sieving algorithms, especially when k is large. Table 4 also lists the value tk log k`ku that we conjecture as an upper bound on Hpkq for all sufficiently large k.

10.3
Hpkq bounds for large k . The upper bounds on Hpkq for the last two cases (10) and (11) were obtained using modified versions of the techniques described above that are better suited to handling very large values of k. These entail three types of optimizations that are summarized in the subsections below.

Improved time complexity
As noted above, the complexity of admissibility testing is quasi-quadratic in k. Each of the techniques listed in §10.2 involves optimizing over a parameter space whose size is at least quasi-linear in k, leading to an overall quasi-cubic time complexity for constructing a narrow admissible k-tuple; this makes it impractical to handle k ą 10 9 . We can reduce this complexity in a number of ways. First, we can combine parameter optimization and admissibility testing. In both the sieve of Eratosthenes and Hensley-Richards sieves, taking m " k guarantees an admissible k-tuple. For m ă k, if the corresponding k-tuple is inadmissible, it is typically because it is inadmissible modulo the smallest prime p m`1 that appears in the tuple. This suggests a heuristic approach in which we start with m " k, and then iteratively reduce m, testing the admissibility of each k-tuple modulo p m`1 as we go, until we can proceed no further. We then verify that the last k-tuple that was admissible modulo p m`1 is also admissible modulo all primes p ą p m`1 (we know it is admissible at all primes p ď p m because we have sieved a residue class for each of these primes). We expect this to be the case, but if not we can increase m as required. Heuristically this yields a quasi-quadratic running time, and in practice it takes less time to find the minimal m than it does to verify the admissibility of the resulting k-tuple.
Second, we can avoid a complete search of the parameter space. In the case of the shifted Schinzel sieve, for example, we find empirically that taking s " k typically yields an admissible k-tuple whose diameter is not much larger than that achieved by an optimal choice of s; we can then simply focus on optimizing m using the strategy described above. Similar comments apply to the shifted greedy sieve.

Improved space complexity
We expect a narrow admissible k-tuple to have diameter d " p1`op1qqk log k. Whether we encode this tuple as a sequence of k integers, or as a bitmap of d`1 bits, as in the fast admissibility testing algorithm, we will need approximately k log k bits. For k ą 10 9 this may be too large to conveniently fit in memory. We can reduce the space to Opk log log kq bits by encoding the k-tuple as a sequence of k´1 gaps; the average gap between consecutive entries has size log k and can be encoded in Oplog log kq bits. In practical terms, for the sequences we constructed almost all gaps can be encoded using a single 8-bit byte for each gap.
One can further reduce space by partitioning the sieving interval into windows. For the construction of our largest tuples, we used windows of size Op ? dq and converted to a gap-sequence representation only after sieving at all primes up to an Op ? dq bound.

Parallelization
With the exception of the greedy sieve, all the techniques described above are easily parallelized. The greedy sieve is more difficult to parallelize because the choice of a minimally occupied residue class modulo p depends on the set of survivors obtained after sieving modulo primes less than p. To address this issue we modified the greedy approach to work with batches of consecutive primes of size n, where n is a multiple of the number of parallel threads of execution. After sieving fixed residue classes modulo all small primes p ă 2 ? k log k, we determine minimally occupied residue classes for the next n primes in parallel, sieve these residue classes, and then proceed to the next batch of n primes.
In addition to the techniques described above, we also considered a modified Schinzel sieve in which we check admissibility modulo each successive prime p before sieving multiples of p, in order to verify that sieving modulo p is actually necessary. For values of p close to but slightly less than p m it will often be the case that the set of survivors is already admissibile modulo p, even though it does contain multiples of p (because some other residue class is unoccupied). As with the greedy sieve, when using this approach we sieve residue classes in batches of size n to facilitate parallelization. Table 5 lists the bounds obtained for the two largest values of k. For k " 75 845 707 the best results were obtained with a shifted greedy sieve that was modified for parallel execution as described above, using the fixed shift parameter s "´pk log k´kq{2. A list of the sieved residue classes is available at math.mit.edu/~drew/greedy_75845707_1431556072.txt. This file contains values of k, s, d, and m, along with a list of prime indices n i ą m and residue classes r i such that sieving the interval rs, s`ds of odd integers, multiples of p n for 1 ă n ď m, and at r i modulo p ni yields an admissible k-tuple.

Results for large k
For k " 3 473 955 908 we did not attempt any form of greedy sieving due to practical limits on the time and computational resources available. The best results were obtained using a modified Schinzel sieve that avoids unnecessary sieving, as described above, using the fixed shift parameter s " k0. A list of the sieved residue classes is available at math.mit.edu/~drew/schinzel_3473955908_80550202480.txt. This file contains values of k, s, d, and m, along with a list of prime indices n i ą m such that sieving the interval rs, s`ds of odd integers, multiples of p n for 1 ă n ď m, and multiples of p ni yields an admissible k-tuple.
Source code for our implementation is available at math.mit.edu/~drew/ompadm_v0.5.tar; this code can be used to verify the admissibility of both the tuples listed above.