1 Introduction

1.1 Results

Our goal is to improve the following classic result of Birch.

Theorem 1.1

(Birch [3]) Let \(d\ge 2\) and let \(F_1(\mathbf {x}),\ldots , F_R(\mathbf {x})\) be homogeneous forms of degree d, with integer coefficients in n variables \(x_1,\ldots ,x_n\). Let \(\mathscr {B}\) be a box in \(\mathbb {R}^n\), contained in the box and having sides of length at most 1 which are parallel to the coordinate axes. For each \( P\ge 1\), write

$$\begin{aligned} N_{F_1,\ldots ,F_R}(P) = \# \left\{ \mathbf {x}\in \mathbb {Z}^n :\mathbf {x}/P\in \mathscr {B},\, F_1(\mathbf {x})=0, \ldots ,F_R(\mathbf {x})=0\right\} . \end{aligned}$$

Let W be the projective variety cut out in \(\mathbb {P}_\mathbb {Q}^{n-1}\) by the condition that the \(R\times n\) Jacobian matrix \(\left( \partial F_i(\mathbf {x})/ \partial x_j\right) _{ij}\) has rank less than R. If

$$\begin{aligned} n-1-\dim W > (d-1)2^{d-1}R(R+1), \end{aligned}$$
(1.1)

then for all \(P\ge 1\), some \(\mathfrak {I}\ge 0\) depending only on the \(c_i\) and \(\mathscr {B}\) and some \(\mathfrak {S}\ge 0\) depending only on the \(c_i\), we have

$$\begin{aligned} N_{F_1,\ldots ,F_R}(P) = \mathfrak {I}\mathfrak {S}P^{n-dR} + O\left( P^{n-dR-\delta }\right) \end{aligned}$$
(1.2)

where the implicit constant depends only on the forms \(F_i\) and \(\delta \) is a positive constant depending only on d and R. If the variety \(V(F_1,\ldots ,F_R)\) cut out in \(\mathbb {P}_\mathbb {Q}^{n-1}\) by the forms \(F_i\) has dimension \(n-1-R\) and a smooth point over \(\mathbb {Q}_p\) for each prime p then \(\mathfrak {S}>0\), and if it has dimension \(n-1-R\) and a smooth real point whose homogeneous co-ordinates lie in \(\mathscr {B}\) then \(\mathfrak {I}>0\).

Here \(\mathfrak {I},\) \(\mathfrak {S}\) are the usual singular integral and series; see (2.35) and (2.25) below.

We focus in particular on weakening the hypothesis (1.1) on the number of variables, when the number of forms R is greater than one. Previous improvements of this type have required \(R=1\) or 2. Our first result, proved in Sect. 4, is as follows:

Theorem 1.2

When \(d=2\), we may replace (1.1) with the condition

$$\begin{aligned} n-\sigma _\mathbb {R}> 8R, \end{aligned}$$
(1.3)

where \(\sigma _\mathbb {R}\) is the element of \(\left\{ 0,\ldots ,n\right\} \) defined by

$$\begin{aligned} \sigma _\mathbb {R}= 1+\max _{\mathbf {\beta }\in \mathbb {R}^R{\setminus }\left\{ \mathbf {0}\right\} } \dim {\text {Sing}}V(\mathbf {\beta }\cdot \mathbf {F}), \end{aligned}$$
(1.4)

and \( V(\mathbf {\beta }\cdot \mathbf {F})\) is the hypersurface cut out in \(\mathbb {P}_\mathbb {R}^{n-1}\) by \(\beta _1F_1+\cdots +\beta _RF_R=0\).

Note that (1.3) is equivalent to

$$\begin{aligned} \min _{\mathbf {\beta }\in \mathbb {R}^R{\setminus }\left\{ \mathbf {0}\right\} } {\text {rank}}(\mathbf {\beta }\cdot \mathbf {F}) > 8R, \end{aligned}$$
(1.5)

where \({\text {rank}}(\mathbf {\beta }\cdot \mathbf {F})\) is the rank of the matrix of the quadratic form \( \beta _1F_1+\cdots +\beta _RF_R\). The hypothesis (1.3) is strictly weaker than the case \(d=2\) of the condition (1.1) as soon as \(R\ge 4\). Indeed we have \({\text {Sing}}V(\mathbf {\beta }\cdot \mathbf {F})\subset W\) whenever \(\mathbf {\beta }\in \mathbb {R}^R{\setminus }\left\{ \mathbf {0}\right\} \), and so

$$\begin{aligned} \sigma _\mathbb {R}\le 1+\dim W. \end{aligned}$$

Thus (1.3) is weaker than (1.1) whenever \(2R(R+1)<8R\) holds, that is for \(R\ge 4\).

To obtain the result described in the abstract we can simplify (1.3) with the following lemma, proved at the end of Sect. 4.

Lemma 1.1

Let \(d\ge 2\) and let \(F_1,\ldots ,F_R\) and W be as in Theorem 1.1. If \(V(F_1,\ldots ,F_R)\) is smooth with dimension \(n-1-R\), then we have

$$\begin{aligned} \sigma _\mathbb {R}\le 1+\dim W \le R-1. \end{aligned}$$
(1.6)

If \(V(F_1,\ldots ,F_R)\) is a smooth complete intersection and \(n\ge 9R\) then Theorem 1.2 and Lemma 1.1 imply that the asymptotic formula (1.2) holds. This in turn implies that \(V(F_1,\ldots ,F_R)\) satisfies the Hasse principle, by the last part of Theorem 1.1. As is usual with the circle method one also obtains weak approximation for \(V(F_1,\ldots ,F_R)\) in this case; see the comments after the proof of Theorem 1.2 in Sect. 4.

The “square-root cancellation” heuristic discussed around formula (1.12) in Browning and Heath-Brown [7] suggests that the condition \(n > 4R\) should suffice in place of the \(n\ge 9R\) in the previous paragraph. So (1.3) brings us within a constant factor of square-root cancellation as R grows, while (1.1) misses by a factor of O(R).

We deduce Theorem 1.2 from the following more general result, proved in Sect. 4.

Definition 1.1

For each \(k \in \mathbb {N}{\setminus }\left\{ \mathbf {0}\right\} \) and \(\mathbf {t}\in \mathbb {R}^k\) we write \(\left||\mathbf {t}\right||_\infty = \max _i \left|t_i\right|\) for the supremum norm. Let \(f(\mathbf {x})\) be any polynomial of degree \(d\ge 2\) with real coefficients in n variables \(x_1,\ldots ,x_n\). For \(i= 1,\ldots , n\) we define

$$\begin{aligned} m^{( f )}_i \left( \mathbf {x}^{(1)},\ldots ,\mathbf {x}^{(d-1)} \right) = \sum _{j_1,\ldots ,j_{d-1}=1}^n x^{(1)}_{j_1} \cdots x^{(d-1)}_{j_{d-1}} \frac{\partial ^{d} f(\mathbf {x})}{\partial x_{j_1} \cdots \partial x_{j_{d-1}} \partial x_i}, \end{aligned}$$

where we write \(\mathbf {x}^{(j)}\) for a vector of n variables \((x^{(j)}_1,\ldots ,x^{(j)}_n)^T\). This defines an n-tuple of multilinear forms

$$\begin{aligned} \mathbf {m}^{( f )} \left( \mathbf {x}^{(1)},\ldots ,\mathbf {x}^{(d-1)} \right) \in \mathbb {R}[\mathbf {x}^{(1)},\ldots ,\mathbf {x}^{(d-1)}]^n. \end{aligned}$$

Finally, for each \(B \ge 1\) we put \(N^{{\text {aux}}}_{f} \left( B \right) \) for the number of \((d-1)\)-tuples of integer n-vectors \(\mathbf {x}^{(1)}, \ldots , \mathbf {x}^{(d-1)}\) with

$$\begin{aligned}&\left||\mathbf {x}^{(1)}\right||_\infty ,\ldots ,\left||\mathbf {x}^{(d-1)}\right||_\infty \le B,\nonumber \\&\left||\mathbf {m}^{( f )} \left( \mathbf {x}^{(1)}, \ldots , \mathbf {x}^{(d-1)} \right) \right||_\infty < \left|| f^{[d]} \right||_\infty B^{d-2} \end{aligned}$$
(1.7)

where we let \(\left||f^{[d]}\right||_\infty = \frac{1}{d!} \max _{\mathbf {j}\in \left\{ 1,\ldots ,n\right\} ^d} \left|[\right|\big ]{\frac{\partial ^d f(\mathbf {x})}{\partial x_{j_1}\cdots \partial x_{j_d}}}\).

Theorem 1.3

Let the forms \(F_i\) and the counting function \( N_{F_1,\ldots ,F_R}(P)\) be as in Theorem 1.1, and let \(N^{{\text {aux}}}_{f}(B)\) be as in Definition 1.1. Suppose that the \(F_i\) are linearly independent and that

$$\begin{aligned} N^{{\text {aux}}}_{\mathbf {\beta }\cdot \mathbf {F}}(B) \le C_0 B^{(d-1)n-2^d{\mathscr {C}}} \end{aligned}$$
(1.8)

for some \(C_0 \ge 1\), \({\mathscr {C}}> dR\) and all \(\mathbf {\beta }\in \mathbb {R}^R\) and \(B\ge 1\), where we have written \(\mathbf {\beta }\cdot \mathbf {F}\) for \( \beta _1F_1+\cdots +\beta _RF_R\). Then for all \(P\ge 1\) we have

$$\begin{aligned} N_{F_1,\ldots ,F_R}(P) = \mathfrak {I}\mathfrak {S}P^{n-dR} + O\left( P^{n-dR-\delta }\right) , \end{aligned}$$

where the implicit constant depends at most on \(C_0\), \({\mathscr {C}}\) and the \(F_i\), and \(\delta \) is a positive constant depending at most on \({\mathscr {C}}\), d and R. Here \(\mathfrak {I}\) and \(\mathfrak {S}\) are as in Theorem 1.1.

One trivially has

$$\begin{aligned} B^{(d-2)n} \ll _{d,n} N^{{\text {aux}}}_{\mathbf {\beta }\cdot \mathbf {F}}(B)\ll _{d,n}B^{(d-1)n}. \end{aligned}$$

So (1.8) requires us to save a factor of \(P^{2^d{\mathscr {C}}}\) over the trivial upper bound, while the largest saving possible is of size \(O(P^n)\). It follows that we must have \(n>d2^{d}R\) in order for both (1.8) and \({\mathscr {C}}>dR\) to hold.

Counting functions similar to \(N^{{\text {aux}}}_{\mathbf {\beta }\cdot \mathbf {F}}(B)\) play a similar role in some other applications of the circle method, with the equations

$$\begin{aligned} {\mathbf {m}^{( f )} \left( \mathbf {x}^{(1)}, \ldots , \mathbf {x}^{(d-1)} \right) } = \mathbf {0} \end{aligned}$$
(1.9)

in place of the inequality (1.7). The quantities \(M(a_1,\ldots ,a_r;H)\) from formula (9) of Dietmann [14], and \({\mathcal M}_f(P)\) from Lemma 2 of Schindler [32] are both of this type. In this setting one needs to save a factor of size \(B^{O(R^2)}\) over the trivial bound.

In forthcoming work we bound the function \(N^{{\text {aux}}}_{\mathbf {\beta }\cdot \mathbf {F}}(B)\) for degrees higher than 2, with the goal of handling systems \(F_i\) in roughly \(d2^{d}R\) variables. We will approach this problem variously by using elementary methods, by generalising the argument used in Lemma 3 of Davenport [12] to treat the Eq. (1.9), and by applying the circle method iteratively to the inequalities (1.7). We will also combine the ideas used here with the variant of the circle method due to Freeman [15] to give a version of Theorem 1.3 for systems of forms \(F_i\) with real coefficients.

1.2 Related work

Theorem 1 of Müller [29] gives a result with exactly the same number of variables as Theorem 1.2, but for quadratic inequalities with real coefficients rather than quadratic equations with rational coefficients. It is in turn founded on work of Bentkus and Götze [1, 2] concerning a single quadratic inequality. The method of proof is related to ours, see Sects. 2.1 and 3.1 below.

When \(d=2\), the forms \(F_i\) are diagonal and the variety \(V(F_1,\ldots ,F_R)\) is smooth, then the conclusions of Theorem 1.1 hold whenever \(n > 4R\). That is, we have the “square-root cancellation” situation described at the end of Sect. 1.1. This follows by standard methods from a variant of Hua’s lemma due to Cook [11].

When \(d=2\) Dietmann [13], improving work of Schmidt [33], gives conditions similar to (1.3) under which the asymptotic formula (1.2) holds and the constant \(\mathfrak {S}\) is positive. In particular it is sufficient that either \( \min _{\mathbf {\beta }\in \mathbb {C}^R{\setminus }\left\{ \mathbf {0}\right\} } {\text {rank}}(\mathbf {\beta }\cdot \mathbf {F})>2R^2+3R, \) or that \(\min _{\mathbf {a}\in \mathbb {Q}^R{\setminus }\left\{ \mathbf {0}\right\} } {\text {rank}}(\mathbf {a}\cdot \mathbf {F}) >2R^3 +\tau (R)R\), where \(\tau (R) =2\) if R is odd and 0 otherwise. He also shows that if \(d=2\), the variety \(V(F_1,\ldots ,F_R)\) has a smooth real point and \( \min _{\mathbf {a}\in \mathbb {Q}^R{\setminus }\left\{ \mathbf {0}\right\} } {\text {rank}}(\mathbf {a}\cdot \mathbf {F}) >2R^3-2R \) then \(V(F_1,\ldots ,F_R)\) has a rational point.

Munshi [30] proves the asymptotic formula (1.2) when \(d=2\), \(n=11\) and \(V(F_1,F_2)\) is smooth. By contrast using Theorems 1.1 and equation (1.6) would require \(n \ge 14\). When \(d=2\) and \(R=1\) we have a single quadratic form F. Heath-Brown [18] then proves such an asymptotic formula whenever V(F) is smooth and \(n \ge 3\).

If F is a cubic form, Hooley [20] shows that when \(n=8\), the variety V(F) is smooth, and \(\mathscr {B}\) is a sufficiently small box centred at a point where the Hessian determinant of F is nonzero, then we have a smoothly weighted asymptotic formula analogous to (1.2). This result is conditional on a Riemann hypothesis for a certain modified Hasse-Weil L-function. For \(n=9\) he proves a similar result without any such assumption [19], with an error term \(O(P^{n-3}(\log P)^{-\delta })\) instead of the \(O(P^{n-3-\delta })\) in (1.2). In this setting Theorem 1.1 requires \(n \ge 17\).

In the case of a single quartic form F such that V(F) is smooth, Hanselmann [17] gives the condition \(n\ge 40\) in place of the \(n \ge 49\) required to apply Theorem 1.1. Very recent work of Marmon and Vishe [25] yields \(n\ge 30\), with \(n\ge 29\) expected in the sequel.

When \(d\ge 5\) and \(R=1\), a sharper condition than (1.1) is available by work of Browning and Prendiville [8]. For \(d\le 10\) and a smooth hypersurface V(F) this is essentially a reduction of one quarter in the number of variables required.

Dietmann [14] and Schindler [32] show that the condition (1.1) may be replaced with \(n-\sigma _\mathbb {Z}>(d-1)2^{d-1}R(R+1)\), where we define

$$\begin{aligned} \sigma _\mathbb {Z}= 1+ \max _{\mathbf {a}\in \mathbb {Z}^R{\setminus }\left\{ \mathbf {0} \right\} } \dim {\text {Sing}}V(\mathbf {a}\cdot \mathbf {f}^{[d]}). \end{aligned}$$
(1.10)

Note that the maximum here is over integer points, and so \(\sigma _\mathbb {Z}< \sigma _\mathbb {R}\) may hold.

Birch’s work [3] is generalised to systems of forms with differing degrees by Browning and Heath-Brown [7] over \(\mathbb {Q}\) and by Frei and Madritsch [16] over number fields. It is extended to linear spaces of solutions by Brandes [5, 6]. Versions of the result for function fields are due to Lee [22] and to Browning and Vishe [9]. A version for bihomogeneous forms is due to Schindler [31], and Mignot [26, 27] further develops these methods for certain trilinear forms and for hypersurfaces in toric varieties. Liu [23] proves existence of solutions in prime numbers to a quadratic equation in 10 or more variables. Asymptotic formulae for systems of equations of the same degree with prime values of the variables are considered by Cook and Magyar [10] and by Xiao and Yamagishi [34]. Magyar and Titichetrakun [24] extend these results to values of the variables with a bounded number of prime factors, while Yamagishi [35] treats systems of equations with differing degrees and prime variables. It is natural to ask whether similar generalisations exist for Theorem 1.2.

1.3 Notation

Parts of our work apply to polynomials with general real coefficients. Therefore we let \(f_1(\mathbf {x}),\ldots ,f_R(\mathbf {x})\) be polynomials with real coefficients, of degree \(d \ge 2\) in n variables \(x_1,\ldots ,x_n\), and we write \(f^{[d]}_1(\mathbf {x}), \ldots , f^{[d]}_R(\mathbf {x})\) for the degree d parts.

Implicit constants in \(\ll \) and big-O notation are always permitted to depend on the polynomials \(f_i\), and hence on dn and R. We use scalar product notation to indicate linear combinations, so that for example \(\mathbf {\alpha }\cdot \mathbf {f}=\sum _{i=1}^R \alpha _i f_i\). Throughout, \(\left||\mathbf {t}\right||_\infty \), \(\left||f\right||_\infty \), \(\mathbf {m}^{(f)}\) and \(N^{{\text {aux}}}_{f} \left( B \right) \) are as in Definition 1.1. We do not require algebraic varieties to be irreducible, and we use the convention that \(\dim \emptyset = -1\).

By an admissible box we mean a box in \(\mathbb {R}^n\) contained in the box , and having sides of length at most 1 which are parallel to the coordinate axes. We let \(\mathscr {B}\) be an admissible box. For each \(\mathbf {\alpha }\in \mathbb {R}^R\) and \(P\ge 1\), we define the exponential sum

$$\begin{aligned} S\left( \mathbf {\alpha }; P\right) = \sum \limits _{\mathbf {x}\in \mathop {\mathbb {Z}^n}\limits _{\mathbf {x}/P \in \mathscr {B}}} e( \mathbf {\alpha }\cdot \mathbf {f}(\mathbf {x}) ) \end{aligned}$$
(1.11)

where \(e(t) = e^{2\pi i t}\). This depends implicitly on \(\mathscr {B}\) and the \(f_i\). We often write the expression \(\max \left\{ P^{-d} \left||\mathbf {\beta }\right||_\infty ^{-1}, \left||\mathbf {\beta }\right||_\infty ^{\frac{1}{d-1}}\right\} \), and if \(\mathbf {\beta }=\mathbf {0}\) this quantity is defined to be \(+\infty \).

1.4 Structure of this paper

In Sect. 2 we apply the circle method to a system of degree d polynomials with integer coefficients, assuming a certain hypothesis (2.1) on \(S\left( \mathbf {\alpha }; P\right) \). In Sect. 3 we prove this hypothesis on \({S\left( \mathbf {\alpha }; P\right) }\) for polynomials with real coefficients, assuming that the bound (1.8) above holds. We then prove Theorems 1.2 and 1.3 in Sect. 4.

2 The circle method

In this section we apply the circle method, assuming that the bound

(2.1)

holds for all \(\mathbf {\alpha },\mathbf {\beta }\in \mathbb {R}^R\), \(P \ge 1\), some \({\mathscr {C}}>dR\), \(C \ge 1\) and some small \(\varepsilon >0\). In particular we will show that (2.1) implies that the set of points \(\mathbf {\alpha }\) in \(\mathbb {R}^R\) where \(\left|S\left( \mathbf {\alpha }; P\right) \right|\) is large has small measure. Our goal is the result below, which will be proved in Sect. 2.5.

Proposition 2.1

Assume that the polynomials \(f_i\) have integer coefficients, and that the leading forms \(f^{[d]}_i(\mathbf {x})\) are linearly independent. Write

$$\begin{aligned} N_{f_1,\ldots ,f_R}(P) = \# \left\{ \mathbf {x}\in \mathbb {Z}^n :\mathbf {x}/P\in \mathscr {B},\, f_1(\mathbf {x})=\cdots = f_R(\mathbf {x})=0 \right\} . \end{aligned}$$
(2.2)

Suppose we are given \({\mathscr {C}}>dR\), \(C \ge 1\) and \(\varepsilon >0\) such that the bound (2.1) holds for all \(\mathbf {\alpha },\mathbf {\beta }\in \mathbb {R}^R\), all \(P \ge 1\) and all admissible boxes \(\mathscr {B}\). If \(\varepsilon \) is sufficiently small in terms of \({\mathscr {C}}\), d and R, then we have

$$\begin{aligned} N_{f_1,\ldots ,f_R}(P) = \mathfrak {I}\mathfrak {S}P^{n-dR} + O_{C,f_1,\ldots ,f_R}\left( P^{n-dR-\delta }\right) \end{aligned}$$

for all \(P\ge 1\), all admissible boxes \(\mathscr {B}\), and some \(\delta >0\) depending only on \({\mathscr {C}}\), d, R. Here \(\mathfrak {I},\) \(\mathfrak {S}\) are the usual singular integral and series given by (2.35) and (2.25) below.

We comment on the role of (2.1). If the \(f_i\) have integer coefficients, then we have

(2.3)

If both \(S\left( \mathbf {\alpha }; P\right) \) and \(S\left( \mathbf {\alpha }+\mathbf {\beta }; P\right) \) are large then (2.1) implies that one of the terms \(P^{-d}\left||\mathbf {\beta }\right||_\infty ^{-1}\) or \(\left||\mathbf {\beta }\right||_\infty ^{\frac{1}{d-1}}\) must be large. In particular, the points \(\mathbf {\alpha }\) and \(\mathbf {\alpha }+\mathbf {\beta }\) must either be very close or somewhat far apart. In this sense (2.1) is a “repulsion principle” for the sum \(S\left( \mathbf {\alpha }; P\right) \). We can use this fact to bound the measure of the set where \(S\left( \mathbf {\alpha }; P\right) \) is large, and this will enable us to reduce (2.3) to an integral over major arcs.

To see the source of the condition \({\mathscr {C}}>dR\) in Proposition 2.1, consider the case

$$\begin{aligned} \left|S\left( \mathbf {\alpha }; P\right) \right| = \left|S\left( \mathbf {\alpha }+\mathbf {\beta }; P\right) \right| = CP^{n-{\mathscr {C}}+\varepsilon }. \end{aligned}$$
(2.4)

In general we always have

with equality when \(\left||\mathbf {\beta }\right||_\infty = P^{1-d}\) holds. So in the case (2.4), the assumption (2.1) is trivial. In other words (2.1) might still be satisfied even if the function \(S\left( \mathbf {\alpha }; P\right) \) had absolute value \(P^{n-{\mathscr {C}}+\varepsilon }\) at every point \(\mathbf {\alpha }\) in real R-space. This will lead to an error term of size at least \(P^{n-{\mathscr {C}}+\varepsilon }\) in evaluating the integral (2.3). Hence we require \({\mathscr {C}}>dR\) in the proposition above in order for the error term to be smaller than the main term.

2.1 Mean values from bounds of the form (2.1)

We show that the bound (2.1) implies upper bounds for the integral of the function \(S\left( \mathbf {\alpha }; P\right) \) over any bounded measurable set. Müller [29] and Bentkus and Götze [1, 2] previously used similar ideas to treat quadratic forms with real coefficients.

We begin with a technical lemma.

Lemma 2.1

Let be a strictly decreasing bijection, and let be a strictly increasing bijection. Write \(r_1^{-1}\) and \(r_2^{-1}\) for the inverses of these maps. Let \(\nu >0\) and let \(E_0\) be a hypercube in \(\mathbb {R}^R\) whose sides are of length \(\nu \) and parallel to the coordinate axes. Let E be a measurable subset of \(E_0\) and let be a measurable function.

Suppose that for all \(\mathbf {\alpha },\mathbf {\beta }\in \mathbb {R}^R\) such that \(\mathbf {\alpha }\in E\) and \(\mathbf {\alpha }+\mathbf {\beta }\in E\), we have

(2.5)

Then, for any integers k and \(\ell \) with \(k<\ell \), we have

(2.6)

where the implicit constant depends only on R.

Note that if we choose

$$\begin{aligned} \varphi (\mathbf {\alpha })=\left|S\left( \mathbf {\alpha }; P\right) \right| / CP^{n+\varepsilon }, \qquad r_1(t) = P^{-d}t^{-1/{\mathscr {C}}}, \qquad r_2(t) = t^{(d-1)/{\mathscr {C}}}, \end{aligned}$$

then the hypotheses (2.1) and (2.5) become identical. This will enable us to apply Lemma 2.1 to bound the integral \(\int _{\mathfrak {m}_{P,d,\Delta }}S\left( \mathbf {\alpha }; P\right) \,\mathrm {d}\mathbf {\alpha }\), where \(\mathfrak {m}_{P,d,\Delta }\) is a set of minor arcs on which \(S\left( \mathbf {\alpha }; P\right) \) is somewhat small.

Proof

The strategy of proof is as follows. We deduce from (2.5) that if both \(\varphi (\mathbf {\alpha })\ge t\) and \(\varphi (\mathbf {\alpha }+\mathbf {\beta })\ge t\) hold, then either \(\left||\mathbf {\beta }\right||_\infty \le r_1(t)\) or \(\left||\mathbf {\beta }\right||_\infty \ge r_2(t)\) must hold. From this we will show that the set of points \(\mathbf {\alpha }\) satisfying the bound \(\varphi (\mathbf {\alpha })\ge t\) can be covered by a collection of hypercubes of side \(2 r_1(t)\), each of which is separated from the others by a gap of size \(\tfrac{1}{2}r_2(t)\). The lemma will follow upon bounding the total Lebesgue measure of this collection of hypercubes.

For each \(t>0\) we set

$$\begin{aligned} D\left( t\right) = \left\{ \mathbf {\alpha }\in E:\varphi (\mathbf {\alpha })\ge t\right\} . \end{aligned}$$
(2.7)

Observe that if \(\mathbf {\alpha }\) and \(\mathbf {\alpha }+\mathbf {\beta }\) both belong to \(D\left( t{}\right) \), then (2.5) implies that

from which it follows that either \(\left||\mathbf {\beta }\right||_\infty \le r_1(t)\) or \(\left||\mathbf {\beta }\right||_\infty \ge r_2(t)\) must hold.

Let \(\mathfrak {b}\) be any hypercube in \(\mathbb {R}^R\) whose sides are of length \(\frac{1}{2}r_2(t)\) and parallel to the coordinate axes. We claim that \(\mathfrak {b}\cap D\left( t\right) \) is contained in a hypercube \(\mathfrak {B}\) whose sides are of length \(2r_1(t)\). To see this let \(\mathbf {\alpha }\) be any fixed vector lying in \(\mathfrak {b}\cap D\left( t\right) \), and set

$$\begin{aligned} \mathfrak {B}= \left\{ \mathbf {\alpha }+\mathbf {\beta }:\mathbf {\beta }\in \mathbb {R}^R, \left||\mathbf {\beta }\right||_\infty \le r_1(t)\right\} . \end{aligned}$$

If \(\mathbf {\alpha }+\mathbf {\beta }\) belongs to \(\mathfrak {b}\cap D\left( t\right) \), then by definition of \(\mathfrak {b}\) the bound \(\left||\mathbf {\beta }\right||_\infty \le \frac{1}{2}r_2(t)\) must hold. In particular \(\left||\mathbf {\beta }\right||_\infty <r_2(t)\), so by the comments after (2.7), the bound \(\left||\mathbf {\beta }\right||_\infty \le r_1(t)\) must hold. This shows that \(\mathbf {\alpha }+\mathbf {\beta } \in \mathfrak {B}\), and hence that \(\mathfrak {b}\cap D\left( t\right) \subset \mathfrak {B}\), as claimed. In particular the Lebesgue measure of \(\mathfrak {b}\cap D\left( t\right) \) is at most \( (2r_1(t))^R. \)

The set \(D(t)\) is contained in \(E_0\), a hypercube of side \(\nu \). So in order to cover the set \(D(t)\) with boxes \(\mathfrak {b}\) of side \(\tfrac{1}{2}r_2(t)\) one needs at most

boxes. Summing over all the boxes \(\mathfrak {b}\), it follows that

(2.8)

where we write \(L\left( t\right) \) for the Lebesgue measure of \(D\left( t\right) \). So we have

$$\begin{aligned} \int _{E} \varphi (\mathbf {\alpha })\,\mathrm {d}\mathbf {\alpha } ={}&\int \limits _{ E {\setminus } D\left( 2^k\right) } \varphi (\mathbf {\alpha }) \,\mathrm {d}\mathbf {\alpha } + \sum _{i=k}^{\ell -1}\, \int \limits _{ E \cap \left( [\right) ]{D\left( 2^{i}\right) {\setminus } D\left( 2^{i+1}\right) }} \varphi (\mathbf {\alpha }) \,\mathrm {d}\mathbf {\alpha } \nonumber \\&+\,\int \limits _{E\cap D\left( 2^\ell \right) } \varphi (\mathbf {\alpha }) \,\mathrm {d}\mathbf {\alpha }\\ {}\le {}&\nu ^R 2^k + \sum _{i=k}^{\ell -1}2^{i+1} L\left( 2^i\right) + L\left( 2^\ell \right) \sup _{\mathbf {\alpha }\in E} \varphi (\mathbf {\alpha }). \end{aligned}$$

With (2.8) this yields (2.6). \(\square \)

We now apply Lemma 2.1 to deduce mean values from bounds of the form (2.1). The following result is stated in greater generality than is strictly required here, to facilitate future applications to forms with real coefficients.

Lemma 2.2

Let T be a complex-valued measurable function on \(\mathbb {R}^R\). Let \(E_0\) be a hypercube in \(\mathbb {R}^R\) whose sides are of length \(\nu \) and parallel to the coordinate axes, and let E be a measurable subset of \(E_0\). Suppose that the inequality

(2.9)

holds for some \(P\ge 1\) and \({\mathscr {C}}>0\) and all \(\mathbf {\alpha }, \mathbf {\beta } \in \mathbb {R}^R\). Suppose further that

$$\begin{aligned} \sup _{\mathbf {\alpha }\in E} \left|T\left( \mathbf {\alpha }\right) \right| \le P^{n-\delta } \end{aligned}$$
(2.10)

for some \(\delta \ge 0\). Then we have

(2.11)

Later we will take \(T\left( \mathbf {\alpha }\right) = C^{-1} P^{-\varepsilon } S\left( \mathbf {\alpha }; P\right) \) where C is as in Proposition 2.1. We will take E to be a set of minor arcs \(\mathfrak {m}_{P,d,\Delta }\), and we will interpret the integral \(\int _{\mathfrak {m}_{P,d,\Delta }} S\left( \mathbf {\alpha }; P\right) \,\mathrm {d}\mathbf {\alpha }\) as an error term, which will need to be smaller than a main term of size around \(P^{n-dR}\). As a result, only the case \({\mathscr {C}}>dR\) of the bound (2.11) will be satisfactory for the present application.

Proof

We apply Lemma 2.1 with

$$\begin{aligned} \varphi (\mathbf {\alpha }) = \frac{\left|T(\mathbf {\alpha })\right|}{P^n}, \quad r_1(t) = P^{-d}t^{-1/{\mathscr {C}}}, \quad r_2(t) = t^{(d-1)/{\mathscr {C}}}, \end{aligned}$$
(2.12)

noting that the bound (2.5) then follows from (2.9).

It remains to choose the parameters k and \(\ell \) from (2.6). We will choose these so that the right-hand side of (2.6) is dominated by the sum \(\sum _{i=k}^{\ell -1}\), rather than either of the other two terms. More precisely, take

$$\begin{aligned} k = \left\lfloor \log _2 P^{-{\mathscr {C}}}\right\rfloor , \qquad \ell = \left\lceil \log _2 P^{-\delta } \right\rceil , \end{aligned}$$
(2.13)

observing that

$$\begin{aligned} \tfrac{1}{2}P^{-{\mathscr {C}}}< 2^k \le P^{-{\mathscr {C}}}, \quad P^{-\delta } \le 2^\ell < 2 P^{-\delta }. \end{aligned}$$
(2.14)

We may assume that \({\mathscr {C}}>\delta \), for otherwise the bound \(\int _E T(\mathbf {\alpha })\,\mathrm {d}\mathbf {\alpha } \le \nu ^R P^{n-\delta }\), which follows from (2.10), is stronger than any of the bounds listed in (2.11). We then have \(k < \ell \) and so this choice of \(k, \ell \) is admissible in Lemma 2.1. Hence (2.6) holds, and substituting in our choices (2.12) for the parameters yields

(2.15)

By (2.10) and (2.14) we have \(\sup _{\mathbf {\alpha }\in E} \frac{\left|T\left( \mathbf {\alpha }\right) \right|}{P^n}\le 2^\ell \), and so we may extend the sum in (2.15) from \(\sum _{i=k}^{\ell -1}\) to \(\sum _{i=k}^\ell \) to obtain

Since

we deduce that

$$\begin{aligned} \int _{E} \frac{\left|T\left( \mathbf {\alpha }\right) \right|}{P^n} \,\mathrm {d}\mathbf {\alpha } \ll _R \nu ^R2^k + \sum _{i=k}^{\ell } \nu ^R P^{-dR}2^{i(1-dR/{\mathscr {C}})} + \sum _{i=k}^{\ell } P^{-dR} 2^{i(1-R/{\mathscr {C}})}. \end{aligned}$$
(2.16)

Note that

$$\begin{aligned} \sum _{i=k}^{\ell } 2^{i(1-dR/{\mathscr {C}})} \ll _{{\mathscr {C}},d,R} {\left\{ \begin{array}{ll} 2^{k\left( 1-dR/{\mathscr {C}}\right) } &{}\quad \text {if }{\mathscr {C}}<dR \\ \ell -k &{}\quad \text {if }{\mathscr {C}}= dR\\ 2^{\ell \left( 1-dR/{\mathscr {C}}\right) } &{}\quad \text {if }{\mathscr {C}}>dR. \end{array}\right. } \end{aligned}$$

Recall from (2.14) that we have \(2^k \ge \tfrac{1}{2} P^{-{\mathscr {C}}}\) and \(2^\ell \le 2 P^{-\delta }\), and observe that by (2.13) the bound \(\ell -k \le 2+{\mathscr {C}}\log _2 P\) holds. It follows that

$$\begin{aligned} \sum _{i=k}^{\ell } 2^{i(1-dR/{\mathscr {C}})} \ll _{{\mathscr {C}},d,R} {\left\{ \begin{array}{ll} P^{{\mathscr {C}}-dR} &{}\quad \text {if }{\mathscr {C}}<dR\\ \log P &{}\quad \text {if }{\mathscr {C}}= dR\\ P^{-\delta \left( 1-dR/{\mathscr {C}}\right) } &{}\quad \text {if }{\mathscr {C}}>dR, \end{array}\right. } \end{aligned}$$

and reasoning similarly for \(\sum _{i=k}^{\ell } 2^{i(1-R/{\mathscr {C}})}\), we deduce from (2.16) that

$$\begin{aligned} \int _{E} \frac{\left|T\left( \mathbf {\alpha }\right) \right|}{P^n}\,\mathrm {d}\mathbf {\alpha }\\ \ll \left\{ \begin{array}{@{}l@{}l@{}l@{}} \nu ^R2^k+ \nu ^R P^{-{\mathscr {C}}} &{}+\, P^{n-{\mathscr {C}}-(d-1)R} &{}\quad \text {if } {\mathscr {C}}< R\\ \nu ^R2^k +\, \nu ^R P^{-{\mathscr {C}}} &{}+ P^{-dR}\log P &{}\quad \text {if } {\mathscr {C}}= R\\ \nu ^R2^k+\nu ^R P^{-{\mathscr {C}}} &{}+\, P^{-dR-\delta (1-R/{\mathscr {C}})} &{}\quad \text {if } R< {\mathscr {C}}< dR\\ \nu ^R2^k+ \nu ^R P^{-{\mathscr {C}}}\log P &{}{}+\, P^{-dR-\delta (1-R/{\mathscr {C}})} &{}\quad \text {if } {\mathscr {C}}= dR \\ \nu ^R2^k+ \nu ^R P^{-dR-\delta (1-dR/{\mathscr {C}})} &{}{}+\, P^{-dR-\delta (1-R/{\mathscr {C}})} &{}\quad \text {if } {\mathscr {C}}> dR, \end{array} \right. \end{aligned}$$

with an implicit constant depending only on \({\mathscr {C}},d,\) and R. One final application of the bound \(2^k \le P^{-{\mathscr {C}}}\) from (2.14) completes the proof of (2.11). \(\square \)

2.2 Notation for the circle method

We split the domain into two regions. Let and set

(2.17)

We give local analogues of \(S\left( \mathbf {\alpha }; P\right) \) and of the integral \(\int _{\mathfrak {M}_{P,d,\Delta }}S\left( \mathbf {\alpha }; P\right) \,\mathrm {d}\mathbf {\alpha }\). We set

$$\begin{aligned} S_{q}(\mathbf {a}) = q^{-n} \sum _{\mathbf {y}\in \left\{ 1,\ldots ,q\right\} ^n} e\left( [\right) \big ]{\tfrac{\mathbf {a}}{q}\cdot \mathbf {f}(\mathbf {y})} \end{aligned}$$

for each \(q \in \mathbb {N}\) and \(\mathbf {a}\in \mathbb {Z}^R\), and we put

$$\begin{aligned} \mathfrak {S}(P) = \sum _{q\le P^\Delta } \sum _{\begin{array}{c} \mathbf {a}\in \left\{ 1,\ldots ,q\right\} ^R \\ (a_1,\ldots ,a_R,q) =1 \end{array}} S_{q}(\mathbf {a}). \end{aligned}$$

For each \(\mathbf {\gamma }\in \mathbb {R}^R\), set

$$\begin{aligned} S_{\infty }(\mathbf {\gamma }) = \int _{\mathscr {B}} e\left( [\right) \big ]{\mathbf {\gamma }\cdot \mathbf {f}^{[d]}(\mathbf {t})} \,\mathrm {d}\mathbf {t}, \end{aligned}$$

and let

$$\begin{aligned} \mathfrak {I}(P) = \int \limits _{\begin{array}{c} \mathbf {\alpha }\in \mathbb {R}^R \\ \left||\mathbf {\alpha }\right||_\infty \le P^{\Delta -d} \end{array}} P^n S_{\infty }(P^d\mathbf {\alpha }) \,\mathrm {d}\mathbf {\alpha }. \end{aligned}$$

Finally we define a quantity \(\delta _0\) which in some sense measure the extent to which the system \(f_i\) is singular. Let \(\sigma _\mathbb {Z}\in \left\{ 0,\ldots ,n \right\} \) be as in (1.10), and let

$$\begin{aligned} \delta _0 = \frac{n-\sigma _\mathbb {Z}}{(d-1)2^{d-1}R}. \end{aligned}$$

2.3 The minor arcs

On the minor arcs \(\mathfrak {m}_{P,d,\Delta }\) we have the following bound, compare (2.10) in Lemma 2.2.

Lemma 2.3

(Dietmann [14], Schindler [32]) Suppose that the polynomials \(f_i\) have integer coefficients. Let \(\Delta \), \(\mathfrak {m}_{P,d,\Delta }\) and \(\delta _0\) be as in Sect. 2.2, and let \(\varepsilon >0\). Let the sum \(S\left( \mathbf {\alpha }; P\right) \) be as in (1.11). Then we have

$$\begin{aligned} \sup _{\mathbf {\alpha }\in \mathfrak {m}_{P,d,\Delta }} \left|S\left( \mathbf {\alpha }; P\right) \right| \ll _{\varepsilon } P^{n-\Delta \delta _0 + \varepsilon } \end{aligned}$$
(2.18)

where the implicit constant depends only on dnR,  and \(\varepsilon \). The constant \(\delta _0\) satisfies \(\delta _0 \ge \frac{1}{(d-1)2^{d-1}R}\) whenever the forms \(f^{[d]}_i\) are linearly independent.

Proof

The bound (2.18) follows either from Lemma 4 in Dietmann [14], or from Lemma 2.2 in Schindler [32], by setting the parameter \(\theta \) in either author’s work to be

$$\begin{aligned} \theta = \frac{\Delta -\varepsilon }{(d-1)R}, \end{aligned}$$

and taking \(P\gg _{\varepsilon } 1\) sufficiently large. Provided the forms \(f^{[d]}_i\) are linearly independent, the variety \(V(\mathbf {a}\cdot \mathbf {f}^{[d]})\) is a proper subvariety of \(\mathbb {P}_\mathbb {Q}^{n-1}\) for each \(\mathbf {a}\in \mathbb {Z}^R{\setminus }\left\{ \mathbf {0}\right\} \), and so \(\sigma _\mathbb {Z}\le n-1\) holds, by (1.10). This implies that \(\delta _0 \ge \frac{1}{(d-1)2^{d-1}R}\), as claimed.\(\square \)

2.4 The major arcs

In this section we estimate \(\int _{\mathfrak {M}_{P,d,\Delta }}S\left( \mathbf {\alpha }; P\right) \,\mathrm {d}\mathbf {\alpha }\), the integral over the major arcs.

Lemma 2.4

Suppose that the polynomials \(f_i\) have integer coefficients. Let \(\Delta \), \(\mathfrak {M}_{P,d,\Delta }\), \(S_{\infty }(\mathbf {\gamma })\), \(S_{q}(\mathbf {a})\), \(\mathfrak {S}(P)\) and \(\mathfrak {I}(P)\) be as in Sect. 2.2. Then for all \(\mathbf {a}\in \mathbb {Z}^R\) and all \(q\in \mathbb {N}\) such that \(q \le P\), we have

$$\begin{aligned} S\left( [\right) \big ]{\tfrac{\mathbf {a}}{q}+\mathbf {\alpha }; P} = P^nS_{q}(\mathbf {a}) S_{\infty }(P^d\mathbf {\alpha }) + O_{}\left( [\right) ]{ qP^{n-1}\left( 1+P^{d}\left||\mathbf {\alpha }\right||_\infty \right) }, \end{aligned}$$
(2.19)

and it follows that

$$\begin{aligned} \int _{\mathfrak {M}_{P,d,\Delta }} S\left( \mathbf {\alpha }; P\right) \,\mathrm {d}\mathbf {\alpha } = \mathfrak {S}(P)\mathfrak {I}(P) + O_{} \left( [\right) \big ]{ P^{n-dR+(2R+3)\Delta -1}}. \end{aligned}$$
(2.20)

Proof

To show (2.19) we follow the proof of Lemma 5.1 in Birch [3]. First observe that \(\mathbf {\alpha }\cdot \mathbf {f}(\mathbf {x}) = \mathbf {\alpha }\cdot \mathbf {f}^{[d]}(\mathbf {x}) +O(\left||\mathbf {x}\right||_\infty ^{d-1}\left||\mathbf {\alpha }\right||_\infty )\), and so

(2.21)

If \(\psi \) is any differentiable complex-valued function on \(\mathbb {R}^n\), then we have

$$\begin{aligned} \psi (\mathbf {x}) = q^{-n}\int _{\begin{array}{c} \mathbf {u}\in \mathbb {R}^n\\ \left||\mathbf {u}\right||_\infty \le q/2 \end{array}} \psi (\mathbf {x}+\mathbf {u}) \,\mathrm {d}\mathbf {u} +O_n\left( [\right) \Big ]{ q \max _{\begin{array}{c} \mathbf {u}\in \mathbb {R}^n\\ \left||\mathbf {u}\right||_\infty \le q/2 \end{array}} \left||\mathbf {\nabla }_{\mathbf {u}} \psi (\mathbf {x}+\mathbf {u})\right||_\infty }. \end{aligned}$$

Setting \(\psi (\mathbf {x}) = e(\mathbf {\alpha }\cdot \mathbf {f}^{[d]}(\mathbf {x})) \), we deduce that

where the term \(q^{1-n}P^{n-1}\) allows for errors in approximating the boundary of the box \(\mathscr {B}\). Substituting into (2.21) shows that

$$\begin{aligned} S\left( \tfrac{\mathbf {a}}{q}+\mathbf {\alpha }; P\right) = S_{q}(\mathbf {a}) \int _{\begin{array}{c} \mathbf {v}\in \mathbb {R}^n\\ \mathbf {v}/P \in \mathscr {B} \end{array}} e(\mathbf {\alpha }\cdot \mathbf {f}^{[d]}(\mathbf {v})) \,\mathrm {d}\mathbf {v} +O(qP^{n-1}(1+P^d\left||\mathbf {\alpha }\right||_\infty )). \end{aligned}$$

To complete the proof of (2.19) it suffices to set \(\mathbf {u}=P\mathbf {t}\) and use the definition of \(S_{\infty }(\mathbf {\gamma })\) from Sect. 2.2. Now (2.20) follows from (2.19) by the definition (2.17) of \(\mathfrak {M}_{P,d,\Delta }\).\(\square \)

We remark that in the case when \(\mathbf {a}=\mathbf {0}\) and \(q=1\), the proof of (2.19) is valid whether or not the polynomials \(f_i\) have integer coefficients. That is, we always have

$$\begin{aligned} S\left( \mathbf {\alpha }; P\right) = P^nS_{\infty }(P^d\mathbf {\alpha }) + O_{}\left( [\right) ]{ P^{n-1}\left( 1+P^{d}\left||\mathbf {\alpha }\right||_\infty \right) } \end{aligned}$$
(2.22)

for any \(f_i\) with real cofficients. Next we treat the quantity \(\mathfrak {S}(P)\) from (2.20).

Lemma 2.5

Let the polynomials \(f_i\) have integer coefficients, let the box \(\mathscr {B}\) from Sect. 1.3 be and let \(S_{q}(\mathbf {a})\) be as in Sect. 2.2. Suppose we are given \(\varepsilon \ge 0\) and \(C\ge 1\), such that for all \(\mathbf {\alpha },\mathbf {\beta }\in \mathbb {R}^R\) and all \(P\ge 1\) the bound (2.1) holds. Then:

  1. (i)

    There is \(\varepsilon '\ge 0\) such that \(\varepsilon ' = O_{{\mathscr {C}}}(\varepsilon )\) and

    (2.23)

    for all \(\mathbf {a}\in \left\{ 1,\ldots ,q\right\} ^R\) and \(\mathbf {a}'\in \left\{ 1,\ldots ,q\right\} ^R\) such that \(\frac{\mathbf {a}'}{q'}\ne \frac{\mathbf {a}}{q}\).

  2. (ii)

    If \({\mathscr {C}}>\varepsilon '\), then for all \(t>0\) and \(q_0\in \mathbb {N}\) we have

    $$\begin{aligned} \#\left\{ \tfrac{\mathbf {a}}{q}\in \mathbb {Q}^R\cap [0, 1)^R:q \le q_0, \left|S_{q}(\mathbf {a})\right|\ge t \right\} \ll _{C} (q_0^\varepsilon t)^{-\frac{(d-1)R}{{\mathscr {C}}-\varepsilon '} }, \end{aligned}$$

    where it is understood that the fractions \(\tfrac{\mathbf {a}}{q}\) are in lowest terms.

  3. (iii)

    Let \(\delta _0\) be as in Sect. 2.2 and let \(\varepsilon ''>0\). For all \(q \in \mathbb {N}\) and all \(\mathbf {a} \in \mathbb {Z}^R\) such that \((a_1,\ldots ,a_R,q)=1\), we have

    $$\begin{aligned} \left|S_{q}(\mathbf {a})\right| \ll _{\varepsilon ''} q^{-\delta _0+\varepsilon ''}. \end{aligned}$$
  4. (iv)

    Let \(\Delta \) and \(\mathfrak {S}(P)\) be as in Sect. 2.2. Suppose that \(\varepsilon \) is sufficiently small in terms of \({\mathscr {C}}\), d and R. Provided the inequality \({\mathscr {C}}>(d-1)R\) holds and the forms \(f^{[d]}_i\) are linearly independent, we have

    $$\begin{aligned} \mathfrak {S}(P)-\mathfrak {S}\ll _{C,{\mathscr {C}}} P^{-\Delta \delta _1} \end{aligned}$$
    (2.24)

    for some \(\mathfrak {S}\in \mathbb {C}\) and some \(\delta _1 >0\) depending at most on \({\mathscr {C}}\), d and R. We have

    $$\begin{aligned} \mathfrak {S}= \prod _p \lim _{k\rightarrow \infty } \tfrac{1}{p^{k(n-R)}} \# \big \{ \mathbf {b} \in \left\{ 1,2,\ldots ,p^k\right\} ^n :\\ f_1(\mathbf {b}) \equiv 0, \ldots , f_R(\mathbf {b}) \equiv 0 \mod p^k\big \} \end{aligned}$$
    (2.25)

    where the product is over primes p and converges absolutely.

Proof of part (i)

Provided P is sufficiently large, Lemma 2.4 will allow us to approximate the sum \(S_{q}(\mathbf {a})\) by a multiple of \(S\left( [\right) \big ]{{\mathbf {a}/q};P}\). This will enable us to transform (2.1) into the bound (2.23). Let \(P\ge 1\) be a parameter, to be chosen later. Then (2.1) gives

(2.26)

Since the equality \(S_{\infty }(\mathbf {0})=1\) holds, and so (2.19) implies that

$$\begin{aligned} \frac{ S\left( [\right) \big ]{\frac{\mathbf {a}}{q};P} }{P^{n}} = S_{q}( \mathbf {a} ) + O_{}\left( qP^{-1}\right) ,\quad \frac{ S\left( [\right) \big ]{\frac{\mathbf {a}'}{q'};P} }{P^{n}} = S_{q'}( \mathbf {a}' ) + O_{}\left( q'P^{-1}\right) .\qquad \end{aligned}$$
(2.27)

Together (2.26) and (2.27) yield

figure a

Observe that for P sufficiently large the term \(C P^\varepsilon \left||\frac{\mathbf {a}'}{q'}-\frac{\mathbf {a}}{q}\right||_\infty ^{{\mathscr {C}}/(d-1)}\) dominates the right-hand side of (2.28). We claim this is the case for

$$\begin{aligned} P = (q'+q)\left||[\right||\big ]{ \tfrac{\mathbf {a}'}{q'}-\tfrac{\mathbf {a}}{q} }_\infty ^{-\frac{1+{\mathscr {C}}}{d-1}}. \end{aligned}$$
(2.29)

Indeed, since \(\left||\frac{\mathbf {a}'}{q'}-\frac{\mathbf {a}}{q}\right||_\infty \le 1\), it follows from (2.29) and (2.28) that

which proves the result.\(\square \)

Proof of part (ii)

If \(\varepsilon '< {\mathscr {C}}\) is small, then by part (i), the points in the set

$$\begin{aligned} \left\{ \tfrac{\mathbf {a}}{q}\in \mathbb {Q}^R\cap [0, 1)^R:q \le q_0, \left|S_{q}(\mathbf {a})\right|\ge t \right\} \end{aligned}$$

are separated by gaps of size

$$\begin{aligned} \left||\tfrac{\mathbf {a}'}{q'}-\tfrac{\mathbf {a}}{q}\right||_\infty \gg _{C} (q_0^{-\varepsilon } t)^{\frac{d-1}{{\mathscr {C}}-\varepsilon '}}. \end{aligned}$$

At most \(O_{C}(q_0^{\varepsilon } t)^{-\frac{(d-1)R}{{\mathscr {C}}-\varepsilon '}}\) such points fit in the box \([{0},{1})^R\), proving the claim.\(\square \)

Proof of part (iii)

This follows from Lemma 2.3 by an argument which is now standard, see the proof of Lemma 5.4 in Birch [3]. \(\square \)

Proof of part (iv)

In this part of the proof, whenever we write \(\mathbf {a}/q\) it is understood that \(\mathbf {a}\in \mathbb {Z}^R\) and \(q\in \mathbb {N}\) with \((a_1,\ldots ,a_R,q) =1\). We will show below that

(2.30)

for all \(Q \ge 1\), and some \(\delta _1>0\) depending only on \({\mathscr {C}}\), d and R. Since

this proves (2.24) with

(2.31)

where this sum is absolutely convergent. Then (2.25) follows as in §7 of Birch [3].

We prove (2.30). Let \(\ell \in \mathbb {Z}\). We have

(2.32)

Now parts (ii) and (iii) show that

$$\begin{aligned} \#\left\{ \tfrac{\mathbf {a}}{q}\in \mathbb {Q}^R\cap [0, 1)^R:q \le 2Q, \left|S_{q}(\mathbf {a})\right|\ge t \right\} \ll _{C} (Q^{\varepsilon } t)^{-\frac{(d-1)R}{ {\mathscr {C}}-\varepsilon '}} \end{aligned}$$

and that

$$\begin{aligned} \sup _{q>Q} \left|S_{q}(\mathbf {a})\right| \ll _{} Q^{-\delta _0/2}. \end{aligned}$$

Substituting these bounds into (2.32) gives

$$\begin{aligned} s(Q) \ll _{C} Q^{O_{{\mathscr {C}}}(\varepsilon )-\delta _0/2} 2^{\ell \frac{(d-1)R}{{\mathscr {C}}-\varepsilon '}} + Q^{O_{{\mathscr {C}}}(\varepsilon )} \sum _{i=\ell }^\infty 2^{(i+1) \left( [\right) \big ]{\frac{(d-1)R}{{\mathscr {C}}-\varepsilon '}}-i}. \end{aligned}$$

We have \({\mathscr {C}}>(d-1)R\) and we have assumed that \(\varepsilon '\) is small in terms of \({\mathscr {C}}\), d and R, so we may assume that the bound \({\mathscr {C}}>(d-1)R+\varepsilon '\) holds. So we may sum the geometric progression to find that

$$\begin{aligned} s(Q)\ll _{C,{\mathscr {C}}} Q^{O_{{\mathscr {C}}}(\varepsilon )} 2^{\ell \frac{(d-1)R}{{\mathscr {C}}-\varepsilon '}} \left( [\right) \big ]{ Q^{-\delta _0/2} +2^{-\ell } }. \end{aligned}$$

Picking \( \ell = \left\lfloor \log _2 Q^{\delta _0/2}\right\rfloor \) shows that

$$\begin{aligned} s(Q) \ll _{C,{\mathscr {C}}} Q^{-\delta _0\frac{(d-1)R-{\mathscr {C}}}{2{\mathscr {C}}}+O_{{\mathscr {C}}}(\varepsilon )}. \end{aligned}$$

The forms \(f^{[d]}_i\) are linearly independent, so \(\delta _0\ge \frac{1}{(d-1)2^{d-1}R}\), by Lemma 2.4. As \(\varepsilon \) is small in terms of \({\mathscr {C}}\), d and R it follows that \(s(Q)\ll _{C,{\mathscr {C}}} Q^{-\delta _1}\) for some \(\delta _1>0\) depending only on \({\mathscr {C}}\), d and R. This proves (2.30).\(\square \)

We estimate the integral \(\mathfrak {I}(P)\) from (2.20).

Lemma 2.6

Let \(S_{\infty }(\mathbf {\gamma })\), \(\Delta \) and \(\mathfrak {I}(P)\) be as in Sect. 2.2.

  1. (i)

    Suppose that the bound (2.1) holds for some \(C\ge 1\), \({\mathscr {C}}>0\) and \(\varepsilon \ge 0\) and all \(\mathbf {\alpha },\mathbf {\beta }\in \mathbb {R}^R\) and \(P\ge 1\). Then for all \(\mathbf {\gamma }\in \mathbb {R}^R\) we have

    $$\begin{aligned} S_{\infty }(\mathbf {\gamma }) \ll _{C} \left||\mathbf {\gamma }\right||_\infty ^{-{\mathscr {C}}+\varepsilon '}, \end{aligned}$$
    (2.33)

    for some \(\varepsilon ' \ge 0\) such that \(\varepsilon ' = O_{{\mathscr {C}}}(\varepsilon )\).

  2. (ii)

    If the conclusion of part (i) holds and \({\mathscr {C}}-\varepsilon '>R\), then there exists \(\mathfrak {I}\in \mathbb {C}\) such that for all \(P\ge 1\) we have

    $$\begin{aligned} \tfrac{1}{P^{n-dR}}\mathfrak {I}(P) - \mathfrak {I}\ll _{{\mathscr {C}},C,\varepsilon '} P^{-\Delta \left( {\mathscr {C}}-\varepsilon '-R\right) }. \end{aligned}$$
    (2.34)

    Furthermore we have

    $$\begin{aligned} \mathfrak {I}= \lim _{P \rightarrow \infty }\tfrac{1}{P^{n-dR}} \lambda \left\{ [\right\} \big ]{ \mathbf {t}\in \mathbb {R}^n :\tfrac{1}{P}\mathbf {t} \in \mathscr {B}, \left|f^{[d]}_1(\mathbf {t})\right| \le \tfrac{1}{2} ,\ldots , \left|f^{[d]}_R(\mathbf {t})\right| \le \tfrac{1}{2} } \end{aligned}$$
    (2.35)

    where \(\lambda \left\{ \,\cdot \,\right\} \) denotes the Lebesgue measure.

Proof of part (i)

First, for all \(\mathbf {\beta }\in \mathbb {R}^R\) we have \(\left|S\left( \mathbf {\beta }; P\right) \right|\le S\left( \mathbf {0}; P\right) \), from the definition (1.11). Consequently, taking \(\mathbf {\alpha } = \mathbf {0}\), \(\mathbf {\beta }= P^{-d}\mathbf {\gamma } \) in our hypothesis (2.1) shows that

Together with the case \(\mathbf {\alpha }= P^{-d}\mathbf {\gamma } \) of the bound (2.22), this yields

(2.36)

If we have \(\left||\mathbf {\gamma }\right||_\infty \le 1\), then we set \(P=1\) and (i) follows at once. Otherwise we put \(P = \max \left\{ 1, \left||\mathbf {\gamma }\right||_\infty ^{1+{\mathscr {C}}}\right\} \), and the result follows since (2.36) then implies

\(\square \)

Proof of part (ii)

If the inequality \({\mathscr {C}}-\varepsilon '>R \) holds, then by (2.33) we have

$$\begin{aligned} \left( [\right) \bigg ]{\int \limits _{\mathbf {\gamma }\in \mathbb {R}^R } P^{n-dR} S_{\infty }(\mathbf {\gamma }) \,\mathrm {d}\mathbf {\gamma }} - \mathfrak {I}(P)&= \int \limits _{\begin{array}{c} \mathbf {\gamma }\in \mathbb {R}^R \\ \left||\mathbf {\gamma }\right||_\infty > P^{\Delta } \end{array}} P^{n-dR} S_{\infty }(\mathbf {\gamma }) \,\mathrm {d}\mathbf {\gamma } \\&\ll _{{\mathscr {C}},C,\varepsilon '} P^{n-dR-\Delta \left( {\mathscr {C}}-\varepsilon '-R\right) } , \end{aligned}$$

where the integrals converge absolutely. This proves (2.34) with

$$\begin{aligned} \mathfrak {I}= \int \limits _{\mathbf {\gamma }\in \mathbb {R}^R } S_{\infty }(\mathbf {\gamma }) \,\mathrm {d}\mathbf {\gamma }. \end{aligned}$$
(2.37)

It remains to prove (2.35). Let be the indicator function of the box . We must evaluate the limit

$$\begin{aligned} \lim _{P \rightarrow \infty } \tfrac{1}{P^{n-dR}} \lambda \left\{ [\right\} \big ]{ \mathbf {t}\in \mathbb {R}^n :\tfrac{1}{P}\mathbf {t} \in \mathscr {B}, \left|f^{[d]}_1(\mathbf {t})\right| \le \tfrac{1}{2} ,\ldots , \left|f^{[d]}_R(\mathbf {t})\right| \le \tfrac{1}{2} } \\ = \lim _{P\rightarrow \infty } \tfrac{1}{P^{n-dR}} \int _{\begin{array}{c} \mathbf {t}\in \mathbb {R}^n \\ \mathbf {t}/P \in \mathscr {B} \end{array}} \chi \left( [\right) \big ]{ f^{[d]}_1(\mathbf {t}) ,\ldots , f^{[d]}_R(\mathbf {t}) } \,\mathrm {d}\mathbf {t}. \end{aligned}$$
(2.38)

Let \(\varphi \) be any infinitely differentiable, compactly supported function on \(\mathbb {R}^R\), taking values in . We evaluate \(\frac{1}{P^{n-dR}}\int _{\mathbf {t}/P\in \mathscr {B}} \varphi ( f^{[d]}_1(\mathbf {t}) ,\ldots , f^{[d]}_R(\mathbf {t}) )\,\mathrm {d}\mathbf {t}\), which we think of as a smoothed version of (2.38). Fourier inversion gives

$$\begin{aligned} \int _{\begin{array}{c} \mathbf {t}\in \mathbb {R}^n \\ \mathbf {t}/P \in \mathscr {B} \end{array}} \varphi \left( [\right) \big ]{ f^{[d]}_1(\mathbf {t}) ,\ldots , f^{[d]}_R(\mathbf {t}) } \,\mathrm {d}\mathbf {t}&= \int _{\begin{array}{c} \mathbf {t}\in \mathbb {R}^n \\ \mathbf {t}/P \in \mathscr {B} \end{array}} \int _{\mathbb {R}^R} \hat{\varphi }(\mathbf {\alpha }) e\left( \mathbf {\alpha }\cdot \mathbf {f}^{[d]}(\mathbf {t}) \right) \,\mathrm {d}\mathbf {\alpha } d\mathbf {t} \nonumber \\&= \int _{\mathbb {R}^R} \hat{\varphi }(\mathbf {\alpha }) \int _{\begin{array}{c} \mathbf {t}\in \mathbb {R}^n \\ \mathbf {t}/P \in \mathscr {B} \end{array}} e\left( \mathbf {\alpha }\cdot \mathbf {f}^{[d]}(\mathbf {t}) \right) \,\mathrm {d}\mathbf {t} d\mathbf {\alpha } \nonumber \\&= \int _{\mathbb {R}^R} \hat{\varphi }(\mathbf {\alpha }) P^n S_{\infty }(P^d\mathbf {\alpha }) d\mathbf {\alpha } \end{aligned}$$
(2.39)

where \(\hat{\varphi }(\mathbf {\alpha })\) is the Fourier transform \( \int _{\mathbb {R}^R} \varphi (\mathbf {\gamma }) e\left( - \mathbf {\alpha }\cdot \mathbf {\gamma } \right) \,\mathrm {d}\mathbf {\gamma }\).

Since \( {\mathscr {C}}-\varepsilon ' >R\) holds by assumption, it follows from (2.33) that the function \(S_\infty \) is Lebesgue integrable. Hence (2.37) implies

$$\begin{aligned} \hat{\varphi }(\mathbf {0}) \mathfrak {I}\nonumber&= \int _{\mathbb {R}^R} \hat{\varphi }(\mathbf {0}) S_{\infty }( \mathbf {\gamma }) \,\mathrm {d}\mathbf {\gamma } \nonumber \\&= \lim _{P\rightarrow \infty } \int _{\mathbb {R}^R} \hat{\varphi }(P^{-d}\mathbf {\gamma }) S_{\infty }( \mathbf {\gamma }) \,\mathrm {d}\mathbf {\gamma } \nonumber \\&= \lim _{P\rightarrow \infty } P^{dR} \int _{\mathbb {R}^R} \hat{\varphi }(\mathbf {\alpha }) S_{\infty }(P^d \mathbf {\alpha }) \,\mathrm {d}\mathbf {\alpha }. \end{aligned}$$
(2.40)

Together (2.39) and (2.40) show that for any infinitely differentiable, compactly supported \(\varphi \) taking values in , we have

$$\begin{aligned} \lim _{P\rightarrow \infty } \tfrac{1}{P^{n-dR}} \int _{\begin{array}{c} \mathbf {t}\in \mathbb {R}^n \\ \mathbf {t}/P \in \mathscr {B} \end{array}} \varphi \left( f^{[d]}_1(\mathbf {t}) ,\ldots , f^{[d]}_R(\mathbf {t}) \right) \,\mathrm {d}\mathbf {t} = \hat{\varphi }(\mathbf {0}) \mathfrak {I}. \end{aligned}$$
(2.41)

With \(\chi \) as in (2.38), choose \(\varphi \) such that \(\varphi (\mathbf {\gamma })\le \chi (\mathbf {\gamma })\) for all \(\mathbf {\gamma }\in \mathbb {R}^R\). Then by (2.38) and (2.41) we have

$$\begin{aligned} \liminf _{P \rightarrow \infty }\tfrac{1}{P^{n-dR}} \lambda \left\{ [\right\} \big ]{ \mathbf {t}\in \mathbb {R}^n :\tfrac{1}{P}\mathbf {t} \in \mathscr {B}, \left|f^{[d]}_1(\mathbf {t})\right| \le \tfrac{1}{2} ,\ldots , \left|f^{[d]}_R(\mathbf {t})\right| \le \tfrac{1}{2} } \ge \hat{\varphi }(\mathbf {0}) \mathfrak {I}. \end{aligned}$$

Letting \(\varphi \rightarrow \chi \) almost everywhere gives \( \hat{\varphi }(\mathbf {0}) \rightarrow 1\), so \(\mathfrak {I}\) is a lower bound for the limit inferior in (2.38). Repeating the argument with \(\varphi (\mathbf {\gamma })\ge \chi (\mathbf {\gamma })\) instead of \(\varphi (\mathbf {\gamma })\le \chi (\mathbf {\gamma })\) shows that \(\mathfrak {I}\) is also an upper bound for the corresponding limit superior, so the limit exists and is equal to \(\mathfrak {I}\).\(\square \)

2.5 The proof of Proposition 2.1

In this section we deduce Proposition 2.1 from Lemmas 2.22.6.

Proof of Proposition (2.1)

Let \(P\ge 1\) and \(\Delta =\frac{1}{4R+6}\). By (2.3) we have

$$\begin{aligned} N_{f_1,\ldots ,f_R}(P) = \int _{\mathfrak {M}_{P,d,\Delta }} S\left( \mathbf {\alpha }; P\right) \,\mathrm {d}\mathbf {\alpha } +\int _{\mathfrak {m}_{P,d,\Delta }} S\left( \mathbf {\alpha }; P\right) \,\mathrm {d}\mathbf {\alpha }, \end{aligned}$$

where \(\mathfrak {M}_{P,d,\Delta }\), \(\mathfrak {m}_{P,d,\Delta }\) are as in Sect. 2.2. We apply Lemma 2.2 with

With these choices for T, \(E_0\), E and \(\delta \) we see that (2.9) follows from (2.1). Lemma 2.3 shows that \(\sup _{\mathbf {\alpha }\in \mathfrak {m}_{P,d,\Delta }}CT(\mathbf {\alpha }) \ll _\varepsilon P^{n-\delta }\), and after increasing C if necessary this gives us (2.10). This verifies the hypotheses of Lemma 2.2. Since we have \({\mathscr {C}}>dR\) by assumption, (2.11) gives

$$\begin{aligned} \int _{\mathfrak {m}_{P,d,\Delta }}S\left( \mathbf {\alpha }; P\right) \,\mathrm {d}\mathbf {\alpha } \ll _{C,{\mathscr {C}}} P^{n-dR-\Delta \delta _0\left( 1-\frac{dR}{{\mathscr {C}}}\right) +\varepsilon }. \end{aligned}$$
(2.42)

For the major arcs, since \(\Delta =\frac{1}{4R+6}\) we have by Lemma 2.4 that

$$\begin{aligned} \int _{\mathfrak {M}_{P,d,\Delta }} S\left( \mathbf {\alpha }; P\right) \,\mathrm {d}\mathbf {\alpha } = \mathfrak {S}(P)\mathfrak {I}(P) + O_{} \left( [\right) \big ]{ P^{n-dR-\frac{1}{2}}}, \end{aligned}$$
(2.43)

where \(\mathfrak {S}(P)\), \(\mathfrak {I}(P)\) are as in Sect. 2.2. Since \({\mathscr {C}}>dR\) holds, the \(f_i(\mathbf {x})\) are linearly independent, and \(\varepsilon \) is small in terms of \({\mathscr {C}}\), d and R, both of Lemmas 2.5 and 2.6 apply. In particular (2.24) and (2.34) shows that

$$\begin{aligned} \mathfrak {S}(P)\mathfrak {I}(P) = \mathfrak {S}\mathfrak {I}P^{n-dR} +O_{{\mathscr {C}},C} \left( [\right) \big ]{ P^{n-dR-\Delta ({\mathscr {C}}-R)/2} } +O_{{\mathscr {C}},C} \left( [\right) \big ]{ P^{n-dR-\Delta \delta _1} } \end{aligned}$$
(2.44)

where \(\delta _1 >0\) depends at most on \({\mathscr {C}}\), d and R. By (2.42), (2.43), and (2.44), the result holds.\(\square \)

3 The auxiliary inequality

In this section we verify the hypothesis (2.1), assuming a bound on the number of solutions to the auxiliary inequality from Definition 1.1. The goal is the following result, proved at the end of Sect. 3.2.

Proposition 3.1

Let \(N^{{\text {aux}}}_{f}\left( B \right) \), \(\left||f\right||_\infty \) be as in Definition 1.1. Suppose that we are given \(C_0\ge 1\) and \({\mathscr {C}}>0\) such that for all \(\mathbf {\beta }\in \mathbb {R}^R\) and \(B\ge 1\) we have

$$\begin{aligned} N^{{\text {aux}}}_{\mathbf {\beta }\cdot \mathbf {f}}\left( B \right) \le C_0 B^{(d-1)n-2^{d}{\mathscr {C}}}. \end{aligned}$$
(3.1)

Further let \(M> \mu >0\) such that for all \(\mathbf {\beta }\in \mathbb {R}^R\) we have

$$\begin{aligned} \mu \left|| \mathbf {\beta } \right||_\infty \le \left|| \mathbf {\beta }\cdot \mathbf {f}^{[d]}\right||_\infty \le M \left|| \mathbf {\beta } \right||_\infty , \end{aligned}$$
(3.2)

noting that some such \(M, \mu \) exist whenever the forms \(f^{[d]}_i\) are linearly independent. Let \(\varepsilon >0\). Then there exists \(C\ge 1\), depending only on \(C_0,d,n,\mu ,M\) and \(\varepsilon \), such that the bound (2.1) holds for all \(P\ge 1 \) and all \(\mathbf {\alpha },\mathbf {\beta }\in \mathbb {R}^R\).

3.1 Weyl differencing

We prove (2.1) using the following estimate, which combines work of Birch [3, Lemma 2.4] and Bentkus and Götze [1, Theorem 5.1].

Definition 3.1

Let f, \(\mathbf {m}^{(f)} \left( \mathbf {x}^{(1)}, \ldots , \mathbf {x}^{(d-1)} \right) \) be as in Definition 1.1. Given \(B \ge 1\) and \(\delta >0\), we let \(U_{f} \left( B,\delta \right) \) be the number of \((d-1)\)-tuples of integer n-vectors \(\mathbf {x}^{(1)},\ldots ,\mathbf {x}^{(d-1)}\) such that

$$\begin{aligned} \left||\mathbf {x}^{(1)}\right||_\infty ,\ldots ,\left||\mathbf {x}^{(d-1)}\right||_\infty \le B, \qquad \min _{\mathbf {v}\in \mathbb {Z}^n} \left||[\right||\big ]{\mathbf {v}-\mathbf {m}^{( f )} \left( \mathbf {x}^{(1)}, \ldots , \mathbf {x}^{(d-1)} \right) }_\infty < \delta . \end{aligned}$$

Lemma 3.1

Let \(U_{f} \left( B,\delta \right) \) be as in Definition 3.1. For all \(\varepsilon >0\), \(\mathbf {\alpha }, \mathbf {\beta }\in \mathbb {R}^R\) and , we have

(3.3)

where the implicit constant depends only on \(d,n,\varepsilon \).

Proof of Proposition 2.1

Observe that (3.3) will follow if we can prove that

First we use an idea from the proof of Theorem 5.1 in Bentkus and Götze [1], also found in Lemma 2.2 of Müller [28], to eliminate \(\mathbf {\alpha }\). We have

for some real polynomials \(g_{\mathbf {\alpha },\mathbf {\beta },\mathbf {z}}(\mathbf {x})\) of degree at most \(d-1\) in \(\mathbf {x}\) and some boxes \(\mathscr {B}_{\mathbf {z}}\subset \mathscr {B}\). Now by the special case of Cauchy’s inequality \( \left|\sum _{i\in {\mathcal I}} \lambda _i\right|^2 \le \left( \#{\mathcal I}\right) \cdot \sum _{i\in {\mathcal I}} \left|\lambda _i\right|^2 \), we have

(3.4)

Bentkus and Götze used the double large sieve of Bombieri and Iwaniec [4] to bound the inner sum in (3.4) in the case when \(d=2\). We extend the argument to higher d by employing Lemma 2.4 of Birch [3], which states thatFootnote 1

$$\begin{aligned} S\left( \mathbf {\alpha }; P\right) \ll _{d,n,\varepsilon } P^{2^{d-1}n-(d-1)n\theta +\varepsilon } U_{\mathbf {\alpha }\cdot \mathbf {f}} \left( P^\theta , P^{(d-1)\theta -d} \right) . \end{aligned}$$

The innermost sum in (3.4) has the same form as \(S\left( \mathbf {\alpha }; P\right) \), with \(\mathscr {B}_{\mathbf {z}}\) in place of \(\mathscr {B}\) and \(\mathbf {\beta }\cdot \mathbf {f}^{[d]}(\mathbf {x}) + g_{\mathbf {\alpha },\mathbf {\beta },\mathbf {z}}(\mathbf {x}) \) in place of \(\mathbf {\alpha }\cdot \mathbf {f}\) as the underlying polynomial. The degree of \( g_{\mathbf {\alpha },\mathbf {\beta },\mathbf {z}}\) is at most \(d-1\), so \(\mathbf {\beta }\cdot \mathbf {f}^{[d]}(\mathbf {x})\) is the leading part of this polynomial. So applying Birch’s result to the innermost sum in (3.4) shows

as \(U_{f}\) depends only on the degree d part of f. With (3.4) this proves the result.\(\square \)

3.2 Proof of Proposition 3.1

Proof of Proposition 3.1

Let us first suppose that for some \(\theta >0\) we have

$$\begin{aligned} N^{{\text {aux}}}_{\mathbf {\beta }\cdot \mathbf {f}}\left( P^\theta \right) < U_{\mathbf {\beta }\cdot \mathbf {f}} \left( P^\theta , P^{(d-1)\theta -d} \right) . \end{aligned}$$
(3.5)

Then there must be a \((d-1)\)-tuple of vectors \(\mathbf {x}^{(1)},\ldots ,\mathbf {x}^{(d-1)} \in \mathbb {Z}^n\) which is included in the count \(U_{\mathbf {\beta }\cdot \mathbf {f}} \left( P^\theta , P^{(d-1)\theta -d} \right) \) but not in \(N^{{\text {aux}}}_{\mathbf {\beta }\cdot \mathbf {f}}\left( P^\theta \right) \).

Since the \((d-1)\)-tuple \((\mathbf {x}^{(1)},\ldots ,\mathbf {x}^{(d-1)})\) is counted by \(U_{\mathbf {\beta }\cdot \mathbf {f}} \left( P^\theta , P^{(d-1)\theta -d} \right) \), the inequality \(\left||\mathbf {x}^{(i)}\right||_\infty \le P^\theta \) holds for each \(i=1,\ldots ,d-1\) and we have the bound

$$\begin{aligned} \left||[\right||\big ]{ \mathbf {v}-\mathbf {m}^{( \mathbf {\beta } \cdot \mathbf {f})} \left( \mathbf {x}^{(1)}, \ldots , \mathbf {x}^{(d-1)} \right) }_\infty&< P^{(d-1)\theta -d}, \end{aligned}$$
(3.6)

for some \(\mathbf {v}\in \mathbb {Z}^n\). Since this \((d-1)\)-tuple \((\mathbf {x}^{(1)},\ldots ,\mathbf {x}^{(d-1)})\) is not counted by \(N^{{\text {aux}}}_{\mathbf {\beta }\cdot \mathbf {f}}\left( P^\theta \right) \), we must also have

$$\begin{aligned} \left|| \mathbf {m}^{( \mathbf {\beta } \cdot \mathbf {f})} \left( \mathbf {x}^{(1)}, \ldots , \mathbf {x}^{(d-1)} \right) \right||_\infty&\ge \left||\mathbf {\beta }\cdot \mathbf {f}^{[d]}\right||_\infty P^{(d-2)\theta }. \end{aligned}$$
(3.7)

We use (3.6) and (3.7) to relate \(P^\theta \) and \(\left||\mathbf {\beta }\right||_\infty \). It follows from (3.6) that either

$$\begin{aligned}&\left||[\right||\big ]{\mathbf {m}^{( \mathbf {\beta } \cdot \mathbf {f})} \left( \mathbf {x}^{(1)}, \ldots , \mathbf {x}^{(d-1)} \right) }_\infty < P^{(d-1)\theta -d} \end{aligned}$$
(3.8)

or

$$\begin{aligned} \left||[\right||\big ]{\mathbf {m}^{( \mathbf {\beta } \cdot \mathbf {f})} \left( \mathbf {x}^{(1)}, \ldots , \mathbf {x}^{(d-1)} \right) }_\infty \ge \frac{1}{2}. \end{aligned}$$
(3.9)

When (3.8) holds, then (3.7) implies

$$\begin{aligned} \left||\mathbf {\beta }\cdot \mathbf {f}^{[d]}\right||_\infty < \frac{P^{(d-1)\theta -d}}{P^{(d-2)\theta }} = P^{\theta -d}. \end{aligned}$$
(3.10)

When on the other hand (3.9) holds, then the bound \(\left||\mathbf {x}^{(i)}\right||_\infty \le P^\theta \) implies

$$\begin{aligned} \left|| \mathbf {m}^{( \mathbf {\beta } \cdot \mathbf {f})} \left( \mathbf {x}^{(1)}, \ldots , \mathbf {x}^{(d-1)} \right) \right||_\infty \ll _{} \left||\mathbf {\beta }\cdot \mathbf {f}^{[d]}\right||_\infty P^{(d-1)\theta }, \end{aligned}$$

and it follows by (3.9) that

$$\begin{aligned} \left||\mathbf {\beta }\cdot \mathbf {f}^{[d]}\right||_\infty \gg _{} P^{-(d-1)\theta }. \end{aligned}$$
(3.11)

Either (3.10) or (3.11) holds. So by rearranging and applying (3.2) we infer

(3.12)

We have shown that (3.5) implies (3.12). Now Lemma 3.1 shows that for we have

and together with our assumption (3.1) this implies that (3.5) will hold provided that and that

(3.13)

for some \(C_1 \ge 1\) depending only on \(C_0,d,n\) and \(\varepsilon \). Define \(\theta \) by

(3.14)

so that equality holds in (3.13). We consider three cases.

The first case is when \(\theta \le 0\) holds. We can rule this out. If \(\theta \le 0\) then (3.14) gives

(3.15)

To prove (2.1), we can assume without loss of generality that \(P \gg _{\varepsilon } 1\) holds. But then (3.15) is false, since \(\left|S\left( \mathbf {\alpha }; P\right) \right|\le (P+1)^n\) by the definition (1.11).

The second case is when \(0 < \theta \le 1\) holds. Our choice (3.14) for the parameter \(\theta \) then ensures that (3.13) holds. We saw above that when , that bound (3.13) implies the inequality (3.5). We also saw that (3.5) leads to the estimate (3.12). This estimate (3.12) implies the conclusion (2.1) of the lemma upon substituting in the value of \(\theta \) from (3.14) and choosing C to satisfy the bound \(C \gg _{\mu ,M} C_1^{1/2^d}\).

The third and last case is when \(\theta > 1\) holds. In this case we have by (3.14) that

(3.16)

Now for any \(t>0\) we have \(\max \left\{ P^{-d} t^{-1},\,t^{\frac{1}{d-1}}\right\} \ge P^{-1}\), and hence

So (2.1) follows from (3.16) on choosing C such that \(C\ge C_1^{1/2^d}\) holds.\(\square \)

4 The proof of Theorems 1.2 and 1.3

Proof of Theorem 1.3

Let \({N_{F_1,\ldots ,F_R}}(P)\) be as in (2.2). Set \(f_i = F_i\), and apply Propositions 2.1 and 3.1. This shows that

$$\begin{aligned} N_{f_1,\ldots ,f_R}(P) = \mathfrak {S}\mathfrak {I}P^{n-dR} +O_{C,f_1,\ldots ,f_R}(P^{n-dR-\delta }), \end{aligned}$$
(4.1)

where \(\delta = \delta ({\mathscr {C}},d,R)\) is positive. It remains to prove that \(\mathfrak {I}\) and \(\mathfrak {S}\) are positive under the conditions given in the theorem. Note that since \(V(F_1,\ldots ,F_R)\) has dimension \(n-1-R\), a smooth point corresponds to a solution of the equations

$$\begin{aligned} F_1(\mathbf {x}) = 0,\ldots ,F_R(\mathbf {x})=0 \end{aligned}$$
(4.2)

at which the \(R\times n\) Jacobian matrix \( \left( \partial F_i(\mathbf {x})/\partial x_j\right) _{ij}\) has full rank.

Let \(\mathbf {x}=\mathbf {r}\) be a real solution to (4.2) at which the matrix \( \left( \partial F_i(\mathbf {x})/\partial x_j\right) _{ij}\) has full rank, and for which \(\mathbf {r}\in \mathscr {B}\). Applying the Implicit Function Theorem to the equations (4.2) at the point \(\mathbf {r}\), we find an open set \(U\subset \mathscr {B}\) on which the solutions to (4.2) form an \((n-R)\) dimensional real manifold. Considering a small neighbourhood of this manifold shows that for all we have

$$\begin{aligned} \lambda \left\{ [\right\} \big ]{ \mathbf {s} \in U :\left|F_1(\mathbf {s})\right| \le \varepsilon , \ldots , \left|F_R(\mathbf {s})\right| \le \varepsilon } \gg _{F_1,\ldots ,F_R} \varepsilon ^R \end{aligned}$$

where \(\lambda \) is the Lebesgue measure. Letting \(\mathbf {t} = P\mathbf {s}\) and \(\varepsilon = \tfrac{1}{2}P^{-d}\), we see that

$$\begin{aligned} \lambda \left\{ [\right\} \big ]{ \mathbf {t}\in \mathbb {R}^n:\mathbf {t}/P \in U, \left|F_1(\mathbf {t})\right| \le \tfrac{1}{2}, \ldots , \left|F_R(\mathbf {t})\right| \le \tfrac{1}{2}} \gg _{F_1,\ldots ,F_R} P^{n-dR}, \end{aligned}$$

and (2.35) from Lemma 2.6 then shows that \(\mathfrak {I}\) is positive.

To show that \(\mathfrak {S}\) is positive under the conditions given in the theorem we use a variant of Hensel’s Lemma. Let p be a prime and let \(\mathbf {a}\in \mathbb {Z}_p^n\). Suppose that \(\mathbf {x}=\mathbf {a}\) is a solution to the system \(f_i(\mathbf {x})=\mathbf {0}\) for which the Jacobian matrix \(\left( \partial f_i(\mathbf {x})/\partial x_j\right) _{ij}\) is nonsingular. Possibly after permuting the variables \(x_i\) if necessary, we can assume that the submatrix \(M(\mathbf {x})\) consisting of the last R columns of \(\left( \partial f_i(\mathbf {x})/\partial x_j\right) _{ij}\) is nonsingular at \(\mathbf {x}=\mathbf {a}\).

The so-called valuation theoretic Implicit Function Theorem then applies to the polynomials \(f_i\) with the common zero \(\mathbf {a}\) over the valued field \(\mathbb {Q}_p\). This is essentially a version of Hensel’s Lemma; see Kuhlmann [21, Theorem 25]. If we write \(\left|\det M(\mathbf {a})\right|_p = p^{-\alpha }\), the theorem states that for all p-adic numbers \(a'_1,\ldots ,a'_{n-R}\in \mathbb {Q}_p\) with \(\left|a'_i-a_i\right|_p < p^{-2\alpha }\), there are unique p-adic numbers \(a'_{n-R+1},\ldots ,a'_{n}\in \mathbb {Q}_p\) with \(\left|a'_i-a_i\right|_p < p^{-\alpha }\) such that each \(f_i(\mathbf {a}')=0\).

Now let \(a'_1,\ldots ,a'_{n-R}\) be p-adic integers satisfying \(a'_i \equiv a_i\) modulo \(p^{2\alpha +1}\). For each \(k\in \mathbb {N}\) there are \(p^{(k-2\alpha -1)(n-R)}\) choices for \(a'_i\) which are distinct modulo \(p^k\), and by the theorem above each one extends to a vector of p-adic integers \(\mathbf {a}\) satisfying \(\mathbf {f}(\mathbf {a}')=0\).

If this holds for each prime p, then \(\mathfrak {S}\) is positive. For then reducing the vectors \(\mathbf {a}'\) modulo \(p^k\) gives \(\gg _{\mathbf {f},p} p^{k(n-R)}\) distinct vectors \(\mathbf {b}\in \left\{ 1,\ldots ,p^k\right\} ^n\) satisfying the system of congruences \(f_i(\mathbf {b})\equiv \mathbf {0}\) modulo \(p^k\). The equality (2.25) then shows that \(\mathfrak {S}>0\). \(\square \)

Proof of Theorem 1.2

We let \({\mathscr {C}}= \frac{n-R+1}{4}\), and apply Theorem 1.3 to the system of forms \(F_i\). The result will follow if we can show that (1.8) holds, which is to say that

$$\begin{aligned} N^{{\text {aux}}}_{\mathbf {\beta }\cdot \mathbf {F}} (B) \ll B^{\sigma _\mathbb {R}} \end{aligned}$$
(4.3)

for all \(\mathbf {\beta }\in \mathbb {R}^R\) and all \(B\ge 1\). Here the quantity \(\sigma _\mathbb {R}\) is defined by (1.4).

For each \(\mathbf {\beta }\in \mathbb {R}^R\), let the matrix of the quadratic form \(\mathbf {\beta }\cdot \mathbf {F}\) be \(M\left( \mathbf {\beta }\right) \). That is, \(M\left( \mathbf {\beta }\right) \) is the unique real \(n\times n\) symmetric matrix with

$$\begin{aligned} \mathbf {\beta }\cdot \mathbf {F}(\mathbf {x}) = \mathbf {x}^T M\left( \mathbf {\beta }\right) \mathbf {x}. \end{aligned}$$

Then we have

$$\begin{aligned} \mathbf {m}^{(\mathbf {\beta }\cdot \mathbf {F})} \left( \mathbf {u} \right) = 2M\left( \mathbf {\beta }\right) \mathbf {u}, \end{aligned}$$

so \(N^{{\text {aux}}}_{\mathbf {\beta }}\left( B\right) \) counts vectors \(\mathbf {u}\in \mathbb {Z}^n\) satisfying

$$\begin{aligned} \left||\mathbf {u}\right||_\infty \le B, \qquad \left||M\left( \mathbf {\beta }\right) \right||_\infty \le \tfrac{1}{2}\left||\mathbf {\beta }\cdot \mathbf {F}\right||_\infty . \end{aligned}$$

These vectors \(\mathbf {u}\) are all contained in the box \(\left||\mathbf {u}\right||_\infty \le B\), and in the ellipsoid

$$\begin{aligned} E(\mathbf {\beta }) = \left\{ [\right\} ]{\mathbf {t}\in \mathbb {R}^n :\mathbf {t}^T { M\left( \mathbf {\beta }\right) ^T M\left( \mathbf {\beta }\right) } \mathbf {t} < n\cdot \left||\mathbf {\beta }\cdot \mathbf {F}\right||_\infty ^2}. \end{aligned}$$

The ellipsoid has principal radii \(\left|\lambda \right|^{-1}\sqrt{n}\left||\mathbf {\beta }\cdot \mathbf {F}\right||_\infty \) where \(\lambda \) runs over the eigenvalues of the real symmetric matrix \(M\left( \mathbf {\beta }\right) \), counted with multiplicity. Hence

where \(\lambda \) is as before. So to prove (4.3) it suffices that \(n-\sigma _\mathbb {R}\) of the \(\lambda \) are of size \(\left|\lambda \right| \gg \left||\mathbf {\beta }\cdot \mathbf {F}\right||_\infty \) at least.

Suppose for a contradiction that this is false. Then there exists a sequence \(\mathbf {\beta }^{(i)}\in \mathbb {R}^R\) such that at least \(\sigma _\mathbb {R}+1\) of the eigenvalues of \(M\left( \mathbf {\beta }^{(i)}\cdot \mathbf {q}\right) \) satisfy \(\lambda = o(\left||\mathbf {\beta }^{(i)}\cdot \mathbf {F}\right||_\infty )\). By passing to a subsequence, we can assume \(\mathbf {\beta }^{(i)}/\left||\mathbf {\beta }^{(i)}\right||_\infty \rightarrow \mathbf {\beta }\), and then at least \(\sigma _\mathbb {R}+1\) of the eigenvalues of \(M\left( \mathbf {\beta }\cdot \mathbf {F}\right) \) must be zero. In other words,

$$\begin{aligned} \dim {\text {Sing}}V(\mathbf {\beta }\cdot \mathbf {F}) \ge \sigma _\mathbb {R}. \end{aligned}$$

But this contradicts the definition (1.4). So (4.3) holds as claimed. \(\square \)

As alluded to after Lemma 1.1, the argument used to prove Theorems 1.2 and 1.3 also yields weak approximation for \(V(F_1,\ldots ,F_R)\) if that variety is smooth. It suffices to show that if the system \(F_i(q\mathbf {x}-\mathbf {a})=0\) has solutions in the p-adic integers for each p, then it has integral solutions \(\mathbf {x}\) with \(\frac{\mathbf {x}}{\left||\mathbf {x}\right||_\infty }\) arbitrarily close to \(\frac{\mathbf {r}}{\left||\mathbf {r}\right||_\infty }\), for any fixed real solution \(\mathbf {r}\) to the system \(F_i(\mathbf {r})=0\). For this one can let \(\mathscr {B}\) be a sufficiently small box containing \(\frac{\mathbf {r}}{\left||\mathbf {r}\right||_\infty }\), and repeat the proof of Theorems 1.2 and 1.3 with the choice \({f}_i(\mathbf {x}) = F_i(q\mathbf {x}-\mathbf {a})\) instead of \(f_i=F_i\) at the start of the proof of Theorem 1.3. Since \(N^{{\text {aux}}}_{\mathbf {\beta }\cdot \mathbf {f}}(B)=N^{{\text {aux}}}_{\mathbf {\beta }\cdot \mathbf {F}}(B)\) we obtain (4.1) as before. Recalling that any real or p-adic point of \(V(F_1,\ldots ,F_R)\) must be smooth, the argument to prove that \(\mathfrak {I},\mathfrak {S}\) are positive goes through and we obtain the existence of an integral solution of the required kind.

Proof of Lemma 1.1

We prove the first inequality in (1.6). Let \(\mathbf {\beta }\in \mathbb {R}^R{\setminus }\left\{ \mathbf {0}\right\} \) such that

$$\begin{aligned} \sigma _\mathbb {R}=\dim V(\mathbf {\beta }\cdot \mathbf {F}). \end{aligned}$$

Without loss of generality we may suppose that \(\beta _R \) is nonzero. Then we have

$$\begin{aligned} V(F_1,\ldots ,F_R) = V(F_1,\ldots ,F_{R-1},\mathbf {\beta }\cdot \mathbf {F}). \end{aligned}$$

Since \(V(F_1,\ldots ,F_{R-1})\) has dimension \(n-1-R\), it follows that

$$\begin{aligned} V(F_1,\ldots ,F_{R-1}) \cap {\text {Sing}}V(\mathbf {\beta }\cdot \mathbf {F}) \subset {\text {Sing}}V(F_1,\ldots ,F_R) \end{aligned}$$

and so \( V(F_1,\ldots ,F_{R-1}) \cap {\text {Sing}}V(\mathbf {\beta }\cdot \mathbf {F}) =\emptyset , \) as \(V(F_1,\ldots ,F_{R})\) is smooth. It follows that \(\dim {\text {Sing}}V(\mathbf {\beta }\cdot \mathbf {F})\le R-1\), which proves the first inequality in (1.6).

The second inequality in (1.6) follows from the work of Browning and Heath-Brown [7]. In those authors’ formula (1.3), set

$$\begin{aligned} D=2, \quad r_1=0, \quad r_2=R, \quad F_{i,2} =F_i. \end{aligned}$$

Now the \(R\times n\) Jacobian matrix \( \left( \partial F_i(\mathbf {x})/\partial x_j\right) _{ij}\) has full rank at every nonzero solution \(\mathbf {x}\in \bar{\mathbb {Q}}^n\) to \(F_1(\mathbf {x})=\cdots =F_R(\mathbf {x})=0\), because \(V(F_1,\ldots ,F_R)\) is smooth of dimension \(n-1-R\). This makes \(F_{i,j}\) a “nonsingular system” in the sense of Browning and Heath-Brown, as defined in their formula (1.7). The next step is to replace \(F_{i,d}\) with an “equivalent optimal system”. The comments after formula (1.7) of those authors show that in our case this means replacing \(F_i\) with \(\sum _j A_{ij}f_j\), where A is an invertible linear transformation. In particular this preserves \(V(F_1,\ldots ,F_R)\) and W. Now their formulae (1.4) and (1.8) show that \( B_2 \le R-1 \), where \( B_2 = 1+ \dim (W). \) This proves (1.6). \(\square \)