1 Introduction and results

The Erdős Discrepancy Problem (EDP), formulated in [4] (see also [8, 19] for related questions), states that, given any sequence \(f: \mathbb {N} \rightarrow \{-1,+1\}\), the discrepancy of f on homogeneous arithmetic progressions satisfies

$$\begin{aligned} \sup _{d, N \ge 1} \Big |\sum _{n \le N} f(dn)\Big | = \infty . \end{aligned}$$
(1)

This was eventually settled affirmatively in a groundbreaking paper of Tao [17] in 2015.

The special case where f is completely multiplicative (that is, \(f(mn)=f(m)f(n)\) for all \(m,n\in \mathbb {N}\)) was already highlighted by Erdős as the key special case; in this case, the formulation simplifies to

$$\begin{aligned} \sup _{N \ge 1} \Big |\sum _{n \le N} f(n)\Big | = \infty ,\quad f:\mathbb {N}\rightarrow \{-1,+1\}\,\,\text {completely multiplicative}. \end{aligned}$$
(2)

The Polymath 5 online collaboration project [15] devoted to the Erdős discrepancy problem was indeed able to reduce the proof of (1) to (an averaged version of) the completely multiplicative case (2), with f now taking values on the unit circle \(S^1:=\{z\in \mathbb {C}:\,\, |z|=1\}\) of the complex plane. Tao established in [17] this case of completely multiplicative functions, and hence the whole conjecture (1), making crucial use of his proof [18] of the logarithmic two-point Elliott conjecture on correlations of multiplicative functions. A further reason to concentrate on the discrepancy of completely multiplicative sequences is that such sequences or small perturbations thereof are speculated to have minimal growth rate for the discrepancy among all sequences, as discussed below.

In this paper we shall consider corresponding discrepancy problems in function fields. Let q be a fixed prime power and let \(\mathcal {M}\) denote the set of monic polynomials in \(\mathbb {F}_q[t]\); this set \(\mathcal {M}\) is an analogue of the positive integers. For elements of \(\mathcal {M}\) we have a unique factorization into products of irreducible monic polynomials (prime polynomials). Let \(\text {deg}(G)\) denote the degree of \(G\in \mathbb {F}_q[t].\) For completely multiplicative functions \(f: \mathcal {M} \rightarrow \{-1,+1\}\) (that is, functions that satisfy \(f(G_1G_2)=f(G_1)f(G_2)\) for all \(G_1,G_2\in \mathcal {M}\)), it is known (see e.g. [5]) that the partial sums

$$\begin{aligned}\sigma _f(n):=\sum _{\begin{array}{c} G \in \mathcal {M} \\ \text {deg}(G) \le n \end{array}}f(G)\end{aligned}$$

behave rather differently from their number field counterparts. In particular, in the Polymath 5 project [16] it was observed that if we define the long sum discrepancy

$$\begin{aligned} \mathcal {D}_f:= \sup _{\begin{array}{c} D\in \mathcal {M} \\ N \ge 1 \end{array}} \Big |\sum _{\begin{array}{c} G \in \mathcal {M} \\ \text {deg}(G) \le N \end{array}} f(DG)\Big |, \end{aligned}$$
(3)

then the Erdős discrepancy question for \(\mathcal {D}_f\) has a negative answer, in the sense that there exists even a completely multiplicative \(f:\mathcal {M}\rightarrow \{-1,+1\}\) such that \(\mathcal {D}_f<\infty .\) In fact, without much additional difficulty we can prove the following.

Proposition 1.1

There are uncountably many completely multiplicative functions \(f: \mathcal {M} \rightarrow \{-1,+1\}\) for which \(\mathcal {D}_f < \infty \).

One of the main goals of the present paper is to characterize the boundedness of partial sums of completely multiplicative functions in function fields, discovering along the way difficulties and features that are not present in the integer setting. We apply this to demonstrate that there are natural formulations of the Erdős discrepancy problem in function fields that in contrast have an affirmative answer for completely multiplicative sequences (see Theorem 1.5). Another consequence of our work is further evidence towards the widely-believed conjecture over \(\mathbb {Z}\) that the functions whose discrepancies are of slowest possible growth are “modified” characters (Conjecture 1.4). See Theorems 1.3 and 1.8 for a precise statement (and Definition 1.6 for the notion of modified characters).

1.1 Extremizers for the short sum discrepancy

The main reason why \(\mathcal {D}_f\) is not suitably well-behaved in function fields is because long intervals are too coarse to witness discrepancies in a given sequence. More precisely, an interval \(\mathcal {M}_{\le N}:= \{G\in \mathcal {M}:\,\, \text {deg}(G)\le N\}\) contains too few other intervals \(\mathcal {M}_{\le n}\); there are only \(N+1\) of them, whereas the interval \(\mathcal {M}_{\le N}\) has size (i.e. number of elements) of order \(\asymp q^N\). In contrast, the interval [1, N] in \(\mathbb {N}\) contains N intervals of the form [1, n] with \(n \in \mathbb {Z}\). At the same time, Tao’s proof of the Erdős discrepancy problem makes full use of the fact that there are a lot of different subintervals for an interval in \(\mathbb {N}\) by showing in fact the unboundedness of the quantity

$$\begin{aligned} \frac{1}{\log N}\sum _{m\le N}\frac{1}{m}\Big |\sum _{|n-m|<H}f(n)\Big |^2, \end{aligned}$$

where \(H=H(N)\) is slowly growing. This suggests that it is natural to look at the corresponding short sum discrepancy over function fields:

$$\begin{aligned} \mathcal {S}_f := \limsup _{H \rightarrow \infty } \limsup _{N \rightarrow \infty } \sup _{\begin{array}{c} D,G_0 \in \mathcal {M} \\ \text {deg}(G_0) = N \\ D\in \mathcal {M} \end{array}} \Big |\sum _{\begin{array}{c} G \in \mathcal {M} \\ \text {deg}(G-G_0) < H \end{array}} f(DG)\Big |, \end{aligned}$$

which is now taken over the family of short intervals

$$\begin{aligned} I_H(G_0):=\{G\in \mathcal {M}:\,\, \text {deg}(G-G_0)<H\}. \end{aligned}$$

These short intervals are much more numerous than the corresponding long intervals and thus provide a much more refined scale to measure the fluctuations of the partial sums; there are \(\asymp q^N\) of them inside the set of polynomials of degree at most N.

Note that over the integers the short sum discrepancy is bounded from above in terms of the long sum discrepancy: since the integers are linearly ordered, we get by the triangle inequality that

$$\begin{aligned} \limsup _{H\rightarrow \infty }\limsup _{N \rightarrow \infty } \Big |\sum _{|n-N|\le H} f(dn)\Big | \le 2 \limsup _{N \rightarrow \infty } \Big |\sum _{n \le N} f(dn)\Big |. \end{aligned}$$

Thus, one presumes that the behavior of the short sum discrepancy \(\mathcal {S}_f\) is rather similar to that of Erdős discrepancy in the integers. Indeed, we show that \(\mathcal {S}_f=\infty \) for “nearly all” completely multiplicative functions \(f:\mathcal {M}\rightarrow \{-1,+1\}\), but, in contrast to the integer case, it turns out that there are also a few exceptional functions. Our next theorem gives a complete classification of the cases where \(\mathcal {S}_f\) is bounded for a completely multiplicative function \(f:\mathcal {M}\rightarrow \{-1,+1\}\) (see Definition 2.1 below for the definition of a short interval character and its length).

Corollary 1.2

(Short sum discrepancy is bounded only for modified characters of prime power modulus) Let \(f: \mathcal {M} \rightarrow \{-1,+1\}\) be completely multiplicative. Then \(\mathcal {S}_f < \infty \) if and only if there is a prime power \(P^k\in \mathcal {M}\), a primitive Dirichlet character \(\chi \) modulo \(P^k\), a short interval character \(\xi \), and an integer \(j\in \{0,1\}\) such that \(f(P') = \chi (P')\xi (P')(-1)^{j \text {deg}(P')}\) for all primes \(P' \ne P\). Moreover, if q is odd, we have \(\xi \equiv 1\).

This result is a corollary of the following more general theorem that applies to completely multiplicative functions taking values on the unit circle.

Theorem 1.3

Let \(f: \mathcal {M} \rightarrow S^1\) be a completely multiplicative function. Then \(\mathcal {S}_f < \infty \) if and only if there is a prime power \(P^k\in \mathcal {M}\), a primitive Dirichlet character \(\chi \) modulo \(P^k\), a short interval character \(\xi \), and a real number \(\theta \in [0,1]\) such that \(f(P') = \chi (P')\xi (P')e^{2\pi i \theta \text {deg}(P')}\) for all primes \(P' \ne P\).

Corollary 1.2 and Theorem 1.3 are closely related to the following conjecture on the growth of the partial sums of multiplicative functions on the integers (see [17, Section 1] and [10, Section 1] for some discussion).

Conjecture 1.4

(Extremality of partial sums of modified characters) Let \(f:\mathbb {N}\rightarrow S^1\) be completely multiplicative. Then

$$\begin{aligned}\Big |\sum _{n\le x}f(n)\Big |\ll \log x\end{aligned}$$

if and only if there exists a non-principal Dirichlet character \(\chi \) modulo a prime power \(p^k\) such that \(f(p')=\chi (p')\) for all \(p' \ne p.\) Conversely, for such f there exists a subsequence \(N_k\rightarrow \infty ,\) such that

$$\begin{aligned} \Big |\sum _{n\le N_k}f(n)\Big |\gg \log N_k. \end{aligned}$$
(4)

Little is known towards this conjecture (which contains the Erdős discrepancy problem as a special case), apart from the case where f differs from a Dirichlet character (not necessarily of prime power modulus) at only finitely many primes, which was handled in [10, Corollary 1.6]. This case contains (4), which was first handled in [1], and is the easy part of the conjecture. In fact, the best currently known growth rate for partial sums of length x of a completely multiplicative function \(f:\mathbb {N}\rightarrow S^1\) is of the form \(\Omega ( (\log \log x)^{c})\), for some explicit \(c > 0\) [13, Theorem 4.1.1], [7, Section 9.4].

Both Corollary 1.2 and Conjecture 1.4 manifest the same general phenomenon: the smallest possible discrepancy over \(\mathbb {Z}\) and over \(\mathbb {F}_q[t]\) (for short sums) is attained by “modified" characters to prime power moduli. Over \(\mathbb {F}_q[t]\) with q even, we have an interesting low characteristic phenomenon that the set of characters with bounded discrepancy is somewhat larger than in the case of q odd; this eventually stems from Theorem 2.2 below.

Notice that while over \(\mathbb {Z}\) the smallest possible partial sums are believed to be of the order \(\asymp \log x\), over \(\mathbb {F}_q[t]\) they are O(1). In order to explain this feature, we recall that for the Borwein–Choi–Coons example [1] given by the“modified" character

$$\begin{aligned} f_3(p):={\left\{ \begin{array}{ll}\chi _3(p),\quad &{}p\ne 3\\ 1,&{}p=3, \end{array}\right. } \end{aligned}$$

the partial sums satisfy

$$\begin{aligned} \sum _{n\le x}f_3(n)=\sum _{k\le \log x/\log 3}f_3(3)^k\sum _{m\le \lfloor x/3^k\rfloor }\chi _3(m)\ll \log x, \end{aligned}$$

since the innermost sum is bounded. On the other hand, to construct a sequence \(x_k\) on which the partial sums grow with rate \(\gg \log x_k\) one exploits the fact that the intervals [1, M] contain a different number of residue classes modulo 3, depending on the value of \(M\pmod {3}\). This is no longer true over \(\mathbb {F}_q[t]\), as all short intervals \(I_H(G_0)\) contain the same number of residue classes to any modulus Q as soon as \(H \ge \text {deg}(Q)\). This results in a “logarithmic" dropFootnote 1 as far as quantitative statements are concerned. We shall revisit this further in the following subsections.

1.2 Discrepancy with the lexicographic ordering

To rectify the aforementioned difference between the settings of function fields and integers from the previous subsection, a natural approach involves replacing the partial ordering of \(\mathbb {F}_q[t]\) employed in constructing the sets \(\{G\in \mathbb {F}_q[t]: \text {deg}(G)\le N\}\) used to define \(\mathcal {D}_f\) with a lexicographic ordering of \(\mathbb {F}_q[t]\). This is (a generalization of) an ordering that has been used in the influential work of Liu and Wooley [12] on Waring’s problem over \(\mathbb {F}_q[t]\). It arises by associating a base q integer expansion to each polynomial in \(\mathbb {F}_q[t]\). In order to define this ordering, we must first impose an ordering on \(\mathbb {F}_q\). Let \(a_0, a_1, \ldots , a_{q-1}\) be an arbitrary ordering of \(\mathbb {F}_q\), and define the size \(\langle a\rangle \in \{0,1,\ldots ,q-1\}\) of \(a \in \mathbb {F}_q\) as

$$\begin{aligned} \langle a \rangle =k\quad \text {if}\quad a=a_k. \end{aligned}$$

Then we extend \(\langle \cdot \rangle \) to \(\mathbb {F}_{q}[t]\) by defining

$$\begin{aligned} \langle b_Nt^{N}+\cdots +b_1t+b_0\rangle =\langle b_N\rangle q^N+\cdots + \langle b_1\rangle q+\langle b_0\rangle . \end{aligned}$$

Implicit in this definition is the requirementFootnote 2 that \(\langle 0 \rangle = 0\), and we assume that our ordering always satisfies this property. It is clear that the map \(\langle \cdot \rangle \) is a bijection from \(\mathbb {F}_q[t]\) to \(\mathbb {N}\cup \{0\}\). Thus it defines a total order \(\prec \) on \(\mathbb {F}_{q}[t]\) by setting \(A\prec B\) if \(\langle A\rangle <\langle B\rangle \), which is a lexicographic order on \(\mathbb {F}_q[t]\) (and thus a natural one to use).

We remark that Liu and Wooley confined themselves to the lexicographic order (denoted \(\langle \cdot \rangle _{\xi }\)) that arises from ordering \(\mathbb {F}_q\) as \(0,\xi ^0,\ldots , \xi ^{q-1}\), where \(\xi \) is a fixed generator of \(\mathbb {F}_q^{\times }\). Our results apply to this ordering as well as to any other ordering \(\langle \cdot \rangle \) with \(\langle 0 \rangle = 0\).

Answering a question of Liu and Wooley,Footnote 3 we are able to show that when one uses the ordering given by \(\langle \cdot \rangle \), the Erdős discrepancy conjecture holds for all completely multiplicative sequences. Thus, defining the lexicographic discrepancy as

$$\begin{aligned} \mathcal {L}_f:=\sup _{\begin{array}{c} D \in \mathcal {M} \\ N\ge 1 \end{array}}\Big |\sum _{\begin{array}{c} G\in \mathcal {M}\\ \langle G \rangle \le N \end{array}}f(DG)\Big |, \end{aligned}$$

we will prove the following.

Theorem 1.5

(Lexicographic discrepancy of completely multiplicative sequences is always infinite) For any completely multiplicative sequence \(f:\mathcal {M}\rightarrow S^1\), we have \(\mathcal {L}_f=\infty \).

Thus, for any completely multiplicative function \(f:\mathcal {M}\rightarrow S^1,\) the lexicographically ordered partial sums of f satisfy

$$\begin{aligned}\sup _{N\ge 1}\Big |\sum _{\begin{array}{c} G\in \mathcal {M}\\ \langle G \rangle \le N \end{array}}f(G)\Big |=\infty .\end{aligned}$$

We conclude this subsection by mentioning that the lexicographic ordering appears to be a natural ordering also for several other classical problems over \(\mathbb {F}_q[t]\) (in particular, for those where partial summation plays a role).

1.3 Long sum discrepancy

Having formulated our results for \(\mathcal {L}_f\) and \(\mathcal {S}_f\), we revisit the long sum discrepancy \(\mathcal {D}_f\) to study what can be said about its boundedness. We provide a classification of all modified characters that have bounded \(\mathcal {D}_f\); these functions are defined as follows.

Definition 1.6

(Modified characters) We call a function \(f:\mathcal {M}\rightarrow S^1\) a modified character if f is completely multiplicative and for some primitive Dirichlet character \(\chi \pmod {Q}\) of some modulus \(Q\in \mathcal {M}\) we have \(f(P)=\chi (P)\) for all primes \(P\not \mid Q\), and otherwise \(f(P) \in S^1\) for all \(P\mid Q\). We also define modified characters on \(\mathbb {N}\) analogously.

In the integer setting, the class of modified characters contains the class of functions for which Borwein, Choi and Coons [1] showed unboundedness of discrepancy. Indeed, they considered completely multiplicative functions \(f:\mathbb {N}\rightarrow \{-1,+1\}\) such that for some prime p we have \(f(p')=\chi _{p}(p')\) for all \(p'\ne p\), where \(\chi _p\) is the Legendre symbol \(\pmod {p}\), analogizing the function \(f_3\) constructed above. We note that such functions are significantly easier to work with, since the value of \(\sum _{n\le N}f(n)\) is easy to compute given the base p expansion of N. As soon as one studies modified characters to composite moduli, matters are more complicated and a direct computation of the partial sums (both in the integer case and in the function field case) appears very difficult, requiring control of the digital expansion of N in (at least) two different bases simultaneously.

We prove the following characterization for the discrepancy of modified characters, where \(\omega (Q)\) stands for the number of distinct prime divisors of a polynomial \(Q\in \mathbb {F}_q[t]\) and \(v_2(n)\) is the 2-adic valuation of n.

Corollary 1.7

Let \(f: \mathcal {M} \rightarrow \{-1,+1\}\) be a modified character associated with a primitive character of modulus Q. Then \(\mathcal {D}_f<\infty \) if and only if one of the following holds:

  1. (i)

    \(\omega (Q)=1\).

  2. (ii)

    \(\omega (Q) = 2\), and (up to permutation) the primes \(P_1,P_2\) dividing Q satisfy:

    • \(f(P_1) =-1,\) \(f(P_2)=1\), and

    • \(v_2(\text {deg}(P_1)) \ge v_2(\text {deg}(P_2))\).

  3. (iii)

    \(\omega (Q) = 3\), and (up to permutation) the primes \(P_1,P_2,P_3\) dividing Q satisfy:

    • \(f(P_1) = f(P_2) = -1\) and \(f(P_3) = 1\),

    • \(v_2(\text {deg}(P_1)) \ne v_2(\text {deg}(P_2))\), and

    • \(v_2(\text {deg}(P_j)) \ge v_2(\text {deg}(P_3))\) for \(j = 1,2\).

We also give a complete characterization in the case where the modified character is complex-valued; here the statement perhaps surprisingly depends on whether or not a certain polynomial associated to f has multiple roots.

Theorem 1.8

Let \(f: \mathcal {M} \rightarrow S^1\) be a modified character associated to a primitive character of modulus Q with \(\text {deg}(Q)\ge 1\). Define the polynomial \(p(z):= \prod _{P|Q} (z^{\text {deg}(P)}-\overline{f(P)})\).

a) If all the zeros of p have multiplicity 1, then

$$\begin{aligned} \Big |\sum _{\begin{array}{c} G\in \mathcal {M}\\ \text {deg}(G)\le N \end{array}} f(G) \Big | \ll _{Q} 1. \end{aligned}$$

b) If \(b\ge 2\) is the highest multiplicity of a zero of p, then there is an increasing sequence \(\{N_k\}_{k \ge 1}\) such that

$$\begin{aligned} \Big |\sum _{\begin{array}{c} G \in \mathcal {M}\\ \text {deg}(G)\le N_k \end{array}} f(G)\Big | \asymp _Q N_k^{b-1}. \end{aligned}$$

It is a natural question to ask for classification of all \(\pm 1\)-valued multiplicative functions in \(\mathcal {M}\) with \(\mathcal {D}_f<\infty \). Proposition 1.1 and Corollary 1.7 imply that in the case of general functions this appears all but impossible, whereas for the natural class of modified characters we can give a complete characterization.

Theorem 1.8 is also related to Conjecture 1.4. Namely, one expects that for a completely multiplicative function \(f:\mathbb {N}\rightarrow S^1\) there is an increasing sequence \(\{N_k\}_{k\ge 1}\) such that

$$\begin{aligned} \Big |\sum _{n\le N_k}f(n)\Big |\gg \log N_k. \end{aligned}$$
(5)

As mentioned, from [1] it follows that (5) holds whenever f is a modified character (with prime power modulus). Theorem 1.8 verifies an analogous statement over function fields, namely that when \(f: \mathcal {M} \rightarrow S^1\) is a modified character for which \(\mathcal {D}_f = \infty \), we have

$$\begin{aligned} \Big |\sum _{\begin{array}{c} G\in \mathcal {M}\\ \text {deg}(G)\le N_k \end{array}}f(n)\Big |\gg \log (q^{N_k})\gg N_k. \end{aligned}$$

Theorem 1.8 also reveals an interesting phenomenon about the spectrum of different growth rates of discrepancy (and once again confirms the “logarithmic" drop). It shows that the discrepancy of a modified character on \(\mathbb {F}_q[t]\) always grows like \(N^d\) for some number \(d\in \mathbb {N}\cup \{0\}\); there are no other possible growth rates. It would be interesting to say something about the spectrum of discrepancies for general multiplicative functions on \(\mathbb {F}_q[t]\) (or even on \(\mathbb {Z}\)); however, this seems extremely difficult since in the non-pretentious case the known lower bound on the growth of the discrepancy is very weak (see Sect. 2 for relevant definitions).

2 Strategy of proofs

As in Tao’s resolution of the Erdős discrepancy problem [17], our proofs naturally split into two main parts: the case of non-pretentious multiplicative functions and the case of pretentious multiplicative functions (see Fig. 1). At various points we are forced to significantly deviate from the treatment in the number field case.

By pretentious functions we mean multiplicative functions \(f:\mathcal {M}\rightarrow S^1\) such that for some character \(\widetilde{\chi }:\mathcal {M}\rightarrow \mathbb {C}\) of bounded conductor the pretentious distance between f and \(\widetilde{\chi }\) is bounded (the Granville–Soundararajan pretentious distance can be generalized to function field setting; see (7) below). In the integer setting, the relevant characters would be of the form \(\widetilde{\chi }(n)=\chi (n)n^{it}\), so a Dirichlet character \(\chi \) times an Archimedean character \(n\mapsto n^{it}\), for some \(t \in \mathbb {R}\). A key technical point in this paper is that these characters are not sufficient for understanding the behavior of the short sum discrepancy. We thus need a larger set of characters, introduced in the following definition.

Definition 2.1

(Characters in function fields) A multiplicative function \(\chi :\mathcal {M}\rightarrow \mathbb {C}\) which is not identically zero is called a Dirichlet character of modulus \(Q\in \mathcal {M}\) if \(\chi (M+Q)=\chi (M)\) for all \(M\in \mathcal {M}\) and \(\chi (M)=0\) whenever \((Q,M)\ne 1\). We say that \(\chi \pmod {Q}\) is primitive if there is no divisor \(Q'\mid Q\), \(\text {deg}(Q')<\text {deg}(Q)\) such that for some Dirichlet character \(\chi '\pmod {Q'}\) we have \(\chi (M)=\chi '(M)\) whenever \((M,Q)=1\). We say that \(\chi \pmod {Q}\) is principal if \(\chi (M)=1\) whenever \((M,Q)=1\).

A function \(e_{\theta }\mathcal {M}\rightarrow S^1\) of the form \(e_{\theta }(M):=e(\theta \text {deg}(M))\) for \(\theta \in [0,1]\) is called an Archimedean character.

A multiplicative function \(\xi :\mathcal {M}\rightarrow \mathbb {C}\) which is not identically zero is called a short interval character if there exists \(\nu \ge 0\) such that \(\xi (A)=\xi (B)\) whenever the \(\nu +1\) highest degree coefficients of A and B agree. The smallest such \(\nu \) is called the length \(\text {len}(\xi )\) of \(\xi \).

Any of the characters above are multiplicative. The Archimedean characters play much the same role as the characters \(n\mapsto n^{it}\) on \(\mathbb {N}\). The notion of short interval characters was introduced by Hayes [6], and it has no integer analogue.

2.1 The non-pretentious case

In the non-pretentious case, the main ingredient that we need is a function field version of Tao’s result on two-point logarithmic correlations of multiplicative functions. This was established by the authors in [11].

Theorem 2.2

(Two-point logarithmic Elliott conjecture in function fields, [11]) Let \(B \in \mathbb {F}_q[t] \backslash \{0\}\) be fixed. Let \(f_1,f_2: \mathbb {F}_q[t] \rightarrow \mathbb {U}\) be multiplicative. Let N be large, and assume that \(f_1\) satisfies the non-pretentiousness condition

$$\begin{aligned} \min _{\begin{array}{c} M \in \mathcal {M} \\ \text {deg}(M) \le W \end{array}} \min _{\psi \pmod {M}} \min _{\begin{array}{c} \xi \,\, \text {short} \\ \text {len}(\xi ) \le N \end{array}}\min _{\theta \in [0,1]} \sum _{\begin{array}{c} P \in \mathcal {P} \\ \text {deg}(P) \le N \end{array}} \frac{1-\text {Re}(f_1(P)\overline{\psi }(P)\overline{\xi }(P)e_{\theta }(P))}{q^{\text {deg}(P)}} \xrightarrow {N\rightarrow \infty }\infty \end{aligned}$$

for every fixed \(W\ge 1\). Then

$$\begin{aligned} \frac{1}{N}\sum _{\begin{array}{c} G \in \mathcal {M}\\ \text {deg}(G) \le N \end{array}} q^{-\text {deg}(G)} f_1(G)f_2(G+B) =o(1). \end{aligned}$$

Moreover, if \(f_1\)is real-valued and q is odd, then the same conclusion follows provided only that

$$\begin{aligned} \min _{\begin{array}{c} M \in \mathcal {M} \\ \text {deg}(M) \le W \end{array}} \min _{\psi \pmod {M}} \min _{\theta \in \{0,1/2\}} \sum _{\begin{array}{c} P \in \mathcal {P} \\ \text {deg}(P) \le N \end{array}} \frac{1-\text {Re}(f_1(P)\overline{\psi }(P)e_{\theta }(P))}{q^{\text {deg}(P)}}\xrightarrow {N\rightarrow \infty }\infty . \end{aligned}$$

Proof

This is [11, Theorem 1.5] \(\square \)

2.2 The pretentious case

Assuming for the sake of contradiction that a completely multiplicative f has finite short sum discrepancy \(\mathcal {S}_f\), Theorem 2.2 can be used as in [17] to achieve the crucial reduction to the case in which f pretends to be a twisted character \(G \mapsto \chi \xi e_{\theta }(G)\), where \(\chi \) is a primitive Dirichlet character of bounded conductor, \(\xi \) is a short interval character and \(\theta \in [0,1]\). At this point, after removing the twist \(\xi e_{\theta }\) (which is essentially a triviality) we significantly deviate from Tao’s analysis in [17].

The first step of this different argument is a technique that allows us to pass from the pretentiousness condition

$$\begin{aligned} \sum _{P \in \mathcal {P}} \frac{1-\text {Re}(f(P)\overline{\chi }(P))}{q^{\text {deg}(P)}} < \infty \end{aligned}$$

to the far more restrictive hypothesis

$$\begin{aligned} |\{P \in \mathcal {P}: f(P) \ne \chi (P)\}| < \infty . \end{aligned}$$
(6)

To accomplish this reduction we use a technical device, which we call the “rotation trick." This general trick played a crucial role in the recent work [10] on multiplicative functions over the integers.

After this reduction to f satisfying (6), it remains to treat the case where f is a modified character (see Definition 1.6 above). At this point it should be noted that the corresponding argument in [17] (see Section 4, from (4.8) onwards) is insufficient, due essentially to the fact that for any \(H > \text {deg}(Q)\), orthogonality implies

$$\begin{aligned} \sum _{\text {deg}(M) < H} \chi (M) = 0, \end{aligned}$$

for any non-principal Dirichlet character modulo Q (and, as discussed above, the analysis of Borwein–Choi–Coons from [1] runs into serious difficulties in the case where Q has several prime factors). However, a more elaborate argument using Ramanujan sums (see Proposition 4.7 below) does permit one to show that for suitable large choices of H, and for N large in terms of H, one does have

$$\begin{aligned} \max _{\begin{array}{c} G_0 \in \mathcal {M}\\ \text {deg}(G_0) = N \end{array}} \left| \sum _{\begin{array}{c} G \in \mathcal {M} \\ \text {deg}(G-G_0) < H \end{array}} f(G)\right| \gg _Q H^{(\omega (Q)-1)/2}, \end{aligned}$$

where \(\omega (Q)\) denotes the number of distinct prime factors of the modulus Q of the modified character corresponding to f. The logarithmic power growth rate is consistent with the discrete pattern witnessed in Theorem 1.8. The above enables us to show that if the number of primes at which f and \(\chi \) differ exceeds 1 then \(\mathcal {S}_f = \infty \). The remaining case in which \(\omega (Q) = 1\) can be analyzed directly (because the modulus is a power of a single prime), and this is accomplished at the end of Sect. 4, leading to the proof of Theorem 1.3.

2.3 The lexicographic discrepancy result

It is crucial to remark that the collection of lexicographic intervals is a refinement of the collection of short intervals, in the sense that if \(\text {deg}(G_0) = N\) and \(H < N\) and \(\widetilde{G_0}\) is the unique element of \(I_H(G_0)\) divisible by \(t^{H}\), then we can express

$$\begin{aligned} I_H(G_0) = I_H(\widetilde{G_0})=\{G \in \mathcal {M}: \langle \widetilde{G_0} \rangle \le \langle G\rangle < \langle {\widetilde{G_0}}\rangle + q^H\}. \end{aligned}$$

Thus, in view of our short interval result (Theorem 1.3), a completely multiplicative function \(f:\mathcal {M}\rightarrow S^1\) that is uniformly bounded on lexicographic intervals must be a modified character to prime power modulus. Our main obstacle is thus to rule out uniform boundedness of lexicographic partial sums for this class of functions, for which the analysis in short intervals is not sufficient.

To accomplish this, we fully exploit the “digital" structure of the lexicographic ordering to obtain a recursive relation for partial sums over \(\langle G \rangle \le N\) at a carefully chosen sequence of scales N. The construction is somewhat complicated, and we relegate further explanation to the proof of Proposition 5.1.

The proof strategy of Theorems 1.3 and 1.5 is visualized in the following diagram.

Fig. 1
figure 1

A diagram describing the different steps of the proofs of Theorems 1.3 and 1.5

2.4 The long sum discrepancy result

The proof of Theorem 1.8 proceeds differently than the proofs of Theorems 1.3 and 1.5, using a generating function argument and GRH in function fields, and can be read independently.

2.5 Structure of the paper

The paper is organized as follows. In Sect. 4, we prove the short interval discrepancy theorem (Theorem 1.3). In Sect. 5, we establish the discrepancy theorem for lexicographic discrepancy (Theorem 1.5). Lastly, Sect. 6 concerns the long sum discrepancy result, Theorem 1.8.

3 Notation

Throughout the paper, p is the characteristic of \(\mathbb {F}_q\), and \(q = p^k\) for some \(k \ge 1\).

We denote by \(\mathcal {M}\) the space of monic polynomials in \(\mathbb {F}_q[t]\) (we omit the q-dependence in \(\mathcal {M}\); thus, whenever \(\mathcal {M}\) appears it is understood that the base field has size q), and by \(\mathcal {P}\) we denote the space of monic irreducible (prime) polynomials in \(\mathbb {F}_q[t]\). For \(N \in \mathbb {N}\), we write \(\mathcal {M}_N\), \(\mathcal {M}_{\le N}\) and \(\mathcal {M}_{< N}\) to denote, respectively, the set of monic polynomials of degree exactly N, less than or equal N and strictly less than N. Analogously, we define \(\mathcal {P}_N\), \(\mathcal {P}_{\le N}\) and \(\mathcal {P}_{<N}\) to be the corresponding sets of monic irreducible polynomials. We denote the degree of \(M \in \mathbb {F}_q[t]\) by \(\text {deg}(M)\).

Given two polynomials \(F,G\in \mathbb {F}_q[t]\), not both zero, we define their greatest common divisor (FG) as the unique monic polynomial \(D\in \mathcal {M}\) such that \(D\mid F, D\mid G\) and such that for any \(D'\in \mathcal {M}\) satisfying \(D'\mid F, D'\mid G\) we have \(D'\mid D\). The least common multiple [FG] of F and G is in turn defined by \([F,G]:= FG/(F,G)\).

Typically, G will be used to denote an element of \(\mathcal {M}\), whereas R or P denotes an element of \(\mathcal {P}\) and M denotes an element of \(\mathbb {F}_q[t]\), monic or otherwise.

Given two polynomials \(G_0,G \in \mathcal {M}\) and a parameter \(H \ge 1\), we write

$$\begin{aligned} I_H(G_0):= \{G \in \mathcal {M}: \text {deg}(G-G_0) < H\} \end{aligned}$$

to denote the short interval centred at \(G_0\) of size H.

As usual, given \(t \in \mathbb {R}\) we write \(e(t):= e^{2\pi i t}\). Given a parameter \(\theta \in [0,1]\) and a polynomial \(G \in \mathbb {F}_q[t]\), we also write \(e_{\theta }(G):= e(\theta \text {deg}(G))\). Finally, given a rational function \(F/G \in \mathbb {F}_q(t)\) with Laurent series expansion \(F/G = \sum _{k = -N}^{N'} a_k t^{k}\), we define \(e_{\mathbb {F}}(\alpha ):= e((\text {tr}_{\mathbb {F}_q/\mathbb {F}_p} a_{-1}(\alpha ))/p)\), where \(\text {tr}_{\mathbb {F}_q/\mathbb {F}_p}\) denotes the usual field trace.

Throughout the paper, we write \(\mathbb {U}:= \{z \in \mathbb {C}: |z| \le 1\}\) and \(S^1:= \{z \in \mathbb {U}: |z| = 1\}\). Given sequences \(f,g: \mathcal {M} \rightarrow \mathbb {U}\), we define the pretentious distance between them by

$$\begin{aligned} \mathbb {D}(f,g;N) := \left( \sum _{P \in \mathcal {P}_{\le N}} q^{-\text {deg}(P)}(1-\text {Re}(f(P)\overline{g}(P)))\right) ^{1/2}, \end{aligned}$$
(7)

and also set

$$\begin{aligned} \mathcal {D}_f(N):= \min _{\theta \in [0,1]} \mathbb {D}(f, e_{\theta };N)^2. \end{aligned}$$

We frequently use the pretentious triangle inequalities (see e.g. [9, Section 2]): for any functions \(f_1,f_2,f_3,f_4:\mathcal {M}\rightarrow \mathbb {U}\), we have

$$\begin{aligned} \mathbb {D}(f_1,f_3;N)\le \mathbb {D}(f_1,f_2;N)+\mathbb {D}(f_2,f_3;N) \end{aligned}$$
(8)

and

$$\begin{aligned} \mathbb {D}(f_1f_2,f_3f_4;N)\le \mathbb {D}(f_1,f_3;N)+\mathbb {D}(f_2,f_4;N). \end{aligned}$$
(9)

For \(f: \mathcal {M} \rightarrow \mathbb {U}\) a 1-bounded multiplicative function, we define the Dirichlet series corresponding to f by

$$\begin{aligned} L(s,f) := \sum _{G \in \mathcal {M}} f(G)q^{-\text {deg}(G)s} = \prod _{P \in \mathcal {P}} \sum _{k \ge 0} f(P^k)q^{-k\text {deg}(P)s}, \end{aligned}$$
(10)

for \(\text {Re}(s)>1\); in this region both expressions converge absolutely.

We will sometimes write \(\mu _k\) to denote the set of kth order roots of unity, where \(k \in \mathbb {N}\).

The functions \(\Lambda \), \(\omega \), \(\lambda \), \(\mu \), \(\phi \), \(\text {rad}\) and \(\nu _P\), defined on \(\mathcal {M}\), are the analogues of the corresponding arithmetic functions in the number field setting. Thus

  • \(\Lambda (G)=\text {deg}(P)\) if \(G=P^k\) for some \(k\ge 1\) and \(P\in \mathcal {P}\) and \(\Lambda (G)=0\) otherwise.

  • \(\omega (G)\) is the number of distinct irreducible divisors of G.

  • \(\lambda :\mathcal {M}\rightarrow \{-1,+1\}\) is the completely multiplicative function with \(\lambda (P)=-1\) for all \(P\in \mathcal {P}\).

  • \(\mu :\mathcal {M}\rightarrow \{-1,0,+1\}\) is given by \(\mu (G)=(-1)^{\omega (G)}\) for G not divisible by \(P^2\) for any \(P\in \mathcal {P}\), and \(\mu (G)=0\) otherwise.

  • \(\phi (G)\) is the size of the finite multiplicative group \((\mathbb {F}_q[t]/G\mathbb {F}_q[t])^{\times }\).

  • \(\text {rad}(G)=1\) if \(G=1\) and \(\text {rad}(G)=P_1\cdots P_k\) if \(P_1,\ldots , P_k\) are the distinct irreducible factors of G.

  • \(\nu _P(G)\), for \(P\in \mathcal {P}\), is the largest integer k such that \(P^k\mid G\).

Recall also Definition 2.1 for the definitions of the various types of characters used in this paper.

Throughout this paper, the cardinality q of the underlying finite field \(\mathbb {F}_q\) is fixed. For the sake of convenience we have chosen to omit mention of dependencies on q of implicit constants in our estimates. In particular, the implicit constants in any estimate may depend on q throughout this paper.

4 The short sum discrepancy

The proof of Theorem 1.3 will be achieved through a series of reductions, starting with a reduction to the case of functions that pretend to be characters.

4.1 Reduction to the pretentious case

In this subsection, we will show that the short sum discrepancy \(\mathcal {S}_f\) of f is infinite whenever f is non-pretentious in a suitable sense. Recall the notions of characters and pretentiousness in this context from Definition 2.1 and the notation section, respectively.

Proposition 4.1

Let \(f: \mathcal {M} \rightarrow S^1\) be a completely multiplicative function, and let \(C > 0\). Assume that

$$\begin{aligned} \mathcal {S}_f = \limsup _{H\rightarrow \infty } \limsup _{N \rightarrow \infty } \max _{G_0 \in \mathcal {M}_{N}} \left| \sum _{G \in I_H(G_0)} g(G)\right| \le C. \end{aligned}$$
(11)

Then there is a primitive Dirichlet character \(\chi \) with \(\text {cond}(\chi ) \ll _C 1\), a short interval character \(\xi \) of length \(\ll _C 1\) and a real number \(\theta \in [0,1]\) such that \(\mathbb {D}(g,\chi \xi e_{\theta };N) \ll _C 1\) for all \(N\ge 1\).

Moreover, in the case that \(f: \mathcal {M} \rightarrow \{-1,+1\}\) we may conclude, in fact, that there is a real primitive character \(\chi \), a real short interval character \(\xi \) and \(\theta \in \{0,1/2\}\) such that \(\mathbb {D}(f,\chi \xi e_{\theta };N) \ll _C 1\) for all \(N\ge 1\). If q is additionally odd, we can also say that \(\xi \equiv 1\).

Proof

Let \(H=H(C)\) be an integer large enough in terms of C. We may assume that N is large enough in terms of H, so that in particular \(1 \le H < \sqrt{\log N}/10\). We may bound \(\mathcal {S}_f^2\) from below by a (logarithmically-weighted) \(L^2\)-average of sums over intervals \(I_H(G_0)\) with \(G_0 \in \mathcal {M}_{\le N}\) to deduce that

$$\begin{aligned} \frac{1}{N+1} \sum _{G_0 \in \mathcal {M}_{\le N}\setminus \mathcal {M}_{\le \sqrt{\log N}}} q^{-\text {deg}(G_0)}\left| \sum _{G \in I_H(G_0)} f(G)\right| ^2 \le (\mathcal {S}_f+1)^2\ll _C 1. \end{aligned}$$

We expand the square, exchange orders of summation and separate the terms according to \(\text {deg}(G_0)=m\). We obtain

$$\begin{aligned} \frac{1}{N+1}\sum _{\sqrt{\log N}< m \le N} q^{-m} \sum _{G_0 \in \mathcal {M}_m} \sum _{\text {deg}(M_1),\text {deg}(M_2) < H} f(G_0+M_1)\overline{f(G_0+M_2)} \ll _C 1. \end{aligned}$$
(12)

For each \(m \ge H\) and \(M_1\in \mathcal {M}\) of degree \(<H\), we have \(\mathcal {M}_m + M_1 = \mathcal {M}_m\). Thus, making on the left-hand side of (12) the change of variables \(G:= G_0 + M_1\) and \(M:= M_2-M_1\) (so that \(\text {deg}(M) < H\)), and then bounding the contribution from the summands with \(m < \sqrt{\log N}\) trivially as \(O(q^{2H}\sqrt{\log N})\), we reach

$$\begin{aligned} \frac{|\mathcal {M}_{<H}|}{N+1} \sum _{m \le N} q^{-m} \sum _{G \in \mathcal {M}_m} \sum _{\text {deg}(M) < H} f(G)\overline{f(G+M)} \ll _C 1 + q^{2H}\sqrt{\log N}/N. \end{aligned}$$

If we isolate the choice \(M = 0\) from the remaining shifts, we deduce that

$$\begin{aligned} \left| \sum _{\begin{array}{c} \text {deg}(M) < H \\ M \ne 0 \end{array}}\frac{1}{N+1} \sum _{m \le N} q^{-m} \sum _{G \in \mathcal {M}_m} f(G)\overline{f(G+M)}\right| \gg _C 1 - O_C(\sqrt{\log N}/N). \end{aligned}$$

By the assumption that N is large in terms of H, the triangle inequality and the pigeonhole principle then imply that for some \(M \ne 0\) with \(\text {deg}(M) < H\) we actually have

$$\begin{aligned} \left| \frac{1}{N}\sum _{m \le N} q^{-m} \sum _{G \in \mathcal {M}_m} f(G)\overline{f(G+M)}\right| \gg _C q^{-H}\gg _C 1. \end{aligned}$$

By the first statement in Theorem 2.2, we conclude that there exists a primitive Dirichlet character \(\chi _N\) with \(\text {cond}(\chi _N) = O_C(1)\), a primitive short interval character \(\xi _N\) of length \(\le N\) and a point \(\theta _N \in [0,1]\) such that \(\mathbb {D}(f,\chi _N \xi _N e_{\theta _N};N) \ll _C 1\). Note that the set of Dirichlet characters of conductor at most \(O_C(1)\) is bounded in size (in terms of C) and that [0, 1] is compact. There is thus an infinite increasing sequence \(\{N_j\}_j\) of positive integers, a primitive character \(\chi \) of conductor \(\ll _C 1\) and a \(\theta \in [0,1]\) for which \(\theta _{N_j} \rightarrow \theta \) as \(j \rightarrow \infty \), such that

$$\begin{aligned} \limsup _{j \rightarrow \infty } \mathbb {D}(f,\chi \xi _j e_{\theta _j}; N_j) < \infty , \end{aligned}$$

where by an abuse of notation we have written \(\xi _j:= \xi _{N_j}\) and \(\theta _j:= \theta _{N_j}\), for convenience.

Since \(N_{j+1} > N_j\), it follows from the triangle inequality (8) that

$$\begin{aligned} \mathbb {D}(\xi _j e_{\theta _j}, \xi _{j+1} e_{\theta _{j+1}};N_{j}) \le \mathbb {D}(f,\chi \xi _j e_{\theta _j}; N_j) + \mathbb {D}(f,\chi \xi _{j+1} e_{\theta _{j+1}}; N_{j+1}) \ll _C 1 \end{aligned}$$
(13)

uniformly in \(j\ge 1\). Now, suppose \(\xi _j \overline{\xi }_{j+1}\) is nontrivial. Note that \(\xi _j\overline{\xi _{j+1}}\) is a short interval character of length \(\le \max \{\text {len}(\xi _j),\text {len}(\xi _{j+1})\} \le N_j\). But then by (7) and [11, Lemma 3.2],

$$\begin{aligned} \mathbb {D}(\xi _je_{\theta _j},\xi _{j+1}e_{\theta _{j+1}};N_j)^2&= \log N_j - \text {Re}\left( \sum _{P \in \mathcal {P}_{\le N_j}} \xi _j\overline{\xi }_{j+1}(P)e_{\theta _j-\theta _{j+1}}(P) q^{-\text {deg}(P)}\right) + O(1) \end{aligned}$$
(14)
$$\begin{aligned}&\ge (1-o(1)) \log N_j. \end{aligned}$$
(15)

This contradicts (13). It must follow that \(\xi _j = \xi _{j+1}\) for all j sufficiently large (in terms of C). In particular, it follows that there is a \(j_0 \ll _C 1\) such that \(\xi _j = \xi _{j_0}\) for all \(j \ge j_0\). Setting \(\xi := \xi _{j_0}\), which is a short interval character of length \(\ll _C 1\), we deduce that \(\mathbb {D}(f,\chi \xi e_{\theta _j};N_j) \ll _C 1\) for all \(j\ge 1\), and therefore by the triangle inequality (8) it follows that

$$\begin{aligned} \mathbb {D}(e_{\theta _j},e_{\theta _{j+k}};N_j) \le \mathbb {D}(f,\chi \xi e_{\theta _j};N_j) + \mathbb {D}(f,\chi \xi e_{\theta _{j+k}};N_{j+k}) \ll _C 1, \end{aligned}$$

uniformly in \(j,k \ge 1\). Since the expression on the left-hand side is continuous in \(\theta _{j+k}\), taking \(k \rightarrow \infty \) we deduce that

$$\begin{aligned} \mathbb {D}(e_{\theta _j},e_{\theta };N_j) \ll _C 1, \end{aligned}$$

uniformly in j, and hence for all \(j\ge 1\) we have

$$\begin{aligned} \mathbb {D}(f,\chi \xi e_{\theta };N_j)&\le \mathbb {D}(f,\chi \xi e_{\theta _j}; N_j) + \mathbb {D}(\chi \xi e_{\theta },\chi \xi e_{\theta _j};N_j)\\&=\mathbb {D}(f,\chi \xi e_{\theta _j}; N_j) + \mathbb {D}(e_{\theta }, e_{\theta _j};N_j)+O_C(1)\ll _C 1, \end{aligned}$$

where we used the fact that \(|\chi \xi (P)|=1\) for all \(P\in \mathcal {P}\) of degree \(\gg _C 1\).

Since \(\mathbb {D}(f,\chi \xi e_{\theta }; N_j) \le \mathbb {D}(f,\chi \xi e_{\theta };N) \le \mathbb {D}(f,\chi \xi e_{\theta }; N_{j+1})\) whenever \(N_j \le N < N_{j+1}\), we deduce that

$$\begin{aligned} \mathbb {D}(f,\chi \xi e_{\theta };N) \ll _C 1, \end{aligned}$$
(16)

uniformly in N. This completes the proof of the first claim.

To prove the second claim where \(f:\mathcal {M}\rightarrow \{-1,+1\}\), we take the conclusion (16) and apply the triangle inequality (9) to deduce that

$$\begin{aligned} \mathbb {D}(1,\chi ^2\xi ^2e_{2\theta };N)\le 2\mathbb {D}(f,\chi \xi e_{\theta };N) \ll _C 1. \end{aligned}$$
(17)

Arguing similarly as in (14), this implies that \(\chi ^2\) is principal and \(\xi ^2\equiv 1\). Furthermore, in this case, by (7) we have

$$\begin{aligned} \begin{aligned} \mathbb {D}(1,\chi ^2e_{2\theta };N)^2&= \mathbb {D}(1,e_{2\theta };N)^2+O_C(1)\\&=\log N - \sum _{n \le N} \frac{\cos (4\pi \theta n)}{n} + O_C(1). \end{aligned} \end{aligned}$$
(18)

If \(2\theta \not \equiv 0\pmod {1}\), there exist \(\eta >0\) and \(B>0\) such that every interval of length B contains an integer n for which \(|\cos (4\pi \theta n)|\le 1-\eta \). Inserting this into (18) and comparing with (17), we conclude that we must have \(2\theta \equiv 0 \pmod {1}\).

Lastly, if q is odd, then by the statement in Theorem 2.2 about real-valued f, we also have \(\xi \equiv 1\) (in fact, there are no nontrivial real-valued short interval characters then). The second claim thus follows. \(\square \)

4.2 Reduction to modified characters

We have demonstrated that in order to characterize those completely multiplicative functions f with bounded short sum discrepancy \(\mathcal {S}_f\), it suffices to treat functions that are pretentious to a twisted character. By means of the following proposition, however, we can in fact restrict ourselves to functions differing from a twisted character at a bounded number of irreducibles, only. The proof of the proposition utilizes what we call a “rotation trick”; see [10] for applications of the same idea in the integer setting.

Proposition 4.2

Let \(f: \mathcal {M} \rightarrow S^1\) be completely multiplicative. Suppose there exist \(Q \in \mathcal {M}\) and a primitive Dirichlet character \(\chi \) modulo Q, a primitive short interval character \(\xi \) of length \(\nu \ge 0\) and \(\theta \in [0,1]\) such that \(\mathbb {D}(f,\chi \xi e_{\theta };\infty ) < \infty \). Let

$$\begin{aligned} S:= \{P \in \mathcal {P}: f(P) \ne \chi (P)\xi (P)e_{\theta }(P)\}. \end{aligned}$$

If \(|S| = \infty \) then \(\mathcal {S}_f=\infty \).

Proof

Let \(f:\mathcal {M}\rightarrow S^1\) be be completely multiplicative. Assume for the sake of contradiction that \(|S| = \infty \). We will prove that

$$\begin{aligned} \limsup _{H\rightarrow \infty }\limsup _{N \rightarrow \infty } \max _{G_0 \in \mathcal {M}_N} \left| \sum _{G \in I_H(G_0)} f(G)\right| = \infty . \end{aligned}$$
(19)

For N large enough in terms of H, both \(e_{\theta }\) and \(\xi \) are constant on any short interval \(I_H(G_0)\) with \(G_0\in \mathcal {M}_N\), so we may replace f by \(f\overline{\xi }e_{-\theta }\) in (19) and (still calling this new function f for convenience) we may assume that \(\mathbb {D}(f,\chi ;\infty )<\infty \) and that \(S:=\{P\in \mathcal {P}:\,\, f(P)\ne \chi (P)\}\) is infinite.

Let \(1 \ll H \ll n \ll N\) be parameters, each of which is large enough in terms of the parameters to the left of it. Since \(\mathbb {D}(f,\chi ;\infty )<\infty \), we can impose the condition

$$\begin{aligned} \mathbb {D}(f,\chi ; n,\infty )^2&:= \sum _{\begin{array}{c} P \in \mathcal {P} \\ \text {deg}(P) > n \end{array}} \frac{1-\text {Re}(f(P)\overline{\chi }(P))}{q^{\text {deg}(P)}} \le 1/q^{3H}. \end{aligned}$$
(20)

Since \(|S|=\infty \), there exists a function \(F:\mathbb {N}\rightarrow \mathbb {R}_{\ge 0}\), depending only on H, such that there are \(> q^{2\,H}\) primes \(P \in S\) with \(\text {deg}(P) \in (n,F(n)]\).

For each \(M \in \mathbb {F}_q[t]\) of degree \(<H\) pick some \(P_M \in S\) such that the \(P_M\) are all distinct and such that \(\text {deg}(P_M) \in (n,F(n)]\), and let \(k_M\ll _{H,n}1\) be a positive integer to be chosen later. We set

$$\begin{aligned} \Gamma := \prod _{G \in \mathcal {M}_{\le n}} G^2 \cdot \prod _{\text {deg}(M) < H} P_M^{k_M}; \end{aligned}$$

note that

$$\begin{aligned} \text {deg}(\Gamma )&\le 2\sum _{G \in \mathcal {M}_{\le n}} \text {deg}(G) + \sum _{\text {deg}(M)< H} k_M \text {deg}(P_M)\\&\le 8nq^n+ \left( \max _{\text {deg}(M) < H} k_M\right) q^HF(n) \\&\le (\log N)/(50 \log q), \end{aligned}$$

if H is large enough and N is large enough in terms of n and H.

By the Chinese remainder theorem, we can choose \(R\in \mathcal {M}_{2\text {deg}(()\Gamma )}\) such that

$$\begin{aligned} R&\equiv 0 \left( \mod \prod _{G \in \mathcal {M}_{\le n}} G^2\right) \\ R&\equiv - M+P_M^{k_M} \left( \mod P_M^{k_M+1}\right) \end{aligned}$$

for all \(M\in \mathcal {M}\) with \(\text {deg}(M) < H\).

Note that if \(M\ne M'\) are in \(\mathbb {F}_q[t]\) and \(\text {deg}(M),\text {deg}(M')<H\), then \(P_{M'} \not \mid (R+M)\), since otherwise \(P_{M'}\mid (M'-M)\) but \(\text {deg}(P_{M'})>H\). Therefore,

$$\begin{aligned} (R+M,\Gamma ) = (M,\Gamma ) \cdot P_M^{k_M} = \widetilde{M}P_M^{k_M}, \end{aligned}$$
(21)

where, setting a to be the leading coefficient of M, we put \(\widetilde{M}(t):=M(t)/a\).

We consider the double sum

$$\begin{aligned} \Sigma := q^{-N}\sum _{\text {deg}(M) < H} \sum _{G \in \mathcal {M}_N} f(G \Gamma + R + M). \end{aligned}$$

Since \(\mathcal {S}_f < \infty \), swapping the orders of summation, summing in M and applying the triangle inequality, we see that

$$\begin{aligned} |\Sigma | \le q^{-N} \sum _{G \in \mathcal {M}_N} \left| \sum _{\text {deg}(M)< H} f(G\Gamma + R + M)\right| \le \mathcal {S}_f+1 < \infty . \end{aligned}$$

Now fix \(M\in \mathcal {M}_{<H}\) for the moment. Set

$$\begin{aligned} d_M:= a(R+M,\Gamma )=MP_M^{k_M},\quad \Gamma _M:= \Gamma /d_M. \end{aligned}$$

Factoring out primes in common with \(\Gamma \) and noting that \(\text {deg}(R+M)<N\), we have

$$\begin{aligned} \sum _{G \in \mathcal {M}_N} f(G\Gamma + R+M)&= f(d_M) \sum _{G \in \mathcal {M}_N} f\left( G\Gamma _M + \frac{R+M}{d_M}\right) \\&= f(d_M) \sum _{G' \in \mathcal {M}_{N+\text {deg}(\Gamma _M)}} f(G') 1_{G' \equiv (R+M)/d_M \pmod {\Gamma _M}}. \end{aligned}$$

Using orthogonality of Dirichlet characters modulo \(\Gamma _M\), the above expression equals to

$$\begin{aligned} \frac{f(d_M)}{\phi (\Gamma _M)} \sum _{\psi \pmod {\Gamma _M}} \psi ((R+M)/d_M) \sum _{G' \in \mathcal {M}_{N+\text {deg}(\Gamma _M)}} f(G')\overline{\psi }(G'). \end{aligned}$$

Choosing n large enough in terms of H, we can guarantee that \(Q|\Gamma _M\), regardless of M. Thus, there is a character \(\chi '\) modulo \(\Gamma _M\) that is induced by \(\chi \). If \(\psi \ne \chi '\) then, provided N is large enough in terms of n, [11, Corollary 3.7] yields

$$\begin{aligned} \max _{\text {deg}(M) < H}\, \max _{\begin{array}{c} \psi \pmod {\Gamma _M} \\ \psi \ne \chi ' \end{array}} \left| \sum _{G' \in \mathcal {M}_{N+\text {deg}(\Gamma _M)}} f(G') \overline{\psi }(G')\right| \ll q^{N+\text {deg}(\Gamma )} N^{-1/4 + o(1)} \ll q^N/N^{1/5}, \end{aligned}$$

since \(\text {deg}(\Gamma ) \le \frac{\log N}{50\log q}\). Thus,

$$\begin{aligned} \begin{aligned}&q^{-N} \sum _{G \in \mathcal {M}_N} f(G\Gamma +R+M)\\&\quad = f(d_M)\chi '\left( \frac{R+M}{d_M}\right) \frac{q^{\text {deg}(\Gamma _M)}}{\phi (\Gamma _M)} q^{-N-\text {deg}(\Gamma _M)} \sum _{G' \in \mathcal {M}_{N+\text {deg}(\Gamma _M)}} f(G')\overline{\chi '}(G') + O(N^{-1/5}). \end{aligned} \end{aligned}$$
(22)

Observe that using (21), \(R/M\equiv 0\pmod {Q}\) and \((R/M+1,\Gamma _M) = (P_M^{k_M},\Gamma _M) = 1\), we have

$$\begin{aligned} \chi '\left( \frac{R+M}{d_M}\right) =\chi '\left( \frac{R/M+1}{P_M^{k_M}}\right) =\overline{\chi }(P_M)^{k_M}. \end{aligned}$$
(23)

Applying Delange’s theorem over function fields to \(f\overline{\chi '}\) (see [9, Theorem 1.4.1]), and recalling that \(\chi '(P)=0\) if \(P\mid \Gamma _M\), we see that

$$\begin{aligned}&\sum _{G' \in \mathcal {M}_{N+\text {deg}(\Gamma _M)}} f(G') \overline{\chi '}(G') \nonumber \\&\quad = q^{N+\text {deg}(\Gamma _M)} \Bigg (\frac{\phi (\Gamma _M)}{q^{\text {deg}(\Gamma _M)}} \prod _{\begin{array}{c} P \in \mathcal {P} \\ \text {deg}(P)> n \\ P \ne P_{M'} \, \forall M':\, \text {deg}(M') < H, M' \ne M \end{array}} \left( 1-q^{-\text {deg}(P)}\right) \left( 1-f\overline{\chi '}(P)q^{-\text {deg}(P)}\right) ^{-1}\nonumber \\&\quad \quad + O\Bigg (\mathbb {D}\Big (f,\chi ; \frac{\log N}{2\log q},\infty \Big ) + N^{-1/2}\Bigg )\Bigg ) \nonumber \\&\quad = \frac{\phi (\Gamma _M)}{q^{\text {deg}(\Gamma _M)}} q^{N+\text {deg}(\Gamma _M)}\Bigg (1+O\Bigg (\sum _{\begin{array}{c} P\in \mathcal {P}\\ \text {deg}(P)>n \end{array}}\frac{1-\text {Re}(f(P)\overline{\chi '}(P))}{q^{\text {deg}(P)}} + q^{-3H/2} + N^{-1/2}\Bigg )\Bigg ) \nonumber \\&\quad = \frac{\phi (\Gamma _M)}{q^{\text {deg}(\Gamma _M)}} q^{N+\text {deg}(\Gamma _M)}\left( 1+O(q^{-3H/2}+ N^{-1/2})\right) \end{aligned}$$
(24)

by (20), provided that that n is large enough in terms of H. Since the above can be done uniformly over all \(\text {deg}(M) < H\), we deduce upon inserting (24), (23) and (21) into (22) that when N is large enough relative to H,

$$\begin{aligned} \mathcal {S}_f+1&\ge \Sigma = q^{-N}\sum _{\text {deg}(M)< H} \sum _{G \in \mathcal {M}_N} f(G\Gamma + R + M) = \sum _{\text {deg}(M)< H} f(d_M) \overline{\chi }(P_M)^{k_M} + o(1)\\&= \sum _{\begin{array}{c} \text {deg}(M)< H \end{array}} f\overline{\chi }(P_M)^{k_M} \prod _{\text {deg}(P) \le n} f(P)^{\nu _P(M)}+o(1)\\&= \sum _{\text {deg}(M) < H} f\overline{\chi }(P_M)^{k_M} f(M) + o(1). \end{aligned}$$

We now show that there is a choice of the multiplicities \(k_M\) that makes

$$\begin{aligned} \left| \sum _{\text {deg}(M) < H} f\overline{\chi }(P_M)^{k_M} f(M)\right| \ge q^H/10, \end{aligned}$$

say, which will provide the desired contradiction for H large enough. This follows from the following lemma.

Lemma 4.3

Let \(m\ge 1\), let \(w_1,\ldots , w_m\in S^1\), and let \(\zeta _1,\ldots , \zeta _m\in S^1\setminus \{1\}\). Then there exist \(k_j\in \mathbb {N}\) such that

$$\begin{aligned} \left| \sum _{j\le m}\zeta _j^{k_j}w_j\right| \ge m/7. \end{aligned}$$
(25)

Proof

By the pigeonhole principle, there exists a closed arc I of the unit circle \(S^1\) of length \(2\pi /3\) that contains \(\ge m/3\) of the complex numbers \(w_j\). Let J be the set of j for which \(w_j\in I\). Form a semicircle \(\mathcal {C}\subset S^1\) such that \(I\subset \mathcal {C}\) and such that the midpoint of I is the midpoint of the arc of \(\mathcal {C}\).

Now, for every \(j\in J\), pick \(k_j\) such that \(|\zeta _j^{k_j}-1|\le 1/(100\,m)\). For every \(j\in \{1,\ldots , m\}{\setminus } J\), pick \(k_j\) such that \(\zeta _j^{k_j}w_j\in \mathcal {C}\); this is clearly always possible since \(\{\zeta _j^{k}:\,\, k\in \mathbb {N}\}\) intersects any semicircle. Let \(\alpha \in S^1\) be such that the half-plane determined by \(\mathcal {C}\) is \(\{z\in \mathbb {C}:\,\, \text {Re}(\alpha z)\ge 0\}\). Note that \(\text {Re}(\alpha z)\ge \frac{1}{2}\) whenever \(z\in I\). Thus

$$\begin{aligned} \left| \sum _{j\le m}\zeta _j^{k_j}w_j\right|&\ge \text {Re}\left( \alpha \sum _{j\le m}\zeta _j^{k_j}w_j\right) \\&\ge \text {Re}\left( \alpha \sum _{j\in J}\zeta _j^{k_j}w_j\right) \\&\ge \frac{m}{3}\cdot \left( \frac{1}{2}-\frac{1}{100}\right) \\&\ge \frac{m}{7}, \end{aligned}$$

which proves the claim \(\square \)

Taking \(\zeta _M=f\overline{\chi }(P_M)\ne 1\) and \(w_M=f(M)\) in the lemma, a choice of multiplicities \(k_M\ll _{n,H}1\) can be made, and the claim follows. \(\square \)

4.3 The case of modified characters

It now remains to consider functions that differ at only finitely many primes from a non-principal Dirichlet character. Indeed, as was noted in the proof of Proposition 4.2, if

$$\begin{aligned} \limsup _{H\rightarrow \infty }\limsup _{N \rightarrow \infty } \max _{G_0 \in \mathcal {M}_{N}} \left| \sum _{G \in I_H(G_0)} f(G)\right| = \infty , \end{aligned}$$

holds for a function f, then it also holds for the function \(f\overline{\xi }e_{-\theta }\), so we may assume by Proposition 4.2 that \(|\{P\in \mathcal {P}:\,\, f(P)\ne \chi (P)\}|<\infty \).

This is precisely the case of modified characters (see Definition 1.6 above).

Remark 4.4

Note that if f differs from a non-principal Dirichlet character \(\chi '\) at only finitely many primes S, say, then by setting \(\chi := \chi ' \chi _0^{(Q_S)}\), where \(Q_S:= \prod _{P\in S} P\) and \(\chi _0^{(Q_S)}\) denotes the principal character modulo \(Q_S\), then f is a modified character modulo \([Q,Q_S]\).

4.3.1 Modified characters with at least two prime factors

The last major ingredient that we require before proceeding to the proof of Theorem 1.3 involves showing that modified characters have unbounded short sum discrepancy, provided the modulus has at least two distinct prime factors. We start with a lemma that will be used subsequently.

Lemma 4.5

For a Dirichlet character \(\chi \pmod {Q}\) with \(Q\in \mathcal {M}\) we define the Gauss sumFootnote 4

$$\begin{aligned} \tau (\chi )=\sum _{A\pmod {Q}}\chi (A)e_{\mathbb {F}}\left( \frac{A}{Q}\right) . \end{aligned}$$

Then \(\tau (\chi )\ne 0\) whenever \(\chi \pmod {Q}\) is primitive and non-principal.

Proof

If \(\chi \pmod {Q}\) is primitive and non-principal, the same argument as in the integer case (see [2, Section 2]) shows that \(|\tau (\chi )|=q^{\text {deg}(Q)/2}\), so the claim follows. \(\square \)

We will also need the following formula for the Gauss sums

$$\begin{aligned} \tau (\chi ,B):= \sum _{A \pmod {Q}} \chi (A) e_{\mathbb {F}}\left( \frac{AB}{Q}\right) , \end{aligned}$$

particularly when \(\chi \) is imprimitive.

Lemma 4.6

Let \(Q = Q_1Q_2 \in \mathcal {M}\), where \((Q_1,Q_2) = 1\) and \(Q_2\) is squarefree. Let \(\chi \) be a character modulo Q, induced by a primitive character \(\chi ^{*}\) modulo \(Q_1\). Then for any non-zero \(B \in \mathbb {F}_q[t]\),

$$\begin{aligned} \tau (\chi ,B) = \tau (\chi ^{*}) \chi ^{*}(Q_2)\overline{\chi ^{*}}(B)\phi ((Q_2,B))\mu (Q_2/(Q_2,B))1_{(Q,B)|Q_2}. \end{aligned}$$

Proof

Following the proof of [14, Lemma 5.4] in the function field setting, we find that

$$\begin{aligned} \tau (\chi ,B) = \tau (\chi ^{*})\overline{\chi ^{*}}(B/(Q,B)) \frac{\phi (Q)}{\phi (Q/(Q,B))} \mu (Q_2/(Q,B)) \chi ^{*}(Q_2/(Q,B)) \end{aligned}$$

if \((Q,B)|Q_2\), and \(\tau (\chi ,B)=0\) otherwise. We focus on the former case. Since \((Q_1,Q_2) = 1\) and \(Q_2\) is squarefree, B is coprime to \(Q_1 = \text {cond}(\chi )\), \((Q,B) = (Q_2,B)\) and we can simplify the character factors to give \(\chi ^{*}(Q_2)\overline{\chi ^{*}}(B)\). Furthermore, we have

$$\begin{aligned} \frac{\phi (Q)}{\phi (Q/(Q,B))} = \frac{\phi (Q_2)}{\phi (Q_2/(Q_2,B))} = \phi ((Q_2,B)), \end{aligned}$$

which implies the claim. \(\square \)

Now the result about modified characters modulo \(Q\ne P^r\) follows in a strong form from the following result.

Proposition 4.7

Let \(f: \mathcal {M} \rightarrow S^1\) be a modified character modulo \(Q\in \mathcal {M}\), associated with a non-principal character \(\chi \), induced by a primitive character \(\chi ^{*}\) modulo \(Q^{*}\). Assume moreover that \(Q/Q^{*}\) is squarefree and coprime to Q.

Let \(N \ge 1\) be large. Then for any \(1 \le T \le N/(10(\text {deg}(Q))^{\omega (Q)+1})\) there is a choice of \(H\in [T, T (\text {deg}(Q))^{\omega (Q)}]\) such that

$$\begin{aligned} \max _{G_0 \in \mathcal {M}_{\le N}} \left| \sum _{G \in I_H(G_0)} f(G)\right| \gg _Q H^{\frac{1}{2}(\omega (Q)-1)}. \end{aligned}$$

Remark 4.8

As we shall see, the assumption that \(Q/Q^{*}\) be squarefree and coprime to Q is satisfied in our application.

The proof is based on a careful analysis of Ramanujan sums. For \(G\in \mathcal {M},H \in \mathbb {F}_q[t]\), the Ramanujan sum \(c_G(H)\) is defined by

$$\begin{aligned} c_G(H):= \mathop {{\mathop {\sum }\nolimits ^{*}}}\limits _{A \pmod {G}} e_{\mathbb {F}}\left( \frac{AH}{G}\right) , \end{aligned}$$

where \(*\) in the sum denotes summation over invertible residue classes. Ramanujan sums satisfy the relation

$$\begin{aligned} \sum _{D|G} c_D(H) = q^{\text {deg}(G)} 1_{G|H}, \end{aligned}$$
(26)

so that by Möbius inversion we get

$$\begin{aligned} c_G(H) = \sum _{E|G} \mu (G/E) q^{\text {deg}(E)} 1_{E|H}. \end{aligned}$$
(27)

Lemma 4.9

Let \(Q \in \mathcal {M}\), \(\text {deg}(Q)\ge 1\), and let \(n \in \mathbb {Z}\). Then

$$\begin{aligned} \sum _{\text {deg}(M) < n} c_Q(M) = {\left\{ \begin{array}{ll} \phi (Q) &{}\text { if }n \le 0 \\ 0 &{}\text { if }n \ge 1.\end{array}\right. } \end{aligned}$$

Proof

Let S(n) denote the sum on the left-hand side. Then

$$\begin{aligned} S(n) = \sum _{\text {deg}(M)< n} c_Q(M) = c_Q(0) + \sum _{0 \le m < n} \sum _{\text {deg}(M) = m} c_Q(M), \end{aligned}$$

where the sum on the right-hand side is interpreted as empty unless \(n \ge 1\). If \(n \le 0\), we are done since \(c_Q(0) = \phi (Q)\), so suppose \(n \ge 1\).

Expanding \(c_Q(M)\) using (27), we get

$$\begin{aligned} S(n)&= c_Q(0) + \sum _{E|Q} \mu (Q/E)q^{\text {deg}(E)} \sum _{0 \le m< n} \sum _{\begin{array}{c} \text {deg}(M) = m \\ E|M \end{array}} 1 \\&= c_Q(0) + \sum _{E|Q} \mu (Q/E) q^{\text {deg}(E)} \sum _{0 \le m< n} (q-1)q^{m-\text {deg}(E)} 1_{m \ge \text {deg}(E)} \\&= c_Q(0) + (q-1) \sum _{E|Q} \mu (Q/E) \sum _{\text {deg}(E) \le m < n} q^m. \end{aligned}$$

Summing the geometric series, using \(\sum _{E|Q}\mu (Q/E) = 0\) for \(\text {deg}(Q)\ge 1\) and (27), we get

$$\begin{aligned} S(n)&= c_Q(0) + \sum _{E|Q}\mu (Q/E) (q^{n} - q^{\text {deg}(E)}) = c_Q(0) - \sum _{E|Q}\mu (Q/E) q^{\text {deg}(E)}\\ {}&= c_Q(0) - c_Q(0) = 0. \end{aligned}$$

This completes the proof of the claim. \(\square \)

4.3.2 Proof of proposition 4.7

Since \(\chi \pmod {Q}\) is non-principal, we have \(\text {deg}(Q) \ge 1\). Write \(Q = P_1^{r_1}\cdots P_k^{r_k}\) where the \(P_j\) are all distinct, and set \(d_j:= \text {deg}(P_j)\) for all j. Suppose \(f:\mathcal {M} \rightarrow S^1\) is completely multiplicative, with \(f(P) = \chi (P)\) for all \(P \ne P_j\). We put \(H = Tr_1d_1\cdots r_kd_k\) and observe that we have the inequalities

$$\begin{aligned} T \le H \le T (\max _{1 \le j \le k} r_jd_j)^{\omega (Q)} \le T (\text {deg}(Q))^{\omega (Q)}, \end{aligned}$$

as required.

Let \(N > 10T(\text {deg}(Q))^{\omega (Q)+1} \ge 10H\text {deg}(Q)\). Then

$$\begin{aligned} \max _{G_0 \in \mathcal {M}_{\le N}} \left| \sum _{G \in I_{H}(G_0)} f(G)\right| ^2&\ge q^{-N} \sum _{G_0 \in \mathcal {M}_N} \left| \sum _{\text {deg}(M)< H} f(G_0+M) \right| ^2 \nonumber \\&= q^{-N} \sum _{G_0 \in \mathcal {M}_N} \left| \sum _{\text {rad}(D)|Q} f(D) \sum _{\text {deg}(M) < H} \chi \left( \frac{G_0+M}{D}\right) 1_{D|(G_0+M)} \right| ^2 \nonumber \\&=: \mathcal {T}. \end{aligned}$$
(28)

We will show the following lower bound for \(\mathcal {T}\).

Lemma 4.10

Assume the hypotheses of Proposition 4.7, and write \(Q_S:= Q/Q^{*}\). Then

$$\begin{aligned} \mathcal {T} \ge \frac{\phi (Q)}{q^{\text {deg}(Q)}} \left( \frac{\phi (Q_S)}{q^{\text {deg}(Q_S)}}\right) ^2\prod _{P|Q_S}\left| 1-f\overline{\chi ^{*}}(P)q^{-\text {deg}(P)}\right| ^{-2} \left( q^H \sum _{\begin{array}{c} \text {rad}(D)|Q \\ \text {deg}(D)\ge H \end{array}} q^{-\text {deg}(D)}\right) + o_{N \rightarrow \infty }(1). \end{aligned}$$

Proof

(Deduction of Proposition 4.7 assuming Lemma 4.10) Note that the product

$$\begin{aligned} \prod _{P|Q_S}|1-f\overline{\chi ^{*}}(P)q^{-\text {deg}(P)}|^{-2} \end{aligned}$$

is non-vanishing, and therefore \(\gg _Q 1\). From Lemma 4.10, we thus obtain

$$\begin{aligned} \mathcal {T}&\gg _Q q^H\sum _{\begin{array}{c} \text {rad}(D) |Q \\ \text {deg}(D) \ge H \end{array}} q^{-\text {deg}(D)} + o_{N \rightarrow \infty }(1) \ge \sum _{\begin{array}{c} \text {rad}(D)|Q \\ \text {deg}(D) = H \end{array}} 1 + o_{N \rightarrow \infty }(1) \\&= |\{\varvec{\alpha } \in \mathbb {N}_0^k : \alpha _1r_1d_1 + \cdots + \alpha _kr_kd_k = H\}| + o_{N \rightarrow \infty }(1), \end{aligned}$$

since \(Q = P_1^{r_1}\cdots P_k^{r_k}\) with \(d_j = \text {deg}(P_j)\), for each j. As \(H = Tr_1d_1 \cdots r_k d_k\),

$$\begin{aligned}&|\{\varvec{\alpha } \in \mathbb {N}_0^k : \alpha _1r_1d_1 + \cdots + \alpha _kr_kd_k = H\}| \\&\quad \ge |\{\varvec{\alpha } \in \mathbb {N}_0^k : \alpha _1r_1d_1+\cdots +\alpha _kr_kd_k = Tr_1d_1\cdots r_kd_k \text { and } \prod _{\begin{array}{c} 1 \le i \le k \\ i \ne j \end{array}} r_id_i |\alpha _j \text { for all } 1\le j \le k\}| \\&\quad = |\{\varvec{\beta } \in \mathbb {N}_0^k : \beta _1+ \cdots + \beta _k = T\}| = \left( {\begin{array}{c}T+k-1\\ k-1\end{array}}\right) \gg _k T^{k-1} \gg _Q H^{k-1}. \end{aligned}$$

In particular, we obtain \(\mathcal {T} \gg _Q H^{\omega (Q)-1}\). It thus follows from (28) that

$$\begin{aligned} \max _{G_0 \in \mathcal {M}_{\le N}} \left| \sum _{G \in I_{H}(G_0)} f(G)\right| ^2 \ge \mathcal {T} \gg _Q H^{\omega (Q)-1}, \end{aligned}$$

as required. \(\square \)

It therefore remains to prove Lemma 4.10.

Proof of Lemma 4.10

The proof of the lemma is a technical computation, but can be divided into several steps.

Step 1: Reduction to a sum over a hyperplane \(\pmod {Q}\). Note that if \(\text {deg}(D) > N\) then the only solution to \(D|(G_0+M)\) requires \(G_0 + M = 0\), which is impossible since \(\text {deg}(M) = H < N = \text {deg}(G_0)\). Thus we may additionally assume that \(\text {deg}(D) \le N\) in the inner sum defining \(\mathcal {T}\). Splitting into residue classes modulo Q and then expanding the square (and making the change of variables \(M \mapsto -M\) for later convenience), we have

$$\begin{aligned} \mathcal {T}&= q^{-N} \sum _{G_0 \in \mathcal {M}_N} \left| \sum _{\begin{array}{c} \text {rad}(D)|Q \\ \text {deg}(D) \le N \end{array}} f(D) \sum _{\text {deg}(M)< H} \chi \left( \frac{G_0-M}{D}\right) 1_{D|(G_0-M)}\right| ^2 \\&= q^{-N} \sum _{G_0 \in \mathcal {M}_N} \left| \,\,\mathop {{\mathop {\sum }\nolimits ^{*}}}\limits _{A \pmod {Q}} \chi (A) \sum _{\begin{array}{c} \text {rad}(D)|Q \\ \text {deg}(D) \le N \end{array}} f(D) \sum _{\text {deg}(M)< H} 1_{D|(G_0-M)}1_{(G_0-M)/D \equiv A \pmod {Q}} \right| ^2 \\&= q^{-N} \mathop {{\mathop {\sum }\nolimits ^{*}}}\limits _{A_1,A_2 \pmod {Q}} \chi (A_1) \overline{\chi }(A_2) \sum _{\begin{array}{c} \text {rad}(D_j)|Q \\ \text {deg}(D_j) \le N \\ j = 1,2 \end{array}} f(D_1)\overline{f}(D_2) \\&\cdot \sum _{\begin{array}{c} \text {deg}(M_j) < H \\ j = 1,2 \end{array}} |\{G_0 \in \mathcal {M}_N : G_0 \equiv M_j \pmod {D_j} , (G_0-M_j)/D_j \equiv A_j \pmod {Q}, j = 1,2\}|. \end{aligned}$$

Fix momentarily \(D_1,D_2\) with \(\text {rad}(D_j) |Q\) for \(j = 1,2\), and set \(D:= (D_1,D_2)\) and \(\widetilde{D}_j:= D_j/D\) for \(j = 1,2\). Note that the pair of congruences \(G_0 \equiv M_j \pmod {D_j}\) for \(j = 1,2\) is solvable if and only if \(D|(M_2-M_1)\), and provided \(\text {deg}([D_1,D_2]) \le N\) the general solution has the form

$$\begin{aligned} G_0&= R[D_1,D_2] + \frac{M_1L_2D_2 + M_2L_1D_1}{D}\\&= R[D_1,D_2] + M_1 + \frac{M_2-M_1}{D}L_1D_1 = R[D_1,D_2] + M_2 - \frac{M_2-M_1}{D}L_2D_2, \end{aligned}$$

where \(L_1,L_2\) are reduced residue classes modulo \([D_1,D_2]\) that satisfy \(L_1D_1 + L_2D_2 = D\). Thus, provided that \(\text {deg}([D_1,D_2]) \le N\), we have

$$\begin{aligned}&|\{G_0 \in \mathcal {M}_N : G_0 \equiv M_j \pmod {D_j}, (G_0 - M_j)/D_j \equiv A_j \pmod {Q}, j = 1,2\}| \\&= \left| \left\{ R \in \mathcal {M}_{N-\text {deg}([D_1,D_2])} : {\left\{ \begin{array}{ll} R\widetilde{D}_2 + L_1(M_2-M_1)/D &{}\equiv A_1 \pmod {Q} \\ R\widetilde{D}_1 - L_2(M_2-M_1)/D &{}\equiv A_2 \pmod {Q} \end{array}\right. }\right\} \right| , \end{aligned}$$

We note that even the condition \(\text {deg}([D_1,D_2]Q) < N\) may be assumed in what follows, since the contribution to \(\mathcal {T}\) from those \(D_1,D_2\) that lack this is

$$\begin{aligned}&\ll _Q q^{2H-N} \max _{\text {deg}(M_2) < H} \sum _{\begin{array}{c} \text {rad}(D_j) | Q \\ \text {deg}(D_1) \le \text {deg}(D_2) \le N \\ \text {deg}([D_1,D_2]Q) \ge N \end{array}} |\{G_0 \in \mathcal {M}_N : G_0 \equiv M_2 \pmod {D_2}\}| \\&\ll q^{2H} \sum _{\begin{array}{c} \text {rad}(D_2)|Q \\ \text {deg}(D_2) \ge (N-\text {deg}(Q))/2 \end{array}} q^{-\text {deg}(D_2)} \sum _{\begin{array}{c} \text {rad}(D_1)\mid Q\\ \text {deg}(D_1)\le N \end{array}}1\\&\ll _Q q^{2H-N/2} N^{O_Q(1)} = o_{N \rightarrow \infty }(1), \end{aligned}$$

as \(2H \le 2H\text {deg}(Q) < N/5\).

Earlier we had deduced that \(D|(M_2-M_1)\). Making the change of variables \(M'D = M_2-M_1\), we get

$$\begin{aligned} \mathcal {T}&= q^{H-N} \sum _{\begin{array}{c} \text {rad}(D_j)|Q \\ \text {deg}(Q[D_1,D_2])< N \end{array}} f(D_1)\overline{f}(D_2)\,\, \mathop {{\mathop {\sum }\nolimits ^{*}}}\limits _{A_1,A_2 \pmod {Q}} \chi (A_1)\overline{\chi }(A_2) \\&\cdot \sum _{\text {deg}(M') < H-\text {deg}(D)} \left| \left\{ R \in \mathcal {M}_{N-\text {deg}([D_1,D_2])} : {\left\{ \begin{array}{ll} R\widetilde{D}_2 + M'L_1 &{}\equiv A_1 \pmod {Q} \\ R\widetilde{D}_1 - M'L_2 &{}\equiv A_2 \pmod {Q} \end{array}\right. } \right\} \right| + o_{N \rightarrow \infty }(1); \end{aligned}$$

note that if \(\text {deg}(D) \ge H\) the summation contains the choice \(M'= 0\) alone. It is easy to verify that the system of congruences

$$\begin{aligned} {\left\{ \begin{array}{ll} R\widetilde{D}_2 + M'L_1 &{}\equiv A_1 \pmod {Q} \\ R\widetilde{D}_1-M'L_2 &{}\equiv A_2 \pmod {Q} \end{array}\right. } \end{aligned}$$

is solvable if, and only if,

$$\begin{aligned} {\left\{ \begin{array}{ll} M' &{}\equiv A_1\widetilde{D}_1 - A_2\widetilde{D}_2 \pmod {Q} \\ R &{}\equiv L_1A_2 + L_2A_1 \pmod {Q}. \end{array}\right. } \end{aligned}$$

Therefore, \(\mathcal {T}\) is, up to \(o_{N\rightarrow \infty }(1)\) error, equal to

$$\begin{aligned}&q^{H-N} \sum _{\begin{array}{c} \text {rad}(D_j)|Q \\ \text {deg}(Q[D_1,D_2])< N \end{array}} f(D_1)\overline{f}(D_2)\,\, \mathop {{\mathop {\sum }\nolimits ^{*}}}\limits _{\begin{array}{c} A_1\pmod {Q} \\ A_2 \pmod {Q} \end{array}} \chi (A_1)\overline{\chi }(A_2) \sum _{\begin{array}{c} \text {deg}(M')< H-\text {deg}(D) \\ M' \equiv A_1\widetilde{D}_1-A_2\widetilde{D}_2 \pmod {Q} \end{array}}\\&\quad \quad \sum _{\begin{array}{c} R \in \mathcal {M}_{N-\text {deg}([D_1,D_2])} \\ R \equiv A_1 L_2 + A_2L_1 \pmod {Q} \end{array}} 1\\&\quad = q^{H} \sum _{\begin{array}{c} \text {rad}(D_j)|Q \\ \text {deg}(Q[D_1,D_2])< N \end{array}} \frac{f(D_1)\overline{f}(D_2)}{q^{\text {deg}(Q[D_1,D_2])}}\,\, \mathop {{\mathop {\sum }\nolimits ^{*}}}\limits _{\begin{array}{c} A_1\pmod {Q} \\ A_2 \pmod {Q} \end{array}} \chi (A_1)\overline{\chi }(A_2) \sum _{\begin{array}{c} \text {deg}(M') < H - \text {deg}(D) \\ M' \equiv A_1\widetilde{D}_1-A_2\widetilde{D}_2 \pmod {Q} \end{array}} 1 . \end{aligned}$$

Changing variables as \(D_1 = D\widetilde{D}_1\) and \(D_2 = D\widetilde{D}_2\), and reinstating triples \(D,\widetilde{D}_1,\widetilde{D}_2\) with \(\text {deg}(Q[D_1,D_2]) = \text {deg}(QD\widetilde{D}_1\widetilde{D}_2) \ge N\), this is equal to

$$\begin{aligned}&= q^{H} \sum _{\text {rad}(D)|Q} q^{-\text {deg}(D)} \sum _{\begin{array}{c} \text {rad}(\widetilde{D}_j)|Q \\ (\widetilde{D}_1,\widetilde{D}_2) = 1 \end{array}} \frac{f(\widetilde{D}_1)\overline{f}(\widetilde{D}_2)}{q^{\text {deg}(Q\widetilde{D}_1\widetilde{D}_2)}} \mathop {{\mathop {\sum }\nolimits ^{*}}}\limits _{\begin{array}{c} A_1\pmod {Q} \\ A_2 \pmod {Q} \end{array}} \chi (A_1)\overline{\chi }(A_2) \sum _{\begin{array}{c} \text {deg}(M') < H - \text {deg}(D) \\ M' \equiv A_1\widetilde{D}_1-A_2\widetilde{D}_2 \pmod {Q} \end{array}} 1 \\&+ O(q^{2H-N/3+o_Q(1)}), \end{aligned}$$

the error term being \(o_{N \rightarrow \infty }(1)\) since \(N > 7H\).

Step 2: Decoupling \(\widetilde{D}_1\) and \(\widetilde{D}_2\) via Ramanujan sums. By (26) and the fact that \(\text {rad}(\widetilde{D_j})\mid Q\), inserting additive characters \(\pmod {Q}\) to detect the condition \(M' \equiv A_1 \widetilde{D}_1 - A_2 \widetilde{D}_2 \pmod {Q}\) yields

$$\begin{aligned}&\mathop {{\mathop {\sum }\nolimits ^{*}}}\limits _{A_1,A_2 \pmod {Q}} \chi (A_1)\overline{\chi }(A_2) \sum _{\begin{array}{c} \text {deg}(M')< H-\text {deg}(D) \\ M' \equiv A_1\widetilde{D}_1-A_2\widetilde{D}_2 \pmod {Q} \end{array}} 1 \\&= q^{-\text {deg}(Q)} \sum _{C \pmod {Q}}\,\, \sum _{\text {deg}(M') < H-\text {deg}(D)} e_{\mathbb {F}}\left( \frac{CM'}{Q}\right) \tau (\chi ,-C\widetilde{D}_1) \overline{\tau }(\chi ,-C\widetilde{D}_2). \end{aligned}$$

Write \(Q = Q^{*} Q_S\), where \(Q^{*}\) is the conductor of \(\chi \); by assumption, we have \((Q^{*},Q_S) = 1\) and \(Q_S\) squarefree. By Lemma 4.6, for each \(j = 1,2\) we have

$$\begin{aligned} \tau (\chi ,-C\widetilde{D}_j) = \tau (\chi ^{*}) \phi ((Q_S,C\widetilde{D}_j)) \mu (Q_S/(Q_S,C\widetilde{D}_j)) \chi ^{*}(-Q_S\overline{C\widetilde{D}_j})1_{(C\widetilde{D}_j,Q^{*}) = 1}. \end{aligned}$$

We insert these expressions into the above, using \(|\tau (\chi ^{*})|^2 = q^{\text {deg}(Q^{*})}\). Removing the condition \((\widetilde{D}_1,\widetilde{D}_2) = 1\) by Möbius inversion and splitting the products \(C\widetilde{D}_j\) according to \((C\widetilde{D}_j,Q_S)\), we obtain

$$\begin{aligned} \mathcal {T}&= q^{H-\text {deg}(QQ_S)}\sum _{\text {rad}(D)|Q} q^{-\text {deg}(D)} \sum _{E_1,E_2|Q_S} \phi (E_1)\mu \left( \frac{Q_S}{E_1}\right) \phi (E_2)\mu \left( \frac{Q_S}{E_2}\right) \\&\cdot \sum _{\begin{array}{c} \text {rad}(\widetilde{D}_j)|Q_S \\ (\widetilde{D}_1,\widetilde{D}_2) = 1 \end{array}} \frac{f\overline{\chi ^{*}}(\widetilde{D}_1)\overline{f}\chi ^{*}(\widetilde{D}_2)}{q^{\text {deg}(\widetilde{D}_1\widetilde{D}_2)}}\sum _{\text {deg}(M')< H- \text {deg}(D)}\,\, \sum _{\begin{array}{c} C \pmod {Q} \\ (C,Q^{*}) = 1 \\ E_j = (Q_S,C\widetilde{D}_j), j = 1,2 \end{array}} e_{\mathbb {F}}(CM'/Q) + o_{N \rightarrow \infty }(1) \\&= q^{H-\text {deg}(QQ_S)}\sum _{\text {rad}(D)|Q} q^{-\text {deg}(D)} \sum _{E_j|Q_S} \phi (E_1)\mu \left( \frac{Q_S}{E_1}\right) \phi (E_2)\mu \left( \frac{Q_S}{E_2}\right) \sum _{L|Q_S} \frac{\mu (L)}{q^{2\text {deg}(L)}} \\&\cdot \sum _{\begin{array}{c} \text {rad}(\widetilde{D}_j)|Q_S \end{array}} \frac{f\overline{\chi ^{*}}(\widetilde{D}_1)\overline{f}\chi ^{*}(\widetilde{D}_2)}{q^{\text {deg}(\widetilde{D}_1\widetilde{D}_2)}}\sum _{\text {deg}(M') < H- \text {deg}(D)}\,\, \sum _{\begin{array}{c} C \pmod {Q} \\ (C,Q^{*}) = 1 \\ E_j = (Q_S,CL\widetilde{D}_j), j = 1,2 \end{array}} e_{\mathbb {F}}(CM'/Q) + o_{N \rightarrow \infty }(1). \end{aligned}$$

We next define \(F:= (C,Q_S)\) for each C modulo Q. We also decompose \(\widetilde{D}_j = D_j'D_j''\), where \(\text {rad}(D_j')|[F,L]\) and \(\text {rad}(D_j'')|Q_S/[F,L]\), so that \(E_j = \text {rad}(D_j'')[F,L]\) for each \(j = 1,2\). This leads to the expression

$$\begin{aligned} \mathcal {T}&= q^{H-\text {deg}(QQ_S)} \sum _{\text {rad}(D)|Q}q^{-\text {deg}(D)} \sum _{F | Q_S} \sum _{L|Q_S} \frac{\mu (L)}{q^{2\text {deg}(L)}} \phi ([F,L])^2 \sum _{\begin{array}{c} \text {rad}(D_j')|[F,L] \\ j = 1,2 \end{array}} \frac{f\overline{\chi ^{*}}(D_1')\overline{f}\chi ^{*}(D_2')}{q^{\text {deg}(D_1'D_2')}} \\&\cdot \sum _{\begin{array}{c} \text {rad}(D_j'')|Q_S/[F,L] \\ j = 1,2 \end{array}} \mu \left( \frac{(Q_S/[F,L])}{\text {rad}(D_1'')}\right) \phi (\text {rad}(D_1''))\mu \left( \frac{(Q_S/[F,L])}{\text {rad}(D_2'')}\right) \phi (\text {rad}(D_2''))\frac{f\overline{\chi ^{*}}(D_1'')\overline{f}\chi ^{*}(D_2'')}{q^{\text {deg}(D_1''D_2'')}} \\&\cdot \sum _{\text {deg}(M') < H-\text {deg}(D)}\,\, \sum _{\begin{array}{c} C \pmod {Q} \\ (C,Q^{*}) = 1 \\ F = (C,Q_S) \end{array}} e_{\mathbb {F}}\left( \frac{CM'}{Q^{*}Q_S}\right) . \end{aligned}$$

Replacing C by \(\widetilde{C}:= C/F\) in the innermost sum, and noting that \((C/F,Q/F) = 1\) in that case, it follows from Lemma 4.9 (and \(\text {deg}(Q/F) \ge \text {deg}(Q^{*}) \ge 1\) since \(\chi \) is non-principal) that

$$\begin{aligned} \sum _{\text {deg}(M')< H-\text {deg}(D)}\,\, \sum _{\begin{array}{c} C \pmod {Q} \\ (C,Q^{*}) = 1 \\ F = (C,Q_S) \end{array}} e_{\mathbb {F}}(CM'/Q)&= \sum _{\text {deg}(M')< H-\text {deg}(D)}\,\,\, \mathop {{\mathop {\sum }\nolimits ^{*}}}\limits _{\begin{array}{c} \widetilde{C} \pmod {Q/F} \end{array}}e_{\mathbb {F}}(\widetilde{C}M'/(Q/F)) \\&= \sum _{\text {deg}(M') < H-\text {deg}(D)} c_{Q/F}(M') = \phi (Q/F)1_{\text {deg}(D) \ge H}, \end{aligned}$$

for each \(F|Q_S\). Inserting this into the expression for \(\mathcal {T}\) then gives

$$\begin{aligned} \mathcal {T}&= \left( q^H\sum _{\begin{array}{c} \text {rad}(D)|Q \\ \text {deg}(D) \ge H \end{array}} q^{-\text {deg}(D)}\right) \cdot \frac{\phi (Q^{*})}{q^{\text {deg}(QQ_S)}} \sum _{F|Q_S} \phi \left( \frac{Q_S}{F}\right) \sum _{L|Q_S} \frac{\mu (L)\phi ([F,L])^2 }{q^{2\text {deg}(L)}}\\&\cdot \sum _{\begin{array}{c} \text {rad}(D_j')|[F,L] \\ j =1,2 \end{array}} \frac{f\overline{\chi ^{*}}(D_1')\overline{f}\chi ^{*}(D_2')}{q^{\text {deg}(D_1'D_2')}} \sum _{\begin{array}{c} \text {rad}(D_j'')|\frac{Q_S}{[F,L]} \\ j = 1,2 \end{array}} \prod _{j=1}^2 \mu \left( \frac{\frac{Q_S}{[F,L]}}{\text {rad}(D_j'')}\right) \phi (\text {rad}(D_j''))\cdot \frac{f\overline{\chi ^{*}}(D_1'')\overline{f}\chi ^{*}(D_2'')}{q^{\text {deg}(D_1''D_2'')}}\\&+ o_{N \rightarrow \infty }(1). \end{aligned}$$

Step 3: Concluding the proof. Finally, we make one last change of variable \(G:= [F,L]\). For each \(G|Q_S\), we have (using the squarefreeness of \(Q_S\) repeatedly)

$$\begin{aligned} \sum _{\begin{array}{c} F,L|Q_S \\ {[}F,L] = G \end{array}} \phi (Q_S/F) \frac{\mu (L)}{q^{2\text {deg}(L)}}&= \sum _{R|G} \frac{\mu (R)}{q^{2\text {deg}(R)}} \sum _{L'F' = G/R} \phi \left( \frac{Q_S/R}{F'}\right) \frac{\mu (L')}{q^{2\text {deg}(L')}} \\&= \phi \left( \frac{Q_S}{G}\right) \sum _{R|G} \frac{\mu (R)}{q^{2\text {deg}(R)}} \sum _{F'L' = G/R} \phi \left( \frac{G/R}{F'}\right) \frac{\mu (L')}{q^{2\text {deg}(L')}}\\&= \phi \left( \frac{Q_S}{G}\right) \sum _{R|G} \frac{\mu (R)}{q^{2\text {deg}(R)}} \sum _{L'|G/R} \frac{\phi (L')\mu (L')}{q^{2\text {deg}(L')}} \\&= \phi \left( \frac{Q_S}{G}\right) \sum _{R|G} \frac{\mu (R)}{q^{2\text {deg}(R)}} \prod _{P|G/R} \left( 1{-}q^{-\text {deg}(P)}(1{-}q^{-\text {deg}(P)})\right) \\&= \phi \left( \frac{Q_S}{G}\right) \prod _{P|G} \left( 1-q^{-\text {deg}(P)}\right) \\&= q^{-\text {deg}(G)}\phi (Q_S). \end{aligned}$$

Applying the change of variables and the above identity into the previous expression for \(\mathcal {T}\), we obtain

$$\begin{aligned} \mathcal {T}&= \left( q^H \sum _{\begin{array}{c} \text {rad}(D)|Q \\ \text {deg}(D) \ge H \end{array}} q^{-\text {deg}(D)}\right) \frac{\phi (Q^{*})\phi (Q_S)}{q^{\text {deg}(QQ_S)}} \sum _{G|Q_S} \frac{\phi (G)^2}{q^{\text {deg}(G)}} \sum _{\begin{array}{c} \text {rad}(D_j')|G \\ j = 1,2 \end{array}} \frac{f\overline{\chi ^{*}}(D_1')\overline{f}\chi ^{*}(D_2')}{q^{\text {deg}(D_1'D_2')}} \\&\cdot \sum _{\begin{array}{c} \text {rad}(D_j'')|Q_S/G \\ j = 1,2 \end{array}} \mu \left( \frac{Q_S/G}{\text {rad}(D_1'')}\right) \phi (\text {rad}(D_1''))\mu \left( \frac{Q_S/G}{\text {rad}(D_2'')}\right) \phi (\text {rad}(D_2''))\frac{f\overline{\chi ^{*}}(D_1'')\overline{f}\chi ^{*}(D_2'')}{q^{\text {deg}(D_1''D_2'')}} + o_{N \rightarrow \infty }(1)\\&= \frac{\phi (Q)}{q^{\text {deg}(QQ_S)}}\sum _{G|Q_S} \frac{\phi (G)^2}{q^{\text {deg}(G)}}\\&\left| \sum _{\text {rad}(D')|G} \frac{f\overline{\chi ^{*}}(D')}{q^{\text {deg}(D')}}\right| ^2 \left| \sum _{\text {rad}(D'')|Q_S/G} \mu \left( \frac{Q_S/G}{\text {rad}(D'')}\right) \phi (\text {rad}(D'')) \frac{f\overline{\chi ^{*}}(D'')}{q^{\text {deg}(D'')}}\right| ^2 \\&\cdot \left( q^{H} \sum _{\begin{array}{c} \text {rad}(D)|Q \\ \text {deg}(D) \ge H \end{array}} q^{-\text {deg}(D)}\right) + o_{N \rightarrow \infty }(1) \\&\ge \frac{\phi (Q)}{q^{\text {deg}(Q)}} \left( \frac{\phi (Q_S)}{q^{\text {deg}(Q_S)}}\right) ^2\prod _{P|Q_S}\left| 1-f\overline{\chi ^{*}}(P)q^{-\text {deg}(P)}\right| ^{-2} \cdot \left( q^H\sum _{\begin{array}{c} \text {rad}(D) |Q \\ \text {deg}(D) \ge H \end{array}} q^{-\text {deg}(D)}\right) + o_{N \rightarrow \infty }(1), \end{aligned}$$

where in the last step we used positivity to bound the sum over G from below by the term at \(G = Q_S\), and the factorization

$$\begin{aligned} \sum _{\text {rad}(D')|Q_S} \frac{f\overline{\chi ^{*}}(D)}{q^{\text {deg}(D)}} = \prod _{P|Q_S} \left( 1-f\overline{\chi ^{*}}(P)q^{-\text {deg}(P)}\right) ^{-1}. \end{aligned}$$

This completes the proof. \(\square \)

4.3.3 Modified characters to prime power modulus

Proof of Theorem 1.3

(\(\Rightarrow \)) Suppose \(f: \mathcal {M} \rightarrow S^1\) is a completely multiplicative function for which \(\mathcal {S}_f < \infty \). By Proposition 4.1, there is a primitive Dirichlet character \(\chi \) modulo \(Q'\), a primitive short interval character \(\xi \) of length \(\nu \ge 0\) and \(\theta \in [0,1]\) such that \(\mathbb {D}(f,\chi \xi e_{\theta };\infty ) < \infty \).

We start with the case \(Q'=1\). Let N be large and \(1\le H\le N-\nu -1\). Set \(f_1(G):= fe_{-\theta }\overline{\xi }(G)\) for each \(G \in \mathcal {M}\), so that \(\mathbb {D}(f_1,1;\infty ) < \infty \). Further, note that \(\xi e_{\theta }\) is constant on intervals \(I_H(G_0)\) for all \(G_0 \in \mathcal {M}_N\). We thus obtain

$$\begin{aligned}\begin{aligned} \max _{G_0 \in \mathcal {M}_N} \left| \sum _{\text {deg}(M)< H} f(G_0+M)\right|&= \max _{G_0 \in \mathcal {M}_N} \left| \sum _{\text {deg}(M)< H} f_1(G_0+M)\right| \\ {}&\ge q^{-N} \sum _{G_0 \in \mathcal {M}_N}\left| \sum _{\text {deg}(M)< H} f_1(G_0+M)\right| \\&\ge q^{-N} \left| \sum _{G_0 \in \mathcal {M}_N}\sum _{\text {deg}(M) < H} f_1(G_0+M)\right| \\&=q^{H-N}\left| \sum _{G \in \mathcal {M}_N} f_1(G)\right| , \end{aligned} \end{aligned}$$

where we used the triangle inequality and the fact that \(M + \mathcal {M}_N = \mathcal {M}_N\) for all \(\text {deg}(M) < H\). We now apply Delange’s theorem in function fields (see [9, Theorem 1.4.1]) to \(f_1\), which gives that

$$\begin{aligned} q^{H-N}\left| \sum _{G \in \mathcal {M}_N} f_1(G)\right| = (c+o_{N \rightarrow \infty }(1))q^H, \end{aligned}$$

where, since \(f_1\) is 1-pretentious, we have

$$\begin{aligned} c=\prod _{P\in \mathcal {P}}(1-q^{-\text {deg}(P)})(1-f_1(P)q^{-\text {deg}(P)})^{-1}\ne 0. \end{aligned}$$

It follows directly that \(S_f = \infty \), a contradiction.

We are left with the case \(Q'\ne 1\), so \(\text {deg}(Q')\ge 1\). We apply Proposition 4.2 to f to deduce that

$$\begin{aligned} S:= \{P: f(P) \ne \chi (P)\xi (P)e_{\theta }(P)\} \end{aligned}$$

is finite. Put \(Q:= [Q',\prod _{P \in S} P] = Q'Q''\), where \(Q''\) is squarefree and coprime to \(Q'\).

Then \(f\overline{\xi }e_{-\theta }\) is a modified character modulo Q (as per Remark 4.4), and \(\chi \) is non-principal with conductor \(Q'\). By Proposition 4.7 (applied with T being a large constant, so that H is small compared to N and \(\xi e_{\theta }\) is constant on \(I_H(G_0)\) for any \(G_0 \in \mathcal {M}_N\)) we find that \(\omega (Q) = 1\). Thus, \(Q = Q'\) and \(\omega (Q') = 1\), so \(\chi \) is a primitive Dirichlet character modulo a prime power.

To conclude, we thus have \(f(G) = \widetilde{\chi }(G) \xi (G) e_{\theta }(G)\) for all G, where \(\chi \) is a primitive Dirichlet character modulo \(Q = P^r\) for some \(r \ge 1\) and some prime P, \(\widetilde{\chi }\) is a modified character corresponding to \(\chi \), and \(\xi \) has bounded length.

(\(\Leftarrow \)) Conversely, let \(f(G) = \widetilde{\chi }(G) \xi (G) e_{\theta }(G)\) for all G, where \(\chi \) is a primitive Dirichlet character modulo \(Q = P^r\) for some \(r \ge 1\) and some prime P, \(\widetilde{\chi }\) is a modified character corresponding to \(\chi \), and \(\xi \) has length \(\nu \). Denote \(f_1(G):=\widetilde{\chi }(G)\). As we noted before, \(\xi e_{\theta }\) is constant on \(I_H(G_0)\) for any \(G_0 \in \mathcal {M}_N\) and \(H\le N-\nu -1\), so \(S_{f}=S_{\widetilde{\chi }}\). Thus it suffices to show that \(S_{\widetilde{\chi }}<\infty \).

Let \(H \ge 1\) and suppose \(N \ge H\). For any \(G_0 \in \mathcal {M}_{N}\),

$$\begin{aligned} \sum _{G \in I_H(G_0)} \widetilde{\chi }(G)&= \sum _{k \ge 0} f(P)^{k} \sum _{\text {deg}(M)< H} \chi ((G_0-M)/P^k) \nonumber \\&= \mathop {{\mathop {\sum }\nolimits ^{*}}}\limits _{A \pmod {P^r}} \chi (A) \sum _{k \ge 0} f(P)^k \sum _{\begin{array}{c} \text {deg}(M) < H \\ M \equiv G_0 \pmod {P^k} \\ (G_0-M)/P^k \equiv A \pmod {P^r} \end{array}} 1. \end{aligned}$$
(29)

Consider first the contribution from \(k<H/\text {deg}(P)-r\). Making the change of variables \(M=B_k(G_0)+P^kM'\) in the inner sum over M, where \(B_k(G_0)\) is the residue class of \(G_0\) mod \(P^k\), we see that

$$\begin{aligned} \sum _{\begin{array}{c} \text {deg}(M)< H \\ M \equiv G_0 \pmod {P^k} \\ (G_0-M)/P^k \equiv A \pmod {P^r} \end{array}} 1&= \sum _{\begin{array}{c} \text {deg}(M')<H-k\text {deg}(P)\\ M'\equiv A\pmod {P^r} \end{array}}1=\sum _{\begin{array}{c} \text {deg}(M')<H-k\text {deg}(P)\\ M'\equiv 0\pmod {P^r} \end{array}}1, \end{aligned}$$

which is independent of A. Thus, by orthogonality these values of k contribute nothing to (29).

In the range \(k > \frac{H}{\text {deg}(P)}\), there is at most one polynomial M that contributes for at most one such value of k (and in this case, M must represent the projection of \(G_0\) to \(\text {span}_{\mathbb {F}_q}\{1,\ldots ,t^H\}\)). This results in a O(1) term.

It follows that

$$\begin{aligned} \sum _{G \in I_H(G_0)} f(G) = \sum _{H/\text {deg}(P)-r \le k \le H/\text {deg}(P)} f(P)^k \sum _{\begin{array}{c} \text {deg}(M) < H \\ M \equiv G_0 \pmod {P^k} \end{array}} \chi ((G_0-M)/P^k) + O_{r}(1), \end{aligned}$$

and estimating each term by the triangle inequality this is \(\ll 1\), uniformly over H. It follows that

$$\begin{aligned} \limsup _{N \rightarrow \infty } \max _{G_0 \in \mathcal {M}_N} \left| \sum _{G \in I_H(G_0)} \widetilde{\chi }(G)\right| \ll 1 \end{aligned}$$

uniformly over H, and hence \(S_{\widetilde{\chi }} < \infty \), as claimed. \(\square \)

Proof of Corollary 1.2

The proof of Corollary 1.2 is identical to that of Theorem 1.3, save that by the second conclusion in Proposition 4.1 we may assume that \(\xi \) is quadratic for general q and trivial if q is odd, while \(\chi \) is real and \(\theta \in \{0,1/2\}\). \(\square \)

5 The lexicographic discrepancy

We fix once and for all a lexicographic ordering \(\langle \cdot \rangle \) of \(\mathbb {F}_q[t]\) (recalling the necessary property that \(\langle 0 \rangle = 0\)). Suppose \(f: \mathcal {M} \rightarrow S^1\) is a completely multiplicative function, such that

$$\begin{aligned} \sup _{N \ge 1} \left| \sum _{\begin{array}{c} G \in \mathcal {M} \\ \langle G \rangle< N \end{array}} f(G)\right| < \infty . \end{aligned}$$

We remark that on taking \(N = q^n\) for \(n \ge 1\), this shows that \(\mathcal {D}_g < \infty \). Taking \(N=\langle G_0\rangle \) for any \(G_0\in \mathcal {M}_n\), we see that

$$\begin{aligned} \sup _{n \ge 1} \sup _{G_0 \in \mathcal {M}_{n}} \left| \sum _{\begin{array}{c} G \in \mathcal {M} \\ \langle G \rangle \le \langle G_0 \rangle \end{array}} f(G)\right| < \infty . \end{aligned}$$

By the triangle inequality, it also follows that for any \(h\ge 1\),

$$\begin{aligned} \sup _{n \ge 1} \sup _{G_0 \in \mathcal {M}_n}\left| \sum _{\begin{array}{c} G \in \mathcal {M} \\ \langle G_0\rangle \le \langle G \rangle< \langle G_0 \rangle +q^h \end{array}} f(G) \right| < \infty . \end{aligned}$$

But as pointed out in Sect. 2, the short interval sums \(I_h(G_0)\) coincide with the sum in absolute values whenever \(n \ge h\) and \(t^{h-1}|G_0\). Thus, we deduce that \(\mathcal {S}_f < \infty \). By Theorem 1.3, we may conclude that \(f = \widetilde{\chi }_{\alpha } \xi e_{\theta }\), where \(\xi \) is a short interval character of bounded length, \(\theta \in [0,1]\) and \(\widetilde{\chi }_{\alpha }\) is a primitive modified character with prime power modulus \(P^r\), such that

$$\begin{aligned} \widetilde{\chi }_{\alpha }(P) = e(\alpha ), \end{aligned}$$
(30)

for some \(\alpha \in \mathbb {R}/\mathbb {Z}\). We will use this notation in the sequel.

We have thus reduced our task to showing the following. In the sequel we write \(\widetilde{\chi } = \widetilde{\chi }_{\alpha }\) for ease of notation.

Proposition 5.1

Let \(g:\mathcal {M}\rightarrow S^1\) be of the form \(g=\widetilde{\chi }\xi e_{\theta }\) with \(\widetilde{\chi }\) a primitive modified character associated to a prime power modulus, \(\xi \) a short interval character, and \(\theta \in \mathbb {R}\). Then we have

$$\begin{aligned} \sup _{N\ge 1}\left| \sum _{\begin{array}{c} G\in \mathcal {M}\\ \langle G\rangle < N \end{array}}g(G)\right| =\infty . \end{aligned}$$
(31)

Assume for the sake of contradiction that (31) fails. Before proceeding to the proof of Proposition 5.1 let us make some observations.

Firstly, the function g may be extended naturally to all of \(\mathbb {F}_q[t]\) by the formula

$$\begin{aligned} g(G)=g(P)^{v_P(G)}\chi (G/P^{v_P(G)})\xi (G)e(\theta \text {deg}(G)), \end{aligned}$$

since \(\xi \) and \(\chi \) are both defined on all of \(\mathbb {F}_q[t]\).

Secondly, we may assume that \(\sum _{G\in \mathcal {M}_{n}}g(G)\) is bounded, as otherwise

$$\begin{aligned} \sum _{\begin{array}{c} G\in \mathcal {M}\\ \langle G\rangle< q^{n+1} \end{array}}g(G)-\sum _{\begin{array}{c} G\in \mathcal {M}\\ \langle G\rangle < q^{n} \end{array}}g(G)=\sum _{G\in \mathcal {M}_{n}}g(G) \end{aligned}$$

is unbounded, implying that the claim (31) holds.

The next lemma will allow us to study more precisely the behaviour of long interval sums of modified characters, which will be crucial in the proof of Proposition 5.1.

Lemma 5.2

Let \(f: \mathcal {M} \rightarrow S^1\) be a fixed completely multiplicative function. Suppose there exist \(\theta \in [0,1]\), a short interval character \(\xi \) of length \(\nu \ge 0\), and a non-principal Dirichlet character modulo \(P^r\), where \(P\in \mathcal {P}\) and \(r \ge 1\), such that \(f(P') = \chi (P')\xi (P')e_{\theta }(P')\) for all \(P' \ne P\).

  1. (1)

    For any \(H \ge 1\),

    $$\begin{aligned} \sum _{M \in \mathcal {M}_{< H}} f(M) = \sum _{M' \in \mathcal {M}_{< \nu + r\text {deg}(P)}} \chi \xi (M')e_{\theta }(M') \sum _{0 \le k < (H-\text {deg}(M'))/\text {deg}(P)} f(P)^k. \end{aligned}$$
  2. (2)

    If f(P) is a dth root of unity with \(d\ge 1\), then \(H \mapsto \sum _{M \in \mathcal {M}_{< H}} f(M)\) is \(d\cdot \text {deg}(P)\)-periodic.

Proof

(1) We have

$$\begin{aligned} \sum _{M \in \mathcal {M}_{< H}} f(M)&= \sum _{m< H} \sum _{M \in \mathcal {M}_m} f(M) \\&= \sum _{m< H} e(\theta m)\sum _{0 \le k \le m/\text {deg}(P)} (fe_{-\theta })(P)^k \sum _{\begin{array}{c} M \in \mathcal {M}_m \\ M \equiv 0 \pmod {P^k} \end{array}} \chi \xi (M/P^k) \\&= \sum _{m < H} e(\theta m) \sum _{0 \le k \le m/\text {deg}(P)} (fe_{-\theta })(P)^k \sum _{M' \in \mathcal {M}_{m-k\text {deg}(P)}} \chi \xi (M'). \end{aligned}$$

Swapping orders of summation, this equals to

$$\begin{aligned}&\sum _{0 \le k< H/\text {deg}(P)} f(P)^k \sum _{k \text {deg}(P) \le m< H} e(\theta (m-k\text {deg}(P))) \sum _{M' \in \mathcal {M}_{m-k\text {deg}(P)}} \chi \xi (M') \\&\quad = \sum _{0 \le k< H/\text {deg}(P)} f(P)^k \sum _{0 \le j< H-k\text {deg}(P)} e(j\theta ) \sum _{M' \in \mathcal {M}_j}\chi \xi (M') \\&\quad = \sum _{0 \le j< H} e(j\theta ) \left( \sum _{M' \in \mathcal {M}_j} \chi \xi (M')\right) \sum _{0 \le k < (H-j)/\text {deg}(P)} f(P)^k. \end{aligned}$$

If \(j \ge \nu + r\text {deg}(P)\) then the sum over \(M'\) is 0, as is seen by partitioning \(\mathcal {M}_j\) into short intervals of the form \(I_{j-\nu }(G)\) and using the orthogonality of Dirichlet characters. Thus, the above simplifies to

$$\begin{aligned}&\sum _{0 \le j< \nu + r\text {deg}(P)} e(j\theta ) \sum _{M' \in \mathcal {M}_j} \chi \xi (M') \sum _{0 \le k< (H-j)/\text {deg}(P)} f(P)^k\nonumber \\&\quad = \sum _{M' \in \mathcal {M}_{< \nu + r\text {deg}(P)}} \chi \xi (M')e_{\theta }(M') \sum _{0 \le k < (H-\text {deg}(M'))/\text {deg}(P)} f(P)^k. \end{aligned}$$
(32)

This proves the first claim.

(2) If \(f(P)\ne 1\), this follows immediately from (1), since \(F(n):=\sum _{0\le k\le n}f(P)^k\) is d-periodic by the fact that the dth roots of unity sum up to 0. If instead \(f(P)=1\), then the claim follows by noting that \(F(n+1)=F(n)+1\) and using (1) and the orthogonality relations for \(\chi \xi \) [3, Exercise 5.1.2]. \(\square \)

Let us now introduce some notation. Denote the partial sums of a function \(f: \mathbb {F}_q[t] \rightarrow \mathbb {C}\) in the lexicographic ordering over monic and non-monic polynomials by

$$\begin{aligned} S_N^{\mathcal {M}}(f):=\sum _{\begin{array}{c} G\in \mathcal {M}\\ \langle G\rangle< N \end{array}}f(G) \quad \text {and}\quad S_N(f):=\sum _{\begin{array}{c} G\in \mathbb {F}_q[t]\\ \langle G\rangle < N \end{array}}f(G). \end{aligned}$$

Similarly denote the partial sums arranged according to degree over monic and non-monic polynomials by

$$\begin{aligned} \Sigma _{N}^{\mathcal {M}}(f):=\sum _{\begin{array}{c} G\in \mathcal {M}\\ \text {deg}(G)< N \end{array}}f(G) \quad \text {and}\quad \Sigma _{N}(f):=\sum _{\begin{array}{c} G\in \mathbb {F}_q[t]\\ \text {deg}(G)< N \end{array}}f(G). \end{aligned}$$

We can express the sum \(\Sigma _{n_j}(\widetilde{\chi })\) in terms of the corresponding monic sum \(\Sigma _{n_j}^{\mathcal {M}}(\widetilde{\chi })\) as follows. Since every non-zero polynomial in \(\mathbb {F}_q[t]\) can be uniquely written as cG where \(G\in \mathcal {M}\) and \(c\in \mathbb {F}_q^{\times }\), for all \(n\ge 1\) we have

$$\begin{aligned} \Sigma _{n}(\widetilde{\chi })= \Sigma _{n}^{\mathcal {M}}(\widetilde{\chi })\sum _{c\in \mathbb {F}_q^{\times }}\widetilde{\chi }(c). \end{aligned}$$
(33)

If \(\zeta \) is any generator of \(\mathbb {F}_q^{\times }\), then

$$\begin{aligned} \sum _{c\in \mathbb {F}_q^{\times }}\widetilde{\chi }(c)=\sum _{0\le j\le q-2}\widetilde{\chi }(\zeta )^j=(q-1)1_{\widetilde{\chi }(\zeta )=1}:=c_q, \end{aligned}$$
(34)

where we used the fact that \(x^{q-1}=1\) for all \(x\in \mathbb {F}_q^{\times }\).

Our proof of Proposition 5.1 distinguishes the case \(P = t\) from \(P \ne t\). For the case \(P=t\) we begin with the following lemma.

Lemma 5.3

Suppose \(g = \xi e_{\theta } \widetilde{\chi }\), where \(\widetilde{\chi }\) is a modified non-principal character modulo \(t^r\), such that \(S_N^{\mathcal {M}}(g) = O(1)\) uniformly over all \(N \ge 1\). Then \(\widetilde{\chi }(t) = 1\).

Proof

We observe that if \(N,m \ge 1\) and \(M < q^m\) then, subject to \(M \equiv N \equiv 0 \pmod {q^r}\) we have

$$\begin{aligned} S_{q^mN + M}(\widetilde{\chi }) = \widetilde{\chi }(t)^{m}S_N(\widetilde{\chi }) + S_M(\widetilde{\chi }). \end{aligned}$$
(35)

To see this, we first decompose

$$\begin{aligned} S_{q^mN + M}(\widetilde{\chi }) = \sum _{j \ge 0} \widetilde{\chi }(t)^j \sum _{\left\langle G' \right\rangle \le q^{m-j}N + M/q^j} \chi (G'). \end{aligned}$$

Next, we remark that if \(a \ge 0\) is such that \(q^a \le A < q^{a+1}\) then

$$\begin{aligned} \sum _{\left\langle G \right\rangle \le q^r A} \chi (G)&= \sum _{\text {deg}(G) < a + r} \chi (G) + \sum _{q^{r+a} \le \langle G \rangle \le q^r A} \chi (G) = \sum _{\left\langle M \right\rangle \le (A-q^a)q^r} \chi (t^{r+a}+ M)\\&= \sum _{\left\langle M \right\rangle \le (A-q^a)q^r} \chi (M), \end{aligned}$$

and so by induction we obtain, for each \(j \ge 0\),

$$\begin{aligned} \sum _{\left\langle G' \right\rangle \le q^{m-j}N + M/q^j} \chi (G') = \sum _{\left\langle G \right\rangle \le R_j(N,M)} \chi (G), \end{aligned}$$

where \(R_j(M,N) \in \{0,1,\ldots ,q^r-1\}\) satisfies \(R_j(M,N) \equiv \left\lfloor \frac{q^{m} N + M}{q^j} \right\rfloor \pmod {q^r}\). Now, if \(m > j\) and \(q^r|N\) we have \(R_j(M,N) \equiv \left\lfloor M/q^j \right\rfloor \pmod {q^r}\), and so

$$\begin{aligned} \sum _{0 \le j < m} \widetilde{\chi }(t)^j \sum _{\left\langle G \right\rangle \le M/q^j} \chi (G) = \sum _{\left\langle G \right\rangle \le M} \widetilde{\chi }(G) = S_M(\widetilde{\chi }). \end{aligned}$$

Next, suppose \(j \ge m\). In this case, \(M/q^j < 1\) and \(\left\lfloor \frac{q^mN + M}{q^j} \right\rfloor = \left\lfloor q^{m-j}N\right\rfloor \), since if the floor was one larger this would mean that

$$\begin{aligned} 1> \{N/q^{j-m}\}> 1 - \frac{M}{q^j} > 1-1/q^{j-m}, \end{aligned}$$

which is impossible. Thus, we have

$$\begin{aligned} \sum _{j \ge m} \widetilde{\chi }(t)^j \sum _{\left\langle G' \right\rangle \le q^{m-j} N + M/q^j} \chi (G')&= \sum _{j \ge m} \widetilde{\chi }(t)^j \sum _{\left\langle G \right\rangle \le q^{m-j}N} \chi (G) = \widetilde{\chi }(t)^m \sum _{l \ge 0} \sum _{\begin{array}{c} \left\langle G\right\rangle \le N \\ t^l || G \end{array}} \widetilde{\chi }(G)\\&= \widetilde{\chi }(t)^m S_N(\widetilde{\chi }), \end{aligned}$$

and (35) follows.

Now, we iterate (35) as follows. Assume there is \(A \equiv 0 \pmod {q^r}\) such that \(S_A(\widetilde{\chi }) \ne 0\), and let \(K \ge \nu + 1\) be chosen so that \(q^K > A\) (with \(\nu \) the length of \(\xi \)). For \(J \ge 1,\) let \(\{m_j\}_{j \le J}\) be an increasing sequence of integers for which \(|e(m_jK\alpha )-1| < 1/100\) for each j. Setting \(B:= A(1 + q^{m_1K} + \cdots + q^{m_JK})\), we obtain

$$\begin{aligned} S_B(\widetilde{\chi }) = S_A(\widetilde{\chi }) + e(m_1K\alpha ) S_{(B-A)/q^{m_1K}}(\widetilde{\chi }) = S_A(\widetilde{\chi })\left( 1 + e(m_1K\alpha ) + \ldots + e(m_JK\alpha )\right) . \end{aligned}$$

It follows that if \(S_A(\widetilde{\chi }) \ne 0\) then \(|S_B(\widetilde{\chi })| \gg J\). We then have

$$\begin{aligned} \sum _{\begin{array}{c} \left\langle G \right\rangle< B \\ G \in \mathcal {M} \end{array}} g(G) = \sum _{\begin{array}{c} \left\langle G \right\rangle < q^{m_JK} \\ G \in \mathcal {M} \end{array}} g(G) + e_{\theta }\xi (t)^{m_J} \left( S_B(\widetilde{\chi }) - S_{q^{m_JK}}(\widetilde{\chi })\right) , \end{aligned}$$

and as the left-most two terms are both bounded we obtain that \(|S_{q^{m_JK}}(\widetilde{\chi })| \gg J\). But as \(m_JK >r\) can be assured when J is sufficiently large, (35) (with \(N = q^r\) and \(M = 0\)) implies that \(|S_{q^r}(\widetilde{\chi })| \gg J\), which is an obvious contradiction as \(J \rightarrow \infty \).

Thus, suppose instead that \(S_{q^r N}(\widetilde{\chi }) = 0\) for all \(N \ge 1\). In this case, it suffices to notice that then,

$$\begin{aligned} S_{q^r(N+1)}(\widetilde{\chi }) - S_{q^rN}(\widetilde{\chi }) = 0 \end{aligned}$$

for all \(N\ge 1\). Specializing \(N_1 = q^{M_1}\) and \(N_2 = q^{M_2}\), where \(M_1,M_2 \ge r\), we obtain in both cases that

$$\begin{aligned} 0= & {} S_{q^r(N_j+1)}(\widetilde{\chi }) - S_{q^rN_j}(\widetilde{\chi })=\sum _{\left\langle G \right\rangle< q^r} \widetilde{\chi }(t^{M_j + r} + G) = \widetilde{\chi }(t)^{M_j+r}\\ {}{} & {} + \sum _{0 \le l< r} \widetilde{\chi }(t)^l \sum _{\left\langle G \right\rangle < q^{r-l}} \chi (G), \end{aligned}$$

the double sum on the right-hand side being independent of \(j = 1,2\). It follows from this that \(\widetilde{\chi }(t)^{M_1} = \widetilde{\chi }(t)^{M_2}\), so choosing e.g., \(M_2 = M_1 + 1\) yields the claim \(\widetilde{\chi }(t)=1\) in this case. \(\square \)

Proof of Proposition 5.1 when \(P = t\)

Let \(\{n_j\}_{1 \le j \le k}\) be an increasing sequence of integers satisfying \(n_{j+1} > n_{j} + \nu + r\) for each \(1 \le j \le k-1\). Define

$$\begin{aligned} N_j:= \langle 1\rangle \sum _{1 \le i \le j} q^{n_i} \end{aligned}$$

for each \(1 \le j \le k\); since \(\langle \cdot \rangle \) is a bijection on \(\mathbb {F}_q\) and \(\langle 0 \rangle = 0\) we note that \(N_j > 0\) for each \(j \ge 1\). Note also that if G is monic and satisfies \(\langle G \rangle < N_j\) then either \(\text {deg}(G) < n_j\) or else \(G = t^{n_j} + M\), where \(\langle M \rangle < N_{j-1}\). In the latter case, since \(n_j > n_{j-1} + \nu \) we have \(\xi e_{\theta }(t^{n_j}+M) = \xi e_{\theta }(t)^{n_j}\) whenever \(\langle M \rangle < N_{j-1}\). Furthermore, if \(G \ne 0\) then \(\nu _t(t^{n_j} + G) = \nu _t(G)\) and thus by our choice of \(n_j\) we have \(\widetilde{\chi }(t^{n_j} + G) = \widetilde{\chi }(G)\). Lemma 5.3 shows that we may assume \(\widetilde{\chi }(t) = 1\) and we thus obtain

$$\begin{aligned} \sum _{\begin{array}{c} \left\langle G \right\rangle< N_k \\ G \in \mathcal {M} \end{array}} g(G)&= \sum _{G \in \mathcal {M}_{<n_k}} g(G) + \xi e_{\theta }(t)^{n_k} \left( \sum _{0< \left\langle G \right\rangle < N_{k-1}} \widetilde{\chi }(t^{n_k} + G) + 1\right) \nonumber \\&= \xi e_{\theta }(t)^{n_k} S_{N_{k-1}}(\widetilde{\chi }) + O(1). \end{aligned}$$
(36)

We similarly have for \(1 \le m \le k-1\) that

$$\begin{aligned} S_{N_m}(\widetilde{\chi }) = \sum _{\begin{array}{c} \text {deg}(G)< N_m \end{array}} \widetilde{\chi }(G) + 1 + \sum _{0< \left\langle G \right\rangle < N_{m-1}} \widetilde{\chi }(t^{n_m} + G)&= \Sigma _{N_m}(\widetilde{\chi }) + 1 + S_{N_{m-1}}(\widetilde{\chi }), \end{aligned}$$

and on iterating this we get

$$\begin{aligned} S_{N_m}(\widetilde{\chi }) = \sum _{1 \le j \le m}\left( \Sigma _{N_j}(\widetilde{\chi }) + 1\right) + O(1). \end{aligned}$$
(37)

We now deduce from (36), (37) that

$$\begin{aligned} \left| \sum _{1 \le j \le k-1} (\Sigma _{N_j}(\widetilde{\chi }) + 1) \right| = O(1). \end{aligned}$$

On the other hand, we have \(\Sigma _{N_j}(\widetilde{\chi }) = c_q \sum _{G \in \mathcal {M}_{<N_j}} \widetilde{\chi }(G)\) (where \(c_q\) is given by (34)), and by Lemma 5.2(2) the map \(n \mapsto \sum _{G \in \mathcal {M}_{<n}} \widetilde{\chi }(G)\) is constant (since \(\text {deg}(t) = 1\) and \(\widetilde{\chi }(t) = 1\)). Thus, we in fact obtain that

$$\begin{aligned} (k-1)\left| c_q \sum _{G \in \mathcal {M}_{<n}} \widetilde{\chi }(G) + 1\right| = O(1), \end{aligned}$$

for any \(n \ge 1\). Taking \(n = 1\), we see that \(c_q \sum _{G \in \mathcal {M}_{<1}}\widetilde{\chi }(G) = c_q \chi (1) \ne -1\) in any case. We obtain the contradiction \(k \ll 1\), and the claim is proved. \(\square \)

We will split the remaining case \(P \ne t\) into two subcases depending on whether \(\alpha \in \mathbb {Q}\) or not (recall from (30) that \(e(\alpha )=\widetilde{\chi }(P)\)). Our argument in both subcases has a common setup that we introduce presently.

Pick a sequence \((m_k)_{k\ge 1}\) such that \(m_k - m_{k-1} \ge 10 \nu \), and let \(a \ge 1\) be an integer to be chosen later, which is bounded in terms of \(\alpha \) and \(P^r\). Let \((n_k)_{k\ge 1}=(m_k \phi (P^r)+a)_{k\ge 1}\). We assume furthermore that \(m_k\) is chosen so that \(m_k \ge 2n_{k-1}\), so e.g., \(m_k/m_{k-1} \ge 10 \nu \phi (P^r)\) is sufficient. As in the case \(P = t\), define a sequence \((N_k)_{k\ge 1}\) by

$$\begin{aligned} N_k= \langle 1\rangle \sum _{j=1}^k q^{n_j}. \end{aligned}$$
(38)

Note that, by Euler’s theorem over \(\mathbb {F}_q[t]\), we have

$$\begin{aligned} t^{\phi (P^r)}\equiv 1\pmod {P^r}. \end{aligned}$$

This means that

$$\begin{aligned} t^{\phi (P^r)}= 1 + P^v G_0 \end{aligned}$$
(39)

where \(v \ge r\) and \(G_0\in \mathcal {M}\) is coprime to P. If \(m_k\) is chosen to be a power of q then by the binomial formula,

$$\begin{aligned} t^{n_k-a} = (1+P^vG_0)^{m_k} = 1 + (P^vG_0)^{m_k}. \end{aligned}$$
(40)

Since, by assumption, \(m_k \ge 2n_{k-1}\) we obtain that

$$\begin{aligned} t^{n_k} \equiv t^a \pmod {P^{2rn_{k-1}}}. \end{aligned}$$
(41)

The fact that \(t^{n_k-a}-1\) is highly divisible by P will be used crucially in the sequel. We now split our sum, similarly as in (36), as

$$\begin{aligned} S_{N_k}^{\mathcal {M}}(g)&=\sum _{\begin{array}{c} G\in \mathcal {M}\\ \langle G\rangle< q^{n_k} \end{array}}g(G)+\sum _{\begin{array}{c} G\in \mathbb {F}_q[t]\\ 0\le \langle G\rangle< N_{k-1} \end{array}}g(t^{n_k}+G)\\&=\Sigma _{n_k}^{\mathcal {M}}(g)+\sum _{\begin{array}{c} G\in \mathbb {F}_q[t]\\ 0\le \langle G\rangle < N_{k-1} \end{array}}g(t^{n_k}+G). \end{aligned}$$

Note that by (41) and the fact that \(n_k>n_{k-1}+\nu \), we have

$$\begin{aligned} g(t^{n_k}+G)=e(\theta n_k)\xi (t)^{n_k}\widetilde{\chi }(t^{a}+G) \end{aligned}$$

for all \(G\in \mathcal {M}_{\le n_{k-1}}\setminus \{-t^a\}\). Also note that the conditions \(\langle G+t^a\rangle < n\) and \(\langle G\rangle < n\) are equivalent whenever \(q^{a+1}\mid n\), and \(q^{a+1}\mid N_j\) for all \(j\ge 1\). Hence, we obtain

$$\begin{aligned} \begin{aligned} S_{N_k}^{\mathcal {M}}(g)&=\Sigma _{n_k}^{\mathcal {M}}(g)+e(\theta n_k)\xi (t)^{n_k}S_{N_{k-1}}(\widetilde{\chi })+g(t^{n_k}-t^a) \end{aligned} \end{aligned}$$
(42)

Similarly, for all \(k\ge 1\), we have

$$\begin{aligned} \begin{aligned} S_{N_{k}}(\widetilde{\chi })&=\Sigma _{n_{k}}(\widetilde{\chi })+\sum _{\begin{array}{c} G\in \mathbb {F}_q[t]\\ 0 \le \langle G\rangle < N_{k-1} \end{array}}\widetilde{\chi }(t^{n_k}+G)\\&= \Sigma _{n_{k}}(\widetilde{\chi })+S_{N_{k-1}}(\widetilde{\chi })+\widetilde{\chi }(t^{n_{k}}-t^a), \end{aligned} \end{aligned}$$
(43)

where \(N_0:=0\). Iterating (43) and substituting into (42) produces

$$\begin{aligned} \begin{aligned} S_{N_k}^{\mathcal {M}}(g)&=e(\theta n_k)\xi (t)^{n_k}\left( \sum _{j=1}^{k-1}\Sigma _{n_{j}}(\widetilde{\chi })+\sum _{j=1}^{k}\widetilde{\chi }(t^{n_j}-t^a)\right) +\Sigma _{n_k}^{\mathcal {M}}(g)+O(1)\\&=e(\theta n_k)\xi (t)^{n_k}\left( \sum _{j=1}^{k-1}\Sigma _{n_{j}}(\widetilde{\chi })+\sum _{j=1}^{k}\widetilde{\chi }(t^{n_j}-t^a)\right) +O(1), \end{aligned} \end{aligned}$$
(44)

where we used the assumption \(\Sigma _{n_k}^{\mathcal {M}}(g)= O(1)\). This leads to

$$\begin{aligned} \sum _{j=1}^{k-1}\Sigma _{n_{j}}(\widetilde{\chi })+\sum _{j=1}^{k}\widetilde{\chi }(t^{n _j}-t^a)=O(1). \end{aligned}$$
(45)

At this point, we may distinguish between the remaining two cases.

Proof of Proposition 5.1 when \(P \not =t\)

As mentioned, the proof splits into two subcases.

5.1 Case 1: \(P \ne t\), \(\alpha \notin \mathbb {Q}\)

Let \(G_0\) and \(v \ge r\) be as in (39). Let \(d:= \text {ord}(\chi (G_0))\) and let \(\beta \) be a limit point of the sequence \(\{vq^{An}\alpha \pmod {1}\}_{n \ge 1}\), where \(A = 20C\nu r \text {deg}(P)\) and \(C \ge 1\) is a large integer depending only on \(\alpha \) to be chosen below. By the pigeonhole principle we may select \(\ell _1< \cdots <\ell _k\) sufficiently large in terms of \(\alpha \) and \(\text {deg}(P)\) such that

$$\begin{aligned} \Vert vq^{A\ell _j}\alpha - \beta \Vert < \tfrac{1}{100}, \end{aligned}$$

and so that \(q^{A\ell _j} \equiv c_0 \pmod {d}\) for all \(1 \le j \le k\) and some \(1 \le c_0 \le d\). We now set \(m_j = q^{A \ell _j}\), and \(a = \gamma \text {deg}(P)\), where \(1 \le \gamma = \gamma (\alpha ) \le C\) is an integer to be chosen later. With this choice, we have \(n_j = \phi (P^r) q^{A\ell _j} + \gamma \text {deg}(P)\), and for suitably large \(\ell _1\) we have \(n_1 \ge 10 C \text {deg}(P) + a \ge 2a\). For \(k \ge 1\) we may verify the required inequalities \(n_k-n_{k-1} > \nu +r\) and

$$\begin{aligned} m_k \ge 10\phi (P^r) m_{k-1} = 10 (n_{k-1}-a) \ge 5 n_{k-1} \text { for all } k \ge 1. \end{aligned}$$

By Lemma 5.2(1), for any \(j\ge 1\) we have

$$\begin{aligned} \Sigma _{n_j}^{\mathcal {M}}(\widetilde{\chi })&= \sum _{\begin{array}{c} M' \in \mathcal {M} \\ \text {deg}(M')< \nu + r\text {deg}(P) \end{array}} \chi (M') \sum _{0 \le \ell< (n_j-\text {deg}(M'))/\text {deg}(P)} e(\alpha \ell ) \\&= \frac{1}{1-e(\alpha )} \sum _{\begin{array}{c} M' \in \mathcal {M} \\ \text {deg}(M') < \nu + r\text {deg}(P) \end{array}} \chi (M') \left( 1- e\left( \alpha \left( 1 + \left\lfloor \frac{n_j-\text {deg}(M')}{\text {deg}(P)}\right\rfloor \right) \right) \right) . \end{aligned}$$

As the sum over \(M'\) (without the bracketed expression) vanishes, we may ignore the term 1 in the brackets. Recalling (33), (34) and that \(\text {deg}(P)|a\), we can rewrite \(\Sigma _{n_j}(\widetilde{\chi })\) as

$$\begin{aligned} \Sigma _{n_j}(\widetilde{\chi }) = \frac{- e(\alpha (1+\gamma ))c_q}{1-e(\alpha )} \sum _{\begin{array}{c} M' \in \mathcal {M}\\ \text {deg}(M') < \nu + r\text {deg}(P) \end{array}} \chi (M') e\left( \left\lfloor \frac{m_j\phi (P^r) - \text {deg}(M')}{\text {deg}(P)}\right\rfloor \alpha \right) . \end{aligned}$$

Summing over j, we obtain

$$\begin{aligned}&\sum _{1 \le j \le k-1} \Sigma _{n_j}(\widetilde{\chi }) \\&\qquad = e(\gamma \alpha ) \cdot \Bigg ( -\frac{c_qe(\alpha )}{1-e(\alpha )} \sum _{\begin{array}{c} M' \in \mathcal {M} \\ \text {deg}(M') < \nu + r\text {deg}(P) \end{array}} \chi (M') \sum _{1 \le j \le k-1} e\left( \left\lfloor \frac{m_j\phi (P^r) -\text {deg}(M')}{\text {deg}(P)}\right\rfloor \alpha \right) \Bigg ) \\&\qquad =: e(\gamma \alpha ) \mathcal {S}(\alpha ). \end{aligned}$$

Note that \(\mathcal {S}(\alpha )\) is independent of \(\gamma \). Splitting off \(\widetilde{\chi }(t)^a\) in (45), that expression becomes

$$\begin{aligned} e(\gamma \alpha ) \mathcal {S}(\alpha ) + \chi (t^{\text {deg}(P)})^{\gamma } \sum _{j=1}^{k}\widetilde{\chi }(t^{n _j-a}-1)=O(1). \end{aligned}$$

Now, since \(\alpha \notin \mathbb {Q}\) and \(\chi (t^{\text {deg}(P)})\) is a root of unity of order d, it follows that (taking \(C = C(\alpha )\) large enough ) an integer \(\gamma = \gamma (\alpha ) \in [1,C]\) can be chosen so that

$$\begin{aligned} |\text {arg}(e(\gamma \alpha ) \mathcal {S}(\alpha )) - \text {arg}(\chi (t^{\text {deg}(P)})^{\gamma } \sum _{j=1}^{k}\widetilde{\chi }(t^{n _j-a}-1))| \in \left( -\tfrac{1}{100}, \tfrac{1}{100}\right) . \end{aligned}$$

Hence, (45) in fact implies that

$$\begin{aligned} \chi (t^{\text {deg}(P)})^{\gamma }\sum _{j=1}^{k}\widetilde{\chi }(t^{n _j-a}-1) = O(1). \end{aligned}$$
(46)

Now by construction, for each \(1 \le j \le k\),

$$\begin{aligned} \widetilde{\chi }(t^{n_j-a}-1) = \widetilde{\chi }(P)^{vq^{A\ell _j}}\chi (G_0)^{q^{A\ell _j}} = e(vq^{A\ell _j}\alpha ) \chi (G_0)^{c_0}, \end{aligned}$$

and by choice of \(\ell _j\) we have that \(|e(vq^{A\ell _j}\alpha )-e(\beta )| \le \frac{2\pi }{100} < \tfrac{1}{10}\). It follows that

$$\begin{aligned} \left| \sum _{j = 1}^k \widetilde{\chi }(t^{n_j-a}-1)\right| = \left| \chi (G_0)^{c_0} e(\beta ) k + \sum _{j = 1}^k \chi (G_0)^{c_0}(e(vq^{A\ell _j}\alpha )-e(\beta ))\right| \ge \tfrac{9 k}{10}, \end{aligned}$$

which contradicts (46) for k sufficiently large. This completes the proof in Case 1.

5.1.1 Case 2. \(P \ne t\) and \(\alpha \in \mathbb {Q}\)

In this case, we may find \(b \ge 1\) such that \(\widetilde{\chi }(P) = e(\alpha )\) is a bth root of unity. We select \(m_k = q^{A\ell _k}\), where \(A = 2\nu r\text {deg}(P)\) and \(\ell _k\) is chosen so that \(\ell _k-\ell _{k-1} \ge 10\) and also so that \(\widetilde{\chi }(t^{\phi (P^r)q^{A\ell _k}}-1) = g_0\) for all \(k \ge 1\), where \(g_0 \in S^1\) (for this, it suffices for \(q^{A\ell _k}\) to be constant modulo [bd], where as above \(d = \text {ord}(\chi (G_0))\). We also pick \(a \in [1,b \text {deg}(P)]\).

By (40) we have

$$\begin{aligned} \widetilde{\chi }(t^{n_k}-t^a)=\widetilde{\chi }(t)^a\widetilde{\chi }(t^{n_k-a}-1)=\chi (t)^ag_0, \end{aligned}$$
(47)

by the definition of \(g_0\).

We combine (44), (33) and (34), using the fact (following from Lemma 5.2(2) and the assumption \(\widetilde{\chi }(P)^b=1\)) that \(n\mapsto \Sigma ^{\mathcal {M}}_n(\widetilde{\chi })\) is \(b\text {deg}(P)\)-periodic and \(n_j\equiv a\pmod {b\text {deg}(P)}\). We then see that

$$\begin{aligned} \sum _{j=1}^{k-1}\Sigma _{n_{j}}(\widetilde{\chi })+\sum _{j=1}^{k}\widetilde{\chi }(t^{n _j}-t^a)&=k\left( c_q\Sigma _{a}^{\mathcal {M}}(\widetilde{\chi })+g_0\chi (t)^a\right) +O(1). \end{aligned}$$
(48)

Now, as the left-hand side of (48) is O(1), we must have

$$\begin{aligned} c_q\Sigma _{a}^{\mathcal {M}}(\widetilde{\chi })+g_0\chi (t)^a=0 \end{aligned}$$
(49)

for all \(a\in [1,b\text {deg}(P)]\). We have \(c_q=0\) or \(c_q=q-1\), and the first of these is clearly impossible, since \(|g_0\widetilde{\chi }(t)^a|=1\). Hence, (49) becomes

$$\begin{aligned} (q-1)\Sigma _{a}^{\mathcal {M}}(\widetilde{\chi })+g_0\chi (t)^a=0 \end{aligned}$$
(50)

Since the sequence \(n\mapsto \Sigma _{n}^{\mathcal {M}}(\widetilde{\chi })\) is \(b\text {deg}(P)\)-periodic, and \(n\mapsto \chi (t)^n\) is also \(b\text {deg}(P)\)-periodic, we deduce that

$$\begin{aligned} (q-1)\Sigma _{n}^{\mathcal {M}}(\widetilde{\chi })+g_0\chi (t)^n=0 \end{aligned}$$
(51)

for all \(n\ge 0\).

From (51) we see that, for \(\text {Re}(s) > 1\), we have

$$\begin{aligned} \sum _{G\in \mathcal {M}}\widetilde{\chi }(G)q^{-s\text {deg}(G)}=\sum _{n\ge 0}\Sigma _{n}^{\mathcal {M}}(\widetilde{\chi })q^{-sn}=-\frac{g_0(q-1)^{-1}}{1-\chi (t)q^{-s}}. \end{aligned}$$

But on the other hand in the same region of s by (10) we have

$$\begin{aligned} \sum _{G\in \mathcal {M}}\widetilde{\chi }(G)q^{-s\text {deg}(G)}=\prod _{R\in \mathcal {P}}\sum _{k\ge 0}\widetilde{\chi }(P)^kq^{-sk\text {deg}(R)}=\frac{L(s,\chi )}{1-\widetilde{\chi }(P)q^{-s\text {deg}(P^r)}}. \end{aligned}$$

Comparing these, we see that

$$\begin{aligned} L(s,\chi )=-\frac{g_0(q-1)^{-1}(1-\widetilde{\chi }(P)q^{-s\text {deg}(P^r)})}{1-\chi (t)q^{-s}}; \end{aligned}$$
(52)

initially this holds for \(\text {Re}(s) > 1\), but by analytic continuation we in fact have this for all s. In particular, if \(\text {deg}(P^r)\ge 2\), this implies that \(L(s,\chi )\) has a root other than \(s = 0\) off the critical line \(\text {Re}(s) = 1/2\). But by GRH over function fields this is not possible. Hence, \(\text {deg}(P^r) = r\text {deg}(P) =1\), and \(r = 1\). Then, P being monic and coprime to t implies that \(P=t+c\) for some \(c\in \mathbb {F}_q\setminus \{0\}\). As \(c_q \ne 0\) it follows that \(\chi \) is 1 on \(\mathbb {F}_q^{\times }\), and therefore \(\chi (t) = \chi (-c) = 1\). Since \(L(s,\chi )\) is analytic, it follows that \(\widetilde{\chi }(P) = 1\) as well, but then \(L(s,\chi ) = -g_0(q-1)^{-1} \ne 0\) for all s. On the other hand, since \(\text {deg}(P) = 1\) we obtain \(\sum _{M \in \mathcal {M}_n} \chi (G) = 0\) for all \(n \ge 1\), and thus

$$\begin{aligned} L(s,\chi ) = \sum _{M \in \mathcal {M}} \chi (M)q^{-s\text {deg}(M)} = 1. \end{aligned}$$

Comparing with (52), we obtain \((q-1)^{-1}=-g_0\), so that as \(g_0\in S^1\) we must have \(q=2\) and \(g_0=-1\). Hence, \(\widetilde{\chi }\) must be a generalized character \(\pmod {t+1}\), and additionally \(\widetilde{\chi }(t+1)=1\). But now if \(G\in \mathbb {F}_2[t]\) is any monic polynomial, then by changing bases we can write

$$\begin{aligned} G = \sum _{0 \le j \le r} a_j(t+1)^j, \end{aligned}$$

with \(a_j \in \{0,1\}\). If \(j_0\) is the minimal index for which \(a_j \ne 0\) then we immediately find that \(\widetilde{\chi }(G) = \widetilde{\chi }(a_{j_0}) = 1\). Hence \(\widetilde{\chi } \equiv 1\). But this contradicts the assumption \(|\Sigma _a^{\mathcal {M}}(\widetilde{\chi })| = |-g_0\widetilde{\chi }(t)^a| = 1\) for all \(a \ge 1\), since

$$\begin{aligned} \Sigma _1^{\mathcal {M}}(\widetilde{\chi }) = 1+\widetilde{\chi }(t) + \widetilde{\chi }(t+1) = 3. \end{aligned}$$

This completes the analysis of Case 2 and the proof of Proposition 5.1 in this case. \(\square \)

Remark 5.4

We remark that the proof of Proposition 5.1 gives at best a growth rate of \(\gg \log \log N\) for the lexicographic discrepancy \(S_{N}^{\mathcal {M}}(g)\). This is because the sequence \((n_k)_k\) must satisfy \(n_k\ge 2n_{k-1}\) so that by (38) we have \(k\ll \log \log N_k\).

The proof of Theorem 1.5 is now complete.

6 The long sum discrepancy

We will next prove our characterization result for unboundedness of the long sum discrepancy (Theorem 1.8), as well as Proposition 1.1 that complements it. We begin the proof of Theorem 1.8 with the following simple observation.

Lemma 6.1

Let \(f: \mathcal {M} \rightarrow \mathbb {U}\) be a modified character associated with a non-principal Dirichlet character \(\chi \) modulo \(Q\in \mathcal {M}\), and let \(N > \text {deg}(Q)\) be large. Then

$$\begin{aligned} \sum _{G \in \mathcal {M}_{\le N}} f(G) = \sum _{0 \le m < \text {deg}(Q)} \left( \sum _{\begin{array}{c} A \pmod {Q} \\ A \in \mathcal {M}_{\le m} \end{array}} \chi (A)\right) \sum _{\begin{array}{c} \text {rad}(D)|Q \\ \text {deg}(D) = N-m \end{array}} f(D). \end{aligned}$$

Proof

We split the sum on the left-hand side according to the common factors of G with Q to obtain

$$\begin{aligned} \sum _{G \in \mathcal {M}_{\le N}} f(G)&= \sum _{\text {rad}(D)|Q} f(D)\sum _{G' \in \mathcal {M}_{\le N-\text {deg}(D)}} \chi (G') \\&= \sum _{A\pmod {Q}} \chi (A) \sum _{\text {rad}(D)|Q} f(D) \sum _{\begin{array}{c} G' \in \mathcal {M}_{\le N-\text {deg}(D)} \\ G' \equiv A \pmod {Q} \end{array}} 1. \end{aligned}$$

We separate the contribution with \(\text {deg}(D) \le N-\text {deg}(Q)\) from its complement. Observe that when \(\text {deg}(D) \le N-\text {deg}(Q)\), the inner sum above is independent of A. Thus, orthogonality implies that its contribution is 0. On the other hand, if \(\text {deg}(D) > N-\text {deg}(Q)\) and \(G' \in \mathcal {M}_{\le N-\text {deg}(D)}\) with \(G' \equiv A \pmod {Q}\) then \(G' = A\). It follows that

$$\begin{aligned} \sum _{G \in \mathcal {M}_{\le N}} f(G) = \sum _{\begin{array}{c} \text {rad}(D)|Q \\ N-\text {deg}(Q) < \text {deg}(D) \le N \end{array}} f(D) \sum _{\begin{array}{c} A \pmod {Q} \\ A \in \mathcal {M}_{\le N-\text {deg}(D)} \end{array}} \chi (A). \end{aligned}$$

Splitting the sum according to the size of \(\text {deg}(D)\), we get

$$\begin{aligned} \sum _{G \in \mathcal {M}_{\le N}} f(G) = \sum _{0 \le m < \text {deg}(Q)} \left( \sum _{\begin{array}{c} A \pmod {Q} \\ A \in \mathcal {M}_{\le m} \end{array}} \chi (A)\right) \sum _{\begin{array}{c} \text {rad}(D)|Q \\ \text {deg}(D) = N-m \end{array}} f(D), \end{aligned}$$

as claimed. \(\square \)

Proof of Theorem 1.8

Let \(N > \text {deg}(Q)\) be large. By the residue theorem, we have

$$\begin{aligned} \sum _{\begin{array}{c} \text {rad}(D)|Q \\ \text {deg}(D) = N-m \end{array}} f(D)&= \frac{1}{2\pi i} \int _{|z| = r} \left( \sum _{\text {rad}(D)|Q} f(D)z^{\text {deg}(D)}\right) \frac{dz}{z^{N-m+1}}\\&= \frac{1}{2\pi i } \int _{|z| = r}\prod _{P|Q} \left( 1-f(P)z^{\text {deg}(P)}\right) ^{-1} z^m \frac{dz}{z^{N+1}}, \end{aligned}$$

for any \(r \in (0,1)\). Using Lemma 6.1 together with this expression for each \(m \le \text {deg}(Q)-1\), we have

$$\begin{aligned} \sum _{G \in \mathcal {M}_{\le N}} f(G)&= \sum _{0 \le m< \text {deg}(Q)} \left( \sum _{\begin{array}{c} A \pmod {Q} \\ A \in \mathcal {M}_{\le m} \end{array}} \chi (A)\right) \sum _{\begin{array}{c} \text {rad}(D)|Q \\ \text {deg}(D) = N-m \end{array}} f(D) \\&= \frac{1}{2\pi i } \int _{|z| = r} \prod _{P|Q}\left( 1-f(P)z^{\text {deg}(P)}\right) ^{-1} \left( \sum _{A \in \mathcal {M}_{<\text {deg}(Q)}} \chi (A) \sum _{\text {deg}(A) \le m \le \text {deg}(Q)-1} z^m \right) \frac{dz}{z^{N+1}} \\&= \frac{1}{2\pi i} \int _{|z| = r} \prod _{P|Q}\left( 1-f(P)z^{\text {deg}(P)}\right) ^{-1} (1-z)^{-1}\sum _{A \in \mathcal {M}_{<\text {deg}(Q)}} \chi (A) z^{\text {deg}(A)} (1-z^{\text {deg}(Q)-\text {deg}(A)}) \frac{dz}{z^{N+1}} , \end{aligned}$$

where we used the geometric sum formula in the last step. By the orthogonality of characters, we have \(\sum _{A\in \mathcal {M}_{<\text {deg}(Q)}}\chi (A)z^{\text {deg}(Q)}=0\). Thus, if \(z = q^s\) for some \(s \in \mathbb {C}\) then by writing \(\mathcal {L}(z,\chi ):= L(s,\chi )/(1-z)\) (see (10) for the definition of \(L(s,\chi )\)), the previous expression simplifies to

$$\begin{aligned} \frac{1}{2\pi i} \int _{|z| = r} \mathcal {L}(z,\chi ) \prod _{P|Q} \left( 1-f(P)z^{\text {deg}(P)}\right) ^{-1} \frac{dz}{z^{N+1}}, \end{aligned}$$
(53)

using orthogonality in the last step.

Now let \(\lambda _1,\ldots ,\lambda _J\) be the collection of distinct roots of \(\prod _{P|Q}(1-f(P)z^{\text {deg}(P)})\), with respective multiplicities satisfying \(b_1\le \ldots \le b_J\); note that \(\lambda _j \in S^1\) for all j. A partial fraction decomposition of the reciprocal of this polynomial yields coefficients \(\{a_{j,l}\}_{1 \le l \le b_j, 1 \le j \le J}\) such that

$$\begin{aligned} \prod _{P|Q} (1-f(P)z^{\text {deg}(P)})^{-1} = \prod _{1 \le j \le J} (1-\lambda _j z)^{-b_j} = \sum _{1 \le j \le J} \sum _{1 \le l \le b_j} \frac{a_{j,l}}{(1-\lambda _jz)^l}. \end{aligned}$$

Noting that for each pair (jl) we have the formal power series expansion

$$\begin{aligned} (1-\lambda _j z)^{-l} = \sum _{k \ge 0} \left( {\begin{array}{c}l-1+k\\ k\end{array}}\right) \lambda _j^k z^k, \end{aligned}$$

we see that

$$\begin{aligned} \sum _{G \in \mathcal {M}_{\le N}} f(G)&= \sum _{k \ge 0} \sum _{1 \le j \le J} \sum _{1 \le l \le b_j} a_{j,l}\left( {\begin{array}{c}l-1+k\\ k\end{array}}\right) \lambda _j^k \left( \frac{1}{2\pi i } \int _{|z| = r} \mathcal {L}(z,\chi ) z^k \frac{dz}{z^{N+1}}\right) \nonumber \\&= \sum _{1 \le j \le J} \sum _{1 \le l \le b_j} a_{j,l} \sum _{N-\text {deg}(Q) < k \le N} \left( [z^{N-k}] \mathcal {L}(z,\chi )\right) \left( {\begin{array}{c}l-1+k\\ k\end{array}}\right) \lambda _j^k, \end{aligned}$$
(54)

where, given a formal power series F(z) in z we write \([z^m]F(z)\) to denote the mth coefficient of F, for \(m \in \mathbb {N}\cup \{0\}\). We shall use this last expression to prove both parts of the proposition, beginning with part b).

Part b). By hypothesis, \(b = b_J\ge 2\). Let \(1 \le i \le J\) be the minimal index for which \(b_i = b_{i+1} = \cdots = b_J\le \text {deg}(Q)\). As \(\left( {\begin{array}{c}l-1+k\\ k\end{array}}\right) = \frac{N^{l-1}}{(l-1)!} + O_{\text {deg}(()Q)}(N^{l-2})\) for any \(l \ge 2\) and \(k \in (N-\text {deg}(Q),N]\), we get

$$\begin{aligned}&\sum _{1 \le j \le J} \sum _{1 \le l \le b_j} a_{j,l} \sum _{N-\text {deg}(Q)< k \le N} \left( [z^{N-k}]\mathcal {L}(z,\chi )\right) \left( {\begin{array}{c}l-1+k\\ k\end{array}}\right) \lambda _j^k \\&= \sum _{i \le j \le J} a_{j,b} \sum _{N-\text {deg}(Q)< k \le N} \left( {\begin{array}{c}b-1+k\\ k\end{array}}\right) \left( [z^{N-k}]\mathcal {L}(z,\chi )\right) \lambda _j^k + O_{\text {deg}(()Q)}(N^{b-2}) \\&= \frac{N^{b-1}}{(b-1)!} \sum _{i \le j \le J} a_{j,b} \sum _{0 \le m <\text {deg}(Q)} \left( [z^m]\mathcal {L}(z,\chi )\right) \lambda _j^{N-m} + O_{\text {deg}(()Q)}(N^{b-2}) \\&= \frac{N^{b-1}}{(b-1)!} \sum _{i \le j \le J} a_{j,b} \lambda _j^N \mathcal {L}(\overline{\lambda _j},\chi ) + O_{\text {deg}(()Q)}(N^{b-2}), \end{aligned}$$

where in the last step we made the change of variables \(m=N-k\), which leads to the power series in the penultimate line simplifying to \(\lambda _j^{N}\mathcal {L}(\overline{\lambda _j},\chi )\).

We note that \(\mathcal {L}(\overline{\lambda _j},\chi ) \ne 0\) for all j because by GRH [3, Thm. 5.5 and Ex. 5.2.2] we know that \(\mathcal {L}(z,\chi )\) has no zeros off the circle \(|z| = q^{-1/2}\), aside from a simple zero at \(z = 1\) (which has been cancelled in the definition of \(\mathcal {L}(z,\chi )\)). Moreover, \(a_{j,b} \ne 0\) for all \(i \le j \le b\) as well, otherwise the maximal power of \((1-\lambda _jz)^{-1}\) in the partial fraction decomposition would be strictly smaller than b. Finally, applying Dirichlet’s theorem we can find a sequence of \(\{N_r\}_r\) such that \(\max _{i \le m \le J} |\lambda _m^{N_r}-1| \le \varepsilon \) for any specific choice of \(\varepsilon > 0\) (chosen small relative to Q and J). It follows that for all \(l \in \{0,\ldots ,J-i-1\}\) we have

$$\begin{aligned} \sum _{i \le j \le J} a_{j,b} \mathcal {L}(\overline{\lambda _j},\chi ) \lambda _j^{N_r + l} = \sum _{i \le j \le J} a_{j,b} \mathcal {L}(\overline{\lambda _j},\chi ) \lambda _j^l + O_J(\varepsilon ), \end{aligned}$$
(55)

and thanks to the invertibility of the van der Monde matrix generated by \(\lambda _i,\ldots ,\lambda _J\) (which are distinct by assumption) the expression (55) is \(\ne 0\) for at least one l and some \(\varepsilon >0\) sufficiently small. This implies then that

$$\begin{aligned} \max _{N_r \le N \le N_r+J} \left| \sum _{G \in \mathcal {M}_{\le N}} f(G)\right| \asymp _{Q} N_r^{b-1}, \end{aligned}$$

as \(r \rightarrow \infty \). This completes the proof of part b).

Part a). From (54), we have

$$\begin{aligned} \sum _{G \in \mathcal {M}_{\le N}} f(G) = \sum _{1 \le j \le J} \sum _{1 \le l \le b_j} a_{j,l} \sum _{N-\text {deg}(Q) < k \le N} ([z^{N-k}]\mathcal {L}(z,\chi )) \left( {\begin{array}{c}l-1+k\\ k\end{array}}\right) \lambda _j^k. \end{aligned}$$

Since \(b_j \le b_J\), we have \(b_j = 1\) for all j. As above, we obtain

$$\begin{aligned} \sum _{G \in \mathcal {M}_{\le N}} f(G) = \sum _{1 \le j \le J} a_{j,1} \sum _{N - \text {deg}(Q)<k\le N} ([z^{N-k}] \mathcal {L}(z,\chi )) \lambda _j^k = \sum _{1 \le j \le J} a_{j,1} \lambda _j^N\mathcal {L}(\overline{\lambda _j},\chi ). \end{aligned}$$
(56)

Note that \(\lambda _j \in S^1\) for all j, and \(\mathcal {L}(z,\chi )\), is holomorphic and thus bounded on \(S^1\) (in terms solely of the conductor Q). Furthermore, \(a_{j,1}\) depends only on Q. It follows that the sum here is \(O_{Q}(1)\). This completes the proof. \(\square \)

This gives the following list of corollaries, which includes Corollary 1.7.

Corollary 6.2

Let \(f: \mathcal {M} \rightarrow S^1\) be a modified character associated with a non-principal character of modulus Q.

  1. a)

    If \(Q = P^k\) is a prime power then \(\mathcal {D}_{f} < \infty \).

  2. b)

    If \(\omega (Q) \ge 2\) and there exist prime divisors \(P_1,P_2\) of Q satisfying \(\text {deg}(P_1) = \text {deg}(P_2)\) and \(f(P_1) = f(P_2)\) then \(\mathcal {D}_f = \infty \).

  3. c)

    If \(\omega (Q) \ge 2\) and here exist prime divisors \(P_1,P_2\) of Q satisfying \(f(P_1) = f(P_2) = 1\) then \(\mathcal {D}_f = \infty \).

  4. d)

    Suppose f takes values in \(\{-1,+1\}\).

    1. i)

      If \(\omega (Q) \ge 4\) then \(\mathcal {D}_f = \infty \).

    2. ii)

      If \(\omega (Q) = 3\) then \(\mathcal {D}_f < \infty \) if and only if (up to permutation) the primes \(P_1,P_2,P_3\) dividing Q satisfy \(f(P_1) = f(P_2) = -1\), \(f(P_3) = 1\), and \(v_2(\text {deg}(P_1)) \ne v_2(\text {deg}(P_2))\) and \(v_2(\text {deg}(P_j)) \ge v_2(\text {deg}(P_3))\) for \(j = 1,2\).

    3. iii)

      If \(\omega (Q) = 2\) then \(\mathcal {D}_f < \infty \) if and only if (up to permutation) the primes \(P_1,P_2\) dividing Q satisfy \(f(P_1) =-1,\) \(f(P_2)=1\), and \(v_2(\text {deg}(P_1)) \ge v_2(\text {deg}(P_2))\).

Proof

a) Since the zeros of the equation \(z^m = a\) (with \(m = \text {deg}(P)\) and \(a = \overline{f(P)}\)) are all distinct, Theorem 1.8 a) implies that the discrepancy is bounded.

b) Since the expressions \(z^{\text {deg}(P_j)}f(P_j) = 1\) are identical for \(j = 1,2\), they thus yield identical roots, so by Theorem 1.8 b) the claim follows.

c) This follows from Theorem 1.8 b), as \(z^{\text {deg}(P_1)} = 1\) and \(z^{\text {deg}(P_2)} = 1\) must share the common root \(z = 1\).

d) i) Let \(f:\mathcal {M}\rightarrow \{-1,+1\}\), and let \(P_1,P_2,P_3,P_4\) be distinct prime divisors of Q. At least two prime divisors \(P_1,P_2\) are such that \(f(P_1) = f(P_2)\). If the parities of \(\text {deg}(P_1)\) and \(\text {deg}(P_2)\) are the same then as in the proof of b) the equations \(z^{\text {deg}(P_1)} = f(P_1)\) and \(z^{\text {deg}(P_2)} = f(P_2)\) will share a common root. Now by c), if the common value of \(f(P_1)\) and \(f(P_2)\) is 1 then this is true regardless of these parities. Thus, we may assume that the value 1 occurs at most once among the values \(f(P_j)\), for the 4 prime factors of Q. But then at least 3 of the primes \(P_j\) are such that \(f(P_j) = -1\), and among their degrees at least two have the same parity. Thus, we may conclude that \(\prod _{P|Q} (1-f(P_j)z^{\text {deg}(P_j)})\) has a multiple root, and the first claim follows from Proposition 6.2 b).

ii) Now let \(P_1,P_2,P_3\) be the prime divisors of Q. The argument in i) shows that if at least two of \(f(P_i)\) equal to 1, or all of the \(f(P_i)\) equal to \(-1\) then the discrepancy is unbounded. We are left with the case where exactly two of the \(f(P_i)\) are \(-1\); say \(f(P_1)=f(P_2)=-1\) and \(f(P_3)=1\). One easily sees that

$$\begin{aligned} \begin{aligned}&\{z\in \mathbb {C}:\,\, z^m= -1\}\cap \{z\in \mathbb {C}:\,\, z^n= -1\}\ne \emptyset \quad \text {if and only if}\quad v_2(m)=v_2(n),\\&\{z\in \mathbb {C}:\,\, z^m= -1\}\cap \{z\in \mathbb {C}:\,\, z^n= 1\}\ne \emptyset \quad \text {if and only if}\quad v_2(m)<v_2(n). \end{aligned} \end{aligned}$$
(57)

Applying this with \(m,n\in \{\text {deg}(P_1),\text {deg}(P_2),\text {deg}(P_3)\}\) yields the claim.

iii) The proof of the case \(\omega (Q) = 2\) is almost identical to that of case \(\omega (Q)=3\); again one makes use of (57). \(\square \)

Proof of Proposition 1.1

This follows by generalizing the Polymath 5 example in [16] of a completely multiplicative function having bounded long sum discrepancy.

For \(d \ge 1\) define the quantities

$$\begin{aligned} \alpha _d=\sum _{G\in \mathcal {M}_d}\Lambda (G)f(G),\quad \beta _d=\sum _{G\in \mathcal {M}_d}f(G). \end{aligned}$$

Using \(\text {deg}(G)=\sum _{D\mid G}\Lambda (D)\) and the complete multiplicativity of f, we obtain the recursion

$$\begin{aligned} d\beta _d=\sum _{i=1}^d \alpha _i\beta _{d-i}. \end{aligned}$$
(58)

It was shown by Polymath 5 [16] that there exist a constant C and a completely multiplicative function \(f:\mathcal {M}\rightarrow \{-1,+1\}\) for which the corresponding \(\alpha _i\) satisfy \(|\alpha _i|<q^i\) for all \(i\ge C\) and for which \(0\le \sum _{0\le i\le d}\beta _i\le C\) for all \(i\ge 1\) (Polymath 5 stated their result in the form that if the size q of the field is large enough, then \(\sum _{0\le i\le d}\beta _i\in \{0,1\}\) for all i, but the same proof gives the claim above for all q.). Since the \(\beta _i\) are completely determined by the \(\alpha _i\), this then means that \(\mathcal {D}_g\le C\) for any completely multiplicative g that produces the same sequence of \(\alpha _i\). From this we deduce that there are uncountably many choices of g: for each subset S of \(\mathbb {N}\cap [C+1,\infty )\), we may form a new completely multiplicative function \(f_S\) which is obtained from f by choosing for each \(d\in S\) two irreducibles \(P_{1,d}, P_{2,d}\) of degree d with \(f(P_{1,d})=-f(P_{2,d})\), putting \(f_S(P_{j,d}) = -f(P_{j,d})\) for \(j = 1,2\), and setting \(f_S(P)=f(P)\) at all other irreducibles P. The new function \(f_S\) has the same sequence of \(\alpha _i\) associated with it as to f, so it too has discrepancy bounded by C. \(\square \)