Abstract
With the aim of treating the local behaviour of additive functions, we develop analogues of the Matomäki–Radziwiłł theorem that allow us to approximate the average of a general additive function over a typical short interval in terms of a corresponding long average. As part of this treatment, we use a variant of the Matomäki–Radziwiłł theorem for divisor-bounded multiplicative functions recently proven in Mangerel (Divisor-bounded multiplicative functions in short intervals. arXiv: 2108.11401). We consider two sets of applications of these methods. Our first application shows that for an additive function \({\varvec{g:}} \mathbb {N} \rightarrow \mathbb {C}\) any non-trivial savings in the size of the average gap \(|{\varvec{g}}{} {\textbf {(}}{\varvec{n}}{} {\textbf {)}}-{\varvec{g}}{} {\textbf {(}}{\varvec{n}}-{\textbf {1}}{} {\textbf {)}} |\) implies that \({\varvec{g}}\) must have a small first centred moment i.e. the discrepancy of \({\varvec{g}}{} {\textbf {(}}{\varvec{n}}{} {\textbf {)}}\) from its mean is small on average. We also obtain a variant of such a result for the second moment of the gaps. This complements results of Elliott and of Hildebrand. As a second application, we make partial progress on an old question of Erdős relating to characterizing constant multiples of \({{\textbf {log}}} \,{\varvec{n}}\) as the only almost everywhere increasing additive functions. We show that if an additive function is almost everywhere non-decreasing then it is almost everywhere well approximated by a constant times a logarithm. We also show that if the set \(\{{\varvec{n}} \in \mathbb {N} : {\varvec{g}}{} {\textbf {(}}{\varvec{n}}{} {\textbf {)}} < {\varvec{g}}{} {\textbf {(}}{\varvec{n}}-{\textbf {1}}{} {\textbf {)}}\}\) is sufficiently sparse, and if \({\varvec{g}}\) is not extremely large too often on the primes (in a precise sense), then \({\varvec{g}}\) is identically equal to a constant times a logarithm.
Similar content being viewed by others
Avoid common mistakes on your manuscript.
1 Introduction
An arithmetic function \(g: \mathbb {N} \rightarrow \mathbb {C}\) is called additive if, whenever \(n,m \in \mathbb {N}\) are coprime, \(g(nm) = g(n)+g(m)\); it is said to be completely additive if the coprimality condition on n, m can be ignored. Additive functions are objects of classical study in analytic and probabilistic number theory, their study being enriched by a close relationship with the probabilistic theory of random walks.
Much is understood about the global behaviour of general additive functions. For instance, the orders of magnitude of all of the centred moments
have been computed by Hildebrand [1]. When \(k = 2\), the slightly weaker but generally sharp Turán–Kubilius inequality (see Lemma 3.2) gives an upper bound, uniform in g, of the form
where we have denoted by \(B_g(X)^2\) the approximate variance defined via
When g is real-valued one can determine necessary and sufficient conditions according to which the distribution functions \(F_X(z) := \frac{1}{X}|\{n \le X : g(n) \le z\} |\) converge to a distribution function F as \(X \rightarrow \infty \); this is the content of the Erdős–Wintner theorem [2]. Under certain conditions the corresponding distribution functions (with suitable normalizations) converge to a Gaussian, a fundamental result of Erdős and Kac [3].
Much less is understood regarding the local behaviour of additive functions i.e. the simultaneous behaviour of g at neighbouring integers. Questions of interest from this perspective include
-
(i)
the distribution of \(\{g(n)\}_n\) in typical short intervals \([x,x+H]\), where \(x \in [X,2X]\) and \(H = H(X)\) grows slowly,
-
(ii)
the distribution of the sequence of gaps \(|g(n)-g(n-1) |\) between consecutive values and
-
(iii)
the distribution of tuples \((g(n+1),\ldots ,g(n+k))\), for \(k \ge 2\).
Pervasive within this scope are questions surrounding the characterization of those additive functions g whose local behaviour is rigid in some sense, such questions are discussed in Sect. 1.2.
The purpose of this paper is to consider questions of a local nature about general additive functions.
1.1 Matomäki–Radziwiłł type theorems for additive functions
The study of additive functions is intimately connected with that of multiplicative functions i.e. arithmetic functions \(f: \mathbb {N} \rightarrow \mathbb {C}\) such that \(f(nm) = f(n)f(m)\) whenever \((n,m) = 1\). The mean-value theory of bounded multiplicative functions, which provides tools for the analysis of the global behaviour of multiplicative functions, was developed in the ’60s and ’70s in the seminal works of Wirsing [4] and Halász [5].
In contrast, the study of the local behaviour of multiplicative functions has long been the source of intractable problems. An important example of this is Chowla’s conjecture [6]. This conjecture states, among other things, that for any \(k \ge 2\) and any tuple \(\varvec{\epsilon } \in \{-1,+1\}^k\), the set
has \((2^{-k}+o(1)) X\) elements, where \(\lambda \) is the LiouvilleFootnote 1 function. In other terms, the sequence of tuples \((\lambda (n+1),\ldots ,\lambda (n+k))\) equidistributes among the tuples of signs in \(\{-1,+1\}^k\). The depth of this conjecture is revealed upon observing that when \(k = 1\), this corresponds to the statement that \(\lambda (n)\) takes value \(+1\) and \(-1\) with asymptotically equal probability 1/2. This was shown by Landau [7] to be equivalent to the prime number theorem.
Problems of this type have recently garnered significant interest, thanks to the celebrated theorems of Matomäki and Radziwiłł [8]. Broadly speaking, their results show that averages of a bounded multiplicative function in typical short intervals are well approximated by a corresponding long average. In a strong sense, this suggests that the local behaviour of many multiplicative functions is determined by their global behaviour. The simplest version of their theorems to state is as follows.
Theorem
(Matomäki–Radziwiłł [8]) Let \(f: \mathbb {N} \rightarrow [-1,1]\) be multiplicative. Let \(10 \le h \le X/100\). Then
This result, its natural extensions to complex-valued functions [9], and further improvements, extensions and variants (e.g. [10]) have had profound impacts not only in analytic number theory, but equally in combinatorics and dynamics. For instance, Tao [11] used this result to develop technology in order to obtain estimates for the logarithmically-averaged binary correlation sums
This was essential in his proof of the Erdős discrepancy problem [12], and also enabled him to obtain a logarithmic density analogue of the case \(k = 2\) of Chowla’s conjecture. It has also been pivotal in the various developments towards Sarnak’s conjecture on the disjointness of the Liouville function from zero entropy dynamical systems (see [13] for a survey).
Our first main result establishes an \(\ell ^1\)-averaged comparison theorem for short and long averages of additive functions, inspired by the theorem of Matomäki and Radziwiłł.
Theorem 1.1
Let \(g: \mathbb {N} \rightarrow \mathbb {C}\) be an additive function. Let \(10 \le h \le X/100\) be an integer.Footnote 2 Then
Remark 1.2
Theorem 1.1 should be compared to the “trivial bound” arising from applying the triangle inequality, the Cauchy–Schwarz inequality and (1) (which is valid for dyadic long averages as well) to obtain
In contrast, Theorem 1.1 gives the non-trivial bound \(o(B_g(X))\) whenever \(h = h(X) \rightarrow \infty \) as \(X \rightarrow \infty \).
To get a more precise additive function analogue of the Matomäki–Radziwiłł theorem, one would hope to obtain a mean square (or \(\ell ^2\)) version of Theorem 1.1. We are limited in this matter by the possibility of very large values of g. Specifically, if \(\vert g(p) \vert /B_g(X)\) can get very large for many primes \(p \le X\), it is possible for the \(\ell ^2\) average to be dominated by a sparse set (i.e. the multiples of these p), wherein the discrepancy between the long and short sums is not small. We will thus work with a specific collection of additive functions in order to preclude such pathological behaviour.
To describe this collection we introduce the following notations. Given \(\varepsilon > 0\) and an additive function g, we defineFootnote 3
Roughly speaking, \(F_g(\varepsilon )\) measures the contribution to \(B_g(X)^2\) from prime values g(p) of very large absolute value.
Clearly, \(0 \le F_g(\varepsilon ) \le 1\) for all \(\varepsilon > 0\) and additive functions g. We will concern ourselves with functions g such that \(F_g(\varepsilon ) \rightarrow 0\) as \(\varepsilon \rightarrow 0^+\), a condition that is satisfied by many additive functions. When g is bounded on the primes e.g. when \(g(n) = \Omega (n)\), the number of prime factors of n counted with multiplicity, it is clear that \(F_g(\varepsilon ) = 0\) whenever \(\varepsilon \) is sufficiently small. For a different example, taking \(g = c\log \) for some \(c \in \mathbb {C}\) we find \(B_g(X) \sim \frac{|c |}{\sqrt{2}}\log X\), so that \(|g(p) |\le (\sqrt{2} + o(1)) B_g(X)\) for all primes p and hence \(F_g(\varepsilon ) = 0\) for all \(\varepsilon < 1/2\), say.
Definition 1.3
We define the collection \(\mathcal {A}\) to be the set of those additive functions \(g : \mathbb {N} \rightarrow \mathbb {C}\) such that
-
(a)
\(B_g(X) \rightarrow \infty \), and
-
(b)
\(B_g(X)\) is dominated by the prime values \(|g(p) |\), in the sense that
$$\begin{aligned} \limsup _{X \rightarrow \infty } \frac{1}{B_g(X)^2} \sum _{\begin{array}{c} p^k \le X \\ k \ge 2 \end{array}} \frac{|g(p^k) |^2}{p^k} = 0. \end{aligned}$$
We shall see below (see Lemma 3.6a)) that \(\mathcal {A}\) contains all completely additive and all strongly additiveFootnote 4 functions g with \(B_g(X) \rightarrow \infty \). Within \(\mathcal {A}\) we define
Thus, among other examples, \(\Omega (n), \omega (n) := \sum _{p\mid n} 1\) and, for any \(c \in \mathbb {C}\), \(c\log \) all belong to \(\mathcal {A}_s\). We show in general that whenever \(g \in \mathcal {A}_s\), we may obtain an \(\ell ^2\) analogue of Theorem 1.1.
Theorem 1.4
Let \(g: \mathbb {N} \rightarrow \mathbb {C}\) be an additive function in \(\mathcal {A}_s\). Let \(10 \le h \le X/100\) be an integer with \(h = h(X) \rightarrow \infty \). Then
Our proof of Theorem 1.4 relies on a variant of the Matomäki–Radziwiłł theorem that applies to a large collection of divisor-bounded multiplicative functions, proven in the recent paper [14]. See Theorem 5.3 for a statement relevant to the current circumstances.
Remark 1.5
The rate of decay in this result depends implicitly on the rate at which \(F_g(\varepsilon ) \rightarrow 0\) as \(\varepsilon \rightarrow 0^+\), and on the size of the contribution to \(B_g(X)\) from the prime power values of g. We have therefore chosen to state the theorem in this qualitative form for the sake of simplicity.
It deserves mention that the application of the Matomäki–Radziwiłł method, which will be used in this paper, to the study of specific additive functions is not entirely new. Goudout [15, 16] applied this technique to derive distributional information about \(\omega (n)\) in typical short intervals; for example, he proved in [15] that the Erdős–Kac theorem holds in short intervals \((x-h,x]\) for almost all \(x \in [X/2,X]\), as long as \(h = h(X) \rightarrow \infty \). The specific novelty of Theorems 1.1 and 1.4 lie in their generality, and it is this aspect which will be used in the applications to follow.
1.2 Applications: gaps and rigidity problems for additive functions
Given \(c \in \mathbb {C}\), the arithmetic function \(n \mapsto c \log n\) is completely additive. In contrast to a typical additive function g, whose values g(n) depend on the prime factorization of n which might vary wildly from one integer to the next, \(c\log \) varies slowly and smoothly, with very small gaps
In the seminal paper [17], Erdős studied various characterization problems for real- and complex-valued additive functions relating to their local behaviour, and in so doing found several characterizations of the logarithm as an additive function. Among a number of results, he showed that if either
-
(a)
\(g(n+1) \ge g(n)\) for all \(n \in \mathbb {N}\), or
-
(b)
\(g(n+1)-g(n) = o(1)\) as \(n \rightarrow \infty \),
then there exists \(c \in \mathbb {R}\) such that \(g(n) = c\log n\) for all \(n \ge 1\).
Moreover, Erdős and later authors posited that these hypotheses could be relaxed. Kátai [18] and independently Wirsing [19] weakened assumption (b), and proved the above result under the averaged assumption
Hildebrand [20] showed the stronger conjecture of Erdős that if \(g(n_k+1)-g(n_k) \rightarrow 0\) on a set \(\{n_k\}_k\) of densityFootnote 5 1 then \(g = c \log \); this, of course, is an almost sure version of (b).
In a different direction, Wirsing [21] showed that for completely additive functions g, (b) may be weakened to \(g(n+1)-g(n) = o(\log n)\) as \(n \rightarrow \infty \), and this is best possible.
A number of these results were strengthened and generalized by Elliott [22, Ch. 11], in particular to handle functions g with small gaps \(|g(an+b)-g(An+B) |\), for independent linear forms \(n \mapsto an+b\) and \(n \mapsto An+B\) (i.e. such that \(aB - Ab \ne 0\)).
Characterization problems of these kinds for both additive and multiplicative functions have continued to garner interest more recently. In [23], Klurman proved a long-standing conjecture of Kátai, showing that if a unimodular multiplicative function \(f: \mathbb {N} \rightarrow S^1\) has gaps satisfying \(|f(n+1)-f(n) |\rightarrow 0\) on average then there is a \(t \in \mathbb {R}\) such that \(f(n) = n^{it}\) for all n. In a later work, Klurman and the author [24] proved a conjecture of Chudakov from the ’50s characterizing completely multiplicative functions having uniformly bounded partial sums. See Kátai’s survey paper [25] for numerous prior works in this direction for both additive and multiplicative functions.
While these multiplicative results have consequences for additive functions, they are typically limited by the fact that if g is a real-valued additive function then the multiplicative function \(e^{2\pi i g}\) is only sensitive to the values \(g(n) \pmod {1}\). In particular, considerations about e.g. the monotone behaviour of g cannot be directly addressed by appealing to corresponding results for multiplicative functions.
1.2.1 Erdős’ conjecture for almost everywhere monotone additive functions
One still open problem stated in [17] concerns the almost sure variant of problem (a) above. For convenience, given an additive function \(g: \mathbb {N} \rightarrow \mathbb {R}\) we set \(g(0) := 0\) and define the set of decrease of g:
Conjecture 1.6
[17] Let \(g: \mathbb {N} \rightarrow \mathbb {R}\) be an additive function, such that
Then there exists \(c \in \mathbb {R}\) such that \(g(n) = c\log n\) for all \(n \in \mathbb {N}\).
Thus, if g is non-decreasing except on a set of integers of natural density 0 then it is conjectured that g must be a constant times a logarithm.
Condition (3) is necessary, as for any \(\varepsilon > 0\) one can construct a function g, not a constant multiple of \(\log n\), which is monotone except on a set of density at most \(\varepsilon \). Indeed, picking a prime \(p_0 > 1/\varepsilon \) and defining \(g = g_{p_0}\) to be the completely additive function defined at primes by
one finds that \(g_{p_0}(n) = \log n\) if and only if \(p_0 \not \mid n\), and that \(\mathcal {B} = \{mp_0 + 1: m \in \mathbb {N}\}\). It is easily checked that the density \(d\mathcal {B}\) of \(\mathcal {B}\) satisfies \(0< d\mathcal {B} = 1/p_0 < \varepsilon \).
As a consequence of our results on short interval averages of additive functions, we will prove the following partial result towards Erdős’ conjecture.
Corollary 1.7
Let \(g: \mathbb {N} \rightarrow \mathbb {R}\) be a completely additive function that satisfies
Assume furthermore that there is a \(\delta > 0\) such that
Then there is a constant \(c \in \mathbb {R}\) such that \(g(n) = c\log n\) for all \(n \in \mathbb {N}\).
The above corollary reflects the fact that the main difficulties involved in fully resolving Conjecture 1.6 are
-
(i)
the possible lack of sparseness of \(\mathcal {B}\) beyond \(|\mathcal {B}(X) |= o(X)\), and
-
(ii)
the possibility of very large values \(|g(p) |\).
More generally, we show that any function \(g \in \mathcal {A}_s\) that satisfies \(|\mathcal {B}(X) |= o(X)\) is close to a constant multiple of a logarithm at prime powers.
Theorem 1.8
Let \(g:\mathbb {N} \rightarrow \mathbb {R}\) be an additive function belonging to \(\mathcal {A}_s\), and suppose \(|\mathcal {B}(X) |= o(X)\). Let \(X \ge 10\) be large. Then there is \(\lambda = \lambda (X)\) with \(|\lambda (X) |\ll B_g(X)/\log X\) such that
Moreover, \(\lambda \) is slowly varying as a function of X in the sense that for every fixed \(0 < u \le 1\),
Finally, using a result of Elliott [26], we will prove the following approximate version of Erdős’ conjecture under weaker conditions than in Corollary 1.7.
Theorem 1.9
Let \(g: \mathbb {N} \rightarrow \mathbb {R}\) be an additive function, such that \(|\mathcal {B}(X) |= o(X)\). Then there are parameters \(\lambda = \lambda (X)\) and \(\eta = \eta (X)\) such that for all but o(X) integers \(n \le X\),
The functions \(\lambda ,\eta \) are slowly varying in the sense that for any \(u \in (0,1)\) fixed,
Remark 1.10
Note that if we knew (5) held for all three of \(n,m,nm \in [1,X]\) then we could deduce that
and thus that \(\eta = o(B_g(X))\). As such, (5) would be valid with \(\eta \equiv 0\). Unfortunately, we are not able to confirm this unconditionally.
1.2.2 On Elliott’s property of gaps
Gap statistics provide an important example of local properties of a sequence. Obviously, an additive function g whose values g(n) are globally close to g’s mean value must have small gaps \(|g(n+1) - g(n) |\). Conversely, it was observed by Elliott that the growth of the gaps between consecutive values of g also control the typical discrepancy of g(n) from its mean.
More precisely, given an additive function \(g: \mathbb {N} \rightarrow \mathbb {C}\) and \(X \ge 2\), define
It is well known (see e.g. Lemma 3.1) that as \(X \rightarrow \infty \), \(A_g(X)\) is the asymptotic mean value of \(\{g(n)\}_{n \le X}\). Elliott showed the following estimate relating the average deviations \(\vert g(n)-A_g(X) \vert \) to the average gaps \(\vert g(n)-g(n-1) \vert \).
Theorem
[22, Thm. 10.1] There is an absolute constantFootnote 6\(c > 0\) such that for any additive function \(g: \mathbb {N} \rightarrow \mathbb {C}\) one has
Elliott’s result shows that if g has exceedingly small gaps on average, even at scales that grow polynomially in X, then g must globally be very close to its mean.
The drawback of this result is that it is in principle possible for the upper bound to be trivial even if the gaps \(|g(n) - g(n-1) |\), \(n \le X\), are \(o(B_g(X))\) on average, as long as the average savings over \(n \le X^c\) is not large enough to offset the difference in size between \(B_g(X)\) and \(B_g(X^c)\).
In Sect. 6, we obtain two results that complement Elliott’s. The first shows that for any additive function g, any savings in the \(\ell ^1\)-averaged moment of \(|g(n)-g(n-1) |\) provides a savings over the trivial bound for the first centred moment. The second, which holds whenever \(g \in \mathcal {A}_s\), gives the same type of information as the first but in an \(\ell ^2\) sense.
Theorem 1.11
Let \(g: \mathbb {N} \rightarrow \mathbb {C}\) be an additive function.
-
(a)
The following are equivalent:
$$\begin{aligned}&\frac{1}{X} \sum _{n \le X} |g(n)-g(n-1) |= o(B_g(X)),&\frac{1}{X} \sum _{n\le X} |g(n) - A_g(X) |= o(B_g(X)). \end{aligned}$$ -
(b)
Assume furthermore that \(g \in \mathcal {A}_s\). Then the following are equivalent:
$$\begin{aligned}&\frac{1}{X} \sum _{n \le X} |g(n)-g(n-1) |^2 = o(B_g(X)^2), \\&\frac{1}{X} \sum _{n \le X} |g(n) - A_g(X) |^2 = o(B_g(X)^2). \end{aligned}$$
See Proposition 6.1, where an explicit dependence between the rates of decay of the gap average and the first centred moment in Theorem 1.11(a) is given as a consequence of Theorem 1.1.
As a corollary of Theorem 1.11(b) and a second moment estimate of Ruzsa (see Lemma 3.3), we will deduce the following.
Corollary 1.12
Let \(g \in \mathcal {A}_s\) be an additive function. Assume that
Then there is a function \(\lambda = \lambda (X)\) such that as \(X \rightarrow \infty \),
Remark 1.13
Even in the weak sense of Theorem 1.11 and even when g takes bounded values at primes, it can be seen that having small gaps on average is a very special property. As a simple example, \(g = \omega \), for which \(B_{\omega }(X)^2 \sim \log \log X\), satisfies
since by a bivariate version of the Erdős–Kac theorem (see e.g. [28]) one can find a positive proportion of integers \(n \in [X/2,X]\) such that, simultaneously,
In fact, as Corollary 1.12 shows, if \(g \in \mathcal {A}_s\) has a small \(\ell ^2\) average gap then g must behave like \(\lambda (X) \log \) on average over prime powers \(p^k \le X\).
2 Proof ideas
In this section, we will explain the principal ideas that inform the proofs of our main theorems.
2.1 On the Matomäki–Radziwiłł type theorems
In Theorems 1.1 and 1.4 , our objective is to estimate the averaged deviations
where \(k \in \{1,2\}\), and \(10 \le h \le X/10\) with \(h \in \mathbb {Z}\). Though our result applies to any complex-valued additive function g, by considering first \(\text {Re}(g)\) and \(\text {Im}(g)\) separately it is always possible to restrict to \(g(n) \in \mathbb {R}\) for all n, which we shall assume henceforth.
The key idea underlying the results for both \(k = 1,2\) involves the fact that for \(n \in \mathbb {N}\) and \(z \in \mathbb {C} \backslash \{0\}\) the functionFootnote 7\(n \mapsto z^{g(n)}\) is multiplicative in the n-aspect and analytic in the z-aspect. In the case of Theorem 1.1, for \(t \in \mathbb {R}\) the corresponding function \(G_t(n) := e^{2\pi i t g(n)}\) takes values on the unit circle \(S^1\). Moreover, by replacing \(G_t(n)\) by its constant (in n) multiple \({\tilde{G}}_t(n) := e^{2\pi i t(g(n)-A_g(X))}\) (see (6) for the definition of \(A_g\)), we see that for \(r = 1,2\),
Taylor expanding \({\tilde{G}}_t(m) = G_t(m)e^{-2\pi i t A_g(X)}\) to second order around \(t = 0\) for each m leads to
As the sum of \(g(m)-A_g(X)\) over a medium-length interval \((n-h^{*},n]\), where \(h^{*} = X/(\log X)^c\) for a small constant \(c > 0\), is well approximated by the sum over [X/2, X] (see Lemma 4.1), it suffices to compare the short averages over \((n-h,n]\) to those over \((n-h^{*},n]\). Using the Turán–Kubilius inequality to treat the integral error term in (8), the above allows us to approximate, for t close to 0, the average in (7) with \(k = 1\) by the corresponding average
where now our summands are, crucially, values of a bounded multiplicative function. After passing to the mean square by the Cauchy–Schwarz inequality, we may estimate these averages using the work of Matomäki and Radziwiłł [10] (and their joint work with Tao [8]), along with some additional ideas from pretentious number theory relating to the possible correlations of \(G_t(n)\) with the so-called Archimedean characters \(n^{i\lambda }\) for \(\lambda \in \mathbb {R}\).
The above strategy fails to work in the case \(k = 2\) for the important reason that the integral error term in (8), when squared and then averaged over n, cannot be controlled by an \(\ell ^2\) moment of \(g(n)-A_g(X)\), but rather only by an \(\ell ^4\) moment. This can be far larger than \(B_g(X)^4\), especially if g takes irregularly large values on prime powers.
In place of the Taylor approximation argument given above, we instead use Cauchy’s integral formula to obtain an expression for short averages of g without an error term, namelyFootnote 8 for \(\rho \in (0,1)\),
Though this manoeuvre has eliminated the problematic error term and still introduced multiplicative functions into the game, it has also introduced a different issue in that the path of integrationFootnote 9 intersects the region \(\vert z \vert > 1\). Any point in that region yields a function \(n\mapsto z^{g(n)}\) that is unbounded whenever g takes unbounded, positive values, say.
While this issue prevents us from obtaining an \(\ell ^2\) result for arbitrary additive functions g, we may still succeed if we impose restrictions on the growth of g. Indeed, as shown in [14], the work of Matomäki and Radziwiłł can be generalized to cover certain collections of unbounded multiplicative functions of controlled growth. This includes most natural multiplicative functions f that are uniformly bounded on the primes and not too large on average at prime powers. Assuming the hypothesis \(g \in \mathcal {A}_s\) and modifying g on a small set of prime powers, it can be shown that the resulting multiplicative function \(z^{g(n)/B_g(X)}\) satisfies the necessary hypotheses for the generalization of the Matomäki–Radziwiłł theorem in [14] to be applicable, which is crucial to the proof of Theorem 1.4.
2.2 On gaps between consecutive values of additive functions
Theorem 1.11 establishes that for suitable additive functions g, having a small kth moment of gaps is equivalent to having a small kth centred moment, for \(k \in \{1,2\}\). Since the proof follows similar lines in each of the cases \(k = 1,2\), we will confine ourselves mainly to explaining the case \(k = 1\) here.
By the triangle inequality,
which implies that if the first centred moment is \(o(B_g(X))\) then the average gap is also \(o(B_g(X))\).
The converse is more delicate. The main idea here is to note that if \(h = h(X)\) is slowly growing then as \(h < n \le X\) varies the average gap \(\vert g(n)-g(n-1) \vert \) controls the size of typical differences between g(n) and its length h averages:
Thus, if we assume that g has gaps \(\vert g(n)-g(n-1) \vert \) of size \(o(B_g(X))\) on average, then (by selecting h growing sufficiently slowly) the left-hand side of (9) will also typically be small. Now, Theorem 1.1 allows us to conclude that for almost all \(n \in [X,2X]\),
and in this way we deduce that \(\vert g(n)-A_g(X) \vert \) is also \(o(B_g(X))\) on average.
The corresponding result comparing the 2nd moments is analogous, but relies on our Theorem 1.4 instead of Theorem 1.1. For this reason, we must assume that \(g \in \mathcal {A}_s\) in Theorem 1.11(b).
2.3 On the Erdős monotonicity problem
Our application to Erdős’ problem, Conjecture 1.6, was the original motivation for this paper. The connection between our short interval average results and this conjecture arose from the observation that if g is a real-valued additive function that is non-decreasing outside of a set \(\mathcal {B}\) of density 0, then the average of the gaps of \(\{g(n)\}_n\) is nearly a telescoping sum, that is
since \(g(0) = 0\) by definition. It can be shown (see Lemma 3.5) that \(\vert g(\lfloor X \rfloor ) \vert /X = o(B_g(X))\); via the Cauchy–Schwarz inequality, the sparseness of \(\mathcal {B}\) results in the second expression also being \(o(B_g(X))\). By Theorem 1.11(a), which, as just discussed, is a consequence of Theorem 1.1, the first centred moment thus also satisfies \(\tfrac{1}{X}\sum _{n \le X} \vert g(n)-A_g(X) \vert = o(B_g(X))\).
A classical second moment estimate of Ruzsa (see Lemma 3.3) shows that if, instead, we could obtain savings over \(O(B_g(X)^2)\) for the second centred moment \(\tfrac{1}{X}\sum _{n \le X} \vert g(n)-A_g(X) \vert ^2\), then we could conclude the existence of a slowly-varying function \(\lambda = \lambda (X)\) such that \(g_{\lambda } = g-\lambda \log \) takes smaller values on average over prime powers than g does. That is, \(\lambda \log n\) approximates g(n) in a precise sense. Achieving such savings in the second centred moment is the objective of the proof of Theorem 1.8.
In analogy to the treatment of the first moment of the gaps in (10), the bulk of the work towards Theorem 1.8 involves obtaining savings over \(B_g(X)^2\) on sparsely-supported \(\ell ^2\) sums of the shape
where \(\mathcal {S} \subset \mathbb {N}\) and \(\mathcal {S}(X) := \mathcal {S} \cap [1,X]\) satisfies \(\vert \mathcal {S}(X)\vert = o(X)\), as \(X \rightarrow \infty \). Having no recourse to Hölder’s inequality for savings in \(\ell ^2\), we instead use the large sieve (see Proposition 7.1), together with some ideas due to Elliott, to show that either this sparse average is \(o(B_g(X)^2)\), or else \(\mathcal {S}\) contains many multiples of a sparse set of primes p where \(\vert g(p) \vert \) is extremely large (in a precise sense). As \(g \in \mathcal {A}_s\), this latter set is provably empty, and consequently we obtain the required savings. It would be interesting to understand whether a similar conclusion could be obtained under weaker conditions on g.
The slow variation of \(\lambda \) i.e. \(\lambda (X^u) = (1+o(1)) \lambda (X)\) for fixed \(0 < u \le 1\) is a key property that we exploit in the proof of Corollary 1.7. Though we do not need to directly invoke the general theory of slowly-varying functions due to Karamata (see e.g. [29, Ch. 1]), his representation theorem informs our proof that \(\lambda \) is slowly growing in X i.e. \(\lambda (X) \in [(\log X)^{-\varepsilon }, (\log X)^{\varepsilon }]\) for any \(\varepsilon > 0\) (see Lemma 8.6). Given that, provably, \(B_g(X) \asymp \lambda (X) \log X\) here, we find that \(B_g(X) = (\log X)^{1+o(1)}\). For reference, as noted above we have \(B_g(X) \sim \frac{\vert c \vert }{\sqrt{2}} \log X\) whenever \(g = c \log \).
Corollary 1.7 follows readily from this conclusion, since if \(\vert \mathcal {B}(X) \vert \ll X/(\log X)^{2+\delta }\) for some \(\delta > 0\), then by Cauchy–Schwarz we have
Since \(g(\lfloor X \rfloor )/X = o(B_g(X)/(\log X)^2) = o(1)\), the right-hand side in (10) is thus o(1), and so the Kátai–Wirsing theorem mentioned in the introduction (see also Theorem 3.7 for a statement) implies that \(g = c\log \) exactly, for some \(c \in \mathbb {R}\). Without this additional sparseness assumption on \(\mathcal {B}\), however, it is not clear how to proceed further. It would be interesting to obtain the bound \(\tfrac{1}{X}\sum _{n \le X} \vert g(n)-g(n-1) \vert = o(1)\), even assuming \(g \in \mathcal {A}_s\), under weaker hypotheses on the rate of decay of \(\vert \mathcal {B}(X) \vert /X\), or perhaps assuming to begin with that \(B_g(X) = (\log X)^{1+o(1)}\).
3 Auxiliary lemmas
In this section, we record several results that will be used repeatedly in the sequel. For the convenience of the reader, we recall that for an additive function \(g: \mathbb {N} \rightarrow \mathbb {C}\) and \(X \ge 2\),
Lemma 3.1
Let \(g: \mathbb {N} \rightarrow \mathbb {C}\) be additive. Then for any \(Y \ge 3\),
Proof
As \(g(n) = \sum _{p^k \Vert n} g(p^k)\), we have
using the Cauchy–Schwarz inequality and Chebyshev’s estimate \(\pi (Y) \ll Y/\log Y\) in the last two steps. \(\square \)
Lemma 3.2
(Turán–Kubilius Inequality) Let \(X \ge 3\). Uniformly over all additive functions \(g: \mathbb {N} \rightarrow \mathbb {C}\),
Proof
This is e.g. [22, Lem. 1.5] (taking \(\sigma = 0\)). \(\square \)
The following estimate due to Ruzsa, which sharpens the Turán–Kubilius inequality, gives an order of magnitude estimate for the second centred moment of a general additive function.
Lemma 3.3
[30] Let \(g: \mathbb {N} \rightarrow \mathbb {C}\) be an additive function. Then
where for \(\lambda \in \mathbb {R}\) and \(n \in \mathbb {N}\) we set \(g_{\lambda }(n) := g(n)-\lambda \log n\), and \(\lambda _0 = \lambda _0(X)\) is given by
Lemma 3.4
Let \(g: \mathbb {N} \rightarrow \mathbb {C}\) be additive, and let \(z \ge y \ge 2\). Then
In particular, if \(y \in (z/2,z]\) then
Proof
By Mertens’ theorem,
The second claim follows immediately from this. \(\square \)
Lemma 3.5
Let \(X\ge 3\) and let \(n \in (X/2,X]\). Then \(\frac{|g(n) |}{n} \ll \frac{B_g(X)\log X}{\sqrt{X}}\).
Proof
Observe that whenever \(p^k \le X\) we have \(\vert g(p^k) \vert /p^{k/2} \le B_g(X)\). It follows from the triangle inequality and the bound \(\omega (n) \ll \log n\) for all \(n \ge 2\) that
as claimed. \(\square \)
Working within the collection \(\mathcal {A}\) (see Definition 1.3), the following properties will be useful.
Lemma 3.6
(a) Let \(g: \mathbb {N} \rightarrow \mathbb {C}\) be an additive function satisfying \(B_g(X) \rightarrow \infty \). If g is either completely or strongly additive then \(g \in \mathcal {A}\).
(b) Let \(g \in \mathcal {A}\). Then there is a strongly additive function \(g^{*}\) such that \(g(p) = g^{*}(p)\) for all primes p, and \(B_{g-g^{*}}(X) = o(B_g(X))\) as \(X \rightarrow \infty \).
Proof
(a) Let g be either strongly or completely additive. We put \(\theta _g := 1\) if g is completely additive, and \(\theta _g := 0\) otherwise. Then \(g(p^k) = k^{\theta _g}g(p)\) for any prime power \(p^k\), and thus
Since \(B_g(X) \rightarrow \infty \), choosing \(M = M(X)\) tending to infinity arbitrarily slowly we see that
It follows that \(g \in \mathcal {A}\), as required.
(b) We define \(g^{*}\) to be an additive function defined by \(g^{*}(p^k) := g(p)\) for all primes p and \(k \ge 1\). Thus, \(g^{*}\) is strongly additive. Moreover, if \((g-g^{*})(p^k) \ne 0\) then \(k \ge 2\), for any p. By assumption and part (a), \(g,g^{*} \in \mathcal {A}\), and thus
for both \(h = g\) and \(h = g^{*}\). By the Cauchy–Schwarz inequality,
as required. \(\square \)
Finally, we record the characterization result of Kátai and Wirsing, mentioned in the introduction.
Theorem 3.7
[18, 21] Let \(g: \mathbb {N} \rightarrow \mathbb {C}\) be an additive function such that as \(X \rightarrow \infty \),
Then there is \(c \in \mathbb {C}\) such that \(g(n) = c\log n\) for all \(n \in \mathbb {N}\).
4 The Matomäki–Radziwiłł theorem for additive functions: \(\ell ^1\) variant
In this section, we prove Theorem 1.1.
We begin with the following simple observation, amounting to the fact that the mean value of an additive function changes little when passing from a long interval of length \(\asymp X\) to a medium-sized one of length \(X/(\log X)^{c}\), for \(c > 0\) sufficiently small.
Lemma 4.1
Let \(g: \mathbb {N} \rightarrow \mathbb {C}\) be additive and let X be large. Let \(X/2 < x \le X\), and let \(X/(\log X)^{1/3} \le h \le X/3\). Then
Proof
Applying Lemma 3.1 with \(Y = X/2,X,x-h\) and x, we obtain
Since \(h \ge X/(\log X)^{1/3}\), the error term in the second line is \(\ll \frac{B_g(X)}{(\log X)^{1/6}}\). By Lemma 3.4,
for the main term in the first equation, and also
so that by a second application of Lemma 3.4,
Combining these estimates, we may conclude that
as claimed. \(\square \)
In light of the above lemma, to prove Theorem 1.1 it suffices to prove the following: if \(h' = X/(\log X)^{1/3}\) and \(10 \le h \le h'\) then
Splitting \(g = \text {Re}(g) + i \text {Im}(g)\), and noting that both \(\text {Re}(g)\) and \(\text {Im}(g)\) are real-valued additive functions, we may assume that g is itself real-valued, after which the general case will follow by the triangle inequality.
Let \(10 \le h \le X/3\), with X large. Following [8], fix \(\eta \in (0,1/12)\), parameters \(Q_1 = h\), \(P_1 = (\log h)^{40/\eta }\), and define further parameters \(P_j,Q_j\) by
for all \(j \le J\), where J is chosen maximally subject to \(Q_J \le \exp (\sqrt{\log X})\). We then define
where for any set \(S \subset \mathbb {N}\) we write \(\omega _S(n) := \sum _{p \mid n} 1_S(p)\).
The following key step in the proof of Theorem 1.1 allows us to pass from comparing averages of the additive g to averages of a corresponding multiplicative function, supported on \(\mathcal {S}\).
Lemma 4.2
Let \(g: \mathbb {N} \rightarrow \mathbb {R}\) be an additive function. Let \(10 \le h \le h'\), where \(h' := \frac{X}{(\log X)^{1/3}}\) and \(h \in \mathbb {Z}\), and \((\log X)^{-1/6}< t < 1\). Then
where \({\tilde{g}}(n;X) := B_g(X)^{-1}(g(n) - A_g(X))\) for all \(n \in \mathbb {N}\).
Proof
In view of Lemma 3.5, at the cost of an error term of size \(\max _{X/2<n \le X} |g(n) |/h' \ll B_g(X)X^{-1/4}\), we may assume that both \(h,h' \in \mathbb {Z}\) (else replace \(h'\) by \(\left\lfloor h'\right\rfloor \)). Given \(u \in [0,1]\), \(x \in [X/2,X] \cap \mathbb {Z}\) and an integer \(1 \le H \le h'\), define
which is clearly an analytic function of u. Fix \(x \in [X/2,X] \cap \mathbb {Z}\), and observe that \(S_{h'}(0;x) = 1 = S_h(0;x)\). By Taylor expansion in t,
wherein we have
By inserting the expression (13) into (11), rearranging the latter and then taking absolute values and averaging over \(x \in [X/2,X] \cap \mathbb {Z}\), we find
Since g is real-valued by assumption, \(|e(u{\tilde{g}}(n;X)) |= 1\) for all n. Thus, applying the triangle inequality and Lemma 3.2, we may bound the last expression above by
We now split
with \(H\in \{h,h'\}\). By the triangle inequality, we have
Since \(P_J \le \exp (\sqrt{\log X})\), the union bound and the fundamental lemma of the sieve (see [31, Remark after Lem. 6.3]) yield
We thus find by the triangle inequality that
Finally, if \(n \in [X/2,X] \cap \mathbb {Z}\) and \(x \in [n,n+1)\) then \(S_H^{(\mathcal {S})}(t;x) = S_H^{(\mathcal {S})}(t; n) + O(1/H)\), and thus
Combined with the preceding estimates, we obtain
which implies the claim. \(\square \)
Define the multiplicative function
In light of Lemma 4.2, the proof of Theorem 1.1 essentially boils down to the following comparison result for short- and medium-length interval averages of \(G_{t,X}\).
Lemma 4.3
Let \(g: \mathbb {N} \rightarrow \mathbb {R}\) be an additive function. Let \(X \ge 3\) be large, \((\log X)^{-1/6} < t \le 1/100\) be small and let \(10 \le h_1 \le h_2\) where \(h_2 = X/(\log X)^{1/3}\). Then
To prove Lemma 4.3 we will appeal to some ideas from pretentious analytic number theory. Let \(\mathbb {U} := \{z \in \mathbb {C} : |z |\le 1\}\). In what follows, given multiplicative functions \(f,g: \mathbb {N} \rightarrow \mathbb {U}\) and parameters \(1 \le T \le X\), we introduce the pretentious distance of Granville and Soundararajan:
For multiplicative functions f, g, h taking values in \(\mathbb {U}\), it is well known (see e.g. [32, Lem 3.1]) that \(\mathbb {D}\) satisfies the triangle inequality
For each \(t \in [0,1]\), select \(\lambda _{t,X} \in [-X,X]\) such that \(M_{G_{t,X}}(X;X) = \mathbb {D}(G_{t,X},n^{i\lambda _{t,X}}; X)^2\) (if there are multiple such minimizers, pick any one of them).
Lemma 4.4
Let \(0 < t \le 1/100\) be sufficiently small. Then either
-
(i)
\(M_{G_{t,X}}(X;X) \ge \frac{1}{25}\log \log X\), or else
-
(ii)
\(|\lambda _{t,X}|= O(1)\).
Proof
Assume (i) fails. Then by assumption, \(\mathbb {D}(G_{t,X}, n^{i\lambda _{t,X}};X)^2 \le \frac{1}{25} \log \log X\). We claim that there is also \({\tilde{\lambda }}_{t,X} = O(1)\) such that
To see that this is sufficient to prove (ii), we apply (17) to obtain
for large enough X. Now, if \(|\lambda _{t,X} - {\tilde{\lambda }}_{t,X}|\ge 100\) then as \(|\lambda _{t,X}|, |{\tilde{\lambda }}_{t,X}|\le X\) the Vinogradov–Korobov zero-free region for \(\zeta \) (see e.g. [9, (1.12)]) gives
which is a contradiction. It follows that
as required.
It thus remains to prove that (18) holds. By Lemma 3.2, we obtain
It follows from Taylor expansion that
On the other hand, by Halász’ theorem in the form of Granville and Soundararajan [33, Thm. 1],
where \(1 \le U \le \log X\) is a parameter of our choice. If U is a suitably large absolute constant and t is sufficiently small in an absolute sense, we obtain \(M_{G_{t,X}}(X; U) \ll 1\), and therefore there is a \({\tilde{\lambda }}_{t,X} \in [-U,U]\) (thus of size O(1)) such that
as claimed. \(\square \)
Proof of Lemma 4.3
Set \(\varepsilon = (\log X)^{-1/100}\). If \(M_{G_{t,X}}(X;X) \ge 4\log (1/\varepsilon )\) then by the triangle inequality, Cauchy–Schwarz and [9, Theorem A.2], the LHS of (15) is
Next, assume that \(M_{G_{t,X}}(X;X) < 4\log (1/\varepsilon )\). For \(\lambda \in \mathbb {R}\) and \(h \ge 1\) define
By Lemma 4.4 we have \(\lambda _{t,X} = O(1)\), so that with \(h \in \{h_1,h_2\}\),
and thus for each \(x \in [X/2,X]\) and \(j = 1,2\),
Reinstating the \(n \notin \mathcal {S}\) and using the arguments surrounding (14), we also note that
Adding and subtracting the expression on the LHS of (19) inside the absolute values bars in (15), we obtain the upper bound
where we have set
If \(j = 2\) then as just noted we also have
and so by Cauchy–Schwarz and [34, Theorem 1.6] (taking \(Q = 1\) and \(\varepsilon = (\log X)^{-1/200}\) there), we have
In the same way, when \(h_1 > \sqrt{X}\) we obtain the same bound \(\mathcal {T}_1 \ll (\log X)^{-1/400} + \frac{\log \log h_1}{\log h_1}\) as well.
Thus, assume that \(10 \le h_1 \le \sqrt{X}\). Combining Cauchy–Schwarz with [10, Theorem 9.2(ii)] (taking \(\delta = (\log h_1)^{1/3}P_1^{-1/6+\eta }\), \(\nu _1 = 1/20\) and \(\nu _2 = 1/12\), there), we then get
Combining these estimates, we obtain that the LHS of (15) is
as claimed. \(\square \)
Proof of Theorem 1.1
Set \(h_1 := h\) and \(h_2 := X/(\log X)^{1/3}\). As mentioned, we may assume that g is real-valued (otherwise the result follows for complex-valued g by applying the theorem to \(\text {Re}(g)\) and \(\text {Im}(g)\) and applying the triangle inequality). By Lemma 4.1, we have
By Lemma 4.2, the latter is
Observe next that for any \(x \in [X/2,X]\) and \(t \in (0,1)\),
Taking \(t := \max \big \{\sqrt{\frac{\log \log h_1}{\log h_1}}, (\log X)^{-1/800}\big \}\), Theorem 1.1 now follows on combining this last expression with Lemma 4.3 and inserting the resulting bound into (21) . \(\square \)
5 The Matomäki–Radziwiłł theorem for additive functions: \(\ell ^2\) variant
In this section, we will prove Theorem 1.4.
Let \(g \in \mathcal {A}_s\), so that \(B_g(X) \rightarrow \infty \), and the conditions
both hold. We seek to show that
whenever \(10 \le h \le X/10\) is an integer that satisfies \(h = h(X) \rightarrow \infty \) as \(X \rightarrow \infty \).
We begin by making the following convenient reduction.
Lemma 5.1
Suppose that Theorem 1.4 holds for any non-negative, strongly additive function \(g \in \mathcal {A}_s\). Then Theorem 1.4 holds for any \(g \in \mathcal {A}_s\).
Proof
By splitting \(g = \text {Re}(g) + i\text {Im}(g)\), and separately decomposing
where, for an additive function h we define the non-negative additive functions \(h^{\pm }\) on prime powers via
the Cauchy–Schwarz inequality implies that if (24) holds for non-negative g satisfying (22) and (23) then it holds for all additive g satisfying those conditions.
Therefore, we may assume that g is non-negative. Now, by Lemma 3.6, we can find a strongly additive function \(g^{*}\), satisfying \(g(p) = g^{*}(p)\) for all p, such that upon setting \(G := g-g^{*}\) we have \(B_{G}(X) = o(B_g(X))\). If we write, for a non-negative additive function h,
then we see that when X is large enough,
Taking limsups as \(X \rightarrow \infty \) in these inequalities, it follows that \(g^{*}\) satisfies (22) whenever g does; that \(g^{*}\) also satisfies (23) is an immediate consequence of Lemma 3.6. Moreover, we see by the Cauchy–Schwarz inequality and Lemma 3.2 that
Using the estimate \(\frac{2}{X}\sum _{X/2 < n \le X} G(n) = A_G(X) + o(B_g(X))\) by Lemmas 3.1 and 3.4 (as in the proof of Lemma 4.1), we see that
so that if (24) holds for strongly additive \(g^{*} \in \mathcal {A}_s\) then it also holds for all \(g \in \mathcal {A}_s\). This completes the proof. \(\square \)
Until further notice we may thus assume that g is non-negative and strongly additive. For a fixed small parameter \(\varepsilon > 0\), let \(\delta \in (0,1/100)\) be chosen such that \(F_g(\delta ) < \varepsilon \). Let X be a scale chosen sufficiently large so that
With these data, define
We decompose g as
where \(g_{\mathcal {C}}\) and \(g_{\mathcal {P} \backslash \mathcal {C}}\) are strongly additive functions defined at primes by
We will consider the mean-squared errors
for \(\mathcal {A} \in \{\mathcal {C},\mathcal {P}\backslash \mathcal {C}\}\), separately. The fact that \(g \in \mathcal {A}_s\) means, in particular, that \(g_{\mathcal {P} \backslash \mathcal {C}}\) contributes little to \(\Delta _g(X,h)\).
Lemma 5.2
Let \(g \in \mathcal {A}_s\) be a non-negative, strongly additive function. Assume that X and \(\delta \) are chosen such that (25) and (26) both hold. Then we have
where \(\mathcal {C} = \mathcal {C}(X,\delta )\).
Proof
Arguing as in the proof of Lemma 5.1, we obtain from (25) and (26) that
Thus, by the Cauchy–Schwarz inequality, we obtain
as claimed. \(\square \)
In analogy to the work of the previous section, we will reduce the estimation of \(\Delta _{g_{\mathcal {C}}}(X)\) to that of the variance of short- and long-interval averages of certain multiplicative functions determined by \(g_{\mathcal {C}}\). These are defined as follows.
Fix \(r \in (0,\delta ^{2}]\). Given \(z \in \mathbb {C}\) satisfying \(|z-1 |= r\), define
Since \(g_{\mathcal {C}}\) is strongly additive and satisfies \(0 \le g_{\mathcal {C}}(p) \le \delta ^{-1} B_g(X)\) for all \(p \le X\), we have
for all \(p^k \le X\), and thus also
where d(n) is the divisor function. Furthermore, as \(\delta \in (0,1/100)\), for any \(2 \le u \le v \le X\) we get
Our treatment of short sums of \(g_{\mathcal {C}}\) will entail an analysis of corresponding short sums of \(F_z\), for z lying in a small neighbourhood of 1. In preparation to apply a relevant result from the recent paper [14], we introduce some further notation. Given a multiplicative function \(f: \mathbb {N} \rightarrow \mathbb {C}\) set
We also define the following variant of the pretentious distance:
We let \(t_0 = t_0(f,X)\) denote a real number \(t \in [-X,X]\) that minimizes \(t \mapsto \rho (f,n^{it};X)^2\).
Theorem 5.3
[14, Thm. 2.1] Let \(0 < A \le 2\), and let X be large. Let \(f: \mathbb {N} \rightarrow \mathbb {C}\) be a multiplicative function that satisfies
-
(i)
\(|f(n) |\le d(n)\) for all \(n \le X\), and in particular \(|f(p) |\le 2\) for all \(p \le X\),
-
(ii)
for any \(2 \le u \le v \le X\),
$$\begin{aligned} \sum _{u< p \le v} \frac{|f(p) |}{p} \ge A \sum _{u < p \le v} \frac{1}{p} - O\left( \frac{1}{\log u}\right) . \end{aligned}$$
Let \(10 \le h_0 \le X/(10H(f;X))\), and put \(h_1 := h_0 H(f;X)\) and \(t_0 = t_0(f,X)\). Then there are constants \(c_1,c_2 \in (0,1/3)\), depending only on A, such that if \(X/(\log X)^{c_1} < h_2 \le X\),
Let \(c_1 \in (0,1/3)\) be the constant from Theorem 5.3, applied with \(A = 0.98\). By Lemma 4.1, if \(h_2 = \left\lceil X/(\log X)^{c_1}\right\rceil \) then for any \(x \in (X/2,X]\)
In view of Lemma 5.2, in order to prove Theorem 1.4 it suffices to show that as \(X \rightarrow \infty \),
where \(h_1 = h\) and \(h_2 = \left\lceil X/(\log X)^{c_1} \right\rceil \).
Using Theorem 5.3, we will prove the following.
Corollary 5.4
Let \(10 \le h_1 \le X/10\) be an integer and \(h_2 := \lceil X/(\log X)^{c_1}\rceil \) as above. Then there is a constant \(\gamma > 0\) such that
The conditions (i) and (ii) of Theorem 5.3 were verified for \(f = F_z\) in (27), (28) and (29) (with \(A = 0.98\)), and it remains to elucidate information about \(t_0(F_z,X)\), \(H(F_z;X)\) and the size of the Euler product \(\mathcal {P}_{F_z}(X)\). This is provided by the following lemma.
Lemma 5.5
Fix \(r \in (0,\delta ^2]\) and let \(z \in \mathbb {C}\) satisfy \(|z-1 |= r\). Then
-
(a)
\(t_0(F_z,X) \ll 1/\log X\),
-
(b)
\(H(F_z;X) \asymp 1\) and
-
(c)
\(\mathcal {P}_{F_z}(X)^2 \ll \prod _{p \le X} \left( 1+\frac{|F_z(p) |^2 -1}{p}\right) \).
Proof
(a) Applying [14, (7)] with \(A = 0.98\), \(B = 2\) and \(C = 1\) (which is a straightforward consequence of [10, Lem. 5.1(i)]), we see that if \((\log X) |t_0(F_z;X) |\ge D\) for a suitably large constant \(D > 0\) then by minimality of \(t_0\),
say, where \(\sigma > 0\) is an absolute constant.
To obtain a contradiction, observe next that for any \(z = re(\theta )\) with \(\theta \in [0,1]\), we have already shown that \(\vert F_z(p) \vert \le 2\) for \(\delta \in (0,1/100)\). Thus, writing \(F_z(p) = \vert F_z(p) \vert e(\theta g_{\mathcal {C}}(p)/B_g(X))\) and applying the inequality \(0 \le 1-\cos x \le x^2/2\) for all \(x \ge 0\), we find
This contradiction implies that \(\vert t_0(F_z;X) \vert \log X \le D\) for some constant D, and the claim follows.
(b) By Taylor expansion, \(|z |^{g_{\mathcal {C}}(p)/B_g(X)} = 1 + O(\delta g(p)/B_g(X))\), and thus
The corresponding lower bound is trivial from the definition of \(H(F_z;X)\).
(c) Since \(\vert F_z(p) \vert \le 2\) for all \(p \le X\), we have the upper bounds
the latter of which arises from \((\vert F_z(p) \vert - 1)^2 \ge 0\) for all p. \(\square \)
Lemma 5.6
Let g be non-negative and strongly additive. Let \(r \in (0,\delta ^2]\), and set \(h_1 = h\) and \(h_2 = \lceil X/(\log X)^{c_1}\rceil \) as above. Then there is \(z_0 \in \mathbb {C}\) with \(|z_0-1 |= r\) such that as \(X \rightarrow \infty \),
where \(t_0 = t_0(F_{z_0},X)\) and \(I(x; t,h) := \frac{1}{h} \int _{x-h}^x u^{it} \mathrm{d}u\) as in the previous section.
Proof
For each \(n \in (X/2,X]\), \(z \in \mathbb {C}\) and \(j = 1,2\), define the maps
which are analytic in z. Note that
Recall that \(h_1,h_2 \in \mathbb {Z}\). Thus, by Cauchy’s integral formula we have
By Cauchy–Schwarz and the definition of \(F_{z}\), we obtain
for some \(z_0 \in \mathbb {C}\) with \(|z_0-1 |= r\). To complete the proof, note that by Taylor expansion and Lemma 5.5(a),
and also
uniformly in \(n-h_2 < m \le n\). It follows that
The error term is, by Shiu’s theorem [35, Thm. 1] and the Cauchy–Schwarz inequality,
which suffices to prove the claim. \(\square \)
We are now in a position to apply Theorem 5.3 in order to prove Corollary 5.4.
Proof of Corollary 5.4
Let \(z_0\) be chosen as in Lemma 5.6. Since \(h_1,h_2 \in \mathbb {Z}\) we may replace the discrete average in Lemma 5.6 by an integral average at the cost of an error term of size
again by Shiu’s bound [35, Thm. 1]. Using the data from Lemma 5.5, Theorem 5.3 therefore yields
As g is strongly additive,
the bound in the error term arising from the Cauchy–Schwarz inequality. Now put \(\rho := \log |z_0 |\in (-10\delta ^2,10\delta ^2)\), say. Using the estimates \(\log (1+x) = x + O(x^2)\) and \(|e^x-1-x |\le \tfrac{1}{2}|x |^2\) for \(|x |\le 1/2\), the factors in (32) can be estimated as
The claimed bound now follows with any \(0< \gamma < 0.98 c_2\) (changing the implicit constant as needed). \(\square \)
Proof of Theorem 1.4
Let \(g \in \mathcal {A}_s\). By Lemma 5.1 we may assume that g is non-negative and strongly additive. Let \(\varepsilon > 0\) and pick \(\delta >0\) and \(X_0 = X_0(\delta )\) such that if \(X \ge X_0\) then (25) and (26) both hold, and also define \(\mathcal {C}\) as above. Set also \(h_1 = h\) and \(h_2 := \left\lceil \frac{X}{(\log X)^{c_1}}\right\rceil \) as above. Combining Lemma 5.2 and (30), we have
Applying Corollary 5.4 in this estimate, we find that there is a \(\gamma \in (0,1/6)\) for which
Selecting \(h \ge \exp \left( \delta ^{-5} \varepsilon ^{-2}\log (1/(\delta \varepsilon ))\right) \), picking \(X_0\) larger if necessary, we deduce that \(\Delta _g(X,h) \ll \varepsilon B_g(X)\). Since \(\varepsilon \) was arbitrary, we deduce that \(\Delta _g(X,h) = o(B_g(X))\), as claimed. \(\square \)
6 Gaps and moments
In this section, we will prove Theorem 1.11.
6.1 Small gaps and small first moments are equivalent: proof of Theorem 1.11(a)
We start by proving the following quantitative \(\ell ^1\) gap result.
Proposition 6.1
Let \(0< \varepsilon < 1/3\) and let X be large. Let \(g: \mathbb {N} \rightarrow \mathbb {C}\) be an additive function. Assume that
for all \(X/\log X < Y \le X\). Then we have
Proof
Let \(h = \left\lfloor \min \{X/(2\log X), \varepsilon ^{-1/2}\} \right\rfloor \), and let \(X/\log X < Y \le X\). By the triangle inequality and (33), for any \(1 \le m \le h\) we obtain
Averaging over \(1 \le m \le h\) and then applying the triangle inequality once again, we obtain
Applying Theorem 1.1, we deduce from this that
Now, for each such Y, Lemmas 3.1 and 3.4 combine to yield
and thus combining this estimate into the previous expression and summing over all dyadic subintervals of \([X/\log X,X]\), we obtain
Applying Lemma 3.2 and the Cauchy–Schwarz inequality on \([1,X/\log X]\), we obtain
The latter two estimates together imply the claim. \(\square \)
Proof of Theorem 1.11(a)
By the triangle inequality, we see that if
then, consequently,
The converse implication follows immediately from Proposition 6.1. \(\square \)
6.2 A gap theorem for the second moment
In parallel to the results of the previous subsection, we will apply Theorem 1.4 to prove the following result.
Proposition 6.2
Let \(g \in \mathcal {A}_s\) Then for any integer \(10 \le h \le \tfrac{X}{10\log X}\) we have
Proof
Given our assumptions about g, we may apply Theorem 1.4 to obtain
for any \(X/\log X < Y \le X\). Applying Lemmas 3.1 and 3.4, we deduce that
By telescoping,
Squaring both sides and applying Cauchy–Schwarz, we obtain
Combined with the previous estimates, we obtain
By Lemma 3.4, for each \(X/\log X < Y \le X\) we have
so that, summing over dyadic subintervals of \([X/\log X,X]\) and noting that by our assumption \(h \le X/(10\log X)\) at most two dyadic intervals contain any point of
we find
Applying Lemma 3.2 trivially to the segment \([1,X/\log X]\) gives
so combining these two estimates implies the claim. \(\square \)
Proof of Theorem 1.11(b)
To obtain the theorem, we note first the trivial estimate
so that if the RHS is \(o(B_g(X)^2)\) then so is the LHS.
Conversely, suppose that
for some function \(\xi (Y) \rightarrow 0\) as \(Y \rightarrow \infty \). Set \(h := \lfloor \xi (X)^{-1/3}\rfloor .\) By Proposition 6.2,
as \(X \rightarrow \infty \), as required. \(\square \)
Proof of Corollary 1.12
Let \(g \in \mathcal {A}_s\) and suppose that
By Theorem 1.11(b), we obtain that
Now, by Lemma 3.3, this implies that there is \(\lambda = \lambda (X) \in \mathbb {R}\) such that
where \(g_{\lambda } = g-\lambda \log \). Since the left-hand side is \(\ge B_{g_{\lambda }}(X)^2\), the claim follows immediately. \(\square \)
7 Erdős’ almost everywhere monotonicity problem
Let \(g: \mathbb {N} \rightarrow \mathbb {R}\) be additive. For convenience, set \(g(0) := 0\), and recall the definitions
In this and the following section, we will study functions g such that \(|\mathcal {B}(X) |= o(X)\).
7.1 The second moment along sparse subsets
To prove Theorem 1.8 we will eventually need control over a sparsely-supported sum such as
with the objective of obtaining savings over the trivial bound \(O(B_g(X)^2)\) from Lemma 3.2. The purpose of this subsection is to determine sufficient conditions in order to achieve a non-trivial estimate of this kind.
Given a set of positive integers \(\mathcal {S}\), a positive real number \(X \ge 1\) and a prime power \(p^k \le X\), write \(\mathcal {S}(X) := \mathcal {S} \cap [1,X]\) and \(\mathcal {S}_{p^k}(X) := \{n \in \mathcal {S}(X) : p^k \mid n\}\).
Proposition 7.1
Let \(g: \mathbb {N} \rightarrow \mathbb {C}\) be an additive function belonging to \(\mathcal {A}\). Let \(\mathcal {S}\) be a set of integers with \(|\mathcal {S}(X) |= o(X)\), and let \(\varepsilon \in (0,1)\) satisfy the conditions
Then the following bound holds:
Moreover, we have
Remark 7.2
Proposition 7.1 states that if the bulk of the contribution to the variance of g(n) occurs along a sparse subset \(\mathcal {S}(X) \subseteq [1,X]\) then \(B_g(X)\) is dominated by primes p of which \(\mathcal {S}(X)\) has many multiples \(\le X\). For sufficiently small primes p this is ruled out by the sparseness of \(\mathcal {S}(X)\), but it may occur for large enough primes.
Our proof will proceed by applying variants of the large sieve and Turán–Kubilius inequalities. The first of these is due to Elliott.
Lemma 7.3
(Elliott’s Dual Turán–Kubilius Inequality) Let \(\{a(n)\}_n \subset \mathbb {C}\) be a sequence and let \(X \ge 2\). Then
Proof
This is [22, Lemma 5.2] (taking \(\sigma = 0\) there). \(\square \)
A variant of the latter result, for divisibility by products of two large primes, is as follows.
Lemma 7.4
(Variant of Dual Turán–Kubilius) Let \(\{a(n)\}_n \subset \mathbb {C}\). Then
Proof
By including the factor pq into the square, we observe that this establishes an \(\ell ^2 \rightarrow \ell ^2\) operator norm for the matrix with entries
Thus, by the duality principle [31, Sec. 7.1] it suffices to show that for any sequence \(\{b(p,q)\}_{p,q \text { prime}} \subset \mathbb {C}\) we have
Expanding the square on the LHS and swapping orders of summation, we obtain
Fix the quadruple \((p_1,q_1,p_2,q_2)\) for the moment, and consider the inner sum over \(n \le X\). If \((p_1q_1,p_2q_2) = 1\) then as \(p_1q_1p_2q_2 > X\) the sum is
If \((p_1q_1, p_2q_2) = p_1\) (so that \(q_1 \ne q_2)\), say, then the sum is
By symmetry, the analogous result holds if \((p_1q_1,p_2q_2) = q_1\). Finally, if \(p_1q_1 = p_2q_2\) then similarly the bound is \(\ll p_1q_1 X\). We thus obtain from these cases that the LHS in (35) is bounded above by
By the AM–GM inequality we simply have \(2|b(p,q)||b(p',q')|\le |b(p,q)|^2 + |b(p',q')|^2\) for any pairs of primes p, q and \(p',q'\), so invoking Mertens’ theorem and symmetry the above expressions are
and the claim follows. \(\square \)
Proof of Proposition 7.1
Let \(g \in \mathcal {A}\), and let \(g^{*}\) be the strongly additive function equal to g at primes, provided by Lemma 3.6. Applying Lemma 3.2, then following the proof of Lemma 3.6, we find
by assumption. It follows that
so replacing g by \(g^{*}\), we may assume in what follows that g is strongly additive.
Fix \(z = X^{1/4}\) and split \(g = g_{\le z} + g_{>z}\), where \(g_{\le z}\) is the strongly additive function defined by \(g_{\le z}(p^k) := g(p)1_{p \le z}\) at primes powers \(p^k\). By Cauchy–Schwarz, we seek to estimate
We begin with the first expression. Writing
for each \(n \le X\) and expanding the square, the first sum in (36) is
Consider the off-diagonal term O. Observe that for any two distinct primes p and q, the Chinese remainder theorem implies that
where the asterisked sum is over reduced residues modulo pq. Note that for any two distinct products \(p_1q_1\) and \(p_2q_2\) the gap between fractions with these denominators satisfies
and the number of pairs yielding the same product pq is \(\le 2\). Using this expression in O, applying the Cauchy–Schwarz inequality twice followed by the large sieve inequality [31, Lem. 7.11], we obtain
Expanding \(g_{>z}(n) - A_{g>z}(X)\) in a similar way as in (37), then inserting this and the previous estimate into (36) we obtain the upper bound,
Denote by T the expression in (38), so that
say. We split the pairs of primes \(X^{1/4} < p,q \le X\), \(p \ne q\) in the support of T as follows. Given a squarefree integer d, call \(E_d(\varepsilon )\) the condition \(\frac{d}{X}|\mathcal {S}_d(X)|\le \varepsilon \), and let \(L_d(\varepsilon )\) be the converse condition \(\frac{d}{X}|\mathcal {S}_d(X)|> \varepsilon \). If, simultaneously, the three conditions \(E_{pq}(\varepsilon ),E_p(\varepsilon )\) and \(E_q(\varepsilon )\) all hold, then as \(|\mathcal {S}(X)|/X < \varepsilon \) we have \(T_{p,q}(X) \ll \varepsilon \); otherwise, we trivially have \(T_{p,q}(X) \ll 1\). We thus find by the Cauchy–Schwarz inequality that
Now suppose \(L_d(\varepsilon )\) holds for some \(d\ge 2\). As \(\varepsilon > 2 \frac{|\mathcal {S}(X)|}{X}\) we have
Using this with \(d = p\) for each \(X^{1/4} < p \le X\), and applying Lemma 7.3, we get
this, by the way, establishes (34). Similarly, by Lemma 7.4 we get
Combining these estimates in (39) shows that
Thus, putting \(\delta (X) := \varepsilon + \varepsilon ^{-1} \left( \frac{|\mathcal {S}(X)|}{X}\right) ^{1/2}\), we finally conclude that
and the claim follows. \(\square \)
Corollary 7.5
Let \(g: \mathbb {N} \rightarrow \mathbb {R}\) be an additive function in \(\mathcal {A}_s\). Let \(\mathcal {S} \subset \mathbb {N}\) be a set with \(|\mathcal {S}(X)|= o(X)\). Then for any fixed \(j \in \mathbb {Z}\),
Proof
By appealing to Proposition 7.1, we will show that for any \(\varepsilon > 0\) there is \(X_0(\varepsilon )\) such that if \(X \ge X_0(\varepsilon )\) then
for any fixed \(j \in \mathbb {Z}\).
First, note that as \(B_g(X) \rightarrow \infty \), if \(X \ge X_0(\varepsilon )\) then
Taking X larger if necessary, we may combine this with (23) to deduce that
Next, for any fixed \(j \in \mathbb {Z}\) we have \(|(\mathcal {S}-j)(X)|\le |\mathcal {S}(X)|+ j = o(X)\) as \(X \rightarrow \infty \), and so for X taken even larger if needed we have \(\vert (\mathcal {S}-j)(X) \vert /X< \varepsilon ^4 < \varepsilon /2\), for \(\varepsilon \) sufficiently small. We claim that
for sufficiently large X. Assuming this, Proposition 7.1 will imply that
as required.
We may split the sum in (40) according to whether or not \(|g(p) |> \delta ^{-1}B_g(X)\), where \(\delta = \delta (\varepsilon ) > 0\) is to be chosen. In light of (34), we obtain
so that this is \(\ll \varepsilon B_g(X)^2\) if \(X \ge X_0(\varepsilon )\).
On the other hand, by our assumption (22),
provided \(X \ge X_0(\delta )\). For \(\delta = \delta (\varepsilon )\) sufficiently small we can make this \(\ll \varepsilon B_g(X)^2\) whenever \(X\ge X_0(\varepsilon )\) (with \(X_0(\varepsilon )\) taken larger if necessary). The claim now follows. \(\square \)
We are now able to prove the first part of Theorem 1.8, namely that there is a parameter \(\lambda = \lambda (X) \ll \tfrac{B_g(X)}{\log X}\) such that
The proof of the slow variation condition \(\lambda (X^u) = \lambda (X) + o(\tfrac{B_g(X)}{\log X})\), for \(0 < u \le 1\) fixed, is postponed to the next section.
Proof of Theorem 1.8: Part I
In light of Lemma 3.3, we begin by showing that
By Lemma 3.4, when \(X/\log X < Y \le X\) we have
so that upon dyadically decomposing the range \([X/\log X,X]\) and applying Lemma 3.2 to the range \([1,X/\log X]\) in the sum on the LHS of (41), we get
It thus suffices to show that, uniformly over \(1 \le 2^j \le 2\log X\),
Fix \(1 \le 2^k \le 2\log X\), set \(Y_k := X/2^k\) and introduce a parameter \(1 \le R \le (\log X)^{1/2}\), which will eventually be chosen as slowly growing as a function of X. Let
We observe that if \(n \in \mathcal {G}_R(Y_k)\) then we have
We divide \(\mathcal {G}_R(Y_k)\) further into the sets
Suppose \(n \in \mathcal {G}_R^+(Y_k)\). Since
we deduce from the monotonicity of the map \(y \mapsto y^2\) for \(y \ge 0\) that (shifting \(n \mapsto n+R =: n'\))
where the error term comes from replacing \(A_g(Y_k)\) by the sum over \([Y_k/2,Y_k]\) by applying Lemma 3.4. Similarly, if \(n \in \mathcal {G}_R^-(Y_k)\) then
and so by the same argument we obtain
The above sums cover all elements of \(\mathcal {G}_R(Y_k)\) besides those in \([Y_k/2,Y_k/2 + R) \cup (Y_k-R,Y_k]\). To deal with these, we define
for \(Z \ge 1\). We see that \(|\mathcal {S}(Z)|\ll R \log Z = o(Z)\), and \(\mathcal {S}\) contains \([Y_k/2,Y_k/2 + R] \cup [Y_k-R,Y_k]\) for each k. By Corollary 7.5 (taking \(j = 0\) there), we thus obtain
uniformly over all \(X/\log X < Y_k \le X\), provided X is large enough in terms of R. Combining the foregoing estimates and using positivity, we find that
By Theorem 1.4, this gives \(o_{R \rightarrow \infty }(B_g(X)^2)\), uniformly over \(X/\log X < Y_k \le X\).
It remains to estimate the contribution from \(n \in \mathcal {B}_R(Y_k)\). By the union bound, we have
By Corollary 7.5, the above expression is \(o(B_g(X))\), again provided X is sufficiently large in terms of R.
To conclude, for any \(\varepsilon > 0\) we can find R large enough in terms of \(\varepsilon \) and \(X_0\) sufficiently large in terms of \(\varepsilon \) and R such that if \(X \ge X_0\) then
uniformly in \(X/\log X < Y_k = X/2^k \le X\), and (41) follows.
Now, applying Lemma 3.3, we deduce that
where we recall that \(g_{\lambda _0}(n) = g(n)-\lambda _0 \log n\) for all \(n \ge 1\), and
Note that by Cauchy–Schwarz and the prime number theorem,
Thus, \(\vert \lambda _0(X) \vert \ll B_g(X)/\log X\), and \(B_{g_{\lambda _0}}(X)^2 = o(B_g(X)^2)\), as wanted. We will verify that \(\lambda _0(X)\) is slowly varying in the next section (immediately following the proof of Proposition 8.1). \(\square \)
8 Rigidity properties for almost everywhere monotone functions
We continue to assume that g is almost everywhere monotone in the sense of the previous section. Theorem 1.8 claims that an additive function \(g \in \mathcal {A}\) is well approximated by a constant times a logarithm, assuming g(p) is not frequently much larger than \(B_g(X)\) for \(p \le X\). In this section, we will complete the proof of this theorem, along with those of Corollary 1.7 and Theorem 1.7, all of which are consequences of the almost everywhere monotonicity property. A key input in this direction is Proposition 8.1, which is a structure theorem for the asymptotic mean value \(A_g(X)\).
8.1 The structure of \(A_g(X)\)
The first main result of this section is the following.
Proposition 8.1
Let \(g:\mathbb {N} \rightarrow \mathbb {R}\) be an additive function satisfying \(B_g(X) \rightarrow \infty \) as \(X \rightarrow \infty \). Assume that \(\mathcal {B} := \{n \in \mathbb {N} : g(n) < g(n-1)\}\) satisfies \(\vert \mathcal {B}(X) \vert := \vert \mathcal {B} \cap [1,X] \vert = o(X)\) as \(X \rightarrow \infty \). Then for each X sufficiently large there is \(\lambda = \lambda (X) \in \mathbb {R}\) such that for any \(\frac{\log \log X}{\sqrt{\log X}} < \delta \le 1/4\),
and also
Furthermore, \(A_g(X)\) and \(\lambda (X)\) satisfy the following properties:
-
(i)
\(\lambda (X) \ll B_g(X)/\log X\),
-
(ii)
for X sufficiently large and any \(X^{\delta } < t_1 \le t_2 \le X\),
$$\begin{aligned} A_g(t_2) = A_g(t_1) + \lambda (X)\log (t_1/t_2) + o((\log (1/\delta )^{1/2}B_g(X)), \end{aligned}$$ -
(iii)
for every \(u \in (\delta ,1]\) we have
$$\begin{aligned} \lambda (X) = \lambda (X^u) + o\left( (\log (1/\delta )^{1/2} \delta ^{-1} \frac{B_g(X)}{\log X}\right) . \end{aligned}$$
Remark 8.2
It would be desirable to determine \(A_g(t)\) directly as a function of t in some range, say \(X^{\delta } < t \le X\). Proposition 8.1 provides the approximation \(A_g(t) = A_g(X^{\delta }) + (1-\delta u) \lambda (X) \log t + o(B_g(X))\), where \(u := \log X/\log t\), but this still contains a reference to a second value \(A_g(X^{\delta })\). We might iterate this argument to obtain (using the slow variation of \(\lambda \)) a further approximation in terms of \(A_g(X^{\delta ^2})\), \(A_g(X^{\delta ^3})\), and so forth, but without further data about g (say, \(A_g(X^{1/1000}) = o(B_g(X))\)) it is not obvious that this argument yields an asymptotic formula for \(A_g(t)\) alone.
To prove this proposition we will require a few lemmas.
Lemma 8.3
Let \(g: \mathbb {N} \rightarrow \mathbb {C}\) be an additive function satisfying \(B_g(X) \rightarrow \infty \) as \(X \rightarrow \infty \). Let \(\alpha \in (1,2)\) and \(\frac{\log \log X}{\sqrt{\log X}} < \delta \le 1/4\). Then
Proof
We will estimate the quantity
in two different ways.
First, we will obtain a lower bound for \({\mathscr {M}}\) as follows. Given \(2 \le Y \le X\) observe that for any fixed prime p
where the last equality arises from Lemma 3.1.
Using this estimate with \(Y =X/p^k\) for \(p^k \le X\), we obtain
noting that the third error term is 0 unless \(p^k \le X/2\). Similarly, again by Lemma 3.1 we have
We thus deduce that
where we have set
As in the proof of Lemma 3.1,
Next, we may upper bound \(\mathcal {R}_2(X)\) as
To treat \(\mathcal {R}_1(X)\), we use \(|g(p^j)|/p^j \le B_g(p^j)p^{-j/2}\) to get
Furthermore, we have
Since \(B_g(X) \gg 1\) these last two bounds combine to give
Finally, to bound \(\mathcal {R}_4(X)\) we use Lemma 3.4 to obtain
uniformly over \(p^k \le X\), and thus using (47) we find
Combining the estimates for \(\mathcal {R}_j(X)\), \(1 \le j \le 4\), in view of the range of \(\delta \) we finally obtain
in (45). Thus,
Next, we execute the second estimation of \({\mathscr {M}}\). If \(p^k \in (X^{\delta },X]\) define
Set \(g'(n) := g(n) -A_g(X)\) for \(n \le X\), and note that
Thus, as \( \vert A_g(X) \vert \le B_g(X)\sqrt{\log \log X}\) by Lemma 3.4, we find
Recall that \(\alpha \in (1,2)\). Let us now partition the set of prime powers \(X^{\delta } < p^k \le X\) into the sets
Note that by Mertens’ theorem,
Using this and Hölder’s inequality, we obtain
By Theorem 3.1 of [36] this is bounded by
Combining this with (45) and (48) completes the proof of the lemma. \(\square \)
Next, we show that, in an \(\ell ^1\) sense, \(g(p^k)\) is well approximated by \(\lambda \log p^k\) on average over the prime powers \(X^{\delta } < p^k \le X\), for some function \(\lambda = \lambda (X)\). This will be the \(\lambda \) that appears in the statement of Proposition 8.1.
Lemma 8.4
There is a parameter \(\lambda =\lambda (X) \in \mathbb {R}\) such that the following holds. For any \(\alpha \in (1,2)\),
Proof
By [1, Théorème 1], there are \(\lambda = \lambda (X)\) and \(c = c(X)\) (both depending on g but independent of \(\alpha \)) such that
Here, writing \(g_{\lambda }(n) = g(n)-\lambda \log (n)\), we have set
Since \(g_{\lambda }(p^k) = g_{\lambda ,c}'(p^k) + g_{\lambda ,c}''(p^k)\) for each \(p^k\), by Hölder’s inequality and (49) once again,
The result now follows upon combining this last estimate with (50) and using positivity. \(\square \)
To make use of the previous two lemmas we establish the following upper bound for moments of order \(\alpha \in [1,2)\) that crucially uses the almost-everywhere monotonicity property of g.
Lemma 8.5
Assume that \(\mathcal {B} := \{n \in \mathbb {N} : g(n) < g(n-1)\}\) satisfies \(\vert \mathcal {B}(X) \vert = o(X)\), where \(\mathcal {B}(X) = \mathcal {B} \cap [1,X]\). Then for any \(\alpha \in [1,2)\),
where \(r(X) := \max _{\tfrac{X}{\log X} < Y \le X} \left( \left( \frac{|\mathcal {B}(Y) |}{Y}\right) ^{1/2} + \frac{\log Y}{\sqrt{Y}}\right) \).
Proof
By Hölder’s inequality, for any \(\alpha \in (1,2)\) we have
an inequality that is vacuously also true when \(\alpha = 1\). Applying Lemma 3.2 to the second bracketed expression, we obtain the upper bound
Next, we show that for all \(X/\log X < Y \le X\) we get
This will imply the claim of the lemma, since by Proposition 6.1 the latter bound gives
To prove (51), we note that for all \(1 \le n \le Y\),
It follows from this and telescoping that
By Lemma 3.5, \(g(\left\lfloor Y \right\rfloor )/Y \ll B_g(Y)(\log Y)Y^{-1/2}\). Owing to Lemma 3.2 and the triangle and Cauchy–Schwarz inequalities, we also obtain
This implies (51), and completes the proof of the lemma. \(\square \)
Proof of Proposition 8.1
We begin with the proof of (44). Set \(\alpha = 3/2\), say. Combining Lemmas 8.4 and 8.5, there is a \(\lambda = \lambda (X) \in \mathbb {R}\) such that
We use this to obtain (43). Indeed, this time combining Lemmas 8.3 and 8.5 , we get
Combining this with (52) and applying the triangle inequality in the form
for each \(X^{\delta } < p^k \le X\), we quickly deduce (43).
Next, we proceed to the proofs of properties (i)–(iii).
(i) By the triangle inequality and positivity, we obtain
By Mertens’ theorem,
and by the Cauchy–Schwarz inequality we have, for \(X^{1/4} \le p \le X^{1/2}\),
Using the above, (43), the prime number theorem and partial summation in (53), we find that
and (i) follows immediately.
(ii) We observe, using (i) and (44) that if \(X^{\delta } < t_1 \le t_2 \le X\),
where in the penultimate line the second error term is estimated similarly to (47). This proves the required estimate.
(iii) Applying (ii) with \((t_1,t_2) = (X^y,X^z)\), where, in sequence, \((y,z) = (u,1)\), \((y,z) = (uv,1)\) and \((y,z) = (uv,u)\) for any \(v \in (\delta /u,1/2]\) and \(u \in (\delta ,1]\), we get
We subtract the second equation from the first and combine the result with the third equation. Using \(B_g(X^u) \le B_g(X)\) we conclude that
Since \(1-v \ge 1/2\) and \(u > \delta \), the claim follows immediately upon rearranging (with a potentially larger implicit constant in the error term). \(\square \)
Proof of Theorem 1.8: Part II
The work at the end of Sect. 7 implies that
where \(\lambda _0\) is as in (42). Now, by Proposition 8.1 we have
where \(\lambda (X)\) satisfies
Thus, by (54) and (55), Cauchy–Schwarz and Mertens’ theorem, whenever \(Y = X^u\) with \(0 < u \le 1\) fixed we have
We thus deduce that \(\lambda (X^u) = \lambda _0(X^u) + o(B_g(X)/\log X)\) for all \(0 < u \le 1\) fixed, and therefore also that
and the second claim of Theorem 1.8 is proved. \(\square \)
8.2 Growth of \(B_g(X)\) and the proof of Corollary 1.7
In this subsection we prove Corollary 1.7. The key step will be to show that if there is a \(\lambda (X)\) such that \(B_{g_{\lambda }}(X) = o(B_g(X))\) (which follows from Theorem 1.8) then \(B_g(X)\) grows roughly like \(\log X\).
We begin by showing that this is the case assuming in addition that \(\lambda (X)\) is fairly large (this assumption is subsequently removed in Lemma 8.7).
Until further notice, assume that \(B_g(X) \gg 1\) as X gets large.
Lemma 8.6
Assume that there is a \(C > 0\) such that \(\lambda (X) \ge C B_g(X)/\log X\) for all X sufficiently large in the conclusion of Proposition 8.1. Then for any \(\varepsilon > 0\), \((\log X)^{1-\varepsilon } \ll _{\varepsilon } B_g(X) \ll _{\varepsilon } (\log X)^{1+\varepsilon }\).
Proof
By Proposition 8.1 and our assumption \(\lambda (X) \gg B_g(X)/\log X\),
whenever \(0 < u \le 1\) is fixed. This implies in particular that \(\lambda (X) \ll \lambda (X^u)\). Setting \(Y := X^u\) and \(v := 1/u \ge 1\), we see also that
Thus, (58) holds for all fixed \(u \ge 1\) as well, and thus for all fixed \(u > 0\). We thus deduce that for each \(u > 0\) fixed and \(\varepsilon > 0\) there is \(X_0(\varepsilon ,u)\) such that if \(X\ge X_0(\varepsilon ,u)\),
Set \(u = 1/2\), put \(X_0 = X_0(\varepsilon ,1/2)\) and for each \(k \ge 1\) define \(X_k := X_0^{2^k}\). Let K be large. Then
As \(K \le 2\log \log X_K\) for large enough K and \(X_0\), we find
Thus, we have \(B_g(X_K) \ll |\lambda (X_K) |(\log X_K) \ll _{\varepsilon } (\log X_K)^{1+4\varepsilon }\) by assumption, and by Proposition 8.1(i) we have \(B_g(X_K) \gg |\lambda (X_K)|\log X_K \gg _{\varepsilon } (\log X_k)^{1-4\varepsilon }\).
Since \(\log X_K \asymp \log X_{K+1}\), by monotonicity of \(B_g\) we also have
for any \(X_K< X < X_{K+1}\). Similarly, we obtain \(B_g(X) \gg _{\varepsilon } (\log X)^{1-4\varepsilon }\) on the same interval. Since \(\varepsilon > 0\) was arbitrary, the claim now follows. \(\square \)
Lemma 8.7
Assume \(B_{g_{\nu }}(X) = o(B_g(X))\) for some \(\nu = \nu (X)\) that satisfies \(|\nu |\ll B_g(X)/\log X\). Then for any \(\varepsilon > 0\), \((\log X)^{1-\varepsilon } \ll _{\varepsilon } B_g(X) \ll _{\varepsilon } (\log X)^{1+\varepsilon }\).
Proof
By Cauchy–Schwarz, we have
It follows that \(|\nu (X)|\ge \frac{1}{4}B_g(X)/\log X\) when X is sufficiently large. The conclusion follows from Lemma 8.6, provided we can show that \(\lambda (X) = \nu (X) + o(B_g(X)/\log X)\) for all large X, where \(\lambda (X)\) is the function from the conclusion of Proposition 8.1. But this can be verified by the same argument as that which leads to (57), so the claim follows. \(\square \)
Proof of Corollary 1.7
Suppose \(g: \mathbb {N} \rightarrow \mathbb {R}\) is a completely additive function that satisfies
Suppose first that \(B_g(X) \rightarrow \infty \), so that \(g \in \mathcal {A}_s\). By Theorem 1.8 there is a parameter \(\lambda _0(X)\) with \(|\lambda _0(X)|\ll B_g(X)/\log X\), such that \(B_{g_{\lambda _0}}(X) = o(B_g(X))\), as \(X \rightarrow \infty \). By Lemma 8.7, we deduce that \(B_g(X) \ll _{\varepsilon } (\log X)^{1+\varepsilon }\). Now, applying (51), we obtain
By Theorem 3.7, we deduce that there is a constant \(c\in \mathbb {R}\) such that \(g(n) = c\log n\) for all n, as required.
If, instead, \(B_g(X) \ll 1\) then we again deduce (even if \(g \notin \mathcal {A}_s\)) from (51) that
and so the claim follows (necessarily with \(c = 0\)) by Theorem 3.7. \(\square \)
8.3 Proof of Theorem 1.9
To prove Theorem 1.9 we will appeal to the following result due to Elliott, which will be useful for us in light of our Proposition 8.1.
Theorem
[26, Thm. 6] Let \(0< a < b \le 1\). Let \(g: \mathbb {N} \rightarrow \mathbb {C}\) be an additive function, and for \(y \ge 10\) define
Then for all \(\varepsilon , B > 0\) there exist \(X_0 = X_0(a,b,\varepsilon ,B)\) and \(c > 0\) such that if \(X \ge X_0\) then, uniformly over \(X^{\varepsilon } < t \le X\),
where \(G,\eta \) are measurable functions and
Corollary 8.8
Let \(\delta \in (0,1/2)\). Suppose \(g: \mathbb {N} \rightarrow \mathbb {R}\) is an additive function such that \(|\mathcal {B}(X) |= o(X)\). Then, uniformly over all \(X^{\delta } \le t \le X\) we have
where \(\lambda (X)\) and \(\eta (X)\) are measurable functions such that for each fixed \(0 < u \le 1\),
Proof
By combining Lemmas 8.3 and 8.5 , we have
for any fixed \(\delta > 0\). Applying Elliott’s theorem with \(a = \varepsilon = \delta \), \(b = 1\), \(B = 1\), we have
using Lemma 3.4 to treat the second error term, and the bound \(|g(p^k) |p^{-k/2} \le B_g(X)\) for all \(p^k \le X\) in the third. We thus deduce the existence of G(X) such that
For \(X^{\delta } < t_1 \le t_2 \le X\),
so we have removed the term \(\eta (X)\). Now, by Proposition 8.1 we also know that in the same range,
Applying this with \(t_1 = X^{1/2}\), \(t_2 = X\), we deduce readily that
and hence, from (59), that for all \(X^{\delta } < t \le X\),
The slow variation of \(\lambda (X)\) is a consequence of Proposition 8.1(iii). To obtain the corresponding property for \(\eta \) we evaluate \(A_g(X^u)\) in (59), once as written and once with X replaced by \(X^u\), obtaining
from which it also follows, using the slow variation of \(\lambda \), that
for each fixed \(\delta \le u \le 1\), as required. \(\square \)
Proof of Theorem 1.9
By Lemma 8.5 (with \(\alpha = 1\)),
so that for all but o(X) integers \(n \le X\) we have
By Corollary 8.8 and Proposition 8.1(i), we deduce that
for all but o(X) integers \(X/\log X < n \le X\), and thus for all but o(X) integers \(n \le X\), proving the claim of Theorem 1.9. \(\square \)
Data Availability
Data sharing not applicable to this article as no datasets were generated or analysed during the current study.
Notes
The Liouville function is the multiplicative function defined as \(\lambda (n) := (-1)^{\Omega (n)}\), where \(\Omega (n)\) is the number of prime factors of n, counted with multiplicity.
The requirement that h be an integer is possibly unnecessary, but assuming it allows us to avoid certain pathologies associated with functions g taking very large values.
By Chebyshev’s inequality, \(\sum _{\begin{array}{c} p \le X \\ |g(p) |> \varepsilon ^{-1}B_g(X) \end{array}} p^{-1} \ll \varepsilon ^2\), and thus the proportion of integers divisible by a prime p with \(\vert g(p)\vert > \varepsilon ^{-1}B_g(X)\) is sparse, namely of size \(O(\varepsilon ^2 X)\). Nevertheless, if \(F_g(\varepsilon ) \gg 1\) for all \(\varepsilon > 0\) the values \(g(n)^2\) at multiples of such primes can have an outsized influence on the second moment.
By a strongly additive function we mean an additive function g such that \(g(p^k) = g(p)\) for all primes p and all \(k \ge 1\).
Given a sequence \(\mathcal {C} \subset \mathbb {N}\), the (natural) density of \(\mathcal {C}\), if it exists, is the limit \(d \mathcal {C} := \lim _{X \rightarrow \infty } \frac{\vert \mathcal {C} \cap [1,X] \vert }{X}\).
Hildebrand [27] showed that any \(c > 4\) is admissible.
We may always choose a suitable branch of logarithm to ensure that this is well defined.
Strictly speaking, we actually work with \(z^{g(n)/B_g(X)}\), but for the convenience of exposition we omit the normalization by \(B_g(X)\) in the exponent here.
This would still be true if we replaced the circle \(\vert z-1 \vert = \rho \) by any other path containing 1 in its interior component.
References
Hildebrand, A.: Sur les moments d’une fonction additive. Ann. de l’Institut Fourier 33(3), 1–22 (1983)
Erdös, P., Wintner, A.: Additive arithmetical functions and statistical independence. Am. J. Math. 61, 713–721 (1939)
Erdos, P., Kac, M.: The Gaussian law of errors in the theory of additive number theoretic functions. Am. J. Math. 62, 738–742 (1940)
Wirsing, E.: Das asymptotische Verhalten von Summen über multiplikative Funktionen II. Acta Math. Acad. Sci. Hung. 18, 411–467 (1967)
Halász, G.: Über die Mittelwerte multiplikativer zahlentheoretischer Funktionen. Acta Math. Acad. Sci. Hung. 19, 365–403 (1968)
Chowla, S.: The Riemann Hypothesis and Hilbert’s Tenth Problem. Gordon and Breach Science Publishers, New York (1965)
Landau, E.: Über den Zusammenhang einiger neuer Sätze der analytischen Zahlentheorie. Math. Klasse 115, 589–632 (1906)
Matomäki, K., Radziwiłł, M.: Multiplicative functions in short intervals. Ann. Math. (2) 183(3), 1015–1056 (2016)
Matomäki, K., Radziwiłł, M., Tao, T.: An averaged form of Chowla’s conjecture. Algebra Number Theory 9(9), 2167–2196 (2015)
Matomäki, K., Radziwiłł, M.: Multiplicative functions in short intervals II. arXiv: 2007.04290 [math.NT]
Tao, T.: The logarithmically averaged Chowla and Elliott conjectures for two-point correlations. Forum Math. Pi 4, 8–36 (2016)
Tao, T.: The Erdős discrepancy problem. Discrete Anal. 1, 29 (2016)
Kułaga-Przymus, J., Lemańczyk, M.: Sarnak’s conjecture from the ergodic theory point of view. arXiv: 2009.04757 [math.DS]
Mangerel, A.P.: Divisor-bounded multiplicative functions in short intervals. arXiv: 2108.11401 [math.NT]
Goudout, E.: Théorème d’Erdös-Kac dans presque tous les petits intervalles. Acta Arith. 182(2), 101–116 (2018)
Goudout, E.: Lois locales de la fonction \(\omega \) dans presque tous les petits intervalles. Proc. Lond. Math. Soc. 115(3), 599–637 (2017)
Erdős, P.: On the distribution function of additive functions. Ann. Math. 47(2), 1–20 (1946)
Kátai, I.: On a problem of P. Erdös. J. Number Theory 2(1), 1–6 (1970)
Wirsing, E.: Characterization of the logarithm as an additive function. Am. Math. Soc. Proc. Symp. Pure Math. 20, 375–381 (1971)
Hildebrand, A.: An Erdös-Wintner theorem for differences of additive functions. Trans. Am. Math. Soc. 310(1), 257–276 (1988)
Wirsing, E.: A characterization of \(\log n\) as an additive arithmetic function. Symp. Math. dell’ Istituto Nazionale di Aha Mathematica Roma IV, 45–57 (1970)
Elliott, P.D.T.A.: Arithmetic Functions and Integer Products, vol. 272. Springer, Berlin (1985)
Klurman, O.: Correlations of multiplicative functions and applications. Compos. Math. 153(8), 1622–1657 (2017). https://doi.org/10.1112/S0010437X17007163
Klurman, O., Mangerel, A.P.: Rigidity theorems for multiplicative functions. Math. Ann. 372(1–2), 651–697 (2018). https://doi.org/10.1007/s00208-018-1724-6
Kátai, I.: Continuous homomorphisms as arithmetical functions, and sets of uniqueness. In: Trends in Mathematics, pp. 183–200. Birkhäuse, Basel (2000)
Elliott, P.D.T.A.: Functional analysis of additive arithmetic functions. Bull. Am. Math. Soc. 16(2), 179–223 (1987)
Hildebrand, A.: Additive functions at consecutive integers. J. Lond. Math. Soc. 35(2), 217–232 (1987)
Mangerel, A.P.: On the bivariate Erdös-Kac theorem and correlations of the Möbius function. Math. Proc. Camb. Phil. Soc. 169(3), 547–605 (2020)
Bingham, N.H., Goldie, C.M., Teugels, J.L.: Regular Variation. Cambridge University Press, Cambridge (1987)
Ruzsa, I.Z.: On the variance of additive functions. In: Studies in Pure Mathematics: To the Memory of Paul Turán, 576–586. Editors: Paul Erdös, László Alpár, Gábor Halász and András Sarközy, Springer Basel AG
Iwaniec, H., Kowalski, E.: Analytic Number Theory, vol. 53, p. 615. American Mathematical Society Colloquium Publications. American Mathematical Society, Providence (2004)
Granville, A., Soundararajan, K.: Large character sums: pretentious characters and the Pólya-Vinogradov theorem. J. Am. Math. Soc 20(2), 357–384 (2007)
Granville, A., Soundararajan, K.: Decay of mean values of multiplicative functions. Can. J. Math. 55(6), 1191–1230 (2003)
Klurman, O., Mangerel, A.P., Teräväinen, J.: Multiplicative functions in short arithmetic progressions. arXiv:1909.12280 [math.NT]
Shiu, P.: A Brun-Titchmarsh theorem for multiplicative functions. J. Reine Angew. Math. 313, 161–170 (1980)
Elliott, P.D.T.A.: Duality in Analytic Number Theory. Cambridge Tracts in Mathematics, vol. 122. Cambridge University Press, Cambridge (1997)
Acknowledgements
The author warmly thanks Oleksiy Klurman and Aled Walker for helpful suggestions about improving the exposition of the paper, as well as for their encouragement. He is also greatly indebted to the anonymous referee for a very careful reading of the paper and for providing many useful comments. Most of this paper was written while the author held a Junior Fellowship at the Mittag-Leffler institute for mathematical research during the Winter of 2021. He would like to thank the institute for its support.
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
During the writing of this paper the author was supported by a CRM-ISM fellowship from the Centre de Recherches Mathématiques, as well as a Junior Fellowship from the Mittag-Leffler Institute.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Mangerel, A.P. Additive functions in short intervals, gaps and a conjecture of Erdős. Ramanujan J 59, 1023–1090 (2022). https://doi.org/10.1007/s11139-022-00623-y
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11139-022-00623-y