1 Introduction

1.1 The problem

This paper grew out from an empirical observation by one of us (Z.T.): there are large graphs whose vertex degrees are consecutive members of the sequence of prime gaps. Moreover, such graphs can be generated recursively by the so-called degree preserving growth process [16]. To turn the observation into precise mathematical statements, we introduce the following definition.

Definition

Let \(p_n\) denote the n-th prime number, and let \(p_0=1\). We call a simple graph on \(n\geqslant 2\) vertices a prime gap graph if its vertex degrees are \(p_1-p_0,\dotsc ,p_n-p_{n-1}\).

Conjecture 1.1

(Toroczkai, 2016) For every \(n\geqslant 2\), there exists a prime gap graph on n vertices.

Conjecture 1.2

(Toroczkai, 2016) In every prime gap graph on n vertices, there exist \((p_{n+1}-p_n)/2\) independent edges.

In fact, as will become clear in the next subsection, Conjecture 1.2 implies Conjecture 1.1. By combining techniques from analytic number theory and matching theory, we are able to almost fully settle these conjectures.

Theorem 1.1

Conjectures 1.1 and 1.2 are true for every sufficiently large n. Assuming the Riemann hypothesis, they are true for every \(n\geqslant 2\).

The main input from analytic number theory is an upper bound on the sum of large prime gaps. The main input from matching theory is Vizing’s theorem on edge colorings. See Subsection 1.3 for more details. In the next subsection, we discuss the background and motivation for Theorem 1.1.

1.2 Broader context

Networks are powerful, graph-based representations used in the study of complex systems. They appear in systems ranging from elementary particle interactions, through nucleosynthesis, chemistry, biology (gene interactions, protein interactions, metabolism, physiology), social sciences (human interactions), infrastructures (transportation, power grid, etc.), ecology (food webs) and climate, to the organization of visible and dark matter in the universe. In this paper we report on a novel family of networks, however, in number theory.

In this paper all graphs are simple: there are no parallel edges and loops. The most common characteristic of a graph is its degree sequence: we equip each of the n vertices with a unique label from \(\{1,\dotsc ,n\}\), and an integer vector \(\textbf{D}=(d_1,\dotsc ,d_n)\) lists the degrees of the corresponding labeled vertices, that is, the number of edges incident on a given vertex. In chemistry, and in old-fashioned graph theory, this is called valency.

The inverse problem is the following: we are given a sequence \(\textbf{D}\) of nonnegative integers, and we want to know whether there exists a graph with this ensemble as a degree sequence. When the answer is affirmative, then we call the sequence \(\textbf{D}\) graphic. Clearly, if the sequence is graphic, then the sum of its members must be even. However, it is not self-evident whether a given sequence is graphic. The most well-known characterization of graphic degree sequences is the following theorem:

Theorem 1.2

(Erdős–Gallai [7]) Let \(d_1\geqslant \cdots \geqslant d_n\geqslant 0\) be integers. Then the sequence \((d_1,\dotsc ,d_n)\) is graphic if and only if \(d_1+\cdots +d_n\) is even and for every \(k\in \{1,\dotsc ,n\}\) we have

$$\begin{aligned} \sum _{\ell =1}^k d_\ell \leqslant k(k-1) + \sum _{\ell =k+1}^n \min (k,d_\ell )\;. \end{aligned}$$
(1)

It is well-understood that there are exponentially many different realizations for almost every graphic degree sequence. At the same time, the number of all graphic degree sequences is infinitesimal compared to the number of integer partitions of the sum of the degrees. More precisely, let 2m denote the sum of the degrees (as usual), so that the degree sequence is a partition of 2m. By a difficult result of Pittel [23], as m tends to infinity, the probability of a random partition of 2m being graphic is zero in the limit.

Theorem 1.2 gives us a means to decide whether the degree sequence is graphic. It is, however, an entirely different problem to actually construct a realization of a graphic degree sequence. The simplest way to do that is via the Havel–Hakimi algorithm, which, in turn, is based on the following observation:

Theorem 1.3

(Havel [12] and Hakimi [11]) Let \(d_1 > 0\) and \(d_2 \geqslant \cdots \geqslant d_n \geqslant 0\) be integers. Then \((d_1,\dotsc ,d_n)\) is graphic if and only if \((d_2-1,\dotsc ,d_{d_1+1}-1,d_{d_1+2},\dotsc ,d_n)\) is graphic.

Assume we are given a long sequence of natural numbers \(\textbf{D}\) such that every initial segment \(\textbf{D}^n\) formed by the first n elements of \(\textbf{D}\) is graphic (we restrict this notion to \(n\geqslant 2\)). This means, in particular, that \(d_n<n\) for \(n\geqslant 2\). We would like to construct a realization \(G_n\) for any \(n\geqslant 2\). Clearly, it can be done by separate applications of the Havel–Hakimi algorithm for every single \(n\geqslant 2\). But this is rather uneconomical: in principle, for each new segment we have to restart the algorithm from scratch. Instead, we want to find a graph growth dynamics (GGD) such that \(G_{n+1}\) can be obtained from \(G_n\) rapidly.

There are only a few GGDs in the network science literature as most graph construction models are based on static algorithms. Arguably, the most popular GGDs is the preferential attachment algorithm of scale-free networks. However, in this and other GGDs, typically, some of the degrees of the vertices in \(G_{n+1}\) are bigger than in the degree sequence of \(G_n\), and thus, they are unsuitable for our purposes.

Recently, a new network growth dynamics has been introduced: the degree-preserving network growth (DPG) model family (see [8] or [16]). The DPG-process can be described as follows: let G be a simple graph with degree sequence \(\textbf{D}\). In what follows, by a matching we mean a set of independent edges in the graph, that is, a set of pairwise non-adjacent edges. In a general step, a new vertex u joins G by removing a \(\nu \)-element matching of G followed by connecting u to the vertices incident to the \(\nu \) removed edges. The degree of the newly inserted vertex is \(d=2\nu \). This step does not join two vertices of G that are non-adjacent, furthermore, the degrees of vertices in G are not changed. The degree sequence of the newly generated graph is \(\textbf{D}\circ d\), that is, d is concatenated to the end of \(\textbf{D}\). This graph operation is called a degree-preserving step (DP-step), and the DPG-process repeats such steps iteratively.

Returning to our long sequence \(\textbf{D}\) of natural numbers: if, for each step n, we can find a matching of size \(d_{n+1}/2\) in \(G_n\), then the application of a DP-step provides a realization \(G_{n+1}\) of \(\textbf{D}^{n+1}\). In this case, we say that the pair \((G_n,d_{n+1})\) is DPG-graphic. One can ask, why should a suitable matching be found in \(G_n\)? Actually, this is not inconceivable, as the following theorem shows:

Theorem 1.4

(Theorem 2.5 in [18]) Given a graphic sequence \(\textbf{D}\) of length n and an even integer \(2\leqslant d \leqslant n\), the sequence \(\textbf{D}\circ d\) is graphic if and only if \(\textbf{D}\) has a realization with a matching of size d/2.

Since every initial segment \(\textbf{D}^n\) is graphic, therefore, for each one there exists a “special” realization \(G'_n\) with the requested matching. However, it is not automatic that \(G'_n = G_n\). A natural way to deal with this problem is to add the condition that every realization of \(\textbf{D}^n\) has a matching of size \(d_{n+1}/2\).

We stress that it is not easy to find an infinite, naturally occurring sequence \(\textbf{D}\) whose initial segments are all graphic, or at least graphic beyond a certain point. As a matter of fact, until now we have only known one such example: when all elements in the sequence are equal. Such a GGD describes an ever growing regular graph sequence.

In this paper, we describe for the first time a nontrivial, naturally arising infinite sequence whose initial segments are all graphic. Furthermore, we show that any realization of the initial segments is admissible for the DPG-algorithm. This sequence is the sequence of prime gaps with a prefix 1:

$$\begin{aligned} \textbf{PD}:=(p_1-p_0,p_2-p_1,\dotsc )=(1,1,2,2,4,2,\dotsc ) \end{aligned}$$

The prefix 1 was included to guarantee that the sum of the initial segments is even. The figure below is an illustration of the DPG-process on prime gap graphs. The independent edges used by the DP-steps are red zigzags.

figure a

The proof that the DPG-process creates an infinite sequence of prime gap graphs incorporates two main ingredients. The first one is a symmetric inequality, which implies the Erdős–Gallai inequalities, and thus provides a practical sufficient condition for the graphicality of the underlying degree sequence. Moreover, another new inequality implies the DPG-graphicality of a long sequence of natural numbers. The second ingredient combines classical \(L^2\) and \(L^\infty \) bounds for prime gaps.

Theorem 1.5

If n is sufficiently large, then the initial segment \(\textbf{PD}^n\) is graphic, and for any realization \(G_n\) of \(\textbf{PD}^n\), the pair \((G_n, p_{n+1}-p_n)\) is DPG-graphic.

Ultimately, the proof relies on the rarity of zeros possibly violating the Riemann hypothesis. In principle, it allows one to deduce an effective constant beyond which Theorem 1.5 holds true, but this constant far exceeds the capabilities of computers. However, assuming the Riemann hypothesis, we can reduce the constant significantly and prove the result for all \(n\geqslant 2\). Here the numerics are quite delicate, and for efficiency we depart from the symmetric treatment alluded to above. Instead, we go back to first principles and examine the contribution of large prime gaps more directly. We still need to rely on computational results, but they can either be obtained with very simple computer programs, or found in the literature (e.g. prime gap records until \(2\cdot 10^{18}\)).

Theorem 1.6

Assume the Riemann hypothesis. Then, for any \(n\geqslant 2\), the initial segment \(\textbf{PD}^n\) is graphic, and for any realization \(G_n\) of \(\textbf{PD}^n\), the pair \((G_n, p_{n+1}-p_n)\) is DPG-graphic.

Note that Theorems 1.5 and 1.6 are a reformulation of Theorem 1.1 in the terminology of combinatorics and network theory. They rely on the core theorems presented in the next subsection, which are of independent interest.

1.3 New results

Our first result provides, via two symmetric inequalities, sufficient conditions for a given sequence to be graphic and that in every graph realization of the sequence, there is a matching of a given size.

Theorem 1.7

Let \(\textbf{D}=(d_1,\dotsc ,d_n)\) be a sequence of positive integers such that \({\Vert \textbf{D}\Vert }_1=\sum _{\ell =1}^n d_\ell \) is even. Let \(1<p\leqslant \infty \) be a parameter.

Part (a). Assume that the following \(L^p\)-norm bound holds:

$$\begin{aligned} {\Vert 2+\textbf{D}\Vert }_p\leqslant n^{\frac{1}{2}+\frac{1}{2p}}. \end{aligned}$$
(2)

Then there is a simple graph G with degree sequence \(\textbf{D}\).

Part (b). Let G be any simple graph with degree sequence \(\textbf{D}\). Assume that \(d\geqslant 2\) is an even integer satisfying

$$\begin{aligned} 4d^{1-\frac{1}{p}}{\Vert \textbf{D}\Vert }_p\leqslant {\Vert \textbf{D}\Vert }_1. \end{aligned}$$
(3)

Then the pair (Gd) is DPG-graphic, and consequently \(\textbf{D}\circ d\) is graphic.

Our second result makes explicit a theorem of Selberg [25, Th. 2].

Theorem 1.8

Assume the Riemann hypothesis. Then, for any \(x\geqslant 2\) and \(N>0\), we have

$$\begin{aligned} \sum _{\begin{array}{c} x\leqslant p_\ell \leqslant 2x\\ p_{\ell +1}-p_\ell \geqslant N \end{array}} (p_{\ell +1}-p_\ell ) < \frac{163x\log ^2 x}{N}. \end{aligned}$$
(4)

Remark

The example \(x=2\) and \(N=2\) shows that this result would become false if we replaced the constant 163 by 4. On the other hand, Cramér’s model predicts that it can be replaced by o(1) for \(x\rightarrow \infty \) (cf. [9, §1.1]).

In order to achieve the good numeric constant 163, we estimate carefully (among others) the error term in the truncated von Mangoldt formula for the Chebyshev psi function. This result, stated below, makes explicit a theorem of Goldston [10], and simultaneously extends and sharpens a theorem of Dudek [6, Th. 1.3] in the special case relevant for us. Here and later the notation \(A=O^*(B)\) stands for \(|A|\leqslant B\).

Theorem 1.9

For any \(z>x>10^{18}\) we have

$$\begin{aligned} \psi (x) = x-\sum _{|\Im \rho |<z}\frac{x^{\rho }}{\rho } + O^*\bigl (5\log x\log \log x\bigr ), \end{aligned}$$

where the sum is over the nontrivial zeros of the Riemann zeta function (counted with multiplicity).

2 Preliminary results

2.1 An application of Vizing’s theorem

Theorem 2.1

(Vizing [27]) A simple graph with maximal degree \(\Delta \) admits a proper edge coloring with \(\Delta +1\) colors.

Lemma 2.1

Let G be a simple graph on n vertices with degrees \(d_1,\dotsc ,d_n\). Let \(\delta \geqslant 1\) be an integer, and let \(d\geqslant 2\) be an even integer satisfying

$$\begin{aligned} \delta d\leqslant \sum _{d_\ell <\delta }d_\ell -\sum _{d_\ell \geqslant \delta }d_\ell . \end{aligned}$$
(5)

Then G has a matching of size d/2.

Proof

Let us delete all vertices of degree at least \(\delta \) (and the incident edges) from G. The remaining subgraph H has maximal degree less than \(\delta \), and number of edges at least

$$\begin{aligned} \frac{1}{2}\sum _{\ell =1}^n d_\ell -\sum _{d_\ell \geqslant \delta }d_\ell = \frac{1}{2}\left( \sum _{d_\ell <\delta }d_\ell -\sum _{d_\ell \geqslant \delta }d_\ell \right) \geqslant \frac{\delta d}{2} \end{aligned}$$
(6)

by (5). It follows from Theorem 2.1 that the edge set of H can be partitioned into \(\delta \) matchings, and then (6) shows that the largest matching in this decomposition must be of size at least d/2. Since H is a subgraph of G, the proof is complete. \(\square \)

2.2 Preliminaries about \(\Gamma (z)\) and \(\zeta (s)\)

Lemma 2.2

Assume that \(\Re z>0\). Then

$$\begin{aligned} \Re \frac{\Gamma '(z)}{\Gamma (z)}+\frac{\Gamma '(\Re z)}{\Gamma (\Re z)}< \log |z|+\log \Re z-\Re \frac{1}{2z}-\frac{1}{2\Re z}. \end{aligned}$$

Proof

With the help of the well-known integral representation (cf. [28, §12.31])

$$\begin{aligned} \frac{\Gamma '(z)}{\Gamma (z)}=\log z-\frac{1}{2z} -\int _0^\infty \left( \frac{1}{2}-\frac{1}{t}+\frac{1}{e^t-1}\right) e^{-tz}\,dt,\qquad \Re z >0, \end{aligned}$$
(7)

the statement becomes

$$\begin{aligned} \int _0^\infty \left( \frac{1}{2}-\frac{1}{t}+\frac{1}{e^t-1}\right) \left( \Re e^{-tz}+e^{-t\Re z}\right) \,dt>0. \end{aligned}$$

However, this one is clear, because the integrand is non-negative with a discrete set of zeros (there are no zeros when z is real). \(\square \)

Lemma 2.3

(Delange [5]) For any \(\sigma >1\) and \(t\in \mathbb {R}\) we have

$$\begin{aligned} \left| \frac{\zeta '(\sigma +it)}{\zeta (\sigma +it)}\right| <\frac{1}{\sigma -1}-\frac{1}{2\sigma ^2}. \end{aligned}$$

Lemma 2.4

(Dudek [6]) Let \(\sigma \leqslant -1\) and \(t\in \mathbb {R}\). Assume that either \(\sigma \in 1+2\mathbb {Z}\) or \(|t|\geqslant 1\). Then

$$\begin{aligned} \left| \frac{\zeta '(\sigma +it)}{\zeta (\sigma +it)}\right| <9+\log |\sigma +it|. \end{aligned}$$

Proof

This is a variant of [6, Lem. 2.3], and can be proved in the same way. We note a small oversight in [6, p. 183]: instead of assuming that \(U\geqslant 2\) is an even integer, one should assume that \(U\geqslant 1\) is an odd integer, just as in [4, §17]. \(\square \)

Lemma 2.5

(Dudek [6]) Assume that \(z>100\). Then there exists \(T\in (z-2,z)\) such that

$$\begin{aligned} \left| \frac{\zeta '(\sigma +iT)}{\zeta (\sigma +iT)}\right| <\log ^2 z + 20\log z,\qquad \sigma >-1. \end{aligned}$$
(8)

Proof

The statement follows from [6, Lem. 2.8]. \(\square \)

Definition

For \(T>0\), we denote by N(T) the number of zeros of \(\zeta (s)\) with imaginary part in (0, T), counted with multiplicity.

Lemma 2.6

For any \(T\geqslant \Delta +2\pi>\Delta >0\) we have

$$\begin{aligned} N(T)-N(T-\Delta )<\left( \frac{\Delta }{2\pi }+0.56\right) \log T. \end{aligned}$$

Proof

By [2, Cor. 1], we have

$$\begin{aligned} N(T)-N(T-\Delta )=\frac{1}{2\pi }\int _{T-\Delta }^T \log \frac{t}{2\pi }\, dt +O^*(0.56\log T). \end{aligned}$$

The integrand is less than \(\log T\), and the result follows. \(\square \)

2.3 Preliminaries about prime gaps

Theorem 2.2

(Ingham [15, Th. 4]) Let \(\varepsilon >0\). For any \(x\geqslant x_0(\varepsilon )\), there is a prime number in \([x,x+x^{5/8+\varepsilon }]\).

Remark

The exponent \(5/8+\varepsilon \) was improved multiple times, the current record 21/40 being due to Baker–Harman–Pintz [1]. We have emphasized the classical result of Ingham [15] as it suffices for our purposes.

Theorem 2.3

(Carneiro–Milinovich–Soundararajan [3, Th. 1.5]) Assume the Riemann hypothesis. Then, for any \(x\geqslant 4\), there is a prime number in \([x,x+\frac{22}{25}\sqrt{x}\log x]\).

In a restricted range, we have a stronger unconditional result thanks to explicit calculations.

Lemma 2.7

For any \(x\in [117,10^{18}]\), there is a prime number in \([x,x+\sqrt{x}]\).

Proof

Assume that the conclusion fails for some \(x\in [117,10^{18}]\). Then there is a unique prime number \(p_\ell \) such that \(p_\ell<x<x+\sqrt{x}<p_{\ell +1}\). In particular, \(\ell \geqslant 31\) and \(p_\ell<x<{(p_{\ell +1}-p_\ell )}^2\). Hence the computations of Oliveira e Silva, Herzog, and Pardi [21, Table 8] show that the initial upper bound \({10}^{18}\) for \(p_\ell \) successively improves to: \(1442^2\), \(148^2\), \(52^2\), \(34^2\), \(22^2\), \(14^2\). This means that \(31\leqslant \ell \leqslant 44\), but then \(x<{(p_{\ell +1}-p_\ell )}^2\leqslant 10^2\) is a contradiction. \(\square \)

Remark

The conclusion of Lemma 2.7 is likely true for all \(x\geqslant 117\). However, this statement is not known to follow from the Riemann hypothesis, and it is stronger than Oppermann’s conjecture (which itself implies Legendre’s conjecture, Andrica’s conjecture, and Brocard’s conjecture).

Theorem 2.4

(Heath-Brown [13]) For any \(x\geqslant 2\) we have

$$\begin{aligned} \sum _{p_\ell \leqslant x}(p_{\ell +1}-p_\ell )^2\ll x^{4/3}(\log x)^{10000}. \end{aligned}$$

Remark

The exponent 4/3 was improved to \(23/18+\varepsilon \) by Heath-Brown [14], and to \(5/4+\varepsilon \) independently by Peck [22] and Maynard [17]. The current record \(123/100+\varepsilon \) is due to Stadlmann [26]. We have emphasized the original breakthrough of Heath-Brown [13] as it suffices for our purposes.

3 Proof of the main theorem

In this section, we first prove Theorem 1.5 assuming Theorem 1.7, and then we prove Theorem 1.6 assuming Theorem 1.8. In other words, we deduce Theorem 1.1 from Theorems 1.7 and 1.8.

3.1 Proof of Theorem 1.5

Let n be sufficiently large. We shall verify the conditions of Theorem 1.7 for

$$\begin{aligned} p:=2,\qquad d_\ell :=p_\ell -p_{\ell -1},\qquad d:=p_{n+1}-p_n. \end{aligned}$$

Clearly, \({\Vert \textbf{D}\Vert }_1=p_n-1\) is even. Condition (2) reads

$$\begin{aligned} \sum _{\ell =1}^n (2+p_\ell -p_{\ell -1})^2\leqslant n^{3/2}, \end{aligned}$$
(9)

while condition (3) reads

$$\begin{aligned} 16(p_{n+1}-p_n)\sum _{\ell =1}^n (p_\ell -p_{\ell -1})^2\leqslant (p_n-1)^2. \end{aligned}$$
(10)

Now (9) and (10) follow from Theorems 2.2 and 2.4, hence we are done:

$$\begin{aligned}{} & {} \sum _{\ell =1}^n (2+p_\ell -p_{\ell -1})^2 \leqslant 9\sum _{\ell =1}^n (p_\ell -p_{\ell -1})^2\leqslant p_n^{4/3+o(1)}=n^{4/3+o(1)},\\ {}{} & {} 16(p_{n+1}-p_n)\sum _{\ell =1}^n (p_\ell -p_{\ell -1})^2 \leqslant p_n^{5/8+o(1)}p_n^{4/3+o(1)}=p_n^{47/24+o(1)}. \end{aligned}$$

3.2 Proof of Theorem 1.6

Assume the Riemann hypothesis, and let G be a prime gap graph on n vertices. It suffices to show that G has \((p_{n+1}-p_n)/2\) independent edges (cf. Conjecture 1.2), because then a straightforward induction argument based on Theorem 1.4 shows that every initial segment of \(\textbf{PD}\) is graphic (cf. Conjecture 1.1). The statement is clear for \(2\leqslant n\leqslant 4\), hence we shall restrict to \(n\geqslant 5\).

By Lemma 2.1, it suffices to exhibit an integer \(N\geqslant 1\) satisfying

$$\begin{aligned} N(p_{n+1}-p_n)+2\!\!\!\!\!\!\sum _{\begin{array}{c} 1\leqslant \ell \leqslant n\\ p_\ell -p_{\ell -1}\geqslant N \end{array}}(p_\ell -p_{\ell -1})< p_n. \end{aligned}$$
(11)

For \(p_n<10^{18}\) we take

$$\begin{aligned} N:=\max _{1\leqslant \ell \leqslant n}(1+p_\ell -p_{\ell -1}), \end{aligned}$$

so that (11) simplifies to \(N(p_{n+1}-p_n)<p_n\). In fact the proof of Lemma 2.1 reveals that the last condition can be relaxed to

$$\begin{aligned} \frac{p_{n+1}-p_n}{2}\leqslant \left\lceil \frac{p_n-1}{2 N}\right\rceil , \end{aligned}$$
(12)

which works better for very small \(n\geqslant 5\). For \(5\leqslant n\leqslant 44\) the condition (12) can be checked by a simple computer program (or by hand). For \(n\geqslant 45\) and \(p_n<10^{18}\) we verify (12) as follows. Let k be the unique positive integer satisfying \((k-1)^2<p_n<k^2\). Note that \(k\geqslant 15\), because \(p_n\geqslant p_{45}=197\). From Lemma 2.7 it follows that \(p_{n+1}-p_n\leqslant k-1\) and

$$\begin{aligned} N=\max \left( 15,\max _{32\leqslant \ell \leqslant n}(1+p_\ell -p_{\ell -1})\right) \leqslant k, \end{aligned}$$

hence also that

$$\begin{aligned} \frac{p_n-1}{2N}>\frac{k^2-2k}{2N}\geqslant \frac{k^2-2k}{2k}=\frac{k}{2}-1. \end{aligned}$$

Therefore, (12) is clear by

$$\begin{aligned} \frac{p_{n+1}-p_n}{2}\leqslant \frac{k-1}{2}\leqslant \left\lceil \frac{p_n-1}{2N}\right\rceil . \end{aligned}$$

For \(p_n>10^{18}\) we take

$$\begin{aligned} N:=\left\lceil \frac{\sqrt{p_n}}{3\log p_n}\right\rceil . \end{aligned}$$

Then, by Theorems 2.3 and 1.8, we have

$$\begin{aligned} N(p_{n+1}-p_n)<\frac{p_n}{3}\qquad \text {and}\qquad \sum _{\begin{array}{c} 1\leqslant \ell \leqslant n\\ p_\ell -p_{\ell -1}\geqslant N \end{array}}(p_\ell -p_{\ell -1})<489\sqrt{p_n}\log ^3 p_n<\frac{p_n}{3}. \end{aligned}$$

From here the bound (11) is immediate, hence we are done.

4 A symmetric criterion for graphicality

In this section, we prove Theorem 1.7.

Part (a). By symmetry, we can assume that \(d_1\geqslant \cdots \geqslant d_n\). By Theorem 1.2, it suffices to check that for any \(1\leqslant k\leqslant n\),

$$\begin{aligned} \sum _{\ell =1}^k d_\ell \leqslant k(k-1) + \sum _{\ell =k+1}^n \min (k,d_\ell ). \end{aligned}$$

Since \(d_\ell \geqslant 1\) for any \(1\leqslant \ell \leqslant n\), it suffices to prove that

$$\begin{aligned} \sum _{\ell =1}^k d_\ell \leqslant k(k-1) + (n-k), \end{aligned}$$

which is equivalent to

$$\begin{aligned} \sum _{\ell =1}^k (2+d_\ell ) \leqslant k^2+n. \end{aligned}$$

This last condition follows from (2) and Hölder’s inequality, hence we are done:

$$\begin{aligned} {\Vert 2+\textbf{D}^k\Vert }_1 \leqslant k^{1-\frac{1}{p}}{\Vert 2+\textbf{D}^k\Vert }_p\leqslant k^{1-\frac{1}{p}}{\Vert 2+\textbf{D}\Vert }_p \leqslant k^{1-\frac{1}{p}} n^{\frac{1}{2}+\frac{1}{2p}}\leqslant \max (k^2,n). \end{aligned}$$

In the last step, we used that both \(k^2\) and n are upper bounded by \(\max (k^2,n)\).

Part (b). By Theorem 1.4 and Lemma 2.1, it suffices to verify that (5) holds for some integer \(\delta \geqslant 1\). If \(p=\infty \), then (3) says that \(4d{\Vert \textbf{D}\Vert }_\infty \leqslant {\Vert \textbf{D}\Vert }_1\), hence (5) holds for \(\delta :=1+{\Vert \textbf{D}\Vert }_\infty \). So let us focus on the case \(1<p<\infty \). For any integer \(\delta \geqslant 1\), we have

$$\begin{aligned} {\Vert \textbf{D}\Vert }_p^p\geqslant \sum _{d_\ell \geqslant \delta }d_\ell ^p\geqslant \delta ^{p-1}\sum _{d_\ell \geqslant \delta }d_\ell , \end{aligned}$$

hence also

$$\begin{aligned} \sum _{d_\ell <\delta }d_\ell -\sum _{d_\ell \geqslant \delta }d_\ell \geqslant {\Vert \textbf{D}\Vert }_1-2\delta ^{1-p}{\Vert \textbf{D}\Vert }_p^p. \end{aligned}$$

So for the validity of (5), it suffices that

$$\begin{aligned} \delta ^{1-p}{\Vert \textbf{D}\Vert }_p^p\leqslant \frac{1}{4}{\Vert \textbf{D}\Vert }_1\qquad \text {and}\qquad \delta d\leqslant \frac{1}{2}{\Vert \textbf{D}\Vert }_1. \end{aligned}$$

In other words, it suffices to find an integer \(\delta \) satisfying

$$\begin{aligned} \left( \frac{4{\Vert \textbf{D}\Vert }_p^p}{{\Vert \textbf{D}\Vert }_1}\right) ^\frac{1}{p-1}\leqslant \delta \leqslant \frac{1}{2d}{\Vert \textbf{D}\Vert }_1. \end{aligned}$$

The left-hand side exceeds 1, hence \(\delta \) exists as long as

$$\begin{aligned} 2\left( \frac{4{\Vert \textbf{D}\Vert }_p^p}{{\Vert \textbf{D}\Vert }_1}\right) ^\frac{1}{p-1}\leqslant \frac{1}{2d}{\Vert \textbf{D}\Vert }_1. \end{aligned}$$

This is equivalent to condition (3), hence the proof of Theorem 1.7 is complete.

5 The sum of large prime gaps

In this section, we prove Theorem 1.8 assuming Theorem 1.9. Throughout, we assume the Riemann hypothesis.

First we eliminate some simple cases. Let \(N^*\) denote the largest prime gap \(p_{\ell +1}-p_\ell \) occurring in (4). Then we can clearly assume that

$$\begin{aligned} N\leqslant N^*\qquad \text {and}\qquad 2x+N^*>\frac{163x\log ^2 x}{N}, \end{aligned}$$
(13)

hence also that \(N^*(2x+N^*)>163x\log ^2 x\). From Theorem 2.3 it follows that \(N^*<3\sqrt{x}\log x\), so our previous inequality yields \(N^*>77\log ^2 x\). By [21, Table 8], this forces \(x>10^{18}\). Indeed, for \(x\in [2,10^3]\) we have \(N^*\leqslant 34\), while for \(x\in [10^3,10^{18}]\) we have \(N^*\leqslant 1476\). On the other hand, for \(x>10^{18}\) we get from Therore 2.3 that \(N^*<\frac{4}{3}\sqrt{x}\log x<0.001 x\), hence by (13) also that

$$\begin{aligned} 81\log ^2 x<N<\frac{4}{3}\sqrt{x}\log x. \end{aligned}$$
(14)

From now on we assume both \(x>10^{18}\) and (14). Following Heath-Brown [13], we write \(N=4\delta x\) with

$$\begin{aligned} \frac{81\log ^2 x}{4x}<\delta <\frac{\log x}{3\sqrt{x}}, \end{aligned}$$

and we set out to estimate the square mean of

$$\begin{aligned} E(y,\delta ) := \psi (y+\delta y)-\psi (y)-\delta y,\qquad x\leqslant y\leqslant 2x. \end{aligned}$$

It follows from Theorem 1.9 and the crude bound \(y+\delta y<3x\) that

$$\begin{aligned} |E(y,\delta )|<\left| \sum _{|\Im \rho |<3x} y^{\rho }C(\rho )\right| + \log ^2 x, \end{aligned}$$

where

$$\begin{aligned} C(\rho ):=\frac{1-(1+\delta )^{\rho }}{\rho }. \end{aligned}$$

As a result,

$$\begin{aligned} \int _x^{2x} |E(y,\delta )|^2\, dy< 2\int _x^{2x}\left| \sum _{|\Im \rho |<3x} y^{\rho }C(\rho )\right| ^2\,dy+2x\log ^4 x. \end{aligned}$$
(15)

Lemma 5.1

We have

$$\begin{aligned} |C(\rho )|<\min \left( \delta ,\frac{\sqrt{4+4\delta }}{|\rho |}\right) . \end{aligned}$$
(16)

Proof

The bound \(|C(\rho )|<\delta \) is a consequence of the integral representation

$$\begin{aligned} C(\rho )=\int _{1+\delta }^1 x^{\rho -1}\,dx \end{aligned}$$

and the triangle inequality for complex-valued Riemann integrals. In addition, the triangle inequality for complex numbers yields by the definition of \(C(\rho )\) that

$$\begin{aligned} |C(\rho )|\leqslant \frac{1+\sqrt{1+\delta }}{|\rho |}<\frac{\sqrt{4+4\delta }}{|\rho |}. \end{aligned}$$

\(\square \)

We estimate the integral on the right-hand side of (15) as in the proof of [24, Lem. 5]:

$$\begin{aligned}{} & {} \int _x^{2x}\left| \sum _{|\Im \rho |<3x} y^{\rho }C(\rho )\right| ^2\,dy \leqslant \int _1^2\int _{xv/2}^{2xv}\left| \sum _{|\Im \rho |<3x} y^{\rho }C(\rho )\right| ^2\,dy\,dv\\{} & {} \quad =\sum _{|\Im \rho |,|\Im \rho '|<3x}x^{2+\rho -\rho '}C(\rho )\overline{C(\rho ')}\cdot \frac{2^{2+\rho -\rho '}-2^{-2-\rho +\rho '}}{2+\rho -\rho '}\cdot \frac{2^{3+\rho -\rho '}-1}{3+\rho -\rho '}\\{} & {} \quad \leqslant x^2\sum _{|\Im \rho |,|\Im \rho '|<3x}|C(\rho )C(\rho ')| \left| \frac{2^2+2^{-2}}{2+\rho -\rho '}\right| \left| \frac{2^3+1}{3+\rho -\rho '}\right| \\{} & {} \quad \leqslant x^2\sum _{|\Im \rho |,|\Im \rho '|<3x}|C(\rho )|^2 \left| \frac{2^2+2^{-2}}{2+\rho -\rho '}\right| \left| \frac{2^3+1}{3+\rho -\rho '}\right| . \end{aligned}$$

Here the contribution of \(\Im \rho <0\) is the same as the contribution of \(\Im \rho >0\). Therefore, applying Lemma 5.1 along with the elementary inequality

$$\begin{aligned} |2+it|\cdot |3+it|\geqslant 6+t^2,\qquad t\in \mathbb {R}, \end{aligned}$$

we arrive at

$$\begin{aligned} \int _x^{2x}\left| \sum _{|\Im \rho |<3x} y^{\rho }C(\rho )\right| ^2\,dy< \frac{153}{2} x^2\sum _{0<\Im \rho<3x}\min \left( \delta ^2,\frac{4+4\delta }{|\rho |^2}\right) \sum _{|\Im \rho '|<3x}\frac{1}{6+|\rho -\rho '|^2}.\nonumber \\ \end{aligned}$$
(17)

We need to estimate the inner sum in (17). The idea is to drop the condition \(|\Im \rho '|<3x\), and consider the full convergent series

$$\begin{aligned} \sum _{\rho '}\frac{1}{6+|\rho -\rho '|^2} =\frac{1}{\sqrt{6}}\sum _{\rho '}\Re \frac{1}{\sqrt{6}+\rho -\rho '} =\frac{1}{\sqrt{6}}\Re \frac{\xi '(\sqrt{6}+\rho )}{\xi (\sqrt{6}+\rho )}. \end{aligned}$$

The first equality follows from the Riemann hypothesis, while the second equality follows from [20, Cor. 10.14]. Let us write \(s:=\sqrt{6}+\rho \) for simplicity. Then, as in the proof of [20, Cor. 10.14], we have

$$\begin{aligned} \frac{\xi '(s)}{\xi (s)}=-\frac{1}{2}\log \pi +\frac{1}{s-1}+\frac{\zeta '(s)}{\zeta (s)}+ \frac{1}{2}\frac{\Gamma '(s/2+1)}{\Gamma (s/2+1)}. \end{aligned}$$

We take the real part of both sides, and apply Lemma 2.2:

$$\begin{aligned} \Re \frac{\xi '(s)}{\xi (s)}<&-\frac{1}{2}\log \pi +\Re \frac{1}{s-1}-\frac{\zeta '(\Re s)}{\zeta (\Re s)} +\frac{1}{2}\Re \frac{\Gamma '(s/2+1)}{\Gamma (s/2+1)}\\ <&-\frac{1}{2}\log \pi +\Re \frac{1}{s-1}-\Re \frac{1}{2s+4}-\frac{\zeta '(\Re s)}{\zeta (\Re s)}-\frac{1}{2\Re s+4}\\&-\frac{1}{2}\frac{\Gamma '(\Re s/2+1)}{\Gamma (\Re s/2+1)} +\frac{1}{2}\log (\Re s/2+1)+\frac{1}{2}\log |s/2+1|. \end{aligned}$$

Using that \(\Re s\!=\!\sqrt{6}+1/2\) and \(\Im s>14\), it is straightforward to check that \(\Re \frac{1}{s-1}<\Re \frac{1}{2s+4}\), hence in fact

$$\begin{aligned} \Re \frac{\xi '(s)}{\xi (s)}<0.181-\frac{1}{2}\log \pi +\frac{1}{2}\log |s/2+1| <\frac{1}{4}+\frac{1}{2}\log \frac{\Im \rho }{2\pi }. \end{aligned}$$

To sum up, we have proved that

$$\begin{aligned} \sum _{\rho '}\frac{1}{6+|\rho -\rho '|^2}<\frac{1}{2\sqrt{6}}\left( \frac{1}{2}+\log \frac{\Im \rho }{2\pi }\right) . \end{aligned}$$

Going back to (17), we conclude that

$$\begin{aligned} \int _x^{2x}\left| \sum _{|\Im \rho |<3x} y^{\rho }C(\rho )\right| ^2\,dy<15.616 x^2\sum _{0<\Im \rho <3x}\min \left( \delta ^2,\frac{4}{|\rho |^2}\right) \left( \frac{1}{2}+\log \frac{\Im \rho }{2\pi }\right) . \end{aligned}$$

In the sum on the right-hand side, we drop the condition \(\Im \rho <3x\) and replace \(|\rho |\) by \(\Im \rho \). By [2, Cor. 1, Lem. 5–6], the resulting bigger sum can be estimated as follows:

$$\begin{aligned}&<\delta ^2N(2/\delta )\left( \frac{1}{2}+\log \frac{1}{\delta \pi }\right) + \sum _{\Im \rho \geqslant 2/\delta }\left( \frac{2}{(\Im \rho )^2}+\frac{4}{(\Im \rho )^2}\log \frac{\Im \rho }{2\pi }\right) \\&<\frac{\delta }{\pi }\left( \log \frac{1}{\delta \pi }\right) \left( \frac{1}{2}+\log \frac{1}{\delta \pi }\right) + \frac{\delta }{2\pi }\log \frac{2}{\delta }+ \frac{\delta }{\pi }\left( \log ^2\frac{2}{\delta }-\log \frac{2}{\delta }\right) \\&<\frac{\delta }{\pi }\left( \log ^2\frac{1}{\delta \pi }+\log ^2\frac{2}{\delta }\right) <\frac{2\delta }{\pi }\log ^2 x. \end{aligned}$$

In the end, we get

$$\begin{aligned} \int _x^{2x}\left| \sum _{|\Im \rho |<3x} y^{\rho }C(\rho )\right| ^2\,dy <9.942\delta x^2\log ^2 x. \end{aligned}$$

Plugging this bound into (15), we conclude that

$$\begin{aligned} \int _x^{2x} |E(y,\delta )|^2\, dy< 19.884\delta x^2\log ^2 x + 2x\log ^4 x < 19.983\delta x^2\log ^2 x. \end{aligned}$$
(18)

Assume now that the prime \(p_\ell \in [x,2x]\) satisfies \(p_{\ell +1}-p_\ell \geqslant N\). There is at most one \(p_\ell \) such that \((p_\ell +p_{\ell +1})/2>2x\), so assume also that \((p_\ell +p_{\ell +1})/2\leqslant 2x\). Then, for any

$$\begin{aligned} y\in (p_\ell ,(p_\ell +p_{\ell +1})/2)\subset (x,2x), \end{aligned}$$

the interval

$$\begin{aligned}{}[y,y+\delta y]\subset [y,y+N/2]\subset (p_\ell ,p_{\ell +1}) \end{aligned}$$

is free of primes, hence counting the possible higher prime powers in this interval, we get

$$\begin{aligned} \psi (y+\delta y)-\psi (y)\leqslant (1+\delta \sqrt{y}/2)\log _2(y+\delta y)<(2+\delta \sqrt{y})\log x<0.003\delta y. \end{aligned}$$

That is, \(|E(y,\delta )|>0.997\delta y\) holds on \((p_\ell ,(p_\ell +p_{\ell +1})/2)\). Squaring and integrating, we get

$$\begin{aligned} \int _{p_\ell }^{(p_\ell +p_{\ell +1})/2}|E(y,\delta )|^2\,dy > 0.497\delta ^2 x^2(p_{\ell +1}-p_\ell ). \end{aligned}$$

Summing over all such primes \(p_\ell \), and using (18) as well as Theorem 2.3 for the possible single exceptional \(p_\ell \), we obtain (4):

$$\begin{aligned} \sum _{\begin{array}{c} x\leqslant p_\ell \leqslant 2x\\ p_{\ell +1}-p_\ell \geqslant N \end{array}} (p_{\ell +1}-p_\ell )< \frac{4}{3}\sqrt{x}\log x+\frac{1}{0.497\delta ^2 x^2}\int _x^{2x} |E(y,\delta )|^2\,dy < \frac{163x\log ^2 x}{N}. \end{aligned}$$

6 The error term in the truncated von Mangoldt formula

In this section, we prove Theorem 1.9. We follow Davenport [4, §17] and Goldston [10] with appropriate modifications.

We assume first that \(x\not \in \mathbb {Z}\). We choose \(T\in (z-2,z)\) according to Lemma 2.5, and we also fix \(c:=1+1/\log x\). We record the following approximation to the characteristic function of \((1,\infty )\):

$$\begin{aligned} \textbf{1}_{y>1}=\frac{1}{2\pi i}\int _{c-iT}^{c+iT}\frac{y^s}{s}\,ds+ O^*\left( y^c\min \left( 0.501,\frac{1}{\pi T|\log y|}\right) \right) ,\qquad y\in (0,1)\cup (1,\infty ). \end{aligned}$$

This formula follows by making explicit the calculation on [4, pp. 105–106]. The constant 0.501 follows by observing that the line \(\Re s=c\) divides the circle \(|s|=|c+iT|\) into two almost equal arcs, each of length less than \(1.001\pi |c+iT|\). The constant \(1/\pi \) arises as twice the size of the leading \(1/(2\pi i)\). Applying the formula for \(y=x/n\), multiplying by \(\Lambda (n)\), and summing over \(n\geqslant 1\), we get

$$\begin{aligned} \psi (x)= \frac{1}{2\pi i}\int _{c-iT}^{c+iT} \left( -\frac{\zeta '(s)}{\zeta (s)}\right) \frac{x^s}{s}\, ds +O^*\left( \sum _{n=1}^\infty \left( \frac{x}{n}\right) ^c\Lambda (n) \min \left( 0.501,\frac{1}{\pi T\left| \log \frac{x}{n}\right| }\right) \right) .\nonumber \\ \end{aligned}$$
(19)

We shall abbreviate the integrand in (19) by f(s), and estimate the error term by cutting the n-sum into four parts. Throughout, we keep in mind that \(x/T<x/(x-2)\). As a preparation, we record the elementary inequalities

$$\begin{aligned} \left( \frac{x}{n}\right) ^c\Lambda (n)\leqslant & {} \left( \frac{x}{n}\right) ^c\log n =\frac{x}{n}\cdot \frac{e\log n}{e^{\log n/\log x}}\leqslant \frac{x}{n}\log x, \end{aligned}$$
(20)
$$\begin{aligned} \left| \log \frac{x}{y}\right|= & {} \int _{\min (x,y)}^{\max (x,y)}\frac{du}{u} \geqslant \frac{|x-y|}{\max (x,y)},\qquad y>0.\nonumber \\ \end{aligned}$$
(21)

We also observe that the function

$$\begin{aligned} y\mapsto \frac{x-y}{y\log \frac{x}{y}},\qquad y>0, \end{aligned}$$

is positive and decreasing (the function has a removable discontinuity at \(y=x\)). Indeed, writing \(v:=\log \frac{x}{y}\), the claim is that \(v\mapsto (e^v-1)/v\) is positive and increasing on \(\mathbb {R}\), which in turn follows from the fact that the exponential function is increasing and convex.

First we consider the n’s satisfying \(1\leqslant |x-n|\leqslant \log x\). By (21) and the subsequent observation, in this range we have

$$\begin{aligned} 0<\frac{x-n}{n\log \frac{x}{n}}\leqslant \frac{\log x}{(x-\log x)\log \frac{x}{x-\log x}}\leqslant \frac{x}{x-\log x}<1.001, \end{aligned}$$

hence by (20) also

$$\begin{aligned} \left( \frac{x}{n}\right) ^c\frac{\Lambda (n)}{\left| \log \frac{x}{n}\right| } \leqslant \frac{x\log x}{n\left| \log \frac{x}{n}\right| } < 1.001\frac{x\log x}{|x-n|}. \end{aligned}$$
(22)

So the corresponding n-subsum within (19) is at most

$$\begin{aligned} 1.001\frac{x\log x}{\pi T}\sum _{\begin{array}{c} 1\leqslant |x-n|\leqslant \log x\\ \Lambda (n)\ne 0 \end{array}}\frac{1}{|x-n|}<0.638\log x\sum _{1\leqslant k\leqslant \log x}\frac{1}{k}<A(x), \end{aligned}$$
(23)

where

$$\begin{aligned} A(x):=0.638\log x\cdot (\log \log x+3/5). \end{aligned}$$

Second, we consider the n’s satisfying \(\log x<|x-n|\leqslant x/5\). In this range we have the following variant of (22) proved in the same way:

$$\begin{aligned} \left( \frac{x}{n}\right) ^c\frac{\Lambda (n)}{\left| \log \frac{x}{n}\right| } \leqslant \frac{x\log x}{n\left| \log \frac{x}{n}\right| }\leqslant \frac{1}{4\log \frac{5}{4}}\cdot \frac{x\log x}{|x-n|}. \end{aligned}$$

So the corresponding n-subsum within (19) is at most

$$\begin{aligned} \frac{1}{4\log \frac{5}{4}}\cdot \frac{x\log x}{\pi T}\sum _{\begin{array}{c} \log x<|x-n|\leqslant x\\ \Lambda (n)\ne 0 \end{array}}\frac{1}{|x-n|}, \end{aligned}$$
(24)

where we relaxed the summation condition for convenience. If p(u) denotes the number of prime powers in \([x-u,x+u]\), then the last sum can be written as

$$\begin{aligned} \sum _{\begin{array}{c} \log x<|x-n|\leqslant x\\ \Lambda (n)\ne 0 \end{array}}\frac{1}{|x-n|}=\int _{\log x}^{x}\frac{dp(u)}{u} =\left[ \frac{p(u)}{u}\right] _{\log x}^{x}+\int _{\log x}^{x}\frac{p(u)}{u^2}\,du. \end{aligned}$$
(25)

Since \(0\leqslant p(u)\leqslant 2u+1\), the first term on the right-hand side is less than 2.001. By the Brun–Titchmarsh inequality in the form given by Montgomery and Vaughan [19, Th. 2], we also see that

$$\begin{aligned} p(u)<\frac{4u}{\log u}+\log _2(x)\left( 1+\frac{2u}{\sqrt{2x}}\right) ,\qquad u\in [2,x]. \end{aligned}$$

Indeed, on the right-hand side, the first term upper bounds the number of primes in \([x-u,x+u]\), while the second term upper bounds the number of higher prime powers in \([x-u,x+u]\). In particular,

$$\begin{aligned} \int _{\log x}^{x}\frac{p(u)}{u^2}\,du< \int _{\log x}^{x}\left( \frac{4.001}{u\log u}+\frac{\log _2 x}{u^2}\right) \,du <4.001\log \log x-3.818. \end{aligned}$$
(26)

We infer from (24)–(26) that the n’s satisfying \(\log x<|x-n|\leqslant x/5\) contribute to (19) less than

$$\begin{aligned} B(x):=1.427\log x\cdot (\log \log x-2/5). \end{aligned}$$

Now we turn to the n’s satisfying \(|x-n|>x/5\). In this range we have \(|\log (x/n)|>\log (6/5)\), hence by Lemma 2.3 the corresponding n-subsum is at most

$$\begin{aligned} \sum _{|x-n|>x/5}\Lambda (n)\left( \frac{x}{n}\right) ^c \frac{1}{\pi T\log (6/5)}< \frac{ex}{\pi T\log (6/5)}\left| \frac{\zeta '(c)}{\zeta (c)}\right| < 4.746\log x. \end{aligned}$$

Finally, there are two n’s satisfying \(|x-n|<1\), and by (20) their contribution to (19) is less than

$$\begin{aligned} 0.501\left( \frac{x}{x-1}+\frac{x}{x}\right) \log x < 1.003 \log x. \end{aligned}$$

Collecting everything, we arrive at (with room to spare)

$$\begin{aligned} \psi (x) = \frac{1}{2\pi i}\int _{c-iT}^{c+iT} f +O^*(A(x)+B(x)+6\log x). \end{aligned}$$
(27)

We note here that \(A(x)+B(x)\) is less than \(2.1\log x\log \log x\).

On the other hand, the residue theorem combined with (8) and Lemma 2.4 shows that

$$\begin{aligned}{} & {} \frac{1}{2\pi i}\left( \int _{c-iT}^{c+iT}f+\int _{c+iT}^{-\infty +iT}f+\int _{-\infty -iT}^{c-iT}f\right) \\{} & {} \quad =x-\sum _{|\Im \rho |<T}\frac{x^\rho }{\rho }-\log (2\pi )-\frac{1}{2}\log (1-x^{-2}), \end{aligned}$$

where each integral is over a directed line segment or half-line. We estimate the second integral with the help of (8) and Lemma 2.4:

$$\begin{aligned} \left| \int _{c+iT}^{-\infty +iT}f\right|<\frac{\log ^2 z + 20\log z}{z-2}\int _{-\infty }^{c}x^\sigma \,d\sigma <(\log x+20)\frac{ex}{x-2}. \end{aligned}$$

The third integral obeys the same bound, hence we infer that

$$\begin{aligned} \frac{1}{2\pi i}\int _{c-iT}^{c+iT}f=x-\sum _{|\Im \rho |<T}\frac{x^\rho }{\rho }+O^*(\log x+20). \end{aligned}$$

By Lemma 2.6, we can extend the \(\rho \)-sum to \(|\Im \rho |<z\) at the cost of an error of \(O^*(2\log x)\). Therefore, going back to (27), we conclude for \(x\not \in \mathbb {Z}\) that

$$\begin{aligned} \psi (x)=x-\sum _{|\Im \rho |<z}\frac{x^\rho }{\rho }+O^*(A(x)+B(x)+9\log x+20). \end{aligned}$$

Finally, if x is an integer, then we make use of the following simple observation. For a fixed \(z>10^{18}\), the \(\rho \)-sum on the right-hand side is continuous in \(x\in (10^{18},z)\), while the left-hand side equals \(\psi (x-)+\Lambda (x)\). Therefore, the previous formula is valid at x with an extra error term of \(O^*(\log x)\). We finish the proof of Theorem 1.9 by noting that

$$\begin{aligned} A(x)+B(x)+10\log x+20<5\log x\log \log x,\qquad x>10^{18}. \end{aligned}$$