1 Introduction

The problem of determining the cover time of a graph is a central one in combinatorics and probability [3,4,5, 18, 19, 21, 26]. In recent years, the cover time of random graphs has been extensively studied [1, 14, 16, 17, 20]. All these works consider undirected graphs, with the notable exception of the paper [17] by Cooper and Frieze, where the authors compute the cover time of directed Erdős-Renyi random graphs in the regime of strong connectivity, that is with a logarithmically diverging average degree. The main difficulty in the directed case is that, in contrast with the undirected case, the graph’s stationary distribution is an unknown random variable.

In this paper we address the problem of determining the cover time of sparse random digraphs with bounded degrees. More specifically, we consider random digraphs G with given in- and out-degree sequences, generated via the configuration model. For the sake of this introductory discussion let us look at the special case where all vertices have either in-degree 2 and out-degree 3 or in-degree 3 and out-degree 2, with the two types evenly represented in the vertex set V(G). We refer to this as the (2, 3)(3, 2) case. With high probability G is strongly connected and we may ask how long the random walk on G takes to cover all the nodes. The expectation of this quantity, maximized over the initial point of the walk defines \(T_\mathrm{cov}(G)\), the cover time of G. We will show that with high probability as the number of vertices n tends to infinity one has

$$\begin{aligned} T_\mathrm{cov}(G)\asymp n \log ^\gamma (n) \end{aligned}$$
(1.1)

where \(\gamma = \frac{\log 3}{\log 2}\approx 1.58\), and \(a_n\asymp b_n\) stands for \(C^{-1}\le a_n/b_n \le C\) for some constant \(C>0\). The constant \(\gamma \) can be understood in connection with the statistics of the extremal values of the stationary distribution \(\pi \) of G. Indeed, following the theory developed by Cooper and Frieze, if the graphs satisfy suitable requirements, then the problem of determining the cover time can be reformulated in terms of the control of the minimal values of \(\pi \). In particular, we will see that the hitting time of a vertex \(x\in V(G)\) effectively behaves as an exponential random variable with parameter \(\pi (x)\), and that to some extent these random variables are weakly dependent. This supports the heuristic picture that represents the cover time as the expected value of n independent exponential random variables, each with parameter \(\pi (x)\), \(x\in V(G)\). Controlling the stationary distribution is however a rather challenging task, especially if the digraphs have bounded degrees.

Recently, Bordenave, Caputo and Salez [9] analyzed the mixing time of sparse random digraphs with given degree sequences and their work provides some important information on the distribution of the values of \(\pi \). In particular, in the (2, 3)(3, 2) case, the empirical distribution of the values \(\{n\pi (x),\, x\in V(G)\}\) converges as \(n\rightarrow \infty \) to the probability law \(\mu \) on \([0,\infty )\) of the random variable X given by

$$\begin{aligned} X=\tfrac{2}{5}\sum _{k=1}^N Z_k\,, \end{aligned}$$
(1.2)

where N is the random variable with \(N=2\) with probability \(\frac{1}{2}\) and \(N=3\) with probability \(\frac{1}{2}\), and the \(Z_k\) are independent and identically distributed mean-one random variables uniquely determined by the recursive distributional equation

$$\begin{aligned} Z_1 {\mathop {=}\limits ^{d}} \tfrac{1}{M}\sum _{k=1}^{5-M}Z_k, \end{aligned}$$
(1.3)

where M is the random variable with \(M=2\) with probability 2/5 and \(M=3\) with probability 3/5, independent of the \(Z_k\)’s, and \({\mathop {=}\limits ^{d}}\) denotes equality in distribution.

This gives convergence of the distribution of the bulk values of \(\pi \), that is of the values of \(\pi \) on the scale 1/n. What enters in the cover time analysis are however the extremal values, notably the minimal ones, and thus what is needed is a local convergence result towards the left tail of \(\mu \), which cannot be extracted from the analysis in [9]. To obtain a heuristic guess of the size of the minimal values of \(\pi \) at large but finite n one may pretend that the values of \(n\pi \) are n i.i.d. samples from \(\mu \). This would imply that \(\pi _\mathrm{min}\), the minimal value of \(\pi \) is such that \(n\pi _\mathrm{min}\sim \varepsilon (n)\) where \(\varepsilon (n)\) is a sequence for which \(n\mu ([0,\varepsilon (n)])\sim 1\), if \( \mu ([0,x])\) denotes the mass given by \(\mu \) to the interval [0, x].

Recursive distributional equations of the form (1.3) have been extensively studied, and much is known about the law of random variables satisfying this type of stability and self-similarity; see, e.g., [6, 23,24,25, 27] and references therein. In particular, the left and right tail asymptotic of the law \(\mu \) defined by (1.2) and (1.3) can be obtained as a special case of these results. For instance, it is shown in Liu [25, Theorem 2.5] that the left tail of \(\mu \) must have the form

$$\begin{aligned} \log \mu ([0,x])\asymp - x^{-\alpha }\,,\qquad x\rightarrow 0^+, \end{aligned}$$

where \(\alpha =1/(\gamma -1)\), with the coefficient \(\gamma \) taking the value \(\gamma = \frac{\log 3}{\log 2}\) in the (2, 3)(3, 2) case. Thus, returning to our heuristic reasoning, one has that the minimal value of \(\pi \) should satisfy

$$\begin{aligned} n\pi _\mathrm{min}\asymp \log ^{1-\gamma }(n). \end{aligned}$$
(1.4)

Moreover, this argument also suggests that with high probability there should be at least \(n^\beta \) vertices \(x\in V(G)\), for some constant \(\beta >0\), such that \(n\pi (x)\) is as small as \(O(\log ^{1-\gamma }( n))\).

A similar heuristic argument, this time based on the analysis of the right tail of \(\mu \), see [23, 24] (in particular, see the first display in [24, p. 271]), predicts that \(\pi _\mathrm{max}\), the maximal value of \(\pi \), should satisfy

$$\begin{aligned} n\pi _\mathrm{max}\asymp \log ^{1-\kappa }(n), \end{aligned}$$
(1.5)

where \(\kappa \) takes the value \(\kappa = \frac{\log 2}{\log 3}\approx 0.63\) in the (2, 3)(3, 2) case.

Our main results below will confirm these heuristic predictions. The proof involves the analysis of the statistics of the in-neighbourhoods of a node. Roughly speaking, it will be seen that the smallest values of \(\pi \) are achieved at vertices \(x\in V(G)\) whose in-neighbourhood at distance \(\log _2\log n\) is a directed tree composed entirely of vertices with in-degree 2 and out-degree 3, while the the maximal values of \(\pi \) are achieved at \(x\in V(G)\) whose in-neighbourhood at distance \(\log _3\log n\) is a directed tree composed entirely of vertices with in-degree 3 and out-degree 2. Once the results (1.4) and (1.5) are established, the cover time asymptotic (1.1) will follow from an appropriate implementation of the Cooper-Frieze approach.

We conclude this preliminary discussion by comparing our estimates (1.4) and (1.5) with related results for different random graph models. The asymptotic of extremal values of \(\pi \) has been determined in [17] for the directed Erdős-Renyi random graphs with logarithmically diverging average degree. There, the authors show that \(n\pi _\mathrm{min}\) and \(n\pi _\mathrm{max}\) are essentially of order 1, which can be interpreted as a concentration property enforced by the divergence of the degrees. On the other hand, for uniformly random out-regular digraphs, that is with constant out-degrees but random in-degrees, the recent paper [2] shows that the stationary distribution restricted to the strongly connected component satisfies \(n\pi _\mathrm{min}=n^{-\eta +o(1)}\), where \(\eta \) is a computable constant, and \(n\pi _\mathrm{max}=n^{o(1)}\). Indeed, in this model in contrast with our setting one can have in-neighborhoods made by long and thin filaments which determine a power law deviation from uniformity.

We now turn to a more systematic exposition of our results.

1.1 Model and statement of results

Set \([n]=\{1,\dots ,n\}\), and for each integer n, fix two sequences \(\mathbf{d}^+=(d_x^+)_{x\in [n]}\) and \(\mathbf{d}^-=(d_x^-)_{x\in [n]}\) of positive integers such that

$$\begin{aligned} m=\sum _{x=1}^nd_x^+=\sum _{x=1}^nd_x^-. \end{aligned}$$
(1.6)

The directed configuration model DCM(\(\mathbf{d}^\pm \)) is the distribution of the random digraph G with vertex set \(V(G)=[n]\) obtained by the following procedure: 1) equip each node x with \(d_x^+\) tails and \(d_x^-\) heads; 2) pick uniformly at random one of the m! bijective maps from the set of all tails into the set of all heads, call it \(\omega \); 3) for all \(x,y\in [n]\), add a directed edge (xy) every time a tail from x is mapped into a head from y through \(\omega \). The resulting digraph G may have self-loops and multiple edges, however it is classical that by conditioning on the event that there are no multiple edges and no self-loops G has the uniform distribution among simple digraphs with in degree sequence \(\mathbf{d}^-\) and out degree sequence \(\mathbf{d}^+\).

Structural properties of digraphs obtained in this way have been studied in [13]. Here we consider the sparse case corresponding to bounded degree sequences and, in order to avoid non irreducibility issues, we shall assume that all degrees are at least 2. Thus, from now on it will always be assumed that

$$\begin{aligned} \delta _\pm =\min _{x\in [n]}d_x^\pm \ge 2\, \quad \Delta _\pm =\max _{x\in [n]}d_x^\pm =O(1). \end{aligned}$$
(1.7)

Under the first assumption it is known that DCM(\(\mathbf{d}^\pm \)) is strongly connected with high probability. Under the second assumption, it is known that DCM(\(\mathbf{d}^\pm \)) has a uniformly (in n) positive probability of having no self-loops nor multiple edges. In particular, any property that holds with high probability for DCM(\(\mathbf{d}^\pm \)) will also hold with high probability for a uniformly random simple digraph with degrees given by \(\mathbf{d}^-\) and \(\mathbf{d}^+\) respectively. Here and throughout the rest of the paper we say that a property holds with high probability (w.h.p. for short) if the probability of the corresponding event converges to 1 as \(n\rightarrow \infty \).

The (directed) distance d(xy) from x to y is the minimal number of edges that need to be traversed to reach y from x. The diameter is the maximal distance between two distinct vertices, i.e.

$$\begin{aligned} \mathrm{diam}(G)=\max _{x\not =y}d(x,y). \end{aligned}$$
(1.8)

We begin by showing that the diameter \(\mathrm{diam}(G)\) concentrates around the value \(c\log n\) within a \(O(\log \log n)\) window, where c is given by \(c=1/\log \nu \) and \(\nu \) is the parameter defined by

$$\begin{aligned} \nu =\frac{1}{m}\sum _{y=1}^nd_y^{-}d_y^{+}. \end{aligned}$$
(1.9)

Theorem 1.1

Set \(\mathrm{d}_\star =\log _\nu n\). There exists \(\varepsilon _n=O\left( \frac{\log \log (n)}{\log (n)}\right) \) such that

$$\begin{aligned} \mathbb {P}\left( (1-\varepsilon _n)\,\mathrm{d}_\star \le \mathrm{diam}(G)\le (1+\varepsilon _n)\,\mathrm{d}_\star \right) =1-o(1). \end{aligned}$$
(1.10)

Moreover, for any \(x,y\in [n]\)

$$\begin{aligned} \mathbb {P}\left( (1-\varepsilon _n)\,\mathrm{d}_\star \le d(x,y)\le (1+\varepsilon _n)\,\mathrm{d}_\star \right) =1-o(1). \end{aligned}$$
(1.11)

The proof of Theorem 1.1 is a directed version of a classical argument for undirected graphs [8]. It requires controlling the size of in- and out-neighborhoods of a node, which in turn follows ideas from [2] and [9]. The value \(\mathrm{d}_\star =\log _\nu n\) can be interpreted as follows: both the in- and the out-neighborhood of a node are tree-like with average branching given by \(\nu \), so that their boundary at depth h has typically size \(\nu ^h\), see Lemma 2.10; if the in-neighborhood of y and the out-neighborhood of x are exposed up to depth h, one finds that the value \(h=\frac{1}{2}\log _\nu (n)\) is critical for the formation of an arc connecting the two neighborhoods.

In particular, Theorem 1.1 shows that w.h.p. the digraph is strongly connected, so there exists a unique stationary distribution \(\pi \) characterized by the equation

$$\begin{aligned} \pi (x)=\sum _{y=1}^n\pi (y)P(y,x)\,,\qquad x\in [n], \end{aligned}$$
(1.12)

with the normalization \(\sum _{x=1}^n\pi (x)=1\). Here P is the transition matrix of the simple random walk on G, namely

$$\begin{aligned} P(y,x)=\frac{m(y,x)}{d^+_y}, \end{aligned}$$
(1.13)

and we write m(yx) for the multiplicity of the edge (yx) in the digraph G. If the sequences \(\mathbf{d}^\pm \) are such that \(d_x^+=d_x^-\) for all \(x\in [n]\), then the stationary distribution is given by

$$\begin{aligned} \pi (x)= \frac{d^\pm _x}{m}. \end{aligned}$$
(1.14)

The digraph is called Eulerian in this case. In all other cases the stationary distribution is a nontrivial random variable. To discuss our results on the extremal values of \(\pi \) it is convenient to introduce the following notation.

Definition 1.2

We say that a vertex \(x\in [n]\) is of type (ij), and write \(x\in \mathcal V_{i,j}\), if \((d^-_x,d^+_x)=(i,j)\). We call \(\mathcal C=\mathcal C(\mathbf{d}^\pm )\) the set of all types that are present in the double sequence \(\mathbf{d}^\pm \), that is \(\mathcal C=\{(i,j):\, |\mathcal V_{i,j}|>0\}.\) The assumption (1.7) implies that the number of distinct types is bounded by a fixed constant C independent of n, that is \(|\mathcal C|\le C\). We say that the type (ij) has linear size, if

$$\begin{aligned} \liminf _{n\rightarrow \infty }\frac{|\mathcal V_{i,j}|}{n}>0. \end{aligned}$$
(1.15)

We call \(\mathcal L\subset \mathcal C\) the set of types with linear size, and define the parameters

$$\begin{aligned} \gamma _0:=\frac{\log \Delta _+}{\log \delta _-},\quad \gamma _1:=\max _{(k,\ell )\in \mathcal L}\frac{\log \ell }{\log k}, \quad \kappa _1:=\min _{(k,\ell )\in \mathcal L}\frac{\log \ell }{\log k}, \quad \kappa _0:=\frac{\log \delta _+}{\log \Delta _-}. \end{aligned}$$
(1.16)

Theorem 1.3

Set \(\pi _\mathrm{min}=\min _{x\in [n]}\pi (x)\). There exists a constant \(C>0\) such that

$$\begin{aligned} \mathbb {P}\left( C^{-1}\log ^{1-\gamma _0}(n)\le n\pi _\mathrm{min}\le C\,\log ^{1-\gamma _1}(n) \right) =1-o(1). \end{aligned}$$
(1.17)

Moreover, there exists \(\beta >0\) such that

$$\begin{aligned} \mathbb {P}\Big (\exists S\subset [n],\,|S|\ge n^\beta \,,\; n \max _{y\in S}\pi (y)\le C\log ^{1-\gamma _{1}}(n) \Big )=1-o(1). \end{aligned}$$
(1.18)

Notice that \(\gamma _0\ge \gamma _1\ge 1\). If the sequences \(\mathbf{d}^\pm \) are such that \((\delta _-,\Delta _+)\in \mathcal L\), then \(\gamma _0=\gamma _1=:\gamma \). This immediately implies the following

Corollary 1.4

If the sequences \(\mathbf{d}^\pm \) are such that \((\delta _-,\Delta _+)\in \mathcal L\), then

$$\begin{aligned} \pi _\mathrm{min}\asymp \frac{1}{n}\log ^{1-\gamma }(n)\,\qquad \mathrm{w.h.p.} \end{aligned}$$
(1.19)

Remark 1.5

If \((\delta _-,\Delta _+)\notin \mathcal L\), then the estimate (1.17) can be strengthened by replacing \(\gamma _0\) with \(\gamma '_0\) where

$$\begin{aligned} \gamma '_0:=\frac{\log \Delta '_+}{\log \delta '_-}\,,\qquad \Delta '_+ := \max \{\ell :\; (k,\ell )\in \mathcal L_0\} \,,\quad \delta '_- := \min \{k:\; (k,\ell )\in \mathcal L_0\}, \end{aligned}$$
(1.20)

and \(\mathcal L_0\subset \mathcal C\) is defined as the set of \((k,\ell )\in \mathcal C\) such that

$$\begin{aligned} \limsup _{n\rightarrow \infty }\frac{|\mathcal V_{k,\ell }|}{n^{1-a}}=+\infty \,,\qquad \forall a>0. \end{aligned}$$
(1.21)

We refer to Remark 3.9 below for additional details on this improvement.

Concerning the maximal values of \(\pi \) we establish the following estimates.

Theorem 1.6

Set \(\pi _\mathrm{max}=\max _{x\in [n]}\pi (x)\). There exists a constant \(C>0\) such that

$$\begin{aligned} \mathbb {P}\left( C^{-1}\log ^{1-\kappa _1}(n)\le n\pi _\mathrm{max}\le \log ^{1-\kappa _0}(n) \right) =1-o(1). \end{aligned}$$
(1.22)

Moreover, there exists \(\beta >0\) such that

$$\begin{aligned} \mathbb {P}\Big (\exists S\subset [n],\,|S|\ge n^\beta \,,\; n \min _{y\in S}\pi (y)\ge C^{-1}\log ^{1-\kappa _{1}}(n) \Big )=1-o(1). \end{aligned}$$
(1.23)

Notice that \(\kappa _0\le \kappa _1\le 1\). If the sequences \(\mathbf{d}^\pm \) are such that \((\Delta _-,\delta _+)\in \mathcal L\), then \(\kappa _0=\kappa _1=:\kappa \), and we obtain the following

Corollary 1.7

If \((\Delta _-,\delta _+)\in \mathcal L\)

$$\begin{aligned} \pi _\mathrm{max}\asymp \frac{1}{n}\log ^{1-\kappa }(n)\,\qquad \mathrm{w.h.p.} \end{aligned}$$
(1.24)

Remark 1.8

In analogy with Remark (1.5), if \((\Delta _-,\delta _+)\notin \mathcal L\), then (1.22) can be improved by replacing \(\kappa _0\) with \(\kappa '_0\) where

$$\begin{aligned} \kappa '_0:=\frac{\log \delta '_+}{\log \Delta '_-}, \quad \delta '_+ := \min \{\ell :\; (k,\ell )\in \mathcal L_0\},\quad \Delta '_- := \max \{k:\; (k,\ell )\in \mathcal L_0\}, \end{aligned}$$
(1.25)

We turn to a description of our results concerning the cover time. A central role here will be played by both statements in Theorem 1.3. On the other hand the information on \(\pi _\mathrm{max}\) derived in Theorem 1.6 will not be essential, and the statement (1.23) will never be used in our proof.

Let \(X_t\), \(t=0,1,2,\dots \), denote the simple random walk on the digraph G, that is the Markov chain with transition matrix P defined in (1.13). Consider the hitting times

$$\begin{aligned} H_y=\inf \{ t\ge 0:\, X_t=y \},\quad \tau _\mathrm{cov}=\max _{y\in [n]}H_y. \end{aligned}$$
(1.26)

The cover time \(T_\mathrm{cov}=T_\mathrm{cov}(G)\) is defined by

$$\begin{aligned} T_\mathrm{cov}=\max _{x\in [n]}\,\mathbf{E}_x[\tau _\mathrm{cov}], \end{aligned}$$
(1.27)

where \(\mathbf{E}_x\) denotes the expectation with respect to the law of the random walk \((X_t)\) with initial point \(X_0=x\) in a fixed realization of the digraph G. Let \(\gamma _0,\gamma _1\) be as in Definition 1.2

Theorem 1.9

There exists a constant \(C>0\) such that

$$\begin{aligned} \mathbb {P}\left( C^{-1}n\log ^{\gamma _1}(n)\le T_\mathrm{cov}\le C\,n\log ^{\gamma _0}(n) \right) =1-o(1). \end{aligned}$$
(1.28)

Corollary 1.10

For sequences \(\mathbf{d}^\pm \) such that \((\delta _-,\Delta _+)\in \mathcal L\) one has \(\gamma _0=\gamma _1=\gamma \) and

$$\begin{aligned} T_\mathrm{cov}\asymp n\log ^{\gamma }(n), \quad \mathrm{w.h.p.} \end{aligned}$$
(1.29)

Remark 1.11

As in Remark 1.5, if \((\delta _-,\Delta _+)\notin \mathcal L\), then Theorem 1.9 can be strengthened by replacing \(\gamma _0\) with the constant \(\gamma _0'\) defined in (1.20).

Finally, we observe that when the sequences \(\mathbf{d}^\pm \) are Eulerian, that is \(d_x^+=d_x^-\) for all \(x\in [n]\), then the estimates in Theorem 1.9 can be refined considerably, and one obtains results that are at the same level of precision of those already established in the case of random undirected graphs [1].

Theorem 1.12

Suppose \(d_x^-=d_x^+=d_x\) for every \(x\in [n]\). Call \(\mathcal V_{d}\) the set of vertices of degree d, and write \(\bar{d}=m/n\) for the average degree. Assume

$$\begin{aligned} |\mathcal V_{d}|= n^{\alpha _d+o(1)} \end{aligned}$$
(1.30)

for some constants \(\alpha _d\in [0,1]\), for each type d. Then,

$$\begin{aligned} T_\mathrm{cov}=(\beta +o(1)) \,n\log n, \quad \mathrm{w.h.p.} \end{aligned}$$
(1.31)

where \(\beta :=\bar{d}\,\max _{d}\frac{\alpha _d}{d}\).

In particular, if all present types have linear size then \(\alpha _d\in \{0,1\}\) for all d and (1.31) holds with \(\beta =\bar{d}/\delta \), where \(\delta =\delta _\pm \) is the minimum degree. In any case it is not difficult to see that \(\beta \ge 1\), since \(\bar{d}\) is determined only by types with linear size. For some general bounds on cover times of Eulerian graphs we refer to [7].

The rest of the paper is divided into three sections. The first is a collection of preliminary structural facts about the directed configuration model. It also includes the proof of Theorem 1.1. The second section is the core of the paper. There we establish Theorem 1.3 and Theorem 1.6. The last section contains the proof of the cover time results Theorem 1.9 and Theorem 1.12.

2 Neighborhoods and diameter

We start by recalling some simple facts about the directed configuration model.

2.1 Sequential generation

Each vertex x has \(d^-_x\) labeled heads and \(d^+_x\) labeled tails, and we call \(E_x^-\) and \(E_x^+\) the sets of heads and tails at x respectively. The uniform bijection \(\omega \) between heads \(E^-=\cup _{x\in [n]}E_x^-\) and tails \(E^+=\cup _{x\in [n]}E_x^+\), viewed as a matching, can be sampled by iterating the following steps until there are no unmatched heads left:

  1. 1)

    pick an unmatched head \(f\in E^-\) according to some priority rule;

  2. 2)

    pick an unmatched tail \(e\in E^+\) uniformly at random;

  3. 3)

    match f with e, i.e. set \(\omega (f)=e\), and call ef the resulting edge.

This gives the desired uniform distribution over matchings \(\omega :E^-\mapsto E^+\) regardless of the priority rule chosen at step 1. The digraph G is obtained by adding a directed edge (xy) whenever \(f\in E_y^-\) and \(e\in E_x^+\) in step 3 above.

2.2 In-neighborhoods and out-neighborhoods

We will use the notation

$$\begin{aligned} \delta =\min \{\delta _-,\delta _+\}, \quad \Delta =\max \{\Delta _-,\Delta _+\}. \end{aligned}$$
(2.1)

For any \(h\in {\mathbb N} \), the h-in-neighborhood of a vertex y, denoted \(\mathcal B^-_{h}(y)\), is the digraph defined as the union of all directed paths of length \(\ell \le h\) in G which terminate at vertex y. In the sequel a path is always understood as a sequence of directed edges \((e_1f_1,\dots ,e_kf_k)\) such that \(v_{f_i}=v_{e_{i+1}}\) for all \(i=1,\dots ,k-1\), and we use the notation \(v_e\) (resp. \(v_f\)) for the vertex x such that \(e\in E_x^+\) (resp. \(f\in E_x^-\)).

To generate the random variable \(\mathcal B^-_{h}(y)\), we use the following breadth-first procedure. Start at vertex y and run the sequence of steps described above, by giving priority to those unmatched heads which have minimal distance to vertex y, until this minimal distance exceeds h, at which point the process stops. Similarly, for any \(h\in {\mathbb N} \), the h-out-neighborhood of a vertex x, denoted \(\mathcal B^+_{h}(x)\) is defined as the subgraph induced by the set of directed paths of length \(\ell \le h\) which start at vertex x. To generate the random variable \(\mathcal B^+_{h}(x)\), we use the same breadth-first procedure described above except that we invert the role of heads and tails. With slight abuse of notation we sometimes write \(\mathcal B^\pm _{h}(x)\) for the vertex set of \(\mathcal B^\pm _{h}(x)\). We also warn the reader that to simplify the notation we often avoid taking explicitly the integer part of the various parameters entering our proofs. In particular, whenever we write \(\mathcal B_h^\pm (x)\) it is always understood that \(h\in {\mathbb N} \).

During the generation process of the in-neighborhood, say that a collision occurs whenever a tail gets chosen, whose end-point x was already exposed, in the sense that some tail in \(E^+_x\) or head in \(E^-_x\) had already been matched. Since less than 2k vertices are exposed when the \(k^{\text {th}}\) tail gets matched, less than \(2\Delta k\) of the \(m-k+1\) possible choices can result in a collision. Thus, the conditional chance that the \(k^{\text {th}}\) step causes a collision, given the past, is less than \(p_k=\frac{2\Delta k}{m-k+1}\). It follows that the number \(Z_k\) of collisions caused by the first k arcs is stochastically dominated by the binomial random variable Bin\(\left( k,p_k\right) \). In particular,

$$\begin{aligned} \mathbb {P}\left( Z_k\ge \ell \right) \le \frac{k^\ell p_k^\ell }{\ell !}\,,\qquad \ell \in {\mathbb N} . \end{aligned}$$
(2.2)

Notice that as long as no collision occurs, the resulting digraph is a directed tree. The same applies to out-neighborhoods simply by inverting the role of heads and tails.

For any digraph G, define the tree excess of G as

$$\begin{aligned} {\textsc {tx}}(G)=1+|E|-|V|, \end{aligned}$$
(2.3)

where E is the set of directed edges and V is the set of vertices of G. In particular, \({\textsc {tx}}(\mathcal B^\pm _{h}(x))=0\) iff \(\mathcal B^\pm _{h}(x)\) is a directed tree, and \({\textsc {tx}}(\mathcal B^\pm _{h}(x))\le 1\) iff there is at most one collision during the generation of the neighborhood \(\mathcal B^\pm _{h}(x)\). Define the events

$$\begin{aligned} \mathcal G_x(h)=\left\{ {\textsc {tx}}(\mathcal B_h^-(x))\le 1 \;\text {and}\;{\textsc {tx}}(\mathcal B_h^+(x))\le 1\right\} ,\quad \mathcal G(h)=\cap _{x\in [n]}\mathcal G_x(h). \end{aligned}$$
(2.4)

Set also

$$\begin{aligned} \hslash = \frac{1}{5}\log _\Delta (n). \end{aligned}$$
(2.5)

Proposition 2.1

There exists \(\chi >0\) such that \(\mathbb {P}\left( \mathcal G_x(\hslash )\right) =1-O(n^{-1-\chi })\) for any \(x\in [n]\). In particular,

$$\begin{aligned} \mathbb {P}\left( \mathcal G(\hslash )\right) =1-O(n^{-\chi }). \end{aligned}$$
(2.6)

Proof

During the generation of \(\mathcal B_h^-(x)\) one creates at most \(\Delta ^h\) edges. It follows from (2.2) with \(\ell =2\) that the probability of the complement of \(\mathcal G_x(\hslash )\) is \(O(n^{-1-\chi })\) for all \(x\in [n]\) for some absolute constant \(\chi >0\):

$$\begin{aligned} \mathbb {P}\left( \mathcal G_x(\hslash )\right) =1-O(n^{-1-\chi }). \end{aligned}$$
(2.7)

The conclusion follows from the union bound. \(\square \)

We will need to control the size of the boundary of our neighborhoods. To this end, we introduce the notation \(\partial \mathcal B_t^-(y)\) for the set of vertices \(x\in [n]\) such that \(d(x,y)=t\). Similarly, \(\partial \mathcal B_t^+(x)\) is the set of vertices \(y\in [n]\) such that \(d(x,y)=t\). Clearly, \(|\partial \mathcal B_t^\pm (y)|\le \Delta ^{h}\) for any \(y\in [n]\) and \(h\in {\mathbb N} \).

Lemma 2.2

There exists \(\chi >0\) such that for all \(y\in [n]\),

$$\begin{aligned} \mathbb {P}\left( |\partial \mathcal B_h^\pm (y)|\ge \tfrac{1}{2}{\delta _\pm ^h},\,\forall h\in [1,\hslash ]\right) =1-O(n^{-1-\chi }), \end{aligned}$$
(2.8)

where \(\delta _{\pm }\) is defined as in (1.7).

Proof

By symmetry we may restrict to the case of in-neighborhoods. By (2.7) it is sufficient to show that \(|\partial \mathcal B_h^\pm (y)|\ge \tfrac{1}{2}{\delta _\pm ^h}\), for all \(h\in [1,\hslash ]\), if \(\mathcal G_y(\hslash )\) holds. If the tree excess of the h-in-neighborhood \(\mathcal B_h^-(y)\) is at most 1 then there is at most one collision in the generation of \(\mathcal B_h^-(y)\). This collision can be of two types:

  1. (1)

    there exists some \(1\le t\le h\) and a \(v\in \partial \mathcal B^-_{t}(y)\) s.t. v has two out-neighbors \(w,w'\in \partial \mathcal B^-_{t-1}(y)\);

  2. (2)

    there exists some \(0\le t\le h\) and a \(v\in \partial \mathcal B^-_{t}(y)\) s.t. v has an in-neighbor w in \(\mathcal B^-_{t}(y)\).

The first case can be further divided into two cases: a) \(w=w'\), and b) \(w\ne w'\); see Fig. 1.

Fig. 1
figure 1

The light-coloured arrow represents a collision of type (1a) (left) and a collision of type (1b) (right)

In case 1a) we note that the \((h-t)\)-in-neighborhood of v must be a directed tree with at least \(\delta _-^{h-t}\) elements on its boundary and with no intersection with the \((h-t)\)-in-neighborhoods of other \(v'\in \partial \mathcal B^-_{t}(y)\). Moreover, \(\mathcal B^-_{t-1}(y)\) must be a directed tree with \(|\partial \mathcal B^-_{t-1}(y)|\ge \delta _-^{t-1}\), and all elements of \( \partial \mathcal B^-_{t-1}(y)\) except one have disjoint \((h-t+1)\)-in-neighborhoods with \(\delta _-^{h-t+1}\) elements on their boundary. Therefore

$$\begin{aligned} |\partial \mathcal B^-_{h}(y)|\ge (\delta _-^{t-1}-1) \delta _-^{h-t+1} + (\delta _--1)\delta _-^{h-t}\ge \frac{1}{2}\delta _-^h. \end{aligned}$$

In case 1b) one has that \(t\ge 2\), \(\mathcal B^-_{t-1}(y)\) is a directed tree with \(|\partial \mathcal B^-_{t-1}(y)|\ge \delta _-^{t-1}\), and for all \(z\in \partial \mathcal B^-_{t}(y)\), the \((h-t)\)-in-neighborhoods of z are disjoint directed trees with at least \(\delta _-^{h-t}\) elements on their boundary. Since \(|\partial \mathcal B^-_{t}(y)|\ge \delta _-^{t}-1\) it follows that

$$\begin{aligned} |\partial \mathcal B^-_{h}(y)|\ge (\delta _-^t-1)\delta _-^{h-t}\ge \frac{1}{2}\delta _-^h. \end{aligned}$$

Collisions of type 2 can be further divided into two types: a) \(w\in \partial \mathcal B^-_{s}(y)\) with \(s< t\) and there is no path from v to w of length \(t-s\), or \(w\in \partial \mathcal B^-_{t}(y)\) and \(w\ne v\), and b) \(w\in \partial \mathcal B^-_{s}(y)\) with \(s< t\) and there is a path from v to w of length \(t-s\), or \(w=v\). Note that in contrast with collisions of type 2a), a collision of type 2b) creates a directed cycle within \(\mathcal B^-_{t}(y)\); see Figs. 2 and 3.

Fig. 2
figure 2

Two examples of collision of type (2a)

Fig. 3
figure 3

Two examples of collision of type (2b)

We remark that in either case 2a) or case 2b), \(\partial \mathcal B^-_{t}(y)\) has at least \(\delta _-^{t}\) elements, and the vertex \(v\in \partial \mathcal B^-_{t}(y)\) has at least \(\delta _--1\) in-neighbors whose \((h-t-1)\)-in-neighborhoods are disjoint directed trees. All other \(v'\in \partial \mathcal B^-_{t}(y)\) have \((h-t)\)-in-neighborhoods that are disjoint directed trees. Therefore, in case 2):

$$\begin{aligned} |\partial \mathcal B^-_{h}(y)|\ge (\delta _-^{t}-1) \delta _-^{h-t} + (\delta _--1)\delta _-^{h-t-1}\ge \frac{1}{2}\delta _-^h. \end{aligned}$$

\(\square \)

We shall need a more precise control of the size of \(\partial \mathcal B_h^\pm (y)\), and for values of h that are larger than \(\hslash \). Recall the definition (1.9) of the parameter \(\nu \). We use the following notation in the sequel:

$$\begin{aligned} \ell _0=4\log _\delta \log (n),\qquad h_\eta =(1-\eta )\log _\nu (n). \end{aligned}$$
(2.9)

Lemma 2.3

For every \(\eta \in (0,1)\), there exist constants \(c_1,c_2>0,\chi >0\) such that for all \(y\in [n]\),

$$\begin{aligned} \mathbb {P}\left( \nu ^{h}\log ^{-c_1}(n)\le |\partial \mathcal B_h^\pm (y)|\le \nu ^{h}\log ^{c_2}(n)\,,\;\forall h\in \left[ \ell _0, h_\eta \right] \right) =1-O(n^{-1-\chi }). \end{aligned}$$
(2.10)

Proof

We run the proof for the in-neighborhood only since the case of the out-neighborhood is obtained in the same way. We generate \(\mathcal B^-_h(y)\), \(h \in \left[ \ell _0, h_\eta \right] \) sequentially in a breadth first fashion. After the depth j neighborhood \(\mathcal B^-_j(y)\) has been sampled, we call \( \mathcal F_{j}\) the set of all heads attached to vertices in \(\partial \mathcal B^-_j(y)\). Set

$$\begin{aligned} u= \log ^{-7/8}(n). \end{aligned}$$

For any \( h\ge \ell _0\) define

$$\begin{aligned} \kappa _h:=[\nu (1-u)]^{h-\ell _0}\log ^{7/2}(n),\quad \widehat{\kappa }_h:=[\nu (1+u)]^{h-\ell _0}\Delta ^{\ell _0}. \end{aligned}$$
(2.11)

We are going to prove

$$\begin{aligned} \mathbb {P}\left( \kappa _h\le |\mathcal F_{h}|\le \widehat{\kappa }_h,\;\forall h\in \left[ \ell _0, h_\eta \right] \right) =1-O(n^{-1-\chi }). \end{aligned}$$
(2.12)

Notice that, choosing suitable constants \(c_1,c_2>0\), (2.10) is a consequence of (2.12).

Consider the events

$$\begin{aligned} A_j=\left\{ |\mathcal F_i|\in \left[ \kappa _i,\widehat{\kappa }_i \right] ,\; \forall i\in [\ell _0,j] \right\} . \end{aligned}$$
(2.13)

Thus, we need to prove \(\mathbb {P}(A_h)=1-O(n^{-1-\chi })\), for \(h=h_\eta \). From Lemma 2.2 and the choice of \(\ell _0\), it follows that

$$\begin{aligned} \mathbb {P}(A_{\ell _0})=1-O(n^{-1-\chi }). \end{aligned}$$
(2.14)

For \(h>\ell _0\) we write

$$\begin{aligned} \mathbb {P}(A_h)= \mathbb {P}(A_{\ell _0})\prod _{j=\ell _0+1}^{h}\mathbb {P}(A_j|A_{j-1}). \end{aligned}$$
(2.15)

To estimate \(\mathbb {P}(A_j|A_{j-1})\), note that \(A_{j-1}\) depends only on the in-neighborhood \(\mathcal B^-_{j-1}(y)\), so if \(\sigma _{j-1}\) denotes a realization of \(\mathcal B^-_{j-1}(y)\) with a slight abuse of notation we write \(\sigma _{j-1}\in A_{j-1}\) if \(A_{j-1}\) occurs for this given \(\sigma _{j-1}\). Then

$$\begin{aligned} \mathbb {P}(A_j|A_{j-1})=\frac{\sum _{\sigma _{j-1}}\mathbb {P}(\sigma _{j-1})\mathbb {P}(A_j|\sigma _{j-1})1_{\sigma _{j-1}\in A_{j-1}}}{\mathbb {P}(A_{j-1})}. \end{aligned}$$
(2.16)

Therefore, to prove a lower bound on \(\mathbb {P}(A_j|A_{j-1})\) it is sufficient to prove a lower bound on \(\mathbb {P}(A_j|\sigma _{j-1})\) that is uniform over all \(\sigma _{j-1}\in A_{j-1}\).

Suppose we have generated the neighborhood \(\sigma _{j-1}\) up to depth \(j-1\), for a \(\sigma _{j-1}\in A_{j-1}\). In some arbitrary order we now generate the matchings of all heads \(f\in \mathcal F_{j-1}\). We define the random variable \(X_f^{(j)}\), \(f\in \mathcal F_{j-1}\), which, for every f evaluates to the in-degree \(d_z^-\) of the vertex z that is matched to f if the vertex z was not yet exposed, and evaluates to zero otherwise. In this way

$$\begin{aligned} |\mathcal F_j|= \sum _{f\in \mathcal F_{j-1}}X_f^{(j)}. \end{aligned}$$
(2.17)

Therefore,

$$\begin{aligned} \mathbb {P}(A_{j}|\sigma _{j-1})&=\mathbb {P}\left( \nu (1-u)\kappa _{j-1}\le |\mathcal F_j|\le \nu (1+u)\widehat{\kappa }_{j-1}\, |\, \sigma _{j-1}\right) \nonumber \\&= 1-\mathbb {P}\Big (\sum _{f\in \mathcal F_{j-1}}X_f^{(j)}<\nu (1-u)\kappa _{j-1}\, |\, \sigma _{j-1}\Big )\nonumber \\&\quad -\mathbb {P}\Big (\sum _{f\in \mathcal F_{j-1}}X_f^{(j)}> \nu (1+u)\widehat{\kappa }_{j-1}\, |\, \sigma _{j-1}\Big ). \end{aligned}$$
(2.18)

To sample the variables \(X_f^{(j)}\), at each step we pick a tail uniformly at random among all unmatched tails and evaluate the in-degree of its end point if it is not yet exposed. Since \(\sigma _{j-1}\in A_{j-1}\), at any such step the number of exposed vertices is at most \(K=O(n^{1-\eta /2})\). In particular, for any \(f\in \mathcal F_{j-1}\) and any \(d\in [\delta ,\Delta ]\), \(\sigma _{j-1}\in A_{j-1}\):

$$\begin{aligned} \mathbb {P}\left( X_f^{(j)}= d \, |\, \sigma _{j-1}\right) \ge \frac{\left[ \left( \sum _{k=1}^{n} d^+_{k}\mathbf {1}_{d^-_{k}=d}\right) -\Delta K\right] _+}{m}=:p(d), \end{aligned}$$

where \([\cdot ]_+\) denotes the positive part. This shows that \(X_f^{(j)}\) stochastically dominates the random variable \(Y^{(j)}\) and is stochastically dominated by the random variable \(\widehat{Y}^{(j)}\), where \(Y^{(j)}\) and \(\widehat{Y}^{(j)}\) are defined by

$$\begin{aligned}&\forall d\in [\delta ,\Delta ],\quad \mathbb {P}(Y^{(j)}=d)=\mathbb {P}(\widehat{Y}^{(j)}=d)= p(d)\\&\mathbb {P}\left( \widehat{Y}^{(j)}= \Delta +1 \right) =\mathbb {P}\left( Y^{(j)}= 0 \right) =1-\sum _{d=\delta }^{\Delta }p(d) . \end{aligned}$$

Notice that

$$\begin{aligned} \mathbb {E}\left( Y^{(j)}\right) =\sum _{d=\delta }^{\Delta }dp(d)\ge \nu - \frac{\Delta ^2K}{m} = \nu - O(n^{-\eta /2}). \end{aligned}$$
(2.19)

Similarly,

$$\begin{aligned} \mathbb {E}\left( \widehat{Y}^{(j)}\right) \le \nu + \frac{\Delta ^2K}{m} = \nu + O(n^{-\eta /2}). \end{aligned}$$
(2.20)

Moreover, letting \(Y_i^{(j)}\) and \(\widehat{Y}_i^{(j)}\) denote i.i.d. copies of the random variables \(Y^{(j)}\) and \(\widehat{Y}^{(j)}\) respectively, since \(\sigma _{j-1}\in A_{j-1}\), the sum in (2.17) stochastically dominates \(\sum _{i=1}^{\kappa _{j-1}}Y_i^{(j)}\), and is stochastically dominated by \(\sum _{i=1}^{\widehat{\kappa }_{j-1}}Y_i^{(j)}\). Therefore, \(\sum _{f\in \mathcal F_{j-1}}X_f^{(j)}<\nu (1-u)\kappa _{j-1}\) implies that

$$\begin{aligned} \sum _{i=1}^{\kappa _{j-1}}\left[ Y_i^{(j)}-\mathbb {E}\left( Y^{(j)}\right) \right] \le -\frac{1}{2}\,u\kappa _{j-1}, \end{aligned}$$
(2.21)

if n is large enough. Similarly, \(\sum _{f\in \mathcal F_{j-1}}X_f^{(j)}>\nu (1+u)\widehat{\kappa }_{j-1}\) implies that

$$\begin{aligned} \sum _{i=1}^{\widehat{\kappa }_{j-1}}\left[ \widehat{Y}_i^{(j)}-\mathbb {E}\left( \widehat{Y}^{(j)}\right) \right] \ge \frac{1}{2}\,u\widehat{\kappa }_{j-1}. \end{aligned}$$
(2.22)

An application of Hoeffding’s inequality shows that the probability of the events (2.21) and (2.22) is bounded by \(e^{-c u^2 \kappa _{j-1}}\) and \(e^{-c u^2 \widehat{\kappa }_{j-1}}\) respectively, for some absolute constant \(c>0\). Hence, from (2.18) we conclude that for some constant \(c>0\):

$$\begin{aligned} \mathbb {P}(A_{j}|\sigma _{j-1})\ge 1- e^{-c u^2 \kappa _{j-1}} - e^{-c u^2 \widehat{\kappa }_{j-1}}. \end{aligned}$$

Therefore, using \(u^2 \widehat{\kappa }_{j-1}\ge u^2\kappa _{j-1}\ge u^2\kappa _{0} \ge \log ^{3/2}(n)\),

$$\begin{aligned} \mathbb {P}(A_{j}|\sigma _{j-1})=1-O(n^{-3}), \end{aligned}$$
(2.23)

uniformly in \(j\in [\ell _0,h_\eta ]\) and \(\sigma _{j-1}\in A_{j-1}\). By (2.16) the same bound applies to \(\mathbb {P}(A_{j}|A_{j-1})\) and going back to (2.15), for \(h=h_\eta \) we have obtained

$$\begin{aligned} \mathbb {P}(A_{h})=1-O(n^{-1-\chi }). \end{aligned}$$

\(\square \)

We shall also need the following refinement of Lemma 2.3. Define the events

$$\begin{aligned} F_y^\pm =F_y^\pm (\eta ,c_1,c_2)=\left\{ \nu ^{h}\log ^{-c_1}(n)\le |\partial \mathcal B_{h}^\pm (y)|\le \nu ^{h}\log ^{c_2}(n)\,,\;\forall h\in \left[ \ell _0, h_\eta \right] \right\} . \end{aligned}$$
(2.24)

Lemma 2.3 states that

$$\begin{aligned} \mathbb {P}\left( (F_y^\pm )^c\right) =O(n^{-1-\chi }). \end{aligned}$$

Let \(\mathcal G(\hslash )\) be the event from Proposition 2.1.

Lemma 2.4

For every \(\eta \in (0,1)\), there exist constants \(c_1,c_2>0,\chi >0\) such that for all \(y\in [n]\),

$$\begin{aligned} \mathbb {P}\left( (F_y^\pm )^c; \mathcal G(\hslash )\right) =O(n^{-2-\chi }). \end{aligned}$$
(2.25)

Proof

By symmetry we may prove the inequality for the event \(F_y^-\) only. Consider the set \(\mathcal D^-_y\) of all possible 2-in-neighborhoods of y compatible with the event \(\mathcal G(\hslash )\), that is the set of labeled digraphs D such that

$$\begin{aligned} \mathbb {P}(\mathcal B^-_2(y)=D\,;\,\mathcal G(\hslash ))>0. \end{aligned}$$
(2.26)

Then

$$\begin{aligned} \mathbb {P}\left( (F_y^-)^c; \mathcal G(\hslash )\right) \le \sup _{D\in \mathcal D^-_y}\mathbb {P}\left( (F_y^\pm )^c\, |\, \mathcal B^-_2(y)=D \right) . \end{aligned}$$
(2.27)

Thus it is sufficient to prove that

$$\begin{aligned} \mathbb {P}\left( (F_y^\pm )^c\, |\, \mathcal B^-_2(y)=D \right) =O(n^{-2-\chi }), \end{aligned}$$
(2.28)

uniformly in \(D\in \mathcal D^-_y\). To this end, we may repeat exactly the same argument as in the proof of Lemma 2.3 with the difference that now we condition from the start on the event \( \mathcal B^-_2(y)=D\) for a fixed \(D\in \mathcal D_y\). The key observation is that (2.14) can be strenghtened to \( O(n^{-2-\chi })\) if we condition on \(\mathcal B^-_2(y)=D\). That is, for some \(\chi >0\), uniformly in \(D\in \mathcal D_y\),

$$\begin{aligned} \mathbb {P}\left( A_{\ell _0}\, |\, \mathcal B^-_2(y)=D \right) =1-O(n^{-2-\chi }), \end{aligned}$$
(2.29)

To prove (2.29) notice that if the 2-in-neighborhood of y is given by \(\mathcal B^-_2(y)=D\in \mathcal D_y\) then the set \(\mathcal F_{2}^-(y)\) has at least 4 elements. Therefore, taking a sufficiently large constant C, for the event \(|\mathcal F_{\ell _0}^-(y)|\ge \delta ^{\ell _0}/C\) to fail it is necessary to have at least 3 collisions in the generation of \(\mathcal B^-_t(y)\), \(t\in \{3,\dots ,\ell _0\}\). From the estimate (2.2) the probability of this event is bounded by \(p_k^3k^3\) with \(k=\Delta ^{\ell _0}\), which implies (2.29) if \(\chi \in (0,1)\). Once (2.29) is established, the rest of the proof is a repetition of the argument in (2.15)-(2.23). \(\square \)

2.3 Upper bound on the diameter

The upper bound in Theorem 1.1 is reformulated as follows.

Lemma 2.5

There exist constants \(C,\chi >0\) such that if \(\varepsilon _n=\frac{C\log \log (n)}{\log (n)}\),

$$\begin{aligned} \mathbb {P}\left( \mathrm{diam}(G)>(1+\varepsilon _n)\,\mathrm{d}_\star \right) =O(n^{-\chi }). \end{aligned}$$
(2.30)

Proof

From Proposition 2.1 we may restrict to the event \(\mathcal G(\hslash )\). From the union bound

$$\begin{aligned} \mathbb {P}\left( \mathrm{diam}(G)>(1+\varepsilon _n)\,\mathrm{d}_\star ;\mathcal G(\hslash )\right) \le \sum _{x,y\in [n]} \mathbb {P}\left( d(x,y)>(1+\varepsilon _n)\mathrm{d}_\star ;\mathcal G(\hslash )\right) . \end{aligned}$$
(2.31)

From Lemma 2.4, for all \(x,y\in [n]\)

$$\begin{aligned} \mathbb {P}\left( d(x,y)>(1+\varepsilon _n)\mathrm{d}_\star ;\mathcal G(\hslash )\right) =\mathbb {P}\left( d(x,y)>(1+\varepsilon _n)\mathrm{d}_\star ;F_x^+\cap F_y^-\right) + O(n^{-2-\chi }). \end{aligned}$$
(2.32)

Fix

$$\begin{aligned} k=\frac{1+\varepsilon _n}{2}\,\log _\nu n. \end{aligned}$$

Let us use sequential generation to sample first \(\mathcal B^+_k(x)\) and then \(\mathcal B^-_{k-1}(y)\). Call \(\sigma \) a realization of these two neighborhoods. Consider the event

$$\begin{aligned} U_{x,y}=\{|\partial \mathcal B_k^+(x)|\ge \nu ^k\log ^{-c_1}(n)\,;\;|\partial \mathcal B_{k-1}^-(y)|\ge \nu ^{k-1}\log ^{-c_1}(n)\}. \end{aligned}$$

Clearly, \( F_x^+\cap F_y^-\subset U_{x,y}\). Moreover \(U_{x,y}\) depends only on \(\sigma \). Note also that \(\{d(x,y)>(1+\varepsilon _n)\mathrm{d}_\star \}\subset E_{x,y}\), where we define the event

$$\begin{aligned} E_{x,y}=\{\text {There is no path of length } \le 2k-1\text { from } x\text { to } y \}. \end{aligned}$$
(2.33)

The event \(E_{x,y}\) depends only on \(\sigma \). We say that \(\sigma \in U_{x,y}\cap E_{x,y}\) if \(\sigma \) is such that both \(E_{x,y}\) and \(U_{x,y}\) occur. Thus, we write

$$\begin{aligned} \mathbb {P}\left( d(x,y)>(1+\varepsilon _n)\mathrm{d}_\star ;F_x^+\cap F_y^-\right)&\le \mathbb {P}\left( d(x,y)>(1+\varepsilon _n)\mathrm{d}_\star ; U_{x,y}\cap E_{x,y}\right) \nonumber \\&\le \sup _{\sigma \in U_{x,y}\cap E_{x,y}}\mathbb {P}\left( d(x,y)>(1+\varepsilon _n)\mathrm{d}_\star \, |\, \sigma \right) . \end{aligned}$$
(2.34)

Fix a realization \(\sigma \in U_{x,y}\cap E_{x,y}\). The event \(E_{x,y}\) implies that all vertices on \(\partial \mathcal B_{k-1}^-(y)\) have all their heads unmatched and the same holds for all the tails of vertices in \(\partial \mathcal B_k^+(x)\). Call \(\mathcal F_{k-1}\) the heads attached to vertices in \(\partial \mathcal B_{k-1}^-(y)\) and \(\mathcal E_{k}\) the tails attached to vertices in \(\partial \mathcal B_k^+(x)\). The event \(d(x,y)>(1+\varepsilon _n)\mathrm{d}_\star \) implies that there are no matchings between \(\mathcal F_{k-1}\) and \(\mathcal E_k\). The probability of this event is dominated by

$$\begin{aligned} \left( 1-\frac{|\mathcal E_{k}|}{m}\right) ^{|\mathcal F_{k-1}|}\le \left( 1-n^{-\frac{1}{2}+\frac{\varepsilon _n}{4}}\right) ^{n^{\frac{1}{2}+\frac{\varepsilon _n}{4}}}\le \exp {(-n^{{\varepsilon _n}/2})}\,, \end{aligned}$$

if n is large enough and \(\varepsilon _n=C\log \log n/\log n\) with C large enough. Therefore, uniformly in \(\sigma \in U_{x,y}\cap E_{x,y}\),

$$\begin{aligned} \mathbb {P}\left( d(x,y)>(1+\varepsilon _n)\mathrm{d}_\star \, |\, \sigma \right) \le \exp {(-n^{{\varepsilon _n}/2})}=O(n^{-2-\chi }). \end{aligned}$$

Inserting this in (2.31)–(2.32) completes the proof. \(\square \)

2.4 Lower bound on the diameter

We prove the following lower bound on the diameter. Note that Lemma 2.5 and Lemma 2.6 imply Theorem 1.1.

Lemma 2.6

There exists \(C>0\) such that taking \(\varepsilon _n=\frac{C\log \log (n)}{\log (n)}\), for any \(x,y\in [n]\),

$$\begin{aligned} \mathbb {P}\left( d(x,y)\le (1-\varepsilon _n)\mathrm{d}_\star \right) =o(1). \end{aligned}$$
(2.35)

Proof

Define

$$\begin{aligned} \ell =\frac{1-\varepsilon _n}{2}\,\log _\nu n. \end{aligned}$$

We start by sampling the out-neighborhood of x up to distance \(\ell \). Consider the event

$$\begin{aligned} J_x=\left\{ | \mathcal B^+_\ell (x)| \le n^{\frac{1-\varepsilon _n}{2}} \log ^{c_2}(n) \right\} . \end{aligned}$$

From Lemma 2.3, \(\mathbb {P}(J_x)=1-O(n^{-1-\chi })\) for suitable constants \(c_2,\chi >0\), and therefore

$$\begin{aligned} \mathbb {P}(y\in \mathcal B^+_\ell (x))= \mathbb {P}(y\in \mathcal B^+_\ell (x); J_x)+O(n^{-1-\chi }). \end{aligned}$$
(2.36)

If \(J_x\) holds, in the generation of \(\mathcal B^+_\ell (x)\) there are at most \(K:=n^{\frac{1-\varepsilon _n}{2}} \log ^{c_2}(n) \) attempts to include y in \(\mathcal B^+_\ell (x)\), each with probability at most \(d^-_y/(m-K) \le 2\Delta /m\) of success, so that

$$\begin{aligned} \mathbb {P}(y\in \mathcal B^+_\ell (x); J_x)\le \frac{2\Delta }{m}\,K = O(n^{-\frac{1}{2}}) . \end{aligned}$$
(2.37)

Once the out-neighborhood \(\mathcal B^+_\ell (x)\) has been generated, if \(y\notin \mathcal B^+_\ell (x)\), we generate the in-neighborhood \(\mathcal B^-_\ell (y)\). If \(d(x,y)\le (1-\varepsilon _n)\mathrm{d}_\star \) then there must be a collision with \(\partial \mathcal B^+_\ell (x)\), and

$$\begin{aligned} \mathbb {P}(d(x,y)\le (1-\varepsilon _n)\mathrm{d}_\star \,;\, y\notin \mathcal B^+_\ell (x))= \mathbb {P}(y\notin \mathcal B^+_\ell (x)\,;\, \mathcal B^{-}_\ell (y)\cap \partial \mathcal B^{+}_\ell (x)\ne \emptyset ). \end{aligned}$$
(2.38)

Consider the event

$$\begin{aligned} J_y=\left\{ | \mathcal B^-_\ell (y)| <n^{\frac{1-\varepsilon _n}{2}}\log ^{c_2}(n) \right\} . \end{aligned}$$

From Lemma 2.3 it follows that \(\mathbb {P}(J_y)=1-O(n^{-1-\chi })\) for suitable constants \(c_2,\chi >0\). If \(J_x\) and \(J_y\) hold, in the generation of \(\mathcal B^{-}_\ell (y)\) there are at most \(K=n^{\frac{1-\varepsilon _n}{2}}\log ^{c_2}(n)\) attempts to collide with \(\partial \mathcal B^{+}_\ell (x)\), each of which with success probability at most \(\Delta K/m\), and therefore

$$\begin{aligned} \mathbb {P}(y\notin \mathcal B^+_\ell (x)\,;\, \mathcal B^{-}_\ell (y)\cap \partial \mathcal B^{+}_\ell (x)\ne \emptyset ) \le \frac{\Delta K^2}{m} = O(n^{-\varepsilon _n/2})=o(1), \end{aligned}$$
(2.39)

where we take the constant C in the definition of \(\varepsilon _n\) sufficiently large. In conclusion,

$$\begin{aligned} \mathbb {P}\left( d(x,y)\le (1-\varepsilon _n)\mathrm{d}_\star \right) \le \mathbb {P}\left( y\in \mathcal B^+_\ell (x)\right) + \mathbb {P}\left( d(x,y)\le (1-\varepsilon _n)\mathrm{d}_\star \,;\, y\notin \mathcal B^+_\ell (x)\right) , \end{aligned}$$

and the inequalities (2.36)–(2.39) end the proof. \(\square \)

3 Stationary distribution

We start by recalling some key facts established in [9].

3.1 Convergence to stationarity

Let \(P^t(x,\cdot )\) denote the distribution after t steps of the random walk started at x. The total variation distance between two probabilities \(\mu ,\nu \) on [n] is defined as

$$\begin{aligned} \Vert \mu -\nu \Vert _\texttt {TV}= \frac{1}{2}\sum _{x\in [n]}|\mu (x)-\nu (x)|. \end{aligned}$$

Let the entropy H and the associated entropic time \(T_\mathrm{ENT} \) be defined by

$$\begin{aligned} H=\sum _{x\in V}\frac{d_x^-}{m}\,\log d^+_x, \;\quad T_\mathrm{ENT} = \frac{\log n}{H}. \end{aligned}$$
(3.1)

Note that under our assumptions on \(\mathbf{d}^\pm \), the deterministic quantities \(H,T_\mathrm{ENT} \) satisfy \(H=\Theta (1)\) and \(T_\mathrm{ENT} = \Theta (\log n)\). Theorem 1 of [9] states that for all \(s>0\) with \(s\ne 1\),

$$\begin{aligned} \max _{x\in [n]}\left| \Vert P^{sT_\mathrm{ENT} }(x,\cdot )-\pi \Vert _\texttt {TV}- \vartheta (s) \right| \overset{\mathbb {P}}{\longrightarrow }0\,, \end{aligned}$$
(3.2)

where \(\vartheta \) denotes the step function \(\vartheta (s)=1\) if \(s<1\) and \(\vartheta (s)=0\) if \(s>1\), and we use the notation \(\overset{\mathbb {P}}{\longrightarrow }\) for convergence in probability as \(n\rightarrow \infty \). In words, convergence to stationarity for the random walk on the directed configuration model displays with high probability a cutoff phenomenon, uniformly in the starting point, with mixing time given by the entropic time \(T_\mathrm{ENT} \). By concavity of \(x\mapsto \log (x)\) the mixing time \(T_\mathrm{ENT} =\frac{\log n}{H}\) is always larger than the diameter \(\mathrm{d}_\star =\frac{\log n}{\log \nu }\) in Theorem 1.1,

$$\begin{aligned} H= \sum _{x=1}^n\frac{d_x^-}{m}\,\log d^+_x \le \log \left( \sum _{x=1}^n\frac{d_x^-}{m}\,d^+_x\right) = \log \nu , \end{aligned}$$
(3.3)

with equality if and only if the sequence is out-regular, that is \(d_x^+\equiv d\). Thus, the analysis of convergence to stationarity requires investigating the graph on a length scale that may well exceed the diameter. Considering all possible paths on this length scale is not practical, and we shall rely on a powerful construction of [9] that allows one to restrict to a subset of paths with a tree structure, see Sect. 3.3.1 below for the details.

3.2 The local approximation

A consequence of the arguments of [9] is that the unknown stationary distribution at a node y admits an approximation in terms of the in-neighborhood of y at a distance that is much smaller than the mixing time. More precisely, it follows from [9, Theorem 3] that for any sequence \(t_n\rightarrow \infty \)

$$\begin{aligned} \Vert \pi -\mu _\mathrm{in}P^{t_n}\Vert _\texttt {TV}\overset{\mathbb {P}}{\longrightarrow }0, \end{aligned}$$
(3.4)

where we use the notation \(\mu _\mathrm{in}\) for the in-degree distribution

$$\begin{aligned} \mu _\mathrm{in}(x)=\frac{d_x^-}{m}, \end{aligned}$$
(3.5)

and for any probability \(\mu \) on [n], \(\mu P^t\) is the distribution

$$\begin{aligned} \mu P^t(y) = \sum _{x\in [n]}\mu (x)P^t(x,y),\quad y\in [n]. \end{aligned}$$

We refer to [11, Lemma 1] for a stronger statement than (3.4) where \(\mu _\mathrm{in}\) is replaced by any sufficiently widespread probability on [n]. While these facts are very useful to study the typical values of \(\pi \), they give very poor information on its extremal values \(\pi _\mathrm{min}\) and \(\pi _\mathrm{max}\), and to prove Theorem 1.3 and Theorem 1.6 we need a stronger control of the local approximation of the stationary distribution.

A key role in our analysis is played by the quantity \(\Gamma _h(y)\) defined as follows. Consider the set \(\partial \mathcal B^-_h(y)\) of all vertices \(z\in [n]\) such that \(d(z,y)=h\), and define

$$\begin{aligned} \Gamma _{h}(y):=\sum _{z\in \partial \mathcal B^-_h(y)}d_z^- \,P^h(z,y). \end{aligned}$$
(3.6)

The definitions (3.6) and (1.13) are such that for any \(y\in [n]\) and \(h\in {\mathbb N} \)

$$\begin{aligned} \Gamma _{h}(y)\le m\,\mu _\mathrm{in}P^h(y), \end{aligned}$$
(3.7)

where \(\mu _\mathrm{in}\) is defined in (3.5). If \(\mathcal B^-_h(y)\) is a tree, then (3.7) is an equality. In any case, \(\Gamma _{h}(y)\) satisfies the following rough inequalities.

Lemma 3.1

With high probability, for all \(y\in [n]\), for all \(h\in [1,\hslash ]\):

$$\begin{aligned} \left( \frac{\delta _-}{\Delta _+}\right) ^h\le \Gamma _{h}(y)\le 2\Delta _-\left( \frac{\Delta _-}{\delta _+}\right) ^h. \end{aligned}$$
(3.8)

Proof

From Proposition 2.1 we may assume that the event \(\mathcal G(\hslash )\) holds. From Lemma 2.2 we know that \(\frac{1}{2}\delta _-^h \le |\partial \mathcal B^-_h(y)|\le \Delta _-^h\). Thus it suffices to show that for any \(z\in \partial \mathcal B^-_h(y)\), \(h\in [1,\hslash ]\):

$$\begin{aligned} \Delta _+^{-h}\le P^h(z,y)\le 2\delta _+^{-h}. \end{aligned}$$
(3.9)

The bounds in (3.9) follow from the observation that any path of length h from z to y has weight at least \(\Delta _+^{-h}\) and at most \(\delta _+^{-h}\), and that there is at least one and at most two such paths if \(z\in \partial \mathcal B^-_h(y)\) and \(\mathcal G(\hslash )\) holds. The latter fact can be seen with the same argument used in the proof of Lemma 2.2. With reference to that proof: in case 1) there are at most two paths from z to y, see Fig. 1; in case 2) there is only one path from z to y; see Figs. 2 and 3. \(\square \)

Roughly speaking, in what follows the extremal values of \(\pi \) will be controlled by approximating \(\pi (y)\) in terms of \(\Gamma _h(y)\) for values of h of order \(\log \log n\), for every node y. The next two results allow us to control \(\Gamma _h(y)\) in terms of \(\Gamma _{h_0}(y)\) for all \(h\in \left[ h_0,\hslash \right] \) where \(h_0\) is of order \(\log \log n\).

Lemma 3.2

There exist constants \(c>0\) and \(C>0\) such that:

$$\begin{aligned} \mathbb {P}\left( \forall y\in [n],\,\forall h\in \left[ h_0,\hslash \right] ,\,\Gamma _{h}(y)\ge c\log ^{1-\gamma _0}(n)\right) =1-o(1), \end{aligned}$$
(3.10)

where \(\gamma _0\) is the constant from Theorem 1.3 and \( h_0:=\log _{\delta _-}\log (n) + C\).

Proof

From Lemma 2.2 we may assume that \( |\partial \mathcal B^-_{h_0}(y)|\ge \frac{1}{2}\delta _-^{h_0}=:R\) for all \(y\in [n]\), where \(h_0\) is as in the statement above with C to be fixed later. Once we have the in-neighborhood \(\mathcal B^{^-}_{h_0}(y)\) we proceed with the generation of the \((h-h_0)\)-in-neghborhoods of all \(z\in \partial \mathcal B^-_{h_0}(y)\). Consider the first R elements of \(\partial \mathcal B^-_{h_0}(y)\), and order them as \((z_1,\dots ,z_R)\) in some arbitrary way. We sample sequentially \( \mathcal B^{-}_{h-h_0}(z_1)\), then \( \mathcal B^{-}_{h-h_0}(z_2)\), and so on. We want to couple the random variables \(Z_i:= \mathcal B^{-}_{h-h_0}(z_i)\), \(i=1,\dots ,R\) with a sequence of independent rooted directed random trees \(W_i\), \(i=1,\dots ,R\), defined as follows. The tree \(W_i\) is defined as the first \(h-h_0\) generations of the marked random tree \(\mathcal T_i\) produced by the following instructions:

  • the root is given the mark \(z_i\);

  • every vertex with mark j has \(d^-_j\) children, each of which is given independently the mark \(k\in [n]\) with probability \(d^+_k/m\).

Consider the generation of the i-th variable \(Z_i\). This is achieved by the breadth-first sequential procedure, where at each step a head is matched with a tail chosen uniformly at random from all unmatched tails; see Sect. 2. If instead we pick the tail uniformly at random from all possible tails, then we need to reject the outcome if the chosen tail belongs to the set of tails that have been already matched. Since the total number of tails matched at any step of this generation is at most \(K:=\Delta ^\hslash =O(n^{1/5})\), it follows that the probability of a rejection is bounded by \(p:=K/m = O(n^{-4/5})\). Let us now consider the event of a collision, that is when the chosen tail belongs to a vertex that has already been exposed during the previous steps, including the generation of \(\mathcal B^{^-}_{h_0}(y)\) and of the \(Z_j\), \(j\le i\). Notice that the total number of exposed vertices is at most K and therefore the probability of a collision is bounded by \(p'=\Delta K/m = O(n^{-4/5})\). Since the generation of \(Z_i\) requires at most K matchings, we see that conditionally on the past, a \(Z_i\) with no rejections and no collisions is created with probability uniformly bounded from below by \(1-q\), where \(q=O(n^{-3/5})\). We say that \(Z_i\) is bad if its generation produced a rejection or a collision. Once the \(Z_i\)’s have been sampled we define a set \(\mathcal I\) such that \(i\in \mathcal I\) if and only if either \(Z_i\) is bad or there is a bad \(Z_j\) such that the generation of \(Z_j\) produced a collision with a vertex from \(Z_i\). With this notation, \(W_i=Z_i\) for all \(i\notin \mathcal I\) and

$$\begin{aligned} \Gamma _{h}(y)\ge \Delta _+^{-h_0}\sum _{i\notin \mathcal I} \Gamma _{h-h_0}(z_i). \end{aligned}$$
(3.11)

The above construction shows that the cardinality of the set \(\mathcal I\) is stochastically dominated by twice the binomial \(\mathrm{Bin}(R,q)\). Therefore,

$$\begin{aligned} {\mathbb P} (|\mathcal I|\ge 10)\le {\mathbb P} (\mathrm{Bin}(R,q)\ge 5) \le (Rq)^5 = o(n^{-2}). \end{aligned}$$
(3.12)

On the other hand, notice that for all \(i\notin \mathcal I\):

$$\begin{aligned} \Gamma _{h-h_0}(z_i) = M^{i}_{h-h_0}, \end{aligned}$$
(3.13)

where \(M^{i}_t\), \(t\in {\mathbb N} \), is defined as follows. Let \(\mathcal T_{t,i}\) denote the set of vertices forming generation t of the tree \(\mathcal T_i\) rooted at \(z_i\), and for \(x\in \mathcal T_{t,i}\), write

$$\begin{aligned} \mathbf {w}(x):=\mathbf {w}\left( x\mapsto z_i; \mathcal T_i\right) = \prod _{u=1}^t\frac{1}{d^+_{x_u}}, \end{aligned}$$
(3.14)

for the weight of the path \((x_t=x,x_{t-1},\dots ,x_1,x_0=z_i)\) from x to \(z_i\) along \(\mathcal T_i\). Then \(M^{i}_t \) is defined by

$$\begin{aligned} M^{i}_t = \sum _{x\in \mathcal T_{t,i}}d_x^-\mathbf {w}(x), \quad M^{i}_0 = d_{z_i}^-. \end{aligned}$$
(3.15)

It is not hard to check (see e.g. [11, Proposition 4]) that for fixed n, \((M^{i}_t)_{t\ge 0}\) is a martingale with

$$\begin{aligned} {\mathbb E} [M^{i}_t]=M^{i}_0 = d_{z_i}^-. \end{aligned}$$

In particular, by truncating at a sufficiently large constant \(C_1>0\) one has \( M^{i}_{h-h_0} \ge X_i\), where

$$\begin{aligned} X_i:=\min \{M^{i}_{h-h_0},C_1\} \end{aligned}$$

are independent random variables with \(0\le X_i\le C_1\) and \({\mathbb E} [X_i]\ge 1\) for all i. Therefore, Hoeffding’s inequality gives, for any \(k\in {\mathbb N} \):

$$\begin{aligned} \mathbb {P}\left( \sum _{i=1}^k M^{i}_{h-h_0} \le k/2\right)&\le e^{-c_1 k}, \end{aligned}$$
(3.16)

where \(c_1>0\) is a suitable constant.

Divide the integers \(\{1,\dots ,R\}\) into 10 disjoint intervals \(I_1,\dots ,I_{10}\), each containing R/10 elements. If \(|\mathcal I|<10\) then there must be one of the intervals, say \(I_{j_*}\), such that \(I_{j_*}\cap \mathcal I=\emptyset \). It follows that if \(|\mathcal I|<10\), then

$$\begin{aligned} \sum _{i\notin \mathcal I} \Gamma _{h-h_0}(z_i)\ge \sum _{i\in I_{j^*}} M^{i}_{h-h_0} \ge \min _{\ell =1,\dots ,10} \sum _{i\in I_\ell } M^{i}_{h-h_0}. \end{aligned}$$
(3.17)

Using (3.12), and (3.16), (3.17) we conclude that, for a suitable constant \(c_2>0\):

$$\begin{aligned} \mathbb {P}\left( \sum _{i\notin \mathcal I} \Gamma _{h-h_0}(z_i) \le c_2 R\right)&\le \mathbb {P}\left( \min _{\ell =1,\dots ,10} \sum _{i\in I_\ell } M^{i}_{h-h_0} \le c_2 R\right) + {\mathbb P} (|\mathcal I|\ge 10)\nonumber \\&\le 10 \exp {\left( -c_1 R/10\right) }+ o(n^{-2}). \end{aligned}$$
(3.18)

Since \(R=\frac{1}{2} \delta _-^{h_0}=\frac{1}{2}{\delta _-^C}\log n\), the probability in (3.18) is \(o(n^{-2})\) if C is large enough. From (3.11), on the event \(\sum _{i\notin \mathcal I} \Gamma _{h-h_0}(z_i) > c_2R\) one has

$$\begin{aligned} \Gamma _{h}(y)\ge \tfrac{1}{2}c_2\delta _-^{h_0}\Delta _+^{-h_0} = c \,\log ^{1-\gamma _0}(n), \end{aligned}$$
(3.19)

where \(c=\tfrac{1}{2}c_2(\delta _-/\Delta _+)^{C}\). Thus the event (3.19) has probability \(1-o(n^{-2})\), and the desired conclusion follows by taking a union bound over \(y\in [n]\) and \(h\in [h_0,\hslash ]\). \(\square \)

Lemma 3.3

There exists a constant \(K>0\) such that for all \(\varepsilon >0\), with high probability:

$$\begin{aligned} \max _{y\in [n]}\max _{h\in [h_1,\hslash ]}\Big |\frac{\Gamma _h(y)}{\Gamma _{h_1}(y)}-1 \Big |\le \varepsilon , \end{aligned}$$
(3.20)

where \(h_1:=K\log \log (n)\).

Proof

For any \(h\ge h_1\), let \(\sigma _h\) denote a realization of the in-neighborhood \(\mathcal B^-_{h}(y)\), obtained with the usual breadth-first sequential generation. From Proposition 2.1 we may assume that the tree excess of \(\mathcal B^-_{h}(y)\) is at most 1, as long as \(h\le \hslash \). Call \(\mathcal E_{tot,h},\mathcal F_{tot,h}\) the set of unmatched tails and unmatched heads, respectively, after the generation of \(\sigma _h\). Let also \(\mathcal E_h\subset \mathcal E_{tot,h}\) denote the set of unmatched tails belonging to vertices not yet exposed, and let \(\mathcal F_h\) be the subset of heads attached to \(\partial \mathcal B^-_h(y)\). By construction, all heads attached to \(\partial \mathcal B^-_h(y)\) must be unmatched at this stage so that \(\mathcal F_h\subset \mathcal F_{tot,h}\). Moreover,

$$\begin{aligned} \Gamma _h(y)=\sum _{f\in \mathcal F_h}P^h(v_f,y), \end{aligned}$$
(3.21)

where \(v_f\) denotes the vertex to which the head f belongs. To compute \(\Gamma _{h+1}\) given \(\sigma _h\) we let \(\omega :\mathcal E_{tot,h}\mapsto \mathcal F_{tot,h}\) denote a uniform random matching of \(\mathcal E_{tot,h}\) and \(\mathcal F_{tot,h}\), and notice that a vertex z is in \(\partial \mathcal B^-_{h+1}(y)\) if and only if z is revealed by matching one of the heads \(f\in \mathcal F_h\) with one of the tails \(e\in \mathcal E_h\). Therefore,

$$\begin{aligned} \Gamma _{h+1}(y)&=\sum _{e\in \mathcal E_h}\frac{d^-_e}{d^+_e}\sum _{f\in \mathcal F_{h}} P^h(v_f,y)\mathbf {1}_{\omega (e)=f}\nonumber \\&=\sum _{e\in \mathcal E_{tot,h}}c(e,\omega (e)), \end{aligned}$$
(3.22)

where we use the notation \(d^\pm _e\) for the degrees of the vertex to which the tail e belongs, and the function c is defined by

$$\begin{aligned} c(e,f)=\frac{d^-_e}{d^+_e}P^h(v_f,y)\mathbf {1}_{e\in \mathcal E_h,f\in \mathcal F_h}. \end{aligned}$$
(3.23)

Since \(\sigma _h\) is such that \({\textsc {tx}}(\mathcal B^-_{h}(y))\le 1\), we may estimate \(P^h(v_f,y)\) as in (3.9), so that

$$\begin{aligned} \Vert c\Vert _\infty = \max _{e,f}c(e,f)\le 2\Delta \,\delta ^{-h-1}. \end{aligned}$$
(3.24)

We now use a version of Bernstein’s inequality proved by Chatterjee ([12, Proposition 1.1]) which applies to any function of a uniform random matching of the form (3.22). It follows that for any fixed \(\sigma _h\), for any \(s>0\):

$$\begin{aligned}&\mathbb {P}\left( |\Gamma _{h+1}(y)-\mathbb {E}\left[ \Gamma _{h+1}(y)\, |\, \sigma _h\right] |\ge s\, |\, \sigma _h\right) \nonumber \\&\quad \le 2\exp \left( -\frac{s^2}{2\left\| c \right\| _{\infty }(2\mathbb {E}\left[ \Gamma _{h+1}(y)\, |\, \sigma _h\right] +s)} \right) . \end{aligned}$$
(3.25)

Taking \(s=a\mathbb {E}\left[ \Gamma _{h+1}(y)\, |\, \sigma _h\right] \), \(a\in (0,1)\), one has

$$\begin{aligned} \mathbb {P}\left( |\Gamma _{h+1}(y)-\mathbb {E}\left[ \Gamma _{h+1}(y)\, |\, \sigma _h\right] |\ge s\, |\, \sigma _h\right) \le 2\exp \left( -\frac{a^2\mathbb {E}\left[ \Gamma _{h+1}(y)\, |\, \sigma _h\right] }{6\left\| c \right\| _{\infty }} \right) .\nonumber \\ \end{aligned}$$
(3.26)

Since the probability of the event \(\omega (e)=f\) conditioned on \(\sigma _h\) is \(\frac{1}{|\mathcal E_{tot,h}|}=\frac{1}{m}(1+O(\Delta ^h/m))\), we have

$$\begin{aligned} \mathbb {E}\left[ \Gamma _{h+1}(y)\, |\, \sigma _h\right]&= \frac{1}{|\mathcal E_{tot,h}|}\sum _{e\in \mathcal E_h}\frac{d^-_e}{d^+_e}\Gamma _h(y) \nonumber \\&= \frac{1}{m}\left( 1+O(\Delta ^h/m)\right) \left( m-\sum _{e\notin \mathcal E_h}\frac{d^-_e}{d^+_e}\right) \Gamma _h(y) \nonumber \\&= \left( 1+O(\Delta ^h/m)\right) \Gamma _h(y)= \left( 1+O(n^{-1/2})\right) \Gamma _h(y), \end{aligned}$$
(3.27)

for all \(h\in [h_1,\hslash ]\), where we use the fact that the sum over all tails e (matched or unmatched) of \(d^-_e/d^+_e\) equals m. In particular, from Lemma 3.2 it follows that for some constant \(c>0\):

$$\begin{aligned} \mathbb {E}\left[ \Gamma _{h+1}(y)\, |\, \sigma _h\right] \ge c\log ^{-\gamma _0 +1}(n), \end{aligned}$$
(3.28)

and therefore, using (3.24), one finds

$$\begin{aligned} \Vert c\Vert _\infty ^{-1}\mathbb {E}\left[ \Gamma _{h+1}(y)\, |\, \sigma _h\right] \ge \log ^6(n), \end{aligned}$$
(3.29)

for all \(h\ge h_1\), if the constant K in the definition of \(h_1\) is large enough. From (3.26), (3.27) and (3.29) it follows that, letting

$$\begin{aligned} \mathcal A:=\left\{ |\Gamma _{h+1}(y)-\Gamma _{h}(y)|\le a \Gamma _{h}(y)\,, \;\forall h\in [h_1,\hslash ]\right\} , \end{aligned}$$

with \(a:=\log ^{-2}(n)\), then

$$\begin{aligned} \mathbb {P}\left( \mathcal A\right) =1-o(1). \end{aligned}$$
(3.30)

Moreover, on the event \(\mathcal A\), for all \(h\in [h_1,\hslash ]\):

$$\begin{aligned} |\Gamma _{h}(y)-\Gamma _{h_1}(y)|\le \sum _{j=h_1}^{h-1}\left| \Gamma _{j+1}(y)-\Gamma _{j}(y)\right| \le \varepsilon \Gamma _{h_1}(y). \end{aligned}$$

\(\square \)

3.3 Lower bound on \(\pi _{\min }\)

If for some \(t\in {\mathbb N} \) and \(a>0\) one has \(P^t(x,y)\ge a\) for all \(x,y\in [n]\), then

$$\begin{aligned} \pi (z) = \sum _{x=1}^n \pi (x)P^t(x,z)\ge a, \end{aligned}$$
(3.31)

and therefore \(\pi _\mathrm{min}\ge a\). We will prove the lower bound on \(P^t(x,y)\) by choosing t of the form \(t=(1+\varepsilon )T_\mathrm{ENT} \), for some small enough \(\varepsilon >0\); see (3.1) for the definition of \(T_\mathrm{ENT} \). More precisely, fix a constant \(\eta >0\), set \(\eta '= 3\eta \frac{H}{\log \delta }\), and define

$$\begin{aligned} t_\star = h_x+h_y+1\,,\quad h_x=(1-\eta )T_\mathrm{ENT} \,,\quad h_y=\eta 'T_\mathrm{ENT} . \end{aligned}$$
(3.32)

Note that \(\eta '\ge 3\eta \) and thus \(t_\star =t_\star (\eta )\ge (1+2\eta )T_\mathrm{ENT} \).

Lemma 3.4

There exists \(\eta _0>0\) such that for all \(\eta \in (0,\eta _0)\):

$$\begin{aligned} \mathbb {P}\left( \forall x,y\in [n],\,\, P^{t_\star +1}(x,y)\ge \tfrac{c}{n}\,\Gamma _{h_y}(y) \right) =1-o(1), \end{aligned}$$
(3.33)

for some constant \(c=c(\eta ,\Delta )>0\).

From (3.31) and Lemma 3.4 it follows that w.h.p. for all y

$$\begin{aligned} \pi (y)\ge \tfrac{c}{n}\,\Gamma _{h_y}(y) . \end{aligned}$$
(3.34)

Lemma 3.2 thus implies, for some new constant \(c>0\)

$$\begin{aligned} \mathbb {P}\left( \pi _\mathrm{min}\ge \tfrac{c}{n}\log ^{1-\gamma _0}(n)\right) =1-o(1), \end{aligned}$$
(3.35)

which settles the lower bound in Theorem 1.3.

To prove Lemma 3.4 we will restrict to a subset of nice paths from x to y. This will allow us to obtain a concentration result for the probability to reach y from x in \(t_\star \) steps.

3.3.1 A concentration result for nice paths

The definition of the nice paths follows a construction introduced in [9], which we now recall. In contrast with [9] however, here we need a lower bound on \(P^{t_\star }(x,y)\) and thus the argument is somewhat different.

Fix \(h_x\) as in (3.32). Following [9, Section 6.2] and [10, Section 4.1], we introduce the rooted directed tree \(\mathcal T(x)\), namely the subgraph of the \(h_x\)-out-neighborhood of x obtained by the following process: initially all tails and heads are unmatched and \(\mathcal T(x)\) is identified with its root, x; throughout the process, we let \(\partial _+\mathcal T(x)\) (resp. \(\partial _-\mathcal T(x)\)) denote the set of unmatched tails (resp. heads) whose endpoint belongs to \(\mathcal T(x)\); the height \(\mathbf {h}(e)\) of a tail \(e\in \partial _+\mathcal T(x)\) is defined as 1 plus the number of edges in the unique path in \(\mathcal T(x)\) from x to the endpoint of e; the weight of \(e\in \partial _+\mathcal T(x)\) is defined as

$$\begin{aligned} \mathbf {w}^+(e) = \prod _{i=0}^{\mathbf {h}(e)-1}\frac{1}{d_{x_i}^+}\,, \end{aligned}$$
(3.36)

where \((x=x_0,x_1,\dots ,x_{\mathbf {h}(e)-1})\) denotes the path in \(\mathcal T(x)\) from x to the endpoint of e. Hence, at the beginning \(\partial _+\mathcal T(x)\) is the set of tails at x, which all have height 1 and weight \(1/d_{x}^+\). We then iterate the following steps:

  • A tail \(e\in \partial _+\mathcal T(x)\) is selected with maximal weight among all \(e\in \partial _+\mathcal T(x)\) with \(\mathbf {h}(e) \le h_x-1\) and \(\mathbf {w}^+(e) \ge \mathbf {w}_{\min }:=n^{-1+\eta ^2}\) (using an arbitrary ordering of the tails to break ties);

  • e is matched to a uniformly chosen unmatched head f, forming the edge ef;

  • If f was not in \(\partial _-\mathcal T(x)\), then its endpoint and the edge ef are added to \(\mathcal T(x)\).

The process stops when there are no tails \(e\in \partial _+\mathcal T(x)\) with height \(\mathbf {h}(e) \le h_x-1\) and weight \(\mathbf {w}^+(e)\ge \mathbf {w}_{\min }\). The third item above guarantees that \(\mathcal T(x)\) remains a directed tree at each step. The final value of \(\mathcal T(x)\) represents the desired directed tree. Notice that this construction applies only to the out-neighborhood of a vertex; a different procedure will be used for the in-neighborhood of a vertex (see the text preceding (3.38) below).

After the generation of the tree \(\mathcal T(x)\) a total number \(\kappa \) of edges has been revealed, some of which may not belong to \(\mathcal T(x)\). As in [10, Lemma 7], it is not difficult to see that when exploring the out-neighborhood of x in this way the random variable \(\kappa \) is deterministically bounded as

$$\begin{aligned} \kappa \le n^{1-\frac{\eta ^2}{2}}. \end{aligned}$$
(3.37)

At this stage, let us call \(\mathcal E^*(x)\) the set of unmatched tails \(e\in \partial _+\mathcal T(x)\) such that \(\mathbf {h}(e)=h_x\).

Definition 3.5

A path \(\mathbf{p}=(x_0=x,x_1,\dots ,x_{t_\star }=y)\) of length \(t_\star \) starting at x and ending at y is called nice if it satisfies:

  1. (1)

    The first \(h_x\) steps of \(\mathbf{p}\) are contained in \(\mathcal T(x)\), and satisfy

    $$\begin{aligned} \prod _{i=0}^{h_x}\frac{1}{d_{x_i}^+}\le n^{2\eta -1}; \end{aligned}$$
  2. (2)

    \(x_{h_x+1}\in \partial \mathcal B^-_{h_y}(y)\).

We recall that \(h_y\) is defined as in (3.32), and refer to Sect. 2.2 for the definitions of \(\mathcal B^-_{h_y}(y)\) and \(\partial \mathcal B^-_{h_y}(y)\). To obtain a useful expression for the probability of going from x to y along a nice path, we need to generate \(\mathcal B^-_{h_y}(y)\). To this end, assume that \(\kappa \) edges in the \(h_x\)-out-neighborhood of x have been already sampled according to the procedure described above, and then sample \(\mathcal B^-_{h_y}(y)\) according to the sequential generation described in Sect. 2. Some of the matchings producing \(\mathcal B^-_{h_y}(y)\) may have already been revealed during the previous stage. In any case, this second stage creates an additional random number \(\tau \) of edges, satisfying the crude bound \(\tau \le \Delta ^{h_y+1}\). We call \(\mathcal F_\mathrm{tot}\) the set of unmatched heads, and \(\mathcal E_\mathrm{tot}\) the set of unmatched tails after the sampling of these \(\kappa +\tau \) edges. Consider the set \(\mathcal F^0:= \mathcal F_{h_y}\cap \mathcal F_\mathrm{tot}\), where \(\mathcal F_{h_y}\) denotes the set of all heads (matched or unmatched) attached to vertices in \(\partial \mathcal B^{-}_{h_y}(y)\). Moreover, call \(\mathcal E^0:= \mathcal E^*(x)\cap \mathcal E_\mathrm{tot}\) the subset of unmatched tails which are attached to vertices at height \(h_x\) in \(\mathcal T(x)\). Finally, complete the generation of the digraph by matching the \(m-\kappa -\tau \) unmatched tails \(\mathcal E_\mathrm{tot}\) to the \(m-\kappa -\tau \) unmatched heads \(\mathcal F_\mathrm{tot}\) using a uniformly random bijection \(\omega :\mathcal E_\mathrm{tot}\mapsto \mathcal F_\mathrm{tot}\). For any \(f\in \mathcal F_{h_y}\) we introduce the notation

$$\begin{aligned} \mathbf {w}^-(f):=P^{h_y}(v_f,y), \end{aligned}$$
(3.38)

where \(v_f\) denotes the vertex \(v\in \partial \mathcal B^{-}_{h_y}(y)\) such that \(f\in E_v^-\). With the notation introduced above, the probability to go from x to y in \(t_\star \) steps following a nice path can now be written as

$$\begin{aligned} P_{0,t_\star }(x,y):=\sum _{e\in \mathcal E^0}\sum _{f\in \mathcal F^0}\mathbf {w}^+(e)\mathbf {w}^-(f) \mathbf {1}_{\omega (e)=f}\mathbf {1}_{\mathbf {w}^+(e)\le n^{2\eta -1}}. \end{aligned}$$
(3.39)

Note that, conditionally on the construction of the first \(\kappa +\tau \) edges described above, each Bernoulli random variable \( \mathbf {1}_{\omega (e)=f}\) appearing in the above sum has probability of success at least 1/m. In particular, if \(\sigma \) denotes a fixed realization of the \(\kappa +\tau \) edges, then

$$\begin{aligned} \mathbb {E}\left[ P_{0,t_\star }(x,y)\, |\, \sigma \right] \ge \frac{1}{m}\,A_{x,y}(\sigma )B_{x,y}(\sigma )\,, \end{aligned}$$
(3.40)

where

$$\begin{aligned} A_{x,y}(\sigma ):=\sum _{e\in \mathcal E^0} \mathbf {1}_{\mathbf {w}^+(e)\le n^{2\eta -1}}\mathbf {w}^+(e)\,,\quad B_{x,y}(\sigma ):=\sum _{f\in \mathcal F^0}\mathbf {w}^-(f). \end{aligned}$$
(3.41)

Moreover, the probability of \(\omega (e)=f\) for any fixed \(e\in \mathcal E^0,f\in \mathcal F^0\) is at most \(1/(m-\kappa -\tau )\), so that

$$\begin{aligned} \mathbb {E}\left[ P_{0,t_\star }(x,y)\, |\, \sigma \right] \le \frac{(1+o(1))}{m}\,A_{x,y}(\sigma )B_{x,y}(\sigma )\le \frac{(1+o(1))}{m}\,\Gamma _{h_y}(y), \end{aligned}$$
(3.42)

where we use \(A_{x,y}\le 1\) and \(B_{x,y}\le \Gamma _{h_y}(y)\). Recall the definition of the tree excess, \({\textsc {tx}}\), in (2.3) and consider the event

$$\begin{aligned} \mathcal Y_{x,y}=\Big \{\sigma :\; A_{x,y}(\sigma ) \ge \tfrac{1}{2}\,,\; B_{x,y}(\sigma ) \ge \log ^{-\gamma _0}(n)\,, \,{\textsc {tx}}(\mathcal B_{h_y}^-(y))\le 1 \Big \}, \end{aligned}$$
(3.43)

where the exponent \(-\gamma _0\) is chosen for convenience only and any exponent \(-c\) with \(c>\gamma _0-1\) would be as good.

Lemma 3.6

There exists \(\eta _0>0\) such that for all \(\eta \in (0,\eta _0)\), for any \(\sigma \in \mathcal Y_{x,y}\), any \(a\in (0,1)\):

$$\begin{aligned} \mathbb {P}\left( |P_{0,t_\star }(x,y)-\mathbb {E}\left[ P_{0,t_\star }(x,y)\, |\, \sigma \right] |\ge a\mathbb {E}\left[ P_{0,t_\star }(x,y)\, |\, \sigma \right] \, |\, \sigma \right) \le 2\exp \left( -a^2n^{\eta /2}\right) \end{aligned}$$
(3.44)

Proof

Conditioned on \(\sigma \), \(P_{0,t_\star }(x,y)\) is a function of the uniform random permutation \(\omega :\mathcal E_\mathrm{tot}\mapsto \mathcal F_\mathrm{tot}\),

$$\begin{aligned} P_{0,t_\star }(x,y) =\sum _{e\in \mathcal E_\mathrm{tot}} c(e,\omega (e))\,,\quad c(e,f)=\mathbf {w}^+(e)\mathbf {w}^-(f)\mathbf {1}_{\mathbf {w}^+(e)\le n^{2\eta -1}}\mathbf {1}_{e\in \mathcal E^0,f\in \mathcal F^0}. \end{aligned}$$
(3.45)

Since we are assuming \({\textsc {tx}}(\mathcal B_{h_y}^-(y))\le 1\), we can use (3.9) to estimate \(\mathbf {w}^-(f)\le 2 \delta ^{-h_y}=n^{-3\eta }\) for any \(f\in \mathcal F^0\). Therefore

$$\begin{aligned} \Vert c \Vert _{\infty }=\max _{e,f}c(e,f)\le 2n^{-1-\eta }. \end{aligned}$$
(3.46)

As in Lemma 3.3, and as in [9], we use Chatterjee’s concentration inequality for uniform random matchings [12, Proposition 1.1] to obtain for any \(s>0\):

$$\begin{aligned}&\mathbb {P}\left( |P_{0,t_\star }(x,y)-\mathbb {E}\left[ P_{0,t_\star }(x,y)\, |\, \sigma \right] | \ge s\, |\, \sigma \right) \nonumber \\&\quad \le 2\exp \left( -\frac{s^2}{2\left\| c \right\| _{\infty }(2\mathbb {E}\left[ P_{0,t_\star }(x,y)\, |\, \sigma \right] +s)} \right) . \end{aligned}$$
(3.47)

Taking \(s=a\mathbb {E}\left[ P_{0,t_\star }(x,y)\, |\, \sigma \right] \), \(a\in (0,1)\), one has

$$\begin{aligned} \mathbb {P}\left( |P_{0,t_\star }(x,y)-\mathbb {E}\left[ P_{0,t_\star }(x,y)\, |\, \sigma \right] |\ge s\, |\, \sigma \right) \le 2\exp \left( -\frac{a^2\mathbb {E}\left[ P_{0,t_\star }(x,y)\, |\, \sigma \right] }{6\left\| c \right\| _{\infty }} \right) . \end{aligned}$$
(3.48)

Using (3.40), (3.43), and (3.46) one concludes that (3.44) holds for all \(\sigma \in \mathcal Y_{x,y}\) and for all n large enough. \(\square \)

3.3.2 Proof of Lemma 3.4

Let \(V_*\) denote the set of all \(z\in [n]\) such that \(\mathcal B_\hslash ^+(z)\) is a directed tree. It is an immediate consequence of Proposition 2.1 that with high probability, for all \(x\in [n]\):

$$\begin{aligned} P(x,V_*)=\sum _{z\in V_*}P(x,z) \ge \tfrac{1}{2}. \end{aligned}$$
(3.49)

In fact, as observed in [9, Proposition 6], one can show that with high probability \(P^\ell (x,V_*)\ge 1-2^{-\ell }\) for any fixed \(\ell \in {\mathbb N} \). Therefore,

$$\begin{aligned} P^{t_\star +1}(x,y)\ge \tfrac{1}{2}\min _{x\in V_*}P^{t_\star }(x,y). \end{aligned}$$
(3.50)

Since \(P^{t_\star }(x,y)\ge P_{0,t_\star }(x,y)\) it is sufficient to prove

$$\begin{aligned} \mathbb {P}\left( \forall x\in V_*, \forall y\in [n],\,\, P_{0,t_\star }(x,y)\ge \tfrac{c}{n}\,\Gamma _{h_y}(y) \right) =1-o(1), \end{aligned}$$
(3.51)

for some constant \(c=c(\eta ,\Delta )>0\). The proof of (3.51) is based on Lemma 3.6 and the following two lemmas, which allow us to make sure the events \(\mathcal Y_{x,y}\) in Lemma 3.6 have large probability.

Lemma 3.7

The event \(\mathcal A_1= \{\forall x\in V_*, \forall y\in [n]: A_{x,y} \ge \tfrac{1}{2}\}\) has probability

$$\begin{aligned} \mathbb {P}\left( \mathcal A_1 \right) =1-o(1)\,. \end{aligned}$$

Proof

Let us first note that the event \(\widehat{\mathcal A}_1= \{\forall x\in V_*: \sum _{e\in \mathcal E^*(x)}\mathbf {w}^+(e)\mathbf {1}_{\mathbf {w}^+(e)\le n^{2\eta -1}}\ge 0.9\}\) satisfies

$$\begin{aligned} \mathbb {P}\left( \widehat{\mathcal A}_1 \right) =1-o(1). \end{aligned}$$

Indeed, this fact is a consequence of [9, 10], which established that for any \(\varepsilon >0\), with high probability

$$\begin{aligned} \min _{x\in V_*}\sum _{e\in \mathcal E^*(x)}\mathbf {w}^+(e)\mathbf {1}_{\mathbf {w}^+(e)\le n^{2\eta -1}}\ge 1-\varepsilon , \end{aligned}$$
(3.52)

see e.g. [10, Theorem 4 and Lemma 11]. Thus, it remains to show that replacing \(\mathcal E^*(x)\) with \(\mathcal E^0\) does not alter much the sum. Suppose the \(\kappa \) edges generating \(\mathcal T(x)\) have been revealed and then sample the \(\tau \) edges generating the neighborhood \(\mathcal B_{h_y}^-(y)\). Let K denote the number of collisions between \(\mathcal T(x)\) and \(\mathcal B_{h_y}^-(y)\). There are at most \(N:=\Delta ^{h_y}=n^{3\eta \log \Delta /\log \delta }\) attempts each with success probability at most \(p:=\kappa /(m-\kappa )\). Thus K is stochastically dominated by a binomial \(\mathrm{Bin}(N,p)\), and therefore by Hoeffding’s inequality

$$\begin{aligned} \mathbb {P}(K>Np+N)\le \exp {\left( -2N\right) } \le \exp {\left( -n^{3\eta }\right) }. \end{aligned}$$

Thus by a union bound we may assume that all xy are such that the corresponding collision count K satisfies \(K\le Np+N\le 2N\). Therefore, on the event \(\widehat{\mathcal A}_1\)

$$\begin{aligned} \sum _{e\in \mathcal E^0}\mathbf {w}^+(e)\mathbf {1}_{\mathbf {w}^+(e)\le n^{2\eta -1}} \ge 0.9- 2N \,n^{2\eta -1} \ge \frac{1}{2}, \end{aligned}$$

if \(\eta \) is small enough. \(\square \)

Lemma 3.8

Fix a constant \(c>0\) and consider the event \(\mathcal A_2= \{\forall x,y\in [n]: B_{x,y} \ge c\,\Gamma _{h_y}(y)\}\). If \(c>0\) is small enough

$$\begin{aligned} \mathbb {P}\left( \mathcal A_2 \right) =1-o(1)\,. \end{aligned}$$

Proof

By definition, \(\sum _{f\in \mathcal F_{h_y}}\mathbf {w}^-(f)=\Gamma _{h_y}(y)\). Thus, we need to show that if we replace \(\mathcal F^0\) by \(\mathcal F_{h_y}\) the sum defining \(B_{x,y}\) is still comparable to \(\Gamma _{h_y}(y)\). For any constant \(T>0\), for each \(z\in \partial \mathcal B_{h_y-T}^-(y)\), let \(V_z\) denote the set of \(w\in \partial \mathcal B_{h_y}^-(y)\) such that \(d(w,z)=T\). Notice that if the event \(\mathcal G(\hslash )\) from Proposition 2.1 holds then for each \(z\in \partial \mathcal B_{h_y-T}^-(y)\) one has \(|V_z|\ge \frac{1}{2}\delta ^T\). Consider the generation of the \(\kappa +\tau \) edges as above, and call a vertex \(z\in \partial \mathcal B_{h_y-T}^-(y)\) bad if all heads attached to \(V_z\) are matched, or equivalently if none of these heads is in \(\mathcal F_\mathrm{tot}\). Given a \(z\in \partial \mathcal B_{h_y-T}^-(y)\), we want to estimate the probability that it is bad. To this end, we use the same construction given in Sect. 3.3.1 but this time we first generate the in-neighborhood \(\mathcal B_{h_y}^-(y)\) and then the tree \(\mathcal T(x)\). Let K denote the number of collisions between \(\mathcal T(x)\) and the set \(V_z\). Notice that \(|V_z|\le \Delta ^T\) and that \(|\mathcal T(x)|\le n^{1-\eta ^2/2}\), so that K is stochastically dominated by the binomial \(\mathrm{Bin}(N,p)\) where \(N=n^{1-\eta ^2/2}\) and \(p=\Delta ^{T+1}/n\). Therefore,

$$\begin{aligned} \mathbb {P}\left( K>\tfrac{1}{2}\delta ^T\right) \le (Np)^{\frac{1}{2}\delta ^T}\le \left( \Delta ^{T+1}n^{-\eta ^2/2}\right) ^{\frac{1}{2}\delta ^T}. \end{aligned}$$

Since \(|V_z|\ge \frac{1}{2}\delta ^T\), if z is bad then \(K>\frac{1}{2}\delta ^T\) and thus the probability of the event that z is bad is at most \(O(n^{-\delta ^T\eta ^2/4})\). The probability that there exists a bad \(z\in \partial \mathcal B_{h_y-T}^-(y)\) is then bounded by \(O(\Delta ^{h_y}n^{-\delta ^T\eta ^2/4})\). In conclusion, if \(T=T(\eta )\) is a large enough constant, we can ensure that for any \(y\in [n]\) the probability that there exists a bad \(z\in \partial \mathcal B_{h_y-T}^-(y)\) is \(o(n^{-2})\), and therefore, by a union bound, with high probability there are no bad \(z\in \partial \mathcal B_{h_y-T}^-(y)\), for all \(x,y\in [n]\). On this event, for all z we may select one vertex \(w\in V_z\) with at least one head \(f\in \mathcal F^0\) attached to it. Notice that \(\mathbf {w}^-(f)\ge \Delta ^{-T-1}P^{h_y-T}(z,y)\). Therefore, assuming that there are no bad \(z\in \partial \mathcal B_{h_y-T}^-(y)\):

$$\begin{aligned} B_{x,y}(\sigma )&=\sum _{f\in \mathcal F^0}\mathbf {w}^-(f) \\&\ge \Delta ^{-T}\sum _{z\in \partial \mathcal B_{h_y-T}^-(y)}P^{h_y-T}(z,y) \ge \Delta ^{-T-1}\Gamma _{h_y-T}(y). \end{aligned}$$

From Lemma 3.3 we may finish with the estimate \(\Gamma _{h_y-T}(y)\ge \frac{1}{2} \Gamma _{h_y}(y)\). \(\square \)

We can now conclude the proof of (3.51). Consider the event

$$\begin{aligned} \mathcal A=\mathcal A_1\cap \mathcal A_2\cap \mathcal G(\hslash )\cap \mathcal R, \end{aligned}$$
(3.53)

where \(\mathcal R\) denotes the event from Lemma 3.2 and \(\mathcal G(\hslash )\) is given by (2.4) and (2.5). For any \(s>0\),

$$\begin{aligned}&\mathbb {P}\left( \forall x,y\in [n],\,\, P_{0,t_\star }(x,y) \ge \tfrac{s}{n}\,\Gamma _{h_y}(y) \right) \nonumber \\&\quad \ge \mathbb {P}(\mathcal A) - \sum _{x,y\in [n]}\mathbb {P}\left( P_{0,t_\star }(x,y)<\tfrac{s}{n}\,\Gamma _{h_y}(y); \mathcal A\right) , \end{aligned}$$
(3.54)

where the semicolon represents intersection of events. From Lemma 3.2, Lemma 3.7, Lemma 3.8, and Proposition 2.1 it follows that \(\mathbb {P}(\mathcal A) =1-o(1)\). Let \(\mathcal W_{x,y}\) denote the event

$$\begin{aligned} \mathbb {E}\left[ P_{0,t_\star }(x,y)\, |\, \sigma \right] \ge \tfrac{c}{2m}\,\Gamma _{h_y}(y), \end{aligned}$$
(3.55)

where c is the constant from Lemma 3.8. By definition of the events involved

$$\begin{aligned} \mathcal A\subset \mathcal W_{x,y}\cap \mathcal Y_{x,y}, \end{aligned}$$

for all xy, and for all n large enough. Therefore,

$$\begin{aligned} \mathbb {P}\left( P_{0,t_\star }(x,y)< \tfrac{s}{n}\,\Gamma _{h_y}(y); \mathcal A\right) \le \sup _{\sigma \in \mathcal W_{x,y}\cap \mathcal Y_{x,y}} \mathbb {P}\left( P_{0,t_\star }(x,y)< \tfrac{s}{n}\,\Gamma _{h_y}(y)\, |\, \sigma \right) . \end{aligned}$$
(3.56)

Taking \(s=c/(4\Delta )\) and using (3.55), we see that for every \(\sigma \in \mathcal W_{x,y}\), \(P_{0,t_\star }(x,y)< \frac{s}{n}\,\Gamma _{h_y}(y)\) implies:

$$\begin{aligned} |P_{0,t_\star }(x,y)-\mathbb {E}\left[ P_{0,t_\star }(x,y)\, |\, \sigma \right] |\ge \frac{1}{2}\,\mathbb {E}\left[ P_{0,t_\star }(x,y)\, |\, \sigma \right] , \end{aligned}$$

and therefore from Lemma 3.6

$$\begin{aligned} \sup _{\sigma \in \mathcal W_{x,y}\cap \mathcal Y_{x,y}} \mathbb {P}\left( P_{0,t_\star }(x,y)< \tfrac{s}{n}\,\Gamma _{h_y}(y)\, |\, \sigma \right) =o(n^{-2}). \end{aligned}$$
(3.57)

The bounds (3.54) and (3.57) end the proof of (3.51). This ends the proof of Lemma 3.4.

Remark 3.9

Let us show that if the type \((\delta _-,\Delta _+)\) is not in the set of linear types \(\mathcal L\) one can improve the lower bound on \(\pi _\mathrm{min}\) as mentioned in Remark 1.5. The proof given above shows that it is sufficient to replace \(\gamma _0\) by \(\gamma '_0\) in Lemma 3.2, where \(\gamma '_0\) is defined by (1.20). To this end, for any \(\varepsilon >0\), let \(\mathcal L_\varepsilon \) denote the set of types \((k,\ell )\in \mathcal C\) such that

$$\begin{aligned} \limsup _{n\rightarrow \infty }\frac{|\mathcal V_{k,\ell }|}{n^{1-\varepsilon }}=+\infty \,, \end{aligned}$$
(3.58)

where \(\mathcal V_{k,\ell }\) denotes the set of vertices of type \((k,\ell )\), and define

$$\begin{aligned} \gamma '_\varepsilon :=\frac{\log \Delta '_{\varepsilon ,+}}{\log \delta '_{\varepsilon ,-}},\quad \Delta '_{\varepsilon ,+} := \max \{\ell :\; (k,\ell )\in \mathcal L_\varepsilon \} \,,\quad \delta '_{\varepsilon ,-} := \min \{k:\; (k,\ell )\in \mathcal L_\varepsilon \}. \end{aligned}$$
(3.59)

The main observation is that if \((k,\ell )\notin \mathcal L_\varepsilon \), then w.h.p. there are at most a finite number of vertices of type \((k,\ell )\) in all in-neighborhoods \(\mathcal B^-_{h_0}(y)\), \(y\in [n]\), for any \(h_0=O(\log \log n)\). Indeed, for a fixed \(y\in [n]\) the number of \(v\in \mathcal V_{k,\ell }\cap \mathcal B^-_{h_0}(y) \) is stochastically dominated by the binomial \(\mathrm{Bin}\left( \Delta ^{h_0},n^{-\varepsilon /2} \right) \), and therefore if \(K=K(\varepsilon )\) is a sufficiently large constant then the probability of having more than K such vertices is bounded by \((\Delta ^{h_0}n^{-\varepsilon /2})^K=o(n^{-1})\). Taking a union bound over \(y\in [n]\) shows that w.h.p. all \(\mathcal B^-_{h_0}(y)\), \(y\in [n]\) have at most K vertices with type \((k,\ell )\). Then we may repeat the argument of Lemma 3.2 with this constraint, to obtain that for all \(\varepsilon >0\), w.h.p. \(\Gamma _{h_y}(y)\ge c(\varepsilon ) \log ^{1-\gamma '_\varepsilon }(n)\). Since the number of types is finite one concludes that if \(\varepsilon \) is small enough then \(\gamma '_0=\gamma '_\varepsilon \) and the desired conclusion follows.

3.4 Upper bound on \(\pi _{\min }\)

In this section we prove the upper bound for \(\pi _\mathrm{min}\) in Theorem 1.3 by establishing the estimate in (1.18). We first show that we can replace \(\pi (y)\) in (1.18) by a more convenient quantity. Define the distances

$$\begin{aligned} d(s)=\max _{x\in [n]}\Vert P^{s}(x,\cdot )-\pi \Vert _\texttt {TV},\quad \bar{d}(s) = \max _{x,y\in [n]}\Vert P^{s}(x,\cdot )-P^{s}(y,\cdot )\Vert _\texttt {TV}. \end{aligned}$$
(3.60)

It is standard that, for all \(k,s\in {\mathbb N} \),

$$\begin{aligned} d(ks)\le \bar{d}(ks) \le \bar{d}(s)^k \le 2^kd(s)^k, \end{aligned}$$
(3.61)

see e.g. [22]. In particular, defining

$$\begin{aligned} \lambda _t(y)=\frac{1}{n}\sum _{x\in [n]}P^t(x,y)\,, \end{aligned}$$
(3.62)

for any \(k\in {\mathbb N} \), setting \(t=2kT_\mathrm{ENT} \), one has

$$\begin{aligned} \max _{y\in [n]}|\lambda _{t}(y)- \pi (y)|\le d(2kT_\mathrm{ENT} )\le 2^{k}d(2T_\mathrm{ENT} )^k. \end{aligned}$$
(3.63)

From (3.2) we know that \(d(2T_\mathrm{ENT} )\rightarrow 0\) in probability, or equivalently that for all fixed \(\varepsilon \in (0,1)\) we have \(d(2T_\mathrm{ENT} )\le \varepsilon \) w.h.p. In particular we can choose \(\varepsilon =\frac{1}{2e}\), so that w.h.p. the right hand side above is at most \(e^{-k}\). If \(k=\Theta (\log ^2(n))\) we can safely replace \(\pi (y)\) with \(\lambda _t(y)\) in (1.18). Thus, it suffices to prove the following statement.

Lemma 3.10

For some constants \(\beta >0\), \(C>0\), and for any \(t=t_n=\Theta (\log ^3(n))\):

$$\begin{aligned} \mathbb {P}\Big (\exists S\subset [n],\,|S|\ge n^\beta \,,\; n \max _{y\in S}\lambda _{t}(y)\le C\,\log ^{1-\gamma _1}(n) \Big )=1-o(1). \end{aligned}$$
(3.64)

Proof

Let \((\delta _*,\Delta _*)\in \mathcal L\) denote a type realizing the maximum in the definition of \(\gamma _1\); see (1.16). Let \(V_*=\mathcal V_{\delta _*,\Delta _*}\) denote the set of vertices of this type, and let \(\alpha _*\in (0,1)\) be a constant such that \(|V_*|\ge \alpha _* n\), for all n large enough. Let us fix a constant \(\beta _1\in (0,\tfrac{1}{4})\). This will be related to the constant \(\beta \), but we shall not look for the optimal exponent \(\beta \) in the statement (3.64). Consider the first \(N_1:=n^{\beta _1}\) vertices in the set \(V_*\), and call them \(y_1,\dots ,y_{N_1}\). Next, generate sequentially the in-neighborhoods \(\mathcal B^-_{h_0}(y_i)\), \(i=1,\dots ,N_1\), where

$$\begin{aligned} h_0=\log _{\delta _*}\log n - C_0, \end{aligned}$$
(3.65)

for some constant \(C_0\) to be fixed later. As in the proof of Lemma 3.2 we couple the \(\mathcal B^-_{h_0}(y_i)\) with independent random trees \(Y_i\) rooted at \(y_i\). For each \(\mathcal B^-_{h_0}(y_i)\) the probability of failing to equal \(Y_i\), conditionally on the previous generations, is uniformly bounded above by \(p:=N_1\Delta ^{2h_0}/m\). Let \(\mathcal A\) denote the event that all \(\mathcal B^-_{h_0}(y_i)\) are successfully coupled to the \(Y_i\)’s and that they have no intersections. Therefore,

$$\begin{aligned} \mathbb {P}(\mathcal A)\ge 1-O(N_1p)\ge 1-O(n^{3\beta _1-1})=1-o(1). \end{aligned}$$
(3.66)

Consider now a single random tree \(Y_1\). We say that \(Y_1\) is unlucky if all labels of the vertices in the tree are of type \((\delta _*,\Delta _*)\). The probability that \(Y_1\) is unlucky is at least

$$\begin{aligned} q=\left( \frac{\alpha _* n \Delta _*}{m}\right) ^{\delta _*^{h_0}}\ge n^{-\eta }, \end{aligned}$$

where \(\eta =\delta _*^{-C_0}\log (\Delta /2\alpha _*)\) if \(C_0\) is the constant in (3.65). We choose \(C_0\) so large that \(0<\eta \le \beta _1/4\). Call \(S_1\) the set of \(y\in \{y_1,\dots ,y_{N_1}\}\) such that \(Y_i\) is unlucky. Since the \(Y_i\) are i.i.d. the probability that \(|S_1|<n^{\beta _1/2}\) is bounded by the probability that \(\mathrm{Bin}(N_1,q)< n^{\beta _1/2}\), which by Hoeffding’s inequality is at most

$$\begin{aligned} \exp {\left( -n^{\beta _1/3}\right) } \end{aligned}$$
(3.67)

Fix a realization \(\sigma \) of the in-neighborhoods \(\mathcal B^-_{h_0}(y_i)\), \(i=1,\dots ,N_1\). Say that \(y_i\) is unlucky if all vertices in \(\mathcal B^-_{h_0}(y_i)\) are of type \((\delta _*,\Delta _*)\). Thanks to (3.66) we may assume that \(\sigma \in \mathcal A\), i.e. \( \mathcal B^-_{h_0}(y_i)=Y_i\) for all i so that the set of unlucky \(y_i\) coincides with \(S_1\), and thanks to (3.67) we may also assume that \(\sigma \) is such that \(|S_1|\ge \bar{N}:=n^{\beta _1/2}\). We call \(\mathcal A'\subset \mathcal A\) the set of all \(\sigma \in \mathcal A\) satisfying the latter requirement. Let \(\bar{S}\) denote the first \(\bar{N}\) elements in \(S_1\). We are going to show that uniformly in \(\sigma \in \mathcal A'\), for a sufficiently large constant \(C>0\), any \(t=\Theta (\log ^3(n))\),

$$\begin{aligned} \mathbb {P}\Big (\sum _{y\in \bar{S}}\lambda _t(y) > \tfrac{C\bar{N}}{2n}\log ^{1-\gamma _1}(n) \,\Big |\, \sigma \Big ) =o(1). \end{aligned}$$
(3.68)

Notice that (3.68) says that, conditionally on a fixed \(\sigma \in \mathcal A'\), with high probability

$$\begin{aligned} \sum _{y\in \bar{S}}\lambda _t(y) \le \tfrac{C\bar{N}}{2n}\log ^{1-\gamma _1}(n), \end{aligned}$$

which implies that there are at most \(\bar{N}/2\) vertices \(y\in \bar{S}\) with the property that \( \lambda _t(y) > \frac{C}{n}\log ^{1-\gamma _1}(n)\). Summarizing, the above arguments and (3.68) allow one to conclude the unconditional statement that with high probability there are at least \(\frac{1}{2}n^{\beta _1/2}\) vertices \(y\in [n]\) such that

$$\begin{aligned} \lambda _t(y) \le \tfrac{C}{n}\log ^{1-\gamma _1}(n), \end{aligned}$$

which implies the desired claim (3.64), taking e.g. \(\beta =\beta _1/3\).

To prove (3.68), consider the sum

$$\begin{aligned} \mathcal X= \sum _{y\in \bar{S}}\lambda _t(y). \end{aligned}$$
(3.69)

We first establish that, uniformly in \(\sigma \in \mathcal A'\), for any \(t=\Theta (\log ^3(n))\),

$$\begin{aligned} \mathbb {E}\left( \mathcal X\, |\, \sigma \right) =(1+o(1)) \frac{\delta _*}{m} \bar{N} \Delta _*^{-h_0}\delta _*^{h_0}. \end{aligned}$$
(3.70)

If y is unlucky then \(P^{h_0}(z,y)=\Delta _*^{-h_0}\) for any \(z\in \partial \mathcal B^-_{h_0}(y)\). Hence, for any \(y\in \bar{S}\):

$$\begin{aligned} \lambda _t(y) = \frac{\Delta _*^{-h_0}}{n}\sum _{x\in [n]}\sum _{z\in \partial \mathcal B^-_{h_0}(y)}P^{t-h_0}(x,z) = \Delta _*^{-h_0} \sum _{z\in \partial \mathcal B^-_{h_0}(y)}\lambda _{t-h_0}(z). \end{aligned}$$

Since \(|\partial \mathcal B^-_{h_0}(y)|=\delta _*^{h_0}\), and since all \(z\in \partial \mathcal B^-_{h_0}(y)\) have the same in-degree \(d_z^-=\delta _*\), using symmetry the proof of (3.70) is reduced to showing that for any \(z\in \partial \mathcal B^-_{h_0}(y)\), \(t=\Theta (\log ^3n)\),

$$\begin{aligned} \mathbb {E}\left( \lambda _t(z) \, |\, \sigma \right) =(1+o(1)) \frac{d_z^-}{m}. \end{aligned}$$
(3.71)

To compute the expected value in (3.71) we use the so called annealed process. Namely, observe that

$$\begin{aligned} \mathbb {E}\left( \lambda _t(z) \, |\, \sigma \right) =\frac{1}{n}\sum _{x\in [n]} \mathbb {E}\left( P^t(x,z)\, |\, \sigma \right) =\frac{1}{n}\sum _{x\in [n]}\mathbb {P}^{a,\sigma }_x\left( X_t=z\right) , \end{aligned}$$
(3.72)

where \(X_t\) is the annealed walk with initial environment \(\sigma \), and initial position x, and \(\mathbb {P}^{a,\sigma }_x\) denotes its law. This process can be described as follows. At time 0 the environment consists of the edges from \(\sigma \) alone, and \(X_0=x\); at every step, given the current environment and position, the walker picks a uniformly random tail e from its current position, if it is still unmatched then it picks a uniformly random unmatched head f, the edge ef is added to the environment and the position is moved to the vertex of f, while if e is already matched then the position is moved to the vertex of the head to which e was matched. Let us show that uniformly in \(x\ne z\in \partial \mathcal B^-_{h_0}(y)\), uniformly in \(\sigma \in \mathcal A'\):

$$\begin{aligned} \mathbb {P}^{a,\sigma }_x\left( X_{t}=z\right) =(1+o(1))\frac{d_z^-}{m}. \end{aligned}$$
(3.73)

Say that a collision occurs if the walk lands on a vertex that was already visited by using a freshly matched edge. Recall that we fixed \(t=O(\log ^3(n))\). At each time step the probability of a collision is at most O(t/m), and therefore the probability of more than one collision in the first t steps is at most \(O(t^4/m^2)=o(m^{-1})\). Thus we may assume that there is at most one cycle in the path of the walk up to time t. There are two cases to consider: 1) there is no cycle in the path up to time t or there is one cycle that does not pass through the vertex z; 2) there is a cycle and it passes through z. In case 1) since \(X_{t}=z\) the walker must necessarily pick one of the heads of z at the very last step. Since all heads of z are unmatched by construction, and since the total number of unmatched heads at that time is at least \(m-n^{\beta _1}\Delta ^{h_0}-t= (1-o(1))m\), this event has probability \((1+o(1))d_z^-/m\). In case 2) since \(x\ne z\) we argue that in order to have a cycle that passes through z, the walk has to visit z at some time before t, which is an event of probability O(t/m), and then must hit back the previous part of the path, which is an event of probability \(O(t^2/m)\). This shows that we can upper bound the probability of scenario 2) by \(O(t^3/m^2)=o(m^{-1})\). This concludes the proof of (3.73). Next, observe that if \(x=z\), then the previous argument gives \(\mathbb {P}^{a,\sigma }_z\left( X_{t}=z\right) =O(t/m)\) which is a bound on the probability that the walk again hits z at some point within time t. In conclusion, (3.72) and (3.73) imply (3.71) which establishes (3.70).

Let us now show that

$$\begin{aligned} \mathbb {E}\left( \mathcal X^2 \, |\, \sigma \right) =(1+o(1))\mathbb {E}\left( \mathcal X\, |\, \sigma \right) ^2. \end{aligned}$$
(3.74)

Once we have (3.74) we can conclude (3.68) by using Chebyshev’s inequality together with (3.70) and the fact that \(\delta _*^{h_0}\Delta _*^{-h_0} \le C_2\log ^{1-\gamma _1}(n)\) for some constant \(C_2>0\). We write

$$\begin{aligned} \mathbb {E}\left( \mathcal X^2 \, |\, \sigma \right) =\sum _{y,y'\in \bar{S}} \Delta _*^{-2h_0} \frac{1}{n^2}\sum _{x,x'\in [n]} \sum _{z\in \partial \mathcal B^-_{h_0}(y)}\sum _{z'\in \partial \mathcal B^-_{h_0}(y')}{\mathbb P} ^{a,\sigma }_{x,x'}(X_{t-h_0}=z,X'_{t-h_0}=z'), \end{aligned}$$
(3.75)

where \({\mathbb P} ^{a,\sigma }_{x,x'}\) is the law of two trajectories \((X_s,X'_s)\), \(s=0,\dots ,t\), that can be sampled as follows. Let X be sampled up to time t according to the previously described annealed measure \({\mathbb P} _x^{a,\sigma }\), call \(\sigma '\) the environment obtained by adding to \(\sigma \) all the edges discovered during the sampling of X and then sample \(X'\) up to time t independently, according to \( {\mathbb P} _{x'}^{a,\sigma '}\).

Let also \({\mathbb P} ^{a,\sigma }_\mathrm{u}\) be defined by

$$\begin{aligned} {\mathbb P} ^{a,\sigma }_\mathrm{u}=\frac{1}{n^2}\sum _{x,x'\in [n]}{\mathbb P} ^{a,\sigma }_{x,x'}. \end{aligned}$$

Thus, under \({\mathbb P} ^{a,\sigma }_\mathrm{u}\) the two trajectories have independent uniformly distributed starting points \(x,x'\). With this notation we write

$$\begin{aligned} \mathbb {E}\left( \mathcal X^2 \, |\, \sigma \right) = \sum _{y,y'\in \bar{S}} \Delta _*^{-2h_0} \sum _{z\in \partial \mathcal B^-_{h_0}(y)}\sum _{z'\in \partial \mathcal B^-_{h_0}(y')}{\mathbb P} ^{a,\sigma }_\mathrm{u}(X_{t-h_0}=z,X'_{t-h_0}=z'). \end{aligned}$$
(3.76)

Let us show that if \(z\ne z'\), \(t=\Theta (\log ^3(n))\):

$$\begin{aligned} {\mathbb P} ^{a,\sigma }_\mathrm{u}(X_t=z,X'_t=z') = (1+o(1))\frac{d_z^-d_{z'}^-}{m^2}. \end{aligned}$$
(3.77)

Indeed, let A be the event that the first trajectory hits z at time t and visits \(z'\) at some time before that. Then reasoning as in (3.73) the event A has probability \(O(t/m^2)\). Given any realization X of the first trajectory satisfying this event, the probability of \(X'_t=z'\) is at most the probability of colliding with the trajectory X within time t, which is O(t/m). On the other hand, if the first trajectory hits z at time t and does visit \(z'\) at any time before that, then the conditional probability of \(X'_t=z\), as in (3.73) is given by \((1+o(1))d_{z'}^-/m\). This proves (3.77) when \(z\ne z'\).

If \(z=z'\), \(t=\Theta (\log ^3(n))\), let us show that

$$\begin{aligned} {\mathbb P} ^{a,\sigma }_\mathrm{u}(X_t=z,X'_t=z) = O(1/m^2). \end{aligned}$$
(3.78)

Consider the event A that the first trajectory X has at most one collision. The complementary event \(A^c\) has probability at most \(O(t^4/m^2)\). If \(A^c\) occurs, then the conditional probability of \(X'_t=z\) is at most the probability that \(X'\) collides with the first trajectory at some time \(s\le t\), that is O(t/m). Hence,

$$\begin{aligned} {\mathbb P} ^{a,\sigma }_\mathrm{u}(X_t=z,X'_t=z; A^c) = O(t^5/m^3)=O(1/m^2). \end{aligned}$$
(3.79)

To prove (3.78), notice that to realize \(X'_t=z\) there must be a time \(s=0,\dots ,t\) such that \(X'\) collides with the first trajectory X at time s, then \(X'\) stays in the digraph \(D_1\) defined by the first trajectory for the remaining \(t-s\) units of time, and \(X'\) hits z at time t. On the event A the probability of spending h units of time in \(D_1\) is at most \(2\delta ^{-h}\), and for any \(h\in [0,t]\) there are at most \(h+1\) points x which have a path of length h from x to z in \(D_1\). Therefore

$$\begin{aligned} {\mathbb P} ^{a,\sigma }_\mathrm{u}(X_t=z,X'_t=z; A) \le (1+o(1))\frac{d_z^-}{m}\sum _{h=0}^t\frac{2(h+1)}{m}\,2\delta ^{-h}= O(1/m^2). \end{aligned}$$
(3.80)

Hence, (3.78) follows from (3.79) and (3.80).

In conclusion, using (3.77) and (3.78) in (3.76), and recalling (3.70), we have obtained (3.74). \(\square \)

3.5 Upper bound on \(\pi _{\max }\)

As in Sect. 3.4 we start by replacing \(\pi (y)\) with \(\lambda _t(y)=\frac{1}{n} \sum _x P^t(x,y)\). In (3.63) we have seen that if \(t=2kT_\mathrm{ENT} \), then w.h.p.

$$\begin{aligned} \max _{y\in [n]}|\lambda _{t}(y)- \pi (y)|\le e^{-k}. \end{aligned}$$
(3.81)

Thus, using a union bound over \(y\in [n]\), the upper bound in Theorem 1.6 follows from the next statement.

Lemma 3.11

There exists \(C>0\) such that for any \(t=t_n=\Theta (\log ^3(n))\), uniformly in \(y\in [n]\)

$$\begin{aligned} \mathbb {P}\left( \lambda _{t}(y)\ge \tfrac{C}{n}\log ^{1-\kappa _0}(n)\right) = o(n^{-1}). \end{aligned}$$
(3.82)

Proof

Fix

$$\begin{aligned} h_0=\log _{\Delta _-}\log n, \end{aligned}$$

and call \(\sigma \) a realization of the in-neighborhood \(\mathcal B_{h_0}^-(y)\). Clearly,

$$\begin{aligned} \lambda _{t+h_0}(y)=\sum _{z\in \mathcal B^-_{h_0}(y)}\lambda _t(z)P^{h_0}(z,y). \end{aligned}$$

From (3.9), under the event \(\mathcal G_y(\hslash )\) from Proposition 2.1, we have \(P^{h_0}(z,y)\le 2\delta _+^{-h_0}=2\log ^{-\kappa _0}(n)\) for every \(z\in \mathcal B^-_{h_0}(y)\). Define

$$\begin{aligned} \mathcal X:=\sum _{z\in \mathcal B^-_{h_0}(y)}\lambda _t(z)=\lambda _t(\mathcal B^-_{h_0}(y)). \end{aligned}$$

Note \(\mathcal X=\mathcal X^y\) where the vertex y is fixed. Then it is sufficient to prove that for some constant C, uniformly in \(\sigma \) and \(y\in [n]\):

$$\begin{aligned} \mathbb {P}\left( \mathcal X>\tfrac{C}{n}\,\log n \,;\;\mathcal G_y(\hslash ) \,\, |\, \, \sigma \right) =o(n^{-1}), \end{aligned}$$
(3.83)

where \(\mathcal G_y(\hslash )\) is defined in (2.4), (2.5). By Markov’s inequality, for any \(K\in {\mathbb N} \) and any constant \(C>0\):

$$\begin{aligned} \mathbb {P}\left( \mathcal X>\tfrac{C}{n}\,\log (n);\mathcal G_y(\hslash ) \, |\, \sigma \right) \le \frac{\mathbb {E}\left[ \mathcal X^K;\mathcal G_y(\hslash )\, |\, \sigma \right] }{\left( \tfrac{C}{n}\log n\right) ^K}. \end{aligned}$$
(3.84)

We fix \(K=\log n\), and claim that there exists an absolute constant \(C_1>0\) such that

$$\begin{aligned} \mathbb {E}\left[ \mathcal X^K;\mathcal G_y(\hslash )\, |\, \sigma \right] \le \left( \tfrac{C_1}{n}\log n\right) ^K. \end{aligned}$$
(3.85)

The desired estimate (3.83) follows from (3.85) and (3.84) by taking C large enough.

We compute the K-th moment \(\mathbb {E}\left[ \mathcal X^K;\mathcal G_y(\hslash )\, |\, \sigma \right] \) by using a version of the annealed process that we used in (3.75) that we now explain. This time we have K trajectories instead of 2:

$$\begin{aligned}&\mathbb {E}\left[ \mathcal X^K;\mathcal G_y(\hslash )\, |\, \sigma \right] = \frac{1}{n^K}\sum _{x_1,\dots ,x_K}\mathbb {E}\left[ P^t(x_1,\mathcal B^-_{h_0}(y))\cdots P^t(x_K,\mathcal B^-_{h_0}(y))\,;\,\mathcal G_y(\hslash )\, |\, \sigma \right] \nonumber \\&\quad = \frac{1}{n^K}\sum _{x_1,\dots ,x_K} \mathbb {P}^{a,\sigma }_{x_1,\dots ,x_K}\left( X_t^{(1)}\in \mathcal B^-_{h_0}(y),\dots ,X_t^{(K)}\in \mathcal B^-_{h_0}(y)\,;\, \mathcal G_y(\hslash ) \right) , \end{aligned}$$
(3.86)

where \(X^{(j)}:=\{X_s^{(j)}, s\in [0,t]\}\), \(j=1,\dots ,K\) denote K annealed walks each with initial point \(x_j\), and \(\mathbb {P}^{a,\sigma }_{x_1,\dots ,x_K}\) denotes the joint law of the trajectories \(X^{(j)}\), \(j=1,\dots ,K\), and the environment, defined as follows. Start with the environment \(\sigma \), and then run the first random walk \(X^{(1)}\) up to time t as described after (3.72). After that, run the walk \(X^{(2)}\) up to time t with initial environment given by the union of edges from \(\sigma \) and the first trajectory, as described in (3.75). Proceed recursively until all trajectories up to time t have been sampled. This produces a new environment, namely the digraph given by the union of \(\sigma \) and all the K trajectories. At this stage there are still many unmatched heads and tails, and we complete the environment by using a uniformly random matching of the unmatched heads and tails. This defines the coupling \(\mathbb {P}^{a,\sigma }_{x_1,\dots ,x_K}\) between the environment (the digraph G) and K independent walks in that environment. To ensure the validity of the expression (3.86) it suffices to note that conditionally on the realization of the full digraph G, under \(\mathbb {P}^{a,\sigma }_{x_1,\dots ,x_K}\) the variables \(X_t^{(1)},\dots ,X_t^{(K)}\) are independent random walks on G with length t.

It is convenient to introduce the notation

$$\begin{aligned} \mathbb {P}^{a,\sigma }_\mathrm{u}= \frac{1}{n^K}\sum _{x_1,\dots ,x_K} \mathbb {P}^{a,\sigma }_{x_1,\dots ,x_K}, \end{aligned}$$

for the annealed law of the K trajectories such that independently each trajectory starts at a uniformly random point \(X_0^{(j)}=x_j\). Let \(D_0=\sigma \) and let \(D_\ell \), for \(\ell =1,\dots ,K\), denote the digraph defined by the union of \(\sigma =\mathcal B^-_{h_0}(y)\) with the first \(\ell \) paths

$$\begin{aligned} \{X_s^{(j)}, 0\le s\le t\}, \quad j=1,\dots ,\ell . \end{aligned}$$

Call \(D_\ell (\hslash )\) the subgraph of \(D_\ell \) consisting of all directed paths in \(D_\ell \) ending at y with length at most \(\hslash \). We define \(\mathcal G^\ell _y(\hslash )\) as the event \({\textsc {tx}}(D_\ell (\hslash ))\le 1\). Notice that if the final environment has to satisfy \(\mathcal G_y(\hslash )\), then necessarily for every \(\ell \) the digraph \(D_\ell \) must satisfy \(\mathcal G^\ell _y(\hslash )\). Therefore,

$$\begin{aligned} \mathbb {E}\left[ \mathcal X^K;\mathcal G_y(\hslash )\, |\, \sigma \right] \le \mathbb {P}^{a,\sigma }_\mathrm{u} \left( X_t^{(1)}\in \mathcal B^-_{h_0}(y),\dots ,X_t^{(K)}\in \mathcal B^-_{h_0}(y)\,;\, \mathcal G^K_y(\hslash )\right) . \end{aligned}$$
(3.87)

Define

$$\begin{aligned} \mathcal W_\ell = \sum _{x\in V(D_\ell )}[d_x^-(D_\ell )-1]_+, \end{aligned}$$
(3.88)

where \(V(D_\ell )\) denotes the vertex set of \(D_\ell \) and \(d_x^-(D_\ell )\) is the in-degree of x in the digraph \(D_\ell \). Define also the \((\ell ,s)\) cluster \(\mathcal C_{\ell }^s\) as the digraph given by the union of \(D_{\ell -1}\) and the truncated path \(\{X_u^{(\ell )}, 0\le u\le s\}\) with \(s\le t\). We say that the \(\ell \)-th trajectory \(X^{(\ell )}\) has a collision at time \(s\ge 1\) if the edge \((X^{(\ell )}_{s-1},X^{(\ell )}_s)\notin \mathcal C_{\ell }^{s-1}\) and \(X^{(\ell )}_s\in \mathcal C_{\ell }^{s-1}\). We say that a collision occurs at time zero if \(X^{(\ell )}_0\in D_{\ell -1}\). Notice that at least

$$\begin{aligned} \sum _{x\notin \mathcal B^-_{h_0}(y)} [d_x^-(D_\ell )-1]_+ \end{aligned}$$

collisions must have occurred after the generation of the first \(\ell \) trajectories.

Let \(\mathcal Q_\ell \) denote the total number of collisions after the generation of the first \(\ell \) trajectories. Since \(|\mathcal B^-_{h_0}(y)|\le \Delta \log n\) one must have

$$\begin{aligned} \mathcal W_\ell \le \Delta \log n + \mathcal Q_\ell . \end{aligned}$$
(3.89)

Notice that the probability of a collision at any given time by any given trajectory is bounded above by \(p:=2\Delta (Kt + \Delta _-^{h_0})/m=O(\log ^4(n)/n)\) and therefore \(\mathcal Q_\ell \) is stochastically dominated by the binomial \(\mathrm{Bin}(Kt,p)\). In particular, for any \(k\in {\mathbb N} \):

$$\begin{aligned} \mathbb {P}\left( \mathcal Q_K\ge k \right) \le (Kt p)^k \le C_2^k\frac{\log ^{8k}(n)}{n^k}, \end{aligned}$$
(3.90)

for some constant \(C_2>0\). If \(A>0\) is a large enough constant, then

$$\begin{aligned} \mathbb {P}\left( \mathcal Q_K\ge A\log n \right) \le e^{-\tfrac{A}{2} \log ^2(n)}. \end{aligned}$$
(3.91)

If \(A\ge 2\) then (3.91) is smaller than the right hand side of (3.85) with e.g. \(C_1=1\), and therefore from now on we may restrict to proving the upper bound

$$\begin{aligned} \mathbb {P}^{a,\sigma }_\mathrm{u} \left( X_t^{(1)}\in \mathcal B^-_{h_0}(y),\dots ,X_t^{(K)}\in \mathcal B^-_{h_0}(y)\,;\,\mathcal Q_K\le A\log n \,;\,\mathcal G^K_y(\hslash )\right) \le \left( \tfrac{C_1}{n}\log n\right) ^K, \end{aligned}$$
(3.92)

for some constant \(C_1=C_1(A)>0\). To prove (3.92), define the events

$$\begin{aligned} B_\ell =\{X_t^{(1)}\in \mathcal B^-_{h_0}(y),\dots ,X_t^{(\ell )}\in \mathcal B^-_{h_0}(y)\,;\,\mathcal Q_\ell \le A\log n \,;\,\mathcal G^\ell _y(\hslash )\}, \end{aligned}$$
(3.93)

for \(\ell =1,\dots ,K\). Since \(B_{\ell +1}\subset B_{\ell }\), the left hand side in (3.92) is equal to

$$\begin{aligned} \mathbb {P}^{a,\sigma }_\mathrm{u} \left( B_1\right) \prod _{\ell =2}^{K}\mathbb {P}^{a,\sigma }_\mathrm{u}\left( B_\ell \, |\, B_{\ell -1} \right) \end{aligned}$$
(3.94)

Thus, it is sufficient to show that for some constant \(C_1\):

$$\begin{aligned} \mathbb {P}^{a,\sigma }_\mathrm{u}\left( B_\ell \, |\, B_{\ell -1} \right) \le \tfrac{C_1}{n}\log n\,, \end{aligned}$$
(3.95)

for all \(\ell =1,\dots , K\), where it is understood that \(\mathbb {P}^{a,\sigma }_\mathrm{u}\left( B_1\, |\, B_{0} \right) =\mathbb {P}^{a,\sigma }_\mathrm{u}\left( B_1 \right) .\)

Let us partition the event \(\{X_t^{(\ell )}\in \mathcal B_{h_0}^-(y) \}\) by specifying the last time in which the walk \(X^{(\ell )}\) enters the neighborhood \(\mathcal B_{h_0}^-(y)\). Unless the walk starts in \(\mathcal B_{h_0}^-(y)\), at that time it must enter from \(\partial \mathcal B^-_{h_0}(y)\). Since the tree excess of \(\mathcal B_{h_0}^-(y)\) is at most 1, once the walker is in \(\mathcal B_{h_0}^-(y)\), we can bound the chance that it remains in \(\mathcal B_{h_0}^-(y)\) for k steps by \(2\delta _+^{-k}\). Therefore,

$$\begin{aligned} \mathbb {P}^{a,\sigma }_\mathrm{u}\left( B_\ell \, |\, B_{\ell -1} \right)&\le \mathbb {P}^{a,\sigma }_\mathrm{u}\left( X^{(\ell )}_{t}\in \mathcal B^-_{h_0}(y)\, |\, B_{\ell -1} \right) \\&\le 2\delta _+^{-t}\mathbb {P}^{a,\sigma }_\mathrm{u}\left( X^{(\ell )}_{0}\in \mathcal B^-_{h_0}(y)\, |\, B_{\ell -1} \right) \\&\quad + \sum _{j=1}^{t} 2\delta _+^{-(t-j)} \mathbb {P}^{a,\sigma }_\mathrm{u}\left( X^{(\ell )}_{j}\in \partial \mathcal B^-_{h_0}(y)\, |\, B_{\ell -1} \right) \\&\le 2t\delta _+^{-t/2}+\sum _{j=t/2+1}^t2\delta ^{-(t-j)}\mathbb {P}^{a,\sigma }_\mathrm{u}\left( X^{(\ell )}_{j}\in \partial \mathcal B^-_{h_0}(y)\, |\, B_{\ell -1} \right) \end{aligned}$$

Since \(t=\Theta (\log ^3(n))\), it is enough to show

$$\begin{aligned} \mathbb {P}^{a,\sigma }_\mathrm{u}\left( X^{(\ell )}_{j}\in \partial \mathcal B^-_{h_0}(y)\, |\, B_{\ell -1} \right) \le \tfrac{C_1}{n}\log n, \end{aligned}$$
(3.96)

uniformly in \(j\in (t/2,t)\) and \(1\le \ell \le K\).

Let \(\mathcal H^{\ell }_0\) denote the event that the \(\ell \)-th walk makes its first visit to the digraph \(D_{\ell -1}\) at the very last time j, when it enters \(\partial \mathcal B^-_{h_0}(y)\). Uniformly in the trajectories of the first \(\ell -1\) walks, at any time there are at most \(\Delta _-| \partial \mathcal B^-_{h_0}(y)|\le \Delta _-^{h_0+1}=\Delta _-\log n\) unmatched heads attached to \(\partial \mathcal B^-_{h_0}(y)\), and therefore

$$\begin{aligned} \mathbb {P}^{a,\sigma }_\mathrm{u}\left( X^{(\ell )}_{j}\in \partial \mathcal B^-_{h_0}(y)\,;\, \mathcal H^{\ell }_0\, |\, B_{\ell -1} \right) =O(| \partial \mathcal B^-_{h_0}(y)|/m)\le \tfrac{C_1}{n}\log n. \end{aligned}$$
(3.97)

Let \(\mathcal H^{\ell }_2\) denote the event that the \(\ell \)-th walk makes a first visit to \(D_{\ell -1}\) at some time \(s_1<j\), then at some time \(s_2>s_1\) it exits \(D_{\ell -1}\), and then at a later time \(s_3\le j\) enters again the digraph \(D_{\ell -1}\). Since each time the walk is outside \(D_{\ell -1}\) the probability of entering \(D_{\ell -1}\) at the next step is O(Kt/m), it follows that

$$\begin{aligned} \mathbb {P}^{a,\sigma }_\mathrm{u}\left( X^{(\ell )}_{j}\in \partial \mathcal B^-_{h_0}(y)\,;\, \mathcal H^{\ell }_2\, |\, B_{\ell -1} \right) =O(K^2t^4/m^2)\le \tfrac{C_1}{n}\log n. \end{aligned}$$
(3.98)

It remains to consider the case where the \(\ell \)-th walk enters only once the digraph \(D_{\ell -1}\) at some time \(s\le j-1\), and then stays in \(D_{\ell -1}\) for the remaining \(j-s\) units of time. Calling \(\mathcal H^\ell _{1,s}\) this event, and summing over all possible values of s, we need to show that

$$\begin{aligned} \sum _{s=0}^{j-1} \mathbb {P}^{a,\sigma }_\mathrm{u}\left( X^{(\ell )}_{j}\in \partial \mathcal B^-_{h_0}(y)\,;\, \mathcal H^{\ell }_{1,s}\, |\, B_{\ell -1} \right) \le \tfrac{C_1}{n}\log n. \end{aligned}$$
(3.99)

We divide the sum in two parts: \(s\in [0, j-\hslash +h_0]\) and \(s\in (j-\hslash +h_0,j)\). For the first part, note that the walk must spend at least \(\hslash -h_0\ge \hslash /2\) units of time in \(D_{\ell -1}(\hslash )\), which has probability at most \(2\delta _+^{-\hslash /2}=O(n^{-\varepsilon })\) for some constant \(\varepsilon >0\), because of the condition \(\mathcal G^{\ell -1}_y(\hslash )\) included in the event \(B_{\ell -1}\). Since the probability of hitting \(D_{\ell -1}\) at time s is O(Kt/m) we obtain

$$\begin{aligned} \sum _{s=0}^{j-\hslash +h_0} \mathbb {P}^{a,\sigma }_\mathrm{u}\left( X^{(\ell )}_{j}\in \partial \mathcal B^-_{h_0}(y)\,;\, \mathcal H^{\ell }_{1,s}\, |\, B_{\ell -1} \right) =O(Kt^2n^{-\varepsilon }/m)\le \tfrac{C_1}{n}\log n. \end{aligned}$$
(3.100)

To estimate the sum over \(s\in (j-\hslash +h_0,j)\), notice that the walk has to enter \(D_{\ell -1}\) by hitting a point \(z\in D_{\ell -1}\) at time s such that there exists a path of length \(h=j-s\) from z to \(\partial \mathcal B^-_{h_0}(y)\) within the digraph \(D_{\ell -1}\). Call \(L_{h}\) the set of such points in \(D_{\ell -1}\). Hitting this set at any given time s coming from outside the digraph \(D_{\ell -1}\) has probability at most \(2\Delta |L_h|/m\), and the path followed once it has entered \(D_{\ell -1}\) is necessarily in \(D_{\ell -1}(\hslash )\) and therefore has weight at most \(2\delta _+^{-h}\). Then,

$$\begin{aligned} \sum _{s=j-\hslash +h_0+1}^{j-1} \mathbb {P}^{a,\sigma }_\mathrm{u}\left( X^{(\ell )}_{j}\in \partial \mathcal B^-_{h_0}(y)\,;\, \mathcal H^{\ell }_{1,s}\, |\, B_{\ell -1} \right) \le \sum _{h=1}^{\hslash -h_0-1}\frac{2\Delta |L_{h}|}{m} 2\delta _+^{-h}, \end{aligned}$$
(3.101)

Let \(A_h\subset L_h\) denote the set of points exactly at distance h from \(\partial \mathcal B^-_{h_0}(y)\) in \(D_{\ell -1}\). We have

$$\begin{aligned} |A_h|&\le \sum _{x\in A_{h-1}} d_x^-(D_{\ell -1}) \\&\le |A_{h-1}| + \sum _{x\in A_{h-1}} [d_x^-(D_{\ell -1})- 1]_+ \\&\le |A_{h-2}| + \sum _{x\in A_{h-1}\cup A_{h-2}} [d_x^-(D_{\ell -1}) - 1]_+ \\&\le \dots \le |A_0|+\sum _{x\in A_{0}\cup .... \cup A_{h-1}} [d_x^-(D_{\ell -1}) - 1]_+ \\&\le |\partial \mathcal B^-_{h_0}(y)| + \mathcal W_{\ell -1}. \end{aligned}$$

Since \(h\le \hslash =O( \log n)\) and \(|\partial \mathcal B^-_{h_0}(y)|\le \log n\), using (3.89) we have obtained

$$\begin{aligned} |A_h|\le C_2\log n + \mathcal Q_{\ell -1}. \end{aligned}$$
(3.102)

On the event \(B_{\ell -1}\) we know that \(\mathcal Q_{\ell -1}\le A\log n\), and therefore \(|A_h|\le C_3\log n\) for some absolute constant \(C_3>0\). In conclusion, for all \(h\in (0,\hslash -h_0)\)

$$\begin{aligned} |L_h|\le \sum _{\ell =0}^{h}|A_\ell |\le C_3h\log n. \end{aligned}$$
(3.103)

Inserting this estimate in (3.101),

$$\begin{aligned} \sum _{s=j-\hslash +1}^{j-1} \mathbb {P}^{a,\sigma }_\mathrm{u}\left( X^{(\ell )}_{j}\in \partial \mathcal B^-_{h_0}(y)\,;\, \mathcal H^{\ell }_{1,s}\, |\, B_{\ell -1} \right) \le \tfrac{C_4}{n} \log n. \end{aligned}$$
(3.104)

Combining (3.100) and (3.104) we have proved (3.99) for a suitable constant \(C_1\). \(\square \)

3.6 Lower bound on \(\pi _{\max }\)

Lemma 3.12

There exist constants \(\varepsilon ,c>0\) such that

$$\begin{aligned} \mathbb {P}\Big (\exists S\subset [n],\,|S|\ge n^\varepsilon \,,\; n \min _{y\in S}\pi (y)\ge c\log ^{1-\kappa _1}(n) \Big )=1-o(1). \end{aligned}$$
(3.105)

Proof

We argue as in the first part of the proof of Lemma 3.10. Namely, let \((\Delta _*,\delta _*)\in \mathcal L\) denote the type realizing the minimum in the definition of \(\kappa _1\); see (1.16). Let \(V_*=\mathcal V_{\Delta _*,\delta _*}\) denote the set of vertices of this type, and let \(\alpha _*\in (0,1)\) be a constant such that \(|V_*|\ge \alpha _* n\), for all n large enough. Fix a constant \(\beta _1\in (0,\tfrac{1}{4})\) and call \(y_1,\dots ,y_{N_1}\) the first \(N_1:=n^{\beta _1}\) vertices in the set \(V_*\). Then sample the in-neighborhoods \(\mathcal B^-_{h_0}(y_i)\) where

$$\begin{aligned} h_0=\log _{\Delta _*}\log n - C, \end{aligned}$$
(3.106)

and call \(\sigma \) a realization of all these neighborhoods. As in the proof of Lemma 3.10, we may assume that all \(\mathcal B^-_{h_0}(y_i)\) are successfully coupled with i.i.d. random trees \(Y_i\). Next define a \(y_i\) lucky if \(\mathcal B^-_{h_0}(y_i)\) has all its vertices of type \((\Delta _*,\delta _*)\). Then, if C in (3.106) is large enough we may assume that at least \(n^{\beta _1/2}\) vertices \(y_i\) are lucky; see (3.67). As before, we call \(\mathcal A'\) the set of \(\sigma \) realizing these constraints. Given a realization \(\sigma \in \mathcal A'\), and some \(\varepsilon \in (0,\beta _1/2)\) we fix the first \(n^\varepsilon \) lucky vertices \(y_{*,i}\), \(i=1,\dots ,n^\varepsilon \). Since \({\mathbb P} (\mathcal A')=1-o(1)\), letting \(S=\{y_{*,i}, i=1,\dots ,n^\varepsilon \}\), it is sufficient to prove that for some constant \(c>0\)

$$\begin{aligned} \max _{\sigma \in \mathcal A'}\,\mathbb {P}\left( \min _{i=1,\dots ,n^\varepsilon }n \pi (y_{*,i})< c\log ^{1-\kappa _1}(n)\, |\, \sigma \right) =o(1). \end{aligned}$$
(3.107)

To prove (3.107) we first observe that by (3.34) and Lemma 3.3 it is sufficient to prove the same estimate with \(n \pi (y_{*,i})\) replaced by \(\Gamma _{h_1}(y_{*,i})\), where \(h_1=K\log \log n\) for some large but fixed constant K. Therefore, by using symmetry and a union bound it suffices to show

$$\begin{aligned} \max _{\sigma \in \mathcal A'}\,\mathbb {P}\left( \Gamma _{h_1}(y_{*})< c\log ^{1-\kappa _1}(n)\, |\, \sigma \right) \le n^{-2\varepsilon }, \end{aligned}$$
(3.108)

where \(y_*=y_{*,1}\) is the first lucky vertex. By definition of lucky vertex, \(\partial \mathcal B^-_{h_0}(y_*)\) has exactly \(\Delta _*^{h_0}\) elements. For each \(z\in \partial \mathcal B^-_{h_0}(y_*)\) we sample the in-neighborhood \(\mathcal B^-_{h_1-h_0}(z)\). The same argument of the proof of Lemma 3.2 shows that the probability that all these neighborhoods are successfully coupled to i.i.d. random directed trees is at least \(1- O(\Delta ^{2h_1}/n)\). On this event we have

$$\begin{aligned} \Gamma _{h_1}(y_*)=\delta _*^{-h_0}\sum _{i=1}^{\Delta _*^{h_0}} X_i, \end{aligned}$$
(3.109)

where \(X_i=M^i_{h_1-h_0}\) is defined by (3.15). Then (3.16) shows that

$$\begin{aligned} \mathbb {P}\left( \Gamma _{h_1}(y_*)<\tfrac{1}{2}\Delta _*^{h_0}\delta _*^{-h_0}\right) \le \exp {\left( -c_1\Delta _*^{h_0}\right) }, \end{aligned}$$
(3.110)

for some constant \(c_1>0\). Since \(\Delta _*^{h_0}=\Delta _*^{-C}\log n\) and \(\Delta _*^{h_0}\delta _*^{-h_0}=(\delta _*/\Delta _*)^C\log ^{1-\kappa _1}(n)\), this shows that

$$\begin{aligned} \max _{\sigma \in \mathcal A'}\,\mathbb {P}\left( \Gamma _{h_1}(y_*)< c_2\log ^{1-\kappa _1}(n)\, |\, \sigma \right) \le n^{-2\varepsilon }, \end{aligned}$$
(3.111)

for some new constant \(c_2>0\) and for \(\varepsilon =c_1\Delta _*^{-C}/4\). This ends the proof of (3.108). \(\square \)

4 Bounds on the cover time

In this section we show how the control of the extremal values of the stationary distribution obtained in previous sections can be turned into the bounds on the cover time presented in Theorem 1.9. To this end we exploit the full strength of the strategy developed by Cooper and Frieze [14,15,16,17].

4.1 The key lemma

Given a digraph G, write \(X_t\) for the position of the random walk at time t and write \(\mathbf{P}_x\) for the law of \(\{X_t, t\ge 0\}\) with initial value \(X_0=x\). In particular, \(\mathbf{P}_x(X_t=y )=P^t(x,y)\) denotes the transition probability. Fix a time \(T>0\) and define the event that the walk does not visit y in the time interval [Tt], for \(t>T\):

$$\begin{aligned} \mathcal A^T_y(t)=\{X_s\not =y,\,\forall s\in [T,t] \}. \end{aligned}$$
(4.1)

Moreover, define the generating function

$$\begin{aligned} R_y^T(z)=\sum _{t=0}^{T} z^t\,\mathbf{P}_y(X_t=y ), \quad z\in {\mathbb C} . \end{aligned}$$
(4.2)

Thus, \(R_y^T(1)\ge 1\) is the expected number of returns to y within time T, if started at y. The following statement is proved in [15], see also [17, Lemma 3].

Lemma 4.1

Assume that \(G=G_n\) is a sequence of digraphs with vertex set [n] and stationary distribution \(\pi =\pi _n\), and let \(T=T_n\) be a sequence of times such that

  1. (i)

    \(\max _{x,y\in [n]}|P^T(x,y)-\pi (y)|\le n^{-3}\).

  2. (ii)

    \(T^2\pi _\mathrm{max}=o(1)\) and \(T\pi _\mathrm{min}\ge n^{-2}\).

Suppose that \(y\in [n]\) satisfies:

  1. (iii)

    There exist \(K,\psi >0\) independent of n such that

    $$\begin{aligned} \min _{|z|\le 1+\frac{1}{K T}}|R^T_y(z)|\ge \psi . \end{aligned}$$

Then there exist \(\xi _1,\xi _2=O(T\pi _\mathrm{max})\) such that for all \(t\ge T\):

$$\begin{aligned} \max _{x\in [n]}\left| \mathbf{P}_x\left( \mathcal A^T_y(t)\right) -\frac{1+\xi _1}{(1+p_y)^{t+1}}\right| \le e^{-\frac{t}{2K T}}, \end{aligned}$$
(4.3)

where

$$\begin{aligned} p_y=(1+\xi _2)\frac{\pi (y)}{R_y^T(1)}. \end{aligned}$$
(4.4)

We want to apply the above lemma to digraphs from our configuration model. Thus, our first task is to make sure that the assumptions of Lemma 4.1 are satisfied. From now on we fix the sequence \(T=T_n\) as

$$\begin{aligned} T = \log ^3(n). \end{aligned}$$
(4.5)

From (3.2) and the argument in (3.61) it follows that item (i) of Lemma 4.1 is satisfied with high probability. Moreover, Theorem 1.3 and Theorem 1.6 imply that item (ii) of Lemma 4.1 is also satisfied with high probability. Next, following [16], we define a class of vertices \(y\in [n]\) which satisfy item (iii) of Lemma 4.1. We use the convenient notation

$$\begin{aligned} \vartheta = \log \log \log (n). \end{aligned}$$
(4.6)

Definition 4.2

We call small cycle a collection of \(\ell \le 3\vartheta \) edges such that their undirected projection forms a simple undirected cycle of length \(\ell \). We say that \(v\in [n]\) is locally tree-like (LTL) if its in- and out-neighborhoods up to depth \(\vartheta \) are both directed trees and they intersect only at x. We denote by \(V_1\) the set of LTL vertices, and write \(V_2=[n]\setminus V_1\) for the complementary set.

The next proposition can be proved as in [16, Section 3]. Recall the definition of \(\Delta \) in (2.1).

Proposition 4.3

The following holds with high probability:

  1. (1)

    The number of small cycles is at most \( \Delta ^{9\vartheta }\).

  2. (2)

    The number of vertices which are not LTL satisfies \(|V_2|\le \Delta ^{15\vartheta }\).

  3. (3)

    There are no two small cycles which are less than \(9\vartheta \) undirected steps away from each other.

Proposition 4.4

With high probability, uniformly in \(y\in V_1\):

$$\begin{aligned} R_y^T(1)=1+O(2^{-\vartheta }). \end{aligned}$$
(4.7)

Moreover, there exist constants \(K,\psi >0\) such that with high probability, every \(y\in V_1\) satisfies item (iii) of Lemma 4.1. In particular, (4.3) holds uniformly in \(y\in V_1\).

Proof

We first prove (4.7). Fix \(y\in V_1\) and consider the neighborhoods \(\mathcal B_{\vartheta }^\pm (y)\) and \(\mathcal B_{\hslash }^-(y)\). By Proposition 2.1 we may assume that \(\mathcal B_{\hslash }^-(y)\) and \(\mathcal B_{\vartheta }^+(y)\) are both directed trees except for at most one extra edge. By the assumption \(y\in V_1\) we know that \(\mathcal B_{\vartheta }^-(y), \mathcal B_{\vartheta }^+(y)\) are both directed trees with no intersection except y, so that the extra edge in \(\mathcal B_{\hslash }^-(y)\cup \mathcal B_{\vartheta }^+(y)\) cannot be in \(\mathcal B_{\vartheta }^-(y)\cup \mathcal B_{\vartheta }^+(y)\). Thus, the following cases only need to be considered:

  1. (1)

    There is no extra edge in \(\mathcal B_{\hslash }^-(y)\cup \mathcal B_{\vartheta }^+(y)\);

  2. (2)

    The extra edge connects \(\mathcal B_{\hslash }^-(y)\setminus \mathcal B_{\vartheta }^-(y)\) to itself

  3. (3)

    The extra edge connects \(\mathcal B_{\vartheta }^-(y)\) to \( \mathcal B_{\hslash }^-(y)\setminus \mathcal B_{\vartheta }^-(y)\);

  4. (4)

    The extra edge connects \(\mathcal B_{\vartheta }^+(y)\) to \( \mathcal B_{\hslash }^-(y)\setminus \mathcal B_{\vartheta }^-(y)\).

In all cases but the last, if a walk started at y returns at y at time \(t>0\) then it must exit \(\partial \mathcal B_{\vartheta }^+(y)\) and enter \(\partial \mathcal B_{\hslash }^-(y)\), and from any vertex of \(\partial \mathcal B_{\hslash }^-(y)\) the probability to reach y before exiting \(\mathcal B_{\hslash }^-(y)\) is at most \(2\delta ^{-\hslash }\). Therefore, in these cases the number of visits to y up to T is stochastically dominated by \(1+\mathrm{Bin}(T,2\delta ^{-\hslash })\) and

$$\begin{aligned} 1\le R_y^T(1) \le 1 + 2T\delta ^{-\hslash } = 1 + O(n^{-a}), \end{aligned}$$

for some \(a>0\). In the last case instead it is possible for the walk to jump from \(\mathcal B_{\vartheta }^+(y)\) to \( \mathcal B_{\hslash }^-(y)\setminus \mathcal B_{\vartheta }^-(y)\). Let \(E_k\) denote the event that the walk visits y exactly k times in the interval [1, T]. Let B denote the event that the walk visits y exactly \(\vartheta \) units of time after its first visit to \(\partial \mathcal B_{\vartheta }^-(y)\). Then \(\mathbf{P}_y(B)\le \delta ^{-\vartheta }\). On the complementary event \(B^c\) the walk must enter \(\partial \mathcal B_{\hslash }^-(y)\) before visiting y, and each time it visits \(\partial \mathcal B_{\hslash }^-(y)\) it has probability at most \(2\delta ^{-\hslash }\) to visit y before the next visit to \(\partial \mathcal B_{\hslash }^-(y)\). Since the number of attempts is at most T one finds

$$\begin{aligned} \mathbf{P}_y(E_1) \le \mathbf{P}_y(B) + \mathbf{P}_y(E_1,B^c)\le \delta ^{-\vartheta } + 2T\delta ^{-\hslash } \le 2\delta ^{-\vartheta }. \end{aligned}$$

By the strong Markov property,

$$\begin{aligned} \mathbf{P}_y(E_k) \le \mathbf{P}_y(E_1)^k. \end{aligned}$$

Therefore

$$\begin{aligned} R_y^T(1) = 1 + \sum _{k=1}^\infty k \mathbf{P}_y(E_k) =1+O(\delta ^{-\vartheta }). \end{aligned}$$

To see that \(y\in V_1\) satisfies item (iii) of Lemma 4.1, take \(z\in {\mathbb C} \) with \(|z|\le 1+ 1/KT\) and write

$$\begin{aligned} |R^T_y(z)|&\ge 1-\sum _{t=1}^T \mathbf{P}_y(X_t=y)|z|^t \ge 1- e^{1/K}(R^T_y(1)-1)= 1 - O(\delta ^{-\vartheta }). \end{aligned}$$

\(\square \)

4.2 Upper bound on the cover time

We prove the following estimate relating the cover time to \(\pi _\mathrm{min}\). From Theorem 1.3 this implies the upper bound on the cover time in Theorem 1.9.

Lemma 4.5

For any constant \(\varepsilon >0\), with high probability

$$\begin{aligned} \max _{x\in [n]}\,\mathbf{E}_x\left( \tau _\mathrm{cov}\right) \le (1+\varepsilon )\frac{\log n}{\pi _\mathrm{min}}. \end{aligned}$$
(4.8)

Proof

Let \(U_s\) denote the set of vertices that are not visited in the time interval [0, s]. By Markov’s inequality, for all \(t_*\ge T\):

$$\begin{aligned} \mathbf{E}_x[\tau _\mathrm{cov}]&=\sum _{s\ge 0}\mathbf{P}_x(\tau _\mathrm{cov}>s)=\sum _{s\ge 0}\mathbf{P}_x(U_s\ne \emptyset )\nonumber \\&\le t_* + \sum _{s\ge t_*}\mathbf{E}_x\left[ |U_s|\right] =t_*+\sum _{s\ge t_*}\sum _{y\in [n]}\mathbf{P}_x(y\in U_s)\nonumber \\&\le t_*+\sum _{s\ge t_*}\sum _{y\in [n]}\mathbf{P}_x(\mathcal A^T_y(s)). \end{aligned}$$
(4.9)

Choose

$$\begin{aligned} t_*:= \frac{(1+\varepsilon )\log n}{\pi _\mathrm{min}}, \end{aligned}$$

for \(\varepsilon >0\) fixed. It is sufficient to prove that the last term in (4.9) is \(o(t_*)\) uniformly in \(x\in [n]\).

From Proposition 4.4 we can estimate

$$\begin{aligned} \mathbf{P}_x(\mathcal A^T_y(s))= \frac{(1+\xi ')}{(1+\bar{p}_y)^{s+1}}\,, \end{aligned}$$
(4.10)

where \(\bar{p}_y:= (1+\xi )\pi (y)\) with \(\xi ,\xi '=O(T\pi _\mathrm{max})+O(\delta ^{-\vartheta })=o(1)\) uniformly in \(x\in [n],y\in V_1\). Therefore,

$$\begin{aligned} \sum _{s\ge t_*}\sum _{y\in V_1}\mathbf{P}_x(\mathcal A^T_y(s))=(1+o(1))\sum _{y\in V_1} \frac{1}{\bar{p}_y(1+\bar{p}_y)^{t_*}}. \end{aligned}$$
(4.11)

Using \(\pi (y)\ge \pi _\mathrm{min}\), (4.11) is bounded by

$$\begin{aligned} \frac{(1+o(1))n}{\bar{p}_y(1+\bar{p}_y)^{t_*}}&\le \frac{2n}{\pi _\mathrm{min}}\exp {\left( -\pi _\mathrm{min}t_*(1+o(1))\right) } \le \frac{1}{\pi _\mathrm{min}}=o(t_*), \end{aligned}$$

for all fixed \(\varepsilon >0\) in the definition of \(t_*\).

It remains to control the contribution of \(y\in V_2\) to the sum in (4.9). From Proposition 4.3 we may assume that \(|V_2|=O(\Delta ^{15\vartheta })\). In particular, it is sufficient to show that with high probability uniformly in \(x\in [n]\) and \(y\in V_2\):

$$\begin{aligned} \sum _{s\ge t_*}\mathbf{P}_x(\mathcal A^T_y(s)) = o(t_*\Delta ^{-15\vartheta }). \end{aligned}$$
(4.12)

To prove (4.12), fix \(y\in V_2\) and notice that by Proposition 4.3 (3), we may assume that there exists \(u\in V_1\) s.t. \(d(u,y)<10\vartheta \). If \(t_1=t_0+10\vartheta \), \(t_0:=4/\pi _\mathrm{min}\), then

$$\begin{aligned} \mathbf{P}_x(\mathcal A^T_y(t_1)^c)&=\mathbf{P}_x( y\in \{X_T,X_{T+1},\dots ,X_{t_1}\})\\&\ge \mathbf{P}_x( u\in \{X_T,X_{T+1},\dots ,X_{t_0}\})\mathbf{P}_u(y\in \{X_1,\dots ,X_{10\vartheta }\})\\&\ge \left( 1-\mathbf{P}_x(\mathcal A^T_u(t_0))\right) \Delta ^{-10\vartheta }. \end{aligned}$$

Since \(u\in V_1\), as in (4.10), for n large enough,

$$\begin{aligned} \mathbf{P}_x(\mathcal A^T_u(t_0))\le \frac{2}{(1+\bar{p}_y)^{t_0+1}}\le \frac{1}{2}. \end{aligned}$$
(4.13)

Setting \(\gamma :=\frac{1}{2}\Delta ^{-10\vartheta }\), we have shown that \(\mathbf{P}_x(\mathcal A^T_y(t_1)^c)\ge \gamma \). Since this bound is uniform over x, the Markov property implies, for all \( k\in {\mathbb N} \),

$$\begin{aligned} \mathbf{P}_x(\mathcal A^T_y(s))\le (1-\gamma )^k,\,\,s>k(T+t_1). \end{aligned}$$
(4.14)

Therefore,

$$\begin{aligned} \sum _{s\ge t_*}\mathbf{P}_x(\mathcal A^T_y(s))&\le \sum _{s\ge t_*}(1-\gamma )^{\lfloor s/(T+t_1)\rfloor } \le \sum _{s\ge t_*}(1-\gamma )^{s/2t_1} \\&\le \frac{\exp {\left( -\gamma t_*/2t_1\right) }}{1-\exp {\left( -\gamma /2t_1\right) }}= O(t_1/\gamma ) = o(t_*\Delta ^{-15\vartheta }). \end{aligned}$$

\(\square \)

4.3 Lower bound on the cover time

We prove the following stronger statement.

Lemma 4.6

For some constant \(c>0\), with high probability

$$\begin{aligned} \min _{x\in [n]}\,\mathbf{P}_x\left( \tau _\mathrm{cov} \ge c\, n\log ^{\gamma _1} n\right) =1-o(1). \end{aligned}$$
(4.15)

Clearly, this implies the lower bound on \(T_\mathrm{cov}=\max _{x\in [n]}\mathbf{E}_x\left( \tau _\mathrm{cov}\right) \) in Theorem 1.9. The proof of Lemma 4.6 is based on the second moment method as in [17]. If \(W\subset [n]\) is a set of vertices, let \(W_t\) be the set

$$\begin{aligned} W_t=\{y\in W:\,y \text { is not visited in } [0,t]\} \end{aligned}$$
(4.16)

Then

$$\begin{aligned} \mathbf{P}_x\left( \tau _\mathrm{cov}>t\right) \ge \mathbf{P}_x\left( |W_t|>0\right) \ge \frac{\mathbf{E}_x\left[ |W_t|\right] ^2}{\mathbf{E}_x\left[ |W_t|^2\right] }. \end{aligned}$$
(4.17)

Therefore, Lemma 4.6 is a consequence of the following estimate.

Lemma 4.7

For some constant \(c>0\), with high probability there exists a nonempty set \(W\subset [n]\) such that

$$\begin{aligned} \max _{x\in [n]}\,\frac{\mathbf{E}_x\left[ |W_{t}|^2\right] }{\mathbf{E}_x\left[ |W_t|\right] ^2}=1+o(1),\quad t=c \,n\log ^{\gamma _1} n. \end{aligned}$$
(4.18)

We start the proof of Lemma 4.7 by exhibiting a candidate for the set W.

Proposition 4.8

For any constant \(K>0\), with high probability there exists a set W such that

  1. (1)

    \(W\subset V_1\), where \(V_1\) is the LTL set from Definition 4.2, and \(|W|\ge n^\alpha \) for some constant \(\alpha >0\).

  2. (2)

    For some constant \(C>0\), for all \(y\in W\),

    $$\begin{aligned} \pi (y)\le \tfrac{C}{n}\,\log ^{1-\gamma _1}(n). \end{aligned}$$
    (4.19)
  3. (3)

    For all \(x,y\in W\):

    $$\begin{aligned} \left| \pi (x)-\pi (y)\right| \le \pi _\mathrm{min}\log ^{-K}(n). \end{aligned}$$
    (4.20)
  4. (4)

    For all \(x,y\in W\): \(\min \{d(x,y), d(y,x)\}>2\vartheta \).

Proof

From Theorem 1.3 we know that w.h.p. there exists a set \(S\subset [n]\) with \(|S|>n^\beta \) such that (4.19) holds. Moreover, a minor modification of the proof of Lemma 3.10 shows that we may also assume that \(S\subset V_1\) and that \(\min \{d(x,y), d(y,x)\}>2\vartheta \) for every \(x,y\in W\). Indeed, it suffices to generate the out-neighborhoods \(\mathcal B^+_\vartheta (y_i)\) for every \(i=1,\dots ,N_1\) and the argument for (3.66) shows that these are disjoint trees with high probability. To conclude, we observe that there is a \(W\subset S\) such that \(|W|>n^{\beta /2}\) and such that (4.20) holds. Indeed, using \(\pi _\mathrm{min}\ge n^{-1}\log ^{-K_1}(n)\) for some constant \(K_1\), for any constant \(K>0\) we may partition the interval

$$\begin{aligned}{}[n^{-1}\log ^{-K_1}(n),Cn^{-1}\log ^{1-\gamma _1}(n)] \end{aligned}$$

in \(\log ^{2K}(n)\) intervals of equal length and there must be at least one of them containing \(n^{\beta } \log ^{-2K}(n)\ge n^{\beta /2}\) elements which, if K is sufficiently large, satisfy (4.20). \(\square \)

Proof of Lemma 4.7

Consider the first moment \(\mathbf{E}_x\left[ |W_t|\right] \), where W is the set from Proposition 4.8 and t is fixed as \(t=c\,n\log ^{\gamma _1}(n)\). For \(y\in W\subset V_1\) we use Lemma 4.1 and Proposition 4.4. As in (4.10) we have

$$\begin{aligned} \mathbf{P}_x(\mathcal A^T_y(t))=(1+o(1))(1+\bar{p}_y)^{-(t+1)}, \end{aligned}$$
(4.21)

where \(\bar{p}_y= (1+o(1))\pi (y)\le p_W:=2C\,n^{-1}\log ^{1-\gamma _1}(n)\), where C is as in (4.19). Therefore,

$$\begin{aligned} \mathbf{E}_x\left[ |W_t|\right]&=\sum _{y\in W} \mathbf{P}_x\left( y \text { not visited in } [0,t] \right) \\&\ge -T+\sum _{y\in W}\mathbb {P}(\mathcal A^T_y(t)) \ge -T+(1+o(1))|W|(1+p_W)^{-t}. \end{aligned}$$

Taking the constant c in the definition of t sufficiently small, one has \(p_W t\le \alpha /2 \log n\) and therefore

$$\begin{aligned} \mathbf{E}_x\left[ |W_t|\right] \ge -T+(1+o(1))|W|n^{-\alpha /2} \ge \tfrac{1}{2}\,n^{\alpha /2}, \end{aligned}$$
(4.22)

where we use \(T=\log ^3(n)\) and \(|W|\ge n^\alpha \). In particular, since \(T=\log ^3(n)\), (4.22) shows that

$$\begin{aligned} \sum _{y\in W}\mathbb {P}(\mathcal A^T_y(t)) =(1+o(1))\mathbf{E}_x\left[ |W_t|\right] . \end{aligned}$$
(4.23)

Concerning the second moment \(\mathbf{E}_x\left[ |W_t|^2\right] \), we have

$$\begin{aligned} \mathbf{E}_x\left[ |W_t|^2\right]&=\sum _{y,y'\in W} \mathbf{P}_x\left( y \text { and } y' \text { not visited in } [0,t] \right) \\&\le \sum _{y,y'\in W}\mathbf{P}_x\left( \mathcal A^T_y(t)\cap \mathcal A^T_{y'}(t)\right) . \end{aligned}$$

From this and (4.23), the proof of Lemma 4.7 is completed by showing, uniformly in \(x\in [n],y,y'\in W\):

$$\begin{aligned} \mathbf{P}_x\left( \mathcal A^T_y(t)\cap \mathcal A^T_{y'}(t)\right) = (1+o(1))\mathbf{P}_x\left( \mathcal A^T_y(t)\right) \mathbf{P}_x\left( \mathcal A^T_{y'}(t)\right) . \end{aligned}$$
(4.24)

We follow the idea of [17]. Let \(G^*\) denote the digraph obtained from our digraph G by merging the two vertices \(y,y'\) into the single vertex \(y_*=\{y,y'\}\). Notice that \(y^*\) is LTL in the graph \(G^*\) in the sense of Definition 4.2. Moreover, \(G^*\) has the law of a directed configuration model with the same degree sequence of G except that at \(y_*\) it has \(d_{y_*}^\pm =d^\pm _y+d^\pm _{y'}\). It follows that we may apply Lemma 4.1 and Proposition 4.4. Therefore, if \(\mathbf{P}^*_x\) denotes the law of the random walk on \(G^*\) started at x, as in (4.21) we have

$$\begin{aligned} \mathbf{P}^*_x(\mathcal A^T_{y_*}(t))=(1+o(1))(1+\bar{p}_{y_*})^{-t}, \end{aligned}$$
(4.25)

uniformly in \(x\in [n],y,y'\in W\), where \(\bar{p}_{y_*}=(1+o(1))\pi ^*(y_*)\), and \(\pi ^*\) is the stationary distribution of \(G^*\). In Lemma 4.9 below we prove that

$$\begin{aligned} \max _{\begin{array}{c} v\in [n]: \\ v\ne y,y' \end{array}}|\pi (v)-\pi ^*(v)|\le a, \quad |\pi (y)+\pi (y')-\pi ^*(y_*)|\le a, \end{aligned}$$
(4.26)

where \(a:= \pi _\mathrm{min}\log ^{-1}(n)\). Assuming (4.26), we can conclude the proof of (4.24). Indeed, letting \(P_*\) denote the transition matrix of the graph \(G^*\),

$$\begin{aligned} \mathbf{P}_x^*(\mathcal A^T_{y_*}(t))&=\sum _{v\not =y,y'}P^T_*(x,v)\mathbf{P}^*_v(X_s\not =y_*,\,\forall s\in [1,t-T])\\&=\sum _{v\not =y,y'}\left( \pi ^*(v)+O(n^{-3}) \right) \mathbf{P}^*_v(X_s\not =y_*,\,\forall s\in [1,t-T]) \end{aligned}$$

On the other hand,

$$\begin{aligned} \mathbf{P}_x(\mathcal A^T_y(t)\cap \mathcal A^T_{y'}(t))&= \sum _{v\not =y,y'}P^T(x,v)\mathbf{P}_v(X_s\notin \{y,y'\},\,\forall s\in [1,t-T])\\&=\sum _{v\not =y,y'}\left( \pi (v)+O(n^{-3}) \right) \mathbf{P}_v(X_s\notin \{y,y'\},\,\forall s\in [1,t-T]) \end{aligned}$$

For all \(v\not =y,y'\),

$$\begin{aligned}&\mathbf{P}^*_v(X_s\ne y_*,\,\forall s\in [1,t-T])=\mathbf{P}_v(X_s\not \in \{y,y'\},\,\forall s\in [1,t-T])\\&\quad \le \frac{(1+o(1))}{\pi _\mathrm{min}}P^T(x,v)\mathbf{P}_v(X_s\not \in \{y,y'\},\,\forall s\in [1,t-T]), \end{aligned}$$

uniformly in \(x\in [n]\), where we used condition (i) in Lemma 4.1 to estimate \(1\le \frac{(1+o(1))}{\pi _\mathrm{min}}P^T(x,v)\). Therefore, using (4.26)

$$\begin{aligned}&\left| \mathbf{P}_x\left( \mathcal A^T_y(t)\cap \mathcal A^T_{y'}(t)\right) -\mathbf{P}_x^*\left( \mathcal A^T_{y_*}(t)\right) \right| \\&\quad \le \sum _{v\ne y,y'} |\pi (v)-\pi _*(v)+O(n^{-3})|\,\mathbf{P}_v(X_s\not \in \{y,y'\},\,\forall s\in [1,t-T]) \\&\quad \le (a+O(n^{-3}))\frac{(1+o(1))}{\pi _\mathrm{min}} \sum _{v\ne y,y'}P^T(x,v)\mathbf{P}_v(X_s\not \in \{y,y'\},\,\forall s\in [1,t-T])\\&\quad \le \frac{2a}{\pi _\mathrm{min}}\,\mathbf{P}_x(\mathcal A_y(t)\cap \mathcal A_{y'}(t)). \end{aligned}$$

By definition of a we have \(a/\pi _\mathrm{min}\rightarrow 0\) so that

$$\begin{aligned} \mathbf{P}_x(\mathcal A^T_y(t)\cap \mathcal A^T_{y'}(t))=(1+o(1))\mathbf{P}_x^*(\mathcal A^T_{y_*}(t)). \end{aligned}$$
(4.27)

Using (4.21), (4.25) and (4.26) we conclude that

$$\begin{aligned} \mathbf{P}_x\left( \mathcal A^T_y(t)\cap \mathcal A^T_{y'}(t)\right)&=(1+o(1))\exp {\left( -(1+o(1))(\pi (y)+\pi (y'))t\right) } \\&= (1+o(1))\mathbf{P}_x\left( \mathcal A^T_y(t)\right) \mathbf{P}_x\left( \mathcal A^T_{y'}(t)\right) . \end{aligned}$$

\(\square \)

Lemma 4.9

The stationary distributions \(\pi ,\pi ^*\) satisfy (4.26).

Proof

We follow the proof of Eq. (107) in [17]. The stochastic matrix of the simple random walk on \(G^*\) is given by

$$\begin{aligned} P_*(v,w)={\left\{ \begin{array}{ll} P(v,w)&{}\text {if }v,w\not =y_*\\ \frac{1}{2}\left( P(y,w)+P(y',w) \right) &{}\text {if }v=y_*\\ P(v,y)+P(v,y')&{}\text {if }w=y_*. \end{array}\right. } \end{aligned}$$

Let \(V^* \) denote the vertices of \(G^*\). Define the vector \(\zeta (v)\), \(v\in V^*\) via

$$\begin{aligned} \zeta (v)={\left\{ \begin{array}{ll}\pi _*(v)-\pi (v) &{} v\ne y_*\\ \pi _*(y_*) - (\pi (y)+\pi (y')) &{} v=y_* \end{array}\right. } \end{aligned}$$

We are going to show that

$$\begin{aligned} \max _{v\in V^*}|\zeta (v)| = o(\pi _\mathrm{min}\log ^{-1}(n)), \end{aligned}$$
(4.28)

which implies (4.26). A computation shows that

$$\begin{aligned}&\zeta P_*(w)=\sum _{v\in V^*}\zeta (v)P_*(v,w)\nonumber \\&\quad ={\left\{ \begin{array}{ll} \zeta (w)&{}\quad \text {if }w\not \in \mathcal B_1^+(y)\cup \mathcal B^+_1(y')\\ \zeta (w)+\frac{\pi (y')-\pi (y)}{2}P(y,w)&{}\quad \text {if }w\in \mathcal B_1^+(y)\\ \zeta (w)+\frac{\pi (y)-\pi (y')}{2}P(y',w)&{}\quad \text {if }w\in \mathcal B^+_1(y'). \end{array}\right. } \end{aligned}$$

Therefore, the vector \(\phi :=\zeta (I-P_*)\) satisfies

$$\begin{aligned} |\phi (w)|\le {\left\{ \begin{array}{ll} 0&{}\quad \text {if }\,w\not \in \mathcal B_1^+(y)\cup \mathcal B^+_1(y')\\ |\pi (y)-\pi (y')|&{}\quad \text {otherwise }. \end{array}\right. } \end{aligned}$$

Hence \(\phi (v)=0\) for all but at most \(2\Delta \) vertices v, and recalling (4.20) we have

$$\begin{aligned} |\phi (w)|\le \pi _\mathrm{min}\log ^{-K}(n). \end{aligned}$$
(4.29)

Next, consider the matrix

$$\begin{aligned} M=\sum _{s=0}^{T-1}P_*^s, \end{aligned}$$

and notice that

$$\begin{aligned} \zeta (I-P_*^T)=\phi M. \end{aligned}$$

Since \(P_*\) and \(\pi ^*\) satisfy condition (i) in Lemma 4.1,

$$\begin{aligned} P_*^T=\Pi _*+E, \quad \text {with} \quad |E(u,v)|\le n^{-3},\,\,\forall u,v\in V^*, \end{aligned}$$
(4.30)

where \(\Pi _*\) denotes the matrix with all rows equal to \(\pi _*\). We rewrite the vector \(\zeta \) as

$$\begin{aligned} \zeta =\alpha \pi _*+\rho , \end{aligned}$$

where \(\alpha \in {\mathbb R} \) and \(\rho \) is orthogonal to \(\pi _*\), that is

$$\begin{aligned} \langle \rho , \pi _*\rangle =\sum _{v\in V^*}\rho (v)\pi _*(v)=0. \end{aligned}$$

Therefore,

$$\begin{aligned} \langle \phi M , \rho \rangle = \langle \rho , (I-E) \rho \rangle . \end{aligned}$$

Moreover,

$$\begin{aligned} |\langle \phi M , \rho \rangle |\le \sum _{s=0}^{T-1}|\langle \phi , P_*^s\rho \rangle |\le T\frac{\pi _\mathrm{max}^*}{\pi _\mathrm{min}^*}\Vert \phi \Vert _2\Vert \rho \Vert _2, \end{aligned}$$
(4.31)

where we use

$$\begin{aligned} \langle P_*^s\psi , P_*^s\psi \rangle&\le \frac{1}{\pi _\mathrm{min}^*}\sum _{v}\pi ^*(v)(P_*^s\psi )^2(v)\\&\le \frac{1}{\pi _\mathrm{min}^*}\sum _{u,v}\pi ^*(v)P_*^s(v,u)\psi ^2(u)\\&=\frac{1}{\pi _\mathrm{min}^*}\sum _{u}\pi ^*(u)\psi ^2(u)\le \frac{\pi _\mathrm{max}^*}{\pi _\mathrm{min}^*}\Vert \psi \Vert _2^2, \end{aligned}$$

for any vector \(\psi :V^*\mapsto {\mathbb R} \). On the other hand,

$$\begin{aligned} |\langle \rho , (I-E) \rho \rangle |\ge \Vert \rho \Vert _2^2-n^{-3}\left( \sum _v |\rho (v)|\right) ^2 \ge \Vert \rho \Vert _2^2(1-n^{-2}). \end{aligned}$$
(4.32)

Using (4.29), from (4.31) and (4.32) we conclude that

$$\begin{aligned} \Vert \rho \Vert _2\le 2T\frac{\pi _\mathrm{max}^*}{\pi _\mathrm{min}^*}\Vert \phi \Vert _2=2T\frac{\pi _\mathrm{max}^*}{\pi _\mathrm{min}^*}\times O(\pi _\mathrm{min}\log ^{-K}(n)). \end{aligned}$$

From Theorem 1.3 applied to \(G^*\) we can assume that \( \frac{\pi _\mathrm{max}^*}{\pi _\mathrm{min}^*}=O(\log ^{K/3}(n))\) if K is a large enough constant. Since \(T=\log ^3(n)\), with K sufficiently large one has

$$\begin{aligned} \Vert \rho \Vert _2\le \pi _\mathrm{min}\log ^{-K/2}(n). \end{aligned}$$

Next, notice that

$$\begin{aligned} 0= \langle \zeta , 1\rangle =\langle \alpha \pi _*+\rho , 1\rangle =\alpha +\langle \rho , 1\rangle . \end{aligned}$$

Hence

$$\begin{aligned} |\alpha |=|\langle \rho , 1\rangle | \le \sqrt{n}\,\Vert \rho \Vert _2\le \sqrt{n}\,\pi _\mathrm{min}\log ^{-K/2}(n). \end{aligned}$$

In conclusion,

$$\begin{aligned} \zeta (v)^2&\le 2\alpha ^2\pi _*(v)^2+2\rho (v)^2 \le 2n\pi _\mathrm{min}^2\log ^{-K}(n)(\pi _\mathrm{max}^*)^2+ 2\Vert \rho \Vert _2^2 \\&\le 2n\pi _\mathrm{min}^2\log ^{-K}(n)(\pi _\mathrm{max}^*)^2 + 2\pi _\mathrm{min}^2\log ^{-K}(n) \le 4 \pi _\mathrm{min}^2\log ^{-K}(n), \end{aligned}$$

which implies (4.28). \(\square \)

4.4 The Eulerian case

We prove Theorem 1.12. The strategy is the same as for the proof of Theorem 1.9, with some significant simplifications due to the explicit knowledge of the invariant measure \(\pi (x)=d_x/m\). For the upper bound, it is then sufficient to prove that, setting \(t_*=(1+\varepsilon )\beta n\log n\),

$$\begin{aligned} \sum _{y\in V_1}\sum _{s\ge t_*}\mathbf{P}_x(\mathcal A^T_y(s))+ \sum _{y\in V_2}\sum _{s\ge t_*}\mathbf{P}_x(\mathcal A^T_y(s))=o(n\log n). \end{aligned}$$
(4.33)

Letting \(\mathcal V_d\) denote the set of vertices with degree d, reasoning as in (4.11) we have

$$\begin{aligned} \sum _{y\in V_1}\sum _{s\ge t_*}\mathbf{P}_x(\mathcal A^T_y(s))\le (1+o(1))\sum _{d=\delta }^\Delta |\mathcal V_d| \frac{m}{d(1+(1+o(1))d/m)^{t_*}} \end{aligned}$$

Since \( |\mathcal V_d|=n^{\alpha _d+o(1)}\), \(m=\bar{d} n\), for any fixed \(\varepsilon >0\) we obtain

$$\begin{aligned} \sum _{y\in V_1}\sum _{s\ge t_*}\mathbf{P}_x(\mathcal A^T_y(s))\le \frac{2m}{\delta } \,\sum _{d=\delta }^\Delta \exp {\left( -\left( \tfrac{d\beta }{\bar{d}} - \alpha _d\right) \log n\right) } = O(n), \end{aligned}$$
(4.34)

since by definition \(\frac{d\beta }{\bar{d}} - \alpha _d\ge 0\). Concerning the vertices \(y\in V_2\) one may repeat the argument in (4.14) without modifications, to obtain

$$\begin{aligned} \sum _{y\in V_2}\sum _{s\ge t_*}\mathbf{P}_x(\mathcal A^T_y(s))=o(n\log n). \end{aligned}$$
(4.35)

Thus, (4.33) follows from (4.34) and (4.35).

It remains to prove the lower bound. We shall prove that for any fixed d such that \(|\mathcal V_d|=n^{\alpha _d+o(1)}\), \(\alpha _d\in (0,1]\), for any \(\varepsilon >0\),

$$\begin{aligned} \min _{x\in [n]}\,\mathbf{P}_x\left( \tau _\mathrm{cov} \ge (1-\varepsilon )\frac{\bar{d}\alpha _d}{d}\, n\log ^{\gamma _1} n\right) =1-o(1). \end{aligned}$$
(4.36)

We proceed as in the proof of Lemma 4.7. Here we choose W as the subset of \(\mathcal V_d\) consisting of LTL vertices in the sense of Definition 4.2 and such that for all \(x,y\in W\) one has \(\min \{d(x,y),d(y,x)\}>2\vartheta \). Let us check that this set satisfies

$$\begin{aligned} |W|\ge n^{\alpha _d +o(1)}. \end{aligned}$$
(4.37)

Indeed, the vertices that are not LTL are at most \(\Delta ^{9\vartheta }\) by Proposition 4.3. Therefore there are at least \(|\mathcal V_d|-\Delta ^{9\vartheta }=n^{\alpha _d +o(1)}\) LTL vertices in \(\mathcal V_d\). Moreover, since there are at most \(\Delta ^{2\vartheta }\) vertices at undirected distance \(2\vartheta \) from any vertex, we can take a subset W of LTL vertices of \(\mathcal V_d\) satisfying the requirement that \(\min \{d(x,y),d(y,x)\}>2\vartheta \) for all \(x,y\in W\) and such that \(|W|\ge (|\mathcal V_d|-\Delta ^{9\vartheta })\Delta ^{-2\vartheta }=n^{\alpha _d +o(1)}.\) From here on all arguments can be repeated without modifications, with the simplification that we no longer need a proof of Lemma 4.9 since a can be taken to be zero in (4.26) in the Eulerian case. The only thing to control is the validity of the bound (4.23) with the choice

$$\begin{aligned} t=(1-\varepsilon )\frac{\bar{d}\alpha _d}{d}\, n\log n. \end{aligned}$$

As in (4.23), it suffices to check that with high probability

$$\begin{aligned} \sum _{y\in W}\mathbb {P}(\mathcal A^T_y(t))-T\rightarrow \infty . \end{aligned}$$
(4.38)

From (4.21) we obtain

$$\begin{aligned} \sum _{y\in W}\mathbb {P}\left( \mathcal A^T_y(t)\right) = (1+o(1))|W|\exp {\left( -\tfrac{(1+o(1))d}{m}\,t\right) }. \end{aligned}$$
(4.39)

Using (4.37) and \(dt/m = (1-\varepsilon )\alpha _d\log n\), (4.39) is at least \(n^{\varepsilon \alpha _d/2}\) for all n large enough. Since \(T=\log ^3(n)\) this proves (4.38).