Stationary distribution and cover time of sparse directed configuration models

We consider sparse digraphs generated by the configuration model with given in-degree and out-degree sequences. We establish that with high probability the cover time is linear up to a poly-logarithmic correction. For a large class of degree sequences we determine the exponent $\gamma \ge 1$ of the logarithm and show that the cover time grows as $n\log^{\gamma}(n)$, where $n$ is the number of vertices. The results are obtained by analysing the extremal values of the stationary distribution of the digraph. In particular, we show that the stationary distribution $\pi$ is uniform up to a poly-logarithmic factor, and that for a large class of degree sequences the minimal values of $\pi$ have the form $\frac1n\log ^{1-\gamma}(n)$, while the maximal values of $\pi$ behave as $\frac1n\log ^{1-\kappa}(n)$ for some other exponent $\kappa\in[0,1]$. In passing, we prove tight bounds on the diameter of the digraphs and show that the latter coincides with the typical distance between two vertices.


INTRODUCTION
The problem of determining the cover time of a graph is a central one in combinatorics and probability [5,4,20,3,25,17,18]. In recent years, the cover time of random graphs has been extensively studied [19,15,13,16,1]. All these works consider undirected graphs, with the notable exception of the paper [16] by Cooper and Frieze, where the authors compute the cover time of directed Erdős-Renyi random graphs in the regime of strong connectivity, that is with a logarithmically diverging average degree. The main difficulty in the directed case is that, in contrast with the undirected case, the graph's stationary distribution is an unknown random variable.
In this paper we address the problem of determining the cover time of sparse random digraphs with bounded degrees. More specifically, we consider random digraphs G with given in-and outdegree sequences, generated via the configuration model. For the sake of this introductory discussion let us look at the special case where all vertices have either in-degree 2 and out-degree 3 or in-degree 3 and out-degree 2, with the two types evenly represented in the vertex set V (G). We refer to this as the (2, 3)(3, 2) case. With high probability G is strongly connected and we may ask how long the random walk on G takes to cover all the nodes. The expectation of this quantity, maximized over the initial point of the walk defines T cov (G), the cover time of G. We will show that with high probability as the number of vertices n tends to infinity one has T cov (G) ≍ n log γ (n) (1.1) where γ = log 3 log 2 ≈ 1.58, and a n ≍ b n stands for C −1 ≤ a n /b n ≤ C for some constant C > 0. The constant γ can be understood in connection with the statistics of the extremal values of the stationary distribution π of G. Indeed, following the theory developed by Cooper and Frieze, if the graphs satisfy suitable requirements, then the problem of determining the cover time can be reformulated in terms of the control of the minimal values of π. In particular, we will see that the hitting time of a vertex x ∈ V (G) effectively behaves as an exponential random variable with parameter π(x), and that to some extent these random variables are weakly dependent. This supports the heuristic picture that represents the cover time as the expected value of n independent exponential random variables, each with parameter π(x), x ∈ V (G). Controlling the stationary distribution is however a rather challenging task, especially if the digraphs have bounded degrees.
Recently, Bordenave, Caputo and Salez [8] analyzed the mixing time of sparse random digraphs with given degree sequences and their work provides some important information on the distribution of the values of π. In particular, in the (2, 3)(3, 2) case, the empirical distribution of the values {nπ(x), x ∈ V (G)} converges as n → ∞ to the probability law µ on [0, ∞) of the random variable X given by 2) where N is the random variable with N = 2 with probability 1 2 and N = 3 with probability 1 2 , and the Z k are independent and identically distributed mean-one random variables uniquely determined by the recursive distributional equation where M is the random variable with M = 2 with probability 2/5 and M = 3 with probability 3/5, independent of the Z k 's, and d = denotes equality in distribution. This gives convergence of the distribution of the bulk values of π, that is of the values of π on the scale 1/n. What enters in the cover time analysis are however the extremal values, notably the minimal ones, and thus what is needed is a local convergence result towards the left tail of µ, which cannot be extracted from the analysis in [8]. To obtain a heuristic guess of the size of the minimal values of π at large but finite n one may pretend that the values of nπ are n i.i.d. samples from µ. This would imply that π min , the minimal value of π is such that nπ min ∼ ε(n) where ε(n) is a sequence for which nµ([0, ε(n)]) ∼ 1, if µ([0, x]) denotes the mass given by µ to the interval [0, x].
Recursive distributional equations of the form (1.3) are well studied, and many properties of the distribution µ can be derived. In particular, it has been shown by Liu [24] that the left tail of µ is of the form log µ([0, x]) ≍ −x −α , x → 0 + , where α = 1/(γ − 1), with the coefficient γ taking the value γ = log 3 log 2 in the (2, 3)(3, 2) case. Thus, returning to our heuristic reasoning, one has that the minimal value of π should satisfy nπ min ≍ log 1−γ (n).
( 1.4) Moreover, this argument also predicts that with high probability there should be at least n β vertices x ∈ V (G), for some constant β > 0, such that nπ(x) is as small as O(log 1−γ (n)).
A similar heuristic argument, this time based on the analysis of the right tail of µ, see [22,23], predicts that π max , the maximal value of π, should satisfy nπ max ≍ log 1−κ (n), (1.5) where κ takes the value κ = log 2 log 3 ≈ 0.63 in the (2, 3)(3, 2) case. Our main results below will confirm these heuristic predictions. The proof involves the analysis of the statistics of the in-neighbourhoods of a node. Roughly speaking, it will be seen that the smallest values of π are achieved at vertices x ∈ V (G) whose in-neighbourhood at distance log 2 log n is a directed tree composed entirely of vertices with in-degree 2 and out-degree 3, while the the maximal values of π are achieved at x ∈ V (G) whose in-neighbourhood at distance log 3 log n is a directed tree composed entirely of vertices with in-degree 3 and out-degree 2. Once the results (1.4) and (1.5) are established, the cover time asymptotic (1.1) will follow from an appropriate implementation of the Cooper-Frieze approach.
We conclude this preliminary discussion by comparing our estimates (1.4) and (1.5) with related results for different random graph models. The asymptotic of extremal values of π has been determined in [16] for the directed Erdős-Renyi random graphs with logarithmically diverging average degree. There, the authors show that nπ min and nπ max are essentially of order 1, which can be interpreted as a concentration property enforced by the divergence of the degrees. On the other hand, for uniformly random out-regular digraphs, that is with constant out-degrees but random in-degrees, the recent paper [2] shows that the stationary distribution restricted to the strongly connected component satisfies nπ min = n −η+o (1) , where η is a computable constant, and nπ max = n o (1) . Indeed, in this model in contrast with our setting one can have in-neighborhoods made by long and thin filaments which determine a power law deviation from uniformity.
We now turn to a more systematic exposition of our results. The directed configuration model DCM(d ± ) is the distribution of the random digraph G with vertex set V (G) = [n] obtained by the following procedure: 1) equip each node x with d + x tails and d − x heads; 2) pick uniformly at random one of the m! bijective maps from the set of all tails into the set of all heads, call it ω; 3) for all x, y ∈ [n], add a directed edge (x, y) every time a tail from x is mapped into a head from y through ω. The resulting digraph G may have self-loops and multiple edges, however it is classical that by conditioning on the event that there are no multiple edges and no self-loops G has the uniform distribution among simple digraphs with in degree sequence d − and out degree sequence d + .
Structural properties of digraphs obtained in this way have been studied in [12]. Here we consider the sparse case corresponding to bounded degree sequences and, in order to avoid non irreducibility issues, we shall assume that all degrees are at least 2. Thus, from now on it will always be assumed that Under the first assumption it is known that DCM(d ± ) is strongly connected with high probability. Under the second assumption, it is known that DCM(d ± ) has a uniformly (in n) positive probability of having no self-loops nor multiple edges. In particular, any property that holds with high probability for DCM(d ± ) will also hold with high probability for a uniformly random simple digraph with degrees given by d − and d + respectively. Here and throughout the rest of the paper we say that a property holds with high probability (w.h.p. for short) if the probability of the corresponding event converges to 1 as n → ∞. The (directed) distance d(x, y) from x to y is the minimal number of edges that need to be traversed to reach y from x. The diameter is the maximal distance between two distinct vertices, i.e.

diam(G) = max
x =y d(x, y). (1.8) We begin by showing that the diameter diam(G) concentrates around the value c log n within a O(log log n) window, where c is given by c = 1/ log ν and ν is the parameter defined by (1.9) Moreover, for any x, y ∈ [n] The proof of Theorem 1.1 is a directed version of a classical argument for undirected graphs [7]. It requires controlling the size of in-and out-neighborhoods of a node, which in turn follows ideas from [2] and [8]. The value d ⋆ = log ν n can be interpreted as follows: both the in-and the outneighborhood of a node are tree-like with average branching given by ν, so that their boundary at depth h has typically size ν h , see Lemma 2.9; if the in-neighborhood of y and the out-neighborhood of x are exposed up to depth h, one finds that the value h = 1 2 log ν (n) is critical for the formation of an arc connecting the two neighborhoods.
In particular, Theorem 1.1 shows that w.h.p. the digraph is strongly connected, so there exists a unique stationary distribution π characterized by the equation with the normalization n x=1 π(x) = 1. Here P is the transition matrix of the simple random walk on G, namely and we write m(y, x) for the multiplicity of the edge (y, x) in the digraph G. If the sequences d ± are such that d + , then the stationary distribution is given by (1.14) The digraph is called Eulerian in this case. In all other cases the stationary distribution is a nontrivial random variable. To discuss our results on the extremal values of π it is convenient to introduce the following notation.
The assumption (1.7) implies that the number of distinct types is bounded by a fixed constant C independent of n, that is |C| ≤ C. We say that the type (i, j) has linear size, if We call L ⊂ C the set of types with linear size, and define the parameters (1.16) Theorem 1.3. Set π min = min x∈[n] π(x). There exists a constant C > 0 such that Moreover, there exists β > 0 such that If the sequences d ± are such that (δ − , ∆ + ) ∈ L, then γ 0 = γ 1 =: γ, so in these cases Theorem 1.3 implies that In all other cases, the estimate (1.17) can be strengthened by replacing γ 0 with γ ′ 0 where 20) and L 0 ⊂ C is defined as the set of (k, ℓ) ∈ C such that We refer to Remark 3.9 below for additional details on this improvement.
Concerning the maximal values of π we establish the following estimates.
Theorem 1.5. Set π max = max x∈[n] π(x). There exists a constant C > 0 such that Moreover, there exists β > 0 such that (1.23) Remark 1.6. Notice that κ 0 ≤ κ 1 ≤ 1. If the sequences d ± are such that (∆ − , δ + ) ∈ L, then κ 0 = κ 1 =: κ, and in these cases Theorem 1.5 implies We turn to a description of our results concerning the cover time. Let X t , t = 0, 1, 2, . . . , denote the simple random walk on the digraph G, that is the Markov chain with transition matrix P defined in (1.13). Consider the hitting times The cover time T cov = T cov (G) is defined by where E x denotes the expectation with respect to the law of the random walk (X t ) with initial point X 0 = x in a fixed realization of the digraph G. Let γ 0 , γ 1 be as in Definition 1.2 Theorem 1.7. There exists a constant C > 0 such that , then the estimates in Theorem 1.7 can be refined considerably, and one obtains results that are at the same level of precision of those already established in the case of random undirected graphs [1].
. Call V d the set of vertices of degree d, and writē d = m/n for the average degree. Assume for some constants α d ∈ [0, 1], for each type d. Then, In particular, if all present types have linear size then α d ∈ {0, 1} for all d and (1.31) holds with β =d/δ, where δ is the minimum degree. In any case it is not difficult to see that β ≥ 1, sinced is determined only by types with linear size. For some general bounds on cover times of Eulerian graphs we refer to [6].
The rest of the paper is divided into three sections. The first is a collection of preliminary structural facts about the directed configuration model. It also includes the proof of Theorem 1.1. The second section is the core of the paper. There we establish Theorem 1.3 and Theorem 1.5. The last section contains the proof of the cover time results Theorem 1.7 and Theorem 1.9.

NEIGHBORHOODS AND DIAMETER
We start by recalling some simple facts about the directed configuration model.

Sequential generation.
Each vertex x has d − x labeled heads and d + x labeled tails, and we call E − x and E + x the sets of heads and tails at x respectively. The uniform bijection ω between heads viewed as a matching, can be sampled by iterating the following steps until there are no unmatched heads left: 1) pick an unmatched head f ∈ E − according to some priority rule; 2) pick an unmatched tail e ∈ E + uniformly at random; 3) match f with e, i.e. set ω(f ) = e, and call ef the resulting edge.
This gives the desired uniform distribution over matchings ω : E − → E + regardless of the priority rule chosen at step 1. The digraph G is obtained by adding a directed edge (x, y) whenever f ∈ E − y and e ∈ E + x in step 3 above.

2.2.
In-neighborhoods and out-neighborhoods. We will use the notation For any h ∈ N, the h-in-neighborhood of a vertex y, denoted B − h (y), is the digraph defined as the union of all directed paths of length ℓ ≤ h in G which terminate at vertex y. In the sequel a path is always understood as a sequence of directed edges (e 1 f 1 , . . . , e k f k ) such that v f i = v e i+1 for all i = 1, . . . , k − 1, and we use the notation v e (resp. v f ) for the vertex x such that e ∈ E + x (resp. f ∈ E − x ). To generate the random variable B − h (y), we use the following breadth-first procedure. Start at vertex y and run the sequence of steps described above, by giving priority to those unmatched heads which have minimal distance to vertex y, until this minimal distance exceeds h, at which point the process stops. Similarly, for any h ∈ N, the h-out-neighborhood of a vertex x, denoted B + h (x) is defined as the subgraph induced by the set of directed paths of length ℓ ≤ h which start at vertex x. To generate the random variable B + h (x), we use the same breadth-first procedure described above except that we invert the role of heads and tails. With slight abuse of notation we sometimes write B ± h (x) for the vertex set of B ± h (x). We also warn the reader that to simplify the notation we often avoid taking explicitly the integer part of the various parameters entering our proofs. In particular, whenever we write B ± h (x) it is always understood that h ∈ N. During the generation process of the in-neighborhood, say that a collision occurs whenever a tail gets chosen, whose end-point x was already exposed, in the sense that some tail in E + x or head in E − x had already been matched. Since less than 2k vertices are exposed when the k th tail gets matched, less than 2∆k of the m − k + 1 possible choices can result in a collision. Thus, the conditional chance that the k th step causes a collision, given the past, is less than p k = 2∆k m−k+1 . It follows that the number Z k of collisions caused by the first k arcs is stochastically dominated by the binomial random variable Bin(k, p k ). In particular, Notice that as long as no collision occurs, the resulting digraph is a directed tree. The same applies to out-neighborhoods simply by inverting the role of heads and tails. For any digraph G, define the tree excess of G as TX(G) = 1 + |E| − |V |, where E is the set of directed edges and V is the set of vertices of G. In particular, is a directed tree, and TX(B ± h (x)) ≤ 1 iff there is at most one collision during the generation of the neighborhood B ± h (x). Define the events for any x ∈ [n]. In particular, Proof. During the generation of B − h (x) one creates at most ∆ h edges. It follows from (2.2) with ℓ = 2 that the probability of the complement of G (2.6) The conclusion follows from the union bound.
We will need to control the size of the boundary of our neighborhoods. To this end, we introduce the notation ∂B − t (y) for the set of vertices x ∈ [n] such that d(x, y) = t. Similarly, ∂B + t (x) is the set of vertices y ∈ [n] such that d(x, y) = t. Clearly, |∂B ± t (y)| ≤ ∆ h for any y ∈ [n] and h ∈ N. Lemma 2.2. There exists χ > 0 such that for all y ∈ [n], Proof. By symmetry we may restrict to the case of in-neighborhoods. By (2.6) it is sufficient to show that |∂B ± h (y)| ≥ 1 2 δ h ± , for all h ∈ [1, ℏ], if G y (ℏ) holds. If the tree excess of the h-in-neighborhood B − h (y) is at most 1 then there is at most one collision in the generation of B − h (y). This collision can be of two types: (1) there exists some 1 ≤ t ≤ h and a v ∈ ∂B − t (y) s.t. v has two out-neighbors w, w ′ ∈ ∂B − t−1 (y); (2) there exists some 0 ≤ t ≤ h and a v ∈ ∂B − t (y) s.t. v has an in-neighbor w in B − t (y). The first case can be further divided into two cases: a) w = w ′ , and b) w = w ′ ; see Figure 1.
In case 1a) we note that the (h − t)-in-neighborhood of v must be a directed tree with at least δ h−t − elements on its boundary and with no intersection with the (h − t)-in-neighborhoods of other v ′ ∈ ∂B − t (y). Moreover, B − t−1 (y) must be a directed tree with |∂B − t−1 (y)| ≥ δ t−1 − , and all elements  of ∂B − t−1 (y) except one have disjoint (h − t + 1)-in-neighborhoods with δ h−t+1 − elements on their boundary. Therefore In case 1b) one has that t ≥ 2, B − t−1 (y) is a directed tree with |∂B − t−1 (y)| ≥ δ t−1 − , and for all z ∈ ∂B − t (y), the (h − t)-in-neighborhoods of z are disjoint directed trees with at least δ h−t − elements on their boundary. Since Collisions of type 2 can be further divided into two types: a) w ∈ ∂B − s (y) with s < t and there is no path from v to w of length t − s, or w ∈ ∂B − t (y) and w = v, and b) w ∈ ∂B − s (y) with s < t and there is a path from v to w of length t − s, or w = v. Note that in contrast with collisions of type 2a), a collision of type 2b) creates a directed cycle within B − t (y); see Figure 2 and Figure 3. We remark that in either case 2a) or case 2b), ∂B − t (y) has at least δ t − elements, and the vertex v ∈ ∂B − t (y) has at least δ − − 1 in-neighbors whose (h − t − 1)-in-neighborhoods are disjoint directed trees. All other v ′ ∈ ∂B − t (y) have (h − t)-in-neighborhoods that are disjoint directed trees. Therefore, in case 2): We shall need a more precise control of the size of ∂B ± h (y), and for values of h that are larger than ℏ. Recall the definition (1.9) of the parameter ν. We use the following notation in the sequel: Proof. We run the proof for the in-neighborhood only since the case of the out-neighborhood is obtained in the same way. We generate B − h (y), h ∈ [ℓ 0 , h η ] sequentially in a breadth first fashion. After the depth j neighborhood B − j (y) has been sampled, we call F j the set of all heads attached to vertices in ∂B − j (y). Set u = log −7/8 (n). For any h ≥ ℓ 0 define We are going to prove Notice that, choosing suitable constants c 1 , c 2 > 0, (2.9) is a consequence of (2.11). For h > ℓ 0 we write

Consider the events
(2.14) To estimate P(A j |A j−1 ), note that A j−1 depends only on the in-neighborhood B − j−1 (y), so if σ j−1 denotes a realization of B − j−1 (y) with a slight abuse of notation we write σ j−1 ∈ A j−1 if A j−1 occurs for this given σ j−1 . Then Therefore, to prove a lower bound on P(A j |A j−1 ) it is sufficient to prove a lower bound on P(A j |σ j−1 ) that is uniform over all σ j−1 ∈ A j−1 . Suppose we have generated the neighborhood σ j−1 up to depth j − 1, for a σ j−1 ∈ A j−1 . In some arbitrary order we now generate the matchings of all heads f ∈ F j−1 . We define the random variable was not yet exposed, and evaluates to zero otherwise. In this way Therefore, To sample the variables X (j) f , at each step we pick a tail uniformly at random among all unmatched tails and evaluate the in-degree of its end point if it is not yet exposed. Since σ j−1 ∈ A j−1 , at any such step the number of exposed vertices is at most K = O(n 1−η/2 ). In particular, for any f ∈ F j−1 and any d ∈ [δ, ∆], σ j−1 ∈ A j−1 : where [·] + denotes the positive part. This shows that X (j) f stochastically dominates the random variable Y (j) and is stochastically dominated by the random variable Notice that Similarly, i , and is stochastically dominated by An application of Hoeffding's inequality shows that the probability of the events (2.20) and (2.21) is bounded by e −cu 2 κ j−1 and e −cu 2 κ j−1 respectively, for some absolute constant c > 0. Hence, from (2.17) we conclude that for some constant c > 0: Therefore, using uniformly in j ∈ [ℓ 0 , h η ] and σ j−1 ∈ A j−1 . By (2.15) the same bound applies to P(A j |A j−1 ) and going back to (2.14), for h = h η we have obtained We shall also need the following refinement of Lemma 2.3. Define the events . Let G(ℏ) be the event from Proposition 2.1.

Lemma 2.4.
For every η ∈ (0, 1), there exist constants c 1 , c 2 > 0, χ > 0 such that for all y ∈ [n], Proof. By symmetry we may prove the inequality for the event F − y only. Consider the set D − y of all possible 2-in-neighborhoods of y compatible with the event G(ℏ), that is the set of labeled digraphs D such that Thus it is sufficient to prove that To this end, we may repeat exactly the same argument as in the proof of Lemma 2.3 with the difference that now we condition from the start on the event B − 2 (y) = D for a fixed D ∈ D y . The key observation is that (2.13) can be strenghtened to To prove (2.28) notice that if the 2-in-neighborhood of y is given by B − 2 (y) = D ∈ D y then the set F − 2 (y) has at least 4 elements. Therefore, taking a sufficiently large constant C, for the event |F − ℓ 0 (y)| ≥ δ ℓ 0 /C to fail it is necessary to have at least 3 collisions in the generation of B − t (y), t ∈ {3, . . . , ℓ 0 }. From the estimate (2.2) the probability of this event is bounded by p 3 k k 3 with k = ∆ ℓ 0 , which implies (2.28) if χ ∈ (0, 1). Once (2.28) is established, the rest of the proof is a repetition of the argument in (2.14)-(2.22).

2.3.
Upper bound on the diameter. The upper bound in Theorem 1.1 is reformulated as follows.

Lemma 2.5.
There exist constants C, χ > 0 such that if ε n = C log log(n) log(n) , Proof. From Proposition 2.1 we may restrict to the event G(ℏ). From the union bound Let us use sequential generation to sample first B + k (x) and then B − k−1 (y). Call σ a realization of these two neighborhoods. Consider the event (2.32) The event E x,y depends only on σ. We say that σ ∈ U x,y ∩ E x,y if σ is such that both E x,y and U x,y occur. Thus, we write The event E x,y implies that all vertices on ∂B − k−1 (y) have all their heads unmatched and the same holds for all the tails of vertices in ∂B + k (x). Call F k−1 the heads attached to vertices in ∂B − k−1 (y) and E k the tails attached to vertices in ∂B + k (x). The event d(x, y) > (1 + ε n )d ⋆ implies that there are no matchings between F k−1 and E k . The probability of this event is dominated by if n is large enough and ε n = C log log n/ log n with C large enough. Therefore, uniformly in σ ∈ U x,y ∩ E x,y , . Inserting this in (2.30)-(2.31) completes the proof.

Lower bound on the diameter.
We prove the following lower bound on the diameter. Note that Lemma 2.5 and Lemma 2.6 imply Theorem 1.1. Lemma 2.6. There exists C > 0 such that taking ε n = C log log(n) log(n) , for any x, y ∈ [n], We start by sampling the out-neighborhood of x up to distance ℓ. Consider the event for suitable constants c 2 , χ > 0, and therefore log c 2 (n) attempts to collide with ∂B + ℓ (x), each of which with success probability at most ∆K/m, and therefore where we take the constant C in the definition of ε n sufficiently large. In conclusion, , and the inequalities (2.35)-(2.38) end the proof.

STATIONARY DISTRIBUTION
We start by recalling some key facts established in [8].
3.1. Convergence to stationarity. Let P t (x, ·) denote the distribution after t steps of the random walk started at x. The total variation distance between two probabilities µ, ν on [n] is defined as Let the entropy H and the associated entropic time T ENT be defined by Note that under our assumptions on d ± , the deterministic quantities H, T ENT satisfy H = Θ(1) and T ENT = Θ(log n). Theorem 1 of [8] states that where ϑ denotes the step function ϑ(s) = 1 if s < 1 and ϑ(s) = 0 if s > 1, and we use the notation P −→ for convergence in probability as n → ∞. In words, convergence to stationarity for the random walk on the directed configuration model displays with high probability a cutoff phenomenon, uniformly in the starting point, with mixing time given by the entropic time T ENT . We remark that, by Jensen's inequality the mixing time T ENT = log n H is always larger than the diameter d ⋆ = log n log ν in Theorem 1.1, with equality if and only if the sequence is out-regular, that is d + x ≡ d. Thus, the analysis of convergence to stationarity requires investigating the graph on a length scale that may well exceed the diameter. Considering all possible paths on this length scale is not practical, and we shall rely on a powerful construction of [8] that allows one to restrict to a subset of paths with a tree structure, see Section 3.3.1 below for the details.
3.2. The local approximation. A consequence of the arguments of [8] is that the unknown stationary distribution at a node y admits an approximation in terms of the in-neighborhood of y at a distance that is much smaller than the mixing time. More precisely, it follows from [8, Theorem 3] that for any sequence t n → ∞ (3.4) where we use the notation µ in for the in-degree distribution and for any probability µ on [n], µP t is the distribution We refer to [10, Lemma 1] for a stronger statement than (3.4) where µ in is replaced by any sufficiently widespread probability on [n]. While these facts are very useful to study the typical values of π, they give very poor information on its extremal values π min and π max , and to prove Theorem 1.3 and Theorem 1.5 we need a stronger control of the local approximation of the stationary distribution.
A key role in our analysis is played by the quantity Γ h (y) defined as follows. Consider the set The definitions (3.6) and (1.13) are such that for any y ∈ [n] and h ∈ N where µ in is defined in (3.5). If B − h (y) is a tree, then (3.7) is an equality. In any case, Γ h (y) satisfies the following rough inequalities.
Proof. From Proposition 2.1 we may assume that the event G(ℏ) holds. From Lemma 2.2 we know The bounds in (3.9) follow from the observation that any path of length h from z to y has weight at least ∆ −h + and at most δ −h + , and that there is at least one and at most two such paths if z ∈ ∂B − h (y) and G(ℏ) holds. The latter fact can be seen with the same argument used in the proof of Lemma 2.2. With reference to that proof: in case 1) there are at most two paths from z to y, see Figure 1; in case 2) there is only one path from z to y; see Figure 2 and Figure 3.
Roughly speaking, in what follows the extremal values of π will be controlled by approximating π(y) in terms of Γ h (y) for values of h of order log log n, for every node y. The next two results allow where h 0 is of order log log n.

Lemma 3.2.
There exist constants c > 0 and C > 0 such that: where γ 0 is the constant from Theorem 1.3 and h 0 := log δ − log(n) + C.
, where h 0 is as in the statement above with C to be fixed later. Once we have the in-neighborhood , and order them as (z 1 , . . . , z R ) in some arbitrary way. We sample sequentially , and so on. We want to couple the random variables The tree W i is defined as the first h − h 0 generations of the marked random tree T i produced by the following instructions: • the root is given the mark z i ; • every vertex with mark j has d − j children, each of which is given independently the mark k ∈ [n] with probability d + k /m. Consider the generation of the i-th variable Z i . This is achieved by the breadth-first sequential procedure, where at each step a head is matched with a tail chosen uniformly at random from all unmatched tails; see Section 2. If instead we pick the tail uniformly at random from all possible tails, then we need to reject the outcome if the chosen tail belongs to the set of tails that have been already matched. Since the total number of tails matched at any step of this generation is at most K := ∆ ℏ = O(n 1/5 ), it follows that the probability of a rejection is bounded by p := K/m = O(n −4/5 ). Let us now consider the event of a collision, that is when the chosen tail belongs to a vertex that has already been exposed during the previous steps, including the generation of B − h 0 (y) and of the Z j , j ≤ i. Notice that the total number of exposed vertices is at most K and therefore the probability of a collision is bounded by p ′ = ∆K/m = O(n −4/5 ). Since the generation of Z i requires at most K matchings, we see that conditionally on the past, a Z i with no rejections and no collisions is created with probability uniformly bounded from below by 1 − q, where q = O(n −3/5 ). We say that Z i is bad if its generation produced a rejection or a collision. Once the Z i 's have been sampled we define a set I such that i ∈ I if and only if either Z i is bad or there is a bad Z j such that the generation of Z j produced a collision with a vertex from Z i . With this notation, W i = Z i for all i / ∈ I and The above construction shows that the cardinality of the set I is stochastically dominated by twice the binomial Bin(R, q). Therefore, On the other hand, notice that for all i / ∈ I: where M i t , t ∈ N, is defined as follows. Let T t,i denote the set of vertices forming generation t of the tree T i rooted at z i , and for x ∈ T t,i , write (3.14) for the weight of the path It is not hard to check (see e.g. [10,Proposition 4]) that for fixed n, Therefore, Hoeffding's inequality gives, for any k ∈ N: where c 1 > 0 is a suitable constant. Divide the integers {1, . . . , R} into 10 disjoint intervals I 1 , . . . , I 10 , each containing R/10 elements. If |I| < 10 then there must be one of the intervals, say I j * , such that I j * ∩ I = ∅. It follows that if |I| < 10, then Using (3.12), and (3.16)-(3.17) we conclude that, for a suitable constant c 2 > 0: where c = 1 2 c 2 (δ − /∆ + ) C . Thus the event (3.19) has probability 1 − o(n −2 ), and the desired conclusion follows by taking a union bound over y ∈ [n] and h ∈ [h 0 , ℏ].

Lemma 3.3.
There exists a constant K > 0 such that for all ε > 0, with high probability:
Proof. For any h ≥ h 1 , let σ h denote a realization of the in-neighborhood B − h (y), obtained with the usual breadth-first sequential generation. From Proposition 2.1 we may assume that the tree excess of B − h (y) is at most 1, as long as h ≤ ℏ. Call E tot,h , F tot,h the set of unmatched tails and unmatched heads, respectively, after the generation of σ h . Let also E h ⊂ E tot,h denote the set of unmatched tails belonging to vertices not yet exposed, and let F h be the subset of heads attached to ∂B − h (y). By construction, all heads attached to ∂B − h (y) must be unmatched at this stage so that F h ⊂ F tot,h . Moreover, where v f denotes the vertex to which the head f belongs. To compute Γ h+1 given σ h we let ω : E tot,h → F tot,h denote a uniform random matching of E tot,h and F tot,h , and notice that a vertex z is in ∂B − h+1 (y) if and only if z is revealed by matching one of the heads f ∈ F h with one of the tails e ∈ E h . Therefore, where we use the notation d ± e for the degrees of the vertex to which the tail e belongs, and the function c is defined by Since σ h is such that TX(B − h (y)) ≤ 1, we may estimate P h (v f , y) as in (3.9), so that We now use a version of Bernstein's inequality proved by Chatterjee ([11, Proposition 1.1]) which applies to any function of a uniform random matching of the form (3.22). It follows that for any fixed σ h , for any s > 0: Since the probability of the event ω(e) = f conditioned on σ h is 1 and therefore, using (3.24), one finds 3.3. Lower bound on π min . If for some t ∈ N and a > 0 one has P t (x, y) ≥ a for all x, y ∈ [n], then and therefore π min ≥ a. We will prove the lower bound on P t (x, y) by choosing t of the form t = (1 + ε)T ENT , for some small enough ε > 0; see (3.1) for the definition of T ENT . More precisely, fix a constant η > 0, set η ′ = 3η H log δ , and define (3.32) Note that η ′ ≥ 3η and thus t ⋆ = t ⋆ (η) ≥ (1 + 2η)T ENT .
From (3.31) and Lemma 3.4 it follows that w.h.p. for all y π(y) ≥ c n Γ hy (y). To prove Lemma 3.4 we will restrict to a subset of nice paths from x to y. This will allow us to obtain a concentration result for the probability to reach y from x in t ⋆ steps.

3.3.1.
A concentration result for nice paths. The definition of the nice paths follows a construction introduced in [8], which we now recall. In contrast with [8] however, here we need a lower bound on P t⋆ (x, y) and thus the argument is somewhat different.
Following [8, Section 6.2] and [9, Section 4.1], we introduce the rooted directed tree T (x), namely the subgraph of the h x -out-neighborhood of x defined by the following process: initially all tails and heads are unmatched and T (x) is identified with its root, x; throughout the process, we let ∂ + T (x) (resp. ∂ − T (x)) denote the set of unmatched tails (resp. heads) whose endpoint belongs to T (x); the height h(e) of a tail e ∈ ∂ + T (x) is defined as 1 plus the number of edges in the unique path in T (x) from x to the endpoint of e; the weight of e ∈ ∂ + T (x) is defined as where (x = x 0 , x 1 , . . . , x h(e)−1 ) denotes the path in T (x) from x to the endpoint of e; we then iterate the following steps: • a tail e ∈ ∂ + T (x) is selected with maximal weight among all e ∈ ∂ + T (x) with h(e) ≤ h x − 1 and w(e) ≥ w min := n −1+η 2 (using an arbitrary ordering of the tails to break ties); • e is matched to a uniformly chosen unmatched head f , forming the edge ef ; • if f was not in ∂ − T (x), then its endpoint and the edge ef are added to T (x). The process stops when there are no tails e ∈ ∂ + T (x) with height h(e) ≤ h x − 1 and weight w(e) ≥ w min . Note that T (x) remains a directed tree at each step. The final value of T (x) represents the desired directed tree. After the generation of the tree T (x) a total number κ of edges has been revealed, some of which may not belong to T (x). As in [9,Lemma 7], it is not difficult to see that when exploring the out-neighborhood of x in this way the random variable κ is deterministically bounded as (3.37) At this stage, let us call E * (x) the set of unmatched tails e ∈ ∂ + T (x) such that h(e) = h x . (2) x hx+1 ∈ ∂B − hy (y). To obtain a useful expression for the probability of going from x to y along a nice path, we need to generate B − hy (y), the h y -in-neighborhood of y. To this end, assume that κ edges in the h x -outneighborhood of x have been already sampled according to the procedure described above, and then sample B − hy (y) according to the sequential generation described in Section 2. Some of the matchings producing B − hy (y) may have already been revealed during the previous stage. In any case, this second stage creates an additional random number τ of edges, satisfying the crude bound τ ≤ ∆ hy+1 . We call F tot the set of unmatched heads, and E tot the set of unmatched tails after the sampling of these κ + τ edges. Consider the set F 0 := F hy ∩ F tot , where F hy denotes the set of all heads (matched or unmatched) attached to vertices in ∂B − hy (y). Moreover, call E 0 := E * (x)∩ E tot the subset of unmatched tails which are attached to vertices at height h x in T (x). Finally, complete the generation of the digraph by matching the m − κ − τ unmatched tails E tot to the m − κ − τ unmatched heads F tot using a uniformly random bijection ω : E tot → F tot . For any f ∈ F hy we introduce the notation where v f denotes the vertex v ∈ ∂B − hy (y) such that f ∈ E − v . With the notation introduced above, the probability to go from x to y in t ⋆ steps following a nice path can now be written as Note that, conditionally on the construction of the first κ + τ edges described above, each Bernoulli random variable 1 ω(e)=f appearing in the above sum has probability of success at least 1/m. In particular, if σ denotes a fixed realization of the κ + τ edges, then where Moreover, the probability of ω(e) = f for any fixed e ∈ E 0 , f ∈ F 0 is at most 1/(m − κ − τ ), so that where we use A x,y ≤ 1 and B x,y ≤ Γ hy (y). Consider the event where the exponent −γ 0 is chosen for convenience only and any exponent −c with c > γ 0 − 1 would be as good.
Lemma 3.6. There exists η 0 > 0 such that for all η ∈ (0, η 0 ), for any σ ∈ Y x,y , any a ∈ (0, 1): Proof. Conditioned on σ, P 0,t⋆ (x, y) is a function of the uniform random permutation ω : E tot → F tot , Since we are assuming TX(B − hy (y)) ≤ 1, we can use (3.9) to estimate w(f ) ≤ 2δ −hy = n −3η for any f ∈ F 0 . Therefore As in Lemma 3.3, and as in [8], we use Chatterjee's concentration inequality for uniform random matchings [11, Proposition 1.1] to obtain for any s > 0: Since P t⋆ (x, y) ≥ P 0,t⋆ (x, y) it is sufficient to prove for some constant c = c(η, ∆) > 0. The proof of (3.51) is based on Lemma 3.6 and the following estimates which allow us to make sure the events Y x,y in Lemma 3.6 have large probability. Proof. Let us first note that the event A 1 = {∀x ∈ V * : e∈E * (x) w(e)1 w(e)≤n 2η−1 ≥ 0.9} satisfies Indeed, this fact is a consequence of [8,9], which established that for any ε > 0, with high probability  (N, p), and therefore by Hoeffding's inequality Thus by a union bound we may assume that all x, y are such that the corresponding collision count if η is small enough.

Lemma 3.8. Fix a constant c > 0 and consider the event
Proof. By definition, f ∈F hy w(f ) = Γ hy (y). Thus, we need to show that if we replace F 0 by F hy the sum defining B x,y is still comparable to Γ hy (y). For any constant T > 0, for each z ∈ ∂B − hy−T (y), let V z denote the set of w ∈ ∂B − hy (y) such that d(w, z) = T . Notice that if the event G(ℏ) from Proposition 2.1 holds then for each z ∈ ∂B − hy−T (y) one has |V z | ≥ 1 2 δ T . Consider the generation of the κ + τ edges as above, and call a vertex z ∈ ∂B − hy−T (y) bad if all heads attached to V z are matched, or equivalently if none of these heads is in F tot . Given a z ∈ ∂B − hy−T (y), we want to estimate the probability that it is bad. To this end, we use the same construction given in Section 3.3.1 but this time we first generate the in-neighborhood B − hy (y) and then the tree T (x). Let K denote the number of collisions between T (x) and the set V z . Notice that |V z | ≤ ∆ T and that |T (x)| ≤ n 1−η 2 /2 , so that K is stochastically dominated by the binomial Bin(N, p) where N = n 1−η 2 /2 and p = ∆ T +1 /n. Therefore, Since |V z | ≥ 1 2 δ T , if z is bad then K > 1 2 δ T and thus the probability of the event that z is bad is at most O(n −δ T η 2 /4 ). The probability that there exists a bad z ∈ ∂B − hy−T (y) is then bounded by O(∆ hy n −δ T η 2 /4 ). In conclusion, if T = T (η) is a large enough constant, we can ensure that for any y ∈ [n] the probability that there exists a bad z ∈ ∂B − hy−T (y) is o(n −2 ), and therefore, by a union bound, with high probability there are no bad z ∈ ∂B − hy−T (y), for all x, y ∈ [n]. On this event, for all z we may select one vertex w ∈ V z with at least one head f ∈ F 0 attached to it. Notice that w(f ) ≥ ∆ −T −1 P hy−T (z, y). Therefore, assuming that there are no bad z ∈ ∂B − hy−T (y): From Lemma 3.3 we may finish with the estimate Γ hy−T (y) ≥ 1 2 Γ hy (y). We can now conclude the proof of (3.51). Consider the event (3.53) For any s > 0, P P 0,t⋆ (x, y) < s n Γ hy (y); A .

55) where c is the constant from Lemma 3.8. From Lemma 3.2 we infer that
A ⊂ W x,y ∩ Y x,y , for all x, y, and for all n large enough. Therefore, P P 0,t⋆ (x, y) < s n Γ hy (y); A ≤ sup σ∈Wx,y∩Yx,y P P 0,t⋆ (x, y) < s n Γ hy (y) | σ . Taking s > 0 a small enough constant and using (3.42) and (3.55), we see that P 0,t⋆ (x, y) < s n Γ hy (y) implies |P 0,t⋆ (x, y) − E [P 0,t⋆ (x, y) | σ] | ≥ a E [P 0,t⋆ (x, y) | σ] , for some constant a > 0, and therefore from Lemma 3.6 sup σ∈Wx,y∩Yx,y P P 0,t⋆ (x, y) < s n Γ hy (y) | σ = o(n −2 ). where V k,ℓ denotes the set of vertices of type (k, ℓ), and define The main observation is that if (k, ℓ) / ∈ L ε , then w.h.p. there are at most a finite number of vertices of type (k, ℓ) in all in-neighborhoods B − h 0 (y), y ∈ [n], for any h 0 = O(log log n). Indeed, for a fixed y ∈ [n] the number of v ∈ V k,ℓ ∩ B − h 0 (y) is stochastically dominated by the binomial Bin ∆ h 0 , n −ε/2 , and therefore if K = K(ε) is a sufficiently large constant then the probability of having more than K such vertices is bounded by (∆ h 0 n −ε/2 ) K = o(n −1 ). Taking a union bound over y ∈ [n] shows that w.h.p. all B − h 0 (y), y ∈ [n] have at most K vertices with type (k, ℓ). Then we may repeat the argument of Lemma 3.2 with this constraint, to obtain that for all ε > 0, w.h.p. Γ hy (y) ≥ c(ε) log 1−γ ′ ε (n). Since the number of types is finite one concludes that if ε is small enough then γ ′ 0 = γ ′ ε and the desired conclusion follows. 3.4. Upper bound on π min . In this section we prove the upper bound on π min given in (1.18). We first show that we can replace π(y) in (1.18) by a more convenient quantity. Define the distances (3.60) It is standard that, for all k, s ∈ N, see e.g. [21]. In particular, defining From (3.2) we know that w.h.p. d(2T ENT ) ≤ 1 2e so that the right hand side above is at most e −k . If k = Θ(log 2 (n)) we can safely replace π(y) with λ t (y) in (1.18). Thus, it suffices to prove the following statement. Lemma 3.10. For some constants β > 0, C > 0, and for any t = t n = Θ(log 3 (n)): Proof. Let (δ * , ∆ * ) ∈ L denote the type realizing the maximum in the definition of γ 1 ; see (1.16). Let V * = V δ * ,∆ * denote the set of vertices of this type, and let α * ∈ (0, 1) be a constant such that |V * | ≥ α * n, for all n large enough. Let us fix a constant β 1 ∈ (0, 1 4 ). This will be related to the constant β, but we shall not look for the optimal exponent β in the statement (3.64). Consider the first N 1 := n β 1 vertices in the set V * , and call them y 1 , . . . , y N 1 . Next, generate sequentially the in-neighborhoods for some constant C 0 to be fixed later. As in the proof of Lemma 3.2 we couple the B − h 0 (y i ) with independent random trees Y i rooted at y i . For each B − h 0 (y i ) the probability of failing to equal Y i , conditionally on the previous generations, is uniformly bounded above by p := N 1 ∆ 2h 0 /m. Let A denote the event that all B − h 0 (y i ) are successfully coupled to the Y i 's and that they have no intersections. Therefore, Consider now a single random tree Y 1 . We say that Y 1 is unlucky if all labels of the vertices in the tree are of type (δ * , ∆ * ). The probability that Y 1 is unlucky is at least . We choose C 0 so large that 0 < η ≤ β 1 /4. Call S 1 the set of y ∈ {y 1 , . . . , y N 1 } such that Y i is unlucky. Since the Y i are i.i.d. the probability that |S 1 | < n β 1 /2 is bounded by the probability that Bin(N 1 , q) < n β 1 /2 , which by Hoeffding's inequality is at most . Thanks to (3.66) we may assume that σ ∈ A, i.e. B − h 0 (y i ) = Y i for all i so that the set of unlucky y i coincides with S 1 , and thanks to (3.67) we may also assume that σ is such that |S 1 | ≥N := n β 1 /2 . We call A ′ ⊂ A the set of all σ ∈ A satisfying the latter requirement. LetS denote the firstN elements in S 1 . We are going to show that uniformly in σ ∈ A ′ , for a sufficiently large constant C > 0, any t = Θ(log 3 (n)), Notice that (3.68) says that, conditionally on a fixed σ ∈ A ′ , with high probability which implies that there are at mostN /2 vertices y ∈S with the property that λ t (y) > C n log 1−γ 1 (n). Summarizing, the above arguments and (3.68) allow one to conclude the unconditional statement that with high probability there are at least 1 2 n β 1 /2 vertices y ∈ [n] such that λ t (y) ≤ C n log 1−γ 1 (n), which implies the desired claim (3.64), taking e.g. β = β 1 /3.
We first establish that, uniformly in σ ∈ A ′ , for any t = Θ(log 3 (n)), If y is unlucky then P h 0 (z, y) = ∆ −h 0 * for any z ∈ ∂B − h 0 (y). Hence, for any y ∈S: Since |∂B − h 0 (y)| = δ h 0 * , and since all z ∈ ∂B − h 0 (y) have the same in-degree d − z = δ * , using symmetry the proof of (3.69) is reduced to showing that for any z ∈ ∂B − h 0 (y), t = Θ(log 3 n), To compute the expected value in (3.70) we use the so called annealed process. Namely, observe that where X t is the annealed walk with initial environment σ, and initial position x, and P a,σ x denotes its law. This process can be described as follows. At time 0 the environment consists of the edges from σ alone, and X 0 = x; at every step, given the current environment and position, the walker picks a uniformly random tail e from its current position, if it is still unmatched then it picks a uniformly random unmatched head f , the edge ef is added to the environment and the position is moved to the vertex of f , while if e is already matched then the position is moved to the vertex of the head to which e was matched. Let us show that uniformly in x = z ∈ ∂B − h 0 (y), uniformly in σ ∈ A ′ : Say that a collision occurs if the walk lands on a vertex that was already visited by using a freshly matched edge. At each time step the probability of a collision is at most O(t/m), and therefore the probability of more than one collision in the first t steps is at most O(t 4 /m 2 ) = o(m −1 ). Thus we may assume that there is at most one cycle in the path of the walk up to time t. There are two cases to consider: 1) there is no cycle in the path up to time t or there is one cycle that does not pass through the vertex z; 2) there is a cycle and it passes through z. In case 1) since X t = z the walker must necessarily pick one of the heads of z at the very last step. Since all heads of z are unmatched by construction, and since the total number of unmatched heads at that time is at least (1))m, this event has probability (1 + o(1))d − z /m. In case 2) since x = z we argue that in order to have a cycle that passes through z, the walk has to visit z at some time before t, which is an event of probability O(t/m), and then must hit back the previous part of the path, which is an event of probability O(t 2 /m). This shows that we can upper bound the probability of scenario 2) by O(t 3 /m 2 ) = o(m −1 ). This concludes the proof of (3.72). Next, observe that if x = z, then the previous argument gives P a,σ z (X t = z) = O(t/m) which is a bound on the probability that the walk hits again z at some point within time t. In conclusion, (3.71) and (3.72) imply (3.70) which establishes (3.69).
Let us now show that (3.73) Once we have (3.73) we can conclude (3.68) by using Chebyshev's inequality together with (3.69) and the fact that where P a,σ x,x ′ is the law of two trajectories (X s , X ′ s ), s = 0, . . . , t, that can be sampled as follows. Let X be sampled up to time t according to the previously described annealed measure P a,σ x , call σ ′ the environment obtained by adding to σ all the edges discovered during the sampling of X and then sample X ′ up to time t independently, according to P a,σ ′ x ′ . Let also P a,σ u be defined by P a,σ u = 1 n 2 x,x ′ ∈[n] P a,σ x,x ′ .
Thus, under P a,σ u the two trajectories have independent uniformly distributed starting points x, x ′ . With this notation we write Let us show that if z = z ′ , t = Θ(log 3 (n)): Indeed, let A be the event that the first trajectory hits z at time t and visits z ′ at some time before that. Then reasoning as in (3.72) the event A has probability O(t/m 2 ). Given any realization X of the first trajectory satisfying this event, the probability of X ′ t = z ′ is at most the probability of colliding with the trajectory X within time t, which is O(t/m). On the other hand, if the first trajectory hits z at time t and does visit z ′ at any time before that, then the conditional probability of X ′ t = z, as in (3.72) is given by (1 + o(1))d − z ′ /m. This proves (3.76) when z = z ′ . If z = z ′ , t = Θ(log 3 (n)), let us show that Consider the event A that the first trajectory X has at most one collision. The complementary event A c has probability at most O(t 4 /m 2 ). If A c occurs, then the conditional probability of X ′ t = z is at most the probability that X ′ collides with the first trajectory at some time s ≤ t, that is O(t/m). Hence, To prove (3.77), notice that to realize X ′ t = z there must be a time s = 0, . . . , t such that X ′ collides with the first trajectory X at time s, then X ′ stays in the digraph D 1 defined by the first trajectory for the remaining t − s units of time, and X ′ hits z at time t. On the event A the probability of spending h units of time in D 1 is at most 2δ −h , and for any h ∈ [0, t] there are at most h + 1 points x which have a path of length h from x to z in D 1 . Therefore Hence, (3.77) follows from (3.78) and (3.79).

3.5.
Upper bound on π max . As in Section 3.4 we start by replacing π(y) with λ t (y) = 1 n x P t (x, y). In (3.63) we have seen that if t = 2kT ENT , then w.h.p. (3.80) Thus, using a union bound over y ∈ [n], the upper bound in Theorem 1.5 follows from the next statement.
Lemma 3.11. There exists C > 0 such that for any t = t n = Θ(log 3 (n)), uniformly in y ∈ [n] and call σ a realization of the in-neighborhood B − h 0 (y). Clearly, From (3.9), under the event G y (ℏ) from Proposition 2.1, we have P h 0 (z, y) ≤ 2δ −h 0 Then it is sufficient to prove that for some constant C, uniformly in σ and y ∈ [n]: By Markov's inequality, for any K ∈ N and any constant C > 0: (3.83) We fix K = log n, and claim that there exists an absolute constant C 1 > 0 such that The desired estimate (3.82) follows from (3.84) and (3.83) by taking C large enough.
We compute the K-th moment E X K ; G y (ℏ) | σ by using the annealed process as in (3.74). This time we have K trajectories instead of 2: s , s ∈ [0, t]}, j = 1, . . . , K denote K annealed walks each with initial point x j , and P a,σ x 1 ,...,x K denotes the joint law of the trajectories X (j) , j = 1, . . . , K, and the environment, defined as follows. Start with the environment σ, and then run the first random walk X (1) up to time t as described after (3.71). After that run the walk X (2) up to time t with initial environment given by the union of edges from σ and the first trajectory, as described in (3.74). Proceed recursively until all trajectories up to time t have been sampled. This produces a new environment, namely the digraph given by the union of σ and all the K trajectories. At this stage there are still many unmatched heads and tails, and we complete the environment by using a uniformly random matching of the unmatched heads and tails. This defines the coupling P a,σ x 1 ,...,x K between the environment and K independent walks in that environment, which justifies the expression in (3.85). It is convenient to introduce the notation for the annealed law of the K trajectories such that independently each trajectory starts at a uniformly random point X (j) 0 = x j . Let D 0 = σ and let D ℓ , for ℓ = 1, . . . , K, denote the digraph defined by the union of σ = B − h 0 (y) with the first ℓ paths {X (j) s , 0 ≤ s ≤ t}, j = 1, . . . , ℓ. Call D ℓ (ℏ) the subgraph of D ℓ consisting of all directed paths in D ℓ ending at y with length at most ℏ. We define G ℓ y (ℏ) as the event TX(D ℓ (ℏ)) ≤ 1. Notice that if the final environment has to satisfy G y (ℏ), then necessarily for every ℓ the digraph D ℓ must satisfy G ℓ y (ℏ). Therefore, where V (D ℓ ) denotes the vertex set of D ℓ and d − x (D ℓ ) is the in-degree of x in the digraph D ℓ . Define also the (ℓ, s) cluster C s ℓ as the digraph given by the union of D ℓ−1 and the truncated path {X (ℓ) u , 0 ≤ u ≤ s}. We say that the ℓ-th trajectory X (ℓ) has a collision at time s ≥ 1 if the edge (X . We say that a collision occurs at time zero if X (ℓ) 0 ∈ D ℓ−1 . Notice that at least collisions must have occurred after the generation of the first ℓ trajectories. Let Q ℓ denote the total number of collisions after the generation of the first ℓ trajectories. Since |B − h 0 (y)| ≤ ∆ log n one must have W ℓ ≤ ∆ log n + Q ℓ .
(3.88) Notice that the probability of a collision at any given time by any given trajectory is bounded above by p := 2∆(Kt + ∆ h 0 − )/m = O(log 4 (n)/n) and therefore Q ℓ is stochastically dominated by the binomial Bin(Kt, p). In particular, for any k ∈ N: for some constant C 2 > 0. If A > 0 is a large enough constant, then If A ≥ 2 then (3.90) is smaller than the right hand side of (3.84) with e.g. C 1 = 1, and therefore from now on we may restrict to proving the upper bound for some constant C 1 = C 1 (A) > 0. To prove (3.91), define the events Thus, it is sufficient to show that for some constant C 1 : for all ℓ = 1, . . . , K, where it is understood that P a,σ u (B 1 | B 0 ) = P a,σ u (B 1 ) . Let us partition the event {X (ℓ) t ∈ B − h 0 (y)} by specifying the last time in which the walk X (ℓ) enters the neighborhood B − h 0 (y). Unless the walk starts in B − h 0 (y), at that time it must enter from ∂B − h 0 (y). Since the tree excess of B − h 0 (y) is at most 1, once the walker is in B − h 0 (y), we can bound the chance that it remains in B − h 0 (y) for k steps by 2δ −k + . Therefore, Since t = Θ(log 3 (n)), it is enough to show (3.95) uniformly in j ∈ (t/2, t) and 1 ≤ ℓ ≤ K.
Let H ℓ 0 denote the event that the ℓ-th walk makes its first visit to the digraph D ℓ−1 at the very last time j, when it enters ∂B − h 0 (y). Uniformly in the trajectories of the first ℓ − 1 walks, at any time there are at most Let H ℓ 2 denote the event that the ℓ-th walk makes a first visit to D ℓ−1 at some time s 1 < j, then at some time s 2 > s 1 it exits D ℓ−1 , and then at a later time s 3 ≤ j enters again the digraph D ℓ−1 . Since each time the walk is outside D ℓ−1 the probability of entering D ℓ−1 at the next step is O(Kt/m), it follows that  To estimate the sum over s ∈ (j − ℏ + h 0 , j), notice that the walk has to enter D ℓ−1 by hitting a point z ∈ D ℓ−1 at time s such that there exists a path of length h = j − s from z to ∂B − h 0 (y) within the digraph D ℓ−1 . Call L h the set of such points in D ℓ−1 . Hitting this set at any given time s coming from outside the digraph D ℓ−1 has probability at most 2∆|L h |/m, and the path followed once it has entered D ℓ−1 is necessarily in D ℓ−1 (ℏ) and therefore has weight at most 2δ −h + . Then, On the event B ℓ−1 we know that Q ℓ−1 ≤ A log n, and therefore |A h | ≤ C 3 log n for some absolute Proof. We argue as in the first part of the proof of Lemma 3.10. Namely, let (∆ * , δ * ) ∈ L denote the type realizing the minimum in the definition of κ 1 ; see (1.16). Let V * = V ∆ * ,δ * denote the set of vertices of this type, and let α * ∈ (0, 1) be a constant such that |V * | ≥ α * n, for all n large enough. Fix a constant β 1 ∈ (0, 1 4 ) and call y 1 , . . . , y N 1 the first N 1 := n β 1 vertices in the set V * . Then sample the in-neighborhoods 105) and call σ a realization of all these neighborhoods. As in the proof of Lemma 3.10, we may assume that all B − h 0 (y i ) are successfully coupled with i.i.d. random trees Y i . Next define a y i lucky if B − h 0 (y i ) has all its vertices of type (∆ * , δ * ). Then, if C in (3.105) is large enough we may assume that at least n β 1 /2 vertices y i are lucky; see (3.67). As before, we call A ′ the set of σ realizing these constraints.
Given a realization σ ∈ A ′ , and some ε ∈ (0, β 1 /2) we fix the first n ε lucky vertices y * ,i , i = 1, . . . , n ε . Since P(A ′ ) = 1 − o(1), letting S = {y * ,i , i = 1, . . . , n ε }, it is sufficient to prove that for some constant (3.106) To prove (3.106) we first observe that by (3.34) and Lemma 3.3 it is sufficient to prove the same estimate with nπ(y * ,i ) replaced by Γ h 1 (y * ,i ), where h 1 = K log log n for some large but fixed constant K. Therefore, by using symmetry and a union bound it suffices to show where y * = y * ,1 is the first lucky vertex. By definition of lucky vertex, The same argument of the proof of Lemma 3.2 shows that the probability that all these neighborhoods are successfully coupled to i.i.d. random directed trees is at least 1 − O(∆ 2h 1 /n). On this event we have (3.15). Then (3.16) shows that for some new constant c 2 > 0 and for ε = c 1 ∆ −C * /4. This ends the proof of (3.107).

BOUNDS ON THE COVER TIME
In this section we show how the control of the extremal values of the stationary distribution obtained in previous sections can be turned into the bounds on the cover time presented in Theorem 1.7. To this end we exploit the full strength of the strategy developed by Cooper and Frieze [15,13,14,16].

4.1.
The key lemma. Given a digraph G, write X t for the position of the random walk at time t and write P x for the law of {X t , t ≥ 0} with initial value X 0 = x. In particular, P x (X t = y) = P t (x, y) denotes the transition probability. Fix a time T > 0 and define the event that the walk does not visit y in the time interval [T, t], for t > T : Moreover, define the generating function Thus, R T y (1) ≥ 1 is the expected number of returns to y within time T , if started at y. The following statement is proved in [14], see also [16,Lemma 3]. Lemma 4.1. Assume that G = G n is a sequence of digraphs with vertex set [n] and stationary distribution π = π n , and let T = T n be a sequence of times such that (ii) T 2 π max = o(1) and T π min ≥ n −2 .
Suppose that y ∈ [n] satisfies: (iii) there exist K, ψ > 0 independent of n such that Then there exist ξ 1 , ξ 2 = O(T π max ) such that for all t ≥ T : where .

(4.4)
We want to apply the above lemma to digraphs from our configuration model. Thus, our first task is to make sure that the assumptions of Lemma 4.1 are satisfied. From now on we fix the sequence T = T n as T = log 3 (n). 1 is also satisfied with high probability. Next, following [15], we define a class of vertices y ∈ [n] which satisfy item (iii) of Lemma 4.1. We use the convenient notation ϑ = log log log(n). (4.6) Definition 4.2. We call small cycle a collection of ℓ ≤ 3ϑ edges such that their undirected projection forms a simple undirected cycle of length ℓ. We say that v ∈ [n] is locally tree-like (LTL) if its in-and outneighborhoods up to depth ϑ are both directed trees and they intersect only at x. We denote by V 1 the set of LTL vertices, and write V 2 = [n] \ V 1 for the complementary set.
The next proposition can be proved as in [15,Section 3]. (1) The number of small cycles is at most ∆ 9ϑ .
(3) There are no small cycles which are less than 9ϑ undirected steps away.

Proposition 4.4.
With high probability, uniformly in y ∈ V 1 : Moreover, there exist constants K, ψ > 0 such that with high probability, every y ∈ V 1 satisfies item (iii) of Lemma 4.1. In particular, (4.3) holds uniformly in y ∈ V 1 .
Proof. We first prove (4.7). Fix y ∈ V 1 and consider the neighborhoods B ± ϑ (y) and B − ℏ (y). By Proposition 2.1 we may assume that B − ℏ (y) and B + ϑ (y) are both directed trees except for at most one extra edge. By the assumption y ∈ V 1 we know that B − ϑ (y), B + ϑ (y) are both directed trees with no intersection except y, so that the extra edge in B − ℏ (y) ∪ B + ϑ (y) cannot be in B − ϑ (y) ∪ B + ϑ (y). Thus, the following cases only need to be considered: . In all cases but the last, if a walk started at y returns at y at time t > 0 then it must exit ∂B + ϑ (y) and enter ∂B − ℏ (y), and from any vertex of ∂B − ℏ (y) the probability to reach y before exiting B − ℏ (y) is at most 2δ −ℏ . Therefore, in these cases the number of visits to y up to T is stochastically dominated by 1 + Bin(T, 2δ −ℏ ) and , for some a > 0. In the last case instead it is possible for the walk to jump from B + ϑ (y) to B − ℏ (y) \ B − ϑ (y). Let E k denote the event that the walk visits y exactly k times in the interval [1, T ]. Let B denote the event that the walk visits y exactly ϑ units of time after its first visit to ∂B − ϑ (y). Then P y (B) ≤ δ −ϑ . On the complementary event B c the walk must enter ∂B − ℏ (y) before visiting y, and each time it visits ∂B − ℏ (y) it has probability at most 2δ −ℏ to visit y before the next visit to ∂B − ℏ (y). Since the number of attempts is at most T one finds By the strong Markov property, To see that y ∈ V 1 satisfies item (iii) of Lemma 4.1, take z ∈ C with |z| ≤ 1 + 1/KT and write Proof. Let U s denote the set of vertices that are not visited in the time interval [0, s]. By Markov's inequality, for all t * ≥ T : Choose t * := (1 + ε) log n π min , for ε > 0 fixed. It is sufficient to prove that the last term in (4. (4.11) Using π(y) ≥ π min , (4.11) is bounded by for all fixed ε > 0 in the definition of t * . It remains to control the contribution of y ∈ V 2 to the sum in (4.9). From Proposition 4.3 we may assume that |V 2 | = O(∆ 15ϑ ). In particular, it is sufficient to show that with high probability uniformly in x ∈ [n] and y ∈ V 2 : To prove (4.12), fix y ∈ V 2 and notice that by Proposition 4.3 (3), we may assume that there exists u ∈ V 1 s.t. d(u, y) < 10ϑ. If t 1 = t 0 + 10ϑ, t 0 := 4/π min , then Since u ∈ V 1 , as in (4.10), for n large enough, . (4.13) Since this bound is uniform over x, the Markov property implies, for all k ∈ N, (4.14) Therefore,

Lower bound on the cover time.
We prove the following stronger statement. Clearly, this implies the lower bound on T cov = max x∈[n] E x (τ cov ) in Theorem 1.7. The proof of Lemma 4.6 is based on the second moment method as in [16]. If W ⊂ [n] is a set of vertices, let W t be the set W t = {y ∈ W : y is not visited in [0, t]} (4.16) Then Therefore, Lemma 4.6 is a consequence of the following estimate.
We start the proof of Lemma 4.7 by exhibiting a candidate for the set W . (1) W ⊂ V 1 , where V 1 is the LTL set from Definition 4.2, and |W | ≥ n α for some constant α > 0.
Proof. From Theorem 1.3 we know that w.h.p. there exists a set S ⊂ [n] with |S| > n β such that (4.19) holds. Moreover, a minor modification of the proof of Lemma 3.10 shows that we may also assume that S ⊂ V 1 and that min{d(x, y), d(y, x)} > 2ϑ for every x, y ∈ W . Indeed, it suffices to generate the out-neighborhoods B + ϑ (y i ) for every i = 1, . . . , N 1 and the argument for (3.66) shows that these are disjoint trees with high probability. To conclude, we observe that there is a W ⊂ S such that |W | > n β/2 and such that (4.20) holds. Indeed, using π min ≥ n −1 log −K 1 (n) for some constant K 1 , for any constant K > 0 we may partition the interval [n −1 log −K 1 (n), Cn −1 log 1−γ 1 (n)] in log 2K (n) intervals of equal length and there must be at least one of them containing n β log −2K (n) ≥ n β/2 elements which, if K is sufficiently large, satisfy (4.20).
Proof of Lemma 4.7. Consider the first moment E x [|W t |], where W is the set from Proposition 4.8 and t is fixed as t = c n log γ 1 (n). For y ∈ W ⊂ V 1 we use Lemma 4.1 and Proposition 4.4. As in (4.10) we have P x (A T y (t)) = (1 + o(1))(1 +p y ) −(t+1) , (4.21) wherep y = (1 + o(1))π(y) ≤ p W := 2C n −1 log 1−γ 1 (n), where C is as in (4.19). Therefore, Taking the constant c in the definition of t sufficiently small, one has p W t ≤ α/2 log n and therefore where we use T = log 3 (n) and |W | ≥ n α . In particular, since T = log 3 (n),  Concerning the second moment E x |W t | 2 , we have E x |W t | 2 = y,y ′ ∈W P x y and y ′ not visited in [0, t] ≤ y,y ′ ∈W P x A T y (t) ∩ A T y ′ (t) .
Proof. We follow the proof of Eq. (107) in [16]. The stochastic matrix of the simple random walk on G * is given by if v, w = y * 1 2 (P (y, w) + P (y ′ , w)) if v = y * P (v, y) + P (v, y ′ ) if w = y * .
4.4. The Eulerian case. We prove Theorem 1.9. The strategy is the same as for the proof of Theorem 1.7, with some significant simplifications due to the explicit knowledge of the invariant measure π(x) = d x /m. For the upper bound, it is then sufficient to prove that, setting t * = (1 + ε)βn log n,  Thus, (4.33) follows from (4.34) and (4.35). It remains to prove the lower bound. We shall prove that for any fixed d such that |V d | = n α d +o(1) , α d ∈ (0, 1], for any ε > 0, We proceed as in the proof of Lemma 4.7. Here we choose W as the subset of V d consisting of LTL vertices in the sense of Definition 4.2 and such that for all x, y ∈ W one has min{d(x, y), d(y, x)} > 2ϑ. Let us check that this set satisfies |W | ≥ n α d +o (1) . (4.37) Indeed, the vertices that are not LTL are at most ∆ 9ϑ by Proposition 4.3. Therefore there are at least |V d | − ∆ 9ϑ = n α d +o(1) LTL vertices in V d . Moreover, since there are at most ∆ 2ϑ vertices at undirected distance 2ϑ from any vertex, we can take a subset W of LTL vertices of V d satisfying the requirement that min{d(x, y), d(y, x)} > 2ϑ for all x, y ∈ W and such that |W | ≥ (|V d | − ∆ 9ϑ )∆ −2ϑ = n α d +o (1) . From here on all arguments can be repeated without modifications, with the simplification that we no longer need a proof of Lemma 4.9 since a can be taken to be zero in (4.26) in the Eulerian case. The only thing to control is the validity of the bound (4.23) with the choice t = (1 − ε)d α d d n log n.