A spectral characterization for concentration of the cover time

We prove that for a sequence of finite vertex-transitive graphs of increasing sizes, the cover times are asymptotically concentrated if and only if the product of the spectral-gap and the expected cover time diverges. In fact, we prove this for general reversible Markov chains under the much weaker assumption (than transitivity) that the maximal hitting time of a state is of the same order as the average hitting time.


Introduction
A big part of the modern theory of Markov chains is dedicated to the study of the hierarchy of different quantities associated with a Markov chain. It is a common theme that certain phenomena can be characterized by a simple criterion concerning whether or not two such quantities are strongly separated (i.e., are of strictly different orders). Often, one of these quantities is the inverse of the spectral-gap. One instance is the cutoff phenomenon and the condition that the product of the mixing-time and the spectral-gap diverges, known as the product condition (a necessary condition for precutoff in total-variation [32,Proposition 18.4] and a necessary and sufficient condition for cutoff in L 2 [14]). The condition that the product of the spectral-gap and the maximal (expected) hitting time diverges is studied in [1] and [27,Theorem 1]).
Aldous' classic criterion for concentration of the cover time [4] is another such instance. Aldous' criterion asserts that for a sequence of Markov chains on finite state spaces of diverging sizes τ (n) cov t (n) cov → 1 in distribution if t  cov does not concentrate around any value. [1] [2] Conversely, (even without reversibility) t  → 1 and that τ (n) cov t (n) cov → 1 in distribution for every sequence of initial states.
Our Theorem 1 refines Aldous' criterion in the transitive setup by allowing one to replace the maximal hitting time in his result by the inverse of the spectral-gap, which is positioned much lower in the aforementioned hierarchy of Markov chain parameters (see (3.10)). Throughout, let gap := λ 2 be the spectral-gap of the considered chain, and t rel := 1 λ 2 its relaxation-time, where 0 = λ 1 < λ 2 . . . λ |V | 2 are the eigenvalues of the Laplacian I − P . When considering simple random walk (SRW ) on a graph G we often add parenthesis '(G)' to various quantities. Theorem 1. Let G n be a sequence of finite connected vertex-transitive graphs of diverging sizes. Then τcov(Gn) tcov(Gn) → 1 in distribution iff gap(G n )t cov (G n ) → ∞.
We note that in the setup of Theorem 1, if gap(G n )t cov (G n ) = O(1) then τ (n) cov /t (n) cov does not concentrate around any fixed value (by transitivity this holds for all initial states). [3] Theorem 1 holds in the more general setup of reversible transitive Markov chains. That is, reversible Markov chains on a finite state space V whose transition matrix satisfies that for every x, y ∈ V there is a bijection f : V → V such that f (x) = y and P (x, z) = P (y, f (z)) for all z ∈ V . Theorem 2 extends Theorem 1 to a much larger class of Markov chains. Denote the average hitting time of an irreducible Markov chain on a finite state space V by α := where throughout π denotes the stationary distribution, and α y := E π [T y ]. Theorem 2 indeed generalizes Theorem 1, as (by Fact 3.1) for a transitive chain α t hit 2α .
Theorem 2. Consider a sequence of irreducible reversible Markov chains with finite state spaces V (n) and stationary distributions π (n) . If [4] cov does not concentrate around any fixed value for any sequence of initial states.
hit used in the first part of the statement of Theorem 2. We believe that the condition min Throughout we work with the continuous-time rate 1 version of the chain. We remark that all our results are valid also in discrete-time even if the chain is not lazy (i.e., if P (x, x) = 0 for some x). Moreover, in this case one does not need to replace gap by the [3] In fact, it is shown in [7] that for reversible chains there is always some state x and a set A of stationary probability at least 1/2 such that P x [T A > t] exp(−gap t) for all t 0, where T A := inf{t : X t ∈ A}. [4] We write o(1) for terms which vanish as n → ∞. We write f n = o(g n ) or f n ≪ g n if f n /g n = o (1). We write f n = O(g n ) and f n g n (and also g n = Ω(f n ) and g n f n ) if there exists a constant C > 0 such that |f n | C|g n | for all n. We write f n = Θ(g n ) or f n ≍ g n if f n = O(g n ) and g n = O(f n ).
absolute spectral-gap (as is often the case when translating a result from the continuoustime or discrete-time lazy setups to the discrete-time non-lazy setup). This can be verified by an application of Wald's equation (used to argue that the expected cover-time and hitting times are the same in both setups), together with the fact that Aldous' result [4] (which is used in the proofs of Theorems 1.1 and 1.2) applies to both setups.
We note that if G n and G n are two sequences of finite connected graphs of uniformly bounded degree which are uniformly quasi isometric (i.e., there exists some K > 0 such that G n is K-quasi isometric to G n for all n) then gap(G n ) ≍ gap( G n ), α(G n ) ≍ α( G n ), t hit (G n ) ≍ t hit ( G n ) and [22,Theorem 1.6] t cov (G n ) ≍ t cov ( G n ). [5] In particular, if G n are vertex-transitive, the sequence of SRWs on G n satisfies the conditions of Theorem 2 (apart perhaps from min x∈V (n) α x ≍ t cov (G n ), used only in (1.1)).
The cover time of an n × n grid torus is concentrated [18], while that of the n-cycle is not. The following example shows that an n × ⌈n/ log 2 n⌉ grid torus is in some sense critical.
Example 1.1. Consider an n × m discrete (grid) torus (i.e., the Cayley graph of Z n × Z m w.r.t. the standard choice of generators). If m = m(n) = O(n/ log 2 n) then its (expected) cover time is of order n 2 , same as the inverse of its spectral-gap. Conversely, if n/ log 2 n ≪ m n the cover time is of order mn(log n) 2 ≫ n 2 , while the spectral-gap is Θ( 1 n 2 ).
Theorem 1 is a fairly immediate consequence of the following result (see §4 for the details). We note that whenever 1 gap |V | a for some a ∈ (0, 1) the bound offered by (1.2) is of the correct order, as by Matthews [35] t cov t hit [32,Theorem 11.2]) and by (3.2) t hit α |V | 2 (and so gap t hit |V | 1−a ). Theorem 2 (apart from (1.1)) is a fairly immediate consequence of the following extension of Proposition 1.1. When α ≍ t hit and t hit ≫ 1/gap we get that M( t hit α , α gap) ≫ 1 and so t cov ≫ t hit . [5] The fact that α(G n ) ≍ α( G n ) follows from (3.2) via a standard comparison argument [19]. The claim that t hit (G n ) ≍ t hit ( G n ) can be seen from the commute-time identity (e.g. [32,Eq. (10.14)]) combined with the robustness of the effective-resistance under quasi isometries (cf. the proof of Theorem 2.17 in [34]).

Related work
Cover times have been studied extensively for over 35 years. This is a topic with rich ties to other objects such as the Gaussian free field [22]. There has been many works providing general bounds on the cover time, studying its evolution and its fluctuations in general [41,20,36,6,31], and in particular for the giant component of various random graphs [16], for trees [2,23], for the two dimensional torus [18,21,15,9] and for higher dimensional tori [8]. Feige [24,25] proved tight extremal upper and lower bounds on cover times of graphs (by SRW). For a more comprehensive review of the literature see the related work section in [22] and the references therein. For further background on hitting times and cover times see [5,32,1].

Organization of this note
In §2 we present some open problems. In §3 we present some background on hitting-times. In §4 we prove Theorems 1-2 and Propositions 1.1 and 1.2. Example 1.1 is analyzed in §5.

Open Problems
In the following seven questions let G n = (V n , E n ) be a sequence of finite connected vertextransitive graphs of diverging sizes. Denote the degree of G n by d n and its diameter (i.e. the maximal graph distance between a pair of vertices) by Diam(G n ). Let t rel (G n ) := 1 gap(Gn) be the relaxation-time. The following two questions are the focus of an ongoing work with Nathanaël Berestycki (we believe both have an affirmative answer).
We now discuss some relaxations of the conditions from Questions 2.1 and 2.2. Let R(a, b) be the effective resistance between a and b and R * := max x,y R(x, y) (e.g. [32,Ch. 9] and [34, Ch. 2]). Let o n ∈ V n . Consider the conditions: Condition (i) arises in forthcoming work of Tessera and Tointon [39] as the analogue of transience in a Varopoulos-type result for finite vertex-transitive graphs. In particular, they show (in the transitive setup) that it follows from the condition Diam(G n ) 2 |Vn| log |Vn| . Below we give some equivalent conditions to (i). One of them is t hit (G n ) ≍ |V n |. Using the fact that t rel (G) 2dDiam(G) 2 for a vertex-transitive graph G of degree d [32, Theorem 13.26] (as well as t hit (G n ) |V n |), it follows that when d n ≍ 1, Moderate growth is a certain technical growth condition introduced by Diaconis and Saloff-Coste in their seminal work [17]. For Cayley graphs this condition is shown by Breuillard and Tointon [13] to be equivalent to the condition c|V | Diam(G) a , in some precise quantitative sense, with these a and c being related to the parameters in the definition in [17]. This was recently extended to vertex-transitive graphs by Tessera and Tointon [40].
Using the fact that for vertex-transitive graphs of moderate growth t rel (G n ) Diam(G n ) 2 [17], [6] it appears that the main ingredient for establishing the converse implications to the ones in (2.1) when d n ≍ 1 is providing an affirmative answer to Question 2.3. Indeed for graphs of sufficiently large growth the condition Diam(G n ) 2 ≪ |Vn| log |Vn| holds for free.
Question 2.5. Do the reverse implications in (2.1) hold? What can be said without the assumption that d n ≍ 1?
Question 2.6. Assume that Diam(G n ) 2 ≪ |Vn| log |Vn| . Is it the case that for all fixed δ ∈ (0, 1), For vertex-transitive graphs condition (i) is equivalent to t hit (G n ) ≍ |V n | and condition (ii) is equivalent to the condition that |{x ∈ V n : E on [T x ] δt hit (G n )}| = O(1) for all fixed δ ∈ (0, 1). Indeed, by the commute-time identity (e.g., [32,Proposition 10.7]), for SRW on a graph G = (V, E) we have that [32,Proposition 10.10], this also follows from (3.5)), and so Question 2.7. In the above setup, is it the case that t hit (G n ) ≍ |V n | if and only if t cov (G n ) ≍ |V n | log |V n |? [6] See also [?, §8.1], where it is noted that argument from [17] is valid for vertex-transitive graphs of moderate growth, not just for Cayley graphs.
The implication t hit (G n ) ≍ |V n | =⇒ t cov (G) ≍ |V n | log |V n | follows (even without transitivity) from (3.10) and the bound t cov (G) It is not clear what is the correct analog of transience for a sequence of finite graphs. One technical definition is given in [36]. Here we discuss two different conditions. Our goal is to motivate Questions 2.3-2.5, relate them back to τ cov and stimulate future research.
A natural informal definition of being uniformly locally transient for a sequence G n = (V n , E n ) of d n -regular graphs is that the expected number of returns to the origin of the walk by the mixing-time (or by time |V n |) is O(1) (uniformly in the choice of the origin). If we think of "mixing" as "reaching infinity" then this is a natural analog of transience in the infinite setup. It is hence natural that such a condition can be phrased in terms of effective resistance. To make this precise, one can consider the following equivalent conditions: Consider a sequence of finite connected graphs G n := (V n , E n ). Let d (n) max and d (n) min be the maximal and minimal (respectively) degree of G n . We assume that d min so that π Gn the stationary distribution of SRW on G n satisfies max v∈Vn π Gn (v) ≍ 1 |Vn| ≍ min v∈Vn π Gn (v). A natural informal definition for saying that G n is uniformly globally transient is that the walk either returns to the origin rapidly (i.e. in O(1) time units), or else it is unlikely to return before getting mixed. To make this precise, we consider the condition min ). These conditions imply that When G n are vertex-transitive, condition (1)-(3) are equivalent to condition (a) and also to conditions (1), (2') and (3), where condition (2') is: |B eff−res (o, δR * (G n ))| = O(1), for all fixed δ ∈ (0, 1) (this is condition (ii) above). While we omit the proof of these equivalence, we stress that some of them are not at all obvious.
We strongly believe that (in the transitive setup) these conditions imply that tcov(Gn) t hit (Gn) log |Vn| → 1, and that the distribution of the cover time exhibits Gumbel fluctuations.
Considering an n × n discrete torus shows that even under transitivity the condition t rel (G n ) log |V n | ≍ t hit (G n ) does not imply that R * (G n ) ≍ 1. Another illustrative example is an n × n × f (n) discrete torus. It is not hard to verify that when f (n) = ⌈log n⌉, we have that t rel (G n ) log |V n | ≍ t hit (G n ) and R * (G n ) ≍ 1. However for every δ ∈ (0, 1) it holds that |{x ∈ V n : R(o n , x) δR * (G n )}| = Ω δ (n 2cδ log n) for some absolute constant c ∈ (0, 1), and so condition (ii) fails. Conversely, if log n ≪ f (n) n then t rel (G n ) log |V n | ≪ t hit (G n ) and conditions (i) and (ii) hold.
Considering a cartesian product of the n-cycle with a vertex-transitive expander of size f (n) ≫ n shows that the condition t rel (G n ) log |V n | ≪ t hit (G n ) above is not a necessary condition for conditions (i) and (ii) to hold.
In light of Example 1.1, the following question naturally arises.
Question 2.8. Let G n be a sequence of finite connected vertex-transitive graphs of diverging sizes and uniformly bounded degrees. Assume that along every subsequence τcov(Gn) tcov(Gn) does not converge to 1 in distribution. Is it the case that when viewing G n as a metric space with the graph distance as its metric, after rescaling distances by a Diam(Gn) f (Diam(Gn)) factor, for every f : N → R + satisfying 1 ≪ f (k) = o((log k) 2 ), the pointed Gromov-Hausdorff scaling limit exists and is R?
This question is the cover time analog of a question from [12] [7] , where it is shown that for a sequence of finite vertex-transitive graphs G n of fixed degree and increasing sizes satisfying that their mixing times are proportional to their maximal hitting times, G n rescaled by their diameters converge in the Gromov-Hausdorff topology to the unit circle S 1 .

Hitting-times preliminaries
Let (X t ) ∞ t=0 be an irreducible reversible Markov chain on a finite state space V with transition matrix P and stationary distribution π. Denote the law of the continuous-time rate 1 version of the chain starting from vertex x (resp. initial distribution µ) by P x (respectively, [7] In which the assumption on τcov(Gn) tcov(Gn) is replaced by the assumption that t (∞) mix (G n ) ≍ t hit (G n ), and there we take f (k) = o(log k)). P µ ). Denote the corresponding expectation by E x (respectively, E µ ). Let H t := e −t(I−P ) be its heat kernel (so that H t (·, ·) are the time t transition probabilities).
We now present some background on hitting times. The random target identity (e.g. [32, Lemma 10.1]) asserts that y π(y)E x [T y ] is independent of x and hence equals α, while for all x ∈ V we have that (e.g. [32,Proposition 10.26]) Averaging over x yields the eigentime identity ([5, Proposition 3.13]) for all x, y [1, Proposition 2] (see also (3.6)) and so Let U ∼ π be independent of the chain. As T x T U + inf{t : X t+T U = x}, using the random target identity to argue E[T U ] = α, as well as the strong Markov property yields: The following material can be found at [5, §3.5]. Under reversibility, for any set A the law of its hitting time T A := inf{t : X t ∈ A} under initial distribution π conditioned on A ∁ , is a mixture of Exponential distributions, whose minimal parameter λ(A) is the Dirichlet eigenvalue of the set A ∁ . There exists a distribution µ A , known as the quasi-stationary distribution of A ∁ , under which T A has an Exponential distribution of parameter λ(A). It follows that λ(A) Let Z x,y := ∞ 0 (H t (x, y) − π(y)) dt.
Recall that t TV mix := inf{t : max a b |H t (a, b)−π(b)| 1/2} and t t hit t cov t hit (log |V | + 1), (3.10) where the last inequality is due to Matthews' [35] (see [32,Ch. 11] for a neat presentation). It is interesting to note that for reversible chains t TV mix C min x max y E y [T x ], for some absolute constant C. This follows from the results of Lovász and Winkler [33] concerning what they call the "forget-time". [8] See equation (3.7) in [28] for a more elementary derivation.
4 Proof of Theorems 1 -2 and Propositions 1.1 and 1.2 We will show that for every irreducible reversible Markov chain on a finite state space V , We first prove (1.2) and (1.3) assuming (4.2) and (4.1), whose proofs are deferred to the end of the section.
Proof of (1.2) and (1.3). We first prove (1.2). By (3.2) |V | α gap (this is used in the first inequality below). By Fact 3.1 α such that for all a, b ∈ B we have that E a [T b ] α/2 t hit /4. The claim now follows from Matthews' method [35] (see [32,Proposition 11.4]), which asserts that t cov min a,b∈B: We now prove (1.3). We first use the Paley-Zygmund inequality to argue that Hence by (4.1) with ε = 1 4 there exists a subset B of D of stationary probability at least such that for all a, b ∈ B we have that E a [T b ] α b − α/4 α/4 (using the fact that α x α/2 for all x ∈ D). The proof is concluded as above using Matthews' method. [8] Under reversibility, it follows from their result that t stop C 1 t forget−time C 1 min x max y E y [T x ], while Aldous [3] showed that t TV mix C 2 t stop (see also Peres and Sousi [38]), for some absolute constants C 1 , C 2 , where t stop is the expectation of a mean optimal stopping rule ((4.3)) starting from the worst initial state.
We now prove Theorems 1 and 2.
Proof of Theorems 1 and 2 without (1.1). Recall that Aldous [4] showed that in the reversible setup τ (n) cov t (n) cov → 1 in distribution for all sequences of initial states iff t     Before proving (1.1), we first recall a notion of mixing, first introduced by Aldous [3] in the continuous-time setup, and later studied in discrete-time by Lovász and Winkler [33] who developed a rich theory and also by Peres and Sousi [38] and by Oliveira [37]: and where a stopping rule is a stopping time, possibly with respect to a filtration larger than the natural filtration. A stopping rule attaining the infimum in (4.3) is called mean optimal. Lovász and Winkler [33] showed that for every initial state x there exists a mean optimal stopping rule T and [33, Theorem 2.2] that T is mean optimal iff there exists a state y such that a.s. T T y .
Such a state is called a halting state. While they work in discrete-time, a standard application of Wald's equation can be used to translate their results to the continuous-time setup (cf. [38]; alternatively, one can simply check that the arguments in [33] can be carried out directly in continuous-time).
Aldous [3] showed that under reversibility 1 C t stop /t TV mix C for some universal constant C. This was refined by Peres and Sousi [38] and independently by Oliveira [37] (see also [32,Ch. 24]) who in particular showed that for reversible Markov chains t mix ≍ max x,A: π(A) α E x [T A ] for all fixed α > 1/2 (this was extended also to α = 1/2 in [30]). For more connections between hitting times and mixing times we refer the reader to [7,26,29].
Proof of (1.1). We suppress the dependence on n. Fix some x and some mean optimal stopping rule T (such that P x [X T ∈ ·] = π(·)). Let y be an halting state. Then by the strong Markov property, and the fact that T T y a.s., we have that for all t, where we have used the Paley-Zygmund inequality in the second inequality. Using the assumption min z α z ≍ t cov , we conclude the proof by arguing that α 2 y ≍ E π [T 2 y ]. Indeed, by the aforementioned assumption α y ≍ t hit , whereas E π [T 2 y ] 2t 2 hit by (3.3).
To conclude the proof of Propositions 1.1 and 1.2 it remains to prove (4.2) and (4.1).

Analysis of Example 1.1
Let G n be an n × m grid torus with m = m(n) n. It is well-known that for SRW on the n-cycle the spectral gap is ≍ n −2 (cf. [32,Example 12.10]). Hence by general results about product chains (e.g. [32,Corollary 12.13]) gap(G n ) ≍ n −2 (uniformly for all m(n) n).
We first consider the case that m ∈ [n/ log n, n]. We now prove that t cov mn(log n) 2 for such m. The same bound for m ∈ [ n (log n) 2 , n log n ] will be given at the end of the section. Let H t := e −t(I−P ) be the heat kernel of the continuous-time SRW on G n . Let H for t k 2 (this follows by the local CLT, e.g. [11, §4.4], or from some more general considerations, e.g. [32,Theorem 17.17]), while by the Poincaré inequality (for some absolute constants c, C 0 , C 1 ) for all t 0. Because the continuous-time chain evolves independently in the two coordinates, at rate 1 2 along each coordinate (e.g. [32, p. 288 (20.35)]) for all a = (a 1 , a 2 ) and Denote the vertex set of G n by V n and the uniform distribution on V n by π. It follows that for all y ∈ V n we have that ∞ 2n 2 (H t (y, y) − π(y)) dt C 2 2n 2 0 (H t (y, y) − π(y)) dt and so and so by (3.10) t cov (G n ) C 6 (n 2 + nm log m) log n.
For m ∈ [n/ log n, n] we get that t cov (G n ) nm(log n) 2 .
We now prove a matching lower bound for m ∈ [n/(log n) 2 , n]. If x, y ∈ V n are of graph distance at least √ m then by the local CLT (e.g. [11, §4.4]) where we have used the fact that for transitive chains H t (y, y) H t (x, y) for all x, y and all t (e.g. [5, Eq. (3.60)]). Hence by (3.6) for such x, y we have that By considering a collection of vertices A ⊂ V n of size Ω(n) such that for any distinct a, b ∈ A we have that the distance of a from b is at least √ m we get by (5.3) and Matthews' argument that t cov (G n ) nm(log n) 2 (for m ∈ [n/(log n) 2 , n]).
We now treat m ∈ [1, n/(log n) 2 ]. Using our upper bound on the spectral-gap, it remains only need to show that t cov (G n ) n 2 in this regime. This requires a more careful analysis than the one above.
It is not hard to show that there exists an absolute constant p ∈ (0, 1) such that for every n and every C 1, SRW on the n-cycle satisfies that all vertices are visited at least Cn/4 times by time Cn 2 with probability at least p. For an argument which is specific for the cycle cf. [10,Lemma 6.6]. This also follows from a general result from [22] which says that the blanket-time (see [22] for a definition) is proportional to the cover time. For a set A we define the induced chain on A, denoted by (Y A k ) ∞ k=0 , to be the chain (X t ) t 0 viewed only at times at which it visits A. That is t 0 (A) := inf{t 0 : X t ∈ A}, where we have used the fact that for all j ∈ [ℓ] and k ∈ N we have that We now argue that it suffices to show that the induced chain on each strip S i satisfies that the probability that it is not covered in Cn steps is ≪ 1/n, provided that C is sufficiently large (note that here steps are counted w.r.t. the induced chain, i.e. these are the number of visits to each strip). That is, by symmetry, it suffices to show that max a∈S 1 P induced,S 1 a [τ induced,S 1 cov > ⌈Cn⌉] = o(1/n). Indeed, once this is established for some C 1 then using the above with the partition S 1 , . . . , S n we get that P v [τ cov > 8Cn 2 ] P v [max i∈ [n] t ⌈Cn⌉ (S i ) > 8Cn 2 ] + n max The r.h.s. is at most 1 − p + o(1) where p ∈ (0, 1) is as above. Using the obvious submultiplicativity property this implies that t cov (G n ) n 2 as desired.
Denote M := max x,y∈S 1 E induced,S 1 x [T y ]. We will show that M C 6 m log m. By Markov's inequality P induced a [T b > 2M] 1 2 for all a, b in the same strip. By the Markov property, for all a, b in the same strip, we have that P induced a [T b > (2M)⌈log 2 (n 3 )⌉] 2 − log 2 (n 3 ) = n −3 .
By a union bound over all m vertices in that strip we obtain the desired tail estimate on the cover time of a single strip w.r.t. the induced chain, as (2M)⌈log 2 (n 3 )⌉ m log m log n n.
It remains to show that M = O(m log m). The induced chain is itself a transitive chain. In particular, its stationary distribution is the uniform distribution. Let x, y ∈ S 1 . By Wald's equation, and the fact that the expected return time to S 1 from any a ∈ S 1 is n, we have that E x [T y ] = nE induced,S 1 x [T y ], and so by (3.6) we have that We now show that also for m ∈ [n/(log n) 2 , n/ log n] we have t cov nm(log n) 2 . Indeed by (5.6) M m log m and so by the above analysis we can bound t cov from above, up to a constant factor, by the expected time until all strips are visited for at least CM log n time units. As M log n n it is not hard to to show that this time is of order n × (CM log n) ≍ nm(log n) 2 by using the fact that there exists C > 0 such that with probability at least p > 0 all strips are visited for at least n time units by time Cn 2 . To see this, consider a sequence of consecutive time intervals of length Cn 2 , and use the Markov property to argue that during each interval, with probability at least p > 0, independently of the previous time intervals, all strips are visited for at least n time units).