1 Introduction

The behaviour of random walks on random fractals has been the subject of intense study since the 1970s [26], and a sophisticated and widely applicable theory has now developed on the topic [13, 46, 49]. In particular, it is now well established that the asymptotic behaviour of spectral quantities such as exit times, return probabilities, and walk displacement are determined under mild conditions by geometric properties such as volume growth and resistance growth [17, 49], with very general results to this effect established in the recent work of Lee [56]. This theory has led to a fairly complete understanding of several notable motivating examples including random planar maps [25, 31,32,33], high-dimensional percolation and branching random walks [13, 16, 48], and uniform spanning trees in two dimensions [11], three dimensions [6], and high dimensions (\(d>4\)) [36]. The analysis of other important examples such as two-dimensional critical percolation remain largely open despite significant partial progress [29, 45, 46].

As suggested by this list of examples, many of the most interesting random fractals arise from critical statistical mechanics models, and for many such models the geometric and spectral properties of the associated random fractal depends heavily on the dimension in which the model is considered. Indeed, for many random fractals arising in statistical mechanics, a dichotomy emerges around an upper-critical dimension [70], denoted \(d_c\), which is equal to 4 for the uniform spanning tree and 6 for percolation: below this dimension, the behaviour of the fractal is highly dependent on the geometry of the underlying space, while above this dimension the fractal displays mean-field behaviour, meaning that its large-scale behaviour is the same as it would be in a ‘geometrically trivial’ setting such as the complete graph or the binary tree. For many models the mean-field regime is described by Alexander–Orbach behaviour [5, 14, 46], in which the relevant random fractal has quadratic volume growth, spectral dimension 4/3, and typical n-step walk displacement of order \(n^{1/3}\). Indeed, Alexander–Orbach behaviour has been proven to hold for high-dimensional oriented percolation by Barlow, Jarai, Kumagai, and Slade [13], high-dimensional percolation by Kozma and Nachmias [48], and for the high-dimensional uniform spanning tree by the second author [36]. (An interesting example that is not expected to exhibit Alexander–Orbach behaviour in high dimensions is the minimal spanning forest, mean-field models of which have cubic volume growth and spectral dimension 3/2 [1, 62].)

At the upper-critical dimension itself (\(d=d_c\)), it is expected that mean-field behaviour almost holds, with many quantities of interest expected to exhibit a polylogarithmic correction to their mean-field scaling. It is this regime that provides the focus of this paper, in which we determine the precise order of the polylogarithmic corrections to scaling for the geometric and spectral properties of the uniform spanning tree (UST) at its upper-critical dimension \(d_c=4\). The particular polylogarithmic corrections we compute are those governing the volume of balls, the resistance across them, and the return probabilities, range, displacement and exit times of random walks on the tree. Most of our work goes into estimating the volume growth and resistance growth of the 4d UST, with the associated random walk estimates following straightforwardly by techniques developed in [13, 50] that are by now rather standard. (The relevant proofs are presented in a self-contained way in Sect. 3.3.) We believe that this is the first time that polylogarithmic corrections to Alexander–Orbach behaviour have been computed for the random walk on a random fractal at the upper-critical dimension. Following [67], which computes the exact polylogarithmic corrections to a random walk on the four-dimensional random walk trace, we also believe that our work is the second time such polylogarithmic corrections to random walk behaviour at the upper critical dimension have been computed for any model. Partial progress on this problem for other models includes [40] (see also [41]) in which the existence of a non-trivial polylogarithmic correction to resistance growth is established for oriented branching random walk in \({{\,\mathrm{{\mathbb {Z}}}\,}}^6\times {{\,\mathrm{{\mathbb {Z}}}\,}}_+\).

1.1 The uniform spanning tree

Over the last 30 years, the uniform spanning tree has emerged as a model of central importance throughout probability theory, with close connections to many other topics including electrical networks [23, 47], loop-erased random walk [18, 52, 69], the dimer model [20, 44], the Abelian sandpile model [21, 36, 42, 43, 60] and the random cluster model [30, 34]. Aside from these connections, the UST is also interesting as an example of a model exhibiting much of the rich phenomena associated to critical statistical mechanics models, but which is much more tractable to study than essentially any other (non-Gaussian) model thanks to its close connection to random walks via Wilson’s algorithm [18, 69] and the Aldous–Broder algorithm [4, 22, 35].

We now very briefly introduce the model, referring the reader to e.g. [9, 36, 59] for further background. The uniform spanning tree of a finite connected graph is defined by choosing a spanning tree (i.e. a connected subgraph that contains every vertex and no cycles) of the graph uniformly at random. Pemantle [64] proved that there is a well-defined infinite volume limit of the uniform spanning tree of the hypercubic lattice \({{\,\mathrm{{\mathbb {Z}}}\,}}^d\) which does not depend on the boundary conditions used when taking the limit and which is connected a.s. if and only if \(d\le 4\) (see also [18]). This infinite volume limit is known as the uniform spanning tree of \({{\,\mathrm{{\mathbb {Z}}}\,}}^d\) when \(d\le 4\) and the uniform spanning forest of \({{\,\mathrm{{\mathbb {Z}}}\,}}^d\) when \(d\ge 5\). The critical dimension \(d=4\) is characterized by the UST just barely managing to be connected, with two points at Euclidean distance n typically connected by a path of Euclidean diameter muchFootnote 1 larger than n and with the length of the path in the tree connecting two neighbouring vertices having an extremely heavy \((\log n)^{-1/3}\) tail [55]. This heavy tail on the probability of an abnormally long connection, and the related fact that the length of a loop-erased random walk in four dimensions is only very weakly concentrated, is responsible for much of the technical difficulties encountered in the paper. For example, it makes it difficult to justify the important heuristic that the volume of the intrinsic n-ball in the tree comes mostly from ‘typical’ points for which the tree-geodesic to the origin has Euclidean diameter of order \(n^{1/2} (\log n)^{1/6}\).

1.2 Distributional asymptotic notation

To facilitate a clean presentation of our main results, we use distributional asymptotic notation (a.k.a. “big-O and little-o in probability” notation). Since this notation is not at all standard in probability theoryFootnote 2, let us take a moment to explain how it is used. We hope the reader will find this diversion worthwhile after seeing how clean the statements of our main theorems are compared with similar results in the literature, and consider using this notation in their own work.

Before introducing this notation, let us first briefly introduce standard (deterministic) asymptotic notation as we use it. We write \(\asymp \), \(\succeq \), and \(\preceq \) for equalities and inequalities holding to within positive multiplicative constants, so that if f and g are non-negative then “\(f(n) \preceq g(n)\) for every \(n\ge 1\)” means that there exists a positive constant C such that \(f(n)\le Cg(n)\) for every \(n\ge 1\). (We will often drop the “for every \(n\ge 1\)” and write simply “\(f(n)\preceq g(n)\)” when doing so does not cause confusion.) We use Landau’s asymptotic notation similarly, so that \(f(n)=O(g(n))\), \(f(n)=\Omega (g(n))\), and \(f(n)=\Theta (g(n))\) mean the same thing as \(f(n) \preceq g(n)\), \(f(n) \succeq g(n)\), and \(f(n) \asymp g(n)\) respectively, while \(f(n)=o(g(n))\) means that \(f(n)/g(n)\rightarrow 0\) as \(n\rightarrow \infty \). More complicated expressions can be obtained by putting this notation inside functions, so that e.g. \(f(n)=O(e^{n-o(n^{1/2})})\) means that there exists a non-negative function h(n) with \(n^{-1/2}h(n)\rightarrow 0\) and a positive constant C such that \(f(n)\le Ce^{n-h(n)}\) for every \(n\ge 1\). Implicit constants and functions given by this notation will always be non-negative, and we denote quantities of uncertain sign using \(\pm O\), \(\pm o\), etc. (While this is not completely standard, it greatly increases the expressive power of the notation.) Be careful to note that when forming such compound expressions, \(\Theta \) should always be interpreted as the conjunction of O and \(\Omega \), so that “\(f(n)=\Theta (e^{n-o(n)})\)” means the same thing as “\(f(n)=O(e^{n-o(n)})\) and \(f(n)=\Omega (e^{n-o(n)})\)”, which means that there exist positive constants c and C and possibly distinct non-negative functions \(h^+\) and \(h^-\) with \(\lim _{n\rightarrow \infty }n^{-1}h^+(n)=\lim _{n\rightarrow \infty } n^{-1}h^-(n) =0\) such that \(f(n) \le C e^{n-h^+(n)}\) and \(f(n)\ge c e^{n-h^-(n)}\). Whenever we use asymptotic notation, we can add a qualifier such as “as \(n\rightarrow \infty \)” to mean that the inequalities in question hold only for sufficiently large n; this will typically be used to avoid worrying about expressions such as \(\log \log n\) being undefined or negative for small values of n.

We use boldface characters to apply this notation in settings where the relevant bounds are guaranteed only to hold with high probability, rather than deterministically. Given two sequences of (possibly deterministic) non-negative random variables \((X_n)\) and \((Y_n)\) defined on the same probability space, we write

In other words, \(X_n={\textbf{O}}(Y_n)\) and \(Y_n={\varvec{\Omega }}(X_n)\) both mean that \(\{X_n/Y_n\}\) is tight in \([0,\infty )\), \(X_n=\varvec{\Theta }(Y_n)\) means that \(\{X_n/Y_n\}\) is tight in \((0,\infty )\), and \(X_n={\textbf{o}}(Y_n)\) means that \(X_n/Y_n\) converges to zero in probability. As in the deterministic case, we can add a qualifier “as \(n\rightarrow \infty \)” to mean that there exists \(n_0<\infty \) such that the relevant inequalities hold between \(X_n\) and \(Y_n\) provided that \(n\ge n_0\). Let us stress again that, as in the deterministic case, the random variables denoted implicitly by our use of asymptotic notation are always taken to be non-negative. When we wish to apply this notation to quantities of uncertain sign we use \(\pm {\textbf{O}}\), \(\pm {\textbf{o}}\), etc. as appropriate.

Like in the deterministic case, this notation really begins to shine when forming more complicated compound expressions. Again, we warn the reader that in such an expression, the implicit random variables (e.g. those appearing in an exponent) may be different in the upper and lower bounds. Indeed this will usually be the case in our applications. To give a contrived example in which all these conventions come into force, “\(X_n = {\varvec{\Theta }} (\exp [n+{\textbf{O}}((\log n)^{{\textbf{O}}(1)})\pm {\textbf{o}}(\log \log n)])\) as \(n\rightarrow \infty \)” is equivalent to the statement that there exists \(n_0<\infty \) and sequences of non-negative random variables \((A_n^-)\), \((A_n^+)\), \((B_n^-)\), \((B_n^+)\), \((C_n^-)\), and \((C_n^+)\) and real-valued sequences of random variables \((D_n^-)\) and \((D_n^+)\) such that \((A_n^-)\) is tight in \((0,\infty ]\), \((A_n^+)\), \((B_n^-)\), \((B_n^+)\), \((C_n^-)\), and \((C_n^+)\) are tight in \([0,\infty )\), \((D_n^-)\) and \((D_n^+)\) converge to zero in probability, and

$$\begin{aligned} A_n^- e^{n+B_n^- (\log n)^{C_n^-}+D_n^- \log \log n} \le X_n \le A_n^+ e^{n+B_n^+ (\log n)^{C_n^+}+D_n^+ \log \log n} \qquad \text { for every }n\ge n_0. \end{aligned}$$

Note the incredible economy we have achieved by writing this complicated condition in the simple form “\(X_n = {\varvec{\Theta }} (\exp [n+{\textbf{O}}((\log n)^{{\textbf{O}}(1)})\pm {\textbf{o}}(\log \log n)])\) as \(n\rightarrow \infty !\)

Remark 1

As with deterministic asymptotic notation, there are many useful elementary notational identities. Of these, we will repeatedly use that for any sequence of random variables \((X_n)_{n\ge 0}\) if \(X_n={\textbf{o}}(Y_n)\) then \(X_n={\textbf{O}}(Y_n)\), and if \(X_n={\textbf{O}}(Y_n(\log n)^\delta )\) for all \(\delta >0\), then \(X_n={\textbf{O}}(Y_n(\log n)^{o(1)})\). Similarly, if \(X_n=\varvec{\Omega }(Y_n(\log n)^{-\delta })\) for all \(\delta >0\), then \(X_n=\varvec{\Omega }(Y_n(\log n)^{-o(1)})\).

1.3 Statement of results

We now state our main results. We begin with our results on the volumes of intrinsic balls, the proof of which occupies the majority of the paper.

Theorem 1.1

(Volume growth) Let \({\mathfrak {T}}\) be the uniform spanning tree of \({{\,\mathrm{{\mathbb {Z}}}\,}}^4\) and for each \(n\ge 0\) let \({\mathfrak {B}}(n)={\mathfrak {B}}(0,n)\) denote the intrinsic ball of radius n around the origin in \({\mathfrak {T}}\). The volume of \({\mathfrak {B}}(n)\) satisfies the distributional asymptotics

$$\begin{aligned} |{\mathfrak {B}}(n)| = \varvec{\Theta }\left( \frac{n^2}{(\log n)^{1/3-o(1)}}\right) \qquad \text { and } \qquad {\mathbb {E}}|{\mathfrak {B}}(n)| = \Theta \left( \frac{n^2}{(\log n)^{1/3-o(1)}}\right) \end{aligned}$$

as \(n\rightarrow \infty \). Moreover, letting \(\Lambda (r)\) denote the \(\ell ^\infty \) ball of radius r around the origin in \({{\,\mathrm{{\mathbb {Z}}}\,}}^4\) for each \(r\ge 0\), we have that

$$\begin{aligned} \lim _{n\rightarrow \infty }{\mathbb {P}}\left( {\mathfrak {B}}(n)\subseteq \Lambda \Bigl (n^{1/2}(\log n)^{1/6+\delta }\Bigr )\right) =1 \end{aligned}$$

for every \(\delta >0\).

Recall that in high dimensions the components of the uniform spanning forest have quadratic volume growth \(|{\mathfrak {B}}(n)|=\varvec{\Theta }(n^2)\) [12, 36], so that the behaviour in four dimensions differs from the high-dimensional behaviour by a polylogarithmic factor as expected.

The proofs of both the upper and lower bounds of Theorem 1.1 rely on Wilson’s algorithm [18, 69] to express properties of the tree in terms of properties of loop-erased random walks. Accordingly, they also both rely on an understanding of the behaviour of the loop-erased random walk in four dimensions developed in [51, 54, 55], with the proof of the lower bounds also relying on the control of the capacity of the loop-erased walk developed in [7, 37]. The proof of the upper bound also uses a generalisation of the method of typical times introduced in [37], a very useful technical tool that allows us to circumvent several issues that arise from the fact that the length of a four-dimensional loop-erased random walk is only very weakly concentrated. (The use of this machinery is also responsible for the presumably unnecessary subpolylogarithmic \((\log n)^{\pm o(1)}\) errors appearing throughout our results.)

We now turn to our results concerning the random walk on the four-dimensional UST. We write \({\mathbb {P}}\) and \({\mathbb {E}}\) for probabilities and expectations taken with respect to the joint law of the UST \({\mathfrak {T}}\) on \({{\,\mathrm{{\mathbb {Z}}}\,}}^4\) and the random walk \(X=(X_n)_{n\ge 0}\) on \({\mathfrak {T}}\) started at the origin, and write \({\textbf{P}}^{\mathfrak {T}}\) and \({\textbf{E}}^{\mathfrak {T}}\) for probabilities and expectations taken with respect to the conditional law of X given \({\mathfrak {T}}\). We write \(p^{{\mathfrak {T}}}_n(x,y)\) for the transition probabilities of a random walk on the uniform spanning tree \({\mathfrak {T}}\) conditional on \({\mathfrak {T}}\), write \(\tau _n\) for the time taken for the random walk to hit the complement of the intrinsic ball of radius of n, and write \(d_{\mathfrak {T}}\) for the intrinsic distance on \({\mathfrak {T}}\).

Theorem 1.2

(Random walk asymptotics) Let \({\mathfrak {T}}\) be the uniform spanning tree of \({{\,\mathrm{{\mathbb {Z}}}\,}}^4\) and let \(X=(X_n)_{n\ge 0}\) be the simple random walk on \({\mathfrak {T}}\) started at the origin. The following distributional asymptotic expressions hold as \(n\rightarrow \infty \):

$$\begin{aligned}{} & {} Intrinsic displacement :{} & {} \hspace{0.5cm} \,d_{{\mathfrak {T}}}(X_0,X_n),\,\max _{0\le i\le n} d_{{\mathfrak {T}}}(X_0,X_i)&= {\varvec{{\Theta }}}\Biggl (\hspace{0.115em}n^{\frac{1}{3}}\hspace{0.115em}(\log n)^{\frac{1}{9}- o(1)} \Biggr ) \hspace{1cm} \end{aligned}$$
(1)
$$\begin{aligned}{} & {} Extrinsic displacement :{} & {} \max _{0\le i\le n} \Vert X_i\Vert _{\infty }&=\varvec{\Theta }\Biggl (\hspace{0.115em}n^{\frac{1}{6}}\hspace{0.115em}(\log n)^{\frac{2}{9}+ o(1)} \Biggr ) \end{aligned}$$
(2)
$$\begin{aligned}{} & {} Return probabilities :{} & {} p_{2n}^{{\mathfrak {T}}}(0,0)&=\varvec{\Theta }\Biggl (\frac{1}{n^{\frac{2}{3}}}(\log n)^{\frac{1}{9}- o(1)}\Biggr ) \end{aligned}$$
(3)
$$\begin{aligned}{} & {} Range :{} & {} \#\{X_m:0\le m \le n\}&= \varvec{\Theta }\Biggl (\hspace{0.115em}n^{\frac{2}{3}} \frac{1}{(\log n)^{\frac{1}{9}\pm o(1)}} \hspace{-0.125em}\Biggr ) \end{aligned}$$
(4)
$$\begin{aligned}{} & {} Hitting times :{} & {} \tau _n,\, {\textbf{E}}^{\mathfrak {T}}[\tau _n]&= \varvec{\Theta }\Biggl (\hspace{0.085em}n^3\hspace{0.085em} \frac{1}{(\log n)^{\frac{1}{3}- o(1)}}\Biggr ). \end{aligned}$$
(5)

Remark 2

It is reasonably straightforward to adapt the proofs of [36] to prove that, in four dimensions, all the quantities we consider here satisfy Alexander–Orbach asymptotics up to \((\log n)^{\pm O(1)}\) factors. Identifying the correct powers of \(\log \) is significantly more difficult and is the primary contribution of this paper.

As mentioned above, the behaviour of the random walk on the uniform spanning tree has previously been studied in dimensions \(d=2\) [11, 15] \(d=3\) [6], and \(d\ge 5\) [36], with the two cases \(d=2\) and \(d=3\) presenting unique challenges that are largely distinct from those associated to the critical dimension \(d=d_c=4\) considered here. While we are the first to study the polylogarithmic corrections to the volume of balls and the behaviour of random walks on the UST at \(d=4\), our work builds upon the substantial literature studying other aspects of the 4d UST, the highlights of which include [37, 51, 53,54,55, 66]. Our work is influenced most strongly by the recent work of Sousi and the second author [37]; we rely on both the results proven and the techniques developed in that paper in numerous ways.

Following Kumagai–Misumi [50], which collects and generalises results of [10, 13, 14], estimates of the form proven in Theorem 1.2 can all be deduced from the volume growth estimates of Theorem 1.1 together with estimates on the effective resistance between the origin and the boundary of a ball in the tree. The relevant effective resistance estimates will in turn be deduced from Theorem 1.1 together with the asymptotics of the intrinsic arm probability computed in [37]. We let \({\mathscr {R}}_{\textrm{eff}}(A\leftrightarrow B;G)\) denote the effective resistance between sets \(A,B\subseteq V[G]\) in the graph G, where we assign unit resistance to each edge \(e\in E[G]\), so that if \(\deg _{\mathfrak {T}}(0)\) denotes the degree of 0 in \({\mathfrak {T}}\) then \({\mathscr {R}}_{\textrm{eff}}(0\leftrightarrow \partial {\mathfrak {B}}(0,n);{\mathfrak {T}})^{-1}:=\deg _{\mathfrak {T}}(0) {\textbf{P}}^{\mathfrak {T}}(\)hit \(\partial {\mathfrak {B}}(0,n)\) before returning to 0). Background on effective resistances can be found in e.g. [49, 59].

Theorem 1.3

(Effective resistance) Let \({\mathfrak {T}}\) be the uniform spanning tree of \({{\,\mathrm{{\mathbb {Z}}}\,}}^4\) and for each \(n\ge 0\) let \(\partial {\mathfrak {B}}(n)= \partial {\mathfrak {B}}(0,n)\) denote the set of vertices with distance exactly n from the the origin in \({\mathfrak {T}}\). Then

$$\begin{aligned} {\mathscr {R}}_{\textrm{eff}}(0\leftrightarrow \partial {\mathfrak {B}}(0,n);{\mathfrak {T}}) = n (\log n)^{-{\textbf{o}}(1)} \end{aligned}$$

as \(n\rightarrow \infty \).

Note that the linear upper bound \({\mathscr {R}}_{\textrm{eff}}(0\leftrightarrow \partial {\mathfrak {B}}(0,n)) \le n\) is trivial and holds for any graph. Together with existing results in other dimensions [6, 11, 36], Theorem 1.3 shows that the UST has (approximately) linear effective resistance growth in every dimension. As will be clear from the proof, this is a consequence of the scaling relation

$$\begin{aligned} {\mathbb {P}}(\text {the past of the origin has intrinsic diameter} \ge n) \approx \frac{n}{\text {typical volume of an intrinsic }n\text {-ball}}, \end{aligned}$$
(6)

which also holds in every dimension. Here, the past of the origin is the union of the origin and the finite connected component of the UST left when the origin is deleted; estimating the probability that the past is large in various senses is the main subject of [37], which in particular establishes up-to-constants estimates on the left hand side of (6). Currently, however, there is no direct proof of this scaling relation, which in four dimensions is verified only by computing the two sides separately in [37] and the present paper. It would be very interesting to have a direct and general proof of this relation in all dimensions that worked without computing either quantity.

While Theorems 1.1 and 1.3 are sufficient to compute the exact logarithmic corrections to the asymptotic properties of the random walk on the UST using the methods of [12, 50] as discussed above, we will also show that a significantly stronger bound on the displacement of the random walk can be proven using the Markov-type method pioneered in the work of Lee and coauthors [27, 29, 56, 57].

Theorem 1.4

(Sharp upper bounds on the mean-squared displacement) Let \({\mathfrak {T}}\) be the uniform spanning tree of \({{\,\mathrm{{\mathbb {Z}}}\,}}^4\) and let \(X=(X_n)_{n\ge 0}\) be the simple random walk on \({\mathfrak {T}}\) started at the origin. Then

$$\begin{aligned} {\mathbb {E}}\left[ \max _{0\le i \le n}d_{{\mathfrak {T}}}(X_0,X_i)^2\right] \preceq n^{2/3}(\log n)^{2/9} \end{aligned}$$
(7)

for every \(n\ge 2\).

The specific argument used to prove this theorem is inspired closely by the work of Ganguly and Lee [29]. Briefly, the idea is to use the universal Markov-type inequality for weighted metrics on trees [27] to prove a diffusive upper bound for the random walk with respect to a modified metric supported only on vertices of the tree whose past has large intrinsic diameter, then deduce the desired subdiffusive estimate in the original metric. The tail bounds on the intrinsic diameter of the past of the origin proven in [37] are precisely what is needed to carry this argument through. In particular, the proof of Theorem 1.4 does not rely on Theorem 1.1 or the theory of typical times, allowing us to avoid the \((\log n)^{\pm o(1)}\) present in the statement of that theorem.

Remark 3

As discussed in [13, Example 2.6], although the typical displacement of the random walk can always be controlled in terms of volume growth and resistance growth, it is possible in general for the displacement not to be uniformly integrable, so that its mean grows significantly faster than its median. As such, the second moment estimate provided by Theorem 1.4 is significantly stronger than what can be deduced directly from Theorems 1.1 and 1.3 by the techniques of [13, 50].

2 Intrinsic Volume Growth

In this section we prove Theorem 1.1. The upper and lower bounds of the theorem, which use completely different techniques, are proven in Sects. 2.1 and 2.2 respectively. Both parts of the proof will utilize the connections between the uniform spanning tree and the loop-erased random walk implied by Wilson’s algorithm, and so to proceed we must provide notation for the loop-erased random walk and some related quantities.

Loop-erased random walk. For each \(-\infty \le n\le m\le \infty \), let L(nm) be the graph with vertex set \(\{i\in {{\,\mathrm{{\mathbb {Z}}}\,}}:n\le i\le m\}\) and edge set \(\{\{i,i+1\}:n\le i\le m-1\}\). A path is then a multigraph homomorphism from L(nm) to the hypercubic lattice \({{\,\mathrm{{\mathbb {Z}}}\,}}^4\) for some \(-\infty \le n\le m\le \infty \). We write \(w_i=w(i)\) for the vertex visited at time i. For \(n\le b\le m\), we write \(w^b\) for the restriction of w to [nb], and call \(w^b\) the path stopped at b. In particular, given a random walk X, we will often use the notation \(X^T\) for a random walk stopped at some possibly random time T. A path is said to be transient if it visits every vertex of \({{\,\mathrm{{\mathbb {Z}}}\,}}^4\) at most finitely many times. In particular, finite paths are always transient. Given a transient path \(w:L(0,m)\rightarrow {{\,\mathrm{{\mathbb {Z}}}\,}}^4\), we recursively define the sequence of times \(\ell _n(w)\) by \(\ell _0(w)=0\), and

$$\begin{aligned} \ell _{n+1}(w) = 1 + \max \{k:w_k=w_{\ell _n}\}, \end{aligned}$$

where we terminate the sequence the first time \(\max \{k:w_k=w_{\ell _n}\}=m\) when \(m<\infty \). The loop-erasure of w is then the path induced by the sequence of neighbouring vertices

$$\begin{aligned} \textrm{LE}(w)_i = w_{\ell _i(w)}. \end{aligned}$$

We will also need the quantity

$$\begin{aligned} \rho _n(w) = \max \{m\ge 0:\ell _m(w)\le n\}, \end{aligned}$$

which for each \(n\ge 0\) counts the number of points up to time n (excluding \(w_0\)) which are not erased when computing the loop-erasure of w, so that \((\ell _n)_{n\ge 0}\) and \((\rho _n)_{n\ge 0}\) are inverses of each other in the sense that

$$\begin{aligned} \ell _n(w)\le m \quad \text {if and only if} \quad \rho _m(w) \ge n, \end{aligned}$$

for every \(n,m\ge 0\).

The loop-erasure of a simple random walk is known as the loop-erased random walk. The theory of loop-erased random walk was both introduced and developed extensively by Lawler [52], whose results on the four-dimensional loop-erased random walk [54, 55] play an extensive role in this paper both directly and through inputs to [37]. Given a random walk X, we will usually abbreviate \(\ell _n=\ell _n(X)\) and \(\rho _n=\rho _n(X)\). It will also be convenient to define the notation

$$\begin{aligned} \textrm{LE}_\infty (X^n) := \textrm{LE}(X)^{\rho _n} \end{aligned}$$

for \(n\ge 0\), giving the component of the infinite loop erasure \(\textrm{LE}(X)\) which is contributed by the first n steps of the random walk X. We emphasise that the brackets of \(\textrm{LE}_\infty (X^n)\) do not indicate that \(\textrm{LE}_\infty (X^n)\) is a function of just \(X^n\). The following concentration estimates of Lawler [54, 55], as stated in [37, Theorem 2.2], will be used repeatedly throughout the the paper.

Theorem 2.1

([37], Theorem 2.2) Let X be a simple random walk on \({{\,\mathrm{{\mathbb {Z}}}\,}}^4\), then

$$\begin{aligned} {{\,\mathrm{{\textbf{P}}}\,}}\left( \left| \frac{\rho _n}{n(\log n)^{-1/3}}-1\right|>\varepsilon \right)&\preceq _\varepsilon \frac{\log \log n}{(\log n)^{2/3}}\qquad \text {and hence}\\ {{\,\mathrm{{\textbf{P}}}\,}}\left( \left| \frac{\ell _n}{n(\log n)^{1/3}}-1\right| >\varepsilon \right)&\preceq _\varepsilon \frac{\log \log n}{(\log n)^{2/3}}, \end{aligned}$$

for every \(\epsilon >0\) and \(n\ge 3\).

Wilson’s algorithm rooted at infinity [18, 69] allows us to build a sample of the UST of \({{\,\mathrm{{\mathbb {Z}}}\,}}^4\) (or any other transient graph) out of loop-erased random walks. This algorithm is very important to most analyses of the UST. We will assume that the reader is already familiar with Wilson’s algorithm, referring them to e.g. [59] for background otherwise.

Finally, let us introduce notation concerning the geometry of \({{\,\mathrm{{\mathbb {Z}}}\,}}^4\) and the tree \({\mathfrak {T}}\). We write \(\left\Vert x\right\Vert \) for the \(\ell ^\infty \) norm of \(x\in {{\,\mathrm{{\mathbb {Z}}}\,}}^4\) and write \(\Lambda (x,r)\) for the \(\ell ^\infty \) ball around \(x\in {{\,\mathrm{{\mathbb {Z}}}\,}}^d\) of radius r. For convenience, we will write \(\Lambda (r)\) for \(\Lambda (0,r)\). For each \(x\in {{\,\mathrm{{\mathbb {Z}}}\,}}^4\) and \(r\ge 1\), \({\mathfrak {B}}(x,r)\) will denote the intrinsic ball of radius r around x in \({\mathfrak {T}}\), with \({\mathfrak {B}}(r):={\mathfrak {B}}(0,r)\). For each pair of vertices \(x,y\in {{\,\mathrm{{\mathbb {Z}}}\,}}^4\) we write \(\Gamma (x,y)\) for the unique simple path between x and y in \({\mathfrak {T}}\), which is well-defined since the UST of \({{\,\mathrm{{\mathbb {Z}}}\,}}^4\) is a.s. connected [18, 64], and write \(\Gamma (x,\infty )\) for the future of x in \({\mathfrak {T}}\), i.e. the unique infinite simple path in \({\mathfrak {T}}\) with x as an endpoint, which is well-defined since the UST of \({{\,\mathrm{{\mathbb {Z}}}\,}}^4\) is one-ended a.s. [18, 64]. Given two vertices \(x,y\in {{\,\mathrm{{\mathbb {Z}}}\,}}^4\) we will denote by \(x\vee y=y\vee x\) the unique point at which the futures of x and y in \({\mathfrak {T}}\) first intersect.

The past of a vertex v in the uniform spanning tree \({\mathfrak {T}}\), denotedFootnote 3\({\mathfrak {P}}(v)\), is the union of the vertex and the finite components that are disconnected from infinity when the vertex is deleted from \({\mathfrak {T}}\). We write \({\mathfrak {P}}(v,n)\) for \({\mathfrak {P}}(v)\cap {\mathfrak {B}}(v,n)\) and write \(\partial {\mathfrak {B}}(v,n)\) for the set of vertices in \({\mathfrak {T}}\) at intrinsic distance exactly n from v. Further discussion of the basic topological features of the UST used here can be found in [59, Chapter 10].

2.1 Upper bounds

In this section we prove the following two propositions, which establish the upper bounds of Theorem 1.1. Throughout this section we will write \(\asymp \), \(\preceq \), and \(\succeq \) with subscripts such as \(\delta \) and p to mean that the implicit constants are allowed to depend on these parameters.

Proposition 2.2

Let \({\mathfrak {T}}\) be the uniform spanning tree of \({{\,\mathrm{{\mathbb {Z}}}\,}}^4\). Then

$$\begin{aligned} {\mathbb {E}}|{\mathfrak {B}}(n)| = O\left( \frac{n^2}{(\log n)^{1/3-o(1)}}\right) \end{aligned}$$

as \(n\rightarrow \infty \).

Proposition 2.3

Let \({\mathfrak {T}}\) be the uniform spanning tree of \({{\,\mathrm{{\mathbb {Z}}}\,}}^4\) and let \(\delta >0\). Then

$$\begin{aligned} {{\,\mathrm{{\mathbb {P}}}\,}}\left( {\mathfrak {B}}(n) \nsubseteq \Lambda \Bigl (n^{1/2}(\log n)^{1/6+\delta }\Bigr )\right) \preceq _\delta \frac{\log \log n}{(\log n)^{2/3}} \end{aligned}$$

for every \(n\ge 3\).

Both of these results will be proven using the following supporting technical proposition, which bounds in expectation the amount of the volume of intrinsic balls which come from paths of atypical diameter.

Proposition 2.4

Let \({\mathfrak {T}}\) be the uniform spanning tree of \({{\,\mathrm{{\mathbb {Z}}}\,}}^4\), let \(\delta >0\) and let \(p\ge 1\). Then

$$\begin{aligned} {\mathbb {E}}\,\big \vert \big \{x\in {\mathfrak {B}}(n):\Gamma (0,x)\nsubseteq \Lambda \big (n^{1/2}(\log n)^{1/6+\delta }\big )\big \}\big \vert \preceq _{p,\delta } \frac{n^2}{(\log n)^{p}} \end{aligned}$$

for every \(n\ge 2\).

The expected intrinsic volume bound of Proposition 2.2 follows immediately from Proposition 2.4 together with [37, Proposition 7.3], which provides a tight upper bound on the number of points connected to the origin inside an extrinsic box of a given radius.

Proposition 2.5

([37], Proposition 7.3) Let \({\mathfrak {T}}\) be the uniform spanning tree of \({{\,\mathrm{{\mathbb {Z}}}\,}}^4\). Then

$$\begin{aligned} {\mathbb {E}}\,|\{x\in {{\,\mathrm{{\mathbb {Z}}}\,}}^4: \Gamma (0,x)\subseteq \Lambda (r)\}|\preceq \frac{r^4}{\log r} \end{aligned}$$

for every \(r\ge 2\).

Proof of Proposition 2.2

Fix \(\delta >0,\,p\ge 1\) and \(n\ge 4\). We have trivially that

$$\begin{aligned}&{\mathbb {E}}|{\mathfrak {B}}(n)|\le {\mathbb {E}}\,\big \vert \{x\in {\mathfrak {B}}(n):\Gamma (0,x)\not \subset \Lambda (n^{1/2}(\log n)^{1/6+\delta })\} \big \vert \\&\quad +{\mathbb {E}}\,|\{x\in {{\,\mathrm{{\mathbb {Z}}}\,}}^d: \Gamma (0,x)\subseteq \Lambda (n^{1/2}(\log n)^{1/6+\delta })\}|. \end{aligned}$$

Applying Proposition 2.4 to the first term on the right hand side and Proposition 2.5 to the second yields that

$$\begin{aligned} {\mathbb {E}}|{\mathfrak {B}}(n)|\preceq _{p,\delta } \frac{n^2}{(\log n)^{1/3-4\delta }}+\frac{n^2}{(\log n)^{p}}, \end{aligned}$$

which implies the claim since \(\delta >0\) and \(p\ge 1\) were arbitrary. \(\square \)

The deduction of Proposition 2.3 from Proposition 2.4 requires a more involved argument using further results of [37] and is given after the proof of Proposition 2.4.

To prove Proposition 2.4 we need to be able to relate balls in the extrinsic metric (i.e. the \(\ell ^\infty \) metric on \({{\,\mathrm{{\mathbb {Z}}}\,}}^4\)) to balls in the intrinsic metric. Intuitively, since paths in the UST are distributed as loop-erased random walks and since length-n loop-erased random walks in \({{\,\mathrm{{\mathbb {Z}}}\,}}^4\) are typically generated by simple random walks of length roughly \(n (\log n)^{1/3}\) [55], we expect that intrinsic paths of length n in the UST should have extrinsic diameter concentrated around \(n^{1/2} (\log n)^{1/6}\). Unfortunately, however, the concentration estimates that are available for the length of loop-erased random walks are far too weak to directly rule out that most of the volume of the intrinsic ball comes from paths of atypically large diameter. We circumvent this problem using a generalization of the typical time methodology of [37, Section 8], originally introduced to prove tail estimates on the extrinsic radius of the past of the origin: we will use typical times to subsume balls in the intrinsic metric by balls of an appropriate radius in the extrinsic metric.

Typical times. We now detail the generalised typical time methodology that we use. Given points \(x,y\in {{\,\mathrm{{\mathbb {Z}}}\,}}^4\) and a simple path \(\gamma \) starting at x and ending at y, let X be a random walk started at x and conditioned to hit y and to have loop erasure \(\gamma \) when it first hits y. Roughly speaking, the typical time \(T(\gamma )\) of \(\gamma \) is defined to be the typical length of the walk X under this conditional distribution; an important part of the theory is that this length is concentrated around the typical time \(T(\gamma )\) under mild conditions on the path \(\gamma \). Our proofs will apply a slight generalization of this notion, which we now introduce. Instead of stopping the walk at a single point y, we introduce disjoint sets \(A,B\subset {{\,\mathrm{{\mathbb {Z}}}\,}}^d\) and define the (AB)-typical time \(T_{A,B}(\gamma )\) of a simple path \(\gamma \) starting at x, ending when it first hits A, and avoiding B to be

$$\begin{aligned} T_{A,B}(\gamma ):={\textbf{E}}_x\left[ \sum _{i=1}^{|\gamma |} \Big (\ell _i(X^{\tau _A})-\ell _{i-1}(X^{\tau _A})\Big )\wedge |\gamma |\ \Bigg \vert \ \tau _A<\infty , \tau _A<\tau _B, \textrm{LE}(X^{\tau _A})=\gamma \right] , \end{aligned}$$

where \({\textbf{E}}_x\) denotes expectation with respect to the law of a simple random walk X on \({{\,\mathrm{{\mathbb {Z}}}\,}}^4\) started at \(X_0=x\), and where the times \(\ell _i(X^{\tau _A})\) are from the definition of the loop-erasure of \(X^{\tau _A}\), so that

$$\begin{aligned} \tau _A= \ell _{|\gamma |}(X^{\tau _A})=\sum _{i=1}^{|\gamma |} \Big (\ell _i(X^{\tau _A})-\ell _{i-1}(X^{\tau _A})\Big ) \end{aligned}$$

when \(\textrm{LE}(X^{\tau _A})=\gamma \). We will use boldface to denote probabilities and expectations taken with respect to the law of a simple random walk throughout the paper, so that \({{\,\mathrm{{\textbf{P}}}\,}}_x\) will denote probability with respect to the law of a simple random walk started at time 0 at vertex x. We remark that for paths \(\gamma \) which hit A and avoid B we have that \(T_{A,B}(\gamma )=T_{A\cup B,\emptyset }(\gamma )\), where we define \(\tau _\emptyset = \infty \), and that the usual typical time as defined in [37] is given by \(T(\eta )=T_{\{\eta _n\},\emptyset }(\eta )\) when \(\eta \) has length n.

The following Lemma extends [37, Lemma 8.2] to (AB)-typical times. The proof is identical to the proof of that lemma and is omitted.

Lemma 2.6

There exists a constant C such that if \(x\in {{\,\mathrm{{\mathbb {Z}}}\,}}^4\), AB are disjoint subsets of \({{\,\mathrm{{\mathbb {Z}}}\,}}^4\), and \(\gamma \) is a simple path of length \(n\ge 0\) from x to A which does not intersect B, then

$$\begin{aligned} {\textbf{P}}_x\left( |\tau _A-T_{A,B}(\gamma )|>\lambda n \ \Big |\ \tau _A<\infty , \tau _A<\tau _B, \textrm{LE}(X^{\tau _A})=\gamma \right) \le \frac{C}{\lambda }, \end{aligned}$$

for every \(\lambda \ge 1\).

As explained in detail in [37, Section 8], for most paths of interest the typical time \(T(\gamma )\) is significantly larger than \(|\gamma |\), so that Lemma 2.6 can indeed be thought of as a concentration estimate, justifying the use of the ‘typical time’ terminology. Indeed, when \(\gamma \) is a loop-erased random walk of length n its typical time will usually be of order \(n (\log n)^{1/3}\). For an arbitrary path \(\gamma \) of length \(n\ge 1\) the best bounds are of the form

$$\begin{aligned} n \preceq T(\gamma ) \preceq n \log (n+1); \end{aligned}$$
(8)

the lower bound is trivial while the upper bound follows by bounding the distribution of the length of the loop \(\ell _i(X^{\tau _A})-\ell _{i-1}(X^{\tau _A})\) by that of the length of an unconditioned simple random walk loop in \({{\,\mathrm{{\mathbb {Z}}}\,}}^4\) (see [37, Equation (8.6)]). The upper bound is sharp when \(\gamma \) is a straight line, while the lower bound is sharp when \(\gamma \) is a space-filling curve.

As in [37], we bound typical times by a simpler functional that is easier to work with. If \(\gamma \) has length n, we define \(A_i(\gamma )=\sum _{k=1}^n \frac{1}{k}\textrm{Esc}_k(\gamma ^i)^2\), where given a finite path \(\eta \) of length m and an integer \(k\ge 1\) the k-step escape probability \(\textrm{Esc}_k(\eta )\) is defined by \(\textrm{Esc}_k(\eta )={\textbf{P}}_{\eta _m}(X^k\cap \eta ^{m-1}=\emptyset )\). We then define

$$\begin{aligned} {\widetilde{T}}(\gamma ):=\sum _{i=0}^{n-1} A_i(\gamma ). \end{aligned}$$

It follows from the same calculations used to derive the analogous bound for the ordinary hitting time on [37, Page 69] that

$$\begin{aligned} {\widetilde{T}}(\gamma ) \succeq T_{A,B}(\gamma ) \end{aligned}$$
(9)

for every path \(\gamma \) and every pair of disjoint sets \(A,B \subseteq {{\,\mathrm{{\mathbb {Z}}}\,}}^4\). For a given \(0<\delta \le 1\), we say that a finite path \(\gamma \) of length \(n\ge 0\) is \(\delta \)-good if

$$\begin{aligned} \sum _{i=0}^{n-1} A_{i}(\gamma )\mathbb {1}\big (A_i\ge (\log n)^{1/3+\delta }\big )\le \delta n, \end{aligned}$$

and say it is \(\delta \)-bad otherwise. If \(\gamma \) is a \(\delta \)-good path of length \(n\ge 2\), then

$$\begin{aligned} {\widetilde{T}}(\gamma )\le \delta n +\sum _{i=0}^{n-1} A_i(\gamma ) \mathbb {1}\big (A_i<(\log n)^{1/3+\delta }\big )\preceq n(\log n)^{1/3+\delta }. \end{aligned}$$
(10)

We will apply [37, Lemma 8.5], which is based on the work of Lawler [55] (see also [51]), and states that the loop erasure of a random walk is highly unlikely to be bad.

Lemma 2.7

([37], Lemma 8.5) Let \(\delta >0\) and \(p\ge 0\) and let X be simple random walk on \({{\,\mathrm{{\mathbb {Z}}}\,}}^4\). Then

$$\begin{aligned} \frac{1}{n} \sum _{k=0}^n {\textbf{P}}_0\Big (\textrm{LE}(X^k) \text { is }\delta -\text {bad}\Big )\preceq _{\delta ,p} \frac{1}{(\log n)^p}, \end{aligned}$$

for every \(n\ge 2\).

We now apply this machinery to prove Proposition 2.4. We will also use the mass-transport principle for \({{\,\mathrm{{\mathbb {Z}}}\,}}^4\), which states that if \(f:{{\,\mathrm{{\mathbb {Z}}}\,}}^d\times {{\,\mathrm{{\mathbb {Z}}}\,}}^d\rightarrow [0,\infty ]\) is a diagonally invariant function, meaning that \(f(x,y)=f(x+z,y+z)\) for every \(x,y,z\in {{\,\mathrm{{\mathbb {Z}}}\,}}^4\), then \(\sum _{x\in {{\,\mathrm{{\mathbb {Z}}}\,}}^d}f(0,x)=\sum _{x\in {{\,\mathrm{{\mathbb {Z}}}\,}}^d}f(-x,0)=\sum _{x\in {{\,\mathrm{{\mathbb {Z}}}\,}}^d}f(x,0)\).

Proof of Proposition 2.4

To prove the proposition, we will show that if \({\mathscr {A}}\) is any set of simple paths \(\gamma \) with \(\gamma _0=0\) and with length \(|\gamma |\le n\), then

$$\begin{aligned}{} & {} \!\!\!\!\!\!\sum _{v\in {{\,\mathrm{{\mathbb {Z}}}\,}}^4} {{\,\mathrm{{\mathbb {P}}}\,}}\Bigl (\Gamma (0,v)\in {\mathscr {A}}, \text { and }\Gamma (0,0 \wedge v)\subseteq \Lambda (0 \wedge v,r)\Bigr ) \nonumber \\{} & {} \quad \preceq _{\delta ,p} \sum _{v\in {{\,\mathrm{{\mathbb {Z}}}\,}}^4} {{\,\mathrm{{\mathbb {P}}}\,}}\Bigl (\Gamma (0,v)\in {\mathscr {A}},\, \Gamma (0,0 \wedge v)\subseteq \Lambda (0 \wedge v,r)\nonumber \\{} & {} \quad \text { and }\Gamma (v,0 \wedge v)\subseteq \Lambda (0 \wedge v,n^{1/2}(\log n)^{1/6+\delta }) \Bigr ) + n^2 (\log n)^{-p} \end{aligned}$$
(11)

for every \(0<\delta \le 1\), \(p\ge 1\), and \(n,r\ge 2\). Before proving (11), let us first see how it implies the proposition. We must first define some notation. Given a finite path \(\gamma =(\gamma _0,\ldots ,\gamma _{|\gamma |})\) and a vector x, we define \(\gamma +x=(\gamma _0+x,\ldots ,\gamma _{|\gamma |}+x)\), and \(\gamma ^\leftarrow =(\gamma _{|\gamma |},\ldots ,\gamma _0)\). We extend these operations to sets of paths in the obvious way. Fix \(\delta \in (0,1]\), \(p\ge 1\) and define the two sets of paths

$$\begin{aligned} {\mathscr {A}}_0&=\{\gamma :\gamma \text { simple},\, \gamma _0 = 0,\, |\gamma |\le n,\, \gamma \nsubseteq \Lambda (0,\,\ \ \ n^{1/2}(\log n)^{1/6+2\delta })\}, \\ {\mathscr {A}}_0^\prime&=\{\gamma :\gamma \text { simple},\, \gamma _0 = 0,\, |\gamma |\le n,\, \gamma \nsubseteq \Lambda (\gamma _{|\gamma |},\,n^{1/2}(\log n)^{1/6+2\delta })\}. \end{aligned}$$

For any \(x\in {{\,\mathrm{{\mathbb {Z}}}\,}}^d\), writing \({\mathscr {A}}_0(x)\) for the set of paths \({\mathscr {A}}_0+x\), we observe that for any path \(\gamma \) with \(\gamma _0=x\), \(\gamma _{|\gamma |}=0\), we have that

$$\begin{aligned} \gamma \in {\mathscr {A}}_0(x) \iff \gamma ^\leftarrow \in {\mathscr {A}}_0^\prime . \end{aligned}$$
(12)

With this notation and observation in hand, setting \({\mathscr {A}}={\mathscr {A}}_0\) in (11) and taking \(r\uparrow \infty \), we get

$$\begin{aligned}&{\mathbb {E}}\,\big \vert \{x\in {{\,\mathrm{{\mathbb {Z}}}\,}}^d:\Gamma (0,x)\in {\mathscr {A}}_0\} \big \vert =\sum _{v\in {{\,\mathrm{{\mathbb {Z}}}\,}}^4} {{\,\mathrm{{\mathbb {P}}}\,}}\Bigl (\Gamma (0,v)\in {\mathscr {A}}_0\Bigr ) \nonumber \\&\quad \preceq _{\delta ,p} \sum _{v\in {{\,\mathrm{{\mathbb {Z}}}\,}}^4} {{\,\mathrm{{\mathbb {P}}}\,}}\Bigl (\Gamma (0,v)\in {\mathscr {A}}_0(0) \text { and }\Gamma (v,0 \wedge v)\subseteq \Lambda (0 \wedge v,n^{1/2}(\log n)^{1/6+\delta }) \Bigr ) + n^2 (\log n)^{-p} \nonumber \\&\quad =_{} \sum _{v\in {{\,\mathrm{{\mathbb {Z}}}\,}}^4} {{\,\mathrm{{\mathbb {P}}}\,}}\Bigl (\Gamma (v,0)\in {\mathscr {A}}_0(v) \text { and }\Gamma (0,0 \wedge v)\subseteq \Lambda (0 \wedge v,n^{1/2}(\log n)^{1/6+\delta }) \Bigr ) + n^2 (\log n)^{-p} \nonumber \\&\quad =_{} \sum _{v\in {{\,\mathrm{{\mathbb {Z}}}\,}}^4} {{\,\mathrm{{\mathbb {P}}}\,}}\Bigl (\Gamma (0,v)\in {\mathscr {A}}_0^\prime \text { and }\Gamma (0,0 \wedge v)\subseteq \Lambda (0 \wedge v,n^{1/2}(\log n)^{1/6+\delta }) \Bigr ) + n^2 (\log n)^{-p} \end{aligned}$$
(13)

for every \(0<\delta \le 1\), \(p\ge 1\), and \(n \ge 2\), where the second equality follows by an application of the mass-transport principle to exchange the roles of 0 and v, and the third equality follows by (12). Applying (11) a second time with \({\mathscr {A}}= {\mathscr {A}}_0^\prime \) then yields that

$$\begin{aligned}{} & {} {\mathbb {E}}\,\big \vert \{x\in {{\,\mathrm{{\mathbb {Z}}}\,}}^d:\Gamma (0,x)\in {\mathscr {A}}_0\} \big \vert \preceq _{\delta ,p} \sum _{v\in {{\,\mathrm{{\mathbb {Z}}}\,}}^4} {{\,\mathrm{{\mathbb {P}}}\,}}\Bigl (\Gamma (0,v)\in {\mathscr {A}}_0^\prime ,\, \nonumber \\{} & {} \quad \text { and } \Gamma (0,0 \wedge v), \Gamma (v,0 \wedge v)\subseteq \Lambda (0 \wedge v,n^{1/2}(\log n)^{1/6+\delta }) \Bigr ) + n^2 (\log n)^{-p}, \end{aligned}$$

and hence, applying the mass-transport principle a second time, we get

$$\begin{aligned}&{\mathbb {E}}\,\big \vert \{x\in {{\,\mathrm{{\mathbb {Z}}}\,}}^d:\Gamma (0,x)\in {\mathscr {A}}_0\} \big \vert \\&\quad \preceq _{\delta ,p} \sum _{v\in {{\,\mathrm{{\mathbb {Z}}}\,}}^4} {{\,\mathrm{{\mathbb {P}}}\,}}\Bigl (\Gamma (0,v)\in {\mathscr {A}}_0, \text { and } \Gamma (0,0 \wedge v),\\&\quad \qquad \Gamma (v,0 \wedge v)\subseteq \Lambda (0 \wedge v,n^{1/2}(\log n)^{1/6+\delta }) \Bigr ) + n^2 (\log n)^{-p}\\&\quad \preceq _{\delta } \sum _{v\in {{\,\mathrm{{\mathbb {Z}}}\,}}^4} {{\,\mathrm{{\mathbb {P}}}\,}}\Bigl (\Gamma (0,v)\in {\mathscr {A}}_0, \text { and } \Gamma (0,v)\subseteq \Lambda (2n^{1/2}(\log n)^{1/6+\delta }) \Bigr ) + n^2 (\log n)^{-p}. \end{aligned}$$

If n is sufficiently large that \((\log n)^\delta >2\) then the first term is zero and the claim follows.

It remains to prove (11). Fix \(0<\delta \le 1,p\ge 1\) and \(n,r \ge 2\). Let \(\eta \) be the future of the origin in \({\mathfrak {T}}\) and write \({{\,\mathrm{{\mathbb {P}}}\,}}^\eta \) and \({\mathbb {E}}^\eta \) for probabilities and expectations taken with respect to the conditional law of \({\mathfrak {T}}\) given \(\eta \). Let \({\mathcal {I}}=\{i\in \{0,\ldots ,n\}:\eta [0,i]\subseteq \Lambda (\eta _i,r)\}\), and for any \(i\ge 0\) define the restriction \({\mathscr {A}}\vert _{x,\eta }^i\) to be the set of finite simple paths

$$\begin{aligned} {\mathscr {A}}\vert _{x,\eta }^i = \{\gamma :\gamma _0=\eta _i,\, \gamma [1,|\gamma |]\cap \eta =\emptyset ,\,\gamma _{|\gamma |}=x,\,\eta [0,i]\oplus \gamma [1,|\gamma |]\in {\mathscr {A}}\}, \end{aligned}$$

where for any two finite paths \(\gamma \), \(\gamma ^\prime \), we have \((\gamma _0,\ldots ,\gamma _{|\gamma |})\oplus (\gamma _0^\prime ,\ldots ,\gamma _{|\gamma ^\prime |} ^\prime )=(\gamma _0,\ldots ,\gamma _{|\gamma |},\gamma _0^\prime ,\ldots ,\gamma _{|\gamma ^\prime |}^\prime )\). In other words, \({\mathscr {A}}\vert _{x,\eta }^i\) is the set of simple paths (including paths of just a single vertex) beginning at \(\eta _i\), avoiding the other points of \(\eta \), and which when concatenated to \(\eta [0,i-1]\) yield a path in \({\mathscr {A}}\) ending at x.

For each \(v\in {{\,\mathrm{{\mathbb {Z}}}\,}}^d\) we can sample from the conditional distribution of the path in \({\mathfrak {T}}\) connecting v to \(\eta \) using Wilson’s algorithm by starting a random walk X at v and loop erasing it when it first hits \(\eta \). When sampling the path in this manner we have that the event \(\{\Gamma (0,v)\in {\mathscr {A}}\) and \(\Gamma (0,0\wedge v) \subseteq \Lambda (0\wedge v, r)\}\) occurs if and only if the union of disjoint events

$$\begin{aligned} \bigcup _{i\in {\mathcal {I}}}\{\tau _i<\tau _{i}^c,\,\textrm{LE}(X^{\tau _i})^\leftarrow \in {\mathscr {A}}\vert ^i_{v,\eta }\} \end{aligned}$$

occurs, where we write \(\tau _i\) for the hitting time of \(\eta _i\) and write \(\tau _{i}^c\) for the hitting time of \(\eta \setminus \{\eta _i\}\), so that

$$\begin{aligned}{} & {} \sum _{v\in {{\,\mathrm{{\mathbb {Z}}}\,}}^4} {{\,\mathrm{{\mathbb {P}}}\,}}^\eta \Bigl (\Gamma (0,v)\in {\mathscr {A}}, \text { and }\Gamma (0,0 \wedge v)\subseteq \Lambda (0 \wedge v,r)\Bigr )\nonumber \\{} & {} \quad = \sum _{i\in {\mathcal {I}}}\sum _{v\in {{\,\mathrm{{\mathbb {Z}}}\,}}^4} {\textbf{P}}_v\bigl (\tau _i<\tau _{i}^c,\,\textrm{LE}(X^{\tau _i})^\leftarrow \in {\mathscr {A}}\vert ^i_{v,\eta }\bigr ). \end{aligned}$$
(14)

We remark that the probabilities on the right hand side of (14) are themselves random variables given that \(\tau _i,\, \tau _i^c\) and \({\mathscr {A}}\vert ^i_{v,\eta }\) depend on \(\eta \). (The law of the simple random walk \({\textbf{P}}_v\) does not depend on \(\eta \) and, since there is no possible ambiguity that \({\textbf{P}}\) could denote expectation over the UST, we have chosen to made the dependence implicit.)

Temporarily fixing \(i\in {\mathcal {I}}\), we analyze the inner summation on the right hand side of (14) using the union bound

$$\begin{aligned}{} & {} \sum _{v\in {{\,\mathrm{{\mathbb {Z}}}\,}}^4} {\textbf{P}}_v\bigl (\tau _i<\tau _{i}^c,\,\textrm{LE}(X^{\tau _i})^\leftarrow \in {\mathscr {A}}\vert ^i_{v,\eta }\bigr ) \nonumber \\{} & {} \quad \le \sum _{v\in {{\,\mathrm{{\mathbb {Z}}}\,}}^4} {\textbf{P}}_v(\tau _i<\tau _{i}^c,\,\textrm{LE}(X^{\tau _i})^\leftarrow \in {\mathscr {A}}\vert ^i_{v,\eta },\, \textrm{LE}(X^{\tau _i})\,\, \delta \text {-good}) \nonumber \\{} & {} \quad +\sum _{v\in {{\,\mathrm{{\mathbb {Z}}}\,}}^4} {\textbf{P}}_v( \tau _i<\tau _{i}^c,\,\textrm{LE}(X^{\tau _i})^\leftarrow \in {\mathscr {A}}\vert ^i_{v,\eta },\, \textrm{LE}(X^{\tau _i})\,\,\delta \text {-bad}). \end{aligned}$$
(15)

If \(\textrm{LE}(X^{\tau _i})\) is \(\delta \)-good then we have by (9) and (10) that \(T_i:=T_{\eta _i,\eta \setminus \{\eta _i\}}(|\textrm{LE}(X^{\tau _i})|) \le C_1 n (\log n)^{1/3+\delta }\) for some universal constant \(C_1\), and hence that

$$\begin{aligned}&{\textbf{P}}_v(\tau _i<\tau _{i}^c,\, \textrm{LE}(X^{\tau _i})^\leftarrow \in {\mathscr {A}}\vert ^i_{v,\eta },\, \textrm{LE}(X^{\tau _i}) \,\, \delta \text {-good}) \nonumber \\&\quad \le {\textbf{P}}_v(\tau _i<\tau _{i}^c,\, \textrm{LE}(X^{\tau _i})^\leftarrow \in {\mathscr {A}}\vert ^i_{v,\eta }, \,T_i \le C_1 n(\log n)^{1/3+\delta }) \nonumber \\&\quad \le {\textbf{P}}_v\big (\tau _i<\tau _{i}^c,\, \textrm{LE}(X^{\tau _i})^\leftarrow \in {\mathscr {A}}\vert ^i_{v,\eta },\, |T_{i}-\tau _{i}|\ge \lambda n\big ) \nonumber \\&\quad +{\textbf{P}}_v\big (\tau _i<\tau _{i}^c,\, \textrm{LE}(X^{\tau _i})^\leftarrow \in {\mathscr {A}}\vert ^i_{v,\eta },\, \tau _i\le C_1 n(\log n)^{1/3+\delta }+\lambda n\big ) \end{aligned}$$
(16)

for every \(\lambda >0\). The first term on the right hand side of (16) is bounded above by \(C_2\lambda ^{-1} {\textbf{P}}_v(\tau _i<\tau _{i}^c,\, \textrm{LE}(X^{\tau _i})^\leftarrow \in {\mathscr {A}}\vert ^i_{v,\eta })\) for some universal constant \(C_2\) by Lemma 2.6, so that taking \(\lambda =2C_2\), substituting (16) into (15) and rearranging yields that

$$\begin{aligned}{} & {} \sum _{v\in {{\,\mathrm{{\mathbb {Z}}}\,}}^4} {\textbf{P}}_v\bigl (\tau _i<\tau _{i}^c,\,\textrm{LE}(X^{\tau _i})^\leftarrow \in {\mathscr {A}}\vert ^i_{v,\eta }\bigr ) \nonumber \\{} & {} \quad \le 2 \sum _{v\in {{\,\mathrm{{\mathbb {Z}}}\,}}^4} {\textbf{P}}_v(\tau _i<\tau _{i}^c,\,\textrm{LE}(X^{\tau _i})^\leftarrow \in {\mathscr {A}}\vert ^i_{v,\eta },\,\nonumber \tau _i \le C_3n (\log n)^{1/3+\delta }) \nonumber \\{} & {} \qquad +2\sum _{v\in {{\,\mathrm{{\mathbb {Z}}}\,}}^4} {\textbf{P}}_v(\tau _i<\tau _{i}^c,\,\textrm{LE}(X^{\tau _i})^\leftarrow \in {\mathscr {A}}\vert ^i_{v,\eta },\, \textrm{LE}(X^{\tau _i}) \,\, \delta \text {-bad}), \end{aligned}$$
(17)

where \(C_3=C_3(\delta )\) has been chosen so that \(C_1 n (\log n)^{1/3+\delta } +2C_2 n\le C_3 n(\log n)^{1/3+\delta }\) for every \(n\ge 2\).

We next bound the second term on the right hand side of (17). Since the typical time of a length n path is always \(O(n \log n)\), it follows by the same argument used to derive (17) from (16) that there exists a constant \(C_4\) such that

$$\begin{aligned}{} & {} \sum _{v\in {{\,\mathrm{{\mathbb {Z}}}\,}}^4} {\textbf{P}}_v(\tau _i<\tau _{i}^c,\,\textrm{LE}(X^{\tau _i})^\leftarrow \in {\mathscr {A}}\vert ^i_{v,\eta },\, \textrm{LE}(X^{\tau _i}) \,\, \delta \text {-bad}) \\{} & {} \quad \le 2\sum _{v\in {{\,\mathrm{{\mathbb {Z}}}\,}}^4} {\textbf{P}}_v(\tau _i<\tau _{i}^c,\,\textrm{LE}(X^{\tau _i})^\leftarrow \in {\mathscr {A}}\vert ^i_{v,\eta },\, \tau _i \le C_4 n \log n). \end{aligned}$$

Thus, taking a union bound over the possible values of \(\tau _i\), we have that

$$\begin{aligned}&\sum _{v\in {{\,\mathrm{{\mathbb {Z}}}\,}}^4} {\textbf{P}}_v(\tau _i<\tau _{i}^c,\,\textrm{LE}(X^{\tau _i})^\leftarrow \in {\mathscr {A}}\vert ^i_{v,\eta },\, \textrm{LE}(X^{\tau _i}) \,\, \delta \text {-bad}) \\&\quad \le 2 \sum _{v\in {{\,\mathrm{{\mathbb {Z}}}\,}}^4}\sum _{k=0}^{\lceil C_4n\log n\rceil } {\textbf{P}}_v(X_k=\eta _i, \textrm{LE}(X^k) \,\, \delta \text {-bad}) \end{aligned}$$

and hence by symmetry that

$$\begin{aligned}{} & {} \sum _{v\in {{\,\mathrm{{\mathbb {Z}}}\,}}^4} {\textbf{P}}_v(\tau _i<\tau _{i}^c,\,\textrm{LE}(X^{\tau _i})^\leftarrow \in {\mathscr {A}}\vert ^i_{v,\eta },\, \textrm{LE}(X^{\tau _i})\,\, \delta \text {-bad}) \nonumber \\{} & {} \quad \le 2 \sum _{v\in {{\,\mathrm{{\mathbb {Z}}}\,}}^4} \sum _{k=0}^{\lceil C_4n\log n\rceil } {\textbf{P}}_{\eta _i}(X_k=v, \textrm{LE}(X^k)\,\, \delta \text {-}\text {bad}) \nonumber \\{} & {} \quad =2\sum _{k=0}^{\lceil C_4n\log n\rceil } {\textbf{P}}_{\eta _i}( \textrm{LE}(X^k) \,\, \delta \text {-bad}) \preceq _{\delta ,p} n(\log n)^{1-p} \end{aligned}$$
(18)

for every \(n\ge 2\).

Next, we consider the first term on the right hand side of (17). We write \(B=\{\tau _i<\tau _{i}^c,\,\textrm{LE}(X^{\tau _i})^\leftarrow \in {\mathscr {A}}\vert ^i_{v,\eta },\, \tau _i \le C_3n (\log n)^{1/3+\delta }\}\) and wish to estimate \(\sum _{v\in {{\,\mathrm{{\mathbb {Z}}}\,}}^4} {\textbf{P}}_v(B)\). To do this, we split the event B according to how far the walk travels before hitting \(\eta _i\), yielding the union bound

$$\begin{aligned} {\textbf{P}}_v(B)\le & {} {\textbf{P}}_v\big (B, \sup _{0\le m \le \tau _i} \left\Vert X_m-\eta _i\right\Vert \ge n^{1/2} (\log n)^{1/6+\delta }\big )\nonumber \\{} & {} +{\textbf{P}}_v\big (B, \sup _{0\le m \le \tau _i} \left\Vert X_m-\eta _i\right\Vert < n^{1/2} (\log n)^{1/6+\delta }\big ). \end{aligned}$$
(19)

For the first of these terms, we bound

$$\begin{aligned}{} & {} {\textbf{P}}_v\big (B, \sup _{m\le \tau _i} \left\Vert X_m-\eta _i\right\Vert \ge n^{1/2} (\log n)^{1/6+\delta }\big )\\{} & {} \quad \le \sum _{k=0}^{\lceil C_3n(\log n)^{1/3+\delta }\rceil }{\textbf{P}}_v(X_k=\eta _i,\sup _{m\le k}\left\Vert X_m-\eta _i\right\Vert \ge n^{1/2} (\log n)^{1/6+\delta }). \end{aligned}$$

Summing over v and using time-reversal gives that

$$\begin{aligned}&\sum _{v\in {{\,\mathrm{{\mathbb {Z}}}\,}}^4} {\textbf{P}}_v\big (B, \sup _{m\le \tau _i} \left\Vert X_m-\eta _i\right\Vert \ge n^{1/2} (\log n)^{1/6+\delta }\big )\nonumber \\&\quad \le \sum _{k=0}^{\lceil C_3n(\log n)^{1/3+\delta }\rceil }{\textbf{P}}_{\eta _i}\Biggl (\sup _{m\le k}\left\Vert X_m-\eta _i\right\Vert \ge n^{1/2} (\log n)^{1/6+\delta }\Biggr ) \nonumber \\&\quad \preceq n (\log n)^{1/3+\delta } {\textbf{P}}_0 \Biggl (\sup _{m\le \lceil C_3n(\log n)^{1/3+\delta }\rceil }\left\Vert X_m\right\Vert \ge n^{1/2} (\log n)^{1/6+\delta }\Biggr ) \nonumber \\&\quad \preceq n (\log n)^{1/3+\delta } e^{-c_1 (\log n)^{\delta }} \preceq _{\delta ,p} n (\log n)^{-p} \end{aligned}$$
(20)

for some constant \(c_1>0\), where the first inequality in the last line follows by e.g. the maximal version of Azuma-Hoeffding [61, Section 2].

Substituting the estimates (18) and (20) into (17) in light of (19) yields that there exists a constant \(C_{\delta ,p}\) such that

$$\begin{aligned}{} & {} \sum _{v\in {{\,\mathrm{{\mathbb {Z}}}\,}}^4} {\textbf{P}}_v(\tau _i<\tau _{i}^c,\,\textrm{LE}(X^{\tau _i})^\leftarrow \in {\mathscr {A}}\vert ^i_{v,\eta })\nonumber \\{} & {} \quad \preceq \sum _{v\in {{\,\mathrm{{\mathbb {Z}}}\,}}^4} {\textbf{P}}_v\big (\tau _i<\tau _{i}^c,\,\textrm{LE}(X^{\tau _i})^\leftarrow \in {\mathscr {A}}\vert ^i_{v,\eta },\, \sup _{m\le \tau _i} \left\Vert X_m-\eta _i\right\Vert \le n^{1/2} (\log n)^{1/6+\delta }\big ) \nonumber \\{} & {} \quad +C_{\delta ,p} n(\log n)^{-p}. \end{aligned}$$
(21)

Now \(\textrm{LE}(X^{\tau _i})\subseteq (X_m)_{m\le \tau _i}\), and so applying Wilson’s algorithm, we have

$$\begin{aligned} \begin{aligned}{}&{} {{\textbf {P}}}_v\big (\tau _i<\tau _{i}^c,\,\text {LE}(X^{\tau _i})^\leftarrow \in {\mathscr {A}}\vert ^i_{v,\eta } \sup _{m\le \tau _i} \left\| X_m-\eta _i\right\| \le n^{1/2} (\log n)^{1/6+\delta }\big )\preceq \\{}&{} \quad {\mathbb {P}}^\eta (0\wedge v=\eta _i, \Gamma (0\wedge v,v)\in {\mathscr {A}}\vert ^i_{v,\eta },\, \text{ and } \Gamma (v,0 \wedge v)\subseteq \Lambda (0\wedge v,n^{1/2}(\log n)^{1/6+\delta })). \end{aligned} \end{aligned}$$

Substituting this inequality into (21) and summing over \(i\in {\mathcal {I}}\) yields

$$\begin{aligned}{} & {} \sum _{i\in {\mathcal {I}}}\sum _{v\in {{\,\mathrm{{\mathbb {Z}}}\,}}^4} {\textbf{P}}_v(\tau _i<\tau _{i}^c,\,\textrm{LE}(X^{\tau _i})^\leftarrow \in {\mathscr {A}}\vert ^i_{v,\eta }) \preceq \\{} & {} \quad \sum _{v\in {{\,\mathrm{{\mathbb {Z}}}\,}}^4} {\mathbb {P}}^\eta (\Gamma (0,v)\in {\mathscr {A}},\, \Gamma (0,0 \wedge v)\subseteq \Lambda (r)\\{} & {} \quad \text { and }\Gamma (v,0 \wedge v)\subseteq \Lambda (0\wedge v,n^{1/2}(\log n)^{1/6+\delta }))+C_{\delta ,p} n^2(\log n)^{-p}, \end{aligned}$$

since \(|{\mathcal {I}}|\le n+1\). Substituting this inequality into (14) and taking expectations over \(\eta \) yields the claimed inequality (11). \(\square \)

Containment of balls. We now turn our attention to the proof of Proposition 2.3. We begin by showing that it is very unlikely for \({\mathfrak {T}}\) to include a crossing of an annulus that it shorter than it should be by a large (i.e. non-sharp) polylogarithmic factor. We write \(\partial \Lambda (r)\) for the set of vertices in \({{\,\mathrm{{\mathbb {Z}}}\,}}^4\) with \(\Vert x\Vert _\infty =r\).

Lemma 2.8

Let \({\mathfrak {T}}\) be the uniform spanning tree of \({{\,\mathrm{{\mathbb {Z}}}\,}}^4\) and for each \(r,n\ge 1\) let \({\mathscr {E}}(r,n)\) be the event that there exists a path in \({\mathfrak {T}}\) from \(\partial \Lambda (r)\) to \(\partial \Lambda (4r)\) that has length at most n. Then

$$\begin{aligned} {\mathbb {P}}\left( {\mathscr {E}}\left( r,\lceil r^2(\log r)^{-3}\rceil \right) \right) = \exp \left[ -\Omega ((\log r)^2)\right] \end{aligned}$$

as \(r\rightarrow \infty \).

Proof of Lemma 2.8

Fix \(r\ge 2\), let \(n=\lceil r^2(\log r)^{-3}\rceil \), and write \({\mathscr {E}}={\mathscr {E}}(r,n)\). If \({\mathscr {E}}\) holds, there must exist a pair of points \(x\in \partial \Lambda (r)\) and \(y\in \partial \Lambda (4r)\) such that the path connecting x and y in \({\mathfrak {T}}\) is contained in the box \(\Lambda (4r)\) and has length at most n. Considering separately the case that \(x\wedge y\) belongs to \(\Lambda (2r)\) or not yields the union bound

$$\begin{aligned}&{\mathbb {P}}({\mathscr {E}})\le \sum _{y\in \partial \Lambda (4r)} \sum _{z\in \Lambda (2r)} {\mathbb {P}}\left( z \in \Gamma (y,\infty ), |\Gamma (y,z)|\le n\right) \\&\quad + \sum _{x\in \partial \Lambda (r)} \sum _{z\in \Lambda (4r)\setminus \Lambda (2r)}{\mathbb {P}}\left( z \in \Gamma (x,\infty ), |\Gamma (x,z)|\le n\right) , \end{aligned}$$

and using Wilson’s algorithm to convert this into a loop-erased random walk quantity yields that

$$\begin{aligned} {\mathbb {P}}({\mathscr {E}})&\le \sum _{y\in \partial \Lambda (4r)} \sum _{z\in \Lambda (2r)} \sum _{k=0}^n {\textbf{P}}_y\left( \textrm{LE}(X)_k=z\right) + \sum _{x\in \partial \Lambda (r)} \sum _{z\in \Lambda (4r)\setminus \Lambda (2r)}\sum _{k=0}^n{\textbf{P}}_x\left( \textrm{LE}(X)_k=z\right) \nonumber \\&= \sum _{y\in \partial \Lambda (4r)} \sum _{k=0}^n {\textbf{P}}_y\left( \textrm{LE}(X)_k\in \Lambda (2r)\right) + \sum _{x\in \partial \Lambda (r)} \sum _{k=0}^n{\textbf{P}}_x\left( \textrm{LE}(X)_k\in \Lambda (4r)\setminus \Lambda (2r) \right) \nonumber \\&\preceq r^3 n {\textbf{P}}_0\left( \max _{0\le k\le n}\Vert \textrm{LE}(X)_k\Vert _\infty \ge r \right) . \end{aligned}$$
(22)

We will bound this probability using the weak \(L^1\) method as introduced in [37, Section 6.2], which can be thought of as a simple special case of the typical time theory. Conditional on the loop-erased random walk \(\textrm{LE}(X)\), we have as in [36, Lemma 5.3] that the sequence of random variables \((\ell _{i+1}(X)-\ell _i(X))_{i\ge 0}\) are conditionally independent and satisfy

$$\begin{aligned} {\textbf{P}}_0(\ell _{i+1}(X)-\ell _i(X) = m \mid \textrm{LE}(X)) \le p_{m-1}(0,0) \preceq \frac{1}{m^2} \end{aligned}$$

for every \(m\ge 1\), and it follows from Vershynin’s weak triangle inequality for the weak \(L^1\) norm [68] as explained in [37, Section 6.2] that

$$\begin{aligned} {\textbf{P}}_0(\ell _{n}(X) \ge m \mid \textrm{LE}(X)) \preceq \frac{n \log n}{m} \end{aligned}$$

for every \(n\ge 2\) and \(m\ge 1\). As such, there exists a constant C such that

$$\begin{aligned} {\textbf{P}}_0\left( \max _{0\le k\le n}\Vert \textrm{LE}(X)_k\Vert _\infty \ge r \right)&\le 2 {\textbf{P}}_0\left( \max _{0\le k\le n}\Vert \textrm{LE}(X)_k\Vert _\infty \ge r,\, \ell _n(X) \le Cn\log n \right) \\&\le 2 {\textbf{P}}_0\left( \max _{0\le i\le Cn\log n}\Vert X_i\Vert _\infty \ge r \right) \\ {}&\preceq \exp \left[ -\Omega \left( \frac{r^2}{n \log n}\right) \right] \preceq \exp \left[ -\Omega \left( (\log r)^2\right) \right] . \end{aligned}$$

where we have used the maximal version of Azuma-Hoeffding in the last line [61, Section 2]. The claim follows by substituting this estimate into (22) and using that \(r^3n =r^{O(1)}=\exp [o((\log r)^2)]\). \(\square \)

Before proceeding with the deduction of Proposition 2.3 from Proposition 2.4 and Lemma 2.8, we will first introduce some more tools from [36, 37].

We begin by defining a variant of the uniform spanning tree known as the 0-wired uniform spanning forest, which was first introduced by Járai and Redig [42] as part of their work on the Abelian sandpile model. Let \((V_n)_{n\ge 0}\) be an exhaustion of \({{\,\mathrm{{\mathbb {Z}}}\,}}^4\) by finite connected sets. For each \(n\ge 0\), let \(G_n^{*}\) be the graph obtained by identifying (a.k.a. wiring) \({{\,\mathrm{{\mathbb {Z}}}\,}}^4\setminus V_n\) into a single point denoted by \(\partial _n\). Let \(G_n^{*0}\) be the graph obtained by identifying 0 with \(\partial _n\) in in \(G_n^*\). The 0-wired uniform spanning forest is then the weak limit of the uniform spanning trees on \(G_n^{*0}\) as \(n\rightarrow \infty \), which is well-defined and does not depend on the choice of exhaustion [58, §3]. Lyons, Morris and Schramm [58] proved that the component of the origin in the 0-wired forest is finite almost surely, and, since the entire 0-wired forest is stochastically dominated by the uniform spanning tree by [59, Theorem 4.6], and the definitions ensure that every component other than that of the origin is infinite, the rest of the vertices of \({{\,\mathrm{{\mathbb {Z}}}\,}}^4\) are contained in a single infinite one-ended component almost surely.

The stochastic domination property. We let \({\mathfrak {T}}_0\) be the component of 0 in the 0-wired UST. Lyons, Morris and Schramm [58, Proposition 3.1] proved that \({\mathfrak {T}}_0\) stochastically dominates \({\mathfrak {P}}(0)\), which we recall denotes the past of the origin in \({\mathfrak {T}}\). In [36], a stronger version of this stochastic domination property was derived, the relevant parts of which we restate below in our context. Given that the UST of \({{\,\mathrm{{\mathbb {Z}}}\,}}^4\) is connected and one-ended, we can, in a unique manner, add an orientation to each edge in \({\mathfrak {T}}\) so that each vertex in the tree has exactly one oriented edge emanating from it. By abuse of notation, we denote the resulting oriented tree by \({\mathfrak {T}}\) as we do in the unoriented case. The oriented 0-wired spanning forest \({\mathfrak {F}}_0\) is generated similarly, but with the edges in the finite component all oriented towards the origin. Lastly, we generalise the notion of the past: given an arbitrary oriented forest F, we define the past of a vertex \(v\in F\), denoted \(\textrm{past}_F(v)\), to be the set of vertices u with a directed path \(\gamma \) in F emanating from u and ending at v.

Lemma 2.9

(Stochastic domination) Let \({\mathfrak {T}}\) be the oriented uniform spanning tree of \({{\,\mathrm{{\mathbb {Z}}}\,}}^4\), and let \({\mathfrak {F}}_0\) be the oriented 0-wired uniform spanning forest of \({{\,\mathrm{{\mathbb {Z}}}\,}}^4\). Let K be a finite set of vertices in \({{\,\mathrm{{\mathbb {Z}}}\,}}^4\) and let \(\Gamma (K)=\cup _{u\in K} \Gamma (u,\infty )\). Then for every increasing event \({\mathscr {A}}\subseteq \{0,1\}^{E({{\,\mathrm{{\mathbb {Z}}}\,}}^4)}\) we have that

$$\begin{aligned} \begin{aligned} {{\,\mathrm {{\mathbb {P}}}\,}}\left( \text {past}_{{\mathfrak {F}}\setminus \Gamma (K)}(0)\in {\mathscr {A}}\mid \Gamma (K)\right) \le {{\,\mathrm {{\mathbb {P}}}\,}}\big ({\mathfrak {T}}_0\in {\mathscr {A}}\big ). \end{aligned}\end{aligned}$$

We will also utilize the following result of [37]. For any subset A of \({{\,\mathrm{{\mathbb {Z}}}\,}}^4\) containing the origin, let \(\mathrm {{\text{ rad }}_{\text {ext}}}(A)\) be the maximal \(\ell ^1\) distance between the origin and a vertex of A.

Theorem 2.10

([37], Theorem 1.6) Let \({\mathfrak {T}}_0\) be the component of the origin in the 0-wired uniform spanning tree of \({{\,\mathrm{{\mathbb {Z}}}\,}}^4\). Then

$$\begin{aligned} {{\,\mathrm{{\mathbb {P}}}\,}}\Big ({{\,\mathrm{{\text {rad}}_{\textrm{ext}}}\,}}({\mathfrak {T}}_0)\ge n\Big )\asymp \frac{(\log n)^{1+o(1)}}{n^2} \end{aligned}$$

for every \(n\ge 2\).

Remark 4

For the proof of Proposition 2.3 it would suffice to have the weaker bound in which \((\log n)^{1+o(1)}\) is replaced by \((\log n)^{O(1)}\), which is significantly easier to prove. (That is, it can be proven by the high-dimensional methods of [36] without needing a careful analysis of the four-dimensional case.)

With these tools in hand we proceed to the proof of Proposition 2.3.

Proof of Proposition 2.3

Fix \(\delta \in (0,1]\), and fix an integer \(n\ge 2\). Let \(\eta \) be the future of the origin in the uniform spanning tree \({\mathfrak {T}}\) and let \(r=\lceil n^{1/2} (\log n)^{1/6+\delta }\rceil \). We write

$$\begin{aligned} \{{\mathfrak {B}}(n) \nsubseteq \Lambda (8r)\} \subseteq {\mathscr {F}}\cup {\mathscr {E}} \cup {\mathscr {A}}, \end{aligned}$$

where \({\mathscr {F}}=\{\eta [0,n]\nsubseteq \Lambda (r)\}\) is the event that the first n steps of the future are not contained in the box of radius r, \({\mathscr {E}}={\mathscr {E}}(r,\lceil r^2/(\log r)^{-3}\rceil )\) is the event defined in Lemma 2.8, and \({\mathscr {A}}\) is the event \(\{{\mathfrak {B}}(n) \nsubseteq \Lambda (8r)\}\setminus ({\mathscr {F}}\cup {\mathscr {E}})\). We have already shown in Lemma 2.8 that the probability of \({\mathscr {E}}\) is much smaller than required for n sufficiently large. For the event \({\mathscr {F}}\), we use Wilson’s algorithm to compute that

$$\begin{aligned} {{\,\mathrm{{\mathbb {P}}}\,}}\big ({\mathscr {F}}\big )&={{\,\mathrm{{\textbf{P}}}\,}}_0\big (\textrm{LE}(X)^n\nsubseteq \Lambda (r)\big )\\&\le {{\,\mathrm{{\textbf{P}}}\,}}_0\big (\ell _n>2n(\log n)^{1/3})+{{\,\mathrm{{\textbf{P}}}\,}}_0\left( \max _{0\le k \le 2n(\log n)^{1/3}} \Vert X_k\Vert _\infty > r\right) \\&\preceq \frac{\log \log n}{(\log n)^{2/3}} + \exp \left[ -\Omega ((\log n)^{\delta })\right] \preceq _\delta \frac{\log \log n}{(\log n)^{2/3}} \end{aligned}$$

as required, where the second inequality follows by Theorem 2.1 for the bound on \(\ell _n\), and e.g. the maximal version of Azuma–Hoeffding [61, Section 2] for the bound on the displacement of the simple random walk.

We now bound the probability of \({\mathscr {A}}\). Observe that if \({\mathscr {A}}\) holds then there exists an integer \(0\le i\le n-1\) such that \({\mathfrak {P}}(\eta _i,n)\) is not contained in \(\Lambda (8r)\). Since \({\mathscr {E}}\) does not hold, we must also have that every crossing of the annulus \(\Lambda (4r)\setminus \Lambda (r)\) has length at least \(r^2/(\log r)^3\), and it follows that there must exist a collection of at least \(r^2/(\log r)^3\) points \(y\in ({\mathfrak {B}}(n)\setminus \eta [0,n]) \cap (\Lambda (4r)\setminus \Lambda (r))\) such that \({\mathfrak {P}}(y,n)\) has extrinsic diameter at least 4r. Summing over all possible such points, applying Markov’s inequality yields, and using the stochastic domination lemma (Lemma 2.9) yields that

$$\begin{aligned} {\mathbb {P}}({\mathscr {A}})&\le \frac{(\log r)^3}{r^2}\sum _{y\in \Lambda (4r)\setminus \Lambda (r)}{{\,\mathrm{{\mathbb {P}}}\,}}(y\in {\mathfrak {B}}(n)\setminus \eta [0,n] \text { and } \textrm{diam}({\mathfrak {P}}(y))\ge 4r)\\&\le \frac{(\log r)^3}{r^2}\sum _{y\in \Lambda (4r)\setminus \Lambda (r)} {{\,\mathrm{{\mathbb {P}}}\,}}\big (y\in {\mathfrak {B}}(n)) {{\,\mathrm{{\mathbb {P}}}\,}}\big ({{\,\mathrm{{\text {rad}}_{\textrm{ext}}}\,}}({\mathfrak {T}}_0) \ge 2r\big ) \\ {}&=\frac{(\log r)^3}{r^2}{\mathbb {E}}\ \big \vert \{y\in {\mathfrak {B}}(n):y\notin \Lambda (r)\}\big \vert {{\,\mathrm{{\mathbb {P}}}\,}}\big ({{\,\mathrm{{\text {rad}}_{\textrm{ext}}}\,}}({\mathfrak {T}}_0) \ge 2r\big ), \end{aligned}$$

and it follows from Proposition 2.4 and Theorem 2.10 that

$$\begin{aligned} \begin{aligned} {\mathbb {P}}({\mathscr {A}}) \preceq _{\delta ,p} \frac{(\log r)^3}{r^2} \frac{n^2}{(\log n)^p}\cdot \frac{(\log r)^{1+o(1)}}{r^2} \preceq (\log n)^{10/3-4\delta -p+o(1)}, \end{aligned} \end{aligned}$$

for every \(p\ge 1\). Taking \(p=10\), say, yields a bound that is stronger than required and completes the proof. \(\square \)

2.2 Lower bounds

In this section we prove the following proposition, which implies the lower bounds of Theorem 1.1. Note that, in contrast to Proposition 2.2, we do not lose any \((\log n)^{\pm o(1)}\) factors in this bound.

Proposition 2.11

Let \({\mathfrak {T}}\) be the uniform spanning tree of \({{\,\mathrm{{\mathbb {Z}}}\,}}^4\). Then

$$\begin{aligned} |{\mathfrak {B}}(n)| = {\varvec{\Omega }}\left( \frac{n^2}{(\log n)^{1/3}}\right) \end{aligned}$$

as \(n\rightarrow \infty \).

Remark 5

The proof yields the explicit lower tail bound

$$\begin{aligned} {\mathbb {P}}\left( |{\mathfrak {B}}(n)| \le \frac{n^2}{\lambda (\log n)^{1/3}}\right) \preceq \lambda ^{-1/5} \end{aligned}$$

for every \(n\ge 3\) and \(1\le \lambda \le \log n\). Presumably this bound is far from optimal.

We will prove this proposition by estimating the mean and variance of certain random variables that lower bound \(|{\mathfrak {B}}(n)|\). We expect \(|{\mathfrak {B}}(n)|\) to be unconcentratedFootnote 4, so its variance should be of the same order as its second moment and applying Chebyshev directly to \(|{\mathfrak {B}}(n)|\) should not be a viable method to prove lower tail bounds. Instead we calculate the mean and variance of a certain ‘good’ portion of the uniform spanning tree within a certain radius of the spine. We choose this radius according to how deep into the lower tail of the volume we wish to control: the lower we take this radius, the deeper into the tail we bound. The precise meaning of ‘good’ we will use is engineered precisely to make the later parts of the proof go through cleanly.

Our first task is to set up the relevant definitions. Recall that \({\textbf{P}}_z\) denotes the law of a simple random walk X on \({{\,\mathrm{{\mathbb {Z}}}\,}}^4\) started at z for each \(z\in {{\,\mathrm{{\mathbb {Z}}}\,}}^4\). [37, Theorem 7.4] states that if \({{\,\mathrm{{\textbf{P}}}\,}}_{0,\Lambda (r)}\) denotes the joint law of two independent random walks X and Y started at 0 and at a uniform point of \(\Lambda (r)\) respectively, then

$$\begin{aligned} {{\,\mathrm{{\textbf{P}}}\,}}_{0,\Lambda (r)}(X \cap Y \cap \Lambda (r) \ne \emptyset ) \asymp \frac{1}{\log r} \end{aligned}$$
(23)

for \(r\ge 2\). Fix \(\alpha >0\) and \(r\ge 2\). We say a path \(\gamma \) in \({{\,\mathrm{{\mathbb {Z}}}\,}}^4\) is \((\alpha ,r)\)-good if

$$\begin{aligned} \sum _{z\in \Lambda (\gamma _0,6r)}{\textbf{P}}_z\big (\text {hit }\gamma \cap \Lambda (\gamma _0,6r)\big )\le \alpha \frac{r^4}{\log r}, \end{aligned}$$

and say that \(\gamma \) is \((\alpha ,r)\)-bad otherwise. We note that

$$\begin{aligned}&{{\,\mathrm{{\textbf{P}}}\,}}_0(X\ \text {is }(\alpha ,r)\text {-bad})= {\textbf{P}}_{0,\Lambda (6r)} \nonumber \\&\quad \left( |\Lambda (6r)| {\textbf{P}}_{0,\Lambda (6r)}\left( X \cap Y \cap \Lambda (6r) \ne \emptyset \mid X \right) > \alpha \frac{r^4}{\log r}\right) \preceq \alpha ^{-1} \end{aligned}$$
(24)

by (23) and Markov’s inequality. Crucially, we also observe that being \((\alpha ,r)\)-bad is an increasing property of a path in the sense that if \(\gamma \) and \({\tilde{\gamma }}\) are two paths satisfying \(\gamma _0={\tilde{\gamma }}_0\) and \(\gamma \subseteq {\tilde{\gamma }}\), then \({\tilde{\gamma }}\) is \((\alpha ,r)\)-bad whenever \(\gamma \) is \((\alpha ,r)\)-bad. We will apply this to bound the probability that a loop-erased random walk is bad in terms of the probability that the corresponding simple random walk is bad.

Condition on the future of the origin \(\eta :=\Gamma (0,\infty )\) in the uniform spanning tree \({\mathfrak {T}}\) and for each \(x\in {{\,\mathrm{{\mathbb {Z}}}\,}}^4\) and \(r\ge 3\) consider the random set

$$\begin{aligned}{} & {} M_{\alpha }(x,r) = \left\{ y\in \Lambda (x,3r):\Gamma (y,0\wedge y)\subseteq \Lambda (x,3r),\, |\Gamma (y,0\wedge y)|\le \frac{r^2}{(\log r)^{1/3}},\right. \\{} & {} \quad \left. \text { and } \Gamma (y,0\wedge y) \text { is }(\alpha ,r)\text {-good}\right\} . \end{aligned}$$

The key step in the proof of Proposition 2.11 is to bound the conditional mean and variance of \(|M_\alpha (x,r)|\) in terms of the capacity of \(\eta \). Here we recall that the capacity (a.k.a. conductance to infinity) of a set \(A \subseteq {{\,\mathrm{{\mathbb {Z}}}\,}}^4\) is defined to be

$$\begin{aligned} \textrm{Cap}(A)= & {} \sum _{a\in A} \deg (a){\textbf{P}}_a(\text {never return to }A\text { after time zero}) \\= & {} 8\sum _{a\in A} {\textbf{P}}_a(\text {never return to }A\text { after time zero}). \end{aligned}$$

The two relevant estimates are as follows, where we write \(\textrm{Var}^\eta \) for the conditional variance given \(\eta \):

Proposition 2.12

There exist \(\alpha _0>0\) and \(r_0>0\) such that if \(\alpha \ge \alpha _0\) then

$$\begin{aligned} {\mathbb {E}}^\eta |M_\alpha (x,r)|\succeq r^{2}\textrm{Cap}(\eta \cap \Lambda (x,r)) \end{aligned}$$

for every \(x\in {{\,\mathrm{{\mathbb {Z}}}\,}}^4\) and every \(r\ge r_0\).

Proposition 2.13

For each \(\alpha >0\) we have

$$\begin{aligned} \textrm{Var}^\eta (|M_\alpha (x,r)|)\preceq \alpha \frac{r^6}{\log r}\textrm{Cap}(\eta \cap \Lambda (x,3r)). \end{aligned}$$

for every \(x\in {{\,\mathrm{{\mathbb {Z}}}\,}}^4\) and every \(r\ge 2\).

We will require the following variational formula for the capacity proved in [38, Lemma 2.3]. Recall that the Green’s function on \({{\,\mathrm{{\mathbb {Z}}}\,}}^4\) is defined by

$$\begin{aligned} G(x,y) = \frac{1}{\deg y}{\textbf{E}}_x\sum _{n\ge 0}\mathbb {1}(X_n=y)=\frac{1}{8}{\textbf{E}}_x\sum _{n\ge 0}\mathbb {1}(X_n=y), \end{aligned}$$

where X is a simple random walk on \({{\,\mathrm{{\mathbb {Z}}}\,}}^4\) and \(x,y\in {{\,\mathrm{{\mathbb {Z}}}\,}}^d\).

Lemma 2.14

The capacity of a set \(S\subset {{\,\mathrm{{\mathbb {Z}}}\,}}^4\) can be expressed as

$$\begin{aligned} \textrm{Cap}(S)^{-1} = \inf \left\{ \sum _{u,v\in S} G(u,v)\mu (u)\mu (v):\mu \text { is a probability measure on }S\right\} .\nonumber \\ \end{aligned}$$
(25)

Proof of Propostion 2.12

Fix \(x\in {{\,\mathrm{{\mathbb {Z}}}\,}}^4\), \(r\ge 1\) and \(\alpha >0\). We assume that \(\textrm{Cap}(\eta \cap \Lambda (x,r))>0\) or else the proposition is trivial. We let \(n=\lfloor r^2(\log r)^{-1/3}\rfloor \) and \(N=\lfloor \lambda r^2\rfloor \) where \(\lambda \in (0,1/2)\) is a parameter that will later be taken to be a small constant. Let V be a uniform random element of \(\Lambda (x,3r)\), let \(X=(X_m)_{m\ge 0}\) be a random walk started at V, and let \({\textbf{P}}\) denote the joint law of V and X. Let \(\sigma \) be the time at which \(X^{N}\) hits \(\eta \cap \Lambda (x,3r)\) and let \(\tau \) be the time \(X^{N}\) first exits \(\Lambda (x,3r)\). Each of these stopping times is defined to be infinite if the relevant event does not occur before or at time N. We let \(\mu \) be a measure which minimises the right hand side of (25) when \(S=\eta \cap \Lambda (x,r)\) and define the random variable

$$\begin{aligned} A_r=\mathbb {1}(\sigma <\tau ,|\textrm{LE}(X^\sigma )|\le n,\textrm{LE}(X^\sigma ) \text { good})\sum _{w\in \eta \cap \Lambda (x,r)} \sum _{j=0}^{N} \mu (w)\mathbb {1}(X_j=w), \end{aligned}$$

where to save on notation we have and will abbreviate \((\alpha ,r)\)-good and \((\alpha ,r)\)-bad to good and bad respectively. The weight \(\mu \) is included in the definition of \(A_r\) since it makes the second moment of \(A_r\) easier to control; this is closely related to the theory of Martin capacity as developed in [19]. An application of Wilson’s algorithm implies that

$$\begin{aligned} {\mathbb {E}}^\eta |M_\alpha (x,r)|\ge \sum _{v\in \Lambda (x,3r)}{{\,\mathrm{{\textbf{P}}}\,}}(A_r>0 \mid V=v)=|\Lambda (x,3r)|{{\,\mathrm{{\textbf{P}}}\,}}(A_r>0), \end{aligned}$$

so that to prove the proposition we need only demonstrate that there exists \(\alpha _0,r_0>0\) such that

$$\begin{aligned} {{\,\mathrm{{\textbf{P}}}\,}}(A_r>0)\succeq r^{-2}\textrm{Cap}(\eta \cap \Lambda (x,r)) \end{aligned}$$

for every \(\alpha \ge \alpha _0,r\ge r_0\), where we emphasize that the constant implied by the \(\succeq \) on the right hand side is independent of \(\eta \) and r. We do so by proving that

$$\begin{aligned}{} & {} {\textbf{E}}{A_r}\succeq r^{-2} \end{aligned}$$
(26)
$$\begin{aligned}{} & {} {\textbf{E}}{A_r^2}\preceq r^{-2}\textrm{Cap}^{-1}(\eta \cap \Lambda (x,r)) \end{aligned}$$
(27)

for appropriately large \(\alpha ,r\) and an appropriately small constant value of \(\lambda \); once (26) and (27) are established the claim will follow since, by Cauchy–Schwartz,

$$\begin{aligned} {\mathbb {E}}^\eta |M_\alpha (x,r)|\succeq r^4{{\,\mathrm{{\textbf{P}}}\,}}(A_r>0)\succeq r^4 \frac{{\textbf{E}}[A_r]^2}{{\textbf{E}}[A_r^2]}\succeq r^2 \textrm{Cap}(\eta \cap \Lambda (x,r)) \end{aligned}$$

as claimed.

We begin by lower bounding the expectation of \(A_r\). We decompose \(A_r\) as \(A_r=E_r-D_r-C_r-B_r\), where

$$\begin{aligned} B_r&=\mathbb {1}(\sigma \ge \tau )\sum _{w\in \eta \cap \Lambda (x,r)} \sum _{j=0}^{N} \mu (w)\mathbb {1}(X_j=w),\\ C_r&=\mathbb {1}(\sigma<\tau ,|\textrm{LE}(X^\sigma )|> n)\sum _{w\in \eta \cap \Lambda (x,r)} \sum _{j=0}^{N} \mu (w)\mathbb {1}(X_j=w),\\ D_r&=\mathbb {1}(\sigma <\tau ,|\textrm{LE}(X^\sigma )|\le n,\textrm{LE}(X^\sigma ) \text { bad})\sum _{w\in \eta \cap \Lambda (x,r)} \sum _{j=0}^{N} \mu (w)\mathbb {1}(X_j=w), \qquad \text { and }\\ E_r&=\sum _{w\in \eta \cap \Lambda (x,r)} \sum _{j=0}^{N} \mu (w)\mathbb {1}(X_j=w). \end{aligned}$$

The random variable \(E_r\) is the \(\mu \)-mass of the intersections of the random walk with the relevant part of \(\eta \), i.e. \(\eta \cap \Lambda (x,r)\). From \(E_r\), we have subtracted the error term \(B_r\) pertaining to the possibility that the walk exits the ball \(\Lambda (x,3r)\) before hitting the relevant part of \(\eta \); the term \(C_r\) pertaining to the possibility that that the walk hits the relevant part of \(\eta \) before exiting this ball, but has too long a loop erasure; and finally the term \(D_r\) pertaining to the possibility that the walk hits the relevant part of \(\eta \) before exiting this ball and has a suitably short loop erasure, but the loop erasure is bad, as defined above.

Lower bounding the expectation of \(E_r\): First, we lower bound the expectation of \(E_r\). We have by time-reversal that

$$\begin{aligned} {\textbf{E}}{[E_r]}&\ge \frac{1}{|\Lambda (x,3r)|} \sum _{w\in \eta \cap \Lambda (x,r)} \mu (w) \sum _{j=0}^{N} \sum _{v\in \Lambda (x,3r)}{{\,\mathrm{{\textbf{P}}}\,}}_v(X_j=w)\nonumber \\&\quad \succeq r^{-4} \sum _{w\in \eta \cap \Lambda (x,r)} \mu (w) \sum _{j=0}^{N}{{\,\mathrm{{\textbf{P}}}\,}}_w(X_j\in \Lambda (x,3r))\nonumber \\&\quad \succeq r^{-4} \sum _{j=0}^{N}{{\,\mathrm{{\textbf{P}}}\,}}_0(X_j\in \Lambda (0,2r))\succeq r^{-4} \sum _{j=0}^{N} 1 -\frac{j}{4r^2}\nonumber \\&\quad \ge r^{-4}N\left( 1-\frac{N}{4r^2}\right) \succeq \lambda r^{-2}(1-\lambda /4)\succeq \lambda r^{-2}, \end{aligned}$$
(28)

where the third inequality follows since \(\sum _{w\in \eta \cap \Lambda (x,r)}\mu (w)=1\) and \(\Lambda (w,2r)\subset \Lambda (x,3r)\) for \(w\in \Lambda (x,r)\), the fourth inequality follows by e.g. the central limit theorem for the simple random walk, and the penultimate inequality holds if \(r>1/\lambda \) (which is just the condition we need to avoid rounding N down to zero).

Upper bounding the expectation of \(B_r\): Next, we upper bound the expectation of \(B_r\), which pertains to the possibility that the walk exits the ball \(\Lambda (x,3r)\) before hitting the relevant part of \(\eta \). We have

$$\begin{aligned} {\textbf{E}}[B_r]&= \frac{1}{|\Lambda (x,3r)|} \sum _{v\in \Lambda (x,3r)}\sum _{w\in \eta \cap \Lambda (x,r)} \sum _{j=0}^{N} \mu (w){{\,\mathrm{{\textbf{P}}}\,}}_v(X_j=w,\sigma \ge \tau )\\&\le \frac{1}{|\Lambda (x,3r)|} \sum _{v\in \Lambda (x,3r)}\sum _{w\in \eta \cap \Lambda (x,r)} \sum _{j=0}^{N} \mu (w){{\,\mathrm{{\textbf{P}}}\,}}_v(X_j=w,\tau \le j)\\&\preceq r^{-4} \sum _{v\in \Lambda (x,3r)}\sum _{w\in \eta \cap \Lambda (x,r)} \sum _{j=0}^{N} \mu (w){{\,\mathrm{{\textbf{P}}}\,}}_w(X_j=v,\tau \le j)\\&\preceq r^{-4}N \sum _{w\in \eta \cap \Lambda (x,r)} \mu (w) {{\,\mathrm{{\textbf{P}}}\,}}_w(\tau \le N)\preceq \lambda r^{-2} {\textbf{P}}_0\left( \sup _{0\le i\le N} \left\Vert X_i\right\Vert _\infty \ge 2r\right) , \end{aligned}$$

where the second inequality follows by time reversal of X, and the final inequality holds because the distance between any \(w\in \eta \cap \Lambda (x,r)\) and \(\partial \Lambda (x,3r)\) is greater than or equal to 2r. Since \({\mathbb {E}}_o[\sup _{j\le i}\left\Vert X_j\right\Vert ^2]\preceq i\) for \(i\ge 0\), it follows by Markov’s inequality that

$$\begin{aligned} {\textbf{E}}[B_r]\preceq \lambda r^{-2} \frac{N}{r^2} \preceq \lambda ^2 r^{-2}. \end{aligned}$$
(29)

Upper bounding the expectation of \(D_r\). We now upper bound the expectation of \(D_r\), which pertains to the possibility that the walk hits the relevant part of \(\eta \) before exiting this ball and has a suitably short loop erasure, but the loop erasure is bad. Observe that

$$\begin{aligned} {\textbf{E}}[D_r]&\le {\textbf{E}}\left[ \mathbb {1}(\textrm{LE}(X^\sigma )\text { bad})\sum _{w\in \eta \cap \Lambda (x,r)} \sum _{j=0}^{N} \mu (w)\mathbb {1}(X_j=w)\right] \\&\le {\textbf{E}}\left[ \sum _{w\in \eta \cap \Lambda (x,r)} \sum _{j=0}^{N} \mu (w)\mathbb {1}(X\text { bad}, X_j=w)\right] \\&\quad \preceq r^{-4}\sum _{w\in \eta \cap \Lambda (x,r)} \sum _{j=0}^{N} \mu (w)\sum _{v\in \Lambda (x,3r)} {{\,\mathrm{{\textbf{P}}}\,}}_v(X\text { bad}, X_j=w)\le r^{-4}\\&\quad \sum _{w\in \eta \cap \Lambda (x,r)} \sum _{j=0}^{N} \mu (w)\sum _v {{\,\mathrm{{\textbf{P}}}\,}}_0(X\text { bad}, X_j=w-v)\\&= r^{-4}\sum _{j=0}^N {{\,\mathrm{{\textbf{P}}}\,}}_0(X \text { bad}) \preceq Nr^{-4}{{\,\mathrm{{\textbf{P}}}\,}}_0(X\text { bad})\preceq \alpha ^{-1}Nr^{-4}\preceq \alpha ^{-1}\lambda r^{-2}, \end{aligned}$$

where the second inequality follows as \(\textrm{LE}(X^\sigma )\subseteq X\), the fourth inequality follows by translation-invariance, and the penultimate inequality follows by (24). Combining this inequality with (29) and (28), we can see that there exist positive constants \(\alpha _0\) and \(\lambda _0\) such that if \(\alpha \ge \alpha _0\), \(\lambda =\lambda _0\), and \(r\ge 1/\lambda _0\) then

$$\begin{aligned} {\textbf{E}}[{E_r-D_r-B_r}]\succeq r^{-2}. \end{aligned}$$

Thus, to complete the proof of (26), it is sufficient to show that \({\textbf{E}}{[C_r]}=o(r^{-2})\).

Upper bounding the expectation of \(C_r\): To bound the final term \(C_r\), which pertains to the possibility that that the walk hits the relevant part of \(\eta \) before exiting this ball, but has too long a loop erasure. We will need some understanding of the cut times of a simple random walk. Recall that a time \(t\ge 0\) is said to be a cut time, or loop-free time of the random walk X if X[0, t] and \(X(t,\infty )\) are disjoint. We observe that if \(0 \le s \le t\) are cut times of X then the loop-erasure of X is equal to the concatenation of the loop-erasures of the portions of X before s, between s and t, and after t; this property allows us to decorrelate different parts of the loop-erased random walk. We use the following estimate of Lawler which demonstrates that the random walk on \({{\,\mathrm{{\mathbb {Z}}}\,}}^4\) has a reasonably good supply of cut times.

Lemma 2.15

([54], Lemma 7.7.4) Let X be simple random walk on \({{\,\mathrm{{\mathbb {Z}}}\,}}^4\). Then

$$\begin{aligned} {{\,\mathrm{{\textbf{P}}}\,}}(\text {there are no cut times between times }n\text { and }m) \preceq \frac{\log \log m}{\log m}. \end{aligned}$$

for every \(3\le n \le m\) such that \(|n-m| \ge m/(\log m)^6\).

Observe that if \(|\textrm{LE}(X^\sigma )|>n\) then we must have that \(\sigma >n\) and that if X has a cut time in \([\sigma -n/4,\sigma ]\), then \(|\textrm{LE}(X^j)|\ge 3n/4\) for every \(j\ge \sigma \). Therefore,

$$\begin{aligned} C_r \le \sum _{w\in \eta \cap \Lambda (x,r)} \sum _{j=0}^{N} \mu (w)\mathbb {1}\Big (X_j=w,|\textrm{LE}(X^\sigma )|> n,n<\sigma \le N\wedge j\Big )\le C_r^\prime +C_r^{\prime \prime },\nonumber \\ \end{aligned}$$
(30)

where

$$\begin{aligned} \begin{aligned} C_r^\prime&=\sum _{w\in \eta \cap \Lambda (x,r)} \sum _{j=n+1}^{N} \mu (w)\mathbb {1}\Big (X_j=w,X \text{ has } \text{ no } \text{ cut } \text{ time } \text{ in } [\sigma -n/4,\sigma ],n<\sigma \le N\wedge j\Big )\qquad \text{ and }\\ C_r^{\prime \prime }&= \sum _{w\in \eta \cap \Lambda (x,r)} \sum _{j=n+1}^{N} \mu (w)\mathbb {1}\Big (X_j=w,|\text {LE}(X^j)|> \frac{3}{4}n\Big ). \end{aligned} \end{aligned}$$

We show that the expectation conditioned on \(\eta \) of both \(C_r^\prime \) and \(C_r^{\prime \prime }\) is \(o(r^{-2})\); we begin with the latter. We have

$$\begin{aligned} {\textbf{E}} C_r^{\prime \prime }&\le \frac{1}{|\Lambda (x,3r)|}\sum _{w\in \eta \cap \Lambda (x,r)}\mu (w)\sum _{j=n}^N\sum _{v\in \Lambda (x,3r)} {{\,\mathrm{{\textbf{P}}}\,}}_{v}\left( X_j=w,|\textrm{LE}(X^j)|>\frac{3}{4}n\right) \\&= \frac{1}{|\Lambda (x,3r)|}\sum _{w\in \eta \cap \Lambda (x,r)}\mu (w)\sum _{j=n}^N\sum _{v\in \Lambda (x,3r)} {{\,\mathrm{{\textbf{P}}}\,}}_{0}\left( X_j=w-v,|\textrm{LE}(X^j)|>\frac{3}{4}n\right) \\&\le \frac{1}{|\Lambda (x,3r)|}\sum _{w\in \eta \cap \Lambda (x,r)}\mu (w)\sum _{j=n}^N {{\,\mathrm{{\textbf{P}}}\,}}_{0}\left( |\textrm{LE}(X^j)|>\frac{3}{4}n\right) \preceq r^{-4}\sum _{j=n}^N {{\,\mathrm{{\textbf{P}}}\,}}_0\left( |\textrm{LE}(X^j)|>\frac{3}{4}n\right) , \end{aligned}$$

where we used translation invariance in the second line. Observe for each \(n\le i\le N\) that if X has a cut time in \([i-i/(\log i)^6,i]\), then \(|\textrm{LE}(X^i)|\le |\textrm{LE}_\infty (X^i)| +i/(\log i)^6\). Therefore,

$$\begin{aligned} r^4{\textbf{E}} C_r^{\prime \prime }&\preceq \sum _{i=n}^{N} {{\,\mathrm{{\textbf{P}}}\,}}_0\big (|\textrm{LE}(X^i)|> \frac{3}{4}n\big ) \nonumber \\&\quad \le \sum _{i=n}^N {{\,\mathrm{{\textbf{P}}}\,}}_0(|\textrm{LE}_\infty (X^i)|> (3/4)n-i/(\log i)^6 )\nonumber \\&\quad +{{\,\mathrm{{\textbf{P}}}\,}}_0\big (X\text { has no cut times in }[i-i/(\log i)^6,i]\big )\nonumber \\&\quad \preceq \sum _{i=n}^N {{\,\mathrm{{\textbf{P}}}\,}}_0(\rho _i> (3/4)n-i/(\log i)^6)+c\frac{\log \log i}{\log i}\nonumber \\&\quad \preceq N\frac{\log \log n}{\log n}+ \sum _{i=n}^N \frac{\log \log i}{(\log i)^{2/3} }\preceq N\frac{\log \log n}{(\log n)^{2/3}}=o(r^2) \end{aligned}$$
(31)

as required, where the third inequality follows by Lemma 2.15 and the fourth inequality follows from Theorem 2.1 and the fact that \(\lambda <1/2\). Next, we upper bound the conditional expectation of \(C_r^\prime \). Recalling the definitions \(N=\lfloor \lambda r^2\rfloor \) for some \(\lambda \in (0,1/2)\) and \(n=\lfloor r^2(\log r)^{-1/3}\rfloor \), we can calculate that \(N\le n(\log n)^{1/3}\) for all \(r\ge 2\). Define the sequence of times \(T_k=\lceil (1+k/8)n\rceil \) for \(k\ge 0\), and observe that for r larger than some universal constant, if \(n\le \sigma \le N\) and X has no cut time in \([\sigma -n/4,\sigma ]\), then X has no cut times in at least one of the intervals belonging to the family \(\{[T_k-T_k/(\log T_k)^6,T_k]:0\le k\le 8\lceil (\log n)^{1/3}\rceil \}\). Therefore, for r larger than some universal constant, we have that

$$\begin{aligned}{} & {} C_r^\prime \le \sum _{k=0}^{8\lceil (\log n)^{1/3}\rceil } \sum _{w\in \eta \cap \Lambda (x,r)} \sum _{j=n+1}^{N} \mu (w)\nonumber \\{} & {} \qquad \qquad \qquad \quad \mathbb {1}\Big (X_j=w,X\text { has no cut time in }[T_k-T_k/(\log T_k)^6,T_k]\Big ). \end{aligned}$$
(32)

We also have by symmetry that

$$\begin{aligned}{} & {} {\textbf{P}}_x\Big (X_j=y,X\text { has no cut time in }[T_k-T_k/(\log T_k)^6,T_k]\Big ) \\{} & {} \quad = {\textbf{P}}_y\Big (X_j=x,X\text { has no cut time in }[T_k-T_k/(\log T_k)^6,T_k]\Big ) \end{aligned}$$

for each \(x,y\in {{\,\mathrm{{\mathbb {Z}}}\,}}^4\) and \(j\ge 0\), so that for r larger than some universal constant

$$\begin{aligned} {\textbf{E}} C_r^\prime\le & {} \frac{N}{|\Lambda (3x,r)|}\sum _{k=0}^{8\lceil (\log n)^{1/3}\rceil } {{\,\mathrm{{\textbf{P}}}\,}}_0\Big (X\text { has no cut time in }[T_k,T_k-T_k/(\log T_k)^6]\Big )\nonumber \\{} & {} \preceq \lambda r^{-2}\sum _{k=0}^{8\lceil (\log n)^{1/3}\rceil } \frac{\log \log T_k}{\log T_k}\preceq \lambda r^{-2}(\log n)^{1/3}\frac{\log \log n}{\log n}=o(r^{-2}), \end{aligned}$$
(33)

where the second inequality follows from Lemma 2.15. We have now shown (26), and so to complete the proof we must show (27), which upper bounds the second moment of A.

Upper bounding the second moment of A. It is at this stage of the proof that we benefit from defining A in terms of the measure \(\mu \). Indeed, we can use the Markov property to compute that

$$\begin{aligned} {\textbf{E}}{A_r^2}&\le {\textbf{E}}\left[ \left( \sum _{i\ge 0}\sum _{ w\in \eta \cap \Lambda (x,r)}\mu (w)\mathbb {1}\big (X_i=w\big )\right) ^2\right] \\ {}&\le 2{\textbf{E}}\left[ \sum _{i\ge 0}\sum _{w,z\in \eta \cap \Lambda (x,r)}\mu (w)\mu (z)\mathbb {1}\big (X_i=w\big )\sum _{j\ge i}\mathbb {1}\big (X_j=z\big )\right] \\&\asymp 2{\textbf{E}}\left[ \sum _{i\ge 0}\sum _{ w,z\in \eta \cap \Lambda (x,r)}\mu (w)\mu (z)G(w,z)\mathbb {1}\big (X_i=w\big )\right] \\&=\frac{2}{|\Lambda (x,3r)|}\sum _{w,z\in \eta \cap \Lambda (x,r)}\mu (w)\mu (z)G(w,z)\sum _{i\ge 0}\sum _{v\in \Lambda (x,3r)}{\textbf{P}}_v\big (X_i=w\big ), \end{aligned}$$

and hence by time-reversal that

$$\begin{aligned} {\textbf{E}}{A_r^2}&\preceq r^{-4} \sum _{w,z\in \eta \cap \Lambda (x,r)}\mu (w)\mu (z)G(w,z)\sum _{i\ge 0}{\textbf{P}}_w\big (X_i\in \Lambda (x,3r)\big )\\&\preceq r^{-2} \sum _{w,z\in \eta \cap \Lambda (x,r)}\mu (w)\mu (z)G(w,z)=r^{-2} \textrm{Cap}^{-1}(\eta \cap \Lambda (x,r)), \end{aligned}$$

where the final inequality follows since the random walk spends at most \(O(r^2)\) time in any ball of radius r in expectation (which follows from the Green’s function bound \(G(x,y)\preceq \left\Vert x-y\right\Vert _2^{-2}\) for \(x\ne y\)), and the final equality follows from the definition of \(\mu \). This concludes the proof of (27) and hence the proof of the proposition. \(\square \)

We now turn to the proof of the variance estimate of Proposition 2.13. We will require the following lemma relating the capacity of a set S to the probability that a random walk, started at a uniform position in a ball containing S, hits S. The lemma will follow straightforwardly from [19, Theorem 2.2] and Lemma 2.14. We prove the result in all dimensions \(d\ge 3\) for completeness; the implicit constants may depend on d.

Lemma 2.16

Fix a dimension \(d\ge 3\), a radius \(r\ge 1\), and let \(S\subseteq \Lambda (r):=\{x\in {{\,\mathrm{{\mathbb {Z}}}\,}}^d:\left\Vert x\right\Vert _\infty \le r\}\). Let X be a simple random walk on \({{\,\mathrm{{\mathbb {Z}}}\,}}^d\). Then

$$\begin{aligned} \sum _{x\in \Lambda (r)}{{\,\mathrm{{\textbf{P}}}\,}}_x(X \ \textrm{hits}\ S)\asymp r^{2}\textrm{Cap}(S). \end{aligned}$$

Proof of Lemma 2.16

[19, Theorem 2.2] states that for any transient Markov chain \((X_n)_{n\ge 0}\) on a countable state space \(\Omega \) with initial state \(\rho \) and Green’s functionFootnote 5\(G(x,y)=\sum _{n\ge 0}{{\,\mathrm{{\textbf{P}}}\,}}_x(X_n=y)\), we have that

$$\begin{aligned} {{\,\mathrm{{\mathbb {P}}}\,}}_\rho (X\text { hits }S)\asymp \inf _{\mu } \left[ \sum _{x,y\in S}\mu (x) \frac{G(x,y)}{G(\rho ,y)}\mu (y)\right] ^{-1}, \end{aligned}$$

for any subset \(S\subseteq \Omega \), where the infimum on the right hand side is taken over probability measures on S. We would like to apply this result with X a simple random walk on state space \({{\,\mathrm{{\mathbb {Z}}}\,}}^d\), however, we would like the walk to start at a random vertex. To achieve this, we attach a ‘ghost vertex’ to the state space from which the random walk will start. We set up the transition probabilities from the ghost vertex so that after one step, the walk’s distribution on \({{\,\mathrm{{\mathbb {Z}}}\,}}^d\) is equal to that which we desire.

Define the set \({{\,\mathrm{{\mathbb {Z}}}\,}}^d_*={{\,\mathrm{{\mathbb {Z}}}\,}}^d\cup \{*\}\), where \(*\) is the additional ghost vertex, and define the Markov transition kernel p on the state space S by \(p(x,y)=\frac{1}{8}\mathbb {1}(x\sim y)\) for \(x,y\in {{\,\mathrm{{\mathbb {Z}}}\,}}^d\) and \(p(*,z)=1/|\Lambda |\) for \(z\in \Lambda :=\Lambda (r)\). Note that a trajectory of this chain, which we will denote by X, is just a simple random walk on \({{\,\mathrm{{\mathbb {Z}}}\,}}^d\) when started in \({{\,\mathrm{{\mathbb {Z}}}\,}}^d\). We observe that

$$\begin{aligned} \frac{1}{|\Lambda |}\sum _{x\in \Lambda }{{\,\mathrm{{\textbf{P}}}\,}}_x(X \ \textrm{hits}\ S)={{\,\mathrm{{\textbf{P}}}\,}}_*(X \ \textrm{hits}\ S)\asymp \inf _{\mu } \left[ \sum _{x,y\in S}\mu (x) \frac{G(x,y)}{G(*,y)}\mu (y)\right] ^{-1}, \end{aligned}$$
(34)

for any subset \(S\subseteq \Lambda \). An integral comparison yields that

$$\begin{aligned} G(*,y)\asymp \frac{1}{|\Lambda |}\sum _{x\in \Lambda } \frac{1}{(1\vee \left\Vert x-y\right\Vert _\infty )^{d-2}}\asymp r^{2-d}, \end{aligned}$$

for \(y\in \Lambda \), and so

$$\begin{aligned} \inf _{\mu } \left[ \sum _{x,y\in S}\mu (x) \frac{G(x,y)}{G(*,y)}\mu (y)\right] ^{-1}\asymp r^{2-d}\inf _{\mu } \left[ \sum _{x,y\in S}\mu (x) G(x,y)\mu (y)\right] ^{-1}. \end{aligned}$$

Substituting this into (34) and applying Lemma 2.14, we get

$$\begin{aligned} \sum _{x\in \Lambda (r)}{{\,\mathrm{{\textbf{P}}}\,}}_x(X \ \textrm{hits}\ S)\asymp r^{2}\inf _{\mu } \left[ \sum _{x,y\in S}\mu (x) G(x,y)\mu (y)\right] ^{-1}\asymp r^{2}\textrm{Cap}(S) \end{aligned}$$

as claimed. (We do not have an exact equality on the right hand side because we are using a slightly different definition of the Green’s function than usual.) \(\square \)

Proof of Proposition 2.13

Given \(y,z\in {{\,\mathrm{{\mathbb {Z}}}\,}}^4\), let Y be a random walk started at y and let Z be an independent random walk started at z and write \({\textbf{P}}_{y,z}\) for the joint law of Y and Z. Let \(\sigma _1\) be the first time Y hits \(\eta \) and let \(\sigma _2\) be the first time Z hits \(\eta \cup \textrm{LE}(Y^{\sigma _1})\). We continue to write \(n=\lceil r^2 (\log r)^{-1/3}\rceil \) as in the previous proof. Abbreviating \(M=M_\alpha \), \(\Lambda =\Lambda (x,3r)\), we have by Wilson’s algorithm that

$$\begin{aligned} \begin{aligned} {\mathbb {E}}^\eta [|M(x,r)|^2]&\le \sum _{{y,z}\in \Lambda }{{\,\mathrm{{\mathbb {P}}}\,}}^\eta \!\big (y,z\in M(x,r)\big )\\&\le \sum _{{y,z}\in \Lambda } {{\,\mathrm{{\textbf{P}}}\,}}_{y,z}(\sigma _1<\infty ,\sigma _2<\infty ,|\textrm{LE}(Y^{\sigma _1})|\le n,|\textrm{LE}(Z^{\sigma _2})|\le n,\\&\hspace{2cm}\textrm{LE}(Y^{\sigma _1})\subseteq \Lambda , \textrm{LE}(Z^{\sigma _2})\subseteq \Lambda ,\textrm{LE}(Y^{\sigma _1}),\textrm{LE}(Z^{\sigma _2}) \text { both good}). \end{aligned} \end{aligned}$$
(35)

Now, on the event that \(\sigma _1,\sigma _2<\infty \), let \(\sigma _3\) be the time Z first hits \(\textrm{LE}(Y^{\sigma _1})\) and let \(\sigma _4\) be the time Z first hits \(\eta \). We split according to whether \(\sigma _3\le \sigma _4\) or \(\sigma _4<\sigma _3\), beginning with the case \(\sigma _4<\sigma _3\). Observing that \(\sigma _2=\sigma _4\) on this event, we obtain

$$\begin{aligned}&\sum _{{y,z}\in \Lambda } {{\,\mathrm{{\textbf{P}}}\,}}_{y,z}(\sigma _1<\infty ,\,\sigma _2<\infty ,\,|\textrm{LE}(Y^{\sigma _1})|\le n,\,|\textrm{LE}(Z^{\sigma _2})|\le n, \nonumber \\&\quad \qquad \textrm{LE}(Y^{\sigma _1})\subseteq \Lambda ,\, \textrm{LE}(Z^{\sigma _2})\subseteq \Lambda ,\textrm{LE}(Y^{\sigma _1}),\,\textrm{LE}(Z^{\sigma _2}) \text { both good},\text { and }\sigma _4<\sigma _3). \nonumber \\&\quad \le \sum _{{y,z}\in \Lambda } {{\,\mathrm{{\textbf{P}}}\,}}_{y,z}(\sigma _1<\infty ,\,\sigma _4<\infty ,\,|\textrm{LE}(Y^{\sigma _1})|\le n,\,|\textrm{LE}(Z^{\sigma _4})|\le n, \nonumber \\&\quad \qquad \textrm{LE}(Y^{\sigma _1})\subseteq \Lambda ,\, \textrm{LE}(Z^{\sigma _4})\subseteq \Lambda , \text { and }\textrm{LE}(Y^{\sigma _1}),\textrm{LE}(Z^{\sigma _4}) \text { both good}). \nonumber \\&\quad =\sum _{{y,z}\in \Lambda } {{\,\mathrm{{\textbf{P}}}\,}}_{y}(\sigma _1<\infty ,\,|\textrm{LE}(Y^{\sigma _1})|\le n,\,\textrm{LE}(Y^{\sigma _1})\subseteq \Lambda , \text { and } \textrm{LE}(Y^{\sigma _1})\text { good}) \nonumber \\&\quad \qquad \cdot {{\,\mathrm{{\textbf{P}}}\,}}_{z}(\sigma _4<\infty ,\,|\textrm{LE}(Y^{\sigma _4})|\le n,\,\textrm{LE}(Y^{\sigma _4})\subseteq \Lambda ,\,\textrm{LE}(Y^{\sigma _4})\text { good}) \nonumber \\&\quad =\left[ \sum _{y\in \Lambda }{{\,\mathrm{{\textbf{P}}}\,}}_{y}(\sigma _1<\infty ,|\textrm{LE}(Y^{\sigma _1})|\le n ,\textrm{LE}(Y^{\sigma _1})\subseteq \Lambda ,\textrm{LE}(Y^{\sigma _1})\text { good})\right] ^2 ={\mathbb {E}}^\eta [|M(y,r)|]^2, \end{aligned}$$
(36)

where the first equality follows by independence of Y and Z conditional on \(\eta \), and the last follows by an application of Wilson’s algorithm. On the other hand, if \(\sigma _3\le \sigma _4\) then \(\sigma _2=\sigma _3\), and so we get

$$\begin{aligned}&\sum _{{y,z}\in \Lambda } {{\,\mathrm{{\textbf{P}}}\,}}_{y,z}(\sigma _1<\infty ,\sigma _2<\infty ,|\textrm{LE}(Y^{\sigma _1})|\le n,|\textrm{LE}(Z^{\sigma _2})|\le n, \nonumber \\&\quad \textrm{LE}(Y^{\sigma _1})\subseteq \Lambda , \textrm{LE}(Z^{\sigma _2})\subseteq \Lambda ,\textrm{LE}(Y^{\sigma _1}),\textrm{LE}(Z^{\sigma _2}) \text { both good},\sigma _3\le \sigma _4) \nonumber \\&\quad \le \sum _{y\in \Lambda } {\textbf{E}}_y\Bigg [\mathbb {1}(\sigma _1<\infty ,|\textrm{LE}(Y^{\sigma _1})|\le n,\textrm{LE}(Y^{\sigma _1})\subseteq \Lambda ,\textrm{LE}(Y^{\sigma _1}) \text { good}) \nonumber \\&\quad \sum _{z\in \Lambda } {{\,\mathrm{{\textbf{P}}}\,}}_{y,z}(\sigma _3<\infty \mid Y) \Bigg ] \nonumber \\&\quad \le \alpha \frac{r^4}{\log r} \sum _{y\in \Lambda }{{\,\mathrm{{\textbf{P}}}\,}}_{y}(\sigma _1<\infty ,|\textrm{LE}(Y^{\sigma _1})|\le n ,\textrm{LE}(Y^{\sigma _1})\subseteq \Lambda ,\textrm{LE}(Y^{\sigma _1})\text { good}) \nonumber \\&\quad = \alpha \frac{r^4}{\log r}{\mathbb {E}}^\eta |M(x,r)|, \end{aligned}$$
(37)

where the final inequality follows by the definition of ‘good’, and the final equality follows by an application of Wilson’s algorithm. Substituting (37) and (36) into (35) with a union bound yields

$$\begin{aligned} {\mathbb {E}}^\eta [|M(x,r)|^2]\le {\mathbb {E}}^\eta [|M(x,r)|]^2+\alpha \frac{r^4}{\log r}{\mathbb {E}}^\eta |M(x,r)| \end{aligned}$$

and hence that

$$\begin{aligned} {\text {Var}}^\eta (|M(x,r)|)\le \alpha \frac{r^4}{\log r}{\mathbb {E}}^\eta |M(x,r)|. \end{aligned}$$
(38)

Finally we upper bound \({\mathbb {E}}^\eta |M(x,r)|\). We have that

$$\begin{aligned} {\mathbb {E}}^\eta |M(x,r)|\le \sum _{y\in \Lambda } {{\,\mathrm{{\textbf{P}}}\,}}_y(X \text { hits } \eta \cap \Lambda ), \end{aligned}$$

so that applying Lemma 2.16 to the right hand side and plugging the resulting inequality into (38) concludes the proof. \(\square \)

Our next goal is to deduce Proposition 2.11 from Propositions 2.12 and 2.13. To proceed we will need the following result controlling the capacity of the first n steps of a loop-erased random walk which follows easily from [37, Proposition 3.4]Footnote 6.

Proposition 2.17

Let X be a random walk on \({{\,\mathrm{{\mathbb {Z}}}\,}}^4\) started at the origin. There exists a constant \(C>0\) such that we have

$$\begin{aligned} {{\,\mathrm{{\textbf{P}}}\,}}\left( \textrm{Cap}(\textrm{LE}(X)^n)\le \frac{Cn}{(\log n)^{2/3}}\right) \preceq \frac{\log \log n}{(\log n)^{2/3}}, \end{aligned}$$

for every \(n\ge 2\).

Proof

By [37, Proposition 3.4], we know that there exists a constant c such that

$$\begin{aligned} {{\,\mathrm{{\textbf{P}}}\,}}\left( \text {Cap}(\textrm{LE}_\infty (X^n))\le \frac{cn}{\log n}\right) \preceq \frac{1}{(\log n)^{2/3}}, \end{aligned}$$

for each \(n\ge 2\). Fix \(\epsilon \in (0,1/3)\). Employing a union bound and the fact that capacity is increasing, we obtain

$$\begin{aligned}&{{\,\mathrm{{\textbf{P}}}\,}}\left( \text {Cap}(\textrm{LE}(X)^n)\le \frac{C n}{(\log n)^{2/3}}\right) ={{\,\mathrm{{\textbf{P}}}\,}}\left( \text {Cap}(\textrm{LE}_{\infty }(X^{\ell _n}))\le \frac{C n}{(\log n)^{2/3}}\right) \\&\quad \le {{\,\mathrm{{\textbf{P}}}\,}}\left( \text {Cap}(\textrm{LE}_{\infty }(X^{(1-\epsilon )n(\log n)^{1/3}}))\le \frac{C n}{(\log n)^{2/3}}\right) +{{\,\mathrm{{\textbf{P}}}\,}}\left( \left| \frac{\ell _n}{n(\log n)^{1/3}}-1\right| >\epsilon \right) \\&\quad \preceq \frac{1}{(\log n)^{2/3}}+\frac{\log \log n}{(\log n)^{2/3}}\preceq \frac{\log \log n}{(\log n)^{2/3}} \end{aligned}$$

when we choose \(C<c(1-\epsilon )\). \(\square \)

We will also use the following covering lemma, whose proof we defer to the end of the section.

Lemma 2.18

Let S be a finite subset of \({{\,\mathrm{{\mathbb {Z}}}\,}}^4\), and let \(r\ge 1\). Then there exists an integer K and points \(\{x_i:1\le i\le K\}\subseteq {{\,\mathrm{{\mathbb {Z}}}\,}}^4\) such that the balls \(\Lambda (x_i,3r)\) are disjoint, \(\{x_i\}_{1\le i\le K}\subseteq S+\Lambda (r)\), and

  • \(\sum _{i=1}^K \textrm{Cap}(S\cap \Lambda (x_i,r))\ge 5^{-4}\textrm{Cap}(S)\), and

  • \(\sum _{i=1}^K \textrm{Cap}(S\cap \Lambda (x_i,3r))\le 21^4\sum _{i=1}^K \textrm{Cap}(S\cap \Lambda (x_i,r))\).

We now have everything we need to complete the proof of Proposition 2.11 given Lemma 2.18.

Proof of Proposition 2.11

Let \(\alpha _0,r_0\) be the constants yielded by Proposition 2.12, and fix \(r\ge r_0\vee 2\), \(\alpha >\alpha _0\). For the remainder of the proof we will abbreviate \(M=M_\alpha \). Let \(K\ge 1\) and suppose that \(\{x_i:1\le i \le K\}\subseteq {{\,\mathrm{{\mathbb {Z}}}\,}}^4\) is a set of points such that the family of boxes \((\Lambda (x_i,3r))_{i=1}^K\) are mutually disjoint. We first show that the random variables \(|M(x_i,r)|\) are pairwise negatively correlated conditional on \(\eta \) in the sense that

$$\begin{aligned} {\mathbb {E}}^\eta \Bigl [|M(x_i,r)| \cdot |M(x_j,r)|\Bigr ] \le {\mathbb {E}}^\eta \Bigl [ |M(x_i,r)|\Bigr ] {\mathbb {E}}^\eta \Bigl [|M(x_j,r)|\Bigr ] \end{aligned}$$

for every \(1\le i < j \le K\). Indeed, suppose that \(u\in \Lambda (x_i,3r)\) and \(v\in \Lambda (x_j,3r)\) for some \(i\ne j\). We sample the UST conditional on \(\eta =\Gamma (0,\infty )\) with Wilson’s algorithm, beginning with a random walk X started at u, followed by another walk Y started at v. Let \(\tau _1\) be the first time X hits \(\eta \), let \(\tau _2\) be the first time Y hits \(\textrm{LE}(X^{\tau _1})\cup \eta \), and let \(\tau _2^\prime \) be the first time Y hits \(\eta \). Then

$$\begin{aligned}{} & {} {{\,\mathrm{{\mathbb {P}}}\,}}^\eta \bigl (u\in M(x_i,r),\, v \in M(x_j,r)\bigr ) ={{\,\mathrm{{\mathbb {P}}}\,}}^\eta (u\in M(x_i,r))\\{} & {} \quad \cdot {{\,\mathrm{{\mathbb {P}}}\,}}^\eta \left( \textrm{LE}(Y^{\tau _2})\subseteq \Lambda (x_j,3r),\, |\textrm{LE}(Y^{\tau _2})|\le \frac{r^2}{(\log r)^{1/3}},\,\right. \\{} & {} \qquad \qquad \left. \textrm{LE}(Y^{\tau _2}) \text { is }(\alpha ,r)\text {-good}\ \Big \vert \ u\in M(x_i,r)\right) . \end{aligned}$$

We have by the definition of \(M(x_i,r)\) that if \(u\in M(x_i,r)\) then \(\textrm{LE}(X^{\tau _1})\subseteq \Lambda (x_i,3r)\), so that if \(\textrm{LE}(Y^{\tau _2})\subseteq \Lambda (x_j,3r)\) then \(\tau _2=\tau _2^\prime \). It follows that

$$\begin{aligned}&{{\,\mathrm{{\mathbb {P}}}\,}}^\eta \bigl (v\in M(x_j,r)\mid u\in M(x_i,r)\big )\\&\quad \le {{\,\mathrm{{\mathbb {P}}}\,}}^\eta \left( \textrm{LE}(Y^{\tau _2^\prime })\subseteq \Lambda (x_j,3r),\, |\textrm{LE}(Y^{\tau _2^\prime })|\le \frac{r^2}{(\log r)^{1/3}},\,\right. \\&\quad \left. \textrm{LE}(Y^{\tau _2^\prime }) \text { is }(\alpha ,r)\text {-good}\ \Big \vert \ u\in M(x_i,r)\right) \\&\quad ={{\,\mathrm{{\mathbb {P}}}\,}}^\eta \left( \textrm{LE}(Y^{\tau _2^\prime })\subseteq \Lambda (x_j,3r),\, |\textrm{LE}(Y^{\tau _2^\prime })|\le \frac{r^2}{(\log r)^{1/3}},\, \textrm{LE}(Y^{\tau _2^\prime }) \text { is }(\alpha ,r)\text {-good}\right) \\&\quad ={{\,\mathrm{{\mathbb {P}}}\,}}^\eta \big (v\in M(x_j,r)\big ) \end{aligned}$$

where the first equality follows because \(Y^{\tau _2^\prime }\) is independent from the event \(\{u\in M(x_i,r)\}\) conditional on \(\eta \) and where the last equality follows by an application of Wilson’s algorithm. The claimed negative correlation of \(|M(x_i,r)|\) and \(|M(x_j,r)|\) follows by summing over u and v. Negativity of the correlations immediately implies that

$$\begin{aligned} \text {Var}^\eta \left( \Bigl |\bigcup _{i=1}^K M(x_i,r)\Bigr |\right) \le \sum _{1\le i\le K} \text {Var}^\eta (|M(x_i,r)|), \end{aligned}$$

and we deduce by Chebyshev together with Propositions 2.12 and 2.13 that

$$\begin{aligned} {{\,\mathrm{{\mathbb {P}}}\,}}^\eta \left( \Bigl |\bigcup _{i=1}^K M(x_i,r)\Bigr |\le c_1r^{2}\sum _{i=1}^K \text {Cap}(\eta \cap \Lambda (x_i,r))\right) \preceq \frac{r^2}{\log r} \cdot \frac{\sum _{i=1}^K \textrm{Cap}(\eta \cap \Lambda (x_i,3r))}{\left( \sum _{i=1}^K \text {Cap}\big (\eta \cap \Lambda (x_i,r)\big )\right) ^2}\ \end{aligned}$$
(39)

for some constant \(c_1>0\). Note that this estimate holds for any \(K\ge 1\) and any collection of points \((x_i)_{i=1}^K\) in \({{\,\mathrm{{\mathbb {Z}}}\,}}^4\) such that the family of boxes \((\Lambda (x_i,3r))_{i=1}^K\) are mutually disjoint, where we are free to choose K and \((x_i)_{i=1}^K\) as functions of \(\eta \) if we wish. (Of course the points we choose must be conditionally independent of the rest of the UST given \(\eta \).)

We now want to apply this estimate to prove our lower tail estimate on \(|{\mathfrak {B}}(n)|\). Fix \(n\ge 1\), and for each \(R\ge 1\), let \({\mathscr {A}}_R\) be the event that \(\Vert \eta _i\Vert _\infty \ge 2R\) for every \(i\ge n/2\). Observe from the definitions that if \({\mathscr {A}}_R\) holds and \(r\ge 2\) is such that \(r^2(\log r)^{-1/3} \le n/2\) and \(3r\le R\) then

$$\begin{aligned} |{\mathfrak {B}}(n)| \ge \Bigl |\bigcup _{i=1}^K M(x_i,r)\Bigr | \end{aligned}$$

for any collection of points \(x_1,\ldots ,x_K\) in \(\Lambda (R)\): the definition of the set \(M(x_i,r)\) and the choice of r ensures the path connecting x to \(\eta \) is contained in \(\Lambda (2R)\) and has length at most n/2, while the definition of \({\mathscr {A}}_R\) ensures that this path meets \(\eta \) within the first n/2 steps of \(\eta \). Thus, choosing these points as a function of \(\eta \) and \(r\ge 1\) as in the covering lemma, Lemma 2.18, where we take \(S = \eta \cap \Lambda (R)\), we deduce from (39) that there exists a constant \(c_1\) such that

$$\begin{aligned} \mathbb {1}({\mathscr {A}}_R){{\,\mathrm{{\mathbb {P}}}\,}}^\eta \left( |{\mathfrak {B}}(n)|\le c_1 r^{2}\text {Cap}(\eta \cap \Lambda (R))\right) \preceq \frac{r^2}{\log r} \cdot \frac{1}{\text {Cap}(\eta \cap \Lambda (R))} \end{aligned}$$
(40)

for every \(r,R\ge 2\) such that \(r^2(\log r)^{-1/3} \le n/2\) and \(3r\le R\). As such, we have by a union bound that

$$\begin{aligned}{} & {} {{\,\mathrm{{\mathbb {P}}}\,}}\left( |{\mathfrak {B}}(n)|\le \frac{c_1 r^{2} R^2}{\lambda \log R}\right) \preceq \frac{\lambda r^2 \log R}{R^2 (\log r)}+{\mathbb {P}}({\mathscr {A}}_R^c) + {\mathbb {P}}\left( \text {Cap}(\eta \cap \Lambda (R)) \le \frac{R^2}{\lambda \log R}\right) \end{aligned}$$
(41)

for every \(r,R\ge 2\) such that \(r^2(\log r)^{-1/3} \le n/2\) and \(3r\le R\) and every \(\lambda \ge 1\).

To proceed, we will bound the second and third terms on the right hand side then optimize over the choice of r, R, and \(\lambda \). To bound \({\mathbb {P}}({\mathscr {A}}_R)\), we use Wilson’s algorithm to write

$$\begin{aligned} {\mathbb {P}}({\mathscr {A}}_R^c)&={\textbf{P}}_0(\textrm{LE}(X)_i \in \Lambda (2R) \text { for some }i\ge n/2) \\ {}&\le {\textbf{P}}_0\left( \ell _{\lfloor n/2 \rfloor }(X) \le \frac{1}{4}n(\log n)^{1/3} \right) +{\textbf{P}}_0\left( X_j \in \Lambda (2R) \text { for some }j\ge \frac{1}{4}n(\log n)^{1/3}\right) \\&\preceq \frac{\log \log n}{(\log n)^{2/3}}+ \frac{R^2 }{n(\log n)^{1/3}}, \end{aligned}$$

where the first term has been bounded using Theorem 2.1 and the second follows by a standard random walk computation (for example, it follows by [36, Lemma 4.4] and Markov’s inequality). To bound the second term, we use the union bound

$$\begin{aligned} {\mathbb {P}}\left( \text {Cap}(\eta \cap \Lambda (R)) \le \frac{R^2}{\lambda \log R}\right) \le {\mathbb {P}}(\Vert \eta _i\Vert \ge R \text { for some }i\le k) + {\mathbb {P}}\left( \text {Cap}(\eta ^k) \le \frac{R^2}{\lambda \log R}\right) \end{aligned}$$

for every \(R,k\ge 1\) and \(\lambda \ge 1\). Using Wilson’s algorithm and a further union bound yields that

$$\begin{aligned}{} & {} {\mathbb {P}}\left( \text {Cap}(\eta \cap \Lambda (R)) \le \frac{R^2}{\lambda \log R}\right) \le {\textbf{P}}_0\left( \ell _k \ge 2 k (\log k)^{1/3} \right) \\{} & {} \qquad \qquad \qquad \qquad \qquad \qquad \qquad + {\textbf{P}}_0(\Vert X_j\Vert \ge R \text { for some }j\le 2 k (\log k)^{1/3})\\{} & {} \qquad \qquad \qquad \qquad \qquad \qquad \qquad + {\textbf{P}}_0\left( \text {Cap}(\textrm{LE}(X)^k) \le \frac{R^2}{\lambda \log R}\right) , \end{aligned}$$

and we deduce from Theorem 2.1, the maximal version of Azuma-Hoeffding [61, Section 2], and Proposition 2.17 that there exists a positive constant C such that

$$\begin{aligned} {\mathbb {P}}\left( \text {Cap}(\eta \cap \Lambda (R)) \le \frac{R^2}{\lambda \log R}\right) \preceq \frac{\log \log k}{(\log k)^{2/3}} + \exp \left[ -\Omega \left( \frac{R^2}{k(\log k)^{1/3}}\right) \right] \end{aligned}$$

for every \(R,k\ge 1\) such that \(k(\log k)^{-2/3} \le C \lambda ^{-1} R^2 (\log R)^{-1}\). If \(\lambda \le R^{1/2}\) then the maximal such k is of order \(\lambda ^{-1} R^2 (\log R)^{-1/3}\) and it follows by calculus that

$$\begin{aligned} {\mathbb {P}}\left( \text {Cap}(\eta \cap \Lambda (R)) \le \frac{R^2}{\lambda \log R}\right) \preceq \frac{\log \log R}{(\log R)^{2/3}} + \exp \left[ -\Omega (\lambda ^{-1})\right] \end{aligned}$$

for every \(R\ge 3\) and \(1\le \lambda \le R^{1/2}\). Putting these estimates together yields that

$$\begin{aligned}{} & {} {{\,\mathrm{{\mathbb {P}}}\,}}\left( |{\mathfrak {B}}(n)|\le \frac{c_1 r^{2} R^2}{\lambda \log R}\right) \le \frac{\lambda r^2 \log R}{R^2\log r} + \frac{\log \log n}{(\log n)^{2/3}} \\{} & {} \qquad \qquad \qquad \qquad \qquad \quad \,\,\,\,\, + \frac{R^2}{n(\log n)^{1/3}} + \frac{\log \log R}{(\log R)^{2/3}} + \exp \left[ -\Omega (\lambda ^{-1})\right] \end{aligned}$$

for every \(r,R\ge 2\) such that \(r^2(\log r)^{-1/3} \le n/2\) and \(3r\le R\) and every \(1\le \lambda \le R^{1/2}\). Letting \(\beta \ge 10\), taking \(R=\lceil \beta ^{-1} n^{1/2} (\log n)^{1/6}\rceil \), \(r= \lceil \beta ^{-2} n^{1/2} (\log n)^{1/6}\rceil \) and \(\lambda =\beta \) yields that if \(n \ge \beta ^4\) then

$$\begin{aligned}&{{\,\mathrm{{\mathbb {P}}}\,}}\left( |{\mathfrak {B}}(n)|\le \frac{c_2 n^2}{\beta ^5 (\log n)^{1/3}}\right) \preceq \beta ^{-1} + \frac{\log \log n}{(\log n)^{2/3}} + \beta ^{-2} \\&\qquad \qquad \qquad \qquad \qquad \qquad \,\, + \frac{\log \log n}{(\log n)^{2/3}} + \exp \left[ -\Omega (\beta ^{-1})\right] \preceq \beta ^{-1} + \frac{\log \log n}{(\log n)^{2/3}}, \end{aligned}$$

which implies the claim. \(\square \)

It remains to prove our covering lemma for the capacity, Lemma 2.18. The proof, which exhibits and analyzes a greedy algorithm for constructing the desired set of balls, follows a standard strategy for proving covering lemmas of similar form.

Proof of Lemma 2.18

Consider the set of centres \({\mathcal {C}}=\{x\in (2r+1){{\,\mathrm{{\mathbb {Z}}}\,}}^4:\textrm{Cap}(\Lambda (x,r)\cap S )>0\}\) and the partition of \({{\,\mathrm{{\mathbb {Z}}}\,}}^4\) defined by \({\mathcal {B}}=\{\Lambda (x,r):x\in {\mathcal {C}}\}\). Note that \(x\in S +\Lambda (r)\) for each \(x\in {\mathcal {C}}\), since otherwise the box \(\Lambda (x,r)\) would not contain any points of S. Given \(x\in {\mathcal {C}}\), we write \(A[x]=\{y\in {\mathcal {C}}:\left\Vert y-x\right\Vert _\infty \le 2r+1\}\) for the set of centres in \({\mathcal {C}}\) equal to or adjacent to x. We note the crude bound \(\#{A[x]}\le 3^4\). Similarly, we write \(A^2[x]=\{y\in {\mathcal {C}}:\left\Vert y-x\right\Vert _\infty \le 4r+2\}\), \(A^3[x]=\{y\in {\mathcal {C}}:\left\Vert y-x\right\Vert _\infty \le 6r+3\}\), and note that \(\#{A^2[x]}\le 5^4\), \(\#{A^3[x]}\le 7^4\).

We will construct the sequence \((x_i)_{i=1}^K\) using a greedy algorithm. By subadditivity of capacity (which is an immediate consequence of the variational principle of Lemma 2.14), we know that

$$\begin{aligned} \Pi :=\sum _{x\in {\mathcal {C}}}\textrm{Cap}( S \cap \Lambda (x,r))\ge \textrm{Cap}( S ). \end{aligned}$$
(42)

Define the list of centres \((x_i)_{i\ge 0}\subseteq {\mathcal {C}}\) as follows. Let \({\mathcal {C}}_0={\mathcal {C}}\), and for \(i\ge 0\) such that \({\mathcal {C}}_i\ne \emptyset \), let

$$\begin{aligned} x_i=\mathop {\mathrm {arg\,max}}\limits \{\textrm{Cap}(\Lambda (x,r)\cap S ):x\in {\mathcal {C}}_i\};\qquad {\mathcal {C}}_{i+1} = {\mathcal {C}}_i\setminus A^2[x_i], \end{aligned}$$

Write \(I=\inf \{i\ge 0: {\mathcal {C}}_i=\emptyset \}\) and define \(\kappa _i=\textrm{Cap}(\Lambda (x_i,r)\cap S )\) for \(0\le i< I\). We claim that

$$\begin{aligned} \sum _{0\le i\le n}\textrm{Cap}( S \cap \Lambda (x_i,3r))\le 21^4 \sum _{0\le j\le n} \kappa _j \qquad \text {for every }n<I. \end{aligned}$$
(43)

Fix \(0\le i<I\). We note that for any \(y\in A^2[x_i]\), there exists a unique \(0\le j\le i\) such that \(y\in {\mathcal {C}}_{j}\setminus {\mathcal {C}}_{j+1}\). By definition of \(\kappa _j\) and \(x_j\), it must then hold that \(\textrm{Cap}(\Lambda (y,r)\cap S )\le \kappa _j\). By subadditivity of capacity, we can therefore write

$$\begin{aligned} \textrm{Cap}(\Lambda (x_i,3r)\cap S )\le \sum _{y\in A[x_i]}\textrm{Cap}(\Lambda (y,r)\cap S )\le \sum _{y\in A[x_i]}\sum _{j\le i}\kappa _{j}\mathbb {1}(y\in {\mathcal {C}}_j\setminus {\mathcal {C}}_{j+1}). \end{aligned}$$

Observing that \({\mathcal {C}}_j\setminus {\mathcal {C}}_{j+1}\subseteq A^2[x_j]\) for \(j<I\), we get

$$\begin{aligned} \textrm{Cap}(\Lambda (x_i,3r)\cap S )\le \sum _{j\le i}\kappa _{j}|A[x_i]\cap A^2[x_j]|. \end{aligned}$$

By switching the order of summation, we have

$$\begin{aligned} \sum _{0\le i\le n}\textrm{Cap}( \Lambda (x_i,3r)\cap S )\le \sum _{0\le j\le n} \kappa _{j}\sum _{j\le i\le n}|A[x_i]\cap A^2[x_j]|. \end{aligned}$$

Finally, \(|A[x_i]\cap A^2[x_j]|\le |A[x_i]|\le 3^4\), and if \(|A[x_i]\cap A^2[x_j]|\ne 0\), then \(x_i\in A^3[x_j]\). The \(x_i\) are all distinct, and there are at most \(7^4\) elements in \(A^3[x_j]\), and so the summations over i on the right hand side are bounded above by \(3^4\times 7^4=21^4\), thus proving the claim (43).

Next, observe that for \(i\ge 0\)

$$\begin{aligned} \sum _{x\in {\mathcal {C}}_i}\textrm{Cap} (\Lambda (x,r)\cap S ) \ge \Pi -5^4 \sum _{0\le j\le i-1}\kappa _j. \end{aligned}$$

Indeed, at stage i in the algorithm we remove at most \(5^4\) centres from \({\mathcal {C}}_i\) to give \({\mathcal {C}}_{i+1}\), and for each of these centres x, we must have \(\textrm{Cap}( S \cap \Lambda (x,r))\le \kappa _i\). Putting \(i=I\) in the above equation gives

$$\begin{aligned} 5^4 \sum _{0\le j<I}\kappa _j\ge \Pi , \end{aligned}$$

and so by (42), we have \(\sum _{0\le j<I}\kappa _j\ge 5^{-4} \textrm{Cap}( S )\). Finally, we note that for \(0\le i<j<I\), by construction \(x_j\notin A^2[x_i]\), and so the balls \(\Lambda (x_i,3r)\) and \(\Lambda (x_j,3r)\) are disjoint. \(\square \)

Remark 6

Note that the proof of Lemma 2.18 does not use any properties of the capacity other than subadditivity and non-negativity, so that a similar covering lemma holds for any subadditive, non-negative set function.

3 Random Walk

We now apply our main geometric theorem, Theorem 1.1, to study the behaviour of the random walk on the 4d UST. We begin by applying our results together with those of [37] to prove our effective resistance estimate, Theorem 1.3, in Sect. 3.1. In Sect. 3.2 we review the theory of Markov-type inequalities and prove our upper bound on the mean-squared displacement, Theorem 1.4. Finally, in Sect. 3.3 we show how the remaining estimates of Theorem 1.2 can be deduced from these estimates using the methods of [13, 50].

3.1 Effective resistance

In this section we prove Theorem 1.3. The upper bound is trivial since resistances are always bounded by distances, so we focus on the lower bound. We will employ [36, Lemma 8.3] which we reproduce here. Let \({\mathscr {C}}_{\textrm{eff}}(A\leftrightarrow B;G)={\mathscr {R}}_{\textrm{eff}}(A\leftrightarrow B;G)^{-1}\) denote the effective conductance between sets \(A,B\subseteq V[G]\).

Lemma 3.1

([36], Lemma 8.3) Let T be a tree, let v be a vertex of T, and let \(N_v(n,k)\) be the number of vertices \(u\in \partial B(v,k):=B(v,k)\setminus B(v,k-1)\) at distance k from v such that u lies on a geodesic in T from v to \(\partial B(v,n)\). Then

$$\begin{aligned} {\mathscr {C}}_{\textrm{eff}}(v\leftrightarrow \partial B(v,n);T)\le \frac{1}{k} N_v(n,k) \end{aligned}$$

for every \(1\le k\le n\).

We will also use the following theorem of [37] concerning the tail of the intrinsic radius of the past. For each \(n\ge 0\), let \(\partial \mathfrak {P}(0,n)\) be the set of vertices in \(\mathfrak {P}(0)\) with an intrinsic distance from 0 of exactly n.

Theorem 3.2

([37], Theorem 1.1) Let \({\mathfrak {T}}\) be the uniform spanning tree of \({{\,\mathrm{{\mathbb {Z}}}\,}}^4\). Then

$$\begin{aligned} {{\,\mathrm{{\mathbb {P}}}\,}}(\partial {\mathfrak {P}}(0,n)\ne \emptyset )\asymp \frac{(\log n)^{1/3}}{n} \end{aligned}$$

for every \(n\ge 1\).

We now apply these results together with Theorem 1.1 to prove Theorem 1.3.

Proof of Theorem 1.3

Fix \(\lambda >0\) and \(\delta \in (0,1]\). For each \(0\le m\le n\), let K(nm) be the set of vertices \(u\in \partial {\mathfrak {B}}(0,m)\) that lie on a geodesic from 0 to \(\partial {\mathfrak {B}}(0,n)\) and let \(K^\prime (n,m)\) be the set of vertices \(u\in \partial {\mathfrak {B}}(0,m)\) such that \(\partial {\mathfrak {P}}(u,n-m)\ne \emptyset \). We observe that \(K(n,m)\setminus K^\prime (n,m)\) contains at most one vertex, namely the unique vertex in \(\partial {\mathfrak {B}}(0,m)\) which lies in the future of v, and so, by Lemma 3.1, we have

$$\begin{aligned} \begin{aligned} {\mathscr {C}}_{\text {eff}}(0\leftrightarrow \partial {\mathfrak {B}}(0,n);{\mathfrak {T}})\le \frac{1}{m}|K(n,m)|\le \frac{1}{m}+\frac{1}{m}|K^\prime (n,m)| \end{aligned} \end{aligned}$$

for each \(1\le m\le n\). Averaging this gives us that

$$\begin{aligned} \begin{aligned} {\mathscr {C}}_{\text {eff}}(0\leftrightarrow \partial {\mathfrak {B}}(0,3n);{\mathfrak {T}})\preceq \frac{1}{n}+ \frac{1}{n^2}\sum _{m=n}^{2n}|K^\prime (3n,m)|, \end{aligned}\end{aligned}$$

for each \(n\ge 1\). Now, for each \(n\ge 1\), the sets \((K^\prime (n,m))_{n\le m\le 2n}\) are pairwise disjoint and their union satisfies

$$\begin{aligned} \begin{aligned} \bigcup _{n\le m\le 2n} K^\prime (n,m)\subseteq \{u\in {{\,\mathrm {{\mathbb {Z}}}\,}}^4: u\in {\mathfrak {B}}(0,2n),\, {\partial \mathfrak {P}}(u,n)\ne \emptyset \}, \end{aligned} \end{aligned}$$

and so

$$\begin{aligned} \begin{aligned} {\mathscr {C}}_{\text {eff}}(0\leftrightarrow \partial {\mathfrak {B}}(0,3n);{\mathfrak {T}})\preceq \frac{1}{n}+ \frac{1}{n^2}\sum _{u\in {{\,\mathrm {{\mathbb {Z}}}\,}}^4} \mathbb {1}\big (u\in {\mathfrak {B}}(0,2n),\, {\partial \mathfrak {P}}(u,n)\ne \emptyset \big ). \end{aligned} \end{aligned}$$

Multiplying both sides by the indicator function \(\mathbb {1}(|{\mathfrak {B}}(0,4n)|\le \lambda ^{1/2}n^2(\log n)^{-1/3+\delta })\) and taking expectations gives

$$\begin{aligned} \begin{aligned}&{\mathbb {E}}\left[ {\mathscr {C}}_{\text {eff}}(0\leftrightarrow \partial {\mathfrak {B}}(0,3n);{\mathfrak {T}})\mathbb {1}\left( |{\mathfrak {B}}(0,4n)|\le \frac{\lambda ^{1/2}n^2}{(\log n)^{1/3-\delta } }\right) \right] \\ {}&\quad \preceq \frac{1}{n}+ \frac{1}{n^2}\sum _{u\in {{\,\mathrm {{\mathbb {Z}}}\,}}^d} {{\,\mathrm {{\mathbb {P}}}\,}}\left( u\in {\mathfrak {B}}(0,2n),\, \partial {\mathfrak {P}}(u,n)\ne \emptyset ,\, |{\mathfrak {B}}(0,4n)|\le \frac{\lambda ^{1/2}n^2}{(\log n)^{1/3-\delta } }\right) , \end{aligned}\end{aligned}$$

and applying the mass-transport principle to exchange the roles of 0 and u yields that

$$\begin{aligned} \begin{aligned}&{\mathbb {E}}\left[ {\mathscr {C}}_{\text {eff}}(0\leftrightarrow \partial {\mathfrak {B}}(0,3n);{\mathfrak {T}})\mathbb {1}\left( |{\mathfrak {B}}(0,4n)|\le \frac{\lambda ^{1/2}n^2}{(\log n)^{1/3-\delta } }\right) \right] \\ {}&\quad \preceq \frac{1}{n}+ \frac{1}{n^2}\sum _{u\in {{\,\mathrm {{\mathbb {Z}}}\,}}^d} {{\,\mathrm {{\mathbb {P}}}\,}}\left( 0\in {\mathfrak {B}}(u,2n),\, \partial {\mathfrak {P}}(0,n)\ne \emptyset ,\, |{\mathfrak {B}}(u,4n)|\le \frac{\lambda ^{1/2}n^2}{(\log n)^{1/3-\delta } }\right) \\ {}&\quad \le \frac{1}{n}+ \frac{1}{n^2}{\mathbb {E}}\left[ |{\mathfrak {B}}(0,2n)|\mathbb {1}\left( |{\mathfrak {B}}(0,2n)|\le \frac{\lambda ^{1/2}n^2}{(\log n)^{1/3-\delta } },\, \partial {\mathfrak {P}}(0,n)\ne \emptyset \right) \right] \\ {}&\quad \preceq \frac{1}{n}+ \frac{\lambda ^{1/2}}{(\log n)^{1/3-\delta }}{{\,\mathrm {{\mathbb {P}}}\,}}\big (\partial {\mathfrak {P}}(0,n)\ne \emptyset \big ) \preceq \lambda ^{1/2}\frac{(\log n)^\delta }{n}, \end{aligned} \end{aligned}$$
(44)

where the final inequality follows from Theorem 3.2. Now by a union bound, we have

$$\begin{aligned} \begin{aligned}{}&{} {{\,\mathrm {{\mathbb {P}}}\,}}\left( {\mathscr {C}}_{\text {eff}}(0\leftrightarrow \partial {\mathfrak {B}}(0,3n);{\mathfrak {T}})>\lambda \frac{(\log n)^\delta }{n}\right) \le {{\,\mathrm {{\mathbb {P}}}\,}}\left( |{\mathfrak {B}}(0,4n)|> \frac{\lambda ^{1/2}n^2}{(\log n)^{1/3-\delta } } \right) \\{}&{} \quad +{{\,\mathrm {{\mathbb {P}}}\,}}\left( {\mathscr {C}}_{\text {eff}}(0\leftrightarrow \partial {\mathfrak {B}}(0,3n);{\mathfrak {T}})\mathbb {1}\left( |{\mathfrak {B}}(0,4n)|\le \frac{\lambda ^{1/2}n^2}{(\log n)^{1/3-\delta } }\right) >\lambda \frac{(\log n)^\delta }{n}\right) . \end{aligned} \end{aligned}$$

Applying Markov’s inequality to each term on the right hand side and using (44) and Theorem 1.1 to estimate the relevant expectations yields that

$$\begin{aligned} \begin{aligned} {{\,\mathrm {{\mathbb {P}}}\,}}\left( {\mathscr {C}}_{\text {eff}}(0\leftrightarrow \partial {\mathfrak {B}}(0,3n);{\mathfrak {T}})>\lambda \frac{(\log n)^\delta }{n}\right) \preceq _\delta \lambda ^{-1}\lambda ^{1/2}+\lambda ^{-1/2}\preceq \lambda ^{-1/2}, \end{aligned} \end{aligned}$$
(45)

and the claim follows since \(\lambda ,\delta >0\) were arbitrary. \(\square \)

3.2 Upper bounds on displacement via Markov-type inequalities

In this section, we will use Markov-type inequalities [8, 27, 63] together with the results of [37] to prove Theorem 1.4, which establishes sharp upper bounds on the expectation of the squared maximal intrinsic displacement of a random walk on the 4d UST. Markov-type inequalities were first introduced by Ball [8] in the context of the Lipschitz extension problem, and have since been found to have many important applications to the study of random walk [29, 32, 56, 57, 65]. Our work is particularly influenced by that of James Lee and his coauthors [28, 29, 56, 57], who pioneered the use of Markov-type inequalities to prove sharp subdiffusive estimates for random walks on fractals. We begin by quickly reviewing the general theory, including in particular the extension of the universal Markov-type inequality for planar graphs of Ding, Lee, and Peres [28] to unimodular hyperfinite planar graphs established in [32].

Unimodular weighted graphs. A vertex-weighted graph is a pair \((G,\omega )\) consisting of a graph G and a weighting on G, that is a function \(\omega : V[G]\rightarrow [0,\infty )\). We define the weighted graph distance between vertices xy of a weighted graph \((G,\omega )\) by

$$\begin{aligned} d_\omega ^G(x,y)=\inf _{x=u_0\sim \cdots \sim u_n=y, n\in {{\,\mathrm{{\mathbb {N}}}\,}}}\sum _{i=1}^n \frac{1}{2}\big (\omega (u_i)+\omega (u_{i-1})\big ). \end{aligned}$$

Let \({\mathcal {G}}_\bullet ^\omega \) be the space of triples \((G,\omega ,\rho )\), where \((G,\omega )\) is a locally finite vertex-weighted graph, and \(\rho \in V[G]\) is a vertex known as the root vertex. The space \({\mathcal {G}}_\bullet ^\omega \) is equipped with the Borel sigma algebra induced by the natural generalisation of the Benjamini–Schramm local topology [3, 24] in which two rooted, weighted graphs are considered to be close if there exist large graph-distance balls around their roots for which their respective balls admit a graph isomorphism that approximately preserves the weights. The details of this construction are not important to us and can be found in e.g. [24, Section 1.2]. Similarly, we also have the space \({\mathcal {G}}_{\bullet \bullet }^\omega \) of vertex-weighted graphs with an ordered pair of distinguished vertices. We say that a random variable \((G,\omega ,\rho )\) taking values in \({\mathcal {G}}_\bullet ^\omega \) is a unimodular vertex-weighted graph if it satisfies the mass-transport principle, i.e. if

$$\begin{aligned} {\mathbb {E}}\left[ \sum _{v\in V[G]}F(G,\omega ,\rho ,v)\right] ={\mathbb {E}}\left[ \sum _{v\in V[G]}F(G,\omega ,v,\rho )\right] \end{aligned}$$

for each Borel measurable function \(F:{\mathcal {G}}_{\bullet \bullet }^\omega \rightarrow [0,\infty )\). Unweighted unimodular random graphs are defined similarly; we refer the reader to [3, 24] for a more in-depth discussion of the local topology and unimodularity. These notions are relevant to our setting since if K is the component of the origin 0 in some translation-invariant random subgraph of \({{\,\mathrm{{\mathbb {Z}}}\,}}^d\) then (K, 0) always defines a unimodular random rooted graph, so that, in particular, \(({\mathfrak {T}},0)\) is a unimodular random rooted graph when \({\mathfrak {T}}\) is the UST of \({{\,\mathrm{{\mathbb {Z}}}\,}}^4\). Moreover, if the weight \(\omega :{{\,\mathrm{{\mathbb {Z}}}\,}}^4\rightarrow [0,\infty )\) is computed from \({\mathfrak {T}}\) in a translation-equivariant way then the resulting weighted random rooted graph \(({\mathfrak {T}},\omega ,0)\) is also unimodular, as can be seen by applying the usual mass-transport principle on \({{\,\mathrm{{\mathbb {Z}}}\,}}^4\) to the expectations \({\mathbb {E}} F({\mathfrak {T}},\omega ,x,y)\).

Markov-type inequalities. A metric space \({\mathcal {X}}=({\mathcal {X}},d)\) is said to have Markov-type 2 with constant \(c<\infty \) if for every finite set S, every irreducible reversible Markov chain M on S, and every function \(f:S\rightarrow {\mathcal {X}}\) the inequality

$$\begin{aligned} {\mathbb {E}}\left[ d\big (f(Y_0),f(Y_n)\big )^2\right] \le c^2n {\mathbb {E}}\left[ d\big (f(Y_0),f(Y_1)\big )^2\right] \end{aligned}$$

holds for every \(n\ge 0\), where \((Y_i)_{i\ge 0}\) is a trajectory of the Markov chain M with \(Y_0\) distributed as the stationary measure of M. Similarly, a metric space \({\mathcal {X}}=({\mathcal {X}},d)\) is said to have maximal Markov-type 2 with constant \(c<\infty \) if for every finite set S and every irreducible reversible Markov chain M on S, and every function \(f:S\rightarrow {\mathcal {X}}\), we have that

$$\begin{aligned} {\mathbb {E}}\left[ \max _{0\le i\le n}d\big (f(Y_0),f(Y_i)\big )^2\right] \le c^2n {\mathbb {E}}\left[ d\big (f(Y_0),f(Y_1)\big )^2\right] \end{aligned}$$

for each \(n\ge 0\), where, as before, \((Y_i)_{i\ge 0}\) is a trajectory of the Markov chain M with \(Y_0\) distributed as the stationary measure of M.

It is proved in [28] that there exists a universal constant C such that every vertex-weighted planar graph has Markov-type 2 with constant C; in fact their proof also establishes the existence of a universal constant C such that every weighted planar graph has maximal Markov-type 2 with constant C as explained in [32, Proposition 2.4]. This fact is significantly easier for trees, where it was established by Naor, Peres, Schramm, and Sheffield [63] (see also [59, Theorem 13.14]).

We now describe the consequences of this theorem for unimodular random planar graphs. We must first define what it means for a unimodular random rooted graph to be hyperfinite. A percolation on a unimodular random rooted graph \((G,\rho )\) is a labelling \(\eta \) of the edge set of G by the elements 0, 1 such that the resultant edge-labelled graph \((G,\eta ,\rho )\) is unimodular. We think of the percolation \(\eta \) as a random subgraph of G, where each edge is labelled 1 if it is included in the subgraph and 0 otherwise, and denote the connected component of \(\rho \) in this subgraph as \(K_\eta (\rho )\). We say a percolation is finitary if \(K_\eta (\rho )\) is almost surely finite, and say a unimodular random rooted graph \((G,\rho )\) is hyperfinite if there exists an increasing sequence of finitary percolations \((\eta _n)_{n\ge 1}\) such that \(\cup _{n\ge 1} K_{\eta _n}(\rho )=V[G]\) almost surely. The component of the origin in a translation-invariant random subgraph of \({{\,\mathrm{{\mathbb {Z}}}\,}}^d\) is always hyperfinite as can be seen by taking a random hierarchical partition of \({{\,\mathrm{{\mathbb {Z}}}\,}}^d\) into dyadic boxes. The following proposition appears as [32, Corollary 2.5].

Proposition 3.3

Let \((G,\rho )\) be a hyperfinite, unimodular random rooted graph with \({\mathbb {E}}\left[ \deg (\rho )\right] <\infty \) that is almost surely planar, and suppose that \(\omega \) is a vertex-weighting of G such that \((G,\omega ,\rho )\) is a unimodular vertex-weighted graph. If Y is a random walk on G started at \(\rho \) then

$$\begin{aligned} {\mathbb {E}}\left[ \deg (\rho )\max _{0\le i\le n}d_\omega ^G\big (Y_0,Y_i\big )^2\right] \le C^2 n {\mathbb {E}}\left[ \deg (\rho )\omega (\rho )^2\right] , \end{aligned}$$

for each \(n\ge 1\), where C is a universal constant.

We now apply this proposition to prove Theorem 1.4.

Proof of Theorem 1.4

Let \(r\ge 1\) be a parameter to be optimized over shortly. Seeing as the UST of \({{\,\mathrm{{\mathbb {Z}}}\,}}^d\) is unimodular, hyperfinite (being a translation-invariant percolation processes on \({{\,\mathrm{{\mathbb {Z}}}\,}}^d\)) and planar, we can apply Proposition 3.3 to the vertex weight

$$\begin{aligned} \omega _r(v)=\mathbb {1}(\partial {\mathfrak {P}}(v,r)\ne \emptyset ), \end{aligned}$$

which makes \(({\mathfrak {T}},\omega _r,0)\) unimodular since it is computed as a translation-equivariant function of \({\mathfrak {T}}\). This particular choice of weight is inspired by that used by Ganguly and Lee in [29]. Writing \(d_r=d_{\omega _r}^{\mathfrak {T}}\) and using the fact that \({\mathfrak {T}}\) has degrees uniformly bounded below by 1 and above by 8, we get that

$$\begin{aligned} {\mathbb {E}}\left[ \max _{0\le i\le n}d_{r}\big (Y_0,Y_i\big )^2\right] \le 8C^2 n{{\,\mathrm{{\mathbb {P}}}\,}}\big (\partial {\mathfrak {P}}(0,r)\ne \emptyset \big ) \end{aligned}$$
(46)

for each \(r,n \ge 1\). We next claim that

$$\begin{aligned} d_{\mathfrak {T}}(u,v)\le 4r + 4d_r(u,v) \qquad \text {for every }u,v\in {\mathfrak {T}}\text { and }r\ge 1. \end{aligned}$$
(47)

Indeed, let \(u,v\in {{\,\mathrm{{\mathbb {Z}}}\,}}^4\) and suppose that \(d_{\mathfrak {T}}(u,v) \ge 4r\), the claimed inequality being trivial otherwise. Let w be the vertex at which the futures of u and v meet. At least one of the inequalities \(d_{{\mathfrak {T}}}(u,w) \ge \frac{1}{2}d_{\mathfrak {T}}(u,v)\) or \(d_{{\mathfrak {T}}}(v,w) \ge \frac{1}{2}d_{\mathfrak {T}}(u,v)\) holds, and we may assume without loss of generality that \(d_{{\mathfrak {T}}}(u,w) \ge \frac{1}{2}d_{\mathfrak {T}}(u,v) \ge 2r\). Since u belongs to the past of each of the vertices in the \({\mathfrak {T}}\)-geodesic connecting u to w, all the vertices in the second half of this geodesic must have past of intrinsic diameter at least r, so that \(d_r(u,w) \ge \frac{1}{2} d_{\mathfrak {T}}(u,w)\) and hence that \(d_r(u,v) \ge \frac{1}{4}d_{\mathfrak {T}}(u,v)\) as required. It follows from (47) together with (46) that

$$\begin{aligned}{} & {} {\mathbb {E}}\left[ \max _{0\le i\le n}d_{{\mathfrak {T}}}\big (Y_0,Y_i\big )^2\right] \le 32r^2 + 32{\mathbb {E}}\left[ \max _{0\le i\le n}d_{r}\big (Y_0,Y_i\big )^2\right] \preceq r^2\\{} & {} \quad + n{{\,\mathrm{{\mathbb {P}}}\,}}\big (\partial {\mathfrak {P}}(0,r)\ne \emptyset \big ) \preceq r^2+ \frac{n(\log r)^{1/3}}{r} \end{aligned}$$

for every \(r,n\ge 1\), where we applied Theorem 3.2 in the third inequality, and taking \(r=\lceil n^{1/3}(\log n)^{1/9}\rceil \) yields that

$$\begin{aligned} {\mathbb {E}}\left[ \max _{0\le i\le n}d_{{\mathfrak {T}}}\big (Y_0,Y_i\big )^2\right] \preceq n^{2/3}(\log n)^{2/9} \end{aligned}$$

for every \(n\ge 2\) as claimed. \(\square \)

Remark 7

This method also gives sharp upper bounds in dimensions \(d\ge 5\): applying [36, Theorem 1.2] in place of Theorem 3.2, it yields that if \(d\ge 5\), \({\mathfrak {T}}\) is the component of the origin in the uniform spanning forest of \({{\,\mathrm{{\mathbb {Z}}}\,}}^d\), and Y is a random walk on \({\mathfrak {T}}\) started at 0, then

$$\begin{aligned} {\mathbb {E}}\left[ \max _{0\le i\le n}d_{{\mathfrak {T}}}\big (Y_0,Y_i\big )^2\right] \preceq n^{2/3} \end{aligned}$$

for every \(n\ge 0\). This is stronger than the displacement upper bounds proven in [36], which were based on the results of [13].

3.3 Proof of Theorem 1.2

In this section we use all of the previous results to compute logarithmic corrections to the asymptotic behaviour of the displacement, exit times, return probabilities and range of the simple random walk on the uniform spanning tree. We will draw heavily on the methods of [50], which generalizes and synthesizes the earlier works [10, 13, 14]. Note that we must rederive all our results from the methods of [50] rather than simply quote their results since, as stated, these results do not allow for non-matching upper and lower bounds.

Remark 8

In this proof we will often use our big-O in probability notation on random variables indexed by more than one variable (e.g. n and r). When we write an expression \(X_{n,r}={\textbf{O}}(Y_{n,r})\) of this form, it means that the entire family of associated random variables indexed by both n and r is tight.

Proof of Theorem 1.2

We recall that \({\textbf{E}}^{{\mathfrak {T}}}_x\) denotes expectation with respect to the law of a simple random walk X on \({\mathfrak {T}}\) started at \(x\in {{\,\mathrm{{\mathbb {Z}}}\,}}^4\) conditional on \({\mathfrak {T}}\), and write \({\textbf{P}}^{{\mathfrak {T}}}_x\) for the corresponding probability measure. Where clear from context, we will write \({\mathbb {P}}\) for the joint law and expectation of the uniform spanning tree and a random walk on the tree started at the origin, and similarly will write \({\mathbb {E}}\) for expectation with respect to this joint law.

Heat-kernel upper bound: [50, Proof of Proposition 3.1(a)] implies that

$$\begin{aligned} p_{2n}^{{\mathfrak {T}}}(0,0)+p_{2n+1}^{{\mathfrak {T}}}(0,0)\preceq \frac{1}{|{\mathfrak {B}}(0,R)|}\vee \frac{R}{n} \end{aligned}$$

for every \(n,R\ge 1\). Taking \(R=n^{1/3}(\log n)^{1/9}\) and applying the volume lower bound of Theorem 1.1 therefore yields that

$$\begin{aligned} p_{2n}^{{\mathfrak {T}}}(0,0)={\textbf {O}}\left( \frac{(\log n)^{1/9}}{n^{2/3}}\right) \end{aligned}$$
(48)

for every \(n \ge 2\).

Intrinsic displacement lower bound: We have by Cauchy–Schwarz that

$$\begin{aligned}{} & {} {{\,\mathrm{{\textbf{P}}}\,}}^{{\mathfrak {T}}}_0(d_{{\mathfrak {T}}}(0,X_n)\le r) = \sum _{v\in {\mathfrak {B}}(0,r)}{p^{\mathfrak {T}}_n}(o,v)\le {|{\mathfrak {B}}(0,r)|}^{1/2} \left( \sum _{v\in {\mathfrak {B}}(R)}{p^{\mathfrak {T}}_n}(o,v)^2\right) ^{1/2} \nonumber \\{} & {} \qquad \qquad \qquad \qquad \preceq {|{\mathfrak {B}}(0,r)|}^{1/2} p_{2n}^{{\mathfrak {T}}}(0,0)^{1/2}={\textbf{O}}\left( \frac{r}{(\log r)^{1/6-o(1)}} \cdot \frac{1}{n^{\frac{1}{3}}} (\log n)^{\frac{1}{18}}\right) \nonumber \\ \end{aligned}$$
(49)

for every \(n,r\ge 1\), where we have applied the volume upper bound on Theorem 1.1, and the previously derived heat-kernel upper bounds. If we take \(r = n^{1/3} (\log n)^{1/9-\delta }\) for some \(\delta >0\), then the expression appearing inside the \({\textbf{O}}\) is o(1), and, since this holds for every \(\delta >0\) (with implicit constants depending on \(\delta \)), it follows that \(d_{{\mathfrak {T}}}(0,X_n) =\varvec{\Omega }( n^{1/3} (\log n)^{1/9-o(1)})\) for every \(n\ge 2\) as claimed.

Intrinsic displacement upper bound: The estimate

$$\begin{aligned} \begin{aligned} d_{\mathfrak {T}}(X_0,X_n) \le \max _{0\le m \le n} d_{\mathfrak {T}}(X_0,X_m) = {{\textbf {O}}}\left( n^{1/3}(\log n)^{1/9}\right) \end{aligned} \end{aligned}$$

follows immediately from Theorem 1.4.

Heat-kernel lower bound: Fix \(\delta >0\) and let \(R=n^{1/3}(\log n)^{1/9+\delta }\). Using the same Cauchy-Schwarz argument as in (49), it follows from the intrisic displacement upper bounds of Theorem 1.4 and the volume lower bounds of Theorem 1.1 that there exists \(N_\delta \) such that

$$\begin{aligned} p_{2n}^{{\mathfrak {T}}}(o,o) \ge \frac{(1-{{\,\mathrm{{\textbf{P}}}\,}}^{{\mathfrak {T}}}(d_{{\mathfrak {T}}}(o,X_n)>R))^2}{|{\mathfrak {B}}(0,R)|}=\frac{1-{\textbf{o}}(1)}{{\textbf{O}}\big (R^2(\log R)^{-1/3}\big )} = \frac{1-{\textbf{o}}(1)}{{\textbf{O}}\big (n^{2/3}(\log n)^{-1/9+2\delta }\big )} \end{aligned}$$

for every \(n\ge N_\delta \), and the claim follows since \(\delta >0\) was arbitrary.

Exit time upper bound: [50, Equation 3.7] implies that

$$\begin{aligned} {\textbf{E}}^{\mathfrak {T}}_0[\tau _R]\le {\mathscr {R}}_{\textrm{eff}}(0\leftrightarrow {{\mathfrak {B}}(0,R)^c;{\mathfrak {T}}}) |{\mathfrak {B}}(0,R)|\le R|{\mathfrak {B}}(0,R)| \end{aligned}$$

for every \(R\ge 1\), and applying Theorem 1.1 yields that

$$\begin{aligned} {\textbf{E}}^{\mathfrak {T}}[\tau _R]={\textbf {O}}\left( \frac{R^3}{(\log R)^{1/3-o(1)}}\right) \qquad \text { and hence that } \qquad \tau _R={\textbf {O}}\left( \frac{R^3}{(\log R)^{1/3-o(1)}}\right) \end{aligned}$$

for every \(R\ge 2\).

Exit time lower bound: Fix \(R\ge 1\), and let \(\beta >0\), \(n= R^3/(\log R)^{1/3}\). Applying Theorem 1.4, we have

$$\begin{aligned} {{\,\mathrm{{\mathbb {P}}}\,}}(\tau _R\le \beta n)={{\,\mathrm{{\mathbb {P}}}\,}}\left( \max _{0\le i\le \beta n} d_{{\mathfrak {T}}}(o,X_i)^2\ge R^2\right) =O\left( \frac{\beta ^{2/3}n^{2/3}(\log n)^{2/9}}{R^2}\right) =O(\beta ^{2/3}), \end{aligned}$$

and so \(\tau _R=\varvec{\Omega }(R^3/(\log R)^{1/3})\). The relation \({\textbf{E}}^{{\mathfrak {T}}}[\tau _R]=\varvec{\Omega }(R^3/(\log R)^{1/3})\) then follows.

Extrinsic displacement upper bound: Let \(R\ge 1\) and fix \(\delta >0\). We have already established that

$$\begin{aligned} \begin{aligned} \max _{0\le m \le n} d_{\mathfrak {T}}(X_0,X_m) = {{\textbf {O}}}\left( n^{1/3}(\log n)^{1/9}\right) , \end{aligned} \end{aligned}$$

and Theorem 1.1 tells us that

$$\begin{aligned} {\mathfrak {B}}(n)\subseteq \Lambda \big (n^{1/2}(\log n)^{1/6+{\textbf{o}}(1)}\big ) \qquad \text {as }n\rightarrow \infty . \end{aligned}$$

Combining these two facts gives us

$$\begin{aligned} \max _{0\le m \le n} \left\Vert X_m\right\Vert _\infty ={\textbf{O}}\bigl (n^{\frac{1}{6}}(\log n)^{\frac{2}{9}+ o(1)} \bigr ) \qquad \text {as }n\rightarrow \infty , \end{aligned}$$

as required.

Extrinsic displacement lower bound: Let \(R\ge 1\). Exploiting the tree structure of \({\mathfrak {T}}\), we note that if \(\max _{m\le n}\left\Vert X_m\right\Vert _\infty \le R\), then \(\Gamma (0,X_n)\subseteq \Lambda (R)\). Thus, arguing as in (49), we have that

$$\begin{aligned} {{\,\mathrm{{\textbf{P}}}\,}}^{{\mathfrak {T}}}\big (\max _{m\le n} \left\Vert X_m\right\Vert _\infty \le R\big )&\preceq |\{x\in {{\,\mathrm{{\mathbb {Z}}}\,}}^d:\Gamma (0,x)\subseteq \Lambda ( R)\}|^{1/2} p_{2n}^{{\mathfrak {T}}}(o,o)^{1/2}\\&={\textbf {O}}\left( \frac{R^2}{(\log R)^{1/2}}\cdot \frac{(\log n)^{1/18}}{n^{1/3}}\right) , \end{aligned}$$

where the we have applied Proposition 2.5 and heat kernel upper bound (48) in the last line. This implies that \(\max _{m\le n} \left\Vert X_m\right\Vert _\infty = \varvec{\Omega }(n^{1/6}(\log n)^{2/9})\) as claimed.

Range upper bound: Fix \(\delta >0\). For \(n\ge 1\), let \(D_n = \max _{0\le i\le n} d_{{\mathfrak {T}}}(0,X_i)\). Applying displacement upper bounds and the volume upper bounds of Theorem 1.1, we have that

$$\begin{aligned} |\{X_m:0\le m \le n\}|\le \left| {\mathfrak {B}}(D_n)\right| =\left| {\mathfrak {B}}({\textbf{O}}(n^{1/3}(\log n)^{1/9}))\right| ={\textbf{O}}\left( \frac{n^{2/3}}{(\log n)^{1/9-o(1)}}\right) \end{aligned}$$

as \(n\rightarrow \infty \) as required.

Range lower bound: Fix \(R\ge 1\), \(\delta >0\) and write \({\mathfrak {B}}={\mathfrak {B}}(R)\). Let \(g_R(x,y)=(\deg _{\mathfrak {T}}y)^{-1} {\textbf{E}}_x^{{\mathfrak {T}}}[{\sum _{0\le i \le \tau _R} \mathbb {1}(X_n=y)}]\) and let \(p(y)=g_R(0,y)/g_R(y,y)\) be the probability that a random walk started at \(0\in {\mathfrak {T}}\) hits y before exiting \({\mathfrak {B}}\). For each \(y\in {\mathfrak {B}}^\prime :={\mathfrak {B}}(\lfloor R/(\log R)^\delta \rfloor )\), we have \({\mathscr {R}}_{\textrm{eff}}(0\leftrightarrow y;{\mathfrak {T}})\le R/(\log R)^\delta \), so that if the event \(A=\{{\mathscr {R}}_{\textrm{eff}}(0\leftrightarrow {\mathfrak {B}}^c;{\mathfrak {T}})\ge R/(\log R)^{\delta /2}\}\) holds then

$$\begin{aligned}{} & {} \inf _{y\in {\mathfrak {B}}^\prime }{\mathscr {R}}_{\textrm{eff}}(y\leftrightarrow {\mathfrak {B}}^c;{\mathfrak {T}})\ge \inf _{y\in {\mathfrak {B}}^\prime } \big [{\mathscr {R}}_{\textrm{eff}}(0\leftrightarrow {\mathfrak {B}}^c;{\mathfrak {T}})-{\mathscr {R}}_{\textrm{eff}}(0\leftrightarrow y;{\mathfrak {T}})\big ]\\{} & {} \qquad \qquad \qquad \qquad \ge R/(\log R)^{\delta /2}-R/(\log R)^\delta =\Omega (R/(\log R)^{\delta /2}). \end{aligned}$$

Now for each \(y\in {\mathfrak {B}}\) we have the following inequality which was derived for general graphs in [50, Proof of Proposition 3.2(b)]:

$$\begin{aligned} |1-p(y)|^2\le {\mathscr {R}}_{\textrm{eff}}(0\leftrightarrow y;{\mathfrak {T}}){\mathscr {R}}_{\textrm{eff}}(y\leftrightarrow {\mathfrak {B}}^c;{\mathfrak {T}})^{-1}. \end{aligned}$$

Taking the supremum over \(y\in {\mathfrak {B}}^\prime \subset {\mathfrak {B}}\) yields

$$\begin{aligned} \sup _{y\in {\mathfrak {B}}^\prime }|1-p(y)|^2\le \frac{R}{(\log R)^\delta }\cdot \sup _{y\in {\mathfrak {B}}^\prime }{\mathscr {R}}_{\textrm{eff}}(y\leftrightarrow {\mathfrak {B}}^c;{\mathfrak {T}})^{-1}=O((\log R)^{-\delta /2}) \end{aligned}$$

on the event A. For each \(R\ge 1\), consider the random variable \(U_R=|\{X_i:0\le i \le \tau _R\}\cap {\mathfrak {B}}^\prime |\). Then

$$\begin{aligned} \begin{aligned}{}&{} {{\textbf {E}}}_0^{{\mathfrak {T}}}{[U_R]}\ge {{\textbf {E}}}^{{\mathfrak {T}}}_0\Big [\sum _{x\in {\mathfrak {B}}^\prime }\mathbb {1}(X \text{ hits } x \text{ before } \text{ exiting } {\mathfrak {B}})\Big ] \\{}&{} \quad =\sum _{y\in {\mathfrak {B}}^\prime }p(y)\ge \mathbb {1}(A)(1-O((\log R)^{-\delta /4}))|{\mathfrak {B}}^\prime |. \end{aligned} \end{aligned}$$
(50)

Now

$$\begin{aligned}{} & {} {{\,\mathrm{{\mathbb {P}}}\,}}\left( \frac{U_R}{|{\mathfrak {B}}^\prime |}\le 1/2\right) \le {\mathbb {E}}\left[ {{\,\mathrm{{\textbf{P}}}\,}}^{{\mathfrak {T}}}\left( A,\frac{U_R}{|{\mathfrak {B}}^\prime |}\le 1/2\right) \right] +{{\,\mathrm{{\mathbb {P}}}\,}}(A^c)\\{} & {} \quad ={\mathbb {E}}\left[ {{\,\mathrm{{\textbf{P}}}\,}}^{{\mathfrak {T}}}\left( \mathbb {1}(A) \Big (1-\frac{U_R}{|{\mathfrak {B}}^\prime |}\Big )\ge 1/2\right) \right] +{{\,\mathrm{{\mathbb {P}}}\,}}(A^c), \end{aligned}$$

and so applying (50) with Markov’s inequality to the conditional probability inside the expectation gives

$$\begin{aligned} \begin{aligned} {{\,\mathrm {{\mathbb {P}}}\,}}\left( \frac{U_R}{|{\mathfrak {B}}^\prime |}\le 1/2\right) \le O((\log R)^{-\delta /4}){{\,\mathrm {{\mathbb {P}}}\,}}(A)+ {{\,\mathrm {{\mathbb {P}}}\,}}(A^c)=o(1) \end{aligned}\end{aligned}$$

as \(R\rightarrow \infty \), where the fact that \({{\,\mathrm{{\mathbb {P}}}\,}}(A^c)\rightarrow 0\) as \(R\rightarrow \infty \) follows from Corollary 1.3. The claim follows since \(|{\mathfrak {B}}'|=\varvec{\Omega }(R^2(\log R)^{-1/3-2\delta })\), \(\tau _R={\textbf{O}}(R^3(\log R)^{-1/3+o(1)})\), and \(\delta >0\) was arbitrary. \(\square \)