1 Introduction

We consider the Abelian sandpile model on the nearest neighbour lattice \(\mathbb {Z}^d\); see Sect. 1.1 for definitions and background. Let \(\mathbf {P}\) denote the weak limit of the stationary distributions \(\mathbf {P}_L\) in finite boxes \([-L,L]^d \cap \mathbb {Z}^d\). Let \(\eta \) denote a sample configuration from the measure \(\mathbf {P}\). Let \(p_d(i) = \mathbf {P}[\eta (o) = i]\), \(i = 0, \dots , 2d-1\), denote the height probabilities at the origin in d dimensions. The following theorem is our main result that states the asymptotic form of these probabilities as \(d \rightarrow \infty \).

Theorem 1.1

  1. (i)

    For \(0 \le i \le d^{1/2}\), we have

    $$\begin{aligned} p_d(i) = \sum _{j=0}^i \frac{e^{-1} \frac{1}{j!}}{2d-j} + O\Big (\frac{i}{d^2}\Big ) = \frac{1}{2d} \sum _{j=0}^i e^{-1} \frac{1}{j!} + O\Big (\frac{i}{d^2}\Big ). \end{aligned}$$
    (1.1)
  2. (ii)

    If \(d^{1/2} < i \le 2d-1\), we have

    $$\begin{aligned} p_d(i) = p_d(d^{1/2}) + O(d^{-3/2}). \end{aligned}$$

    In particular, \(p_d(i) \sim (2d)^{-1}\), if \(i,d \rightarrow \infty \).

The appearance of the \(\mathsf {Poisson}(1)\) distribution in the above formula is closely related to the result of Aldous [1] that the degree distribution of the origin in the uniform spanning forest in \(\mathbb {Z}^d\) tends to 1 plus a \(\mathsf {Poisson}(1)\) random variable as \(d \rightarrow \infty \). Indeed our proof of (1.1) is achieved by showing that in the uniform spanning forest of \(\mathbb {Z}^d\), the number of neighbours w of the origin o, such that the unique path from w to infinity passes through o is asymptotically the same as the degree of o minus 1, that is, \(\mathsf {Poisson}(1)\).

In [11], we compared the formula (1.1) to numerical simulations in \(d = 32\) on a finite box with \(L = 128\), and there is excellent agreement with the asymptotics already for these values.

Other graphs where information on the height distribution is available are as follows. Dhar and Majumdar [7] studied the Abelian sandpile model on the Bethe lattice, and the exact expressions for various distribution functions including the height distribution at a vertex were obtained using combinatorial methods. For the single-site height distribution they obtained (see [7, Eqn. (8.2)])

$$\begin{aligned} p_{\mathrm {Bethe},d}(i) = \frac{1}{(d^2 - 1) \, d^d} \sum _{j = 0}^i \left( {\begin{array}{c}d+1\\ j\end{array}}\right) (d-1)^{d-j+1}. \end{aligned}$$

If one lets the degree \(d \rightarrow \infty \) in this formula, one obtains the form in the right hand side of (1.1) for any fixed i (with 2d replaced by d).

Exact expressions for the distribution of height probabilities were derived by Papoyan and Shcherbakov [20] on the Husimi lattice of triangles with an arbitrary coordination number q. However, on d-dimensional cubic lattices of \(d\ge 2\), exact results for the height probability are only known for \(d = 2\); see [13, 14, 18, 21, 22].

1.1 Definitions and Background

Sandpiles are a lattice model of self-organized criticality, introduced by Bak, Tang and Wiesenfeld [3] and have been studied in both physics and mathematics. See the surveys [6, 9, 10, 15, 23]. Although the model can easily be defined on an arbitrary finite connected graph, in this paper we will restrict to subsets of \(\mathbb {Z}^d\).

Let \(V_L = [-L,L]^d \cap \mathbb {Z}^d\) be a box of radius L, where \(L \ge 1\). For simplicity, we suppress the d-dependence in our notation. We let \(G_L = (V_L \cup \{s \},E_L)\) denote the graph obtained from \(\mathbb {Z}^d\) by identifying all vertices in \(\mathbb {Z}^d {\setminus } V_L\) that becomes s, and removing loop-edges at s. We call s the sink. A sandpile \(\eta \) is a collection of indistinguishable particles on \(V_L\), specified by a map \(\eta : V_L \rightarrow \{ 0, 1, 2, \dots \}\).

We say that \(\eta \) is stable at \(x\in V_L\), if \(\eta (x) < 2d\). We say that \(\eta \) is stable, if \(\eta (x) < 2d\), for all \(x\in V_L\). If \(\eta \) is unstable (i.e. \(\eta (x) \ge 2d\) for some \(x\in V_L\)), x is allowed to topple which means that x passes one particle along each edge to its neighbours. When the vertex x topples, the particles are re-distributed as follows:

$$\begin{aligned} \begin{aligned}&\eta (x) \rightarrow \eta (x) - 2d; \\&\eta (y) \rightarrow \eta (y) + 1, \quad y \in V_L, y \sim x. \end{aligned} \end{aligned}$$

Particles arriving at s are lost, so we do not keep track of them. Toppling a vertex may generate further unstable vertices. Given a sandpile \(\xi \) on \(V_L\), we define its stabilization

$$\begin{aligned} \xi ^\circ \in \Omega _L := \{\hbox {all stable sandpiles on}\; V_L\} = \{0,1,\dots ,2d-1\}^{V_L} \end{aligned}$$

by carrying out all possible topplings, in any order, until a stable sandpile is reached. It was shown by Dhar [5] that the map \(\xi \mapsto \xi ^\circ \) is well-defined, that is, the order of topplings does not matter.

We now define the sandpile Markov chain. The state space is the set of stable sandpiles \(\Omega _L\). Fix a positive probability distribution p on \(V_L\), i.e. \(\sum _{x\in V_L} p(x) = 1 \) and \(p(x) > 0\) for all \(x\in V_L\). Given the current state \(\eta \in \Omega _L\), choose a random vertex \(X \in V\) according to p, add one particle at X and stabilize. The one-step transition of the Markov chain moves from \(\eta \) to \((\eta + \mathbf {1}_X)^{\circ }\). Considering the sandpile Markov chain on \(G_L\), there is only one recurrent class [5]. We denote the set of recurrent sandpiles by \(\mathcal {R}_L\). It is known [5] that the invariant distribution \(\mathbf {P}_{L}\) of the Markov chain is uniformly distributed on \(\mathcal {R}_L\).

Majumdar and Dhar [19] gave a bijection between \(\mathcal {R}_L\) and spanning trees of \(G_L\). This maps the uniform measure \(\mathbf {P}_L\) on \(\mathcal {R}_L\) to the uniform spanning tree measure \(\mathsf {UST}_L\). A variant of this bijection was introduced by Priezzhev [22] and is described in more generality in [8, 12]. The latter bijection enjoys the following property that we will exploit in this paper. Orient the spanning tree towards s, and let \(\pi _L(x)\) denote the oriented path from a vertex x to s. Let

$$\begin{aligned} W_L = \{ x \in V_L : o \in \pi _L(x) \}. \end{aligned}$$

Then, we have that

$$\begin{aligned}&\text {conditional on } \deg _{W_L}(o) = i, \text { the height } \eta (o)\text { is uniformly}\nonumber \\&\text {distributed over the values } i, i+1, \dots , 2d-1. \end{aligned}$$
(1.2)

This has the following consequence for the height probabilities. Let \(q^L(i) = \mathsf {UST}_L [ \deg _{W_L}(o) = i ]\), \(i = 0, \dots , 2d-1\). Then,

$$\begin{aligned} p^L(i) := \mathbf {P}_L [ \eta (o) = i ] = \sum _{j=0}^{i} \frac{q^L(j)}{2d - j}. \end{aligned}$$

The measures \(\mathbf {P}_L\) have a weak limit \(\mathbf {P}= \lim _{L \rightarrow \infty } \mathbf {P}_L\) [2], and hence, \(p(i) = \lim _{L \rightarrow \infty } p^L(i)\) exist, \(i = 0, \dots , 2d-1\). Although the \(q^L(i)\) depend on the non-local variable \(W_L\), one also has that \(q(i) = \lim _{L \rightarrow \infty } q^L(i)\) exist, \(i = 0, \dots , 2d-1\); see [12]. In fact, q(i) is given by the following natural analogue of its finite volume definition. Consider the uniform spanning forest measure \(\mathsf {USF}\) on \(\mathbb {Z}^d\); defined as the weak limit of \(\mathsf {UST}_L\); see [16, Chapter 10]. Let \(\pi (x)\) denote the unique infinite self-avoiding path in the spanning forest starting at x, and let

$$\begin{aligned} W = \{ x \in \mathbb {Z}^d : o \in \pi (x) \}. \end{aligned}$$

Then, \(q(i) = \mathsf {USF}[ \deg _{W}(o) = i ]\), \(i = 0, \dots , 2d-1\).

Therefore, we have

$$\begin{aligned} p(i) := \mathbf {P}[ \eta (o) = i ] = \sum _{j=0}^{i} \frac{q(j)}{2d - j}. \end{aligned}$$
(1.3)

1.2 Wilson’s Method

Given a finite path \(\gamma = [s_0, s_1, \ldots , s_k] \) in \(\mathbb {Z}^d\), we erase loops from \(\gamma \) chronologically, as they are created. We trace \(\gamma \) until the first time t, if any, when \(s_t \in \{s_0, s_1, \ldots , s_{t-1}\}\), i.e. there is a loop. We suppose \(s_t = s_i\), for some \(i \in \{0,1,\ldots ,t-1\}\) and remove the loop \([s_i,s_{i+1},\ldots ,s_t=s_i]\). Then, we continue tracing \(\gamma \) and follow the same procedure to remove loops until there are no more loops to remove. This gives the loop-erasure \(\pi = LE(\gamma )\) of \(\gamma \), which is a self-avoiding path [17]. If \(\gamma \) is generated from a random walk process, the loop-erasure of \(\gamma \) is called the loop-erased random walk (LERW).

When \(d \ge 3\), the \(\mathsf {USF}\) on \(\mathbb {Z}^d\) can be sampled via Wilson’s method rooted at infinity [4, 16, Section 10], that is described as follows. Let \(s_1, s_2,\dots \) be an arbitrary enumeration of the vertices and let \(\mathcal {T}_0\) be the empty forest with no vertices. We start a simple random walk \(\gamma _n\) at \(s_n\) and \(\gamma _n\) stops when \(\mathcal {T}_{n-1}\) is hit, otherwise we let it run indefinitely. \(LE(\gamma _n)\) is attached to \(\mathcal {T}_{n-1}\), and the resulting forest is denoted by \(\mathcal {T}_n\). We continue the same procedure until all the vertices are visited. The above gives a random sequence of forests \(\mathcal {T}_1 \subset \mathcal {T}_2 \subset \dots \), where \(\mathcal {T}= \cup _{n} \mathcal {T}_n \) is a spanning forest of \(\mathbb {Z}^d\). The extension of Wilson’s theorem [24] to transient infinite graphs proved in [4] implies that \(\mathcal {T}\) is distributed as the \(\mathsf {USF}\).

2 Proof of the Main Theorem

Let \((S_n^x)_{n\ge 0}\) be a simple random walk started at x (independent between x’s on \(\mathbb {Z}^d\)) and let \(\pi (x)\) be the path in the USF from x to infinity. We introduce the events:

$$\begin{aligned} \begin{aligned}&E_i = \Big \{ |\{ w \sim o : \pi (w)\text { passes through } o\}| = i \Big \}, \quad i = 0, \dots , 2d-1;\\&E_i(x_1,x_2,\dots ,x_i) = \Big \{ \{ w \sim o : \pi (w) \text { passes through } o\} = \{x_1, x_2,\dots ,x_i\} \Big \}. \end{aligned} \end{aligned}$$

Then, recall that

$$\begin{aligned} q_d(i) = \mathbf {P}[\deg _W(o) = i] = \mathbf {P}[E_i] = \sum _{\begin{array}{c} x_1,\dots ,x_i \sim o \\ \text {distinct} \end{array}} \mathbf {P}[E_i(x_1,\dots ,x_i)]. \end{aligned}$$
(2.1)

2.1 Preliminary

Lemma 2.1

We have \(\mathbf {P}[S^o_n = o\text { for some } n \ge 2] = O(1/d)\) and \(\mathbf {P}[S^o_n = o\text { for some } n \ge 4] = O(1/d^2)\), as \(d \rightarrow \infty \).

Proof

Let \({\hat{D}}(k)\) = \( \frac{1}{d} \sum _{j=1}^{d} \cos (k_j)\), \( k \in [-\pi ,\pi ]^d\) be the Fourier transform in d dimensions of the one-step distribution of RW. Lemma A.3 in [17] states that for all non-negative integers n and all \(d\ge 1\), we have

$$\begin{aligned} \Vert {\hat{D}}^n\Vert _1 = (2\pi )^{-d}\int _{[-\pi ,\pi ]^d}|{\hat{D}}(k)^n|d^dk \le \left( \frac{\pi d}{4n}\right) ^{d/2}. \end{aligned}$$

Based on above, we have

$$\begin{aligned} \begin{aligned} \mathbf {P}[S^o_n = o\text { for some } n \ge 4]&\le \frac{1}{(2\pi )^d} \sum _{n=4}^{\infty } \int {\hat{D}}^n(k)dk \\&\le \frac{1}{(2\pi )^d} \sum _{n=4}^{d-1} \int {\hat{D}}^n(k)dk + \sum _{n=d}^{\infty } \Big (\frac{\pi d}{4n}\Big )^{d/2}. \end{aligned} \end{aligned}$$
(2.2)

Since \(\int {\hat{D}}^4(k) dk\) and \(\int {\hat{D}}^6(k)dk\) state the probability that \(S^o\) returns to o in 4 and 6 steps each, by counting the number of ways to return, they are bounded by dimension-independent multiples of \(1/d^2\) and \(1/d^3\), respectively. We have \(\int {\hat{D}}^n(k) dk = 0\) with odd n, and for \(6 < n \le d-1\) and n even, we have \(\int {\hat{D}}^n(k)dk \le \int {\hat{D}}^6(k)dk\). Hence,

$$\begin{aligned} \frac{1}{(2\pi )^d} \int {\hat{D}}^n(k)dk = O\Big (\frac{1}{d^3}\Big ), \quad 6 \le n \le d-1. \end{aligned}$$

The last sum in (2.2) can be bounded as:

$$\begin{aligned} \begin{aligned} \Big (\frac{\pi d}{4}\Big )^{d/2} \sum _{n=d}^{\infty } n^{-d/2}&\le \Big (\frac{\pi d}{4}\Big )^{d/2} \int _{d-1}^\infty x^{-d/2} dx = \Big (\frac{\pi d}{4}\Big )^{d/2} \frac{(d-1)^{1-\frac{d}{2}}}{d/2-1}\\&= \Big (\frac{d-1}{d/2-1}\Big )\Big (\frac{d}{d-1}\Big )^{\frac{d}{2}} \Big (\frac{\pi }{4}\Big )^{\frac{d}{2}} \le Ce^{-cd}, \end{aligned} \end{aligned}$$

since we can take \(d > 4\) and \(\frac{\pi }{4} < 1\).

Hence, we have the required results

$$\begin{aligned} \begin{aligned} \mathbf {P}[S^o_n = o\text { for some } n \ge 4]&\le \int {\hat{D}}^4(k) dk + d \int {\hat{D}}^6(k)dk + C e^{-cd}\\&= O\Big (\frac{1}{d^2}\Big ) + d\times O\Big (\frac{1}{d^3}\Big ) = O\Big (\frac{1}{d^2}\Big ),\\ \mathbf {P}[S^o_n = o\text { for some } n \ge 2]&\le \Big (\frac{1}{2d}\Big ) + \mathbf {P}[S^o_n = o\text { for some } n \ge 4] = O\Big (\frac{1}{d}\Big ). \end{aligned} \end{aligned}$$

\(\square \)

2.2 Lower Bounds

Let us fix the vertices \(x_1, \dots , x_i \sim o\). Let

$$\begin{aligned} A_0 = \Big \{ S_1^o \not \in \{ x_1, \dots , x_i \},\, S_n^o \not \in \mathcal {N} \text { for } n \ge 2 \Big \}, \end{aligned}$$

where \(\mathcal {N} = \{ y \in \mathbb {Z}^d : |y| \le 1\}\).

Lemma 2.2

We have \(\mathbf {P}[A_0] \ge 1- O(i/d).\)

Proof

$$\begin{aligned} \mathbf {P}[A_0] = \mathbf {P}[S_1^o \ne x_1,\dots ,x_i] \mathbf {P}[S_n^o \not \in \mathcal {N}\text { for } n \ge 2 | S_1^o \ne x_1,\dots ,x_i]. \end{aligned}$$

We have \(\mathbf {P}[S_1^o \ne x_1, \dots x_i] = 1-O(i/d)\) and the probability for the remaining steps is at least \(1-O(1/d)\), shown as follows. The probabilities \(\mathbf {P}[S_2^o \ne o | S_1^o \ne x_1,\dots ,x_i]\) and \(\mathbf {P}[S_3^o \not \in \mathcal {N} | S_2^o \ne o, S_1^o \ne x_1,\dots ,x_i]\) are both equal to \(1-O(1/d)\). Considering the s.r.w starting at the position \(S_3^o\), it hits at most three neighbours of o in two further steps, the remaining neighbours will need at least 4 steps to hit, so, by Lemma 2.1, we have

$$\begin{aligned} \begin{aligned} \sum _{\text {at most } 3\text { neighbours } x_j} \sum _{k\ge 1} P_{2k}(S_3^o,x_j)&\le O\left( \frac{1}{d}\right) , \\ \sum _{\text {the remaining neighbours } x_{j'}} \sum _{k\ge 2} P_{2k}(S_3^o,x_{j'})&\le O(d)O\left( \frac{1}{d^2}\right) = O\left( \frac{1}{d}\right) , \end{aligned} \end{aligned}$$

since \(P_{2k}(x,y) \le P_{2k}(o,o)\) for all xy. Therefore, combining above results together, we get \(\mathbf {P}[S_n^o \not \in \mathcal {N}\text { for } n \ge 2 | S_1^o \ne x_1,\dots ,x_i] \ge 1 - O(1/d)\) as required. \(\square \)

Let us label the neighbours of o different from \(x_1, \dots , x_i\) as \(x_{i+1}, \dots , x_{2d}\), in any order. On the event \(A_0\), the first step of \(\pi (o)\) is to a neighbour of o in \(\{x_{i+1},\dots ,x_{2d}\}\) and we could assume \(x_{2d}\) to be the first step of \(\pi (o)\). Then, \(\pi (o)\) does not visit other vertices in \(\mathcal {N} \backslash \{o\}\). Define \(A_j =\{S_1^{x_j} = o\}\) for \(j = 1,2,\dots ,i\) and then \(\mathbf {P}[A_j] = 1/2d\).

Using Wilson’s algorithm, consider random walks first started at \(o, x_1, .., x_i\) and then started at \(x_{i+1},\dots ,x_{2d-1}\). We obtain the following:

$$\begin{aligned} \begin{aligned} \mathbf {P}[E_i(x_1,\dots ,x_i)]&\ge \mathbf {P}[A_0] \times \prod _{j=1}^i \mathbf {P}[A_j] \times \mathbf {P}[E_i(x_1,..,x_i)|A_0 \cap A_1 \cap \dots \cap A_i] \\&\ge \Big (1-O\Big (\frac{i}{d}\Big )\Big )\Big (\frac{1}{2d}\Big )^i \mathbf {P}[E_i(x_1,..,x_i)|A_0 \cap A_1 \cap \dots \cap A_i]. \end{aligned}\nonumber \\ \end{aligned}$$
(2.3)

Define \(B_k = \{S_1^{x_k} \ne o, S_n^{x_k} \not \in \{x_1,\dots ,x_i\} \text { for } n\ge 2\}\) for \(k = i+1,\dots ,2d-1\).

Lemma 2.3

\(\mathbf {P}[B_k] \ge 1 - 1/2d - O(i/d^2)\), where \(i+1 \le k \le 2d-1\).

Proof

We have \(\mathbf {P}[S_1^{x_k} \ne o] = 1 - 1/2d\). If the first step is not to o, the first step could be in one of the \(e_1,\dots ,e_i\) directions, say \(e_j\), with probability i / 2d. Then, the probability to hit \(x_j\) is \(1/2d + O(1/d^2)\). Hence, the probability that \(S^{x_k}\) hits \(\{x_1,\dots ,x_i\}\) is \(O(i/d^2)\). \(\square \)

Lemma 2.4

\(q_d(i) \ge e^{-1} \frac{1}{i!}\left( 1+O\left( \frac{i^2}{d}\right) \right) .\)

Proof

By (2.3), we have

$$\begin{aligned} \mathbf {P}[E_i(x_1,\dots ,x_i)] \ge \Big (1- O\Big (\frac{i}{d}\Big )\Big )\Big (\frac{1}{2d}\Big )^i \Big (1-\frac{1}{2d} + O\Big (\frac{i}{d^2}\Big )\Big )^{2d-1-i}. \end{aligned}$$

Then, by (2.1),

$$\begin{aligned} \begin{aligned} q_d(i)&\ge {2d \atopwithdelims ()i}\Big (1- O\Big (\frac{i}{d}\Big )\Big )\Big (\frac{1}{2d}\Big )^i \Big (1-\frac{1}{2d}+O\Big (\frac{i}{d^2}\Big )\Big )^{2d-1-i}\\&=\frac{2d(2d-1)\dots (2d-i+1)}{i!(2d)^i}\Big (1-O\Big (\frac{i}{d}\Big )\Big )\\&\quad \Big (1-\frac{1}{2d}+O\Big (\frac{i}{d^2}\Big )\Big )^{2d} \Big (1+O\Big (\frac{i}{d}\Big )\Big ), \end{aligned} \end{aligned}$$

where

$$\begin{aligned} \begin{aligned} \Big (1 - \frac{1}{2d} + O\Big (\frac{i}{d^2}\Big )\Big )^{2d}&= \exp \Big (2d\times \log \Big (1 - \frac{1}{2d} + O\Big (\frac{i}{d^2}\Big )\Big )\\&= \exp \Big (2d\Big (- \frac{1}{2d} + O\Big (\frac{i}{d^2}\Big )\Big ) \\&= \exp \Big (-1 + O\Big (\frac{i}{d}\Big )\Big ) = e^{-1}\Big (1+O\Big (\frac{i}{d}\Big )\Big ), \end{aligned} \end{aligned}$$

and

$$\begin{aligned} \begin{aligned} \frac{2d(2d-1)\dots (2d-i+1)}{(2d)^i}&= 1\Big (1-\frac{1}{2d}\Big )\Big (1-\frac{2}{2d}\Big )\dots \Big (1-\frac{i}{2d}+\frac{1}{2d}\Big )\\&= \Big (1 + O\Big (\frac{i^2}{d}\Big )\Big ). \end{aligned} \end{aligned}$$

Then, the result follows

$$\begin{aligned} q_d(i)\ge & {} e^{-1} \frac{1}{i!}\Big (1+O\Big (\frac{i}{d}\Big )\Big ) \Big (1+O\Big (\frac{i^2}{d}\Big )\Big ) \Big (1+O\Big (\frac{i}{d}\Big )\Big ) \Big (1-O\Big (\frac{i}{d}\Big )\Big ) \\= & {} e^{-1} \frac{1}{i!}\Big (1+O\Big (\frac{i^2}{d}\Big )\Big ). \end{aligned}$$

\(\square \)

The above lemma gives a lower bound for \(q_d\), and we now prove an upper bound.

2.3 Upper Bounds

Recall that \(\pi (o)\) denotes the unique infinite self-avoiding path in the spanning forest starting at o and let \({\bar{A}}_o = \{\pi (o)\text { visits only one neighbour of } o\}\).

Lemma 2.5

\(\mathbf {P}[\pi (o) \text { visits more than one neighbour of } o] = P[{\bar{A}}_o^c] = O(1/d).\)

Proof

The first step of \(\pi (o)\) must visit a neighbour of o, denoted by w, then \(P[{\bar{A}}_o^c]\)

$$\begin{aligned} \begin{aligned}&= \mathbf {P}[\text {The second step of } \pi (o)\text { visits } x\ne 2w, \text { the third step visits } w' \sim o, w'\ne w] \\&\quad + O\Big (\frac{1}{d^2}\Big )\\&=\Big (\frac{1}{2d}\Big )\Big (\frac{2d-1}{2d}\Big ) + O\Big (\frac{1}{d^2}\Big ) = O\Big (\frac{1}{d}\Big ).\\ \end{aligned} \end{aligned}$$

\(\square \)

Let \({\bar{A}}_\mathrm{all} = \{ \forall w \sim o{:} \text { either } \pi (w) \text { does not visit } o \text { or } \pi (w) \text { visits } o~\mathrm{at~the~first~step} \}\).

Lemma 2.6

\(\mathbf {P}[\exists w\sim o: \pi (w)\text { visits } o \text { but not at the first step}] = \mathbf {P}[{\bar{A}}_\mathrm{all}^c] = O(1/d).\)

Proof

For a given w, \(w \sim o\), use Wilson’s algorithm with a walk started at w. Consider that if \(S_1^w \ne o\), or \(S_1^w = o\), but \(S^w\) returns to w subsequently and then this loop starting from w in \(S^w\) is erased, \(\pi (w)\) does not visit o at the first step. Hence, we have the inequality:

$$\begin{aligned} \begin{aligned}&\mathbf {P}[\pi (w)\text { visits } o\text { but not at the first step}] \\&\le \mathbf {P}[S^w\text { visits } o \text { but not at the first step}] + \mathbf {P}[S_1^w = o, S_n^w = w \text { for some } n\ge 2]. \end{aligned}\nonumber \\ \end{aligned}$$
(2.4)

We bound the two terms as follows. For the first term, let us append a step from o to w at the beginning of the walk and analyse it as if the walk started at o. Since \(S_1^o \in \mathcal {N} \backslash \{o\}\), by symmetry, we may assume \(S_1^o = w\). Then, if \(S_2^o \ne o\), \(S^o\) will need at least 2 more steps to return to o.

For the second term in the right hand side of (2.4), we first note that we have \(\mathbf {P}[S_1^w = o, S_2^w = w] = 1/(2d)^2\). If \(S^w\) does not return to w in the first two steps, \(S^w\) will need at least 4 steps to return to w. Then, we have that the right hand side of (2.4) is

$$\begin{aligned} \begin{aligned}&\le \mathbf {P}[S^o\text { returns to } o \text { in at least } 4 \text { steps}] + \frac{1}{(2d)^2} + \mathbf {P}[S^w\text { returns to } w \text { in at least } 4 \text { steps}]\\&= 2\times \mathbf {P}[S^o\text { returns to } o \text { in at least } 4 \text { steps}] + O\Big (\frac{1}{d^2}\Big ). \end{aligned} \end{aligned}$$

Therefore, by Lemma 2.1, we have the required result

$$\begin{aligned} \begin{aligned}&\mathbf {P}[\exists w\sim o: pi(w)\text { visits } o \text { but not at the first step }]\\&= 2d\times \mathbf {P}[\pi (w)\text { visits } o \text { but not at the first step for a fixed } w \sim o] =O\Big (\frac{1}{d}\Big ). \end{aligned} \end{aligned}$$

\(\square \)

Due to Lemmas 2.5 and 2.6, we have

$$\begin{aligned} q_d(i)\le & {} O\Big (\frac{1}{d}\Big ) + \mathbf {P}[{\bar{A}}_o \cap {\bar{A}}_\mathrm{all} \cap E_i] \\= & {} O\Big (\frac{1}{d}\Big ) + \sum _{\begin{array}{c} x_1,\dots ,x_i \sim o \\ \text {distinct} \end{array}} \mathbf {P}[{\bar{A}}_o \cap {\bar{A}}_\mathrm{all} \cap E_i(x_1,\dots ,x_i)]. \end{aligned}$$

Here,

$$\begin{aligned} \begin{aligned}&{\bar{A}}_o \cap {\bar{A}}_\mathrm{all} \cap E_i(x_1,\dots ,x_i) \\&\subset {\bar{A}}_o \cap {\bar{A}}_\mathrm{all} \cap \{\text {the first step of } \pi (x_j) \text { is to } o, j = 1,\dots ,i\} \cap F_i(x_1,\dots ,x_i), \end{aligned}\nonumber \\ \end{aligned}$$
(2.5)

where

$$\begin{aligned} F_i(x_1,\dots ,x_i) = \{\pi (x_j)\text { does not go through } o, j = i+1, \dots ,2d\}. \end{aligned}$$

The right hand side of (2.5) is contained in the event

$$\begin{aligned} {\bar{A}}_o \cap \{\pi (o) \text { does not visit } x_1,\dots ,x_i\} \cap {\bar{A}}_\mathrm{rest} \cap \bigcap _{1\le j \le i} H_j \cap F_i(x_1, \dots , x_i), \end{aligned}$$

where

$$\begin{aligned} {\bar{A}}_\mathrm{rest}= & {} \{\pi (x_j)\text { goes through at most one }\\ x_{j'}, j= & {} i+1,\dots ,2d, i+1 \le j' \le 2d, j' \ne j \} \end{aligned}$$

and \(H_j = \{\text {the first step of } \pi (x_j) \text { is to } o\}\) for \(j = 1,\dots ,i\).

We denote \({\bar{A}}_o\cap \{\pi (o)\text { does not visit } x_1,\dots ,x_i\}\) by \({\bar{A}}_{o,x_1,\dots ,x_i}\). Then,

$$\begin{aligned} \begin{aligned}&\mathbf {P}\Big [{\bar{A}}_{o,x_1,\dots ,x_i} \cap {\bar{A}}_\mathrm{rest} \cap \bigcap _{1\le j \le i} H_j \cap F_i(x_1, \dots , x_i)\Big ] \\&\quad =\mathbf {P}[{\bar{A}}_{o,x_1,\dots ,x_i}] \prod _{j=1}^i\mathbf {P}\Big [H_j\Big |\bigcap _{1\le j' < j} H_{j'} \cap {\bar{A}}_{o,x_1,\dots ,x_i}\Big ] \\&\qquad \times \mathbf {P}\Big [F_i(x_1,\dots ,x_i)\cap {\bar{A}}_\mathrm{rest} \Big |{\bar{A}}_{o,x_1,\dots ,x_i}\cap \bigcap _{1\le j \le i}H_j\Big ]. \end{aligned} \end{aligned}$$

Therefore, we have

$$\begin{aligned} \begin{aligned} q_d(i)&\le O\Big (\frac{1}{d}\Big ) +\sum _{\begin{array}{c} x_1,\dots ,x_i\sim o \\ \mathrm {distinct} \end{array}} \Big (\prod _{j=1}^i\mathbf {P}\Big [H_j \Big |\bigcap _{1\le j' < j}H_{j'}\cap {\bar{A}}_{o,x_1,\dots ,x_i}\Big ]\Big ) \\&\quad \times \mathbf {P}\Big [F_i(x_1,\dots ,x_i)\cap {\bar{A}}_\mathrm{rest} \Big |{\bar{A}}_{o,x_1,\dots ,x_i}\cap \bigcap _{1\le j \le i}H_j\Big ]. \end{aligned} \end{aligned}$$
(2.6)

Lemma 2.7

\(\mathbf {P}[H_j | {\bar{A}}_{o,x_1,\dots ,x_i} \cap \bigcap _{1\le j' < j}H_{j'}] = 1/2d + O(1/d^2)\), where \(j = 1,\dots ,i\).

Proof

Given that \(\pi (o)\) visits only one neighbour of o which is not in \(\{x_1,\dots ,x_i\}\) and the first steps of \(\pi (x_1),\dots ,\pi (x_{j-1})\) are all to o, the probability that \(H_j\) happens is \(\mathbf {P}[S_1^{x_j} = o] = 1/2d\) with the error term of \(O(1/d^2)\) due to the loop-erasure. \(\square \)

Lemma 2.8

$$\begin{aligned}&\mathbf {P}\Big [F_i(x_1,\dots ,x_i)\cap {\bar{A}}_\mathrm{rest} \Big |{\bar{A}}_{o,x_1,\dots ,x_i}\cap \bigcap _{1\le j \le i}H_j\Big ]\nonumber \\&\quad \le \mathbf {E}\Big [\Big (1-\frac{1}{2d}+O\Big (\frac{1}{d^2}\Big )\Big )^{2d-i-1-N} \mathbf {1}_{{\bar{A}}_\mathrm{rest}}\Big ], \end{aligned}$$
(2.7)

where \(N=|\{i+1\le j\le 2d-1:\exists i+1\le j'<j\) s.t. \(\pi (x_{j'})\) goes through \(x_j\}|\).

Proof

Consider Wilson’s algorithm with random walks started at the remaining neighbours \(x_{i+1}, \dots ,x_{2d}\). Assume \(x_{2d}\) to be the neighbour of o that \(\pi (o)\) goes through. The probability that \(\pi (x_k)\) does not go through o is \(1 - 1/2d + O(1/d^2)\) for \(k\in \{i+1,\dots ,2d-1\}\).

If \(\pi (x_k)\) visits \(x_{k'}\), where \(k < k' \le 2d-1\), the probability that \(\pi (x_{k'})\) does not go through o is 1 instead of \(1-1/2d+O(1/d^2)\), since the LERW from \(x_{k'}\) stops immediately and \(\pi (x_{k'}) \subset \pi (x_k)\), which does not go through o. \(\square \)

Lemma 2.9

On the event \({\bar{A}}_\mathrm{rest}\), \(N \le B\), where \(B \sim \mathsf {Binom}(2d-i-1, p)\), \(p = 1/2d + O(1/d^2)\).

Proof

Since we have \((2d-i-1)\) trials with probability at most \(1/2d+O(1/d^2)\). \(\square \)

Due to Lemma 2.9, we have that the right hand side of (2.7) is

$$\begin{aligned} \le \Big (1-\frac{1}{2d}+O\Big (\frac{1}{d^2}\Big )\Big )^{2d} \Big (1+O\Big (\frac{i}{d}\Big )\Big ) \mathbf {E}\Big [\frac{1}{(1-\frac{1}{2d}+O(\frac{1}{d^2}))^B}\Big ], \end{aligned}$$
(2.8)

where \(\mathbf {E}[z^B] = \sum _{j=0}^{2d-i-1} z^j {2d-i-1 \atopwithdelims ()j}p^j(1-p)^{2d-i-1-j} = (1-p-zp)^{2d-i-1}\).

Hence, (2.8) is

$$\begin{aligned} \begin{aligned}&\le e^{-1}\Big (1+O\Big (\frac{1}{d}\Big )\Big )\Big (1+O\Big (\frac{i}{d}\Big )\Big ) \Big (1-\frac{1}{2d} + O\Big (\frac{1}{d^2}\Big ) + \frac{\frac{1}{2d} + O(\frac{1}{d^2})}{1-\frac{1}{2d} + O(\frac{1}{d^2})}\Big )^{2d-i-1}\\&= e^{-1}\Big (1+O\Big (\frac{1}{d}\Big )\Big ) \Big (1+O\Big (\frac{i}{d}\Big )\Big )\Big (1+O\Big (\frac{1}{d^2}\Big )\Big )^{2d-i-1} = e^{-1}\Big (1+O\Big (\frac{i}{d}\Big )\Big ). \end{aligned} \end{aligned}$$
(2.9)

Lemma 2.10

\(q_d(i) \le O(\frac{1}{d}) + e^{-1}\frac{1}{i!}(1+O(\frac{i}{d})).\)

Proof

Due to Lemma 2.7, (2.6) and (2.9), we have

$$\begin{aligned} \begin{aligned} q_d(i)&\le O\Big (\frac{1}{d}\Big ) + {2d\atopwithdelims ()i}\Big (\frac{1}{2d}+O\Big (\frac{1}{d^2}\Big )\Big )^i e^{-1}\Big (1+O\Big (\frac{i}{d}\Big )\Big )\\&= O\Big (\frac{1}{d}\Big ) + e^{-1}\frac{2d(2d-1)\dots (2d-i+1)}{i!} \Big (\frac{1}{2d}\Big )^i\Big (1+O\Big (\frac{1}{d}\Big )\Big )^i\\&\quad \Big (1+O\Big (\frac{i}{d}\Big )\Big )\\&\le O\Big (\frac{1}{d}\Big ) + e^{-1}\frac{1}{i!}\Big (1+O\Big (\frac{i}{d}\Big )\Big ). \end{aligned} \end{aligned}$$

\(\square \)

Lemma 2.11

For \(k = 1, \dots , 3\) and distinct \(w_1, \dots , w_k \sim o\), we have

$$\begin{aligned} \mathbf {P}[ \pi (w_i)\text { passes through } o \text { for } i = 1,\dots ,k ] = \left( \frac{1}{2d} \right) ^k + O (d^{-k-1}). \end{aligned}$$

This lemma can be proved using ideas used to prove Lemma 2.7.

2.4 Proof of the Asymptotic Formula

Proof of Theorem 1.1

We first prove part (i). By Wilson’s algorithm,

$$\begin{aligned} p_d(i) = \sum _{j=0}^{i} \frac{q_d(j)}{2d-j}. \end{aligned}$$

Due to Lemmas 2.4 and 2.10 , we have

$$\begin{aligned} p_d(i) \ge \sum _{j=0}^{i} \frac{e^{-1}\frac{1}{j!}\left( 1 + O\left( \frac{j^2}{d}\right) \right) }{2d-j} =\sum _{j=0}^{i}\frac{e^{-1}\frac{1}{j!}}{2d-j} +\sum _{j=0}^{i} \frac{\frac{1}{j!} O\left( \frac{j^2}{d}\right) }{2d-j}, \end{aligned}$$
(2.10)

and

$$\begin{aligned} p_d(i)\le & {} \sum _{j=0}^{i}\frac{O(\frac{1}{d}) + e^{-1}\frac{1}{j!}\left( 1+O\left( \frac{j}{d}\right) \right) }{2d-j} =\sum _{j=0}^{i}\frac{e^{-1}\frac{1}{j!}}{2d-j}\nonumber \\&+\sum _{j=0}^{i}\frac{O\left( \frac{1}{d}\right) +\frac{1}{j!}O\left( \frac{j}{d}\right) }{2d-j}. \end{aligned}$$
(2.11)

Here, using that \(0\le j\le d^{1/2}\), we have

$$\begin{aligned} \sum _{j=0}^{i} \frac{\frac{1}{j!} O\left( \frac{j^2}{d}\right) }{2d-j} \le \frac{1}{2d-d^{1/2}}O\left( \frac{1}{d}\right) \sum _{j=0}^{i}\frac{j^2}{j!} =O\left( \frac{1}{d^2}\right) . \end{aligned}$$

Similarly,

$$\begin{aligned} \begin{aligned} \sum _{j=0}^{i} \frac{O\left( \frac{1}{d}\right) + \frac{1}{j!} O\left( \frac{j}{d}\right) }{2d-j} \le \sum _{j=0}^i O(d^{-2}) + \sum _{j=0}^i \frac{j}{j!} O(d^{-2}) = O( i/d^2 ). \end{aligned} \end{aligned}$$

Putting these error bounds together with (2.10) and (2.11), we prove statement (i) of the theorem.

Let us now use that

$$\begin{aligned} \frac{1}{2d}e^{-1}\sum _{j=0}^{i} \frac{1}{j!} \le \sum _{j=0}^{i} \frac{e^{-1} \frac{1}{j!}}{2d-j} \le \frac{1}{2d-i}e^{-1}\sum _{j=0}^{i}\frac{1}{j!}. \end{aligned}$$

When \(i \le d^{1/2}\), and \(i, d \rightarrow \infty \), we have \(\frac{1}{2d-i} \sim \frac{1}{2d}\) and \(\sum _{j=0}^{i}\frac{1}{j!} \rightarrow e\). Hence,

$$\begin{aligned} \sum _{j=0}^{i} \frac{e^{-1} \frac{1}{j!}}{2d-j} \sim \frac{1}{2d}, \quad \text {as } i, d \rightarrow \infty . \end{aligned}$$

We are left to prove statement (ii). The uniform distribution for \(d^{1/2} \le i \le 2d-1\) can be obtained from the monotonicity:

$$\begin{aligned} p_d(d^{1/2}) \le p_d(i) \le p_d(2d-1), \quad d^{1/2} \le i \le 2d-1, \end{aligned}$$

if we show that \(p_d(2d-1) = p_d(d^{1/2}) + O(d^{-3/2})\).

We write

$$\begin{aligned} p_d(2d-1)= & {} \sum _{j=0}^{2d-1} \frac{q_d(j)}{2d-j} = p_d(d^{1/2}) + \sum _{j = d^{1/2}}^{2d-1} \frac{q_d(j)}{2d-j} \le p_d(d^{1/2}) \nonumber \\&+ \sum _{j = d^{1/2}}^{2d-1} q_d(j). \end{aligned}$$
(2.12)

Introducing the random variable

$$\begin{aligned} X:= |\left\{ w \sim o : o \in \pi (w) \right\} |, \end{aligned}$$

the last expression in (2.12) equals

$$\begin{aligned} p_d(d^{1/2}) + \mathbf {P}[ X \ge d^{1/2} ] \le p_d(d^{1/2}) + \mathbf {P}[ X^3 \ge d^{3/2} ] \le p_d(d^{1/2}) + \frac{\mathbf {E}[ X^3 ]}{d^{3/2}}. \end{aligned}$$

Therefore, it remains to show that \(\mathbf {E}[ X^3 ] = O(1)\). This follows from Lemma 2.11, by summing over \(w_1, \dots , w_3\) (not necessarily distinct). The cases \(k=1,2\) of the lemma are used to sum the contributions where one or more of the \(w_i\)’s coincide. \(\square \)