On the tail of the branching random walk local time

Consider a critical branching random walk on \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\mathbb Z^d$$\end{document}Zd, \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$d\ge 1$$\end{document}d≥1, started with a single particle at the origin, and let L(x) be the total number of particles that ever visit a vertex x. We study the tail of L(x) under suitable conditions on the offspring distribution. In particular, our results hold if the offspring distribution has an exponential moment.


Introduction
In this paper we study the tail of the number of times a critical branching random walk on Z d returns to the origin. The result is most interesting in the upper-critical dimension d = 4, where we find that the local time has a stretched-exponential tail. Theorem 1.1 Let d ≥ 1, let (B n ) n≥0 be a branching random walk on Z d whose offspring distribution μ is critical, non-trivial, and sub-exponential, started with a single particle at the origin, and let L(0) be the total number of particles that visit the origin. Then Data sharing not applicable to this article as no datasets were generated or analysed during the current study.
for every n ≥ 1.
Here, we say that the offspring distribution μ is critical if it has mean 1, nontrivial if μ(1) < 1, and sub-exponential if there exist positive constants C and c such that μ(n) ≤ Ce −cn for every n ≥ 1. We use both " f (n) = (g(n)) for every n ≥ 1" and " f (n) g(n) for every n ≥ 1" to mean that there exist positive constants c, C depending only on the offspring distribution μ and the dimension d such that cg(n) ≤ f (n) ≤ Cg(n) for every n ≥ 1. Similar meaning applies to the symbols and , so that, for example, " f n (x) g n (x) for every n ≥ 1 and x ∈ Z d " means that there exists a positive constant C depending only on the offspring distribution μ and the dimension d such that f n (x) ≤ Cg n (x) for every n ≥ 1 and x ∈ Z d .
Our work is motivated in part by our hope to understand the analogous questions for the Abelian sandpile model [4,18]. In this model, surveyed in [6], the total number of times the origin topples in an avalanche at equillibrium (equivalently, the total number of waves in an avalanche) is expected to behave in a roughly analogous way to the branching random walk local time at the origin, with a closer analogy expected to hold in dimensions d > 4. Currently, the distribution of the total number of waves in an avalanche remains poorly understood even in the high-dimensional case, where other aspects of the model are now fairly well-understood [3,5,7].
There is an extensive literature on critical branching random walk on Z d , with works particularly relevant to the present paper including [2,[11][12][13][14][21][22][23][24]. In light of this extensive literature, we were surprised to find that the tail of the local time had not previously been studied. The basic methods that we use (inductive analysis of moments via diagrammatic sums) are well-known to experts, but we have included a detailed exposition so that this paper could be used as an introduction to these techniques.
We also prove the following off-diagonal version of Theorem 1.1. We use the notation x = 2 ∨ d(0, x), where d(0, x) denotes the graph distance between 0 and x, to avoid dividing by zero.
These estimates are due in the case d = 4 to Le Gall and Lin [13,14], who also proved the lower bound in the case d = 4, while the d = 4 upper bound in the case was proven by Zhu [23,24]. (In fact the exact asymptotics of P μ,0 (L(x) ≥ 1) have also been established by the same authors, see [13,Theorem 7] and [21,23].) Remark 1. 3 It is well-known that a critical branching random walk conditioned to survive forever visits the origin infinitely often if and only if d ≤ 4 [2]. This is closely related to the fact that the conditional distribution of L(x) given L(x) > 0 is tight as x → ∞ if and only if d ≥ 5.

Remark 1.4
In the context of super-Brownian motion (which is a continuum analogue of critical branching random walk), Le Gall and Merle [15] studied the conditional distribution of the occupation measure Z(B 1 (x)) of the unit ball B 1 (x) for large x, given that this measure is positive. Their results are closely related to Theorem 1.2. In particular, they show that if d = 4 then the conditional distribution of the normalized occupation measure Z(B 1 (x))/ log |x| given that it is positive converges to an exponential distribution as x → ∞. It would be interesting to establish a version of their theorem in the discrete case.

Remark 1.5
It is natural to consider the distribution of L(x) for branching random walks on graphs other than Z d . It should be straightforward to adapt the proof of Theorem 1.1 to bounded degree graphs that are d-Ahlfors regular and satisfy Gaussian heat kernel estimates. See e.g. [10,20] for background on these notions. We restrict attention to the usual nearest-neighbour random walk on Z d for clarity of exposition.

Branching random walk
Let us now very briefly define the model, referring the reader to e.g. [17,21] for more details on branching processes and Galton-Watson trees. Given d ≥ 1, an offspring distribution μ (i.e., a probability measure on {0, 1, . . .}), and a point x ∈ Z d we write P μ,x for the law of a branching random walk (B n ) n≥0 on Z d with offspring distribution μ started with a single particle at x. More precisely, (B n ) n≥0 is a Markov chain whose state space is the set of finitely supported functions Z d → {0, 1 . . .}, where B 0 (y) = 1(y = x) and where we think of B n (y) as the number of particles occupying the point y at generation n. At each time step, each particle splits into a random number of offspring particles independently at random according to the offspring distribution μ, and each offspring particle immediately performs an independent simple random walk step. We define the local time L n (x) = n m=0 B m (x) to be the total number of particles that occupy the site x up to time n, and similarly define the limit L(x) = ∞ m=0 B m (x). Alternatively, we may construct branching random walk by first taking a Galton-Watson tree T with offspring distribution μ, which encodes the genealogy of the particles of the branching random walk, letting X : V (T ) → Z d be a uniform random graph homomorphism from T into Z d mapping the root to x (i.e., a simple random walk on Z d started at x and indexed by T ), and letting B n (y) = #{v ∈ ∂ T n : X (v) = y} for every n ≥ 0 and y ∈ Z d . We write ∂ T r for the set of vertices of T at distance exactly r from the root. It is easily seen that if μ is critical then E μ,0 [#∂ T r ] = 1 for every r ≥ 0. Moreover, if μ is critical, non-trivial, and has finite variance σ 2 , then Kolmogorov's estimate states that This estimate was proven by Kolmogorov under a third moment assumption [9], and in full generality by Kesten, Ney, and Spitzer [8]; see [16] for a modern proof.

Random walk estimates
We now briefly recall the relevant background concerning random walk on Z d , referring the reader to e.g. [10,20] for further background. Let p n (u, v) denote the n-step transition probabilities of simple random walk on Z d . The Gaussian heat kernel estimates state that for every x, y ∈ Z d and n ≥ 1, where d(x, y) denotes the graph distance between x and y. (Note that the constants in the notation may differ for the lower and upper bounds.) Note that p n (x, y) = 0 if n has a different parity to d(x, y). In particular, we have that for every x ∈ Z d and n ≥ 1. If d ≥ 3, the Gaussian heat kernel estimates can be integrated over time to obtain that the Green's function G(u, v) = n≥0 p n (u, v) satisfies

Diagrammatic expansion of moments
In this section we discuss how the moments of the branching random walk local time may be expanded in terms of diagrammatic sums, and then prove a recursive inequality that may be used to bound these sums. This basic methodology is wellknown to experts, see e.g. [15, eq. 6] for an application to super-Brownian motion, and [1] for related techniques in percolation.
Recall that a rooted plane tree is a locally finite tree with a distinguished root vertex and a distinguished linear ordering of the children of each vertex; an isomorphism of trees is an isomorphism of rooted plane trees if it preserves this additional data. Note that a rooted plane tree cannot have any nontrivial automorphisms. We may consider a Galton-Watson tree T to be a rooted plane tree by picking a uniform random linear ordering of the children of every vertex.
Let k ≥ 0. We define a k-labelled rooted plane tree to be a finite rooted plane tree S with vertex set V (S), together with a (not necessarily injective) labelling function : {0, 1, . . . , k} → V (S) mapping 0 to the root of S such that every leaf of S is labelled (i.e., is in the image of ). Note that leaves of S may have multiple labels, and that internal vertices may also have labels. Given a k-labelled rooted plane tree S, we write ∂ V (S) = ({0, 1, . . . , k}) and V • (S) = V (S) \ ∂ V (S) to denote the sets of labelled and unlabelled vertices of S. An isomorphism of rooted plane trees is an isomorphism of labelled rooted plane trees if it preserves the labelling.
We say that a k-labelled rooted plane tree is a (labelled) k-skeleton if every unlabelled vertex has at least two children.
In particular, up to isomorphism there is only one 0-skeleton, which has one vertex labelled 0 and no edges. Similarly, there are exactly two isomorphism classes of 1skeletons, which have one and two vertices respectively. For each k ≥ 0, we let S k be a set of isomorphism class representatives for the set of labelled k-skeletons and let H k be a set of isomorphism class respresentatives for the set of k-labelled rooted plane trees.
We will use the modified Green's functioñ for each x, y ∈ Z d and n ≥ 1. For each k ≥ 0, each k-labelled rooted plane tree S, and each x = (x 0 , . . . , When S is a k-skeleton, we define the S-diagram to be the function D( · ; S) : where the second product is over all unordered pairs of adjacent vertices in S. In particular, if S is the 0-skeleton then D( · ; S) ≡ 1, while if S is the 1-skeleton with two vertices then D(x, y; S) ≡G(x, y). Similarly, for each k, n ≥ 0 and each k-skeleton S we define the truncated S-diagram to be the function D n ( · ; S) : given by where, as before, the second product is over all unordered pairs of adjacent vertices in S.
Recall that E μ,x denotes the law of a branching random walk (B n ) n≥0 with offspring distribution μ started with a single particle at x. Recall also that we write L n (y) = n k=0 B k (y) for the total number of particles that visit y up to time n, and write L(y) = ∞ k=0 B k (y) for the total number of particles that ever visit y. For each k ≥ 0, we define b k to be the expectation of the binomial coefficient · k under the offspring distribution μ, that is,  We have that for every n, k ≥ 0 and x 0 , . . . , x k ∈ Z d .

Proof
We first explain the appearance of the combinatorial term b c(u) in the proposition. Let T be the genealogical tree of B, and let X be the random embedding of T into Z d . Let H be a k-labelled rooted plane tree. We say that a graph homomorphism φ from H into the Galton-Watson tree T is an embedding if it is injective, maps the root of H to the root of T , and respects the plane structure of H and T in the sense that for every vertex v of H with children u 1 , . . . , u n , the children φ(u 1 ), . . . , φ(u n ) of φ(v) in T appear in the same linear order as u 1 , . . . , u n do in H . However, T may have additional vertices not corresponding to any vertex in H . It is easily seen by induction on the height of H that for every finite rooted plane tree H . (This equality holds even if μ is not critical.) We begin with the first, non-trucated formula. Given a k-tuple of not necessarily be the k-labelled rooted plane tree spanned by the union of the geodesics between the root of T and the vertices v 1 , . . . , v k , with labelling function defined by setting (0) to be the root of T and On the other hand, by definition of the embedding X we have that where p 1 (·, ·) denotes the one-step transition probabilities for simple random walk on Z d . Taking expectations over T and applying (3.1), we obtain that For each H in H k , let S(H ) ∈ S k denote the k-skeleton obtained from H by replacing each path whose interior vertices are unlabelled vertices of degree two by a single edge. Thus, for each S ∈ S k , the set of H ∈ H k with S(H ) = S is equal to the set of k-labelled rooted plane trees that can be obtained from S by replacing each edge with a path of arbitrary length. Since μ is critical and b 1 = 1, one may readily verify that for every S ∈ S k and x 0 , x 1 , . . . , x k ∈ Z d . The first claim follows from this together with (3.2).
The proof in the truncated case is fairly similar, and we give only a very brief outline. For each n ≥ 0 and k ≥ 0, let H n,k ⊂ H k denote the set of k-labelled rooted plane trees with height at most n, and let H n,k denote the set of k-labelled rooted plane trees in which each path whose interior vertices are unlabelled vertices of degree two has length at most n. Clearly H n,k ⊂ H n,k . We have by similar reasoning to above that as claimed.
We next state and prove a recursive inequality that allows us to bound the diagrammatic sums arising in Lemma 3.1. For each k ≥ 0, let S k be the set of k-skeletons whose labelling function is injective. We observe that for any tuple x, the maximum max S∈S k D(x; S) is invariant to permuting the elements of x. Indeed, D(x; S) is invariant under applying the same permutation to both the entries of x and the labels of S. (If 0 is not a fixed point of the permutation, this requires one to change the root of S.) The symmetry of the random walk implies that such re-rooting also does not change D. In light of this, for each k ≥ 1 and x ∈ Z d , we define where the equality of these three expressions follow from the symmetry noted above. We could equivalently define M k (x) by maximizing D(x; S) over all S ∈ S k and all x which are a permutation of (0, . . . , 0, x). Similarly, we define the truncated version Note that the quantities 1 ∨G n (0, 0) −1 and 1 ∨G(0, 0) −1 are bounded above by p 2 (0, 0) −1 = 2d when n ≥ 2. Be warned, however, that 1 ∨G n (0, 0) −1 is infinite when n ∈ {0, 1}. Later in the paper we will be careful to avoid this case.
Proof of Lemma 3. 2 We will prove (3.3), the proof of (3.4) being almost identical. It suffices to prove that for every k ≥ 2. Indeed, the first and second terms are each clearly smaller than the third multiplied by M 1 (0) −1 =G(0, 0) −1 (consider the contributions to the sum in the third term from y = 0 and y = x).
. We consider three cases, which correspond to the three terms being maximized over in the inequality (3.5): is a leaf and the parent of (k) is in V • (S) (i.e., is unlabelled).
Case 1 Let a ≥ 1 be the number of labelled vertices that are descendants of (k) in S. Since is injective, (k) is not the root of S and a < k.
Let S 1 be the a-skeleton formed by (k) and its descendants in S, where we consider (k) to be the root of S 1 and re-index the labels if necessary so that the labelling function has domain {0, . . . , a}. Similarly, let S 2 = (T 2 , 2 ) be the (k − a)-skeleton obtained from S by deleting all the descendants of (k), and re-indexing the labels so that the labelling function 2 has domain {0, 1, . . . , k − a} and satisfies 2 (k − a) = (k). (In both cases, the details of relabelling are not important.) Having done this, we observe that, by the definitions, We deduce that if S ∈ S k is such that (k) is not a leaf of S then which corresponds to the first term in (3.5). Case 2 We may define a (k − 1)-skeleton S by deleting (k) from S. The definitions then ensure that which corresponds to the second term in (3.5). Case 3 Let v be the (unlabelled) parent of (k). Let a be the number of labelled descendants of v other than (k). Since v is unlabelled it has at least two children, and therefore has a ≥ 1. Let S 1 be the a-skeleton consisting of v and its descendants other than (k), where we consider v to be the root of We deduce that if S ∈ S k is such that (k) is a leaf and the parent of (k) is in V • (S) then which corresponds to the third term in (3.5).
Since one of the three cases above holds for every S ∈ S k , the claimed inequality (3.5) follows from (3.7), (3.8), and (3.9).
We now note that bounds on M k and M k,n yield bounds on all diagrams, i.e. also with non-injective labels. Indeed, suppose that S ∈ S k for some k ≥ 1 and that the labelling function of S is not injective. for every k ≥ 0, x ∈ Z d , and n ≥ 0.

Low dimensions
In this section we prove the following proposition, which implies the case d < 4 of Theorems 1.1 and 1.2. We remark that in this low dimensional case we do not require a sub-exponential tail for the offspring distribution, and a moment condition is sufficient.
for every n ≥ 1.
Our analysis is informed by the following heuristic: In low dimensions, the easiest way for the local time L(x) to be large is for the genealogical tree to be sufficiently large, without any other unusual behaviour for the tree or the associated random walks. Indeed, intuitively, if the genealogical tree survives to generation k, which occurs with probability (k −1 ), then it typically contains roughly k 2 vertices, and the locations of the corresponding particles are roughly uniformly distributed on the ball of radius k 1/2 . Thus, if R denotes the survival time of the branching random walk, we should typically have that . Thus, we expect that the easiest way to have L(x) ≥ n is for R to be at least min x 2 , n 2/(4−d) , which leads to the expression given in Proposition 4.1. One may think of this heuristic argument as yielding a hyperscaling relation for branching random walk below the critical dimension, and the proof of Proposition 4.1 as a rigorous verification of this hyperscaling relation.
We now begin the rigorous proof of Proposition 4.1. We shall see that it is sufficient to look at the first three moments of the truncated local time L n (x). (In dimensions d = 1, 2 it suffices to consider the first and second moment, while in d = 3 dimensions using the second moment results in an unwanted logarithmic correction.) Lemma 4.3 Let μ be critical, and let d ∈ {1, 2, 3}. Then the following moment bounds hold.
(a) If μ has finite second moment then for every x ∈ Z d and n ≥ 1. Note that these bounds are clearly not sharp when, say, x √ n. This will not be a problem for us as the estimates are sharp in the regimes that we wish to apply them.
We will frequently use the easily proved fact that for every c > 0 and α ∈ R there exists a constant C = C(c, α) such that r ≥1 for every n ≥ 1.
(a) Second moment. Let S ∈ S 2 be a 2-skeleton. Fix 1 ≤ d ≤ 3. No 2-skeleton has a vertex of degree more than three. Since b 0 , b 1 , b 2 < ∞ by assumption, and there is a finite number (10) of 2-skeletons, it suffices by Lemma 3.1 to prove that for every x ∈ Z d and n ≥ 1. for every k = 0, 1, 2 and x ∈ Z d . This bound is trivially satisfied for k = 0, since in this case M 0,n (x) = 1(x = 0) ≤ 1. For k = 1 we have that for every x ∈ Z d and n ≥ 2. Applying the Gaussian heat kernel estimates eq. (2.2) we deduce that there exists a positive constant c such that (4.6) Using that #{y ∈ Z d : y = r } = O(r d−1 ) and changing variables to and hence that 3 and deduce that for every n ≥ 2 as claimed.
(b) Third moment. Since no 3-skeleton has a vertex with more than 3 offspring, and since b 0 , b 1 , b 2 , b 3 < ∞ by assumption, it suffices by Lemma 3.1 to prove that for every n ≥ 2, k = 0, 1, 2, 3, and x ∈ Z d . The fact that this bound is satisfied for k = 0, 1, 2 has already been established. For k = 3, we apply Lemma 3.2 and (4.5) to deduce that n (z, y)G n (0, y)G n (y, x).
As before, we apply the Gaussian heat kernel estimates (2.2) to bound this sum by By similar reasoning to above, we can bound By symmetry we can also bound the left hand side by min{k 1 where we once again bounded the minimum by the geometric mean and used that for every n ≥ 1 and x ∈ Z d as required.
Before applying Lemma 4.3 to prove Proposition 4.1, let us recall the Paley-Zygmund inequality and its higher-moment variants. The usual Paley-Zygmund inequality states that if X is a non-negative random variable with finite second moment then for every 0 ≤ ε ≤ 1. Applying this inequality to the conditional distribution of a nonnegative random variable X given that X > 0 and doing a little algebra, we obtain that in fact for every 0 ≤ ε ≤ 1. The Paley-Zygmund inequlity also has the following L p version. We include a short proof since this inequality is less standard.

Lemma 4.4 Let X be a non-negative random variable. Then
for every p > 1 and 0 ≤ ε ≤ 1.

Proof Hölder's inequality implies that
Rearranging gives the desired inequality. Now suppose that X is a nonnegative random variable. Applying the above inequality to a random variable Z distributed according to the conditional distribution of X given X > 0 gives that for every p > 1 and 0 ≤ ε ≤ 1.

Proof of Proposition
for every n ≥ 1 and x ∈ Z d . Since μ has finite second moment, is critical and non-trivial, we have by the Kolmogorov estimate (2.1) that where T is the genealogical tree of the branching process. Thus, we can bound for every n, r ≥ 1. Taking r = n 2/3 when d = 1, r = n when d = 2, and r = n 2 when d = 3, we obtain that for every n ≥ 1 and x ∈ Z d as desired.
We now turn to the lower bounds. It suffices to prove that there exists a constant c such that for every x ∈ Z d and every n ≥ c x 4−d : the required bound for smaller n follows since P μ,0 (L(x) ≥ n) is a decreasing function of n. For each r ≥ 1 we have by linearity of expectation that for every x ∈ Z d and r ≥ 1. If r x 2 and has the right parity then p (0, x) r −d/2 . It follows that for every x ∈ Z d and r ≥ x 2 . Suppose that d ∈ {1, 2}. We deduce from (4.8), (4.1) and the Paley-Zygmund inequality that there exists a constant c > 0 such that if r ≥ x 2 then and the desired lower bound follows by taking r = (n/c) 2/(4−d) . Now suppose that d = 3. Applying (4.7) with p = 3 we obtain that there exists a constant c such that and we conclude as before.

High dimensions
In this section we treat the case d ≥ 5.

Proposition 5.1 Let d ≥ 5 and suppose that the offspring distribution μ is critical, non-trivial, and sub-exponential. Then
for every n ≥ 1 and x ∈ Z d .
The lower bound is simple, and most of our work will go into proving the upper bound. By a standard computation, which we reproduce below, it suffices to prove that there exists a constant C = C(μ, d) such that E μ,0 [L(x) k ] ≤ C k k! x −d+2 for every k ≥ 1. Thus, applying Lemma 3.1, it suffices to prove the following two lemmas. Recall that c(u) is the number of offspring of a vertex u in a skeleton.

Lemma 5.2 (The skeleton partition function)
If μ is critical and sub-exponential then there exist a constant κ = κ(μ) such that S∈S k u∈S b c(u) κ k k! for every k ≥ 1. for every k ≥ 0, S ∈ S k and x ∈ Z d .
We begin with Lemma 5.2.

Proof of Lemma 5.2
Since μ is subexponential it satisfies a bound of the form μ(n) ≤ Cλ n for some C < ∞ and λ < 1. Thus, we have by a standard generating function calculation [19,Eq. 1.31 Let S n,k ⊆ S k be the set of isomorphism classes of k-skeletons with exactly n vertices, and let T n denote the set of isomorphism classes of rooted plane trees with exactly n vertices. It is well known [19,Example 2.16] that |T n | is given by the Catalan number For each rooted plane tree T ∈ T n there are at most n k isomorphism classes of kskeletons with underlying rooted tree T , so that |S n,k | ≤ 4 n n k for every n ≥ 1 and k ≥ 0. On the other hand, if S ∈ S k then every vertex of V • (S) has degree at least three, so that and hence that |V (S)| ≤ 2k. Putting these observations together, we obtain that for every k ≥ 0, from which the claim follows easily. Lemma 5.3 will be proven using the recursive inequality Lemma 3.2 together with the following simple fact, which is related to the fact that the simple random walk bubble diagram converges when d ≥ 5.

Lemma 5.4 Let d ≥ 5. Then there exists a constant C = C(d) ≥ 1 such that
for every x ∈ Z d .

Proof of Lemma 5.4
The lower bound is trivial from the contribution of y = 0. For the upper bound, consider the set A = {y ∈ Z d : d(x, y) ≥ d(0, x)/2}. We will control the contribution to the sum from A and A c separately. If y ∈ A then x − y x , so that where we used that y∈Z d y −2d+4 is finite when d ≥ 5. On the other hand, if y ∈ A c then d(0, x)/2 ≤ d(0, y) ≤ 3d(0, x)/2 and we have that Since there are O(r d−1 ) points y with x − y = r for each r ≥ 1, we deduce that where we used that d ≥ 4 in the last inequality. Combining (5.3) and (5.4) completes the proof.

Proof of Lemma 5.3
Let C 1 ≥ 1 be such thatG(x, y) ≤ C 1 x−y −d+2 for every x, y ∈ Z d , let C 2 ≥ 1 be the constant from Lemma 5.4, and let λ = C 2 1 C 2 [1 ∨G(0, 0) −1 ]. We will prove by induction on k that for every k ≥ 1. The base case k = 1 is immediate, since S 1 has only one element and this element S has D(0, x; S) =G(0, x) ≤ C 1 x −2 . Now suppose that k ≥ 2 and that the induction hypothesis (5.3) holds for all 1 ≤ r ≤ k − 1. Applying Lemma 3.2 and Lemma 5.4 we obtain that for every x ∈ Z d . This completes the induction. The claim follows from (5.5) and (3.10).
Proof of Proposition 5. 1 We begin with the upper bound. Lemmas 3.1 , 5.2, and 5.3 imply that there exists a constant α such that E μ,0 [L(x) k ] ≤ α k k! x −d+2 for every k ≥ 1 and x ∈ Z d . We deduce that for every x ∈ Z d , and hence by Markov's inequality that for every x ∈ Z d and n ≥ 1 as claimed.
We finish with the lower bound. First suppose that x = 0. The probability q that the initial particle has at least one grandchild is positive, and any grandchild has probability 1/(2d) of being back at the origin. By the Markov property, the probability that there are at least n visits to 0 is at least (q/2d) n = e − (n) for every n ≥ 1. If x = 0, then we claim that as required, where the final inequality follows from (1.1). Indeed, for the first inequality, note that if we explore the genealogical tree T in a breadth-first manner until x is visited for the first time, the part of the branching process that is descended from this first visit to x has conditional law P μ,x . This completes the proof.

The critical dimension
In this section we deal with the case of the upper critical dimension d = 4, which is the most technical. We rely on the machinery developed in the previous sections, in particular Lemmas 5.2 and 3.2. The following is the d = 4 case of Theorem 1.2.
Proposition 6.1 Let d = 4 and suppose that the offspring distribution μ is critical, nontrivial, and subexponential. Then for every n ≥ 1 and x ∈ Z d . Remark 6.2 Proposition 6.1 shows that in four dimensions, unlike in low dimensions, the easiest way for L(0) to be large is not for the genealogical tree to be "large in a typical way". Indeed, L(0) is typically logarithmic in the size of the tree, so for L(0) to be of order n we would need the tree to survive to generation e (n) . This occurs with probability e − (n) , which is much smaller than the probability that L(0) ≥ n.
The proof of this proposition relies on the results of Zhu [23,24] (i.e., the d = 4 case of the hitting probability estimate (1.1)) in the case x = 0, but is self-contained in the case x = 0. Indeed, the proposition will follow from Zhu's results together with the following proposition.
for every x ∈ Z d and k ≥ 1.
We begin with the following lemma, which is the four-dimensional analogue of Lemma 5.4.

Lemma 6.4 Let d = 4. Then there exists a positive constant C such that
for every x ∈ Z d and k ≥ 0.
Proof of Lemma 6.4 Partition Z 4 into three sets A, B, C according to the distance to 0 and x: We will control the contribution to the sum of each of these three sets separately.
The sum on the right hand side can be bounded with a little calculus: We have the integral identity for every s ≥ 1, and since the function t −1 (k + log t) k is decreasing when t ≥ 1 (as can be seen by computing the derivative to be −t −2 (k + log t) k−1 log t), we have that and hence that as required. It remains to upper bound the contributions from B and C. If y ∈ B then d(0, x)/2 ≤ d(0, y) ≤ 2d(0, x) and we have that Up to constants, there are 2 4n choices for y with 2 n ≤ y < 2 n+1 . For each such y we have y −6 [k + log y ] k 2 −6n (k + n log 2) k , so the total contribution from all such y's is (up to constants) 2 −2n (k + n log 2) k . Thus The ratio of consecutive terms in this sum is Since e/4 < 1, it follows that the sum on the right of (6.4) is of the same order as its first term, and we deduce that This is also of the required order, completing the proof.
Proof of Proposition 6. 3 We begin with the upper bound. Let C 1 ≥ 1 be a constant such thatG(0, x) ≤ C 1 x −2 for every x ∈ Z 4 , let C 2 ≥ 1 be the constant from Lemma 6.4, and let λ = C 2 1 C 2 [1 ∨G(0, 0) −1 ]. We prove by induction on k that for every k ≥ 1 and x ∈ Z 4 . The base case k = 1 is trivial. For k ≥ 2, we may apply Lemma 3.2 and the induction hypothesis to obtain that as desired, where we applied Lemma 6.4 in the second line. As in the proof of Proposition 5.1, it follows from (6.6), Lemmas 3.1, and 5.2 that there exists a constant C 3 such that for every x ∈ Z d and k ≥ 1 as claimed.
We now turn to the lower bound. We first prove the bound for k of the form 2 for some natural number ≥ 1. For each ≥ 0, let k = 2 and let T = T be the rooted plane tree with boundary in which the root has degree 1, the descendants of the root's child form a complete binary tree of height , and ∂ V (T ) is equal to the set of leaves of T . Let ρ be the root of T , let v 0 be the child of the root, and for each vertex v of T other than ρ, let σ (v) denote the parent of v in T . There are k! ways to label the non-root leaves of T with the labels {1, . . . , k}, and each such labelling yields a distinct k-skeleton. Let S = S be one such labelled k-skeleton. Applying Lemma 3.1, we have by symmetry that 1 D(x, 0, . . . , 0; S). (6.8) (Recall that b 2 is the second descending moment of the offspring distribution, which is positive since μ is critical and nontrivial.) Consider the set of functions φ : V • (T ) → {0, 1, . . . , k ∨ log 2 x } that are decreasing along each branch of the tree, i.e. such that 2 k∨log 2 x ≤ 2 k x . Thus, we obtain from the definitions that there exists a positive constant c 1 such that Next observe that there exists a positive constant c 2 such that so that there exists a positive constant c 3 such that D(x, 0, . . . , 0; S) ≥ c k 3 x −2 | |.
It remains to estimate | |. Let E i be the set of edges of T connecting vertices at distance i from the root to the children of these vertices, so that |E i | = 2 i for 0 ≤ i ≤ , and let E = −1 i=0 E i . Let be the set of functions ψ : E → {0, . . . , k ∨ log 2 x } such that if e ∈ E i then ψ(e) ≤ 2 i− (k ∨ log 2 x ). We clearly have that We claim that there is an injection → . Given ψ ∈ , let φ ∈ be defined recursively by φ(v 0 ) = k ∨ log 2 x and φ(v) = φ(σ (v)) − ψ({v, σ (v)}) for every v ∈ V • (T ) \ {v 0 }. The function φ is indeed an element of , since φ(v) ≥ k ∨ log 2 x − −1 i=1 2 i− (k ∨ log 2 x ) ≥ 0 for every v ∈ V • (T ). Moreover, distinct elements of clearly lead to distinct elements of under this assignment, as claimed. We deduce that where c 4 = 2 − ∞ m=1 m2 −m > 0. It follows that there exists a constant c 5 > 0 such that D(x, 0, . . . , 0; S ) ≥ c k 5 [k + log 2 x ] k−1 x −2 (6.9) for every k = 2 ≥ 2 and x ∈ Z 4 . Putting together (6.8) and (6.9), we obtain that there exists a constant c 6 > 0 such that for every x ∈ Z 4 and every k = 2 for some ≥ 1.
To get the lower bound for k which is not a power of 2, we interpolate using logconvexity. By Cauchy-Schwarz, for any random variable X ≥ 0 and any a ≥ i ≥ 0 we have (EX a ) 2 ≤ EX a−i EX a+i , so that the moments EX n are a log-convex sequence. Since we have the claimed upper bound for every k and the lower bound for powers of 2, the lower bound follows for all k. More precisely, let a ∈ [k, 2k] be a power of 2, and let b = 2a − k. Log-convexity gives Applying (6.10) to control the numerator and (6.7) to control the denominator yields the lower bound for arbitrary k.
Proof of Proposition 6.1 By Zhu's Theorem, it suffices to prove that P μ,0 (L(x) ≥ n | L(x) > 0) = exp − min √ n, n log x for every x ∈ Z 4 and n ≥ 1. Moreover, Zhu's Theorem and Proposition 6.3 imply that there exist positive constants c 1 and C 2 such that for every x ∈ Z 4 and k ≥ 1, and hence that there exist positive constants c 2 and C 2 ≥ 1 such that (6.11) for every x ∈ Z 4 and k ≥ 1.
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.