Local unimodularity
We now introduce the notion of local unimodularity. This definition plays a central role in our proofs, and we expect that it will have several further applications in the future.
We define \({\mathcal {G}}_\bullet ^\diamond \) to be the space of isomorphism classes of triples (g, a, u), where (g, u) is a rooted graph and a is a distinguished set of vertices of g (this notation is not standard). The local topology on \({\mathcal {G}}_\bullet ^\diamond \) is defined in an analogous way to that on \({\mathcal {G}}_\bullet \), so that (g, a, u) and \((g',a',u')\) are close in the local topology if there exists a large r and an isomorphism of rooted graphs \(\phi \) from the r-ball around u in g to the r-ball around \(u'\) in \(g'\) such that the intersection of \(a'\) with the r-ball around \(u'\) is equal to the image under \(\phi \) of the restriction of a to the r-ball around u. The doubly rooted space \({\mathcal {G}}_{\bullet \bullet }^\diamond \) and the local topology on this space are defined analogously. It follows by a similar argument to that of [18, Theorem 2] that \({\mathcal {G}}_\bullet ^\diamond \) and \({\mathcal {G}}_{\bullet \bullet }^\diamond \) are Polish spaces. We write \({\mathcal {T}}_\bullet ^\diamond \) and \({\mathcal {T}}_{\bullet \bullet }^\diamond \) for the closed subspaces of \({\mathcal {G}}_\bullet ^\diamond \) and \({\mathcal {G}}_{\bullet \bullet }^\diamond \) in which the underlying graph is a tree.
We say that a random variable \((G,A,\rho )\) taking values in \({\mathcal {G}}_\bullet ^\diamond \) is locally unimodular if \(\rho \in A\) almost surely and
$$\begin{aligned} {\mathbb {E}}\left[ \sum _{v\in A} F(G,A,\rho ,v)\right] = {\mathbb {E}}\left[ \sum _{v\in A} F(G,A,v,\rho )\right] \end{aligned}$$
for every measurable function \(F : {\mathcal {G}}_{\bullet \bullet }^\diamond \rightarrow [0,\infty ]\). (Note that the first condition is in fact redundant, being implied by the second.) We say that a probability measure \(\mu \) on \({\mathcal {G}}_\bullet ^\diamond \) is locally unimodular if a random variable with law \(\mu \) is locally unimodular. We write \({\mathcal {L}}({\mathcal {G}}_\bullet ^\diamond )\) for the space of locally unimodular probability measures on \({\mathcal {G}}_\bullet ^\diamond \) with the weak topology.
For example, if \((G,\rho )\) is a unimodular random rooted graph and \(\omega \) is a unimodular percolation process on G (i.e., \(\omega \) is a random subgraph of G such that \((G,\omega ,\rho )\) is unimodular in an appropriate sense) and \(K_\rho \) is the component of \(\rho \) in \(\omega \) then \((G,K_\rho ,\rho )\) is locally unimodular. We stress however that locally unimodular random rooted graphs need not arise this way, and indeed that the set A need not be connected. For example, if G is an arbitrary connected, locally finite graph, A is an arbitrary finite set of vertices of G, and \(\rho \) is chosen uniformly at random from among the vertices of A then the triple \((G,A,\rho )\) is locally unimodular. More generally, we have the intuition that \((G,A,\rho )\) is locally unimodular if and only if \(\rho \) is ‘uniformly distributed on A’. (Of course, this intuitive definition does not make formal sense when A is infinite.)
It follows by a similar argument to [18, Theorem 8] that \({\mathcal {L}}({\mathcal {G}}_\bullet ^\diamond )\) is a closed subset of the space of all probability measures on \({\mathcal {G}}_\bullet ^\diamond \) with respect to the weak topology. Thus, if \((G_n,A_n,\rho _n)\) is a sequence of locally unimodular \({\mathcal {G}}_\bullet ^\diamond \) random variables converging in distribution to \((G,A,\rho )\), then \((G,A,\rho )\) is also locally unimodular.
As before, it will be convenient for us to introduce the following more general notion. We say that a random variable \((G,A,\rho )\) taking values in \({\mathcal {G}}_\bullet ^\diamond \) is locally quasi-unimodular if there exists a measurable function \(W:{\mathcal {G}}_\bullet ^\diamond \rightarrow (0,\infty )\) such that \({\mathbb {E}}[W(G,A,\rho )]=1\) and
$$\begin{aligned} {\mathbb {E}}\left[ W(G,A,\rho )\sum _{v\in A} F(G,A,\rho ,v)\right] = {\mathbb {E}}\left[ W(G,A,\rho )\sum _{v\in A} F(G,A,v,\rho )\right] \end{aligned}$$
for every measurable function \(F:{\mathcal {G}}_{\bullet \bullet }^\diamond \rightarrow [0,\infty ]\); in this case we say that \((G,A,\rho )\) is locally quasi-unimodular with weight W. Equivalently, \((G,A,\rho )\) is locally quasi-unimodular if and only if there exists a locally unimodular \((G',A',\rho ')\) whose law is equivalent to that of \((G,A,\rho )\) in the sense that both measures are absolutely continuous with respect to each other; the weight W is the Radon-Nikodym derivative of the law of \((G',A',\rho ')\) with respect to the law of \((G,A,\rho )\). (We expect that the weight W has similar uniqueness properties to those discussed in Remark 2.1. We do not pursue this here.)
Our interest in these notions owes to the following proposition, which gives conditions under which local unimodularity can be pulled back or pushed forward through a unimodular tree-indexed random walk.
Proposition 3.1
(Local unimodularity via tree-indexed walks)
-
1.
Pull-back Let (G, A, o) be a locally unimodular random rooted graph and let (T, o) be an independent unimodular random rooted tree. Let X be a T-indexed random walk on G with \(X(o)=\rho \). If \({\mathbb {E}}[\deg _G(\rho )] < \infty \) then \((T,X^{-1}(A),o)\) is locally quasi-unimodular with weight
$$\begin{aligned} W\bigl (T,X^{-1}(A),o\bigr ):=\frac{{\mathbb {E}}\left[ \deg _G(\rho )\mid \bigl (T,X^{-1}(A),o\bigr )\right] }{{\mathbb {E}}\left[ \deg _G(\rho )\right] }. \end{aligned}$$
-
2.
Push-forward Let (G, o) be a unimodular random rooted graph and let (T, A, o) be an independent locally unimodular random rooted tree. Let X be a T-indexed random walk on G with \(X(o)=\rho \). If X is transient almost surely and \({\mathbb {E}}[\deg _G(\rho )(\# X^{-1}(\rho ))^{-1}] < \infty \) then \((G,X(A),\rho )\) is locally quasi-unimodular with weight
$$\begin{aligned} W(G,X(A),\rho ):= \frac{{\mathbb {E}}\left[ \deg _G(\rho )(\#X^{-1}(\rho ))^{-1} \mid (G,X(A),\rho )\right] }{{\mathbb {E}}\left[ \deg _G(\rho )(\# X^{-1}(\rho ))^{-1}\right] }. \end{aligned}$$
Proof of Proposition 3.1
For each \((g,x) \in {\mathcal {G}}_{\bullet }\) and \((t,u) \in {\mathcal {T}}_\bullet \) we let \({\mathbf {P}}_{u,x}^{t,g}\) and \({\mathbf {E}}_{u,x}^{t,g}\) denote probabilities and expectations taken with respect to the law of a t-indexed random walk X on g started with \(X(u)=x\), which we consider to be a random graph homomorphism from t to g. Observe that tree-indexed random walk has the following time-reversal property: If \((g,x,y) \in {\mathcal {G}}_{\bullet \bullet }\) and \((t,u,v) \in {\mathcal {T}}_{\bullet \bullet }\), then we have that
$$\begin{aligned} \deg (x) {\mathbf {P}}_{u,x}^{t,g}\bigl (X(v)=y\bigr )&= \deg (y) {\mathbf {P}}_{v,y}^{t,g}\bigl (X(u)=x\bigr ) \end{aligned}$$
(3.1)
and that
$$\begin{aligned} {\mathbf {P}}_{u,x}^{t,g}\bigl (X \in {\mathscr {A}}\mid X(v) = y\bigr )&= {\mathbf {P}}_{y,x}^{t,g}\bigl (X \in {\mathscr {A}}\mid X(u) = x\bigr ) \end{aligned}$$
(3.2)
for every event \({\mathscr {A}}\). That is, the conditional distribution of X given \(\{ X(u)=x, X(v)=y\}\) is the same under the two measures \({\mathbf {P}}_{u,x}^{t,g}\) and \({\mathbf {P}}_{v,y}^{t,g}\). Both statements follow immediately from the analogous statements for simple random walk, which are classical. Indeed, \({\mathbf {P}}_{u,x}^{t,g}\bigl (X(v)=y\bigr )\) is equal to \(p_{d(u,v)}(x,y)\), so that (3.1) follows from the standard time-reversal identity \(\deg (x) p_n(x,y) = \deg (y) p_n(y,x)\). To prove (3.2), observe that, under both measures, the conditional distribution of X given \(X(u)=x\) and \(X(v)=y\) is given by taking the restriction of X to the geodesic connecting u and v in T to be a uniformly random path of length d(u, v) from x to y in G, and then extending X to the rest of T in the natural Markovian fashion.
Proof of item 1. Write \({\mathbb {E}}_G\) for expectations taken with respect to \((G,A,\rho )\) and \({\mathbb {E}}_T\) for expectations taken with respect to (T, o). Let \(F: {\mathcal {G}}_{\bullet \bullet }^\diamond \rightarrow [0,\infty ]\) be measurable, and define \(f: {\mathcal {G}}_{\bullet \bullet }^\diamond \rightarrow [0,\infty ]\) by
$$\begin{aligned} f(g,a,x,y)&= {\mathbb {E}}_T\left[ \sum _{v \in V(T)}{\mathbf {E}}_{o,x}^{T,g}\left[ F\bigl (T,X^{-1}(a),o,v\bigr ) \mathbb {1}\bigl (v \in X^{-1}(y)\bigr ) \right] \right] . \end{aligned}$$
Observe that we can equivalently write f as
$$\begin{aligned} f(g,a,x,y)&= {\mathbb {E}}_T\left[ \sum _{v \in V(T)}{\mathbf {E}}_{v,x}^{T,g}\left[ F\bigl (T,X^{-1}(a),v,o\bigr ) \mathbb {1}\bigl (o \in X^{-1}(y)\bigr ) \right] \right] \nonumber \\&=\frac{\deg (y)}{\deg (x)}{\mathbb {E}}_T\left[ \sum _{v \in V(T)}{\mathbf {E}}_{o,y}^{T,g}\left[ F\bigl (T,X^{-1}(a),v,o\bigr ) \mathbb {1}\bigl (v \in X^{-1}(x)\bigr ) \right] \right] \end{aligned}$$
(3.3)
where the first equality follows from the mass-transport principle for (T, o) and the second follows from the time-reversal identities (3.1) and (3.2). On the other hand, we have that
$$\begin{aligned}&{\mathbb {E}}\left[ \sum _{v \in X^{-1}(A)} \deg (\rho ) F\bigl (T,X^{-1}(A),o,v\bigr ) \right] \\&\quad = {\mathbb {E}}_G\left[ \sum _{y \in A} \deg (\rho ) f(G,A,\rho ,y) \right] = {\mathbb {E}}_G\left[ \sum _{y \in A} \deg (y)f(G,A,y,\rho ) \right] , \end{aligned}$$
where the first equality is by definition and the second follows from the mass-transport principle for \((G,A,\rho )\). Applying (3.3) we deduce that
$$\begin{aligned}&{\mathbb {E}}\left[ \sum _{v \in X^{-1}(A)} \deg (\rho ) F\bigl (T,X^{-1}(A),o,v\bigr ) \right] \\&\quad ={\mathbb {E}}_G \left[ \sum _{y\in A}\deg (\rho ){\mathbb {E}}_T\left[ \sum _{v \in V(T)}{\mathbf {E}}_{o,\rho }^{T,g}\left[ F\bigl (T,X^{-1}(A),v,o\bigr ) \mathbb {1}\bigl (v \in X^{-1}(y)\bigr ) \right] \right] \right] \\&\quad = {\mathbb {E}}\left[ \sum _{v \in X^{-1}(A)} \deg (\rho ) F\bigl (T,X^{-1}(A),v,o\bigr ) \right] . \end{aligned}$$
Since the measurable function \(F:{\mathcal {G}}_{\bullet \bullet }^\diamond \rightarrow [0,\infty ]\) was arbitrary, this concludes the proof.
Proof of item 2. Write \({\mathbb {E}}_G\) for expectations taken with respect to \((G,\rho )\) and \({\mathbb {E}}_T\) for expectations taken with respect to (T, A, o). Let \(F:{\mathcal {G}}_{\bullet \bullet }^\diamond \rightarrow [0,\infty ]\) be measurable, and for each \((t,a,u,v) \in {\mathcal {T}}_{\bullet \bullet }^\diamond \), define
$$\begin{aligned}&f(t,a,u,v)\nonumber \\&\quad = {\mathbb {E}}_G\left[ \sum _{y \in V(G)} \deg (\rho ) {\mathbf {E}}_{u,\rho }^{t,G} \left[ |X^{-1}(\rho )|^{-1} |X^{-1}(y)|^{-1} F(G,X(a),\rho ,y) \mathbb {1}(X(v)=y) \right] \right] \nonumber \\&\quad = {\mathbb {E}}_G\left[ \sum _{y \in V(G)} \deg (y){\mathbf {E}}_{u,y}^{t,G} \left[ |X^{-1}(\rho )|^{-1} |X^{-1}(y)|^{-1}F(G,X(a),y,\rho ) \mathbb {1}(X(v)=\rho ) \right] \right] \nonumber \\&\quad = {\mathbb {E}}_G\left[ \sum _{y \in V(G)}\deg (\rho ) {\mathbf {E}}_{v,\rho }^{t,G} \left[ |X^{-1}(\rho )|^{-1} |X^{-1}(y)|^{-1}F(G,X(a),y,\rho ) \mathbb {1}(X(u)=y) \right] \right] , \end{aligned}$$
(3.4)
where, as before, the first equality follows from the mass-transport principle for \((G,\rho )\) and the second inequality follows from the time-reversal identities (3.1) and (3.2). Taking expectations over (T, A, o), we deduce that
$$\begin{aligned}&{\mathbb {E}}\left[ \deg (\rho )|X^{-1}(\rho )| \sum _{y\in X(A)} F(G,X(A),\rho ,y)\right] \\&\quad ={\mathbb {E}}\left[ \sum _{y\in V(G)} \deg (\rho ) \sum _{v\in A}|X^{-1}(\rho )||X^{-1}(y)|^{-1} F(G,X(A),\rho ,y) \mathbb {1}(X(v)=y)\right] \\&\quad = {\mathbb {E}}_T\left[ \sum _{v\in A}f(T,A,o,v)\right] = {\mathbb {E}}_T\left[ \sum _{v\in A}f(T,A,v,o)\right] , \end{aligned}$$
where the first and second equalities are by definition and the third is by the mass-transport principle for (T, A, o). Applying (3.4) we deduce that
$$\begin{aligned}&{\mathbb {E}}\left[ \deg (\rho )|X^{-1}(\rho )| \sum _{y\in X(A)} F(G,X(A),\rho ,y)\right] \\&\quad = {\mathbb {E}}_T\left[ \sum _{v\in A} {\mathbb {E}}_G \left[ \sum _{y \in V(G)} \deg (\rho ) {\mathbf {E}}_{o,\rho }^{T,G} \left[ |X^{-1}(\rho )|^{-1}|X^{-1}(y)|^{-1}F(G,X(a),y,\rho ) \mathbb {1}(X(v)=y) \right] \right] \right] \\&\quad = {\mathbb {E}}\left[ \deg (\rho )|X^{-1}(\rho )| \sum _{y\in X(A)} F(G,X(A),y,\rho )\right] . \end{aligned}$$
The claim follows since the measurable function \(F:{\mathcal {G}}_{\bullet \bullet }^\diamond \rightarrow [0,\infty ]\) was arbitrary. \(\square \)
Note that the weight that arises when pulling back is identically equal to 1 when G is a deterministic transitive graph. Moreover, pushing forward \(A=V(T)\), it follows that if \({\mathbb {E}}[\deg (\rho )\bigl (\#X^{-1}(\rho )\bigr )^{-1}]<\infty \) then \((G,X(V(T)),\rho )\) is locally quasi-unimodular with weight
$$\begin{aligned} W\bigl (G,X(V(T)),\rho \bigr ) = \frac{ {\mathbb {E}}\left[ \deg _G(\rho )\bigl (\#X^{-1}(\rho )\bigr )^{-1} \mid \bigl (G,X(V(T)),\rho \bigr )\right] }{{\mathbb {E}}\left[ \deg _G(\rho )(\#X^{-1}(\rho ))^{-1}\right] }. \end{aligned}$$
This is very closely related to Proposition 2.2. Pulling this set back along a second tree-indexed walk, we therefore deduce the following immediate corollary.
Corollary 3.2
Let G be a connected, locally finite, unimodular transitive graph and let \(\rho \) be a vertex of G. For each \(i \in \{1,2\}\), let \((T_i,o_i)\) be a unimodular random rooted tree and let \(X_i\) be a \(T_i\)-indexed random walk on G with \(X_i(o_i)=\rho \), where we take the random variables \(((T_1,o_1),X_1)\) and \(((T_2,o_2),X_2)\) to be independent. Let \(I=X_1^{-1}(X_2(V(T_2))) \subseteq V(T_1)\). If \(X_2\) is almost surely transient, then the random triple \((T_1,I,o_1)\) is locally quasi-unimodular with weight
$$\begin{aligned} W(T_1,I,o_1) = \frac{{\mathbb {E}}\left[ \bigl (\#X_2^{-1}(\rho )\bigr )^{-1} \mid \bigl (T_1,I,o_1\bigr )\right] }{{\mathbb {E}}\left[ \bigl (\#X_2^{-1}(\rho )\bigr )^{-1}\right] }. \end{aligned}$$
Note that this corollary has a straightforward extension to the case that \((G,\rho )\) is a unimodular random rooted graph or network. (Indeed, one can even consider the case that G carries two different network structures, one for each walk, in a jointly unimodular fashion.)
Ends in locally unimodular random trees via the Magic Lemma
Recall that an infinite graph G is said to be k-ended (or that G has k ends) if deleting a finite set of vertices from G results in a maximum of k infinite connected components. It is a well-known fact that a Benjamini-Schramm limit of finite trees (i.e., a distributional limit of finite trees each rooted at a uniform random vertex) is either finite or has at most two ends.
There are several ways to prove this (see e.g. [18, Theorem 13]), and several far-reaching generalizations of this fact can be found in [2, 3, 13].
Our next result shows that this fact also has a local version, from which we will deduce Theorems 1.1 and 1.2 in the next subsection. Given a graph G and an infinite set of vertices A in G, we say that A is k-ended if deleting a finite set of vertices from G results in a maximum of k connected components that have infinite intersection with A. (In particular, if T is a tree, then an infinite set of vertices A in T is k-ended if and only if it accumulates to exactly k ends of T.)
Theorem 3.3
Let \(((T_n,A_n,o_n))_{n\ge 1}\) be a sequence of locally unimodular random rooted trees converging in distributionFootnote 3 to some random variable (T, A, o) as \(n\rightarrow \infty \). If \(A_n\) is finite almost surely for every \(n \ge 1\), then A is either finite, one-ended, or two-ended almost surely.
We will deduce Theorem 3.3 as a corollary of Theorem 3.4, below. This theorem is a version of the Magic Lemma of Benjamini and Schramm [13, Lemma 2.3], see also [44, Section 5.2]. Indeed, while the usual statement of the Magic Lemma concerns sets of points in \({\mathbb {R}}^d\), its proof is powered by a more fundamental fact about trees, which is implicit in the original proof (see in particular [44, Claim 5.5]) and is essentially equivalent to Theorem 3.4. We include a full proof for clarity, and since the statement we give is slightly different. We remark that the Magic Lemma has found diverse applications to several different problems in probability [13, 27, 30, 36], and useful generalizations of the Magic Lemma to doubling metric spaces [26] and to Gromov hyperbolic spaces [36] have also been found.
Let T be a locally finite tree and let A be a finite set of vertices of T. For each pair of distinct vertices u, v in T, let \(A_{u,v}\) be the set of vertices \(a\in A \setminus \{v\}\) such that the unique simple path from u to a in T passes through v. We say that a vertex u of T is (k, r)-branching for A if \(|A|-|A_{u,v} \cup A_{u,w}| \ge k\) for every pair of vertices v, w with distance exactly r from u.
Theorem 3.4
(Magic lemma for trees) Let T be a locally finite tree and let A be a finite set of vertices of T. Then for each \(k,r \ge 1\), there are at most \(r(2|A|-k)/k\) vertices of T that are (k, r)-branching for A.
Proof
By attaching an infinite path to T if necessary, we may assume without loss of generality that T is infinite. We may then pick an orientation of T so that every vertex v of T has exactly one distinguished neighbour, which we call the parent of v and denote by \(\sigma (v)\). This leads to a decomposition \((L_n)_{n \in {\mathbb {Z}}}\) of T into layers, unique up to a shift of index, such that the parent of every vertex in \(L_n\) lies in \(L_{n-1}\) for every \(n \in {\mathbb {Z}}\). These levels are sometimes known as horocycles, see e.g. [48, Section II.12.C]. (It may be that \(L_n =\emptyset \) for every n larger than some \(n_0\), but this possibility will not cause us any problems.) We denote by \(\sigma ^r\) the r-fold iteration of \(\sigma \), so that if \(v\in L_{n}\) then \(\sigma ^r(v) \in L_{n-r}\). We call u a descendant of v, and call v an ancestor of u, if \(v=\sigma ^r(u)\) for some \(r\ge 0\). For each vertex v of T, we let \(A_v\) be the set of vertices in \(A \setminus \{v\}\) that are descendants of v.
We say that a vertex v is (k, r)-supported if \(|A_{u}|-|A_{w}| \ge k\) for every w with \(\sigma ^r(w)=v\). Observe that for every vertex u and every w with \(\sigma ^r(w)=u\), we have that \(A_w = A_{u,w} \subseteq A_u\) and that \(A_u \subseteq A \setminus A_{u,\sigma ^r(u)}\), so that \(|A_u|-|A_w| \ge |A| - |A_{u,\sigma ^r(u)} \cup A_{u,w}|\). Thus, every (k, r)-branching vertex is (k, r)-supported, and it suffices to prove that there exist at most \(r(2|A|-k)/k\) vertices that are (k, r)-supported. We may assume that \(|A|\ge k\), since otherwise there cannot be any (k, r)-supported vertices and the claim holds vacuously.
We begin with the case \(r=1\). We follow closely the proof of [44, Claim 5.5]. Let V be the vertex set of T, and let B be the set of (k, 1)-supported points. Define a function \(f: V^2 \rightarrow {\mathbb {R}}\) by
$$\begin{aligned} f(u,v) = {\left\{ \begin{array}{ll} |A_u| \wedge \frac{k}{2} &{} v = \sigma (u)\\ -|A_v|\wedge \frac{k}{2} &{} u = \sigma (v)\\ 0 &{} \text {otherwise}. \end{array}\right. } \end{aligned}$$
This function is antisymmetric in the sense that \(f(u,v)=-f(v,u)\) for every \(u,v \in V\). We observe that
$$\begin{aligned} 0 \le f(u,\sigma (u)) {=} \frac{k}{2}\wedge \sum _{v: \sigma (v)=u}\left[ \mathbb {1}(v \in A)+f(v,u)\right] {\le } \sum _{v: \sigma (v)=u}\left[ \mathbb {1}(v \in A)+f(v,u)\right] \end{aligned}$$
for every \(u \in V\), as can be verified by splitting into two cases according to whether u has a child v with \(|A_v|\ge k/2\) or not. Moreover, if u is (k, 1)-supported then
$$\begin{aligned} f(u,\sigma (u)) \le \sum _{v: \sigma (v)=u}\left[ \mathbb {1}(v \in A)+f(v,u)\right] - \frac{k}{2}, \end{aligned}$$
where the inequality may be verified by splitting into three cases according to whether u has zero, one, or more than one child v with \(|A_v| \ge k/2\).
Let S be the finite set spanned by the union of the geodesics between pairs of points in A. Observe that \(B \cup A \subseteq S\) and that if \(v\notin S\) then \(A_v \in \{A,\emptyset \}\). Note also that there is a unique vertex \(\rho \in S\) such that every vertex of S is descended from \(\rho \), and this vertex \(\rho \) satisfies \(A = A_{\sigma (\rho )}\). Let \(S' = S \cup \{\sigma (\rho )\}\). We may sum the above estimates to obtain that
$$\begin{aligned}&|A|-\frac{k}{2}|B| + \sum _{u \in S'} \left[ \sum _{v: \sigma (v)=u}f(v,u) - f(u,\sigma (u))\right] \\&\quad = \sum _{u \in S'} \left[ \sum _{v: \sigma (v)=u}\left[ \mathbb {1}(v \in A)+f(v,u)\right] - f(u,\sigma (u)) - \frac{k}{2}\mathbb {1}\bigl (u\in B)\right] \ge 0. \end{aligned}$$
On the other hand, using the antisymmetry property of f and rearranging we obtain that
$$\begin{aligned} \sum _{u \in S'} \left[ \sum _{v: \sigma (v)=u}f(v,u) - f(u,\sigma (u))\right]&= \sum _{v \notin S', \sigma (v) \in S'} f(v,\sigma (v)) \\&\quad + \sum _{u,v\in S'} f(u,v) - f(\sigma (\rho ),\sigma ^2(\rho ))\\&=-f(\sigma (\rho ),\sigma ^2(\rho )) = - \frac{k}{2}, \end{aligned}$$
so that \(\frac{k}{2} |B| \le |A|-\frac{k}{2}\) as claimed.
Now let \(r\ge 2\). We will deduce the bound in this case from the \(r=1\) bound by constructing an auxiliary tree corresponding to each residue class mod r. For each \(1 \le m \le r\), let \(R_m = \bigcup _{n\in {\mathbb {Z}}} L_{nr+m}\) and let \(T_m\) be the tree constructed from T by connecting each vertex in a level of the form \(L_{nr + m}\) to all of its descendants in \(\bigcup _{\ell =1}^r L_{nr+m+\ell }\). Thus, \(T_m\) has the same vertex set as T, and every vertex not in \(R_m\) is a leaf in \(T_m\). Observe that if a vertex \(v\in R_m\) is (k, r)-supported in T then it is (k, 1)-supported in \(T_m\). For each \(1\le m \le r\) we know that there are at most \((2|A|-k)/k\) such vertices, and the claim follows by summing over m. \(\square \)
Proof of Theorem 3.3
Let (T, A, o) be locally unimodular and suppose that A is almost surely finite. Let \(k,r \ge 1\) and let \(B_{k,r}\) be the set of vertices of T that are (k, r)-branching for A. Considering the function \(F:{\mathcal {T}}_{\bullet \bullet }^\diamond \rightarrow [0,\infty ]\) defined by \(F(g,a,u,v) = \mathbb {1}(v\) is (k, r)-branching for a)/|a|, and applying the mass-transport principle, we obtain that
$$\begin{aligned} {\mathbb {E}}\left[ |B_{k,r} \cap A|/|A|\right] = {\mathbb {E}}\left[ \sum _{v\in A}F(G,A,\rho ,v)\right] = {\mathbb {E}}\left[ \sum _{v\in A}F(G,A,v,\rho )\right] = {\mathbb {P}}(o \in B_{k,r}). \end{aligned}$$
Applying Theorem 3.4 to bound the left hand side, we obtain that
$$\begin{aligned} {\mathbb {P}}(o \in B_{k,r}) \le \frac{2r}{k} \end{aligned}$$
(3.5)
for every \(k,r\ge 1\) and every locally unimodular triple (T, A, o) such that A is almost surely finite.
Now observe that for each \(k,r\ge 1\), the set of \((t,a,u) \in {\mathcal {T}}_\bullet ^\diamond \) such that u is (k, r)-branching for a is open with respect to the local topology on \({\mathcal {T}}_\bullet ^\diamond \). It follows by the portmanteau theorem that the map \(\mu \mapsto \mu (\{(t,a,u): u \text { is (k,r)-branching for a}\})\) is weakly lower semi-continuous on the space of probability measures on \({\mathcal {T}}_\bullet ^\diamond \). We deduce that if \(((T_n,A_n,o_n))_{n\ge 1}\) and (T, A, o) are as in the statement of the theorem then
$$\begin{aligned} {\mathbb {P}}\Bigl (o\text { is }(k,r)\text {-branching for }A\Bigr ) \le \lim _{n\rightarrow \infty } {\mathbb {P}}\Bigl (o_n\text { is }(k,r)\text {-branching for } A_n\Bigr ) \le \frac{2r}{k}\nonumber \\ \end{aligned}$$
(3.6)
for every \(r,k \ge 1\). This is a quantitative refinement of the statement of the theorem: If A is infinite with more than two ends then there exists a vertex v of T whose removal disconnects T into at least three connected components that have infinite intersection with A. If there is such a vertex within distance r of o, then o is (k, r)-branching for every \(k\ge 1\). The estimate (3.6) implies that this event has probability zero for every \(r \ge 1\), and the claim follows. \(\square \)
Completing the proof
We now deduce Theorems 1.1 and 1.2 from Theorem 3.3. We begin with the following simple lemma.
Lemma 3.5
Let G be a transitive nonamenable graph with spectral radius \(\Vert P\Vert <1\), and let \(\mu _1,\mu _2\) be offspring distributions with \(\overline{\mu _1},\overline{\mu _2} \le \Vert P\Vert ^{-1}\), and suppose that this inequality is strict for at least one of \(i=1,2\). Let x, y be vertices of G. Then an independent \(\mu _1\)-BRW started at x and \(\mu _2\)-BRW started at y intersect at most finitely often almost surely.
Proof of Lemma 3.5
For \(i=1,2\), let \(T_i\) be a \(\mu _i\)-Galton-Watson tree with root \(o_i\) and let \(X_i\) be a random walk on G indexed by \(T_i\), started at x when \(i=1\) and y when \(i=2\), where the pair \((T_1,X_1)\) is independent of \((T_2,X_2)\). Let \(V_i\) be the vertex set of \(T_i\). The expected number of vertices of \(T_i\) with distance exactly n from \(o_i\) is \(\overline{\mu _i}^n\), and we can compute that
$$\begin{aligned} {\mathbb {E}}\left[ \#\{(u,v) \in V_1 \times V_2 : X_1(u)=X_2(v)\} \right]&= \sum _{z \in V(G)} \sum _{n,m \ge 0} \overline{\mu _1}^n p_n(x,z) \overline{\mu _2}^m p_m(y,z)\\&= \sum _{z \in V(G)} \sum _{n,m \ge 0} \overline{\mu _1}^n p_n(x,z) \overline{\mu _2}^m p_m(z,y)\\&= \sum _{n,m \ge 0} \overline{\mu _1}^n\overline{\mu _2}^m p_{n+m}(x,y). \end{aligned}$$
Since \(\overline{\mu _1},\overline{\mu _2} \le \Vert P\Vert ^{-1}\) and this inequality is strict for at least one of \(i=1,2\), it follows by an elementary calculation that there exists a constant C such that
$$\begin{aligned} {\mathbb {E}}\left[ \#\{(u,v) \in V_1 \times V_2 : X_1(u)=X_2(v)\} \right] \le C \sum _{n \ge 0} \Vert P\Vert ^{-n}p_n(x,y). \end{aligned}$$
The right-hand side is finite by [48, Theorem 7.8], concluding the proof. (Note that we do not need to invoke this theorem if we have both strict inequalities \(\overline{\mu _1},\overline{\mu _2} < \Vert P\Vert ^{-1}\), and in this case the claim holds for any bounded degree nonamenable graph.) \(\square \)
Given an offspring distribution \(\mu \) and \(p\in [0,1]\), let \(\mu ^p\) be the offspring distribution defined by
$$\begin{aligned} \mu ^p(k) = \sum _{n\ge k} \left( {\begin{array}{c}n\\ k\end{array}}\right) p^k(1-p)^{n-k}\mu (k), \end{aligned}$$
so that \(\overline{\mu ^p}=p \overline{\mu }\) and \(\mu ^p\) converges weakly to \(\mu =\mu ^1\) as \(p\uparrow 1\).
Proof of Theorem 1.2
First, observe that the claim is clearly equivalent to the corresponding claim concerning unimodular branching random walks. Moreover, it suffices to consider the case that \(x=y=\rho \), where \(\rho \) is some fixed root vertex of G. Indeed, if there exists some choice of starting vertices x and y so that the two walks intersect infinitely often with positive probability, then any choice of starting vertices must have this property, since there exist times n and m such that with positive probability the first walk has at least one particle at x at time n and the second walk has at least one particle at y at time m, and on this event we clearly have a positive conditional probability of having infinitely many intersections. We may also assume that the offspring distributions \(\mu _1,\mu _2\) have \(\overline{\mu _1},\overline{\mu _2}= \Vert P\Vert ^{-1}>1\), since otherwise the claim follows from Lemma 3.5. In particular, this implies that both \(\mu _1\) and \(\mu _2\) are non-trivial.
For each \(i\in \{1,2\}\) let \((T_i,o_i)\) be a unimodular Galton-Watson tree with offspring distribution \(\mu _i\), let \(X_i\) be a \(T_i\) indexed random walk on G with \(X_i(o_i)=\rho \), and let \(U_i=(U_i(e))_{e\in E(T_i)}\) be a collection of i.i.d. uniform [0, 1] random variables indexed by the edge set of \(T_i\). We take \(X_i\) and \(U_i\) to be conditionally independent given \(T_i\) for each \(i=1,2\), and take the two random variables \(((T_1,o_1),X_1,U_1)\) and \(((T_2,o_2),X_2,U_2)\) to be independent of each other. We have by the results of [14, 24] that \(X_1\) and \(X_2\) are both transient almost surely. Let \(I = X_1^{-1}(X_2(V(T_2)))\). We wish to show that I is finite almost surely.
For each \(i\in \{1,2\}\) and \(p\in [0,1]\), let \(T_i^p\) be the component of \(o_i\) in the subgraph of \(T_i\) spanned by the edges of \(T_i\) with \(U_i(e) \le p\). Let \(X_i^p\) be the restriction of \(X_i\) to \(T_i^p\). Then \((T_i^p,o_i)\) is a unimodular random tree, and \(X_i^p\) is distributed as a \(T_i^p\)-indexed random walk on G. Observe that we can alternatively sample a random variable whose law is equivalent (i.e., mutually absolutely continuous) to that of \((T_i^p,o_i)\) by taking two independent Galton-Watson trees with law \(\mu _i^p\), attaching these trees by a single edge between their roots, and then deciding whether to delete or retain this additional edge with probability p, independently of everything else. It follows from this observation together with Lemma 3.5 that the set \(I^p {:}{=} (X_1^p)^{-1}(X_2^p(V(T_2^p)))\) is almost surely finite when \(p<1\).
By Corollary 3.2, for each \(p\in [0,1]\) the random triple \((T_1^p,I^p,o_1)\) is locally quasi-unimodular with weight
$$\begin{aligned} W_p(T_1^p,I^p,o_1) = \frac{{\mathbb {E}}\left[ \left( \#(X_2^p)^{-1}(\rho )\right) ^{-1} \mid (T_1^p,I^p,o_1)\right] }{{\mathbb {E}}\left[ \left( \#(X_2^p)^{-1}(\rho )\right) ^{-1}\right] }. \end{aligned}$$
For each \(p\in [0,1]\) let \(W_p'\) be the random variable
$$\begin{aligned} W_p':=\frac{\left( \#(X_2^p)^{-1}(\rho )\right) ^{-1}}{{\mathbb {E}}\left[ \left( \#(X_2^p)^{-1}(\rho )\right) ^{-1}\right] ^{-1}}, \end{aligned}$$
so that \(W_p(T_1^p,I^p,o_1) = {\mathbb {E}}[W'_p \mid (T_1^p,I^p,o_1)]\). Since \(X_2=X_2^1\) is transient, the expectation in the denominator is bounded away from 0. Since we also trivially have that \(\left( \#(X_2^p)^{-1}(\rho )\right) ^{-1} \le 1\), it follows that the random variables \(W'_p\) are all bounded by the finite constant \(1/{\mathbb {E}}\bigl [\left( \#(X_2)^{-1}(\rho )\right) ^{-1}\bigr ]\). Moreover, we clearly have that \(W'_p \rightarrow W_1'\) almost surely as \(p \uparrow 1\). For each \(p\in [0,1]\), let \(\nu _p\) be the law of \((T_1^p,I^p,o_1)\) and let \(\nu _p'\) be the locally unimodular probability measure given by biasing \(\nu _p\) by \(W_p\). We clearly have that \(\nu _p\) converges weakly to \(\nu _1\) as \(p\uparrow 1\), and we claim that \(\nu _p'\) converges weakly to \(\nu _1'\) as \(p\uparrow 1\) also. Indeed, if \(F:{\mathcal {G}}_\bullet ^\diamond \rightarrow {\mathbb {R}}\) is a bounded continuous function then we trivially have that \(F(T_1^p,I^p,o_1)\) converges almost surely to \(F(T_1,I,o_1)\) as \(p \uparrow 1\), and it follows by bounded convergence that
$$\begin{aligned}&\lim _{n\rightarrow \infty }{\mathbb {E}}\left[ W_p(T_1^p,I^p,o_1)F(T_1^p,I^p,o_1)\right] = \lim _{n\rightarrow \infty }{\mathbb {E}}\left[ W_p'F(T_1^p,I^p,o_1)\right] \\&\quad = {\mathbb {E}}\left[ W_1'F(T_1,I,o_1)\right] = {\mathbb {E}}\left[ W_1(T_1,I,o_1)F(T_1,I,o_1)\right] . \end{aligned}$$
Since F was arbitrary, this establishes the desired weak convergence. Since the sets \(I^p\) are almost surely finite for every \(0\le p < 1\), it follows from Theorem 3.3 that \(I=I^1\) is either finite, one-ended or two-ended almost surely.
Suppose for contradiction that I is infinite with positive probability. Since \(\mu _1\) and \(\mu _2\) are non-trivial, there exists n such that, with positive probability, \(o_1\) and \(o_2\) both have at exactly three descendants belonging to \(X^{-1}(\rho )\) in level n. Condition on the \(\sigma \)-algebra \({\mathcal {F}}\) generated by the first n generations of each tree and the restriction of X to these generations, and suppose that this event holds. Denote the three descendants in each tree by \(o_{i,1},o_{i,2},o_{i,3}\) (the choice of enumeration is not important), let \(T_{i,j}\) be the subtree of \(T_i\) spanned by \(o_i\) and its descendants, and let \(X_{i,j}\) be the restriction of \(X_i\). Then \(T_{i,j}\) is conditionally distributed as a Galton-Watson tree with offspring distribution \(\mu _i\), and \(X_{i,j}\) is a \(T_{i,j}\)-indexed walk on G started with \(X_{i,j}(o_{i,j})=\rho \). Moreover, the random variables \(((T_{i,j},o_{i,j}),X_{i,j})\) are all conditionally independent of each other given \({\mathcal {F}}\), and our assumption implies that \(X_{1,j}^{-1}(X_{2,j}(V(T_{2,j})) = \infty \) with positive conditional probability for each \(1 \le j \le 3\). It follows by independence that \(X_{1,j}^{-1}(X_{2,j}(V(T_{2,j})) = \infty \) for every \(1 \le j \le 3\) with positive probability, and hence that I has at least three ends with positive probability, a contradiction. \(\square \)
Remark 3.6
The last part of the proof of Theorem 1.2 can be generalized as follows: Suppose that G is a graph, \(\mu \) is a non-trivial offspring distribution, T is a Galton-Watson tree with offspring distribution \(\mu \), and X is a T-indexed random walk in G. Let A be a set of vertices in G. Then the event \(\{X^{-1}(A)\) is infinite and has finitely many ends\(\}\) has probability zero.
It remains to deduce Theorem 1.1 from Theorem 1.2; this is very straightforward. We also prove the following slight variation on the same result.
Theorem 3.7
Let G be a unimodular transitive graph.
Let \(\mu \) be an offspring distribution with \(1<\overline{\mu } \le \Vert P\Vert ^{-1}\). Then the trace of a unimodular branching random walk on G with offspring distribution \(\mu \) is infinitely ended and has no isolated ends almost surely on the event that it survives forever.
Proof of Theorems 1.1 and 3.7
We begin by proving that the trace of a branching random walk is infinitely-ended on the event that it survives forever. Let (T, o) be a Galton-Watson tree with offspring distribution \(\mu \), and let X be a T-indexed random walk in G with \(T(o)=\rho \). Let \({\mathcal {F}}_n\) be the \(\sigma \)-algebra generated by the first n generations of T and the restriction of X to these generations. Let the vertices of T in generation n be enumerated \(v_{n,1},\ldots ,v_{n,N_n}\), and let \(M_n\) be the number of vertices in generation n that have infinitely many descendants. Let \(W_n\) be the image of the first n generations of T under X and let \(A_{n,i}\) be the image under X of the offspring of \(v_{n,i}\). Theorem 1.2 implies that \(|A_{n,i} \cap A_{n,j}| < \infty \) for every \(1 \le i < j \le N_n\). Let \(K_n = W_n \cup \bigcup _{1 \le i < j \le N_n} A_{n,i} \cap A_{n,j}\). Then \(K_n\) is finite and deleting \(K_n\) from the trace of X results in at least \(M_n\) infinite connected components. On the other hand, standard results in the theory of branching processes imply that \(M_n \rightarrow \infty \) almost surely on the event that T is infinite, concluding the proof. A similar proof establishes that the trace of a unimodular branching random walk is infinitely-ended almost surely on the event that it survives forever.
Applying Proposition 2.2 and [2, Proposition 6.10], we deduce that the trace of a unimodular branching random walk has continuum ends and no isolated end almost surely on the event that it survives forever. The fact that the same claim holds for the usual branching random walk trace follows by a further application of Theorem 1.2. This deduction will use the notion of the space of ends of a tree as a topological space, see [48, Section 21] for a definition. Let \((T_1,o)\) and \((T_2,o')\) be independent Galton-Watson trees with offspring distribution \(\mu \), and let (T, o) be the augmented Galton-Watson tree formed by attaching \((T_1,o)\) and \((T_2,o')\) by a single edge connecting o to \(o'\). Let X be a T-indexed random walk with \(X(o)=\rho \), and let \(X_1\) and \(X_2\) be the restrictions of X to \(T_1\) and \(T_2\) respectively, so that \({\text {Tr}}(X)\) has continuum many ends and no isolated ends almost surely on the event that it is infinite. Theorem 1.2 is easily seen to imply that the space of ends of \({\text {Tr}}(X)\) is equal to the disjoint union of the spaces of ends of \({\text {Tr}}(X_1)\) and \({\text {Tr}}(X_2)\), and it follows that \({\text {Tr}}(X_1)\) has continuum many ends and no isolated end almost surely on the event that \(T_1\) is infinite, as desired. \(\square \)