Abstract
Consider the Erdős–Renyi random graph on \(n\) vertices where each edge is present independently with probability \(\lambda /n\), with \(\lambda >0\) fixed. For large \(n\), a typical random graph locally behaves like a Galton–Watson tree with Poisson offspring distribution with mean \(\lambda \). Here, we study large deviations from this typical behavior within the framework of the local weak convergence of finite graph sequences. The associated rate function is expressed in terms of an entropy functional on unimodular measures and takes finite values only at measures supported on trees. We also establish large deviations for other commonly studied random graph ensembles such as the uniform random graph with given number of edges growing linearly with the number of vertices, or the uniform random graph with given degree sequence. To prove our results, we introduce a new configuration model which allows one to sample uniform random graphs with a given neighborhood distribution, provided the latter is supported on trees. We also introduce a new class of unimodular random trees, which generalizes the usual Galton Watson tree with given degree distribution to the case of neighborhoods of arbitrary finite depth. These generalized Galton Watson trees turn out to be useful in the analysis of unimodular random trees and may be considered to be of interest in their own right.
Similar content being viewed by others
1 Introduction and main results
Consider the Erdős–Renyi ensemble \(\mathcal {G}(n,p)\), where a random graph is obtained from the vertex set \([n]=\{1,\dots ,n\}\) by adding each edge independently with probability \(p\). In the sparse regime with \(p=\lambda /n\), for a fixed \(\lambda >0\), it is well known that, for large \(n\), a typical graph from \(\mathcal {G}(n,p)\) locally looks like a Galton–Watson tree with Poisson offspring distribution with mean \(\lambda \). In this work we study large deviations from this typical behavior. The problem is intimately related to the question: conditioned on having a certain neighborhood distribution, what does a typical element of \(\mathcal {G}(n,p)\) locally look like? The same questions can be asked for other commonly studied random graph ensembles such as the uniform random graphs with fixed number of edges growing linearly with the number of vertices, or with given degree sequence. We formulate the problem within the theory of local weak convergence of graph sequences that was recently introduced by Benjamini and Schramm [4] and Aldous and Steele [2]. The associated local weak topology has now become a common tool for studying sparse graphs, see Aldous and Lyons [1] and Bollobàs and Riordan [8]. A surprising large variety of graph functionals are continuous for this topology. In Sect. 2 below, we will give more details on local weak convergence. In order to present our result, here we first introduce the main terminology.
1.1 Local weak convergence
A graph \(G = (V,E)\), with \(V\) a countable set of vertices, is said to be locally finite if, for all \(v \in V\), the degree of \(v\) in \(G\) is finite. A rooted graph \((G,o)\) is a locally finite and connected graph \(G = (V,E)\) with a distinguished vertex \(o \in V\), called the root. For \(t \geqslant 0\), we denote by \((G,o)_t\) the induced rooted graph with vertex set \(\{u\in V: \,D(o,u)\leqslant t\}\), with \(D(\cdot ,\cdot )\) the natural graph distance. Two rooted graphs \((G_i,o_i) = ( V_i, E_i, o_i )\), \(i \in \{1,2\}\), are isomorphic if there exists a bijection \(\sigma : V_{1} \rightarrow V_{2}\) such that \(\sigma ( o_1) = o_2\) and \(\sigma ( G_1) = G_2\), where \(\sigma \) acts on \(E_1\) through \(\sigma ( \{ u, v \} ) = \{ \sigma ( u), \sigma (v) \}\). We will denote this equivalence relation by \((G_1,o_1) \simeq (G_2, o_2)\). An equivalence class of rooted graphs is often simply referred to as unlabeled rooted graph. We denote by \(\mathcal {G}^*\) the set of (locally finite, connected) unlabeled rooted graphs. \(\mathcal {T}^*\) will be the set of unlabeled rooted trees. To each unlabeled rooted graph \(g\in \mathcal {G}^*\), we may associate a labeled rooted graph \((G,o)\) with vertex set \(V\subset \mathbb {Z}_+\), rooted at \(0\), in a canonical way; see e.g. [1]. For ease of notation, one sometimes identifies \(g \in \mathcal {G}^*\) with its canonical rooted graph \((G,o)\).
For \(\gamma \in \mathcal {G}^*\) and \(h \in \mathbb {N}\), we write \(\gamma _h\) for the truncation at \(h\) of the graph \(\gamma \), namely the unlabeled rooted graph obtained by removing all vertices (together with the edges incident to them) that are at distance larger than \(h\) from the root. The local topology is the smallest topology such that for any \(\gamma \in \mathcal {G}_*\) and \(h \in \mathbb {N}\), the \(\mathcal {G}^* \rightarrow \{0,1\}\) function \( f(g) = \mathbf {1}( g_h = \gamma _h ) \) is continuous. Equivalently, a sequence \(g_n\in \mathcal {G}^*\) converges locally to \(g\in \mathcal {G}^*\) iff for all \(h\in \mathbb {N}\) there exists \(n_0(h)\) such that \((g_n)_h=g_h\) whenever \(n\geqslant n_0(h)\). This topology is metrizable and the space \(\mathcal {G}^*\) is separable and complete [1]. The space of probability measures on \(\mathcal {G}^*\), denoted \(\mathcal {P}(\mathcal {G}^*)\), is equipped with the topology of weak convergence. We often write \(\rho _n\rightsquigarrow \rho \) to indicate that a sequence \(\rho _n\in \mathcal {P}(\mathcal {G}^*)\) converges weakly to \(\rho \in \mathcal {P}(\mathcal {G}^*)\).
For a finite graph \(G = (V,E)\) and \(v\in V\), one writes \(G(v)\) for the connected component of \(G\) at \(v\). The empirical neighborhood distribution \(U(G)\) of \(G\) is the law of the equivalence class of the rooted graph \((G(o),o)\) where the root \(o\) is sampled uniformly at random from \(V\), i.e. \(U(G)\in \mathcal {P}(\mathcal {G}^*)\) is defined by
where \([G,v]\in \mathcal {G}^*\) stands for the equivalence class of \((G(v),v)\) and \(\delta _g\) is the Dirac mass at \(g\in \mathcal {G}^*\); see Fig. 1 for an example. If \(\{G_n\}\) is a sequence of finite graphs, we shall say that \(G_n\) has local weak limit \(\rho \in \mathcal {P}(\mathcal {G}^*)\) if \(U(G_n)\) converges to \(\rho \) in \(\mathcal {P}(\mathcal {G}^*)\) as \(n\rightarrow \infty \). A measure \(\rho \in \mathcal {P}(\mathcal {G}^*)\) is called sofic if there exists a sequence of finite graphs \(\{G_n\}\) whose local weak limit is \(\rho \). In other words, the set of sofic measures is the closure of the set \(\{ U (G_n) : G_n \hbox { finite graph}\}\). An example is the Dirac mass at the infinite regular tree with degree \(d\in \mathbb {N}\), which is almost surely the local weak limit of a sequence of uniformly sampled random \(d\)-regular graphs on \(n\) vertices [31]. Another example is the law of the Galton–Watson tree with Poisson offspring distribution with mean \(\lambda >0\), which is almost surely the local weak limit of a sequence of random graphs sampled from \(\mathcal {G}(n,p)\) when \(p=\lambda /n\). Sofic measures form a closed subset of \(\mathcal {P}(\mathcal {G}^*)\).
Sofic measures share a stationarity property called unimodularity [1]. To define the latter, consider the set \(\mathcal {G}^{**}\) of unlabeled graphs with two distinguished roots, obtained as the set of equivalence classes of locally finite connected graphs with two distinguished vertices \((G,u,v)\). The notion of local topology extends naturally to \(\mathcal {G}^{**}\). A function \(f\) on \(\mathcal {G}^{**}\) can be extended to a function on connected graphs with two distinguished roots \((G,u,v)\) through the isomorphism classes. Then, a measure \(\rho \in \mathcal {P}(\mathcal {G}^*)\) is called unimodular if for any Borel measurable function \(f: \mathcal {G}^{**} \rightarrow \mathbb {R}_+\), we have
where \((G,o)\) is the canonical rooted graph whose equivalence class \(g\in \mathcal {G}^*\) has law \(\rho \). It is not hard to check that if \(G\) is a finite graph then its neighborhood distribution \(U(G)\) is unimodular. In particular, all sofic measures are unimodular. The converse is open; see [1]. We denote by \(\mathcal {P}_{u} (\mathcal {G}^*)\) the set of unimodular probability measures. Similarly, we write \(\mathcal {P}_{u} (\mathcal {T}^*)\) for unimodular probability measures supported by trees.
1.2 Unimodular Galton–Watson trees with given neighborhood
We now introduce a family of unimodular measures that will play a key role in what follows. As we will see, this is the natural generalization of the usual Galton–Watson trees with given degree distribution to the case of neighborhoods of arbitrary depth \(h\in \mathbb {N}\). These measures will be shown to be sofic, and this fact can be used to give an alternative proof of the Bowen–Elek theorem [3, 13, 19] asserting that all \(\rho \in \mathcal {P}_{u} (\mathcal {T}^*)\) are sofic, see Corollary 1.5 below.
Fix \(h \in \mathbb {N}\), and recall that \(g_h\) denotes the truncation at depth \(h\) of \(g\in \mathcal {G}^*\). Call \(\mathcal {G}^*_h\) the set of unlabeled rooted graphs with depth \(h\), i.e. the set of \(g\in \mathcal {G}^*\) such that \(g_h=g\). Similarly, call \(\mathcal {T}^ *_h\) the set of unlabeled rooted trees \(t\in \mathcal {T}^*\) such that \(t_h=t\). Given \(\rho \in \mathcal {P}(\mathcal {G}^*)\), we write \(\rho _h\in \mathcal {P}(\mathcal {G}^*_h)\) for the \(h\)-neighborhood marginal of \(\rho \), i.e. the law of \(g_h\) when \(g\) has law \(\rho \). Notice that if \(t\in \mathcal {T}^*\) and \(h=1\), then \(t_h\) is simply the number of children of the root. In particular, \(\mathcal {T}^*_1\) can be identified with \(\mathbb {Z}_+\). When \(h=0\), it is understood that \(\mathcal {G}^*_0\) contains only the trivial graph consisting of a single isolated vertex (the root), so that \(|\mathcal {G}^*_0|=1\).
If \(G = (V,E)\) is a graph and \(\{ u, v\} \in E\) then define \(G(u,v)\) as the rooted graph \((G'(v),v)\), where \(G'=(V, E\backslash \{u,v\})\), i.e. \(G(u,v)\) is the rooted graph obtained from \(G\) by removing the edge \(\{u,v\}\) and taking the connected component at the root \(v\). Next, given a rooted graph \((G,o)\), and \(g,g'\in \mathcal {G}_{h-1}^*\), define
The notation \(v \mathop {\sim }\limits ^{G} u\) indicates that the vertex \(v\) is a neighbor of \(u\) in \(G\). Thus, \(E_h ( g, g')\) is the number of neighbors of the root in \((G,o)\) which have the given patterns \(G(o,v)_{h-1}\simeq g\) and \(G(v,o)_{h-1} \simeq g'\). Notice that if \(h=1\), then necessarily \(g,g'=o\) and \(E_1(o,o)={\mathrm {deg}}_G(o)\) is simply the degree of the root.
As an example, consider the the rooted graph \(\alpha \) from Fig. 1. Fix \(h=2\), and call \(g_1,g_2\) the elements of \(\mathcal {G}^*_{h-1}\) consisting respectively of a rooted single edge and a rooted triangle. Then one has \(E_h(g_1,g_2)=2\) and \(E_h(g_2,g_1)=0\). Similarly, if the reference graph is \(\beta \) from Fig. 1, then \(E_h(g_1,g_2)=0\) while \(E_h(g_2,g_1)=1\).
We call a measure \(P \in \mathcal {P}( \mathcal {G}^*_h)\) admissible if \(\mathbb {E}_P {\mathrm {deg}}_G (o) < \infty \) and for all \(g, g' \in \mathcal {G}^ *_{h-1}\),
where
Here it is understood that \((G,o)\) represents the canonical rooted graph whose equivalence class in \(\mathcal {G}_h^*\) has law \(P\). By applying the definition of unimodularity (2) to the function
it is not hard to check that if \(\rho \) is unimodular and \(\mathbb {E}_\rho {\mathrm {deg}}_G (o) < \infty \) then \(\rho _h\in \mathcal {P}(\mathcal {G}_h^*)\) is admissible. In particular, for any finite graph \(G\), the neighborhood distribution \(U(G)_h\) truncated at depth \(h\) is admissible. Remark that, when \(h=1\), since \(|\mathcal {G}^*_0|=1\), all \(P \in \mathcal {P}( \mathcal {T}_1^*) = \mathcal {P}( \mathbb {Z}_+)\) with finite mean are admissible.
We now define the measures \({\mathrm {UGW}}_h(P) \in \mathcal {P}(\mathcal {T}^*)\); see also Sect. 3 below for more details. Fix \(P\in \mathcal {P}(\mathcal {T}_h^*)\) admissible. The probability \({\mathrm {UGW}}_h(P) \in \mathcal {P}(\mathcal {T}^*)\) is the law of the equivalence class of the random rooted tree \((T,o)\) defined below.
For \(t, t' \in \mathcal {T}^*_{h-1}\) such that \(e_P(t,t') \ne 0\) define, for all \(\tau \in \mathcal {T}^*_{h}\),
where \(\tau \cup t'_+\) denotes the tree obtained from \(\tau \) by adding a new neighbor of the root whose rooted subtree is \(t'\); see Fig. 2 for an example. The subtree \(\tau (o,v)\) is defined before Eq. (3) with the graph \(G\) replaced by \(\tau \).
It can be checked that \({\widehat{P}}_{t,t'}\) is a probability, i.e. \({\widehat{P}}_{t,t'}\in \mathcal {P}(\mathcal {T}^*_h)\); see Sect. 3. We may now define the random rooted tree \((T,o)\). First, \((T,o)_{h}\) is sampled according to \(P\). Next, for each vertex \(v\) in the first generation of \((T,o)_{h}\), consider the subtree \(t=T(o,v)_{h-1}\) with depth \(h-1\) rooted at \(v\) obtained by removing the edge \(\{o,v\}\) and retaining the connected component up to distance \(h-1\) from \(v\). We add a layer to \(t\) by replacing \(t\) with a new tree \(\tau \) with depth \(h\) that coincides with \(t\) in the first \(h-1\) generations. The new tree \(\tau \) is sampled according to \({\widehat{P}}_{t, t'}\) where \(t\) is as above while \(t'\) denotes the subtree \(T ( v,o)_{h-1}\) rooted at \(o\) obtained from \((T,o)_{h}\) by removing the edge \(\{o,v\}\) and retaining the connected component up to distance \(h-1\) from \(o\). This operation is repeated for each \(v\) in the first generation independently. After this step, we have overall added one layer to \((T,o)_{h}\), and thus we have sampled \((T,o)_{h+1}\).
We now proceed recursively, layer by layer, to obtain a sample of the full tree \((T,o)\). Formally, this construction can be stated as follows. If \(u\) is the parent of \(v\), we say that \(v\) has type \((t,t')\), where \(t,t'\in \mathcal {T}^*_{h-1}\), if \(T(u,v)_{h-1}\simeq t\) and \(T ( v,u)_{h-1}\simeq t'\). The subtrees \(T(u,v)\), and \(T(v,u)\) are defined before Eq. (3) with \(G\) replaced by \(T\). Denote by \(1, \ldots , d\), with \(d = {\mathrm {deg}}_T(o)\) the neighbors of the root in the canonical representation of the random variable with law \(P\). Given \((T,o)_{h}\), the subtrees \(T(o,v), 1 \leqslant v \leqslant d,\) are independent random variables and, given that \(v\) has type \((t,t')\), then \(T(o,v)_{h}\) has distribution \({\widehat{P}}_{t, t'}\). Once \(T(o,v)_{h}\) is sampled, the type of a child \(v'\) of \(v\) is determined using only \(T(o,v)_{h}\) and \(T(v,o)_{h-2}\). For each child \(v'\) of \(v\) we sample the subtree \(T(v,v')_h\) independently according to \({\widehat{P}}_{t, t'}\) where \((t,t')\) is the type of \(v'\) and so on, recursively. This defines our random rooted tree \((T,o)\).
If \(h =1 \), then there is only one type possible and \({\mathrm {UGW}}_1( P)\) is the unimodular Galton–Watson tree with degree distribution \(P \in \mathcal {P}(\mathbb {Z}_+)\), where the number \(d\) of children of the root is sampled according to \(P\), and conditionally on \(d\), the subtrees of the children of the root are independent Galton–Watson trees with offspring distribution given by the size-biased law \({\widehat{P}}\):
If \(P={\mathrm {Poi}}(\lambda )\) is the Poisson distribution with mean \(\lambda \), then \({\widehat{P}}=P\) and \({\mathrm {UGW}}_1( P)\) is the standard Galton–Watson tree with mean degree \(\lambda \).
The following proposition summarizes the main properties of the measures \({\mathrm {UGW}}_h(P)\) for generic \(h\in \mathbb {N}\) and \(P\in \mathcal {P}(\mathcal {T}_h^*)\) admissible.
Proposition 1.1
Fix \(h\in \mathbb {N}\) and \(P\in \mathcal {P}(\mathcal {T}_h^*)\) admissible. The measure \({\mathrm {UGW}}_h(P)\) is unimodular. Moreover, the following consistency relation is satisfied: for any \(k \geqslant h\), \(({\mathrm {UGW}}_h(P))_k\in \mathcal {P}(\mathcal {T}_k^*)\) is admissible and
1.3 Entropy of a measure \(\rho \in \mathcal {P}(\mathcal {G}^*)\)
It is convenient to work with uniformly distributed random graphs with a given number of edges. For any \(n,m\in \mathbb {N}\), let \(\mathcal {G}_{n,m}\) be the set of graphs on \(V=[n]\) with \(|E|=m\) edges. Fix \(d >0\), and a sequence \(m = m(n)\) such that \(m / n \rightarrow d/2\), as \(n\rightarrow \infty \). Since
an application of Stirling’s formula shows that
If \(\rho \in \mathcal {P}(\mathcal {G}^*)\), define
where \(B ( \rho , \varepsilon )\) denotes the open ball with radius \(\varepsilon \) around \(\rho \) with respect to the Lévy metric on \(\mathcal {P}(\mathcal {G}^*)\). For \(\varepsilon >0\), define
Since \(\varepsilon \mapsto \overline{\Sigma }(\rho ,\varepsilon )\) is non-decreasing, one defines
The extended real numbers \(\underline{\Sigma }( \rho ,\varepsilon )\) and \(\underline{\Sigma }( \rho )\) are defined as above, with \(\limsup \) replaced by \(\liminf \). If \(\rho \) is such that \(\underline{\Sigma }(\rho ) = \overline{\Sigma }(\rho )\), we set \(\Sigma (\rho ) := \overline{\Sigma }(\rho ) = \underline{\Sigma }(\rho )\). The number \(\Sigma (\rho )\) can be interpreted, up to an overall constant, as a microcanonical entropy associated to the state \(\rho \). From (7), one has that \(\Sigma (\rho )\in [-\infty ,s(d)]\), whenever it is well defined.
Theorem 1.2
Fix \(d > 0\) and choose a sequence \( m = m(n)\) such that \(m / n \rightarrow d / 2\). For any \(\rho \in \mathcal {P}( \mathcal {G}^*)\), the entropy \(\Sigma (\rho ) \in \; \left[ -\infty , s(d)\right] \) is well defined, it is upper semi-continuous, and it does not depend on the choice of the sequence \(m(n)\). Moreover, \(\Sigma (\rho ) = - \infty \) if at least one of the following is satisfied:
-
(i)
\(\rho \) is not unimodular
-
(ii)
\(\rho \) is not supported on rooted trees.
-
(iii)
\(\mathbb {E}_{\rho } {\mathrm {deg}}_G (o) \ne d\).
Notice that the definition of \(\Sigma (\rho )\) depends on the parameter \(d\). For simplicity, we do not write explicitly this dependence. In view of Theorem 1.2(iii), to avoid trivialities, unless otherwise stated, \(\Sigma (\rho )\) will refer to the value at \(d = \mathbb {E}_\rho {\mathrm {deg}}_G ( o )\) (provided that the latter is finite). The next theorem computes the actual value of \(\Sigma (\rho )\) for unimodular Galton–Watson trees and gives an expression for \(\Sigma (\rho )\) for all \(\rho \in \mathcal {P}_u (\mathcal {T}^*)\). Moreover, it shows that unimodular Galton–Watson trees maximize entropy under a \(h\)-neighborhood marginal constraint.
Let us introduce some additional notation. For any \(P \in \mathcal {P}(\mathcal {T}^*_h)\), define the Shannon entropy
For \(h \in \mathbb {N}\), call \(\mathcal {P}_h\) the set of all \(P \in \mathcal {P}(\mathcal {T}^*_h)\), with \(P\) admissible such that \(H(P)<\infty \) and \(\mathbb {E}_P {{\left[ {\mathrm {deg}}_T(o) \log {{\left( {\mathrm {deg}}_T (o)\right) }} \right] }} < \infty \).Footnote 1 For \(P\in \mathcal {P}_h\), let \(\pi _P\) denote the probability on \(\mathcal {T}^*_{h-1}\times \mathcal {T}^*_{h-1}\) defined by
where \( d = \mathbb {E}_{P} {\mathrm {deg}}_G (o)\), \(e_P (s,s')= \mathbb {E}_P E_h(s,s')\), and \(E_h(s,s')\) is defined in (3). We write \(H(\pi _P)\) for the Shannon entropy of \(\pi _P\):
Theorem 1.3
Fix \(h \in \mathbb {N}\). The expression
defines a function \(J_h:\mathcal {P}_h\mapsto [-\infty ,s(d)]\), satisfying
for all \(P\in \mathcal {P}_h\). Define \(\overline{J}_h: \mathcal {P}(\mathcal {T}^*_h)\mapsto [-\infty ,s(d)]\) by \(\overline{J}_h (P) = J_h(P)\) if \(P\in \mathcal {P}_h\), and \(\overline{J}_h(P)=-\infty \) if \(P\notin \mathcal {P}_h\). If \(\rho \in \mathcal {P}_{u}(\mathcal {T}^*)\), then for all \(h\in \mathbb {N}\),
and, if \(\rho _1\) has finite support, the inequality is strict unless \(\rho = {\mathrm {UGW}}_h(\rho _h)\). Finally, for any \(\rho \in \mathcal {P}_{u}(\mathcal {T}^*)\), \(\overline{J}_h(\rho _h)\) is non-increasing in \(h\in \mathbb {N}\), and
In Remark 5.13 below we provide an alternative expression for \(J_h(P)\) in terms of relative entropies. Specializing to the case \(h=1\), we obtain the following corollary of Theorem 1.3.
Corollary 1.4
If \(P\in \mathcal {P}( \mathbb {Z}_+)\) has mean \(d\), then
where \({\mathrm {Poi}}(d)\) stands for Poisson distribution with mean \(d\), and \(H(\cdot \,|\,\cdot )\) is the relative entropy.
In particular, the standard Galton–Watson tree \(\rho = {\mathrm {UGW}}_1( {\mathrm {Poi}}(d))\) maximizes the entropy \(\Sigma (\rho )\) among all measures \(\rho \) with mean degree \(d\).
As a byproduct of our analysis, we will also obtain an alternative proof of the Bowen–Elek Theorem [3, 13, 19].
Corollary 1.5
If \(\rho \in \mathcal {P}_{u} (\mathcal {T}^*)\), then \(\rho \) is sofic.
We observe finally that, from its definition, the map \(\Sigma : \rho \mapsto \Sigma (\rho )\) is easily seen to be upper semi-continuous for the local weak topology (see Lemma 5.3). In Proposition 5.14 below, we will however prove that \(\Sigma \) fails to be continuous at any \(\rho = {\mathrm {UGW}}_1 (P)\) whenever \(P \in \mathcal {P}(\mathbb {Z}_+)\) has finite support and satisfies \(P(0) = P(1) = 0\), \(P(2) < 1\).
1.4 Large deviations of uniform graphs with given degrees
Given a vector \(\mathbf {d}\in \mathbb {Z}_+^n\), let \(\mathcal {G}(\mathbf {d})\) denote the set of graphs \(G = ([n],E)\) such that \(\mathbf {d}\) is the degree sequence of \(G\), i.e. if \(\mathbf {d}=(d(1),\dots ,d(n))\), then for all \(v \in [n]\), \({\mathrm {deg}}_G(v) = d(v)\). Consider a sequence \(\mathbf {d}^{(n)}\), \(n\in \mathbb {N}\), of degree vectors \((d^{(n)}(1),\dots ,d^{(n)}(n))\) such that, for some fixed \(\theta \in \mathbb {N}\), and \(P\in \mathcal {P}(\mathbb {Z}_+)\):
-
(C1)
\(\sum _{v=1}^nd^{(n)}(v)\) is even,
-
(C2)
\(\max _{1\leqslant v \leqslant n} d^{(n)}(v) \leqslant \theta \),
-
(C3)
\(\frac{1}{n} \sum _{v\in [n]} \delta _{d^{(n)}(v)} \rightsquigarrow P\),
where \( \rightsquigarrow \) denotes weak convergence in \(\mathcal {P}(\mathbb {Z}_+)\). A consequence of Erdős and Gallai [21] is that if (C1)–(C3) above are satisfied, then \(\mathcal {G}(\mathbf {d}^{(n)})\) is not empty for all \(n\) large enough. We shall consider a random graph \(G_n\) sampled uniformly from \(\mathcal {G}(\mathbf {d}^{(n)})\). Models of this type are well known in the random graph literature; see e.g. Molloy and Reed [26]. In particular, it is a folklore fact that almost surely the neighborhood distribution \(U(G_n)\) defined in (1) is weakly convergent to \({\mathrm {UGW}}_1(P)\); see also Theorem 4.8 below for a more general statement. One of our main results concerns the large deviations of \(U(G_n)\). Here and below whenever we say that \(U(G_n)\) satisfies the large deviation principle (LDP) in \(\mathcal {P}(\mathcal {G}^*)\) with speed \(n\) and good rate function \(I\), we mean that the function \(I:\mathcal {P}(\mathcal {G}^*)\mapsto [0,\infty ]\) is lower semi-continuous with compact level sets, and for every Borel set \(B\subset \mathcal {P}(\mathcal {G}^*)\)
where \(B^\circ \) denotes the interior of \(B\) and \(\overline{B}\) denotes the closure of \(B\).
Theorem 1.6
Let \(\mathbf {d}^{(n)}\) be a sequence satisfying conditions \((C1)\)–\((C3)\) above. Let \(G_n\) be uniformly distributed on \(\mathcal {G}(\mathbf {d}^{(n)})\). Then \(U(G_n)\) satisfies the LDP in \(\mathcal {P}(\mathcal {G}^*)\) with speed \(n\) and good rate function
It follows from Theorem 1.3 that for any integer \(h \geqslant 1\), and \(Q \in \mathcal {P}_h \) with \(Q_1 = P\), then
and the minimum is uniquely attained for \(\rho = {\mathrm {UGW}}_h(Q)\). This allows one to compute large deviations of neighborhood measures \(U(G_n)_h\) explicitly in terms of the function \(J_h\).
On the other hand, consider the special case of \(d\)-regular graphs, where \(\mathbf {d}^{(n)}\) is the constant vector \((d,\dots ,d)\), and \(P=\delta _d\), for some fixed \(d\in \mathbb {N}\). To have \(\Sigma (\rho )>-\infty \), \(\rho \) must be supported on trees by Theorem 1.2, and because of the constant degree constraint we find that the only \(\rho \in \mathcal {P}(\mathcal {G}^*)\) such that \(I(\rho )<\infty \) is the Dirac mass at the infinite rooted \(d\)-regular tree, which coincides with \({\mathrm {UGW}}_1(P)\), where the rate function is zero. Thus, for the \(d\)-regular random graph \(I(\rho )\) is either zero or infinite, and one should look at faster speed than \(n\) here for non trivial large deviations.
We note finally that Theorem 1.6 establishes a large deviations principle with speed \(n\). Other interesting large deviation events occur at higher speed. For example, for the proportion of vertices in a triangle in \(G_n\), the speed would be \(n \log n\).
1.5 Large deviations of Erdős–Rényi graphs
Next, we describe our main results for sparse Erdős–Rényi graphs such as the uniform random graph from \(\mathcal {G}_{n,m}\), with \(m\sim nd/2\) and the \(\mathcal {G}(n,p)\) where each edge is independently present with probability \(p=d/n\). It is well known that, in both cases, with probability one, \(U(G_n)\) converges weakly to the standard Galton–Watson tree with mean degree \(d\), i.e. \(\rho ={\mathrm {UGW}}_1( {\mathrm {Poi}}(d))\), which by Corollary 1.4 satisfies \(\Sigma (\rho )=s(d)\).
Theorem 1.7
Fix \(d > 0\) and a sequence \( m = m(n)\) such that \(m / n \rightarrow d / 2\), as \(n\rightarrow \infty \). Let \(G_n\) be uniformly distributed in \(\mathcal {G}_{n,m}\). Then \(U(G_n)\) satisfies the LDP in \(\mathcal {P}(\mathcal {G}^*)\) with speed \(n\) and good rate function
Theorem 1.8
Fix \(\lambda > 0\) and take \(G_n\) with law \(\mathcal {G}(n,\lambda /n)\). Then \(U(G_n)\) satisfies the LDP in \(\mathcal {P}(\mathcal {G}^*)\) with speed \(n\) and good rate function
where \(d:=\mathbb {E}_\rho {\mathrm {deg}}_G(o)\), with the convention that if \(d=0\) then \(\Sigma (\rho )=s(0)=0\).
In the special case of \(1\)-neighborhoods, Theorem 1.7, Theorem 1.8 and Corollary 1.4 allow us to prove the following results. Let \(u(G_n)\in \mathcal {P}(\mathbb {Z}_+)\) denote the empirical distribution of the degree: \(u(G_n) = \frac{1}{n}\sum _{i=1}^n \delta _{{\mathrm {deg}}_{G_n}(i)}\).
Corollary 1.9
Fix \(d > 0\), a sequence \( m = m(n)\) such that \(m / n \rightarrow d / 2\), and let \(G_n\) be uniformly distributed in \(\mathcal {G}_{n,m}\). Then \(u(G_n)\) satisfies the LDP in \(\mathcal {P}(\mathbb {Z}_+)\) with good rate function
Corollary 1.10
Fix \(\lambda > 0\) and take \(G_n\) with law \(\mathcal {G}(n,\lambda /n)\). Then \(u(G_n)\) satisfies the LDP in \(\mathcal {P}(\mathbb {Z}_+)\) with speed \(n\) and good rate function
1.6 Plan and methods
The proof of the main results discussed above is organized as follows. In Sect. 2 we review some basic facts about local weak convergence in the context of multi-graphs. We also establish a compactness criterion which parallels recent results of Benjamini, Lyons and Schramm [3]. In Sect. 3 we introduce the unimodular Galton Watson trees with given \(h\)-neighborhood distribution and prove the properties stated in Proposition 1.1. In Sect. 5 we prove our main results concerning the entropy \(\Sigma (\rho )\), cf. Theorem 1.2 and Theorem 1.3. These are crucially based on the possibility of counting asymptotically the number of graphs in \(\mathcal {G}_{n,m}\) which have a certain \(h\)-neighborhood distribution. To compute such things, we introduce what we call a generalized configuration model.
The standard configuration model, introduced in Bollobas [6], allows one to compute asymptotically the number of graphs with a given degree sequence. Since here we want to uncover the \(h\)-neighborhood of a vertex and not only its degree, we need to generalize the usual construction. To keep track of the \(h\)-neighborhood structure, we introduce directed multigraphs with colored edges and analyze the associated configuration model; see Sect. 4. This will allow us to sample a random graph with a given sequence of \(h\)-neighborhoods, as long as these neighborhoods are rooted trees. As an application, we prove Corollary 1.5 at the end of Sect. 4. It seems to us that this new configuration model may turn out to be a natural tool in other applications as well. Finally, Sect. 6 is devoted to the proof of large deviation principles in the classical random graphs ensembles. We stress that our methods allow in principle a much greater generality, since one could establish large deviation estimates for random graphs that are uniformly sampled from the class of all graphs with a given \(h\)-neighborhood distribution and not only with given degree sequences; see Remark 6.1.
1.7 Related work
Large deviations in random graphs is a rapidly growing topic. For dense graphs, e.g. \(\mathcal {G}(n,p)\) with fixed \(p\in (0,1)\), a thorough treatment has been given recently by Chatterjee and Varadhan [14], in the framework of the cut topology introduced by Lovász and Szegedy [24], see also Borgs, Chayes, Lovász, Sós and Vesztergombi [10, 11]. In the sparse regime, only a few partial results are known. O’Connell [27], Biskup et al. [5] and Puhalskii [28] have proven large deviation asymptotics for the connectivity and for the size of the connected components. Large deviations for degree sequences of Erdős–Rényi graphs has been studied in Doku-Amponsah and Mörters [17] and Boucheron et al. [12, Theorem 7.1]. Closer to our approach, large deviations in the local weak topology were obtained for critical multi-type Galton–Watson trees by Dembo et al. [15]. Finally, large deviations for other models of statistical physics on Erdős–Rényi graphs have been considered in Rivoire [29] and Engel et al. [20].
As far as we know, this is the first time that large deviations of the neighborhood distribution are addressed in a systematic way. While our approach does not cover results on connectivity and the size of connected components such as [27], it does yield a simplification of some of the existing arguments concerning the large deviations for degree sequences. We point out that our Corollary 1.10 gives a corrected version of [17, Corollary 2.2]. Under a stronger sparsity assumption, large deviations of neighborhood distributions for random networks have been used in [9] to study the large deviations of the spectral measure of certain random matrices.
2 Local weak convergence
In this section, we first recall the basic notions of local weak convergence in the more general context of rooted multi-graphs; see [2, 4], and [1]. Then, we give a general tightness lemma.
2.1 Local convergence of rooted multi-graphs
Let \(V\) be a countable set, a multi-graph \(G = (V, \omega )\) is a vertex set \(V\) together with a map \(\omega \) from \(V^2\) to \(\mathbb {Z}_+\) such that for all \((u,v) \in V^2\), \(\omega ( u, u)\) is even and \( \omega (u,v) = \omega (v,u). \) For ease of notation, we sometimes set \(\omega (v) = \omega (v,v)\) for the weight of the loop at \(v\). If \(e = \{u,v\}\) is an unordered pair (\(u\ne v\)), we may also write \(\omega (e)\) in place of \(\omega (u,v)\). The edge set \(E\) of \(G\) is the set of unordered pairs \(e = \{ u,v\}\) such that \(\omega ( e) \geqslant 1\), \(\omega (e)\) being the multiplicity of the edge \(e \in E\). Similarly, \(\omega (v) /2\) is the number of loops attached to \(v\). A multi-graph with no loop, and with no edge with multiplicity greater than \(1\) is a graph.
The degree of \(v\) in \(G\) is defined by
The multi-graph \(G\) is locally finite if for any vertex \(v\), \({\mathrm {deg}}(v)< \infty \).
We denote by \({\widehat{\mathcal {G}}}\) the set of all locally finite multi-graphs. For a multi-graph \(G\in {\widehat{\mathcal {G}}}\), to avoid possible confusion, we will often denote by \(V_G\), \(\omega _G\), \({\mathrm {deg}}_G\) the corresponding vertex set, weight and degree functions.
Recall that a path \(\pi \) from \(u\) to \(v\) of length \(k\) is a sequence \(\pi = (u_0, \dots , u_k)\) with \(u_0 = u\), \(u_k = v\) and, for \(0 \leqslant i \leqslant k-1\), \(\{ u_i, u_{i+1}\} \in E\). If such \(\pi :u\rightarrow v\) exists, the distance \(D(u,v)\) in \(G\) between \(u\) and \(v\) is defined as the minimal length of all paths from \(u\) to \(v\). If there is no path \(\pi :u\rightarrow v\), then the distance \(D(u,v)\) is set to be infinite. A multi-graph is connected if \(D(u,v)<\infty \) for any \(u\ne v \in V\).
Below, a rooted multi-graph \((G,o) = ( V, \omega , o)\) is a locally finite and connected multi-graph \((V, \omega )\) with a distinguished vertex \(o \in V\), the root. For \(t \geqslant 0\), we denote by \((G,o)_t\) the induced rooted multi-graph with vertex set \(\{u\in V: \,D(o,u)\leqslant t\}\). Two rooted multi-graphs \((G_i,o_i) = ( V_i, \omega _i, o_i )\), \(i \in \{1,2\}\), are isomorphic if there exists a bijection \(\sigma : V_{1} \rightarrow V_{2}\) such that \(\sigma ( o_1) = o_2\) and \(\sigma ( G_1) = G_2\), where \(\sigma \) acts on \(G_1\) through \(\sigma ( u, v ) = (\sigma ( u), \sigma (v) )\) and \(\sigma ( \omega ) = \omega \circ \sigma \). We will denote this equivalence relation by \((G_1,o_1) \simeq (G_2, o_2)\). The associated equivalence classes can be seen as unlabeled rooted multi-graphs. We call \({\widehat{\mathcal {G}}}^*\) the set of all such equivalence classes.
We define the semi-distance \(d\) between two rooted multi-graphs \((G_1,o_1)\) and \((G_2,o_2)\) as
where \(T\) is the supremum of those \(t > 0\) such that \((G_1,o_1)_t\) and \((G_2,o_2)_t\) are isomorphic. On the space \({\widehat{\mathcal {G}}}^*\), \(d\) is a distance. The associated topology will be referred to as the local topology. The space \(({\widehat{\mathcal {G}}}^*,d)\) is Polish (i.e. separable and complete) [1].
Explicit compact subsets of \({\widehat{\mathcal {G}}}^*\) can be constructed as follows. If \(g \in {\widehat{\mathcal {G}}}^*\), we define
i.e. twice the total number of edges in \(g\). For \(g \in {\widehat{\mathcal {G}}}^*\), \(t \in \mathbb {N}\), the truncation at distance \(t\), \(g_t\), is defined as the equivalence class of \((G,o)_t\) where the equivalence class of \((G,o)\) is \(g\).
Lemma 2.1
Let \(t_0 \geqslant 0\) and \(\varphi : \mathbb {N}\rightarrow \mathbb {R}_+ \) be a non-negative function. Then
is a compact subset of \(\,{\widehat{\mathcal {G}}}^*\) for the local topology.
Proof
For each \(t \geqslant t_0\), there is a finite number of elements in \({\widehat{\mathcal {G}}}^*\), say \(f_{t,1}, \dots , f_{t,n_t}\), such that \(| g | \leqslant \varphi (t)\) and for any vertex the distance to the root is at most \(t\). Therefore, the collection \(A_{t,1}, \dots , A_{t,n_t}\) where \(A_{t,k} = \{g \in {\widehat{\mathcal {G}}}^* : g _t = f_{t,k} \}\) is a finite covering of \(K\) of radius \(1/(1+t)\). \(\square \)
The notions of local weak convergence introduced in Sect. 1.1 are immediately extended to the present setting of multi-graphs. The definitions of \(U(G)\) in (1) and unimodularity (2) easily carry over to \(\mathcal {P}( {\widehat{\mathcal {G}}}^*)\). The next simple lemma is proved in [4].
Lemma 2.2
The set \(\mathcal {P}_{u} ({\widehat{\mathcal {G}}}^*)\) is closed in the local weak topology.
2.2 Compactness lemma for the local weak topology
Let \(G_n\) be a sequence of finite multi-graphs. We now give a condition which guarantees that the sequence \(U(G_n)\) is tight for the local weak topology. If \(G = (V, \omega )\) is a multi-graph, we define the degree of a subset \(S \subset V\) as
The next lemma is a sufficient condition for tightness in \(\mathcal {P}_{u} ({\widehat{\mathcal {G}}}^*)\). A similar result appears in Benjamini et al. [3, Theorem 3.1]. We give an independent proof.
Lemma 2.3
Let \(\delta : [0,1] \rightarrow \mathbb {R}_+\) be a continuous increasing function such that \(\delta (0) = 0\). There exists a compact set \(\Pi = \Pi (\delta ) \subset \mathcal {P}_{u} ({\widehat{\mathcal {G}}}^*)\) such that if a finite multi-graph \(G = (V, \omega )\) satisfies
for all \(S \subset V\), then \(U(G) \in \Pi \).
Considering a sequence \(U(G_n), n \geqslant 1\), condition (15) amounts to a uniform integrability of the degree sequences of the multi-graphs \((G_n), n \geqslant 1\). It may seem quite paradoxical that a sole condition on the degrees implies the tightness of the whole graph sequence. However, the unimodularity of \(U(G)\) yields enough uniformity for this result to hold.
Proof of Lemma 2.3
Since \({\widehat{\mathcal {G}}}^*\) is a Polish space, from Prohorov’s theorem, a set \(\Pi \subset \mathcal {P}({\widehat{\mathcal {G}}}^*)\) is relatively compact if and only if for any \(\varepsilon > 0\), there exists a compact \(K\subset {\widehat{\mathcal {G}}}^*\) such that for all \(\mu \in \Pi \), \(\mu ( K^c ) \leqslant \varepsilon \).
Set \(c = \delta (1)\). Without loss of generality, we may assume \(c > 1\). We consider the increasing function \([0, c] \mapsto [0,1]\)
Now, for each \(\varepsilon >0\), and integer \(t \geqslant 1\), we set
where the composition holds \(t\) times. We now define \(\Pi \) as being the closure of the set of measures \(\mu \) in \(\mathcal {P}_{u} ({\widehat{\mathcal {G}}}^*)\) such that for any \(\varepsilon > 0\), \(\mu ( K_\varepsilon ^c ) \leqslant \varepsilon \) where
By Lemma 2.1, \(K_\varepsilon \) is a compact set of \({\widehat{\mathcal {G}}}^*\). Hence, Prohorov’s theorem asserts that \(\Pi \) is a compact set of \(\mathcal {P}_{u} ({\widehat{\mathcal {G}}}^*)\).
We now check that \(\rho = U(G) \in \Pi \). This will conclude the proof of our lemma. It is sufficient to prove that \(\rho ( K_\varepsilon ) \geqslant 1 - \varepsilon \) for all \(\varepsilon >0\). Let \(t \geqslant 0\) be an integer, for \(S \subset V\), \(B(S,t)\) denote the set of vertices at distance at most \(t\) from a vertex in \(S\). In particular, if \(v\in V\) and \(g\) is the equivalence class of \((G(v),v)\) we have
Notice also that \(| B (S, 1) | \leqslant {\mathrm {deg}}(S)\). Set \(|V| = n\). By iteration on (15), it follows that if \(S \subset V\) is such that \(|S| \leqslant h_\varepsilon ( t) n \) then
Moreover, from (15), we have
Hence, using Markov inequality, we deduce that the set
has cardinality at most \( h_\varepsilon ( t) n \). From what precedes, the set
has cardinality at most \(2^{-t} \varepsilon n\). Note that, if \( v \notin U_t\), then \({\mathrm {deg}}( B (v, t) )\) is bounded by
This implies that the set
has cardinality at most \(2^{-t} \varepsilon n\). So finally, from the union bound, the set
has cardinality at least \((1 - \varepsilon )n\). We have thus checked that \(\rho ( K_\varepsilon ) \geqslant 1 - \varepsilon \). \(\square \)
3 Unimodular Galton–Watson trees with given neighborhood
The aim of this section is to prove Proposition 1.1. We thus fix \(h \in \mathbb {N}\) and \(P \in \mathcal {P}(\mathcal {T}^*_h)\) admissible. We start with some simple observations which ensure that \({\mathrm {UGW}}_h(P)\) is indeed well defined.
First observe that if \(\tau \in \mathcal {T}^* _{h}\), \(t' \in \mathcal {T}^* _{h-1}\) and \(S = \tau \cup t'_+\), then (recall the definition of \( \tau \cup t'_+\) and Fig. 2)
Therefore, for any \(t,t' \in \mathcal {T}^* _{h-1}\),
where \(e_P\) was defined by (4). We thus have checked that \({\widehat{P}}_{t,t'}\) defined by (5) is indeed a probability measure on \(\mathcal {T}^*_h\). Consequently, the probability measure \({\mathrm {UGW}}_h(P)\) is well defined.
3.1 Unimodularity
The next lemma is a direct argument for the unimodularity of \({\mathrm {UGW}}_h(P)\), which establishes the first part of Proposition 1.1. We remark however that this fact could be derived indirectly from Theorem 4.8 and Lemma 4.9 below, which ensure in particular that \({\mathrm {UGW}}_h(P)\) is sofic (and hence unimodular).
Lemma 3.1
Fix \(h\in \mathbb {N}\) and \(P\in \mathcal {P}(\mathcal {T}_h^*)\) admissible. The measure \({\mathrm {UGW}}_h(P)\) is unimodular.
Proof
It is sufficient to check the so-called involution invariance, i.e. that (2) holds with \(f\) restricted to functions \(f : \mathcal {G}^{**} \rightarrow \mathbb {R}_+\) such that \(f ( G, u, v) = 0\) unless \(\{ u, v\} \in E_G\); see [1]. Recall that we may extend \(f : \mathcal {G}^{**} \rightarrow \mathbb {R}_+\) to all connected graphs with two distinguished roots \((G,u,v)\) through the isomorphism class.
Let \((T,o)\) be the random rooted tree defined in the introduction whose equivalence class has law \({\mathrm {UGW}}_h (P)\). Recall that the neighbors of the root \(o\) are indexed by \(1, \dots , {\mathrm {deg}}_T(o)\) and that the vector of subtrees \((T(o,1),\ldots ,T(o,{\mathrm {deg}}_T(o)))\) is exchangeable. We write
where, in the summand, \( S = \tau \cup t'_+\). Now, (5) and (16) imply
For \((t,t') \in \mathcal {T}^* _{h-1}\), we introduce a new random tree \(H=H_{t,t'}\) defined as follows. Start with two vertices \(o\) and \(o'\) which are connected by an edge. Attach the tree \(t\) to \(o\) and the tree \(t'\) to \(o'\), so that the type of \(o\) is \((t,t')\) and the type of \(o'\) is \((t',t)\). Sample independently \(H(o',o)_{h}\) according to \({\widehat{P}}_{t,t'}\) and \(H(o,o')_{h}\) according to \({\widehat{P}}_{t',t}\). The subtrees \(H(o',o)_{h}\) and \(H(o,o')_{h}\) define the types of the children of \(o\) and \(o'\). Next, sample independently their rooted subtrees, according to their types, i.e. \(H(o,v)_h\) (resp. \(H(o',v)_h\)) is sampled according to \({\widehat{P}}_{a,b}\) if \(v\sim o\) (resp. \(v\sim o'\)) has type \((a,b)\). Repeating recursively for all children defines the random tree \(H\). From the definition of \({\mathrm {UGW}}_h (P)\), one has
where we use \(\mathbb {E}_{t,t'}\) for expectation over the random \(H=H_{t,t'}\) defined above. It follows that
Similarly,
where the second identity follows from the symmetry in \(o,o'\) in the definition in \(H\), which implies that \(\mathbb {E}_{t,t'} [ f ( H, o',o ) ]=\mathbb {E}_{t',t} [ f ( H, o,o' ) ]\).
Finally, the assumption \(e_P(t,t') = e_P(t',t)\) yields
\(\square \)
3.2 Consistency lemma
We turn to the second part of Proposition 1.1. The following lemma computes the law of the \((h+1)\)-neighborhood of a Galton–Watson tree with a given \(h\)-neighborhood.
Lemma 3.2
Fix \(h\in \mathbb {N}\), \(P \in \mathcal {P}( \mathcal {T}^*_h)\) admissible and set \(\rho = {\mathrm {UGW}}_h ( P)\). For any \(\tau \in \mathcal {T}^{*}_{h+1}\) with \({\mathrm {deg}}_\tau (o) = d\), we have
where
-
\((t^i \in \mathcal {T}^*_{h}, 1 \leqslant i \leqslant d)\) are the subtrees of \(\tau \) attached to the offspring of the root, and for \(1 \leqslant i \leqslant d\), \(s^i = (t^i)_{h-1}\);
-
\(\{s^a\}_{a \in \mathcal {A}}\) is set of distinct elements of \(( s^i, 1\leqslant i \leqslant d )\), and, for each \(a \in \mathcal {A}\), \(\{t^{a,b}\}_{b \in \mathcal {B}_a}\) is the set the distinct elements of \((t^i, 1\leqslant i \leqslant d)\), such that \(( t^{a,b} )_{h-1} = s^a \);
-
\(n_a\) is the cardinality of \(s^i\)’s equal to \(s^a\) and \(k_{a,b}\) is the cardinality of \(t^i\)’s equal to \(t^{a,b}\);
-
\(s^{-a} = (t^{-a})_{h-1}\) and \(t^{-a} \in \mathcal {T}^*_{h}\) is the tree obtained from \(\tau _{h}\) by removing one offsping with subtree equal to \(s^{a}\).
Proof
Using \(\rho _h = P\), for a fixed \(\tau \in \mathcal {T}^{*}_{h+1}\), the above definitions allow us to write
Observe that \(T(o,v)_{h} = t^{a,b}\) implies that \(T(o,v)_{h-1} = s^a\). Moreover, given \((T,o)_{h} = t\), \(T(o,v)_{h-1} = s^a\) implies that \(T(v,o)_{h-1} = s^{-a}\), i.e. the type of vertex \(v\) is \((s^a, s^{-a})\). The lemma is then a consequence of the conditional independence of the subtrees attached to the offspring of the root given \((T,o)_h\). \(\square \)
Lemma 3.3
Fix integers \( k>h\geqslant 1\), \(P \in \mathcal {P}( \mathcal {T}^*_h)\) admissible and set \(\rho = {\mathrm {UGW}}_h ( P)\). Then
Proof
By recursion, it suffices to prove the statement for \(k = h+1\). For \(s \in \mathcal {T}^*_{h-1}\) such that \(e_P(s,s') >0\) for some \(s' \in \mathcal {T}^* _{h-1}\), we may define the probability measure
In words, \({\widehat{\mathbb {P}}}_s\in \mathcal {P}(\mathcal {T}^*)\) is the law of the whole subtree \(T(o,v)\) of a neighbor \(v\) of the root given that \(T(v,o)_{h-1} = s\), where \((T,o)\) has law \(\rho \). Next, we show that, for \(s, s' \in \mathcal {T}^*_{h-1}\), \(t \in \mathcal {T}^*_h\) such that \(t_{h-1} = s\) and \(e_P(s,s') >0\), one has
where \((T,o)\) is now the random variable with law \({\widehat{\mathbb {P}}}_{s'}\). Since \(P=\rho _h\) one has \(e_P(s,s') = \mathbb {E}_{\rho } \big | \big \{ v \mathop {\sim }\limits ^{T} o : T(o,v)_{h-1} = s \;, T(v,o)_{h-1} = s' \big \}\big |\), and therefore
However, with \( n = \big | \big \{ v \mathop {\sim }\limits ^{t} o : t(o,v) = s' \big \} \big |, \) we deduce from the unimodularity of \(\rho \) and (16) that
This proves (17).
Now we set \(Q=\rho _{h+1}\) and \(\rho ' = {\mathrm {UGW}}_{h+1} ( Q)\). Our aim is to prove that \(\rho '= \rho \). It is sufficient to prove that for any \(t, t' \in \mathcal {T}^*_{h}\) and \(\tau \in \mathcal {T}^*_{h+1}\) such that \(\tau _{h} = t\) and \(e_Q (t,t') >0\),
where \(s'=t'_{h-1}\). Indeed, since \(\rho '\) and \(\rho \) have the same \(h+1\) neighborhood, this would prove that they have in fact the same \(h+2\) neighborhood and, by conditional independence, we would deduce that \(\rho = \rho '\).
Let us prove (19). Set
where, as above, \(t=\tau _h\) and \(s'=t'_{h-1}\). Since \((\tau \cup t'_+)_h = t \cup s'_+\), and \(\rho '_{h+1}=\rho _{h+1}\), we have
As in Lemma 3.2, let \(t^i \in \mathcal {T}^*_{h}, 1 \leqslant i \leqslant d\) be the subtrees of \(\tau \) attached to the offspring of the root and call \(s^i\) their restriction to \(\mathcal {T}^*_{h-1}\). By construction, \(k\) elements of the \(t^i\)’s are equal to \(t'\) and \(n\) elements of the \(s^i\)’s are equal to \(s'\). Let \((s^a)_a\) be the set of distinct elements of the set \(\{s'\}\cup \{s^i, 1\leqslant i \leqslant d \}\), and, for each \(a\), let \((t^{a,b})_b\) denote the distinct elements of \(\{t'\}\cup \{t^i, 1\leqslant i \leqslant d \}\), such that \(t^{a,b}\) restricted to \(\mathcal {T}^*_{h-1}\) is \(s^a\). We denote by \(n_a\) the cardinality of \(s^i\)’s equal to \(s^a\) and \(k_{a,b}\) the cardinality of \(t^i\)’s equal to \(t^{a,b}\). We set \(n'_a = n_a + \mathbf {1}( s_a = s')\) and \(k'_{a,b} = k_{a,b} + \mathbf {1}( t_{a,b} = t')\). Then, Lemma 3.2 yields
where, \(s^{-a} = [(t\cup s'_+)^{-a}]_{h-1}\) and \((t\cup s'_+)^{-a} \in \mathcal {T}^*_{h}\) is the tree obtained from \(t \cup s'_+\) by removing one of the offspring with subtree equal to \(s^{a}\). Thus, we find
Since \(\rho _{h+1}=\rho '_{h+1}\), one has
By sampling the \(h\)-neighborhood \((T,o)_h\) first, and using the number \(n\) as above, one has
Next, we show that the right hand side in (19) equals the above expression. We have
where we have used unimodularity and (18). Now, by sampling first the \(h\)-neighborhood \((T,o)_h\), one finds that
where, as before, \(k=k(t')\) stands for the number of \(v \mathop {\sim }\limits ^{\tau } o\) such that \(\tau (o,v) = t'\). Using Lemma 3.2 in the form (20), and the fact that \(\sum _{t' : t'_{h-1} = s'}{\widehat{P}}_{s',s} (t')=1\), we find
Hence,
The identity (19) follows from (24) and (23). \(\square \)
Remark 3.4
From (22) one deduces the identity
for any \(t,t' \in \mathcal {T}^*_{h}\), with \(s=t_{h-1},s'=t'_{h-1}\), for any \(P\in \mathcal {P}(\mathcal {T}^*_h)\) admissible, with \(Q=[{\mathrm {UGW}}_{h}(P)]_{h+1}\).
4 Configuration model for directed graphs with colored edges
This section introduces a generalized configuration model, to be used later on to count the number of graphs with a given tree-like neighborhood distribution.
4.1 Directed multi-graphs with colors
We are now going to define a family of directed multi-graphs with colored edges. Let \(L\) be a fixed integer. Each pair \((i,j)\) with \( 1 \leqslant i, j \leqslant L\) is interpreted as a color. Define the sets of colors
Also, define \(\mathcal {C}_\leqslant = \mathcal {C}_<\cup \mathcal {C}_=\), \(\mathcal {C}_>=\mathcal {C}{\setminus } \mathcal {C}_\leqslant \) and \(\mathcal {C}_{\ne }= \mathcal {C}{\setminus } \mathcal {C}_=\). If \(c = (i,j) \in \mathcal {C}\), then set \({\bar{c}} = (j,i)\) for the conjugate color.
We consider the class \({\widehat{\mathcal {G}}}(\mathcal {C})\) of directed multi-graphs with \(\mathcal {C}\)-colored edges defined as follows. We say that a directed multi-graph \(G\) is an element of \({\widehat{\mathcal {G}}}(\mathcal {C})\) if \(G = (V, \omega )\) where \(V=[n]\) for some \(n\in {\mathbb {N}}\), \(\omega =\{\omega _c\}_{c\in \mathcal {C}}\) and for each \(c\in \mathcal {C}\), \(\omega _c\) is a map \(\omega _c : V^2 \rightarrow \mathbb {Z}_+\) with the following properties: if \(c\in \mathcal {C}_=\), then \(\omega _c (u,u)\) is even for all \(u\in V\), and \(\omega _c (u,v) = \omega _{ c} (v,u)\) for all \(u,v\in V\); if \(c\in \mathcal {C}_{\ne }\), then \(\omega _c (u,v) = \omega _{ {\bar{c}}} (v,u)\) for all \(u,v\in V\). The interpretation is that, for any \(c\in \mathcal {C}\), if \(u\ne v\) then \(\omega _c (u,v)\) is the number of directed edges of color \(c\) from \(u\) to \(v\); if \(u=v\) and \(c\in \mathcal {C}_=\), then \(\frac{1}{2} \omega _c (u,u)\) is the number of loops of color \(c\) at \(u\), while if \(u=v\) and \(c\in \mathcal {C}_{<}\) then \(\omega _c (u,u)=\omega _{{\bar{c}}} (u,u)\) is the number of loops of color \(c\) at \(u\); we adopt the convention that there are no loops of color \(c\in \mathcal {C}_>\) at any vertex. We call \(\mathcal {G}(\mathcal {C})\) the subset of \( {\widehat{\mathcal {G}}}(\mathcal {C})\) consisting of graphs, i.e. \(G=(V,\omega )\) such that \(\omega _c(u,v)\in \{0,1\}\) for all \(c\in \mathcal {C}\) and \(u,v\in V\) (no multiple edges) and \(\omega _c(u,u)=0\) for all \(c\in \mathcal {C}\) and \(u\in V\) (no loop). See Fig. 3 for an example of an element of \({\widehat{\mathcal {G}}}(\mathcal {C})\).
If \(G\in {\widehat{\mathcal {G}}}(\mathcal {C})\), one can define the colorblind multi-graph \({\bar{G}}=(V,{\bar{\omega }})\), by setting
The multi-graph \({\bar{G}}=(V,{\bar{\omega }})\) can be identified with an undirected multi-graph, in that by construction \({\bar{\omega }}(u,v)={\bar{\omega }}(v,u)\) for all \(u,v\in V\). We say that \(G\) is a simple graph if \({\bar{G}}\) has no loops and no multiple edges. Clearly, if \(L=1\) then there is only one color, so that any multi-graph \(G\in {\widehat{\mathcal {G}}}(\mathcal {C})\) coincides with its own \({\bar{G}}\).
If \(G\in {\widehat{\mathcal {G}}}(\mathcal {C})\), \(c\in \mathcal {C}\) and \(u\in V\), set
and write \(D(u)=\{D_c(u),c\in \mathcal {C}\}\). Note that \(D(u)\) is an element of \(\mathcal {M}_L \), defined as the set of \(L\times L\) matrices with nonnegative integer valued entries. The vector \(\mathbf {D}= \{D(u), u\in V\}\) of such matrices will be called the degree sequence of \(G\).
4.2 Directed colored multi-graphs with given degree sequence
Fix \(n\in {\mathbb {N}}\), and let \(\mathcal {D}_n\) denote the set of all vectors \((D(1),\dots ,D(n))\) such that \(D(i)=\{D_c(i),\, c\in \mathcal {C}\}\in \mathcal {M}_L \) for all \(i\in [n]\), and such that
is a symmetric matrix with even coefficients on the diagonal, i.e. \(S=\{S_c,\, c\in \mathcal {C}\}\), \(S_c=S_{{\bar{c}}}\) for all \(c\in \mathcal {C}\), and \(S_c\in 2\mathbb {Z}_+\) for all \(c\in \mathcal {C}_=\). Clearly, if \(G\in {\widehat{\mathcal {G}}}(\mathcal {C})\) then the vector \(\mathbf {D}\) defined by (26) yields an element of \(\mathcal {D}_n\) for some \(n\). Next, for a given \(\mathbf {D}\in \mathcal {D}_n\) we consider the set of all elements of \({\widehat{\mathcal {G}}}(\mathcal {C})\) which have \(\mathbf {D}\) as their degree sequence.
Definition 4.1
Fix \(n\in {\mathbb {N}}\) and \(\mathbf {D}\in \mathcal {D}_n\).
-
\({\widehat{\mathcal {G}}}(\mathbf {D})\) is the set of multi-graphs \(G\in {\widehat{\mathcal {G}}}(\mathcal {C})\) with \(V=[n]\) such that the degree sequence of \(G\) defined by (26) coincides with \(\mathbf {D}\).
-
\(\mathcal {G}(\mathbf {D},h)\) is the set of \(G\in {\widehat{\mathcal {G}}}(\mathbf {D})\) such that the colorblind graph \({\bar{G}}\) defined in (25) contains no cycle of length \(\ell \leqslant h\).
We also use the notation \(\mathcal {G}(\mathbf {D})\) for the set of simple graphs in \( {\widehat{\mathcal {G}}}(\mathbf {D})\). This set coincides with \(\mathcal {G}(\mathbf {D},2)\), since loops are cycles with length \(1\) and multiple edges are cycles with length \(2\). The main goal of this section is to provide asymptotic formulas for the cardinality of \(\mathcal {G}(\mathbf {D})\) and more generally of \(\mathcal {G}(\mathbf {D},h)\), for any \(h\in \mathbb {N}\). To this end we introduce a natural extension of the usual configuration model from [7], see also [23].
Fix a multi-graph \(G\in {\widehat{\mathcal {G}}}(\mathbf {D})\). For a fixed \(c\in \mathcal {C}_=\), let \(G_c\) denote the subgraph of \(G\) obtained by removing all edges but the ones with color \(c\). If \(c\in \mathcal {C}_<\) instead, then define \(G_c\) as the subgraph of \(G\) obtained by removing all edges but the ones with color \(c\) or \({\bar{c}}\). Thus, every \(G\in {\widehat{\mathcal {G}}}(\mathbf {D})\) is the result of the superposition of the multi-graphs \(G_c\), \(c\in \mathcal {C}_\leqslant \). We may then analyze each color separately.
4.2.1 Configuration model for \(c\in \mathcal {C}_=\)
When \(c\in \mathcal {C}_=\), every pair \(u,v\) satisfies \(\omega _c(u,v)=\omega _c(v,u)\), so \(G_c\) is actually a multi-graph with undirected edges, and we may use the usual construction [7, Section 2.4]. We provide the details for completeness. The degrees of \(G_c\) are fixed by the sequence \(D_c(1),\dots ,D_c(n)\). Let \(W_c=\cup _{i=1}^nW_c(i)\), be a fixed set of \(S_c=\sum _{i=1}^nD_c(i)\) points, with the subsets \(W_c(i)\) satisfying \(|W_c(i)|=D_c(i)\). Recall that \(S_c\) is even by assumption. Let \(\Sigma _c\) be the set of all perfect matchings of the complete graph over the points of \(W_c\), i.e. the set of all partitions of \(W_c\) into disjoint edges. Then,
Elements of \(\Sigma _c\) are called configurations. For any configuration \(\sigma _c\in \Sigma _c\), call \(\Gamma (\sigma _c)\) the multi-graph on \([n]\) with undirected edges obtained by including an edge \(\{i,j\}\) iff \(\sigma _c\) has a pair with one element in \(W_c(i)\) and the other in \(W_c(j)\). Notice that \(\Gamma (\sigma _c)\) has the same degree sequence \(D_c(1),\dots ,D_c(n)\) of \(G_c\). Moreover, any multi-graph with that degree sequence equals \(\Gamma (\sigma _c)\) for some \(\sigma _c\in \Sigma _c\).
Lemma 4.2
Fix \(c\in \mathcal {C}_=\). Let \(H\) be a multi-graph on \([n]\) with undirected edges and with degree sequence \(D_c(1),\dots ,D_c(n)\). The number of \(\sigma _c\in \Sigma _c\) such that \(\Gamma (\sigma _c)=H\) is given by
where \(\omega _c(i,j)\) is the number of edges between nodes \(\{i,j\}\) in \(H\), while \(\omega _c(i,i)/2\) is the number of loops at node \(i\) in \(H\).
Proof
We need to count the number of matchings \(\sigma _c\in \Sigma _c\) such that for every \(i<j\) one has \(\omega _c(i,j)\) edges between \(W_c(i)\) and \(W_c(j)\), and such that for all \(i\) one has \(\frac{1}{2}\omega _c(i,i)\) edges within \(W_c(i)\). Fix \(i<j\). Once we choose the \(\omega _c(i,j)\) elements of \(W_c(i)\) and the \(\omega _c(i,j)\) elements of \(W_c(j)\) to be matched together to produce the \(\omega _c(i,j)\) edges, then there are \(\omega _c(i,j)!\) distinct matchings that produce the same graph. Similarly, once we fix the \(\omega _c(i,i)\) elements of \(W_c(i)\) to be matched together to produce the \(\frac{1}{2}\omega _c(i,i)\) loops at \(i\), then there are \((\omega _c(i,i)-1)!!\) distinct matchings that produce the same graph. On the other hand, for every node \(i\) there are
distinct ways of choosing the elements of \(W_c(i)\) to be matched with \(W_c(1),\dots ,W_c(n)\) respectively. Putting all together we arrive at the following expression for the total number of configurations producing the graph \(H\):
which can be rewritten as (28). \(\square \)
4.2.2 Configuration model for \(c\in \mathcal {C}_<\)
When \(c\in \mathcal {C}_<\), every pair \(u,v\) satisfies \(\omega _c(u,v)=\omega _{{\bar{c}}}(v,u)\), so for the multi-graph \(G_c\), \(D_c(i)\) represents the number of outgoing edges at node \(i\), which equals the number of incoming edges at that node. Here we use a bipartite version of the previous construction. Let \(W_c=\cup _{i=1}^nW_c(i)\), be a fixed set of \(S_c=\sum _{i=1}^nD_c(i)\) points, with the subsets \(W_c(i)\) satisfying \(|W_c(i)|=D_c(i)\). Similarly, set \({\bar{W}}_c=\cup _{i=1}^n{\bar{W}}_c(i)\), with \(|{\bar{W}}_c(i)|=D_{{\bar{c}}}(i)\). Consider the set \(\Sigma _c\) of all perfect matchings of the complete bipartite graph over the sets \((W_c,{\bar{W}}_c)\), i.e. the set of perfect matchings containing only edges connecting an elements of \(W_c\) with an element of \({\bar{W}}_c\). Since \(S_c=S_{{\bar{c}}}\), one has \(|W_c|=|{\bar{W}}_c|\), and \(\Sigma _c\) can be identified with the set of permutations of \(S_c\) objects, or the set of bijective maps \(W_c\mapsto {\bar{W}}_c\), and \(|\Sigma _c|=S_c !\). A configuration is an element \(\sigma _c\in \Sigma _c\). For any configuration \(\sigma _c\), let \(\Gamma (\sigma _c)\) denote the directed multi-graph on \([n]\) obtained by including the directed edge \((i,j)\) with color \(c\) and the edge \((j,i)\) with color \({\bar{c}}\) iff \(\sigma _c\) has a pair with one element in \(W_c(i)\) and the other in \({\bar{W}}_c(j)\). Notice that \(\Gamma (\sigma _c)\) has the same degree sequence \(D_c(1),\dots ,D_c(n)\) of \(G_c\), and any multi-graph with directed edges with colors with the same degree sequence equals \(\Gamma (\sigma _c)\) for some \(\sigma _c\in \Sigma _c\).
Lemma 4.3
Fix \(c\in \mathcal {C}_<\). Let \(H\) be a multi-graph on \([n]\) with directed edges with colors \((c,{\bar{c}})\) only and with degree sequence \(D_c(1),\dots ,D_c(n)\). The number of \(\sigma _c\in \Sigma _c\) such that \(\Gamma (\sigma _c)=H\) is given by
where \(\omega _c(i,j)=\omega _{{\bar{c}}}(j,i)\) is the number of edges from \(i\) to \(j\) with color \(c\) in \(H\).
Proof
We have to count the number of bijective maps \(W_c\mapsto {\bar{W}}_c\) such that for every \(i,j\in [n]\) (including the case \(i=j\)), \(\omega _c(i,j)\) elements of \(W_c(i)\) are mapped to \({\bar{W}}_c(j)\). We begin by choosing, for every fixed node \(i\), the subsets of \(W_c(i)\) that are mapped into \({\bar{W}}_c(k)\), \(k=1,\dots ,n\), and the subsets of \({\bar{W}}_c(i)\) that are mapped into \(W_c(k)\), \(k=1,\dots ,n\). This can be done in
distinct ways. Once these subsets are chosen there remain, for every \(i,j\), \(\omega _c(i,j)!\) distinct bijections producing the same graph. Therefore, the total number of bijections from \(W_c\) to \({\bar{W}}_c\) which preserve the numbers \(\omega _c(i,j)=\omega _{{\bar{c}}}(j,i)\) is given by
The latter expression can be rewritten as (29). \(\square \)
4.2.3 Generalized configuration model
We now define the configuration model for a generic degree sequence \(\mathbf {D}\in \mathcal {D}_n\) by putting together the configuration models for all the colors. Let \(\Sigma \) denote the cartesian product of \(\Sigma _c,\,c\in \mathcal {C}_\leqslant \), where, as defined above, \(\Sigma _c\) are the sets of configurations associated to the degree sequence \(D_c(1),\dots ,D_c(n)\), that is \(\Sigma _c\) is the set of matchings of \(W_c\) if \(c\in \mathcal {C}_=\) and \(\Sigma _c\) is the set of bijections \(W_c\mapsto {\bar{W}}_c\) if \(c\in \mathcal {C}_<\). A configuration is an element \(\sigma =(\sigma _c)_{c\in \mathcal {C}_\leqslant }\) of \(\Sigma \). The map \(\Gamma (\cdot ):\Sigma \mapsto {\widehat{\mathcal {G}}}(\mathbf {D})\) is defined by calling \(\Gamma (\sigma )\) the multi-graph obtained by superposition of the multi-graphs \(\Gamma (\sigma _c)\) defined above. The configuration model, denoted \({\mathrm {CM}}(\mathbf {D})\), is the law of \(\Gamma (\sigma )\) when \(\sigma \in \Sigma \) is chosen uniformly at random.
Lemma 4.4
Let \(\mathbf {D}\in \mathcal {D}_n\), \(G\) with distribution \({\mathrm {CM}}(\mathbf {D})\) and \(H \in {\widehat{\mathcal {G}}}(\mathbf {D})\). We have
where \(S_c = \sum _{i=1}^nD_c(i)\), and \(b(H)\) is defined by
In particular, for any \(h\geqslant 2\), if \(\mathcal {G}(\mathbf {D},h)\) is not empty, the law of \(G\) conditioned on \(\mathcal {G}(\mathbf {D},h)\) is the uniform distribution on \(\mathcal {G}(\mathbf {D},h)\).
Proof
The cardinality of \(\Sigma \) is given by \(\prod _{c \in \mathcal {C}_<} S_c ! \prod _{c \in \mathcal {C}_=} (S_c-1) !!\). Thus, it suffices to check that \(\Gamma ^ {-1} (H)\) has cardinality \(b(H)^{-1}\prod _{ c \in \mathcal {C}}\prod _i D_c(u) !\). This follows from Lemmas 4.2 and 4.3 by observing that \(|\Gamma ^ {-1} (H)|=\prod _{c\in \mathcal {C}_\leqslant } n_c(H_c)\), where \(H_c\) denotes the multi-graph \(H\) after all edges with color \(c'\notin \{c,{\bar{c}}\}\) are removed. This proves (30). If \(H\in \mathcal {G}(\mathbf {D},h)\), \(h\geqslant 2\), then \(\omega _c(i,i)=0\) and \(\omega _c(i,j)\in \{0,1\}\) for all \(i,j\in [n]\) and \(c\in \mathcal {C}\), so that \(b(H)=1\). This proves the last assertion. \(\square \)
4.3 Probability of having no cycles of length \(\ell \leqslant h\)
Fix \(\theta \in \mathbb {N}\) and call \(\mathcal {M}_L^{(\theta )}\) the set of \(L\times L\) matrices with nonnegative integer entries bounded by \(\theta \). Fix \(P \in \mathcal {P}(\mathcal {M}_L^{(\theta )})\), a probability on \(\mathcal {M}_L^{(\theta )}\). We consider a sequence \(\mathbf {D}^{(n)} = ( D^{(n)}(u))_{u\in [n]}\in \mathcal {D}_n\), \(n \geqslant 1\) such that
-
(H1)
for all \(u \in [n]\), \(D^{(n)}(u)\in \mathcal {M}_L^{(\theta )}\);
-
(H2)
as \(n \rightarrow \infty \), \( \frac{1}{n} \sum _{u = 1} ^ n \delta _{D^{(n)}(u)} \rightsquigarrow P. \)
The main result of this section is the following
Theorem 4.5
Fix \(\theta \in \mathbb {N}\), \(P \in \mathcal {P}(\mathcal {M}_L^{(\theta )})\), and a sequence \(\mathbf {D}^{(n)}\) satisfying (H1)–(H2). Take \(G_n \in {\widehat{\mathcal {G}}}(\mathbf {D}^{(n)} )\) with distribution \({\mathrm {CM}}( \mathbf {D}^{(n)})\). For every \(h\in \mathbb {N}\), there exists \(\alpha _h>0\) such that
The actual value of \(\alpha _h\) could be in principle computed in terms of \(P\) (see proof of Theorem 4.5). We will however not need that.
Corollary 4.6
In the setting of Theorem 4.5, writing \(S_c^{(n)}=\sum _{u\in [n]}D_c^{(n)}(u)\), for all \(h\geqslant 2\):
Proof
By definition of \({\mathrm {CM}}(\mathbf {D}^{(n)})\), one has
As in Lemma 4.4, for each \(H \in \mathcal {G}(\mathbf {D}^{(n)},h) \), \( |\Gamma ^ {-1} (H)| = \prod _{c \in \mathcal {C}}\prod _u D^{(n)}_c(u) !\). Hence the sum in the right hand side above equals \( | \mathcal {G}(\mathbf {D}^{(n)},h ) | \prod _{c \in \mathcal {C}}\prod _u D^{(n)}_c(u) !\). The conclusion follows from Theorem 4.5 and \(|\Sigma |=\prod _{c \in \mathcal {C}_<} S^{(n)}_c ! \prod _{c \in \mathcal {C}_=} (S^{(n)}_c-1) !!\). \(\square \)
The proof of Theorem 4.5 will follow a well known strategy; see e.g. Bollobás [7, proof of Theorem 2.16] for a similar result. Our first lemma computes the number of copies of a subgraph in a graph sampled from \({\mathrm {CM}}( {\mathbf {D}}^{(n)})\). To formulate it, we need to introduce some more notation. Let \({\widehat{\mathcal {G}}}_n\) denote the set of \(G\in {\widehat{\mathcal {G}}}(\mathcal {C})\) with vertex set \([n]\). If \(G\in {\widehat{\mathcal {G}}}_n\), and \(H\in {\widehat{\mathcal {G}}}(\mathcal {C})\) has vertex set \(V\subset [n]\), we let \(Y(H, G)\) be the number of times that \(H \subset G\). When \(G\) is not a simple graph, then \(Y(H,G)\) may be larger than \(1\). Indeed, one has
where we use the notation \(B_c^{H,G}(u,v)\) for the binomial coefficient \(\left( {\begin{array}{c}\omega _c^G(u,v)\\ \omega _c^H(u,v)\end{array}}\right) \), with the convention that if \(u=v\) and \(c\in \mathcal {C}_=\), then \(B_c^{H,G}(u,u)\) equals \(\left( {\begin{array}{c}(\omega _c^G(u,u)/2)\\ (\omega _c^H(u,u)/2)\end{array}}\right) \).
Next, for \(G\in {\widehat{\mathcal {G}}}_n\) and \(H\in {\widehat{\mathcal {G}}}_k\), \(1 \leqslant k \leqslant n\), define \(X(H,G)\) as the number of distinct subgraphs of \(G\) that are isomorphic to \(H\). If \(a(H)\) denotes the cardinality of the automorphism group of \(H\), i.e. the number of permutations of the vertex labels which leave \(H\) invariant, then
where the sum is over all injective maps \(\tau \) from \([k]\) to \([n]\), and \(\tau (H)\) represents the multi-graph obtained by embedding \(H\) in \([n]\) through \(\tau \).
For \(H\in {\widehat{\mathcal {G}}}_k\), the \(c\)-degree at vertex \(u\) is denoted
The excess of \(H\) is defined by
Notice that \({\mathrm {exc}}(H)= |E(H)| - k\), where \(E(H)\) is the total number of edges of \({\bar{H}}\) (counting \(1\) for each loop) where \({\bar{H}}\) is the colorblind undirected multi-graph obtained from \(H\) by (25). Notice that for \(H\) connected, then \({\mathrm {exc}}(H) \geqslant -1\), and \({\mathrm {exc}}(H) = -1\) iff \({\bar{H}}\) is a tree. If \(n\geqslant k\) are positive integers, we use the notation \((n)_k = n!/(n-k)!\) for the number of injective maps \([k]\mapsto [n]\), with \((n)_0 = 1\).
Lemma 4.7
Let \(G_n \in {\widehat{\mathcal {G}}}(\mathbf {D}^{(n)} )\) with distribution \({\mathrm {CM}}( {\mathbf {D}}^{(n)})\), where \(\mathbf {D}^{(n)}\) satisfies assumptions (H1)–(H2). For any fixed \(k\in \mathbb {N}\), \(H\in {\widehat{\mathcal {G}}}_k\), as \(n \rightarrow \infty \):
where \(D\in \mathcal {M}^{(\theta )}_L\) has distribution \(P\) and \(s^H_c:=\sum _{i=1}^kd_c^H(i)\).
Proof
From (34), \(\mathbb {E}X ( H, G_n) = a(H)^{-1} \sum _\tau \mathbb {E}Y ( \tau (H), G_n )\). Below, we fix a map \(\tau \) and write \(H\) instead of \(\tau (H)\) for simplicity. We start by showing that
where we use the notation \(((n ))_{k} = (n-1)!!/(n-k-1)!!\). Since \({\mathrm {CM}}(\mathbf {D}^{(n)})\) is a product measure over \(c\in \mathcal {C}_\leqslant \), we may analyze one color at a time.
Consider first the case \(c\in \mathcal {C}_<\). Set
where \(H_c\) is the graph \(H\) with all edges removed except for edges of color \(c\) or \({\bar{c}}\), and the condition \(G\supset H_c\) indicates that \(\omega _c^G(u,v)\geqslant \omega _c^H(u,v)\) for all \(u,v\in [n]\). Then, as in Lemma 4.4
where we drop the superscript \((n)\) from \(D_c(i)\) and \(S_c\), and the sum runs over all \(G\in {\widehat{\mathcal {G}}}_n\) with \((c,{\bar{c}})\) colors only, with degree sequence given by \((D_c(i),D_{{\bar{c}}}(i))_{i\in [n]}\). Therefore,
On the other hand, applying (29) to the multi-graph \(G\!\backslash \! H\) defined by \((\omega _c^G(u,v)-\omega _c^H(u,v))\), one has
Thus, for \(c\in \mathcal {C}_<\) one has
Next, consider the case \(c\in \mathcal {C}_=\). Here
where \(H_c\) is the graph \(H\) with all edges removed except for edges of color \(c\). Then,
Applying (28) to the multi-graph \(G\!\backslash \! H\) and simplifying, one arrives at
Finally, taking products over \(c\in \mathcal {C}_<\) of (36) together with products over \(c\in \mathcal {C}_=\) of (37), we arrive at (35).
Summing over the injective maps \(\tau :[k]\mapsto [n]\), we deduce that
where \((M(1), \dots , M(k) )\) is uniformly sampled without replacement on \((\mathbf {D}^{(n)}(1),\dots , \mathbf {D}^{(n)}(n))\). From assumptions (H1)–(H2), for every fixed \(k\) and \(H\in {\widehat{\mathcal {G}}}_k\), as \(n\rightarrow \infty \):
where \(D\in \mathcal {M}_L^{(\theta )}\) has law \(P\). Moreover, for \(c \in \mathcal {C}_<\) and \(c \in \mathcal {C}_{=}\) respectively,
The desired conclusion now follows by using these asymptotics in (38) together with \((n)_k\sim n^k\) and
\(\square \)
Proof of Theorem 4.5
For every \(\ell \in \mathbb {N}\), call \(\mathcal {L}_\ell \) the set of all \(H\in {\widehat{\mathcal {G}}}_\ell \) such that the undirected graph \({\bar{H}}\) defined by (25) is a cycle of length \(\ell \). If \(\ell =1\), then \(\mathcal {L}_\ell \) is the union over \(c\in \mathcal {C}\) of the single loop graph at vertex \(\{1\}\) with color \(c\), if \(\ell =2\), then \(\mathcal {L}_\ell \) is the union over \(c,c'\in \mathcal {C}\) of the double edge graph at vertices \(\{1,2\}\) with \(\omega _c(1,2)=\omega _{{\bar{c}}}(2,1)=1,\omega _{c'}(1,2)=\omega _{{\bar{c}}'}(2,1)=1\), and so on. Let \(\mathcal {L}_{\leqslant h}=\cup _{\ell =1}^h \mathcal {L}_\ell \). We define the random variable
where \(G_n \in {\widehat{\mathcal {G}}}(\mathbf {D}^{(n)} )\) has distribution \({\mathrm {CM}}( {\mathbf {D}}^{(n)})\). With this notation, we need to show that under the assumptions of the theorem one has
for some \(\alpha _h>0\).
If \(H\in \mathcal {L}_{\leqslant h}\), then \({\mathrm {exc}}(H)=0\). By Lemma 4.7, for some \(\lambda _H \geqslant 0\), as \(n \rightarrow \infty \), one has
and, setting \(\lambda (h) = \sum _{H \in \mathcal {L}_{\leqslant h}}\lambda _H\), one finds
We are going to prove that \(Z\) converges weakly to a Poisson random variable with mean \(\lambda (h)\). This will prove (40) with \(\alpha _h = e^{-\lambda (h)}\). To this end, by the well known moment method, it is sufficient to prove that for any integer \(p\geqslant 1\):
where \((Z)_p=Z!/(Z-p)!\). The case \(p=1\) is (42). Below, we establish (43) for all \(p\geqslant 2\).
For any \(H\in \mathcal {L}_{\leqslant h}\), let \(\mathcal {H}_H\) denote the set of multi-graphs \(F\in {\widehat{\mathcal {G}}}(\mathcal {C})\) with vertex set \(V_F \subset [n]\) which are isomorphic to \(H\). If \(\mathcal {H}= \cup _{H\in \mathcal {L}_{\leqslant h}} \mathcal {H}_H\), then one has
where \(Y_F:=Y(F,G_n)\) is defined by (33). The proof of (43) uses two elementary topological facts:
-
(i)
if \(F \ne F' \in \mathcal {H}\) and \(F \cap F' \ne \emptyset \), i.e. \(V_F \cap V_{F'} \ne \emptyset \), then \({\mathrm {exc}}( F \cup F') \geqslant 1\),
-
(ii)
if \(H \in {\widehat{\mathcal {G}}}_k\) and \(H' \in {\widehat{\mathcal {G}}}_{k'}\), then \({\mathrm {exc}}(H \cup H') \geqslant {\mathrm {exc}}(H) + {\mathrm {exc}}(H')\) and \({\mathrm {exc}}(H \oplus H') = {\mathrm {exc}}(H) + {\mathrm {exc}}(H')\),
where \(H \oplus H' \in {\widehat{\mathcal {G}}}_{k+k'}\) is the multigraph obtained from the disjoint union of \(H\) and an isomorphic copy of \(H'\) with vertex set \(\{k+1, \dots , k + k'\}\). We also use two consequences of Lemma 4.7:
-
(iii)
if \(H \in {\widehat{\mathcal {G}}}_k\) and \({\mathrm {exc}}( H) \geqslant 1\) then \(\mathbb {E}X(H,G_n) = o(1)\);
-
(iv)
if \(H \in {\widehat{\mathcal {G}}}_k\) and \(H' \in {\widehat{\mathcal {G}}}_{k'}\), then \(\mathbb {E}X(H \oplus H',G_n) \sim \mathbb {E}X(H,G_n)\,\mathbb {E}X(H',G_n)\).
We start by showing that for all \(q\geqslant 1\), there exists \(c=c(q)>0\) such that
Write
By assumption (H1), \(Y_F \leqslant c_0\) for some \(c_0=c_0 (\theta ,h)\), and hence, for some \(c_1 = c_1 ( \theta ,h, q)\), one has the crude bound
where the sum \(\sum _{*}\) is over all choices of pairwise distinct \(F_1,\ldots , F_k\) in \(\mathcal {H}\). We now decompose \(\sum _{*}\) into the sum \(\sum _{**}\) over all choices of \(k\) pairwise disjoint sets \(F_i\) in \(\mathcal {H}\), and the sum \(\sum _{***}\) over all choices of \(k\) pairwise distinct \(F_i\) in \(\mathcal {H}\) such there exists \(i \ne j\) with \(F_i \cap F_j \ne \emptyset \). Notice that this last summation satisfies
for some \(c_0=c_0(\theta ,h)\), where \(K\) ranges over a finite collection (with cardinality independent of \(n\)) of multi-graphs which by facts (i–ii) satisfy \({\mathrm {exc}}(K) \geqslant 1\). In particular, fact (iii) implies that \(\mathbb {E}\sum _{***} \prod _{i=1}^k Y_{F_{i}}=o(1)\) as \(n\rightarrow \infty \). On the other hand,
Fact (iv) and (41) then imply that
This ends the proof of (44).
Next, define \(\tilde{Y}_F = \mathbf {1}( Y_F = 1)\) and \(\tilde{Z} = \sum _{ F \in \mathcal {H}} \tilde{Y}_F\). Let \(E\) be the event that for all \(F \in \mathcal {H}\), \(Y_F = \tilde{Y}_F\). Note that \(\tilde{Z} = Z\) if \(E\) holds and \(\mathbf {1}_{E^ c}\leqslant \sum _K X(K,G_n)\), where \(K\) ranges over a finite collection of multi-graphs with \({\mathrm {exc}}(K) \geqslant 1\). From fact (iii), it follows that \(\mathbb {P}(E^c) = o(1)\).
Clearly, \(\tilde{Z} \leqslant Z\) and \((Z)_p \leqslant Z^p\). Cauchy-Schwarz’ inequality yields
Therefore, using (44) and \(\mathbb {P}(E^c) = o(1)\), we see that it suffices to prove that \( \mathbb {E}(\tilde{Z})_p\) converges to \(\lambda (h)^ p\). Since \(\tilde{Y}_{F} \in \{0,1\}\), we write
where the sum \(\sum _*\) is over all choices of \(p\) pairwise distinct \(F_i\) in \(\mathcal {H}\). By assumption (H1), \(Y_F\) is uniformly bounded, and therefore
can be bounded by \(c_1\sum _K X( K, G)\) where \(K\) ranges over a finite collection of multi-graphs with \({\mathrm {exc}}(K) \geqslant 1\) and \(c_1=c_1(\theta ,h,p)\). Therefore from fact (iii) we get
The conclusion \(\mathbb {E}( \tilde{Z})_p\rightarrow \lambda (h)^p\), \(n\rightarrow \infty \), then follows from (45). \(\square \)
4.4 Unimodular Galton–Watson trees with colors
Let \({\widehat{\mathcal {G}}}^*(\mathcal {C})\) denote the set of equivalence classes of rooted directed locally finite colored multi-graphs, i.e. the set of connected multi-graphs \(G\in {\widehat{\mathcal {G}}}(\mathcal {C})\) with a distinguished vertex \(o\) (the root) where two rooted multi-graphs are identified if they only differ by a relabeling of the vertices. An element of \({\widehat{\mathcal {G}}}^*(\mathcal {C})\) is called a rooted directed colored tree if the corresponding colorblind multi-graph defined via (25) has no cycles. We now introduce a probability measure on \({\widehat{\mathcal {G}}}^*(\mathcal {C})\) supported on rooted colored directed trees. Let \(P \in \mathcal {P}( \mathcal {M}_L)\) be a probability measure on \(\mathcal {M}_L\), \(|\mathcal {C}|=L^2\), such that for all \(c\in \mathcal {C}\),
where \(D\in \mathcal {M}_L\) has distribution \(P\). For each \(c \in \mathcal {C}\) such that \(\mathbb {E}D_c > 0\), define the probability measure \({\widehat{P}}^{c} \in \mathcal {P}( \mathcal {M}_L)\) such that, for \(M \in \mathcal {M}_L\),
where \(D\) has distribution \(P\), and for any \(c\in \mathcal {C}\), \(E^c\) denotes the matrix with all entries equal to \(0\) except for the entry at \(c\), which equals \(1\). Notice that \({\widehat{P}}^c\) is indeed a probability since
If \(\mathbb {E}D_c = 0\) then we set \({\widehat{P}}^{c} (M) = \mathbf {1}( M = 0)\).
In a rooted directed colored tree \((T,o)\), for all \(v \ne o\), call \(a(v)\) the parent of \(v\) in \(T\). The type of a vertex \(v \ne o\) in \((T,o)\) is defined as the color of the edge \((a(v),v)\). The probability measure \({\mathrm {UGW}}(P)\in \mathcal {P}({\widehat{\mathcal {G}}}^*(\mathcal {C}))\) is the law of the multi-type Galton–Watson tree defined as follows. The root \(o\) produces offspring according to the distribution \(P\), i.e. the root has \(D_c\) children of type \(c\), for all \(c\in \mathcal {C}\), where \(D\in \mathcal {M}_L\) has law \(P\). Recursively, and independently, any \(v\ne o\) of type \(c\), produces offspring according to the distribution \({\widehat{P}}^ c\), i.e. \(v\) has \(D_{c'}\) children of type \(c'\), for all \(c'\in \mathcal {C}\), where \(D\in \mathcal {M}_L\) has law \({\widehat{P}}^ c\). Notice that in the case of a single color (\(L=1\) and \(\mathcal {C}= \{(1,1)\}\)), then \(P\) is a probability measure on \(\mathbb {Z}_+\) and \({\mathrm {UGW}}(P)\) coincides with the Galton–Watson tree \({\mathrm {UGW}}_1(P)\) with degree distribution \(P\), cf. (6).
Following the argument of Lemma 3.1, it could be proved that the measure \({\mathrm {UGW}}(P)\) is unimodular. However, in the next paragraph, Theorem 4.8 implies that \({\mathrm {UGW}}(P)\) is sofic (and hence unimodular).
4.5 Local weak convergence
It is straightforward to extend the local topology introduced in Sect. 2 to the case of rooted directed multi-graphs with colored edges \({\widehat{\mathcal {G}}}^*(\mathcal {C})\). The only difference is that the weight function \(\omega \) is now matrix-valued.
Theorem 4.8
If \(G_n \in {\widehat{\mathcal {G}}}(\mathbf {D}^{(n)} )\) has distribution \({\mathrm {CM}}( {\mathbf {D}}^{(n)})\), with \(\mathbf {D}^{(n)}\) such that assumptions (H1)–(H2) hold, then with probability one \(U(G_n) \rightsquigarrow {\mathrm {UGW}}(P)\). Moreover the same result holds if \(G_n\) is uniformly sampled on \(\mathcal {G}(\mathbf {D}^{(n)},h )\), for any fixed \(h\geqslant 2\).
In the case of a single color \(L=1\), Theorem 4.8 is folklore; see e.g. the monographs [18, 23]. The proof of Theorem 4.8 in the general case is given in the appendix.
4.6 Graphs with given tree-like neighborhood
Here we show how the configuration model can be used to count the number of graphs with a given tree-like neighborhood structure.
Fix \(n\) and a graph \(G=(V,E)\) with \(V=[n]\). Call \(\mathcal {G}_n\) the set of all such graphs. For \(h\in \mathbb {N}\), define the \(h\)-neighborhood vector
where \([G,u]_h\) stands for the equivalence class of the \(h\)-neighborhood of \(G\) at vertex \(u\). We say that \(G\) is \(h\)-tree-like if \([G,u]_h\) is a tree for all \(u\in [n]\).
We describe now a procedure which turns the given graph \(G\) into a directed colored graph \({\widetilde{G}}\) in \(\mathcal {G}(\mathcal {C})\). The color set \(\mathcal {C}\) is defined as follows. Let \(\mathcal {F}\subset \mathcal {G}^*_{h-1}\) denote the collection of all equivalence classes of the subgraphs \(G(u,v)_{h-1}\), where we recall that \(G(u,v)\) is the rooted graph obtained from \(G\) by removing the edge \(\{u,v\}\) and taking the root at \(v\). For simplicity, below we will identify \(G(u,v)_{h-1}\) with its equivalence class. If \(L=|\mathcal {F}|\) denotes the cardinality of \(\mathcal {F}\), we call \(\mathcal {C}\) the set of \(L^2\) pairs \((g,g')\), with \(g,g'\in \mathcal {F}\); see Fig. 4 for an example. To construct the directed colored graph, for every pair \(u,v\) such that \(\{u,v\}\) is an edge of \(G\), we include a directed edge \((u,v)\) with color
together with the directed edge \((v,u)\) with color \((g',g)=(G(v,u)_{h-1},G(u,v)_{h-1})\). This defines an element \({\widetilde{G}}\) of \(\mathcal {G}(\mathcal {C})\); see Fig. 5. As such, we can define its degree sequence \(\mathbf {D}=\mathbf {D}({\widetilde{G}})\) as in (26) above. Notice that if \(G\) is \(h\)-tree-like, then the above construction yields an element of \(\mathcal {G}(\mathbf {D},2h+1)\) since being \(h\)-tree-like is equivalent to having no cycles with length \(1\leqslant \ell \leqslant 2h+1\); see Fig. 6. A crucial property to be used below is that, for this particular choice of \(\mathbf {D}\), all elements of \(\mathcal {G}(\mathbf {D},2h+1)\) have the same \(h\)-neighborhoods.
Lemma 4.9
Let \(h\in \mathbb {N}\), let \(G\in \mathcal {G}_{n}\) be a fixed \(h\)-tree-like graph and let \(\mathbf {D}=\mathbf {D}({\widetilde{G}})\) be the associated degree sequence as above. For any \(\Gamma \in \mathcal {G}(\mathbf {D},2h+1)\), the colorblind graph \({\bar{\Gamma }}\in \mathcal {G}_{n}\) defined via (25) satisfies \(\psi _h({\bar{\Gamma }})=\psi _h(G)\).
Proof
Consider first the case \(h=1\). If \(\Gamma \in \mathcal {G}(\mathbf {D},3)\), then for any node \(i\in [n]\), the \(1\)-neighborhood \(({\bar{\Gamma }},i)_1\) at \(i\) is uniquely determined by the number of edges exiting node \(i\). By (25), this number equals \(\sum _{c\in \mathcal {C}}D_c(i)\), which is independent of \(\Gamma \). Thus, all \(\Gamma \in \mathcal {G}(\mathbf {D},3)\) satisfy necessarily \(\psi _1({\bar{\Gamma }})=\psi _1(G)\).
Next, we assume that any \(\Gamma \in \mathcal {G}(\mathbf {D},2h+1)\) satisfies \(\psi _{h-1}({\bar{\Gamma }})=\psi _{h-1}(G)\), and show that \(\psi _{h}({\bar{\Gamma }})=\psi _{h}(G)\). Since \(\mathcal {G}(\mathbf {D},2h+1)\subset \mathcal {G}(\mathbf {D},2(h-1)+1)\), by induction over \(h\) this will prove the desired result.
Since there are no cycles of length \(\ell \leqslant 2h+1\) in \(G\), \(\mathcal {F}\) consists of unlabeled rooted trees of depth \(h-1\). For any \(t\in \mathcal {F}\) we write \(t_k\) for the \(k\)-neighborhood of the root in \(t\) (truncation of \(t\) at depth \(k\)). Moreover, if \(t\) is a rooted tree, we write \(t_{k,+}\) for the unlabeled rooted tree of depth \(k+1\) obtained from \(t_k\) by adding a new edge to the root and taking the other endpoint of that edge as the new root. If \(t,t'\) are finite rooted trees, we write \(t\cup t'\) for the rooted tree obtained by joining \(t,t'\) at the common root. Since there are no cycles of length \(\ell \leqslant 2h+1\) in \({\bar{\Gamma }}\), to prove \(\psi _{h}({\bar{\Gamma }})=\psi _{h}(G)\) it is sufficient to show that for any edge \((u,v)\) with color \((t,t')\) in \(\Gamma \), with \(t,t'\in \mathcal {F}\), one has \({\bar{\Gamma }}(u,v)_{h-1}=t'\) and \({\bar{\Gamma }}(v,u)_{h-1}=t\).
Let \((u,v)\) be an edge in \(\Gamma \) with color \((t,t')\). Notice that in \({\widetilde{G}}\), \(u\) must have an edge \((u,{\widetilde{v}})\) with color \((t,t')\) going out of \(u\), and \(v\) must have an edge \((v,{\widetilde{u}})\) with color \((t',t)\) going out of \(v\). Therefore, \([G,u]_{h-1}=t\cup t'_{h-2,+}\) and \([G,v]_{h-1}=t'\cup t_{h-2,+}\). By assumption, \(({\bar{\Gamma }},u)_{h-1}=[G,u]_{h-1}\) and \(({\bar{\Gamma }},v)_{h-1}=[G,v]_{h-1}\). Therefore, the rooted trees \(T:={\bar{\Gamma }}(v,u)_{h-1}\) and \(T':={\bar{\Gamma }}(u,v)_{h-1}\) must satisfy
We need to show that \(t=T\) and \(t'=T'\). From (49), one has that it is sufficient to show that \(T'_{h-2}=t'_{h-2}\) and \(T_{h-2}=t_{h-2}\). Truncating (49) at depth \(h-2\) one has
Thus, it is sufficient to show that \(T'_{h-3}=t'_{h-3}\) and \(T_{h-3}=t_{h-3}\). Iterating this reasoning, one finds that it suffices to show that \(T'_{1}=t'_{1}\) and \(T_{1}=t_{1}\). However, this is guaranteed by the fact that the degree of \(u\) in \(G\) and \({\bar{\Gamma }}\) is the same, for any \(u\in [n]\).
\(\square \)
We turn to the problem of counting the number of graphs \(G'\in \mathcal {G}_{n}\) whose \(h\)-neighborhood distribution coincides with that of a given \(h\)-tree-like graph \(G\). The following is an important corollary of Lemma 4.9.
Corollary 4.10
Fix an arbitrary \(h\)-tree-like graph \(G\in \mathcal {G}_{n}\), and define
One has
where \(\mathbf {D}=(D(1),\dots ,D(n))\) is the degree sequence associated to \(G\) via (48), and \(n(\mathbf {D})\) denotes the number of distinct vectors \((D(\pi _1),\dots ,D(\pi _n))\in \mathcal {D}_n\) as \(\pi :[n]\mapsto [n]\) ranges over permutations of the labels.
Proof
For a permutation \(\pi :[n]\mapsto [n]\), let \(\mathbf {D}^\pi =(D(\pi _1),\dots ,D(\pi _n))\). Since the cardinality of \(\mathcal {G}(\mathbf {D}^\pi ,2h+1)\) does not depend on \(\pi \), \(n(\mathbf {D})|\mathcal {G}(\mathbf {D}^\pi ,2h+1)|\) coincides with the cardinality of \(\cup _\pi \mathcal {G}(\mathbf {D}^\pi ,2h+1)\). By Lemma 4.9, any two distinct elements \(\Gamma _1,\Gamma _2\in \cup _\pi \mathcal {G}(\mathbf {D}^\pi ,2h+1)\) yield two distinct graphs \({\bar{\Gamma }}_1,{\bar{\Gamma }}_2\) such that \(U({\bar{\Gamma }}_i)_h=U(G)_h\), \(i=1,2\). This proves that \(N_h(G)\geqslant n(\mathbf {D})|\mathcal {G}(\mathbf {D},2h+1)|\). On the other hand, any two distinct elements \(G_1,G_2\in \mathcal {G}_{n}\) with \(U(G_i)_h = U(G)_h\), \(i=1,2\), yield two distinct elements \({\widetilde{G}}_1,{\widetilde{G}}_2\in \cup _\pi \mathcal {G}(\mathbf {D}^\pi ,2h+1)\) with the map \(G\mapsto {\widetilde{G}}\) defined by (48). This proves the other direction. \(\square \)
Lemma 4.11
Fix \(h\in \mathbb {N}\), and \(P \in \mathcal {P}(\mathcal {T}^*_h)\) admissible, with finite support and \(\mathbb {E}_P {\mathrm {deg}}(o) = d\). Let \(m = m(n)\) be a sequence such that \(m/n \rightarrow d /2\) as \(n \rightarrow \infty \). Then, there exist a finite set \(\Delta \subset \mathcal {T}^*_h\) and a sequence of graphs \(\Gamma _n \in \mathcal {G}_{n,m}\) such that the support of \(U(\Gamma _n)_h \) is contained in \(\Delta \) for all \(n\) and \(U(\Gamma _n)_h \rightsquigarrow P\) as \(n\rightarrow \infty \).
Proof
Let \(S:=\{t_1,\dots ,t_r\}\subset \mathcal {T}_h^*\) be the finite support of \(P\). We define the vector \(\mathbf {g}^{(n) } = (g^{(n)} (1), \dots , g^{(n)}(n))\) with \(g^{(n)} (i) \in S\) by setting \(g^{(n)} (i) = t_{k}\) if \(\sum _{\ell \leqslant k} P(t_\ell ) > (i-1)/n\) and \(\sum _{\ell \leqslant k - 1} P(t_\ell ) \leqslant (i-1) /n\) with the convention that the sum over an empty set is \(-\infty \). The empirical measure of \(\mathbf {g}^{(n)}\), say \(P^{(n)}\), converges weakly to \(P\).
Let \(\mathcal {C}\) denote the set of all pairs \(c=(t,t')\in \mathcal {T}_{h-1}^*\times \mathcal {T}_{h-1}^*\) associated to any element \(g \in S\) as in (48). In this manner, we associate to any \(g^{(n)} (i)\) an integer valued matrix \(D^{(n)} (i) \in \mathcal {M}_L\) where \({{\left| \mathcal {C} \right| }} = L^2\). We denote by \(S^{(n)}_c = \sum _{i=1} ^n D^{(n)}_c (i)\). We finally set, for \(c \in \mathcal {C}_=\), \({\widetilde{S}}^{(n)}_c = 2 \lfloor S^{(n)}_c /2 \rfloor \) and, for \(c \in \mathcal {C}_{\ne }\), \({\widetilde{S}}^{(n)}_c = S^{(n)}_c \wedge S^{(n)}_{{\bar{c}}}\). We may fix a sequence of integer-valued matrices \({\widetilde{\mathbf {D}}}^{(n)} = ( {\widetilde{D}}^{(n)} (i))_{1 \leqslant i \leqslant n}\) such that component-wise \({\widetilde{D}}^{(n)}(i) \leqslant D^{(n)}(i)\) and, for all \(c \in \mathcal {C}\), \( \sum _{i=1}^n {\widetilde{D}}^{(n)}_c (i) = {\widetilde{S}}^{(n)}_c \). The properties \(P^{(n)} \rightsquigarrow P\) and \({\mathrm {supp}}(P^{(n)}) \subset S\) imply that for all \(c \in \mathcal {C}\), \({\widetilde{S}}^{(n)}_c - S^{(n)}_c = o(n)\) and for all but \(o(n)\) vertices \({\widetilde{D}}^{(n)}(i) = D^{(n)}(i)\). Moreover,
We consider the generalized configuration model on \({\widetilde{\mathbf {D}}}^{(n)}\). Corollary 4.6 implies the existence, for all \(n\) large enough, of an directed colored graph \({\widetilde{\Gamma }}_n\) with girth at least \(2 h +1\) and whose colored degree sequence is precisely given by \({\widetilde{\mathbf {D}}}^{(n)}\). Let \({\bar{\Gamma }}_n\) be the associated color-blind graph. The proof of Lemma 4.9 actually shows that if a vertex \(v\) of \({\widetilde{\Gamma }}_n\) is such that all vertices \(u\) in \(({\widetilde{\Gamma }}_n,v)_h\) satisfy \({\widetilde{D}}^{(n)}_c (u) = D^{(n)}_c (u)\) then the equivalence class of \(({\bar{\Gamma }}_n,v)_h\) is precisely \(g^{(n)} (v)\). Now, let \(\theta \) be the maximal degree of vertices in \(t \in S\) and set \(\kappa = \sum _{\ell = 0}^{h} \theta ^h\). Any vertex is in the \(h\)-neighborhood of at most \(\kappa \) vertices. Since for all but \(o(n)\) vertices \({\widetilde{D}}^{(n)}(v) = D^{(n)}(v)\), we deduce that for all but \(o(n)\) vertices, the equivalence class of \(({\bar{\Gamma }}_n,v)_h\) is \(g^{(n)} (v)\). We thus have proved that \(U({\bar{\Gamma }}_n)_h \rightsquigarrow P\). Also, by construction, the support of \(U({\bar{\Gamma }}_n)_h\) is contained in the finite set \(\Delta _{h,\theta }\) of unlabeled rooted trees \(t \in \mathcal {T}^*_h\) such that all degrees of vertices in \(t\) are bounded by \(\theta \).
A last modification is needed: we have \({\bar{\Gamma }}_n\in \mathcal {G}_{n,{\widetilde{m}}}\) and we need a graph \(\Gamma _n \in \mathcal {G}_{n,m}\). However, since the number of vertices in \(({\bar{\Gamma }}_n,v)_h\) is bounded by \(\kappa \), adding or removing one edge in \({\bar{\Gamma }}_n\) will change the value of \(({\bar{\Gamma }}_n,v)_h\) for at most \(2 \kappa \) vertices. Let \(\delta (n) = |{\widetilde{m}} - m| = o(n)\). Assume first that \({\widetilde{m}} < m\), then we need to add edges to \({\bar{\Gamma }}_n\). We may add \(\delta (n)\) new edges to \({\bar{\Gamma }}_n\) such that any vertex has a most one new adjacent edge. From what precedes, we obtain a graph \(\Gamma _n \in \mathcal {G}_{n,m}\) such that \(U(\Gamma _n)_h \rightsquigarrow P\). Moreover the support \(U(\Gamma _n)_h\) is contained in \(\Delta _{h,\theta +1}\). If \({\widetilde{m}} > m\), we need to remove edges. We remove an arbitrary subset of them of cardinality \(\delta (n)\). We get a graph \(\Gamma _n \in \mathcal {G}_{n,m}\) such that \(U(\Gamma _n)_h \rightsquigarrow P\) and the support of \(U(\Gamma _n)_h\) is contained in \(\Delta _{h,\theta }\). \(\square \)
4.7 Proof of Corollary 1.5
We note that the set, say \(\mathcal {S}\), of sofic measures supported on trees is a closed subset of \(\mathcal {P}_{u} ( \mathcal {T}^*)\). Let \(B\) be the set of measures of the form \(\rho = {\mathrm {UGW}}_h ( P)\) with \(P \in \mathcal {P}( \mathcal {T}^*_h)\) admissible with finite support and \(h \in \mathbb {N}\). A consequence of Lemma 4.11 and Theorem 4.8 is that \(B\) is a subset of \(\mathcal {S}\).
Let us first check that for any \(h \in \mathbb {N}\) and \(P \in \mathcal {P}( \mathcal {T}^*_h)\) admissible, \(\rho = {\mathrm {UGW}}_h(P) \in \mathcal {S}\). For each \(n \in \mathbb {N}\), consider the forest \(F_n\) obtained from \((T,o)\), with law \(\rho \), by removing all edges adjacent to a vertex with degree higher than \(n\). We may define \(\rho ^{(n)}\) as the the law of \((F_n(o),o)\), the connected component of the root. It is easy to check that \(\rho ^{(n)}\) is a unimodular measure. We define \(Q_n = \rho ^{(n)}_h\), the law of its \(h\)-neighborhood. By construction, \({\mathrm {UGW}}_h (Q_n) \in B\) and \(Q_n\) converges weakly to \(P\). We deduce that \({\mathrm {UGW}}_h(Q_n) \rightsquigarrow {\mathrm {UGW}}_h (P)\) and \({\mathrm {UGW}}_h(P) \in \mathcal {S}\).
Moreover, if \(\rho \in \mathcal {P}_{u} ( \mathcal {T}^*)\), then \({\mathrm {UGW}}_h(\rho _h)\rightsquigarrow \rho \), as \(h\rightarrow \infty \). From what precedes \({\mathrm {UGW}}_h(\rho _h) \in \mathcal {S}\). Therefore, \(\rho \in \mathcal {S}\) and \(\mathcal {S}= \mathcal {P}_{u} ( \mathcal {T}^*)\).
5 Graph counting and entropy
In this section we prove Theorem 1.2 and Theorem 1.3. The strategy will be as follows. We first establish the cases \(\Sigma (\rho ) = -\infty \) in Theorem 1.2. We then prove Theorem 1.3, and later complete the proof of Theorem 1.2. In what follows, we fix \(d>0\) and a sequence \(m=m(n)\) such that \(m/n\rightarrow d/2\) as \(n\rightarrow \infty \).
5.1 Measures with \(\Sigma (\rho )=-\infty \)
Since unimodular measures form a closed subset of \(\mathcal {P}(\mathcal {G}^*)\), if \(\rho \notin \mathcal {P}_u(\mathcal {G}^*)\), then for some \(\varepsilon >0\) one has \(B(\rho ,\varepsilon )\subset \mathcal {P}(\mathcal {G}^*){\setminus } \mathcal {P}_u(\mathcal {G}^*)\). Since \(U(G_n)\in \mathcal {P}_u(\mathcal {G}^*)\), then \(|\mathcal {G}_{n,m}(\rho ,\varepsilon )|=0\). Therefore \(\overline{\Sigma }(\rho )=-\infty \) for all \(\rho \notin \mathcal {P}_u(\mathcal {G}^*)\).
Next, we show that \(\overline{\Sigma }(\rho ) = -\infty \) whenever \(\mathbb {E}_{\rho } {\mathrm {deg}}(o) \ne d\). We start with the case \(\mathbb {E}_\rho {\mathrm {deg}}( o ) > d\). Let \(\rho \in \mathcal {P}_{u} ( \mathcal {G}^*)\) and assume that \(\overline{\Sigma }(\rho ) > - \infty \). Then, by an extraction argument, there must exist a sequence of graphs \(G_{n} \in \mathcal {G}_{n,m}\) such that \(U(G_{n}) \rightsquigarrow \rho \). Weak convergence then implies that \(\mathbb {E}_{U(G_{n})} [{\mathrm {deg}}(o)\wedge t] \rightarrow \mathbb {E}_{\rho } [{\mathrm {deg}}(o)\wedge t]\) for any \(t>0\), and therefore, letting \(n\rightarrow \infty \) and then \(t\rightarrow \infty \):
On the other hand, by construction,
We thus have checked that if \(\mathbb {E}_\rho {\mathrm {deg}}( o ) > d\), then \(\overline{\Sigma }(\rho )= -\infty \).
The case \(\mathbb {E}_\rho {\mathrm {deg}}( o, G) < d\) requires a little more care.
Lemma 5.1
If \(\mathbb {E}_\rho {\mathrm {deg}}( o) < d\), then \(\overline{\Sigma }(\rho )= -\infty \).
Proof
From (7), it is sufficient to prove that, for any sequence \(\varepsilon _n \rightarrow 0\),
where \(G_n\) is a uniform random graph in \(\mathcal {G}_{n,m}\). Define \(d' = \mathbb {E}_\rho {\mathrm {deg}}( o)\) and \(\delta = d- d' > 0\). If \(U(G_n)\in B(\rho ,\varepsilon _n)\) for all \(n\), then for any \(t>0\):
Therefore, for some sequence \(t_n \rightarrow \infty \), one has
Define \(A_n = \{ i \in [n] : {\mathrm {deg}}_{G_n}(i) > t_n \}\). Using (52) one has
On the other hand, by Markov’s inequality and (52), the cardinality of \(A_n\) satisfies \(|A_n|\leqslant \alpha _n n\), where \( \alpha _n= 2d/t_n\) for all \(n\) large enough. Thus \(U(G_n)\in B( \rho , \varepsilon _n)\) implies that there exists \(S\subset [n]\) with \(|S|\leqslant \alpha _n n\) such that \({\mathrm {deg}}_{G_n}(S):=\sum _{v \in S} {\mathrm {deg}}_{G_n}(v)\) is larger than \(\delta n/2\) for all \(n\) large enough. By the union bound one has
where \([\alpha _n n]=\{1,\dots ,\alpha _n n\}\). Next, we check that
To this end, observe that \({\mathrm {deg}}_{G_n}([\alpha _n n])\) is stochastically dominated by \(2N\), where \(N\) denotes the binomial random variable \(N={\mathrm {Bin}}{(\alpha _n n^2,2d/n)}\). Indeed, the number of potential edges incident to the set \([\alpha _n n]\) is trivially bounded by \( \alpha _n n^2\) and each potential edge can be included in \(G_n\) recursively, where at each step the probability of inclusion is bounded above by
if \(n\) is large enough, where we use \(m/n\rightarrow d/2\) and \(\alpha _n\rightarrow 0\). Therefore, from Chernov’s bound, for any \(x> 0\),
Taking e.g. \(x=-\frac{1}{4}\log {\alpha _n} \), one obtains (54). Moreover, Stirling’s formula implies
This implies (53). \(\square \)
We turn to the claim that \(\Sigma (\rho )=-\infty \) whenever \(\rho \) is not supported on trees.
Lemma 5.2
Suppose \(\rho \in \mathcal {P}_u(\mathcal {G}^*)\) is such that \(\rho (\mathcal {T}^*)<1\). Then there exists \(\varepsilon _0 > 0\) such that if \(0 < \varepsilon < \varepsilon _0\), then
In particular, \(\overline{\Sigma }( \rho , \varepsilon ) = - \infty \), for any \(0 < \varepsilon < \varepsilon _0\).
Proof
Once (55) is established, the last assertion follows from (7) and \(m/n\rightarrow d/2\). Let us prove (55). By assumption, there exist integers \(t\) and \(\ell \geqslant 3\) such that
For integer \(k \geqslant 2\), let us say that a cycle is a \((k, \ell )\)-cycle if its length is \(\ell \) and the degree of all vertices on the cycle is bounded by \(k\). Since \((G,o)_t\) is \(\rho \)-a.s. locally finite, there exists an integer \(k \geqslant 2\) such that
Consider the function \(f (G, o, v) = \mathbf {1}( {\mathrm {dist}}_G ( o, v) \leqslant t\, ; \,v \hbox { is in a}\, (k,\ell )\hbox {-cycle})\). From what precedes
Since \(\rho \) is unimodular, equation (2) applied to \(f\) implies that for some \(\eta >0\),
Thus, if \(G \in \mathcal {G}_{n,m} ( \rho , \varepsilon )\) and \(\varepsilon \) is small enough,
By definition of \(U(G)\), this implies that the number of vertices in a \((k,\ell )\)-cycle in \(G\) is at least \(\eta n\). Since degrees are bounded by \(k\) in a \((k,\ell )\)-cycle, we deduce that \(G\) contains at least \(\delta n\) mutually disjoint cycles of length \(\ell \), for some \(\delta = \delta (\ell , k )>0\). Therefore,
where \(C_{n,\ell }\) is the number of ways to place \( \lceil \delta n\rceil \) disjoint cycles of length \(\ell \) on \(n\) vertices. One has
Indeed \( ( n)_{ \ell \lceil \delta n\rceil } \) counts the number of ordered choices of the \(\ell \) vertices for each of \(\lceil \delta n\rceil \) labeled cycles (the first \(\ell \) vertices define the first cycle and so on), while division by \( \lceil \delta n\rceil !\) is used to remove cycle labels. By Stirling’s formula,
On the other hand, from (7), we have
So finally,
This proves (55). \(\square \)
5.2 Proof of Theorem 1.3 and Theorem 1.2
Notice that if \(P\in \mathcal {P}_h\), then \(J_h(P)\) is a well defined extended real number in \([-\infty ,\infty )\). The fact that \(J_h(P)\leqslant s(d)\) follows from Proposition 5.6 below and from the upper bound \(\overline{\Sigma }(\rho )\leqslant s(d)\), cf. (7).
As before, we fix \(d>0\) and an integer sequence \(m=m(n)\) such that \(m/n\rightarrow d/2\) as \(n\rightarrow \infty \). We start with three preliminary lemmas.
Lemma 5.3
The function \(\rho \mapsto \underline{\Sigma }(\rho )\) on \(\mathcal {P}_u(\mathcal {G}_*)\) is upper semi-continuous.
Proof
Consider a sequence \((\rho _k)\) converging to \(\rho \). We should check that \(\underline{\Sigma }(\rho ) \geqslant \limsup \underline{\Sigma }(\rho _k)\). Observe that for any \(\varepsilon >0\), for all \(k\) large enough, \(B(\rho , \varepsilon ) \supset B(\rho _k, \varepsilon /2)\). We get for \(k\) large enough,
Letting \(k\) tend to infinity and then \(\varepsilon \) to \(0\), we obtain the claim. \(\square \)
We will also need two general lemmas.
Lemma 5.4
Let \(P=\{p_x,\,x\in \mathcal {X}\}\) be a probability measure on a discrete space \(\mathcal {X}\) such that \(H(P) < \infty \). Let \((\ell _x)_{x \in \mathcal {X}}\) be a sequence with \(\ell _x \in \mathbb {Z}_+\), \(x\in \mathcal {X}\), such that \( \sum _{x} p_x \ell _x \log \ell _x < \infty \). Then \( - \sum _{x} p_x \ell _x \log p_x < \infty \).
Proof
We can assume without loss of generality that \(p_x \ne 0\) for all \(x \in \mathcal {X}\). We look for the sequence \((\ell _x)\) which maximizes the linear function \( - \sum _{x} p_x \ell _x \log p_x < \infty \) under the constraints \(\ell _x \geqslant 1\) and \(\sum _{x} p_x \ell _x \log \ell _x = c\). If the constraint \(\ell _x \geqslant 1\) is not saturated, taking derivative, we find \(0 = - p_x \log p_x - \lambda p_x - \lambda p_x \log \ell _x\) where \(\lambda \) is the Lagrange mutliplier associated to the constraint \(\sum _{x} p_x \ell _x \log \ell _x = c\). We get \(\ell _x = e^{-1} p_x^{-1/\lambda }\). Let \(\mathcal {X}_1\) be the set of \(x\) such that \(\ell _x =1\). We thus find
The conclusion follows. \(\square \)
Lemma 5.5
Let \(p,\kappa \) be integers and \(\mathcal {A}_\kappa \subset \mathcal {P}( \mathbb {Z}^p)\) the set of probability measures \(P\) on \(\mathbb {Z}^p\) such that \(\mathbb {E}\sum _{i=1}^p |X_i | \leqslant \kappa \) where \( X = (X_1, \dots , X_p)\) has law \(P\). Then the map \(P \mapsto H(P)\) is continuous on \(\mathcal {A}_\kappa \) for the weak topology.
Proof
A simple truncation argument shows that \(\mathcal {A}_\kappa \) is weakly closed. Let \(Q_n\) (resp. \(Q\)) be the law of \( \Vert X \Vert _1= \sum _{i=1}^p |X_i | \) where \(X\) has law \(P_n\) (resp. \(P\)). If \(P_n ^k\) (resp. \(P^k\)) is the conditional law of \(P_n\) (resp. \(P\)) conditioned on \(\Vert X\Vert _1= k\), we have
and similarly for \(P\). Since \(P_n ^k\) is a probability measure on a finite set of size \(c_k \leqslant ( 2 k +1)^p\), we have for any \(k\), \(Q_n(k) \rightarrow Q(k)\), \(H(P_n ^k) \rightarrow H(P^k)\) as \(n \rightarrow \infty \). Also, \(H(P^k_n) \leqslant \log (c_k) \leqslant p \log (2 k +1) \). Since \(\sum _k k Q_n(k) \leqslant \kappa \), using that \(x/\log (2x+1)\) is increasing for \(x\geqslant 1\), it follows that for \(\theta \geqslant 1\),
This proves the uniform integrability of \(k \mapsto H(P_n^k)\) for the measures \(Q_n\). Hence letting first \(n\) and then \(\theta \) tend to infinity, we get
It thus remains to prove that \(\lim _{n \rightarrow \infty } H(Q_n) = H(Q)\). The proof is similar. First, for any \(\theta \),
Then, we need to upper bound \( - \sum _{k \geqslant \theta } Q_n (k) \log Q_n(k)\), uniformly in \(n\). It can be done as follows. Observe that \(\sum _{k \geqslant \theta } \sqrt{k} Q_n(k) \leqslant \kappa / \sqrt{\theta }\). We then compute
under the linear constraints, \(x_k \geqslant 0\), \(\sum _k x_k \leqslant 1\) and \(\sum _{ k \geqslant 0} \sqrt{k} x_k = \delta \). Using Lagrange multipliers denoted by \(\lambda \) and \(\mu \), the solution of this convex optimization problem is of the form \(x_k = e^{-\mu - \lambda \sqrt{k}}\) for \(k \geqslant 0\) and \(\sum _k x_k = 1\). It is then easy to check that as \(\delta \rightarrow 0\), \(\lambda \delta \rightarrow 0\) and \(\mu \rightarrow 0\). It follows that \(L( \delta ) = \mu + \lambda \delta \rightarrow 0\). It implies that \(- \sum _{k \geqslant \theta } Q_n (k) \log Q_n(k) \leqslant L ( \kappa / \sqrt{\theta })\) goes to \(0\) as \(\theta \rightarrow \infty \) uniformly in \(n\). Letting \(n\) tend to infinity and then \(\theta \), it proves that \(\lim _{n \rightarrow \infty } H(Q_n) = H(Q)\). This concludes the proof of Lemma 5.5. \(\square \)
We now compute the entropy of \({\mathrm {UGW}}_h(P)\).
Proposition 5.6
For \(h\in \mathbb {N}\), \(P \in \mathcal {P}_h\) and \(\mathbb {E}_P {\mathrm {deg}}(o) = d\):
Proof
Lower bound, finite support Consider first the lower bound \(\underline{\Sigma }({\mathrm {UGW}}_h(P))\geqslant J_h(P)\) when \(P\) has finite support. By Lemma 4.11, we may choose a sequence \(\Gamma _n\in \mathcal {G}_{n,m}\) such that \(U(\Gamma _n)_h\rightsquigarrow P\), \(n\rightarrow \infty \) and \(U(\Gamma _n)_h\) has support contained \(\Delta :=\{t_1,\dots ,t_r\}\subset \mathcal {T}_h^*\) for all \(n\). Let \(N_h(\Gamma _n)\) denote the number of graphs \(G\in \mathcal {G}_{n}\) such that \(U(G)_h=U(\Gamma _n)_h\). Clearly, all such graphs have the same number \(m\) of edges. From Corollary 4.10, we know that \(N_h(\Gamma _n)= n(\mathbf {D})|\mathcal {G}(\mathbf {D},2h+1)|\), where \(\mathbf {D}\) is the neighborhood sequence associated to \(\Gamma _n\), i.e. if \(c=(t,t')\in \mathcal {T}_{h-1}^*\times \mathcal {T}_{h-1}^*\), then \(D_c(i) \) is the number of \(j\sim i\) in \(\Gamma _n\) such that \(\Gamma _n(i,j)_{h-1}=t'\) and \( \Gamma _n(j,i)_{h-1}=t\); see (48). Then,
where \(\alpha _k=\alpha _k( n) \) stands for the probability of \(t_k\) under \(U(\Gamma _n)_h\). Since \(\alpha _k\rightarrow P(t_k)\) as \(n\rightarrow \infty \), Stirling’s formula yields
On the other hand, from Corollary 4.6 we have
where \(\mathcal {C}\) denotes the set of all pairs \(c=(t,t')\in \mathcal {T}_{h-1}^*\times \mathcal {T}_{h-1}^*\) associated to \(\Gamma _n\) as in (48), \(S_c= S_{{\bar{c}}} = \sum _{u\in [n]}D_c(u)\), \({\bar{c}}=(t',t)\) if \(c=(t,t')\). Note that the size of \(\mathcal {C}\) is finite and independent of \(n\). For a given \(c=(t,t')\), using the notation (3) one has \( S_c/n \rightarrow e_P(t,t'). \) Also, writing \(2m=\sum _{c\in \mathcal {C}} S_c\), (58) can be rewritten as
To prove the desired lower bound on \(\underline{\Sigma }({\mathrm {UGW}}_h(P))\), we may restrict to graphs \(G\in \mathcal {G}_{n,m}\) with \(U(G)_h=U(\Gamma _n)_h\) to obtain
where \(G_n\) is uniformly distributed in \(\mathcal {G}(\mathbf {D},2h+1)\) with \(\mathbf {D}\) as above. From Theorem 4.8, for all \(\varepsilon >0\) one has
Using (60), we have proved that for all \(\varepsilon >0\), \(\underline{\Sigma }({\mathrm {UGW}}_h(P),\varepsilon )\geqslant J_h(P)\). Therefore,
Lower bound, general case Set \(\rho = {\mathrm {UGW}}_h(P)\). We can assume that \(J_h(P) > - \infty \). For each \(n \in \mathbb {N}\), consider the forest \(F_n\) obtained from \((T,o)\) with law \(\rho \) by removing all edges adjacent to a vertex with degree higher than \(n\). We may define \(\rho ^{(n)}\) as the the law of \((F_n(o),o)\), the connected component of the root. It is not hard to check that \(\rho ^{(n)}\) is a unimodular measure. We define \(P_n = \rho ^{(n)}_h\), the law of its \(h\)-neighborhood. By construction, \(P_n\) is finitely supported, admissible, \(P_n\) converges weakly to \(P\) and \(d_n = \mathbb {E}_{Q_n} {\mathrm {deg}}_G (o) \leqslant d\) converges to \(d\). We pick some fixed integer \(D > d\vee 2\) and define \(R = \delta _{t_\star }\) as the Dirac mass of the \(h\)-neighborhood of the \(D\)-regular tree. If \(n\) is large enough, there exists \(p_n \rightarrow 1\) such that \(Q_n = p_n P_n + (1-p_n) R\) has mean root degree equal to \(d\). Also, \(Q_n \in \mathcal {P}(\mathcal {T}^*_h)\) is admissible (the set of admissible measures is convex) and has finite support. We apply Lemma 5.3 and the lower bound for finitely supported measures, to obtain
By definition, \(J_h(Q_n) = -s(d) + H(Q_n) - \frac{d}{2} H(\pi _{Q_n}) - \sum _{(s,s')} \mathbb {E}_{Q_n} \log E_h (s,s') !\). We need to prove that \(\limsup J_h(Q_n) \geqslant J_h(P)\). It suffices to prove that
First, the lower semi-continuity of the entropy gives \( \liminf _{n\rightarrow \infty } H(P_n) \geqslant H(P)\). We now check that
For ease of notation, we write \(\mathcal {C}= \mathcal {T}^*_{h-1} \times \mathcal {T}^* _{h-1}\), \(c = (t,t') \in \mathcal {C}\) and \(E_h(c)(\tau )\) to make explicit the dependence in \(\tau \in \mathcal {T}^* _h\). As above, \(F_n\) is the forest obtained from \((T,o)\) with law \(\rho \), so that
where \(\varphi (\tau ) = \sum _c \log {{\left( E_h (c ) (\tau ) ! \right) }}\) satisfies:
In particular, \(\varphi ( (F_n(o),o) )\leqslant {\bar{\varphi }}(T,o):={\mathrm {deg}}_T(o) \log {\mathrm {deg}}_T(o)\). The assumption \(P \in \mathcal {P}_h\) implies that \(\mathbb {E}_P {\bar{\varphi }}(T,o)< \infty \). Therefore (63) follows from the dominated convergence theorem.
To conclude the proof of (62), it remains to check that \(\limsup H(\pi _{P_n}) \leqslant H(\pi _P)\), i.e.
For \(\theta \in \mathbb {N}\), we denote by \(\mathcal {F}_\theta \subset \mathcal {T}^*_h\) the subset of trees whose root vertex has degree bounded by \(\theta +1\) and by \(\mathcal {C}_\theta \subset \mathcal {C}\), the finite subset of pairs of trees with vertex degrees bounded by \(\theta \). The assumption \(P \in \mathcal {P}_h\) and Lemma 5.4 imply that \(- \sum _{\tau } {\mathrm {deg}}_\tau (o) P(\tau ) \log P (\tau ) < \infty \). Also, the assumption \(J_h(P) > -\infty \) implies that \(H(\pi _P) < \infty \) and \(\sum _c {{\left| e_{P} (c) \log e_{P} (c) \right| }}< \infty \). It follows that for any \(\varepsilon >0\), there exists \(\theta \) such that
By dominated convergence, for any \(c\in \mathcal {C}\), \(e_{P_n} (c) \rightarrow e_{P} (c)\). Since \(\mathcal {C}_\theta \) is finite, we find
Since \(\varepsilon >0\) is arbitrarily small, in order to complete (65), it suffices to prove that for any \(n \in \mathbb {N}\),
We write
It follows that
where we use that \(\sum _{ c } E_h ( c) (\tau )={\mathrm {deg}}_\tau (o)\), and that if \(\tau \notin \mathcal {F}_{\theta }\) and \(c \in \mathcal {C}_\theta \) then \(E_h( c)(\tau ) = 0\). Now, by construction, there exists a partition \(\cup _i \mathcal {X}^i _n \) of \(\mathcal {T}^*_h\) and \(\tau ^i_n \in \mathcal {X}^i _n\) such that if \((T,o) \in \mathcal {X}^i_n\) then \((F_n(o),o) = \tau ^i _n\). Also, \(P_n (\tau ^i_n) = P ( \mathcal {X}^i_n) \geqslant P( \tau ^i _n)\), and for all \(\tau \in \mathcal {X}^i _n \), \({\mathrm {deg}}_\tau (o) \geqslant {\mathrm {deg}}_{\tau ^i_n} (o)\), \(\mathbf {1}( \tau \notin \mathcal {F}_\theta ) \geqslant \mathbf {1}( \tau ^i _n \notin \mathcal {F}_\theta )\). It follows that
This concludes the proof of (65).
Upper bound The upper bound \(\overline{\Sigma }({\mathrm {UGW}}_h(P))\leqslant J_h(P)\) is a consequence of the general estimate of Lemma 5.7 below. \(\square \)
Lemma 5.7
Fix \(h\in \mathbb {N}\). If \(\rho \in \mathcal {P}_u(\mathcal {T}^*)\) is such that \(\rho _h\in \mathcal {P}_h\), then
Proof
Finite support For clarity, we first assume that \(P=\rho _h\) has finite support. The definition of local weak topology implies that for any \(h\in \mathbb {N}\), any \(\varepsilon >0\), there exists \(\eta >0\) such that \(B ( \rho , \eta ) \subset \{ \mu \in \mathcal {P}( \mathcal {G}^ *) : \,d_{TV} ( \mu _h, \rho _h) \leqslant \varepsilon \}\), where \(d_{TV}\) denotes the total variation distance. Define
Therefore, (66) follows if we prove
Let \(\Delta \subset \mathcal {T}_h^*\) be the support of \(P\). Define \(\mathcal {F}\subset \mathcal {T}^*_{h-1}\) as the set of unlabeled rooted trees \(t\in \mathcal {T}^*_{h-1}\) such that either \(T(o,v)_{h-1}=t\) or \( T(v,o)_{h-1}=t\) for some \(T\in \Delta \). Set \(L=|\mathcal {F}|\). Also, by adding a fictitious point \(\star \) to \(\mathcal {F}\), define \({\bar{\mathcal {F}}} = \mathcal {F}\cup \{\star \}\), and call \({\bar{\mathcal {C}}}\) the associated set of \((L+1)\times (L+1)\) colors \(c=(t,t')\), \(t,t'\in {\bar{\mathcal {F}}}\). To any graph \(G\in \mathcal {G}_{n,m}\) we may associate a degree sequence \({\bar{\mathbf {D}}} = ({\bar{D}}(1),\dots ,{\bar{D}}(n))\), where \({\bar{D}}(i)\) is a \((L+1)\times (L+1)\) matrix for each \(i\), obtained as in (48) by identifying with \(\star \) all neighborhoods that do not belong to \(\mathcal {F}\). The precise construction is defined as follows. Fix an edge \(\{u,v\}\) of \(G\): if \(G(u,v)_{h-1} = t'\) and \(G(v,u)_{h-1} =t\), with \(t,t'\in \mathcal {F}\), then we say that the oriented pair \((u,v)\) has color \(c=(t,t')\in {\bar{\mathcal {C}}}\); if either \(G(u,v)_{h-1}\) or \(G(v,u)_{h-1}\) are not in \(\mathcal {F}\), then we say that the oriented pair \((u,v)\) has color \((\star ,\star )\in {\bar{\mathcal {C}}}\). This defines a directed colored graph \({\widetilde{G}}\) with colors from the set \({\bar{\mathcal {C}}}\). We call \({\bar{\mathbf {D}}}\) the corresponding degree sequence, i.e. \({\bar{D}}_c(i)\) is the number of directed edges with color \(c\) going out of vertex \(i\). Note that by construction, if \((u,v)\) has color \(c\), then \((v,u)\) has color \({\bar{c}}\), and that there is no edge with color \((t,\star )\) or \((\star , t)\) for any \(t\in \mathcal {F}\).
In this way a graph \(G\in \mathcal {G}_{n,m}\) yields an element \({\widetilde{G}}\) of \({\widehat{\mathcal {G}}}({\bar{\mathbf {D}}})\). Let \({\bar{Q}}(G)\) denote the empirical degree law
Thus \({\bar{Q}}(G)\) is a probability measure on the set \(\mathcal {M}_{L+1}\); see Eq. (26). Also, let \({\bar{P}}\) denote the probability measure on \(\mathcal {M}_{L+1}\) induced by \(P\). Namely, \({\bar{P}}\) is the law of the random matrix \(\mathbf {D}\in \mathcal {M}_{L+1}\) defined as follows: for all \(c= (t,\star )\), or \(c=(\star , t)\) or \(c=(\star ,\star )\), set \(D_c=0\); and for \(c=(t,t')\) with \(t,t'\in \mathcal {F}\), set \(D_c=E_h(t',t)\), where \(E_h(t',t)\) is defined by (3) if the rooted graph \((G,o)\) has law \(P\). By contraction, one has \(H({\bar{P}}) \leqslant H(P)\) and
Let \(\mathcal {P}_{n,m}(P,\varepsilon )\) denote the set of probability measures \(Q\in \mathcal {P}(\mathcal {M}_{L+1})\) of the form (68), satisfying \(\sum _{i\in [n]}\sum _{c\in {\bar{\mathcal {C}}}}{\bar{D}}_c(i)=2m\), and such that \( d_{TV} (Q,{\bar{P}} )\leqslant \varepsilon \). The above discussion shows that if \(G\in A_{n,m} (P, \varepsilon )\), there must exist \(Q\in \mathcal {P}_{n,m}(P,\varepsilon )\) such that \({\bar{Q}}(G)=Q\). Therefore, one obtains
where \(n({\bar{\mathbf {D}}})\) is defined as in Corollary 4.10, and \({\bar{\mathbf {D}}}\) is the degree vector associated to \(Q\) as in (68).
Next, we claim that for each \(\varepsilon >0\),
Indeed, let \(p = |{\bar{\mathcal {C}}}|\) and fix a vector \(\ell \in \mathbb {Z}_+^{p}\). An integer partition of the vector \(\ell \) is an unordered sequence \(\{d(1), \dots , d(k)\}\), with \(d(i)\in \mathbb {Z}_+^ p\) for all \(i\), and such that \(d(1) + \cdots + d(k) = \ell \) componentwise. By [17, Lemma 4.2], if \(\sum _{i=1}^ p \ell _i = 2 m\) then the number of integer partitions of \(\ell \) is \(\exp ( o (m) )\). The number of vectors \(\ell \in \mathbb {Z}_+^{p}\) such that \(\sum _{i=1}^ p \ell _i = m\) is bounded by \((m+1)^p\). It follows that the number of unordered sequences \(\{d(1), \dots , d(n)\}\) in \(\mathbb {Z}_+^p\) such that \(\sum _{i=1}^n \sum _{c = 1} ^ p d_c(i) = 2 m\) is at most \(\exp (o(n))\), for \(m=O(n)\). Now, if \(Q\) is of the form (68) we may define \(d_c(i)={\bar{D}}_c(i)\), for every \(c\in {\bar{\mathcal {C}}}\) and \(i\in [n]\), which yields an injective map from \(\mathcal {P}_{n,m}(P,\varepsilon )\) to the unordered sequences \(\{d(1), \dots , d(n)\}\), with \(d(i)\in \mathbb {Z}_+^p\) such that \(\sum _{i=1}^n \sum _{c \in {\bar{\mathcal {C}}}} d_c(i) = 2 m\). This proves (70).
From (69) and (70), to prove (67), it remains to show that
where we use the notation \(\eta (\varepsilon )\) for an arbitrary function satisfying \(\eta (\varepsilon ) \rightarrow 0\) as \(\varepsilon \rightarrow 0\). Since \({\bar{\mathcal {C}}}\) is finite, reasoning as in (57) and using Lemma 5.5, it is easily seen that
Moreover, as in (58) one has
where \({\bar{S}}_c = \sum _{i\in [n]}{\bar{D}}_c(i)\). Observe that
Indeed, if \(Q_n\) is a sequence with \(Q_n\in \mathcal {P}_{n,m}(P,\varepsilon )\), then \({\bar{S}}_c/n = \mathbb {E}_{Q_n}D_{c}\), where \(D\in \mathcal {M}_{L+1}\) has law \(Q_n\). Then, for any \(k\in \mathbb {N}\), \({\bar{S}}_c/n \geqslant \mathbb {E}_{Q_n}[D_{c}\wedge k]\), and since \(Q_n\in \mathcal {P}_{n,m}(P,\varepsilon )\) and \(D_{c}\wedge k\) is a bounded function, taking first \(\varepsilon \rightarrow 0\) and then \(k\rightarrow \infty \), one has \({\bar{S}}_c/n \geqslant e_P(c) - \eta (\varepsilon )\) uniformly in \(n\). Moreover, since \(\sum _{c\in {\bar{\mathcal {C}}}}{\bar{S}}_c/n = 2m/n = d + o (1)\), one has \({\bar{S}}_c/n = d + o(1) - \sum _{c'\ne c}{\bar{S}}_{c'}/n \). Therefore, from the lower bound \({\bar{S}}_c/n \geqslant e_P(c) - \eta (\varepsilon )\) and the fact that \(\sum _c e_P(c)=d\), one finds \({\bar{S}}_c/n \leqslant e_P(c) +|{\bar{\mathcal {C}}}|\eta (\varepsilon ) + o(1)\). This ends the proof of (73). Moreover, with the same truncation argument as above one has that
for all \(c\in {\bar{\mathcal {C}}}\). This, together with (72)–(73) and the argument in (59) allows us to conclude the proof of (71). This ends the proof of (67).
General case We now come back to the case of arbitrary \(P \in \mathcal {P}_h\). For any finite set \(\Delta \subset \mathcal {T}^*_h\), we associate the sets \(\mathcal {C}=\mathcal {C}(\Delta )\) and \({\bar{\mathcal {C}}}\) as above. The above argument establishes that
where \(J^\Delta _h (P) := -s(d) + H(P) - \frac{d}{2} H ( \pi _{{\bar{P}}}) - \sum _{(t,t') \in \mathcal {C}}\mathbb {E}_ P \log E_h ( t,t') ! - \mathbb {E}_P \log E_h (\star , \star ) ! \) and \(\pi _{{\bar{P}}} \in \mathcal {P}( {\bar{\mathcal {C}}})\) is defined as follows: for \(c = (t,t') \in \mathcal {C}\), \(\pi _{{\bar{P}}} (t,t') = \pi _ P ( t,t')\) and for \(c = (\star , \star )\),
Assume first that \(J_h(P) > - \infty \). Using (64) at the second line, one has
We may then consider a sequence \((\Delta _k)\) of finite subsets in \(\mathcal {T}^*_h\) such that \(P ( T \notin \Delta _k) \rightarrow 0\), and \(\sum _{c \notin \mathcal {C}(\Delta _k)} \pi _ P ( c) \log \pi _ P ( c) \rightarrow 0\), as \(k \rightarrow \infty \). Then as \(k \rightarrow \infty \), the above expression converges to \(J_h (P)\). This proves that (66) holds when \(P \in \mathcal {P}_h\) and \(J_h (P) > -\infty \).
If \(P \in \mathcal {P}_h\) and \(J_h (P) = -\infty \) then either \(H(\pi _P) = \infty \) or \(\sum _{(t,t')} \mathbb {E}_P \log E_h (t,t) ! = \infty \). We use the upper bound
We may consider a sequence \((\Delta _k)\) of finite subsets of \(\mathcal {T}^*_h\) such that, as \(k\rightarrow \infty \), one has \(\sum _{c\in \mathcal {C}(\Delta _k)} \pi _ P ( c) \log \pi _ P (c) \rightarrow -H(\pi _P)\) and \( \sum _{c \in \mathcal {C}(\Delta _k)}\mathbb {E}_ P \log E_h (c) ! \rightarrow \sum _{c}\mathbb {E}_ P \log E_h ( c) ! \), and therefore \(J^{\Delta _k} _h (P)\rightarrow -\infty \), \(k\rightarrow \infty \). This completes the proof of (66). \(\square \)
Next, we extend Lemma 5.7 to the case \(\rho _h \notin \mathcal {P}_h\), i.e. \(H(\rho _h)=\infty \) or \(\mathbb {E}_\rho {\mathrm {deg}}_T (o) \log {\mathrm {deg}}_T(o) = \infty \). We start with the latter case.
Lemma 5.8
If \(\rho \in \mathcal {P}_u ( \mathcal {T}^*)\) is such that \(\mathbb {E}_\rho {\mathrm {deg}}_T (o) = d\) and \(\mathbb {E}_\rho {\mathrm {deg}}_T (o) \log {\mathrm {deg}}_T(o) = \infty \) then
Proof
We set \(P = \rho _1\) which can be identified with a probability measure on \(\mathbb {Z}_+\). Since \(P\) has finite first moment, \(H(P)\) is finite. The proof of Lemma 5.7 can be simplified for \(h=1\): since \(\mathcal {T}^* _{h-1}\) has a unique element (the isolated root), one has \(H(\pi _P)=0\) and it is not necessary to consider the extra state \(\star \). The bound (67) gives
Now, from Stirling’s approximation, for \(n \geqslant 1\), \(n ! \geqslant c \sqrt{n} e^{-n} n ^n\) for some constant \(c >0\). We deduce that \(\log n ! \geqslant c' - n + n \log n \) for some constant \(c'>0\). In particular, from \(\mathbb {E}_P {\mathrm {deg}}_T(o) = d < \infty \) and \(\mathbb {E}_\rho {\mathrm {deg}}_T (o) \log {\mathrm {deg}}_T(o) = \infty \), we get that \(\mathbb {E}_ P \log {\mathrm {deg}}_T (o) ! = \infty \). \(\square \)
The following statement is the extension of Lemma 5.7 to the case \(\rho _h \notin \mathcal {P}_h\).
Proposition 5.9
If \(\rho \in \mathcal {P}_u(\mathcal {T}^*)\), then for any \(h\in \mathbb {N}\),
where \(\overline{J}_h(\rho _h)=J_h(\rho _h)\) if \(\rho _h\in \mathcal {P}_h\), and \(\overline{J}_h(\rho _h)=-\infty \) otherwise.
In view of Lemma 5.7 and Lemma 5.8, Proposition 5.9 is a consequence of the following lemma.
Lemma 5.10
Let \(\rho \in \mathcal {P}_u(\mathcal {T}^*)\) be such that \(\mathbb {E}_\rho {\mathrm {deg}}_T (o) \log {\mathrm {deg}}_T(o) < \infty \). Then for any \(h \in \mathbb {N}\), \( H(\rho _h) < \infty . \) Consequently, for any \(P \in \mathcal {P}(\mathcal {T}^*_h)\), \(P \in \mathcal {P}_h\) is equivalent to \(P\) admissible and \(\mathbb {E}_P {\mathrm {deg}}_T (o) \log {\mathrm {deg}}_T(o) < \infty \).
Proof
The second statement follows from the first applied to \(\rho = {\mathrm {UGW}}_h (P)\) with \(P\) admissible. We now prove the first statement. Since \(d:=\mathbb {E}_\rho {\mathrm {deg}}(o)\) is finite, one has \(H(\rho _1)<\infty \). To prove the lemma, we proceed by induction, and show that for any \(h\in \mathbb {N}\), if \(H(\rho _{h})<\infty \) and \(\mathbb {E}_\rho {\mathrm {deg}}_T (o) \log {\mathrm {deg}}_T(o) < \infty \) then \(H(\rho _{h+1})<\infty \). Set \(P=\rho _h\), \(Q=\rho _{h+1}\), and \(Q^*=[{\mathrm {UGW}}_h(P)]_{h+1}\). Assume that \(H(P)<\infty \). We are going to prove that \(H(Q)<\infty \). Observe that
where \(Q(\cdot |\gamma )\) stands for the conditional distribution of the \((h+1)\)-neighborhood given the \(h\)-neighborhood \(\gamma \). Also,
Now recall that \(\tau \in \mathcal {T}^*_{h+1}\) determines all the coefficients \(E_{h+1}(t,t')\), \((t,t')\in \mathcal {T}^*_{h}\times \mathcal {T}^*_{h}\), and these can be partitioned according to the pairs \((s,s')\in \mathcal {T}^*_{h-1}\times \mathcal {T}^*_{h-1}\) such that \(t_{h-1}=s\), \(t'_{h-1}=s'\). With this notation, by definition of \(Q^*\), one has, for \(\tau \in \mathcal {T}^*_{h+1}\) such that \(\tau _h=\gamma \):
where the terms \(\{E_{h+1}(t,t')\}\) in the multinomial coefficient are all such that \(t_{h-1}=s\), \(t'_{h-1}=s'\), and we write \(k_{t,s'}(\tau ):=|\{v \mathop {\sim }\limits ^{\tau }o: \, \tau (o,v)_h=t, \,\tau (v,o)_{h-1}=s'\}|\), with \(t_{h-1}=s\). Therefore,
Moreover, unimodularity yields
Thus,
In conclusion, we have obtained that
The proof will be complete once we show that \(H(P)<\infty \) and \(\mathbb {E}_\rho {\mathrm {deg}}_T (o) \log {\mathrm {deg}}_T(o) < \infty \) imply that \(\sum _{s,s'}e_P(s,s')\,H( {\widehat{P}}_{s,s'})<\infty .\)
Now, by definition, if \(\gamma = t \cup s'_+\) and \(n_{t,s'} = \big |\{ v \mathop {\sim }\limits ^{\gamma } o : \gamma (v,o) = t, \gamma (o,v) = s' \}\big |\), we have
where we have used that \(e_P (s,s')\leqslant d\, n_{t,s'}\), and that \(1\leqslant n_{t,s'} \leqslant E_h (s',s) (\gamma )\) where \(\gamma = t \cup s'_+\) and \(t_{h-1} = s\). Since \({\mathrm {deg}}_\gamma (o) = \sum _{s,s'}E_h (s,s') (\gamma )\), we find
It remains to apply Lemma 5.4 with \(\mathcal {X}= \mathcal {T}^*_h\) and \(\ell _x = {\mathrm {deg}}_x (o)\) together with the assumption \(\mathbb {E}_\rho {\mathrm {deg}}_T (o) \log {\mathrm {deg}}_T(o) < \infty \). \(\square \)
Lemma 5.11
Suppose \(\rho \in \mathcal {P}_u(\mathcal {T}^*)\). Then \(\overline{J}_k(\rho _k)\), \(k\in \mathbb {N}\), is a non-increasing sequence. Assume moreover that \(\rho _1\) has finite support. Then for fixed \(k>h\), one has \(\overline{J}_k(\rho _k) < \overline{J}_h ( \rho _h)\) if and only if \(\rho _k\ne [{\mathrm {UGW}}_h(\rho _h)]_k\). In particular, if \(\rho _1\) has finite support, then for any \(h\in \mathbb {N}\), one has \(\overline{\Sigma }(\rho )< \overline{J}_h(\rho _h)\) if and only if \(\rho \ne {\mathrm {UGW}}_h(\rho _h)\).
Proof
Fix \(h\in \mathbb {N}\). To prove that \(\overline{J}_{h+1}(\rho _{h+1})\leqslant \overline{J}_h(\rho _h)\), we may assume that \(\rho _{h+1}\in \mathcal {P}_{h+1}\). In this case one has also that \(\rho _{h}\in \mathcal {P}_{h}\). From Proposition 5.6, we know that \(\Sigma ({\mathrm {UGW}}_k(\rho _k))=J_k(\rho _k)\) for both \(k=h\) and \(k=h+1\). Therefore, using Lemma 5.7 one has
where we use \([{\mathrm {UGW}}_k(\rho _k)]_h=\rho _h\), \(k\geqslant h\). This proves that \(\overline{J}_k(\rho _k)\) is non-increasing in \(k\).
We now assume that \(\rho _1\) has finite support. Then, by unimodularity it follows that \(\rho _h\) has finite support for all \(h\in \mathbb {N}\). In particular, \(\rho _h\in \mathcal {P}_h\) and \(J_h(\rho _h)>-\infty \) for all \(h\in \mathbb {N}\). Fix \(k>h\). Suppose that \(\overline{J}_k(\rho _k) < \overline{J}_h ( \rho _h)\). One has \(\Sigma ({\mathrm {UGW}}_k(\rho _k))=J_k(\rho _k)\) by Proposition 5.6. From the consistency property of Lemma 3.3, one must then have \(\rho _k\ne [{\mathrm {UGW}}_h(\rho _h)]_k\).
Next, suppose that \(\rho _1\) has finite support and that \(\rho _k\ne [{\mathrm {UGW}}_h(\rho _h)]_k\) and let us show that \(J_k(\rho _k) < J_h ( \rho _h)\). If \(\Gamma _n\in \mathcal {G}_{n,m}\) is a sequence with \(U(\Gamma _n)_k\rightsquigarrow \rho _k\), then also \(U(\Gamma _n)_h\rightsquigarrow \rho _h\) and by (60) one has
Using Corollary 4.10, if \({\widehat{G}}_n\) denotes a random graph with uniform distribution in \(\mathcal {G}(\mathbf {D}^{(n)},2h+1)\), \(\mathbf {D}^{(n)}\) being the degree vector associated to the \(h\)-neighborhood of \(\Gamma _n\), one also has
Since \(\rho _k \ne \gamma _k:=[{\mathrm {UGW}}_h(\rho _h)]_k\), there exist \(\varepsilon >0\) and an event \(A\) of the form \(A = \{ g \in \mathcal {G}^* : g_k= t \}\) for some \(t\in \mathcal {T}^*_k\), such that \(| \rho _k(A) - \gamma _k (A) | > \varepsilon \). Therefore, \(U(\Gamma _n)_k\rightsquigarrow \rho _k\) implies that
By Proposition 7.1, \(\mathbb {E}U({\widehat{G}}_n) (A)\) converges to \(\gamma _k(A)\). It follows that
The desired conclusion \(J_k(\rho _k)-J_h(\rho _h)<0\) now follows from (99) (in Appendix).
Finally, the assertion concerning \( \overline{\Sigma }(\rho )\) follows easily from the results above. Indeed, from Proposition 5.6 we know that \(\overline{\Sigma }(\rho )<J_h(\rho _h)\) implies that \(\rho \ne {\mathrm {UGW}}_h(\rho _h)\). For the opposite direction, observe that if \(\rho \ne {\mathrm {UGW}}_h(\rho _h)\), then \(\rho _k\ne [{\mathrm {UGW}}_h(\rho _h)]_k\) for some \(k>h\). From Lemma 5.7 one has \(\overline{\Sigma }(\rho )\leqslant J_k(\rho _k)\), and the above implies \(\overline{\Sigma }(\rho )<J_h(\rho _h)\). \(\square \)
Lemma 5.12
Suppose \(\rho \in \mathcal {P}_u(\mathcal {T}^*)\). Then
Proof
The limit \(\overline{J}_\infty (\rho )\) is well defined by the monotonicity in Lemma 5.11. The upper bound in Proposition 5.9 shows that \(\overline{\Sigma }(\rho )\leqslant \overline{J}_\infty (\rho )\). Thus, all we have to prove is
We may assume that \(\rho _k\in \mathcal {P}_k\) for all \(k\in \mathbb {N}\). Fix \(\eta >0\) and set \(\rho ^ h = {\mathrm {UGW}}_h ( \rho _h)\). By the lower bound in Proposition 5.6, for any \(h \in \mathbb {N}\), \(\varepsilon >0\) and \(n \geqslant n_0(\varepsilon ,h,\eta )\),
By diagonal extraction, there exist sequences \(h_n \rightarrow \infty \) and \(\varepsilon _n \rightarrow 0\) such that
Since \(\rho ^ {h_n} \rightsquigarrow \rho \), for any fixed \(\varepsilon >0\) and all \(n\) large enough, \( B ( \rho ^{h_n}, \varepsilon _n ) \subset B ( \rho , \varepsilon ). \) In particular, \({{\left| \mathcal {G}_{n,m} (\rho , \varepsilon ) \right| }} \geqslant {{\left| \mathcal {G}_{n,m} (\rho ^{h_n}, \varepsilon _n ) \right| }}\). It follows that \( \underline{\Sigma }(\rho ,\varepsilon ) \geqslant \overline{J}_\infty (\rho )- \eta . \) The latter holding for all \(\varepsilon >0\) and \(\eta >0\), we have checked that (81) holds. \(\square \)
All the statements in Theorem 1.3 are contained in Proposition 5.6, Proposition 5.9, Lemma 5.11 and Lemma 5.12. Moreover, Lemma 5.12 implies that \(\Sigma (\rho )\) is well defined and equals \(\overline{J}_\infty (\rho )\) for every \(\rho \in \mathcal {P}_u(\mathcal {T}^*)\), independently of the choice of the sequence \(m=m(n)\) with \(m/n\rightarrow d/2\). This completes the proof of Theorem 1.2 and Theorem 1.3.
5.3 Proof of Corollary 1.4
In the special case \(h = 1\), one has \(P\in \mathcal {P}(\mathbb {Z}_+)\), and the condition \(\sum _{n=0}^\infty nP(n)=d\) implies \(H(P)<\infty \). By Proposition 5.6 one has \(\Sigma ({\mathrm {UGW}}_1(P)) = J_1(P)\). Moreover, since \(|\mathcal {T}^*_0|=1\), there exists a unique type \((s,s')\in \mathcal {T}^*_0\times \mathcal {T}^*_0\) with \(e_P(s,s')= d\) and therefore \(H ( \pi _P)=0\), and
It follows that
This ends the proof of Corollary 1.4.
Remark 5.13
Fix \(h\in \mathbb {N}\), and suppose that \(\rho \in \mathcal {P}_u(\mathcal {T}^*)\) is such that \(\rho _h\in \mathcal {P}_h\). One can derive the following alternative expression for \(J_h(\rho _h)\) in terms of relative entropies:
where \(\Delta _1(\rho )= H(\rho _1\,|\,{\mathrm {Poi}}(d))\) and, for \(k\geqslant 2\):
where \(\rho _k^*:=[{\mathrm {UGW}}_{k-1}(\rho _{k-1})]_k\). To prove (82), thanks to Corollary 1.4, it suffices to prove that the increment \(J_{k-1}(\rho _{k-1}) - J_k(\rho _k)\) equals (83) for \(k\geqslant 2\). This in turn can be checked as follows.
Fix \(h\in \mathbb {N}\), and write \(Q=\rho _{h+1}\), \(Q^*=\rho ^*_{h+1}\), \(P=\rho _h\). Simple manipulations show that
where \(t\in \mathcal {T}^*_{h}\) while \((s,s')\in \mathcal {T}^*_{h-1}\times \mathcal {T}^*_{h-1}\), we use the multinomial coefficients introduced in (76), and we define the conditional probability \(q(\cdot |s,s')\) on \(\mathcal {T}^*_{h-1}\times \mathcal {T}^*_{h-1}\) by \(\pi _Q(t,t')=\pi _{P}(s,s')q(t,t'|s,s')\). Using (76) and (77), one finds
Therefore,
Next observe that if \(q^*(\cdot |s,s')\!:=\!{\widehat{P}}_{s,s'}(t){\widehat{P}}_{s',s}(t')\), then \(\pi _{Q^*}(t,t')\!=\!\pi _{P}(s,s')q^*(\cdot |s,s')\), see Remark 3.4. Moreover, using
one finds
It follows that
where \((s,s')\in \mathcal {T}^*_{h-1}\times \mathcal {T}^*_{h-1}\), while \((t,t')\in \mathcal {T}^*_{h}\times \mathcal {T}^*_{h}\). From (85) we then obtain the desired conclusion \(J_h(P)-J_{h+1}(Q) = \Delta _{h+1}(\rho )\). Clearly, the monotonicity in Lemma 5.11 implies that \(\Delta _{h+1}(\rho )\geqslant 0\). This yields the seemingly nontrivial inequality \(\frac{d}{2} H(\pi _Q\,|\,\pi _{Q^*})\leqslant H(Q|Q^*)\).
5.4 Discontinuity of the entropy
The aim of this section is to prove that the \(\mathcal {P}(\mathcal {G}^*) \rightarrow [-\infty ,\infty )\) map \(\rho \mapsto \Sigma (\rho )\) is discontinuous for the weak topology at \(\rho = {\mathrm {UGW}}_1( P)\) for any finitely supported \(P \in \mathcal {P}(\mathbb {Z}_+)\) with \(P(0) = P(1) = 0\) and \(P(2) < 1\).
Let \(P_1,P_2\) be two probability measures on \(\mathbb {Z}_+\) with finite positive means, say \(d_1\) and \(d_2\). For \(i = 1, 2\), we set \(p_i = d_{{\bar{i}}} / ( d_1 + d_2)\), where \({\bar{1}} = 2 \), \({\bar{2}} = 1\). We define \({\mathrm {UGW}}(P_1,P_2)\) as the law of the rooted tree \((T, o)\) obtained as follows. We first build a rooted multi-type Galton–Watson tree \(({\check{T}},o)\). The vertices can be of type \(1\) or of type \(2\). The root has type \(i\) with probability \(p_i\). All offspring of a vertex of type \(i\) are of type \({\bar{i}}\). Conditioned on being of type \(i\), the root has a number of offspring distributed according to \(P_i\). Conditioned on being of type \(i\), a vertex different from the root has a number of offspring distributed according to the size-biased law \({\widehat{P}}_i\) given by (6). The tree \((T,o)\) is finally obtained from \(({\check{T}}, o)\) by removing the types.
The distribution of \(({\check{T}}, o)\) is unimodular. It implies that \({\mathrm {UGW}}(P_1,P_2)\) is also unimodular. It can be checked directly from the definition of unimodularity or by proving that it is the local weak limit of bipartite configuration models (they are especially of interest in coding theory, see e.g. Montanari and Mézard [25]).
Now, let \(S \subset \mathbb {Z}_+\) be a finite set and \(P\) be a probability measure on \(S\). Observe that \({\mathrm {UGW}}(P,P) = {\mathrm {UGW}}_1 (P)\) and that if \(P_n\) is a sequence of probability measures on \(S\) such that \(P_n \rightsquigarrow P\) then \({\widehat{P}}_n \rightarrow {\widehat{P}}\) and
However, we have the following discontinuity result:
Proposition 5.14
Assume that \( S \subset \mathbb {Z}_+ \backslash \{ 0, 1\}\), \(P, P_n\in \mathcal {P}(S)\), and \(P_n \rightsquigarrow P\) as \(n\rightarrow \infty \). Assume further that \(P(2) < 1\) and that \(P_n \ne P\) for all \(n\) large enough. Then,
The proposition is a consequence of the following upper bound on \( \Sigma ({\mathrm {UGW}}(P,Q)) \)
Lemma 5.15
Let \( S \subset \mathbb {Z}_+ \backslash \{ 0, 1\}\) be a finite set and \(P_1 \ne P_2\) be two probability measures on \(S\). We have
where \(D\) is the random variable with law \(P_1\), \(P_2\) respectively, and \(H((p_1,p_2))=-\sum _{i=1}^2p_i\log p_i\).
The idea will be to prove that if \(U(G_n) \rightsquigarrow {\mathrm {UGW}}(P_1,P_2)\), then \(G_n\) needs to be approximately bipartite. The constraint of being bipartite will be costly in terms of entropy.
Proof of Proposition 5.14
Using Lemma 5.15, with \(P_1 = P_n\) and \(P_2 = P\), we may upper bound \(\Sigma ( {\mathrm {UGW}}( P, P_n) ) \) by
Since \(P_n\) and \(P\) have support in the finite set \(S\), \(p_1(n) \rightarrow 1/2\), \(p_2(n) \rightarrow 1/2\), \(H(P_n) \rightarrow H(P)\) and \(\mathbb {E}_{P_n} \log (D!) \rightarrow \mathbb {E}_{P} \log (D!) \). So finally
Since \(P(0) = P(1) = 0 \) and \(P(2) < 1\), we have \(d > 2\). \(\square \)
Proof of Lemma 5.15
Let us start by a remark. We denote by \(d_i\) and \({\hat{d}}_i\) the mean of \(P_i\) and \({\widehat{P}}_i\). Since \(\{0,1\} \notin S\), the support of \({\widehat{P}}_i\) is included in \(\{1, \dots , \theta \}\) for some \(\theta \). It follows that \({\hat{d}}_i \geqslant 1\). Also, \({\hat{d}}_i = 1\) implies that \({\widehat{P}}_i = \delta _1\), hence \(P_i = \delta _2\). Since \(P_1 \ne P_2\), we have that either \({\widehat{P}}_1\) or \({\widehat{P}}_2\) is different from \(\delta _1\). In particular,
Let \((T,o)\) be a rooted tree with distribution \(\rho = {\mathrm {UGW}}(P_1,P_2)\) obtained from a multi-type rooted tree \(({\check{T}}, o)\) as above whose law is denoted by \({\check{\rho }}\). We will assign to all vertices of \(T\) a type \(\{a,b\}\): type \(a\) (resp. \(b\)) is supposed to be a good approximation for type \(1\) (resp. \(2\)) in \({\check{T}}\).
Let \(\mathcal {A}_1 \cup \mathcal {A}_2\) be a partition of \(\mathcal {P}( \mathbb {Z}_+)\) such that \({\widehat{P}}_i\) is in the interior of \(\mathcal {A}_i\) (it is possible since \({\widehat{P}}_1 \ne {\widehat{P}}_2\)). Now, for \(v \in V(T)\) and integer \(h \geqslant 1\), \(\partial B(v,h)\) is the set of vertices at distance \(h\) from \(v\) in \(T\). The assumption \(\{0,1\} \notin S\) implies that \(\partial B(V,h)\) is not empty. Hence, we define
Moreover \(\alpha > 1\) implies that \({\check{\rho }}\)-a.s.
Indeed, we consider a tree \(T'\) whose vertex set are the vertices at even distance (in \(T\)) from the root. \(T'\) is obtained by connecting vertices at distance \(2h\) from the root to their grandchildren (the offspring of its own offspring), at distance \(2(h+1)\). Then, by construction, all vertices have the same type in \(T'\). Moreover, conditioned on the root being of type \(i\), \(T'\) is a Galton–Watson tree where the root has offspring distribution \(Q_i\), the distribution of \(\sum _{k=1}^{N} N_k\), where \(N\) has law \( P_i\), independent of \((N_k)_k\) an i.i.d. sequence with law \({\widehat{P}}_{2}\) if \(i =1\) and \({\widehat{P}}_1\) if \(i=2\), and any other vertex in \(T'\) has offspring distribution \(Q'_i\), the distribution of \(\sum _{k=1}^{{\widehat{N}}} N_k\), where \({\widehat{N}} \) has law \({\widehat{P}}_i\), independent of \((N_k)_k\) as above. By construction, \(Q'_i\) has mean \(\alpha ^ 2 = {\hat{d}}_1 {\hat{d}}_2\) and \(T'\) has extinction probability \(0\). Then (86) is a consequence of the Seneta–Heyde Theorem [22, 30].
Also, conditioned on the root being of type \(i\), all vertices \(u \in B(o,2h)\) are of type \(i\). It follows that, conditioned on \(|\partial B (o, 2h)|\) the vector \(({\mathrm {deg}}_T (u) -1)_{u \in \partial B (o, 2h)}\) is i.i.d. with common law \({\widehat{P}}_i\). Hence, the strong law of large numbers implies that, \({\check{\rho }}\)-a.s.
where \(c(o)\) is the type of the root.
In the sequel, we fix \(\delta >0\) and take \(h\) large enough such that
Now, to a locally finite graph \(G = (V,E)\), we attach to each vertex \(v \in V\) the type \(\omega (v) = a\) (resp. \(\omega (v) = b\)) if \(\partial B (v, 2h)\) is not empty, \({\mathrm {deg}}(v) \in S\) and \(\mu ^ h_v \in \mathcal {A}_1\) (resp. \(\mu ^h_v \in \mathcal {A}_2\)). Otherwise, we set set \(\omega (v) = \bullet \).
Let \({\bar{a}} = b\), \({\bar{b}} = a\), \(\theta = \max ( s \in S)\) and \(\Theta = \{0,\ldots ,\theta \}\). We also attach on the vertices of \(G\) a new type in the set \(\mathcal {R}= \{ \bullet , (a,k), (b,k) : k \in \Theta \}\) defined, for \(c \in \{a,b\}\), by \(\tau (u) = (c,k)\) if
-
(i)
\(\omega (u) = c\);
-
(ii)
\(\sum _{v \mathop {\sim }\limits ^{G} u} \mathbf {1}( \omega (v) = {\bar{c}} ) = k\).
Otherwise, \( \omega (u) = \bullet \) and we also set \(\tau (u) = \bullet \). In words: a vertex has \(\tau \)-type \((c,k)\) if its \(\omega \)-type is \(c\) and it has exactly \(k\) of its neighbors having \(\omega \)-type \( {\bar{c}} \). We may call this scalar \(k\) the \(ab\)-degree of the vertex.
By construction, \(\mathbb {P}_{{\check{\rho }}} ( \omega (o) = c(o) ) \geqslant 1 - \delta \). Also, using the union bound and unimodularity,
We thus have proved that
It follows that, for any \(k \in S\), \(c\in \{a,b\}\),
where \(i= 1\) if \(c = a\) and \(i=2\) if \(c = b\). Equation (87) shows that we can nearly reconstruct the types and the bipartite structure from \(2h\)-neighborhoods.
Also, by construction, the maps \((G,o) \rightarrow \omega (o)\) and \((G,o) \rightarrow \tau (o)\) are continuous for the local topology. Hence, there exists \(\eta (\varepsilon )>0\) with \(\eta (\varepsilon ) \rightarrow 0\), \(\varepsilon \rightarrow 0\), such that \(\mu \in B(\rho ,\varepsilon )\) implies that
For all \(\varepsilon \leqslant \varepsilon ( \delta )\) small enough, \(\eta (\varepsilon ) \leqslant \delta \).
All ingredients are now in order. Consider a sequence \(m = m (n)\) such that \(m(n)/n \rightarrow d /2\) where \(d = 2 p_1 d_1 = 2 p_2 d_2 = 2 d_1 d_2 / (d_1 + d_2)\). Let \(G_n \in \mathcal {G}_{n,m} ( \rho , \varepsilon )\) with \(\varepsilon \leqslant \varepsilon (\delta )\). For \(c \in \{a,b,\bullet \}\) and \(r \in \mathcal {R}\), we set
From what precedes and (87), for \(c \in \{a,b\}\) and \(k \in \Theta \),
where \(i = 1\) if \(c =a\) and \(i = 2\) if \(c = b\). We notice also that \((n_a,n_b,n_{\bullet })\) is an integer partition of \(n\) of length \(3\) and \((n_{r})_{ r \in \mathcal {R}}\) is an integer partition of length \(|\mathcal {R}| = 2(\theta +1) +1\).
We now compute an upper bound for \(| \mathcal {G}_{n,m} ( \rho , \varepsilon )|\). Fix \(\mathbf {n}= ((n_c)_{c\in \{a,b\}},(N_r)_{r \in \mathcal {R}})\). We denote by \(A(\mathbf {n})\) the set of vertex-labeled graphs \(G = ([n],E,\omega ',\tau ')\) such that for any \(c \in \{a,b\}\), \(r\in \mathcal {R}\) and \(v\in [n]\),
-
(i)
\(\omega '(v) \in \{a,b,\bullet \}\) and \(\tau '(v) \in \mathcal {R}\);
-
(ii)
\(\tau '(v) = (c,k)\) iif \(\omega '(v) = c\) and \(\sum _{ u \mathop {\sim }\limits ^{G} v} \mathbf {1}( \omega ' (u) = {\bar{c}} ) = k\);
-
(iii)
\(n_c = \sum _v \mathbf {1}( \omega ' (v) = c )\) and \(N_r = \sum _{ v = 1 }^ n \mathbf {1}( \tau '(v) = r )\).
From what precedes,
where the maximum is over all pairs of integer partitions \(((n_c)_{c\in \{a,b,\bullet \}},(N_r)_{r \in \mathcal {R}})\) satisfying (88).
We set
In words, \(m_{\circ }\) is the number of \(ab\)-edges (i.e. adjacent to a vertex of \(\omega '\)-type \(a\) and a vertex of \(\omega '\)-type \(b\)), \(m_{\bullet }\) counts all the other edges. Summing (88) over \(c \in \{a,b\}\), \(k \in \Theta \), yields
and
Since \(m = m_{\bullet } + m_{\circ } = nd /2 + o(n)\). It follows that
where \(O( \cdot ) \) depends only on \(\theta \).
We find
where: the first term counts the number of ways to partition \([n]\) into three blocks of sizes \(n_a,n_b\) and \(n_\bullet \); the second and third terms subdivide each of the blocks in terms of the \(ab\)-degrees of the vertices; the fourth term upper bounds the number of ways to realize the \(ab\)-degree sequence (reasoning as in Lemma 4.3); the last term bounds the number of ways to put the remaining \(m_{\bullet }\) edges.
We set \(p = (p_a,p_b,p_{\bullet })\) with \(p_c = n_c / n\) and for \(r = (c,k) \in \mathcal {R}\), \(P_c (k) = N_{(c,k)} / n_c\). Using Stirling’s approximation, we obtain
where \(o(\cdot )\) depends only on \(\theta \). Using our estimates in terms of \(\delta \), we get
Letting \(n \rightarrow \infty \) and then \(\delta \rightarrow 0\), the lemma follows. \(\square \)
6 Large deviation principles
6.1 Proof of Theorem 1.6
Fix a sequence \(\mathbf {d}=\mathbf {d}^{(n)}\) as in (C1)–(C3), set \(P_n := \frac{1}{n} \sum _{i=1} ^ n \delta _{d(i)}\). The measure \(P_n\in \mathcal {P}(\mathbb {Z}_+)\) may be viewed as a measure on rooted graphs with depth \(1\), i.e. \(\mathcal {G}_1^*\), by assigning probability zero to any \(g\in \mathcal {G}_1^*{\setminus }\mathcal {T}_1^*\), and by assigning the weight \(P_n(k)\) to the unlabeled star with \(k\) neighbors (rooted at the center of the star). Define
so that \(m/n\rightarrow d/2\) as \(n\rightarrow \infty \), and define the set
Each element of \(\mathcal {G}( \mathbf {d}_n)\) is isomorphic to exactly \(n(\mathbf {d})\) graphs in \(\mathcal {G}_{P_n}\), i.e. \(n(\mathbf {d})|\mathcal {G}( \mathbf {d})|=|\mathcal {G}_{P_n}|\), where \(n(\mathbf {d})\) denotes the number of distinct vectors \((d(\pi _1),\dots ,d(\pi _n))\) as \(\pi :[n]\mapsto [n]\) ranges over permutations of the vertex labels. Since \(U(G)\) is invariant under isomorphisms, Theorem 1.6 is equivalent to the same statement where \(G_n\) is a random graph uniformly distributed in \(\mathcal {G}_{P_n}\) rather than in \(\mathcal {G}( \mathbf {d})\). Thus, for the rest of this proof \(G_n\) will denote a uniform graph in \(\mathcal {G}_{P_n}\).
Since \(U(G_n)\) is unimodular, we may restrict to the closed subspace \(\mathcal {P}_u(\mathcal {G}^*)\). Let \(\mathcal {K}\subset \mathcal {P}_u(\mathcal {G}^*)\) denote the compact set of unimodular probability measures supported by graphs with degree bounded by \(\theta \). Unimodularity implies that \(\rho \in \mathcal {K}\) is equivalent to \(\rho \) being supported by graphs such that the degree at the root is bounded by \(\theta \). By construction, \(U(G_n)\in \mathcal {K}\) and \(P\in \mathcal {K}\). Therefore, if \(\rho \in \mathcal {P}_u(\mathcal {G}^*)\) is such that \(\rho _1=P\), then \(\rho \in \mathcal {K}\). From general principles, see e.g. [16, Ch. 4], the theorem follows if we prove that: (i) for any \(\rho \in \mathcal {K}\) with \(\rho _1 = P\), \(\delta >0\),
and (ii) for any \(\rho \in \mathcal {K}\)
To prove the lower bound (89), write
As a consequence of (67)
On the other hand, the lower bound in Proposition 5.6 proves that for fixed \(\delta >0\), one has
for all \(h\) large enough. From Theorem 1.3 one has \( \Sigma (\rho )=\lim _{h\rightarrow \infty } J_h(\rho _h)\), and \(\Sigma ({\mathrm {UGW}}_1(P))=J_1(P)\), and (89) follows.
We turn to the proof of the upper bound (90). We start with the case \(\rho _1 \ne P\). For \(\delta > 0\), consider the closure, say \(F(\delta )\), of the probability measures \(\rho \in \mathcal {K}\) such that \(d_{TV} (\rho _1, P) \leqslant \delta \). For all \(n\) large enough, \(U(G_n) \in F(\delta )\), since \(U(G_n)_1=P_n\rightsquigarrow P\). If \(\rho _1 \ne P\), then \(\rho \notin F(\delta )\) for some \(\delta >0\), and \( \mathbb {P}( U(G_n) \in B ( \rho , \varepsilon ) ) = 0\), for all \(\varepsilon \) small enough and \(n\) large enough. It follows that (90) is \(-\infty \) in this case. Suppose now that \(\rho _1 = P\). For the upper bound one may drop the constraint \(U(G_n)_1=P_n\) in the numerator of (91). Then, using the lower bound in Proposition 5.6 for the denominator and Theorem 1.3 for the numerator, one has the desired estimate.
Remark 6.1
The result of Theorem 1.6 can be extended with no difficulty to the case where \(G_n\) is uniformly distributed in the set of all graphs \(G\) with vertex set \([n]\) satisfying \(U(G)_h=P_n\), where \(P_n\) is supported on some fixed set \(\Delta =\{t_1,\dots ,t_r\}\subset \mathcal {T}^*_h\) for all \(n\), and such that \(P_n\rightsquigarrow P\) for some admissible \(P\). Theorem 1.6 is the special case \(h=1\). With the same proof, for any fixed \(h\in \mathbb {N}\), one obtains that \(U(G_n)\) satisfies the large deviation principle with speed \(n\) and good rate function \(I(\rho )=J_h(P)-\Sigma (\rho )\) if \(\rho _h=P\), and \(I(\rho )=+\infty \) otherwise.
6.2 Proof of Theorem 1.7
We start with a proof of exponential tightness. Let \(c \geqslant 1\) and let \(G_n\) be a random graph sampled uniformly on \(\mathcal {G}_{n,m}\), where \(m=m(n)\) is an arbitrary sequence satisfying
The random probability measure \(\rho _n:= U ( G_n) \) is an element of \(\mathcal {P}_{u} ( \mathcal {G}^*) \).
Lemma 6.2
The sequence of random variables \(\rho _n\) is exponentially tight in \(\mathcal {P}_{u} ( \mathcal {G}^*)\), i.e. for any \(z\geqslant 1\), there exists a compact set \(\Pi _z\subset \mathcal {P}_{u} ( \mathcal {G}^*)\) such that
Proof
For \(y \geqslant 1 \) and \(x \in (0,1)\), we define
and consider the event,
where \({\mathrm {deg}}_{G_n} (S)\) was defined in (14). We are going to prove that there exists a constant \(L> 0\) such that for any real \(y \geqslant 1\), for any integer \(n \geqslant 1\),
In view of Lemma 2.3, (92) implies the lemma.
To prove (92), we may restrict ourself to subsets \(S \subset [n]\) of cardinality at most \(|S| \leqslant n \varepsilon _0\), with \(\varepsilon _0 = \delta _y ^{-1} ( 1) = e^{ - 2 y} / c \leqslant e^{-2y}\). From the union bound,
By choosing \(y\) large enough we may assume that \(\varepsilon _0>0\) is small enough. Choose \(\varepsilon \in (0,\varepsilon _0]\) and \(\delta :=\delta _y {{\left( \varepsilon \right) }}\). We note that, as in the proof of (54), \({\mathrm {deg}}_{G_n} (S )\) is stochastically dominated by \(2N\), where \(N\) has distribution \({\mathrm {Bin}}( \varepsilon n^2, 2d/n)\). It follows that
For \(x>0\),
Taking \(x=-\frac{1}{2}\log (c\varepsilon )\) one finds
On the other hand, from Stirling’s formula, there exists a constant \( C\) such that
where \(H(\varepsilon )=-\varepsilon \log \varepsilon -(1-\varepsilon )\log (1-\varepsilon )\). Since \(\varepsilon \leqslant \varepsilon _0=e^{-2y}\), these bounds imply the desired conclusion (92). \(\square \)
We turn to the proof of Theorem 1.7. Fix \(d>0\) and a sequence \(m=m(n)\) such that \(m/n\rightarrow d/2\), as \(n\rightarrow \infty \). Thanks to Lemma 6.2, from general principles, see e.g. [16, Ch. 4], it is sufficient to establish: (i) for any \(\rho \in \mathcal {P}_u(\mathcal {G}^*)\) and \(\delta >0\),
and (ii) for any \(\rho \in \mathcal {P}_u(\mathcal {G}^*)\)
However, both the lower bound (93) and the upper bound (94) follow immediately from the definition of \(\Sigma (\rho )\), Theorem 1.2 and (7). This ends the proof.
6.3 Proof of Theorem 1.8
Theorem 1.8 is a simple consequence of Theorem 1.7. We argue as in [17]. Let \(G_n\) denote the random graph with distribution \(\mathcal {G}(n,\lambda /n)\), and let \(M(n)\) be the total number of edges in \(G_n\). Then \(M(n)\) is the binomial random variable \({\mathrm {Bin}}(n(n-1)/2,\lambda /n)\). Conditioned on a given value \(M(n)=m\), \(G_n\) has uniform distribution over \(\mathcal {G}_{n,m}\). It follows that \(\mathcal {G}(n,\lambda /n)\) is a mixture of the uniform distribution on \(\mathcal {G}_{n,m}\), where \(m\) is sampled according to \({\mathrm {Bin}}(n(n-1)/2,\lambda /n)\). We use the following simple lemma, whose proof is omitted.
Lemma 6.3
The sequence \(2M(n)/n\) satisfies the LDP in \([0,\infty )\) with speed \(n\) and good rate function
We need to prove that \(\rho _n = U(G_n)\) satisfies a LDP on \(\mathcal {P}_u (\mathcal {G}^*)\) with speed \(n\) and good rate function
where \(\rho \in \mathcal {P}_u (\mathcal {G}^*)\), \(d = \mathbb {E}_\rho {\mathrm {deg}}_G (o)\) and \(\Sigma _r (\rho )\) is the entropy of \(\rho \) associated to the mean degree \(r\) (which is equal to \(-\infty \) if \(r \ne d\) by Theorem 1.2). A simple adaptation of the proof of Lemma 6.2 shows that the random variable \(\rho _n=U(G_n)\) is exponentially tight. The conclusion follows from a general result on large deviations for mixtures; see Biggins [16, Theorem 5(b)].
6.4 Proof of Corollary 1.9 and Corollary 1.10
The proof is an application of the contraction principle, cf. [16]. Concerning Corollary 1.9, by Theorem 1.7 one has that \(u(G_n)\) satisfies the LDP in \(\mathcal {P}(\mathbb {Z}_+)\) with speed \(n\) and good rate function
From Theorem 1.3 and Corollary 1.4 this expression equals \(s(d)-J_1(P)=H(P\,|\,{\mathrm {Poi}}(d))\).
As for Corollary 1.10, by Theorem 1.8 \(u(G_n)\) satisfies the LDP in \(\mathcal {P}(\mathbb {Z}_+)\) with speed \(n\) and good rate function
where \(\phi (\lambda ,d)=\frac{\lambda }{2} - \frac{d}{2} \log \lambda \). Since all \(\rho \in \mathcal {P}(\mathcal {G}^*)\) with \(\rho _1=P\) have the same expected degree at the root, this equals \(\phi (\lambda ,d)-J_1(P)=\phi (\lambda ,d)-s(d)+H(P\,|\,{\mathrm {Poi}}(d))\).
Notes
We shall actually see with Lemma 5.10 below that \(P \in \mathcal {P}_h\) is equivalent to \(P\) admissible and \(\mathbb {E}_P {{\left[ {\mathrm {deg}}_T(o) \log {{\left( {\mathrm {deg}}_T (o)\right) }} \right] }} < \infty \).
References
Aldous, D., Lyons, R.: Processes on unimodular random networks. Electron. J. Probab. 12(54), 1454–1508 (2007)
Aldous, D., Steele, J.M.: The objective method: probabilistic combinatorial optimization and local weak convergence. In: Probability on discrete structures. Encyclopaedia of Mathematical Sciences, vol. 110, pp. 1–72. Springer, Berlin (2004)
Benjamini, I., Lyons, R., Schramm, O.: Unimodular random trees. arXiv:1207.1752 (2012)
Benjamini, I., Schramm, O.: Recurrence of distributional limits of finite planar graphs. Electron. J. Probab. 6(23), 13 (2001)
Biskup, M., Chayes, L., Smith, S.A.: Large-deviations/thermodynamic approach to percolation on the complete graph. Random Struct. Algorithms 31(3), 354–370 (2007)
Bollobás, B.: A probabilistic proof of an asymptotic formula for the number of labelled regular graphs. Eur. J. Combin. 1(4), 311–316 (1980)
Bollobás, B.: Random graphs. Cambridge Studies in Advanced Mathematics, vol. 73, 2nd edn. Cambridge University Press, Cambridge (2001)
Bollobás, B., Riordan, O.: Sparse graphs: metrics and random models. Random Struct. Algorithms 39(1), 1–38 (2011)
Bordenave, C., Caputo, P.: A large deviation principle for Wigner matrices without Gaussian tails. Ann. Probab. 42(6), 2454–2496 (2014)
Borgs, C., Chayes, J.T., Lovász, L., Sós, V.T., Vesztergombi, K.: Convergent sequences of dense graphs. I. Subgraph frequencies, metric properties and testing. Adv. Math. 219(6), 1801–1851 (2008)
Borgs, C., Chayes, J.T., Lovász, L., Sós, V.T., Vesztergombi, K.: Convergent sequences of dense graphs II. Multiway cuts and statistical physics. Ann. Math. (2), 176(1), 151–219 (2012)
Boucheron, S., Gamboa, F., Léonard, C.: Bins and balls: large deviations of the empirical occupancy process. Ann. Appl. Probab. 12(2), 607–636 (2002)
Bowen, L.: Periodicity and circle packings of the hyperbolic plane. Geom. Dedicata 102, 213–236 (2003)
Chatterjee, S., Varadhan, S.R.S.: The large deviation principle for the Erdős–Rényi random graph. Eur. J. Combin. 32(7), 1000–1017 (2011)
Dembo, A., Mörters, P., Sheffield, S.: Large deviations of Markov chains indexed by random trees. Ann. Inst. H. Poincaré Probab. Stat. 41(6), 971–996 (2005)
Dembo, A., Zeitouni, O.: Large deviations techniques and applications. Springer, Berlin (2010). Reprint of second edition
Doku-Amponsah, K., Mörters, P.: Large deviation principles for empirical measures of colored random graphs. Ann. Appl. Probab. 20(6), 1989–2021 (2010)
Durrett, R.: Random graph dynamics. Cambridge Series in Statistical and Probabilistic Mathematics. Cambridge University Press, Cambridge (2010)
Elek, G.: On the limit of large girth graph sequences. Combinatorica 30(5), 553–563 (2010)
Engel, A., Monasson, R., Hartmann, A.K.: On large deviation properties of Erdős–Rényi random graphs. J. Stat. Phys. 117(3–4), 387–426 (2004)
Erdős, P., Gallai, T.: Graphs with prescribed degrees of vertices (hungarian). Mat. Lapok 11, 264–274 (1960)
Heyde, C.: Extension of a result of Seneta for the super-critical Galton–Watson process. Ann. Math. Stat. 41, 739–742 (1970)
Janson, S., Luczak, T., Rucinski, A.: Random graphs. Wiley-Interscience Series in Discrete Mathematics and Optimization. Wiley, New York (2000)
Lovász, L., Szegedy, B.: Limits of dense graph sequences. J. Combin. Theory Ser. B 96(6), 933–957 (2006)
Mézard, M., Montanari, A.: Information, physics, and computation. Oxford Graduate Texts. Oxford University Press, Oxford (2009)
Molloy, M., Reed, B.: A critical point for random graphs with a given degree sequence. In: Proceedings of the Sixth International Seminar on Random Graphs and Probabilistic Methods in Combinatorics and Computer Science, “Random Graphs ’93” (Poznań, 1993), vol. 6, pp. 161–179 (1995)
O’Connell, N.: Some large deviation results for sparse random graphs. Probab. Theory Related Fields 110(3), 277–285 (1998)
Puhalskii, A.A.: Stochastic processes in random graphs. Ann. Probab. 33(1), 337–412 (2005)
Rivoire, O.: Properties of atypical graphs from negative complexities. J. Stat. Phys. 117(3–4), 453–476 (2004)
Seneta, E.: On recent theorems concerning the supercritical Galton–Watson process. Ann. Math. Stat. 39, 2098–2102 (1968)
Wormald, N.C.: Models of random regular graphs. In: Surveys in Combinatorics, 1999 (Canterbury). London Mathematicala Society Lecture Note Series, vol. 267, pp. 239–298. Cambridge University Press, Cambridge (1999)
Acknowledgments
We thank Justin Salez for bringing reference [3] to our attention and Bálint Virág for a discussion on the discontinuity of the entropy. This work was supported by the GDRE GREFI-MEFI CNRS-INdAM. Partial support of the European Research Council through the Advanced Grant PTRELSS 228032 and ANR-11-JS02-005-01 is also acknowledged.
Author information
Authors and Affiliations
Corresponding author
Appendix : Local convergence for generalized configuration model
Appendix : Local convergence for generalized configuration model
In this section we prove Theorem 4.8.
1.1 The exploration process
The first step is to prove convergence of the average measure \(\mathbb {E}U (G_n)\), where \(G_n\) has distribution \({\mathrm {CM}}( {\mathbf {D}}^{(n)} )\).
Proposition 7.1
Let \(G_n \in {\widehat{\mathcal {G}}}(\mathbf {D}^{(n)} )\) with distribution \({\mathrm {CM}}( {\mathbf {D}}_n)\) such that assumptions (H1)–(H2) hold (see Sect. 4.3). Then, \(\mathbb {E}U(G_n) \rightsquigarrow {\mathrm {UGW}}(P)\).
The proof of Proposition 7.1 is based on an exploration process of the neighborhood of a vertex. We shall use the notation of Sect. 4. For ease of notation, we will often omit the dependence on \(n\) from our notation. Let \(\mathbf {D}= (D(1), \dots , D(n)) \in \mathcal {D}_n\), \(\sigma \in \Sigma \) and \(G =\Gamma (\sigma )\) the associated multigraph. To be precise, we specify the set \(W = \cup _{c \in \mathcal {C}} W_c\) to be \( W_c= \{ (c, i, j): i \in [n], 1 \leqslant j \leqslant D_c(i) \}\) and \(W (i) = \{ (c, i, j ): c \in \mathcal {C}, 1 \leqslant j \leqslant D_c (i) \}\) the set of half-edges of all colors starting from \(i\). With a slight abuse of notation, we will sometimes write for \(e = (c, i,j) \in W\), \(\sigma (e)\) in place of \(\sigma _c ( i,j)\).
Let \(\mathbb {N}^f = \cup _{k \geqslant 0} \mathbb {N}^ k\) where \(\mathbb {N}^0 = o\). We consider the total order on \(\mathbb {N}^f\): \(\mathbf i< \mathbf j\) with \(\mathbf i= (i_1, \dots , i_k), \mathbf j= (j_1, \dots , j_{\ell })\) if either \(k < \ell \) or \(k = \ell \) and \(\mathbf i< \mathbf j\) for the lexicographical order. We will define a bijective map \(\phi \) from a finite set \(S \subset \mathbb {N}^f\) to the vertex set of \(G(v)\). The value of \(\phi \) is defined iteratively and if \(\mathbf i< \mathbf j\) are in \(S\) then the value of \(\phi (\mathbf i)\) will be determined before the value of \(\phi (\mathbf j)\). Moreover \(\phi ( S \cap \mathbb {N}^k)\) will be the set of vertices at distance \(k\) from \(v\).
The exploration is on the set of half-edges \(W\) and it is defined recursively. At integer step \(t\), we partition \(W\) in \(3\) sets: an half-edge may belong to the active set \(A(t)\), to the unexplored set \(U(t)\) or to the connected set \(C(t) = W \backslash ( A(t) \cup U(t))\). At stage \(t\), a vertex with an half-edge in \(C(t) \cup A(t)\) will have a pre-image via \(\phi \) in \(\mathbb {N}^f\). We start with a given \(v \in [n]\), and fix the initial conditions \(A(0) = W (v)\), \(C(0) = \emptyset \), \(U(0) = W \backslash W(v) \), and \(\phi (o) = v\).
For integer \(t \geqslant 0\), if \(A(t) \ne \emptyset \), let \(e_{t+1} = (c_t,\phi (\mathbf i_{t}), j_t)\) be an half-edge in \(A(t)\) such that \(\mathbf i_{t}\) is minimal for the total order on \(\mathbb {N}^f\). Let \(I(t+1) = ( W (v_{t+1}) \backslash \{ \sigma (e_{t+1} ) \}) \cap U(t) \) where \(v_{t+1}\) is the vertex such that \(\sigma (e_{t+1}) \in W (v_{t+1})\). \(I_{t+1}\) is the set of new half-edges and our partition of \(W\) is updated as
If \(\sigma (e_{t+1}) \notin A(t)\), we also set \(\phi ( ( \mathbf i_{t}, j_t) ) = v_{t+1}\). Finally, if \(A(t) = \emptyset \), then the exploration process stops.
We notice that the elements in \(C(t) \) are the half-edges for which we know by step \(t\) their matched half-edge. It implies that \(\sigma (e_{t+1}) \in A(t) \cup U(t)\). Moreover, for any vertex \(u\), we cannot have simultaneously \(W (u) \cap U(t) \ne \emptyset \) and \(W (u) \cap A(t) \ne \emptyset \). With a slight abuse, we may thus write \(u \in U(t)\) or \(u \in A(t)\) if, respectively, \(W(u) \cap U(t) \ne \emptyset \) or \(W(u) \cap A(t) \ne \emptyset \). Now, if \(v_{t+1} \in U(t)\), then \(I(t+1) = W (v_{t+1}) \backslash \{ \sigma (e_{t+1}) \}\), otherwise \(v_{t+1} \in A(t)\) and \(I(t+1)= \emptyset \). Note also that for integer \(k\), the image by \(\phi \) of the vertices of generation \(k\) in \(S\), \(\phi ( S \cap \mathbb {N}^k)\), are the set of vertices in \(G\) at distance \(k\) from \(v\) (by recursion, this comes from the fact that \(\mathbf i_t\) is minimal for the total order on \(\mathbb {N}^f\)).
We now define \(X(0) = D(v)\) and for integer \(t \geqslant 1\), \(X_c (t+1) = | \{ (i, j ): (c,i,j) \in I_{t+1} \}|\). Hence \(X(t) \in \mathcal {M}_L\) gives the new colored half-edges attached to \(v_{t}\). For ease of notation, we also set
and
Setting \(A_c=A\cap W_c\), \(U_c=U\cap W_c\) and \(C_c=C\cap W_c\), we get
Note that \(| C_c(t) | = | C_{{\bar{c}}}(t) |\) and, if \(c \in \mathcal {C}_=\), \(|C_c (t)|\) is even.
Now, as in the statement of Proposition 7.1, consider a random multi-graph \(G_n\) with distribution \({\mathrm {CM}}(\mathbf {D}^{(n)})\). For integer \(t \geqslant 0\), we consider the filtration
The hitting time \(\tau \) is a stopping time for this filtration. Also, given \(\mathcal {F}_{t}\), if \(\{ t < \tau \}\) and \(c_t = c \in \mathcal {C}_{\ne }\), then \(\sigma (e_{t+1})\) is uniformly distributed on \( U_{{\bar{c}}} (t) \cup A_{{\bar{c}}} (t) \). It follows that for \(u \in [n]\),
Similarly, given \(\mathcal {F}_t\), if \(\{ t < \tau \}\) and \(c_t = c \in \mathcal {C}_{=}\), \(\sigma (e_{t+1})\) is uniformly distributed on \( U_{c} (t) \cup A_{c} (t) \backslash \{ e_{t+1}\} \). We find in this case,
In either case, for \(c \in \mathcal {C}\), if \(\sigma (e_{t+1}) \in U(t)\), then \(X(t+1) = D(v_{t+1}) - E^ {{\bar{c}}}\) otherwise, \(\sigma (e_{t+1}) \in A(t)\) and \(X(t+1) = 0\). We recall also that \(|U_c (t)| + | A_c (t) | = | W_c | - | C_c (t) | = | W_{{\bar{c}}} | - | C_{{\bar{c}}} (t) | \). We get, for \( M \in \mathcal {M}_L\), if \(c_t=c\) then
Observe that, from (96) and assumption (H1), we find for any \(c \in \mathcal {C}\),
The next lemma computes the limiting marginals of the exploration process.
Lemma 7.2
Under the assumption of Proposition 7.1, let \(o\) be uniformly distributed on \([n]\), independently of \(G_n\), and consider the exploration process on the rooted graph \((G_n(o),o)\). For any integer \(t \geqslant 0\), as \(n\rightarrow \infty \):
-
(i)
\(X(0)\) converges weakly to \(P\).
-
(ii)
Let \(c\in \mathcal {C}\) be such that \(\mathbb {E}D_c > 0\). Given \(\mathcal {F}_t\), if \(\{ t < \tau \}\) and \(c_t = c\), then the conditional law of \(X(t+1)\) given \(\mathcal {F}_t\) converges weakly to \({\widehat{P}}^ c\).
-
(iii)
The probability that there exist \(c\in \mathcal {C}\) and an integer \(1 \leqslant s \leqslant t\wedge \tau \) such that \(\mathbb {E}D_c =0\) and \(c_s = c\) goes to \(0\).
Proof
Since \(X(0)= D(o)\), statement (i) is simply a restatement of the assumption (H2).
For statement (ii), we first note that the set \(\{ i \in [n]: i \notin U(t)\}\) has cardinality bounded by \(1 + \theta L^2 t\). It follows by (97) that, if \(\{ t < \tau \}\) and \(c_t = c\) hold, for any \(M \in \mathcal {M}_L\),
Now, assumptions (H1)–(H2) imply that \(| W_c | / n\) converges to \(\mathbb {E}D_c\), where \(D\) has law \(P\). Similarly, assumption (H2) implies that \( \frac{1}{n} \sum _{i=1} ^n \mathbf {1}_{D(i) = M + E^{{\bar{c}}}} \) converges to \(P(M + E^ {{\bar{c}}})\). Hence, from (98), if \(\mathbb {E}D_c >0\), then \(\mathbb {P}( X (t+1) = M | \mathcal {F}_t )\) converges to \({\widehat{P}}^ {c } (M)\). This proves statement (ii).
We now turn to statement (iii). We set \(\mathcal {C}_0 = \{ c\in \mathcal {C}: D_c \equiv 0 \}\) and \(A_{\mathcal {C}_0} (t) = \cup _{c\in \mathcal {C}_0} A_{c} (t)\). We recall that \(\mathbb {E}D_{{\bar{c}}} = \mathbb {E}D_{c}\), hence \(c\in \mathcal {C}_0\) is equivalent to \({\bar{c}} \in \mathcal {C}_0\). We should prove that for any integer \(t \geqslant 0\), \(\mathbb {P}( | A_{\mathcal {C}_0} (t \wedge \tau ) | \geqslant 1 ) \rightarrow 0\). First, statement (i) and the union bound implies that \(\mathbb {P}( | A_{\mathcal {C}_0} (0) | \geqslant 1 ) \leqslant \sum _{c \in \mathcal {C}_0} \mathbb {P}(X_c(0) \geqslant 1) \rightarrow 0\). By recursion, it is thus sufficient to prove that for any integer \(t \geqslant 0\), \(c \in \mathcal {C}_0\), \(c' \in \mathcal {C}\backslash \mathcal {C}_0\), if \(\{ t < \tau \}\) and \(c_t = c'\) hold, then
The latter follows from statement (ii) (recall that \({\bar{c}} \in \mathcal {C}_0\)). \(\square \)
We introduce a variable that counts the number of times that two elements in the active sets are matched by step \(t\):
Lemma 7.3
Under the assumption of Proposition 7.1, let \(o\) be uniformly distributed on \([n]\), independently of \(G_n\), and consider the exploration process on the rooted graph \((G_n(o),o)\). For every integer \(t \geqslant 0\), we have
If \(t \leqslant \tau \) and \(E(t) = 0\), the subgraph of \(G_n\) spanned by the vertices with all their half-edges in \(C(t )\) is an directed colored tree.
Proof
We start with the second statement. To every vertex \(u\) with an half-edge in \(C (t) \cup A(t)\), there is an element \(\mathbf i\) in \(\mathbb {N}^f\) such that \(\phi (\mathbf i) = u\). We may thus order these vertices by the order through \(\phi ^{-1}\) in \(\mathbb {N}^f\). Every such vertex is adjacent to its parent. By construction if \( E(t) = 0\) or equivalently if for all \(1 \leqslant s \leqslant t\), all \(c\in \mathcal {C}\), \(\varepsilon _c(s) = 0\), then every vertex with an half-edge in \(C(t) \cup A(t) \) has a unique adjacent vertex with a smaller index. It follows that there cannot be a cycle in the subgraph spanned by these vertices.
If \( E (t\wedge \tau ) \ne 0 \), there exists an integer \(1 \leqslant s \leqslant t \wedge \tau \) such that \(\sigma (e_s) \in A(s-1)\). Using (97), it follows from the union bound and the fact that \(\{ s < \tau \} \in \mathcal {F}_s\),
From (98), for each \(t \geqslant 0\), \(|A_c (t)| \leqslant \theta (t+1)\). Also, by assumptions (H1)–(H2), \(| W _c| / n\) converges to \(\mathbb {E}D_c\), where \(D\) has law \(P\). If \(\mathbb {E}D_c=0\), we may appeal to Lemma 7.2(iii). \(\square \)
All ingredients of the proof of Proposition 7.1 are now gathered.
Proof of Proposition 7.1
Let \(o\) be uniformly distributed on \([n]\), independently of \(G_n\). We set \(\overline{\rho }_n = \mathbb {E}U ( G_n)\), and \(\rho = {\mathrm {UGW}}(P)\). Define \(B =\{g \in {\widehat{\mathcal {G}}}^*: g_t = \gamma \}\) where \(\gamma \) is the equivalence class of a finite rooted directed colored tree of depth at most \(t\). It is sufficient to prove that for any integer \(t \geqslant 1\) and any such \(\gamma \), \(\overline{\rho }_n (B) \) converges to \(\rho (B)\).
For some \(m = \sum _{k = 0}^{t-1} (\theta L^ 2)^ k\), \((G_n(o),o)_t\) has at most \(m\) vertices. However, by Lemma 7.3, with high probability, \(E_{m \wedge \tau } =0\) and \((G_n (o),o)_t\) is a rooted directed colored tree. Applying now Lemma 7.2, we deduce that
The conclusion follows. \(\square \)
1.2 A.2 Concentration inequalities
We are going to state a concentration inequality for the configuration model. We use the notation of Sect. 4. We fix an integer \(L \geqslant 1\) and consider a set of colors \(\mathcal {C}= \{(i,j): 1 \leqslant i, j \leqslant L \}\), \(\mathbf {D}= (D(1),\ldots , D(n)) \in \mathcal {D}_n\) and \(\Sigma = \Sigma ( \mathbf {D})\) be the set of configurations. We shall say that \(m \in \Sigma \) and \(m' \in \Sigma \) differ by at most one switch if there exists \(c \in \mathcal {C}_\leqslant \) such that for all \(c ' \ne c\), \(m_{c'} = m'_{c'}\) and a set \(J \subset W_c\), with \(|J| \leqslant 2\) if \(c \in \mathcal {C}_{\ne }\) or \(|J| \leqslant 4\) if \(c \in \mathcal {C}_=\), and for all \(x \in W_c \backslash J\), \(m (x) = m'(x)\). In other words, if \(c \in \mathcal {C}_{\ne }\), \(m'_c \circ m_c^ {-1}\) is either the identity (\(|J| = 0\)) or a transposition (\(|J| = 2\)). Similarly, for \(c \in \mathcal {C}_{=}\), \(m'_c \circ m_c^ {-1}\) is either the identity (\(|J| = 0\)) or the composition of two disjoint transpositions (\(|J| = 4\)).
In the special case \(L=1\), the next proposition appears in Wormald [31, Theorem 2.19].
Proposition 7.4
Let \(\mathbf {D}= (D(1),\ldots , D(n)) \in \mathcal {D}_n\), \(\Sigma = \Sigma ( \mathbf {D})\) be the set of configurations, \(S = \sum _{i=1}^ n D (i)\) and \(N = \sum _{c \in \mathcal {C}} S_c\). Let \(F: \Sigma \rightarrow \mathbb {R}\) be a function such that for some \(\kappa > 0\) and any \(m, m' \in \Sigma \) which differ by at most one switch, we have
Then, if \(\sigma \) is uniformly sampled from \(\Sigma \), for any \(t > 0\),
The proof will be given in Sect. 7.1.2 below.
Corollary 7.5
Let \(\mathbf {D}= (D(1),\ldots , D(n)) \in \mathcal {D}_n\) such that (H1) holds (see Sect. 4.3). Let \(k \geqslant 0\), \(\gamma \) be a rooted directed colored multigraph and \(A = \{ g :\; g_k = \gamma _k \}\). There exists a constant \(\delta = \delta (\theta ,k,L) > 0\), such that, if \(G_n \mathop {\sim }\limits ^{d} {\mathrm {CM}}(\mathbf {D})\), \(\rho _n = U(G_n)\) and \(t >0\),
Proof
By assumption we have for any \(c \in \mathcal {C}\), \(i\in [n]\), \( D_c(i) \leqslant \theta \). We may thus assume without loss of generality that \(\gamma \) has degrees bounded by \( \theta \). We set
The number of vertices in \(G_n\) which are at distance at most \(k\) from both endpoints of any given edge is bounded by \(\kappa = 2\sum _{s=0}^{k-1} (\theta L^ 2) ^{s}\). If two configurations \(m,m'\) in \(\Sigma \) differ by at most one switch then \(|f(m) - f(m')| \leqslant 4 \kappa \). Indeed, a switch changes the status at most \(4\) edges and the addition or the removal of an edge can modify for at most \(\kappa \) vertices the value of \( \mathbf {1}( (G_n(i), i )_k \simeq \gamma _k ) \). It remains to apply Proposition 7.4, with \(F(\sigma )=f(\sigma )/n\) and \(N=O(n)\). \(\square \)
1.2.1 Proof of Theorem 4.8
Let us start with the case \(G_n \mathop {\sim }\limits ^{d} {\mathrm {CM}}(\mathbf {D}^{(n)})\). We set \(\rho _n = U(G_n)\) and \(\rho = {\mathrm {UGW}}(P)\). Let \(k \geqslant 0\), \(\gamma \in {\widehat{\mathcal {G}}}_*\) and \(A = \{ g \in {\widehat{\mathcal {G}}}^ *: g_k = \gamma _k \}\). Corollary 7.5, Proposition 7.1 and Borel–Cantelli’s Lemma imply that with probability one, \(\rho _n (A) \rightarrow \rho (A)\). The collection of sets \(A = A(k,\gamma )\), \(k \geqslant 0\), \(\gamma \in {\widehat{\mathcal {G}}}^*\), being a basis of the topology on \({\widehat{\mathcal {G}}}^*\), this proves the first statement of Theorem 4.8.
For the second statement, we notice that if \(G_n\) is uniformly distributed on \(\mathcal {G}(\mathbf {D}^{(n)},h)\) and \(\widehat{G}_n \mathop {\sim }\limits ^{d} {\mathrm {CM}}(\mathbf {D}^{(n)})\), then Lemma 4.4 implies for any subset \(B \subset \mathcal {G}(\mathbf {D}^{(n)},h)\) that
By Corollary 4.6, \(\mathbb {P}( {\widehat{G}}_n \in \mathcal {G}(\mathbf {D}^{(n)},h))\) is lower bounded by some \(\alpha >0\), uniformly in \(n\) (depending of the sequence \(\mathbf {D}^{(n)}\) and \(h\)). Then, if \(\rho _n = U(G_n)\) and \({\widehat{\rho }}_n = U({\widehat{G}}_n)\), from what precedes, for any \(t > 0\), and \(A\) as above
It remains to apply again Borel-Cantelli’s lemma and Proposition 7.1. \(\square \)
Remark 7.6
The proof of Theorem 4.8 actually shows that for a sequence \(\mathbf {D}^{(n)}\) satisfying (H1)–(H2) the following holds. Let \(k \geqslant 0\), \(\gamma \) fixed and \(A = \{ g : \;g_k = \gamma _k \}\). There exists a constant \(\delta > 0\) (depending of the sequence \(\mathbf {D}^{(n)}\), \(h\) and \(k\)), such that, if \(G_n\) is uniformly distributed on \(\mathcal {G}(\mathbf {D}^{(n)},h)\) and \(\widehat{G}_n \mathop {\sim }\limits ^{d} {\mathrm {CM}}(\mathbf {D}^{(n)})\), we have for any \(t >0\)
where \(\rho _n = U(G_n)\) and \({\widehat{\rho }}_n = U({\widehat{G}}_n)\).
1.2.2 Proof of Proposition 7.4
The proof is a consequence of Azuma–Hoeffding’s inequality.
1.2.2.1 Case of random matchings
For clarity, we start with the case \(L=1\), i.e. \(\mathcal {C}= \{(1,1)\}\). Then \(N = S_{(1,1)}\). We order the elements of \(W = W_{(1,1)}\) in the lexicographic order. We may identify a matching of \(W\) as the set of \(N/2\) matched pairs. We order these \(N/2\) pairs by the index of their smallest element. We then define \(\mathcal {F}_0\) as the trivial \(\sigma \)-algebra and for \(1 \leqslant k \leqslant N/2\), we define \(\mathcal {F}_k\) as the \(\sigma \)-algebra generated by the first \(k\) pairs of matched elements of \(\sigma \). We set \(Z_k = \mathbb {E}[ F ( \sigma ) | \mathcal {F}_{k} ]\), so that \(Z_0 = \mathbb {E}F(\sigma )\), \(Z_{N/2 - 1} = F(\sigma )\). By construction, \(Z_k\) is a Doob martingale.
If \(A\) is a finite set, we denote by \(\mathbf {M}(A)\) the set of perfect matchings on \(A\). With our previous notation \(\Sigma = \mathbf {M}( W)\). For \(1 \leqslant k \leqslant N/2\), an element \(\sigma \) of \(\mathbf {M}(W)\) can be uniquely decomposed into \((\sigma ^-_{k-1}, \sigma ^+_{k})\) where \(\sigma ^-_{k-1}\) is the restriction of \(\sigma \) to the \(k-1\) smallest pairs and \(\sigma ^+_{k}\) is the rest. Let \(W^{k-1}\) denote the subset of \(W\) such that \( \sigma ^-_{k-1}\) is a perfect matching on \(W^{k-1}\).
If \(v_{k}\) is the smallest element of \(W \backslash W^{k-1}\), we set \(w_k = \sigma (v_k) \in W \backslash W^{k-1}\), so that \(W^k = W^{k-1} \cup \{ v_k, w_k \}\). Now, for \(w \in W \backslash ( W^{k-1} \cup \{ v_k \} )\), let \(\mathbf {M}_w\) denote the set of matchings of \(W \backslash W^{k-1}\) such that \(m (v_k) = w\). Then for any \(w,w' \in W \backslash ( W^{k-1} \cup \{ v_k \} )\), each \(m \in \mathbf {M}_w\) corresponds to a unique \(m' \in \mathbf {M}_{w'}\) through the switch \( \{\{v_k, w \}, \{w', z \} \} \rightarrow \{\{v_k, w'\}, \{w, z\}\}\), where \(m (w') = z\). This gives a bijection between \(\mathbf {M}_w\) and \(\mathbf {M}_{w'}\), and we set \(N_k = | \mathbf {M}_w |\). By assumption, we deduce that for any \(w, w'\),
Applying the above inequality to \(w_k\), we deduce that
We may then apply Azuma–Hoeffding’s inequality to the martingale \(Z_k\). We obtain that for any \(t >0\),
This proves the proposition when \(L =1\).
1.2.2.2 Case of random bijections
Now, let \(\sigma \) be a uniformly drawn bijection from the set \(W\) to \({\bar{W}}\) of common cardinality \(N\). We order the elements of the set \(W\) as \((x_1, \dots , x_N)\). We introduce the filtration \(\mathcal {F}_k\) generated by \(\sigma (x_1), \dots , \sigma (x_k)\). Now let \(F\) be a function on the set of bijections from \(W\) to \({\bar{W}}\) such that for any \(m\), \(m'\) such that \(m' \circ m^ {-1}\) is a transposition, we have \(|F(m) - F(m')| \leqslant \kappa \). We set \(Z_k = \mathbb {E}[ F(\sigma ) | \mathcal {F}_k]\).
With minor modifications, the above argument shows that, for \(1 \leqslant k \leqslant N\), \(|Z_{k-1} - Z_k| \leqslant \kappa \). Then, by Azuma–Hoeffding’s inequality, we find for \(t \geqslant 0\),
1.2.2.3 General case
It suffices to combine the two results. We first order the elements of \(\mathcal {C}_{\leqslant }\) in an arbitrary way, say \(c_1, \dots , c_{\ell }\) with \(\ell = L ( L+1) / 2\). Let \(N_0 = 0\), for \(1 \leqslant k \leqslant \ell \), \(N_k = \sum _{i=1}^k S_{c_i}/(1 + \mathbf {1}_{c_i \in \mathcal {C}_=})\). We have \(N_{\ell } = N /2\). We define the filtration, \((\mathcal {F}_{t}), 0 \leqslant t \leqslant N_{\ell },\) built as follows. \(\mathcal {F}_0\) is the trivial \(\sigma \)-algebra, \(\mathcal {F}_{N_k}\) is the filtration generated by the independent variables \((\sigma _{c_i}), 1 \leqslant i \leqslant k\). Finally for \(1 \leqslant i < N_{k+1} - N_k\), \(\mathcal {F}_{N_{k} +i}\) is the filtration generated by \(\mathcal {F}_{N_k}\) and the first \(i\) matched pairs of \(\sigma _{c_{k+1}}\).
As above, we set \(Z_k = \mathbb {E}[ F(\sigma ) | \mathcal {F}_k]\), so that \(Z_0 = \mathbb {E}F ( \sigma )\), \(Z_{N_{\ell }} = F ( \sigma )\). By construction, using the independence of \((\sigma )_c, c\in \mathcal {C}\), we find, for \(1 \leqslant k \leqslant N_\ell \), \(|Z_{k-1} - Z_k| \leqslant \kappa \). Hence, Azuma–Hoeffding’s inequality implies that (100) holds for \(\sigma \) uniform in \(\Sigma \), with \(N\) replaced by \(N_\ell = N/2\). This ends the proof of Proposition 7.4.
Rights and permissions
About this article
Cite this article
Bordenave, C., Caputo, P. Large deviations of empirical neighborhood distribution in sparse random graphs. Probab. Theory Relat. Fields 163, 149–222 (2015). https://doi.org/10.1007/s00440-014-0590-8
Received:
Revised:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00440-014-0590-8
Keywords
- Random graphs
- Random trees
- Local convergence
- Large deviations
- Configuration model
- Unimodular measure
- Entropy