1 Introduction and results

Many real-world networks are claimed to be small worlds, meaning that their graph distances are quite small. In social networks, such small distances go under the name of the ‘six-degrees-of-separation’ paradigm and have attracted attention due to the interesting experiments by Milgram [41, 59]. See also Pool and Kochen [55] as well as [17], where a related experiment is described on the basis on email messages. After this start in social sciences, the small-world nature of many other networks was first described by Strogatz and Watts [62]. A popular account of small-world aspects of networks can be found in the book by Watts [61]. See also the surveys by Newman [49] and Albert and Barabási [2] on real-world networks.

The reported small-world nature of many real-world networks has incited a deep and thorough study of typical distances in random graphs. See the highly influential paper by Newman, Strogatz and Watts [51], who pioneered this line of research. There is a deep relation between the small-world nature of networks and their other often reported common feature, the scale-free paradigm, which states that the proportion of vertices of degree k in many networks scales as an inverse power of k for k large. This scale-free nature implies that there are many vertices with very high degrees, and these hubs drastically shrink graph distances. The common picture in many random graph models is that graph distances are asymptotically logarithmic in the graph size when the degrees have finite-variance degrees [23, 30, 51], while they are doubly logarithmic or ultrasmall when the graphs have infinite-variance degrees [11, 12, 15, 18, 31, 52]. See also [29] for a discussion of many of the available results. The conclusion is that typical distances in random graphs are closely related to their degree structure, and larger degrees significantly shrink graph distances.

In many real-world networks, not only power-law degree sequences are observed, but also power laws with a so-called exponential truncation. This means that, even though the degree distribution for small values is close to a power law, for large values the tail distribution becomes exponentially small. This occurs e.g. in sexual networks [39], the Internet Movie Data base [4], and in scientific collaboration networks [47, 48]. Of course, it is not easy to guess whether a distribution has a power law, or rather a power law with exponential truncation. Newman and collaborators give sensible suggestions on how to approach these issues in real-world networks [13, 50].

The value above which the exponential decay starts to set in is the truncation parameter. Naturally, when the degrees already had finite variance before truncation, they will remain to have so after truncation, so nothing much happens and graph distances ought to behave as in the finite-variance setting in [30]. Further, any power-law distribution with exponential truncation with a bounded truncation parameter has all moments, and thus distances ought to become logarithmic as in the finite-variance case in e.g., [30], accounting to ‘strictly’ small-world networks [62]. The situation changes dramatically when dealing with infinite-variance degrees, that is, when the power-law exponent is below 3. Indeed, in this setting, it is well known that typical paths realizing the graph distance pass through the vertices of highest possible degrees (i.e., the hubs) and typical distances grow much slower, doubly-logarithmically with the size of the graph, accounting to ultrasmall worlds [31, 53]. However, truncating the degrees could possibly have a dramatic effect, and could possibly increase the graph distances rather substantially. In fact, even though the asymptotic variance of the degrees remains infinite after truncation, distances might grow substantially with the truncation and the graph might fail to be an ultra-small world.

The main aim of this paper is to quantify the effect of truncation of the degrees in random graphs models, in particular, in the configuration model that we define below. For simplicity, we state our result as a theorem without bothering to explain the somewhat tedious details and conditions. Below, we give several more accurate and detailed versions of this theorem.

Theorem

Let us consider the configuration model on n vertices with empirical degree distribution that follows a power law with exponent \(\tau \in (2,3)\), satisfying some appropriate regularity assumptions and truncated at degrees \(n^{{\beta _n}}\), where \({\beta _n} (\log n)^\gamma \rightarrow \infty \) for some \(\gamma \in (0,1)\). Let \({\mathrm {d}}_G(v_r, v_b)\) denote the graph distance between two uniformly chosen vertices \(v_r, v_b\). Then

$$\begin{aligned} d_G(v_r, v_b) - 2\frac{\log \log (n^{{\beta _n}})}{|\log (\tau -2)|} -\frac{1}{(3-\tau ) {\beta _n}} \end{aligned}$$
(1.1)

is a tight sequence of random variables.

For the precise result see Theorem 1.8 below. In that theorem, we also identify the distributional limit of the tight sequence of random variables along subsequences. This was first done in [31] using different methods for the special case \(\beta _n\equiv 1/(\tau -1)\). We show in Section 6 that the seemingly different formulations, Theorem 1.8 and the result in [31] are actually the same.

Note that as soon as \({\beta _n}=o(1/\log \log n)\), the term containing \(1/{\beta _n}\) becomes dominant, and the second order term is of order \(\log \log n\). Thus, the random graph fails to be an ultrasmall world in a strict sense as soon as the truncation happens at degrees at most \(n^{o(1/\log \log n)}\). However, when \({\beta _n} \log \log n \rightarrow \infty \), the dominant term is the first one, of order \(\log \log n\). This theorem shows that distances in truncated power-law graphs with infinite asymptotic variance interpolate between ultrasmall worlds and small worlds.

This sheds light to a discussion in the physics literature [15, 19, 24, 25, 33, 51] about the validity of the formula derived using generating function methods or n-dependent branching process approximations, stating that

$$\begin{aligned} {\mathrm {d}}_G(v_r, v_b) = \frac{\log n}{\log \nu _n} + \text {tight} = \frac{1}{(3-\tau ){\beta _n}}+\text {tight} \end{aligned}$$
(1.2)

with \(\nu _n\) standing for the empirical second momentFootnote 1 of the degrees and \({\beta _n}\) is the truncation exponent. Compare this formula to the one in (1.1) and note that this formula yields the first order term if and only if \({\beta _n}=o(1/\log \log n)\), while even in this case it fails to capture the second order term, which is of order \(\log \log n\).

While [51] questions the validity of this formula for \(\tau \in (2,3)\), [24, 25, 33] argue that a constant \({\beta _n}\equiv {\beta }\) for the truncation exponent yields bounded typical distances in this regime, in agreement with what (1.2) suggests. This contradicts the arguments in [15] where the authors show that the smallest achievable order for typical distances is \(\log \log n\).

Closest to our result is the work of Dorogovtsev et al. [19], who, using (non-rigorous) generating function methods, study typical distances in the configuration model with truncated power-law distributions of the form \(\mathbb {P}(D=k)\sim k^{-\tau } \zeta ^k\), for constant \(\zeta <1\), and derive that

$$\begin{aligned} {\mathrm {d}}_G(v_r, v_b)= & {} \frac{\log n}{(3-\tau )|\log (1-\zeta )|} + \frac{2\log |\log (1-\zeta )|}{|\log (\tau -2)|}+\text {tight},\nonumber \\= & {} \frac{1}{(3-\tau ){\beta _n}} + \frac{2 \log \log (n^{{\beta _n}})}{|\log (\tau -2)|} + \text {tight}, \end{aligned}$$
(1.3)

where we took the liberty to set \(\zeta :=\exp \{-1/n^{{\beta _n}}\}\) to obtain the second line. Observe that our (rigorous) result in (1.1) above is in perfect agreement with this. The region of validity of their approach [19, (106)] translates to the requirement that \({\beta _n}=o(1/\log \log n)\), can be interpreted that their method requires that the first term is the leading order term in (1.3). We explain below in Sect. 1.3 why the term \(\log \log n\) is missing from (1.2) and where it comes from.

Let us make a comparison: two models with the same maximal degree, a truncated and an un-truncated one. More precisely, take a model with power-law exponent \(\tau \in (2,3)\) and truncation at \({\beta _n}={\beta }<1/(\tau -1)\) fixed. In a model with power-law exponent \(\widetilde{\tau }\) and natural truncation (that is, \(\widetilde{\beta }=1/(\widetilde{\tau }-1)\)), the maximal degree is \(n^{1/(\widetilde{\tau }-1)}\). Setting the maximal degrees in the two models, – \(n^{{\beta }}\) and \(n^{1/(\widetilde{\tau }-1)}\), respectively – to be equal yields the relation \(\widetilde{\tau }=1+1/{\beta }\). Comparing distances in the two models, we see an interesting phenomenon. When \({\beta _n}<1/2\), the un-truncated model has \(\widetilde{\tau }=1+1/{\beta }>3\), implying that \(\lim _{n\rightarrow \infty }\mathbb {E}[(D_n')^2 ]<\infty \), which, in turn, implies that distances jump up to logarithmic order. In the truncated model distances are of order \(\log \log n\), as described by (1.1). When \({\beta _n}>1/2\), the leading order of distances in the truncated model is \(\log \log n/|\log (\tau -2)|\)+tight, while in the un-truncated model it is \(\log \log n/|\log (\widetilde{\tau }-2)|+tight=\log \log n/|\log (1/{\beta }-1)|+tight\). Since \({\beta }<1/(\tau -1)\) by assumption, \(1/{\beta }-1>\tau -2\) and hence the distances in the latter model are larger. This means that ‘re-parametrizing’ the truncated model by another power-law (\(\tau '\)) that would more naturally reflect the maximal degree is not the same as truncating, even the leading term changes. This effect is extreme when \({\beta }<1/2\), in which case the truncated and the un-truncated models do not even belong to the same universality class (ultrasmall versus small world).

Let us comment on the choice of model as well. We expect that the same result is true for the giant component of the Chung-Lu or Norros-Reitu model when the power law exponent is \(\tau \in (2,3)\). These models behave qualitatively similarly to the configuration model. In fact, the ultrasmall nature of these networks were pointed out in [52, 53]. The independence of the edges conditioned on vertex-weights even makes path-counting methods easier, in fact, we expect that the same proof that we provide here could be applied for these models as well.

Attack vulnerability The removal of all vertices above a certain degree in a network is called a targeted attack or deliberate attack. An immediate corollary (see Corollary 1.17 below) of our work is that our results remain valid when instead of truncating the degrees, we remove all vertices with degree at least \(n^{\widetilde{\beta _n}}\) from a configuration model. In particular, distances after a targeted attack are described by (1.1), with \(\beta _n\) replaced by \(\widetilde{\beta }_n\). This theorem also sheds light to how distances gradually grow from ultrasmall to small world when vertices with smaller and smaller degree are gradually removed. A similar analysis have been carried out for a variant of the preferential attachment model in [20], for the special case when all degrees above order \(\log n\) (equivalently, the oldest \(\varepsilon n\) vertices for small small \(\varepsilon >0\)) are removed. They show that distances in this case grow logarithmically, giving a strong base for our conjecture that the formula in (1.1) should also be valid for \(\beta _n=\Theta (1/\log n)\), which is, at least with the methods of this paper, beyond our reach.

1.1 The Model and the Main Result

In this paper we work under the setting of the configuration model \({\mathrm {CM}}_n(\varvec{d})\). In this random graph model, there are n vertices, with prescribed degrees \(d_v, v\in \{1,2,\dots , n\}:=[n]\). To each vertex \(v\in [n]\) we assign \(d_v\) half-edges and the half-edges are then paired uniformly at random to form edges. We assume that the total number of half-edges \(\ell _n:=\sum _{v\in [n]} d_v\) is even. We denote the outcome - a graph-valued random variable - by \({\mathrm {CM}}_n(\varvec{d})\).

1.1.1 Setting and Assumptions

We study the case when the empirical degree distribution follows a possibly truncated power law, with an exponent that gives rise to empirical variance tending to infinity with n and when the truncation happens at some polynomial of n. To make this precise, we impose the following three assumptions on the empirical degree distribution, :

Assumption 1.1

(Power-law tail behavior) There exists a \({\beta _n}\in (0, 1/(\tau -1)]\) such that for all \(\varepsilon >0\), \(F_n(x)=1\) for \(x\ge n^{{\beta _n}(1+\varepsilon )}\), while for all \(x\le n^{{\beta _n}(1-\varepsilon )}\),

$$\begin{aligned} 1- F_n(x)= \frac{L_n(x)}{x^{\tau -1}},\end{aligned}$$
(1.4)

with \(\tau \in (2,3)\), and a function \(L_n(x)\) that satisfies for some constant \(C_1>0\) and \(\eta \in (0,1)\) that

$$\begin{aligned} \exp \{ -C_1 (\log x)^\eta \} \le L_n(x)\le \exp \{ C_1 (\log x)^\eta \}.\end{aligned}$$
(1.5)

Assumption 1.2

(Minimal degree at least 2) \(\min _{v\in [n]} d_v \ge 2\).

See Remark 1.12 for an extension of our results in the case when \(\min _{v\in [n]} d_v =1\). We write \(D_n\) for a random variable having distribution \(F_n\). Then \(D_n\) is the degree of a uniformly chosen vertex from [n]. We introduce \(D^\star _n:=\)(the size-biased version of \(D_n)-1\) by

(1.6)

We write \(F^\star _n(x)\) for the distribution function of \(D^\star _n\). Note that for all \(x\le n^{{\beta _n}(1-\varepsilon )}\),

$$\begin{aligned} 1- F_n^\star (x)=\frac{1}{\mathbb {E}[D_n]}\sum _{j\ge x} (j+1)\mathbb {P}(D_n=j+1)\ge \frac{1}{\mathbb {E}[D_n]} x (1-F_n(x)), \end{aligned}$$

and similarly, under Assumption 1.1,

$$\begin{aligned} \begin{aligned} 1- F_n^\star (x)&\le (x+1) (1-F_n(x+1)) + \sum _{j\ge x} (1-F_n(x)) \\&\le \frac{2}{\mathbb {E}[D_n]} \frac{L_n(x+1)}{x^{\tau -2}} + \frac{1}{\mathbb {E}[D_n]}\sum _{j\ge x} \frac{L_n(j)}{j^{\tau -1}}. \end{aligned} \end{aligned}$$

By a Karamata-type theorem (see [9, Proposition 1.5.10]) the latter sum on the rhs is at most \(2 L_n(x)/(x^{\tau -2} (\tau -2))\) for all large enough x. Further, for \(x\ge n^{{\beta _n}(1+\varepsilon )}\) it is obvious from the definition (1.6) that \(1-F_n^\star (x)=0\). Thus, it follows that for all \(\varepsilon >0\) and \(x\le n^{{\beta _n}(1-\varepsilon )}\), \(F_n^{\star }\) satisfies that

$$\begin{aligned} 1- F^\star _n(x)= \frac{L^\star _n(x)}{x^{\tau -2}},\end{aligned}$$
(1.7)

with a function \(L^\star _n(x)\) satisfying (1.5) again (possibly with a different constant \(C^\star _1\) instead of \(C_1\) in the exponent in (1.5) for \(L_n^\star \)).

To be able to state convergence results, we will need an assumption that relates the behavior of \(F_n\) and \(F_n^\star \) for different values of n to a limiting distribution function. We write \(\mathrm {d}_{\scriptscriptstyle {\mathrm {TV}}}(F, G):=\tfrac{1}{2} \sum _{x\in \mathbb {N}} |F(x+1)-F(x) - (G(x+1)-G(x)) |\) for the total variation distance between two (discrete) probability measures. The weakest form of such assumption that we can pose is captured in the following assumption:

Assumption 1.3

(Convergence to limiting distributions) We assume that there exist distribution functions F(x), \(F^\star (x)\) such that \(F_n(x)\rightarrow F(x)\) and \(F_n^\star (x)\rightarrow F^\star (x)\) in all continuity points of F(x), \(F^\star (x)\). We assume that there exists a \(\kappa >0\),

$$\begin{aligned} \max \{ \mathrm {d}_{\scriptscriptstyle {\mathrm {TV}}}(F_n, F), \mathrm {d}_{\scriptscriptstyle {\mathrm {TV}}}(F_n^\star , F^\star ) \} \le n^{-{\beta _n}\kappa }. \end{aligned}$$
(1.8)

Let us write \(D, D^\star \) for random variables following the limiting distributions \(F, F^\star \) in Assumption 1.3, respectively. Since total variation convergence equals weak convergence for discrete random variables, \(D_n\buildrel {d}\over {\longrightarrow }D\) and \(D_n^\star \buildrel {d}\over {\longrightarrow }D^\star \). Using (1.6), we further obtain that

$$\begin{aligned} \mathbb {P}(D^\star =j) = \frac{(j+1)\mathbb {P}(D=j)}{\mathbb {E}[D]}, \end{aligned}$$
(1.9)

thus \(D^\star \) is the (size-biased version\(-1\)) of D. Further note that F can be written in the form (1.4), and \(F^\star \) as in (1.7) with limiting \(L, L^\star \) that satisfies (1.5). Note that the limit variables are not truncated. Further, the bound \(n^{-{\beta _n} \kappa }\) in Assumption 1.3 is best possible since \(\mathrm {d}_{\scriptscriptstyle {\mathrm {TV}}}(F_n, F) \ge \mathbb {P}(D>n^{{\beta _n}}) \ge n^{-{\beta _n}(\tau -1)}\ell (n^{-{\beta _n}(\tau -1)})\ge n^{-{\beta _n}(\tau -1-\delta )}\). It is also reasonable, e.g. it can be shown that it is satisfied in Examples 1.18-1.21 below.

Under Assumptions 1.1 and 1.2, the graph almost surely has a unique connected component of size \(n(1-o_{\mathbb {P}}(1))\) see e.g. [28, Vol II., Theorem 4.1] or [42, 43], or the recent paper [22].

We provide three examples in Sect. 1.2 below (i.i.d. degrees, exponential and hard truncation) that satisfy Assumption 1.1, see Examples 1.18, 1.20, 1.21 below, as well as collect some references to networks following such empirical degree distributions.

In this paper we study typical distances, that is, the graph distance \({\mathrm {d}}_G\) between two uniformly chosen vertices in the graph. For the sake of the proof, we denote these vertices by \(v_r\) and \(v_b\) and think of them as being red and blue, respectively.

Definition 1.4

(With high probability) We say that a sequence of events \({\mathcal {E}}_n\) happens with high probability under the measure \({\mathbb {Q}}\) (and abbreviate this as \({\mathbb {Q}}\)-whp), if \({\mathbb {Q}}({\mathcal {E}}_n) \rightarrow 1\) as \(n\rightarrow \infty \). We write simply ‘whp’ when the measure is the annealed measure of the configuration model and the two uniformly chosen vertices \(v_r, v_b\).

We emphasize that in the setting of this paper, whp statements should be read as follows: for asymptotically almost every realizations of the random graph \({\mathrm {CM}}_n(\varvec{d})\) and almost all pairs of vertices \((v_r, v_b)\), the statement is true.

Definition 1.5

(\(\sim \) notation) We use the shorthand notation \(X_n \sim a_n\) if there exists a constant \(\theta \in (0,1)\) such that

$$\begin{aligned} X_n \sim a_n \quad \Longleftrightarrow \quad \mathbb {P}\left( X_n \in \left[ a_n {{\mathrm {e}}}^{-(\log a_n)^{\theta }}, a_n {{\mathrm {e}}}^{(\log a_n)^{\theta }}\right] \right) \rightarrow 1. \end{aligned}$$
(1.10)

We call vertices with degree at least \(\sim n^{(\tau -2){\beta _n}}\) hubs.

Note that \(X_n\sim n^a\) is somewhat stronger than stating that \(X_n=n^{a(1+o_{\mathbb {P}}(1))}\).

The statement of the main theorem uses some knowledge about infinite-mean branching processes, as well as their coupling to the local neighborhood of the two vertices \(v_r, v_b\). So, before stating the result, we have to do a small excursion into defining these objects. In particular, Lemma 2.2 and Corollary 2.3 below, based on [7, Proposition 4.7], states that under Assumption 1.3, whp, the number of vertices and their forward degrees in an exploration of the neighborhood of \(v_r, v_b\) can be coupled to two independent branching processes (that are embedded in the graph disjointly, and have offspring distribution \(F^\star \) for the second and further generations, and with offspring distribution given by F for the first generation), as long as the total number of vertices of the explored clusters does not exceed \(n^{\varrho }\) for some small \(\varrho >0\). Let us do the exploration in a breadth-first-search manner, and then the exploration at time t contains all the vertices with at most distance t away from \(\mathcal {C}^{\scriptscriptstyle {(r)}}_0:=\{v_r\}\) and \(\mathcal {C}^{\scriptscriptstyle {(b)}}_0:=\{v_b\}\). We shall denote these clusters by \(\mathcal {C}^{\scriptscriptstyle {(r)}}_t, \mathcal {C}^{\scriptscriptstyle {(b)}}_t\) (i.e., the vertices and their graph structure), and think of them as being colored red and blue, respectively. Similarly, we denote the number of vertices in the kth generation of the coupled branching processes by \((Z_{k}^{\scriptscriptstyle {(r)}}, Z_k^{\scriptscriptstyle {(b)}})_{k>0}\). The next definition, describing the double-exponential growth rates of these neighborhoods, uses this coupling:

Definition 1.6

(Double-exponential growth rates of local neighborhoods) Let \((Z_k^{\scriptscriptstyle {(r)}}, Z_k^{\scriptscriptstyle {(b)}})\) denote the number of individuals in the kth generation of the two independent copies of a Galton-Watson process, coupled to the breadth-first-search exploration process of the neighborhoods of \(v_r\) and \(v_b\) in the configuration model. In these branching processes, the size of the first generation has distribution F, and all further generations have offspring distribution \(F^\star \) from Assumption 1.3. Then, for some \(\varrho _n'<(\tau -2)\min \Big \{ {\beta _n}\kappa , (1-{\beta _n}(1+\varepsilon ))/2, (\tau -2-2\varepsilon )/(2(\tau -1))\Big \}\), let us define

$$\begin{aligned} Y_r^{\scriptscriptstyle {(n)}}:=(\tau -2)^{t(n^{\varrho _n'})} \log \left( Z^{\scriptscriptstyle {(r)}}_{t(n^{\varrho _n'})}\right) , \quad Y_b^{\scriptscriptstyle {(n)}}:=(\tau -2)^{ t(n^{\varrho _n'})} \log \left( Z^{\scriptscriptstyle {(b)}}_{ t(n^{\varrho _n'})}\right) ,\end{aligned}$$
(1.11)

where \(t(n^{\varrho _n'})=\inf _k\{\max \{ Z_k^{\scriptscriptstyle {(r)}}, Z_k^{\scriptscriptstyle {(b)}} \} \ge n^{\varrho _n'}\}\). Let us further introduce

$$\begin{aligned} Y_r:= \lim _{k\rightarrow \infty } (\tau -2)^k \log \left( Z_k^{\scriptscriptstyle {(r)}}\right) , \quad Y_b:=\lim _{k\rightarrow \infty } (\tau -2)^k \log \left( Z_k^{\scriptscriptstyle {(b)}}\right) .\end{aligned}$$
(1.12)

Note that the limit variables in (1.12) are independent of \(\rho _n'\). Further, note that \((Y_r^{\scriptscriptstyle {(n)}}, Y_b^{\scriptscriptstyle {(n)}})\) is a subsequence of the convergent sequence \(\left( (\tau -2)^k \log (Z_k^{\scriptscriptstyle {(r)}}), (\tau -2)^k \log (Z_k^{\scriptscriptstyle {(b)}})\right) \), taken at the subsequence \(k_n:=t(n^{\rho _n'})\). Since for any \(\rho '>0\), \(t(n^{\rho _n'}) \rightarrow \infty \) as \(n\rightarrow \infty \), under Assumption 1.3 we shall obtain that \((Y_r^{\scriptscriptstyle {(n)}}, Y_b^{\scriptscriptstyle {(n)}})\buildrel {d}\over {\longrightarrow }(Y_r,Y_b)\) as \(n\rightarrow \infty \). When (and only when) \({\beta _n}=1/(\tau -1)\), we shall need one more assumption that concerns the limiting distribution of \(Y_r, Y_b\):

Assumption 1.7

(No pointmass of the measure of Y) We assume that the limiting random variable \(Y:=\lim _{k\rightarrow \infty } (\tau -2)^k\log (\max \{ Z_k,1\})\) of the branching process in Definition 1.6 has no point-mass on \((0,\infty )\).

The criteria on F in Assumption 1.3 required for this assumption to hold are not obvious. According to our knowledge, no necessary and sufficient condition for no point mass or absolute continuity can be found in the literature. A sufficient criterion for absolute continuity of Y is given in [56, 57].

To be able to state the results shortly, let us define the \(\sigma \)-algebra generated by the induced subgraph on \(\mathcal {C}^{\scriptscriptstyle {(r)}}_{t(n^{\varrho _n'})} \cup \mathcal {C}^{\scriptscriptstyle {(b)}}_{t(n^{\varrho _n'})}\):

$$\begin{aligned} \mathcal {F}_{\varrho _n'}:=\sigma \left( \mathcal {C}^{\scriptscriptstyle {(r)}}_{t(n^{\varrho _n'})} \cup \mathcal {C}^{\scriptscriptstyle {(b)}}_{t(n^{\varrho _n'})}\right) \end{aligned}$$
(1.13)

and introduce the shorthand notationFootnote 2

$$\begin{aligned} \mathbb {P}_{\scriptscriptstyle {Y}}(\cdot ):= \mathbb {P}(\cdot | \mathcal {F}_{\varrho _n'}), \quad \mathbb {E}_{\scriptscriptstyle {Y}}(\cdot ):= \mathbb {E}(\cdot | \mathcal {F}_{\varrho _n'}) \end{aligned}$$
(1.14)

Further define, for \(q\in \{r,b\}\),

$$\begin{aligned} T_q({\beta _n}):=\left\lfloor \frac{\log \log (n^{\beta _n}) -\log (Y_q^{\scriptscriptstyle {(n)}})}{|\log (\tau -2)|}\right\rfloor -1, \quad b_n^{(q)}({\beta _n}):= \left\{ \frac{\log \log (n^{\beta _n}) -\log (Y_q^{\scriptscriptstyle {(n)}})}{|\log (\tau -2)|}\right\} ,\nonumber \\ \end{aligned}$$
(1.15)

where \(\lfloor x\rfloor \) denotes the largest integer that is at most x and \(\{x\}= x-\lfloor x\rfloor \) denotes the fractional part of x. Further, let \(\lceil x\rceil \) denote the smallest integer that is larger than x.

1.1.2 Typical Distances

Theorem 1.8

(Distances in truncated power-law configuration models) Consider the configuration model with empirical degree distribution satisfying Assumptions 1.1-1.3 with some \({\beta _n}\in (0, 1/(\tau -1)]\) such that \({\beta _n}(\log n)^\gamma \rightarrow \infty \) for some \(\gamma \in (0,1)\). When \({\beta _n}\rightarrow 1/(\tau -1)\) then we require that Assumption 1.7 holds additionally.

$$\begin{aligned} {\mathrm {d}}_G(v_r, v_b) =T_r({\beta _n})+T_b({\beta _n})+ \left\lceil \frac{1/{\beta _n}-(\tau -2)^{b_n^{\scriptscriptstyle {(r)}}({\beta _n})}-(\tau -2)^{b_n^{\scriptscriptstyle {(b)}}({\beta _n})}}{3-\tau } \right\rceil +1.\nonumber \\ \end{aligned}$$
(1.16)

The sequence \(Y_q^{\scriptscriptstyle {(n)}}\) in \(T_q({\beta _n}), b_n^{\scriptscriptstyle {(q)}}({\beta _n})\) converges in distribution as \(n\rightarrow \infty \). It is straightforward to show that for a sequence of random variables \(X_n\) converging in distribution, the transforms \(\lfloor \log \log (n^{\beta })+X_n\rfloor \) and \(\{\log \log n+X_n\}\) do not converge, since their distribution shifts along as \(\log \log (n^{\beta })\) moves from one integer to the next. We can rephrase the statement of Theorem 1.8 in terms of convergence in distribution by filtering out the parts that show loglog-periodicity.

Corollary 1.9

The following distributional convergence holds:

$$\begin{aligned} \begin{aligned}&{\mathrm {d}}_G(v_r, v_b)- \frac{2\log \log (n^{\beta _n})}{|\log (\tau -2)|} -\left\lceil \frac{1/{\beta _n}-(\tau -2)^{b_n^{\scriptscriptstyle {(r)}}({\beta _n})}-(\tau -2)^{b_n^{\scriptscriptstyle {(b)}}({\beta _n})}}{3-\tau } \right\rceil \\&\quad +b_n^{(r)}({\beta _n})+b_n^{(b)}({\beta _n}) \buildrel {d}\over {\longrightarrow }-1+ \frac{-\log (Y_rY_b)}{|\log (\tau -2)|}. \end{aligned} \end{aligned}$$
(1.17)

Alternatively, we obtain weak convergence along (double-exponentially growing) subsequences \((n_k)_{k\in \mathbb {N}}\) satisfying \(\log \log (n_k^{\beta _n})=k+c+o(1)\) for every \(c\in [0,1)\).

Remark 1.10

The criterion that \({\beta _n}(\log n)^\gamma \rightarrow \infty \) for some \(\gamma \in (0,1)\) is just slightly stronger than requiring that the empirical second moment of the degrees in the graph tends to infinity. Indeed, we give in (2.44) in Claim 2.6 the upper bound \(n^{(3-\tau ){\beta _n}}\) on the empirical second moment of the degrees. A similar lower bound can be proved as well. This expression tends to infinity whenever \({\beta _n}\log n\rightarrow \infty \).

Remark 1.11

The message of Theorem 1.8 is that typical distances are centered around approximately \(2\log \log (n^{{\beta _n}})/|\log (\tau -2)|+1/({\beta _n}(3-\tau ))\), when the truncation of the degrees happens at a value \(n^{{\beta _n}}\), with tight fluctuations around this value. As pointed out before, the threshold for the dominance of the two terms is at \({\beta _n}=\Theta (1/\log \log n)\). Indeed, as soon as \({\beta _n}=o(1/\log \log n)\), the term containing \(1/{\beta _n}\) in (1.17) becomes dominant, and the second order termFootnote 3 is of order \(\log \log n\). On the other hand, when \({\beta _n} \log \log n\rightarrow \infty \), the dominant term is \(\log \log n\).

Remark 1.12

(Dropping the condition on the minimal degree) With slightly more work it is possible to drop Assumption 1.2 from the assumptions in Theorems 1.8. Under Assumption 1.1 but without Assumption 1.2, the graph has a unique giant component of linear size, \(\zeta n(1-o_{\mathbb {P}}(1))\) for some \(\zeta >0\), see Janson and Luczak [34]. In this case, the statement of Theorem 1.8 remain valid conditioned on the event that both \(v_r, v_b\) are in the giant component of the graph. This conditioning can be done similarly as described in [31]. To keep our paper short, we omit to provide the proof here, since this is not the main focus of this paper.

1.1.3 Structure and Number of Shortest Paths

The next two theorems are by-products of the proof of Theorem 1.8. They reveal the structure of shortest paths and thus shed light on the topology of the graph in more detail. Let us denote a path connecting vertices uv by \(\mathcal {P}_{u,v}\), and let us denote any path that realizes the graph distance \({\mathrm {d}}_G(u,v)\) by \(\mathcal {P}^\star _{u,v}\). We write \(w\in \mathcal {P}_{u,v}\) if w is a vertex \(\ne u,v\) that is on the path \(\mathcal {P}_{u,v}\). We write \({\mathrm {d}}_G(u,v|\Lambda )\) for the length of the shortest path between two vertices uv restricted to contain vertices in a set \(\Lambda \). Finally, let us write

$$\begin{aligned} \Lambda _{\le z}:=\left\{ v\in [n]: d_v\le n^z\right\} , \end{aligned}$$
(1.18)

and for a triplet \((z, x_1,x_2)\) of numbers let us define the ‘upper’ and ‘lower’ fractional-part of the following expression:

$$\begin{aligned} f^u(z, x_1, x_2)&:=\left\lceil \frac{1/z-x_1-x_2}{3-\tau } \right\rceil - \frac{1/z-x_1-x_2}{3-\tau }, \end{aligned}$$
(1.19)
$$\begin{aligned} f^\ell (z, x_1, x_2)&:= \frac{1/z-x_1-x_2}{3-\tau } - \left\lfloor \frac{1/z-x_1-x_2}{3-\tau } \right\rfloor . \end{aligned}$$
(1.20)

Note that either \(f^u=f^\ell =0\) or \(f^u=1-f^\ell \in (0,1)\).

Theorem 1.13

(Structure and number of shortest paths between hubs) Under the same conditions as Theorem 1.8, let \(\widetilde{\beta _n}\le {\beta _n}\) be such that \(\widetilde{\beta _n} (\log n)^\gamma \rightarrow \infty \). Let \(x_1,x_2 > \tau -2\) and \(v_1,v_2\) be two vertices with degrees \(d_{v_j}\sim n^{x_j\widetilde{\beta _n} }\) for \(j=1,2\). Then, the distance between \(v_1,v_2\) restricted to paths \(\mathcal {P}_{v_1,v_2}\) that contain only vertices with degree at most \(n^{\widetilde{\beta _n}}\) is whp

$$\begin{aligned} \mathrm {d}_G(v_1,v_2\mid \Lambda _{\le \widetilde{\beta _n}})=\left\lceil \frac{1/(\widetilde{\beta _n})-x_1-x_2}{3-\tau } \right\rceil +1, \end{aligned}$$
(1.21)

while the number of shortest paths Footnote 4 between \(v_1, v_2\) within \(\Lambda _{\le \widetilde{\beta _n}}\) satisfies whp

$$\begin{aligned} \#\left\{ \mathcal {P}_{v_1, v_2}^\star \mid \Lambda _{\le \widetilde{\beta _n}}\right\} \sim n^{\widetilde{\beta _n} f^u(\widetilde{\beta _n}, x_1, x_2)}, \end{aligned}$$
(1.22)

where \(f^u(\widetilde{\beta _n}, x_1, x_2)\) is defined in (1.19). Further, whp all vertices on any shortest path in \(\Lambda _{\le \widetilde{\beta _n}}\) connecting \(v_1,v_2\) have degree at least \(n^{\widetilde{\beta _n} f^\ell (\widetilde{\beta _n}, x_1, x_2)}\). I.e., for all \(\varepsilon >0\),

$$\begin{aligned} \mathbb {P}\left( \exists \mathcal {P}^\star _{v_1,v_2}\in \Lambda _{\le \widetilde{\beta _n}}, w\in \mathcal {P}^\star _{v_1,v_2}: d_w \le n^{\widetilde{\beta _n} f^\ell (\widetilde{\beta _n}, x_1, x_2)(1-\varepsilon )}\right) \rightarrow 0. \end{aligned}$$
(1.23)

Interpreting Theorem 1.13, first note that setting \(\widetilde{\beta _n}<{\beta _n}\) in (1.21) reveals the distance between the two vertices when the path must avoid vertices with degree at least \(n^{\widetilde{\beta _n}}\). While (1.21) for \(\widetilde{\beta _n}\equiv {\beta _n}\) shows that the generating function approximation known from physics - the formula described in (1.2)—is valid when instead of typical distances, we consider distances between very high-degree vertices. The second statement, (1.22) shows that the number of shortest paths between hubs concentrate on a logarithmic scale, since the statement

$$\begin{aligned} \frac{\log \left( \#\{\mathcal {P}_{v_1, v_2} \in \Lambda _{\le \widetilde{\beta _n}}\}\right) }{ f^u(\widetilde{\beta _n}, x_1, x_2)\widetilde{\beta _n}\log n } \buildrel {\mathbb {P}}\over {\longrightarrow }1. \end{aligned}$$
(1.24)

is a direct consequence of (1.22). This implies that there are many shortest paths, \(n^{\widetilde{\beta _n} f^u(\widetilde{\beta _n}, x_1, x_2)}\) many. Note however that as soon as any of \(\widetilde{\beta _n}, x_1, x_2\) depends on n, the upper fractional part \(f^u\) starts to oscillate in the interval [0, 1). Similarly, the statement in (1.23) shows that all these shortest paths use relatively high degree vertices - just a factor \(f^\ell \) multiplies the exponent of the maximally allowed degree \(n^{\widetilde{\beta _n}}\). Keep in mind that \(f^u\) and \(f^\ell \) are does not necessarily are bounded away from 0.

Note again that as soon as any of \({\beta _n}, x_1, x_2\) depends on n, \(f^\ell ({\beta _n}, x_1, x_2)\) oscillates together with this fractional part.

Comparing (1.22) and (1.23), one notes that \(f^u+f^\ell =1\) unless both of them are 0. One can avoid integer values by slightly changing \({\beta _n}\) or \(x_1, x_2\) by pushing some n-dependent terms into the \(d_{v_j}\sim n^{x_j {\beta _n}}\) relation. Further, it is not hard to extend the proof of Theorem 1.13 to show that there is at least one shortest path that uses a vertex with degree \(\sim n^{\widetilde{\beta _n} f^\ell (\widetilde{\beta }_n, x_1, x_2 )(1+\varepsilon )}\) for arbitrary small \(\varepsilon >0\). Thus, we arrive to the following observation.

Observation 1.14

Let \(v_1, v_2\) be two hubs with \(d_{v_j}\sim n^{x_j{\beta _n}}\) for \(x_j>\tau -2, j\in \{1,2\}\). Then the number of shortest paths between \(v_1, v_2\) times the lowest degree that these paths use gives approximatelyFootnote 5 the maximal degree in the graph, i.e., \(\sim n^{\beta _n}\).

We provide a sketch proof of this observation in Section 5. Our next theorem analyses the number and structure of shortest paths between two uniformly chosen vertices \(v_r, v_b\):

Theorem 1.15

(Structure and number of shortest paths) Under the same conditions as in Theorem 1.8, there is a shortest path between \(v_r, v_b\) that has the following structure, whp:

  1. (1)

    (Degree-increasing phase) For both \(q\in \{r,b\}\), starting from \(v_q\), a path segment of length \(T_q({\beta _n})=\log \log (n^{{\beta _n}})/|\log (\tau -2)|+\Theta _{\mathbb {P}}(1)\) as in (1.15) ends with a vertex \(v_q^\star \) with degree

    $$\begin{aligned} d_{v_q^\star }\sim n^{{\beta _n} (\tau -2)^{b_n^{\scriptscriptstyle {(q)}}({\beta _n})}}, \end{aligned}$$
    (1.25)

    where \(b_n^{\scriptscriptstyle {(q)}}({\beta _n})\) is from (1.15). The vertex \(v_q^\star \) can be chosen to be the maximal degree vertex among all vertices that are reachable from \(v_q\) on a path of length \(T_q({\beta _n})\). For any \(k<i_{\star \scriptscriptstyle {(q)}}({\beta _n})\), the degree of the \((T_q({\beta _n})-k)\)th vertex on the path between \(v_q, v_q^\star \) is \(\sim n^{{\beta _n} (\tau -2)^{b_n^{\scriptscriptstyle {(q)}}({\beta _n})+k}}\), where \(i_{\star \scriptscriptstyle {(q)}}({\beta _n})\) are tight random variables given below in (2.34) and (2.36).

  2. (2)

    (Connection among high-degree vertices phase) A path of length

    $$\begin{aligned} \left\lceil \frac{1/{\beta _n}-(\tau -2)^{b_n^{\scriptscriptstyle {(r)}}({\beta _n})}-(\tau -2)^{b_n^{\scriptscriptstyle {(b)}}({\beta _n})}}{3-\tau } \right\rceil +1 \end{aligned}$$
    (1.26)

    connects \(v_r^\star , v_b^\star \) using only vertices with degree at least \(n^{{\beta _n} f_n^\ell }\), where, in agreement with Theorem 1.13, \(f^\ell _n:=f^\ell ({\beta _n}, (\tau -2)^{b_n^{\scriptscriptstyle {(r)}}({\beta _n})}, (\tau -2)^{b_n^{\scriptscriptstyle {(b)}}({\beta _n})})\). Further, Phase (2) is valid for all shortest paths, whp. That is, whp, for any shortest path \(\mathcal {P}_{v_r, v_b}^\star \), the segment between the \(T_r({\beta _n})\)th and the \(|\mathcal {P}_{v_r, v_b}^\star |-T_b({\beta _n})\)th vertex has length as in (1.26), and it only contains vertices with degree at least \(n^{{\beta _n} f^\ell _n}\).

Finally, as a consequence of Theorem 1.13, the number of shortest paths between \(v_r, v_b\) satisfies \(\mathbb {P}_{\scriptscriptstyle {Y}}-whp\)

$$\begin{aligned} \#\{ \mathcal {P}_{v_r, v_b}^\star \}\sim n^{{\beta _n} f^u_n}, \end{aligned}$$

where \(f^u_n:=f^u({\beta _n}, (\tau -2)^{b_n^{\scriptscriptstyle {(r)}}({\beta _n})}, (\tau -2)^{b_n^{\scriptscriptstyle {(b)}}({\beta _n})} )\in [0,1)\) fluctuates with n.

Theorem 1.15 sheds light on the true structure of the shortest path between two uniformly chosen vertices: both ends of the path start with a segment where after a short initial randomness, degrees are essentially increasing in a deterministic fashion: each degree is asymptotically a power \(1/(\tau -2)\) of the previous degree on the path. This phase ends with a vertex \(v_q^\star \) that has degree in the interval \([n^{(\tau -2){\beta _n}}, n^{{\beta _n}}]\), for \(q\in \{r,b\}\). The precise (random) prefactors of \({\beta _n}\) for \(q\in \{r,b\}\) in the exponent of n of the degree \(d_{v_q^\star }\) determines the precise length of the next phase, that establishes a connecting path between \(v_r^\star , v_b^\star \). For this path segment, Theorem 1.13 can be applied, which means that the length of this path is at most a constant \(\lceil 2/(3-\tau )\rceil \) away from \(1/({\beta _n}(3-\tau ))\), and all vertices on this path have relatively high degree.

We can turn Observation 1.14 to hold for two uniformly chosen vertices as well: the non-integer condition holds whp under Assumption 1.7 with \((x_1, x_2)=((\tau -2)^{b_n^{\scriptscriptstyle {(q)}}({\beta _n})})_{q\in \{r,b\}}\), so we arrive to the following observation:

Observation 1.16

Let \(v_r, v_b\) be two uniformly chosen vertices. Then whp under Assumption 1.7, the number of shortest paths between \(v_r, v_b\) times the lowest degree that these paths use in Phase (2) in Theorem 1.15 is approximately the maximal degree in the graph, i.e., it is \(\sim n^{\beta _n}\).

1.1.4 Attack Vulnerability

Let us mention that when we remove some set of vertices \(\Lambda \) from a configuration model and the edges attached to them (called an attack), the remaining graph is still a configuration model on \([n]\setminus \Lambda \), with a new empirical degree distribution that might become random, depending on the type of the attack. Observe that the shortest path between two vertices in the remaining graph is the same as the shortest path in the original graph restricted to stay among vertices in \([n]\setminus \Lambda \). When the attack is so that it removes all vertices above a certain degree, we call it a targeted attack or deliberate attack. This is the meaning of setting \(\widetilde{\beta }_n< \beta _n\) in Theorem 1.13 above.

An immediate corollary of the proof of Theorems 1.8 and 1.13 is that our results remain valid in the configuration model with targeted attack as well, that is, when instead of truncating the degrees, we remove all vertices with degree at least \(n^{\widetilde{\beta _n}}\) from a configuration model. Equivalently, we can consider the length of the shortest path restricted to stay among vertices with degree at most \(n^{\widetilde{\beta _n}}\).

Corollary 1.17

Let us consider the configuration model under the same assumptions as the ones in Theorem 1.8. For a sequence \(\widetilde{\beta _n}\) with \(\widetilde{\beta _n} (\log n)^\gamma \rightarrow \infty \) for some \(\gamma <1\), let us remove all vertices with degree at least \(n^{\widetilde{\beta _n}}\) and the edges attached to them from the graph on n vertices. Then, the typical distance between two vertices \(v_r, v_b\) chosen uniformly at random from the remaining set of vertices satisfies whp

$$\begin{aligned}&{\mathrm {d}}_G(v_r, v_b| \Lambda _{\le \widetilde{\beta _n}})- \frac{2\log \log (n^{\widetilde{\beta _n}})}{|\log (\tau -2)|} -\left\lceil \frac{1/(\widetilde{\beta _n})-(\tau -2)^{b_n^{\scriptscriptstyle {(r)}}(\widetilde{\beta _n})}-(\tau -2)^{b_n^{\scriptscriptstyle {(b)}}(\widetilde{\beta _n})}}{3-\tau } \right\rceil \nonumber \\&\quad +\,b_n^{(r)}(\widetilde{\beta _n})+b_n^{(b)}(\widetilde{\beta _n}) \buildrel {d}\over {\longrightarrow }-1+ \frac{ -\log (Y_rY_b)}{|\log (\tau -2)|}. \end{aligned}$$
(1.27)

Further, Theorem 1.15 also remains valid in this setting, with \({\beta _n}\) replaced by \(\widetilde{\beta _n}\) everywhere.

This corollary sheds light on the effect of a targeted attack - commonly known as the attack vulnerability of the network. In fact, Corollary 1.17 describes the way typical distances grow when we (gradually) remove the ‘core’ of the graph, meaning all vertices with degree at least \(n^{\widetilde{\beta _n}}\). For example, starting with a configuration model with i.i.d. degrees, (corresponding to \({\beta _n}\equiv 1/(\tau -1)\)), one has to go as far as to choose \(\widetilde{\beta _n}=o(1/\log \log n)\) to change the order of magnitude of the length of shortest paths.

An alternative proof of this corollary could be the following: It can be shown that the number of vertices in \(\Lambda _{\le \widetilde{\beta _n}}=o(n)\), so, only o(n) many vertices are removed. Then, one can show that – even though these are the highest degree vertices – the total number of half-edges attached to the removed vertices is still o(n). When considering the degree of a remaining vertex in the remaining graph, the half-edges that were matched to removed vertices should also be removed. This results in a thinning of the degrees. This thinning is not independent for different half-edges and vertices. However, by a stochastic domination argument one can still show that the resulting new degree distribution still satisfies the conditions of Theorem 1.8, with now \({\beta _n}\) replaced by \(\widetilde{\beta _n}\). This makes clear why all expressions in (1.27) depend only on \(\widetilde{\beta _n}\) and not on the original \({\beta _n}\).

1.2 Examples

Note that Assumption 1.1 is satisfied in the following cases that we keep in mind to study:

Example 1.18

The first example arises when the degrees are independent and identically distributed from a background power-law distribution F that satisfies (1.4) and (1.5) (for all \(x \in \mathbb {R}\)). In this case, it is not hard to see that the order of magnitude of the maximal degree in the graph is \(n^{(1+o_{\mathbb {P}}(1))/(\tau -1)}\) whp. Further, using the concentration of binomial random variables (see [31] for the computations) shows that whp the empirical degree distribution satisfies Assumption 1.1, with \({\beta _n}=1/(\tau -1)\), with a possibly larger constant C in the bound on \(L_n(x)\) than the one in the background distribution F.

Pure power-law degrees were found for example in the internet backbone network [21], in metabolic reaction networks [35], in telephone call graphs [46], and most famously, in the world-wide-web [5, 10, 37].

Remark 1.19

(Fluctuations of typical distances in the i.i.d. degree case) In the special case of i.i.d. degrees as in Example 1.18, the value \({\beta _n}=1/(\tau -1)\). Under Assumption 1.7, the upper integer part in (1.16) simplifies to either 0 or 1, and typical distances in this case become

(1.28)

We emphasize that Theorem 1.8 implies that the typical distances in the graph are concentrated around \(2 \log \log n / |\log (\tau -2)|\) with bounded fluctuations, a result that already appeared in [31] for the i.i.d. degree case. The statement of Corollary 1.9 ‘filters out’ the bounded oscillations arising from fractional part issues that oscillate with n. We emphasize here that the statement of Theorem 1.8 applied to i.i.d. degrees and [31, Theorem 1.2] are essentially the same. However, they provide a different description of typical distances. The current proof here is much shorter than the one in [31] as well as it allows to treat the truncated degree case at the same time. We show in Section 6 that the two theorems are indeed the same.

Example 1.20

(Exponential truncation) The degrees are generated i.i.d. from an n-dependent truncated power law distribution \(F^{(n)}\) that can be written in the form

$$\begin{aligned} 1-F^{(n)}(x)=\frac{L^{(n)}(x)}{x^{\tau -1}} \exp \{ -c x / n^{\beta _n}\}, \end{aligned}$$
(1.29)

with \(L^{(n)}\) satisfying (1.5). In this case, the empirical distribution \(F_n(x)\) satisfies Assumption 1.1 for all sufficiently large n, since for any \(x\le n^{{\beta _n}(1-\varepsilon )}\), the exponential term is at least 1 / 2, say, while for any \(x\ge n^{{\beta _n}(1+\varepsilon )}\), \(F^{(n)}(x)=O(1/n^2)\), thus we shall not actually see vertices with such high degree in the graph for large enough n. A special case arises when \((d_i)_{i\in [n]}\) are i.i.d. with \(d_i=\min \{ X_i, G_i\}\), where \(G_i\) are i.i.d. geometric random variables with mean \(\exp \{ c/n^{\beta _n}\}\), and \(X_i\) i.i.d. as in Example 1.18.

Power-law degrees with exponential truncation were proposed e.g. in [45], and are observed for instance in the movie actor network [4], air transportation networks [27] and co-authorship networks [48, 58], brain functional networks [1], ecological networks [44] such as coevolutionary networks of plant-animal interactions [36].

Example 1.21

(Hard truncation) The degrees are generated i.i.d. from an \(n-dependent\) truncated power-law that can be written in the form

(1.30)

with \(L^{(n)}\) satisfying (1.5). A special case again arises when \((d_i)_{i\in [n]}\) are i.i.d. with \(d_i=\min \{X_i, n^{\beta _n}\}\), where \(X_i\) are i.i.d. as in Example 1.18.

Probably the most important example for hard truncation is a targeted attack, since in this case every vertex above a certain degree in the network is removed. Scale-free graphs are often called attack vulnerable, see e.g. [3, 14, 32]. A theoretical example where the authors use a network model with hard truncation can be found in [26] or [40].

Perhaps surprisingly, the online social network of Facebook does not seem to follow a truncated power-law [60], even though the total number of friends of a person was limited to 5000 at the time of the measurement. Nevertheless, we would like to emphasize that our theorem allows for many possible truncations functions, among which the hard truncation is possibly the most strict.

1.3 Discussion and Open Questions

1.3.1 Heuristic explanation of the formula in Theorem 1.8.

In Theorem 1.8, we have determined that the distances are centered around

$$\begin{aligned} \frac{2 \log \log (n^{{\beta _n}})}{|\log (\tau -2)|} + \frac{1}{{\beta _n}(3-\tau )} \end{aligned}$$

with tight fluctuations. Here we give a heuristic explanation of this formula. For graphs with locally tree-like structure, the usual BP approximation says that the number of vertices that are reachable on a path of length k from \(v_r\), is approximately \(Z_k^{\scriptscriptstyle {(n)}}\), the size of generation k of a BP with offspring \(D_n\) in the first and \(D_n^\star \) in the consecutive generations. Generating function methods then yield that \(\mathbb {E}[Z_k^{\scriptscriptstyle {(n)}}]=\mathbb {E}[D_n] \nu _n^{k-1}\), with \(\nu _n=\mathbb {E}[D_n^\star ]\sim n^{{\beta _n}(3-\tau )}\) as in (4.2) below, and then the approximation \(Z_k^{\scriptscriptstyle {(n)}} \approx \mathbb {E}[Z_k^{\scriptscriptstyle {(n)}}]\) is often used. Unfortunately, for heavily skewed distributions like that of \(D_n^\star \), it does not hold that \(Z_k^{\scriptscriptstyle {(n)}}\approx \mathbb {E}[Z_k^{\scriptscriptstyle {(n)}}]\). This is so because \(\mathbb {E}[D_n^\star ]\) is characterised by the highest degree vertices, of degree \(n^{{\beta _n}}\), while, on the other hand, low degree vertices are typically not connected to these hubs in the graph, and thus \(Z_k^{\scriptscriptstyle {(n)}} \ll \mathbb {E}[Z_k^{\scriptscriptstyle {(n)}}]\).

It is true however that

$$\begin{aligned} Z_k^{\scriptscriptstyle {(n)}}\approx C^{1/(\tau -2)^k}, \end{aligned}$$
(1.31)

for some random constant C [16]. Thus, as long as the BP approximation is valid, we see a ‘degree-increasing phase’ within the exploration clusters of the vertices \(v_r,v_b\). From the approximation (1.31) it already follows that it takes \(\log \log (n^{{\beta _n}})/|(\log (\tau -2))|+\)tight number of steps to reach a hub \(v_q^\star \) in the graph, for \(q\in \{r,b\}\). Extreme value theory tells us that any hub will have some neighbors that are also hubs, and thus the approximation that \(D_n^\star \approx \mathbb {E}[D_n^\star ]\) and consequently \(Z_k^{\scriptscriptstyle {(n)}}\approx \mathbb {E}[Z_k^{\scriptscriptstyle {(n)}}]\) suddenly becomes valid when considering the number of vertices of distance k away from \(v_q^\star \). This means that it takes an additional \(\log n/\log \nu _n=1/{\beta _n}(3-\tau )+\)tight number of steps to connect the two hubs \(v_r^\star , v_b^\star \) to each other. This explains the formula in Theorem 1.8.

1.3.2 Comment About ‘Structural Cutoff’

Often in physics literature, \({\beta _n}=1/2\) is called ‘structural cut-off’ [54]. When \({\beta _n}>1/2\), vertices with degree at least \(n^{1/2+\varepsilon }\) form a complete subgraph of the graph, while for \({\beta _n}<1/2\) this complete subgraph is not present. Further, when \({\beta _n}>1/2\), a growing number of multiple edges appears, while for \({\beta _n}<1/2\), the number of multiple edges in the graph stays bounded. Our theorems show that there is no significantly different behavior of typical distances when the truncation happens below versus above the structural cutoff \(n^{1/2}\).

1.3.3 Open Questions

We believe the criterion \({\beta _n} (\log n)^\gamma \rightarrow \infty \) for some \(\gamma \in (0,1)\) can be relaxed to be \({\beta _n} \log n\rightarrow \infty \), at least when one imposes more strict bounds on the slowly varying function \(L_n(x)\) in (1.4). The criterion \({\beta _n} \log n\rightarrow \infty \) is the weakest form that is necessary for the empirical second moment to tend to infinity. Provided one can generalize our results to hold whenever \({\beta _n} \log n\rightarrow \infty \), we obtain a perfect interpolation between doubly logarithmic and logarithmic distances. When \({\beta _n} \log n=\theta (1)\), the empirical second moment remains bounded and thus a finite mean BP approximation becomes available.

1.3.4 Notation

We write [n] for the set of integers \(\{1,2,\dots , n\}\). As usual, we write i.i.d. for independent and identically distributed, lhs and rhs for left-hand side and right-hand side. We use \(\buildrel {d}\over {\longrightarrow }, \buildrel {\mathbb {P}}\over {\longrightarrow }, \buildrel {a.s.}\over {\longrightarrow }\) for convergence in distribution, in probability and almost surely, respectively. We use the Landau symbols \(o(\cdot ), O(\cdot ), \Theta (\cdot )\) in the usual way. For sequences of random or deterministic variables \(X_n, Y_n\) we further write \(X_n = o_{\mathbb {P}}(Y_n)\) and \(X_n = O_{\mathbb {P}}(Y_n)\) if the sequence \(X_n/Y_n \buildrel {\mathbb {P}}\over {\longrightarrow }0 \) and is tight, respectively.

Constants are typically denoted by c in lower and C in upper bounds (possible with indices to indicate which constant is coming from which bound), and their precise values might change from line to line. We introduce \((\mathcal {C}^{\scriptscriptstyle {(r)}}, \mathcal {C}^{\scriptscriptstyle {(b)}}):=(\mathcal {C}^{\scriptscriptstyle {(r)}},\mathcal {C}^{\scriptscriptstyle {(b)}})\). At some time t along the matching or the exploration, for a set of vertices \(\mathcal {A}_t\) we denote the set of unpaired half-edges at that moment attached to vertices in \(\mathcal {A}_t\) as \(\mathcal {H}(\mathcal {A}_t)\) and its size by \(H(\mathcal {A}_t)\). Thus . When the time is set to be 0, \(H(\mathcal {A})=\sum _{v\in \mathcal {A}} d_v\). We write \(d_v\) for the degree of vertex v.

1.4 Overview of the Proof

To determine the distance between \(v_r, v_b\), we start growing two clusters that we call red and blue, respectively, in a breadth-first-search manner, and see how these two clusters reach the highest degree vertices. As long as the two clusters are disjoint, the growth is not necessarily simultaneous, i.e., we might stop the growth of one color earlier. To describe the growing clusters, we extensively use the fact that in the configuration model half-edges can be paired in an arbitrarily chosen order. This allows for a joint construction of the graph together with the growing of the two colored clusters. In Proposition 2.1 (Section 2) below, we show that the highest degree vertex \(v_q^\star \), \(q\in \{r,b\}\) that is reached by this path is of degree

$$\begin{aligned} d_{v_q^\star }\sim n^{{\beta _n}(\tau -2)^{b_n^{(q)}({\beta _n})}}, \end{aligned}$$
(1.32)

for \(q\in \{r,b\}\), respectively. The total length of this path from \(v_q\) is \(T_q({\beta _n})\) as in (1.15) for \(q\in \{r,b\}\). The proof of this proposition has the following ingredients: We couple the initial stages of the growth to two independent branching processes (BPs) (Section 2.1). The coupling fails when one of the colors (wlog we assume it is red) reaches size \(n^{\varrho _n'}\) for some \(\varrho _n'>0\) sufficiently small. From the half-edges attached to the BP cluster of \(v_q, q\in \{r,b\}\), we build a path through higher and higher degree vertices to a vertex with degree at least \(\sim n^{(\tau -2){\beta _n}}\). In Lemma 2.5 we give an upper bound on the degree of the maximal-degree vertex reached at any time \(t(n^{\varrho _n'})+i\) of the exploration, implying that in \(T_q({\beta _n})\) hops no vertex of degree higher than \(\sim n^{{\beta _n}(\tau -2)^{b_n^{(q)}({\beta _n})}}\) is reached from \(v_q\). Thus, this lemma serves also as a building block for the proof of Proposition 2.1, but beyond that, it also enables us to show two other important things, that form the content of Proposition 3.1 (Section 3):

  1. (1)

    An early meeting is highly unlikely, i.e., the clusters \(\mathcal {C}^{\scriptscriptstyle {(q)}}_{T_q({\beta _n})}\) are \(\mathbb {P}_{\scriptscriptstyle {Y}}\)-whp disjoint.

  2. (2)

    The quantity in (1.32) bounds also the total number of half-edges attached to the explored cluster \(\mathcal {C}^{\scriptscriptstyle {(q)}}_{T_q({\beta _n})}\).

Finally, in Sect. 4 we finish the proof of Theorem 1.8. For the lower bound, we count the number of z-length paths between the two disjoint clusters \(\mathcal {C}^{\scriptscriptstyle {(q)}}_{T_q({\beta _n})}\). Here, we use a first moment method (i.e., we show that the expected number of paths is o(1)) when z is less than the expression in (1.26). For the upper bound, we establish the existence of a path of length as in (1.26) between \(v_r^\star , v_b^\star \). We do this using a second moment method. This completes the proof of Theorem 1.8. We prove Theorems 1.13 and 1.15 in Sect. 5. In Sect. 6 we compare Theorem 1.8 in the special case \(\beta _n\equiv 1/(\tau -1)\) to the result in [31].

2 Distance from the Hubs

In this section we analyse the distance of \(v_r, v_b\) from the highest-degree vertices. The construction is similar to that of [6, Sect. 3], however, since Assumption 1.1 on \(F_n\) is weaker than the one in [6], we need to modify the proof. Recall that we call the vertices with degree at least \(n^{{\beta _n}(\tau -2)}\) hubs and recall \(T_q({\beta _n}), b_n^{\scriptscriptstyle {(q)}}({\beta _n})\) from (1.15) More precisely, let us define

$$\begin{aligned} \mathrm {hubs}:=\{ v\in [n]: d_v\ge n^{{\beta _n}(\tau -2)}\},\end{aligned}$$
(2.1)

and for a set of vertices \(\mathcal {A}\subseteq [n]\) and a vertex \(v\in [n]\),

$$\begin{aligned} {\mathrm {d}}_G(v, \mathcal {A}):= \min _{a\in {\mathcal {A}}} {\mathrm {d}}_G(v, a). \end{aligned}$$
(2.2)

The next proposition determines the distance of the uniformly chosen vertices \(v_r, v_b\) from the hubs:

Proposition 2.1

(Distance from the hubs) Let us consider the configuration model on n vertices with empirical degree distribution that satisfies Assumption 1.1, and let \(v_r, v_b\) be two uniformly chosen vertices. Then, for \(q\in \{r,b\}\), \(\mathbb {P}_{\scriptscriptstyle {Y}}\)-whp,

$$\begin{aligned} {\mathrm {d}}_G(v_q, \mathrm {hubs})=T_q({\beta _n})=\frac{\log \log n+\log \left( {\beta _n}/Y_n^{\scriptscriptstyle {(q)}}\right) }{|\log (\tau -2)|} -1-b_n^{\scriptscriptstyle {(q)}}({\beta _n}). \end{aligned}$$

More precisely, \(\mathbb {P}_{\scriptscriptstyle {Y}}\)-whp there is a vertex \(v^\star _q\in \mathrm {hubs}\) at distance \(T_q({\beta _n})\) away from \(v_q\) with degree

$$\begin{aligned} d_{v_q^\star }\sim n^{{\beta _n}(\tau -2)^{b_n^{\scriptscriptstyle {(q)}}({\beta _n})}}, \end{aligned}$$
(2.3)

while all vertices at distance at most \(T_q({\beta _n})-1\) from \(v_q\) are not hubs.

The main goal of this section is to prove this proposition. To show the upper bound, the proof has two main steps: the initial stage of the breadth-first-search exploration (BFS) is coupled to branching process trees (Sect. 2.1), while the later stage uses a decomposition of the vertices with degrees that are polynomial in n into shells (Sect. 2.2). To show the lower bound, we provide an upper bound on the degrees reached by the BFS in any shell at the time of first reaching that particular shell, see Lemma 2.5 (Sect. 2.3). This method is novel compared to the one in [31].

2.1 Coupling the Initial Stages of BFS to Branching Processes

In this section we investigate the initial stage of the spreading cluster of \(v_r, v_b\). In the construction of the configuration model, at any time that we construct the matching of the half-edges, we are allowed to choose one of the not-yet-paired half-edges arbitrarily, and pair it to a uniformly chosen other not-yet-paired half-edge. Hence, we can do the pairing in an order that corresponds to the breadth-first-search (BFS) exploration started from \(v_r, v_b\). That is, first we pair all the outgoing half-edges from the sources \(v_r, v_b\) (distance 1), then we pair the outgoing half-edges from the neighbors of the source vertices (distance 2), and so on, in a breadth-first-search manner. Whenever we finish pairing all the half-edges attached to vertices at a given graph distance from the source vertices, we increase the distance by 1. This process of joint construction of the BFS exploration and graph building is often called the exploration process in the literature.

Recall that \(\mathcal {C}^{\scriptscriptstyle {(r)}}_t, \mathcal {C}^{\scriptscriptstyle {(b)}}_t\) denotes the subgraph that is at distance at most t from \(v_r, v_b\). [7, Proposition 4.7] (see also [6, Lemma 2.2]) shows that that the number of vertices and their forward degreesFootnote 6 in the exploration process can be coupled to i.i.d. degrees having distribution function \(F_n^\star \) from (1.6) as long as the total number of vertices of the colored clusters is not too large. There, a different assumption is posed on the maximal degree in the graph, so we shortly adjust the proof of [7, Proposition 4.7] to our setting below in Lemma 2.2. The distribution \(F^\star _n\) arises from the fact that as long as the set of explored vertices is relatively small, a forward degree j is generated when the uniformly chosen half-edge of a pairing belongs to a vertex with degree \(j+1\). The probability of choosing a half-edge that belongs to a vertex with degree \(j+1\) is approximately equal to \((j+1)\mathbb {P}(D_n=j+1)/\mathbb {E}[D_n]\), and thus \(F^\star _n\) and thus \(F^\star \) are the natural candidates for the forward degrees in the exploration process.

Lemma 2.2

(Coupling error of the forward degrees) Consider the configuration model with degree sequence that satisfies Assumption 1.1. Then, in the exploration process started from two uniformly chosen vertices \(v_r, v_b\), the forward degrees \((X_k^{\scriptscriptstyle {(n)}})_{k\le s_n}\) of the first \(s_n\) newly discovered vertices can be coupled to an i.i.d. sequence \(D_{n,k}^\star \) from distribution \(D_n^\star \) with the following error bound

$$\begin{aligned} \mathbb {P}( \exists k\le s_n, D_{n,k}^\star \ne X_k^{\scriptscriptstyle {(n)}} ) \le C s_n^2 n^{{\beta _n}(1+\varepsilon )-1}+C n^{- (\tau -2-2\varepsilon )/(\tau +\varepsilon )} s_n^{2(\tau -1+\varepsilon /2)/(\tau +\varepsilon )}\qquad \quad \end{aligned}$$
(2.4)

If further Assumption 1.3 holds, then there is a coupling of \((X_k^{\scriptscriptstyle {(n)}}, D_{n,k}^\star , D_k^\star )_{k\le s_n}\) with

$$\begin{aligned} \begin{aligned} \mathbb {P}( \exists k\le s_n, D_{n,k}^\star \ne X_k^{\scriptscriptstyle {(n)}} \text { or } D_{n,k}^\star \ne D_k^\star )&\le s_n n^{-{\beta _n}\kappa }+ C s_n^2 n^{{\beta _n}(1+\varepsilon )-1}\\&\quad \ \ +C n^{- (\tau -2-2\varepsilon )/(\tau +\varepsilon )} s_n^{2(\tau -1+\varepsilon /2)/(\tau +\varepsilon )}. \end{aligned}\nonumber \\ \end{aligned}$$
(2.5)

By choosing \(s_n\) in Lemma 2.2 so that the rhs of the bound in (2.5)still tends to zero, we obtain the following corollary:

Corollary 2.3

(Whp coupling of the exploration to two BPs) In the configuration model satisfying Assumptions 1.1 and 1.3, let t be such that

$$\begin{aligned} |\mathcal {C}^{\scriptscriptstyle {(r)}}_t\cup \mathcal {C}^{\scriptscriptstyle {(b)}}_t|\le \min \{n^{{\beta _n}(\kappa -\delta )}, n^{(1-{\beta _n}(1+\varepsilon )-\delta )/2}, n^{-(\tau -2-\varepsilon -\delta )/(2(\tau -1+\varepsilon /2))}\} \end{aligned}$$
(2.6)

for some \(\delta >0\). Then \((\mathcal {C}^{\scriptscriptstyle {(r)}}_t, \mathcal {C}^{\scriptscriptstyle {(b)}}_t)\) can be whp coupled to two i.i.d. BPs with generation sizes \((Z_{k}^{\scriptscriptstyle {(r)}}, Z_{k}^{\scriptscriptstyle {(b)}})_{k>0}\) with distribution \(F^\star \) for the offspring in the second and further generations, and with distribution F for the offspring in the first generation.

Proof of Lemma 2.2

We would like to couple the forward degrees \((X_k^{\scriptscriptstyle {(n)}})_{k\le s_n}\) in the exploration to an i.i.d. sample of size \(s_n\) from distribution \(D_{n}^\star \) as in (1.7)as well as to \(D^\star \) and estimate the coupling error. The idea of the coupling is to achieve size-biased sampling with and without replacement of the vertices at the same time: this is [8, Construction 4.2] that we informally recall here. Let us write \(\mathcal {L}_n\) for the list of half-edges.

We use a sequence of uniform random variables \(U_k\in [0,1]\). Then, we sample a uniform half-edge \(h_k\) from \(\mathcal {L}_n\), namely we set \(h_k\) to be the jth element of \(\mathcal {L}_n\) if \(U_k\in ((j-1)/\ell _n, j/\ell _n]\). We sample the i.i.d. \(D_{n,k}^\star \) by setting \(D_{n,k}^\star \) to be \(d_{v(h_k)}-1\), where v(h) denotes the vertex that h is incident to.

At the same time, we keep a list of already sampled vertices \(\mathcal {S}_k:=\{v_r, v_b, v(h_1),\dots , v(h_k)\}\). As long as \(v(h_{k}) \notin \mathcal {S}_{k-1}\), we can set \(B_k^{\scriptscriptstyle {(n)}}:=d_{v(h_k)}-1\), this quantity describes the number of brother half-edges of a newly discovered vertex via a pairing. Note that in this sampling procedure there is no pairing of the half-edges yet. To ensure that the exploration cluster is a tree, at each step k we yet have to check if any of these \(B_k^{\scriptscriptstyle {(n)}}=d_{v(h_k)}-1\) half-edges create cycles when being paired, i.e., they shall be paired to vertices in \(\mathcal {S}_{k-1}\). We write \(X_k^{\scriptscriptstyle {(n)}}\) to be \(B_k^{\scriptscriptstyle {(n)}}\) minus those half-edges that shall be paired to vertices in \(\mathcal {S}_{k-1}\).

Thus, the coupling to a BP tree with offspring distribution \(D_n^\star \) can fail in two ways: either \(v(h_{k}) \in \mathcal {S}_{k-1}\) and the coupling between \(B_k^{\scriptscriptstyle {(n)}}, D_{n,k}^\star \) fails (depletion-of-points effect), or \(X_k^{\scriptscriptstyle {(n)}}<B_k^{\scriptscriptstyle {(n)}}\) and some of the \(B_k^{\scriptscriptstyle {(n)}}\) half-edges create cycles (cycle-creation effect).

Introducing the \(\sigma \)-algebra \(\mathcal {G}_k\) generated by \(v_r, v_b\) and the first k draws, [8, Lemma 4.3] bounds the coupling error between \(B_k^{\scriptscriptstyle {(n)}}, D_{n,k}^\star \):

$$\begin{aligned} \mathbb {P}\big (B_k^{\scriptscriptstyle {(n)}} \ne D_{n,k}^\star \mid \mathcal {G}_{k-1}\big ) \le \frac{1}{\ell _n} \left( d_{v_r} + d_{v_b} + \sum _{s=1}^{k-1} (B_s^{\scriptscriptstyle {(n)}}+1)\right) , \end{aligned}$$
(2.7)

while [8, Lemma 4.3] estimates the probability of creating a cycle at step k:

$$\begin{aligned}&\mathbb {P}\left( X_k^{\scriptscriptstyle {(n)}} < B_{k}^{\scriptscriptstyle {(n)}} \mid \mathcal {G}_{k-1}\right) \nonumber \\&\quad \le \frac{B_k^{\scriptscriptstyle {(n)}}}{\ell _n-1-\Big (d_{v_r} + d_{v_b} + \sum _{s=1}^k \left( B_s^{\scriptscriptstyle {(n)}}+1\right) \Big )} \left( d_{v_r} + d_{v_b} + \sum _{s=1}^k (B_s^{\scriptscriptstyle {(n)}}+1)\right) .\qquad \quad \end{aligned}$$
(2.8)

Under Assumption 1.1 the maximal degree is at most \(n^{{\beta _n}(1+\varepsilon )}\) in the graph, and \(\ell _n=\mathbb {E}[D_n]n\) is of order n. Combining these observations, (2.7)is at most \(C k n^{{\beta _n}(1+\varepsilon )-1}\). Summing this bound over \(k\le s_n\), we obtain that

$$\begin{aligned} \mathbb {P}\Big ( \exists k\le s_n, B_k^{\scriptscriptstyle {(n)}} \ne D_{n,k}^\star \Big ) \le C s_n^2 n^{{\beta _n}(1+\varepsilon )-1}. \end{aligned}$$
(2.9)

Then, on the event \(\{\forall k\le s_n: B_k^{\scriptscriptstyle {(n)}} = D_{n,k}^\star \}\), \(B_s^{(n)}\) in (2.8)can be replaced by the i.i.d. \(D_{n,s}^{\star }\). Taking expectations of the rhs (2.8)does not work, since \(\mathbb {E}[D_n^\star ]\) is infinite. Thus we apply a truncation argument. Take \(\varepsilon >0\) so that \(s_n^{(\tau -1+\varepsilon )/(\tau -2)}=o(n)\). Then the denominator on the rhs of (2.8)is at least cn for some \(c>0\). Further, for some truncation value \(K_n\) to be chosen later that satisfies that \(s_nK_n=o(n)\),

(2.10)

where we used that on the event that \(\big \{ \forall k \le s_n: B_{k}^{\scriptscriptstyle {(n)}} =D_{n,k}^\star , D_{n,k}^\star \le K_n, \big \}\) the denominator on the rhs of (2.8) is at least cn for some \(c>0\). Using (1.7) the first term on the rhs is at most \(s_n L_n^\star (K_n )K_n^{-(\tau -2)}\le s_n K_n^{-(\tau -2)+\varepsilon }\) for arbitrarily small \(\varepsilon >0\) for sufficiently large n, while the second term is at most \(C n^{-1} K_n^2 s_n^2\). Making the order of the two error terms to be equal yields that the best choice of truncation value is at

$$\begin{aligned} K_n:=(n/s_n)^{1/(\tau +\varepsilon )}. \end{aligned}$$
(2.11)

Note that the initial criterium that \(s_nK_n=o(n)\) is satisfied for all \(s_n=o(n)\), while \(K_n\) is a polynomial of n with strictly positive exponent whenever \( s_n=o(n^{(1-\varepsilon )/(\tau -1)})\). With \(K_n\) in (2.11) the sum of the error terms in (2.9) becomes

$$\begin{aligned} \mathbb {P}\Big ( \exists k\le s_n, X_k^{\scriptscriptstyle {(n)}} \ne B_{k}^{\scriptscriptstyle {(n)}} \mid \forall k \le s_n, B_{k}^{\scriptscriptstyle {(n)}} =D_{n,k}^\star \Big ) \le C n^{- (\tau -2-2\varepsilon )/(\tau +\varepsilon )} s_n^{2(\tau -1+\varepsilon /2)/(\tau +\varepsilon )}\nonumber \\ \end{aligned}$$
(2.12)

which tends to zero as long as \(s_n=o\left( n^{(\tau -2-\varepsilon )/2(\tau -1+\varepsilon /2)}\right) \) for some arbitrarily small \(\varepsilon >0\).

Note that we have neglected coupling the forward degree of \(v_r, v_b\) to two i.i.d. copies of \(D_n\). This coupling can be done in a very similar way, by choosing with and without replacement two uniform numbers from [n], with a coupling error at most 1 / n, which is negligible compared to the rhs of (2.9) The probability that any of the \(d_{v_r}\) or \(d_{v_b}\) half-edges form cycles is at most of order \(n^{{\beta _n}(1+\varepsilon )}/\ell _n\), which again can be merged into the rhs of (2.9) This finishes the proof of (2.4)

Next we extend the coupling between \(\Big (X_k^{\scriptscriptstyle {(n)}}, D_{n,k}^\star \Big )\) to additionally couple \(D^\star _k\) to them, using Assumption 1.3. On the event that \(X_k^{\scriptscriptstyle {(n)}}= D_{n,k}^\star =\ell \), we use the optimal coupling that realizes the total variation distance between \(D_n^\star \) and \(D^\star \). Namely,

(2.13)

One can set the other possible values of \(D_k^\star \) as described e.g. in [38, Chapter 1]. Nevertheless, the coupling error equals

$$\begin{aligned} \mathbb {P}\left( D^\star _k\ne D_{n,k}^\star \right) =\sum _{\ell \ge 1}\Big (1-\min \left\{ \mathbb {P}(D^\star =\ell ), \mathbb {P}\left( D_n^\star =\ell \right) \right\} \Big ) = d_{\scriptscriptstyle {\mathrm {TV}}}\left( F^\star , F_n^\star \right) . \end{aligned}$$

One can realize this coupling by using another independent uniform variable \(\widetilde{U}_k\) to set the value of \(D_k^\star \) once the value of \(D_{n,k}^\star \) is determined. Thus, we obtain that

$$\begin{aligned} \mathbb {P}\left( \exists k\le s_n, D_{n,k}^\star \ne D_{k}^\star \right) \le d_{\scriptscriptstyle {\mathrm {TV}}}\left( F_n^\star , F^\star \right) \le s_n n^{-\kappa }. \end{aligned}$$
(2.14)

Similarly, we can couple \(d_{v_r}, d_{v_b}\) to two i.i.d. copies of D with a coupling error at most \(2d_{\scriptscriptstyle {\mathrm {TV}}}(F_n, F)\). This finishes the proof of (2.5) \(\square \)

A theorem by Davies [16] describes the growth rate of a branching process with a given offspring distribution G that satisfies (1.7) that we describe here informally. Let \(\widetilde{Z}_k\) denote the k-th generation of a branching process with offspring distribution given by a distribution function G, that can be written in the formFootnote 7 as in (1.7) and (1.5) for some \(\tau \in (2,3)\) and some \(x_0>0\) for all \(x\ge x_0\). Then \((\tau -2)^{k}\log \left( \widetilde{Z}_{k}\vee 1\right) \) converges almost surely to a random variable \(\widetilde{Y}\). Further, the variable \(\widetilde{Y}\) has exponential tails: if \(J(x):=\mathbb {P}(\widetilde{Y}\le x)\), then

$$\begin{aligned} \lim _{x\rightarrow \infty } \frac{- \log (1-J(x))}{x} =1. \end{aligned}$$
(2.15)

We can apply Davies’ theorem to each subtree of the two roots of the two i.i.d. BPs from Corollary 2.3, to obtain that the corresponding convergence in (1.12) in Definition 1.6. See [6, Lemma 2.4] for more details.

Recall from Corollary 2.3 that the coupling of the forward degrees in the BP and in the exploration fails when the total size of the BPs is too large, i.e., when (2.6)is not satisfied. Thus let us setFootnote 8

$$\begin{aligned} \varrho _n':=(\tau -2)\min \left\{ {\beta _n}\kappa , (1-{\beta _n}(1+\varepsilon ))/2, (\tau -2-\varepsilon )/(2(\tau -1+\varepsilon /2))\right\} . \end{aligned}$$
(2.16)

Without loss of generality we assume that the cluster of \(v_r\) reaches size \(n^{\varrho _n'}\) first (otherwise we switch the indices rb). Thus, let us define

$$\begin{aligned} t\left( n^{\varrho _n'}\right) =\inf \left\{ k: Z_k^{\scriptscriptstyle {(r)}} \ge n^{\varrho _n'}\right\} . \end{aligned}$$
(2.17)

From the definition (1.11) an elementary rearrangement yields that (conditioned on \(Y_r^{\scriptscriptstyle {(n)}}\)),

$$\begin{aligned} t(n^{\varrho _n'})= \frac{\log \left( \varrho _n'/Y_r^{\scriptscriptstyle {(n)}}\right) + \log \log n}{|\log (\tau -2)|}+ 1-a_n^{\scriptscriptstyle {(r)}}, \end{aligned}$$
(2.18)

where

$$\begin{aligned} a_n^{\scriptscriptstyle {(r)}}= \left\{ \frac{\log \left( \varrho _n'/Y_r^{\scriptscriptstyle {(n)}}\right) + \log \log n}{|\log (\tau -2)|}\right\} . \end{aligned}$$
(2.19)

Note that \(1-a_n^{\scriptscriptstyle {(r)}}\) in (2.18) is there to make \(t(n^{\varrho _n'})\) equal to the upper integer part of the fraction on the rhs of (2.18). Due to this effect, the last generation has a bit more vertices than \(n^{\varrho _n'}\), namely

$$\begin{aligned} Z_{t(n^{\varrho _n'})}^{\scriptscriptstyle {(r)}}= n^{ \varrho _n' (\tau -2)^{a_n^{\scriptscriptstyle {(r)}}-1} }=: m_r, \end{aligned}$$
(2.20)

We obtain this expression by rearranging (1.11) and using the value \(t(n^{\varrho _n'})\) from (2.18). The definition of \(\varrho _n'\) in (2.16) and \(a_n^{\scriptscriptstyle {(r)}} \in [0,1)\) implies that the exponent \(\varrho _n' (\tau -2)^{a_n^{\scriptscriptstyle {(r)}}-1}\) is still so small that the condition (2.6) in Corollary 2.3 is satisfied. Similarly, from (1.11) and (2.18), the blue cluster at this moment has size

$$\begin{aligned} Z_{t(n^{\varrho '})}^{\scriptscriptstyle {(b)}}=n^{\varrho _n' (\tau -2)^{a_n^{\scriptscriptstyle {(r)}}-1} Y_b^{\scriptscriptstyle {(n)}}/Y_r^{\scriptscriptstyle {(n)}}}=:m_b, \end{aligned}$$
(2.21)

where the assumption that red reaches size \(n^{\varrho _n'}\) first is equivalent to the assumption that \(Y_b^{\scriptscriptstyle {(n)}}/Y_r^{\scriptscriptstyle {(n)}}\le 1\). This assumption together with (2.20) as well as the double-exponential growth apparent from (1.12) ensures that the total size of the two BPs is less than \(n^{\varrho _n}\) and thus Corolllary 2.3 still holds. Note that \(m_r, m_b\) are random variables that are measurable wrt \(\mathcal {F}_{\varrho _n'}\).

2.2 Short Path to the Hubs Through Shells

To provide an upper bound on the distance of \(v_q\) and the hubs for \(q\in \{r,b\}\), as well as to show (2.3) we build a path from \(\mathcal {C}^{\scriptscriptstyle {(q)}}_{t(n^{\varrho _n'})}\) to the hubs. Let us set \(C_2:=\max \{C_1, C^\star _1\}\), where \(C_1, C^\star _1\) are the constants in the exponent in (1.5) for \(L_n, L_n^\star \), respectively, and define the function

$$\begin{aligned} h(x):=\exp \left\{ \frac{2C_2}{(\tau -2)^\gamma } (\log x)^\gamma \right\} . \end{aligned}$$
(2.22)

We shall repeatedly use that for any possible \(L_n, L_n^\star \) satisfying (1.5) as \(x\rightarrow \infty \),

$$\begin{aligned}&\min \left\{ L_n\left( x^{1/(\tau -2)}\right) , L_n^\star \left( x^{1/(\tau -2)}\right) \right\} h(x)\rightarrow \infty , \end{aligned}$$
(2.23)
$$\begin{aligned}&\max \left\{ L_n\left( x^{1/(\tau -2)}\right) , L_n^\star \left( x^{1/(\tau -2)}\right) \right\} /h(x)\rightarrow 0. \end{aligned}$$
(2.24)

Recall \(m_q\) from (2.20), (2.21) and that they are measurable wrt \(\mathcal {F}_{\varrho _n'}\). Generally in the rest of the paper, random variables measurable wrt \(\mathcal {F}_{\varrho _n'}\) are denoted by small letters, since they can be treated as constants under the measure \(\mathbb {P}_Y\) in (1.14), and this is the meaure we mostly work with. In order to build the path, for both \(q\in \{r,b\}\) we decompose the high-degree vertices in the graph into the following sets, that we call shells:

$$\begin{aligned} \Gamma _i^{\scriptscriptstyle {(q)}}:=\left\{ v: d_v>u_{i}^{\scriptscriptstyle {(q)}}\right\} , \end{aligned}$$
(2.25)

where \(u_i^{\scriptscriptstyle {(q)}}\) is defined recursively by

$$\begin{aligned} u_{i+1}^{\scriptscriptstyle {(q)}} =\left( \frac{u_{i}^{\scriptscriptstyle {(q)}}}{h(u_{i}^{\scriptscriptstyle {(q)}})}\right) ^{1/(\tau -2)}, \ \ \qquad u_0^{\scriptscriptstyle {(q)}}:= \bigg (\frac{m_q}{h(m_q)}\bigg )^{1/(\tau -2)}. \end{aligned}$$
(2.26)

Setting \(u_{-1}^{\scriptscriptstyle {(q)}}:=m_q\), we obtain by iteration

$$\begin{aligned} u_{i}^{\scriptscriptstyle {(q)}}= m_q^{(\tau -2)^{-(i+1)}} /\prod _{k=1}^{i+1} h\left( u_{i-k}^{\scriptscriptstyle {(q)}}\right) ^{(\tau -2)^{-k}}. \end{aligned}$$

Clearly, \(u_{i}^{\scriptscriptstyle {(q)}}\le m_q^{(\tau -2)^{-(i+1)}}\). Using this upper bound to estimate the arguments of the function h in the denominator, as well as (2.22), the following lower bound holds with \(K_\gamma =1-(\tau -2)^{-(1-\gamma )}\):

$$\begin{aligned} u_{i}^{\scriptscriptstyle {(q)}}\ge m_q^{ (\tau -2)^{-(i+1)}} \exp \left\{ -\frac{2C_2 }{K_\gamma (\tau -2)^\gamma } (\tau -2)^{-(i+1)} (\log m_q)^\gamma \right\} . \end{aligned}$$
(2.27)

Since \(m_q\) tends to infinity with n (see (2.20), (2.21) and \(\gamma <1\), the second factor is of smaller order than the first factor for all sufficiently large n. This observation together with the upper bound yields that for any fixed i,

$$\begin{aligned} u_{i}^{\scriptscriptstyle {(q)}}\sim m_q^{ (\tau -2)^{-(i+1)}}, \end{aligned}$$
(2.28)

in the sense of (1.5). Note that \((\tau -2)^{-1}>1\), thus \(u_{i}^{\scriptscriptstyle {(q)}}\) is growing and \(\Gamma _i^{\scriptscriptstyle {(q)}} \supset \Gamma _{i+1}^{\scriptscriptstyle {(q)}}\).

To show that the initial stage (coupling to BPs) and the paths through shells has nonzero intersection, we will use the following claim:

Claim 2.4

Let \(X_i, \ i=1, \dots , m\) be i.i.d. random variables from distribution \(F_n^\star \) or \(F^\star \). Then

$$\begin{aligned} \mathbb {P}\bigg (\max _{i\in [m]} X_i < \Big (\frac{ m}{h(m)}\Big )^{1/(\tau -2)} \bigg ) \le \exp \Big \{-\exp \Big \{ \frac{C^\star _1}{(\tau -2)^\gamma } (\log m)^\gamma \Big \}\Big \} \rightarrow 0. \end{aligned}$$
(2.29)

Proof

We show it for \(F_n^\star \). The proof for \(F^\star \) is identical. Clearly

$$\begin{aligned} \begin{aligned} \mathbb {P}\bigg (\max _{i\in [m]} X_i < \Big (\frac{ m}{h(m)}\Big )^{1/(\tau -2)} \bigg )&= F^\star _n\left( \big (\tfrac{m}{h(m)}\big )^{1/(\tau -2)}\right) ^{m} \\&\le \exp \left\{ m\left( 1-F_n^\star \big ((\tfrac{m}{h(m)})^{1/(\tau -2)} \big )\right) \right\} , \end{aligned} \end{aligned}$$

and the rest follows by using the function h from (2.22) as well as the form of \(F_n^\star \) from (1.7), in particular the relation in (2.23).

Proof of Proposition 2.1, upper bound

By the coupling established in Corollary 2.3, conditioned on the size of the last generation that we denote by \(m_q\), the degrees in the last generation of the two BPs are an i.i.d. (either from \(F_n^\star \) or from \(F^\star \)). Claim 2.4, applied conditionally on \(m_q\), ensures that whp there are vertices with degree at least \(u_0^{\scriptscriptstyle {(q)}}=(m_q/h(m_q))^{1/\tau -2}\) in the last generation of the two BPs, establishing that \(\mathbb {P}_{\scriptscriptstyle {Y}}\)-whp

$$\begin{aligned} \mathcal {C}_{t(n^{\varrho _n'})}^{\scriptscriptstyle {(q)}}\cap \Gamma _0^{\scriptscriptstyle {(q)}} \ne \varnothing . \end{aligned}$$
(2.30)

The next step is to show that \(\mathbb {P}_{\scriptscriptstyle {Y}}\)-whp for all i such that \(\Gamma _{i+1}^{\scriptscriptstyle {(q)}}\ne \varnothing \),

$$\begin{aligned} \Gamma _i^{\scriptscriptstyle {(q)}} \subset N(\Gamma _{i+1}^{\scriptscriptstyle {(q)}}), \end{aligned}$$
(2.31)

where N(S) stands for the set of vertices that are neighbors of S (Fig. 1).

Fig. 1
figure 1

An illustration of the layers and the mountain climbing phase at time \(t(n^{{\varrho '_n}})+3\). Disclaimer: the degrees on the picture are only an illustration

This statement can be obtained by a modification of [6, Lemma 3.4] that we provide here for the reader’s convenience. Recall that \(H(\mathcal {A})\) denotes the number of half-edges attached to vertices in a set \(\mathcal {A}\).

The algorithm to generate the configuration model makes it possible that when checking the connection of a vertex \(v\in \Gamma _i^{\scriptscriptstyle {(q)}}\), we can start by pairing the half-edges of v one after another. Given \(H(\Gamma _{i+1}^{\scriptscriptstyle {(q)}})\), the probability that a half-edge is not connected to any of the half-edges attached to vertices in \(\Gamma _{i+1}^{\scriptscriptstyle {(q)}}\) is at most \(1-H(\Gamma _{i+1}^{\scriptscriptstyle {(q)}})/\ell _n\). Since we can pair at least \(u_{i}^{\scriptscriptstyle {(q)}}/2\) half-edges before all the half-edges of v are pairedFootnote 9, by a union bound for all \(v\in H(\Gamma _{i}^{\scriptscriptstyle {(q)}})\),

$$\begin{aligned} \mathbb {P}_{\scriptscriptstyle {Y}}\big (\exists v\in \Gamma _i^{\scriptscriptstyle {(q)}}: v \not \leftrightarrow \Gamma _{i+1}^{\scriptscriptstyle {(q)}}\mid H(\Gamma _{i+1}^{\scriptscriptstyle {(q)}} ) \big )\le & {} |\Gamma _i^{\scriptscriptstyle {(q)}}| \Big (1-\frac{H(\Gamma _{i+1}^{\scriptscriptstyle {(q)}})}{\ell _n} \Big )^{u_{i}^{\scriptscriptstyle {(q)}}/2}\nonumber \\\le & {} n \exp \!\left\{ -Cu_{i}^{\scriptscriptstyle {(q)}}u_{i+1}^{\scriptscriptstyle {(q)}}(1-F_n(u_{i+1}^{\scriptscriptstyle {(q)}})) \right\} \!.\qquad \end{aligned}$$
(2.32)

By using the lower bound in (1.5) as well as the upper bound \(u_{i+1}^{\scriptscriptstyle {(q)}}\le \left( u_{i}^{\scriptscriptstyle {(q)}}\right) ^{1/(\tau -2)}\), (see also (2.23)),

$$\begin{aligned} u_{i}^{\scriptscriptstyle {(q)}}u_{i+1}^{\scriptscriptstyle {(q)}}\left( 1-F_n(u_{i+1}^{\scriptscriptstyle {(q)}})\right) = h(u_{i}^{\scriptscriptstyle {(q)}})L_n\left( u_{i+1}^{\scriptscriptstyle {(q)}}\right) \ge \exp \left\{ \widetilde{C} (\tau -2)^{-i} (\log m_q)^\gamma \right\} , \end{aligned}$$

for some \(\widetilde{C}>0 \). Since \(m_q\) ((2.20), (2.21)) tends to infinity with n, the bound in (2.32) tends to zero as \(n\rightarrow \infty \) even when we sum over \(i\ge 1\). This establishes (2.31).

The coupling to two BPs combined with (2.30) and (2.31) establishes the existence of a path to the hubs. This provides an upper bound on the distance between \(v_r, v_b\) and the hubs. It remains to calculate the length of these paths.

We write \(i_{\star \scriptscriptstyle {(q)}}\) for the last index when \(\Gamma _{i}^{\scriptscriptstyle {(q)}}\) is nonempty, i.e., by (1.4),

$$\begin{aligned} i_{\star \scriptscriptstyle {(q)}}:=\sup \left\{ i: u_i^{\scriptscriptstyle {(q)}} \le n^{{\beta _n}} \right\} . \end{aligned}$$
(2.33)

Some calculation using the value of \(m_q\) from (2.20), combined with (2.28) and (2.27) shows that

$$\begin{aligned} i_{\star \scriptscriptstyle {(r)}}= -1+ \frac{\log \Big ({\beta _n}/\left( \varrho _n' (\tau -2)^{a_n^{\scriptscriptstyle {(r)}}-1}\right) \Big )}{|\log (\tau -2)|}-b_n^{\scriptscriptstyle {(r)}}({\beta _n}) + o_{\mathbb {P}_{\scriptscriptstyle {Y}}}(1), \end{aligned}$$
(2.34)

where \(b_n^{\scriptscriptstyle {(r)}}({\beta _n})\) is the fractional part of the previous term on the rhs. Using the value of \(a_n ^{\scriptscriptstyle {(q)}}\) from (2.19), plus the fact that \(\{x - 1+\{y\}\}=\{x+y\}\), we get that

$$\begin{aligned} b_n^{\scriptscriptstyle {(r)}}({\beta _n}) = \left\{ \frac{\log ({\beta _n}/\varrho _n')}{|\log (\tau -2)|}+ a_n^{\scriptscriptstyle {(r)}}-1\right\} = \left\{ \frac{\log ({\beta _n}/Y_r^{\scriptscriptstyle {(n)}})+\log \log n}{|\log (\tau -2)|}\right\} , \end{aligned}$$
(2.35)

exactly as defined in (1.15). Similar calculation for \(q=b\) yields

$$\begin{aligned} i_{\star \scriptscriptstyle {(b)}} =-1+\frac{\log ({\beta _n}/ \varrho _n' (\tau -2)^{a_n^{\scriptscriptstyle {(r)}}-1})) + \log (Y_r^{\scriptscriptstyle {(n)}}/Y_b^{\scriptscriptstyle {(n)}})}{|\log (\tau -2)|}-b_n^{\scriptscriptstyle {(b)}}({\beta _n})+o_{\mathbb {P}_{\scriptscriptstyle {Y}}}(1),\qquad \end{aligned}$$
(2.36)

where \(b_n^{\scriptscriptstyle {(b)}}({\beta _n})\) is the fractional part of the previous term on the rhs. Again, some calculation yields that \(b_n^{\scriptscriptstyle {(b)}}({\beta _n})\) is exactly as in (1.15). Note that the definition of \(\varrho _n'\) in (2.16) guarantees that the ratio \({\beta _n}/\varrho _n'\) is bounded, and hence \(i_{\star \scriptscriptstyle {(q)}}\) is a tight random variable (measurable wrt \(\mathcal {F}_{\varrho _n'}\)). From (2.28) and (2.34) and (2.36) respectively, one can calculate that

$$\begin{aligned} u_{i_{\star \scriptscriptstyle {(q)}}}^{\scriptscriptstyle {(q)}}\sim n^{{\beta _n}(\tau -2)^{b_n^{\scriptscriptstyle {(q)}}({\beta _n})}}, \end{aligned}$$
(2.37)

and the error factor as in (2.27) is \(o_{\mathbb {P}_{\scriptscriptstyle {Y}}}(1)\), since \(i_{\star \scriptscriptstyle {(q)}}\) does not tend to infinity with n. Thus, the total length of the constructed path is, for \(q\in \{r,b\}\),

$$\begin{aligned} T_q({\beta _n})=t(n^{\varrho _n'})+i_{\star \scriptscriptstyle {(q)}}=\frac{\log \log (n^{{\beta _n}})-\log Y_n^{\scriptscriptstyle {(q)}}}{|\log (\tau -2)|} -1-b_n^{\scriptscriptstyle {(q)}}({\beta _n}), \end{aligned}$$
(2.38)

establishing the upper bound on \({\mathrm {d}}_G(v_q, \mathrm {hubs})\) in Proposition 2.1. Note that \(T_q({\beta _n})\) only depends on the value \(\varrho _n'\) through the approximating variables \(Y_n^{\scriptscriptstyle {(q)}}\), and \(b_n^{\scriptscriptstyle {(q)}}({\beta _n})\) is exactly the fractional part of the expression on the rhs of \(T_q({\beta _n})\). Since also \(Y_n^{\scriptscriptstyle {(q)}}\buildrel {d}\over {\longrightarrow }Y_q\) irrespective of the choice of \(\varrho _n'\), this establishes that the choice of \(\varrho _n'\) is not relevant in the proof (at least not in the limit), but more a technical necessity. \(\square \)

2.3 Upper Bound on the Degrees in the BFS

Now we turn towards providing a matching lower bound for the distance from the hubs. Similarly as in (2.26), let us define, for \(q\in \{r,b\}\),

$$\begin{aligned} \begin{aligned} \widehat{u}_0^{\scriptscriptstyle (q)}&:= (m_q h(m_q))^{1/(\tau -2)}, \\ \widehat{u}_{i+1}^{\scriptscriptstyle {(q)}}&:= \big (\widehat{u}_i^{\scriptscriptstyle {(q)}}h(\widehat{u}_{i}^{\scriptscriptstyle {(q)}})\big )^{1/(\tau -2)},\\ \widehat{\Gamma }_i^{\scriptscriptstyle {(q)}}&:=\{v \in {\mathrm {CM}}_n(\varvec{d}): d_v \ge \widehat{u}_i^{\scriptscriptstyle {(q)}}\}, \end{aligned} \end{aligned}$$
(2.39)

Note that \(\widehat{u}_{i}^{\scriptscriptstyle {(q)}}\) grows faster than \(u_{i}^{\scriptscriptstyle {(q)}}\) since here we multiply by h instead of dividing by it.

The next lemma handles the upper bound on the maximal-degree vertex reached at any time \(t(n^{\varrho _n'})+i\), but first some definitions. We say that a sequence of vertices and half-edges \(\underline{\pi }:=(\pi _0, s_0, t_1, \pi _1, s_1, t_2, \dots , t_k, \pi _k)\) forms a path in \({\mathrm {CM}}_n(\varvec{d})\), if for all \(0< i\le k\), the half-edges \(s_i, t_i\) are incident to the vertex \(\pi _i\) and \((s_{i-1}, t_i)\) forms an edge between \(\pi _{i-1},\pi _i\).

For \(q\in \{r,b\}\), we say that a path is q-good if \(\deg (\pi _i)\le \widehat{u}_i^{\scriptscriptstyle {(q)}}\) holds for every i. Otherwise we call it q-bad. We further decompose the set of q-bad paths in terms of where they ‘turn’ q-bad:

$$\begin{aligned} \begin{aligned} Bad\mathcal {P}_k^{\scriptscriptstyle {(q)}} :=&\Big \{ (\pi _0, s_0, t_1, \pi _1, s_1 \dots , t_k, \pi _k) \text { is a path, } \\&\quad \pi _0\!\in \!\mathcal {C}^{\scriptscriptstyle {(q)}}_{ t(n^{\varrho _n'}) },\ \deg (\pi _i)\!\le \! \widehat{u}_i^{\scriptscriptstyle {(q)}} \ \forall i\le k-1,\ \deg (\pi _k)\!>\!\widehat{u}_k^{\scriptscriptstyle {(q)}} \Big \}. \end{aligned} \end{aligned}$$
(2.40)

The following lemma shows that q-bad paths \(\mathbb {P}_{\scriptscriptstyle {Y}}\)-whp do not occur:

Lemma 2.5

For some constant \(C>0\), the following bound on the probability of having any bad paths holds for color \(q\in \{r,b\}\):

$$\begin{aligned} \mathbb {P}_{\scriptscriptstyle {Y}}\left( \exists k\in [0, i_{\star \scriptscriptstyle {(q)}}]: Bad\mathcal {P}_k^{\scriptscriptstyle {(q)}} \ne \varnothing \right) \le C \exp \left\{ -C (\log m_q)^{\gamma }\right\} . \end{aligned}$$
(2.41)

Before we prove this lemma, we need the following technical claim:

Claim 2.6

Let \(\left( D_{n,i}^\star \right) _{i\le m}\) be i.i.d. from distribution \(F_n^\star \) or \(F^\star \). Then there exists a \(0<C<\infty \), so that

$$\begin{aligned} \mathbb {P}_{\scriptscriptstyle {Y}}\left( \sum _{i=1}^{m} D_{n,i}^\star \ge (m h(m))^{1/(\tau -2)} \right) \le \exp \left\{ -C (\log m)^\gamma \right\} . \end{aligned}$$
(2.42)

Further, for any \(y\in [0, n^{{\beta _n}}]\),

$$\begin{aligned} \sum _{d_{\pi }\ge y} \frac{d_{\pi } }{\ell _n} \le L_n^{\star , \text {up}}(y)y^{2-\tau } \end{aligned}$$
(2.43)

where we denote by \(L_n^{\star , \text {up}}\) the upper bound in (1.5) on \(L_n^\star \). Next, the empirical truncated second moment satisfies for all \(y_n\rightarrow \infty \) and large enough n that

$$\begin{aligned} \sum _{\pi : d_{\pi }\le y_n} \frac{d_{\pi } (d_{\pi }-1)}{\ell _n} \le \frac{2}{3-\tau }(y_n)^{3-\tau } L_n^{\star , \text {up}}(y_n). \end{aligned}$$
(2.44)

Proof

The proof of (2.43) is the probably the easiest. Namely, by the definition of the empirical distribution as well as \(F_n^\star \) the sum can be rewritten as follows:

(2.45)

and an application of (1.7) establishes (2.43). Next, (2.44) can be rewritten similarly,

(2.46)

To obtain the second line, we used the usual trick to relate the expectation to the tail of a distribution, namely,

$$\begin{aligned} \begin{aligned} \sum _{j\le y_n} (j-1) \mathbb {P}\left( D_n^\star =j-1\right)&=\sum _{j\le y_n} \sum _{s\le j-1} \mathbb {P}\left( D_n^\star =j-1\right) = \sum _{s\le y_n-1} \sum _{s< j \le y_n} \mathbb {P}\left( D_n^\star =j-1\right) \\&\le \sum _{s\le y_n-1} \mathbb {P}\left( D_n^\star \ge s\right) = \sum _{s\le y_n} \left( 1-F_n^\star (s)\right) . \end{aligned} \end{aligned}$$

The condition that \(y_n\rightarrow \infty \) as \(n\rightarrow \infty \) enables us to apply the direct half of Karamata’s theoremFootnote 10 (see [9, p. 26]), and obtain that for all large enough n, the following bound holds on the rhs of (2.46):

$$\begin{aligned} \sum _{j\le y_n} \frac{L_n^{\star , \text {up}}(j)}{j^{\tau -2}}\le \frac{2}{3-\tau }(y_n)^{3-\tau } L_n^{\star , \text {up}}(y_n), \end{aligned}$$
(2.47)

finishing the proof of (2.44). The proof of (2.42) is the trickiest, we handle it with a truncation method. Let us shortly write \(M_m:=(m h(m))^{1/(\tau -2)}\). First we use a union bound:

(2.48)

Then, we estimate the first term on the rhs of (2.48) again by a union bound:

$$\begin{aligned} \mathbb {P}\left( \exists i\le m: D_{n,i}^\star \ge M_m \right) \le m \left( 1-F_n^{\star }(M_m)\right) =L_n^\star (M_m) / h(m). \end{aligned}$$
(2.49)

We can use Markov’s inequality on the second term on the rhs of (2.48):

(2.50)

where we have used that the expectation in the numerator equals precisely the truncated empirical second moment as in (2.44) with \(y_n:=M_m\), thus this expectation can be handled in the same way as the rhs of (2.46). After elementary calculation, the sum of rhs of (2.49) and (2.50) equals

$$\begin{aligned} (C+1) \frac{L_n^\star (M_m) }{h(m)}\le (C+1)\exp \left\{ -\frac{C_2}{(\tau -2)^\gamma } (\log m)^\gamma \right\} . \end{aligned}$$
(2.51)

This, together with (2.48) establishes (2.42). \(\square \)

Proof of Lemma 2.5

The statement can be proved by path-counting methods in the same way as [6, Lemma 5.2] in the Appendix of that paper. Some minor modifications are needed to that proof, for two reasons: First, Assumption 1.1 imposes an assumption on the empirical degree distribution \(F_n\) and the degrees are no longer i.i.d.. This makes certain estimates about truncated empirical moments easier (i.e., (2.43) and (2.44)). On the other hand, Assumption 1.1 is weaker than the assumption on the degrees there, compare to [6, (1.1)]. As a result, the recursion of \(\widehat{u}_{i}^{\scriptscriptstyle {(q)}}\) uses the function h as a multiplier instead of the constant function \(C\log n\). To make the argument easy to follow here, we recall the main steps in the proof and highlight differences only.

First of all, formulas [6, (A.3)–(A.5)] that count the expected number of bad paths apply word-for-word. That is, the expected number of paths in \({\mathrm {CM}}_n(\varvec{d})\) through a fixed sequence of vertices \((\pi _0, \pi _1, \dots , \pi _k)\) equals

$$\begin{aligned} \prod _{i=1}^k \frac{1}{\ell _n^\star -2i+1} d_{\pi _0} \left( \prod _{i=1}^{k-1} d_{\pi _i} (d_{\pi _i}-1)\right) d_{\pi _k}, \end{aligned}$$
(2.52)

where \(\ell _n^\star \) denotes the number of unpaired half-edges at the moment of counting these paths. It is not hard to see (using the sizes \(m_q\) in (2.20) (2.21)) that \(\ell _n^\star =\ell _n (1-o_{\mathbb {P}}(1))\) when we apply this to a path emanating from \(\mathcal {C}^{\scriptscriptstyle {(q)}}_{t(n^{\varrho _n'})}\).

This formula holds for any fixed sequence of vertices. We can count the expected number of q-bad paths in \(Bad\mathcal {P}_k^{\scriptscriptstyle {(q)}}\) when we impose the same restrictions on \(\pi _i\) as in (2.40), and sum over all possible such options:

$$\begin{aligned} \mathbb {E}_{\scriptscriptstyle {Y}}\left[ Bad\mathcal {P}_k^{\scriptscriptstyle {(q)}}|\right] \le {{\mathrm {e}}}^{Ck^2/\ell _n}\sum _{\pi _0 \in \mathcal {C}^{\scriptscriptstyle {(q)}}_{t(n^{\varrho _n'})} }\!\!\!\!d_{\pi _0} \prod _{i=1}^{k-1}\Bigg (\sum _{\pi _i: d_{\pi _i}\le \widehat{u}_i^{\scriptscriptstyle {(b)}}} \frac{d_{\pi _i} (d_{\pi _i}-1)}{\ell _n^\star }\Bigg ) \Bigg (\sum _{{\begin{array}{c} \pi _k \in [n] \\ d_{\pi _k} \ge \widehat{u}_k^{\scriptscriptstyle {(b)}} \end{array}}}\frac{d_{\pi _k}}{\ell _n^\star }\Bigg ).\nonumber \\ \end{aligned}$$
(2.53)

The next step is to estimate the different factors on the rhs of (2.53): here, the two proofs separate. In fact, we can use the bounds in Claim 2.6, and then we can spare all the arguments between [6, (A.6)–(A.9)]. Let \({\mathcal {E}}_n:=\left\{ \sum _{i=1}^{m_q} D_i^\star \le \widehat{u}_0^{\scriptscriptstyle {(q)}} \right\} \), so that \({\mathcal {E}}_n\) holds \(\mathbb {P}_{\scriptscriptstyle {Y}}\)-whp by Claim 2.6, with \(m:=m_q\). Using the estimates (2.43) and (2.44) in Claim 2.6 with \(y_n=\widehat{u}_{i}^{\scriptscriptstyle {(q)}}\), (2.53) turns into

$$\begin{aligned} \mathbb {E}_{\scriptscriptstyle {Y}}\Big [ \left| Bad\mathcal {P}_k^{\scriptscriptstyle {(q)}}\right| \mid {\mathcal {E}}_n\Big ] \le \widehat{u}_0^{\scriptscriptstyle {(q)}} \cdot \left( \widehat{u}_k^{\scriptscriptstyle {(q)}}\right) ^{2-\tau } L_n^{\star , \text {up}}\left( \widehat{u}_{k}^{\scriptscriptstyle {(q)}}\right) \cdot \prod _{i=1}^{k-1} \left( \widehat{u}_{i}^{\scriptscriptstyle {(q)}}\right) ^{3-\tau } \frac{2}{3-\tau }L_n^{\star , \text {up}} \left( \widehat{u}_{i}^{\scriptscriptstyle {(q)}}\right) .\nonumber \\ \end{aligned}$$
(2.54)

This formula replaces [6, (A.11)]. It is an elementary calculation using the defining recursion (2.39) that \(\widehat{u}_j^{\scriptscriptstyle {(q)}} \cdot \left( \widehat{u}_{j+1}^{\scriptscriptstyle {(q)}}\right) ^{3-\tau }h\left( \widehat{u}_j^{\scriptscriptstyle {(q)}}\right) =\widehat{u}_{j+1}^{\scriptscriptstyle {(q)}}\). Applying this equation to \(j=0, \dots , k-2\) sequentially, we arrive at the identity

$$\begin{aligned} \widehat{u}_0^{\scriptscriptstyle {(q)}} \cdot \left( \widehat{u}_k^{\scriptscriptstyle {(q)}}\right) ^{2-\tau } h\left( \widehat{u}_{k-1}^{\scriptscriptstyle {(q)}}\right) \cdot \prod _{i=1}^{k-1} \left( \widehat{u}_{i}^{\scriptscriptstyle {(q)}}\right) ^{3-\tau } h \left( \widehat{u}_{i-1}^{\scriptscriptstyle {(q)}}\right) =1. \end{aligned}$$
(2.55)

Comparing (2.54) to (2.55), we see thatFootnote 11

$$\begin{aligned} \mathbb {E}_{\scriptscriptstyle {Y}}\left[ |Bad\mathcal {P}_k^{\scriptscriptstyle {(q)}}| \mid {\mathcal {E}}_n \right] \le \prod _{i=0}^{k-1} \frac{2}{3-\tau } \frac{L_n^{\star , \text {up}} \left( \widehat{u}_{i+1}^{\scriptscriptstyle {(q}}\right) }{h\left( \widehat{u}_{i}^{\scriptscriptstyle {(q)}}\right) }. \end{aligned}$$
(2.56)

Next we show that for all n large enough, the rhs of (2.56) tends to zero even when summed over all \(k\ge 1\). For this, using the recursion on \(\widehat{u}_{i}^{\scriptscriptstyle {(q)}}\) in (2.39) as well as the function h in (2.22),

$$\begin{aligned} L_n^{\star , \text {up}}\left( \widehat{u}_{i+1}^{\scriptscriptstyle {(q}}\right)= & {} \exp \left\{ C^\star _1 \left( \log \left( \widehat{u}_{i}^{\scriptscriptstyle {(q)}}\right) ^{1/(\tau -2)} + 2 C_2 \left( \log \left( \widehat{u}_{i}^{\scriptscriptstyle {(q)}}\right) ^{1/(\tau -2)}\right) ^\gamma \right) ^\gamma \right\} \nonumber \\= & {} \exp \left\{ C^\star _1 \left( \log \left( \widehat{u}_{i}^{\scriptscriptstyle {(q)}}\right) ^{1/(\tau -2)}\right) ^\gamma \left( 1+ 2 C_2 \left( \log \left( \widehat{u}_{i}^{\scriptscriptstyle {(q)}}\right) ^{1/(\tau -2)}\right) ^{\gamma -1} \right) ^\gamma \right\} \nonumber \\\le & {} \exp \left\{ \frac{3C^\star _1}{2} \left( \log \left( \widehat{u}_{i}^{\scriptscriptstyle {(q)}}\right) ^{1/(\tau -2)}\right) ^\gamma \right\} , \end{aligned}$$
(2.57)

where in the inequality we have used that since \(\gamma <1\) and \(\widehat{u}_{i}^{\scriptscriptstyle {(q)}}\) tends to infinity with n, for all n large enough the last factor in the rhs of the second line is at most 3 / 2. The denominator of the ith factor in (2.56) has the exact same form, except there the constant multiplier in the exponent is at least \(2C_1^\star \). Thus, the rhs of (2.54) is at most

$$\begin{aligned} \mathbb {E}_{\scriptscriptstyle {Y}}\left[ |Bad\mathcal {P}_k^{\scriptscriptstyle {(q)}}| \mid {\mathcal {E}}_n \right]\le & {} \prod _{i=0}^{k-1} \frac{2}{3-\tau } \exp \left\{ -\frac{C^\star _1}{2} \left( \log \left( \left( \widehat{u}_{i}^{\scriptscriptstyle {(q)}}\right) ^{1/(\tau -2)}\right) \right) ^\gamma \right\} \nonumber \\\le & {} \frac{2^k}{(3-\tau )^k} \exp \left\{ -\frac{C^\star _1}{2}\sum _{i=0}^{k-1} (\tau -2)^{-(i+2)\gamma } \left( \log m_q\right) ^\gamma \right\} ,\qquad \quad \end{aligned}$$
(2.58)

where we used the lower bound \(\widehat{u}_{i}^{\scriptscriptstyle {(q)}}\ge m_q^{(\tau -2)^{-(i+1)}}\) that follows from the recursion (2.39) (see also (2.60) below) to obtain the second line. Since \((\tau -2)^{-\gamma }>1\), the sum in the exponent is of order \((\tau -2)^{-k\gamma }(\log m_q)^\gamma \). So, by Markov’s inequality we obtain for some constant \(C>0\) that

$$\begin{aligned} \mathbb {P}_{\scriptscriptstyle {Y}}\!\left( \exists k\ge 1, Bad\mathcal {P}_k^{\scriptscriptstyle {(q)}} \ne \varnothing \right) \le \sum _{k=1}^\infty \mathbb {E}_{\scriptscriptstyle {Y}}\left[ \left| Bad\mathcal {P}_k^{\scriptscriptstyle {(q)}}\right| \right] \le C \exp \left\{ - C \left( \log m_q\right) ^\gamma \right\} \rightarrow 0\nonumber \\ \end{aligned}$$
(2.59)

as \(n \rightarrow \infty \), since \(m_q\) is a positive power of n under \(\mathbb {P}_{\scriptscriptstyle {Y}}\) (see (2.20), (2.21)). We yet have to add the case \(k=0\): note that \(Bad\mathcal {P}_0^{\scriptscriptstyle {(q)}} \ne \varnothing \) means that there exists a vertex in the last generation of the BP with degree at least \(\widehat{u}_{0}^{\scriptscriptstyle {(q)}}\). We have already estimated this probability in (2.49), and the error term obtained in (2.51) can be merged into the rhs of (2.59), establishing the statement of the lemma in (2.41). \(\square \)

Proof of Proposition 2.1, lower bound

We argue that \(T_q({\beta _n})\) is also whp a lower bound to reach the hubs, that is, there is whp no path to the hubs shorter than \(T_q({\beta _n})\).

On the event \(\left\{ \forall k\in [0, i_{\star \scriptscriptstyle {(q)}}]: Bad\mathcal {P}_k^{\scriptscriptstyle {(q)}} = \varnothing , q\in \{r,b\} \right\} \), that occurs \(\mathbb {P}_{\scriptscriptstyle {Y}}\)-whp by Lemma 2.5, we can use the upper bound \(\widehat{u}_i^{\scriptscriptstyle {(q)}}\) on the degrees at time \(t(n^{\varrho _n'})+i\) for all \(i\le i_{\star \scriptscriptstyle {(q)}}\). Hence we obtain that the time it takes to reach a vertex of degree at least \(n^{(\tau -2)/(\tau -1)}\) is at least

$$\begin{aligned} \widehat{i}_{\star \scriptscriptstyle {(q)}} :=\inf \left\{ i: \widehat{u}_{i}^{\scriptscriptstyle {(q)}}\ge n^{{\beta _n} (\tau -2)}\right\} , \end{aligned}$$

which, considering the double exponential growth of \(\widehat{u}_i^{\scriptscriptstyle {(q)}}\) by powers of \(1/(\tau -2)\), is similar to the definition of \( i_{\star \scriptscriptstyle {(q)}}\) in (2.33). The lower bound follows once we show that \(\widehat{i}_{\star \scriptscriptstyle {(q)}}=i_{\star \scriptscriptstyle {(q)}} \) holds \(\mathbb {P}_{\scriptscriptstyle {Y}}\)-whp. For this it is enough to show that (2.28) holds also for \(\widehat{u}_{i}^{\scriptscriptstyle {(q)}}\). From the recursion (2.39),

$$\begin{aligned} \widehat{u}_i^{\scriptscriptstyle {(q)}} = m_q^{(\tau -2)^{-(i+1)}} \prod _{k=1}^{i+1}h\left( \widehat{u}_{i-k}^{\scriptscriptstyle {(q)}}\right) ^{(\tau -2)^{-k}}. \end{aligned}$$
(2.60)

After a somewhat lengthy calculation, using a similar argument as in the second and third line of (2.57) recursively, we obtain that the product on the rhs is at most

$$\begin{aligned} \begin{aligned}&\exp \left\{ 2 C_2\left( \log \left( m_q^{1/(\tau -2)}\right) \right) ^\gamma \frac{1}{(\tau -2)^{i+1}}\sum _{k=1}^{i} \frac{1}{(\tau -2)^{k(\gamma -1)}} \right\} \\&\quad \le \exp \left\{ \widetilde{C} \left( \log \left( m_q^{1/(\tau -2)}\right) \right) ^\gamma \frac{1}{(\tau -2)^{i+1}} \right\} , \end{aligned} \end{aligned}$$

since \(\gamma <1\). Recall that \(m_q\) tends to infinity with n and comparing this to (2.60) as well as to the definition of \(\sim \) in (1.5). Nevertheless, the product in (2.60) is of much smaller order than the main term \(m_q^{(\tau -2)^{-(i+1)}}\) and thus we obtain that (2.28) holds and also that \(\widehat{i}_{\star \scriptscriptstyle {(q)}}=i_{\star \scriptscriptstyle {(q)}} \) \(\mathbb {P}_{\scriptscriptstyle {Y}}\)-whp. This finishes the proof of the lower bound. \(\square \)

3 Early Meeting is Unlikely

For the lower bound of Theorem 1.8, we crucially use the following proposition that shows that the two explorations are disjoint, i.e., the vertices at distance at most \(T_b({\beta _n})\) away from \(v_b\) are all different from the vertices that are distance at most \(T_r({\beta _n})\) away from \(v_r\):

Proposition 3.1

Let us consider the configuration model on n vertices with empirical degree distribution that satisfies Assumption 1.1, and let \(v_r, v_b\) be two uniformly chosen vertices. The event

$$\begin{aligned} \mathcal {C}^{\scriptscriptstyle {(r)}}_{T_r({\beta _n})}\cap \mathcal {C}^{\scriptscriptstyle {(b)}}_{T_b({\beta _n})} = \varnothing \end{aligned}$$
(3.1)

holds \(\mathbb {P}_{\scriptscriptstyle {Y}}\)-whp. Further, the total number of half-edges attached to vertices in \(\mathcal {C}^{\scriptscriptstyle {(r)}}_{T_r({\beta _n})}, \mathcal {C}^{\scriptscriptstyle {(b)}}_{T_b({\beta _n})}\) is the same order of magnitude as the degree of \(v_q^\star \) in Proposition 2.1 up to smaller order correction terms. That is, \(\mathbb {P}_{\scriptscriptstyle {Y}}\)-whp,

$$\begin{aligned} H\left( \mathcal {C}^{\scriptscriptstyle {(r)}}_{T_r({\beta _n})}\right) \sim n^{{\beta _n}(\tau -2)^{b_n^{\scriptscriptstyle {(r)}}({\beta _n})} } \quad \text{ and }\quad H\left( \mathcal {C}^{\scriptscriptstyle {(b)}}_{T_b({\beta _n})}\right) \sim n^{{\beta _n}(\tau -2)^{b_n^{\scriptscriptstyle {(b)}}({\beta _n})} }. \end{aligned}$$
(3.2)

Proof

Recall that we write \(\mathbb {P}_{\scriptscriptstyle {Y}}(\cdot ), \mathbb {E}_{\scriptscriptstyle {Y}}[\cdot ]\) for probabilities of events and expectations of random variables conditioned on \(\mathcal {F}_{\varrho _n'}\). Recall from Lemma 2.5 that the event \(\text {NoBad}:=\left\{ Bad \mathcal {P}^{\scriptscriptstyle {(q)}}_k =\varnothing \ \forall k\le i_{\star \scriptscriptstyle {(q)}} \text{ for } q\in \{r,b\}\right\} \) holds \(\mathbb {P}_Y\)-whp. Since for any event A, \(\mathbb {P}_Y(A)\ge \mathbb {P}_Y(A\mid \text {NoBad}) \mathbb {P}_Y(\text {NoBad})\), it is enough to show that the event in (3.1) holds with probability tending to 1 when conditioned on \(\text {NoBad}\).

To prove the proposition we first calculate the total number of free (unpaired) half-edges going out of the set \(\mathcal {C}^{\scriptscriptstyle {(r)}}_{T_q({\beta _n})-\ell }\), (that we denote by \(H\left( \mathcal {C}^{\scriptscriptstyle {(r)}}_{T_q({\beta _n})-\ell }\right) \)), for any \(\ell \in [0, i_{\star \scriptscriptstyle {(q)}}], q\in \{r,b\}\). We do this by counting the number of paths with free ends: we say that a sequence of vertices and half-edges \((\pi _0, s_0, t_1, \pi _1, s_1, t_2, \dots , t_k, \pi _k, s_k)\) forms a free-ended path of length k in \({\mathrm {CM}}_n(\varvec{d})\), if for all \(0< i\le k\), the half-edges \(s_i, t_i\) are incident to the vertex \(\pi _i\) and \((s_{i-1}, t_i)\) forms an edge between vertices \(\pi _{i-1},\pi _i\). Clearly, since the same vertex might be approached on several paths, the total number of free half-edges in \(\mathcal {C}^{\scriptscriptstyle {(q)}}_{T_q({\beta _n})-\ell }\) can be bounded from above by the number of free-ended paths of length \(T_q({\beta _n})-\ell \), starting from \(v_r\). By the definition of \( Bad \mathcal {P}^{\scriptscriptstyle {(q)}}_k\) in (2.40), on the event \(\text {NoBad}\) at time \(t(n^{\varrho _n'}) + i\), \(\widehat{u}_{i}^{\scriptscriptstyle {(q)}}\) defined in (2.39) is an upper bound on the degrees of color q vertices. We write \(\mathcal {N}_k(\mathcal {A}, \text {free})\) for the set of, and \(N_k(\mathcal {A}, \text {free})\) for the total number of, k-length free-ended paths starting from an unpaired half-edge that belongs to the set \(\mathcal {A}\). Then, since \(T_q({\beta _n})=t(n^{\varrho _n'})+i_{\star \scriptscriptstyle {(q)}}\) (see (2.38)), for any \(\ell \le i_{\star \scriptscriptstyle {(q)}}\),

$$\begin{aligned} H\left( \mathcal {C}^{\scriptscriptstyle {(r)}}_{T_q({\beta _n})-\ell }\right) \le N_{i_{\star \scriptscriptstyle {(q)}}-\ell }\left( \mathcal {C}^{\scriptscriptstyle {(r)}}_{t(n^{\varrho _n'})}, \text {free}\right) , \end{aligned}$$
(3.3)

and recall that \(\mathcal {C}^{\scriptscriptstyle {(r)}}_{t(n^{\varrho _n'})}\) is coupled to the branching process described in Sect. 2.1. Hence, the degrees in the last generation of the BP phase are i.i.d. having distribution \(D^\star _n\) satisfying (1.7). When counting free-ended paths through fixed vertices \((\pi _0, \dots , \pi _k)\), (2.52) should be modified so that we have to choose two half-edges also from the end vertex \(\pi _k\), thus there is an additional factor \(d_{\pi _k}-1\) that should multiply (2.52). The effect of this on (2.53) is that the factor containing \(\pi _k\) can be merged into the previous factor:

$$\begin{aligned}&\mathbb {E}_{\scriptscriptstyle {Y}}\left[ N_{i_{\star \scriptscriptstyle {(q)}}-\ell }\left( \mathcal {C}^{\scriptscriptstyle {(r)}}_{t(n^{\varrho })}, \text {free}\right) \mid \text {NoBad} \right] \nonumber \\&\quad \le {{\mathrm {e}}}^{Ci_{\star \scriptscriptstyle {(q)}}^2 /\ell _n}\!\!\!\sum _{\pi _0 \in \mathcal {C}^{\scriptscriptstyle {(q)}}_{t(n^{\varrho _n'})} } \!\!\!\!d_{\pi _0} \cdot \prod _{i=1}^{i_{\star \scriptscriptstyle {(q)}}-\ell }\left( \sum _{\pi _i \in \Lambda _i} \frac{d_{\pi _i} (d_{\pi _i}-1)}{\ell _n}\right) , \end{aligned}$$
(3.4)

where we have applied the restriction that is valid under the event \(\text {NoBad}\): \(\pi _i\in \Lambda _i\), with \(\Lambda _i=\left\{ v \in [n]: D_v \le \widehat{u}_i^{\scriptscriptstyle {(r)}} \right\} \). Note that we could use again that \(\ell _n^\star =\ell _n (1-o_{\mathbb {P}}(1))\) by the same argument that was used after formula (2.52). Using (2.42) and (2.44) from Claim 2.6, we obtain that

$$\begin{aligned} \mathbb {E}_{\scriptscriptstyle {Y}}\Big [ N_{i_{\star \scriptscriptstyle {(r)}}-\ell }\left( \mathcal {C}^{\scriptscriptstyle {(r)}}_{t(n^{\varrho })}, \text {free}\right) \mid \text {NoBad}\Big ] \le \widehat{u}_0^{\scriptscriptstyle {(r)}} \cdot \left( \prod _{i=1}^{i_{\star \scriptscriptstyle {(r)} }-\ell } \frac{2}{3-\tau }\left( \widehat{u}_i^{\scriptscriptstyle {(q)}}\right) ^{3-\tau } L_n^{\star \text {up}}\left( \widehat{u}_i^{\scriptscriptstyle {(q)}}\right) \right) {{\mathrm {e}}}^{2 i_{\star \scriptscriptstyle {(q)}}^2/ \ell _n }.\nonumber \\ \end{aligned}$$
(3.5)

Note that this is similar to (2.54). Indeed, we again sequentially apply the identity \(\widehat{u}_j^{\scriptscriptstyle {(q)}} \cdot (\widehat{u}_{j+1}^{\scriptscriptstyle {(q)}})^{3-\tau }h(\widehat{u}_j^{\scriptscriptstyle {(q)}})=\widehat{u}_{j+1}^{\scriptscriptstyle {(q)}}\), and then (3.5) turns into

$$\begin{aligned} \mathbb {E}_{\scriptscriptstyle {Y}}\Big [ N_{i_{\star \scriptscriptstyle {(r)}}}\left( \mathcal {C}^{\scriptscriptstyle {(r)}}_{t(n^{\varrho })}, \text {free}\right) \mid \text {NoBad} \Big ] \le \widehat{u}_{ i_{\star \scriptscriptstyle {(r)} }-\ell }^{\scriptscriptstyle {(q)}} \left( \prod _{i=0}^{ i_{\star \scriptscriptstyle {(r)} }-\ell -1 } \frac{2}{3-\tau } \frac{L_n^{\star , \text {up}} \left( \widehat{u}_{i+1}^{\scriptscriptstyle {(q}}\right) }{h\left( \widehat{u}_{i}^{\scriptscriptstyle {(q)}}\right) } \right) {{\mathrm {e}}}^{2 i_{\star \scriptscriptstyle {(q)}}^2/ \ell _n }.\nonumber \\ \end{aligned}$$
(3.6)

Combining this with Markov’s inequality and a union bound gives

$$\begin{aligned}&\mathbb {P}_{\scriptscriptstyle {Y}}\left( \exists \ell \le i_{\star \scriptscriptstyle {(q)}}: H\left( \mathcal {C}^{\scriptscriptstyle {(r)}}_{T_q({\beta _n})-\ell }\right) \ge \widehat{u}_{ i_{\star \scriptscriptstyle {(r)} }-\ell }^{\scriptscriptstyle {(q)}}\mid \text {NoBad} \right) \nonumber \\&\qquad \le {{\mathrm {e}}}^{2 i_{\star \scriptscriptstyle {(q)}}^2/ \ell _n } \sum _{\ell =0}^{i_{\star \scriptscriptstyle {(q)}}}\left( \prod _{i=0}^{ i_{\star \scriptscriptstyle {(r)} }-\ell -1 } \frac{2}{3-\tau } \frac{L_n^{\star , \text {up}} \left( \widehat{u}_{i+1}^{\scriptscriptstyle {(q}}\right) }{h\left( \widehat{u}_{i}^{\scriptscriptstyle {(q)}}\right) } \right) . \end{aligned}$$
(3.7)

Recall again that \(i_{\star \scriptscriptstyle {(q)}}\) is a tight random variable measurable wrt \(\mathcal {F}_{\varrho _n'}\) (see (2.34) and (2.36)), and \(\ell _n=\mathbb {E}[D_n]n\) is of order n. Thus the first factor on the rhs is \(1+o_{\mathbb {P}_{\scriptscriptstyle {Y}}}(1)\). Further, in the analysis below (2.56) we have showed that the sum in the rhs of (3.7) is at most the rhs of (2.59). Thus we obtain

$$\begin{aligned} \mathbb {P}_{\scriptscriptstyle {Y}}\left( \exists \ell \le i_{\star \scriptscriptstyle {(q)}}: H\left( \mathcal {C}^{\scriptscriptstyle {(r)}}_{T_q({\beta _n})-\ell }\right) \ge \widehat{u}_{ i_{\star \scriptscriptstyle {(q)} }-\ell }^{\scriptscriptstyle {(q)}}\mid \text {NoBad} \right) \le C\exp \left\{ -C (\log m_q)^\gamma \right\} . \end{aligned}$$
(3.8)

Now, to see that \(\mathcal {C}^{\scriptscriptstyle {(r)}}_{T_r({\beta _n})}\) and \(\mathcal {C}^{\scriptscriptstyle {(b)}}_{T_b({\beta _n})}\) are disjoint, we apply the following procedure: It is easy to see that \(H\left( \mathcal {C}^{\scriptscriptstyle {(r)}}_{T_r({\beta _n})-\ell }\right) \) is maximised at \(\ell =0\). Hence, we grow the red cluster first until time \(T_r({\beta _n})\), and then stop it. Then, we grow the blue cluster step by step, looking at the pairs of half-edgesFootnote 12 in \(\mathcal {H}\left( \mathcal {C}^{\scriptscriptstyle {(b)}}_1\right) , \mathcal {H}\left( \mathcal {C}^{\scriptscriptstyle {(b)}}_2\right) , \dots , \mathcal {H}\left( \mathcal {C}^{\scriptscriptstyle {(b)}}_{T_b({\beta _n})-1}\right) \), and at each step we check whether any of the half edges paired are actually paired to a red half-edge. If this happens for any time before or at \(T_b({\beta _n})-1\), then an early connection happens and the distance is at most \(T_b({\beta _n})+T_r({\beta _n})\). (Note that the distance is \(T_r({\beta _n})+i\) if we pair a blue half-edge attached to \(\mathcal {C}^{\scriptscriptstyle {(b)}}_{i-1}\) to a red half-edge.)

The probability that there is a connection before or at \(t(n^{\varrho _n'})\) is of the same order of magnitude as the probability that there is a connection at time \(t(n^{\varrho _n'})\), since the total degree in the whole BP is the same order of magnitude as the total degree in the last generation, thus it is enough to investigate the probability that \(\mathcal {H}\left( \mathcal {C}^{\scriptscriptstyle {(b)}}_{T_b({\beta _n})-\ell }\right) \) connects to \(\mathcal {H}\left( \mathcal {C}^{\scriptscriptstyle {(r)}}_{T_r({\beta _n})}\right) \) for some \(\ell \le i_{\star \scriptscriptstyle {(b)}}\). This probability is at most

$$\begin{aligned} \mathbb {P}_{\scriptscriptstyle {Y}}\left( \mathcal {C}^{\scriptscriptstyle {(b)}}_{T_b({\beta _n})-\ell } \leftrightarrow \mathcal {C}^{\scriptscriptstyle {(r)}}_{T_r({\beta _n})} \mid H\left( \mathcal {C}^{\scriptscriptstyle {(b)}}_{T_b({\beta _n})-\ell }\right) , H\left( \mathcal {C}^{\scriptscriptstyle {(r)}}_{T_r({\beta _n})}\right) \right) \le \frac{H\left( \mathcal {C}^{\scriptscriptstyle {(b)}}_{T_b({\beta _n})-\ell }\right) H\left( \mathcal {C}^{\scriptscriptstyle {(r)}}_{T_r({\beta _n})}\right) }{\ell _n(1+o(1))}.\nonumber \\ \end{aligned}$$
(3.9)

Let us write \(\mathcal {D}_n:=\left\{ H\left( \mathcal {C}^{\scriptscriptstyle {(b)}}_{T_b({\beta _n})-\ell }\right) \le \widehat{u}_{ i_{\star \scriptscriptstyle {(b)} }-\ell }, \forall \ell \in [i_{\star \scriptscriptstyle {(b)} }]\right\} \). Then by (3.7), \(\mathcal {D}_n\) happens \(\mathbb {P}_{\scriptscriptstyle {Y}}\)-whp. Using this, we sum the bound on the rhs over \(\ell \in [i_{\star \scriptscriptstyle {(b)}}]\), using (2.28), to obtain that

$$\begin{aligned} \mathbb {P}_{\scriptscriptstyle {Y}}\Big ( \mathcal {C}^{\scriptscriptstyle {(r)}}_{T_r({\beta _n})} \cap \mathcal {C}^{\scriptscriptstyle {(b)}}_{T_b({\beta _n})}\ne \varnothing \mid \mathcal {D}_n \Big ) \le \frac{\widehat{u}_{ i_{\star \scriptscriptstyle {(r)} }}^{\scriptscriptstyle {(r)}} }{\ell _n}\sum _{\ell =1}^{ i_{\star \scriptscriptstyle {(b)}} }\widehat{u}_{ i_{\star \scriptscriptstyle {(b)} }-\ell }^{\scriptscriptstyle {(q)}} \lesssim \frac{n^{{\beta _n}(\tau -2)^{b_n^{\scriptscriptstyle {(r)}}({\beta _n})}}}{\ell _n} \sum _{\ell =1}^{i_{\star \scriptscriptstyle {(b)}}}n^{{\beta _n}(\tau -2)^{b_n^{\scriptscriptstyle {(b)}}({\beta _n})+\ell }},\nonumber \\ \end{aligned}$$
(3.10)

where we recall that \(\lesssim \) means inequality up to multiplicative factors that are of order at most \(\exp \{ (\log n)^\theta \}\) for some \(\theta \in [0,1)\), as in Definition 1.5. Since \(\ell _n=\mathbb {E}[D_n] n\) is of order n, the exponent of n in the dominant term in the numerator is

$$\begin{aligned} {\beta _n}\left( (\tau -2)^{b_n^{\scriptscriptstyle {(r)}}({\beta _n})}+ (\tau -2)^{b_n^{\scriptscriptstyle {(b)}}({\beta _n})+1}\right) < 1, \end{aligned}$$

as long as \({\beta _n}<1/(\tau -1)\), since \(b_n^{\scriptscriptstyle {(b)}}({\beta _n}) , b_n^{\scriptscriptstyle {(r)}}({\beta _n}) \in [0,1)\). When \({\beta _n}=1/(\tau -1)\), the strict inequality still holds as long as \((b_n^{\scriptscriptstyle {(b)}}({\beta _n}),b_n^{\scriptscriptstyle {(r)}}(a{\beta _n}))\ne (0,0)\), an event that happens with probability 1 under Assumption 1.7, since \(b_n^{\scriptscriptstyle {(q)}}({\beta _n})=0\) is only possible if \(Y_q^{\scriptscriptstyle {(n)}}\) takes values in a measure 0 discrete set, see (1.15). The probability of this event tends to 0 under Assumption 1.7.

For large enough n the multiplicative factors hidden in the \(\sim \) sign on the rhs of (3.10) are negligible, thus the rhs of (3.10) tends to zero with n. This finishes the proof of the proposition.

4 Distances in the Graph

Proof of Theorem 1.8

The lower bound is easier, since we can use the first moment method (i.e., Markov’s inequality) on the number of paths emanating from \(H({\mathcal {R}}_{T_r({\beta _n})})\) and \(H(\mathcal B_{T_b({\beta _n})})\) and connecting to each other to obtain a lower bound. Thus, let us start counting paths of length \(z+1\) (that is, z vertices in between) connecting \(\mathcal {H}\left( \mathcal {C}^{\scriptscriptstyle {(q)}}_{T_q({\beta _n})}\right) \), for \(q\in \{r,b\}\). Starting with (2.52), the restriction now is that \(\pi _0 \in \mathcal {C}^{\scriptscriptstyle {(r)}}_{T_r({\beta _n})}\), while \(\pi _{z+1} \in \mathcal {C}^{\scriptscriptstyle {(b)}}_{T_b({\beta _n})}\), and there are no restrictions on the in-between vertices. Thus, we obtain a similar formula as in (2.53), except that the restrictions on the vertices \(\pi _i\) are now different:

$$\begin{aligned} \begin{aligned}&\mathbb {E}_{\scriptscriptstyle {Y}}\Big [N_z\big ({\mathcal {C}}^{\scriptscriptstyle {(r)}}_{T_r({\beta _n})}, \mathcal C^{\scriptscriptstyle {(b)}}_{T_b({\beta _n})}\big )\mid \text {NoBad}\Big ] \\&\quad \le {{\mathrm {e}}}^{Cz^2 /\ell _n}\!\!\!\!\!\sum _{\pi _0 \in \mathcal {C}^{\scriptscriptstyle {(r)}}_{T_r({\beta _n})} } \!\!\!\!d_{\pi _0} \cdot \prod _{i=1}^{z }\left( \sum _{\pi _i \in [n]} \frac{d_{\pi _i} (d_{\pi _i}-1)}{\ell _n^\star }\right) \cdot \sum _{\pi _{z+1} \in \mathcal {C}^{\scriptscriptstyle {(b)}}_{T_b({\beta _n})} } \frac{d_{\pi _{z+1}}}{\ell _n^\star }. \end{aligned} \end{aligned}$$
(4.1)

We have to check that in this case \(\ell _n^\star =\ell _n(1-o_{\mathbb {P}}(1))\) is still satisfied. This follows from the proof of Proposition 3.1 and the fact that Lemma 2.5 holds and thus the total number of used half-edges can be bounded from above as the sum of \(\sim \widehat{u}_{i}^{\scriptscriptstyle {(q)}}\) over \(i\le i_{\star \scriptscriptstyle {(q)}}\) and \(q\in \{r,b\}\).

The first and last sum are handled by (3.2) in Proposition 3.1. We would like to estimate the sums within the product sign on the rhs. For this, recall that \(D_n^\star \) stands for the (degree-1) of a vertex that a uniformly chosen half-edge in \(\ell _n\) is attached to. \(D_n^\star \) then follows distribution \(F_n^\star \), see (1.6). By Claim 2.6, (2.44), the size-biased empirical moment of \(D_n^\star \) is

$$\begin{aligned} \mathbb {E}[D_n^\star ]=\sum _{\begin{array}{c} v\in [n]\\ d_v\le n^{{\beta _n}} \end{array}} \frac{d_v (d_v-1)}{\ell _n}\le n^{{\beta _n}(3-\tau )} L_n^{\star , \text {up} }(n^{{\beta _n}})\lesssim n^{{\beta _n}(3-\tau )}. \end{aligned}$$
(4.2)

Applying this inequality on the rhs of (4.1) yields that

$$\begin{aligned} \mathbb {E}_{\scriptscriptstyle {Y}}\left[ N_z\Big (\mathcal C^{\scriptscriptstyle {(r)}}_{T_r({\beta _n})}, \mathcal C^{\scriptscriptstyle {(b)}}_{T_b({\beta _n})}\Big )\mid \text {NoBad}\right] \lesssim n^{-1}n^{{\beta _n} (\tau -2)^{b_n^{\scriptscriptstyle {(r)}}({\beta _n})}} n^{{\beta _n} (\tau -2)^{b_n^{\scriptscriptstyle {(r)}}({\beta _n})}} n^{z {\beta _n} (3-\tau )}. \end{aligned}$$
(4.3)

By Markov’s inequality, the probability that there is at least one path connecting \({\mathcal {C}}^{\scriptscriptstyle {(r)}}_{T_r({\beta _n})}, \mathcal {\mathcal {C}}^{\scriptscriptstyle {(b)}}_{T_b({\beta _n})}\) with \(z+1\) edges can be bounded from above by the expected number of connections, so we obtain the bound

$$\begin{aligned} \mathbb {P}_{\scriptscriptstyle {Y}}\left( \mathrm {d}_G\Big (\mathcal C^{\scriptscriptstyle {(r)}}_{T_r({\beta _n})}, \mathcal C^{\scriptscriptstyle {(b)}}_{T_b({\beta _n})}\Big ) \le z+1 \mid \text {NoBad}\right) \lesssim n^{-1+{\beta _n} \left( (\tau -2)^{b_n^{\scriptscriptstyle {(r)}}({\beta _n})} + (\tau -2)^{b_n^{\scriptscriptstyle {(b)}}({\beta _n})}+z(3-\tau )\right) } .\nonumber \\ \end{aligned}$$
(4.4)

So, the two clusters are \(\mathbb {P}_{\scriptscriptstyle {Y}}\)-whp disjoint as long as this quantity tends to zero. The smallest value of \(z\in \mathbb {N}\) when the rhs of (4.4) does not tend to zero is

$$\begin{aligned} \begin{aligned} z^\star _n&:= \inf \left\{ z\in \mathbb {N}: (\tau -2)^{b_n^{\scriptscriptstyle {(r)}}({\beta _n})} + (\tau -2)^{b_n^{\scriptscriptstyle {(b)}}({\beta _n})}+ z(3-\tau ) > 1/{\beta _n} \right\} \\&\ =\left\lceil \frac{1/{\beta _n}-(\tau -2)^{b_n^{\scriptscriptstyle {(r)}}({\beta _n})}-(\tau -2)^{b_n^{\scriptscriptstyle {(b)}}({\beta _n})}}{3-\tau } \right\rceil . \end{aligned} \end{aligned}$$
(4.5)

Since \(z_n^\star \) counts the number of vertices needed between \(\mathcal {C}^{\scriptscriptstyle {(r)}}_{T_r({\beta _n})}\) and \(\mathcal {C}^{\scriptscriptstyle {(b)}}_{T_b({\beta _n})}\), and we would like to count the number of edges Footnote 13, as long as the number of edges between \(\mathcal {C}^{\scriptscriptstyle {(r)}}_{T_r({\beta _n})}\) and \(\mathcal {C}^{\scriptscriptstyle {(b)}}_{T_b({\beta _n})}\) is at most \(z_n^\star \), the bound in (4.4) tends to zero as \(n\rightarrow \infty \). This in turn means that \(\mathbb {P}_{\scriptscriptstyle {Y}}\)-whp there is no path of length \(T_r({\beta _n})+T_b({\beta _n})+z_n^\star \) connecting \(v_r, v_b\). Thus, we obtain that \(\mathbb {P}_{\scriptscriptstyle {Y}}\)-whp:

$$\begin{aligned} {\mathrm {d}}_G(v_r, v_b)\ge T_r({\beta _n})+ T_b({\beta _n})+\left\lceil \frac{1/{\beta _n} -(\tau -2)^{b_n^{\scriptscriptstyle {(r)}}({\beta _n})}-(\tau -2)^{b_n^{\scriptscriptstyle {(b)}}({\beta _n})}}{3-\tau } \right\rceil +1.\nonumber \\ \end{aligned}$$
(4.6)

This completes the proof of the lower bound on \(\mathrm {d}_G(v_r, v_b)\) in Theorem 1.8. For the upper bound on \(\mathrm {d}_G(v_r, v_b)\), we expect the existence of a path of length as the rhs of (4.6). To be able to show this, we apply the second moment method. Recall that we have already constructed paths of length \(T_q({\beta _n})\) between \(v_q\) and a vertex \(v_q^\star \), where \(d_{v_q^\star }\) as in (2.37). We calculate the expected number and variance of paths of length \(z^\star +1\) connecting \(v_r^\star \) to \(v_b^\star \), with certain restrictions. Namely, the formula for the variance turns out to be simpler and easier if we count paths where the ith vertex on the path falls into a different (and disjoint) set for all \(i\ge 0\). The reason why the variance is easier to calculate is that two possible paths can overlap only in fairly simple ways (see [6, Figure 7]).

Note that since \(b_n^{\scriptscriptstyle {(b)}}, b_n^{\scriptscriptstyle {(r)}}\in [0,1)\),

$$\begin{aligned} z^\star _n+2 \le \left\lceil \frac{1/{\beta _n}-2(\tau -2)}{3-\tau }\right\rceil +2=:M_{\beta _n}. \end{aligned}$$
(4.7)

Now we divide the set of vertices into \(M_n\) many roughly equal disjoint sets. We denote the ith set by \(\Delta _i\). By roughly equal we mean that the following inequalities hold for some \(0<c_1< c_2<\infty \)

$$\begin{aligned} \nu _i^{\text {new}}:=&\sum _{v\in \Delta _i} \frac{d_i (d_i-1)}{\ell _n} \in \left[ \frac{c_1}{M_{\beta _n}}, \frac{c_2}{M_{\beta _n}}\right] \cdot \mathbb {E}[D_n^\star ] \end{aligned}$$
(4.8)
$$\begin{aligned} \kappa _i^{\text {new}}:=&\sum _{v\in \Delta _i} \frac{d_i (d_i-1)(d_i-2)}{\ell _n} \in \left[ \frac{c_1}{M_{\beta _n}}, \frac{c_2}{M_{\beta _n}}\right] \cdot \mathbb {E}[D_n^\star (D_n^\star -1)]. \end{aligned}$$
(4.9)

This can be done as long as we distribute the vertices in the intervals \([n^{{\beta _n}(1-\varepsilon )}, n^{{\beta _n}}]\) in an approximately uniform way. We require that \(v_r^\star \in \Delta _0\) and \(v_b^\star \in \Delta _{M_{\beta _n}}\) in the partitioning. We will count the paths on vertices \((v_r^\star :=v_0, v_1, \dots , v_{z}, v_{z+1}:=v_b^\star )\) that satisfy the property that the jth vertex falls into \(\Delta _j\) when \(j\le z/2\) and it falls into \(\Delta _{M_{\beta _n}+1-j}\) when \(j>z/2\).Footnote 14 As a result of the restriction, the proof of [6, Lemma 7.1, (7.5)] applies word by word. This proof bounds the expected number and variance of restricted paths between vertices ab with k vertices in between. In this proof, we only need to replace \(\nu _i\) by \(\nu _i^{\text {new}}\) as in (4.8) and \(\kappa _i\) by \(\kappa _i^{\text {new}}\) as in (4.9). In our case the degree of vertex \(a:=v_r^\star \) is \(d_a:=u_{i_{\star \scriptscriptstyle {(r)}}}^{\scriptscriptstyle {(r)}}\) while the degree of vertex \(b:=v_b^\star \) is \(d_b:=u_{i_{\star \scriptscriptstyle {(b)}}}^{\scriptscriptstyle {(b)}}\).

First we need a lower bound on the expected number of paths of length \(z+1\) between \(v_r^\star , v_b^\star \). We expect it to be of a similar order of magnitude as the upper bound in (4.3). The differences between a lower and an upper bound (see (4.1)) are the following: (1) the first and last factor in (4.1) in lower bound changes, since we only count the degree of \(v_q^\star \), (2) in the middle factor in (4.1) we have to apply the restriction that \(\pi _j\in \Delta _{j}\) for \(j<z/2\) while \(\pi _j\in \Delta _{M_{\beta _n}-j}\) for \(j> z/2\) instead of summing over all vertices in [n].

Using (2.3), combined with (4.8) yields that we have the upper and lower bound

$$\begin{aligned} \mathbb {E}_{\scriptscriptstyle {Y}}[N_{z_n^\star }(v_r^\star , v_b^\star )] \ {\buildrel \in \over \sim }\ \frac{n^{{\beta _n} (\tau -2)^{b_n^{\scriptscriptstyle {(b)}}({\beta _n})}} n^{{\beta _n} (\tau -2)^{b_n^{\scriptscriptstyle {(r)}}({\beta _n})}}}{\ell _n} n^{z {\beta _n} (3-\tau )} \cdot \left[ (c_1/M_{\beta _n})^{z}, (c_2/M_{\beta _n})^{z}\right] ,\nonumber \\ \end{aligned}$$
(4.10)

where \({\buildrel \in \over \sim }\) means containment in an interval, where an additional factor of order at most \(\exp \{\pm (\log n^{\beta _n})^\theta \}\) for some \(\theta <1\), as defined in Definition 1.5, might multiply the prefactors of the interval. This additional factor comes from \(v_q^\star \) not being precisely equal to \(n^{{\beta _n}(\tau -2)^{b_n^{\scriptscriptstyle {(q)}}({\beta _n})}}\), as well as \(\mathbb {E}[D_n^\star ]\) in (4.2) is not precisely equal to \(n^{{\beta _n}(3-\tau )}\).

By the definitions of \(M_{\beta _n}\) and \(z_n^\star \) in (4.7) and (4.5) and the bound \(\beta _n\ge (\log n)^{-\gamma }\) for some \(\gamma < 1\), for \(i\in \{1,2\}\), for all \(z\le z_n^\star \),

$$\begin{aligned} (c_i/M_{\beta _n})^{z}\ge \exp \left\{ -\gamma \log \log n \cdot (\log n)^\gamma +\log c_i (\log n)^\gamma \right\} \ge \exp \{ -(\log n)^{\theta }\} \end{aligned}$$

for any \(\theta \in (\gamma ,1)\) and n sufficiently large, so the lower bound on \(\mathbb {E}_{\scriptscriptstyle {Y}}[N_{z^\star }(v_r^\star , v_b^\star )]\) in (4.10) fits the Definition 1.5 when we use that \(\ell _n\) is of order n. Combining this with the upper bound in (4.3), we arrive at the desired

$$\begin{aligned} \mathbb {E}_{\scriptscriptstyle {Y}}[N_{z}(v_r^\star , v_b^\star )] \sim n^{-1} n^{{\beta _n} (\tau -2)^{b_n^{\scriptscriptstyle {(b)}}({\beta _n})}} n^{{\beta _n} (\tau -2)^{b_n^{\scriptscriptstyle {(r)}}({\beta _n})}} n^{z {\beta _n} (3-\tau )}. \end{aligned}$$
(4.11)

The smallest value of z for which this expression tends to infinity (and not to 0) as \(n\rightarrow \infty \) is precisely \(z_n^\star \) defined in (4.5).

By Chebyshev’s inequality,

$$\begin{aligned} \mathbb {P}_{\scriptscriptstyle {Y}}(N_{z_n^\star }(v_r^\star , v_b^\star ) = 0) \le \frac{\mathrm{Var}_Y[N_{z_n^\star }(v_r^\star , v_b^\star ) ]}{\mathbb {E}_{\scriptscriptstyle {Y}}[N_{z_n^\star }(v_r^\star , v_b^\star ) ]^2} \end{aligned}$$
(4.12)

and thus, to show the \(\mathbb {P}_{\scriptscriptstyle {Y}}\)-whp existence of at least 1 path of length \(z_n^\star +1\), it is enough to show that the variance is of smaller order than the expectation squared.

Thus next we calculate the variance of \(N_{z_n^\star }(v_r^\star , v_b^\star )\). Here we rely on the proof of [6, Lemma 7.1]. Unfortunately here the appendix of the paper does not state the variance independently of the statement of [6, Lemma 7.1]. However, one can word-by-word follow the derivation of the variance, starting from the formula before [6, (A.19)] until [6, (A.25)]. The first occurence where the setting of this paper deviates from that paper is [6, (A.26)]: in this equation, in the last error factor \((1+\varepsilon )^k\), \(\varepsilon \) can be set to 0. This can be done since the events in \({\mathcal {E}}'_n\) defined in [6, (A.15)] hold for \(\varepsilon =0\) under Assumption 1.1.

A crucial observation in the proof there is the inequality in [6, (A.27)] stating that \(\kappa _i/\nu _i^2 \le \kappa _1/\nu _1^2\). This allows us to replace every occurrence of \(\kappa _i/\nu _i^2\) by \(\kappa _1/\nu _1^2\) in that proof. Note that this inequality in our case is not valid, however, with our choice of the \(\Delta _i\) it is true that

$$\begin{aligned} \frac{\kappa _i^{\text {new}}}{(\nu _i^{\text {new}})^2} \le C \frac{\kappa _1^{\text {new}}}{(\nu _1^{\text {new}})^2}, \end{aligned}$$

with \(C:=c_2/c_1^2\) from (4.8). Thus when following the proof, we are allowed to replace every occurrence of \(\kappa _i/\nu _i^2\) by \(C\kappa _1^{\text {new}}/(\nu _1^{\text {new}})^2\). Similarly, \(1/\nu _{k-1}\) can be replaced by \(\widetilde{C}/\nu _{1}^{\text {new}}\). If we do this replacement in [6, (A.28)], and thus the geometric sums in the formula before [6, (A.29)] yield that in [6, (A.29)] we should replace the two occurrences of \(\nu _{k-1}/(\nu _{k-1}-1)\) in [6, (A.29)] by

$$\begin{aligned} \frac{\nu _{1}^{\text {new}}}{\nu _{1}^{\text {new}}-\widetilde{C}} \end{aligned}$$

From here on, the arguments work word-by-word again and thus we obtain that the arguments as well as the formulas until [6, (A.32)] remain all true when implementing these modifications.

Ultimately, the final estimate for the variance is the sum of the rhs of [6, (A.19),(A.20), (A.29) and (A.32)] with the addition of the prefactor C at places where one sees \(\kappa _1/\nu _1^2\) and modifying \(\nu _{k-1}\) in numerators to \(\nu _1\) and \(1/(\nu _{k-1}-1)\) to \(1/(\nu _{1}-\widetilde{C})\). Using that \(d_{v_q^\star }\ge u_{i_{\star \scriptscriptstyle {(q)}}}^{\scriptscriptstyle {(q)}}\), we thus obtain that (ignoring the ‘new’ superscript everywhere for brevity now):

$$\begin{aligned}&\mathrm{Var}_Y[N_{z_n^\star }(v_r^\star , v_b^\star )] \le \mathbb {E}_{\scriptscriptstyle {Y}}[N_{z_n^\star }(v_r^\star , v_b^\star )]\nonumber \\&\quad \ \ +\,\overline{\mathbb {E}_{\scriptscriptstyle {Y}}[ N_{z_n^\star }(v_r^\star , v_b^\star )]}^2 \left( \frac{\nu _1}{\nu _1-\widetilde{C}} \frac{C \kappa _1}{\nu _1^2} \Big ( \frac{1}{u_{i_{\star \scriptscriptstyle {(r)}}}^{\scriptscriptstyle {(r)}}}+ \frac{1}{u_{i_{\star \scriptscriptstyle {(b)}}}^{\scriptscriptstyle {(b)}}}\Big )\!+\! \frac{\nu _1^2}{(\nu _1-\widetilde{C})^2 }\frac{C^2 \kappa _1^2}{\nu _1^4}\frac{1}{u_{i_{\star \scriptscriptstyle {(r)}}}^{\scriptscriptstyle {(r)}}u_{i_{\star \scriptscriptstyle {(b)}}}^{\scriptscriptstyle {(b)}}}\right. \nonumber \\&\quad \ \ \left. +\,\frac{8(z_n^\star )^2}{\ell _n} + \left( 1+ \frac{C \kappa _1 \nu _1}{\nu _1^2 u_{i_{\star \scriptscriptstyle {(r)}}}^{\scriptscriptstyle {(r)}} } \right) \left( 1+ \frac{C \kappa _1 \nu _1}{\nu _1^2 u_{i_{\star \scriptscriptstyle {(b)}}}^{\scriptscriptstyle {(b)}} }\right) \frac{z^\star }{\nu _1-C}\left( 2\frac{(z_n^\star )^2 \nu _1}{\ell _n} \frac{C^2 \kappa _1^2}{\nu _1^4} \right) \right) \end{aligned}$$
(4.13)

where \(\overline{\mathbb {E}_{\scriptscriptstyle {Y}}[ N_{z_n^\star }(v_r^\star , v_b^\star )]}\) stands for the upper bound on \(\mathbb {E}_{\scriptscriptstyle {Y}}[N_{z_n^\star }(v_r^\star , v_b^\star )]\) in (4.3). Recall that \(\overline{\mathbb {E}_{\scriptscriptstyle {Y}}[ N_{z_n^\star }(v_r^\star , v_b^\star )]}\) and \(\mathbb {E}_{\scriptscriptstyle {Y}}[ N_{z_n^\star }(v_r^\star , v_b^\star )]\) are all given by (4.11) up to smaller order correction terms. So, in order to show that the rhs of (4.12) tends to zero, it is enough to analyse the factor multiplying \(\overline{\mathbb {E}_{\scriptscriptstyle {Y}}[ N_{z_n^\star }(v_r^\star , v_b^\star )]}^2\) in (4.13) and show that it tends to zero as \(n\rightarrow \infty \). We carry out this now.

By the same method as the one in Claim 2.6, (i.e., using Karamata’s theorem) it is not hard to show that

$$\begin{aligned} \mathbb {E}[D_n^\star (D_n^\star -1)]=\sum _{\begin{array}{c} v\in [n]\\ d_v\le n^{\beta _n} \end{array}} \frac{d_v (d_v-1)(d_v-2)}{\ell _n}\le n^{{\beta _n}(4-\tau )} C L_n^{\star , \text {up} }(n^{\beta _n})\lesssim n^{{\beta _n}(4-\tau )}.\nonumber \\ \end{aligned}$$
(4.14)

This together with (4.9) implies that \(\kappa _1\sim n^{{\beta _n}(4-\tau )}\). From (4.2) and (4.8), \(\nu _1\sim n^{{\beta _n}(3-\tau )}\), and finally from (2.37), (see also (2.3)) \(u_{i_{\star \scriptscriptstyle {(q)}}}^{\scriptscriptstyle {(q)}}\sim n^{{\beta _n}(\tau -2)^{b_n^{\scriptscriptstyle {(q)}}({\beta _n}) }}\). This implies that for \(q\in \{r,b\}\)

$$\begin{aligned} \frac{\kappa _1}{\nu _1^2}\frac{1}{u_{i_{\star \scriptscriptstyle {(q)}}}^{\scriptscriptstyle {(q)}}}\sim n^{{\beta _n}((\tau -2)-(\tau -2)^{b_n^{\scriptscriptstyle {(q)}}({\beta _n}) })} \rightarrow 0, \end{aligned}$$
(4.15)

as \(n\rightarrow \infty \), since \(b_n^{\scriptscriptstyle {(b)}}({\beta _n}), b_n^{\scriptscriptstyle {(r)}}({\beta _n}) <1\), and the error terms hidden in the \(\sim \) sign are at most \(\exp \{\pm (\log n^{\beta _n})^\theta \}\) for \(\theta <1\), and are thus of smaller order of magnitude. Thus, both the first and second term multiplying \(\overline{\mathbb {E}_{\scriptscriptstyle {Y}}[ N_{z^\star }(v_r^\star , v_b^\star )]}\) in (4.13) tends to zero as \(n\rightarrow \infty \). Note that for (4.16) to tend to zero it is crucial that \(d_{v_q^\star }>n^{\beta _n(\tau -2)}\), i.e., that \(v_q^\star \) is a hub. When distributing the product in the second line of (4.13), using that \(\nu _1/(\nu _1-\widetilde{C})\) is a constant factor, we see that the main contribution comes from the terms

$$\begin{aligned} \frac{\kappa _1}{\nu _1 u_{i_{\star \scriptscriptstyle {(q)}}}^{\scriptscriptstyle {(q)}}} \cdot \frac{1}{\ell _n}\frac{\kappa _1^2}{\nu _1^4} \sim n^{{\beta _n}(1-(\tau -2)^{b_n^{\scriptscriptstyle {(q)}}({\beta _n}) })}\cdot n^{-1} \cdot n^{2{\beta _n}(\tau -2)} = n^{{\beta _n}(\tau -1+\tau -2-(\tau -2)^{b_n^{\scriptscriptstyle {(q)}} ({\beta _n})})-1}.\nonumber \\ \end{aligned}$$
(4.16)

Note that since \(b_n^{\scriptscriptstyle {(q)}}({\beta _n})\in [0,1)\), \((\tau -2)-(\tau -2)^{b_n^{\scriptscriptstyle {(q)}}}\le 0\), and thus the exponent is always negative when \({\beta _n}<1/(\tau -1)\). When \({\beta _n}=1/(\tau -1)\), the exponent is always nonpositive and equals 0 if and only if \(b_n^{\scriptscriptstyle {(q)}}({\beta _n})=0\). This is only possible if \(Y_q^{\scriptscriptstyle {(n)}}\) takes values in a measure 0 discrete set, see (1.15). The probability of this event tends to 0 under Assumption 1.7.

Combining the estimates in (4.15) and (4.16), we obtain that the variance of \(N_{z_n^\star }(v_r^\star , v_b^\star )\) is of smaller order than its expectation squared, hence the rhs of (4.12) tends to zero. This establishes that whp there is a path of length \(z_n^\star +1\) connecting \(v^\star _r\) to \(v_b^\star \). Thus, we obtain the existence of a path of length as in (1.16). This proves the upper bound on \(\mathrm {d_G}(v_r, v_b)\) and thus completes the proof of Theorem 1.8. \(\square \)

5 Extensions and By-products

In this section prove Theorems 1.13 and 1.15 and sketch the proof of Observation 1.14.

Proof of Theorem 1.13

The proof of the first statement of the theorem, that is, (1.21) follows from the proof of Theorem 1.8 in Section 4. Recall that we write \(\Lambda _{\le z}=\{w\in [n]: d_w \le n^{z}\}\). Note that in this case, Claim 2.6 yields that the empirical second moment restricted to degrees at most \(n^{\widetilde{\beta _n}}\) is \(\lesssim n^{\widetilde{\beta _n}(3-\tau )}\). The proof of Theorem 1.13 is even simpler than the one in Section 4: here, one does not need to estimate the number of half-edges attached to \(v_1, v_2\), since these are given, so there is no need to condition on the sigma algebra \(\mathcal {F}_{\varrho _n'}\) or on the good event (\(\text {NoBad}\)) either. Thus, below in this proof, we can use the ‘usual’ probability measure \(\mathbb {P}\) instead of \(\mathbb {P}_Y\). Let us write \(\mathcal {P}_{v_1, v_2}(z)\) for a path of length z connecting \(v_1, v_2\). Counting paths restricted to the set \(\Lambda _{\widetilde{\beta _n}}\), the restriction in the sum in the middle factor in (4.1) is that \(\pi _i\in \Lambda _{\le \widetilde{\beta _n}}\), thus, (4.3) turns into

$$\begin{aligned} \mathbb {E}\left[ \#\left\{ \mathcal {P}_{v_1, v_2}(z) \in \Lambda _{\le \widetilde{\beta _n}}\right\} \right] \lesssim n^{-1} n^{x_1\widetilde{\beta _n}+x_2\widetilde{\beta _n}} n^{z \widetilde{\beta _n}(3-\tau )} \end{aligned}$$
(5.1)

and (4.4) becomes

$$\begin{aligned} \mathbb {P}\left( \exists \mathcal {P}_{v_1, v_2}(z) \in \Lambda _{\le \widetilde{\beta _n}}: |\mathcal {P}_{v_1, v_2}|\le z+1 \right) \lesssim n^{-1} n^{\widetilde{\beta _n}(x_1+x_2 + z (3-\tau ))}. \end{aligned}$$
(5.2)

From here, the proof of the upper bound in Section 4 can be repeated with \({\beta _n}\) replaced by \(\widetilde{\beta _n}\), \((\tau -2)^{b_n^{\scriptscriptstyle {(q)}}(\beta _n)}\) replaced by \(x_j\), yielding (1.21). The proof of the lower bound in Sect. 4 is again valid word-by-word. It is important to note that the second moment method in (4.12) works only if the variance is of smaller order than the expectation squared, which, in turn, is equivalent to the quantities in (4.15) and (4.16) tending to zero, which is the case whenever \(x_j > \tau -2\). (Thus, for non-hub vertices the method does not work.) Next we show (1.22) and (1.24). Note that (1.24) is equivalent to

$$\begin{aligned} \mathbb {P}\left( \left\{ \# \mathcal {P}_{v_1, v_2}^\star \in \Lambda _{\le \widetilde{\beta _n}}\right\} \notin \left[ n^{\widetilde{\beta _n} f^u(\widetilde{\beta _n}, x_1, x_2) (1-\varepsilon ) }, n^{\widetilde{\beta _n} f^u(\widetilde{\beta _n}, x_1, x_2) (1+\varepsilon ) }\right] \right) \rightarrow 0. \end{aligned}$$
(5.3)

We prove this statement using Chebyshev’s inequality. Let us shortly write

$$\begin{aligned} \zeta _n:=\left\{ \# \mathcal {P}_{v_1, v_2}^\star \in \Lambda _{\le \widetilde{\beta _n}}\right\} . \end{aligned}$$
(5.4)

First, when we write \(\underline{\mathbb {E}[\zeta _n]}, \overline{\mathbb {E}[\zeta _n]}\) for the lower and upper bounds on \(\mathbb {E}[\zeta _n]\), respectively, then these are handled in (4.11), and equal \(\sim n^{\widetilde{\beta _n} f^u(\widetilde{\beta _n}, x_1, x_2)}\) by elementary calculations using the value \(z_n^\star \) from (4.5) and \(f^u\) from (1.19).

Next, \(\mathrm{Var}[ \zeta _n]\) is handled in (4.13), where consecutively in (4.15) and (4.16) it is established that

$$\begin{aligned} \mathrm{Var}[ \zeta _n] \le C \overline{\mathbb {E}[\zeta _n]}^2 \max _{j=1,2}\max \left\{ \frac{\kappa _1}{\nu _1^2 d_{v_j}} ,\frac{\kappa _1^3}{\nu _1^5 \ell _n d_{v_j}} \right\} . \end{aligned}$$
(5.5)

Comparing the rhs of (4.15) to the rhs of (4.16) with \({\beta _n}\) replaced by \(\widetilde{\beta _n}\), it is elementary to check that the dominating expression is the rhs of (4.15) unless \(\widetilde{\beta _n}=1/(\tau -1)\), in which case both terms are of the same order. Nevertheless, we arrive to

$$\begin{aligned} \max _{j=1,2} \max \left\{ \frac{\kappa _1}{\nu _1^2 d_{v_j}}, \frac{\kappa _1^3}{\nu _1^5 \ell _n d_{v_j}} \right\} \sim n^{-\widetilde{\beta _n} (\min (x_1, x_2) - (\tau -2) )}. \end{aligned}$$
(5.6)

Note that this is the point where it becomes clear why the assumption \(x_1, x_2 > \tau -2\) was necessary: only in this case can we expect any concentration of the variable \(\zeta _n\). Chebyshev’s inequality yields that for any \(c(n,\varepsilon )\) that depends on n and some \(\varepsilon >0\) to be chosen later

$$\begin{aligned} \mathbb {P}\left( |\zeta _n - \mathbb {E}[\zeta _n]|\ge \mathbb {E}[\zeta _n] c(n,\varepsilon ) \right)\le & {} \frac{\mathrm{Var}( \zeta _n) }{\mathbb {E}[\zeta _n]^2 c(n,\varepsilon )^2 }\lesssim \frac{\overline{\mathbb {E}[\zeta _n]}^2 }{\underline{\mathbb {E}[\zeta _n]}^2} \frac{1}{c(n,\varepsilon )^2}n^{-\widetilde{\beta _n} (\min (x_1, x_2) - (\tau -2) )} \nonumber \\\sim & {} \frac{1}{\widetilde{\beta _n}^2}\frac{1}{c(n,\varepsilon )^2}n^{-\widetilde{\beta _n} (\min (x_1, x_2) - (\tau -2) )}, \end{aligned}$$
(5.7)

where we have used that \(\overline{\mathbb {E}[\zeta _n]}/\underline{\mathbb {E}[\zeta _n]}\) is \(1/\widetilde{\beta _n}\) times a factor that can be merged in the \(\sim \) sign. By the assumption that \({\beta _n} (\log n)^\gamma \rightarrow \infty \), for some \(\gamma <1\), setting

$$\begin{aligned} c(n,\varepsilon ):=n^{-(1-\varepsilon )\widetilde{\beta _n} (\min (x_1, x_2) - (\tau -2) )/2}\rightarrow 0, \end{aligned}$$

the rhs tends to zero for any fix \(\varepsilon >0\). Note that we only have upper and lower bounds on the expected value, thus we obtain that

$$\begin{aligned} \mathbb {P}(\zeta _n \notin [(1-c(n,\varepsilon ))\underline{\mathbb {E}[\zeta _n]} , (1+c(n,\varepsilon ))\overline{\mathbb {E}[\zeta _n]}] )\rightarrow 0. \end{aligned}$$
(5.8)

The interval in (5.3) is certainly wider than the one excluded here. This finishes the proof of (1.24). Further note that both the lower as well as the upper ends of the interval in (5.8) are still \(\sim n^{\widetilde{\beta _n} f^u(\widetilde{\beta _n}, x_1, x_2)}\). This finishes the proof of (1.22).

We turn to the proof of (1.23), which is essentially Markov’s inequality. Indeed, when we require for some \(\delta <1\) that one of the vertices on a path must fall in \(\Lambda _{\le \delta \widetilde{\beta _n}}\) (but not its location), we obtain that the expected number of such paths connecting \(v_1, v_2\) is at most

$$\begin{aligned} \mathbb {E}\left[ \#\left\{ \mathcal {P}_{v_1, v_2}(z): \exists i\le z, \pi _i \in \Lambda _{\le \delta \widetilde{\beta _n}}\right\} \right] \lesssim z n^{-1} n^{\widetilde{\beta _n}(x_1+x_2 + (z-1)(3-\tau ) + \delta (3-\tau ))}. \end{aligned}$$
(5.9)

Let us denote by \(z_n^\star (\delta )\) the smallest z for which this quantity does not tend to zero, which is exactly

$$\begin{aligned} z_n^\star (\delta )= \left\lceil \frac{1/\widetilde{\beta _n}-x_1-x_2}{3-\tau }+(1-\delta ) \right\rceil \end{aligned}$$

On the other hand, without any restriction, the shortest path uses

$$\begin{aligned} z_n^\star (1)=\left\lceil \frac{1/\widetilde{\beta _n}-x_1-x_2}{3-\tau } \right\rceil \end{aligned}$$

many in-between vertices, by (1.21). As long as \(\delta \) is such that \(z^\star (\delta )> z^\star (1)\), the rhs of (5.9) tends to 0 for \(z=z^\star (1)\). Thus, there will be no connecting paths of length \(z^\star (1)\) that have a vertex in \(\Lambda _{\le \delta \widetilde{\beta _n}}\). Finally, the largest \(\delta \) that we can achieve, \(\delta _{\max }:=\sup \{ \delta : z^\star (\delta )>z^\star (1)\}\) is precisely the lower fractional part of the expression within the upper-integer-part in \(z^\star (1)\), in other words, \(\delta _{\max }=f^\ell (\widetilde{\beta _n}, x_1, x_2)\), establishing (1.23). \(\square \)

Sketch proof of Observation 1.14. From the proof of Theorem 1.13 it follows that all shortest paths use degree at least \(\sim n^{\widetilde{\beta }_n f^\ell (\widetilde{\beta }_n,x_1,x_2)}\) while the Chebyshev’s inequality in (5.7) shows that the number of shortest paths is \(\sim n^{\widetilde{\beta }_n f^u(\widetilde{\beta }_n, x_1,x_2)}\). Multiplying these two together yields one part of the observation. The full statement is finished when we show that there is at least one shortest path that actually uses a vertex with degree \(\sim n^{\widetilde{\beta }_n f^\ell (\widetilde{\beta }_n,x_1,x_2)(1+\varepsilon )}\) for arbitrary small \(\varepsilon >0\). This can be done using Chebyshev’s inequality again in the same way as for \(\zeta _n\) in (5.4), now counting paths with at least one vertex in \(\Lambda _{\le \widetilde{\beta }_nf^\ell (\widetilde{\beta }_n, x_1,x_2)(1+\varepsilon )}\) and the rest of the vertices in \(\Lambda _{\le \widetilde{\beta }_n}\). In this case, the variance vs expectation squared method carries through the same way. \(\square \)

Proof of Theorem 1.15

The proof of this theorem is essentially the upper bound – the construction of the connecting path – of the proof of Theorem 1.8. For the first statement, about the segments with increasing degrees, Proposition 2.1 shows that there is a path of length \(T_q({\beta _n})\) that connects \(v_q\) to a vertex \(v_q^\star \) with degree as in (1.25), while Proposition 3.1 ensures that the total degree in the exploration clusters of depth \(T_q({\beta _n})\) is the same order of magnitude as the degree of \(v_q^\star \), and that the two exploration clusters are disjoint.

By Lemma 2.5, the \((T_q({\beta _n})-k)\)th vertex on this path has degree at least \(u_{i_{\star \scriptscriptstyle {(q)}}-k}^{\scriptscriptstyle {(q)}}\) and at most \(\widehat{u}_{i_{\star \scriptscriptstyle {(q)}}-k}^{\scriptscriptstyle {(q)}}\), both of them \(\sim n^{{\beta _n}(\tau -2)^{b_n^{\scriptscriptstyle {(q)}}({\beta _n})+k }}\).

Thus, when connecting the two clusters \(\mathcal {C}^{\scriptscriptstyle {(r)}}_{T_r({\beta _n})}\) and \(\mathcal {C}^{\scriptscriptstyle {(b)}}_{T_b({\beta _n})}\), Theorem 1.13 can be applied to show that there is a path that connects \(v_r^\star , v_b^\star \) of length as in (1.26), using vertices of degree at least \(n^{{\beta _n}f_n^\ell }\). By the bound on the total degrees in Proposition 3.1, we can identify all the vertices in \(\mathcal {C}^{\scriptscriptstyle {(q)}}_{T_q({\beta _n})}\), and apply Theorem 1.13 once more to see that no shorter path is possible between \(\mathcal {C}^{\scriptscriptstyle {(r)}}_{T_r({\beta _n})}\) and \(\mathcal {C}^{\scriptscriptstyle {(b)}}_{T_b({\beta _n})}\) than the one with length (1.26), and any of these shortest paths uses vertices with degrees at least \(n^{{\beta _n}f_n^\ell }\). \(\square \)

6 Comparison of Typical Distances

In this section we compare the result of Theorem 1.8 applied to the \({\beta _n}=1/(\tau -1)\) setting to [31, Theorem 1.2] in more detail. Here we argue that the two formulations - namely the one in [31, Theorem 1.2] and the one in (1.28) - are indeed the same, by describing the core idea of the proof of [31, Theorem 1.2], and relate quantities (events, random variables, etc.) appearing in that proof to quantities in this paper. In this section \(\approx \) means equality up to a \((1+o_{\mathbb {P}}(1))\) factor.

The proof of [31, Theorem 1.2] goes through a minimisation problem, where the two BFS clusters of \(v_r\) and \(v_b\) should connect the first time such that a coupling (to two branching processes) should be maintained. More precisely, suppose \(k_1\) is a random variable that is measurable w.r.t. \(\{\mathcal {C}_{s}^{\scriptscriptstyle {(q)}}\}_{s=1}^m\), for some m (in this paper we take \(m=t(n^{\varrho _n'})\)). Suppose we run the BFS started from \(v_r\) for \(k_1\) steps, and from \(v_b\) for \(k-k_1-1\) steps. There are \(Z_{k_1+1}^{\scriptscriptstyle {(r)}}, Z_{k-k_1}^{\scriptscriptstyle {(b)}}\) many half edges attached to the vertices in the two clusters, respectively. The distance between \(v_r, v_b\) is then larger than k if these sets of half-edges do not connect to each other, and the probability of this event is approximately

$$\begin{aligned} \mathbb {P}\Big (H\left( \mathcal {C}^{\scriptscriptstyle {(r)}}_{k_1}\right) \cap H\left( \mathcal {C}^{\scriptscriptstyle {(b)}}_{k-k_1-1}\right) =\varnothing \mid Z_{k_1+1}^{\scriptscriptstyle {(r)}}, Z_{k-k_1}^{\scriptscriptstyle {(b)}}\Big ) \approx \exp \left\{ - c Z_{k_1+1}^{\scriptscriptstyle {(r)}} Z_{k-k_1}^{\scriptscriptstyle {(b)}} / \ell _n\right\} \end{aligned}$$
(6.1)

A branching process approximation similar to the one in Section 2.1 is performed to approximate the numerator in the exponent. However, this BP approximation is only valid until none of the colors have more half-edges than \(n^{(1-\varepsilon ) / (\tau -1)}\) for some small \(\varepsilon >0\), i.e., they do not reach the highest-degree vertices in the graph yet. This criterion is established in [31, Proposition 3.2]. The set \(\mathcal {T}_{m}^{i,n}\) in [31, (3.3)] exactly describes those values of \(\ell \) for which \(\{\ell \le T_q+1\}\), \(q\in \{r,b\}\) holds (where \(T_q=T_q(1/(\tau -1))\), defined in (1.15), or in (2.38), is the time to reach the hubs). The \(+1\) is added to \(T_q\) since the half-edges attached to vertices in the BFS cluster at time \(T_q\) can be described as the next generation, they have size \(Z_{T_q+1}^{\scriptscriptstyle {(q)}}\).

Now, from (6.1), we see that \(\{{\mathrm {d}}_G(v_r, v_b)>k\}\) happens whp if \(Z_{k_1+1}^{\scriptscriptstyle {(r)}}\cdot Z_{k-k_1}^{\scriptscriptstyle {(b)}} = o(n)\) and also that both \(k_1+1 \in \mathcal {T}_{m}^{r,n}\) and \(k-k_1 \in \mathcal {T}_{m}^{b,n}\) holds. These latter conditions are described in the set of indices \(\mathcal {B}_n\) in [31, (4.57)]:

$$\begin{aligned} \mathcal {B}_n:=\left\{ k_1\in \mathbb {N}: k_1+1 \le T_r+1, k-k_1 \le T_b+1\right\} . \end{aligned}$$

Using the BP approximation similar as in (1.11), and the rhs of (6.1),

$$\begin{aligned}&\mathbb {P}(\mathrm {d}_G(v_r, v_b) >k) \\&\quad \approx \max _{k_1 \in \mathcal {B}_n}\exp \left\{ - C \exp \left\{ (\tau -2)^{-(k_1+1)}Y_r^{\scriptscriptstyle {(n)}}+ (\tau -2)^{-(k-k_1)} Y_b^{\scriptscriptstyle {(n)}}- \log n \right\} \right\} . \end{aligned}$$

With the event

$$\begin{aligned} {\mathcal {E}}_{n,k}(\delta ):=\left\{ \exists k_1\in \mathcal {B}_n, (\tau -2)^{-(k_1+1)}Y_r^{\scriptscriptstyle {(n)}}+ (\tau -2)^{-(k-k_1)} Y_b^{\scriptscriptstyle {(n)}}< (1-\delta )\log n\right\} \end{aligned}$$

it is obvious from (6.1) that for any \(\delta >0\), \(\lim _{n\rightarrow \infty }\mathbb {P}({\mathrm {d}}_G(v_r, v_b)>k\mid {\mathcal {E}}_{n,k}(\delta )) \rightarrow 1\), while \(\mathbb {P}({\mathrm {d}}_G(v_r, v_b)>k\mid {\mathcal {E}}_{n,k}^c(\delta )) \rightarrow 0\). Hence, we get that

$$\begin{aligned} \mathbb {P}_Y({\mathrm {d}}_G(v_r, v_b)>k) \approx \mathbb {P}\left( \min _{k_1\in \mathcal {B}_n}(\tau -2)^{-(k_1+1)}Y_r^{\scriptscriptstyle {(n)}}+ (\tau -2)^{-(k-k_1)} Y_b^{\scriptscriptstyle {(n)}}< \log n \right) .\nonumber \\ \end{aligned}$$
(6.2)

The paper shows that \(\min _{k_1\in \mathcal {B}_n}\) can be replaced by \(\min _{k_1\le k}\) in the minimum in (6.2) above.

Next we show that the formulation of (6.2) gives the same distances as our statement for typical distances in Theorem 1.8, that is,

(6.3)

To be able to show that the two formulation are the same, we use (6.2) to show that

  1. (a)

    \(\mathbb {P}( {\mathrm {d}}_G(v_r, v_b) > T_r + T_b )\rightarrow 1\),

  2. (b)

    and finally that

  3. (c)

    \(\mathbb {P}( {\mathrm {d}}_G(v_r, v_b) > T_r + T_b +2 ) \rightarrow 0\).

From (2.38) and for \({\beta _n}=1/(\tau -1)\), it is an elementary calculation to check that that for any \(i\in \mathbb {Z}\), \(q\in \{r,b\}\)

$$\begin{aligned} \left( \tau -2\right) ^{-(T_q+1+i)} Y_q^{\scriptscriptstyle {(n)}} = \log n \frac{(\tau -2)^{b_n^{\scriptscriptstyle {(r)}}-i}}{\tau -1}. \end{aligned}$$
(6.4)

First, we check that (6.2) gives (a). For this we set \(k=T_r+T_b\), and let us write \(k_1:=T_r-\ell \) for some \(\ell \in \mathbb {Z}\), then \(k-k_1=T_b+\ell \). Hence, we can rewrite (6.2) using (6.4) with \(i=-\ell \) and \(i=\ell -1\), and get

$$\begin{aligned} \mathbb {P}_Y( {\mathrm {d}}_G(v_r, v_b) > T_r+T_b) \approx \mathbb {P}\left( \min _{\ell } \frac{(\tau -2)^{b_n^{\scriptscriptstyle {(r)}}+\ell } + (\tau -2)^{b_n^{\scriptscriptstyle {(b)}}+1-\ell }}{\tau -1} < 1 \right) . \end{aligned}$$

It is clear now that setting \(\ell =0\) shows that the inequality is satisfied for all \(b_n^{\scriptscriptstyle {(r)}}, b_n^{\scriptscriptstyle {(b)}}\in [0,1)\), since the expression after the \(\min _\ell \), for \(\ell =0\) is at most \((1+(\tau -2))/(\tau -1) = 1\). Moreover, note that for \(\ell =0\), \(k_1=T_r\) and \(k-k_r=T_b\), so both \(k_1+1\le T_r+1\) and \(k-k_1 \le T_b+1\) hold, hence, we found an index \(k_1\) in \(\mathcal {B}_n\).

Next, we check that (6.2) gives (b). For this, we set \(k=T_r+T_b+1\), again write \(k_1=T_r-\ell \), so that \(k-k_1=T_b+1+\ell \). Hence, we can rewrite (6.2) using (6.4) with \(i=-\ell \) and \(i=\ell \) and get

$$\begin{aligned} \mathbb {P}_Y( {\mathrm {d}}_G (v_r, v_b)> T_r+T_b+1) \approx \mathbb {P}\left( \min _{\ell } \frac{(\tau -2)^{b_n^{\scriptscriptstyle {(r)}}+\ell } + (\tau -2)^{b_n^{\scriptscriptstyle {(b)}}-\ell }}{\tau -1} < 1 \right) . \end{aligned}$$

It is clear now that setting \(\ell =0\) yields (b). Moreover, note that for \(\ell =0\) \(k_1\in \mathcal {B}_n\) holds as well. We argue that \(\ell =0\) is indeed the minimizer of the expression after the \(\min \). Wlog we can assume that \(\ell \ge 1\), the case when \(\ell \le -1\) can be treated similarly. Then, we need to show that for all \(\ell \ge 1\),

$$\begin{aligned} (\tau -2)^{b_n^{\scriptscriptstyle {(r)}}+\ell } + (\tau -2)^{b_n^{\scriptscriptstyle {(b)}}-\ell } > (\tau -2)^{b_n^{\scriptscriptstyle {(r)}}} + (\tau -2)^{b_n^{\scriptscriptstyle {(b)}}}. \end{aligned}$$
(6.5)

Rearranging this inequality yields

$$\begin{aligned} (\tau -2)^{b_n^{\scriptscriptstyle {(b)}}} \left( (\tau -2)^{-\ell }-1\right) > (\tau -2)^{b_n^{\scriptscriptstyle {(r)}}}(1-(\tau -2)^\ell ) \end{aligned}$$

Since for all \(\ell >1\),

$$\begin{aligned} \begin{aligned}&(\tau -2)^{b_n^{\scriptscriptstyle {(b)}}} \left( (\tau -2)^{-\ell }-1\right) \\&\quad >(\tau -2)^{-\ell +1} -(\tau -2) \ge 1-(\tau -2)^{\ell } \ge (\tau -2)^{b_n^{\scriptscriptstyle {(r)}}} \left( 1-(\tau -2)^{\ell }\right) , \end{aligned} \end{aligned}$$

the claim is established.

Finally, we check (c). For this, we set \(k=T_r+T_b+2\), again write \(k_1=T_r-\ell \), so that \(k-k_1=T_b+2+\ell \). Hence, we can rewrite (6.2) using (6.4) with \(i=-\ell \) and \(i=\ell +1\) and get

$$\begin{aligned} \mathbb {P}_Y( {\mathrm {d}}_G(v_r, v_b) > T_r+T_b+2) \approx \mathbb {P}\left( \min _{\ell } \frac{(\tau -2)^{b_n^{\scriptscriptstyle {(r)}}+\ell } + (\tau -2)^{b_n^{\scriptscriptstyle {(b)}}-\ell -1}}{\tau -1} < 1 \right) .\nonumber \\ \end{aligned}$$
(6.6)

We need to show that no \(\ell \in \mathbb {Z}\) satisfies this minimisation problem and thus the probability tends to 0. For this, we use again that \(b_n^{\scriptscriptstyle {(r)}},b_n^{\scriptscriptstyle {(b)}}\in [0,1)\) implies that

$$\begin{aligned} (\tau -2)^{b_n^{\scriptscriptstyle {(r)}}+\ell } + (\tau -2)^{b_n^{\scriptscriptstyle {(b)}}-\ell -1} > (\tau -2)^{\ell +1} + (\tau -2)^{-\ell }, \end{aligned}$$

and it is elementary to show again that the rhs is at least \(\tau -1\) for all \(\ell \in \mathbb {Z}\) and \(\tau \in (2,3)\), thus, the inequality on the rhs of (6.6) cannot be satisfied.

These calculations show that the statement of Theorem 1.8 yields - through a non-trivial rewrite - the statement of [31, Theorem 1.2]. The final formula of [31, Theorem 1.2], i.e., the distribution of the fluctuation of the typical distance around \(2\log \log n/|\log (\tau -2)|\) is then obtained by solving analytically the minimisation problem on the rhs of (6.2) with \(k_1\in \mathcal {C}^{\scriptscriptstyle {(b)}}_n\) replaced by \(k_1 \le k\).