Abstract
Scale-free networks contain many small cliques and cycles. We model such networks as inhomogeneous random graphs with regularly varying infinite-variance weights. For these models, the number of cliques and cycles have exact integral expressions amenable to asymptotic analysis. We obtain various asymptotic descriptions for how the average number of cliques and cycles, of any size, grow with the network size. For the cycle asymptotics we invoke the theory of circulant matrices.
Similar content being viewed by others
Avoid common mistakes on your manuscript.
1 Introduction
We study the number of cliques and cycles in scale-free random graphs with power-law degree distributions that have an infinite second moment. Such random graphs contain many small subgraphs [1, 2, 13]. Cliques are subsets of vertices that together form a complete graph and cycles are closed paths that visit each vertex only once.
We employ the rank-1 inhomogeneous random graph or hidden-variable model [4,5,6,7,8,9], which generates power-law random graphs, to derive asymptotic expressions for the average number of cliques and cycles, of arbitrary size. This model constructs simple graphs with soft constraints on the degree sequence [4, 7]. The graph consists of n vertices with weights \((h_1,h_2,\ldots ,h_n)\). These weights are an i.i.d. sample from the power-law distribution
for some slowly-varying function l(h) and power-law exponent \(\tau \in (2,3)\). We denote the average value of the weights by \(\mu \). Then, every pair of vertices with weights \((h_i,h_j)\) is connected with probability
which is the Chung-Lu version of the rank-1 inhomogeneous random graph [7]. This connection probability ensures that the degree of a vertex with weight h will be close to h [4], and that the probability \(p(h_i,h_j)\) remains in the interval [0, 1].
An alternative way to guarantee \(p(h_i,h_j)\in [0,1]\) is to assume that the support of the weight distribution is restricted to \([0,\sqrt{n \mu }]\), so that the product \(h_ih_j\) never exceeds \(n \mu \), making the minimum operator superfluous. Banning degrees larger than the \(\sqrt{n \mu }\) (also called the structural cutoff), however, violates the reality of scale-free networks in which hubs of expected degree \((n\mu )^{1/(\tau -1)}\gg \sqrt{n \mu } \) occur. We therefore choose to work with (1.2) and (1.1), putting no further restrictions on the range of the weights (and hence degrees). This creates degree correlations, also observed in real scale-free networks, and an average connectivity and clustering coefficient that depend on the vertex weight/degree [10, 11] .
The goal is then to obtain sharp asymptotic estimates for the average number of k-cliques \(A_k(n)\), which can be expressed as
with \({{\mathbb {P}}}(K_k)\) the probability that k arbitrary vertices together form a k-clique. Similarly, the average number of k-cycles \(C_k(n)\) satisfies
with \({{\mathbb {P}}}(C_k)\) the probability that k arbitrary vertices together form a k-cycle. The combinatorial factor \(\frac{k!}{2k}{n\atopwithdelims ()k}\) is built from the usual factor \({n\atopwithdelims ()k}\), due to choosing the set of k out of n vertices, and the factor k!, being the number of permutations of the chosen set. This has to be divided by \({k \atopwithdelims ()1}\), accounting for choosing a starting point of the cycle, and by 2, accounting for choosing the immaterial cycle’s orientation.
Imposing a cutoff Bianconi and Marsili [1, 2] start from an exact integral for \({{\mathbb {P}}}(K_k)\), and impose the cutoff \(\sqrt{n\mu }\), so that the integral can be transformed into a form amenable to asymptotic analysis through the saddle point method. We now repeat some of their arguments, and provide an alternative derivation that does not rely on the saddle point method. With the cutoff \(\sqrt{n \mu }\) the probability \({{\mathbb {P}}}(K_k)\) can be expressed in terms of the k-fold integral
Observe that since the support of H is restricted to \([1,\sqrt{n \mu }]\), the product in (1.5) can be brought into the form
and hence (1.3) grows as
where by \(g_1(n) \sim g_2(n)\) here and throughout we will understand \(g_1(n)/g_2(n) \rightarrow 1\) as \(n \rightarrow \infty \). To evaluate the integral in (1.7) we have invoked Lemma 2.2 in Section 2, which is a simple corollary of [3, Proposition 1.5.8]. We have also used the defining property of a slowly varying function, namely \(l(ch) \sim l(h)\) as \(h \rightarrow \infty \) for any fixed \(0< c < \infty \).
In the remainder of this paper we will not impose a cutoff, so that the random graph has degrees with a truly heavy-tailed distribution. The straightforward reasoning above is then obstructed by the \(\min \)-operator in the connection probability (1.2). Indeed, the product of all connection probabilities can no longer be decomposed as in (1.6).
Main contributions To deal with the product of all connection probabilities we introduce a specific way of conditioning on the vertex weights. For each of the k vertices that participates in the k-clique we condition on whether its weight is smaller or larger than \(\sqrt{n \mu }\). This in total yields \(k+1\) different configurations: all weights smaller than \(\sqrt{n \mu }\), all weights larger than \(\sqrt{n \mu }\), and one up to \(k-1\) weights smaller than \(\sqrt{n \mu }\). The first two configurations (referred to as ‘extreme configurations’) are relatively easy to deal with: all weights smaller than \(\sqrt{n \mu }\) corresponds to the cutoff setting, and all weights larger than \(\sqrt{n \mu }\) completely eliminates the product of connection probabilities (all equal to one). The remaining configurations (referred to as ‘middle configurations’) are harder to deal with. Based on this idea of conditioning, we establish the following results:
-
1.
In Sect. 2 we obtain the asymptotic behavior of the extreme configurations, and show that the contributions of the middle configurations are asymptotically bounded by the contributions of the extreme configurations. In this way, we circumvent analyzing explicitly the middle configurations, and we obtain the leading-order asymptotics for the average number of cliques in Theorem 2.1.
-
2.
We then turn in Sect. 2 to the average number of cycles. The required conditioning is not limited to the vertex weights being smaller or larger than \(\sqrt{n \mu }\), but also takes into account how these vertices are arranged on the cycle, making the analysis considerably more difficult. In fact, we first provide in Sect. 2 a lower bound in Theorem 2.3 on the cycle count for even values of k by considering one specific arrangement (both in terms of size and order) of vertices on a cycle. This lower bound is of interest as it shows that the number of even-sized cycles strictly dominates that of same-sized cliques.
-
3.
We present in Sect. 3 a more detailed asymptotic analysis of the middle configurations, which in turn leads to sharp asymptotics for the average number of cliques (Theorems 3.2 and 3.5). For analytic tractability we restrict to the pure power-law case \(F'(h)=c h^{-\tau }\), \(h\ge 1\) (with the slowly-varying function taken as a constant).
-
4.
For cycles, the asymptotic evaluation of the integrals involves the theory of circulant matrices (Theorem 3.6). It turns out that the relevant circulant matrix is regular for odd k, and singular for even k, leading to different asymptotics for the average number of cliques in Theorems 3.7 and 3.9. The number of even-sized cycles are shown to grow faster than even-sized cliques, while odd-sized cycles and odd-sized cliques have comparable growth rates.
Relation with other work Our results complement two existing lines of work. For the degree distribution \( {\mathbb {P}}\left( H>h\right) =c h^{1-\tau }\) with a support truncated at the cutoff \(\sqrt{n \mu } \), Bianconi and Marsili obtained sharp asymptotics for both clique counts [2] and cycle counts [1]. The main extension in this paper is that we do not impose the cutoff \(\sqrt{n \mu }\), as explained above, and hence work with truly heavy-tailed weight distributions.
The other line of work was launched recently by Van der Hofstad et al. [13], who consider \({\mathbb {P}}\left( H>h\right) =c h^{1-\tau }\) with infinite support, and study the optimal composition for the most likely subgraph. They showed that for a large class of subgraphs, including cliques and cycles, the optimal composition consists exclusively of vertices with order \(\sqrt{n}\) degrees. They also showed that this optimal composition determines up to leading order the asymptotic growth of the average number of subgraphs as function of the network size n. We sharpen the asymptotics obtained in [13] by directly analyzing the integral expressions for the average number of cycles and cliques. We restrict to cliques and cycles, and do not consider all possible subgraphs as in [13], because we utilize the specific topological structure of cliques and cycles in ways that cannot be easily generalized.
Apart from sharpening results in [1, 2, 13], we sometimes consider a more general setting with the slowly-varying function l(h) in (1.1), which allows for deviations from the pure power law [14]. The general consensus is that the exact shape of l(h) is less important than the precise value of \(\tau \). For \(A_k(n)\) and \(C_k(n)\), \(\tau \) indeed determines the leading growth rate, but l(h) enters the asymptotic expressions in various non-trivial ways.
2 Rough Asymptotics
By \(g_1(n) \asymp g_2(n)\) as \(n \rightarrow \infty \) we are going to understand that there exist constants \(C_1 > 0\) and \(C_2 < \infty \) such that
We write \(g_1(n) \lesssim g_2(n)\) if there exists a constant \(C < \infty \) such that
We also write \(g_1(n) \gtrsim g_2(n)\) if there exists a constant \(C > 0\) such that
Recall that we write \(g_1(n)\sim g_2(n)\) when \(g_1(n)/g_2(n)\rightarrow 1\) as \(n\rightarrow \infty .\)
We write the tail of the degree distribution as \(\overline{F}(h) = h^{-\tau +1} l(h)\) with \(\tau \in (2,3)\) and l(h) a slowly-varying function. Note that
2.1 Rough Asymptotics for Cliques
Theorem 2.1
(Rough asymptotics for cliques) In the rank-1 inhomogeneous random graph with weight distribution (1.1) and connection probability (1.2), the average number of cliques of size \(k \ge 3\) scales as
Comparing the rough asymptotics (2.5) with (1.7), we see that imposing a cutoff does not change the leading growth rate \(n^{\frac{k}{2}(3-\tau )} l^k(\sqrt{n})\) and it is only the constant that may change. We defer calculating the exact constant to Sect. 3. Because we already know that the configuration with all weights smaller than \(\sqrt{n \mu }\) gives the asymptotics in (2.5), the goal of the present section is to demonstrate that the contributions of all other configurations of vertex weights (such that at least one of them is larger than \(\sqrt{n \mu }\)) does not exceed \(n^{\frac{k}{2}(3-\tau )} l^k(\sqrt{n})\) asymptotically.
Before we present the proof of Theorem 2.1, observe that the probability of having an edge between vertices i and j can be described as
where \(H_1,\ldots , H_n\) are independent copies of H and \(U_{ij}\) are independent U(0, 1) random variables. The model may therefore be thought of as follows: Given a collection of random variables \(H_i\) and \(U_{ij}\), an edge between vertices i and j is present if \( {H_i H_j}/{n\mu } > U_{ij} \). This may be rewritten as
where \(V_{ij} = 1/U_{ij}\) has a Pareto(1) distribution: \({{\mathbb {P}}}(V_{ij} > x) = 1/x\) for all \(x \ge 1\). The average number of edges (cliques of size 2) is then straightforward:
as we have a product of three independent regularly varying random variables such that two of them (\(H_1\) and \(H_2\)) are much lighter than the third (V), and the result follows since H is regularly varying with a finite mean. For a general k, we see that
The proof strategy for Theorem 2.1 is to obtain large-n asymptotics for \({{\mathbb {P}}}(K_k)\) by using the conditioning explained in Sect. 1 and properties of random variables with regularly varying distributions, including the following lemma.
Lemma 2.2
-
(i)
If \(\beta > \tau -1\), then
$$\begin{aligned} \int _{1}^x h^{\beta } dFh) \sim \frac{\tau -1}{\beta -\tau +1} x^{\beta } \overline{F}(x) \quad \mathrm{as} \quad x \rightarrow \infty . \end{aligned}$$(2.10) -
(ii)
If \(\beta < \tau -1\), then
$$\begin{aligned} \int _{x}^\infty h^{\beta } dF(h) \sim \frac{\tau -1}{\tau -\beta -1} x^{\beta } \overline{F}(x) \quad \mathrm{as} \quad x \rightarrow \infty . \end{aligned}$$(2.11)
Proof
(i) Integration by parts leads to
and the integral on the right-hand side is asymptotically equivalent to
thanks to [3, Proposition 1.5.8]. The proof of (ii) follows the same lines as (i) and uses [3, Proposition 1.5.10]. \(\square \)
Proof of Theorem 2.1
Denote \(\gamma _n = \sqrt{n \mu }\) and for \(1 \le l \le k\)
We focus on the probability
where
and in order to prove the theorem, it is sufficient to show that the probability of the left-hand side of (2.15) behaves asymptotically, in terms of the leading term, as \(\left( \overline{F}(\sqrt{n}\right) ^k\). We refer to the first and last terms in (2.15) as ‘extreme configurations’ and the summands with \(m=1,\dots ,k-1\) as ‘middle configurations’.
We have seen that the first summand (extreme configuration) behaves asymptotically as
The last summand (contribution of the second extreme configuration) is clearly equal to \( \left( \overline{F}(\gamma _n)\right) ^k \asymp \left( \overline{F}(\sqrt{n})\right) ^k \). In order to prove Theorem 2.1, we then need to show that the contributions of all middle configurations in (2.15) do not exceed that of the extreme configurations, and are hence bounded from above by \(C \left( \overline{F}(\sqrt{n})\right) ^k\) with some constant \(C < \infty \).
For \(m\ge 3\) we have
due to what we have already shown.
For the cases \(m=1\) and \(m=2\), we are going to use [3, Theorem 1.5.4], which implies that there exists a non-increasing function \(\psi (x)\) such that \(x \overline{F}(x) \sim \psi (x)\) as \(x \rightarrow \infty \). For \(m=1\), note that it is sufficient to consider the case \(k=3\) and show that
Indeed, if (2.19) holds, then for a general k we have the following estimate:
Therefore, for the case \(m=1\), it remains to prove (2.19). Write
Note now that
as \(h_1 \le \gamma _n\), and hence
where we used Lemma 2.2. It now remains to consider the case \(m=2\). Note that, again, it is sufficient to consider the case \(k=3\) and show that
Indeed, if this is the case, we can write
which is sufficient. Therefore it remains to prove (2.24). Write
where
and
Therefore, in order to bound the right-hand side of (2.26) from above, we need to bound the integrals involving \(R_1\) and \(R_3\). The integral involving \(R_1\) may be rewritten as
Take \(\delta >0\) and note that (due to [3, Theorem 1.5.6]) there exists a constant A such that
and
for large enough n and for all \(h_1 \le \gamma _n\). Using (2.31) and (2.32), we can then bound the last integral with the following expression:
It now remains to bound the integral on the right-hand side of (2.26) involving \(R_3\):
which is asymptotically equivalent to the integral involving \(R_1\). \(\square \)
2.2 Rough Asymptotics for Cycles
The following theorem illustrates that cycles with an even number of vertices are more likely than cliques with the same number of nodes. The proof is constructive as it presents a particular configuration resulting in the leading asymptotics. In Sect. 3 we present precise asymptotic analysis using a different, more involved method.
As was pointed out in the introduction, the asymptotic analysis of the number of cycles is more difficult than that of the number of cliques, as not only the weights of vertices matter but also their locations in the cycle. To simplify the analysis, we therefore restrict attention to the case of weight distributions having regularly varying densities.
Theorem 2.3
(Lower bound cycle asymptotics) In the rank-1 inhomogeneous random graph with weight density \(\rho (h) = h^{-\tau } l(h)\), average weight \(\mu \) and connection probability (1.2), the average number of cycles of even size \(k \ge 4\) satisfies
Remark 2.4
Theorem 2.3 implies that the asymptotics of \(C_k(n)\) are heavier than the asymptotics of the number of cliques \(A_k(n)\). Indeed, fix \(a > \sqrt{\mu }\). Then for \(n\mu \ge a\sqrt{n}\), i.e., \(n\ge (a/\mu )^2\),
where in the second step we have used that \(n\mu /h\in [\mu \sqrt{n}/a,\sqrt{\mu n}]\) when \(h\in [\sqrt{\mu n},a\sqrt{n}]\) so that both \(l(h)/l(\sqrt{n})\rightarrow 1\) and \(l(n \mu /h)/l(\sqrt{n})\rightarrow 1\) uniformly in \(h\in [\sqrt{\mu n},a\sqrt{n}]\) as \(n\rightarrow \infty \). As a is arbitrary, \(\log a - \tfrac{1}{2} \log \mu \) may be arbitrarily large.
Proof of Theorem 2.3
From (1.4) we see that \( C_k(n) \asymp n^k {{\mathbb {P}}}(C_k)\), so it is sufficient to show that
We again use notation \(\gamma _n = \sqrt{n \mu }\). Fix a \(k=2m\) for an integer m and write (with the convention that by index \(k+1\) we understand index 1). Denote by \(A = \{\gamma _n<H_{2i-1} < \gamma _n^2, 1 \le i \le m\}\) and by \(B = \{\{H_{2j} \le \gamma _n, 1 \le j \le m\}\) and write
with \(M_j = \max \{h_{2j-1},h_{2j+1}\}\) (we will also use \(m_j = \min \{h_{2j-1},h_{2j+1}\}\)). Thanks to Lemma 2.2, the previous terms may be bounded from below by
Note now that the above integral with respect to \(h_{2k-1}\) reads
and hence the integral with respect to \(h_{2k-3}\) satisfies
where we used [3, Theorem 1.5.6]. It is easy to see how this may be continued by induction to show that
which completes the proof. \(\square \)
3 Precise Asymptotics
We now consider the pure power-law case
with \(c=\tau -1\), and derive more precise asymptotic results for the average number of k-cliques \(A_k(n)\) and k-cycles \(C_k(n)\) in the large-network limit \(n\rightarrow \infty \). We choose to work with the pure power law density, instead of the regularly varying distribution function (1.1), to suppress notation in view of the elaborate calculations that will follow. We again use as short-hand notation \( \gamma _n=\sqrt{n\mu } \) with \(\mu =(\tau -1)/(\tau -2)\), so that the connection probability in (1.2) can be writen as
with \(f(x)=\min \,\{1,x\}\). More generally, in Sect. 3.2 we present results that hold for the class of continuous nonnegative nondecreasing functions f considered in [12] that satisfy
Observe that for \(x\ge 0\), the function \(f(x)=\min \,\{1,x\}\) belongs to this class, and so do other standard choices like \(f(x)={x}/{(1+x)}\) and \(f(x)=1-\mathrm{e}^{-x}\).
As before, the notation \(g_1(n)\sim g_2(n)\) is used for \(g_1(n)=g_2(n)(1+o(1))\) as \(n\rightarrow \infty \).
3.1 Precise Asymptotics for Cliques
With \(k\ge 3\), the average number \(A_k(n)\) of k-cliques equals
To analyze this k-fold integral we make the choice \(f(x)=\min \{1,x\}\), \(x\ge 0\). We split the integral in (3.4) into \(k+1\) integrals over subranges where precisely m of the hidden variables \(h_i\) are \(\le \, \gamma _n\) while the \(k-m\) others are \(\ge \,\gamma _n\), \(m=0,1,\ldots ,k\). Observe that this range splitting is exactly the same as the conditioning used in (2.15). By symmetry of the integrand we have
where \(I_m\) is the contribution of the range
By the choice \(f(x)=\min \{1,x\}\), we have
For \(m=1,2,\ldots ,k-1\) we have
The main result for \(I_m\) reads as follows.
Proposition 3.1
For \(m=1,2,\ldots ,k-1\)
where
with
and, for \(m=3,\ldots ,k-1\,\) and \(0\le t_1,\ldots ,t_m\le 1\),
The detailed proof of Proposition 3.1 is deferred to Sect. 4. It uses the basic substitution \(v_i=h_i/\gamma _n\) in (3.8), causing the factor \(\gamma _n^{k(1-\tau )}\) to emerge, and the special form of f(x) (\(=\,\min \{1,x\}\)), so that symmetry and factorizations can be exploited. There is, furthermore, an explicit evaluation of the integrals over \(v_{m+1},\ldots ,v_k\) when \(0\le v_1\le v_2\le \cdots \le v_m\le 1\). A final substitution (\(t_i=v_i/v_{i+1}\) for \(i=1,\ldots ,m-1\) and \(t_m=v_m\)) then yields integrals over the unit cube \([0,1]^m\).
The form of \(I_m\) and \(J_m\) in (3.9) and (3.10) shows a convenient separation of dependencies, with \(J_m\) in (3.10) independent of n and \(\Phi _m\) in (3.11)–(3.13) independent of k. Moreover, the number of integration variables is reduced from k in (3.8) to m in (3.10). The remaining integral \(J_m\) is not readily computable in closed form.
From (3.7) and Proposition 3.1 we get the following result for the average number of k-cliques:
Theorem 3.2
(Precise asymptotics for cliques) In the rank-1 inhomogeneous random graph with weight density (3.1) and connection probability (1.2), the average number of cliques of size \(k \ge 3\) satisfies
with \(J_m\) given in (3.10).
Observing that \({n\atopwithdelims ()k}\gamma _n^{k(1-\tau )}\sim n^{k(3-\tau )/2}\mu ^{k(1-\tau )/2}/k!\), we see that Theorem 3.2 confirms and refines Theorem 2.1 for the pure power-law case (3.1). Below we give further results on the expression in square brackets in (3.14).
The representation (3.9)–(3.10) of \(I_m\) is also useful for getting bounds and asymptotics for \(I_m\) and \(A_k(n)\). For this there is the following result.
Proposition 3.3
For \(m=1,2,\ldots ,k-1\),
-
(i)
\(\Phi _m(0,\ldots ,0)=0\,,~~\Phi _m(1,\ldots ,1)=1\,\),
-
(ii)
\(\Phi _m(t_1,\ldots ,t_m)\) increases in any of the \(t_i\in [0,1]\,\),
-
(iii)
\(\dfrac{\partial \Phi _m}{\partial t_i}\,(1,\ldots ,1)=0\,,~~i=1,\ldots ,m\,\),
-
(iv)
\(\dfrac{\partial ^2\Phi _m}{\partial t_i\,\partial t_j}\,(1,\ldots ,1)={-}(\tau -1)\,\min \{i,j\}\,,~~i,j=1,\ldots ,m\,\).
The maximality of \(\Phi _m\) at \(t_1=\cdots =t_m=1\) translates to \(h_1=\cdots =h_m=\gamma _n\) in the original hidden variables \(h_i\), and this shows that for \(m=1,\ldots ,k-1\) the largest contribution to the integral \(I_m\) in (3.8) comes from hidden variables \(h_1,\ldots ,h_m\) that are less than but near \(\gamma _n\) while the other hidden variables \(h_{m+1},\ldots ,h_k\) exceed \(\gamma _n\).
From Proposition 3.3 we have the following consequences for the quantities \((\tau -1)^m\,m!\,J_m\) occurring in the series (3.14) for \(A_k(n)\).
Proposition 3.4
-
(i)
\((\tau -1)^m\,m!\,J_m\) decreases in \(k=3,4,\ldots \) for \(m=1,2,\ldots ,k-1\).
-
(ii)
\((\tau -1)^m\,m!\,J_m\le \Bigl (\dfrac{\tau -1}{m-\tau }\Bigr )^m\) for \(m=3,4,\ldots ,k-1\).
The series in (3.14) for \(A_k(n)\) has terms that are bounded by \({k\atopwithdelims ()m}(\frac{\tau -1}{m-\tau })^m\) for \(m=3,4,\ldots ,k-1\). The latter quantity reaches its maximum over \(m=3,4,\ldots ,k-1\) at m near \(m_0:=\sqrt{k(\tau -1)/\mathrm{e}}\). Using a Gaussian approximation of \({k\atopwithdelims ()m}(\frac{\tau -1}{m-\tau })^m\) near \(m_0\), see Section 4 for details, we get the following result.
Theorem 3.5
(Asymptotic order of cliques) In the rank-1 inhomogeneous random graph with weight density (3.1) and connection probability (1.2), the average number of cliques of size \(k \ge 3\) satisfies
Theorem 3.5 shows that the expression in square brackets in (3.14) grows subexponentially in k, which is relevant for large cluster sizes k.
3.2 Precise Asymptotics for Cycles
Using (1.4), the average number \(C_k(n)\) of k-cycles with \(k\ge 3\) equals
The integral in (3.16) is the probability that a particular set of vertices \(\{i_1,i_2,\ldots ,i_{k-1},i_k\}\) constitutes a k-cycle \(i_1\rightarrow i_2\rightarrow \cdots \rightarrow i_{k-1}\rightarrow i_k\rightarrow i_1\). In (3.16), we now allow a general f from the class introduced in [12]. The main result for \(C_k(n)\) reads as follows.
Theorem 3.6
(Precise asymptotics for cycles) In the rank-1 inhomogeneous random graph with weight density (3.1) and general class of connection probabilities (3.2), the average number of cycles of size \(k \ge 3\) scales as
where \(A=\mathrm{log}\,\gamma _n\),
with
and \({\mathrm{C}}\) and \(\mathbf{t}\) are the circulant \(k\times k\)-matrix and k-vector
The proof of Theorem 3.6, detailed in Sect. 5, uses again the substitution \(v_i=h_i/\gamma _n\), yielding a factor \(\gamma _n^{k(1-\tau )}\) outside the integral in (3.16) with an integrand of the form
For such an integrand, it is natural to further substitute \(t_i=\mathrm{log}\,v_i\), linearizing the arguments of the g-functions with the linear algebra of circulant matrices presenting itself.
As to evaluating the remaining integral in Theorem 3.6, we note that
while, due to the properties of f in (3.3), the function F is absolutely integrable over \(\mathbf{u}\in {\mathbb {R}}^k\), see (3.18)–(3.19). Thus, for odd k, the integral in (3.17) remains finite as \(A=\mathrm{log}\,\gamma _n\rightarrow \infty \), and basic calculus yields the following result.
Theorem 3.7
(Specific asymptotics for odd cycles) In the rank-1 inhomogeneous random graph with weight density (3.1) and general class of connection probabilities (1.2), the average number of cycles of odd size \(k \ge 3\) scales as
The integral in (3.23) is finite, and can be evaluated in closed form for the standard choices of f:
Remark 3.8
Choose \(f(x)=\min \{1,x\}\) and use Stirling’s formula to approximate \( {n\atopwithdelims ()k}k!\sim n^k \mathrm{e}^{-k^2/2n} \) with validity range \(k=o(n^{2/3})\). Then (3.23) gives
The situation for even k is more delicate, due to singularity of the matrix \({\mathrm{C}}\). In Section 5, the spectral structure of \({\mathrm{C}}\) is examined. It appears that \({\mathrm{C}}\) is diagonizable, with the k-DFT vectors as eigenvectors, and precisely one eigenvalue 0, viz. the one corresponding to the eigenvector
The integration over \(\mathbf{t}\) in (3.17) should now be split into a 1-dimensional integration in the direction of \(\mathbf{c}\) and a \((k-1)\)-dimensional integration over the orthogonal complement L of \(\mathbf{c}\). The integration over L yields a finite result as \(A\rightarrow \infty \), due to invertibility of \({\mathrm{C}}\) on L and absolute integrability of \(F(\mathbf{u})\), \(\mathbf{u}\in {\mathbb {R}}^k\). The integration in the direction of \(\mathbf{c}\) yields a factor \(2A\,\sqrt{k}\). By appropriately representing the integration over L using a delta function, the integral over L can be given in closed form, see Sect. 5. The final result is as follows.
Theorem 3.9
(Specific asymptotics for even cycles) In the rank-1 inhomogeneous random graph with weight density (3.1) and general class of connection probabilities (3.2), the average number of cycles of even size \(k \ge 4\) scales as
where
The integral in (3.30) converges, and again gives closed-form expressions:
Remark 3.10
Choose again \(f(x)=\min \{1,x\}\). Noting that \(|J(v)|\le J(0)=4/(3-\tau )(\tau -1)\), we have
Thus we get for even k a similar expression for \(C_k(n)\) as (3.27), except for an additional factor \(\log (\mu n)/\sqrt{k}\). This agrees with the observation in Remark 2.4 that for even k, \(C_k(n)\) grows faster than \(A_k(n)\).
4 Remaining Proofs for Cliques
To reduce notational complexity, we replace the upper integration limits \(\infty \) in (3.9) and (3.16) by \(\gamma _n^2\), at the expense of relative errors o(1) as \(n\rightarrow \infty \).
Proof of Proposition 3.1
With the substitution
we get for \(m=1,\ldots ,k-1\)
We have for \(w\le v_1,\ldots ,v_m\le 1\le v_{m+1},\ldots ,v_k\le W\)
since \(f(x)=\min \{1,x\}\). Therefore
As a consequence, the integral in (4.2) over \(v_{m+1},\ldots ,v_k\) factorizes, and we get
where
We now observe that \(F_m\) in (4.6) is a symmetric function of \(v_1,\ldots ,v_m\). Thus we shall evaluate (4.6) for increasingly ordered \(v_i\).
We have for \(w\le v_1\le \cdots \le v_m\le 1\) (so that \(1/v_m\le \cdots \le 1/v_1\le 1/w=W\)), where we assume \(m\ge 3\),
Letting \(W\rightarrow \infty \), so that \(W^{1-\tau }\rightarrow 0\) since \(2<\tau <3\), and using
we then get upon some further rewriting
For \(m=1,2\,\), we get
where it is assumed that \(w\le v_1\le 1\) and \(w\le v_1\le v_2\le 1\) in the respective cases.
To summarize, we have from (4.5) and symmetry of \(F_m\) in (4.6)
where \(F_m\) is given for \(m=1,\ldots ,k-1\) by (4.9)–(4.11). Here we have replaced the lower integration limit w of \(v_1\) by 0 at the expense of a relative error o(1) as \(n\rightarrow \infty \).
To complete the proof of Proposition 3.1, we substitute
From
we have
with \(\Phi _m\) given in (3.11))–(3.13). Furthermore, from (4.14)
Finally, rewriting the products in (4.12) using (4.14), we obtain (3.9)–(3.10). \(\square \)
Proof of Proposition 3.3
The cases \(m=1,2\) can be dealt with directly using (3.11)–(3.12). We assume now \(m=3,4,\ldots ,k-1\).
(i) We have \(\Phi _m(0,\ldots ,0)=1\) at once from (3.13), and
where we have used (4.8).
(ii) and (iii) We write for \(0\le t_1,\ldots ,t_m\le 1\)
where
Since \(\tau \in (2,3)\), we see from (4.17) that \(\Psi _m(t_1,\ldots ,t_m)\ge 1\), with equality only when \(t_1=\cdots =t_m=1\).
We consider the cases \(i=1\) and \(i=2,\ldots ,m\) separately. We have from (4.18)–(4.19)
with equality if \(t_1=\cdots =t_m=1\). Next, let \(i=2,\ldots ,m\,\). We have for \(0\le t_1,\ldots ,t_m\le 1\) from (4.18)–(4.19)
and the final member of (4.21) equals 0 by (4.8). There is equality in (4.21) for \(t_1=\cdots t_m=1\). (iv) is shown in a similar fashion as (iii). \(\square \)
Proof of Proposition 3.4
(i) Let \(0\le t_1,\ldots ,t_m\le 1\). Since \(0\le \Phi _m(t_1,\ldots ,t_m)\le 1\), we see that \(\Phi ^{k-m}(t_1,\ldots ,t_m)\) decreases in k, and so does \(J_m\). (ii) Let \(m=3,4,\ldots ,k-1\). Since \(0\le \Phi _m(t_1,\ldots ,t_m)\le 1\), we have
\(\square \)
Proof of Theorem 3.5
We must bound the quantity in [ ] at the right-hand side of (3.14). By Proposition 3.4 (i), we have that \(J_1\) and \(J_2\) are bounded, and so
By Proposition 3.4 (ii), we have
The ratio \(t_m/t_{m+1}\) of two consecutive terms equals
and this exceeds 1 from \(m\sim m_0:=\sqrt{k(\tau -1)/\mathrm{e}}\) onwards. Thus, the largest terms in \(\sum _{m=3}^{k-1}\,t_m\) occur for m of the order \(\sqrt{k}\). With \(m=O(\sqrt{k})\), we have
Then using Stirling’s formula, \(m!=m^m\,\mathrm{e}^{-m}\,\sqrt{2\pi m}\,(1+O(1/m))\), we find that
We aim at a Gaussian approximation of the dominant factor \([k\mathrm{e}(\tau -1)/m(m-\tau )]^m\) at the right-hand side of (4.27). We have
The leading term at the right-hand side of (4.28) vanishes at \(m=m_0=\sqrt{k(\tau -1)/\mathrm{e}}\). At \(m=m_0\), we evaluate
Thus, we find the Gaussian approximation
for m near \(m_0\) (validity range: \(|m-m_0|=o(m_0^{2/3})\)). Then, from (4.23), (4.24), (4.27) and (4.31), we get
as required. \(\square \)
5 Remaining Proofs for Cycles
We replace the upper integration limits \(\infty \) in (3.16) by \(\gamma _n^2\), as in Sect. 4.
Proof of Theorem 3.6
After the basic substitution in (4.1), we get
where
with \(g(x)=x^{-\tau /2}\,f(x)\). The substitution
then yields
where
and Theorem 3.6 follows. \(\square \)
Proof of Theorem 3.7
The formula (3.22) for \(\mathrm{det}({\mathrm{C}})\) follows from basic matrix operations with \({\mathrm{C}}\) in (3.20). Hence, \({\mathrm{C}}\) is non-singular when k is odd. Next, from (3.3) and (5.5), we have
and so j(u) has exponential decay as \(|u|\rightarrow \infty \). Therefore, \(F(\mathbf{u})\) in (3.18) is absolutely integrable over \({\mathbb {R}}^k\), and by the substitution \(\mathbf{u}={\mathrm{C}}\mathbf{t}\), with \(\mathrm{det}({\mathrm{C}})=2\), we get
with integration range \(R(A)={\mathrm{C}}([{-}A,A]^k)\). By non-singularity of \({\mathrm{C}}\), there is a \(\delta >0\) such that
and so we get
where we have used the definition of F in (3.18). Finally, by (3.19) and the substitution \(x=\mathrm{e}^u\in (0,\infty )\), we get
and this is finite because of (3.3) and \(2<\tau <3\). \(\square \)
Proof of Theorem 3.9
Let k be even. From the theory of circulant matrices, we have that \({\mathrm{C}}\) is diagonizable,
where for \(m=1,\ldots ,k\)
are the eigenvalues and eigenvectors of \({\mathrm{C}}\). With \(k=2j\), we have
Let
be the eigenvector of \({\mathrm{C}}\) corresponding to \(\lambda _j=0\) and let L be its orthogonal complement. It follows from (5.11) that \({\mathrm{C}}\) maps L linearly and injectively onto itself.
For \(\mathbf{t}\in {\mathbb {R}}^k\), we write
Then \({\mathrm{C}}\mathbf{t}={\mathrm{C}}\mathbf{w}\), and
Observe that \(A\,\sqrt{k}\,\mathbf{c}=({-}A,A,\ldots ,{-}A,A)^T\) is a corner point of \([{-}A,A]^k\). Let \(\varepsilon \in (0,1)\) and assume that \(a\in {\mathbb {R}}\), \(|a|<(1-\varepsilon )\,A\,\sqrt{k}\). Then
and so
The function \(F(\mathbf{u})\) is absolutely integrable over \(\mathbf{u}\in {\mathbb {R}}^k\) and \({\mathrm{C}}\) is boundedly invertible on L. Therefore, \(\int _{\mathbf{w}\in L}\,F({\mathrm{C}}\mathbf{w})\,d\mathbf{w}\) is finite. It follows from (5.18) that for any \(\varepsilon \in (0,1)\)
uniformly in \(a\in {\mathbb {R}}\), \(|a|<(1-\varepsilon )\,A\,\sqrt{k}\). Therefore, from (5.16), as \(A\rightarrow \infty \)
There remains to be computed \(\int _{\mathbf{w}\in L}\,F({\mathrm{C}}\mathbf{w})\,d\mathbf{w}\). The mapping \({\mathrm{C}}:L\rightarrow L\) is invertible, and we have from (5.11)–(5.12)
Thus we have
We represent the condition \(\mathbf{u}\in L\), i.e., \(\mathbf{u}^T\mathbf{c}=0\) with \(\mathbf{c}\) the vector in (5.14) having unit Euclidean length, as
Hence
Hence, the integral over \(\mathbf{u}\) in (5.24) factorizes, and we get
where
is the Fourier transform of j in (3.19).
Returning to (3.17), we then get from (5.20), (5.22) and (5.26)
and this yields Theorem 3.9 since \(A=\mathrm{log}\,\gamma _n\). \(\square \)
References
Bianconi, G., Marsili, M.: Loops of any size and Hamilton cycles in random scale-free networks. J. Stat. Mech. 2005(06), P06005 (2005)
Bianconi, G., Marsili, M.: Emergence of large cliques in random scale-free networks. EPL (Europhys. Lett.) 74(4), 740 (2006)
Bingham, N.H., Goldie, C.M., Teugels, J.L.: Regular Variation. Cambridge University Press, Cambridge (1989)
Boguñá, M., Pastor-Satorras, R.: Class of correlated random networks with hidden variables. Phys. Rev. E 68, 036112 (2003)
Bollobás, B., Janson, S., Riordan, O.: The phase transition in inhomogeneous random graphs. Random Struct. Algorithms 31(1), 3–122 (2007)
Britton, T., Deijfen, M., Martin-Löf, A.: Generating simple random graphs with prescribed degree distribution. J. Stat. Phys. 124(6), 1377–1397 (2006)
Chung, F., Lu, L.: The average distances in random graphs with given expected degrees. Proc. Natl. Acad. Sci. USA 99(25), 15879–15882 (2002)
Norros, I., Reittu, H.: On a conditionally Poissonian graph process. Adv. Appl. Probab. 38(01), 59–75 (2006)
Park, J., Newman, M.E.J.: Statistical mechanics of networks. Phys. Rev. E 70, 066117 (2004)
Stegehuis, C.: Degree correlations in scale-free null models. arXiv:1709.01085, (2017)
Stegehuis, C., van der Hofstad, R., Janssen, A.J.E.M., van Leeuwaarden, J.S.H.: Clustering spectrum of scale-free networks. Phys. Rev. E 96(4), 042309 (2017)
van der Hofstad, R., Janssen, A.J.E.M., van Leeuwaarden, J.S.H., Stegehuis, C.: Local clustering in scale-free networks with hidden variables. Phys. Rev. E 95, 022307 (2017)
van der Hofstad, R., van Leeuwaarden, J.S.H., Stegehuis, C.: Optimal subgraph structures in scale-free configuration models. arXiv:1709.03466, (2017)
Voitalov, I., van der Hoorn, P., van der Hofstad, R., Krioukov, D.: Scale-free networks well done. arXiv:1811.02071, (2018)
Acknowledgements
This work is supported by NWO Gravitation Networks grant 024.002.003.
Author information
Authors and Affiliations
Corresponding author
Additional information
Communicated by Eric A A. Carlen.
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
OpenAccess This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.
About this article
Cite this article
Janssen, A.J.E.M., van Leeuwaarden, J.S.H. & Shneer, S. Counting Cliques and Cycles in Scale-Free Inhomogeneous Random Graphs. J Stat Phys 175, 161–184 (2019). https://doi.org/10.1007/s10955-019-02248-w
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10955-019-02248-w