Abstract
We investigate the asymptotic number of induced subgraphs in power-law uniform random graphs. We show that these induced subgraphs appear typically on vertices with specific degrees, which are found by solving an optimization problem. Furthermore, we show that this optimization problem allows to design a linear-time, randomized algorithm that distinguishes between uniform random graphs and random graph models that create graphs with approximately a desired degree sequence: the power-law rank-1 inhomogeneous random graph. This algorithm uses the fact that some specific induced subgraphs appear significantly more often in uniform random graphs than in rank-1 inhomogeneous random graphs.
Similar content being viewed by others
1 Introduction
Many networks were found to have a degree distribution that is well approximated by a power-law distribution with exponent \(\tau \in (2,3)\). These power-law real-world networks are often modeled by random graphs: randomized mathematical models that create networks. One of the most natural random graph models to consider is the uniform random graph [14, 16]. Given a degree sequence, the uniform random graph samples a graph uniformly at random from all possible graphs with exactly that degree sequence.
The most common way to analyze uniform random graphs, is to analyze the configuration model, another random graph model that is easier to analyze, instead [2]. The configuration model creates random multigraphs with a specified degree sequence, i.e., graphs where multiple edges and self-loops can be present. When conditioning on the event that the configuration model results in a simple graph, it is distributed as a uniform random graph. If the probability of the event that the configuration model results in a simple graph is sufficiently large, it is possible to translate results from the configuration model to the uniform random graph. In the case of power-law degrees with exponent \(\tau \in (2,3)\) however, the probability of the configuration model resulting in a simple random graph vanishes, so that the configuration model cannot be used as a method to analyze power-law uniform random graphs [11]. In this setting, uniform random graphs need to be analyzed directly instead. However, analyzing uniform random graphs is in general complex: the presence of edges are dependent, and there is no simple algorithm for constructing a uniform random graph with power-law degrees.
Several other random graph models that are easy to generate, create graphs with approximately the desired degree sequence. The most prominent such models are rank-1 inhomogeneous random graphs [1, 3, 4]. In these models, every vertex is equipped with a weight, and pairs of vertices are connected independently with a probability that is a function of the vertex weights. Another such model is the erased configuration model, which erases all multiple edges and self-loops in the configuration model [3]. As these models are easy to generate and easy to analyze, they are often analyzed as a proxy for random graphs with a desired degree sequence.
In this paper, we investigate induced subgraphs in uniform random graphs. Several special cases of subgraph counts in uniform random graphs have been analyzed before, such as cycles [8, 16]. However, existing results often need a bound on the maximal degree in the graph or the assumption that all degrees are equal, which does not allow for analyzing power-law random graphs with \(\tau \in (2,3)\). Recently, triangles in uniform power-law random graphs have also been analyzed [5]. In this paper, we investigate the subgraph count of all possible induced subgraphs by using a recent method based on optimization models [7, 10] which enabled to analyze subgraph counts in erased configuration models and preferential attachment models. We combine this method with novel estimates on the connection probabilities in uniform random graphs [6] to obtain an optimization model that finds the most likely composition of an induced subgraph of a power-law uniform random graph. This method allows us to localize and enumerate all possible induced subgraphs.
We then use this optimization problem to design a randomized algorithm that distinguishes two types of rank-1 inhomogeneous random graphs from uniform random graphs in linear time. Interestingly, this shows that approximate-degree power-law random graphs are fundamentally different in structure from power-law uniform random graphs. Indeed, there are subgraphs that appear significantly more often in uniform random graphs than in these rank-1 inhomogeneous random graphs. Furthermore, the optimization problem that we use to prove results on the number of subgraphs allows to detect these differences in linear time, while subgraph counting in general cannot be done in linear time.
We first introduce the uniform random graph and the induced subgraph counts in Sect. 1. Then, we present our main results on subgraph counts in the large network limit in Sect. 2. After that, we discuss the implications of these results for distinguishing uniform random graphs from inhomogeneous random graphs in Sect. 2.2. We then provide the proofs of our main results in Sects. 3–6.
Notation We denote \([k]=\{1,2,\ldots ,k\}\). We say that a sequence of events \(({{\mathcal {E}}}_n)_{n\ge 1}\) happens with high probability (w.h.p.) if \(\lim _{n\rightarrow \infty }{{\mathbb {P}}}\left( {{\mathcal {E}}}_n\right) =1\) and we use \({\mathop {\longrightarrow }\limits ^{\scriptscriptstyle {{{\mathbb {P}}}}}}\) for convergence in probability. We write \(f(n)=o(g(n))\) if \(\lim _{n\rightarrow \infty }f(n)/g(n)=0\), and \(f(n)=O(g(n))\) if |f(n)|/g(n) is uniformly bounded. We write \(f(n)=\Theta (g(n))\) if \(f(n)=O(g(n) )\) as well as \(g(n)=O(f(n))\). We say that \(X_n=O_{\scriptscriptstyle {{{\mathbb {P}}}}}(g(n))\) for a sequence of random variables \((X_n)_{n\ge 1}\) if \(|X_n|/g(n)\) is a tight sequence of random variables, and \(X_n=o_{\scriptscriptstyle {{{\mathbb {P}}}}}(g(n))\) if \(X_n/g(n){\mathop {\longrightarrow }\limits ^{\scriptscriptstyle {{{\mathbb {P}}}}}}0\).
Uniform random graphs Given a positive integer n and a graphical degree sequence: a sequence of n positive integers \({\varvec{d}}=(d_1,d_2,\ldots , d_n)\), the uniform random graph (\(\mathrm{URG}^{\scriptscriptstyle {(n)}}({\varvec{d}})\)) is a simple graph, uniformly sampled from the set of all simple graphs with degree sequence \((d_i)_{i\in [n]}\). Let
\(d_{\max }=\max _{i\in [n]}d_i\) and \(L_n=\sum _{i=1}^n d_i\). We denote the empirical degree distribution by
We study the setting where the variance of \({\varvec{d}}\) diverges when n grows large. In particular, this implies that \(d_{\max }\) grows as a function of the network size n. In particular, we assume that the degree sequence satisfies the following assumption:
Assumption 1.1
(Degree sequence)
-
(i)
There exist \(\tau \in (2,3)\) and constants \(K_1,K_2>0\) such that for every \(n\ge 1\) and every \(0\le j\le d_{\max }\),
$$\begin{aligned} K_1 j^{1-\tau }\le 1-F_n(j)\le K_2 j^{1-\tau }. \end{aligned}$$(1.2) -
(ii)
There exist \(\tau \in (2,3)\) and a constant \(C>0\) such that, for all \(j=O(\sqrt{n})\),
$$\begin{aligned} 1-F_n(j) = Cj^{1-\tau }(1+o(1)). \end{aligned}$$(1.3)
It follows from (1.2) that
Furthermore, Assumptions (i) and (ii) together show that
for some \(\mu >0\).
2 Main Results
We now present our main results. Let \(H=(V_H,{{{\mathcal {E}}}}_H)\) be a small, connected graph. We are interested in N(H), the induced subgraph count of H, the number of subgraphs of \(\mathrm{URG}^{\scriptscriptstyle {(n)}}({\varvec{d}})\) that are isomorphic to H. Let \(\mathrm{URG}^{\scriptscriptstyle {(n)}}({\varvec{d}})|_{{\varvec{v}}}\) denote the induced subgraph obtained by restricting \(\mathrm{URG}^{\scriptscriptstyle {(n)}}({\varvec{d}})\) to vertices \({\varvec{v}}\). We can write the probability that an induced subgraph H with \(|V_H|=k\) is created on k uniformly chosen vertices \({\varvec{v}}=(v_1, \ldots , v_k)\) in \(\mathrm{URG}^{\scriptscriptstyle {(n)}}({\varvec{d}})\) as
where the sum is over all possible degrees on k vertices \({\varvec{d}}'=(d_i')_{i\in [k]}\), and \(d_{{\varvec{v}}}=(d_{v_i})_{i\in [k]}\) denotes the degrees of the randomly chosen set of k vertices. Recently, it has been shown that in erased configuration models, there is a specific range of \(d_1',\ldots ,d_k'\) that gives the maximal contribution to the amount of subgraphs of those degrees, sufficiently large to ignore all other degree ranges [10]. In this paper, we show that also (2.1) is optimized for specific ranges \(d_1',\ldots ,d_k'\) that depend on the subgraph H.
Furthermore, we show that when (2.1) is maximized by a unique range of degrees, there are only four possible ranges of degrees that maximize the term inside the sum in (2.1). These ranges are constant degrees, or degrees proportional to \(n^{(\tau -2)/(\tau -1)}\), to \(\sqrt{n}\) or to \(n^{1/(\tau -1)}\). Interestingly, these are the same ranges that contribute to the erased configuration model [10]. However, the optimal distribution of the subgraph vertices over these ranges may be different in the erased configuration model and the uniform random graph.
2.1 Optimizing the Subgraph Degrees
We now present the optimization problems that maximizes the summand in (2.1) for induced subgraphs. Let \(H=(V_H,{{{\mathcal {E}}}}_H)\) be a small, connected graph on \(k\ge 3\) vertices. Denote the set of vertices of H that have degree one inside H by \(V_1\). Let \({{\mathcal {P}}}\) be all partitions of \(V_H\setminus V_1\) into three disjoint sets \(S_1,S_2,S_3\). This partition into \(S_1,S_2\) and \(S_3\) corresponds to the optimal orders of magnitude of the degrees in (2.1): \(S_1\) is the set of vertices with degree proportional to \(n^{(\tau -2)/(\tau -1)}\), \(S_2\) the set with degrees proportional to \(n^{1/(\tau -1)}\), and \(S_3\) the set of vertices with degrees proportional to \(\sqrt{n}\). We then derive an optimization problem that finds the partition of the vertices into these three orders of magnitude that maximizes the contribution to the number of induced subgraphs. When a vertex in H has degree 1, its degree in \(\mathrm{URG}^{\scriptscriptstyle {(n)}}({\varvec{d}})\) is typically small, i.e., it does not grow with n.
Given a partition \({{\mathcal {P}}}=(S_1,S_2,S_3)\) of \(V_H\setminus V_1\), let \({{{\mathcal {E}}}}_{S_i}\) denote the set of edges in H between vertices in \(S_i\) and \(E_{S_i}=|{{{\mathcal {E}}}}_{S_i}|\) its size, \({{{\mathcal {E}}}}_{S_i,S_j}\) the set of edges between vertices in \(S_i\) and \(S_j\) and \(E_{S_i,S_j}=|{{{\mathcal {E}}}}_{S_i,S_j}|\) its size, and finally \({{{\mathcal {E}}}}_{S_i,V_1}\) the set of edges between vertices in \(V_1\) and \(S_i\) and \(E_{S_i,V_1}=|{{{\mathcal {E}}}}_{S_i,V_1}|\) its size. We now define the optimization problem that optimizes the summand in (2.1) as
In Sect. 3, we show that this optimization problem is a measure of how likely a configuration of vertices in the three sets \(S_1, S_2, S_3\) is to form subgraph H. The first term in B(H) gives a positive contribution for all vertices in \(S_1\). These are vertices with relatively low degree. In the second term, note that \(k-|S_1|-k_1=|S_3|+|S_2|\), so that the term within brackets is negative, as we assume that \(\tau \in (2,3)\). Thus the second term gives a negative contribution for vertices in \(S_2\), which have high degrees. Therefore, the first two terms in the optimization problem capture that high-degree vertices are rare, and low-degree vertices abundant. The last term gives a negative contribution for all edges between vertices with relatively low degrees in the subgraph, such as edges to vertices in \(S_1\), while it gives a positive contribution for edges to high-degree vertices with one end in \(S_2\). This captures the other part of the trade-off: high-degree vertices are more likely to connect to other vertices than low degree vertices. Note that \(B(H)\ge 0\), since putting all vertices in \(S_3\) yields zero.
Let \(S_1^* ,S_2^* ,S_3^* \) be a maximizer of (2.2). Furthermore, for any \((\alpha _1,\ldots , \alpha _k)\) such that \(\alpha _i\in [0,1/(\tau -1)]\), define
These are the sets of vertices \((v_1,\ldots , v_k)\) such that \({d_{v_1}}\) is proportional to \(n^{\alpha _1}\) and \({d_{v_2}}\) proportional to \(n^{\alpha _2}\) and so on. Denote the number of subgraphs with vertices in \(M_n^{(\varvec{\alpha })}(\varepsilon )\) by \(N (H,M_n^{(\varvec{\alpha })}(\varepsilon ))\). Define the vector \(\varvec{\alpha }\) as
The next theorem shows that sets of vertices in \(M_n^{\varvec{\alpha } }(\varepsilon )\) contain a large number of subgraphs, and computes the scaling of the number of induced subgraphs:
Theorem 2.1
(General induced subgraphs) Let H be a subgraph on k vertices such that the solution to (2.2) is unique.
-
(i)
For any \(\varepsilon _n\) such that \(\lim _{n\rightarrow \infty }\varepsilon _n=0\),
$$\begin{aligned} \frac{N \big (H,M_n^{(\varvec{\alpha } )}\left( \varepsilon _n\right) \big ) }{N (H)}{\mathop {\longrightarrow }\limits ^{\scriptscriptstyle {{{\mathbb {P}}}}}}1, \end{aligned}$$(2.5)with \(\varvec{\alpha }\) as defined in (2.4).
-
(ii)
Furthermore, for any fixed \(0<\varepsilon <1\),
$$\begin{aligned} \frac{N (H,M_n^{(\varvec{\alpha } )}(\varepsilon ))}{n^{\frac{3-\tau }{2}(k_{2+}+B (H))+k_1/2}} \le f(\varepsilon )+o_{\scriptscriptstyle {{{\mathbb {P}}}}}(1), \end{aligned}$$(2.6)and
$$\begin{aligned} \frac{N (H,M_n^{(\varvec{\alpha } )}(\varepsilon ))}{n^{\frac{3-\tau }{2}(k_{2+}+B (H))+k_1/2}} \ge {\tilde{f}}(\varepsilon )+o_{\scriptscriptstyle {{{\mathbb {P}}}}}(1), \end{aligned}$$(2.7)for some functions \(f(\varepsilon ),{\tilde{f}}(\varepsilon )<\infty \) not depending on n and with \(\varvec{\alpha }\) as defined in (2.4). Here \(k_{2+}\) denotes the number of vertices in H of degree at least 2, and \(k_1\) the number of degree-one vertices in H.
Thus, Theorem 2.1(i) shows that asymptotically, all induced subgraphs H have vertices in \(M_n^{\varvec{\alpha } }(\varepsilon )\), and Theorem 2.1(ii) then computes the scaling in n of the number of such induced subgraphs.
Now we study the special class of induced subgraphs for which the unique maximum of (2.2) is \(S_3^*=V_H\). By the above interpretation of \(S_1^*\), \(S_2^*\) and \(S_3^*\), these are induced subgraphs where the maximum contribution to the number of such subgraphs is from vertices with degrees proportional to \(\sqrt{n}\) in \(\mathrm{URG}^{\scriptscriptstyle {(n)}}({\varvec{d}})\). Plugging in \(S_3^*=V_H\) in (2.2) yields \(B(H)=0\), as then \(S_1^*,S_2^*=\emptyset \) so that all terms containing \(S_1\) and \(S_2\) disappear. This also means that when the maximizer of (2.2) is unique, and attained by \(B(H)=0\), the maximizer must be attained by \(S_3^*=V_H\). For such induced subgraphs, we can obtain the detailed asymptotic scaling including the leading constant:
Theorem 2.2
(Induced subgraphs with \(\sqrt{n}\) degrees) Let H be a connected graph on k vertices with minimal degree 2 such that the solution to (2.2) is unique, and \(B (H)=0\). Then,
with
\(\sqrt{n}\)-subgraphs Theorem 2.2 provides detailed asymptotics for induced subgraphs where the optimizer of (3.19) is given by \(S_3=V_H\). These induced subgraphs include all complete graphs and all cycles. Figure 1 shows the optimal structures of all induced subgraphs on 4 vertices. This figure indicates that on 4 vertices, Theorem 2.2 is applicable to the cycle and the complete induced subgraphs only.
Optimal induced subgraph structures Interestingly, Theorem 2.1 implies that the number of copies of a specific induced subgraph H is dominated by the number of copies in which its vertices embedded in \(\mathrm{URG}^{\scriptscriptstyle {(n)}}({\varvec{d}})\) have specific degrees, determined by maximizing (2.2). First restricting to these degrees, and then analyzing the subgraph count, allows to obtain the scaling of the number of induced subgraphs in power-law uniform random graphs where the analysis method by using the configuration model breaks down. Furthermore, this does not only give us information on the total number of subgraphs, but also on where in the graph we are most likely to find them (i.e., on which degrees).
Automorphisms of H An automorphism of a graph H is a map \(V_H\mapsto V_H\) such that the resulting graph is isomorphic to H. In Theorem 2.2 we count automorphisms of H as separate copies, so that we may count multiple copies of H on one set of vertices and edges. Therefore, to count the number of induced subgraphs without automorphisms, one should divide the results of Theorem 2.2 by the number of automorphisms of H.
Uniqueness of the solution One of the assumptions of Theorem 2.2 is that the solution to (3.19) is unique. The smallest subgraph such that this is not the case is depicted in Fig. 2. Here one of the optimal solutions is \(S_3=V_H\) yielding \(B(H)=0\), as shown in Fig. 2a. The other optimal solution is shown in Fig. 2b, and contains vertices in \(S_1\) and \(S_2\) instead. On all 5-vertex subgraphs on the other hand, the optimizer turns out to be unique.
2.2 Distinguishing Uniform Random Graphs from Rank-1 Inhomogeneous Random Graphs
Uniform random graphs create random networks that are uniformly sampled from all graphs with precisely a desired degree sequence. However, in the power-law degree range with \(\tau \in (2,3)\), it is difficult to generate such graphs, as the method that generates a configuration model until a simple graph is obtained does not work anymore [11]. Therefore, random graph models that generate networks with approximately a desired degree sequence are often used as a proxy for uniform random graphs instead, as many of these models are easy to generate. One such model is the rank-1 inhomogeneous random graph [1, 4]. In the inhomogeneous random graph, every vertex i is equipped with a weight \(w_i\). We here assume that these weights are sampled from a power-law distribution with \(\tau \in (2,3)\). Then, several choices of the connection probability \(p(w_i,w_j)\) are possible. Common choices are [1, 4]
where \(\mu \) denotes the average weight. By choosing the connection probabilities in this manner, the degree of vertex i is approximately \(w_i\) with high probability [9].
Theorems 2.1 and 2.2 indicate that in terms of induced subgraphs, the uniform random graph produces the same results as the rank-1 inhomogeneous random graph with connection probabilities as in (2.12). Intuitively, this can be seen from the constant A(H) in Theorem 2.2, where this connection probability appears in a scaled form. Indeed, our proofs are based on the fact that in the uniform random graph, the probability that two vertices form a connection can be approximated by 2.12 (see Lemma 3.1). Therefore, Theorems 2.1 and 2.2 also hold for rank-1 inhomogeneous random graphs with connection probability (2.12):
Theorem 2.3
Let \((G_n)_{n\ge 1}\) be a sequence of rank-1 inhomogeneous random graph with weight sequence satisfying Assumption (1.1), and connection probabilities as in (2.12). Then, Theorems 2.1 and 2.2 also hold for the number of copies of subgraph H in \(G_n\).
In [10, 15], similar theorems for random graphs with connection probability (2.11) and (2.10) were derived. The number of all induced subgraphs in the model with connection probability (2.10) has the same scaling in n as the number of induced subgraphs in the models with connection probability (2.11). However, the scaling in n of the number of copies of some induced subgraphs in models generated from (2.11) and (2.10) may be different from the scaling in the uniform random graph. The smallest such subgraphs are of size 6, and are plotted in Fig. 3. Figure 3 shows that these two subgraphs appear significantly more often in the uniform random graph than in the inhomogeneous random graphs.
Interestingly, this means that rank-1 inhomogeneous random graphs and uniform random graphs can be distinguished by studying small subgraph patterns of size 6. Previous results showed that random graphs with connection probability (2.12) are distinguishable from those generated by connection probabilities (2.11) and (2.10) by their maximum clique size, which differs by a factor of \(\log (n)\) [12]. However, finding the largest clique is an NP-hard problem [13], while this method only needs subgraphs of size 6 as an input, which can be detected in polynomial time. Furthermore, the difference between the amounts of the induced subgraphs of Fig. 3 is not a logarithmic factor, but a polynomial factor, making it easier to detect such differences.
Specifically, we can show that in only O(n) time, it is possible to distinguish between power-law uniform random graphs and the approximate-degree random graph models of (2.10) and (2.11) with high probability:
Theorem 2.4
There exists a randomized algorithm that distinguishes power-law uniform random graphs from power-law rank-1 inhomogeneous random graphs with connection probabilities (2.10) or (2.11) in time O(n) with accuracy at least \(1-n^\gamma e ^{-cn^{\beta }}\) for some \(\gamma ,\beta ,c>0\).
We will prove this theorem in Sect. 6, where we also introduce the randomized algorithm that distinguishes between these two random graph models. This algorithm is based on the subgraph displayed in Fig. 3c and d. It first selects vertices that have degrees close to \(n^{1/(\tau -1)}\) and \(n^{(\tau -2)/(\tau -1)}\), and then randomly searches among those vertices for the induced subgraph of Fig. 3c. In a uniform random graph, this will be successful with high probability, whereas in the rank-1 inhomogeneous random graphs with connection probabilities (2.10) and (2.11) the algorithm fails with high probability.
Organization of the proofs We will prove Theorems 2.1–2.4 in the following sections. First, Sect. 3 proves Theorem 2.1(ii), by calculating the probability that H appears on a specified subset of vertices, and optimizing that probability. Then, Sect. 4 proves Theorem 2.2 with a second moment method. Section 5 proves Theorem 2.1(i), and Sect. 6 introduces and analyzes the randomized algorithm that proves Theorem 2.4.
3 Proof of Theorem 2.1(ii)
We first provide an overview of the proof strategy. The main step in proving Theorem 2.1 is estimating the probability that a subgraph appears on vertices on specific degrees. We will show that this probability scales as a power of the network size n in Sect. 3.2. After that, we optimize this power of n as a function of the vertex degrees to obtain the vertex degrees that carry the most copies of subgraph H. In Lemma 3.2 we characterize these vertex degrees and show that they are given by the sets \(S_1\), \(S_2\) and \(S_3\), yielding the optimization problem (3.19).
3.1 Subgraph Probability in the Uniform Random Graph
We first investigate the probability that a given small graph H appears as an induced subgraph of \(\mathrm{URG}^{\scriptscriptstyle {(n)}}({\varvec{d}})\) on a specific set of vertices \({\varvec{v}}\). We denote the degree of a vertex i inside induced subgraph H by \(d_i^{\scriptscriptstyle {(H)}}\).
Lemma 3.1
Let H be a connected graph on k vertices, and let \({\varvec{d}}\) be a degree sequence satisfying Assumption 1.1. Furthermore, assume that \(d_{v_i}\gg 1\) or \(d_i^{(H)}=1\) for all \(i\in [k]\). Then,
Proof
Suppose that \(G^+\) is a subset of the edges of H, and \(G^-\) a subset of the non-edges of H. Let \({{\mathcal {G}}}^-\) denote the event that the non-edges of \(G^-\) are not present in \(\mathrm{URG}^{\scriptscriptstyle {(n)}}({\varvec{d}})|_{{\varvec{v}}}\) and let \({{\mathcal {G}}}^+\) denote the event that the edges of \(G^+\) are present in \(\mathrm{URG}^{\scriptscriptstyle {(n)}}({\varvec{d}})|_{{\varvec{v}}}\).
Let \(d_{(1)}\ge d_{(2)}\ge \cdots \ge d_{(n)}\) denote the ordered version of \({\varvec{d}}\). Then, by Assumption (i)
for some constant \(W>0\). Therefore,
for some \({\tilde{C}}>0\). Thus, \(\sum _{i=1}^{Cn^{1/(\tau -1)}}d_{(i)}=o(n)\) for \(\tau \in (2,3)\), while \(L_n=\Theta (n)\) by (1.5). Therefore, we may apply [6, Corollary 2] to obtain
where \(d_i^{(G)}\) denotes the degree of vertex i within \(G^+\). When \(d_i\gg 1\), then \(d_i-d_i^{(G)}=d_i(1+o(1))\) as \(d_i^{(G)}\le k-1\). Thus, when \(d_i\gg 1\) or \(d_i^{(G)}=0\) and \(d_j\gg 1\) or \(d_j^{(G)}=0\), then (3.4) becomes
Therefore also
We now use (3.5) and (3.6) to compute the probability that H appears as an induced subgraph on vertices \({\varvec{v}}\). Let the m edges of H be denoted by \(e_1={\{i_1,j_1\},\ldots ,e_m=\{i_m,j_m\}}\), and the \({k\atopwithdelims ()2}-m\) non-edges of H by \({\bar{e}}_1={\{w_1,z_1\},\ldots ,{\bar{e}}_{k(k-1)/2-m}=\{w_m,z_m\}}\). Furthermore, define \(G_0^+=\emptyset \) and \(G_s^+=G_{s-1}^+\cup \{v_{i_s},v_{j_s}\}\). Similarly, define \(G_0^-=\emptyset \) and \(G_s^-=G_{s-1}^-\cup \{v_{w_s},v_{z_s}\}\). Then,
We then use (3.5) and (3.6). This is allowed because \(d_{v_i}\gg 1\) or \(d_i^{\scriptscriptstyle {(H)}}=1\) for all \(i\in [k]\) so that when the edge \(e_l\) incident to vertex i is added to in (3.7), then \(d_i^{\scriptscriptstyle {(G_{l-1}^+)}}=0\). Indeed, when \(d_i^{\scriptscriptstyle {(H)}}=1\), then i has no other incident edges in H, and therefore degree zero in \(G_{l-1}^+\). Thus, we obtain
\(\square \)
3.2 Optimizing the Probability of a Subgraph
We now study the probability that H is present as an induced subgraph on vertices \((v_1, \ldots , v_k)\) of specific degrees. Assume that \(d_{{v_i}}\in [\varepsilon ,1/\varepsilon ]n^{\alpha _i}\) with \(\alpha _i\in [0,1/(\tau -1)]\) for \(i\in [k]\), so that \(d_{{v_i}}=\Theta (n^{\alpha _i})\).
Let H be an induced subgraph on k vertices labeled as \(1,\ldots ,k\). We now study the probability that \(\mathrm{URG}^{\scriptscriptstyle {(n)}}({\varvec{d}})|_{{\varvec{v}}}={{\mathcal {E}}}_H\).
Let \(X_{u,v}\) denote the indicator function of the event that edge \(\{u,v\}\) is present. When \(\alpha _i+\alpha _j< 1\), by Lemma 3.1
while
On the other hand, for \(\alpha _i+\alpha _j>1\),
while
Furthermore, when \(\alpha _i+\alpha _j=1\), \({{\mathbb {P}}}\left( X_{v_i,v_j}=0\right) =\Theta (1)\) and \({{\mathbb {P}}}\left( X_{v_i,v_j}=1\right) =\Theta (1)\). Combining this with Lemma 3.1 shows that we can write the probability that H occurs as an induced subgraph on \({\varvec{v}}=(v_1,\cdots ,v_k)\) as
Furthermore, by Assumption 1.1 the number of vertices with degrees in \([\varepsilon ,1/\varepsilon ](\mu n)^\alpha \) is \(\Theta (n^{(1-\tau )\alpha +1})\) for \(\alpha \le \frac{1}{\tau -1}\). Then, for \(M_n^{(\varvec{\alpha })}\) as in (2.3),
Thus,
Maximizing the exponent yields
The following lemma shows that this optimization problem attains its maximum for specific values of the exponents \(\alpha _i\):
Lemma 3.2
(Maximum contribution to subgraphs) Let H be a connected graph on k vertices. If the solution to (3.16) is unique, then the optimal solution satisfies \(\alpha _i\in \{0,\tfrac{\tau -2}{\tau -1},\tfrac{1}{2},\tfrac{1}{\tau -1}\}\) for all i. If it is not unique, then there exist at least 2 optimal solutions with \(\alpha _i\in \{0,\tfrac{\tau -2}{\tau -1},\tfrac{1}{2},\tfrac{1}{\tau -1}\}\) for all i. In any optimal solution \(\alpha _i=0\) if and only if vertex i has degree one in H.
The proof of this lemma follows a similar structure as the proof of [10, Lemma 4.2], and we therefore defer it to Appendix A. We now use the optimal structure of this optimization problem to prove Theorem 2.1(ii):
Proof of Theorem 2.1(ii)
Let \(\varvec{\alpha }\) be the unique optimizer of (3.16). By Lemma 3.2, the maximal value of (3.16) is attained by partitioning \(V_H\setminus V_1\) into the sets \(S_1,S_2,S_3\) such that vertices in \(S_1\) have \(\alpha _i=\tfrac{\tau -2}{\tau -1}\), vertices in \(S_2\) have \(\alpha _i =\tfrac{1}{\tau -1}\), vertices in \(S_3\) have \(\alpha _i=\tfrac{1}{2}\) and vertices in \(V_1\) have \(\alpha _i =0\). Then, the edges with \(\alpha _i+\alpha _j <1\) are edges inside \(S_1\), edges between \(S_1\) and \(S_3\) and edges from degree 1 vertices. Furthermore, non-edges with \(\alpha _i+\alpha _j>1\) are edges inside \(S_2\) (of which there are \(\frac{1}{2} |S_2|(|S_2|-1)-E_{S_2}\)) or edges between \(S_2\) and \(S_3\) (of which there are \(|S_2||S_3|-E_{S_2,S_3}\)). Recall that the number of edges inside \(S_1\) is denoted by \(E_{S_1}\), the number of edges between \(S_1\) and \(S_3\) by \(E_{S_1,S_3}\) and the number of edges between \(V_1\) and \(S_i\) by \(E_{S_1,V_1}\). Then we can rewrite (3.16) as
over all partitions \({{\mathcal {P}}}=(S_1,S_2,S_3)\) of \(V_H\setminus V_1\). Using that \(|S_3|=k-\left| S_1\right| -\left| S_2\right| -k_1\) and \({E_{S_3,V_1}=k_1-E_{S_1,V_1}-E_{S_2,V_1}}\), where \(k_1=\left| V_1\right| \) and extracting a factor \((3-\tau )/2\) shows that this is equivalent to
Since k and \(k_1\) are fixed and \(3-\tau >0\), we need to maximize
which equals (2.2).
By (3.15), the maximal value of \( N (H,M_n^{(\varvec{\alpha })}(\varepsilon ))\) then scales as
which proves Theorem 2.1(ii). \(\square \)
4 Proof of Theorem 2.2
To prove Theorem 2.2, we need to obtain more detailed asymptotics for the subgraph with \(S_3=V_H\) than the one provided in Theorem 2.1. We will use a second moment method to prove the convergence in probability. We investigate the expected number of copies of H in Lemma 4.2, and show that the variance is small in Lemma 4.3.
In this section, we will prove Lemma 4.1 that is given below, from which we prove Theorem 2.2. For that, we define the special case of \(M_n^{\scriptscriptstyle {(\varvec{\alpha })}}(\varepsilon )\) of (2.3) where \(\alpha _i=\tfrac{1}{2}\) for all \(i\in V_H=[k]\) as
and let \({\bar{W}}_n^k(\varepsilon )\) denote the complement of \(W_n^k(\varepsilon )\). We denote the number of subgraphs H with all vertices in \(W_n^k(\varepsilon )\) by \(N (H,W_n^k(\varepsilon ))\).
Lemma 4.1
(Major contribution to subgraphs) Let H be a connected graph on \(k{\ge 3}\) vertices such that (2.2) is uniquely optimized at \(S_3=[k]\), so that \(B (H)=0\). Then,
-
(i)
the number of subgraphs with vertices in \(W_n^k(\varepsilon )\) satisfies
$$\begin{aligned} \frac{N (H,W_n^k(\varepsilon ))}{n^{\frac{k}{2}(3-\tau )}} \rightarrow&(C(\tau -1))^k\mu ^{-\frac{k}{2}(\tau -1)} \int _{\varepsilon }^{1/\varepsilon }\!\!\cdots \int _{\varepsilon }^{1/\varepsilon }(x_1\cdots x_k)^{-\tau }\nonumber \\&\times \prod _{{\{i,j\}\in {{{\mathcal {E}}}}_H}}\frac{x_ix_j}{1+x_ix_j} \ \ \prod _{{\{u,v\}\notin {{{\mathcal {E}}}}_H}}\frac{1}{1+x_ux_v}\mathrm{d}x_1\cdots \mathrm{d}x_k . \end{aligned}$$(4.2) -
(ii)
A(H) defined in (2.9) satisfies \(A (H)<\infty \).
We now prove Theorem 2.2 using this lemma.
Proof of Theorem 2.2
We first study the expected number of induced subgraphs with vertices outside \(W_n^k(\varepsilon )\) and show that their contribution to the total number of copies of H is small. First, we investigate the expected number of copies of H in the case where vertex 1 of the subgraph has degree smaller than \(\varepsilon \sqrt{\mu n}\). By Lemma 3.1, the probability that H is present on a specified subset of vertices \({\varvec{v}}=(v_1,\ldots ,v_k)\) satisfies
Furthermore, by (1.3), there exists \(C_0\) such that \({{\mathbb {P}}}\left( D=k\right) \le C_0k^{-\tau }\) for all k, where D denotes the degree of a uniformly chosen vertex. Let \(I(H,{\varvec{v}})=\mathbb {1}{\left\{ \mathrm{URG}^{\scriptscriptstyle {(n)}}({\varvec{d}})|_{{\varvec{v}}}= {{{\mathcal {E}}}}_H\right\} },\) so that \(N(H)=\sum _{{\varvec{v}}} I(H,{\varvec{v}})\).
Define
We can use similar methods as in [5, Eq. (4.4)] to show that for some \(K^*>0\),
For all non-decreasing g that are bounded on \([0,\varepsilon \sqrt{\mu n}]\) and once differentiable, where \({\bar{G}}(x)\) denotes a function such that \(\int _0^x{\bar{G}}(y)\mathrm{d}y=g(x)\)
where we have used Assumption 1.1(ii). Taking
yields for (4.5)
Now we can bound the first term of (4.8) as
where \(h_1(\varepsilon )\) is a function of \(\varepsilon \). By Lemma 4.1(ii), \(h_1(\varepsilon )\rightarrow 0\) as \(\varepsilon \searrow 0\).
For the second term in (4.8), we obtain
where
and \(h_2(\varepsilon )\) is a function of \(\varepsilon \).
We can bound the situation where another vertex has degree smaller than \(\varepsilon \sqrt{n}\), or where one of the vertices has degree larger than \(\sqrt{n}/\varepsilon \), similarly. This yields
for some function \(h(\varepsilon )\) not depending on n such that \(h(\varepsilon )\rightarrow 0\) when \(\varepsilon \searrow 0\) and some function \({\tilde{h}}(\varepsilon )\) not depending on n. By the Markov inequality,
Thus, for any \(\delta >0\),
Combining this with Lemma 4.1(i) gives
\(\square \)
4.1 Conditional Expectation
We will prove Lemma 4.1 using a second moment method. Thus, we will first investigate the expected number of copies of induced subgraph H in \(\mathrm{URG}^{\scriptscriptstyle {(n)}}({\varvec{d}})\), and then bound its variance. Let H be a subgraph on k vertices, labeled as [k], and m edges, denoted by \(e_1={\{i_1,j_1\},\ldots ,e_m=\{i_m,j_m\}}\).
Lemma 4.2
(Convergence of conditional expectation of \(\sqrt{n}\) subgraphs) Let H be a subgraph such that (2.2) has a unique maximizer, and the maximum is attained at 0. Then,
Proof
We denote
As
and \(d_{v_i}\ge \varepsilon \sqrt{n}\) for \(i\in [k]\), we get from Lemma 3.1
We then define the measure
By [5, Eq. (4.19)]
Then,
Because the function \(h(t_1,\ldots ,t_k)\) is a bounded, continuous function on \([\varepsilon ,1/\varepsilon ]^k\),
Combining this with (4.19) proves the lemma. \(\square \)
4.2 Variance of the Number of Induced Subgraphs
We now study the variance of the number of induced subgraphs. The following lemma shows that the variance of the number of subgraphs is small compared to its expectation:
Lemma 4.3
(Conditional variance for subgraphs) Let H be a subgraph such that (2.2) has a unique maximum attained at 0. Then,
Proof
By Lemma 4.2,
Thus, we need to prove that the variance is small compared to \(n^{(3-\tau )k}\). Denote \({\varvec{v}}=(v_1,\ldots ,v_k)\) and \({{\varvec{u}}}=(u_1,\ldots ,u_k)\) and, for ease of notation, we denote \(G=\mathrm{URG}^{\scriptscriptstyle {(n)}}({\varvec{d}})\). We write the variance as
This splits into various cases, depending on the overlap of \({\varvec{v}}\) and \({\varvec{u}}\). When \({\varvec{v}}\) and \({\varvec{u}}\) do not overlap,
by Lemma 3.1.
The other contributions are when \({\varvec{v}}\) and \({\varvec{u}}\) overlap. In this situation, we bound the probability that induced subgraph H is present on a specified set of vertices by 1. When \({\varvec{v}}\) and \({\varvec{u}}\) overlap on \(s\ge 1\) vertices, we bound the contribution to (4.26) as
by Assumption (i). This is \(o(n^{(3-\tau )k})\) for \(\tau \in (2,3)\), as required. \(\square \)
Proof of Lemma 4.1
We start by proving part (i). By Lemma 4.3 and Chebyshev’s inequality,
Combining this with Lemma 4.2 proves Lemma 4.1(i). Lemma 4.1(ii) is follows from Lemma 5.1 in the next section, when \(|S_3^*|=k\). \(\square \)
5 Major Contribution to General Subgraphs: Proof of Theorem 2.1(i)
In this section, we prove Theorem 2.1(i), which shows that asymptotically, all induced subgraphs H have vertices in \(M_n^{(\varvec{\alpha })}(\varepsilon )\). We will show that the expected number of copies of H with degrees outside of \(M_n^{(\varvec{\alpha })}(\varepsilon )\) is small. To do so, we will need to use the finiteness of two integrals defined in Lemmas 5.2 and 5.2 .
We first introduce some further notation. As before, we denote the degree of a vertex i inside its subgraph H by \(d^{\scriptscriptstyle {(H)}}_{i}\). Furthermore, for any \(W\subseteq V_H\), we denote by \(d^{\scriptscriptstyle {(H)}}_{i,W}\) the number of edges from vertex i to vertices in W. Let H be a connected subgraph, such that the optimum of (2.2) is unique, and let \({{{\mathcal {P}}}}=(S_1^*,S_2^*,S_3^*)\) be the optimal partition. Define
We now provide two lemmas that show that two integrals related to the solution of the optimization problem (2.2) are finite. These integrals are the key ingredient in proving Theorem 2.1(i).
Lemma 5.1
(Induced subgraph integrals over \(S_3^*\)) Suppose that the maximum in (3.16) is uniquely attained by \({{{\mathcal {P}}}}=(S_1^*,S_2^*,S_3^*)\) with \(|S_3^*|=s>0\), and say that \(S_3^*=[s]\). Then
Lemma 5.2
(Induced subgraph integrals over \(S_1^*\cup S_2^*\)) Suppose the optimal solution to (3.16) is unique, and attained by \({{{\mathcal {P}}}}=(S_1^*,S_2^*,S_3^*)\). Say that \(S_2^*=[t_2]\) and \(S_1^*=[t_2+t_1]\setminus [t_2]\). Then, for every \(a>0\),
The proofs of Lemma 5.1 and 5.2 are similar to the proofs of [10, Lemmas 7.2 and 7.3] and are therefore deferred to Appendix B.
Proof of Theorem 2.1(i)
Note that \(d_{\max }\le M n^{1/(\tau -1)}\) by Assumption (i). Define
with \(\alpha _i\) as in (2.4), and denote
We then show that the expected number of subgraphs of G isomorphic to H where the degree of at least one vertex i satisfies \(d_i\notin [\gamma _i^l(n),\gamma _i^u(n)]\) is small, similarly to the proof of Theorem 2.2 in Sect. 4.
We first study the expected number of copies of H where the first vertex has degree \(d_{v_1}\in [1,\gamma _1^l(n))\) and all other vertices satisfy \(d_{v_i}\in [\gamma _i^l(n),\gamma _i^u(n)]\), by integrating the probability that induced subgraph H is formed over the range where vertex \(v_1\) has degree \(d_{v_1}\in [1,\gamma _1^l(n))\) and all other vertices satisfy \(d_{v_i}\in [\gamma _i^l(n),\gamma _i^u(n)]\). Using Lemma 3.1, and that the degree distribution can be bounded as \({{\mathbb {P}}}\left( D=k\right) \le M_2k^{-\tau }\) for some \(M_2>0\) by Assumption (i), we bound the expected number of such copies of H by
for some \(K>0\), and where we recall that \(I(H, {\varvec{v}})=\mathbb {1}{\left\{ \mathrm{URG}^{\scriptscriptstyle {(n)}}({\varvec{d}})|_{{\varvec{v}}}=H\right\} }\). This integral equals zero when vertex 1 is in \(V_1\), since then \([1,\gamma _1^l(n))=\varnothing \). Suppose that vertex 1 is in \(S_2^*\). W.l.o.g. assume that \(S_2^*={[t_2]}\), \(S_1^*={[t_1+t_2]\setminus [t_2]}\) and \(S_3^*={[t_1+t_2+t_3]\setminus [t_1+t_2]}\). We bound \(x_ix_j/(L_n+x_ix_j)\) by
-
(a)
\(x_ix_j/L_n\) for \(i,j\in S_1^*\);
-
(b)
\(x_ix_j/L_n\) for i or j in \(V_1\);
-
(c)
\(x_ix_j/L_n\) for \(i\in S_1^*\), \(j\in S_3^*\) or vice versa; and
-
(d)
1 for \(i,j\in S_2^*\) and \(i\in S_2^*\), \(j\in S_3^*\) or vice versa.
Similarly, we bound \(L_n/(L_n+x_ix_j)\) by
-
(a)
1 for \(i,j\in S_1^*\);
-
(b)
1 for i or j in \(V_1\);
-
(c)
1 for \(i\in S_1^*\), \(j\in S_3^*\) or vice versa; and
-
(d)
\(L_n/(x_ix_j)\) for \(i,j\in S_2^*\) and \(i\in S_2^*\), \(j\in S_3^*\) or vice versa.
Combining these bounds with the change of variables \(y_i=x_i/n^{\alpha _i}\) yields for (5.6), for some \({\tilde{K}}>0\), in the bound
where the integrals from 0 to M correspond to vertices in \(S_2^*\) and the integrals from 0 to \(\infty \) to vertices in \(S_1^*\) and \(S_3^*\). Since \(\tau \in (2,3)\), the integrals corresponding to vertices in \(V_1\) are finite. By the analysis from (3.17) to (3.20),
The integrals over \(y_i\in V_H\setminus V_1\) can be split into
By Lemma 5.1 the set of integrals on the second line of (5.9) is finite. Lemma 5.2 shows that the set of integrals on the first line of (5.9) tends to zero for \(\varepsilon _n\rightarrow 0\). Thus,
Therefore, (5.7), (5.8) and (5.10) yield
when vertex 1 is in \( S_2^*\). Similarly, we can show that the expected contribution from \(d_{v_1}<\gamma _1^l(n)\) satisfies the same bound when vertex 1 is in \(S_1^*\) or \(S_3^*\). The expected number of subgraphs where \(d_{v_1}>\gamma _1^u(n)\) if vertex 1 is in \(S_1^*\), \(S_3^*\) or \(V_1\) can be bounded similarly, as well as the expected contribution where multiple vertices have \(d_{v_i}\notin [\gamma _i^l(n),\gamma _i^u(n)]\).
Denote
and define \({\bar{\Gamma }}_n(\varepsilon _n)\) as its complement. Denote the number of subgraphs with vertices in \({\bar{\Gamma }}_n(\varepsilon _n)\) by \(N(H,{\bar{\Gamma }}_n(\varepsilon _n))\). Since \(d_{\max }\le Mn^{1/(\tau -1)}\), \(\Gamma _n(\varepsilon _n)={M}_n^{(\varvec{\alpha })}\). Therefore,
where \(N\Big (H,{\bar{M}}_n^{(\varvec{\alpha }))}\left( \varepsilon _n\right) \big )\) denotes the number of copies of H on vertices not in \(M_n^{(\varvec{\alpha })}\left( \varepsilon _n\right) \). By the Markov inequality and (5.11),
Combining this with Theorem 2.1(ii), for fixed \(\varepsilon >0\),
shows that
as required. This completes the proof of Theorem 2.1(i). \(\square \)
6 Proof of Theorem 2.4
Proof of Theorem 2.4
Algorithm 1 shows the algorithm that distinguishes uniform random graphs from rank-1 inhomogeneous random graphs with connection probabilities (2.11) or (2.10). It first selects only vertices of degrees proportional to \(n^{1/(\tau -1)}\) and \(n^{(\tau -2)/(\tau -1)}\), and then randomly selects such vertices and checks whether they form a copy of induced subgraph H of Fig. 4. We will show that with high probability, Algorithm 1 finds a copy of H when the input graph is generated by a uniform random graph, and that with high probability, Algorithm 1 outputs ‘fail’ when the input graph is a rank-inhomogeneous random graph with connection probability (2.11) or (2.10).
We first focus on the performance of Algorithm 1 when the input G is a uniform random graph. Algorithm 1 detects copies of subgraph H where the vertices have degrees as illustrated in Fig. 3c: two vertices of degree proportional to \(n^{1/(\tau -1)}\) and four of degree proportional to \(n^{(\tau -2)/(\tau -1)}\). By Theorem 2.1(ii), there are at least \(cn^{(4-\frac{1}{\tau -1})(3-\tau )}\) such induced subgraphs for some c with high probability. Furthermore, denote
By Assumption 1.1,
so that there are at most \(c_2n^{4(3-\tau )}\log (n)^{6(\tau -1)}\) sets of vertices with degrees in \(M^{\scriptscriptstyle {(}n)}(\varvec{\alpha })\) that form no copy of induced subgraph H for some \(c_2<\infty \). Thus, the probability that a randomly chosen set of vertices with degrees in \(M^{\scriptscriptstyle {(}n)}(\varvec{\alpha })\) forms H is at least
Algorithm 1 tries at most n such sets of vertices with degrees in \(M^{\scriptscriptstyle {(}n)}(\varvec{\alpha })\), and therefore attempts to find subgraph H in \(\Theta (\min (n,n^{4(3-\tau )}\log (n)^{6(\tau -1)}))=\Theta (f(n))\) attempts where \(f(n)=\min (n,n^{4(3-\tau )}\log (n)^{6(\tau -1)})\). Thus, the probability that the algorithm does not find a copy of induced subgraph H among all attempts is bounded by
for some \(\gamma >0\), where we have used that \(1-x\le e ^{-x}\)
We now analyze the performance of Algorithm 1 on rank-1 inhomogeneous random graphs with connection probability (2.11). As these have the same degree distribution asymptotically, (6.2) also holds there. Furthermore, the probability that vertices in \(M^{\scriptscriptstyle {(}n)}(\varvec{\alpha })\) together form a copy of H is
for some \(\gamma _2>0\), where we bounded all p(i, j) and \(1-p(i,j)\) by 1, except for \(1-p(i,j)\) for the non-edge between the two vertices of degree at least \(n^{\frac{1}{\tau -1}}/\log (n)\) (vertices in the left and right bottom corner of Fig. 4). Thus, there are at most \(\Theta (n^{4(3-\tau )}e ^{-n^{\frac{3-\tau }{\tau -1}}}\log (n)^{6(\tau -1)})\) copies of induced subgraph H on sets of vertices in \(M^{\scriptscriptstyle {(}n)}(\varvec{\alpha })\). Therefore, the probability that a randomly chosen set of vertices with degrees in \(M^{\scriptscriptstyle {(}n)}(\varvec{\alpha })\) forms H is at most
Then, the probability that the algorithm does not find a copy of induced subgraph H among all n attempts is bounded by
Thus, with high probability the algorithm outputs ‘fail’ when the input graph is a rank-1 inhomogeneous random graph with connection probability (2.11). A similar calculation shows that the algorithm outputs ‘fail’ with high probability when the input graph G is a rank-1 inhomogeneous random graph with connection probabilities (2.10). \(\square \)
References
Boguñá, M., Pastor-Satorras, R.: Class of correlated random networks with hidden variables. Phys. Rev. E 68, 036112 (2003)
Bollobás, B.: A probabilistic proof of an asymptotic formula for the number of labelled regular graphs. Eur. J. Comb. 1(4), 311–316 (1980)
Britton, T., Deijfen, M., Martin-Löf, A.: Generating simple random graphs with prescribed degree distribution. J. Stat. Phys. 124(6), 1377–1397 (2006)
Chung, F., Lu, L.: The average distances in random graphs with given expected degrees. Proc. Natl. Acad. Sci. USA 99(25), 15879–15882 (2002)
Gao, P., der Hofstad, R.V., Southwell, A., Stegehuis, C.: Counting triangles in power-law uniform random graphs. Electron. J. Comb. 27(3) (2020)
Gao, P., Ohapkin, Y.: Subgraph probability of random graphs with specified degrees and applications to chromatic number and connectivity
Garavaglia, A., Stegehuis, C.: Subgraphs in preferential attachment models. Adv. Appl. Probab. 51(3), 898–926 (2019)
Garmo, H.: The asymptotic distribution of long cycles in random regular graphs. Random Struct. Algorithms 15(1), 43–92 (1999)
van der Hofstad, R., Janssen, A.J.E.M., van Leeuwaarden, J.S.H., Stegehuis, C.: Local clustering in scale-free networks with hidden variables. Phys. Rev. E 95(2), 161–180 (2017)
van der Hofstad, R., van Leeuwaarden, J.S.H., Stegehuis, C.: Optimal subgraph structures in scale-free configuration models. Ann. Appl. Probab. 31(2), 501–537 (2021)
Janson, S.: The probability that a random multigraph is simple. Comb. Probab. Comput. 18(1–2), 205 (2009)
Janson, S., Łuczak, T., Norros, I.: Large cliques in a power-law random graph. J. Appl. Probab. 47(04), 1124–1135 (2010)
Karp, R.M.: Reducibility among combinatorial problems. In: Complexity of Computer Computations, pp. 85–103. Springer (1972)
Molloy, M., Reed, B.: A critical point for random graphs with a given degree sequence. Random Struct. Algorithms 6(2–3), 161–180 (1995)
Stegehuis, C., van der Hofstad, R., van Leeuwaarden, J.S.H.: Variational principle for scale-free network motifs. Sci. Rep. 9(1), 6762 (2019)
Wormald, N.C.: The asymptotic distribution of short cycles in random regular graphs. J. Comb. Theory Ser. B 31(2), 168–182 (1981)
Open Access
This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.
Author information
Authors and Affiliations
Corresponding author
Additional information
Communicated by Antti Knowles.
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Appendices
Proof of Lemma 3.2
Proof of Lemma 3.2
By defining \(\beta _i=\alpha _i-\tfrac{1}{2}\) and
we can rewrite (3.16) as
over all possible values of \(\beta _i\in [-\tfrac{1}{2},\tfrac{3-\tau }{2(\tau -1)}]\). We ignore the constant factor of \((1-\tau )\tfrac{k}{2} \) in (A.2), since it does not influence the optimal \(\beta \) values. Then, we have to prove that \(\beta _i\in \{-\tfrac{1}{2}, \tfrac{\tau -3}{2(\tau -1)},0,\tfrac{3-\tau }{2(\tau -1)}\}\) for all i in the optimal solution. Note that (A.2) is a piecewise linear function in \(\beta _1,\ldots ,\beta _k\). Therefore, if (A.2) has a unique maximum, then it must be attained at the boundary for \(\beta _i\) or at a border of one of the linear sections. Thus, any unique optimal value of \(\beta _i\) satisfies \(\beta _i=-\tfrac{1}{2}\), \(\beta _i=\tfrac{\tau -3}{2(\tau -1)}\) or \(\beta _i+\beta _j=0\) for some j.
The proof of the lemma then consists of three steps:
-
Step 1 Show that \(\beta _i=-\tfrac{1}{2}\) if and only if vertex i has degree 1 in H in any optimal solution.
-
Step 2 Show that any unique solution does not contain i with \(\left| \beta _i\right| \in (0,\tfrac{3-\tau }{2(\tau -1)})\).
-
Step 3 Show that any optimal solution that is not unique can be transformed into two different optimal solutions with \(\beta _i\in \{-\tfrac{1}{2}, \tfrac{\tau -3}{2(\tau -1)},0,\tfrac{3-\tau }{2(\tau -1)}\}\) for all i.
Step 1 Let i be a vertex of degree 1 in H, and j be the neighbor of i.
The contribution from vertex i to (A.2) is
This contribution is maximized when choosing \(\beta _i=-\tfrac{1}{2}\) as \(\tau \in (2,3)\). Thus, \(\beta _i=- \tfrac{1}{2}\) in the optimal solution if the degree of vertex i is one.
Let i be a vertex in \(V_H\), and recall that \(d_i^{\scriptscriptstyle {(H)}}\) denotes the degree of i in H. Let i be such that \(d_i^{\scriptscriptstyle {(H)}}\ge 2\) in H, and suppose that \(\beta _i<\tfrac{\tau -3}{2(\tau -1)}\). Because the maximal value of \(\beta _j\) for \(j\ne i\) is \(\tfrac{3-\tau }{2(\tau -1)}\). This implies that \(\beta _i+\beta _j<0\) for all j. Thus, the contribution to the ith term of (A.2) is
for any \(\beta _j\), \(j\ne i\). Increasing \(\beta _i\) to \(\tfrac{\tau -3}{2(\tau -1)}\) then gives a higher contribution.
Thus, \(\beta _i\ge \tfrac{\tau -3}{2(\tau -1)}\) when \(d_i^{\scriptscriptstyle {(H)}}\ge 2\).
Step 2 Now we show that when the solution to (A.2) is unique, it is never optimal to have \(\left| \beta \right| \in (0,\tfrac{3-\tau }{2(\tau -1)})\). Let
Let \(N_{{\tilde{\beta }}^-}\) denote the number of vertices with their \({\tilde{\beta }}\) value equal to \(-{\tilde{\beta }}\), and \(N_{{\tilde{\beta }}^+}\) the number of vertices with value \({\tilde{\beta }}\), where \(N_{{\tilde{\beta }}^+}+N_{{\tilde{\beta }}^-}\ge 1\). Furthermore, let \(E_{{\tilde{\beta }}^-}^+\) denote the number of edges from vertices with value \(-{\tilde{\beta }}\) to other vertices j such that \(\beta _j<{\tilde{\beta }}\), and \(E_{{\tilde{\beta }}^+}^+\) the number of edges from vertices with value \({\tilde{\beta }}\) to other vertices j such that \(\beta _j<-{\tilde{\beta }}\). Similarly, let \(E_{{\tilde{\beta }}^-}^-\) denote the number of non-edges from vertices with value \(-{\tilde{\beta }}\) to other vertices j such that \(\beta _j>{\tilde{\beta }}\), and \(E_{{\tilde{\beta }}^+}^-\) the number of non-edges from vertices with value \({\tilde{\beta }}\) to other vertices j such that \(\beta _j<-{\tilde{\beta }}\). Then, the contribution from these vertices to (A.2) is
Because we assume \({\beta }\) to be optimal, and the optimum to be unique, the value inside the brackets cannot equal zero. The contribution is linear in \({\tilde{\beta }}\) and it is the optimal contribution, and therefore \({\tilde{\beta }}\in \{0,\tfrac{3-\tau }{2(\tau -1)}\}\).
This shows that \(\beta _i\in \{\tfrac{\tau -3}{2(\tau -1)},0,\tfrac{3-\tau }{2(\tau -1)}\}\) for all i such that \(d_i^{\scriptscriptstyle {(H)}}\ge 2\).
Step 3 Suppose that the solution to (A.2) is not unique. Suppose that \(\beta _*\) appears in one of the optimizers of (A.2). In the same notation as in (A.6), the contribution from vertices with \(\beta \)-values \(\beta _*\) and \(-\beta _*\) equals
Since this contribution is linear in \(\beta _*\), the contribution of these vertices can only be non-unique if the term within the square brackets equals zero. Thus, for the solution to (A.2) to be non-unique, there must exist \({\hat{\beta }}_1,\ldots ,{\hat{\beta }}_s>0\) for some \(s\ge 1\) such that
Setting all \({\hat{\beta }}_j=0\) and setting all \({\hat{\beta }}_j=\tfrac{3-\tau }{2(\tau -1)}\) are both optimal solutions. Thus, if the solution to (A.2) is not unique, at least 2 solutions exist with \(\beta _i\in \{\tfrac{\tau -3}{2(\tau -1)},0,\tfrac{3-\tau }{2(\tau -1)}\}\) for all \(i\in V_H\). \(\square \)
Proof of Lemmas 5.1 and 5.2
We first provide a lemma that states several properties of the variable \(\zeta _i\) of (5.1), that will appear often in the integrals we have to bound:
Lemma B.1
(Bounds on \(\zeta _i\)) Let H be a connected subgraph, such that the optimum of (2.2) is unique, and let \({{{{\mathcal {P}}}}=(S_1^*,S_2^*,S_3^*)}\) be the optimal partition. Then
-
(i)
\(\zeta _i+d^{\scriptscriptstyle {(H)}}_{i,S_2^*}-|S_2^*|\le 1\) for \(i\in S_1^*\);
-
(ii)
\(d^{\scriptscriptstyle {(H)}}_{i,S_3^*}+\zeta _i\ge 1\) for \(i\in S_2^*\);
-
(iii)
\(\zeta _i+d^{\scriptscriptstyle {(H)}}_{i,S_3^*}-|S_3^*|\le 0\) and \(d^{\scriptscriptstyle {(H)}}_{i,S_3^*}+\zeta _i\ge 2\) for \(i\in S_3^*\).
Proof
Suppose first that \(i\in S_1^*\). Now consider the partition \({{\hat{S}}}_1=S_1^*\setminus \{i\}\), \({{\hat{S}}}_2=S_2^*\), \(S_3=S_3^*\cup \{i\}\). Then, \(E_{{{\hat{S}}}_1}=E_{S_1^*}-d^{\scriptscriptstyle {(H)}}_{i,S_1^*}\), \(E_{{{\hat{S}}}_1,{{\hat{S}}}_3}=E_{S_1^*,S_3^*}+d^{\scriptscriptstyle {(H)}}_{i,S_1^*}-d^{\scriptscriptstyle {(H)}}_{i,S_3^*}\) and \(E_{{{\hat{S}}}_2,{{\hat{S}}}_3}=E_{S_2^*,S_3^*}+d^{\scriptscriptstyle {(H)}}_{i,S_2^*}\). Furthermore, \(E_{{{\hat{S}}}_1,V_1}=E_{S^*_1,V_1}-d^{\scriptscriptstyle {(H)}}_{i,V_1}\) and \(E_{{{\hat{S}}}_2,V_1}=E_{S_2^*,V_1}\). Because the partition into \(S_1^*,S_2^*\) and \(S_3^*\) achieves the unique optimum of (2.2),
which reduces to
Using that \(\tau \in (2,3)\) then yields \(d^{\scriptscriptstyle {(H)}}_{i,S_1^*}+d^{\scriptscriptstyle {(H)}}_{i,S_3^*}+d^{\scriptscriptstyle {(H)}}_{i,V_1}\le 1\).
Similar arguments give the other inequalities. For example, for \(i\in S_3^*\), considering the partition where i is moved to \(S_1^*\) gives the inequality \(d^{\scriptscriptstyle {(H)}}_{i,S_3^*}+d^{\scriptscriptstyle {(H)}}_{i,S_1^*}+d^{\scriptscriptstyle {(H)}}_{i,V_1}\ge 2\), and considering the partition where i is moved to \(S_2^*\) results in the inequality \(d^{\scriptscriptstyle {(H)}}_{i,S_1^*}+d^{\scriptscriptstyle {(H)}}_{i,V_1}\le 1\), so that \(\zeta _i\le 1\). \(\square \)
Proof of Lemma 3.2
(Proof of Lemma 5.1) Recall that \(S_3^*=[s]\). First of all,
We compute the contribution to (B.3) where the integrand runs from 1 to \(\infty \) for vertices in some nonempty set U, and from 0 to 1 for vertices in \({\bar{U}}=S_3^*\setminus {U}\). W.l.o.g., assume that \(U={[t]}\) for some \(1\le t< s\) and that \(x_1<x_2<\cdots <x_t\). Define, for \(i\in {\bar{U}}\),
Then (5.2) can be bounded by
We can write \({\tilde{h}}(i,{\varvec{x}})\) as
By Lemma B.1, \(\zeta _i+d^{\scriptscriptstyle {(H)}}_{i,S_3^*}\ge 2\) for \(i\in S_3^*\) so that the first integral is finite. Computing these integrals yields
for some constants \(C_0,\ldots ,C_{t}\). Assume that i is connected to l vertices in U, so that there are l vertices in \(\{1,2.\ldots ,t\}\) such that \(\mathbb {1}{\left\{ \{i,t\}\in {{\mathcal {E}}}_{S_3^*}\right\} }=1\) and \(t-l\) such that \(\mathbb {1}{\left\{ \{i,t\}\notin {{\mathcal {E}}}_{S_3^*}\right\} }=1\). Then,
which is larger than 1 for \(p< \tau -\zeta _i-d^{\scriptscriptstyle {(H)}}_{i,S_3^*}+t-2\) as \(x_{p+1}>x_p\), and at most 1 for \(p\ge \tau -\zeta _i-d^{\scriptscriptstyle {(H)}}_{i,S_3^*}+t-2\). Thus, \(p^*=p^*_i=\mathrm{argmax}_{p} h_p(i,{\varvec{x}})=\lfloor \tau -\zeta _i-d^{\scriptscriptstyle {(H)}}_{i,S_3^*}+t-1\rfloor \). Therefore, there exists a \(K>0\) such that
For all \(j\in U\), let
denote the set of neighbors \(i\in {\bar{U}}\) of \(j\in U\) such that \(x_j\) appears in \(h_{p^*_i}(i,{\varvec{x}})\) with exponent \(+1\). (note that \(i<j\) for all \(i\in {\bar{U}}, j\in U\)). Similarly, let
be the set of non-neighbors \(i\in {\bar{U}}\) of j such that \(x_j\) appears in \(h_{p^*_i}(i,{\varvec{x}})\) with exponent \(-1\). Furthermore, let \({W_j=\{i\in {\bar{U}}:{p^*_i}=j\}}\). Thus, the vertices in \(W_j\) appear with exponent \(\tau -\zeta _i-d^{\scriptscriptstyle {(H)}}_{i,S_3^*}+t-1-j\) in \(h_{p^*_i}(i,{\varvec{x}})\). Then, by the definition of \(\zeta _i\) in (5.1)
This yields
for some \({\tilde{K}}>0\), where \({{\hat{W}}}_j=V_H\setminus W_j\).
We will now use the uniqueness of the solution of the optimization problem in (3.16) to prove that the integral over \(x_t\) in (B.11) is finite. First of all,
so that \(|Q_t^+|=d^{\scriptscriptstyle {(H)}}_{t,W_t}\), whereas
We will now prove that the exponent of \(x_t\) in (B.11)
or
Define \({{\hat{S}}}_2={{\hat{S}}}_2^*\cup \{ t \}\), \({{\hat{S}}}_1={{\hat{S}}}_1^*\cup {W}_t\) and \({{\hat{S}}}_3=S_3^*\setminus (W_t\cup \{t\})\). This gives
Because (2.2) is uniquely optimized by \(S_1^*\), \(S_2^*\) and \(S_3^*\), we obtain
Plugging in (B.15)–(B.19) and using that \(k-k_1-|S_1^*|=|S_2^*|+|S_3^*|\) yields
Multiplying by \(\tau -1\) then gives
Using that \(|W_t|\le |S_3^*|-t\) then yields
so that (B.14) also holds.
Thus, the integral in (B.11) over \(x_t\) results in a power of \(x_{t-1}\). We can then use a similar technique to show that the power of \(x_{t-1}\) is also smaller than one, and iterate to finally show that the integral in (B.11) is finite, so that (5.2) is also finite. \(\square \)
Proof of Lemma 5.2
This lemma can be proven along similar lines of Lemma 5.1. In particular, it follows the same strategy and computations as in [10, Lemma 7.3], where the factors \(x_ix_i/(x_ix_j+1)\) and \(1/(x_ix_j+1)\) are bounded by terms of \(\min (x_ix_j,1)\) and \(\min (1/(x_ix_j),1)\) as in the proof of Lemma 5.1.\(\square \)
Proof of Theorem 2.3
In the rank-1 inhomogeneous random graph with weight sequence \({\varvec{w}}\), \(\mathrm{IRG}^{\scriptscriptstyle {(n)}}({\varvec{w}})\), by definition of the connection probabilities,
This is an equivalent statement to Lemma 3.1, now conditioned on weights instead of degrees. As the weight sequence satisfies Assumption 1.1, we can follow the computations in Sects. 3.2 and 4 from there with degrees replaced by weights. This then proves Theorem 2.3.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Stegehuis, C. Distinguishing Power-Law Uniform Random Graphs from Inhomogeneous Random Graphs Through Small Subgraphs. J Stat Phys 186, 37 (2022). https://doi.org/10.1007/s10955-022-02884-9
Received:
Accepted:
Published:
DOI: https://doi.org/10.1007/s10955-022-02884-9