Solving Vertex Cover in Polynomial Time on Hyperbolic Random Graphs

The computational complexity of the VertexCover problem has been studied extensively. Most notably, it is NP-complete to find an optimal solution and typically NP-hard to find an approximation with reasonable factors. In contrast, recent experiments suggest that on many real-world networks the run time to solve VertexCover is way smaller than even the best known FPT-approaches can explain. We link these observations to two properties that are observed in many real-world networks, namely a heterogeneous degree distribution and high clustering. To formalize these properties and explain the observed behavior, we analyze how a branch-and-reduce algorithm performs on hyperbolic random graphs, which have become increasingly popular for modeling real-world networks. In fact, we are able to show that the VertexCover problem on hyperbolic random graphs can be solved in polynomial time, with high probability. The proof relies on interesting structural properties of hyperbolic random graphs. Since these predictions of the model are interesting in their own right, we conducted experiments on real-world networks showing that these properties are also observed in practice.

119 120 for r ∈ [0, R]. For r > R, f (r) = 0. The constant α ∈ (1/2, 1) is used to tune the power-law 121 exponent β = 2α + 1 of the degree distribution of the generated network. Note that we 122 obtain power-law exponents β ∈ (2, 3). Exponents outside of this range are atypical for 123 hyperbolic random graphs. On the one hand, for β < 2 the average degree of the generated 124 networks are divergent. On the other hand, for β > 3 hyperbolic random graphs degenerate:

23:4 Solving Vertex Cover in Polynomial Time on Hyperbolic Random Graphs
They decompose into smaller components, none having a size linear in n. The obtained graphs have logarithmic tree width [4], meaning the VertexCover problem can be solved 127 efficiently in that case.

128
The probability for a given vertex to lie in a certain area A of the disk is given by its 129 probability measure µ(A) = A f (r)dr. The hyperbolic distance between two vertices u and 130 v increases with increasing angular distance between them. The maximum angular distance 131 such that they are still connected by an edge is bounded by [14,Lemma 6] 132 θ(r(u), r(v)) = arccos cosh(r(u)) cosh(r(v)) − cosh(R) sinh(r(u)) sinh(r(v)) 133 = 2e (R−r(u)−r(v))/2 (1 + Θ(e R−r(u)−r(v) )).
(2) path decomposition of a graph is defined analogously to the tree decomposition, with the 154 constraint that the tree has to be a path. Additionally, as for the treewidth, the pathwidth 155 of a graph G, denoted by pw(G), is the minimum width over all path decompositions of G.

156
Clearly the pathwidth is an upper bound on the treewidth. It is known that for any graph G

Vertex Cover on Hyperbolic Random Graphs
Reduction rules are often applied as a preprocessing step, before using a brute force search 170 or branching in a search tree. They simplify the input by removing parts that are easy to 171 solve. For example, an isolated vertex does not cover any edges and can thus never be part 172 of a minimum vertex cover. Consequently, in a preprocessing step all isolated vertices can be 173 removed, which leads to a reduced input size without impeding the search for a minimum.

174
The dominance reduction rule was previously defined for the IndependentSet prob-

186
In the remainder of this section, we study the effectiveness of the dominance reduction 187 rule on hyperbolic random graphs and conclude that VertexCover can be solved efficiently 188 on these graphs. Our results are summarized in the following main theorem.

203
Recall that a hyperbolic random graph is obtained by distributing n vertices in a hyperbolic The result is illustrated in Figure 1 right. We note that it is sufficient for 210 a vertex v to lie in D(u) in order to be dominated by u, however, it is not necessary.

211
Given the radius r(u) of vertex u we can now compute the probability that u dominates 212 another vertex, i.e., the probability that at least one vertex lies in D(u), by determining

23:6
Solving Vertex Cover in Polynomial Time on Hyperbolic Random Graphs the measure µ(D(u)). To this end, we first define δ(r(u), r(v)) to be the maximum angular 214 distance between two nodes u and v such that v lies in D(u).

225
Recall that θ(r(p), r(q)) denotes the maximum angular distance such that dist(p, q) ≤ R, 226 as defined in Equation (2). Since i u and i v have radius R and hyperbolic distance R to u 227 and v, respectively, we know that their angular coordinates are θ(r(u), R) and θ(r(v), R), 228 respectively. Consequently, the angular distance between i u and i v is given by

232
Using Lemma 3 we can now compute the probability for a given vertex to lie in 233 the dominance area of u. We note that this probability grows roughly like 2/πe −r(u)/2 , 234 which is a constant fraction of the measure of the neighborhood disk of u which grows as Proof. The probability for a given vertex v to lie in D(u) is obtained by integrating the 242 probability density (given by Equation (1)) over D (u). The remaining integrals can be computed easily and we obtain

260
The following lemma shows that, with high probability, all vertices that are not too close 261 to the boundary of the disk dominate at least one vertex with high probability.

272
The probability of at least one node falling into the D(u) is now given by (1)) ).

275
Consequently, for large enough n we can choose c > 4/κ such that the probability of a vertex 276 at radius ρ being dominant is at least 1 − Θ(n −2 ), allowing us to apply union bound.

284
In the following, we use G r = (V r , E r ) to denote the induced subgraph of G that contains all is at most R. It follows, that their angular distance ∆ ϕ (u, v) is bounded by θ(r(u), r(v)).

299
Since It is easy to see that, after removing a vertex from G and G S , G S is still a supergraph 302 of G. Consequently, G S ρ is a supergraph of G ρ . It remains to show that G S ρ has a small 303 maximum clique number, which is given by the maximum number of arcs that intersect at 304 any angle. To this end, we first compute the number of arcs that intersect a given angle 305 which we set to 0 without loss of generality. Let A r denote the area of the disk containing all 306 vertices v with radius r(v) ≥ r whose interval I v intersects 0, as illustrated in Figure 3 right.

307
The following lemma describes the probability for a given vertex to lie in A r . Figure 3 Left: The circular arcs representing the neighborhood of a vertex. For vertex v the area containing the whole neighborhood of v, as well as the circular arc Iv are drawn in the same color. Right: The area that contains the vertices whose arcs intersect angle 0. Area Ar contains all such vertices with radius at least r. Vertex v lies on the boundary of Ar and its interval Iv extends to 0. Lemma 8. Let G be a hyperbolic random graph and let r ≥ R/2. The probability for a 309 given vertex to lie in A r is bounded by

312
Proof. We obtain the measure of A r by integrating the probability density function over A r .

313
Following the definition of I v for a vertex v, we can conclude that A r includes all vertices v 314 with radius r(v) ≥ r whose angular distance to 0 is at most θ(r(v), r(v)). We obtain

322
We split the sum in the integral and deal with the two resulting integrals separately.

331
Simplifying the remaining error terms then yields the claim.
We can now bound the maximum clique number in G S ρ and thus its interval width iw(G S ρ ).

333
Theorem 9. Let G be a hyperbolic random graph and r ≥ R/2. Then there exists a 334 constant c such that, whp., iw(G S r ) = O(log(n)) if r ≥ R − 1 (1−α) log log(n c ), and otherwise

337
Proof. We start by determining the expected number of arcs that intersect at a given angle, 338 which can be done by computing the expected number of vertices in A r , using Lemma 8:

341
It remains to show that this bound holds with high probability at every angle. To this 342 end, we make use of a Chernoff bound (Theorem 1), by first showing that the bound on is Ω(log(n)). We start with the case where r < R − 1 1−α log log(n c ).

353
Thus, for all radii smaller than R − 1 (1−α) log log(n c ), the resulting upper bound is lower 354 bounded by Ω(log(n)), which lets us apply Theorem 1. Moreover, as E[|{v ∈ A r }|] decreases 355 with increasing r, O(log(n)) is a pessimistic but valid upper bound for the case r ≥ R − 356 1 (1−α) log log(n c ). Thus, we can also apply Theorem 1 to this case, when using the pessimistic 357 O(log(n)) bound.

358
By Theorem 1, we can choose c such that in both cases the bound holds with probability First assume that d ≥ log(n) 1/(2−2α) . We handle the other case later. Since d ∈ Ω(log(n)) 393 we can choose ε large enough to apply Theorem 1 and conclude that this holds with high 394 probability. Furthermore, since a smaller radius implies a larger degree, we know that, with 395 high probability, all nodes v with radius at most r, have (1)).

398
For large enough n we can choose ε such that, with high probability, G r is a supergraph of G ≤d .

399
To prove the claim, it remains to bound the pathwidth of G r . If r > R − 1/(1 − α) log log(n c ), 400 we can apply the first part of Theorem 9 to obtain iw(G S r ) = O(log(n)). Otherwise, we use 401 part two to conclude that the interval width of G r is at most As argued in Section 2 the interval width of a graph is an upper bound on the pathwidth.
Although this is not a formal argument, it still explains to a degree why greedy works so 452 well on networks with a heterogeneous degree distribution and high clustering. Moreover, it 453 indicates how the greedy algorithm should be adapted to obtain even better approximation 454 ratios: As the probability to make a mistake grows with growing radius and thus with shrinking vertex degree, the majority of mistakes are done when all vertices have already low degree. However, for hyperbolic random graphs, the subgraphs induced by vertices below a 457 certain constant degree decompose into small components for n → ∞. It thus seems to be 458 a good idea to run the greedy algorithm only until all remaining vertices have low degree, among those with high degree (over α/(α − 1/2) · log n) is rounded to whole percentages.

533
The approximation ratios are rounded to three decimal digits. Treewidth −1 indicates that 534 remaining graph after removing all dominant vertices contained no edge. Table 1 The resulting raw data of our experiments. The columns are: (network) the network name; (easy) whether or not the network is easy; (dom)the percentage of dominant nodes among those of degree above the threshold α/(α − 1/2) · log n; (tw)an upper bound for the treewidth of the remaining graph after deleting dominant nodes; (greedy) the approximation ratio of greedy; (2-ad) of 2-adaptive greedy; (4-ad) of 4-adaptive greedy; (comp) the size of the largest component that remains after the greedy phase of 4-adaptive greedy.