Efficiently Approximating Vertex Cover on Scale-Free Networks with Underlying Hyperbolic Geometry

Finding a minimum vertex cover in a network is a fundamental NP-complete graph problem. One way to deal with its computational hardness, is to trade the qualitative performance of an algorithm (allowing non-optimal outputs) for an improved running time. For the vertex cover problem, there is a gap between theory and practice when it comes to understanding this trade-off. On the one hand, it is known that it is NP-hard to approximate a minimum vertex cover within a factor of 2\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\sqrt{2}$$\end{document}. On the other hand, a simple greedy algorithm yields close to optimal approximations in practice. A promising approach towards understanding this discrepancy is to recognize the differences between theoretical worst-case instances and real-world networks. Following this direction, we narrow the gap between theory and practice by providing an algorithm that efficiently computes nearly optimal vertex cover approximations on hyperbolic random graphs; a network model that closely resembles real-world networks in terms of degree distribution, clustering, and the small-world property. More precisely, our algorithm computes a (1+o(1))\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$(1 + o(1))$$\end{document}-approximation, asymptotically almost surely, and has a running time of O(mlog(n))\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${\mathcal {O}}(m \log (n))$$\end{document}. The proposed algorithm is an adaptation of the successful greedy approach, enhanced with a procedure that improves on parts of the graph where greedy is not optimal. This makes it possible to introduce a parameter that can be used to tune the trade-off between approximation performance and running time. Our empirical evaluation on real-world networks shows that this allows for improving over the near-optimal results of the greedy approach.


Introduction
A vertex cover of a graph is a subset of the vertices that leaves the graph edgeless upon deletion.Since the problem of finding a smallest vertex cover is NP-complete [23], there are probably no algorithms that solve it efficiently.Nevertheless, the problem is relevant due to its applications in computational biology [1], scheduling [14], and internet security [15].Therefore, there is an ongoing effort in exploring methods that can be used in practice [2,3], and while they often work well, they still cannot guarantee efficient running times.
A commonly used approach to overcoming this issue are approximation algorithms.There, the idea is to settle for a near-optimal solution while guaranteeing an efficient running time.For the vertex cover problem, a simple greedy approach computes an approximation in quasi-linear time by iteratively adding the vertex with the largest degree to the cover and removing it from the graph.In general graphs, this algorithm, which we call standard greedy, cannot guarantee a better approximation ratio than log(n), i.e., there are graphs where it produces a vertex cover whose size exceeds the one of an optimum by a factor of log(n) [21].This can be improved to a 2-approximation using a simple linear-time algorithm.The best known polynomial time approximation reduces the factor to 2 − Θ(log(n) −1/2 ) [22].However, assuming the unique games conjecture, it is NP-hard to approximate an optimal vertex cover within a factor of 2 − ε for all ε > 0 [25] and it is proven that finding a √ 2-approximation is NP-hard [37].
Therefore, it is rather surprising that the standard greedy algorithm not only beats the 2-approximation on autonomous systems graphs like the internet [34], it also performs well on many real-world networks, obtaining approximation ratios that are very close to 1 [12].This leaves a gap between the theoretical worst-case bounds and what is observed in practice.One approach to explaining this discrepancy is to consider the differences between the examined instances.Theoretical bounds are often obtained by designing worst-case instances.However, real-world networks rarely resemble the worst case.More realistic statements can be obtained by making assumptions about the solution space [5,10], or by restricting the analysis to networks with properties that are observed in the real world.
Many real networks, like social networks, communication networks, or protein-interaction networks, are considered to be scale-free [4,33,36].Such graphs feature a power-law degree distribution (only few vertices have high degree, while many vertices have low degree), high clustering (two vertices are likely to be adjacent if they have a common neighbor), and a small diameter.
Previous efforts to obtain more realistic insights into the approximability of the vertex cover problem have focused on networks that feature only one of these properties, namely a powerlaw degree distribution [11,19,39].With this approach, guarantees for the approximation factor of the standard greedy algorithm were improved to a constant, compared to log(n) on general graphs [11].Moreover, it was shown that it is possible to compute an expected (2 − ε)-approximation for a constant ε, in polynomial time on such networks [19] and this was later improved to about 1.7 depending on properties of the distribution [39].However, it was also shown that even on graphs that have a power-law degree distribution, the vertex cover problem remains NP-hard to approximate within some constant factor [11].This indicates, that focusing on networks that only feature a power-law degree distribution, is not sufficient to explain why vertex cover can be approximated so well in practice.
The goal of this paper is to narrow this gap between theory and practice, by considering a random graph model that features all of the three mentioned properties of scale-free networks.The hyperbolic random graph model was introduced by Krioukov et al. [27] and it was shown that the graphs generated by the model have a power-law degree distribution and high clustering [20,16], as well as a small diameter [32].Consequently, they are good representations of many real-world networks [9,18,38].Additionally, the model is conceptually simple, making it accessible to mathematical analysis.With it we have previously derived a theoretical explanation for why the bidirectional breadth-first search works well in practice [7].Moreover, we have shown that the vertex cover problem can be solved exactly in polynomial time on hyperbolic random graphs, with high probability [6].However, we note that the degree of the polynomial is unknown and on large networks even quadratic algorithms are not efficient enough to obtain results in reasonable time.
In this paper, we link the success of the standard greedy approach to structural properties of hyperbolic random graphs, identify the parts of the graph where it does not behave optimally, and use these insights to derive a new approximation algorithm.On hyperbolic random graphs, this algorithm achieves an approximation ratio of 1 + o(1), asymptotically almost surely (i.e., with probability 1 − o( 1)), and maintains an efficient running time of O(m log(n)), where n and m denote the number of vertices and edges in the graph, respectively.Since the average degree of hyperbolic random graphs is constant, with high probability [24], this implies a quasi-linear running time on such networks.Moreover, we introduce a parameter that can be used to tune the trade-off between approximation quality and running time of the algorithm, facilitating an improvement over the standard greedy approach.While our algorithm depends on the coordinates of the vertices in the hyperbolic plane, we propose an adaptation of it that is oblivious to the underlying geometry (only relying on the adjacency information of the graph) and compare its approximation performance to the standard greedy algorithm on a selection of real-world networks.On average our algorithm reduces the error of the standard greedy approach to less than 50%.
The paper is structured as follows.We first give an overview of our notation and preliminaries in Section 2 and derive a new approximation algorithm based on prior insights about vertex cover on hyperbolic random graphs in Section 3. Afterwards, we analyze its approximation ratio in Section 4 and evaluate its performance empirically in Section 5.

Preliminaries
Let G = (V, E) be an undirected and connected graph.We denote the number of vertices and edges in G with n and m, respectively.The number of vertices in a set S ⊆ V is denoted by |S|.The neighborhood of a vertex v is defined as The Hyperbolic Plane.After choosing a designated origin O in the two-dimensional hyperbolic plane, together with a reference ray starting at O, a point p is uniquely identified by its radius r(p), denoting the hyperbolic distance to O, and its angle (or angular coordinate) φ(p), denoting the angular distance between the reference ray and the line through p and O.The hyperbolic distance between two points p and q is given by dist(p, q) = acosh(cosh(r(p)) cosh(r(q)) − sinh(r(p)) sinh(r(q)) cos(∆ φ (p, q))), where cosh(x) = (e x + e −x )/2, sinh(x) = (e x − e −x )/2, and denotes the angular distance between p and q.If not stated otherwise, we assume that computations on angles are performed modulo 2π.
In the hyperbolic plane a disk of radius r has an area of 2π(cosh(r) − 1).That is, the area grows exponentially with the radius.In contrast, this growth is polynomial in Euclidean space.
Hyperbolic Random Graphs.Hyperbolic random graphs are obtained by distributing n points independently and uniformly at random within a disk of radius R and connecting any two of them if and only if their hyperbolic distance is at most R. See Figure 1 for an example.The disk radius R (which matches the connection threshold) is given by R = 2 log(n) + C, where the constant C ∈ R depends on the average degree of the network, as well as the power-law exponent β = 2α + 1, with α ∈ (1/2, 1), which are also assumed to be constants.The coordinates of the vertices are drawn as follows.For vertex v the angular coordinate, denoted by φ(v), is drawn uniformly at random from [0, 2π) and the radius of v, denoted by r(v), is sampled according to the probability density function is their joint distribution function.
In the chosen regime for α the resulting graphs have a giant component of size Ω(n) [8], while all other components have at most polylogarithmic size [17,Corollary 13], with high probability.Throughout the paper, we refer only to the giant component when addressing hyperbolic random graphs.
We denote areas in the hyperbolic disk with calligraphic capital letters.The set of vertices in an area A is denoted by V (A).The probability for a given vertex to lie in A is given by its measure µ(A) = A f (r, φ) dφ dr.The hyperbolic distance between two vertices u and v increases with increasing angular distance between them.The maximum angular distance such that they are still connected by an edge is bounded by [28,Lemma 3.2] Hyperbolic Random Graphs with an Expected Number of Vertices.We are often interested in the probability that one or more vertices fall into a certain area of the hyperbolic disk during the sampling process of a hyperbolic random graph.Computing such a probability becomes significantly harder, once the positions of some vertices are already known, since that introduces stochastic dependencies.For example, if all n vertices are sampled into an area A, the probability for a vertex to lie outside of A is 0. In order to overcome such issues, we use an approach (that has been often used on hyperbolic random graphs before, see for example [17,26]), where the vertex positions in the hyperbolic disk are sampled using an inhomogeneous Poisson point process.For a given number of vertices n, we refer to the resulting model as hyperbolic random graphs with n vertices in expectation.After analyzing properties of this simpler model, we can translate the results back to the original model, by conditioning on the fact that the resulting distribution is equivalent to the one originally used for hyperbolic random graphs.More formally, this can be done as follows.
A hyperbolic random graph with n vertices in expectation is obtained using an inhomogeneous Poisson point process to distribute the vertices in the hyperbolic disk.In order to get n vertices in expectation, the corresponding intensity function f P (r, φ) at a point (r, φ) in the disk is chosen as where f (r, φ) is the original probability density function used to sample hyperbolic random graphs (see Equation ( 1)).Let P denote the set of random variables representing the points produced by this process.Then P has two properties.First, the number of vertices in P that are sampled into two disjoint areas are independent random variables.Second, the expected number of points in P that fall within an area A is given by By the choice of f P the number of vertices sampled into the disk matches n only in expectation, i.e., E[|P |] = n.However, we can now recover the original distribution of the vertices, by conditioning on the fact that |P | = n, as shown in the following lemma.Intuitively, it states that probabilistic statements on hyperbolic random graphs with n vertices in expectation can be translated to the original hyperbolic random graph model by taking a small penalty in certainty.We note that proofs of how to bound this penalty have been sketched before [17,26].For the sake of completeness, we give an explicit proof.In the following, we use G P to denote a hyperbolic random graph with n vertices in expectation and point set P .Moreover, we use P to denote a property of a graph and for a given graph G we denote the event that G has property P with E(G, P).
▶ Lemma 1.Let G P be a hyperbolic random graph with n vertices in expectation, let P be a property, and let c > 0 be a constant, such that Pr[E(G P , P)] = O(n −c ).Then, for a hyperbolic random graph G ′ with n vertices it holds that Proof.The probability that G ′ has property P can be obtained by taking the probability that a hyperbolic random graph G P with n vertices in expectation has it, and conditioning on the fact that exactly n vertices are produced during its sampling process.That is, This probability can now be computed using the definition for conditional probabilities, i.e., where the ∧-operator denotes that both events occur.
The quotient can, thus, be bounded by

◀
Probabilities.Since we are analyzing a random graph model, our results are of probabilistic nature.To obtain meaningful statements, we show that they hold with high probability (with probability 1 − O(n −1 )), or asymptotically almost surely (with probability 1 − o( 1)).The following Chernoff bound can be used to show that certain events occur with high probability.
▶ Theorem 2 (Chernoff Bound [31, Theorems 4.4 and 4.5]).Let X 1 , . . ., X n be independent random variables with X i ∈ {0, 1} and let X be their sum.Then, for ε ∈ (0, 1] Usually, it suffices to show that a random variable does not exceed an upper bound.The following corollary shows that a bound on the expected value suffices to obtain concentration.▶ Corollary 3. Let X 1 , . . ., X n be independent random variables with X i ∈ {0, 1}, let X be their sum, and let f (n) be an upper bound on E[X].Then, for ε ∈ (0, 1) Proof.We define random variables X ′ 1 , . . ., X ′ n with X ′ i ≥ X i for every outcome, in such a way that Using Theorem 2 we can derive that

◀
While the Chernoff bound considers the sum of indicator random variables, we often have to deal with different functions of random variables.In this case tight bounds on the probability that the function deviates a lot from its expected value can be obtained using the method of bounded differences.Let X 1 , . . ., X n be independent random variables taking values in a set S. We say that a function f : S n → R satisfies the bounded differences condition if for all i ∈ [n] there exists a ∆ i ≥ 0 such that for all x, x ′ ∈ S n that differ only in the i-th component.
▶ Theorem 4 (Method of Bounded Differences [13, Corollary 5.2]).Let X 1 , . . ., X n be independent random variables taking values in a set S and let f : S n → R be a function that satisfies the bounded differences condition with parameters As before, we are usually interested in showing that a random variable does not exceed a certain upper bound with high probability.Analogously to the Chernoff bound in Corollary 3, one can show that, again, an upper bound on the expected value suffices to show concentration.▶ Corollary 5. Let X 1 , . . ., X n be independent random variables taking values in a set S and let f : S n → R be a function that satisfies the bounded differences condition with parameters As a consequence, we have f ≤ f ′ for all outcomes of X 1 , . . ., X n and it holds that for all x, x ′ ∈ S n .Consequently, f ′ satisfies the bounded differences condition with the same parameters ∆ i as f .Since

◀
A disadvantage of the method of bounded differences is that one has to consider the worst possible change in f when changing one variable and the resulting bound becomes worse the larger this change.A way to overcome this issue is to consider the method of typical bounded differences instead.Intuitively, it allows us to milden the effect of the change in the worst case, if it is sufficiently unlikely, and to focus on the typical cases where the change should be small, instead.Formally, we say that a function f : S n → R satisfies the typical bounded differences condition with respect to an event A ⊆ S n if for all i ∈ [n] there exist for all x, x ′ ∈ S n that differ only in the i-th component.
▶ Theorem 6 (Method of Typical Bounded Differences, [40, Theorem 21 ]).Let X 1 , . . ., X n be independent random variables taking values in a set S and let A ⊆ S n be an event.Furthermore, let f : S n → R be a function that satisfies the typical bounded differences condition with respect to A and with parameters Intuitively, the choice of the values for ε i has two effects.On the one hand, choosing ε i small allows us to compensate for a potentially large worst-case change ∆ i .On the other hand, this also increases the bound on the probability of the event B that represents the atypical case.However, in that case one can still obtain meaningful bounds if the typical event A occurs with high enough probability.Again, it is usually sufficient to show that the function f does not exceed an upper bound on its expected value with high probability.The proof of the following corollary is analogous to the one of Corollary 5.
▶ Corollary 7 ([7, Corollary 4.13]).Let X 1 , . . ., X n be independent random variables taking values in a set S and let A ⊆ S n be an event.Furthermore, let f : S n → R be a function that satisfies the typical bounded differences condition with respect to A and with parameters 1/ε i .

Useful Inequalities.
Finally, computations can often be simplified by making use of the fact that 1 ± x can be closely approximated by e ±x for small x.More precisely, we use the following lemmas, which have been derived previously using the Taylor approximation [28].
Proof.First, note that there exists an ε = o(1) such that 1 − x = e −ε .Therefore, it suffices to show that e −ε ≥ e −(1+ε)x .It is easy to see that By Lemma 8 it holds that 1 + ε ≤ e ε and, therefore,

An Improved Greedy Algorithm
Previous insights about solving the vertex cover problem on hyperbolic random graphs are based on the fact that the dominance reduction rule reduces the graph to a remainder of simple structure [6].This rule states that a vertex u can be safely added to the vertex cover (and, thus, be removed from the graph) if it dominates at least one other vertex, i.e., if there exists a neighbor v ∈ N (u) such that all neighbors of v are also neighbors of u.
On hyperbolic random graphs, vertices near the center of the disk dominate with high probability [6, Lemma 5].Therefore, it is not surprising that the standard greedy algorithm that computes a vertex cover by repeatedly taking the vertex with the largest degree achieves good approximation rates on such networks: Since high degree vertices are near the disk center, the algorithm essentially favors vertices that are likely to dominate and can be safely added to the vertex cover anyway.
On the other hand, after (safely) removing high-degree vertices, the remaining vertices all have similar (small) degree, meaning the standard greedy algorithm basically picks the vertices at random.Thus, in order to improve the approximation performance of the algorithm, one has to improve on the parts of the graph that contain the low-degree vertices.Based on this insight, we derive a new greedy algorithm that achieves close to optimal approximation rates efficiently.More formally, we prove the following main theorem.
▶ Theorem 11.Let G be a hyperbolic random graph on n vertices.Given the radii of the vertices, an approximate vertex cover of G can be computed in time O(m log(n)), such that the approximation ratio is (1 + o( 1)) asymptotically almost surely.
Consider the following greedy algorithm that computes an approximation of a minimum vertex cover on hyperbolic random graphs.We iterate the vertices in order of increasing radius.Each encountered vertex v is added to the cover and removed from the graph.After each step, we then identify the connected components of size at most τ log log(n) in the remainder of the graph, solve them optimally, and remove them from the graph as well.The constant τ > 0 can be used to adjust the trade-off between quality and running time: With increasing τ the parts of the graph that are solved exactly increase as well, but so does the running time.
This algorithm determines the order in which the vertices are processed based on their radii, which are not known for real-world networks.However, in hyperbolic random graphs, there is a strong correlation between the radius of a vertex and its degree [20].Therefore, we can mimic the considered greedy strategy by removing vertices with decreasing degree instead.Then, the above algorithm represents an adaptation of the standard greedy algorithm: Instead of greedily adding vertices with decreasing degree until all remaining vertices are isolated, we increase the quality of the approximation by solving small components exactly.

Approximation Performance
To analyze the performance of the above algorithm, we utilize structural properties of hyperbolic random graphs.While the power-law degree distribution and high clustering are modelled explicitly using the underlying geometry, other properties of the model, like the logarithmic diameter, emerge as a natural consequence of the first two.Our analysis is based on another emerging property: Hyperbolic random graphs decompose into small components when removing high-degree vertices.
More formally, we proceed as follows.We compute the size of the vertex cover obtained using the above algorithm, by partitioning the vertices of the graph into two sets: V Greedy and V Exact , denoting the vertices that were added greedily and the ones contained in small separated components that were solved exactly, respectively (see Figure 1).Clearly, we obtain a valid vertex cover for the whole graph, if we take all vertices in V Greedy together with a vertex cover To bound the size of V Greedy , we identify a time during the execution of the algorithm at which only few vertices were added greedily, yet, the majority of the vertices were contained in small separated components (and were, therefore, part of V Exact ), and only few vertices remain to be added greedily.Since the algorithm processes the vertices by increasing radius, this point in time can be translated to a threshold radius ρ in the hyperbolic disk (see Figure 1).Therefore, we divide the hyperbolic disk into two regions: an inner disk and an outer band, containing vertices with radii below and above ρ, respectively.The threshold ρ is chosen such that a hyperbolic random graph decomposes into small components after removing the inner disk.When adding the first vertex from the outer band, greedily, we can assume that the inner disk is empty (since vertices of smaller radii were chosen before or removed as part of a small component).At this point, the majority of the vertices in the outer band were contained in small components, which have been solved exactly.In our analysis, we now overestimate the size of V Greedy by assuming that all remaining vertices are also added to the cover greedily.Therefore, we obtain a valid upper bound on |V Greedy |, by counting the total number of vertices in the inner disk and adding the number of vertices in the outer band that are contained in components that are not solved exactly, i.e., components whose size exceeds τ log log(n).In the following, we show that both numbers are sublinear in n with high probability.Together with the fact that an optimal vertex cover on hyperbolic random graphs, asymptotically almost surely, contains Ω(n) vertices [11], this implies The main contribution of our analysis is the identification of small components in the outer band, which is done by discretizing it into sectors, such that an edge cannot extend γ ρ Figure 2 The disk is divided into the inner disk (red) and the outer band.It is additionally divided into sectors of equal width γ.Consecutive non-empty sectors form a run.Wide runs (blue) consist of many sectors.Each blue sector is a widening sector.Narrow runs (green) consist of few sectors.Small narrow runs contain only few vertices (light green), while large narrow runs contain many vertices (dark green).
beyond an empty sector (see Figure 2).The foundation of this analysis is the delicate interplay between the angular width γ of these sectors and the threshold ρ that defines the outer band.Recall that ρ is used to represent the time in the execution of the algorithm at which the graph has been decomposed into small components.For our analysis we assume that all vertices seen before this point (all vertices in the inner disk; red Figure 2) were added greedily.Therefore, if we choose ρ too large, we overestimate the actual number of greedily added vertices by too much.As a consequence, we want to choose ρ as small as possible.However, this conflicts our intentions for the choice of γ and its impact on ρ.Recall that the maximum angular distance between two vertices such that they are adjacent increases with decreasing radii (Equation ( 2)).Thus, in order to avoid edges that extend beyond an angular width of γ, we need to ensure that the radii of the vertices in the outer band are sufficiently large.That is, decreasing γ requires increasing ρ.However, we want to make γ as small as possible, in order to get a finer granularity in the discretization and, with that, a more accurate analysis of the component structure in the outer band.Therefore, γ and ρ need to be chosen such that the inner disk does not become too large, while ensuring that the discretization is granular enough to accurately detect components whose size depends on τ and n.To this end, we adjust the angular width of the sectors using a function γ(n, τ ), which is defined as where log (i) (n) denotes iteratively applying the log-function i times on n (e.g., log (2) (n) = log log(n)), and set where R = 2 log(n) + C is the radius of the hyperbolic disk.
In the following, we first show that the number of vertices in the inner disk is sublinear with high probability, before analyzing the component structure in the outer band.To this end, we make use of the discretization of the disk into sectors, by distinguishing between different kinds of runs (sequences of non-empty sectors), see Figure 2. In particular, we bound the number of wide runs (consisting of many sectors) and the number of vertices in them.Then we bound the number of vertices in large narrow runs (consisting of few sectors but containing many vertices).The remaining small narrow runs represent small components that are solved exactly.
The analysis mainly involves working with the random variables that denote the numbers of vertices in the above mentioned areas of the disk.Throughout the paper, we usually start with computing their expected values.Afterwards, we obtain tight concentration bounds using the previously mentioned Chernoff bound or, when the considered random variables are more involved, the method of (typical) bounded differences.

The Inner Disk
The inner disk I contains all vertices whose radius is below the threshold ρ.The number of them that are added to the cover greedily is bounded by the number of all vertices in I.
▶ Lemma 12. Let G be a hyperbolic random graph on n vertices with power-law exponent Proof.We start by computing the expected number of vertices in I and show concentration afterwards.To this end, we first compute the measure µ(I).The measure of a disk of radius r that is centered at the origin is given by e −α(R−r) (1 + o( 1)) [20,Lemma 3.2].Consequently, the expected number of vertices in I is ◀ Since γ(n, τ ) = ω(1), Lemma 12 shows that, with high probability, the number of vertices that are greedily added to the vertex cover in the inner disk is sublinear.Once the inner disk has been processed and removed, the graph has been decomposed into small components and the ones of size at most τ log log(n) have already been solved exactly.The remaining vertices that are now added greedily belong to large components in the outer band.

The Outer Band
To identify the vertices in the outer band that are contained in components whose size exceeds τ log log(n), we divide it into sectors of angular width where θ(ρ, ρ) denotes the maximum angular distance between two vertices with radii ρ to be adjacent (see Equation ( 2)).This division is depicted in Figure 2. The choice of γ (combined with the choice of ρ) has the effect that an edge between two vertices in the outer band cannot extend beyond an empty sector, i.e., a sector that does not contain any vertices, allowing us to use empty sectors as delimiters between components.To this end, we introduce the notion of runs, which are maximal sequences of non-empty sectors (see Figure 2).While a run can contain multiple components, the number of vertices in it denotes an upper bound on the combined sizes of the components that it contains.
To show that there are only few vertices in components whose size exceeds τ log log(n), we bound the number of vertices in runs that contain more than τ log log(n) vertices.For a given run this can happen for two reasons.First, it may contain many vertices if its angular interval is too large, i.e., it consists of too many sectors.This is unlikely, since the sectors are chosen sufficiently small, such that the probability for a given one to be empty is high.Second, while the angular width of the run is not too large, it contains too many vertices for its size.However, the vertices of the graph are distributed uniformly at random in the disk, making it unlikely that too many vertices are sampled into such a small area.To formalize this, we introduce a threshold w and distinguish between two types of runs: A wide run contains more than w sectors, while a narrow run contains at most w sectors.The threshold w is chosen such that the probabilities for a run to be wide and for a narrow run to contain more than τ log log(n) vertices are small.To this end, we set w = e γ(n,τ ) • log (3) (n).
In the following, we first bound the number of vertices in wide runs.Afterwards, we consider narrow runs that contain more than τ log log(n) vertices.Together, this gives an upper bound on the number of vertices that are added greedily in the outer band.

Wide Runs
We refer to a sector that contributes to a wide run as a widening sector.In the following, we bound the number of vertices in all wide runs in three steps.First, we determine the expected number of all widening sectors.Second, based on the expected value, we show that the number of widening sectors is small, with high probability.Finally, we make use of the fact that the area of the disk covered by widening sectors is small, to show that the number of vertices sampled into the corresponding area is sublinear, with high probability.
Expected Number of Widening Sectors.Let n ′ denote the total number of sectors and let S 1 , . . ., S n ′ be the corresponding sequence.For each sector S k , we define the random variable S k indicating whether S k contains any vertices, i.e., S k = 0 if S k is empty and S k = 1 otherwise.The sectors in the disk are then represented by a circular sequence of indicator random variables S 1 , . . ., S n ′ , and we are interested in the random variable W that denotes the sum of all runs of 1s that are longer than w.In order to compute E[W ], we first compute the total number of sectors, as well as the probability for a sector to be empty or non-empty.

▶ Lemma 13. Let G be a hyperbolic random graph on n vertices. Then, the number of sectors of width
Proof.Since all sectors have equal angular width γ = θ(ρ, ρ), we can use Equation (2) to compute the total number of sectors as It remains to simplify the error term.Note that γ(n, τ ) = O(log (3) (n)).Consequently, the error term is equivalent to (1 ± o(1)) −1 .Finally, it holds that 1/(1 + x) = 1 − Θ(x) for x = ±o(1).◀ ▶ Lemma 14.Let G be a hyperbolic random graph on n vertices and let S be a sector of angular width γ = θ(ρ, ρ).For sufficiently large n, the probability that S contains at least one vertex is bounded by . Proof.To compute the probability that S contains at least one vertex, we first compute the probability for a given vertex to fall into S, which is given by the measure µ(S).Since the angular coordinates of the vertices are distributed uniformly at random and since the disk is divided into n ′ sectors of equal width, the measure of a single sector S can be obtained as µ(S) = 1/n ′ .The total number of sectors n ′ is given by Lemma 13 and we can derive where the second equality is obtained by applying 1/(1 + x) = 1 − Θ(x) for x = ±o(1).
Given µ(S), we first compute the lower bound on the probability that S contains at least one vertex.Note that Pr[V (S) Note that µ(S) ∈ o (1).Therefore, we can bound 1 − x ≥ e −x(1+o(1)) for x ∈ o(1), and obtain the following bound on Pr[V (S) For large enough n, we have (1 + o(1)) ≤ 2. Therefore, holds for sufficiently large n.Finally, 1 − x ≤ e −x is valid for all x ∈ R and we obtain the claimed bound.◀ We are now ready to bound the expected number of widening sectors, i.e., sectors that are part of wide runs.To this end, we aim to apply the following lemma.We note that the indicator random variables S 1 , . . ., S ′ n are not independent on hyperbolic random graphs.To overcome this issue, we compute the expected value of W on hyperbolic random graphs with n vertices in expectation (see Section 2) and subsequently derive a probabilistic bound on W for hyperbolic random graphs.
▶ Lemma 16.Let G be a hyperbolic random graph with n vertices in expectation and let W denote the number of widening sectors.Then, (1 ± o( 1)).
Figure 3 A circular sequence of random variables S1, . . ., S n ′ that can either be 0 (white) or 1 (blue).Dark blue runs are as large as possible without being wide.Depending on the value of Si, the two runs of length w are merged into one run of length 2w + 1.

Concentration Bound on the Number of Widening Sectors.
Lemma 16 bounds the expected number of widening sectors and it remains to show that this bound holds with high probability.To this end, we first determine under which conditions the sum of long success runs in a circular sequence of indicator random variables can be bounded with high probability in general.Afterwards, we show that these conditions are met for our application.
▶ Lemma 17.Let S 1 , . . ., S n ′ denote a circular sequence of independent indicator random variables and let W denote the sum of the lengths of all success runs of length at least Proof.In order to show that W does not exceed g(n ′ ) by more than a constant factor with high probability, we aim to apply a method of bounded differences (Corollary 5).To this end, we consider W as a function of n ′ independent random variables S 1 , . . ., S n ′ and determine the parameters ∆ i with which W satisfies the bounded differences condition (see Equation ( 3)).That is, for each i ∈ [n ′ ] we need to bound the change in the sum of the lengths of all success runs of length at least w, obtained by changing the value of S i from 0 to 1 or vice versa.
The largest impact on W is obtained when changing the value of S i from 0 to 1 merges two runs of size w, i.e., runs that are as large as possible but not wide, as shown in Figure 3.In this case both runs did not contribute anything to W before the change, while the merged run now contributes 2w + 1.Then, we can bound the change in W as ∆ i = 2w + 1.Note that the other case in which the value of S i is changed from 1 to 0 can be viewed as the inversion of the change in the first case.That is, instead of merging two runs, changing S i splits a single run into two.Consequently, the corresponding bound on the change of W is the same, except that W is decreasing instead of increasing.
It follows that W satisfies the bounded differences condition for ∆ i = 2w + 1 for all i ∈ {1, . . ., n ′ }.We can now apply Corollary 5 to bound the probability that W exceeds an upper bound g(n ′ ) on its expected value by more than a constant factor as where the second inequality is valid since w is assumed to be at least 1.Moreover, we can apply g(n ′ ) = ω(w n ′ log(n ′ )) (a precondition of this lemma), which yields 1) .
Let G be a hyperbolic random graph on n vertices.Then, with probability 1 − O(n −c ) for any constant c > 0, the number of widening sectors is Proof.In the following, we show that the claimed bound holds with probability 1 − O(n −c1 ) for any constant c 1 > 0 on hyperbolic random graphs with n vertices in expectation.By Lemma 1 the same bound then holds with probability 1 − O(n −c1+1/2 ) on hyperbolic random graphs.Choosing c = c 1 − 1/2 then yields the claim.
Recall that we represent the sectors using a circular sequence of independent indicator random variables S 1 , . . ., S n ′ and that W denotes the sum of the lengths of all success runs spanning more than w sectors, i.e., the sum of all widening sectors.By Lemma 16 we obtain a valid upper bound on E[W ] by choosing and it remains to show that this bound holds with sufficiently high probability.To this end, we aim to apply Lemma 17, which states that In the following, we first show that h(n) fulfills this criterion 3 , before arguing that we can choose c 2 such that Since τ is constant and n ′ = 2n/γ(n, τ ) • (1 ± o( 1)) by Lemma 13, we can bound h(n) by where the last bound is obtained by applying log (3) (n) 1/2 = ω(1).Recall that w was chosen as w = e γ(n,τ ) log (3) (n).Furthermore, we have γ(n, τ ) = log(τ log (2) (n)/(2 log (3) (n) 2 )).Thus, it holds that w = Θ(log (2) (n)/(log (3) (n))), allowing us to further bound h(n) by and it remains to show that the last factor in the root is in ω (1).Note that n ′ = Ω(n/ log (3) (n)) and As stated above, this shows that W = O(h(n)) holds with probability 1 − O((n ′ ) −c2 ) for any constant c 2 .Again, since n ′ = Ω(n/ log (3) (n)), we have n ′ = Ω(n 1/2 ).Therefore, we can conclude that W = O(h(n)) holds with probability 1 − O(n −c2/2 ).Choosing c 2 = 2c 1 then yields the claim.◀ Number of Vertices in Wide Runs.Let W denote the area of the disk covered by all widening sectors.By Lemma 18 the total number of widening sectors is small, with high probability.As a consequence, W is small as well and we can derive that the size of the vertex set V (W) containing all vertices in all widening sectors is sublinear with high probability.
▶ Lemma 19.Let G be a hyperbolic random graph on n vertices.Then, with high probability, the number of vertices in wide runs is bounded by Proof.We start by computing the expected number of vertices in W and show concentration afterwards.The probability for a given vertex to fall into W is equal to its measure µ(W).
Since the angular coordinates of the vertices are distributed uniformly at random, we have µ(W) = W/n ′ , where W denotes the number of widening sectors and n ′ is the total number of sectors, which is given by Lemma 13.The expected number of vertices in W is then where the last equality holds since 1/(1 + x) = 1 − Θ(x) is valid for x = ±o(1).Note that the number of widening sectors W is itself a random variable.Therefore, we apply the law of total expectation and consider different outcomes of W weighted with their probabilities.Motivated by the previously determined probabilistic bound on W (Lemma 18), we consider the events W ≤ g(n) as well as W > g(n), where for sufficiently large c > 0 and n.With this, we can compute the expected number of vertices in W as To bound the first summand, note that Pr[W ≤ g(n)] ≤ 1.Further, by applying Equation ( 5) from above, we have In order to bound the second summand, note that n is an obvious upper bound on As a result we have Finally, since ĝ(n) can be simplified as It remains to bound the number of vertices in large components contained in narrow runs.

Narrow Runs
In the following, we differentiate between small and large narrow runs, containing at most and more than τ log (2) (n) vertices, respectively.As before, we first bound the expected number of vertices in all large narrow runs and deal with concentration afterwards.

Expected Number of Vertices in Large Narrow
Runs.The straight-forward way to bounding the number of vertices in all large narrow runs is to consider each sector and count the contained number of vertices, if the sector is part of a large narrow run.Unfortunately, whether this is the case depends on the surrounding sectors and whether they are empty (ending the run) or contain lots of vertices (to make the run large), and dealing with these stochastic dependencies is difficult.
To relax these dependencies, we determine an upper bound on the number of vertices in large narrow runs, by not only considering sectors that are part of such a run, but also ones that are in the proximity thereof.More precisely, for a sector S we define its narrow proximity P S as S together with the w − 1 sectors to its left and the w − 1 sectors to its right.If S is part of a large narrow run, then there are more than τ log (2) (n) vertices in its narrow proximity.Note, however, that this condition is not sufficient: Even if there are as many vertices in P S , there could be empty sectors that cut S off from the corresponding sectors, in which case S is not part of a large narrow run.
We start by bounding the expected number of vertices in the narrow proximity of a sector.
▶ Lemma 20.Let G be a hyperbolic random graph on n vertices, let S be a sector, and let P S be its narrow proximity.Then, Proof.The narrow proximity of S consists of S together with the w − 1 sectors to its left and the w − 1 ones to its right.In particular, P(S) consists of at most 2w sectors.Since the angular coordinates of the vertices are distributed uniformly at random and since we partitioned the disk into n ′ disjoint sectors of equal width, we can derive an upper bound on the expected number of vertices in 1)) according to Lemma 13, we have Since 1/(1 + x) = 1 − Θ(x) for x = ±o(1), we obtain the claimed bound.◀ Using this upper bound, we can bound the probability that the number of vertices in the narrow proximity of a sector exceeds the threshold τ log (2) (n) by a certain amount.
▶ Lemma 21.Let G be a hyperbolic random graph on n vertices, let S be a sector, and let P S be its narrow proximity.For k > τ log (2)  To see that this is a valid choice, we use Lemma 20 and substitute γ(n, τ ) = log(τ log (2) (n)/(2 log (3) (n) 2 )), which yields Note, that the first error term is equivalent to (1 − o( 1)) and that (1 ± o( 1)) ≤ 4/3 holds for n large enough.Consequently, for sufficiently large n, we have Therefore, we can apply the Chernoff bound in Corollary 3 to conclude that

◀
We are now ready to bound the expected value of the number N of vertices in all large narrow runs.
▶ Lemma 22.Let G be a hyperbolic random graph.Then, the expected number of vertices in all large narrow runs is bounded by As mentioned above, we compute an upper bound of E[N ] by considering random variables N ′ i instead, where N ′ i = |V (S i )| if the number of vertices in the narrow proximity of S i exceeds the threshold t = τ log (2) (n), and N ′ i = 0 otherwise.By the above argumentation it holds that Consequently, it suffices to show that the claimed bound holds for E[N ′ ].To this end, we compute where the second equality is obtained using the law of total expectation.Note that we have N ′ i = 0 whenever |V (P Si )| ≤ t, and N i = |V (S i )|, otherwise.Thus, the expression simplifies to In each summand we are interested in the expected number of vertices in a sector S i , conditioned on the fact that its narrow proximity contains exactly k vertices.Since the angular coordinates of the vertices are distributed uniformly and the narrow proximity consists of 2w − 1 sectors including S i , the expected number that end up in S i is given by k/(2w − 1) ≤ k/w.It follows that The probability can be bounded using Lemma 21, which yields for k > τ log (2) (n) = t that Pr[|V (P Si )| = k] ≤ e −k/18 .Thus,

S i w w
Figure 4 The random variable Si indicates whether Si contains any vertices.Changing Si from 0 to 1 or vice versa merges two narrow runs or splits a wide run into two narrow ones, respectively.If all vertices were placed in the blue area, moving a single vertex in or out of Si may change the number of vertices in large narrow runs by n.
Note that the sum is of the form kb k for b = e −1/18 < 1, which is the derivative of the geometric series multiplied by b.Consequently, we obtain an upper bound by bounding the limits as t + 1 > t and n < ∞ and applying the identity which is valid for b < 1 and reduces to O(b t • t) for constant b.Substituting b = e −1/18 and t = τ log (2) (n) then yields Finally, since w = e γ(n,τ ) log (3) (n) by definition and n ′ = O(n/γ(n, τ )) by Lemma 13, where we defined γ(n, τ ) = log(τ log (2) (n)/(2 log (3) (n) 2 )), the above term can be simplified to ◀ Concentration Bound on the Number of Vertices in Large Narrow Runs.To show that the actual number of vertices in large narrow runs N is not much larger than the expected value, we consider N as a function of n independent random variables P 1 , . . ., P n representing the positions of the vertices in the hyperbolic disk.In order to show that N does not deviate much from its expected value with high probability, we would like to apply the method of bounded differences, which builds on the fact that N satisfies the bounded differences condition, i.e., that changing the position of a single vertex does not change N by much.Unfortunately, this change is not small in general.
In the worst case, there is a wide run R that contains all vertices and a sector S i ⊆ R contains only one of them.Moving this vertex out of S i may split the run into two narrow runs (see Figure 4).These still contain n vertices, which corresponds to the change in N .However, this would mean that R consists of only few sectors (since it can be split into two narrow runs) and that all vertices lie within the corresponding (small) area of the disk.Since the vertices of the graph are distributed uniformly, this is very unlikely.To take advantage of this, we apply the method of typical bounded differences (Corollary 7), which allows us to milden the effects of the change in the unlikely worst case and to focus on the typically smaller change of N instead.Formally, we represent the typical case using an event A denoting that each run of length at most 2w + 1 contains at most O(log(n)) vertices.In the following, we show that A occurs with probability 1 − O(n −c ) for any constant c, which shows that the atypical case is very unlikely.Proof.We show that the probability for a single run R of at most 2w + 1 sectors to contain more then O(log(n)) vertices is O(n −c1 ) for any constant c 1 .Since there are at most n ′ = O(n) runs, applying the union bound and choosing c 1 = c + 1 then yields the claim.
Recall that we divided the disk into n ′ sectors of equal width.Since the angular coordinates of the vertices are distributed uniformly at random, the probability for a given vertex to a lie in R (i.e., to be in V (R)) is given by By Lemma 13 the total number of sectors is given as n ′ = 2n/γ(n, τ )•(1±o(1)).Consequently, we can compute the expected number of vertices in R as Substituting γ(n, τ ) = O(log(log (2) (n)/ log (3) (n) 2 )), we can derive that Consequently, it holds that g(n) = c 2 log(n) is a valid upper bound for any c 2 > 0 and large enough n.Therefore, we can apply the Chernoff bound in Corollary 3 to conclude that the probability for the number of vertices in R to exceed g(n) is at most Thus, c 2 can be chosen sufficiently large, such that for any constant c 1 .◀ The method of typical bounded differences now allows us to focus on this case and to milden the impact of the worst case changes as they occur with small probability.Consequently, we can show that the number of vertices in large narrow runs is sublinear with high probability.
▶ Lemma 24.Let G be a hyperbolic random graph.Then, with high probability, the number of vertices in large narrow runs is bounded by Proof.Recall that the expected number of vertices in all large narrow runs is given by Lemma 22. Consequently, we can choose c > 0 large enough, such that for sufficiently large n we obtain a valid upper bound on E[N ] by choosing In order to show that N does not exceed g(n) by more than a constant factor with high probability, we apply the method of typical bounded differences (Corollary 7).To this end, we consider the typical event A, denoting that each run of at most 2w + 1 sectors contains at most O(log(n)) vertices, and it remains to determine the parameters ∆ A i ≤ ∆ i with which N satisfies the typical bounded differences condition with respect to A (see Equation ( 4)).Formally, we have to show that for all i ∈ {1, . . ., n} As argued before, changing the position P i of vertex i to P ′ i may result in a change of n in the worst case.Therefore, ∆ i = n is a valid bound for all i ∈ [n].To bound the ∆ A i , we have to consider the following situation.We start with a set of positions such that all runs of 2w + 1 sectors contain at most O(log(n)) vertices and we want to bound the change in N when changing the position P i of a single vertex i.In this case, splitting a wide run or merging two narrow runs can only change N by O(log(n)).Consequently, we can choose . By Corollary 7 we can now bound the probability that N exceeds g(n) by more than a constant factor c 1 as . By substituting the previously determined ∆ A i and ∆ i , as well as, choosing ε i = 1/n for all i ∈ [n], we obtain where the last equality holds, since γ(n, τ ) = O(log (3) (n)).By further simplifying the exponent, we can derive that the first part of the sum is exp(−ω(log(n))).It follows that Pr

The Complete Disk
In the previous subsections we determined the number of vertices that are greedily added to the vertex cover in the inner disk and outer band, respectively.Before proving our main theorem, we are now ready to prove a slightly stronger version that shows how the parameter τ can be used to obtain a trade-off between approximation performance and running time.
▶ Theorem 25.Let G be a hyperbolic random graph on n vertices with power-law exponent β = 2α + 1 and let τ > 0 be constant.Given the radii of the vertices, an approximate vertex cover of G can be computed in time O(n log(n) + m log(n) τ ), such that the approximation factor is (1 + O(γ(n, τ ) −α )) asymptotically almost surely.
Proof.Running Time.We start by sorting the vertices of the graph in order of increasing radius, which can be done in time O(n log(n)).Afterwards, we iterate them and perform the following steps for each encountered vertex v.We add v to the cover, remove it from the graph, and identify connected components of size at most τ log log(n) that were separated by the removal.= O((e/1.1996)nc ), this running time is bounded by O(e nc ) = O(log(n) τ ).Consequently, the time required to process each neighbor of v is O(log(n) τ ).Since this is potentially performed for all neighbors of v, the running time of this third step can be bounded by introducing an additional factor of deg(v).
We then obtain the total running time T (n, m, τ ) of the algorithm by taking the time for the initial sorting and adding the sum of the running times of the above three steps over all vertices, which yields Approximation Ratio.As argued before, we obtain a valid vertex cover for the whole graph, if we take all vertices in V Greedy together with a vertex cover  Proof.By Theorem 25 we can compute an approximate vertex cover in time O(n log(n) + m log(n) τ ), such that the approximation factor is 1 + O(γ(n, τ ) −α ), asymptotically almost surely.By choosing τ = 1 we get γ(n, 1) = ω(1), which yields an approximation factor of (1 + o(1)), since α ∈ (1/2, 1).Additionally, the bound on the running time can be simplified to O(n log(n) + m log(n)).The claim then follows since we assume the graph to be connected, which implies that the number of edges is m = Ω(n).◀

Experimental Evaluation
It remains to evaluate how well the predictions of our analysis on hyperbolic random graphs translate to real-world networks.According to the model, vertices near the center of the disk can likely be added to the vertex cover safely, while vertices near the boundary need to be treated more carefully (see Section 3).Moreover, it predicts that these boundary vertices can be found by identifying small components that are separated when removing vertices near the center.Due to the correlation between the radii of the vertices and their degrees [20], this points to a natural extension of the standard greedy approach: While iteratively adding the vertex with the largest degree to the cover, small separated components are solved To evaluate how this approach compares to the standard greedy algorithm, we measured the approximation ratios on the largest connected component of a selection of 42 real-world networks from several network datasets [29,35].The results of our empirical analysis are summarized in Figure 5.
Our experiments confirm that the standard greedy approach already yields close to optimal approximation ratios on all networks, as observed previously [12].In fact, the "worst" approximation ratio is only 1.049 for the network dblp-cite.The average lies at just 1.009.
Clearly, our adapted greedy approach performs at least as well as the standard greedy algorithm.In fact, for τ = 1 the sizes of the components that are solved optimally on the considered networks are at most 3.For components of this size the standard greedy approach performs optimally.Therefore, the approximation performances of the standard and the adapted greedy match in this case.However, the adapted greedy algorithm allows for improving the approximation ratio by increasing the size of the components that are solved optimally.In our experiments, we chose 10⌈log log(n)⌉ as the component size threshold, which corresponds to setting τ = 10.The resulting impact can be seen in Figure 6, which shows the error of the adapted greedy compared to the one of the standard greedy algorithm.This relative error is measured as the fraction of the number of vertices by which the adapted greedy and the standard approach exceed an optimum solution.That is, a relative error of 0.5 indicates that the adapted greedy halved the number of vertices by which the solution of the standard greedy exceeded an optimum.Moreover, a relative error of 0 indicates that the adapted greedy found an optimum when the standard greedy did not.The relative error is omitted (gray in Figure 6) if the standard greedy already found an optimum, i.e., there was no error to improve on.For more than 69% of the considered networks (29 out of 42) the relative error is at most 0.5 and the average relative error is 0.39.Since the behavior of the two algorithms only differs when it comes to small separated components, this indicates that the predictions of the model that led to the improvement of the standard greedy approach do translate to real-world networks.In fact, the average approximation ratio obtained using the standard greedy algorithm is reduced from 1.009 to 1.004 when using the adapted greedy approach.Relative error of the improved greedy compared to the standard approach.The parameter adjusting the component size threshold was chosen as τ = 10.Gray bars indicate that no error could be determined since the standard approach found an optimum already.
w} ∈ E}.The size of the neighborhood, called the degree of v, is denoted by deg(v) = |N (v)|.For a subset S ⊆ V , we use G[S] to denote the induced subgraph of G obtained by removing all vertices in V \ S.
the approximation ratio is given by the quotient δ = (|V Greedy | + |C Exact |)/|C OPT |, where C OPT denotes an optimal solution.Since all components in G[V Exact ] are solved optimally and since any minimum vertex cover for the whole graph induces a vertex cover on G[V ′ ] for any vertex subset V ′ ⊆ V , it holds that |C Exact | ≤ |C OPT |.Consequently, it suffices to show that |V Greedy | ∈ o(|C OPT |) in order to obtain the claimed approximation factor of 1 + o(1).

Figure 1 A
Figure1A hyperbolic random graph with 1942 vertices, average degree 7.7, and power-law exponent 2.6.The vertex sets V Greedy and V Exact are shown in red and blue, respectively.The dashed line shows a possible threshold radius ρ.
and we can apply the Chernoff bound in Corollary 3 to conclude that |V (I)| = O n • γ(n, τ ) −α holds with probability 1 − O(n −c ) for any c > 0.

for any c 1 >
0. Clearly, the first summand dominates the second and we can conclude that E[|V (W)|] = O(g(n)γ(n, τ )).Consequently, for large enough n, there exists a constant c 2 > 0 such that ĝ(n) = c 2 g(n)γ(n, τ ) is a valid upper bound on E[|V (W)|].This allows us to apply the Chernoff bound in Corollary 3 to bound the probability that |V (W)| exceeds ĝ(n) by more than a constant factor as (n) and n large enough, it holds that Pr[|V (P S )| = k] ≤ e −k/18 .Proof.First note that Pr[|V (P S )| = k] ≤ Pr[|V (P S )| ≥ k].In order to show that Pr[|V (P S )| ≥ k] is small, we aim to apply the Chernoff bound in Corollary 3, choosing ε = 1/2 and g(n) = 2/3 • k as an upper bound on E[|V (P S )|].

▶ Lemma 23 .
Let G be a hyperbolic random graph.Then, each run of length at most 2w + 1 contains at most O(log(n)) vertices with probability 1 − O(n −c ) for any constant c.

▶ Theorem 11 .
the first summand dominates asymptotically.◀ Let G be a hyperbolic random graph on n vertices.Given the radii of the vertices, an approximate vertex cover of G can be computed in time O(m log(n)), such that the approximation ratio is (1 + o(1)) asymptotically almost surely.

Figure 5 Figure 6
Figure5 Approximation ratios obtained using the standard greedy approach (blue) and our improved version (red) on a selection of real-world networks.The parameter adjusting the component size threshold was chosen as τ = 10.For the sake of readability the bars denoting the ratios for the dblp-cite network were cropped and the actual values written next to them.
(1) first two steps can be performed in time O(1)and O(deg(v)), respectively.Identifying and solving small components is more involved.Removing v can split the graph into at most deg(v) components, each containing a neighbor u of v.Such a component can be identified by performing a breadth-first search (BFS) starting at u.Each BFS can be stopped as soon as it encounters more than τ log log(n) vertices.The corresponding subgraph contains at most (τ log log(n)) 2 edges.Therefore, a single BFS takes time O(log log(n) 2 ).Whenever a component of size at most n c = τ log log(n) is found, we compute a minimum vertex cover for it in time 1.1996 nc • n The approximation ratio of the resulting cover is then given by the quotientδ = |V Greedy | + |C Exact | |C OPT | ,where C OPT denotes an optimal solution.Since all components in G[V Exact ] are solved optimally and since any minimum vertex cover for the whole graph induces a vertex cover onG[V ′ ] for any vertex subset V ′ ⊆ V , it holds that |C Exact | ≤ |C OPT |.Therefore, the approximation ratio can be bounded byδ ≤ 1 + |V Greedy |/|C OPT |.To bound the number of vertices in V Greedy , we add the number of vertices I in the inner disk I, as well as the numbers of vertices W in the outer band that are contained in wide runs and the number of vertices N that are contained in large narrow runs.That is, [11,1 + I + W + N |C OP T | .Upper bounds on I, W , and N that hold with high probability are given by Lemmas 12, 19, and 24, respectively.Furthermore, it was previously shown that the size of a minimum vertex cover on a hyperbolic random graph is |C OP T | = Ω(n), asymptotically almost surely[11,  Theorems 4.10 and 5.8].We obtain