Distribution-Sensitive Construction of the Greedy Spanner

The greedy spanner is the highest quality geometric spanner (in e.g. edge count and weight, both in theory and practice) known to be computable in polynomial time. Unfortunately, all known algorithms for computing it take Omega(n^2) time, limiting its applicability on large data sets. We observe that for many point sets, the greedy spanner has many `short' edges that can be determined locally and usually quickly, and few or no `long' edges that can usually be determined quickly using local information and the well-separated pair decomposition. We give experimental results showing large to massive performance increases over the state-of-the-art on nearly all tests and real-life data sets. On the theoretical side we prove a near-linear expected time bound on uniform point sets and a near-quadratic worst-case bound. Our bound for point sets drawn uniformly and independently at random in a square follows from a local characterization of t-spanners we give on such point sets: we give a geometric property that holds with high probability on such point sets. This property implies that if an edge set on these points has t-paths between pairs of points `close' to each other, then it has t-paths between all pairs of points. This characterization gives a O(n log^2 n log^2 log n) expected time bound on our greedy spanner algorithm, making it the first subquadratic time algorithm for this problem on any interesting class of points. We also use this characterization to give a O((n + |E|) log^2 n log log n) expected time algorithm on uniformly distributed points that determines if E is a t-spanner, making it the first subquadratic time algorithm for this problem that does not make assumptions on E.


Introduction
A Euclidean graph on a set of points in the Euclidean plane is a weighted graph with geometric distances as edge weights.If a shortest route in the graph is at most t times longer than the direct geometric distance between its endpoints, we say these endpoints have a t-path: a Euclidean graph is a t-spanner if all pairs of points have t-paths.For any t > 1, we can efficiently find a t-spanner with O n t−1 edges in the Euclidean plane [21].These 'approximations' thus have very few edges compared to the complete Euclidean graph, while approximately maintaining distances.This makes them a useful tool in many areas.
Bounded degree spanners are used in wireless network design [15], where for example points of high degree tend to have problems with interference.By using such a bounded degree spanner the problem of interference is minimized while the connectivity is maintained.A considerable amount of research has been done on the topic of spanners [16,21] since they were introduced in network design [22] and in geometry [11].Spanners have been used as components in various geometric and distributed algorithms.
Many different construction methods exist for t-spanners, where t can be parameterized to an arbitrary value greater than 1, each having different advantages and disadvantages.An in-depth treatise of these spanners can be found in the book [21].We focus on the greedy spanner, which is a very sparse graph with degree and weight bounds.On uniform point sets (t = 2), one of its closest well-known competitors, the Θgraph, has about ten times as many edges, twenty times higher total weight and six times higher maximum degree.Figure 1 clearly shows the contrast between these two spanners.Unfortunately, all known algorithms computing the greedy spanner use Ω(n 2 ) time [5,7], making the spanner impractical to compute.
Fig. 1: The left rendering shows the greedy spanner on 100 points distributed uniformly in a square with t = 2.
The right rendering shows the Θ-graph on the same points with k = 6 for which it was recently proven it achieves a dilation of 2.
We observed that on real-world examples, the greedy spanner contains mostly short edges with at most a few longer edges.Whether an edge is placed depends only on the points and edges in an ellipse with its endpoints as foci and with eccentricity t, which is a small area for short potential edges, hopefully containing few points.We can therefore find these short edges using a bucketing scheme, giving a speedup on such point sets.
For the 'long' edges, we consider the 'long' well-separated pairs from a WSPD [10].We first compute information from the 'short' edges, attempting to find witnesses that show that certain 'long' well-separated pairs will not contain greedy spanner edges.This information is represented by path-hyperbola.We then perform a standard algorithm [5] on the (hopefully only few) well-separated pairs for which we cannot find such a witness.
We present experimental results showing that the above algorithm works very well on many data sets, ranging from real-world data sets to generated point sets that are uniformly or normally distributed or clustered.Speedups vary from an (apparently) linear factor to a constant factor.In particular, on a uniformly distributed point set with 300000 points, our new algorithm needs 19 minutes to compute greedy spanner for t = 2, while the only other algorithm that can handle point sets of this size [5] (other algorithms need quadratic space, which is prohibitive) needs 17 hours on the same set.
We show that our algorithm has a near-quadratic worst-case time bound.In order to explain its behavior observed in experiments on realistic point sets, we analyze its performance on point sets distributed uniformly and independently at random in a square (or 'uniformly distributed points' for short).
Euclidean graphs are frequently analyzed on uniformly distributed points, both concerning theoretical properties and experimental evaluation of structures and algorithms.One can find examples in computational geometry [9,19], combinatorial optimization [26,29] and the analysis of ad-hoc networks [23,28].
Various spanner constructions have been analyzed on uniformly distributed point sets [1,8,13,25,27].Some of these constructions are a t-spanner for fixed t, others are parameterizable with arbitrary t > 1. Relatively sharp bounds have been obtained on various qualities of these spanners.This gives insight into the behavior of these constructions in situations arguably closer to realistic point sets than worst case situations.
The spanner constructions studied in these analyses have a 'local' characterization: for example, Gabriel graphs connect u, v if the circle having uv as its diameter contains no points other than u and v.For graphs with such a local characterization there are well-developed techniques to analyze them on uniformly distributed points [12].In this paper, however, we look at the 'global' property t-spannerness and the greedy spanner, a graph for which the existence of an edge may depend on all other points.Previous analysis techniques do not directly apply on such properties.However, one of our main contributions is to show that with high probability, greedy spanners do admit a local characterization on uniform point sets.
We give two more examples of local analysis.For a pair of points the minimum t such that there is a t-path between them is called their dilation.In a t-spanner for all pairs of points the dilation is bounded by t.For graphs on points drawn from a Poisson point process also the average dilation between pairs of points has been studied.Many graphs with a local characterization like the Gabriel graph have low average dilation [3].The property of having low average dilation can be linked to percolation [4].
We consider points distributed uniformly and independently at random in a √ n × √ n square.We use a √ n × √ n scale for our square so that if a part of this square has area A, then O(A) points lie in it in expectation.We only consider the case of the Euclidean plane -our results may generalize to higher dimensions, but we did not explore this.In this introduction, when stating bounds, we assume t is a constant.
We prove that such point sets are, with high probability, configured in such a way that for any edge set E, if there are t-paths between points at most O(log n) away from each other, then there are t-paths between all points.In particular, we show that we can construct a 'witness' of this configuration in O(n log 2 n log log n) expected time if it exists, thus allowing our algorithms to always give the right answer.
This result easily implies that with high probability the greedy spanner has no long edges (longer than O(log n)) and furthermore that the 'proof' phase of our algorithm will find the witnesses for this if it exists.As the grid strategy works well on uniformly distributed point sets, we obtain a O(n log 2 n log 2 log n) expected time bound on our algorithm.To the best of our knowledge, this algorithm is the first subquadratic algorithm to compute the greedy spanner on any interesting class of point sets.
Another application of our result is a method to test whether a Euclidean graph G = (P, E) is a tspanner on uniformly distributed points running in O((n + |E|) log 2 n log log n) expected time.To the best of our knowledge, this is the first subquadratic time algorithm for this problem for any interesting class of points.Various algorithms are known for specific graphs on arbitrary points, but not for arbitrary graphs on specific sets of points.Hellweg et al. [17] give a Monte Carlo algorithm for bounded degree graphs that distinguishes between being a t-spanner and being far away from a spanner.For specific graph classes the minimum t can be computed [2,14], and for general graphs this t can be approximated [20].
The rest of the paper is organized as follows.In Section 2 we introduce bridgedness and give a geometric lemma that will help us obtain our results.In Section 3 we show uniform point sets are locally-O(log n)bridged with high probability.In Section 4 we give several fast algorithms that use this result.Finally, in Section 5 we present experimental results for our algorithm that computes the greedy spanner.

Bridging Points
In this section we will introduce the concept of λ-bridgedness for point sets.We will later use this concept in our characterization of t-spanners on uniformly distributed point sets.We will prove two geometric lemmas that will help us with the result of Section 3.
Let P be a finite set of points in R 2 , let n = |P |, and let t ∈ R be the intended dilation (t > 1).Let G = (P, E) be a graph on P whose edges are weighted with the Euclidean distance between its endpoints.For two points u, v ∈ P , we denote the Euclidean distance between u and v by |uv|, and the network distance in G by δ G (u, v) (or just δ(u, v) if G is clear from the context).We say a pair of points (u, v) have a t-path if δ(u, v) ≤ t • |uv|.If all pairs of points have a t-path, the graph is called a t-spanner.
Let a, b, p, q ∈ P be pairwise different points.We say that the pair (p, q) bridges the pair (a, b) if Bridging points guarantee a t-path for (a, b) if (p, q) is an edge and the pairs (a, p) and (q, b) already have t-paths (as well as |ap|, |qb|, |pq| < |ab|).
We say that (p, q) is mandatory if the ellipse with foci p and q and eccentricity t including its border contains no points in P other than p and q.Any t-path between p and q must fully lie within this ellipse, so if (p, q) is mandatory then (p, q) ∈ E for any t-spanner E on P .
Let λ ∈ R. We say that a point a ∈ P is λ-bridged if for all b ∈ P with |ab| > λ, there exist some mandatory pair of points (p, q), p, q ∈ P , bridging (a, b).We say that the point set P is λ-bridged if all points in P are λ-bridged.We say a point a ∈ P is locally-λ-bridged if it is λ-bridged using only mandatory bridging pairs of points at most λ from a.A point set P is locally-λ-bridged if all points in P are locally-λ-bridged.Lemma 2.1 shows the usefulness of this concept.
Lemma 2.1.Let P be a set of points that is λ-bridged.For any Euclidean graph G = (P, E) it holds that G is a t-spanner if and only if all pairs of points (a, b), a, b ∈ P , with |ab| ≤ λ have a t-path in G.
Proof.Follows by induction over all pairs of points (a, b) with |ab| ascending and earlier observations.We now develop a sufficient geometric condition for bridging pairs of points.
Lemma 2.2.Suppose we are given points a, b ∈ P , rectangles R 1 and R 2 and t > 1, such that (as per Fig. 2): R 1 and R 2 lie in between a and b, have a side parallel to ab, have their centers on line segment ab, both have width w and height h, are separated by s ≥ t+1 t−1 h and R 1 lies closer to a than R 2 .Then, for any p, q ∈ P with p lying in R 1 and q lying in R 2 , (p, q) bridges (a, b).Proof.To simplify the proof, we assume without loss of generality that ab lies on the x-axis.For any u, v ∈ P , we denote the difference in x-coordinates of u and v as d x (u, v).We have d x (p, q) ≥ s ≥ t+1 t−1 h, so h ≤ t−1 t+1 d x (p, q), which leads to the lemma using the triangle inequality as follows: We now use Lemma 2.2 to prove a stronger statement that we will use to prove the full version of Theorem 3.1.Let a, p, q ∈ P be pairwise different points and let A ⊆ R 2 with a, p, q ∈ A. We say that the pair (p, q) bridges (a, A) if for every point b ∈ P with b ∈ A we have that (p, q) bridges (a, b).Lemma 2.3.Assume we are given a ∈ P , a line through a, an angle α ≤ π/4, a constant c max , rectangles R 1 and R 2 and t > 1, such that (as per Fig. 3): R 1 and R 2 have width w and height h, are separated by s, have a side parallel to , have their centers on , R 1 lies between a and R 2 , R 2 lies at most c max away from a, R 1 lies at least h/2 away from a and s ≥ √ 2 t+1 t−1 (2 sin(α)c max + h) + h.Define A as the area in the cone with apex a, angle 2α and bisector that is at least c cone = c max + h/2 away from a. Then for any p, q ∈ P with p lying in R 1 and q lying in R 2 , (p, q) bridges (a, A).
Fig. 3: R 1 and R 2 are covered by R 1 and R 2 , which satisfy the requirements of Lemma 2.2 Proof.Let b ∈ A and b ∈ P .We will prove that the rectangles R 1 and R 2 can be covered by rectangles R 1 and R 2 respectively, that meet all requirements of Lemma 2.2, which therefore implies that the pair (p, q) bridges (a, b).The lemma then follows.
The rectangles R 1 and R 2 are chosen such that their centers lie on line segment ab, they lie in between a and b (this is where c cone = c max + h/2 is needed) and have at least one side parallel to ab.The rectangles are chosen to have equal width (= length of the size parallel to ab) w and height h .Their position, height and width are chosen as the minimal values such that R 1 contains R 1 and R 2 contains R 2 (while maintaining the previous properties), as depicted in Fig. 3. Let s be the separation between R 1 and R 2 and let β be the angle between l and ab.Using basic geometry we can derive that: The angle β ≤ α is bounded by π/4.This implies that cos(β) ≥ √ 2 2 and sin(β) ≤ √ 2 2 .We obtain the following lower bound on s : Substituting the lower bound assumed for the lemma and using that sin α ≥ sin β we have: We bound h /2 by the distance from the center of the right side of R 2 to ab plus the distance from this center to the corner of R 2 : Combining the bounds on h and s gives This proves that all requirements of Lemma 2.2 hold.Hence (p, q) bridges (a, b).
We now have the tool needed for the main result.There exists c t dependent only on t such that for every c > 0, if P is a set of points uniformly and independently distributed at random in a √ n × √ n square and n is large enough, then with probability

The main idea of the proof
We need to prove that every point in P is locally-(c • c t log n)-bridged simultaneously with high probability.
We show that every point individually is locally-(c • c t log n)-bridged with sufficiently high probability that a simple union bound shows that it will happen to all points simultaneously with high probability.We will use Lemma 2.3 to achieve this.For ease of presentation, we assume t is a constant.The rectangles in Lemma 2.3 can be chosen to have a roughly constant chance of containing a point, and if we can fulfill the other requirements, the resulting pair of points bridges a relatively large part of R 2 .In fact, we need only π/α cones (we will end up picking α = O(1/ log n)) to cover the area we wish to cover, as depicted in Fig. 4. We show the likely existence of a pair of mandatory points that bridges a single cone and use a union bound to show such pairs are likely to exist for all cones simultaneously.
We will place O(log n) pairs of rectangles in every cone as depicted in Fig. 5.If any pair of boxes ends up containing a point per box, these two points will satisfy the requirements for Lemma 2.3.We just need this pair of points to be mandatory, and therefore consider an ellipse around such a pair of boxes (defined in terms of the boxes, not the points, for easy analysis), such that if this ellipse is empty apart from these two points, these points must be mandatory.Using a careful analysis, the chance that a pair of boxes contains one point per box and the ellipse contains no more points (an event we will call a 'success') is at least some constant p (dependent only on t).We need only one success per cone and the events are nearly independent (the ellipses don't overlap), so the chance that we get at least one success is at least (roughly) , which then shows the theorem.

The full proof
Note that we will often introduce a constant (say, the height h of R 1 ), give it a value (say h = 1) but still refer to the name of the variable later for clarity (so h instead of 1).

Positioning the cones
Let c be given as per the theorem.Let k = 4 exp 604t 7/2 (t−1) 3/2 (c + 14).Let c max = 12(k log(n) + 1) . We partition the circle with radius c cone = c max + 1/2 around every point a into m log n cones, as depicted in Fig. 4. We want the area in every cone within the circle to fall entirely within the square.If a lies near the edge of the square, this may not always be the case, so for these cones we either remove them or rotate them slightly around a as follows.
We only aim to prove that a is c bridge = √ 2 • c cone -bridged (and not c cone -bridged), so we remove all cones whose area further than c bridge from a lies outside the square in its entirety.For all other cones, if a point lies sufficiently far from a corner, it is easy to see we can just rotate the cone a bit so that the area closer than c cone from a lies entirely within the square while the area that is further than c bridge from a but still within the square is the same for the original and the rotated cone.
The only potential problem occurs when rotating a cone makes it end up outside the square if a lies near a corner of the square.However, this means that the area of the cone further than c bridge away from a but still within the square contains the corner of the square, but it is easily seen that this means that at least one of the edges of the square is more than c cone away from a, so this is never a problem.Note that rotated cones may overlap other cones, causing dependency issues that we will deal with later.

Boxes in Cones
We place k log n rectangles R 1 and R 2 in every cone as per Lemma 2.3, as depicted in Fig. 5 If at least one point ends up in R 1 , and at least one point ends up in R 2 and no other point ends up in E (making the pair of points mandatory), then we say that this pair of rectangles is a success.Let α = arcsin 1 2cmax , then the cones have angle ≤ 2α, which implies that s ≥ √ 2 t+1 t−1 (2 sin(α)c max + h) + h.The pair of points corresponding to a success would therefore fit the conditions of Lemma 2.3 and would therefore bridge the cone we are considering.The Lemma requires that the angle of the cones is at most π/4, which follows from m ≥ 8, which follows from m ≥ 12(k log n+1)t 2 t−1 − 1, which follows from t 2 t−1 > 1 and k log n ≥ 1.We will show that we will have at least one success for every cone simultaneously with high probability.
We first consider the final condition that needs to be met: the first box must lie far enough away from the origin point so the ellipse around it lies entirely within the cone.The ellipse has minor axis ≤ 17 t 3/2 √ t−1 and the cone is therefore wide enough for this at sin(α The major axis of the ellipse is 2t( t−1 , so to accommodate k log n ellipses, we need t−1 , which holds (after simplification).

Probability of success
Let p be the probability of success for a rectangle.Although the rectangles and ellipses do not overlap, the probability distributions for the rectangles are not independent, for if a pair of rectangles is not a success, then we learn something about the point sets: the points either avoid R 1 , avoid R 2 or end up in E too often or in E \ (R 1 ∪ R 2 ).We can therefore not immediately bound the chance that no pair of rectangles in a cone succeeds p by (1 − p) k log n .If we keep the dependencies in mind, we can however get a bound that is almost as strong.The chance that a point ends up in an area S may be higher than S n , up to S n−|E|k log n , and the number of points we do not yet know the exact location of may be less than n.
We bound the chance p e that more than k log(n)( 2|E| + |E|) points end up in the union of the ellipses (of a single cone): if we assume this happens, there are at least n − k log(n)( 2|E| + |E|)) ≥ n − 3k|E| log(n) points that we do not yet know the exact location of.We can bound p e by a binomial distribution with p C = k log n |E| n , n C = n and k C = k log(n)( 2|E| + |E|) and a Chernoff bound: this gives us (after filling in and simplifying) We will now bound the chance p that a pair of rectangles is a success assuming that no more than 3k|E| log(n) points end up in the union of the ellipses, and assuming that for any number of the other pairs of rectangles, we are given that they are either a success or not (thus allowing us to use the bound by (1 − p ) k log n later).For a success, we need two points to hit the rectangles (two factors A n ), no other points hit the rectangle (a factor (1 − 2 |E| n−3k|E| log(n) ) n ), with an additional factor (n − 3k|E| log(n)) 2 because there are at least that many ways of picking the first two points.
n goes to 0 as n increases, so b ≈ exp(−2|E|).Using that b < 1 we conclude that the chance p that no pair of rectangles in a cone succeeds assuming that no more than 3k|E| log(n) points end up in the union of the ellipses is at most

Conclusion
We now use a union bound to bound the chance that some cone either ends up without successes, or has too many points inside its ellipses.There are m log n cones per point and n points, so this chance is at most ≤ n We wish for the above chance to become ≤ n −c .Noting that log(n) , we will bound both exponents in the above chance by −c − log(2)/ log(n).We assume that n > 906k t 7/2 (t−1) 3/2 , which makes 1 − 3k|E| log(n) n > 1/2.We will use t < n and k < n which follow from n > 906k t 7/2 (t−1) 3/2 , as well as log(n This bound holds by our definition of k.We now turn to the other exponent.
This bound also holds by our definition of k.We note that k = O ce

Algorithms
We first introduce three tools used in the results below.Let c and c t be as in Theorem 3.1 throughout this section.The first is that we can divide the input into a time: using polar coordinates, the union corresponds to a lower envelope.Since the hyperbolas pairwise intersect at most twice, this envelope has linear complexity and can be computed in O(n log n) time [6,24].This gives an efficient test of t-paths from s to all other points as least as accurate as local-λ-bridgedness.

Testing t-spanners
The first application of Theorem 3.1 and our tools is a faster algorithm to test if a Euclidean graph is a tspanner on uniformly distributed point sets.To the best of our knowledge, this leads to the first subquadratic algorithm for this problem any interesting set of point sets not making assumptions on E.

Greedy Spanner
Consider the following algorithm that was introduced by Keil [18]: for every pair of distinct points (u, v) in ascending order of |uv| 3.
then add (u, v) to E

return E
The graph returned by this algorithm is called the greedy spanner on V for t and it is obviously a t-spanner, but the algorithm has a O(n 3 log n) running time.We make the following observation: Lemma 4.2.If P is λ-bridged, then the greedy spanner on P does not have edges longer than λ.
Proof.After ensuring t-paths for all (u, v) with |uv| ≤ λ the algorithm will not add more edges as all (u, v) with |uv| > λ have t-paths by Lemma 2.1.
We can combine Lemma 4.2 with Theorem 3.1 to quickly compute the greedy spanner on uniform point sets.We first give a preliminary algorithm which we then employ in two greedy spanner algorithms.Theorem 4.3.For every λ > 0, there is an algorithm that, given a point set P whose points are uniformly distributed in a √ n× √ n square, computes in O(n log n+nλ 2 log 2 λ) expected time the edges of the greedy spanner on P for t of length at most λ.
Proof.We use the algorithm introduced in [5] (we will not explain that algorithm here), except we keep Lemma 4.2 in mind and use our local Dijkstra instead of a normal Dijkstra and only consider well-separated pairs {A i , B i } with min(A i , B i ) ≤ λ.
Using the analysis in [5] and using that the greedy spanner has degree O(1), we conclude that if m is the number of considered well-separated pairs, the running time of our modified algorithm is For any l ∈ R, a point p can only be in O(1) well-separated pairs of length at most a constant factor higher or lower than l [10, Lemma 4.6.1].We can therefore partition the well-separated pairs containing p into O(1)-sized sets of similar length.As the minimal length per set differ by at least a constant factor, we conclude . This last expression is O(log λ) in expectation on uniform point sets, giving an expected running time of O(n log n + nλ 2 log 2 λ).
Note that we could have adapted the algorithm from [7], but this algorithm sorts all potential edges, resulting in an expected O(n log nλ 2 log λ) running time, which is slower when filling in λ = O(log n).There is an algorithm that, given a point set P whose points are uniformly distributed in a expected time the the greedy spanner on P for t with high probability, where c t is a constant dependent only on t.

The full distribution sensitive algorithm
The algorithm from Theorem 4.3 is the first phase of our distribution sensitive algorithm.We now present the second and third phase that ensure that all long edges are also computed.
The second phase gathers path-hyperbola as described at the start of this section.We then consider the well-separated pairs that did not get considered in the first stage of the algorithm and try to prove for them that they will not produce a greedy spanner edge.For the remaining pairs, we employ the algorithm of [5] in the third phase of our algorithm to find the remaining greedy spanner edges.
If for a point u ∈ A i , the bounding box B i is covered by the union of path-hyperbola computed for u (testing this takes O(log n) time), then we say u is discounted with respect to {A i , B i }.If all u ∈ A i are discounted, then {A i , B i } will not contain a greedy spanner edge and we say {A i , B i } is discounted.This can be computed in O(log n m i=1 |A i | + |B i |)) = O(n log 2 log n) expected time by an earlier argument.We then perform the algorithm from [5], with small differences.We ignore pairs that have been discounted in the previous phase, and we do not perform a Dijkstra operation on points which have been discounted with respect to that pair as well.By Theorem 3.1, all pairs are discounted with high probability and hence this phase takes constant time in expectation on uniform point sets.
In practice, using a lower λ than predicted by Theorem 3.1 will suffice and be faster.From experiments we observe that λ = log n/ log log n/(t − 1) 1/4 is the 'right' bound for the length of the longest edge in the greedy spanner.Using 1.1 • λ the initial phase nearly always finds all edges, with the second phase usually discounting 99.7% of the pairs and 95% of the points in undiscounted pairs, with the second phase taking about 20% of the time of the first.Using 1.5 • λ, all pairs are typically discounted.

Experimental results
We have run our algorithm and WSPD-Greedy from [5] on point sets whose size ranged from 500 to 128,000 points.The linear space algorithm has a running time comparable to other algorithms.All other published algorithms use quadratic space, and therefore running them on more then 10.000 points quickly becomes infeasible so we decided not to include them in our experiments.For a detailed comparison between the major quadratic space algorithms and WSPD-Greedy we refer to [5].
Throughout this section we will refer to our algorithm as "Bucketing" in the graphs.We generated point sets according to several distributions.We have recorded space usage and running time (wall clock time).The results are averages over several runs where new point sets were generated for each run.We included graphs for the uniform point set and for a clustered point set as these represent the best and worst cases respectively for our algorithm (with respect to our set of tests).We give some numbers from our experiments in text form.To generate the clustered point set we used the same method as [5], that is, for n points, it consists of √ n uniformly distributed point sets of √ n uniformly distributed points.

Environment
The algorithms have been implemented in C++.The random generator used was the Mersenne Twister PRNG -we have used a C++ port by J. Bedaux of the C code by the designers of the algorithm, M. Matsumoto and T. Nishimura.We have implemented all other necessary data structures and algorithms not already in the std ourselves.The implementations do not use parallelism and run on a single thread.
Our experiments have been run on a server using an Intel Xeon E5530 CPU (2.40GHz) and 8GB (1600 MHz) RAM.It runs the Debian 7 OS and we compiled for 64 bits using G++ 4.7.2 with the -O3 option.

Dependence on instance size
We have compared running time and space usage of WSPD-Greedy and our algorithm for different values of n.We plotted the results using t = 2 on both uniform (Fig. 6) and clustered points (Fig. 7).
The space usage of our algorithm appears to be a constant factor less than that of WSPD-Greedy.Its running time on uniformly distributed points is (nearly) linear making it a massive improvement over WSPD-Greedy.This allows us to calculate greedy spanners on such point sets in a matter of minutes where WSPD-Greedy would need hours or even days for bigger instances.
The clustered point set is a bad case for our algorithm since the greedy spanner will contain a considerabl amount of really large edges between clustered.Nevertheless, the algorithm still outperforms WSPD-Greedy by quite a margin.Our experiments on clustered data with smaller t values (up to t = 1.1 as plotted in Figure 8) show that the performance of the algorithms gets more similar as t decreases.On point sets drawn using a uniform or normal distribution our algorithm massively outperforms WSPD-Greedy for both small (Figure 9 , 10) and large t (Figure 6 , 11).

Real data
Aside from generated instances we also experimented on some real point sets from the TSPLIB 1 .The performance of our algorithm on these real datasets seems to be close to the uniform point sets.Figure 12 shows two point sets and their greedy spanners.For the PCB the computation using our algorithm took on average about 2 seconds for t = 2 and 11 seconds for t = 1.1.The same computations using WSPD-Greedy took 12 and 203 seconds respectively.The bigger Germany instance took 21 and 147 seconds to compute using our algorithm while WSPD-Greedy needed 274 and 7486 seconds for t = 2 and t = 1.1.This a factor 50 improvement for the low t case reduces the computation time from hours to minutes.

Conclusion
We have introduced a distribution sensitive algorithm for computing the greedy spanner.Experiments show large improvements in both time and space for most data sets, while results are never worse than the stateof-the-art.The performance gap in many cases becomes even larger for lower t.To explain these results, we have analyzed the algorithm on uniformly distributed point sets.
To this end, we have introduced the concept of bridgedness and have shown that point sets that are uniformly distributed in a √ n × √ n square are O(log n)-bridged with high probability.This implies that 't-spannerness' is a 'local' property on these point sets: a Euclidean graph is a t-spanner if and only if all pairs of 'close-by' points have t-paths.This locality shows that our algorithm is near-linear on these point sets and yields a near-linear time algorithm for testing whether an edge set is a t-spanner on these point sets.We leave open several questions that may be answered in future work.First, in our experiments, we have observed that the length of the longest edge of the greedy spanner on uniform point set tends towards log n/ log log n/(t − 1) 1/4 , leaving a gap with our upper bound; similarly, our bridgedness bound may also be improvable.Second, it would be interesting if a faster algorithm could be found than our algorithm on uniform point set.Third, it would be interesting to see if our results generalize to higher dimensions.Lastly, there is still no general subquadratic time algorithm for the greedy spanner.Our algorithm could be considered a divide and conquer algorithm where the conquer step may be very slow, possibly susceptible of improvement.

Fig. 4 : 2 aFig. 5 :
Fig. 4: Covering the plane with cones . Every rectangle has width w = 1, height h = 1, and R 1 and R 2 are placed s = h+2 √ 2 t+1 t−1 apart.The rectangles are aligned with the bisector of the cones.Neighboring pairs of rectangles are placed s 2 apart.Let A = w • h = 1 be the area of R 1 and R 2 .We surround the rectangles by an ellipse E with focii d and e and eccentricity t as follows.The centers of R 1 and R 2 lie on de and d and e are placed at a distance X = h+2w+th 2(t−1) = 3+t 2t−2 from R 1 and R 2 respectively.We now note that if a ∈ R 1 and b ∈ R 2 , then any point c lying in the ellipse with focii a and b and eccentricity t also lies in E as follows: from |ac| + |cb| ≤ t|ab| ≤ t(h + 2w + s) we conclude |dc| + |ce| ≤ |da| + |ac| + |cb| + |be| ≤ 2X + h + 2w + t(h + 2w + s) = t(2X + 2w + s) = t|de|, and so c ∈ E. Properties of ellipses and algebraic simplification gives us

t 2 t− 1
log n and the theorem follows.

Theorem 4 . 5 .
There is an algorithm that, given t and a point set P whose points are uniformly distributed in a √ n × √ n square, computes in O((n + |E|)(c t log n) 2 log 2 (c t log n)) expected time time its greedy spanner, where c t is a constant dependent only on t.The algorithm uses O(n 2 log 2 n) time on arbitrary P .

Fig. 6 :Fig. 7 :Fig. 8 :Fig. 9 :Fig. 10 :Fig. 11 :
Fig. 6: The left plot shows the running time of our algorithm (Bucketing) and WSPD-Greedy for t = 2 on variously sized uniformly distributed instances.The right plot shows the memory usage on the same data

Fig. 12 :
Fig. 12: Real point sets from the TSPLIB and their greedy spanners using t = 2. Left: A PCB instance of 3.038 points.Right: Cities in Germany, 15.112 points.
Theorem 4.1.There is an algorithm that, given a point set P whose points are uniformly distributed in a √ n × √ n square and a Euclidean graph E on P , checks if E is a t-spanner using O((n + |E|)(c t log n) 2 log(c t log n)) expected time, where c t is a constant dependent only on t.Proof.Applying our three tools with λ = c • c t log n almost immediately gives us the desired result: we run a local Dijksta for every point, maintaining the union of the path hyperbola.If we find any pair of points without t-path, we return that the input is not a t-spanner.If some union of path-hyperbola for a point s does not cover the area more than λ away from s, we perform a O(n 2 log n) test for t-spannerness, and otherwise we return that the input is a t-spanner, which happens with high probability by Theorem 3.1.This algorithm therefore uses O((n + |E|)(c t log n) 2 log(c t log n)) expected time.