# Uniformity of Point Samples in Metric Spaces Using Gap Ratio

• Arijit Bishnu
• Sameer Desai
• Arijit Ghosh
• Mayank Goswami
• Subhabrata Paul
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 9076)

## Abstract

Teramoto et al. [22] defined a new measure called the gap ratio that measures the uniformity of a finite point set sampled from $$\mathcal S$$, a bounded subset of $$\mathbb {R}^2$$. We attempt to generalize the definition of this measure over all metric spaces. We solve optimization related questions about selecting uniform point samples from metric spaces; the uniformity is measured using gap ratio. We give lower bounds for specific metric spaces, prove hardness and approximation hardness results. We also give a general approximation algorithm framework giving different approximation ratios for different metric spaces and give a $$\left( 1+\epsilon \right)$$-approximation algorithm for a set of points in a Euclidean space.

### Keywords

Discrepancy Metric space Hardness Approximation

## 1 Introduction

Generating uniformly distributed points over a specific domain has applications in digital halftoning; see [1, 22, 24] and the references therein, numerical integration [10, 17], computer graphics [10], etc. Meshing also requires uniform distribution of points over a region of interest [5]. There are different measures of uniformity of points that we discuss below.

One such notion is the discrepancy [10, 17] of a point set. For a formalization of this notion, an interested reader is referred to [10, 17]. Let $$\left| P \right| =n$$ and $$\text{ vol }(B)$$ denote the area of $$B$$. The expected number of points that would lie inside $$B$$ if $$P$$ is distributed uniformly and independently at random is $$n \cdot \text{ vol }(B)$$. Let $$D(P,B)$$ denote the deviation of $$P$$ from uniform distribution inside a particular $$B$$, i.e. $$D(P,B) = n \cdot \text{ vol }(B) - \left| P \cap B \right|$$. Let $$\mathcal R$$ denote the set of all shapes similar to $$B$$. The quantity $$D(P, \mathcal{R}) = \sup _{R \in \mathcal{R}}\left| D(P,R) \right|$$ is the discrepancy of $$P$$ for shapes similar to $$B$$. The function $$D(n,\mathcal{R}) = \inf _{P \subset S \, \& \,\left| P \right| =n} D(P,\mathcal{R})$$ captures the notion of the least possible discrepancy of any point set sized $$n$$. To compute uniformity using the above measure, the quantity $$D(n,\mathcal{R})$$ is to be computed for all possible scales and positions of $$B$$.

Another notion of uniformity has been captured by the idea of maximizing the minimum distance among points inside $$\mathcal S$$. This is equivalent to packing equal radius circles inside $$\mathcal S$$ [11, 18, 19, 20]. Packing equal radius circles has remained a difficult problem [16]. This measure does not take into effect large empty areas inside $$\mathcal S$$.

One can observe that both of the above measures are hard to compute. Motivated by problems in digital halftoning, Teramoto et al. [22] defined a new measure of uniformity called the gap ratio that measures uniformity in $$\mathbb {R}^2$$. The basic notion of this uniformity measure is a ratio between the maximum and minimum gaps among points. The minimum gap is the distance between the closest pair of points of $$P$$. The maximum gap is the radius of the maximum empty circles among points in $$P$$ and is linked to the Voronoi diagram [7] of $$P$$.

Definition of Gap Ratio. Teramoto et al. [22], who introduced the problem motivated by combinatorial approaches and applications in digital halftoning [1, 3, 4, 21], were interested in the online version of the gap ratio problem. We generalise their definition as follows.

### Definition 1

Let $$(\mathcal M, \delta )$$ be a metric space and $$P$$ be a set of $$k$$ points sampled from $$\mathcal M$$. Define the minimum gap as $$r_P :=\text{ min }_{p,q \in P,\,p \not = q} \delta (p,q)/2$$. The maximum gap brings into play the interrelation between the metric space $$\mathcal M$$ and $$P(\subset \mathcal{M})$$, the set sampled from $$\mathcal M$$, and is defined as $$R_P :=\sup _{q \in \mathcal{M}} \delta (q,P)$$, where $$\delta (q,P) :=\text{ min }_{p \in P} \delta (q,p)$$. The gap ratio for the point set $$P$$ is defined as $$GR_P :=R_P/r_P$$. In the rest of the paper, we would mostly not use the subscript $$P$$.

Gap ratio need not be greater than $$1$$. See the example in [8]. In a geometric sense, the maximum gap is analogous to the covering radius of $$P$$, and the minimum gap is analogous to the packing radius of $$P$$. In a uniformly distributed point set, we expect the covering to be thin and the packing to be tight. Thus the gap ratio can be a good measure of estimating uniformity of point samples.

The space $$\mathcal M$$, as in Definition 1 can be both continuous and discrete. Using this generalized definition, we can pose the following combinatorial optimization question.

### Definition 2

(The gap ratio problem). Given a metric space $$\left( \mathcal {M}, \delta \right)$$, an integer $$k$$ and a parameter $$g$$, find a set $$P \subset \mathcal M$$ such that $$\vert P \vert = k$$ and $$GR_P\leqslant g$$.

Asano [2] in his work opened this area of research, where he asked discrepancy like questions in a discrete setting. Asano opined that the discrete version of this discrepancy-like problem will make it amenable to ask combinatorial optimization related questions. We initiate this line of study in this paper for different metric spaces. As we would go back and forth between different metric spaces, we summarize the results of the paper in the following table.

Metric space

Lower bounds

Hardness

Approximation

General

None

Yes

$$2$$-approx. hard

Discrete

Graph (connected)

$$\frac{2}{3}$$

Yes

Approx. factor: $$3;$$$$\frac{3}{2}$$-approx. hard

Euclidean

-

-

$$\left( 1+\epsilon \right)$$-algorithm

Continuous

Path-connected

$$1$$

Yes

Approx. factor: $$2$$

Unit square in $$\mathbb {R}^{2}$$

$$\frac{2}{\sqrt{3}}-o(1)$$

-

Approx. factor: $$\sqrt{3}+o(1)$$

Previous Results. Teramoto et al. [22] proved a lower bound of $$2^{{\left\lfloor k/2 \right\rfloor } / {(\left\lfloor k/2 \right\rfloor +1)}}$$ for the gap ratio in the one dimensional case where $$k$$ points are inserted in the interval $$[0,1]$$ and also proposed a linear time algorithm to achieve the same. They got a gap ratio of $$2$$ in 2-dimension using ideas of Voronoi insertion where the new point was inserted in the centre of a maximum empty circle [7]. They also proposed a local search based heuristic for the problem and provided experimental results in support.

Asano [2] discretized the problem and showed a gap ratio of at most 2 where $$k$$ integral points are inserted in the interval $$[0,n]$$ where $$n$$ is also a positive integer and $$0 < k < n$$. He also showed that such a point sequence may not always exist, but a tight upper bound on the length of the sequence for given values of $$k$$ and $$n$$ can be proved.

Zhang et al. [24] focused on the discrete version of the problem and proposed an insertion strategy that achieved a gap ratio of at most $$2 \sqrt{2}$$ in a bounded two dimensional grid. They also showed that no online algorithm can achieve a gap ratio strictly less than $$2.5$$ for a $$3 \times 3$$ grid.

In Sects. 2 and 3, we deal with continuous and discrete metric spaces respectively, where we give lower bounds, hardness, and approximation results. We show a general approximation hardness result in Sect. 4.

## 2 Continuous Metric Spaces

### 2.1 Lower Bounds

Here we study the lower bounds for the gap ratio in continuous metric spaces. We first point out that there does not exist a general lower bound on gap ratio. See Example 2 in [8], where we consider two disjoint balls as our metric space. However, if the space is path connected we can fix a general lower bound.

### Lemma 3

The lower bound of gap ratio is $$1$$ when $$\mathcal {M}$$ path connected.

For the proof of the above lemma, see Lemma 3 of [8].

Next we consider the metric space, $$\left[ 0,1 \right] ^2 \subset \mathbb {R}^2$$ as in Teramoto et al.’s problem [22]. To prove the lower bound on gap ratio, we would want to increase $$r$$ and reduce $$R$$, as much as possible. To this end we need the definition of packing and covering densities, which can be found in [15, 23].

### Lemma 4

The lower bound for gap ratio is $$\left( \frac{2}{\sqrt{3}}- o\left( 1\right) \right)$$, when $$\mathcal {M} = \left[ 0,1 \right] ^2$$.

### Proof

Let $$2r$$ be the minimum pairwise distance between the point of $$P$$. Consider a circle of radius $$r$$ around each point of $$P$$. This forms a packing of $$k$$ circles of radius $$r$$ in a square of side length $$\left( 1+2r\right)$$. Suppose the density of such a packing is $$d_1$$. Now, we can tile the plane with such squares packed with circles. Thus we have a packing of the plane of density $$d_1$$. It is known that the density of the densest packing of equal circles in a plane is $$\pi / \sqrt{12}$$ [15]. Then obviously $$d_1 \leqslant \pi / \sqrt{12}$$ as we have packed the plane with density $$d_1$$. Hence, $$d_1=k \pi r^2 / (1+2r)^2 \leqslant \pi / \sqrt{12}$$. Consequently we have, $$r\leqslant \left( \sqrt{k\sqrt{12}}-2 \right) ^{-1}$$.

On the other hand, let $$R=\sup _{x \in \mathcal{M}} \delta (x,P)$$. Clearly, circles of radius $$R$$ around each point of $$P$$ cover $$\mathcal M$$. Suppose the density of such a covering is $$D_1$$. Now, we can tile the plane with this unit square. Thus we have a covering of the plane with density $$D_1$$. It is known that the density of the thinnest covering of the plane by equal circle is $$2\pi / \sqrt{27}$$ [15]. Then obviously $$D_1 \geqslant 2\pi / \sqrt{27}$$ as we have covered the plane with density $$D_1$$. Thus we have, $$D_1=k\pi R^2 / 1 \geqslant 2\pi / \sqrt{27}$$, giving us $$R\geqslant \sqrt{2} / \sqrt{k\sqrt{27}}$$. Hence, the gap ratio is $$\frac{R}{r}\geqslant \left( \sqrt{k\sqrt{12}}-2\right) \sqrt{2} / \sqrt{k\sqrt{27}} = \frac{2}{\sqrt{3}}- o\left( 1\right)$$.    $$\square$$

Teramoto et al. [22] had obtained a gap ratio of 2 in the online version, whereas, the lower bound for the problem is asymptotically 1.1547.

### 2.2 Hardness

General NP-Hardness. In this section, we show that the gap ratio problem is hard for a continuous metric space. To show this hardness, we reduce from the problem of system of distant representatives in unit disks [12]. We first define the problem.

### Definition 5

($$S\left( q,l\right)$$-$$DR$$). [12] Given a parameter $$q>0$$ and a family $$\mathcal {F}=\{F_i| i\in I, F_i\subseteq X\}$$ of subsets of $$X$$, a mapping $$f:I\rightarrow X$$ is called a System of$$q$$-Distant Representatives (shortly an $$Sq$$-$$DR$$) if (i)$$f(i)\in F_i$$ for all $$i\in I$$ and (ii) distance between $$f(i)$$ and $$f(j)$$ is at least $$q$$, for $$i,j\in I$$ and $$i\ne j$$. When the family $$\mathcal {F}$$ is a set of unit diameter disks with centres that are at least $$l$$ distance apart, we denote the mapping by $$S\left( q,l\right)$$-$$DR$$.

Fiala et al. proved that $$S(1,l)$$-$$DR$$ is NP-hard [12]. For the general version $$S\left( q,l\right)$$-$$DR$$, we give a proof sketch using Fiala et al.’s technique. Note that for $$q\leqslant l$$, the centres of the disks suffice as our representatives. So assume that $$q>l$$. We restate a generalised version of their result below, see the proof of Theorem 10 in [8].

### Theorem 6

$$S\left( q,l\right)$$-$$DR$$ is NP-hard for $$q>l$$ on the Euclidean plane.

Next we show that the above holds even for a constrained version of the problem.

### Lemma 7

$$S\left( q,l\right)$$-$$DR$$-$$1$$ is NP-complete for $$q>l$$, where $$S\left( q,l\right)$$-$$DR$$-$$1$$ denotes $$S\left( q,l\right)$$-$$DR$$ with one representative point constrained to lie on the boundary of one of the disks.

### Proof

Clearly, a solution to $$S\left( q,l\right)$$-$$DR$$-$$1$$ is a solution to $$S\left( q,l\right)$$-$$DR$$. Conversely, a solution of $$S\left( q,l\right)$$-$$DR$$ can be translated until one point hits the boundary to obtain a solution to $$S\left( q,l\right)$$-$$DR$$-$$1$$.

It is easy to see that $$S\left( q,l\right)$$-$$DR$$-$$1$$ is in NP, as any claimed solution can be checked by using a voronoi diagram in polynomial time. Hence, it is NP-complete for $$q>l$$.    $$\square$$

We now use the above result to prove the hardness of the gap ratio problem.

### Theorem 8

Let $$\mathcal M$$ be a continuous metric space and $$q> 2$$. It is NP-hard to find a finite set $$P\subset \mathcal M$$ of cardinality $$k$$ such that $$GR_P\leqslant \frac{2}{q}$$.

Proof. We show that if there is a polynomial algorithm to find a finite set $$P\subset \mathcal M$$ of cardinality $$k$$ such that the gap ratio of $$P$$ is at most $$\frac{2}{q}$$ for some $$q> 2$$, then there is also a polynomial algorithm for $$S\left( q,l\right)$$-$$DR$$-$$1$$.

Consider an instance of $$S\left( q,l\right)$$-$$DR$$-$$1$$, a family $$\mathcal {F}=\left\{ F_1, F_2, \ldots , F_k \right\}$$ of $$k$$ disks of unit diameter such that their centres are at least distance $$l$$ apart, where $$q>l> 2$$ (even with this restriction the proof of Theorem 6 goes through).

We run the algorithm for the gap ratio problem $$k$$ times, each time on a separate instance. The instance for the $$i$$-th iteration would have the disks $$\left\{ F_j \vert j\ne i \right\}$$ and a circle of unit diameter with its centre being the same as the centre of $$F_i$$. The following claim, whose proof follows later completes the proof.

### Claim 9

If a single iteration of the above process results “yes”, then we have a solution to the $$S\left( q,l\right)$$-$$DR$$-$$1$$ instance.

Since $$S\left( q,l\right)$$-$$DR$$-$$1$$ is NP-hard, the gap ratio problem must also be NP-hard.    $$\square$$

### Proof of Claim 9

Suppose that the gap ratio of a given point set is at most $$\frac{2}{q}$$ for the $$i$$th instance. If it so happens that two points are within the same disk, then $$r\leqslant \frac{1}{2}$$. Thus for the gap ratio to fall below $$\frac{2}{q}$$ we need $$R\leqslant \frac{2r}{q}\leqslant 1/q<1$$. But considering the number of points that we are choosing, we must have an empty disk, which would contain a point $$x$$ such that $$R\geqslant d(P,x)\geqslant l- \frac{1}{2} >1$$, giving us a contradiction. Thus we have that each disk contains exactly one point from $$P$$. Since, $$l>2$$ and $$F_i$$ is a circle, $$R=1$$. Thus, we get $$r=\frac{1}{GR}\geqslant \frac{q}{2}$$, making the closest pair to be at least a distance $$q$$ apart.    $$\square$$

Path Connected Spaces. Next, we show that it is NP-hard to find $$k$$ points in a path connected space such that $$GR=1$$. To prove this, we start by proving that in a path connected space it is NP-Hard to find $$k$$ points such that $$R=r=\frac{3}{2}$$ by reducing from the efficient dominating set problem. Later we extend the result for all positive real values of $$r$$.

### Theorem 10

It is NP-hard to find a set $$P$$ of $$k$$ points in a path connected space $$\mathcal M$$ such that $$R_P=r_P=\frac{3}{2}$$.

Proof. Let us consider an instance of the efficient domination problem, an undirected graph $$G \left( V,E \right)$$, and a parameter $$k$$. From this graph we form a metric space $$\left( \mathcal {M},\delta \right)$$ as follows. In $$\mathcal {M}$$, each edge of $$E$$ corresponds to a unit length path. We place at each vertex of $$V$$ an $$\epsilon$$-path, where $$0< \epsilon <\frac{1}{4}$$, which is merely an $$\epsilon$$ long curve protruding from the vertex as shown in Fig. 1a. The vertices merely become points on a path formed by consecutive edges as shown in Fig. 1b. If there are edge-crossings, we do not consider the crossing to be an intersection but rather consider it as an embedding in $$\mathbb {R}^3$$. This ensures that different paths only intersect at vertices of the graph (this makes sure that there is direct correspondence between the path lengths in the graph and the path lengths of the metric space). The distance, $$\delta$$, between two points in this space is defined by the length of the shortest curve joining the two points.

We show that finding a set $$P$$ of $$k$$ points in $$\mathcal {M}$$ such that $$R_P=r_P=\frac{3}{2}$$ is equivalent to finding an efficient dominating set of size $$k$$ in $$G$$, using a series of claims.

### Claim 11

Suppose $$D\subset V$$ is an efficient dominating set in $$G$$. Then we have a set $$P\subset \mathcal {M}$$ with $$|D|=|P|$$ such that $$R_P=r_P=\frac{3}{2}$$.

Conversely, given a set $$P'$$ of $$k$$ points in $$\mathcal {M}$$ such that $$R_{P'}=r_{P'}=\frac{3}{2}$$, we want to find an efficient dominating set in $$G$$. If $$P'\subset V$$, then we are done as $$P'$$ is an efficient dominating set in $$G$$ (refer to Claim 21). Otherwise, if $$P'\not \subset V$$, then from $$P'$$ we construct another set $$P \subset V$$ such that $$R_P=r_P=\frac{3}{2}$$. We form $$P$$ by appropriately moving points of $$P'$$ to the points corresponding to $$V$$.

### Claim 12

$$P'\subset V$$ or $$P' \cap V=\emptyset$$.

By Claim 12, if $$P'\not \subset V$$, then $$P'\cap V=\emptyset$$. Note that in this case $$P'$$ cannot have midpoints of the graph edges as between any two midpoints at distance $$3$$ from each other, there is a vertex with an $$\epsilon$$-path which is distance $$\frac{3}{2}$$ from both points. Thus the other end of this $$\epsilon$$-path must be at a distance $$\frac{3}{2} + \epsilon$$ from both points contradicting the fact that $$R_{P'}=\frac{3}{2}$$. Thus each point in $$P'$$ must have a closest vertex. We form the set $$P$$ by moving each point of $$P'$$ to its closest vertex.

### Claim 13

$$R_P=r_P=\frac{3}{2}$$.

For the proofs of Claims 1112 and 13 refer to the proofs of Claims 8, 16 and 17 of [8].

By Claim 13, without loss of generality, we can assume that the sampled set is a subset of $$V$$. Using ideas we present in the proof of Claim 21, it is easy to see that, if we can find a set $$P$$ of $$k$$ points in $$\mathcal {M}$$ such that $$R_P=r_P=\frac{3}{2}$$, then we can find an efficient dominating set of $$k$$ vertices in $$G$$.

Hence, it is NP-hard to find a set $$P$$ of $$k$$ points in a path connected space such that $$R_P=r_P=\frac{3}{2}$$.    $$\square$$

In the above reduction, taking the edge lengths to be $$\frac{2x}{3}$$ instead of $$1$$ and $$\frac{2x\epsilon }{3}$$-paths instead of $$\epsilon$$-paths we have that it is NP-hard to find a set of $$k$$ points in a path connected space such that $$R_P=r_P=\frac{3}{2} \times \frac{2x}{3} = x$$. Since this can be done for any positive $$x$$, the following theorem follows as a corollary to Theorem 10.

### Theorem 14

It is NP-hard to find a set of $$k$$ points in a path connected space such that gap ratio is $$1$$.

### 2.3 Approximation Algorithms

In this section we show that Gonzalez’s [14] farthest point insertion method (with a slightly tweaked initiation) for $$k$$-centre clustering gives a constant factor approximation for gap ratio. We will call it Algorithm 1. The following is an outline of the algorithm.

Let $$(\mathcal {M}, \delta )$$ be a metric space of $$n$$ points and $$k$$ be the number of points to be sampled. The first two points chosen are a pair of farthest points. Let $$S_i$$ denote the set of first $$i$$ points. Then, $$S_{i+1}=S_i\cup \left\{ q_{i+1}\right\}$$ for $$2<i\leqslant k$$, where $$\delta \left( q_{i+1},S_i\right) =\sup _{q \in \mathcal{M}} \delta (q,S_i)$$.

We now analyse the algorithm. Without loss of generality, let $$P = \{ p_{1}, \, \dots , \, p_{k}\}$$ be the set with optimal gap ratio, and let $$GR = \alpha$$.

### Lemma 15

In Algorithm 1, $$R_{S_{i}}\leqslant R_{S_{i-1}}$$ for each $$i\in \{2, \ldots , \, k\}$$ and the gap ratio $$GR_{S_{i}}$$ is at most $$2$$ after each iteration.

For the proof of the above lemma, see Lemma 23 of [8]. The main theorem of this section is as follows.

### Theorem 16

Farthest point insertion gives the following approximation guarantees: $$(i)$$ if $$\alpha \geqslant 1$$, then the approximation ratio is $$\frac{2}{\alpha } \leqslant 2$$, $$(ii)$$ if $$\frac{2}{3} \leqslant \alpha < 1$$, the approximation ratio is $$\frac{2}{\alpha } \leqslant 3$$, and $$(iii)$$ if $$\alpha < \frac{2}{3}$$, the approximation ratio is $$\frac{4}{2-\alpha } < 3$$.

Proof. Case $$(i)$$ and $$(ii)$$ follow directly from Lemma 15. We deal with Case $$(iii)$$. Let us define closed balls centred at $$p_{i}$$’s as follows: $$B_{i} = \{x \in \mathcal {M}:\; \delta (p_{i},x)\leqslant r_{P} \}$$ and $$B'_{i} = \{x \in \mathcal {M}:\; \delta (p_{i},x) \leqslant \alpha r_{P} \}$$. We need the following claim.

### Claim 17

For all $$i \in \{2, \, \dots , \, k\}$$, $$2r_{S_{i}} \geqslant (2-\alpha ) r_{P}$$.

### Proof

Note that $$B_{j}'$$’s cover whole of $$P$$. The case of $$i = 2$$ follows from the fact that $$2r_{S_2}=diam\left( \mathcal M \right)$$. Assume the result is true for some $$i \geqslant 2$$. We will show it is true for $$S_{i+1}$$, if $$i \leqslant k-1$$, by contradiction. Suppose $$q_{i+1}$$ falls into a ball $$B'_{j}$$ that contains $$q_{t}$$, for some $$t \leqslant i$$. This would imply $$2 r_{S_{i+1}} \leqslant \delta (q_{t}, q_{i+1}) \leqslant 2 \alpha r_{P}$$. Note that as $$\alpha < 2/3$$, we have $$2 \alpha r_{P} < (2-\alpha ) r_{P}$$. But since, $$i\leqslant k-1$$, there exists $$p_{t'}$$ such that $$B'_{t'}$$ is empty. That implies we could have selected $$p_{t'}$$ instead of $$q_{i+1}$$ to get $$2r_{S_{i+1}} = \min \{ 2r_{S_{i}}, \delta (p_{t'}, S_{i})\} \geqslant (2-\alpha ) r_{P}$$. Note that last inequality follows from the fact that $$2r_{S_{i}} \geqslant (2-\alpha )r_{P}$$ (by induction) and $$\delta (p_{t'}, S_{i}) \geqslant (2-\alpha ) r_{P}$$.

Now that we know $$q_{i+1}$$ falls into a separate ball $$B'_{j}$$, it is easy to see that $$2r_{S_{i+1}} \geqslant \min \{ 2r_{S_{i}}, \delta (p_{j}, S_{i})\} \geqslant (2-\alpha ) r_{P}$$.    $$\square$$

From the proof of Claim 17, we have for all $$j \in \{ 1, \, \dots , \, k\}$$, $$|B'_{j}\cap S_{k}| = 1$$. Thus we have $$R_{S_{k}} \leqslant 2\alpha r_{P}$$, since $$B'_{j}$$ cover $$\mathcal {M}$$. Combining this with the fact that $$2r_{S_{k}} \geqslant (2-\alpha ) r_{P}$$ (Claim 17), we have $$GR_{S_{k}} \leqslant \frac{4\alpha }{2-\alpha }$$ and consequently $$\frac{GR_{S_{k}}}{GR_{P}} \leqslant \frac{4}{2-\alpha } < 3$$.    $$\square$$

From the results in Sect. 2.1, we have the following corollary to Theorem 16.

### Corollary 18

The approximation algorithm gives an approximation ratio of (i) $$2$$ when the metric space is continuous, compact and path connected, and (ii) $$\rho \left( k \right)$$, when the metric space is restricted to a unit square in the Euclidean plane, where $$\rho \left( k\right) =\frac{\root 4 \of {27}\sqrt{k}}{\root 4 \of {3} \sqrt{k} - \sqrt{2}}= \sqrt{3}+o(1)$$.

## 3 Discrete Metric Space

### 3.1 Graph

Lower Bounds. Here we study the lower bounds for the gap ratio problem in discrete metric spaces. Again we point out that there does not exist a general lower bound for gap ratio, in discrete spaces as well. See Example 1 in [8] for details.

Next we study the lower bound of gap ratio on a metric space $$\mathcal M$$ which is the vertex set $$V$$ of an undirected connected graph $$G=(V,E)$$. The distance between a pair of vertices is the length of the shortest path between them.

### Lemma 19

Gap ratio has a lower bound of $$\frac{2}{3}$$ when the metric space $$\mathcal M$$ is a connected undirected graph. The bound is achieved only when $$R=1$$ and $$r=\frac{3}{2}$$.

### Proof

Suppose a set of vertices $$P\subset \mathcal M$$ is sampled. Let a closest pair of vertices in $$P$$ be distance $$q$$ apart. Thus $$r=\frac{q}{2}$$. Now between these two vertices, there is a path of $$q-1$$ vertices in $$\mathcal {M}{\setminus }P$$. Among these $$q-1$$ vertices, the vertex farthest from $$P$$ is at a distance $$\left\lfloor \frac{q}{2} \right\rfloor$$ from $$P$$. Thus $$R\geqslant \left\lfloor \frac{q}{2} \right\rfloor$$ and $$GR = \frac{R}{r}\geqslant \frac{2}{q} \left\lfloor \frac{q}{2} \right\rfloor$$. Note that, when $$q=1$$, clearly we have a gap ratio greater or equal to $$2$$. Now, we analyse this expression for even and odd values of $$q$$. If $$q$$ is even, $$GR\geqslant \frac{2}{q} \left\lfloor \frac{q}{2} \right\rfloor = \frac{2}{q} \frac{q}{2}=1$$ and if $$q$$ is odd and $$q\geqslant 3$$, $$GR\geqslant \frac{q-1}{q}$$. Since, this function is monotonically increasing, $$GR \geqslant \frac{2}{3}$$, and the equality only occurs for $$q=3$$.

Thus, the gap ratio $$GR=\frac{2}{3}$$ implies $$q=3$$, which means $$r=\frac{3}{2}$$. Therefore, $$R=GR\times r= 1$$. Hence, $$GR=\frac{2}{3}$$ only when $$R=1$$ and $$r=\frac{3}{2}$$.    $$\square$$

Hardness. In this section, we show that the problem of finding minimum gap ratio is NP-complete even for graph metric space. To this end, we need the concept of a variation of domination problem, called efficient domination problem. A subset $$D\subseteq V$$ is called an efficient dominating set of $$G=(V,E)$$ if $$|N_G[v]\cap D|=1$$ for every $$v\in V$$, where $$N_G[v]=\{v\}\cup \{x|vx\in E\}$$. An efficient dominating set is also known as independent perfect dominating set [6]. Given a graph $$G=(V,E)$$ and a positive integer $$k$$, the efficient domination problem is to find an efficient dominating set of cardinality at most $$k$$. The efficient domination problem is known to be NP-complete [9].

### Theorem 20

In graph metric space, gap ratio problem is NP-complete.

Proof. First note that, the gap ratio problem in graph metric space is in NP. To prove the hardness, we use a reduction from efficient domination problem, to the gap ratio problem. Given an instance of efficient domination problem $$G=(V, E)$$ and $$k$$, set $$\mathcal {M} = V$$ as the metric space and the shortest path distance between two vertices as the metric $$\delta$$.

### Claim 21

$$G=(V,E)$$ has an efficient dominating set of cardinality $$k$$ if and only if there exists a sampled set $$P$$ of $$k$$ points (vertices) whose gap ratio is $$2/3$$.

See Claim 8 of [8] for the proof of this claim. Thus the gap ratio problem is NP-complete for graph metric space.    $$\square$$

Approximation Hardness. Here we use the hardness of path connected space from Sect. 2.2 to show that the gap-ratio problem is APX-hard on the graph metric.

### Theorem 22

In an unweighted graph, it is NP-hard to approximate the gap ratio better than a factor of $$\frac{3}{2}$$.

Proof. In Sect. 2.2, we reduced the problem of finding a set of $$k$$ points in a graph such that the gap ratio is $$\frac{2}{3}$$ to the problem of finding a set of $$k$$ points in a path-connected space such that the gap ratio is $$1$$. We use this hardness of gap ratio being $$1$$ on instances similar to the one created in the reduction to prove $$\frac{3}{2}$$ approximation hardness on graphs.

Our starting instance is a space formed by joining integer length curves at their ends (so that points that divide these curves into unit length curves form a connected graph with the unit length curves as edges). Also for some $$0 < \epsilon < \frac{1}{4}$$ we join curves of length $$\epsilon$$ (at one end) at points such that the integer length curves are divided into unit length curves. Let us call this path connected space $$\mathcal M$$. Note that $$\mathcal M$$ is similar to the path connected space formed in Sect. 2.2, but, the general shape of the space may vary. The reduction is illustrated in Fig. 2. The metric on this space is defined by the length of the shortest path between pairs of points. We form the graph $$G= \left( V, E \right)$$ by putting vertices at the place where the $$\epsilon$$-length curves are joined to the integer length curves. The $$\epsilon$$-length protrusions are discarded and the unit length curves between the vertices form the edge set.

### Claim 23

There exists a polynomial time algorithm to find $$P \subset \mathcal M$$ such that $$\vert P \vert = k$$ and $$R_P = r_P = \frac{2t+1}{2}$$ for some $$t \in \left\{ 1,2,..., \right\}$$ if and only if there exists a polynomial time algorithm to find a set of $$k$$ vertices in $$G$$ such that the gap ratio of the set is strictly less than $$1$$.

See Theorem 22 of [8] for the proof of the above claim.

This gives us that it is NP-hard to find a set with gap ratio less than $$1$$ in graphs, i.e. it is NP-hard to find an algorithm which approximates gap ratio within a factor better than $$\frac{3}{2}$$.

Note here that if we could have proven Claim 23 for $$\vert P \vert = k$$ and $$R_P = r_P = \frac{t}{2}$$ for some $$t \in \left\{ 2,3,..., \right\}$$, then we wouldn’t need to say strictly less than $$1$$ in the statement.    $$\square$$

Approximation Algorithm. We start by pointing out that Algorithm 1 and Theorem 16 hold in a discrete metric space as well. Thus, as a consequence of Lemma 19, we have the following corollary to Theorem 16.

### Corollary 24

The approximation algorithm gives an approximation ratio of $$3$$ when the metric space is restricted to graph metric space.

### 3.2 Euclidean Space

Next we discuss a $$\left( 1+\epsilon \right)$$- approximation algorithm when the space $$\mathcal M$$ is a set of $$n$$ points in a Euclidean space. We will call it Algorithm 2.

Suppose, $$\mathcal M$$ is a set of $$n$$ points in $$\mathbb {R}^d$$ and the metric $$\delta$$ on $$\mathcal M$$ is the Euclidean metric on $$\mathbb {R}^d$$. We propose Algorithm 2, and prove that it gives a gap ratio within $$\left( 1 + \epsilon \right)$$ factor of the minimum gap ratio, where $$\epsilon \in \left( 0, \frac{1}{2} \right)$$ and $$\epsilon _1 :=\frac{\epsilon }{\left( 3+2\epsilon \right) }$$. The algorithm is as follows.

Obtain a set $$P_1\subset \mathcal M$$ of $$k$$ points by the farthest point method. Create a grid of side-length $$\epsilon _2= \frac{\epsilon _1 R_{P_1}}{2\sqrt{d}}$$. Get a set $$S$$ by choosing $$1$$ point of $$\mathcal M$$ from each grid cell. Of all the $$O\left( \vert S \vert ^k \right)$$ subsets of $$S$$ choose the one with the lowest gap ratio.

For analysing the algorithm we need the following definitions and lemmas.

Define $$R_{OPT} :=\min _{P\subset \mathcal {M}, \vert P \vert =k} \max _{q \in \mathcal{M}} \delta (q,P)$$ and $$r_{OPT} :=\max _{P\subset \mathcal {M}, \vert P \vert =k} \min _{p,q \in {P}, p \ne q} \frac{\delta (p,q)}{2}$$. We try to bound the time complexity by estimating the number of grid cells needed to cover $$\mathcal M$$.

### Lemma 25

In Algorithm 2, at most $$N :=O(k \lceil \frac{1}{\epsilon _1} \rceil ^d )$$ cells cover $$\mathcal M$$.

### Proof

Consider a set (say $$P_{cov}$$) of $$k$$ points in $$\mathcal M$$, such that $$R_{OPT} = \max _{q \in \mathcal{M}} \delta \left( q,P_{cov}\right)$$. Now, we know that balls of radius $$R_{OPT}$$ around the points of $$P_{cov}$$ cover $$\mathcal M$$. Each of these balls intersect $$O( \lceil \frac{2R_{OPT}}{\epsilon _2} \rceil ^d ) = O( \lceil \frac{1}{\epsilon _1} \rceil ^d )$$ grid cells. Thus, $$N :=O(k \lceil \frac{1}{\epsilon _1} \rceil ^d )$$ cells cover $$\mathcal M$$.    $$\square$$

The above lemma shows that the brute force calculation of gap ratio over $$S$$ takes $$O\left( N^k \left( k\log k + \left( n-k \right) k \right) \right)$$ time, where $$O(k\log k)$$ is required to compute $$r$$ and $$O( k(n-k) )$$ is required to compute $$R$$ in each iteration; all other steps in Algorithm 2 are polynomial in $$n$$ and $$k$$. Note that the time is not polynomial in $$k$$.

We are now ready to prove the main theorem for this section.

### Theorem 26

In Algorithm 2 we have, $$GR_P \geqslant \left( 1+\epsilon \right) \cdot GR_{OPT}$$.

### Proof

Consider the set $$P^*$$ of $$k$$ points in $$\mathcal M$$, which gives the minimum gap ratio, $$\alpha$$, in $$\mathcal M$$. Let $$r :=r_{P^*}$$. We have $$R_{P_1}\leqslant 2 R_{OPT}$$ from [14]. For each $$p_i$$ in $$P^*$$, there exists a point $$q_i$$ in $$S$$, such that $$\delta \left( q_i,p_i\right) \leqslant \sqrt{d} \epsilon _2$$, because $$\sqrt{d} \epsilon _2$$ is the diameter of each grid cell. From the definition of $$\epsilon _2$$, we have $$\delta \left( q_i,p_i\right) \leqslant \frac{\epsilon _1 R_{P_1}}{2} \leqslant \epsilon _1 R_{OPT} \leqslant \epsilon _1 R_{P^*}=\epsilon _1 \alpha r$$. Also note that $$\alpha \leqslant 2$$, as the farthest point method itself will yield gap ratio at most $$2$$. Thus, we have $$\delta \left( q_i,p_i\right) \leqslant r$$ (as $$\epsilon _1 < \frac{1}{2}$$), i.e., $$i \ne j \implies q_i \ne q_j$$. Let $$P_2 :=\left\{ q_1, q_2, \ldots , q_k \right\}$$ be a set of such $$k$$ distinct points in $$S$$. Let us compute the gap ratio of $$P_2$$. Triangle inequality gives us $$R_{P_2} \leqslant \left( 1+\epsilon _1 \right) \alpha r$$ and $$r_{P_2} \geqslant \left( 1-\epsilon _1 \alpha \right) r$$. Then the gap ratio of $$P_2$$ is $$\leqslant \frac{\left( 1+\epsilon _1 \right) \alpha }{\left( 1-\epsilon _1 \alpha \right) } \leqslant \frac{\left( 1+\epsilon _1 \right) \alpha }{\left( 1-2\epsilon _1 \right) }=\left( 1+\epsilon \right) \alpha$$.

Also by definition, the gap ratio of $$P$$ is less than the gap ratio of $$P_2$$. Thus we have that gap ratio of $$P$$ in $$S$$ is at most $$\left( 1+\epsilon \right) \alpha$$.    $$\square$$

## 4 A General Approximation Hardness Result

In this section we show that the gap ratio problem is hard to approximate within a factor of $$2$$ for the general metric space.

### Theorem 27

In a general metric space, it is NP-hard to approximate the gap ratio better than a factor of 2.

Proof. To show this hardness, we make a reduction from independent dominating set problem, where the dominating set is also independent set. This problem is known to be NP-hard [13].

Let $$G=\left( V,E\right)$$ and $$k$$ be an instance of independent domination problem. We make a weighted complete graph over $$V$$ such that all edges present in $$G$$ have weight $$1$$ and all other edges have weight $$2$$. Now the metric space $$\mathcal M$$ is given by the vertex set of the complete graph and the metric is defined by the edge weights. The result is easy to see from the following claim.

### Claim 28

$$G=(V,E)$$ has an independent dominating set of cardinality $$k$$ if and only if there exists a sampled set $$P$$ in $$\mathcal {M}$$ of $$k$$ points with gap ratio $$1$$.

See Theorem 20 of [8] for the proof of the above claim and other details of the proof.    $$\square$$

## Notes

### Acknowledgements

The authors want to thank Tetsuo Asano and Geevarghese Philip.

### References

1. 1.
Asano, T.: Computational geometric and combinatorial approaches to digital halftoning. In: CATS, p. 3 (2006)Google Scholar
2. 2.
Asano, T.: Online uniformity of integer points on a line. Inf. Process. Lett. 109(1), 57–60 (2008)
3. 3.
Asano, T., Katoh, N., Obokata, K., Tokuyama, T.: Combinatorial and geometric problems related to digital halftoning. In: Asano, T., Klette, R., Ronse, C. (eds.) Geometry, Morphology, and Computational Imaging. LNCS, vol. 2616, pp. 58–71. Springer, Heidelberg (2003)
4. 4.
Asano, T., Katoh, N., Obokata, K., Tokuyama, T.: Matrix rounding under the $$\text{ L }_p$$-discrepancy measure and its application to digital halftoning. SIAM J. Comput. 32(6), 1423–1435 (2003)
5. 5.
Asano, T., Teramoto, S.: On-line uniformity of points. In: Book of Abstracts for 8th Hellenic-European Conference on Computer Mathematics and its Applications, Athens, Greece, pp. 21–22 (2007)Google Scholar
6. 6.
Bange, D., Barkauskas, A., Host, L., Slater, P.: Generalized domination and efficient domination in graphs. Discrete Math. 159(13), 1–11 (1996)
7. 7.
Berg, M., Cheong, O., Kreveld, M., Overmars, M.: Computational Geometry: Algorithms and Applications. Springer-Verlag TELOS, Santa Clara (2008)
8. 8.
Bishnu, A., Desai, S., Ghosh, A., Goswami, M., Paul, S.: Uniformity of point samples in metric spaces using gap ratio. CoRR, abs/1411.7819v1 (2014)Google Scholar
9. 9.
Chain-Chin, Y., Lee, R.: The weighted perfect domination problem and its variants. Discrete Appl. Math. 66(2), 147–160 (1996)
10. 10.
Chazelle, B.: The Discrepancy Method - Randomness and Complexity. Cambridge University Press, New York (2001) Google Scholar
11. 11.
Collins, C.R., Stephenson, K.: A circle packing algorithm. Comput. Geom. 25(3), 233–256 (2003)
12. 12.
Fiala, J., Kratochvíl, J., Proskurowski, A.: Systems of distant representatives. Discrete Appl. Math. 145(2), 306–316 (2005)
13. 13.
Garey, M.R., Johnson, D.S.: Computers and Intractability; A Guide to the Theory of NP-completeness. W. H. Freeman & Co., New York (1990) Google Scholar
14. 14.
Gonzalez, T.F.: Clustering to minimize the maximum intercluster distance. Theor. Comput. Sci. 38, 293–306 (1985)
15. 15.
Kuperberg, W.: An inequality linking packing and covering densities of plane convex bodies. Geom. Dedicata. 23(1), 59–66 (1987)
16. 16.
Locatelli, M., Raber, U.: Packing equal circles in a square: a deterministic global optimization approach. Discrete Appl. Math. 122(13), 139–166 (2002)
17. 17.
Matoušek, J.: Geometric Discrepancy: An Illustrated Guide. Springer, Heidelberg (1999)
18. 18.
Nurmela, K.J., Östergård, P.R.J.: Packing up to 50 equal circles in a square. Discrete Comput. Geom. 18(1), 111–120 (1997)
19. 19.
Nurmela, K.J., Östergård, P.R.J.: More optimal packings of equal circles in a square. Discrete Comput. Geom. 22(3), 439–457 (1999)
20. 20.
Nurmela, K.J., Östergård, P.R.J., aus dem Spring, R.: Asymptotic behavior of optimal circle packings in a square. Can. Math. Bull. 42(3), 380–385 (1999)
21. 21.
Sadakane, K., Chebihi, N.T., Tokuyama, T.: Discrepancy-based digital halftoning: automatic evaluation and optimization. In: Asano, T., Klette, R., Ronse, C. (eds.) Geometry, Morphology, and Computational Imaging. LNCS, vol. 2616, pp. 301–319. Springer, Heidelberg (2003)
22. 22.
Teramoto, S., Asano, T., Katoh, N., Doerr, B.: Inserting points uniformly at every instance. IEICE Trans. 89D(8), 2348–2356 (2006)Google Scholar
23. 23.
Tóth, G.: New results in the theory of packing and covering. In: Gruber, P., Wills, J. (eds.) Convexity and Its Applications, pp. 318–359. Birkhuser basel, Basel (1983)
24. 24.
Zhang, Y., Chang, Z., Chin, F.Y.L., Ting, H.-F., Tsin, Y.H.: Uniformly inserting points on square grid. Inf. Process. Lett. 111(16), 773–779 (2011)

© Springer International Publishing Switzerland 2015

## Authors and Affiliations

• Arijit Bishnu
• 1
• Sameer Desai
• 1
• Arijit Ghosh
• 2
• Mayank Goswami
• 2
• Subhabrata Paul
• 1
1. 1.ACM UnitIndian Statistical InstituteKolkataIndia
2. 2.MPI for InformaticsSaarbrückenGermany