Abstract
There has been a recent surge of interest in incorporating fairness aspects into classical clustering problems. Two recently introduced variants of the kCenter problem in this spirit are Colorful kCenter, introduced by Bandyapadhyay, Inamdar, Pai, and Varadarajan, and lottery models, such as the Fair Robust kCenter problem introduced by Harris, Pensyl, Srinivasan, and Trinh. To address fairness aspects, these models, compared to traditional kCenter, include additional covering constraints. Prior approximation results for these models require to relax some of the normally hard constraints, like the number of centers to be opened or the involved covering constraints, and therefore, only obtain constantfactor pseudoapproximations. In this paper, we introduce a new approach to deal with such covering constraints that leads to (true) approximations, including a 4approximation for Colorful kCenter with constantly many colors—settling an open question raised by Bandyapadhyay, Inamdar, Pai, and Varadarajan—and a 4approximation for Fair Robust kCenter, for which the existence of a (true) constantfactor approximation was also open. We complement our results by showing that if one allows an unbounded number of colors, then Colorful kCenter admits no approximation algorithm with finite approximation guarantee, assuming that \(\mathtt {P}\ne \mathtt {NP}\). Moreover, under the Exponential Time Hypothesis, the problem is inapproximable if the number of colors grows faster than logarithmic in the size of the ground set.
Introduction
Along with kMedian and kMeans, kCenter is one of the most fundamental and heavily studied clustering problems. In kCenter, we are given a finite metric space (X, d) and an integer \(k\in [X]:=\{1,\dots , X\}\), and the task is to find a set \(C\subseteq X\) with \(C\le k\) minimizing the maximum distance of any point in X to its closest point in C. Equivalently, the problem can be phrased as covering X with k balls of radius as small as possible, i.e., finding the smallest radius \(r\in \mathbb {R}_{\ge 0}\) together with a set \(C\subseteq X\) with \(C\le k\) such that \(X = B(C,r) :=\bigcup _{c\in C}B(c,r)\), where \(B(c,r):=\{u\in X: d(c,u)\le r\}\) is the ball of radius r around c.
kCenter, like most clustering problems, is computationally hard; actually it is \(\mathtt {NP}\)hard to approximate to within any constant below 2 [21]. On the positive side, various 2approximations [15, 19] have been found, and thus, its approximability is settled. Many variations of kCenter have been studied, most of which are based on generalizations along one of the following two main axes:

(i)
which sets of centers can be selected, and

(ii)
which sets of points of X need to be covered.
The most prominent variations along (i) are variations where the set of centers is required to be in some downclosed family \(\mathcal {F}\subseteq 2^X\). For example, if centers have nonnegative opening costs and there is a global budget for opening centers, Knapsack Center is obtained. If \(\mathcal {F}\) is the set of independent sets of a matroid, the problem is known as Matroid Center. The bestknown problem type linked to (ii) is Robust kCenter. Here, an integer \(m\in [X]\) is given, and one only needs to cover any m points of X with k balls of radius as small as possible. Research on kCenter variants along one or both of these axes has been very active and fruitful, see, e.g., [8, 10, 11, 20]. In particular, recent work of Chakrabarty and Negahbani [9] presents an elegant and unifying framework for designing best possible approximation algorithms for all abovementioned variants.
All the above variants have in common that there is a single covering requirement; either all of X needs to be covered or a subset of it. Moreover, they come with different kinds of packing constraints on the centers to be opened as in Knapsack or Matroid Center. However, the desire to address fairness in clustering, which has received significant attention recently, naturally leads to multiple covering constraints. Here, existing techniques only lead to constantfactor pseudoapproximations that violate at least one constraint, like the number of centers to be opened. In this work, we present techniques for obtaining (true) approximations for two recent fairnessinspired generalizations of kCenter along axis (ii), namely

(i)
\(\gamma \)Colorful kCenter, as introduced by Bandyapadhyay et al. [3], and

(ii)
Fair Robust kCenter, a lottery model introduced by Harris et al. [18].
\(\gamma \)Colorful kCenter (\(\upgamma \mathrm {C k C}\)) is a fairnessinspired kCenter model imposing covering constraints on subgroups. It is formally defined as follows.
Definition 1
(\(\upgamma \)Colorful kCenter (\(\upgamma \mathrm {C k C}\)) [3]) Let \(\gamma ,k\in \mathbb {Z}_{\ge 1}\), (X, d) be a finite metric space, \(X_\ell \subseteq X\) and \(m_\ell \in \mathbb {Z}_{\ge 0}\) for \(\ell \in [\gamma ]\). The \(\gamma \)Colorful kCenter problem (\(\upgamma \mathrm {C k C}\)) asks to find the smallest radius \(r\in \mathbb {R}_{\ge 0}\) together with centers \(C\subseteq X\), \(C\le k\), such that
Such a set of centers C is called a \(\upgamma \mathrm {C k C}\) solution of radius r.^{Footnote 1}
We clarify that, unless explicitly stated otherwise, the number \(\gamma \) in the above definition is assumed to be part of the input.
The choice of name for the problem stems from interpreting each set \(X_\ell \) for \(\ell \in [\gamma ]\) as a color assigned to the elements of \(X_{\ell }\). In particular, an element can have multiple colors or no color. In words, the task is to open k balls of smallest possible radius such that, for each color \(\ell \in [\gamma ]\), at least \(m_\ell \) points of color \(\ell \) are covered. Hence, for \(\gamma =1\), we recover the Robust kCenter problem.
We briefly contrast \(\upgamma \mathrm {C k C}\) with related fairness models. A related class of models that has received significant attention also assumes that the ground set is colored, but requires that the ratio between colors within each cluster is approximately the same as the global ratio between colors. Such variants have been considered for kMedian, kMeans, and kCenter, e.g., see [2, 4, 5, 12, 28] and references therein. \(\upgamma \mathrm {C k C}\) differentiates itself from the above notion of fairness by not requiring a percluster guarantee, but a global fairness guarantee. More precisely, each color can be thought of as representing a certain group of people (demographic), and a global covering requirement is given per demographic. Also notice the difference with the wellknown Robust kCenter problem, where a feasible solution might, potentially, completely ignore a certain subgroup, resulting in a heavily unfair treatment. \(\upgamma \mathrm {C k C}\) addresses this issue.
The presence of multiple covering constraints in \(\upgamma \mathrm {C k C}\), imposed by the colors, hinders the use of classical kCenter clustering techniques, which, as mentioned above, have mostly been developed for packing constraints on the centers to be opened. An elegant first step was done by Bandyapadhyay et al. [3]. They exploit sparsity of a wellchosen LP (in a similar spirit as in [18]) to obtain the following pseudoapproximation for \(\upgamma \mathrm {C k C}\): they efficiently compute a solution of twice the optimal radius by opening at most \(k+\gamma 1\) centers. Hence, up to \(\gamma 1\) more centers than allowed may have to be opened. Moreover, [3] shows that in the Euclidean plane, a significantly more involved extension of this technique allows for obtaining a true \((17+\varepsilon )\)approximation for \(\gamma =O(1)\). Unfortunately, this approach is heavily problemtailored and does not even extend to 3dimensional Euclidean spaces. This naturally leads to the main open question raised in [3]:
Does \(\upgamma \mathrm {C k C}\) with \(\gamma =O(1)\) admit an O(1)approximation, for any finite metric?
Here, we introduce a new approach that answers this question affirmatively.
Together with additional ingredients, our approach also applies to Fair Robust kCenter, which is a natural lottery model introduced by Harris et al. [18]. We introduce the following generalization thereof that can be handled with our techniques, which we name Fair \(\gamma \)Colorful kCenter problem (Fair \(\upgamma \mathrm {C k C}\)). (The Fair Robust kCenter problem, as introduced in [18], corresponds to \(\gamma =1\).)
Definition 2
(Fair \(\gamma \)Colorful kCenter problem (Fair \(\upgamma \mathrm {C k C}\))) Given is a \(\upgamma \mathrm {C k C}\) instance on a finite metric space (X, d) together with a vector \(p\in [0,1]^X\). The goal is to find the smallest radius \(r\in \mathbb {R}_{\ge 0}\), for which there exists a distribution \(\mathcal {H}\) over feasible \(\upgamma \mathrm {C k C}\) solutions of radius r such that
An algorithm for this problem should return a radius r along with an efficient procedure for sampling a random feasible \(\upgamma \mathrm {C k C}\) solution of radius r.
We note that if there exists a distribution \(\mathcal {H}\) with the desired properties for some radius r, then there exists a distribution of polynomial support with the desired properties (due to sparsity of the natural LP corresponding to the distribution, described in Sect. 3). This, in particular, implies that the corresponding decision problem is in \(\mathtt {NP}\).
Fair \(\upgamma \mathrm {C k C}\) is a generalization of \(\upgamma \mathrm {C k C}\), where each element \(u\in X\) needs to be covered with a prescribed probability p(u). The Fair Robust kCenter problem, i.e., Fair \(\upgamma \mathrm {C k C}\) with \(\gamma =1\), is indeed a fairnessinspired generalization of Robust kCenter, since Robust kCenter is obtained by setting \(p(u)=0\) for all \(u\in X\). One example setting where the additional fairness aspect of Fair \(\upgamma \mathrm {C k C}\) compared to \(\upgamma \mathrm {C k C}\) is nicely illustrated, is when kCenter problems have to be solved repeatedly on the same metric space. The introduction of the probability requirements p allows for obtaining a distribution to draw from that needs to consider all elements of X (as prescribed by p), whereas classical Robust kCenter likely ignores a group of badlyplaced elements. We refer to Harris et al. [18] for further motivation of the problem setting. They also discuss the Knapsack and Matroid Center problem under the same notion of fairness.
For Fair Robust kCenter, [18] presents a 2pseudoapproximation that slightly violates both the number of points to be covered and the probability of covering each point. More precisely, for any constant \(\varepsilon >0\), only a \((1\varepsilon )\)fraction of the required number of elements are covered, and element \(u\in X\) is covered only with probability \((1\varepsilon ) p(u)\) instead of p(u). It was left open in [18] whether a true approximation may exist for Fair Robust kCenter.
Our results
Our main contribution is a method to obtain 4approximations for variants of kCenter with unary encoded covering constraints on the points to be covered. We illustrate our technique in the context of \(\upgamma \mathrm {C k C}\), affirmatively resolving the open question of Bandyapadhyay et al. [3] about the existence of an O(1)approximation for constantly many colors (without restrictions on the underlying metric space).
Theorem 1
There is a 4approximation algorithm for \(\upgamma \mathrm {C k C}\) running in time \(X^{O(\gamma )}\).
In a second step we extend and generalize our technique to Fair \(\upgamma \mathrm {C k C}\), which, as mentioned, is a generalization of \(\upgamma \mathrm {C k C}\). We show that Fair \(\upgamma \mathrm {C k C}\) admits an O(1)approximation, which neither violates covering nor probabilistic constraints.
Theorem 2
There is a 4approximation algorithm for Fair \(\upgamma \mathrm {C k C}\) running in time \({{\,\mathrm{poly}\,}}(L) \cdot X^{O(\gamma )}\), where L is the encoding length of the input.
We recall that in our definition of \(\upgamma \mathrm {C k C}\), the number of colors \(\gamma \) is part of the input. In the following, we complete our results above—which lead to efficient algorithms only for constant \(\gamma \)—by showing inapproximability of \(\upgamma \mathrm {C k C}\) when \(\gamma \) is not bounded. This holds even on the real line (1dimensional Euclidean space).
Theorem 3
It is \(\mathtt {NP}\)hard to decide whether \(\upgamma \mathrm {C k C}\) on the real line admits a solution of radius 0. Moreover, unless the Exponential Time Hypothesis fails, for any function \(f:\mathbb {Z}_{\ge 0}\rightarrow \mathbb {Z}_{\ge 0}\) with \(f(n) = \omega (\log n)\), no polynomialtime algorithm can distinguish whether \(\upgamma \mathrm {C k C}\) on the real line with \(\gamma \le f(X)\) admits a solution of radius 0.
Hence, assuming the Exponential Time Hypothesis, there is no polynomialtime approximation algorithm for \(\upgamma \mathrm {C k C}\) if the number of colors grows faster than logarithmic in the size of the ground set. Notice that, for a logarithmic number of colors, our procedures run in quasipolynomial time.
Finally, we extend the hardness implied by Theorem 3 to bicriteria algorithms that are allowed to open more than k centers. An \((\alpha ,\beta )\) bicriteria algorithm for \(\upgamma \mathrm {C k C}\), for \(\alpha , \beta \ge 1\), is an algorithm that returns a solution that picks at most \(\alpha k\) centers and its radius is at most \(\beta r\), where r is the radius of an optimal solution with k centers. More precisely, we prove the following theorem.
Theorem 4
There exists a constant \(c > 0\), such that it is \(\mathtt {NP}\)hard to decide whether \(\upgamma \mathrm {C k C}\) on the real line admits a solution of radius 0, even if we are allowed to violate the number of open centers by a factor of \(c \log X\).
Notice that, unless \(\mathtt {P}=\mathtt {NP}\), the above theorem rules out the existence of a \((c \log X, \beta )\) bicriteria algorithm for \(\upgamma \mathrm {C k C}\) for any value of \(\beta \).
Note: In an independent work, Jia, Sheth, and Svensson [23], also made advances on \(\upgamma \mathrm {C k C}\). We briefly highlight some main differences. In particular, they gave a 3approximation algorithm for \(\upgamma \mathrm {C k C}\) running in time \(X^{O(\gamma ^2)}\). Hence, this algorithm provides a better approximation guarantee than our 4approximation for \(\upgamma \mathrm {C k C}\), though with a slower running time. Moreover, contrary to [23], we also show that our techniques extend to Fair \(\upgamma \mathrm {C k C}\) (Theorem 2) and obtain the hardness results highlighted in Theorems 3 and 4.
Outline of main technical contributions and paper organization
We introduce two main technical ingredients. The first is a method to deal with additional covering constraints in kCenter problems. We showcase this method in the context of \(\upgamma \mathrm {C k C}\), which leads to Theorem 1. For this, we combine polyhedral sparsitybased arguments as used by Bandyapadhyay et al. [3], which by themselves only lead to pseudoapproximations, with dynamic programming to design a roundorcut approach. Roundorcut approaches, first used by Carr et al. [7], leverage the ellipsoid method in a clever way. In each ellipsoid iteration they either separate the current point from a welldefined polyhedron P, or round the current point to a good solution. The rounding step may happen even if the current point is not in P. Roundorcut methods have found applications in numerous problem settings (see, e.g., [1, 9, 16, 24,25,26,27]). The way we employ roundorcut is inspired by a powerful roundorcut approach of Chakrabarty and Negahbani [9] also developed in the context of kCenter. However, their approach is not applicable to kCenter problems as soon as multiple covering constraints exist, like in \(\upgamma \mathrm {C k C}\); see Appendix B for more details.
Our second technical contribution first employs LP duality to transform lotterytype models, like Fair \(\upgamma \mathrm {C k C}\), into an auxiliary problem that corresponds to a weighted version of kCenter with covering constraints. We then show how a certain type of approximate separation over the dual is possible, by leveraging the techniques we introduced in the context of \(\upgamma \mathrm {C k C}\), leading to a 4approximation.
Even though Theorem 2 is a strictly stronger statement than Theorem 1, we first prove Theorem 1 in Sect. 2, because it allows us to give a significantly cleaner presentation of some of our main technical contributions. In Sect. 3, we then focus on the additional techniques needed to deal with Fair \(\upgamma \mathrm {C k C}\), by reducing it to a problem that can be tackled with the techniques introduced in Sect. 2. Finally, in Sect. 4, we discuss the hardness results stated in Theorems 3 and 4.
A 4approximation for \(\upgamma \mathrm {C k C}\)B with running time \(\varvec{X^{O(\gamma )}}\)
In this section, we prove Theorem 1, which implies a polynomialtime 4approximation algorithm for \(\upgamma \mathrm {C k C}\) with constantly many colors. We assume \(\gamma \ge 2\); notice that \(\gamma =1\) corresponds to Robust kCenter, for which a (tight) polynomialtime 2approximation is known [8, 18]. Moreover, we assume that \(\gamma < k\), since otherwise, we can simply enumerate over all subsets of X of size k, which leads to an exact algorithm with running time \(X^{O(k)} \le X^{O(\gamma )}\). Thus, from now on, we have that \(2 \le \gamma \le k  1\).
We present a procedure that for any \(r\in \mathbb {R}_{\ge 0}\) returns a solution of radius 4r if a solution of radius r exists, and runs in time \(X^{O(\gamma )}\). This implies Theorem 1 because the optimal radius is a distance between two points. Hence, we can run the procedure for all possible pairwise distances r between points in X (or, alternatively, do binary search on the set of pairwise distances in order to speed up the algorithm) and return the best solution found. Thus, we fix \(r\in \mathbb {R}_{\ge 0}\) in what follows. We denote by \(\mathcal {P}\) the following canonical relaxation of \(\upgamma \mathrm {C k C}\) with radius r:
Integral points \((x,y)\in \mathcal {P}\) correspond to solutions of radius r, where x and y are characteristic vectors indicating the points that are covered and the centers that are opened, respectively. We denote the integer hull of \(\mathcal {P}\) by \(\mathcal {P}_{I}:={{\,\mathrm{conv}\,}}\left( \mathcal {P}\cap (\{0,1\}^X \times \{0,1\}^X )\right) \) .
Our algorithm is based on the roundorcut framework, first used in [7]. The main building block is a procedure that rounds a point \((x,y)\in \mathcal {P}\) to a radius 4r solution under certain conditions. It will turn out that these conditions are always satisfied if \((x,y) \in \mathcal {P}_{I}\). If they are not satisfied, then we can prove that \((x,y) \notin \mathcal {P}_{I}\) and generate in time \(X^{O(\gamma )}\) a hyperplane separating (x, y) from \(\mathcal {P}_{I}\). This separation step now becomes an iteration of the ellipsoid method, employed to find a point in \(\mathcal {P}_{I}\), and we continue with a new candidate point (x, y). Schematically, the whole process is described in Fig. 1.
On a high level, we realize our roundorcut procedure as follows. First, we check whether \((x,y) \in \mathcal {P}\) and return a violated constraint if this is not the case. If \((x,y)\in \mathcal {P}\), we partition the metric space, based on a natural greedy heuristic introduced by Harris et al. [18]. This gives a set of centers \(S=\{s_1,\ldots , s_q\}\subseteq X\) with corresponding clusters \(\mathcal {D}=\{D_1, \ldots , D_q\}\subseteq 2^X\). We now exploit a technique by Bandyapadhyay et al. [3], which implies that if \(y(B(S,r)) \le k  \gamma + 1\), then one can leverage sparsity arguments in a simplified LP to obtain a radius 4r solution that picks centers only within S. (For brevity, we use the shorthand \(y(W):=\sum _{u\in W} y(w)\) for any finite set W and vector \(y\in \mathbb {R}^W\); in particular, \(y(B(S,r))=\sum _{v\in B(S,r)} y(v)\).) We then turn to the case where \(y(B(S, r)) > k  \gamma + 1\). At this point, we show that one can efficiently check whether there exists a solution of radius 2r that opens at most \(k  (k  \gamma + 2) = \gamma  2\) centers outside of S. This is achieved by guessing the centers outside of S (of which there are at most \(\gamma 2\) many, as noted) and using dynamic programming to find the remaining centers in S. If no such radius 2r solution exists, we argue that any solution of radius r has at most \(k\gamma +1\) centers in B(S, r), proving that \(y(B(S, r)) \le k  \gamma + 1\) is an inequality separating (x, y) from \(\mathcal {P}_{I}\).
We now give a formal treatment of each step of this algorithm, which is schematically described in Fig. 1. Given a point \((x,y)\in \mathbb {R}^X \times \mathbb {R}^X\), we first check whether \((x,y)\in \mathcal {P}\), and, if not, return a violated constraint of \(\mathcal {P}\). Such a constraint separates (x, y) from \(\mathcal {P}_{I}\) because \(\mathcal {P}_{I}\subseteq \mathcal {P}\). Hence, we may assume that \((x, y) \in \mathcal {P}\).
We now use a partitioning technique by Harris et al. [18] that, given \((x, y) \in \mathcal {P}\), allows for obtaining what we call an (x, y)good partition \((S,\mathcal {D})\), defined as follows.
Definition 3
((x, y)good partition) Let \((x,y) \in \mathcal {P}\). A tuple \((S,\mathcal {D})\), where the family \(\mathcal {D} = \{D_1, \ldots , D_q\}\) partitions X and \(S = \{s_1, \ldots , s_q\}\subseteq X\) with \(s_i\in D_i\) for \(i\in [q]\), is an (x, y)good partition if:

(i)
\(d(s_i, s_j) > 4r\) for all \(i,j\in [q], i\ne j\),

(ii)
\(D_i \subseteq B(s_i, 4r)\) for all \(i \in [q]\), and

(iii)
\(y(B(s_i,r)) \ge x(u)\) for all \(i\in [q]\) and for all \(u\in D_i\).
The partitioning procedure of [18] was originally introduced for Robust kCenter and naturally extends to \(\upgamma \mathrm {C k C}\) (see [3]). For completeness, we describe it in Algorithm 1. Contrary to prior procedures, we compute an (x, y)good partition whose centers have pairwise distances of strictly more than 4r (instead of 2r as in prior work). This large separation avoids overlap of radius 2r balls around centers in S, and allows us to use dynamic programming (DP) to build a radius 2r solution with centers in S under certain conditions. However, it is also the reason why we get a 4approximation if the DP approach cannot be applied.
Lemma 1
([3, 18]) For \((x,y) \in \mathcal {P}\), Algorithm 1 computes an (x, y)good partition \((S, \mathcal {D})\) in polynomial time.
For completeness, we present the proof of the above lemma.
Proof of Lemma 1
By construction, the first two properties of the definition of an (x, y)good partition are trivially satisfied by the generated partition \((S, \mathcal {D})\). We now turn to the third property. For each point \(u \in D_i\), by the greedy criterion we have \(x(u) \le x(s_i)\). Since \((x,y) \in \mathcal {P}\), we also have \(x(s_i) \le y(B(s_i, r))\), implying the statement. \(\square \)
The following theorem follows from the results in [3].
Theorem 5
([3]) Let \((x, y) \in \mathcal {P}\) and \((S,\mathcal {D})\) be an (x, y)good partition. Then, if \(y(B(S, r)) \le k  \gamma + 1\), a solution of radius 4r can be found in polynomial time.
For completeness, we provide in Appendix A a proof of a slightly stronger version of Theorem 5, namely Theorem 8, which we reuse later in a more general context. Theorem 8 easily follows by the same sparsity argument used in [3].
We are left with the case \(y(B(S, r)) > k  \gamma + 1\). In this case we present a procedure that either returns a solution of radius 2r or, if it fails to do so, we show that every point \((\overline{x},\overline{y})\in \mathcal {P}_{I}\) must fulfill \(\overline{y}(B(S, r)) \le k  \gamma + 1\); hence, this is an inequality separating (x, y) from \(\mathcal {P}_I\).
To show the above, we assume that \((x,y)\in \mathcal {P}_I\) holds and provide a procedure obtaining a solution of radius 2r. (Notice that we cannot check whether \((x,y)\in \mathcal {P}_I\), and even if we knew that \((x,y)\in \mathcal {P}_I\), we still need a procedure transforming the possibly fractional point (x, y) to an actual (integral) solution.) Note that if \((x,y) \in \mathcal {P}_{I}\), then there must exist a solution \(C_1\subseteq X\) of radius r with \(C_1\cap B(S, r) > k  \gamma + 1\). In particular, we must have \(C_1 \setminus B(S,r) \le \gamma  2\). We observe that if such a solution \(C_1\) exists, then there must be a solution \(C_2\) of radius 2r which has at most \(\gamma  2\) centers outside of S. This is formalized in the following lemma.
Lemma 2
Let \(S\subseteq X\) with \(d(s,s')>4r\) for all \(s,s'\in S\) with \(s \ne s'\), and \(\tau \in \{0, \ldots , k1\}\). If there is a radius r solution \(C_1\) with \(C_1\cap B(S,r) > \tau \), then there is a radius 2r solution \(C_2\) with \(C_2\,{\setminus }\, S \le k  \tau  1\).
Proof
Assume there is a solution \(C_1\) of radius r with \(C_1\cap B(S,r) > \tau \). Let \(A = C_1 \cap B(S,r)\). For each \(p \in A\), let \(\phi (p) \in S\) be the unique point in S such that \(p \in B(\phi (p), r)\); \(\phi (p)\) is well defined because \(d(s, s') > 4r\) for every \(s \ne s' \in S\). Thus, \(\phi (A) \le A\), where \(\phi (A) :=\{\phi (p):\,p \in A\}\).
Let \(C_2 = \phi (A) \cup (C_1 \setminus A)\). We have \(C_2 = \phi (A) + C_1 \setminus A \le A + C_1 \setminus A \le k\). Moreover, as \(d(p, \phi (p)) \le r\) for every \(p \in A\), we have that \(B(C_1, r) \subseteq B(C_2, 2r)\). Thus, \(C_2\) is a feasible solution of radius 2r. Finally, by construction, \(C_2 \setminus S = C_1 \setminus B(S,r) \le k  \tau  1\). \(\square \)
So, we have now proved that if \(y(B(S,r)) > k\gamma +1\) and \((x,y)\in \mathcal {P}_{I}\), then there is a solution \(C_2\) of radius 2r with \(C_2\setminus S\le \gamma  2\). The motivation for considering solutions of radius 2r with all centers in S except for constantly many (if \(\gamma =O(1)\)) is that such solutions can be found efficiently via dynamic programming. This is possible because the centers in S are separated by distances strictly larger than 4r, which implies that radius 2r balls centered at points in S do not overlap. Hence, there are no interactions between such balls. This is formalized below.
Lemma 3
Let \(S\subseteq X\) with \(d(s,s')>4r\) for all \(s,s'\in S\) with \(s \ne s'\), and \(\beta \in \mathbb {Z}_{\ge 0}\). If a radius 2r solution \(C\subseteq X\) with \(C\setminus S\le \beta \) exists, then we can find such a solution in time \(X^{O(\beta + \gamma )}\).
Proof
Suppose there is a solution \(C \subseteq X\) of radius 2r with \(C \setminus S \le \beta \). The algorithm has two components. We first guess the set \(Q:=C\setminus S\). Because \(Q \le \beta \), there are \(X^{O(\beta )}\) choices. Given Q, it remains to select at most \(kQ\) centers \(W\subseteq S\) to fulfill the color requirements. Note that for any \(W\subseteq S\), the number of points of color \(\ell \in [\gamma ]\) that B(W, 2r) covers on top of those already covered by B(Q, 2r) is \(\left (B(W,2r)\setminus B(Q,2r))\cap X_\ell \right = \sum _{w\in W} \left \left( B(w,2r) \setminus B(Q,2r) \right) \cap X_\ell \right , \) where equality holds because centers in W are separated by distances strictly larger than 4r, and thus B(W, 2r) is the disjoint union of the sets B(w, 2r) for \(w\in W\). Hence, the task of finding a set \(W\subseteq S\) with \(W\le kQ\) such that \(Q\cup W\) is a solution of radius 2r can be phrased as finding a feasible solution to the following binary program:
The above binary program can be easily solved through standard dynamic programming techniques in \(X^{O(\gamma )}\) time, because the coefficients are small. For completeness, we show in Appendix A how this can be done for a slightly more general problem (see Theorem 9), which we will reuse later on.^{Footnote 2} As the dynamic program is run for \(X^{O(\beta )}\) many guesses of Q, we obtain an overall running time of \(X^{O(\beta +\gamma )}\), as claimed. \(\square \)
This completes the last ingredient for an iteration of our roundorcut approach as shown in Fig. 1. In summary, assuming \(y(B(S,r)) > k\gamma +1\) (for otherwise Theorem 5 leads to a solution of radius 4r) we use Lemma 3 (with \(\beta =\gamma 2\)) to check whether there is a radius 2r solution \(C_2\) with \(C_2\setminus S\le \gamma 2\). This requires \(X^{O(\gamma )}\) time. If this is the case, we are done. If not, the contrapositive of Lemma 2 (with \(\tau =k  \gamma + 1\)) implies that every radius r solution \(C_1\) fulfills \(C_1\cap B(S,r) \le k  \gamma + 1\). Hence, every point \((\overline{x},\overline{y})\in \mathcal {P}_{I}\) satisfies \(\overline{y}(B(S,r)) \le k  \gamma +1\). However, this constraint is violated by (x, y), and so it separates (x, y) from \(\mathcal {P}_{I}\). Thus, we proved that the process described in Fig. 1 is a valid roundorcut procedure that runs in time \(X^{O(\gamma )}\).
Corollary 1
There is an algorithm that, given a point \((x,y)\in \mathbb {R}^X \times \mathbb {R}^X\), either returns a \(\upgamma \mathrm {C k C}\) solution of radius 4r or an inequality separating (x, y) from \(\mathcal {P}_{I}\). The running time of the algorithm is \(X^{O(\gamma )}\).
We can now prove the main theorem.
Proof of Theorem 1
We run the ellipsoid method on \(\mathcal {P}_{I}\) for each of the \(O(X^2)\) candidate radii r. For each r, the number of ellipsoid iterations is polynomially bounded as the separating hyperplanes that are produced by the algorithm have encoding length at most O(X) (see Theorem 6.4.9 of [17]). To see this, note that all generated hyperplanes are either inequalities defining \(\mathcal {P}\) or inequalities of the form \(y(B(S,r))\le k\gamma +1\). For the correct guess of r, \(\mathcal {P}_{I}\) is nonempty and the algorithm terminates by returning a radius 4r solution. Hence, if we return the best solution among those computed for all guesses of r, we have a 4approximation, and the total running time is \({{\,\mathrm{poly}\,}}(X) \cdot X^{O(\gamma )} = X^{O(\gamma )}\). \(\square \)
The lottery model of Harris et al. [18]
Our main tool to solve the lottery model of Harris et al. [18] is a reduction to a certain type of weighted kcenter problem. A key step of this reduction is to transform the problem through the use of linear duality. In Subsect. 3.1, we first present this reduction before proving in Subsect. 3.2 our algorithmic result for the abovereferred version of a weighted kcenter problem.
Reduction to weighted version of kcenter
Let (X, d) be a Fair \(\upgamma \mathrm {C k C}\) instance, and let \(\mathcal {F}(r)\) be the family of sets of centers satisfying the covering requirements with radius r, i.e.,
Note that a radius r solution for Fair \(\upgamma \mathrm {C k C}\) defines a distribution over the sets in \(\mathcal {F}(r)\). Given r, such a distribution exists if and only if the following (exponentialsize) linear program \(\text {PLP}(r)\) is feasible (with \(\text {DLP}(r)\) being its dual):
The dual problem \(\text {DLP}(r)\) can naturally be interpreted as a packing problem with packing constraints imposed by \(\upgamma \mathrm {C k C}\)solutions. However, we will mostly be interested in approximately separating over \(\text {DLP}(r)\). This will turn out to reduce to a weighted version of \(\upgamma \mathrm {C k C}\) as we highlight later.
Clearly, if \(\text {PLP}(r)\) is feasible, then its optimal value is 0. As mentioned in the introduction, it is also easy to see that if \(\text {PLP}(r)\) is feasible, then it has a feasible solution with polynomial support (since the number of nontrivial constraints is \(X + 1\)).
We will again assume that \(\gamma < k\). If \(\gamma \ge k\), then for each fixed radius r, we solve \(\text {PLP}(r)\) in time \({{\,\mathrm{poly}\,}}(L)\cdot X^{O(k)} \le {{\,\mathrm{poly}\,}}(L)\cdot X^{O(\gamma )}\), where L is the encoding length of the input. If \(\text {PLP}(r)\) is infeasible, then the radius r is too small. Otherwise, we compute a feasible extreme point solution to \(\text {PLP}(r)\) which corresponds to a distribution with support size \({{\,\mathrm{poly}\,}}(X)\). Hence, by applying binary search over all candidate radii, which are the \(O(X^2)\) pairwise distances between points in X, we can compute an optimal distribution for the smallest possible radius in \({{\,\mathrm{poly}\,}}(L) \cdot X^{O(\gamma )}\) time. Thus, from now on, we assume that \(1 \le \gamma < k\).
Observe that, for any \(r \ge 0\), \(\text {DLP}(r)\) always has a feasible solution (the zero vector) of value 0. Thus, by strong duality, \(\text {PLP}(r)\) is feasible if and only if the optimal value of \(\text {DLP}(r)\) is 0. Note that \(\text {DLP}(r)\) is scaleinvariant, meaning that if \((\alpha , \mu )\) is feasible for \(\text {DLP}(r)\) then so is \((t\alpha , t\mu )\) for \(t\in \mathbb {R}_{\ge 0}\). This implies that \(\text {DLP}(r)\) has a solution of strictly positive objective value if and only if \(\text {DLP}(r)\) is unbounded. We thus define the following polyhedron \(\mathcal {Q}(r)\), which contains all solutions of \(\text {DLP}(r)\) of value at least 1: As discussed, the following statement is a direct consequence of strong duality of linear programming.
Lemma 4
\(\mathcal {Q}(r)\) is empty if and only if PLP(r) is feasible.
The main lemma that allows us to obtain our result is the following. It guarantees the existence of an algorithm approximately solving a certain weighted kcenter problem, where clients are weighted by \(\alpha \in \mathbb {Q}^X_{\ge 0}\). Before proving the lemma in Subsect. 3.2, we show that it implies Theorem 2.
Lemma 5
There is an algorithm that, given a point \((\alpha ,\mu ) \in \mathbb {Q}_{\ge 0}^{X} \times \mathbb {Q}\) satisfying \(\sum _{u\in X}p(u)\alpha (u) \ge \mu + 1\) and a radius \(r \ge 0\), either certifies that \((\alpha , \mu ) \in \mathcal {Q}(r)\), or outputs a set \(C\in \mathcal {F}(4r)\) with \(\sum _{u\in B(C,4r)} \alpha (u) > \mu \). The running time of the algorithm is \({{\,\mathrm{poly}\,}}(L) \cdot X^{O(\gamma )}\), where L is the encoding length of the input.
In words, Lemma 5 either certifies \((\alpha ,\mu )\in \mathcal {Q}(r)\) or returns a hyperplane separating \((\alpha ,\mu )\) from \(\mathcal {Q}(4r)\). Its proof leverages techniques introduced in Sect. 2, and we present it in Subsect. 3.2. Using Lemma 5, we can now prove Theorem 2.
Proof of Theorem 2
As noted, there are polynomially many choices for the radius r, for each of which we run the ellipsoid method to check emptiness of \(\mathcal {Q}(4r)\) as follows. Whenever there is a call to the separation oracle for a point \((\alpha ,\mu )\in \mathbb {Q}^X \times \mathbb {Q}\), we first check whether \(\alpha \ge 0\) and \(\sum _{u\in X} p(u)\alpha (u) \ge \mu +1\). If one of these constraints is violated, we return it as separating hyperplane. Otherwise, we invoke the algorithm of Lemma 5. The algorithm either returns a constraint in the inequality description of \(\mathcal {Q}(4r)\) violated by \((\alpha ,\mu )\), which solves the separation problem, or certifies \((\alpha ,\mu )\in \mathcal {Q}(r)\). If, at any iteration of the ellipsoid method, the separation oracle is called for a point \((\alpha ,\mu )\) for which Lemma 5 certifies \((\alpha ,\mu ) \in \mathcal {Q}(r)\), then Lemma 4 implies \(\text {PLP}(r)\) is infeasible. Thus, there is no solution to the considered Fair \(\upgamma \mathrm {C k C}\) instance of radius r. Hence, consider from now on that the separation oracle always returns a separating hyperplane, in which case the ellipsoid method certifies that \(\mathcal {Q}(4r) = \emptyset \) as follows. Let \(\mathcal {H}\subseteq \mathcal {F}(4r)\) be the family of all sets \(C\in \mathcal {F}(4r)\) returned by Lemma 5 through calls to the separation oracle. Then, the following polyhedron: which clearly contains \(\mathcal {Q}(4r)\), is empty. As the encoding length of any constraint in the inequality description of \(\mathcal {Q}(4r)\) is polynomially bounded in the input, the ellipsoid method runs in polynomial time (see Theorem 6.4.9 of [17]). In particular, the number of calls to the separation oracle, and thus \(\mathcal {H}\), is polynomially bounded.
As \(\mathcal {Q}(4r) \subseteq \mathcal {Q}_{\mathcal {H}}(4r) = \emptyset \), Lemma 4 implies that PLP(4r) is feasible. More precisely, because \(Q_{\mathcal {H}}(4r)=\emptyset \), the linear program obtained from DLP(4r) by replacing \(\mathcal {F}(4r)\), which parameterizes the constraints in DLP(4r), by \(\mathcal {H}\), has optimal value equal to 0. Hence, its dual, which corresponds to PLP(4r) where we replace \(\mathcal {F}(4r)\) by \(\mathcal {H}\), is feasible. As this feasible linear program has polynomial size, because \(\mathcal {H}\) is polynomially bounded, we can solve it efficiently to obtain a distribution with the desired properties. Moreover, the total running time is \({{\,\mathrm{poly}\,}}(L) \cdot X^{O(\gamma )}\), where L is the encoding length of the input. \(\square \)
Proof of Lemma 5
The desired separation algorithm requires us to find a solution for a \(\upgamma \mathrm {C k C}\) instance with an extra covering constraint; the procedure of Sect. 2 generalizes to handle this extra constraint. We follow similar steps as in Fig. 1.
Let \((\alpha ,\mu ) \in \mathbb {Q}_{\ge 0}^{X} \times \mathbb {Q}\) be a point satisfying \(\sum _{u\in X}p(u)a(u) \ge \mu + 1\), let \(r \ge 0\), and, moreover, let
Hence, to prove Lemma 5, we need to find a procedure that either certifies \(\mathcal {F}^{\alpha ,\mu }(r)=\emptyset \) or returns a set \(C\in \mathcal {F}^{\alpha ,\mu }(4r)\). To avoid technical complications later on due to the strict inequality in the definition of \(\mathcal {F}^{\alpha ,\mu }(r)\), we observe, using standard techniques, that one can efficiently compute a polynomially encoded \(\varepsilon >0\) to replace the inequality \(\sum _{u\in B(C,r)}\alpha (u) > \mu \) by \(\sum _{u\in B(C,r)} \alpha (u) \ge \mu + \epsilon \).
Lemma 6
Let \((\alpha ,\mu ) \in \mathbb {Q}_{\ge 0}^X \times \mathbb {Q}\). Then one can efficiently compute an \(\varepsilon > 0\) with encoding length O(L), where L is the encoding length of \((\alpha ,\mu )\), such that the following holds: For any \(C\in \mathcal {F}(r)\), we have \(\sum _{u\in B(C,r)}\alpha (u) >\mu \) if and only if \(\sum _{u\in B(C,r)} \alpha (u) \ge \mu +\varepsilon \).
Proof
The tuple \((\alpha ,\mu )\) consists of \(X+1\) rationals , with \(p_i \in \mathbb {Z}\) and \(q_i \in \mathbb {Z}_{>0}\). Let \(\Pi = \prod _{i \in [N]} q_i\). Note that if \(\sum _{u\in B(C,r)} \alpha (u) > \mu \), then \(\sum _{u\in B(C,r)} \alpha (u)  \mu \ge \frac{1}{\Pi }\). Thus, we set . Moreover \(\log \Pi = \sum _{i \in [N]} \log q_i\), and so the encoding length of \(\varepsilon \) is O(L). \(\square \)
Let \(\mathcal {P}^{\alpha ,\mu }\) be the following modified relaxation of \(\upgamma \mathrm {C k C}\), defined for given \((\alpha , \mu ) \in \mathbb {Q}_{\ge 0}^X \times \mathbb {Q}\), and a corresponding \(\varepsilon > 0\) as per Lemma 6, where the polytope \(\mathcal {P}\) is defined for a fixed radius r, as in Sect. 2 (see (1)):
Let \(\mathcal {P}_{I}^{\alpha ,\mu }:={{\,\mathrm{conv}\,}}\left( \mathcal {P}^{\alpha ,\mu }\cap (\{0,1\}^X \times \{0,1\}^X)\right) \) be the integer hull of \(\mathcal {P}^{\alpha ,\mu }\). We now state the following straightforward observation, whose proof is an immediate consequence of the definitions of the corresponding polytopes and Lemma 6.
Observation 1
Let \((\alpha , \mu )\in \mathbb {Q}_{\ge 0}^X \times \mathbb {Q}\) be such that \(\sum _{u\in X} p(u)\alpha (u) \ge \mu + 1\) and \(\mathcal {P}_{I}^{\alpha ,\mu }=\emptyset \). Then \((\alpha , \mu )\in \mathcal {Q}(r)\).
The following lemma is a slightly modified version of Theorem 5, which is also a direct consequence of Theorem 8 given in Appendix A.
Lemma 7
Let \((\alpha ,\mu ) \in \mathbb {Q}_{\ge 0}^X\times \mathbb {Q}\), let \((x, y) \in \mathcal {P}^{\alpha ,\mu }\), and let \((S, \mathcal {D})\) be an (x, y)good partition. If \(y(B(S, r)) \le k  \gamma \), a set \(C\in \mathcal {F}^{\alpha ,\mu }(4r)\) can be found in polynomial time.
If \(y(B(S,r)) \le k  \gamma \), then Lemma 7 leads to a set \(C \in \mathcal {F}(4r)\) that satisfies \(\sum _{u \in B(C,4r)} \alpha (u) > \mu \); this gives a constraint separating \((\alpha ,\mu )\) from \(\mathcal {Q}(4r)\).
It remains to consider the case \(y(B(S,r))>k\gamma \). As in Sect. 2, we can either find a set \(C_2\in \mathcal {F}^{\alpha , \mu }(2r)\) or certify that every \(C_1\in \mathcal {F}^{\alpha ,\mu }(r)\) satisfies \(C_1\cap B(S,r)\le k\gamma \).
Lemma 8
Let \((\alpha ,\mu ) \in \mathbb {Q}_{\ge 0}^X\times \mathbb {Q}\), \(S\subseteq X\) with \(d(s,s')>4r\) for all \(s, s' \in S\) with \(s\ne s'\), and \(\tau \in \{0, \ldots , k1\}\). If there is a set \(C_1\in \mathcal {F}^{\alpha ,\mu }(r)\) with \(C_1\cap B(S,r)> \tau \), then there is a set \(C_2\in \mathcal {F}^{\alpha ,\mu }(2r)\) with \(C_2 \setminus S \le k  \tau  1\).
The proof of the above lemma is identical to the proof of Lemma 2, and thus is omitted.
Lemma 9
Let \((\alpha ,\mu ) \in \mathbb {Q}_{\ge 0}^X\times \mathbb {Q}\), \(S\subseteq X\) with \(d(s,s')>4r\) for all \(s,s'\in S\) with \(s \ne s'\), and \(\beta \in \mathbb {Z}_{\ge 0}\). If there exists a set \(C\in \mathcal {F}^{\alpha ,\mu }(2r)\) with \(C\setminus S\le \beta \), then we can find such a set in time \(X^{O(\beta + \gamma )}\).
Proof
As in the proof of Lemma 3, we first guess up to \(\beta \) centers \(Q \subseteq X\setminus S\). For each of those guesses, we consider the binary program (2) with objective function \(\sum _{s\in S} z(s) \cdot \alpha (B(s,2r) \setminus B(Q,2r))\) to be maximized. Again, this is a special case of the binary program presented in Theorem 9, given in Appendix A, and thus can be solved in time \(X^{O(\gamma )}\). For the guess \(Q=C\setminus S\), the characteristic vector \(\chi ^{C\cap S}\) is feasible for this binary program, implying that the optimal centers \(Z\subseteq S\) chosen by the binary program fulfill \(Z\cup Q \in \mathcal {F}^{\alpha ,\mu }(2r)\). \(\square \)
Corollary 2
Let \((\alpha ,\mu ) \in \mathbb {Q}_{\ge 0}^X\times \mathbb {Q}\). There is an algorithm that, given \((x,y)\in \mathbb {R}^X \times \mathbb {R}^X\), either returns a set \(C\in \mathcal {F}^{\alpha ,\mu }(4r)\) or returns a hyperplane separating (x, y) from \(\mathcal {P}_{I}^{\alpha ,\mu }\). The running time of the algorithm is \({{\,\mathrm{poly}\,}}(L) \cdot X^{O(\gamma )}\), where L is the encoding length of the input.
Proof
If \((x,y)\notin \mathcal {P}^{\alpha ,\mu }\), we return a violated constraint separating (x, y) from \(\mathcal {P}^{\alpha ,\mu }\supseteq \mathcal {P}_{I}^{\alpha ,\mu }\). Hence we assume \((x,y)\in \mathcal {P}^{\alpha ,\mu }\). Since \(\mathcal {P}^{\alpha ,\mu }\subseteq \mathcal {P}\), we can use Theorem 1 to get an (x, y)good partition \((S,\mathcal {D})\). If \(y(B(S,r))\le k\gamma \), Lemma 7 gives a set \(C\in \mathcal {F}^{\alpha ,\mu }(4r)\). So, assuming \(y(B(S,r))> k\gamma \), we use Lemma 9 (with \(\beta =\gamma 1\)) to check whether there is \(C_2\in \mathcal {F}^{\alpha ,\mu }(2r)\) with \(C_2\setminus S \le \gamma 1\). If this is the case, we are done because \(\mathcal {F}^{\alpha ,\mu }(2r)\subseteq \mathcal {F}^{\alpha ,\mu }(4r)\). If not, the contrapositive of Lemma 8 (with \(\tau =k\gamma \)) implies that every \(C_1\in \mathcal {F}^{\alpha ,\mu }(r)\) fulfills \(C_1\cap B(S,r)\le k\gamma \). Hence, every point \((\overline{x},\overline{y})\in \mathcal {P}_{I}^{\alpha ,\mu }\) satisfies \(\overline{y}(B(S,r))\le k\gamma \). However, this constraint is violated by (x, y), and it thus separates (x, y) from \(\mathcal {P}_{I}^{\alpha ,\mu }\). \(\square \)
Proof of Lemma 5
We use the ellipsoid method to check emptiness of \(\mathcal {P}_{I}^{\alpha ,\mu }\). Whenever the separation oracle gets called for a point \((x,y)\in \mathbb {R}^X \times \mathbb {R}^X\), we invoke the algorithm of Corollary 2. If the algorithm returns at any point a set \(C\in \mathcal {F}^{\alpha ,\mu }(4r)\), then C corresponds to a constraint in the inequality description of \(\mathcal {Q}(4r)\) violated by \((\alpha ,\mu )\). Otherwise, the ellipsoid method certifies that \(\mathcal {P}_{I}^{\alpha ,\mu }=\emptyset \), which implies \((\alpha ,\mu )\in \mathcal {Q}(r)\) by Observation 1. Note that the number of iterations of the ellipsoid method is polynomial as the separating hyperplanes used by the procedure above have encoding length \({{\,\mathrm{poly}\,}}(L)\), where L is the encoding length of the input (see Theorem 6.4.9 of [17]). Thus, the total running time is \({{\,\mathrm{poly}\,}}(L) \cdot X^{O(\gamma )}\). \(\square \)
Hardness results for Colorful kCenter
We now prove our hardness results. We start in Subsect. 4.1 by showing Theorem 3, i.e., that \(\upgamma \mathrm {C k C}\) becomes hard to approximate when the number of colors is unbounded. Then, in Subsect. 4.2, we prove Theorem 4, which shows our bicriteria inapproximability result, i.e., there is an approximation hardness even when one is allowed to exceed the number of centers to be opened by up to a factor \(c\log X\) for some constant c.
We note that all of our hardness results apply even to realline metrics. These are \(\upgamma \mathrm {C k C}\) instances where the underlying metric is given by a set of real numbers \(X \subseteq \mathbb {R}\), and the distance function d is defined as \(d(x,y) = xy\) for every \(x,y \in X\). The task that we prove to be hard is distinguishing whether such an instance admits a solution of radius 0 or not.
We start by discussing a reduction from the wellknown Set Cover problem to \(\upgamma \mathrm {C k C}\) on the real line. More precisely, we will show that deciding whether a given Set Cover instance has a solution of size at most k is equivalent to deciding whether a certain realline \(\upgamma \mathrm {C k C}\) instance admits a solution of radius 0. We note that the reduction is a straightforward adaptation of the reduction appearing in [22] in the context of the Partial Set Cover problem in geometric settings. For completeness, we first define the (decision version of the) Set Cover problem.
Definition 4
Let U be a finite set, let \(\mathcal {S} \subseteq 2^U\) be a family of subsets of U, and let \(k\in \mathbb {Z}_{\ge 0}\). The (decision) Set Cover problem, denoted as \({{\,\mathrm{SC}\,}}(U,\mathcal {S}, k)\), asks to decide whether there exists a subset \(\mathcal {S}' \subseteq \mathcal {S}\) such that \(\mathcal {S'} \le k\) and \(\bigcup _{S \in \mathcal {S}'} S = U\).
The following lemma, mimicking the ideas in [22], shows a simple yet very useful reduction from Set Cover to \(\upgamma \mathrm {C k C}\).
Lemma 10
Let \({{\,\mathrm{SC}\,}}(U,\mathcal {S},k)\) be a Set Cover instance. Then, in time polynomial in U and \(\mathcal {S}\), we can construct a realline \(\upgamma \mathrm {C k C}\) instance with \(X=\mathcal {S}\) points and \(\gamma =U\) colors such that \({{\,\mathrm{SC}\,}}(U,\mathcal {S},k)\) is a “yes” instance if and only if the \(\upgamma \mathrm {C k C}\) instance admits a solution of radius 0. Moreover, any \(\upgamma \mathrm {C k C}\) solution of radius 0 can be mapped efficiently to a \({{\,\mathrm{SC}\,}}(U,\mathcal {S},k)\) solution.
This reduction is independent of the parameter k, in the sense that for different values of k, the same \(\upgamma \mathrm {C k C}\) instance is obtained with the only difference that the number k of centers one can open is different.
Proof
We construct a \(\upgamma \mathrm {C k C}\) instance as follows. Let \(\gamma = U\) and \(s = \mathcal {S}\). Let \(U = \{u_1,\ldots , u_\gamma \}\) and \(\mathcal {S} = \{S_1, \ldots , S_s\}\). We set \(X = \{1, \ldots , s\} \subseteq \mathbb {R}\). Each element \(u_\ell \in U\) corresponds to a distinct color \(X_\ell = \{i \in [s]: u_\ell \in S_i\}\). We also set the covering requirement for each color \(\ell \in [\gamma ]\) to be \(m_\ell = 1\). Note that none of \(X, \gamma , X_\ell , m_\ell \) depend on k. Clearly, the construction can be done in time polynomial in U and \(\mathcal {S}\).
We now observe that the given \({{\,\mathrm{SC}\,}}(U,\mathcal {S},k)\) is a “yes” instance if and only if the constructed \(\upgamma \mathrm {C k C}\) instance admits a solution of radius 0. Indeed, if \(C \subseteq X\) is a \(\upgamma \mathrm {C k C}\) solution of radius 0, then the set \(\mathcal {S'} = \{S_i: i \in C\}\) is a feasible solution of the Set Cover instance of size \(\mathcal {S}' = C\le k\). Conversely, if \(\mathcal {S}'\subseteq \mathcal {S}\) is a Set Cover solution of size \(\mathcal {S}'\le k\), then \(C = \{i\in X :S_i \in \mathcal {S}'\}\) is a \(\upgamma \mathrm {C k C}\) solution of radius 0 with \(C = \mathcal {S}'\le k\) many centers. \(\square \)
Hardness of approximation for \(\upgamma \mathrm {C k C}\).
In this section, we prove our main hardness result, Theorem 3. For that, we reduce from the wellknown Vertex Cover problem on graphs of maximum degree 3 and cast it as a \(\upgamma \mathrm {C k C}\) problem. We first formally define the problem.
Definition 5
Let \(G = (V, E)\) be a graph of maximum degree 3 and let \(k \in \mathbb {Z}_{\ge 0}\). The Vertex Cover problem on such a graph, denoted as \({{\,\mathrm{VC3}\,}}(G, k)\), asks to decide whether there exists a set \(S \subseteq V\) of size at most k such that \(S \cap e \ne \emptyset \) for every \(e \in E\).
Notice that Vertex Cover is a special case of Set Cover; hence, we can employ the reduction highlighted in Lemma 10 to obtain a \(\upgamma \mathrm {C k C}\) problem. Reducing from a Vertex Cover problem of bounded degree, instead of starting from a general Set Cover problem, has the advantage that the cardinality of a minimum Vertex Cover in bounded degree graphs has, up to constant factors, the same size as the underlying ground set, which is the edge set in case of Vertex Cover. This relation is relevant in our reduction to derive a contradiction with the Exponential Time Hypothesis.
In order to prove Theorem 3, we will use the following hardness results for \({{\,\mathrm{VC3}\,}}(G, k)\).
Theorem 6

(i)
There is no algorithm for \({{\,\mathrm{VC3}\,}}(G, k)\) that runs in polynomial time, assuming that \(\mathtt {P}\ne \mathtt {NP}\).

(ii)
There is no algorithm for \({{\,\mathrm{VC3}\,}}(G,k)\) that runs in time \(2^{o(k)} {{\,\mathrm{poly}\,}}(V(G))\), assuming the Exponential Time Hypothesis.
In our proof of Theorem 3, we reduce \({{\,\mathrm{VC3}\,}}\) to \(\upgamma \mathrm {C k C}\) using Lemma 10 and then derive hardness of \(\upgamma \mathrm {C k C}\) by the hardness given by Theorem 6. Whereas this approach proves the first part of Theorem 3 in a straightforward way, it faces a technical hurdle for the second part. More precisely, note that the second part can be rephrased as follows. The existence of a function \(f:\mathbb {Z}_{\ge 0} \rightarrow \mathbb {Z}_{\ge 0}\) with \(f(n) = \omega (\log n)\) together with a polynomialtime algorithm \(\mathcal {A}\) for \(\upgamma \mathrm {C k C}\) on the real line with \(\gamma \le f(X)\) violates the Exponential Time Hypothesis. However, by reducing a general \({{\,\mathrm{VC3}\,}}\) instance to \(\upgamma \mathrm {C k C}\) through Lemma 10, we may obtain a \(\upgamma \mathrm {C k C}\) instance that does not fulfill \(\gamma \le f(X)\), which is required to apply algorithm \(\mathcal {A}\), as algorithm \(\mathcal {A}\) only needs to work on instances in this regime. Indeed, the reduction of Lemma 10 would only allow us to use algorithm \(\mathcal {A}\) to obtain a polynomialtime algorithm \(\mathcal {A}'\) for Set Cover instances \({{\,\mathrm{SC}\,}}(U,\mathcal {S},k)\) with \(U\le f(\mathcal {S})\); in particular, we would only be able to solve \({{\,\mathrm{VC3}\,}}\) instances whose underlying graph \(G = (V,E)\) satisfies \(E \le f(V)\). However, \(\mathcal {A}'\) can easily be transformed into an algorithm working for \({{\,\mathrm{VC3}\,}}\) instance by artificially inflating the vertex set V to make sure that E is small compared to V. The following lemma formalizes this quite straightforward, though slightly technical, step.
Lemma 11
Let \(f: \mathbb {Z}_{\ge 0} \rightarrow \mathbb {Z}_{\ge 0}\) be a function satisfying \(f(n) = \omega (\log n)\). Suppose that there exists an algorithm \(\mathcal {A}'\) that solves in polynomial time any \({{\,\mathrm{VC3}\,}}(G', k')\) instances with \(E(G')\le f(V(G'))\). Then there is an algorithm \(\mathcal {A}\) that solves any \({{\,\mathrm{VC3}\,}}(G,k)\) instance in time \(2^{o(E(G))} {{\,\mathrm{poly}\,}}(V(G))\).
Proof
Let \(\mathcal {I} = {{\,\mathrm{VC3}\,}}(G,k)\) be a Vertex Cover instance on a graph of maximum degree 3. To be able to apply \(\mathcal {A}'\) to \(\mathcal {I}\) we would need \(E(G)\le f(V(G))\). If this is satisfied, we simply apply \(\mathcal {A}'\). Hence, assume from now on \(E(G) > f(V(G))\). In this case we create a modified \({{\,\mathrm{VC3}\,}}\) instance \(\overline{\mathcal {I}} = {{\,\mathrm{VC3}\,}}(\overline{G},k)\) obtained by inflating \(\mathcal {I}\) through the addition of singleton vertices as discussed in the following. Because \(f(n) = \omega (\log n)\), there is a constant \(n_0\in \mathbb {Z}_{>0}\) and a nondecreasing function \(h:\mathbb {Z}_{\ge 0}\rightarrow \mathbb {Z}_{> 0}\) with

(i)
\(\lim _{n\rightarrow \infty } h(n) = \infty \), and

(ii)
\(f(n) \ge h(n) \cdot \log n \quad \forall n\in \mathbb {Z}_{\ge n_0}\).
Without loss of generality, we assume that \(V(G)\ge n_0\); for otherwise, the instance \(\mathcal {I}\) has constant size and can therefore be solved in constant time. We add
new singleton vertices to the \({{\,\mathrm{VC3}\,}}\) instance \(\mathcal {I}\) to obtain a new blownup \({{\,\mathrm{VC3}\,}}\) instance \(\overline{\mathcal {I}}={{\,\mathrm{VC3}\,}}(\overline{G},k)\) that is equivalent to \(\mathcal {I}\) because the introduced singleton vertices are not incident with any edges.
Hence, the new Vertex Cover instance \(\overline{\mathcal {I}}\) fulfills
Notice that
where the above inequalities follow by the properties of the function h, including that h is nondecreasing, and (3). Hence, algorithm \(\mathcal {A}'\) is applicable to \(\overline{\mathcal {I}}\) and, because \(\overline{\mathcal {I}}\) and \(\mathcal {I}\) are equivalent instances, \(\mathcal {A}'\) solves the original instance \(\mathcal {I}\). Finally, the running time to construct and solve \(\overline{\mathcal {I}}\) through \(\mathcal {A}'\) is upper bounded by
where we used the fact that \(h(n) = \omega (1)\).
We highlight that the function h(n) does not need to be known or computed explicitly to perform the reduction. By our choice of N, the number of vertices \(V(\overline{G})\) in the blownup \({{\,\mathrm{VC3}\,}}\) instance \(\overline{\mathcal {I}}\) is either V(G) or a power of two between V(G) and \(2^{E(G)}\). Hence, one can simply run \(\mathcal {A}'\) in parallel for each of the polynomially many options of the size of the blownup instance and terminate as soon as the first one of these parallel computations terminates. \(\square \)
We are now ready to prove Theorem 3.
Proof of Theorem 3
The first part of the theorem is an immediate consequence of part 6 of Theorem 6 and Lemma 10.
For the second part, let \(f: \mathbb {Z}_{\ge 0} \rightarrow \mathbb {Z}_{\ge 0}\) be a function that satisfies \(f(n) = \omega (\log n)\) and assume for the sake of contradiction that there is a polynomialtime algorithm \(\mathcal {A}'\) for \(\upgamma \mathrm {C k C}\) on the real line with \(\gamma \le f(X)\). Then, by Lemma 10, there exists a polynomialtime algorithm \(\mathcal {A}'\) for Vertex Cover instances \({{\,\mathrm{VC3}\,}}(G,k)\) satisfying \(E(G)\le f(V(G))\). By Lemma 11, this implies the existence of an algorithm \(\mathcal {A}\) for solving (arbitrary) \({{\,\mathrm{VC3}\,}}(G,k)\) instances in time \(2^{o(E(G))} {{\,\mathrm{poly}\,}}(V(G))\).
To obtain a contradiction with Theorem 6 (assuming the Exponential Time Hypothesis), it remains to show that this implies the existence of an algorithm for \({{\,\mathrm{VC3}\,}}(G,k)\) running in time \(2^{o(k)} {{\,\mathrm{poly}\,}}(V(G))\). Given a \({{\,\mathrm{VC3}\,}}(G,k)\) instance, we proceed as follows. Because G has no vertex of degree larger than 3, any vertex cover in G must have cardinality at least . Hence, if , we know that \({{\,\mathrm{VC3}\,}}(G,k)\) is a “no” instance. Otherwise, if , the running time of algorithm \(\mathcal {A}\) is \(2^{o(E(G))} {{\,\mathrm{poly}\,}}(V(G)) = 2^{o(k)} {{\,\mathrm{poly}\,}}(V(G))\), thus leading to the desired contradiction under the Exponential Time Hypothesis.
Hardness for bicriteria algorithms
In this section, we extend the hardness result stated in Theorem 3 to bicriteria algorithms. For this, we reduce from the optimization version of the Set Cover problem, which we refer to as the Minimum Cardinality Set Cover problem to distinguish it from the decision version used earlier. For completeness, we define it formally below.
Definition 6
(Minimum Cardinality Set Cover (\({{\,\mathrm{MCSC}\,}}\))) Let U be a finite set and \(\mathcal {S}\subseteq 2^U\) be a family of subsets of U. The Minimum Cardinality Set Cover problem \({{\,\mathrm{MCSC}\,}}(U,\mathcal {S})\) asks to compute the smallest subset \(\mathcal {S}' \subseteq \mathcal {S}\) such that \(\bigcup _{S \in \mathcal {S}'} S = U\).
\({{\,\mathrm{MCSC}\,}}\) is a wellunderstood \(\mathtt {NP}\)hard problem. We are interested in its approximation hardness, which, after a long series of works, was settled by Dinur and Steurer [13]; we state their result as Theorem 7. We note that since we are not interested in optimizing the constant that appears in the main theorem of this section, any known \(\Omega (\log n)\)hardness result for \({{\,\mathrm{MCSC}\,}}\) suffices to derive Theorem 4, proved below.
Theorem 7
[13]] For every \(\varepsilon > 0\), it is \(\mathtt {NP}\)hard to approximate \({{\,\mathrm{MCSC}\,}}\) for instances with universe size n and \(m \le {{\,\mathrm{poly}\,}}(n)\) sets to within a factor of \((1  \varepsilon ) \ln n\).
Combining Theorem 7 with Lemma 10 leads to the desired result.
Proof of Theorem 4
Suppose that, for some constant \(c>0\) to be determined later, there exists an algorithm \(\mathcal {A}\) for \(\upgamma \mathrm {C k C}\) on the real line that, if there exists a solution of radius 0, it finds a solution of radius 0 by opening at most \(k\cdot c \cdot \log X\) many centers, where X are the points on which \(\upgamma \mathrm {C k C}\) is defined. We now translate this algorithm to \({{\,\mathrm{MCSC}\,}}\) using Lemma 10. To this end, consider an instance \(\mathcal {I}={{\,\mathrm{MCSC}\,}}(U,\mathcal {S})\) with \(\mathcal {S}\le {{\,\mathrm{poly}\,}}(U)\), where the polynomial \({{\,\mathrm{poly}\,}}(U)\) is the one from Theorem 7. Let \(k^*\) be the optimal value of \(\mathcal {I}\).
For every \(k \in \{0, \ldots , \min \{U, \mathcal {S}\}\}\), we use the reduction of Lemma 10 to get a realline \(\upgamma \mathrm {C k C}\) instance and run \(\mathcal {A}\) on it. For \(k = k^*\), the resulting \(\upgamma \mathrm {C k C}\) instance, by Lemma 10, has a feasible solution of size at most \(k^*\), and thus, for this instance our algorithm will return a solution of size at most \(k^* \cdot c \cdot \log \mathcal {S}\). Because \(\mathcal {S} \le {{\,\mathrm{poly}\,}}(U)\), this means that the returned Set Cover has size at most \(k^* \cdot c' \cdot \log U\), for some constant \(c' > 0\) that depends on c and the hidden universal constants in the \(\mathcal {S} \le {{\,\mathrm{poly}\,}}(U)\) assumption. Thus, by considering all constructed \(\upgamma \mathrm {C k C}\) instances—which only differ by their value of k—for which a solution was returned and picking the smallest such solution, we obtain a set cover of size at most \(k^* \cdot c' \cdot \log U\). By setting the constant c appropriately (it is easy to see that this can always be done for sufficiently small c), this now contradicts Theorem 7. We conclude that it is \(\mathtt {NP}\)hard to decide whether a \(\upgamma \mathrm {C k C}\) instance has a solution of radius 0, even if we allow solutions that open up to \(k \cdot c \cdot \log X\) centers.
Conclusion
In this work, we presented a technique for obtaining true constantfactor approximation algorithms for kcenter problems with multiple covering constraints on the points to be covered. This leads to a polynomialtime 4approximation algorithm for \(\gamma \)Colorful kCenter, where \(\gamma \), the number of colors, is assumed to be constant, as well as a polynomialtime 4approximation algorithm for the more general Fair \(\gamma \)Colorful kCenter problem.
We note here that our results extend to the supplier setting, where there are distinct sets of facilities and clients, and one is allowed to open k facilities in order to cover clients. For such settings, we obtain a polynomialtime 5approximation algorithm for the Fair \(\gamma \)Colorful kSupplier problem. The extension of our arguments to this setting is done by using a standard technique: we first find clients C that constitute a 4approximate solution to the corresponding Center problem and then pick a facility \(f_c\in B(c,r)\) for each \(c\in C\). Using the notation introduced in the description of Algorithm 1, we note that terminating Algorithm 1 once \(\max _{u\in U} x(u) = 0\) does not affect the remaining steps in our approximation algorithms. Hence we may assume that \(x(s)>0\) for all \(s\in S\), which guarantees the existence of a facility in B(s, r). We also clarify that the “guessing a few centers” part of our algorithm performed in Lemma 9 can be applied directly to facilities with no issues arising.
On the negative side, we show that Colorful kCenter is inapproximable when the number of colors is assumed to be part of the input.
There are still some open questions remaining; we highlight two of them, which we find particularly natural and interesting:

(i)
The currently known hardness of \(\gamma \)Colorful kCenter is \(2\varepsilon \), inherited from the standard kCenter problem, while (for constant \(\gamma \)) we give a polynomialtime 4approximation, and, as already mentioned, in an independent work, Jia, Sheth, and Svensson [23] give a polynomialtime 3approximation with a worse running time. It would be interesting to close this gap.

(ii)
\(\gamma \)Colorful kCenter naturally generalizes to the knapsack and matroid versions of it, where the set of centers that are opened must satisfy a knapsack or a matroid constraint. Currently, our technique does not easily generalize to such settings, so new ideas might be needed to handle these problems.
Notes
 1.
The version introduced in [3] requires \(X_1,\ldots , X_\gamma \) to partition X. However, this additional condition on the input does not simplify the problem. Indeed, \(\upgamma \mathrm {C k C}\) readily reduces to the model in [3] by introducing a new color \(X_{\gamma + 1} = X \setminus \bigcup _{i \in [\gamma ]}X_i\) with \(m_{\gamma + 1} = 0\) and replacing each element that has \(q > 1\) colors by q elements on the same location with each having a single color.
 2.
 3.
Note that the only reason why this is a slight abuse of terminology is because we defined (x, y)good partitions only for points in \(\mathcal {P}\). Moreover, contrary to \(\mathcal {T}\), the decription of the polytope \(\mathcal {P}\) contains specific constraints for the covering requirements of the colors. However, these constraints did not play any role in showing that Algorithm 1 returns an (x, y)good partition (see proof of Lemma 1).
References
 1.
An, H.C., Singh, M., Svensson, O.: LPbased algorithms for capacitated facility location. SIAM J. Comput. 46(1), 272–306 (2017)
 2.
Backurs, A., Indyk, P., Onak, K., Schieber, B., Vakilian, A., Wagner, T.: Scalable fair clustering. In: Proceedings of the 36th International Conference on Machine Learning (ICML), pp. 405–413 (2019)
 3.
Bandyapadhyay, S., Inamdar, T., Pai, S., Varadarajan, K.R.: A constant approximation for Colorful \(k\)Center. In: Proceedings of the 27th Annual European Symposium on Algorithms (ESA), pp. 12:1–12:14 (2019)
 4.
Bera, S.K., Chakrabarty, D., Flores, N., Negahbani, M.: Fair algorithms for clustering. In: Proceedings of the 33rd International Conference on Neural Information Processing Systems (NeurIPS), pp. 4955–4966 (2019)
 5.
Bercea, I.O., Groß, M., Khuller, S., Kumar, A., Rösner, C., Schmidt, D.R., Schmidt, M.: On the cost of essentially fair clusterings. In: Proceedings of the 22nd International Conference on Approximation Algorithms for Combinatorial Optimization Problems (APPROX/RANDOM), pp. 18:1–18:22 (2019)
 6.
Cai, L., Juedes, D.W.: On the existence of subexponential parameterized algorithms. J. Comput. Syst. Sci. 67(4), 789–807 (2003)
 7.
Carr, R.D., Fleischer, L.K., Leung, V.J., Phillips, C.A.: Strengthening integrality gaps for capacitated network design and covering problems. In: Proceedings of the 11th Annual ACMSIAM Symposium on Discrete Algorithms (SODA), pp. 106–115 (2000)
 8.
Chakrabarty, D., Goyal, P., Krishnaswamy, R.: The nonuniform \(k\)center problem. In: Proceedings of the 43rd International Colloquium on Automata, Languages, and Programming (ICALP), pp. 67:1–67:15 (2016)
 9.
Chakrabarty, D., Negahbani, M.: Generalized center problems with outliers. ACM Trans. Algorithm. 15(3), 41:141:14 (2019)
 10.
Charikar, M., Khuller, S., Mount, D.M., Narasimhan, G.: Algorithms for facility location problems with outliers. In: Proceedings of the 12th Annual Symposium on Discrete Algorithms (SODA), pp. 642–651 (2001)
 11.
Chen, D.Z., Li, J., Liang, H., Wang, H.: Matroid and knapsack center problems. Algorithmica 75(1), 27–52 (2016)
 12.
Chierichetti, F., Kumar, R., Lattanzi, S., Vassilvitskii, S.: Fair clustering through fairlets. In: Proceedings of the 31st International Conference on Neural Information Processing Systems (NIPS), pp. 5029–5037 (2017)
 13.
Dinur, I., Steurer, D.: Analytical approach to parallel repetition. In: Proceedings of the 46th Annual Symposium on the Theory of Computing (STOC), pp. 624–633 (2014)
 14.
Garey, M.R., Johnson, D.S., Stockmeyer, L.: Some simplified NPcomplete problems. In: Proceedings of the 6th Annual ACM Symposium on Theory of Computing (STOC), pp. 47–63 (1974)
 15.
Gonzalez, T.F.: Clustering to minimize the maximum intercluster distance. Theor. Comput. Sci. 38, 293–306 (1985)
 16.
Grandoni, F., Kalaitzis, C., Zenklusen, R.: Improved approximation for tree augmentation: Saving by rewiring. In: Proceedings of the 50th ACM Symposium on Theory of Computing (STOC), pp. 632–645 (2018)
 17.
Grötschel, M., Lovász, L., Schrijver, A.: Geometric Algorithms and Combinatorial Optimization, 2nd edn. Springer, Germany (1993)
 18.
Harris, D.G., Pensyl, T., Srinivasan, A., Trinh, K.: A lottery model for centertype problems with outliers. ACM Transactions on Algorithms 15(3), 36:1–36:25 (2019)
 19.
Hochbaum, D.S., Shmoys, D.B.: A best possible heuristic for the kcenter problem. Math. Op. Res. 10(2), 180–184 (1985)
 20.
Hochbaum, D.S., Shmoys, D.B.: A unified approach to approximation algorithms for bottleneck problems. J. ACM 33(3), 533–550 (1986)
 21.
Hsu, W., Nemhauser, G.L.: Easy and hard bottleneck location problems. Discret. Appl. Math. 1(3), 209–215 (1979)
 22.
Inamdar, T., Varadarajan, K.R.: On the partition set cover problem (2018). https://arxiv.org/abs/1809.06506
 23.
Jia, X., Sheth, K., Svensson, O.: Fair colorful kcenter clustering. In: Proceedings of the 21st International Conference on Integer Programming and Combinatorial Optimization (IPCO), pp. 209–222 (2020)
 24.
Levi, R., Lodi, A., Sviridenko, M.: Approximation algorithms for the capacitated multiitem lotsizing problem via flowcover inequalities. Math. Op. Res. 33(2), 461–474 (2008)
 25.
Li, S.: Approximating capacitated \(k\)median with \((1+\epsilon )k\) open facilities. In: Proceedings of the 27th Annual ACM Symposium on Discrete Algorithms (SODA), pp. 786–796 (2016)
 26.
Li, S.: On uniform capacitated \(k\)median beyond the natural LP relaxation. ACM Trans. Algorithm. 13(2), 22:122:18 (2017)
 27.
Nutov, Z.: On the tree augmentation problem. In: Proceedings of the 25th Annual Symposium on Algorithms (ESA), pp. 61:1–61:14 (2017)
 28.
Rösner, C., Schmidt, M.: Privacy preserving clustering with constraints. In: Proceedings of the 45th International Colloquium on Automata, Languages, and Programming (ICALP), pp. 96:1–96:14 (2018)
Funding
Open Access funding provided by ETH Zurich.
Author information
Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
This project received funding from the European Research Council (ERC) under the European Union’s Horizon 2020 research and innovation programme (grant agreement No 817750) and the Swiss National Science Foundation grants 200021_184622 and PZ00P2_174117. This research was conducted while the second author was at ETH Zurich. A preliminary version of this work was presented at the 21st Conference on Integer Programming and Combinatorial Optimization (IPCO 2020). An independent work of Jia, Sheth, and Svensson [23], presented at the same venue, gave a 3approximation for Colorful kCenter with constantly many colors using different techniques.
Appendices
Technical theorems
Theorem 8
([3]) Let (X, d) be a finite metric space, and suppose that the following polytope
is not empty, where \(k \in \{1, \ldots , X\}\), \(t \in \mathbb {Z}_{\ge 0}\), \(a_\ell \in \mathbb {R}_{\ge 0}^X\) and \(b_{\ell } \in \mathbb {R}_{\ge 0}\) for every \(\ell \in [t]\), and \(r \ge 0\). Let \((x,y) \in \mathcal {T}\), and let \((S,\mathcal {D})\) be a partition obtained by running Algorithm 1 with input (x, y). Then, if \(y(B(S, r)) \le k  t + 1\), we can find in polynomial time a set \(C \subseteq X\) with \(C \le k\) satisfying \(a_\ell (B(C,4r)) \ge b_\ell \) for all \(\ell \in [t]\).
Proof
Let \(S = \{s_1, \ldots , s_q\}\) and \(\mathcal {D} = \{D_1, \ldots , D_q\}\) be the partition obtained by running Algorithm 1 with input (x, y). It is easy to see that \((S,\mathcal {D})\) satisfies all three properties of an (x, y)good partition, and so, by slightly abusing terminology, we will call it an (x, y)good partition.^{Footnote 3} We now assume that \(q \ge k + 1\), since otherwise, the set of centers S is already a feasible solution, as \(B(S,4r) = X\). We claim that the simplified LP given below is feasible and has optimal value at most y(B(S, r)).
This is indeed the case because we can construct a feasible point to the above LP with objective value at most y(B(S, r)) as follows. Let \(z_i = \min \{1, y(B(s_i,r))\}\) for all \(i\in [q]\). Because \((\mathcal {D}, S)\) is a (x, y)good partition, property 3 of Definition 3 implies that
(here we also use the fact that \(x(u) \le 1\) for all \(u \in X\), as \((x,y) \in \mathcal {T}\)), i.e., z is a feasible solution of the above LP, and its objective value is \(\sum _{i \in [q]} z_i \le y(B(S, r))\).
Suppose now that the hypothesis holds, i.e., \(y(B(S,r)) \le k  t + 1\). In particular, this means that \(t \le k + 1 \le q\). Note that if \(t = k + 1\), then \(y(B(S,r)) = 0\), which, by the greediness of Algorithm 1, further implies that \(b_\ell = 0\) for every \(\ell \in [t]\). Such a case is trivial, as we can simply set \(C :=\emptyset \). Thus, from now on, we assume that \(t \le k < q\). By the above discussion, the optimal value of the above simplified LP is at most \(k  t + 1\). We consider an optimal extreme point solution \(z^*\) of LP (4). A standard sparsity argument implies that \(z^*\) has at most t fractional variables. Indeed, \(z^*\) is defined by q linearly independent and tight constraints of (4), among which at most t many are not of type \(z_i\ge 0\) or \(z_i \le 1\). Hence, this implies that there are at least \(qt\) \(z^*\)tight constraints of (4) of type \(z_i^* =0\) or \(z_i^*=1\). This in turn implies that \(z^*\) has at most t fractional components.
Furthermore, the number of strictly positive components of \(z^*\) is at most k. To see this, note that if \(kt+1\) components of \(z^*\) are equal to 1, all other entries must be 0 because \(z^*\) is an optimal solution to (4), which has objective value no more than \(kt+1\). Otherwise, there are at most \(kt\) variables that are equal to 1 and, together with at most t fractional variables, there are at most k strictly positive entries. Therefore, the set of centers \(C=\{s_i \in X \  \ z^*_i>0\} \subseteq S\) has size at most k and satisfies \(a_\ell (B(C,4r)) \ge b_\ell \) for all \(\ell \in [t]\), because \(\bigcup _{c \in C} B(c,4r) \supseteq \bigcup _{i: z^*_i>0} D_i\), as \(D_i \subseteq B(s_i, 4r)\) for all \(i\in [q]\). \(\square \)
For completeness, we now discuss how the dynamic programming problems appearing in our approaches can be solved in the claimed running time.
Theorem 9
Consider the following binary program:
where \(\gamma \in \mathbb {Z}_{\ge 1}\), \(w \in \mathbb {R}_{\ge 0}^q\), \(a_\ell \in \{0, \ldots , M\}^q\) and \(m_\ell \in \{0, \ldots , M\}\) for all \(\ell \in [\gamma ]\), where M is some positive integer number, and \(\kappa \in [q]\). Then, the above program can be solved in time \(O(\gamma q^2 M^\gamma )\).
Proof
The above binary program can be solved using standard dynamic programming techniques. More precisely, we define the following DP table. For every \(i \in \{0, \ldots , q\}\), \(M_\ell \in \{0, \ldots m_\ell \}\) for every \(\ell \in [\gamma ]\), and \(j \in \{0, \ldots , \kappa \}\), let \(A[i, M_1, \ldots , M_\gamma , j]\) be the maximum objective value of any vector \(z\in \{0,1\}^q\) that satisfies

(i)
\(\{t \in [q]: \; z(t) = 1\} \subseteq [i]\),

(ii)
\(\sum _{t = 1}^q z(t) \le j\), and

(iii)
\(\sum _{t = 1}^q a_\ell (t)\cdot z(t) \ge M_\ell \) for every \(\ell \in [\gamma ]\).
Initialization is easy to define. For all nontrivial tuples \([i, M_1, \ldots , M_\gamma , j]\), by setting \(b_\ell :=\max \{M_\ell  a_\ell (i), 0\}\) for every \(\ell \in [\gamma ]\), we have
By observing the range of each parameter of the above table, we get that there are \(O(q \kappa M^\gamma )\) table entries in total. Moreover, for each entry we need to compute \(\gamma \) auxiliary quantities \(b_1, \ldots , b_\gamma \), and so we conclude that each entry can be computed in time \(O(\gamma )\). Thus, the DP can be solved in time \(O(\gamma q^2 M^\gamma )\), where we used the fact that \(\kappa \le q\). \(\square \)
We remark that the \(O(\gamma )\) update time per table entry in the above proof can be reduced to O(1) amortized update time per table entry through a more careful analysis. However, the resulting slight reduction in running time from \(O(\gamma q^2 M^{\gamma })\) to \(O(q^2 M^{\gamma })\) is irrelevant for our purposes.
A limiting example for the framework of Chakrabarty and Negahbani [9]
A natural way to extend the approach of [9] is the following procedure. Given a point \((x,y)\in \mathbb {R}^X\times \mathbb {R}^X\), we first run Algorithm 1 (with balls of radius 2 at each step) to get a partition of X, and then we use dynamic programming to decide whether it is possible to select at most k clusters of this partition so that the covering requirements for all colors are satisfied. Such a selection, if it exists, gives a 2approximation. If there is no such selection, we want to return a hyperplane separating (x, y) from \(\mathcal {P}_{I}\), as in [9].
However, there is an instance and a point (x, y), given below, such that neither the partition will lead to a solution nor is it possible to separate (x, y) from \(\mathcal {P}_{I}\). Thus any such procedure needs to deal with this limitation.
In Fig. 2, we present an instance of \(\gamma \)Colorful kCenter with \(\gamma =k=2\) in the onedimensional Euclidean space; hence \(X\subseteq \mathbb {R}\). There are two colors, red and blue; the red points are represented as red circles and the blue points as blue squares. The color covering requirements are \(m_1=m_2=3\). It is easy to see that there are no integral solutions of radius 0, hence any solution with radius 1 is optimal. We consider two different optimal solutions:

\(C_1 = \{1, M + 1\}\) with corresponding clustering \(\mathcal {C}_1 = \left\{ \{1,2\}, \{M+1, M+2 \} \right\} \),

\(C_2 = \{4, M+4\}\) with corresponding clustering \(\mathcal {C}_2 = \left\{ \{3,4\}, \{M+3, M+4\} \right\} \).
We clarify that in the above, we slightly abuse notation; if there are multiple points in a location, we only pick one of them as a center, while in the corresponding clustering, all points in a covered location participate in the clustering. It is easy to verify that the above clusterings are indeed feasible solutions of radius 1, and thus, they are optimal solutions.
We now define the fractional solution \((x,y) \in \mathbb {R}^X \times \mathbb {R}^X\), where \(x=\frac{1}{2} \left( \chi ^{\mathcal {C}_1}+\chi ^{\mathcal {C}_2}\right) \) and \(y=\frac{1}{2} \left( \chi ^{C_1} + \chi ^{C_2}\right) \). Observe that we have \(x(u)=\frac{1}{2}\) for all \(u\in X\).
In the above example, given the defined point (x, y) as input, Algorithm 1 may return the indicated partitioning \(\{D_1, D_2, D_3, D_4\}\). We stress here that there are ties, and in order to get this partitioning we resolve them adversarially. Note that there is no specified way to resolve such ties in [9] and it seems highly unclear how to design a procedure that always break ties in a good way even if there is a good way to break them. Observe now that no combination of two of these resulting clusters satisfies the covering requirement, so the partitioning does not lead to a solution. However, we cannot possibly find an appropriate separating hyperplane because \((x,y)\in \mathcal {P}_{I}\) by construction.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Anegg, G., Angelidakis, H., Kurpisz, A. et al. A technique for obtaining true approximations for kcenter with covering constraints. Math. Program. (2021). https://doi.org/10.1007/s1010702101645y
Received:
Accepted:
Published:
DOI: https://doi.org/10.1007/s1010702101645y
Keywords
 Approximation algorithms
 kCenter
 Clustering
 Polyhedral techniques
Mathematics Subject Classification
 90C27
 68W40
 68Q25
 90C05