Algorithmica

, Volume 73, Issue 3, pp 483–510 | Cite as

A Quantization Framework for Smoothed Analysis of Euclidean Optimization Problems

Article

Abstract

We consider the smoothed analysis of Euclidean optimization problems. Here, input points are sampled according to density functions that are bounded by a sufficiently small smoothness parameter \(\phi \). For such inputs, we provide a general and systematic approach that allows designing linear-time approximation algorithms whose output is asymptotically optimal, both in expectation and with high probability. Applications of our framework include maximum matching, maximum TSP, and the classical problems of k-means clustering and bin packing. Apart from generalizing corresponding average-case analyses, our results extend and simplify a polynomial-time probable approximation scheme on multidimensional bin packing on \(\phi \)-smooth instances, where \(\phi \) is constant (Karger and Onak in Polynomial approximation schemes for smoothed and random instances of multidimensional packing problems, pp 1207–1216, 2007). Both techniques and applications of our rounding-based approach are orthogonal to the only other framework for smoothed analysis of Euclidean problems we are aware of (Bläser et al. in Algorithmica 66(2):397–418, 2013).

Keywords

Smoothed analysis Euclidean optimization problems  Bin packing Maximum matching Maximum traveling salesman problem 

1 Introduction

Smoothed analysis has been introduced by Spielman and Teng [34] to give a theoretical foundation for analyzing the practical performance of algorithms. In particular, this analysis paradigm was able to provide an explanation why the simplex method is observed to run fast in practice despite its exponential worst-case running time. For a detailed overview, we refer to two surveys on smoothed analysis [30, 33].

The key concept of smoothed analysis, i.e., letting an adversary choose worst-case distributions of bounded “power” to determine input instances, is especially well-motivated in a Euclidean setting. Here, input points are typically determined by physical measurements, which are subject to an inherent inaccuracy, e.g., from locating a position on a map. For clustering problems, it is often even implicitly assumed that the points are sampled from unknown probability distributions which are sought to be recovered.

Making the mentioned assumptions explicit, we call a problem smoothed tractable if it admits a linear-time algorithm with an approximation ratio that is bounded by \(1-o(1)\) with high probability over the input distribution specified by the adversary. Such an approximation performance is called asymptotically optimal. We provide a unified approach to show that several Euclidean optimization problems are smoothed tractable, which sheds light onto the properties that render a Euclidean optimization problem likely to profit from perturbed input.

We employ the one-step model, a widely-used and very general perturbation model, which has been successfully applied to analyze a number of algorithms [10, 12, 13, 19]. In this model, an adversary chooses probability densities on the input space, according to which the input instance is drawn. To prevent the adversary from modeling a worst-case instance too closely, we bound the density functions from above by a parameter \(\phi \). Roughly speaking, for large \(\phi \), we expect the algorithm to perform almost as bad as on worst-case instances. Likewise, choosing \(\phi \) as small as possible requires the adversary to choose the uniform distribution on the input space, corresponding to an average-case analysis. Thus, the adversarial power \(\phi \) serves as an interpolation parameter between worst and average case.

Formally, given a set of feasible distributions \(\mathcal{F}\) that depends on \(\phi \), and a performance measure t, we define the smoothed performance of an algorithm under the perturbation model \(\mathcal{F}\) as
$$\begin{aligned} \max _{f_{1},\dots ,f_{n}\in \mathcal{F}}\;\mathop {\mathrm {E}}_{(X_{1},\ldots ,X_{n})\sim (f_{1},\dots ,f_{n})}[t(X_{1},\dots ,X_{n})]. \end{aligned}$$
In this work, we will be concerned with analyzing the smoothed approximation ratio, as well as obtaining bounds on the approximation ratio that hold with high probability over the random perturbations.

For given \(\phi \), we require the density functions chosen by the adversary to be bounded by \(\phi \). For real-valued input, this includes the possibility to add uniform noise in an interval of length \({1}/{\phi }\) or Gaussian noise with variance \(\sigma =\varTheta ({1}/{\phi })\). In the Euclidean case, the adversary could, e.g., specify for each point a box of volume at least \({1}/{\phi }\), in which the point is distributed uniformly.

1.1 Related Work

Recently, Bläser, Manthey and Rao [12] established a framework for analyzing the expectation of both running times and approximation ratios for some partitioning algorithms on so-called smooth and near-additive functionals. We establish a substantially different framework for smoothed analysis on a general class of Euclidean functionals that is disjoint to the class of smooth and near-additive functionals (see Sect. 7 for further discussion). We contrast both frameworks by considering the maximization counterparts of two problems studied in [12], namely Euclidean matching and TSP. Our algorithms have the advantage of deterministic running times and asymptotically optimal approximation guarantees both in expectation and with high probability.

All other related works are problem-specific and will be described in the corresponding sections. As an exception, we highlight the result of Karger and Onak [27], who studied bin packing. To the best of our knowledge, this is the only problem that fits into our framework and has already been analyzed under perturbation. In their paper, a linear-time algorithm for bin packing was given that is asymptotically optimal on instances smoothed with any constant \(\phi \) and instances in which each input point is drawn from an identical, but arbitrary probability density function. We provide a new, conceptually simpler rounding method and analysis that replaces a key step of their algorithm and puts the reasons for its smoothed tractability into a more general context.

1.2 Our Results

We provide very fast and simple approximation algorithms on sufficiently smoothed inputs for the following problems: The maximum Euclidean matching problem \(\mathrm {MaxM}\), the maximum Euclidean Traveling Salesman problem \(\mathrm {MaxTSP}\), the k-means clustering problem \(\mathrm {KMeans}\) where k denotes the number of desired clusters and is part of the input, and the d-dimensional bin packing problem \(\mathrm {BP}_{d}\). The approximation ratio converges to one with high probability over the random inputs. Additionally, all of these algorithms can be adapted to yield asymptotically optimal expected approximation ratios as well. This generalizes corresponding average-case analysis results [18, 28].

Almost all our algorithms allow trade-offs between running time and approximation performance: by choosing a parameter p within its feasible range, we obtain algorithms of running time \(O(n^{p})\), whose approximation ratios converge to 1 as \(n\rightarrow \infty \), provided that \(\phi \) is small enough, where the restriction on \(\phi \) depends on p. The general trade-offs for our algorithms are listed in Table 2, the special case of linear-time algorithms is summarized in Table 1.

1.3 Organization of the Paper

In Sect. 3, we describe the core of the framework by introducing the notion of quantizability. Sects. 4 and 5 then provide two methods, grid quantization and balanced quantization, for verifying that a functional is quantizable. These methods are applied to maximum matching, the maximum traveling salesman problem and k-means clustering. In Sect. 6, we apply the grid quantization method to multidimensional bin packing.
Table 1

All (near) linear-time algorithms derived in our framework

Problem

Running time

Restriction on adversary power

Reference

\(\mathrm {MaxM}\)

O(n)

\(\phi =o\left( \root 4 \of {n}\right) \) or \(\phi =o\left( n^{\frac{1}{2}\frac{d}{d+2}-\varepsilon }\right) \)

Sections 4.1 and 5.2

\(\mathrm {MaxTSP}\)

O(n)

\(\phi =o\left( \root 4 \of {n}\right) \) or \(\phi =o\left( n^{\frac{1}{2}\frac{d}{d+2}-\varepsilon }\right) \)

Sections 4.2 and 5.2

\(\mathrm {KMeans}\)

O(n)

\(k\phi =o\left( n^{\frac{1}{2}\frac{1}{kd+1}\frac{d}{d+1}}\right) \)

Section 5.1

\(\mathrm {BP}_{1}\)

\(O(n\log n)\)

\(\phi =o(n^{1-\varepsilon })\)

Section 6

\(\mathrm {BP}_{d}\)

O(n)

\(\phi =o\left( \root d(d+1) \of {\log \log n/\log ^{(3)}n}\right) \)

Section 6

Table 2

Our results

Problem

Running time

Approximation ratio

 

\(\mathrm {MaxM}(X)\)

\(O(n^{p})\)

\(1-O\left( \root d \of {\frac{\phi }{n^{p/4}}}\right) \)

\(1\le p<4\)

\(O(n^{p})\)

\(1-O\left( \frac{\root d \of {\phi }}{n^{\frac{p}{2}\frac{1}{d+2}-\varepsilon }}\right) \)

\(1\le p\le 2\left( 1+\frac{1}{d+1}\right) \), \(\varepsilon >0\)

\(\mathrm {MaxTSP}(X)\)

\(O(n^{p})\)

\(1-O\left( \root d \of {\frac{\phi }{n^{p/4}}}\right) \)

\(1\le p\le 4\left( 1-\frac{1}{d+1}\right) \)

\(O(n^{p})\)

\(1-O\left( \frac{\root d \of {\phi }}{n^{\frac{p}{2}\frac{1}{d+2}-\varepsilon }}\right) \)

\(1\le p\le 2\left( 1+\frac{1}{d+1}\right) \), \(\varepsilon >0\)

\(\mathrm {KMeans}(X;k)\)

\(O(n^{p})\)

\(1-O\left( \frac{(k\phi )^{2/d}}{n^{\frac{p}{(kd+1)(d+1)}}}\right) \)

\(\begin{array}{l} 1\le p<kd+1\\ k=O(\log n/\log \log n) \end{array}\)

\(O(n^{p})\)

\(1-O\left( \frac{k^{\frac{2}{d}-\frac{3}{2(d+1)}}\phi {}^{2/d}}{n^{\frac{p}{2k(d+1)}-\varepsilon }}\right) \)

\(\begin{array}{l} 1\le p<2k\\ k=O(\log n/\log \log n) \end{array}\)

\(\mathrm {BP}_{1}(X)\)

\(O(n\log n)\)

\(1-\log n/n^{\varepsilon }-O(\phi /n^{1-\varepsilon })\)

\(\varepsilon >0\)

\(\mathrm {BP}_{d}(X)\)

O(n)

\(1-O\left( \frac{\phi ^{d+1}}{\root d \of {\frac{\log \log n}{\log ^{(3)}n}}}\right) \)

 

2 Preliminaries

Given an n-tuple of density functions \(f=(f_{1},\dots ,f_{n})\) and random variables \(X=(X_{1},\ldots ,X_{n})\), we write \(X\sim f\) for drawing \(X_{i}\) according to \(f_{i}\) for \(1\le i\le n\). We call \(Y=(Y_{1},\dots ,Y_{n})\) a \(\delta \)-rounding of X if \(\left\| X_{i}-Y_{i}\right\| \le \delta \) for all \(1\le i\le n\). For a given X, let \(\mathcal{Y}_{X}^{\delta }\) be the set of \(\delta \)-roundings of X. We will frequently round members of a set C to their center of mass \(\mathrm {cm}(C):=\frac{1}{|C|}\sum _{c\in C}c\). For a collection of points C, its diameter is defined as \(\mathrm {diam}(C):=\max _{c,c'\in C} \left\| c-c'\right\| \).

We will analyze Euclidean functionals \(F:([0,1]^{d})^{*}\rightarrow \mathbb {R}\), denoting the dimension of the input space by a constant \(d\in \mathbb {N}\) independent of n. For formalizing the perturbation model, let \(\phi :\mathbb {N}\rightarrow [1,\infty )\) be an arbitrary function measuring the adversary’s power. For better readability, we usually write \(\phi \) instead of \(\phi (n)\). We define \(\mathcal{F}_{\phi }\) to be the set of feasible probability density functions \(f:[0,1]^{d}\rightarrow [0,\phi ]\). Hence \(\mathcal{F}_{\phi }^{n}:=\mathcal{F}_{\phi (n)}^{n}\) is the set from which a \(\phi \)-bounded adversary may choose the input distributions.

Note that if \(\phi =1\), the set \(\mathcal{F}_{\phi }\) only consists of the uniform distribution on \([0,1]^{d}\), which constitutes an average-case analysis. If however \(\phi =n\), the adversary may specify disjoint boxes for each point. Intuitively, to obtain a particular worst-case instance, the adversary would need to specify Dirac delta functions, which corresponds figuratively to setting \(\phi \) to infinity. Observe also that already \(\phi =\omega (1)\) suffices to let all possible locations of an individual point \(X_{i}\) converge to a single point as the number of input points increases, hence we believe that a superconstant \(\phi \) is especially interesting to analyze.

We will often exploit the following standard argument in smoothed analyses. For a \(\phi \)-bounded adversary, the probability that a specific input point \(X_{i}\) is contained in a ball \(B_{r}(c)\) with radius \(r\in \mathbb {R}_{\ge 0}\) and center \(c\in \mathbb {R}^{d}\) is bounded by \(\phi \cdot \mathrm {vol}(B_{r}(c))=\phi \cdot v(d)\cdot r^{d}\), where the constant v(d) depends only on d.

For a given Euclidean functional F, we analyze the approximation ratio \(\rho \) of approximation algorithms \(\mathrm {ALG}\). If the functional is induced by an optimization problem, we do not focus on constructing a feasible approximate solution, but rather on computing an approximation of the objective value. However, we adopt this simplification only for clarity of presentation. Each of the discussed algorithms can be tuned such that it also outputs a feasible approximate solution for the underlying optimization problem. The approximation ratio on instance X is defined as \(\rho (X)=\min \left\{ \frac{\mathrm {ALG}(X)}{F(X)},\frac{F(X)}{\mathrm {ALG}(X)}\right\} \), which allows to handle both maximization and minimization problems at once.

For analyzing running times, we assume the word RAM model of computation and reveal real-valued input by reading in words of \(w\ge \log n\) bits in unit time per word. We call an approximation algorithm a probable\(g_{\phi }(n)\)-approximation on smoothed instances if \(\rho (X)\ge g_{\phi }(n)\) with high probability, i.e., with probability \(1-o(1)\), when X is drawn from any \(f\in \mathcal{F}_{\phi }^{n}\). The algorithms derived in our framework feature deterministic running times \(t(n)=\mathrm {poly}(n)\) and asymptotically optimal approximation ratios \(g_{\phi }(n)\), i.e., \(g_{\phi }(n)\rightarrow 1\) for \(n\rightarrow \infty \), if \(\phi \) is small enough.

2.1 Tools from Probability Theory

We will make use of the following tools from probability theory, see, e.g., [31]. The first lemma is a simple variant of the Chernoff bounds, giving high concentration results of sums of independent variables.

Lemma 2.1

(Chernoff bound) Let \(X_{1},\ldots ,X_{n}\) be independent random variables in [0, 1]. Let \(X:=\sum _{i=1}^{n}X_{i}\). Then, for any \(0<\delta <1\), it holds that
$$\begin{aligned} \Pr [X\le (1-\delta )\mathrm {E}[X]]\le \exp (-\delta ^{2}\mathrm {E}[X]/2). \end{aligned}$$

The second lemma gives tail bounds for more general functions f of independent random variables. These f are required to have a bounded difference when only a single random variable changes its outcome.

Lemma 2.2

(Azuma’s inequality) Let \(X_{1},\dots ,X_{n}\) be independent random variables with \(X_{k}\) taking values in a set \(A_{k}\) for each k. Let \(f:A_{1}\times \ldots \times A_{n}\rightarrow \mathbb {R}\) and \(c_{1},\ldots ,c_{n}\in \mathbb {R}\) be such that \(|f(x)-f(x')|\le c_{k}\) whenever the vectors x and \(x'\) differ only in the kth coordinate. Let \(\mu \) be the expected value of the random variable f(X). Then for any \(t\ge 0\), it holds that
$$\begin{aligned} \Pr [\left| f(X)-\mu \right| \ge t]\le 2\exp \left( -2t^{2}/\sum _{k=1}^{n}c_{k}^{2}\right) . \end{aligned}$$

3 Framework

Our framework builds on the notion of quantizable functionals. These are functionals that admit fast approximation schemes on perturbed instances using general rounding strategies. The idea is to round an instance of n points to a quantized instance of \(\ell (n)\ll n\) points, each equipped with a multiplicity. This quantized input has a smaller problem size, which allows us to compute an approximation faster than on the original input. However, the objective function needs to be large enough such that the loss incurred by rounding is negligible.

We aim at a trade-off between running time and approximation performance. As it will turn out, varying the number \(\ell (n)\) of quantized points on an instance of n points makes this possible. Thus, we keep the function \(\ell \) variable in our definition. On instances of size n, we will write \(\ell :=\ell (n)\) for short.

Definition 3.1

Let \(d\ge 1\) and \(\mathcal{F}\) be a family of probability distributions \([0,1]^{d}\rightarrow \mathbb {R}_{\ge 0}\) that possibly depends on n. Let \(t,R,Q:\mathbb {N}\rightarrow \mathbb {R}\). We say that a Euclidean functional \(F:([0,1]^{d})^{*}\rightarrow \mathbb {R}_{\ge 0}\) is t-time (RQ)-quantizable with respect to \(\mathcal{F}\), if for any function \(\ell \) satisfying \(\ell =\omega (1)\) and \(\ell =o(n)\), there is a quantization algorithmA and an approximation functional\(g:([0,1]^{d}\times \mathbb {N})^{*}\rightarrow \mathbb {R}\) with the following properties.
  1. 1.
    The quantization algorithm A runs in time O(n) and maps a collection of points \(X=(X_{1},\dots ,X_{n})\in [0,1]^{dn}\) to a multiset
    $$\begin{aligned} A(X)=X'=((X'_{1},n_{1}),\dots ,(X'_{\ell },n_{\ell })), \end{aligned}$$
    the quantized input, with \(X'_{i}\in [0,1]^{d}\) for each \(1\le i\le \ell \).
     
  2. 2.
    On all inputs \(Y=A(X)\), the approximation functional g(Y) is computable in time \(t(\ell )+O(n)\) and, for any \(f\in \mathcal{F}^{n}\), fulfills
    $$\begin{aligned} \Pr _{X\sim f}[|F(X)-g(Y)|\le nR(\ell )]=1-o(1). \end{aligned}$$
     
  3. 3.
    For any \(f\in \mathcal{F}^{n}\), we have
    $$\begin{aligned} \Pr _{x\sim f}\left[ F(X)\ge nQ(n)\right] =1-o(1). \end{aligned}$$
     

The following theorem states that quantizable functionals induce natural approximation algorithms on smoothed instances. We can thus restrict our attention to finding criteria that make a functional quantizable.

Theorem 3.2

Let \(\mathcal{F}\) be a family of probability distributions and F be \(t(\ell )\)-time \((R(\ell ),Q(n))\)-quantizable with respect to \(\mathcal{F}\). Then for every \(\ell \) with \(\ell =\omega (1)\) and \(\ell =o(n)\), there is an approximation algorithm \(\mathrm {ALG}\) with the following property. For every \(f\in \mathcal{F}^{n}\), the approximation \(\mathrm {ALG}(X)\) on the instance X drawn from f is a \((1-\frac{R(\ell )}{Q(n)})\)-approximation to F(X) with high probability. The approximation can be computed in time \(O(n+t(\ell ))\).

Proof

We can compute g(A(X)) in time \(O(n+t(\ell ))\). Let E be the event that \(|g(A(X))-F(X)|\le R(\ell )n\), which happens with probability \(1-o(1)\), and assume that E occurs. Note that we allow the approximation both to over- and to underestimate the functional, which in turn can be induced by either a minimization or a maximization problem. Hence, we do the following case distinction to bound \(\rho =\min \{\frac{g(A(X))}{F(X)},\frac{F(X)}{g(A(X))}\}\). If \(g(A(X))\le F(X)\), then
$$\begin{aligned} \rho =\frac{g(A(X))}{F(X)}\ge \frac{F(X)-R(\ell )n}{F(X)}=1-\frac{R(\ell )n}{F(X)}. \end{aligned}$$
Analogously for \(g(A(X))>F(X)\), we have
$$\begin{aligned} \frac{1}{\rho }=\frac{g(A(X))}{F(X)}\le \frac{F(X)+R(\ell )n}{F(X)} =1+\frac{R(\ell )n}{F(X)}. \end{aligned}$$
Thus with \((1+x)^{-1}=1-\frac{x}{1+x}\ge 1-x\), we conclude that \(\rho \ge 1-\frac{R(\ell )n}{F(X)}\) also holds in this case. This proves that
$$\begin{aligned} \Pr _{X\sim f}\left[ \left. \rho <1-\frac{R(\ell )}{Q(n)}\right| E\right] \le \Pr _{X\sim f}\left[ \left. F(X)<Q\right| E\right] . \end{aligned}$$
By law of total probabilities, the probability that g(A(X)) is not a \((1-\frac{R(\ell )}{Q})\)-approximation to F(X) is thus bounded by
$$\begin{aligned} \Pr [\lnot E]+\Pr [E]\Pr _{X\sim f}\left[ F(X)<Qn\mid E\right] \le \Pr [\lnot E]+\Pr _{X\sim f}\left[ F(X)<Qn\right] , \end{aligned}$$
which tends to zero using conditions (2) and (3). \(\square \)
For all problems considered in this article, we also design algorithms whose expected approximation ratio converges to optimality in the sense that both \(\mathrm {E}[\rho ]\rightarrow 1\) (as a reasonable performance measure for maximization problems) and \(\mathrm {E}[\rho ^{-1}]\rightarrow 1\) (as a more appropriate guarantee for minimization problems). The first guarantee is established already by the framework algorithm, since Theorem 3.2 directly implies
$$\begin{aligned} \mathrm {E}[\rho ]\ge \Pr \left[ \rho \ge 1-\frac{R(\ell )}{Q(n)}\right] \left( 1-\frac{R(\ell )}{Q(n)}\right) \ge (1-o(1))\left( 1-\frac{R(\ell )}{Q(n)}\right) . \end{aligned}$$
The second guarantee will be ensured using auxiliary algorithms. A sufficient auxiliary algorithm for F is a linear-time algorithm approximating F within a constant factor \(0<c<1\). Outputting the better solution of our framework algorithm and the c-approximation does not increase the order of the running time, but achieves an approximation ratio of \(1-\frac{R(\ell )}{Q(n)}=1-o(1)\) with probability \(1-o(1)\) due to the previous theorem, yielding \(\mathrm {E}[\rho ]\rightarrow 1\), and still provides a constant approximation ratio of c on the remaining instances sampled with probability o(1). Thus, \(\mathrm {E}[\rho ^{-1}]\le (1-o(1))(1-\frac{R(\ell )}{Q(n)})^{-1}+o(1)c^{-1}\rightarrow 1\) holds as well.1

By a slight abuse of notation, we identify a multiset of points \(X'=((X'_{1},n_{1}),\dots ,(X'_{\ell },n_{\ell }))\in ([0,1]^{d}\times \mathbb {N})^{\ell }\) with a tuple \(X'\in ([0,1]^{d})^{*}\) in the canonical way.

4 Grid Quantization

Our first method for verifying quantizability is grid quantization. Here, the basic idea is to round the input to the centers of grid cells, where the coarseness of the grid is chosen according to the desired number of distinct points. This method works well for functionals that allow for fast optimal computations on their high-multiplicity version and provide a large objective value on the chosen perturbation model.

Let \(k\in \mathbb {N}\). By subdividing the d-dimensional unit cube \([0,1]^{d}\) into k equally long segments along each axis, we obtain the \(k^{d}\) cubes \(\mathcal{Q}_{(j_{1},\dots ,j_{d})}^{k}\), for \(1\le j_{1},\ldots ,j_{d}\le k\). We enumerate them in an arbitrary fashion \(\mathcal{Q}_{1}^{k},\dots ,\mathcal{Q}_{k^{d}}^{k}\).

Theorem 4.1

(Grid quantization) Let \(d\ge 1\), let \(Q:\mathbb {N}\rightarrow \mathbb {R}\) and let \(\mathcal{F}\) be a family of probability distributions \([0,1]^{d}\rightarrow \mathbb {R}_{\ge 0}\). Let \(F:([0,1]^{d})^{*}\rightarrow \mathbb {R}_{\ge 0}\) be a Euclidean functional with the following properties.
  1. 1.

    On all quantized inputs \(X'=((X'_{1},n_{1}),\dots ,(X'_{\ell },n_{\ell }))\), the value \(F(X')\) can be computed in time \(t(\ell )+O(\sum _{i=1}^{\ell }n_{i})\). The algorithm may (i) assume \(\ell =k^{d}\) for some \(k\in \mathbb {N}\) and (ii) choose an arbitrary location for the distinct input points, as long as each \(X_{i}'\) is contained in its corresponding cube \(\mathcal{Q}_{i}^{k}\).

     
  2. 2.
    There is a constant C such that with high probability, the functional differs by at most \(C\delta n\) on all \(\delta \)-roundings of an instance X drawn from any \(f\in \mathcal{F}^{n}\). Formally, for each \(\delta >0\) we require
    $$\begin{aligned} \Pr _{X\sim f}\left[ \forall Y\in \mathcal{Y}_{X}^{\delta }:|F(X)-F(Y)|\le C\delta n\right] =1-o(1). \end{aligned}$$
     
  3. 3.
    For each \(f\in \mathcal{F}^{n}\), it holds that
    $$\begin{aligned} \Pr _{X\sim f}\left[ F(X)\ge nQ(n)\right] =1-o(1). \end{aligned}$$
     
Then F is \(t(\ell )\)-time \((O(\ell ^{-\frac{1}{d}}),Q(n))\)-quantizable with respect to \(\mathcal{F}\).

Proof

Assume that \(\ell =k^{d}\) for some \(k\in \mathbb {N}\) and consider the following algorithm \(\mathrm {GridQ}\). First, \(\mathrm {GridQ}\) determines, for each cube \(Q_{i}^{k}\), the number \(n_{i}\) of input points contained in the cube. It then outputs, for each cube \(\mathcal{Q}_{i}^{k}\), some point \(q_{i}\in \mathcal{Q}_{i}^{k}\) weighted by \(n_{i}\), where \(q_{i}\) can be chosen arbitrarily, e.g., as the centroid of the cube.

This algorithm can be executed in time O(n) in the word RAM model of computation. To see this, assume for clarity of presentation that k is a power of two, hence \(1/k=2^{-b}\) for some natural number b. For any real number \(x\in [0,1]\), the corresponding interval \([i/k,(i+1)/k]\ni x\) can be determined by reading in the first b bits of x, yielding i. Since \(b=\log k\le \log n\le w\), reading in one chunk of each coordinate suffices and incurs a cost of d word operations per point.

Note that for any cube \(\mathcal{Q}_{i}^{k}\), we have \(\left\| x-q_{i}\right\| \le \frac{\sqrt{d}}{k}\) for all \(x\in \mathcal{Q}_{i}^{k}\). Hence by construction, \(Y:=\mathrm {GridQ}(X)\) satisfies \(\left\| Y_{i}-X_{i}\right\| \le \frac{\sqrt{d}}{k}\) for all \(1\le i\le n\). By the rounding assumption (2) of the theorem,
$$\begin{aligned} \Pr \left[ |F(X)-F(Y)|\le \frac{C\sqrt{d}n}{k}\right]\ge & {} \Pr \left[ \forall Y\in \mathcal{Y}_{X}^{\delta }:|F(X)-F(Y)|\le \frac{C\sqrt{d}n}{k}\right] \\= & {} 1-o(1). \end{aligned}$$
By assumption, F(Y) can be computed in time \(t(\ell )+O(\sum _{i=1}^{\ell }n_{i})=t(\ell )+O(n)\). The lower bound condition (3) of Definition 3.1 is also satisfied by assumption.

For general \(\ell ,\) we choose k as the largest power of 2 with \(k^{d}\le \ell \) in the construction above. The claim follows from the observation that \(\frac{C\sqrt{d}}{k}=O(\ell ^{-1/d})\). \(\square \)

In the remainder of this section, we apply the framework to two Euclidean maximization problems, namely maximum matching and maximum TSP. Both problems have already been analyzed in the average-case world, see, e.g., an analysis of the Metropolis algorithm on maximum matching in [36]. We generalize the result of Dyer et al. [18], who proved the asymptotic optimality of two simple partitioning heuristics for maximum matching and maximum TSP on the uniform distribution in the unit square. However, in contrast to our approach, their partitioning methods typically fail if the points are not identically distributed.

4.1 Maximum Matching

Let \(\mathrm {MaxM}(X)\) denote the maximum weight of a matching of the points \(X\subseteq [0,1]^{d}\), where the weight of a matching M is defined as the total length of matched edges \(\sum _{\{u,v\}\in M}\left\| u-v\right\| \). For the more general problem of finding maximum weighted matchings on general graphs with non-integer weights, the fastest known algorithm due to Gabow [23] runs in time \(O(mn+n^{2}\log n)\).

We aim to apply Theorem 4.1, for which we only need to check three conditions. The rounding condition (2) is easily seen to be satisfied by a straight-forward application of the triangle inequality.

Property 4.2

Let \(Y=(Y_{1},\dots ,Y_{n})\) be a \(\delta \)-rounding of \(X=(X_{1},\dots ,X_{n})\). Then
$$\begin{aligned} |\mathrm {MaxM}(X)-\mathrm {MaxM}(Y)|\le n\delta . \end{aligned}$$

Proof

Let M be an optimal matching on X that we represent as a set of pairs of indices \(\{i,j\}\) rather than pairs of vertices \(\{X_{i},X_{j}\}\). By triangle inequality, we have
$$\begin{aligned} \sum _{\{i,j\}\in M}\left\| Y_{i}-Y_{j}\right\| \le \sum _{\{i,j\}\in M}\left( \left\| X_{i}-X_{j}\right\| +2\delta \right) \le n\delta +\sum _{\{i,j\}\in M}\left\| X_{i}-X_{j}\right\| . \end{aligned}$$
Symmetrically, \(\sum _{\{i,j\}\in M}\left\| X_{i}-X_{j}\right\| \le n\delta +\sum _{\{i,j\}\in M}\left\| Y_{i}-Y_{j}\right\| \). \(\square \)

The lower bound condition (3) is provided by the following lemma.

Lemma 4.3

There is a constant \(\gamma >0\) such that for all \(f\in \mathcal{F}_{\phi }^{n}\),
$$\begin{aligned} {\displaystyle \Pr _{X\sim f}}\left[ \mathrm {MaxM}(X)<\frac{\gamma n}{\root d \of {\phi }}\right] \le \exp (-\varOmega (n)). \end{aligned}$$

Proof

Let M be an arbitrary matching of the indices \(\{1,\dots ,n\}\). Consider any edge \(\{i,j\}\in M\). Let \(z\in [0,1]^{d}\) be arbitrary, then \(\Pr _{X_{j}}[\left\| X_{i}-X_{j}\right\| \le t\mid X_{i}=z]=\Pr _{X_{j}}[X_{j}\in B_{t}(z)]\le \phi \mathrm {vol}(B_{t}(0))=O(\phi t^{d}).\) We conclude that there is a value \(t=\varOmega (1/\root d \of {\phi })\), such that
$$\begin{aligned} \Pr _{X_{i},X_{j}}\left[ \left\| X_{i}-X_{j}\right\| \le t\right] \le \max _{z\in [0,1]^{d}}\Pr _{X_{j}}\left[ \left\| X_{i}-X_{j}\right\| \le t\mid X_{i}=z\right] \le \frac{1}{2}. \end{aligned}$$
Hence, the expected objective value is at least \(nt/4=\varOmega (n/\root d \of {\phi }).\) By the Chernoff bound of Lemma 2.1, with probability at most \(\exp (-n/8)\), less than \(\frac{n}{8}\) edges in the matching have a contribution of at least t. Thus, we have \(\Pr [\mathrm {MaxM}(X)<\frac{nt}{8}]\le \exp (-n/8)\). \(\square \)

We call the task of computing a functional on quantized inputs the quantized version of the functional. In the case of \(\mathrm {MaxM}\), an algorithm for b-matchings by Anstee [1] can be exploited, satisfying condition (1).

Lemma 4.4

The quantized version of \(\mathrm {MaxM}\) can be computed in time \(O(\ell ^{4}+\ell ^{3}\log n)\), where \(n=\sum _{i=1}^{\ell }n_{i}\).

Proof

The quantized version of \(\mathrm {MaxM}\) can be considered as a b-matching problem. The input in this problem is a graph \(G=(V,E)\) with costs d(e) at edges \(e\in E\) and integer weights \(b_{v}\) at vertices \(v\in V\). The aim is to find an assignment of non-negative integer weights \(x_{e}\) to the edges such that \(\sum _{e=\{v,v'\}\in E}x_{e}\le b_{v}\) for each \(v\in V\) and the weighted cost \(\sum _{e\in E}x_{e}d(e)\) is maximized.

For a given instance \(X'=((X_{1},n_{1}),\dots ,(X_{\ell },n_{\ell }))\), we define a complete graph on the vertices \(\{X_{1},\dots ,X_{\ell }\}\) where each edge has cost \(d(X_{i},X_{j})=\left\| X_{i}-X_{j}\right\| \) and each vertex \(X_{i}\) has weight \(b_{X_{i}}=n_{i}\). By an algorithm from [1], this instance can be solved in time \(O(\ell ^{3}\log \sum _{i=1}^{\ell }n_{i}+\ell ^{4})\). \(\square \)

These observations immediately yield the following result.

Theorem 4.5

\(\mathrm {MaxM}\) is \(O(\ell ^{4})\)-time \((O(1/\root d \of {\ell }),\varOmega (1/\root d \of {\phi }))\)-quantizable with respect to \(\mathcal{F}_{\phi }\). Hence, for \(1\le p<4,\) there is a \(O(n^{p})\)-time probable \((1-O(\root d \of {\phi /n^{p/4}}))\)-approximation to \(\mathrm {MaxM}\) for instances drawn according to some \(f\in \mathcal{F}_{\phi }^{n}\). This is asymptotically optimal on smoothed instances with \(\phi =o(n^{p/4})\).

Proof

To verify the quantizability, Property 4.2, Lemmas 4.3 and 4.4 can be used to apply Theorem 4.1. For this, note that \(\ell ^{4}+\ell ^{3}\log n=O(\ell ^{4}+n)\). Using Theorem 3.2 with \(\ell :=\lceil n^{p/4}\rceil \), we obtain the remaining part of the statement. \(\square \)

Interestingly, the restriction on \(\phi \) is independent of the dimension. Note that only \(p<3\) is reasonable, since deterministic cubic-time algorithms for exactly computing \(\mathrm {MaxM}\) exist. Furthermore, as described in Sect. 3, an algorithm achieving an asymptotically optimal approximation ratio also in expectation can be designed as well. For this, we may utilize a simple greedy linear-time \(\frac{1}{2}\)-approximation for \(\mathrm {MaxM}\) [5].

4.2 Maximum Traveling Salesman Problem

The approach for maximum matching of the previous subsection can be adapted to the maximum traveling salesman problem. For \(d\ge 2\), define \(\mathrm {MaxTSP}(X)\) as the maximum weight of a Hamiltonian cycle on \(X\subseteq [0,1]^{d}\), where the weight of a Hamiltonian cycle C is defined as \(\sum _{\{u,v\}\in C}\left\| u-v\right\| \). The problem is NP-hard (proven for \(d\ge 3\) in [8], conjectured for \(d=2\)) but admits a PTAS, cf. [8, 9]. According to Fekete et al. [20], these algorithms are not practical. They stress the need for (nearly) linear-time algorithms.

By using our framework algorithm of Theorem 4.5 and patching the constructed matching to a tour, we obtain the following result.

Theorem 4.6

Let \(1\le p\le 4d/(d+1)\) and \(f\in \mathcal{F}_{\phi }^{n}\). On instances drawn from f, there is a \(O(n^{p})\)-time computable probable \((1-O(\root d \of {\phi /n^{p/4}}))\)-approximation for \(\mathrm {MaxTSP}\). This is asymptotically optimal for \(\phi =o(n^{p/4})\).

Proof

It is obvious that the longest tour through the vertices induces a matching of at least half its length. Thus, \(\mathrm {MaxTSP}(X)\le 2\mathrm {MaxM}(X)\). Our aim is hence to construct a tour of asymptotically twice the cost of the maximum matching. For this, let an optimal solution to the b-matching problem discussed in the proof of Lemma 4.4 be given as a weighted assignment between the \(\ell \) quantized points \(X'_{1},\dots ,X'_{\ell }\) with multiplicities \(n_{1},\dots ,n_{\ell }\). Start with \(X'_{1}\) and arbitrarily enumerate all endpoints \(\{v_{1},\dots ,v_{n_{1}}\}\) of edges \((X'_{1},v_{i})\) in the matching. We patch these edges to the partial tour \(X'_{1}\rightarrow v_{1}\rightarrow X'_{1}\rightarrow v_{2}\rightarrow \cdots \rightarrow X'_{1}\rightarrow v_{n_{1}}\). We proceed by deleting all used edges (and decreasing the multiplicities accordingly) and continue constructing partial tours for every other vertex \(X'_{2},\dots ,X'_{\ell }\). After this, we connect these \(\ell \) partial tours arbitrarily to form a complete tour through all vertices. Each edge in the matching occurs twice in some partial tour, i.e., as a forward and a backward edge, except in the case that it is the last edge of a partial tour. We compensate for this loss with the pessimistic estimate of \(\sqrt{d}\) per vertex. Let \(\mathrm {ALG}_{\mathrm {MaxTSP}}(X)\) denote the length of the thus constructed tour and let \(\mathrm {ALG}_{\mathrm {MaxM}}(X)\) denote the length of the matching constructed by the algorithm of Theorem 4.5, then we have
$$\begin{aligned} \frac{\mathrm {ALG}_{\mathrm {MaxTSP}}(X)}{\mathrm {MaxTSP}(X)}\ge \frac{2\mathrm {ALG}_{\mathrm {MaxM}}(X)-\sqrt{d}\ell }{\mathrm {MaxTSP}(X)}\ge & {} \frac{2\left( \mathrm {MaxM}(X)-O\left( \frac{n}{\root d \of {\ell }}\right) \right) -\sqrt{d}\ell }{\mathrm {MaxTSP}(X)}\\\ge & {} 1-\frac{O\left( \frac{n}{\root d \of {\ell }}+\ell \right) }{2\mathrm {MaxM}(X)}. \end{aligned}$$
Observe that choosing a too fine grid can decrease the approximation performance for \(\ell =\omega (n^{d/(d+1)})\) due to the patching. Hence, we require \(p\le 4d/(d+1)\) and set \(\ell :=\lceil n^{p/4}\rceil \). An application of Lemma 4.3 concludes the proof. \(\square \)

Since \(\mathrm {MaxM}\) is a \(\frac{1}{2}\)-approximation to \(\mathrm {MaxTSP}\), the greedy linear-time computable \(\frac{1}{2}\)-approximation to \(\mathrm {MaxM}\) is a \(\frac{1}{4}\)-approximation to \(\mathrm {MaxTSP}\) and thus provides an adapted algorithm with asymptotically optimal expected approximation ratio for \(\phi =o(n^{p/4})\).

5 Balanced Quantization

Grid quantization proves useful for problems in which algorithms solving the high-multiplicity version are available. For maximum matching, we exploited a specifically designed algorithm, but for other problems, such algorithms might be missing. As an alternative route in these cases, this section establishes a more careful quantization step yielding balanced instances, i.e., instances in which each of the distinct points occurs equally often. These instances are often easier to handle, which holds, e.g., for k-means clustering and similar problems. In general, this alternative method can be applied to problems for which the objective scales controllably when all input points are duplicated.

The balanced quantization algorithm works by partitioning a subset of the input points into packets, which we define as collections of points such that all of these collections have the same cardinality.

Lemma 5.1

Let \(\ell :\mathbb {N}\rightarrow \mathbb {N}\) with \(\ell =\omega (1)\) and \(\ell =o(n)\). There is a function \(\ell ':\mathbb {N}\rightarrow \mathbb {N}\) such that for each \(n\in \mathbb {N}\) and \(X=(X_{1},\dots ,X_{n})\in [0,1]^{dn}\), we can find, in linear time, a family of packets \(C_{1},\dots ,C_{\ell '(n)}\), and a number \(w(n)\in \mathbb {N}\) with the following properties.
  1. 1.

    \(\frac{\ell '(n)}{\ell (n)}\rightarrow 1\) (we obtain \(\ell \) packets asymptotically),

     
  2. 2.

    \(|C_{i}|=w\) for \(1\le i\le \ell '(n)\) (each packet contains exactly w points),

     
  3. 3.

    \(n-\sum _{i=1}^{\ell '(n)}|C_{i}|=O(\frac{n}{\ell ^{1/(d+1)}})\) (almost all points are covered),

     
  4. 4.

    \(\mathrm {diam}(C_{i})=O(\frac{1}{\ell ^{1/(d+1)}})\) (each element in a packet represents it well).

     

Proof

We quantize X using a grid of \(t^{d}\) cubes \(B_{1},\dots ,B_{t^{d}}\), for t to be chosen later. Each cube \(B_{i}\) has side length 1 / t, volume \({1}/{t^{d}}\) and contains some number \(n_{i}\) of input points. For some w to be determined later, we create \(\lfloor \frac{n_{i}}{w}\rfloor \) packets for each cube \(B_{i}\): we succesively assign to each such packet w yet uncovered input points inside \(B_{i}\). This yields a number \(\ell '\) of packets, each of which contains exactly w points which were originally situated in a cube of volume \({1}/{t^{d}}\).

We need to choose t and w such that \(\frac{\ell '(n)}{\ell (n)}\rightarrow 1\) for \(n\rightarrow \infty \). Since each of the produced packets contains w points, we have \(\ell '(n)\le \frac{n}{w}\). Furthermore, since \(\ell '(n)=\sum _{i=1}^{t^{d}}\left\lfloor \frac{n_{i}}{w}\right\rfloor \ge \sum _{i=1}^{t^{d}}\frac{n_{i}}{w}-1=\frac{n}{w}-t^{d},\) setting \(w:=\lfloor \frac{n}{\ell (n)}\rfloor \) establishes \(\ell (n)-t^{d}\le \ell '(n)\le \ell (n)\frac{n}{n-\ell (n)}\). Thus, the first criterion is fulfilled if \(t=o(\root d \of {\ell })\). Since the rounding step leaves at most w points per cube uncovered, at most \(t^{d}w\le t^{d}\frac{n}{\ell (n)}\) points are lost in total by rounding. Setting \(t:=\ell ^{1/(d+1)}\) fulfills \(t=o(\root d \of {\ell })\) and creates \(t^{d}w=O(\frac{n}{\ell ^{1/(d+1)}})\) uncovered points. Furthermore, by this choice, the diameter of each cube is bounded by \(\frac{\sqrt{d}}{t}=O(\ell ^{-1/(d+1)})\). \(\square \)

The previous lemma yields the balanced quantization algorithm \(\mathrm {BalQ}\) that, on input X, returns \(\mathrm {BalQ}(X)=((\mathrm {cm}(C_{1}),w),\dots ,(\mathrm {cm}(C_{\ell '}),w))\), i.e., each point is rounded to the center of mass of its corresponding packet obtained by Lemma 5.1. This allows us to formalize the balanced quantization method as follows.

Theorem 5.2

(Balanced Quantization) Let \(d\ge 1\), let \(Q:\mathbb {N}\rightarrow \mathbb {R}\) and let \(\mathcal{F}\) be a family of probability distributions \([0,1]^{d}\rightarrow \mathbb {R}_{\ge 0}\). Let \(F:([0,1]^{d})^{*}\rightarrow \mathbb {R}_{\ge 0}\) be a Euclidean functional with the following properties.
  1. 1.

    On all quantized inputs \(X'=((X'_{1},w),\dots ,(X'_{\ell },w))\), the value \(F(X')\) can be computed in time \(t(\ell )+O(w\ell )\).

     
  2. 2.
    There is a constant C such that with high probability, the functional on an instance X drawn from any \(f\in \mathcal{F}^{n}\) differs by at most \(\frac{Cn}{\ell ^{1/(d+1)}}\) from the functional on \(\mathrm {BalQ}(X)\). Formally, we require
    $$\begin{aligned} \Pr _{X\sim f}\left[ |F(X)-F(\mathrm {BalQ}(X))|\le \frac{Cn}{\ell ^{1/(d+1)}}\right] =1-o(1). \end{aligned}$$
     
  3. 3.
    For each \(f\in \mathcal{F}^{n}\), it holds that
    $$\begin{aligned} \Pr _{X\sim f}\left[ F(X)\ge nQ(n)\right] =1-o(1). \end{aligned}$$
     
Then F is \(t(\ell )\)-time \((O(\ell ^{-\frac{1}{d+1}}),Q(n))\)-quantizable with respect to \(\mathcal{F}\).

Proof

In Definition 3.1, use \(\mathrm {BalQ}\) as quantization algorithm A. The other conditions follow directly from the assumptions. \(\square \)

For some problems, an instance in which every distinct point occurs equally often can be reduced to its distinct points only. In the following, we exploit this property by applying the previous theorem to k-means clustering in Sect. 5.1. The method also allows for improving the results on maximum matching and maximum TSP in Sect. 5.2.

5.1 K-Means Clustering

Let \(d\ge 2\) and \(k\in \mathbb {N}\). We define \(\mathrm {KMeans}(X,k)\) to be the k-means objective on the points X, where k is the desired number of clusters, i.e.,
$$\begin{aligned} \mathrm {KMeans}(X,k)=\min _{C_{1}\mathop {\dot{\cup }}\cdots \mathop {\dot{\cup }}C_{k}=X}\;\sum _{i=1}^{k}\sum _{x\in C_{i}}\left\| x-\mu _{i}\right\| ^{2},\quad \text {where}\quad \mu _{i}=\frac{1}{|C_{i}|}\sum _{x\in C_{i}}x. \end{aligned}$$
K-Means clustering is a fundamental problem for clustering a set of data points, with applications in a number of fields including machine learning, data mining and data compression. If either k or d is part of the input, it is NP-hard [16, 29]. However, a popular heuristic, the k-means algorithm, usually runs fast on real-world instances despite its worst-case exponential running time. This is substantiated by results proving a polynomial smoothed running time of the k-means algorithm under Gaussian perturbations [2, 4]. In terms of solution quality, however, such a heuristic can perform poorly.

Consequently, k-means clustering has also received considerable attention concerning the design of fast deterministic approximation schemes. There exist linear-time asymptotically optimal algorithms, e.g., PTASs with running time \(O(nkd+d\cdot \mathrm {poly}(k/\varepsilon )+2^{\tilde{O}(k/\varepsilon )})\) in [21] and \(O(ndk+2^{(k/\varepsilon )^{O(1)}}d^{2}n^{\sigma })\) for any \(\sigma >0\) in [14]. Treating the dimension as a constant as we do here, Har-Peled and Mazumdar [24] showed how to compute a \((1+\varepsilon )\)-approximation in time \(O(n+k^{k+2}\varepsilon ^{-(2d+1)k}\log ^{k+1}n\log ^{k}\frac{1}{\varepsilon })\).

Apart from smoothed analysis, the perturbation concept of perturbation stability has been applied to k-means clusterings by Awasthi et al. [6]. They restrict their attention to input instances which, when perturbed, maintain the same partitioning of the input points as an optimal clustering. Their perturbation model uses a bounded multiplicative increase of the distance of every pair of points. On instances that are stable under sufficiently large perturbations, they show how to compute the optimal k-means clustering in polynomial time.

In the following, we will frequently consider the k-means clustering objective with respect to other centroids \(\mu _{i}\) than the center of mass of the corresponding cluster. However, such a choice cannot decrease the objective. This follows from the following fact, which is proven, e.g., in [26].

Property 5.3

Let C be a multiset of points in \(\mathbb {R}^{d}\) and let \(x\in \mathbb {R}^{d}\). Then,
$$\begin{aligned} \sum _{c\in C}\left\| c-x\right\| ^{2}=|C|\cdot \left\| \mathrm {cm}(C)-x\right\| ^{2}+\sum _{c\in C}\left\| c-\mathrm {cm}(C)\right\| ^{2}. \end{aligned}$$
This expression is minimized at \(x=\mathrm {cm}(C)\).

Consider \(\mathrm {BalQ}(X)=((\mathrm {cm}(C_{1}),w),\dots ,(\mathrm {cm}(C_{\ell '}),w))\), a quantized instance obtained by applying Lemma 5.1. Let \(Y=\mathrm {BalQ}(X)=(Y_{1},\dots ,Y_{n'})\), where we order the \(Y_{i}\)’s such that \(Y_{i}\) is the rounded version of \(X_{i}\). Note that the number \(n'=w\ell '\) of points in the rounded instance is potentially slightly smaller than n, since points may be lost in the balanced quantization step.

Lemma 5.4

There is a constant \(\varDelta \) such that for all instances X and balanced quantizations \(Y=\mathrm {BalQ}(X)\), we have
$$\begin{aligned} |\mathrm {KMeans}(X,k)-\mathrm {KMeans}(Y,k)|\le \frac{\varDelta n}{\ell ^{1/(d+1)}}. \end{aligned}$$

Proof

We identify clusterings of X and Y with partitions of the indices \([n]:=\{1,\dots ,n\}\) and \([n']=\{1,\dots ,n'\}\). Let \((D_{1},\dots ,D_{k})\) be a partition of the indices \(\{1,\dots ,n\}\). From this, create a partition \((D_{1}',\dots ,D_{k}')\) of \(\{1,\dots ,n'\}\) by assigning the rounded versions of each point covered by \(C_{1},\dots ,C_{\ell '}\) to the cluster of the original point, i.e., \((D_{1}',\dots ,D_{k}')=(D_{1}\cap [n'],\dots ,D_{k}\cap [n'])\). Consider the k-means objective of \((D_{1}',\dots ,D_{k}')\) on Y using the centroids of the unrounded clusters \(\mu _{i}=\frac{1}{|D_{i}|}\sum _{j\in D_{i}}X_{j}\). By expanding \(\left\| Y_{j}-\mu _{i}\right\| ^{2}=\left\| (Y_{j}-X_{j})+(X_{j}-\mu _{i})\right\| ^{2}\), we compute
$$\begin{aligned}&\sum _{i=1}^{k}\sum _{j\in D'_{i}}\left\| Y_{j}-\mu _{i}\right\| ^{2}\\&\quad \le \sum _{i=1}^{k}\sum _{j\in D'_{i}}\left\| Y_{j}-X_{j}\right\| ^{2}+2\left\| Y_{j}-X_{j}\right\| \left\| X_{j}-\mu _{i}\right\| +\left\| X_{j}-\mu _{i}\right\| ^{2}\\&\quad \le \sum _{i=1}^{k}\sum _{j\in D'_{i}}\mathrm {diam}(C_{j})^{2}+2\mathrm {diam}(C_{j})\sqrt{d}+\left\| X_{j}-\mu _{i}\right\| ^{2}, \end{aligned}$$
where we applied the binomial formula in the second line. By Lemma 5.1, there is a constant \(\alpha >0\) such that \(\mathrm {diam}(C_{j})\le \frac{\alpha }{\ell ^{1/(d+1)}}\) for each packet \(C_{j}\), hence this expression is bounded by
$$\begin{aligned}&\frac{\alpha ^{2}n'}{\ell ^{2/(d+1)}}+\frac{2\alpha n'\sqrt{d}}{\ell ^{1/(d+1)}}+\sum _{i=1}^{k}\sum _{j\in D_{i}}\left\| X_{j}-\mu _{i}\right\| ^{2}\\&\quad \le \left( \alpha ^{2}+2\alpha \sqrt{d}\right) \frac{n}{\ell ^{1/(d+1)}}+\mathrm {KMeans}(X,k). \end{aligned}$$
Using the centroids \(\mu _{i}'=\frac{1}{|D_{i}'|}\sum _{j\in D_{i}}Y_{j}\) cannot increase the objective by Property 5.3, hence \(\mathrm {KMeans}(Y,k)\le \sum _{i=1}^{k}\sum _{j\in D_{i}'}\left\| Y_{j}-\mu _{i}\right\| ^{2}\le \mathrm {KMeans}(X,k) +(\alpha ^{2}+2\alpha \sqrt{d})\frac{n}{\ell ^{1/(d+1)}}\).
Similarly, take any partition \((D_{1}',\dots ,D_{k}')\) of \(\{1,\dots ,n'\}\). From this, we create the partition \((D_{1}\cup ([n]\setminus [n']),D_{2},\dots ,D_{k})\) of [n] by assigning all uncovered points to the first cluster; for the rest we mimic the decision of its rounded versions. Consider the k-means objective of \((D_{1},\dots ,D_{k})\) with respect to the centroids \(\mu _{i}'=\frac{1}{|D_{i}'|}\sum _{j\in D_{i}'}Y_{j}\),
$$\begin{aligned} \sum _{i=1}^{k}\sum _{j\in D_{i}}\left\| X_{j}-\mu _{i}'\right\| ^{2}= & {} \sum _{i=n'+1}^{n}\left\| X_{i}-\mu _{1}'\right\| ^{2}+\sum _{i=1}^{k}\sum _{j\in D_{i}'}\left\| X_{j}-\mu _{i}'\right\| ^{2}\\\le & {} (n-n')d+\frac{\alpha ^{2}n'}{\ell ^{2/(d+1)}}+\frac{2\alpha n'\sqrt{d}}{\ell ^{1/(d+1)}}+\sum _{i=1}^{k}\sum _{j\in D_{i}'}\left\| Y_{j}-\mu _{i}'\right\| ^{2}. \end{aligned}$$
Property 3 of Lemma 5.1 assures that there is some \(\beta \in \mathbb {R}\) such that \(n-n'\le \beta \frac{n}{\ell ^{1/(d+1)}}\) for sufficiently large n. With \(\varDelta :=\beta d+\alpha ^{2}+2\alpha \sqrt{d}\), the objective value with respect to the centroids \(\mu _{i}'\) is bounded by \(\mathrm {KMeans}(Y,k)+\varDelta n\ell ^{-1/(d+1)}\). Again, choosing the correct centroids cannot increase the cost, thus
$$\begin{aligned} \mathrm {KMeans}(X,k)\le \sum _{i=1}^{k}\sum _{j\in D_{i}}\left\| X_{j}-\mu _{i}'\right\| ^{2}\le \mathrm {KMeans}(Y,k)+\frac{\varDelta n}{\ell ^{1/(d+1)}}. \end{aligned}$$
\(\square \)

Having established that rounding the input does not affect the objective value too much, the following lemma enables us to reduce the instance size significantly.

Lemma 5.5

Let \(X=((X_{1},w),\dots ,(X_{\ell },w))\) and \(Z=((X_{1},1),\dots ,(X_{\ell },1))\). It holds that \(\mathrm {KMeans}(X,k)=w\mathrm {KMeans}(Z,k)\).

Proof

Let \((C_{1},\dots ,C_{k})\) be a partition on \(\{1,\dots ,\ell \}\) that is optimal for Z. Then \(\mu _{i}=\frac{1}{|C_{i}|}\sum _{j\in C_{i}}X_{j}=\frac{1}{w|C_{i}|}\sum _{j\in C_{i}}wX_{j}\) and we conclude
$$\begin{aligned} \mathrm {KMeans}(X,k)\le \sum _{i=1}^{k}\sum _{j\in C_{i}}w\left\| X_{j}-\mu _{i}\right\| ^{2}=w\mathrm {KMeans}(Z,k). \end{aligned}$$
For the other direction, assume that an optimal partition \((C_{1},\dots ,C_{k})\) of the indices \(\{1,\dots ,w\ell \}\) assigns two copies \(c_{1},c_{2}\) of the same point \(X_{j}\) to different clusters \(C_{i}\) and \(C_{i'}\). Let \(\mu _{i}\) and \(\mu _{i'}\) be the corresponding centroids. Assigning both copies of \(X_{j}\) to the cluster with the closer centroid cannot increase the k-means objective with respect to the old centroids. By Property 5.3 recomputing the centroids for the new assignment cannot increase the value as well. Thus, there is an optimal clustering in which all identical points are assigned to the same cluster. Hence, we can assign each \(X_{j}\) to the cluster that all its copies are assigned to and conclude
$$\begin{aligned} \mathrm {KMeans}(Z,k)\le \sum _{i=1}^{k}\sum _{j\in C_{i}}\left\| X_{j}-\mu _{i}\right\| ^{2}=\frac{1}{w}\mathrm {KMeans}(X,k), \end{aligned}$$
which proves the claim. \(\square \)

It is left to give a lower bound on the objective value. For this argument, we introduce the following notion.

Definition 5.6

Let \(X_{1},X_{2},\dots ,X_{n}\) be a set of random variables in the unit cube and \(k\in \mathbb {N}\). We define for every set \(S\subseteq \{1,\dots ,n\}\) with exactly k elements an S-clustering by declaring the \(X_{i}\) with \(i\in S\) to be the centroids and assigning each point to its nearest centroid. With \(a:[n]\rightarrow S\) being the assignment of \(X_{i}\) to its nearest centroid, we set \(\mathrm {KSetMeans}(X;k):=\min _{S\subseteq [n],|S|=k}\mathrm {KSetMeans}_{S}(X;k),\) where
$$\begin{aligned} \mathrm {KSetMeans}_{S}(X;k):=\sum _{\mu \in S}\sum _{a(i)=\mu }\left\| X_{i}-\mu \right\| ^{2}. \end{aligned}$$

The following lemma allows us to restrict our attention to centroids at fixed locations, since choosing centroids only among the input points at most doubles the objective. A stronger version of this is used in the analysis of the k-means++ algorithm [3].

Lemma 5.7

It holds that \(\mathrm {KSetMeans}(X;k)\le 2\mathrm {KMeans}(X;k)\).

Proof

Let C be an arbitrary cluster and choose \(c_{0}:={\mathrm {argmin}}_{c\in C}\left\| c-\mathrm {cm}(C)\right\| ^{2}\) as centroid for C. By Property 5.3, the contribution of C to the objective value equals
$$\begin{aligned} \sum _{c\in C}\left\| c-c_{0}\right\| ^{2}=|C|\cdot \left\| \mathrm {cm}(C)-c_{0}\right\| ^{2}+\sum _{c\in C}\left\| c-\mathrm {cm}(C)\right\| ^{2}\le 2\sum _{c\in C}\left\| c-\mathrm {cm}(C)\right\| ^{2}. \end{aligned}$$
Consequently, there is a choice for all centroids among the cluster points that at most doubles each cluster’s contribution. \(\square \)

For proving a lower bound, it thus suffices to consider all possible choices of the centroids among the input points. We let the adversary choose their locations and compute the contribution of each of the remaining points to the objective value.

Lemma 5.8

Let \(f\in \mathcal{F}_{\phi }^{n}\) and \(k=o(\frac{n}{\log n})\). There exists a constant \(\gamma >0\) such that
$$\begin{aligned} \Pr _{X\sim f}\left[ \mathrm {KMeans}(X,k)<\frac{\gamma n}{(k\phi )^{2/d}}\right] =\exp (-\varOmega (n)). \end{aligned}$$

Proof

By Lemma 5.7,
$$\begin{aligned} \Pr [\mathrm {KMeans}(X,k)\le t]\le & {} \Pr [\mathrm {KSetMeans}(X,k)\le 2t]\nonumber \\= & {} \Pr [\exists S\subseteq [n],|S|=k:\mathrm {KSetMeans}_{S}(X,k)\le 2t]\nonumber \\\le & {} \sum _{S\subseteq [n],|S|=k}\Pr [\mathrm {KSetMeans}_{S}(X,k)\le 2t] \end{aligned}$$
(1)
Consider an arbitrary S-clustering and its objective value. Without loss of generality, assume that \(S=\{1,\dots ,k\}\). We first fix the k centroids \(\mu _{1},\dots ,\mu _{k}\) by observing the locations of \(X_{1},\dots ,X_{k}\). We draw the remaining \(n-k\) points according to their distribution and look at the expected increase in the objective per point.
Let \(\mathrm {vol}_{d}(r)=v(d)\cdot r^{d}\) denote the volume of the d-dimensional ball of radius r. Consider point \(X_{i}\). If it contributes less than t to the objective value, it is contained in a ball of radius less than \(\sqrt{t}\) around any of the centroids. Thus,
$$\begin{aligned} \Pr \left[ X_{i}\text { incurs a cost }\le t\text { for }S\right]\le & {} \sum _{j=1}^{k}\Pr \left[ X_{i}\in B_{\sqrt{t}}(\mu _{j})\right] \\\le & {} \sum _{j=1}^{k}\phi \cdot \mathrm {vol}_{d}\left( \sqrt{t}\right) =k\phi \cdot v(d)\cdot t^{d/2}. \end{aligned}$$
By choosing \(t:=(2\cdot v(d)\cdot k\phi )^{-2/d}\), this probability is upper bounded by 1/2 and it holds that
$$\begin{aligned} \mathrm {E}[\mathrm {KSetMeans}_{S}(X,k)]\ge & {} t\sum _{i=k+1}^{n}\Pr [X_{i}\text { incurs a cost }>t\text { for }S]\\= & {} \left( \frac{1}{2\cdot v(d)\cdot k\phi }\right) ^{2/d}\frac{n-k}{2}=\varOmega \left( \frac{n}{(k\phi )^{2/d}}\right) , \end{aligned}$$
since \(k=o(n)\). We analyze the sum \(\sum _{i=k+1}^{n}I_{i}\) of indicator variables \(I_{i}\) for the event that \(X_{i}\) incurs a cost of at least t. From \(E[I_{i}]\ge \frac{1}{2}\) and an application of the Chernoff bound of Lemma 2.1, we conclude
$$\begin{aligned} p_{S}:=\Pr \left[ \mathrm {KSetMeans}_{S}(X;k)\le \frac{(n-k)t}{3}\right] \le \Pr \left[ \sum _{i=k+1}^{n}I_{i}\le \frac{n-k}{3}\right] \le e^{-\frac{n-k}{18}}. \end{aligned}$$
Using (1), this yields
$$\begin{aligned} q:=\Pr \left[ \mathrm {KMeans}(X;k)\le \frac{(n-k)t}{6}\right] \le \sum _{S\subseteq [n],|S|=k}p_{S}\,\le \, n^{k}e^{-\frac{n-k}{18}}. \end{aligned}$$
For \(k=o(n/\log n)\), this implies \(q=\exp (-\varOmega (n))\). This proves the claim, since \(\frac{(n-k)t}{6}=\varOmega \left( \frac{n}{(k\phi )^{2/d}}\right) \). \(\square \)

Note that for other Euclidean minimization functionals like minimum Euclidean matching or TSP, already the uniform distribution on the unit cube achieves an objective value of only \(O(n^{(d-1)/d})\) [35]. Hence a lower bound as given in Lemma 5.8 for these problems would not be possible, making the framework inapplicable in this case. For a more detailed discussion, we refer to Sect. 7.

To solve the smaller instance obtained by quantization, two approaches are reasonable. The first is to compute an optimal solution in time \(O(n^{kd+1})\) using [25] and results in the following theorem.

Theorem 5.9

For \(k=o(n/\log n)\), the functional \(\mathrm {KMeans}(X,k)\) is \(O(\ell ^{kd+1})\)-time \((O(\ell ^{-1/(d+1)}),\varOmega ((k\phi )^{-2/d}))\)-quantizable with respect to \(\mathcal{F}_{\phi }\). Consequently, for \(k=O(\log n/\log \log n)\) and \(1\le p\le kd+1\), there is a \(O(n^{p})\)-time computable probable \(\left( 1-O\left( \frac{(k\phi )^{2/d}}{n^{\frac{p}{(d+1)(kd+1)}}}\right) \right) \)-approximation for \(\mathrm {KMeans}(X,k)\) on smoothed instances.

Note that this is asymptotically optimal if \(\phi =o(\root c \of {n})\) with \(c=2(1+1/d)(kd+1)/p\) if \(k=O(1)\), or more generally, if \(k\phi =o(n^{\frac{pd}{2(d+1)(kd+1)}})\). Using existing linear-time approximation schemes, also an asymptotically optimal expected approximation ratio can be obtained for the same values of \(\phi \). Our framework algorithm applies even for relatively large values of k, e.g., \(k=\log n/\log \log n\), in which case known deterministic approximation schemes require superlinear time. However, for small k, incorporating such an approximation scheme into our algorithm yields a further improvement of the previous theorem.

Let \(X'=\mathrm {BalQ}(X)\). Consider computing an approximation \((1-\varepsilon )g(X')\) rather than the optimal value \(g(X')\). Let \(\ell =\omega (1)\) and \(\ell =o(n)\) and set \(\varepsilon :=k^{2/d}/\ell ^{1/(d+1)}\), then with high probability, the approximation ratio of \((1-\varepsilon )g(X')\) with respect to \(\mathrm {KMeans}(X;k)\) is at least
$$\begin{aligned} (1-\varepsilon )\left( 1-\frac{(k\phi )^{2/d}}{\ell ^{1/(d+1)}}\right) \ge \left( 1-\frac{(k\phi )^{2/d}}{\ell ^{1/(d+1)}}\right) ^{2} =1-O\left( \frac{(k\phi )^{2/d}}{\ell ^{1/(d+1)}}\right) . \end{aligned}$$
For this choice for \(\varepsilon \), the algorithm of Har-Peled and Mazumdar [24] has a running time of
$$\begin{aligned} O\left( n+k^{(k+2)-2\frac{2d+1}{d}k}\ell ^{\frac{2d+1}{d+1}k} \log ^{k+1}n\log ^{k}\frac{\ell }{k}\right) =O(n+k^{-3k}\ell ^{2k}\log ^{2k+1}n), \end{aligned}$$
assuming without loss of generality that \(k\ge 2\). To achieve a running time of \(O(n^{p})\), \(p\ge 1\), set \(\ell :=k^{\frac{3}{2}}n^{p/(2k)-\delta }\) for any \(\delta >0\), which yields the following Corollary.

Corollary 5.10

Let \(k=O(\log n/\log \log n)\), let \(1\le p\le 2k\) and \(\delta >0\). There is a \(O(n^{p})\)-time computable \(1-O\left( \frac{k^{\frac{2}{d}-\frac{3}{2(d+1)}}\phi {}^{2/d}}{n^{\frac{p}{2k(d+1)}-\delta }}\right) \)-approximation for \(\mathrm {KMeans}(X;k)\) with respect to the perturbations \(\mathcal{F}_{\phi }\). This is asymptotically optimal if \(\phi =o(\root c \of {n})\) with any \(c>4k(1+\frac{1}{d})/p\) if \(k=O(1)\), or more generally, if \(k^{1-\frac{3d}{4(d+1)}}\phi =o(n^{\frac{pd}{4k(d+1)}-\delta })\) .

5.2 Maximum Matching and Maximum TSP Revisited

The balanced quantization technique allows for an improvement for the maximum matching problem for \(d\ge 3\). As for k-means clustering, the key observation for applying balanced quantization is that identical points can be treated as one, as captured in the following lemma.

Lemma 5.11

Let \(X'=((X_{1},w),\dots ,(X_{\ell },w))\), then \(w\mathrm {MaxM}(X_{1},\dots ,X_{\ell })=\mathrm {MaxM}(X').\)

Proof

Consider a matching of \(X_{1},\dots ,X_{\ell }\). By mimicking the choice of \(X_{i}\) for all its copies in \(X'\), we obtain a solution of value \(w\mathrm {MaxM}(X_{1},\dots ,X_{\ell })\) immediately yielding \(w\mathrm {MaxM}(X_{1},\dots ,X_{\ell })\le \mathrm {MaxM}(X')\).

For the other direction, consider the optimal matchings of \(X'\). There is an optimal matching that does not connect two copies of the same point. Consider any optimal matching that does connect two copies of the same point, then there is some edge \((c_{1},c_{2})\) connecting two copies of some point \(X_{i}\) and some arbitrary other edge (uv) in the matching where neither u nor v are copies of \(X_{i}\). This is due to the fact that there are at most \(w-2\) other matching edges that include a copy of \(X_{1}\), but there are w edges in the (perfect) matching.

Since \(c_{1}\) and \(c_{2}\) are identically placed, \(\left\| c_{1}-c_{2}\right\| =0\). Consider replacing the edges \((c_{1},c_{2})\) and (uv) by \((c_{1},u)\) and \((c_{2},v)\). By triangle inequality, it holds that
$$\begin{aligned} \left\| c_{1}-u\right\| +\left\| c_{2}-v\right\| =\left\| u-c_{1}\right\| +\left\| c_{1}-v\right\| \ge \left\| u-v\right\| =\left\| u-v\right\| +\left\| c_{1}-c_{2}\right\| , \end{aligned}$$
hence we obtain a matching that is not worse and does not connect these two identical points. By repeating this replacement for all matched copies of a point, we obtain the desired optimal matching. Note that the repeated application of the replacement is possible as the counting argument above is not influenced by the changes we do to the matching.

As a result, we can decompose the matching into w layers, i.e., w matchings of the points \(X_{1},\dots ,X_{\ell }\), since no edge connects two copies of the same point. These w matchings \(M_{1},\dots ,M_{w}\) are independent of each other. All matchings must be optimal matchings on \(X_{1},\dots ,X_{\ell }\), otherwise we can replace each of them by an optimal matching on \(X_{1},\dots ,X_{\ell }\) to obtain a matching of smaller objective value. We conclude that \(\mathrm {MaxM}(X)\le w\mathrm {MaxM}(X_{1},\dots ,X_{\ell })\). \(\square \)

The previous lemma enables us to use the balanced quantization approach. Combined with the properties obtained in Sect. 4.1, the following theorem is immediately derived.

Theorem 5.12

\(\mathrm {MaxM}\) is \(O(\ell ^{3})\)-time \((O(\ell ^{-1/(d+1)}),\varOmega (\phi ^{-1/d}))\)-quantizable with respect to \(\mathcal{F}_{\phi }\).

Proof

Let \(\ell \) be such that \(\ell =\omega (1)\) and \(\ell =o(n)\). We compute \(\mathrm {BalQ}(X)=((\mathrm {cm}(C_{1}),w),\dots ,(\mathrm {cm}(C_{\ell '}),w))\) in time O(n). Consider the maximum matching problem on \(X_{C}=(\mathrm {cm}(C_{1}),\dots ,\mathrm {cm}(C_{\ell '}))\). Let U denote the set of uncovered points of size \(|U|\le \frac{\alpha n}{\ell ^{1/(d+1)}}\) for some \(\alpha >0\). Since adding or deleting m points can change the objective by at most \(m\sqrt{d}\), we have that
$$\begin{aligned} |\mathrm {MaxM}(X)-\mathrm {MaxM}(X\setminus U)|\le |U|\sqrt{d}\le \frac{\alpha \sqrt{d}n}{\ell ^{1/(d+1)}}. \end{aligned}$$
By Lemma 5.11 and definition of \(C_{1},\dots ,C_{\ell '}\), it also holds that for some \(\beta >0\),
$$\begin{aligned} |\mathrm {MaxM}(X\setminus U)-w\mathrm {MaxM}(X_{C})|\le \frac{\beta (n-|U|)}{\ell ^{1/(d+1)}}\le \frac{\beta n}{\ell ^{1/(d+1)}}. \end{aligned}$$
We conclude that \(|\mathrm {MaxM}(X)-w\mathrm {MaxM}(X_{C})|\le (\alpha \sqrt{d}+\beta )\frac{n}{\ell ^{1/(d+1)}}\). We can compute \(w\mathrm {MaxM}(X_{c})\) optimally using a cubic-time algorithm. Since we have the lower bound of Lemma 4.3, \(\mathrm {MaxM}(X)\) is thus \(O(\ell ^{3})\)-time (RQ)-quantizable where \(R=\frac{(\alpha \sqrt{d}+\beta )}{\ell ^{1/(d+1)}}=O\left( \ell ^{-1/(d+1)}\right) \) and \(Q=\varOmega (\phi ^{-1/d})\). \(\square \)

To strengthen our result further, consider the following lemma that is due to Duan and Pettie [17].

Lemma 5.13

Let \(\varepsilon >0\). A \((1-\varepsilon )\)-approximate maximum weighted matching can be computed in time \(O(m\varepsilon ^{-2}\log ^{3}n)\), where m is the number of edges and n is the number of vertices in the graph.

This result implies that an \((1-\varepsilon )\)-approximation to \(\mathrm {MaxM}(X_{c})\) is computable in time \(O(\ell ^{2}\varepsilon ^{-2}\log ^{3}n)\). Setting \(\varepsilon :=\frac{1}{\ell ^{1/(d+1)}}\), the approximation ratio of \((1-\varepsilon )\mathrm {MaxM}(X_{c})\) is at least
$$\begin{aligned} (1-\varepsilon )\left( 1-O\left( \frac{\phi ^{1/d}}{\ell ^{1/(d+1)}}\right) \right) \ge \left( 1-O\left( \frac{\phi ^{1/d}}{\ell ^{1/(d+1)}}\right) \right) ^{2}=1-O\left( \frac{\phi ^{1/d}}{\ell ^{1/(d+1)}}\right) , \end{aligned}$$
with high probability, which does not change the asymptotic approximation performance.

Hence, for \(\mathrm {MaxM},\) a \(O(\ell ^{2(d+2)/(d+1)}\log ^{3}n)\)-time computable probable \((1-O(\phi ^{1/d}/\ell ^{1/(d+1)}))\)-approximation exists. Put otherwise, in time \(O(n^{p})\) for \(1\le p\le 2(1+\frac{1}{d+1})\), we can compute an approximation that is asymptotically optimal if \(\phi =o(n^{\frac{p}{2}(1-\frac{2}{d+2})-\varepsilon })\) for some \(\varepsilon >0\). The largest adversary power possible with this approach is hence \(\phi =o(n^{1-1/(d+1)-\varepsilon })\). This improves upon the admissible adversary power of Sect. 4.

Remark 5.14

The results for \(\mathrm {MaxM}\) carry over to \(\mathrm {MaxTSP}\) as in Sect. 4.2. Extending a matching of a quantized instance \(X'=((X'_{1},w),\dots ,(X'_{2\ell },w))\) to a Hamiltonian cycle in linear time allows, for \(1\le p\le 2\), to compute a solution with \(\rho \ge 1-O(\root d \of {\phi }/n^{\frac{p}{2(d+2)}-\varepsilon })\) on perturbed instances in time \(O(n^{p})\).

6 Bin Packing

In this section, we will apply the grid quantization framework developed in Sect. 4 to the multidimensional bin packing problem. Let \(X=(X_{1},\dots ,X_{n})\in [0,1]^{dn}\) be a set of n items. An item \(X=(x_{1},\dots ,x_{d})\) is treated as a d-dimensional box, where \(x_{i}\) is its side length in dimension i. We define \(\mathrm {BP}_{d}(X)\) to be the minimum number of bins of volume one, i.e., d-dimensional hypercubes \([0,1]^{d}\), that are needed to pack all elements. Items must not be rotated and must be packed such that their interiors are disjoint.

In what follows, we extend the result of Karger and Onak [27], who gave linear-time asymptotically optimal approximation algorithms for smoothed instances with \(\phi =O(1)\) and for instances with i.i.d. points. These tractability results are highly interesting due to the fact that unless \(\mathsf {P}=\mathsf {NP}\), there is not even an asymptotic polynomial-time approximation scheme (APTAS) solving the two-dimensional bin packing problem [7].

While Karger and Onak’s approach appears rather problem-specific, our solution embeds nicely into our framework. The main difference of our approach lies in a much simpler rounding routine and analysis, after which we solve the problem exactly as in their distribution-oblivious algorithm. Note that their algorithm is supplied with a desired relative error \(\varepsilon >0\) and succeeds with a probability of \(1-2^{-\varOmega (n)}\). Although not stated for this case, we believe that their algorithm may also apply to superconstant choices of \(\phi \), at a cost of decreasing the success probability. We feel that our analysis offers more insights on the reasons why bin packing is smoothed tractable by putting it into the context of our general framework.

Consider first the one-dimensional case. Unless \(\mathsf{P}=\mathsf{NP}\), the functional \(\mathrm {BP}_{1}\) does not admit a \((\frac{2}{3}+\varepsilon )\)-approximation for any constant \(\varepsilon >0\). However, asymptotic polynomial approximation schemes exist [22], i.e., \((1-\varepsilon )\)-approximations on instances with a sufficiently large objective value. These approximation schemes have an interesting connection to smoothed analysis due to the following property.

Lemma 6.1

There is a constant \(\gamma >0\) such that for any \(f\in \mathcal{F}_{\phi }^{n}\), we have
$$\begin{aligned} {\displaystyle \Pr _{X\sim f}}\left[ \mathrm {BP}_{d}(X)<\gamma \frac{n}{\phi ^{d}}\right] \le \exp (-\varOmega (n)). \end{aligned}$$

Proof

Since \(f_{i}\) is bounded by \(\phi \), we conclude that \(\Pr _{X_{i}\sim f_{i}}[X_{i}\in [0,a]\times [0,1]^{d-1}]\le a\phi \). By a union bound, we obtain \(\Pr _{X_{i}\sim f_{i}}[X_{i}\notin [a,1]^{d}]\le ad\phi \), hence with probability at least \(\frac{1}{2}\), it holds that \(X_{i}\in [\frac{1}{2d\phi },1]^{d}\). The total volume of all items is a lower bound on the required number of bins. Defining indicator variables \(I_{i}\) with \(I_{i}=1\) if and only if \(X_{i}\in [\frac{1}{2d\phi },1]^{d}\), the expectation is bounded by
$$\begin{aligned} \mathop {\mathrm {E}}_{X\sim f}[\mathrm {\mathrm {BP}_{d}(X)}]\ge \mathrm {E}\left[ \sum _{i=1}^{n}\mathrm {vol}(X_{i})\right] \ge \mathrm {E}\left[ \sum _{i=1}^{n}I_{i}\right] \left( \frac{1}{2d\phi }\right) ^{d}. \end{aligned}$$
The Chernoff bound of Lemma 2.1 yields
$$\begin{aligned} \Pr \left[ \mathrm {BP}_{d}(X)\le \frac{n}{4}\left( \frac{1}{2d\phi }\right) ^{d}\right] \le \exp (-n/8). \end{aligned}$$
\(\square \)
Using this bound on the objective value, we show an example of how to transform an APTAS into a PTAS on smoothed instances. Plotkin et al. [32] have shown how to compute, in time \(O(n\log \varepsilon ^{-1}+\varepsilon ^{-6}\log ^{6}\varepsilon ^{-1})\), a solution with an objective value of \(\mathrm {ALG}(X)\le (1+\varepsilon )\mathrm {BP}_{1}(X)+O(\varepsilon ^{-1}\log \varepsilon ^{-1})\). We derive
$$\begin{aligned} \rho ^{-1}=\frac{\mathrm {ALG}(X)}{\mathrm {BP}_{1}(X)}\le (1+\varepsilon ) +O\left( \frac{\varepsilon ^{-1}\log \varepsilon ^{-1}}{\mathrm {BP}_{1}(X)}\right) \le (1+\varepsilon )+O\left( \frac{\phi \varepsilon ^{-1}\log \varepsilon ^{-1}}{n}\right) , \end{aligned}$$
where the last inequality holds with high probability over the perturbation of the input. Setting \(\varepsilon :=\log n/n^{\delta }\) with some \(\delta <{1}/{6}\) yields a running time of \(O(n\log n)\) with an approximation ratio of \(\rho \ge 1-\log n/n^{\delta }-O(\phi /n^{1-\delta })\). Consequently, there is a linear-time asymptotically optimal approximation algorithm on instances smoothed with \(\phi =o(n^{1-\delta })\) for any \(\delta >0\). Unfortunately, this approach is not easily generalizable to \(d>1\), since already for \(d=2\), no APTAS exists unless \(\mathsf{P}=\mathsf{NP}\), as shown in [7]. Nevertheless, the problem is quantizable in our framework.

We say that a single item \(X=(x_{1},\dots ,x_{d})\) fits in a box \(B=(b_{1},\dots ,b_{d})\) if \(x_{i}\le b_{i}\) for all \(1\le i\le d\). In this case, we write \(X\sqsubseteq B\), adopting the notation of [27]. Regarding an item as a box as well, this relation is transitive and induces the monotonicity property that for each \(X=(X_{1},\dots ,X_{n})\) and \(Y=(Y_{1},\dots ,Y_{n})\) with \(X_{i}\sqsubseteq Y_{i}\), it holds that \(\mathrm {BP}_{d}(X)\le \mathrm {BP}_{d}(Y)\).

To apply the quantization framework, we require a suitable bound on the rounding errors. Unlike for \(\mathrm {MaxM}\) and \(\mathrm {MaxTSP}\), no deterministic bound of \(n\delta \) is possible for a \(\delta \)-rounding: Let the instance \(X^{(n)}\) consist of n copies of \((\frac{1}{2},\dots ,\frac{1}{2})\). Packing \(2^{d}\) items per bin results in zero waste, hence \(\mathrm {BP}_{d}(X^{(n)})=n/2^{d}\). However, for any \(\delta >0\), the \(\delta \)-rounding \(Y_{n}\) consisting of n copies of \((\frac{1}{2}+\frac{\delta }{\sqrt{d}},\dots ,\frac{1}{2}+\frac{\delta }{\sqrt{d}})\) has an objective value of \(\mathrm {BP}_{d}(Y^{(n)})=n=2^{d}\mathrm {BP}_{d}(X^{(n)})\). Thus, a smoothed analysis of the rounding error is necessary.

Lemma 6.2

For \(f\in \mathcal{F}_{\phi }^{n}\) and \(t>0\),
$$\begin{aligned} \Pr _{X\sim f}\left[ \forall Y\in \mathcal{Y}_{X}^{t}:|\mathrm {BP}_{d}(X)-\mathrm {BP}_{d}(Y)|>2ntd\phi \right] \le 2\exp (-2n(dt\phi )^{2}). \end{aligned}$$

Note that this probability tends to zero if \(t=\omega (\frac{1}{\phi \sqrt{n}})\). Since grid quantization rounds the points to \(\ell \) distinct points by moving each item by at most \(t=\sqrt{d}\ell ^{-d}\), the requirement \(\ell =o(n)\) even implies that \(t=\omega (\frac{1}{\sqrt{n}})\) for \(d\ge 2\).

To prepare the proof of Lemma 6.2, we introduce the following notion.

Definition 6.3

For \(X=(X_{1},\dots ,X_{d})\in [0,1]^{d}\) and \(t\in \mathbb {R}\), define the \(\delta \)-expansion by t, written \(\delta _{t}:[0,1]^{d}\rightarrow [0,1]^{d}\), by
$$\begin{aligned} \delta _{t}(X):={\left\{ \begin{array}{ll} (\min \{X_{1}+t,1\},\min \{X_{2}+t,1\},\dots ,\min \{X_{d}+t,1\}) &{} \quad \text {if }t\ge 0,\\ (\max \{X_{1}+t,0\},\max \{X_{2}+t,0\},\dots ,\max \{X_{d}+t,0\}) &{} \quad \text {if }t<0. \end{array}\right. } \end{aligned}$$

Given \(X=(X_{1},\dots ,X_{n})\), we abbreviate \(X_{-i}:=(X_{1},\dots ,X_{i-1},X_{i+1},\dots ,X_{n})\) and write \((X_{-i},z)\) shorthand for \((X_{1},\dots ,X_{i-1},z,X_{i+1},\dots ,X_{n})\). In what follows, we explicitly distinguish random variables \(X_{i}\) from specific realizations \(\bar{X_{i}}\).

Lemma 6.4

Let \(f\in \mathcal{F}_{\phi }^{n}\), let \(\bar{X}_{-i}\in [0,1]^{d(n-1)}\) and let \(t\ge 0\). Then,
$$\begin{aligned} \mathop {\mathrm {E}}_{X_{i}}[\mathrm {BP}_{d}(X_{-i},\delta _{t}(X_{i}))\mid X_{-i} =\bar{X}_{-i}]\le \mathop {\mathrm {E}}_{X_{i}}[\mathrm {BP}_{d}(X_{-i},X_{i})\mid X_{-i}=\bar{X}_{-i}]+dt\phi . \end{aligned}$$
Symmetrically,
$$\begin{aligned} \mathop {\mathrm {E}}_{X_{i}}[\mathrm {BP}_{d}(X_{-i},\delta _{-t}(X_{i}))\mid X_{-i}= \bar{X}_{-i}]\ge \mathop {\mathrm {E}}_{X_{i}}[\mathrm {BP}_{d}(X_{-i},X_{i})\mid X_{-i}=\bar{X}_{-i}]-dt\phi . \end{aligned}$$

Proof

Let N be the smallest number of bins required to pack the items \(\bar{X}_{-i}\). Clearly, \(N\le \mathrm {BP}_{d}(\bar{X}_{-i},X_{i})\le N+1\), since any packing of \((\bar{X}_{-i},X_{i})\) must also pack \(\bar{X}_{-i}\) and it suffices to place element \(X_{i}\) in its own bin to obtain a bin packing with only one additional bin.

For simplicity, consider first the one-dimensional case. Since \(\mathrm {BP}_{1}\) is monotone in \(X_{i}\), we conclude that there exists a threshold \(T\in [0,1]\) such that \(\mathrm {BP}(\bar{X}_{-i},\bar{X})=N\) for all \(\bar{X}\le T\) and \(\mathrm {BP}_{1}(\bar{X}_{-i},\bar{X})=N+1\) for \(\bar{X}>T\). The conditional expectation of \(\mathrm {BP}_{1}(\bar{X}_{-i},X_{i})\) is hence
$$\begin{aligned} \mathop {\mathrm {E}}_{X_{i}\sim f_{i}}[\mathrm {BP}_{1}(X_{-i},X_{i})\mid X_{-i}=\bar{X}_{-i}]=N+\Pr [X_{i}>T]. \end{aligned}$$
For any \(\bar{X}_{i}\in [0,1]\), the \(\delta \)-expansion by \(t>0\) fits into N bins if and only if \(\delta _{t}(\bar{X}_{i})\le T\) by the same observation. Since \(\delta _{t}(\bar{X}_{i})\le \bar{X}_{i}+t\), we conclude
$$\begin{aligned} \mathop {\mathrm {E}}_{X_{i}\sim f_{i}}[\mathrm {BP}_{1}(X_{-i},\delta _{t}(X_{i}))\mid X_{-i}=\bar{X}_{-i}]\le N+\Pr [X_{i}>T-t]. \end{aligned}$$
Since \(\Pr [T-t<X_{i}<T]\le \phi t\), the claim follows for \(d=1\).
For \(d>1\), we distinguish two kinds of boxes. A small box is a box that can be packed into the empty space of some placement of the other boxes into N bins. A large box necessarily increases the number of bins needed to pack all items to \(N+1\). Let \(\mathcal{S}(\bar{X}_{-i})\) be the set of small boxes, i.e., \(\mathrm {BP}_{d}(\bar{X}_{-i},\bar{X})=N\) if \(\bar{X}\in \mathcal{S}(\bar{X}_{-i})\) and \(\mathrm {BP}_{d}(\bar{X}_{-i},\bar{X})=N+1\) otherwise. The conditional expectation of \(\mathrm {BP}_{d}(\bar{X}_{-i},X_{i})\) is given by
$$\begin{aligned} \mathop {\mathrm {E}}_{X_{i}\sim f_{i}}[\mathrm {BP}_{d}(X_{-i},X_{i})\mid X_{-i}=\bar{X}_{-i}]=N+\Pr [X_{i}\notin \mathcal{S}(\bar{X}_{-i})]. \end{aligned}$$
For any \(\bar{X}_{i}\in [0,1]^{d}\), its \(\delta \)-expansion by \(t>0\) fits into N bins containing \(\bar{X}_{-i}\) if and only if \(\delta _{t}(\bar{X}_{i})\in \mathcal{S}(\bar{X}_{-i})\). Consider \(\delta _{-t}(\mathcal{S}(\bar{X}_{-i})):=\{\delta _{-t}(s)\mid s\in \mathcal{S}(\bar{X}_{-i})\}\), then this is implied by \(\bar{X}_{i}\in \delta _{-t}(\mathcal{S}(\bar{X}_{-i}))\). To analyze the quantity \(\varDelta {:=}\mathrm {vol}(\mathcal{S}(\bar{X}_{-i}))-\mathrm {vol}(\delta _{-t}(\mathcal{S}(\bar{X}_{-i})))\), note that shrinking a geometric object contained in the unit cube along a single dimension by an additive amount of t decreases its volume by at most t. Applying this for all dimensions yields \(\varDelta \le dt\). We conclude,
$$\begin{aligned} \mathop {\mathrm {E}}_{X_{i}\sim f_{i}}[\mathrm {BP}_{d}(X_{-i},\delta _{t}(X_{i}))\mid X_{-i}=\bar{X}_{-i}]\le N+\Pr [X_{i}\notin \delta _{-t}(\mathcal{S}(\bar{X}_{-i}))], \end{aligned}$$
and the first statement of the claim follows by observing that \(\Pr [X_{i}\in \mathcal{S}(\bar{X}_{-i})\setminus \delta _{-t}(\mathcal{S}(\bar{X}_{-i}))]\le \phi \varDelta \). The second statement can be proven analogously. \(\square \)

Lemma 6.4 gives a bound on the expected loss of “removing” a \(\delta _{t}\)-expansion for a single item. By linearity of expectation, the expected loss adds up if all items are \(\delta \)-expanded by t.

Lemma 6.5

Let \(f\in \mathcal{F}_{\phi }^{n}\) and \(t\ge 0\),
$$\begin{aligned} \mathop {\mathrm {E}}_{X\sim f}[\mathrm {BP}_{d}(\delta _{t}(X))]\le \mathop {\mathrm {E}}_{X\sim f}[\mathrm {BP}_{d}(X)]+ndt\phi . \end{aligned}$$
Symmetrically, \(\mathrm {E}[\mathrm {BP}_{d}(\delta _{-t}(X))]\ge \mathrm {E}[\mathrm {BP}_{d}(X)]-ndt\phi \).

Proof

Let \(1\le i\le n\). Then
$$\begin{aligned}&\mathop {\mathrm {E}}[\mathrm {BP}_{d}(X_{1},\dots ,X_{i-1},\delta _{t}(X_{i}),\delta _{t}(X_{i+1}),\dots ,\delta _{t}(X_{n}))]\\&\quad = \mathop {\mathrm {E}}_{X_{-i}}[\mathop {\mathrm {E}}_{X_{i}}[\mathrm {BP}_{d}(\bar{X}_{1},\dots ,\bar{X}_{i-1},\delta _{t}(X_{i}),\delta _{t}(\bar{X}_{i+1}),\dots ,\delta _{t}(\bar{X}_{n}))\mid X_{-i}=\bar{X}_{-i}]]\\&\quad \le \mathop {\mathrm {E}}_{X_{-i}}[\mathop {\mathrm {E}}_{X_{i}}[\mathrm {BP}_{d}(\bar{X}_{1},\dots ,\bar{X}_{i-1},X_{i},\delta _{t}(\bar{X}_{i+1}),\dots ,\delta _{t}(\bar{X}_{n}))\mid X_{-i}=\bar{X}_{-i}]+\phi dt]\\&\quad = \mathrm {E}[\mathrm {BP}_{d}(X_{1},\dots ,X_{i-1},X_{i},\delta _{t}(X_{i+1}),\dots ,\delta _{t}(X_{n}))]+td\phi . \end{aligned}$$
Applying this transformation for all \(1\le i\le n\) successively yields the result.

The symmetric statement follows analogously. \(\square \)

With the property that any \(\delta \)-expansion by t has only a small effect on the expected difference of the number of required bins, an application of Azuma’s inequality (Lemma 2.2) shows that this holds, in fact, with high probability.

Proof of (Lemma 6.2) By monotonicity of \(\mathrm {BP}_{d}\), all Y with \(\left\| Y_{i}-X_{i}\right\| \le t\) satisfy \(\mathrm {BP}_{d}(\delta _{-t}(X))\le \mathrm {BP}_{d}(Y)\le \mathrm {BP}_{d}(\delta _{t}(X))\). Hence, if \(|\mathrm {BP}_{d}(X)-\mathrm {BP}_{d}(Y)|>2ntd\phi \), then
$$\begin{aligned} \mathrm {BP}_{d}(\delta _{t}(X))\ge \mathrm {BP}_{d}(Y)>\mathrm {BP}_{1}(X)+2ntd\phi \ge \mathrm {E}[\mathrm {BP}_{d}(\delta _{t}(X))]+ntd\phi , \end{aligned}$$
or
$$\begin{aligned} \mathrm {BP}_{d}(\delta _{-t}(X))\le \mathrm {BP}_{d}(Y)<\mathrm {BP}_{d}(X)-2nt\phi \le \mathrm {E}[\mathrm {BP}_{d}(\delta _{-t}(X))]-ntd\phi . \end{aligned}$$
Since the number of bins needed to pack all items can differ by at most one when changing the weight of only one item, we can apply Azuma’s inequality (Lemma 2.2) and obtain
$$\begin{aligned} \Pr [|\mathrm {BP}_{d}(X)-\mathrm {BP}_{d}(Y)|>2ntd\phi ]\le & {} \Pr [|\mathrm {BP}_{d}(\delta _{t}(X))-\mathrm {E}[\mathrm {BP}_{d}(\delta _{t}(X))]|>ntd\phi ]\\\le & {} 2\exp (-2n(td\phi )^{2}), \end{aligned}$$
which concludes the analysis of the smoothed rounding error for bin packing. \(\square \)

Solving the high-multiplicity version of the one-dimensional case has been a key ingredient in approximation schemes for this problem since the first APTAS by [22]. The following lemma from [27] solves the multi-dimensional case.

Lemma 6.6

Let \(X'=((X'_{1},n_{1}),\dots ,(X'_{\ell },n_{\ell }))\) be a quantized input with \(X'_{i}\in [\delta ,1]^{d}\). Then \(\mathrm {BP}_{d}(X')\) can be computed in time \(O(f(\ell ,\delta )\mathrm {polylog}(n))\) where \(n:=\sum _{i=1}^{\ell }n_{i}\), \(f(\ell ,\delta )\) is independent of n and \(f(\ell ,{1}/{\root d \of {\ell }})=2^{\ell ^{O(\ell )}}\).

Observe that each coordinate of the quantized points \(X'_{i}\) obtained by the grid quantization theorem (Theorem 4.1) is at least \(\ell ^{-1/d}\), since we may assume that \(\mathrm {GridQ}\) represents each hypercube \(Q_{i}^{k}\) by its maximal element. Hence, Lemmas 6.1, 6.2 and 6.6 fulfill the requirements for the grid quantization theorem.

Theorem 6.7

For \(d\ge 2\), \(\mathrm {BP}_{d}\) is \(2^{\ell ^{O(\ell )}}\)-time \((O(\frac{\phi }{\root d \of {\ell }}),\varOmega (\frac{1}{\phi ^{d}}))\)-quantizable with respect to \(\mathcal{F}_{\phi }\).

Hence, there is a linear-time probable \((1-O(\phi ^{d+1}/(\log \log n/\log ^{(3)}n)^{1/d}))\)-approximation. Thus, \(\mathrm {BP}_{d}\) can be computed asymptotically exactly in time O(n) if \(\phi =o((\log \log n/\log ^{(3)}n)^{1/d(d+1)})\). Here, allowing superlinear time has no effect on the admissible adversarial power. Furthermore, since \(\mathrm {BP}_{d}\) can be trivially approximated by a factor of n and the success probability of our algorithm is of order \(1-\exp (-\varOmega (n^{1-\varepsilon }))\), asymptotically optimal expected approximation ratios can be obtained for the same values of \(\phi \).

7 Concluding Remarks

Generalizing previous rounding-based approaches, we demonstrate that the general solution technique of quantization performs well on Euclidean optimization problems in the setting of smoothed analysis. We are optimistic that our framework can also be applied to disk covering and scheduling problems.

Note that our approach is orthogonal to the framework for smooth and near-additive Euclidean functionals by Bläser et al. [12]: By definition, a smooth Euclidean functional F on n points is bounded by \(O(n^{1-1/d})\). Hence, it can never compensate for the rounding error of at least \(\varOmega (\ell ^{-1/d})\) per point that our quantization methods induce, as quantization is only reasonable for \(\ell \le n\) and consequently, the total rounding error amounts to \(\varOmega (n^{1-1/d})\). Conversely, if a functional is large enough to compensate for rounding errors induced by quantization, it cannot be smooth. Thus, for any Euclidean functional, at most one of both frameworks is applicable.

This observation is especially interesting in the context of the work of Bern and Eppstein [11]. For a general class of subadditive geometricgraphs, which includes the optimal solutions of some problems tractable in the two smoothed analysis frameworks, they prove a gap theorem on the worst-case sums of the edge lengths of the graphs. Either these sums are bounded by \(O(n^{1-1/d})\) on all point sets, or there exists a point set inducing a graph of total edge length \(\varOmega (n)\). It might be interesting to explore whether such a gap behavior persists in the setting of smoothed analysis. It might be possible to identify general conditions for subadditive geometric graphs to be smoothed tractable, potentially exploiting both smoothed analysis frameworks on their corresponding sides of the gap.

Footnotes

  1. 1.

    If the framework algorithm fails with probability at most p, then an o(1 / p)-approximation algorithm would also suffice to ensure expected asymptotic optimality. At this point, we require O(1)-approximations only for simplicity of presentation. In Sect. 6, we will make use of a slightly more precise analysis of the failure probability of the framework algorithm to use an n-approximation for bin packing.

Notes

Acknowledgments

The authors are grateful to Markus Bläser for kindling their interest in smoothed analysis and for stimulating discussions, and to the anonymous reviewers of this article for providing helpful remarks.

References

  1. 1.
    Anstee, R.P.: A polynomial algorithm for b-matchings: an alternative approach. Inf. Proc. Lett. 24(3), 153–157 (1987)MATHMathSciNetCrossRefGoogle Scholar
  2. 2.
    Arthur, D., Manthey, B., Röglin, H.: Smoothed analysis of the k-means method. J. ACM 58(5), 19:1–19:31 (2011)CrossRefGoogle Scholar
  3. 3.
    Arthur, D., Vassilvitskii, S.: k-means++: the advantages of careful seeding. In: 18th Annual ACM-SIAM Symposium on Discrete Algorithms, SODA’07, pp. 1027–1035. SIAM (2007)Google Scholar
  4. 4.
    Arthur, D., Vassilvitskii, S.: Worst-case and smoothed analysis of the ICP algorithm, with an application to the k-means method. SIAM J. Comput. 39(2), 766–782 (2009)MATHMathSciNetCrossRefGoogle Scholar
  5. 5.
    Avis, D.: A survey of heuristics for the weighted matching problem. Networks 13(4), 475–493 (1983)MATHMathSciNetCrossRefGoogle Scholar
  6. 6.
    Awasthi, P., Blum, A., Sheffet, O.: Center-based clustering under perturbation stability. Inf. Proc. Lett. 112(1–2), 49–54 (2012)MATHMathSciNetCrossRefGoogle Scholar
  7. 7.
    Bansal, N., Correa, J.É.R., Kenyon, C., Sviridenko, M.: Bin packing in multiple dimensions: inapproximability results and approximation schemes. Math. Oper. Res. 31, 31–49 (2006)MATHMathSciNetCrossRefGoogle Scholar
  8. 8.
    Barvinok, A., Fekete, S.P., Johnson, D.S., Tamir, A., Woeginger, G.J., Woodroofe, R.: The geometric maximum traveling salesman problem. J. ACM 50(5), 641–664 (2003)MathSciNetCrossRefGoogle Scholar
  9. 9.
    Barvinok, A.I.: Two algorithmic results for the traveling salesman problem. Math. Oper. Res. 21(1), 65–84 (1996)MATHMathSciNetCrossRefGoogle Scholar
  10. 10.
    Beier, R., Vöcking, B.: Typical properties of winners and losers in discrete optimization. SIAM J. Comput. 35(4), 855–881 (2006)MATHMathSciNetCrossRefGoogle Scholar
  11. 11.
    Bern, M., Eppstein, D.: Worst-case bounds for subadditive geometric graphs. In: 9th Annual Symposium on Computational Geometry, SCG’93, pp. 183–188, New York, NY, USA, ACM (1993)Google Scholar
  12. 12.
    Bläser, M., Manthey, B., Rao, B.V.R.: Smoothed analysis of partitioning algorithms for Euclidean functionals. Algorithmica 66(2), 397–418 (2013)MATHMathSciNetCrossRefGoogle Scholar
  13. 13.
    Boros, E., Elbassioni, K., Fouz, M., Gurvich, V., Makino, K., Manthey, B.: Stochastic mean payoff games: smoothed analysis and approximation schemes. In: 38th International Colloquium on Automata, Languages and Programming, ICALP’11, pp. 147–158. Springer (2011)Google Scholar
  14. 14.
    Chen, K.: On coresets for k-median and k-means clustering in metric and Euclidean spaces and their applications. SIAM J. Comput. 39(3), 923–947 (2009)MATHMathSciNetCrossRefGoogle Scholar
  15. 15.
    Curticapean, R., Künnemann, M.: A quantization framework for smoothed analysis of Euclidean optimization problems. In: 21st European Symposium on Algorithms, ESA’13, pp. 349–360. Springer, Berlin (2013)Google Scholar
  16. 16.
    Dasgupta, S.: The hardness of k-means clustering. Technical report cs2007-0890, University of California, San Diego (2007)Google Scholar
  17. 17.
    Duan, R., Pettie, S.: Approximating maximum weight matching in near-linear time. In: 51st Annual IEEE Symposium on Foundations of Computer Science, FOCS’10, pp. 673–682, Washington, DC, USA. IEEE Computer Society (2010)Google Scholar
  18. 18.
    Dyer, M.E., Frieze, A.M., McDiarmid, C.J.H.: Partitioning heuristics for two geometric maximization problems. Oper. Res. Lett. 3(5), 267–270 (1984)MATHMathSciNetCrossRefGoogle Scholar
  19. 19.
    Englert, M., Röglin, H., Vöcking, B.: Worst case and probabilistic analysis of the 2-opt algorithm for the TSP: extended abstract. In: 18th Annual ACM-SIAM Symposium on Discrete Algorithms, SODA’07, pp. 1295–1304. SIAM (2007)Google Scholar
  20. 20.
    Fekete, S.P., Meijer, H., Rohe, A., Tietze, W.: Solving a “hard” problem to approximate an “easy” one: heuristics for maximum matchings and maximum traveling salesman problems. ACM J. Exp. Algorithmics 7, 11 (2002)MathSciNetCrossRefGoogle Scholar
  21. 21.
    Feldman, D., Monemizadeh, M., Sohler, C.: A PTAS for k-means clustering based on weak coresets. In: 23rd Annual Symposium on Computational Geometry, SCG’07, pp. 11–18. ACM (2007)Google Scholar
  22. 22.
    Fernandez de la Vega, W., Lueker, G.: Bin packing can be solved within \(1 + \epsilon \) in linear time. Combinatorica 1(4), 349–355 (1981)MATHMathSciNetCrossRefGoogle Scholar
  23. 23.
    Gabow, H.N.: An efficient implementation of Edmonds’ algorithm for maximum matching on graphs. J. ACM 23(2), 221–234 (1976)MATHMathSciNetCrossRefGoogle Scholar
  24. 24.
    Har-Peled, S., Mazumdar, S.: On coresets for k-means and k-median clustering. In: 36th Annual ACM Symposium on Theory of Computing, STOC’04, pp. 291–300 (2004)Google Scholar
  25. 25.
    Inaba, M., Katoh, N., Imai, H.: Applications of weighted Voronoi diagrams and randomization to variance-based k-clustering: (extended abstract). In: 10th Annual Symposium on Computational Geometry, SCG’94, pp. 332–339 (1994)Google Scholar
  26. 26.
    Kanungo, T., Mount, D.M., Netanyahu, N.S., Piatko, C.D., Silverman, R., Wu, A.Y.: A local search approximation algorithm for k-means clustering. Comput. Geom. Theo. Appl. 28(2–3), 89–112 (2004)MATHMathSciNetCrossRefGoogle Scholar
  27. 27.
    Karger, D., Onak, K.: Polynomial approximation schemes for smoothed and random instances of multidimensional packing problems. In: 18th Annual ACM-SIAM Symposium on Discrete Algorithms, SODA’07, pp. 1207–1216 (2007)Google Scholar
  28. 28.
    Karp, R.M., Luby, M., Marchetti-Spaccamela, A.: A probabilistic analysis of multidimensional bin packing problems. In: 16th Annual ACM Symposium on Theory of Computing, STOC’84, pp. 289–298, New York, NY, USA, ACM (1984)Google Scholar
  29. 29.
    Mahajan, M., Nimbhorkar, P., Varadarajan, K.: The planar k-means problem is NP-hard. Theor. Comput. Sci. 442, 13–21 (2012)MATHMathSciNetCrossRefGoogle Scholar
  30. 30.
    Manthey, B., Röglin, H.: Smoothed analysis: analysis of algorithms beyond worst case. Inf. Technol. 53(6), 280–286 (2011)Google Scholar
  31. 31.
    McDiarmid, C.: Concentration. In: Habib, M., McDiarmid, C., Ramirez-Alfonsin, J., Reed, B. (eds.) Probabilistic Methods for Algorithmic Discrete Mathematics, Volume 16 of Algorithms and Combinatorics, pp. 195–248. Springer, Berlin (1998)CrossRefGoogle Scholar
  32. 32.
    Plotkin, S.A., Shmoys, D.B., Tardos, É.: Fast approximation algorithms for fractional packing and covering problems. Math. Oper. Res. 20(2), 257 (1995)MATHMathSciNetCrossRefGoogle Scholar
  33. 33.
    Spielman, D.A., Teng, S.: Smoothed analysis: an attempt to explain the behavior of algorithms in practice. Commun. ACM 52(10), 76–84 (2009)CrossRefGoogle Scholar
  34. 34.
    Spielman, D.A., Teng, S.-H.: Smoothed analysis of algorithms: why the simplex algorithm usually takes polynomial time. J. ACM 51(3), 385–463 (2004)MATHMathSciNetCrossRefGoogle Scholar
  35. 35.
    Steele, J.M.: Subadditive Euclidean functionals and nonlinear growth in geometric probability. Ann. Probab. 9(3), 365–376 (1981)MATHMathSciNetCrossRefGoogle Scholar
  36. 36.
    Weber, M., Liebling, T.M.: Euclidean matching problems and the metropolis algorithm. Math. Methods Oper. Res. 30(3), A85–A110 (1986)MATHMathSciNetCrossRefGoogle Scholar

Copyright information

© Springer Science+Business Media New York 2015

Authors and Affiliations

  1. 1.Saarbrücken Graduate School of Computer ScienceSaarbrückenGermany
  2. 2.Department of Computer ScienceSaarland UniversitySaarbrückenGermany
  3. 3.Max Planck Institute for InformaticsSaarbrückenGermany

Personalised recommendations