# A Quantization Framework for Smoothed Analysis of Euclidean Optimization Problems

## Abstract

We consider the smoothed analysis of Euclidean optimization problems. Here, input points are sampled according to density functions that are bounded by a sufficiently small smoothness parameter \(\phi \). For such inputs, we provide a general and systematic approach that allows designing linear-time approximation algorithms whose output is asymptotically optimal, both in expectation and with high probability. Applications of our framework include maximum matching, maximum TSP, and the classical problems of k-means clustering and bin packing. Apart from generalizing corresponding average-case analyses, our results extend and simplify a polynomial-time probable approximation scheme on multidimensional bin packing on \(\phi \)-smooth instances, where \(\phi \) is constant (Karger and Onak in Polynomial approximation schemes for smoothed and random instances of multidimensional packing problems, pp 1207–1216, 2007). Both techniques and applications of our rounding-based approach are orthogonal to the only other framework for smoothed analysis of Euclidean problems we are aware of (Bläser et al. in Algorithmica 66(2):397–418, 2013).

### Keywords

Smoothed analysis Euclidean optimization problems Bin packing Maximum matching Maximum traveling salesman problem## 1 Introduction

Smoothed analysis has been introduced by Spielman and Teng [34] to give a theoretical foundation for analyzing the practical performance of algorithms. In particular, this analysis paradigm was able to provide an explanation why the simplex method is observed to run fast in practice despite its exponential worst-case running time. For a detailed overview, we refer to two surveys on smoothed analysis [30, 33].

The key concept of smoothed analysis, i.e., letting an adversary choose worst-case distributions of bounded “power” to determine input instances, is especially well-motivated in a Euclidean setting. Here, input points are typically determined by physical measurements, which are subject to an inherent inaccuracy, e.g., from locating a position on a map. For clustering problems, it is often even implicitly assumed that the points are sampled from unknown probability distributions which are sought to be recovered.

Making the mentioned assumptions explicit, we call a problem *smoothed tractable* if it admits a linear-time algorithm with an approximation ratio that is bounded by \(1-o(1)\) with high probability over the input distribution specified by the adversary. Such an approximation performance is called *asymptotically optimal*. We provide a unified approach to show that several Euclidean optimization problems are smoothed tractable, which sheds light onto the properties that render a Euclidean optimization problem likely to profit from perturbed input.

We employ the *one-step model*, a widely-used and very general perturbation model, which has been successfully applied to analyze a number of algorithms [10, 12, 13, 19]. In this model, an adversary chooses probability densities on the input space, according to which the input instance is drawn. To prevent the adversary from modeling a worst-case instance too closely, we bound the density functions from above by a parameter \(\phi \). Roughly speaking, for large \(\phi \), we expect the algorithm to perform almost as bad as on worst-case instances. Likewise, choosing \(\phi \) as small as possible requires the adversary to choose the uniform distribution on the input space, corresponding to an average-case analysis. Thus, the adversarial power \(\phi \) serves as an interpolation parameter between worst and average case.

*t*, we define the smoothed performance of an algorithm under the perturbation model \(\mathcal{F}\) as

For given \(\phi \), we require the density functions chosen by the adversary to be bounded by \(\phi \). For real-valued input, this includes the possibility to add uniform noise in an interval of length \({1}/{\phi }\) or Gaussian noise with variance \(\sigma =\varTheta ({1}/{\phi })\). In the Euclidean case, the adversary could, e.g., specify for each point a box of volume at least \({1}/{\phi }\), in which the point is distributed uniformly.

### 1.1 Related Work

Recently, Bläser, Manthey and Rao [12] established a framework for analyzing the expectation of both running times and approximation ratios for some partitioning algorithms on so-called *smooth* and *near-additive* functionals. We establish a substantially different framework for smoothed analysis on a general class of Euclidean functionals that is disjoint to the class of smooth and near-additive functionals (see Sect. 7 for further discussion). We contrast both frameworks by considering the maximization counterparts of two problems studied in [12], namely Euclidean matching and TSP. Our algorithms have the advantage of deterministic running times and asymptotically optimal approximation guarantees both in expectation and with high probability.

All other related works are problem-specific and will be described in the corresponding sections. As an exception, we highlight the result of Karger and Onak [27], who studied bin packing. To the best of our knowledge, this is the only problem that fits into our framework and has already been analyzed under perturbation. In their paper, a linear-time algorithm for bin packing was given that is asymptotically optimal on instances smoothed with any constant \(\phi \) and instances in which each input point is drawn from an identical, but arbitrary probability density function. We provide a new, conceptually simpler rounding method and analysis that replaces a key step of their algorithm and puts the reasons for its smoothed tractability into a more general context.

### 1.2 Our Results

We provide very fast and simple approximation algorithms on sufficiently smoothed inputs for the following problems: The maximum Euclidean matching problem \(\mathrm {MaxM}\), the maximum Euclidean Traveling Salesman problem \(\mathrm {MaxTSP}\), the *k*-means clustering problem \(\mathrm {KMeans}\) where *k* denotes the number of desired clusters and is part of the input, and the *d*-dimensional bin packing problem \(\mathrm {BP}_{d}\). The approximation ratio converges to one with high probability over the random inputs. Additionally, all of these algorithms can be adapted to yield asymptotically optimal expected approximation ratios as well. This generalizes corresponding average-case analysis results [18, 28].

Almost all our algorithms allow trade-offs between running time and approximation performance: by choosing a parameter *p* within its feasible range, we obtain algorithms of running time \(O(n^{p})\), whose approximation ratios converge to 1 as \(n\rightarrow \infty \), provided that \(\phi \) is small enough, where the restriction on \(\phi \) depends on *p*. The general trade-offs for our algorithms are listed in Table 2, the special case of linear-time algorithms is summarized in Table 1.

### 1.3 Organization of the Paper

All (near) linear-time algorithms derived in our framework

Problem | Running time | Restriction on adversary power | Reference |
---|---|---|---|

\(\mathrm {MaxM}\) |
| \(\phi =o\left( \root 4 \of {n}\right) \) or \(\phi =o\left( n^{\frac{1}{2}\frac{d}{d+2}-\varepsilon }\right) \) | |

\(\mathrm {MaxTSP}\) |
| \(\phi =o\left( \root 4 \of {n}\right) \) or \(\phi =o\left( n^{\frac{1}{2}\frac{d}{d+2}-\varepsilon }\right) \) | |

\(\mathrm {KMeans}\) |
| \(k\phi =o\left( n^{\frac{1}{2}\frac{1}{kd+1}\frac{d}{d+1}}\right) \) | Section 5.1 |

\(\mathrm {BP}_{1}\) | \(O(n\log n)\) | \(\phi =o(n^{1-\varepsilon })\) | Section 6 |

\(\mathrm {BP}_{d}\) |
| \(\phi =o\left( \root d(d+1) \of {\log \log n/\log ^{(3)}n}\right) \) | Section 6 |

Our results

Problem | Running time | Approximation ratio | |
---|---|---|---|

\(\mathrm {MaxM}(X)\) | \(O(n^{p})\) | \(1-O\left( \root d \of {\frac{\phi }{n^{p/4}}}\right) \) | \(1\le p<4\) |

\(O(n^{p})\) | \(1-O\left( \frac{\root d \of {\phi }}{n^{\frac{p}{2}\frac{1}{d+2}-\varepsilon }}\right) \) | \(1\le p\le 2\left( 1+\frac{1}{d+1}\right) \), \(\varepsilon >0\) | |

\(\mathrm {MaxTSP}(X)\) | \(O(n^{p})\) | \(1-O\left( \root d \of {\frac{\phi }{n^{p/4}}}\right) \) | \(1\le p\le 4\left( 1-\frac{1}{d+1}\right) \) |

\(O(n^{p})\) | \(1-O\left( \frac{\root d \of {\phi }}{n^{\frac{p}{2}\frac{1}{d+2}-\varepsilon }}\right) \) | \(1\le p\le 2\left( 1+\frac{1}{d+1}\right) \), \(\varepsilon >0\) | |

\(\mathrm {KMeans}(X;k)\) | \(O(n^{p})\) | \(1-O\left( \frac{(k\phi )^{2/d}}{n^{\frac{p}{(kd+1)(d+1)}}}\right) \) | \(\begin{array}{l} 1\le p<kd+1\\ k=O(\log n/\log \log n) \end{array}\) |

\(O(n^{p})\) | \(1-O\left( \frac{k^{\frac{2}{d}-\frac{3}{2(d+1)}}\phi {}^{2/d}}{n^{\frac{p}{2k(d+1)}-\varepsilon }}\right) \) | \(\begin{array}{l} 1\le p<2k\\ k=O(\log n/\log \log n) \end{array}\) | |

\(\mathrm {BP}_{1}(X)\) | \(O(n\log n)\) | \(1-\log n/n^{\varepsilon }-O(\phi /n^{1-\varepsilon })\) | \(\varepsilon >0\) |

\(\mathrm {BP}_{d}(X)\) |
| \(1-O\left( \frac{\phi ^{d+1}}{\root d \of {\frac{\log \log n}{\log ^{(3)}n}}}\right) \) |

## 2 Preliminaries

Given an *n*-tuple of density functions \(f=(f_{1},\dots ,f_{n})\) and random variables \(X=(X_{1},\ldots ,X_{n})\), we write \(X\sim f\) for drawing \(X_{i}\) according to \(f_{i}\) for \(1\le i\le n\). We call \(Y=(Y_{1},\dots ,Y_{n})\) a \(\delta \)*-rounding* of *X* if \(\left\| X_{i}-Y_{i}\right\| \le \delta \) for all \(1\le i\le n\). For a given *X*, let \(\mathcal{Y}_{X}^{\delta }\) be the set of \(\delta \)-roundings of *X*. We will frequently round members of a set *C* to their center of mass \(\mathrm {cm}(C):=\frac{1}{|C|}\sum _{c\in C}c\). For a collection of points *C*, its diameter is defined as \(\mathrm {diam}(C):=\max _{c,c'\in C} \left\| c-c'\right\| \).

We will analyze Euclidean functionals \(F:([0,1]^{d})^{*}\rightarrow \mathbb {R}\), denoting the dimension of the input space by a constant \(d\in \mathbb {N}\) independent of *n*. For formalizing the perturbation model, let \(\phi :\mathbb {N}\rightarrow [1,\infty )\) be an arbitrary function measuring the adversary’s power. For better readability, we usually write \(\phi \) instead of \(\phi (n)\). We define \(\mathcal{F}_{\phi }\) to be the set of *feasible *probability density functions \(f:[0,1]^{d}\rightarrow [0,\phi ]\). Hence \(\mathcal{F}_{\phi }^{n}:=\mathcal{F}_{\phi (n)}^{n}\) is the set from which a \(\phi \)-bounded adversary may choose the input distributions.

Note that if \(\phi =1\), the set \(\mathcal{F}_{\phi }\) only consists of the uniform distribution on \([0,1]^{d}\), which constitutes an average-case analysis. If however \(\phi =n\), the adversary may specify disjoint boxes for each point. Intuitively, to obtain a particular worst-case instance, the adversary would need to specify Dirac delta functions, which corresponds figuratively to setting \(\phi \) to infinity. Observe also that already \(\phi =\omega (1)\) suffices to let all possible locations of an individual point \(X_{i}\) converge to a single point as the number of input points increases, hence we believe that a superconstant \(\phi \) is especially interesting to analyze.

We will often exploit the following standard argument in smoothed analyses. For a \(\phi \)-bounded adversary, the probability that a specific input point \(X_{i}\) is contained in a ball \(B_{r}(c)\) with radius \(r\in \mathbb {R}_{\ge 0}\) and center \(c\in \mathbb {R}^{d}\) is bounded by \(\phi \cdot \mathrm {vol}(B_{r}(c))=\phi \cdot v(d)\cdot r^{d}\), where the constant *v*(*d*) depends only on *d*.

For a given Euclidean functional *F*, we analyze the approximation ratio \(\rho \) of approximation algorithms \(\mathrm {ALG}\). If the functional is induced by an optimization problem, we do not focus on constructing a feasible approximate solution, but rather on computing an approximation of the objective value. However, we adopt this simplification only for clarity of presentation. Each of the discussed algorithms can be tuned such that it also outputs a feasible approximate *solution* for the underlying optimization problem. The approximation ratio on instance *X* is defined as \(\rho (X)=\min \left\{ \frac{\mathrm {ALG}(X)}{F(X)},\frac{F(X)}{\mathrm {ALG}(X)}\right\} \), which allows to handle both maximization and minimization problems at once.

For analyzing running times, we assume the word RAM model of computation and reveal real-valued input by reading in words of \(w\ge \log n\) bits in unit time per word. We call an approximation algorithm a *probable*\(g_{\phi }(n)\)*-approximation* on smoothed instances if \(\rho (X)\ge g_{\phi }(n)\) with high probability, i.e., with probability \(1-o(1)\), when *X* is drawn from any \(f\in \mathcal{F}_{\phi }^{n}\). The algorithms derived in our framework feature deterministic running times \(t(n)=\mathrm {poly}(n)\) and asymptotically optimal approximation ratios \(g_{\phi }(n)\), i.e., \(g_{\phi }(n)\rightarrow 1\) for \(n\rightarrow \infty \), if \(\phi \) is small enough.

### 2.1 Tools from Probability Theory

We will make use of the following tools from probability theory, see, e.g., [31]. The first lemma is a simple variant of the Chernoff bounds, giving high concentration results of sums of independent variables.

**Lemma 2.1**

The second lemma gives tail bounds for more general functions *f* of independent random variables. These *f* are required to have a bounded difference when only a single random variable changes its outcome.

**Lemma 2.2**

*k*. Let \(f:A_{1}\times \ldots \times A_{n}\rightarrow \mathbb {R}\) and \(c_{1},\ldots ,c_{n}\in \mathbb {R}\) be such that \(|f(x)-f(x')|\le c_{k}\) whenever the vectors

*x*and \(x'\) differ only in the

*k*th coordinate. Let \(\mu \) be the expected value of the random variable

*f*(

*X*). Then for any \(t\ge 0\), it holds that

## 3 Framework

Our framework builds on the notion of *quantizable* functionals. These are functionals that admit fast approximation schemes on perturbed instances using general rounding strategies. The idea is to round an instance of *n* points to a *quantized instance* of \(\ell (n)\ll n\) points, each equipped with a multiplicity. This quantized input has a smaller problem size, which allows us to compute an approximation faster than on the original input. However, the objective function needs to be large enough such that the loss incurred by rounding is negligible.

We aim at a trade-off between running time and approximation performance. As it will turn out, varying the number \(\ell (n)\) of quantized points on an instance of *n* points makes this possible. Thus, we keep the function \(\ell \) variable in our definition. On instances of size *n*, we will write \(\ell :=\ell (n)\) for short.

**Definition 3.1**

*n*. Let \(t,R,Q:\mathbb {N}\rightarrow \mathbb {R}\). We say that a Euclidean functional \(F:([0,1]^{d})^{*}\rightarrow \mathbb {R}_{\ge 0}\) is

*t*

*-time*(

*R*,

*Q*)

*-quantizable*with respect to \(\mathcal{F}\), if for any function \(\ell \) satisfying \(\ell =\omega (1)\) and \(\ell =o(n)\), there is a

*quantization algorithm*

*A*and an

*approximation functional*\(g:([0,1]^{d}\times \mathbb {N})^{*}\rightarrow \mathbb {R}\) with the following properties.

- 1.The quantization algorithm
*A*runs in time*O*(*n*) and maps a collection of points \(X=(X_{1},\dots ,X_{n})\in [0,1]^{dn}\) to a multisetthe$$\begin{aligned} A(X)=X'=((X'_{1},n_{1}),\dots ,(X'_{\ell },n_{\ell })), \end{aligned}$$*quantized input*, with \(X'_{i}\in [0,1]^{d}\) for each \(1\le i\le \ell \). - 2.On all inputs \(Y=A(X)\), the approximation functional
*g*(*Y*) is computable in time \(t(\ell )+O(n)\) and, for any \(f\in \mathcal{F}^{n}\), fulfills$$\begin{aligned} \Pr _{X\sim f}[|F(X)-g(Y)|\le nR(\ell )]=1-o(1). \end{aligned}$$ - 3.For any \(f\in \mathcal{F}^{n}\), we have$$\begin{aligned} \Pr _{x\sim f}\left[ F(X)\ge nQ(n)\right] =1-o(1). \end{aligned}$$

The following theorem states that quantizable functionals induce natural approximation algorithms on smoothed instances. We can thus restrict our attention to finding criteria that make a functional quantizable.

**Theorem 3.2**

Let \(\mathcal{F}\) be a family of probability distributions and *F* be \(t(\ell )\)-time \((R(\ell ),Q(n))\)-quantizable with respect to \(\mathcal{F}\). Then for every \(\ell \) with \(\ell =\omega (1)\) and \(\ell =o(n)\), there is an approximation algorithm \(\mathrm {ALG}\) with the following property. For every \(f\in \mathcal{F}^{n}\), the approximation \(\mathrm {ALG}(X)\) on the instance *X* drawn from *f* is a \((1-\frac{R(\ell )}{Q(n)})\)-approximation to *F*(*X*) with high probability. The approximation can be computed in time \(O(n+t(\ell ))\).

*Proof*

*g*(

*A*(

*X*)) in time \(O(n+t(\ell ))\). Let

*E*be the event that \(|g(A(X))-F(X)|\le R(\ell )n\), which happens with probability \(1-o(1)\), and assume that

*E*occurs. Note that we allow the approximation both to over- and to underestimate the functional, which in turn can be induced by either a minimization or a maximization problem. Hence, we do the following case distinction to bound \(\rho =\min \{\frac{g(A(X))}{F(X)},\frac{F(X)}{g(A(X))}\}\). If \(g(A(X))\le F(X)\), then

*g*(

*A*(

*X*)) is not a \((1-\frac{R(\ell )}{Q})\)-approximation to

*F*(

*X*) is thus bounded by

*expected*approximation ratio converges to optimality in the sense that both \(\mathrm {E}[\rho ]\rightarrow 1\) (as a reasonable performance measure for maximization problems) and \(\mathrm {E}[\rho ^{-1}]\rightarrow 1\) (as a more appropriate guarantee for minimization problems). The first guarantee is established already by the framework algorithm, since Theorem 3.2 directly implies

*F*is a linear-time algorithm approximating

*F*within a constant factor \(0<c<1\). Outputting the better solution of our framework algorithm and the

*c*-approximation does not increase the order of the running time, but achieves an approximation ratio of \(1-\frac{R(\ell )}{Q(n)}=1-o(1)\) with probability \(1-o(1)\) due to the previous theorem, yielding \(\mathrm {E}[\rho ]\rightarrow 1\), and still provides a constant approximation ratio of

*c*on the remaining instances sampled with probability

*o*(1). Thus, \(\mathrm {E}[\rho ^{-1}]\le (1-o(1))(1-\frac{R(\ell )}{Q(n)})^{-1}+o(1)c^{-1}\rightarrow 1\) holds as well.

^{1}

By a slight abuse of notation, we identify a multiset of points \(X'=((X'_{1},n_{1}),\dots ,(X'_{\ell },n_{\ell }))\in ([0,1]^{d}\times \mathbb {N})^{\ell }\) with a tuple \(X'\in ([0,1]^{d})^{*}\) in the canonical way.

## 4 Grid Quantization

Our first method for verifying quantizability is grid quantization. Here, the basic idea is to round the input to the centers of grid cells, where the coarseness of the grid is chosen according to the desired number of distinct points. This method works well for functionals that allow for fast optimal computations on their high-multiplicity version and provide a large objective value on the chosen perturbation model.

Let \(k\in \mathbb {N}\). By subdividing the *d*-dimensional unit cube \([0,1]^{d}\) into *k* equally long segments along each axis, we obtain the \(k^{d}\) cubes \(\mathcal{Q}_{(j_{1},\dots ,j_{d})}^{k}\), for \(1\le j_{1},\ldots ,j_{d}\le k\). We enumerate them in an arbitrary fashion \(\mathcal{Q}_{1}^{k},\dots ,\mathcal{Q}_{k^{d}}^{k}\).

**Theorem 4.1**

- 1.
On all quantized inputs \(X'=((X'_{1},n_{1}),\dots ,(X'_{\ell },n_{\ell }))\), the value \(F(X')\) can be computed in time \(t(\ell )+O(\sum _{i=1}^{\ell }n_{i})\). The algorithm may (i) assume \(\ell =k^{d}\) for some \(k\in \mathbb {N}\) and (ii) choose an arbitrary location for the distinct input points, as long as each \(X_{i}'\) is contained in its corresponding cube \(\mathcal{Q}_{i}^{k}\).

- 2.There is a constant
*C*such that with high probability, the functional differs by at most \(C\delta n\) on all \(\delta \)-roundings of an instance*X*drawn from any \(f\in \mathcal{F}^{n}\). Formally, for each \(\delta >0\) we require$$\begin{aligned} \Pr _{X\sim f}\left[ \forall Y\in \mathcal{Y}_{X}^{\delta }:|F(X)-F(Y)|\le C\delta n\right] =1-o(1). \end{aligned}$$ - 3.For each \(f\in \mathcal{F}^{n}\), it holds that$$\begin{aligned} \Pr _{X\sim f}\left[ F(X)\ge nQ(n)\right] =1-o(1). \end{aligned}$$

*F*is \(t(\ell )\)-time \((O(\ell ^{-\frac{1}{d}}),Q(n))\)-quantizable with respect to \(\mathcal{F}\).

*Proof*

Assume that \(\ell =k^{d}\) for some \(k\in \mathbb {N}\) and consider the following algorithm \(\mathrm {GridQ}\). First, \(\mathrm {GridQ}\) determines, for each cube \(Q_{i}^{k}\), the number \(n_{i}\) of input points contained in the cube. It then outputs, for each cube \(\mathcal{Q}_{i}^{k}\), some point \(q_{i}\in \mathcal{Q}_{i}^{k}\) weighted by \(n_{i}\), where \(q_{i}\) can be chosen arbitrarily, e.g., as the centroid of the cube.

This algorithm can be executed in time *O*(*n*) in the word RAM model of computation. To see this, assume for clarity of presentation that *k* is a power of two, hence \(1/k=2^{-b}\) for some natural number *b*. For any real number \(x\in [0,1]\), the corresponding interval \([i/k,(i+1)/k]\ni x\) can be determined by reading in the first *b* bits of *x*, yielding *i*. Since \(b=\log k\le \log n\le w\), reading in one chunk of each coordinate suffices and incurs a cost of *d* word operations per point.

*F*(

*Y*) can be computed in time \(t(\ell )+O(\sum _{i=1}^{\ell }n_{i})=t(\ell )+O(n)\). The lower bound condition (3) of Definition 3.1 is also satisfied by assumption.

For general \(\ell ,\) we choose *k* as the largest power of 2 with \(k^{d}\le \ell \) in the construction above. The claim follows from the observation that \(\frac{C\sqrt{d}}{k}=O(\ell ^{-1/d})\). \(\square \)

In the remainder of this section, we apply the framework to two Euclidean maximization problems, namely maximum matching and maximum TSP. Both problems have already been analyzed in the average-case world, see, e.g., an analysis of the Metropolis algorithm on maximum matching in [36]. We generalize the result of Dyer et al. [18], who proved the asymptotic optimality of two simple partitioning heuristics for maximum matching and maximum TSP on the uniform distribution in the unit square. However, in contrast to our approach, their partitioning methods typically fail if the points are not identically distributed.

### 4.1 Maximum Matching

Let \(\mathrm {MaxM}(X)\) denote the maximum weight of a matching of the points \(X\subseteq [0,1]^{d}\), where the weight of a matching *M* is defined as the total length of matched edges \(\sum _{\{u,v\}\in M}\left\| u-v\right\| \). For the more general problem of finding maximum weighted matchings on *general* graphs with non-integer weights, the fastest known algorithm due to Gabow [23] runs in time \(O(mn+n^{2}\log n)\).

We aim to apply Theorem 4.1, for which we only need to check three conditions. The rounding condition (2) is easily seen to be satisfied by a straight-forward application of the triangle inequality.

**Property 4.2**

*Proof*

*M*be an optimal matching on

*X*that we represent as a set of pairs of indices \(\{i,j\}\) rather than pairs of vertices \(\{X_{i},X_{j}\}\). By triangle inequality, we have

The lower bound condition (3) is provided by the following lemma.

**Lemma 4.3**

*Proof*

*M*be an arbitrary matching of the indices \(\{1,\dots ,n\}\). Consider any edge \(\{i,j\}\in M\). Let \(z\in [0,1]^{d}\) be arbitrary, then \(\Pr _{X_{j}}[\left\| X_{i}-X_{j}\right\| \le t\mid X_{i}=z]=\Pr _{X_{j}}[X_{j}\in B_{t}(z)]\le \phi \mathrm {vol}(B_{t}(0))=O(\phi t^{d}).\) We conclude that there is a value \(t=\varOmega (1/\root d \of {\phi })\), such that

*t*. Thus, we have \(\Pr [\mathrm {MaxM}(X)<\frac{nt}{8}]\le \exp (-n/8)\). \(\square \)

We call the task of computing a functional on quantized inputs the *quantized version* of the functional. In the case of \(\mathrm {MaxM}\), an algorithm for b-matchings by Anstee [1] can be exploited, satisfying condition (1).

**Lemma 4.4**

The quantized version of \(\mathrm {MaxM}\) can be computed in time \(O(\ell ^{4}+\ell ^{3}\log n)\), where \(n=\sum _{i=1}^{\ell }n_{i}\).

*Proof*

The quantized version of \(\mathrm {MaxM}\) can be considered as a *b-matching problem.* The input in this problem is a graph \(G=(V,E)\) with costs *d*(*e*) at edges \(e\in E\) and integer weights \(b_{v}\) at vertices \(v\in V\). The aim is to find an assignment of non-negative integer weights \(x_{e}\) to the edges such that \(\sum _{e=\{v,v'\}\in E}x_{e}\le b_{v}\) for each \(v\in V\) and the weighted cost \(\sum _{e\in E}x_{e}d(e)\) is maximized.

For a given instance \(X'=((X_{1},n_{1}),\dots ,(X_{\ell },n_{\ell }))\), we define a complete graph on the vertices \(\{X_{1},\dots ,X_{\ell }\}\) where each edge has cost \(d(X_{i},X_{j})=\left\| X_{i}-X_{j}\right\| \) and each vertex \(X_{i}\) has weight \(b_{X_{i}}=n_{i}\). By an algorithm from [1], this instance can be solved in time \(O(\ell ^{3}\log \sum _{i=1}^{\ell }n_{i}+\ell ^{4})\). \(\square \)

These observations immediately yield the following result.

**Theorem 4.5**

\(\mathrm {MaxM}\) is \(O(\ell ^{4})\)-time \((O(1/\root d \of {\ell }),\varOmega (1/\root d \of {\phi }))\)-quantizable with respect to \(\mathcal{F}_{\phi }\). Hence, for \(1\le p<4,\) there is a \(O(n^{p})\)-time probable \((1-O(\root d \of {\phi /n^{p/4}}))\)-approximation to \(\mathrm {MaxM}\) for instances drawn according to some \(f\in \mathcal{F}_{\phi }^{n}\). This is asymptotically optimal on smoothed instances with \(\phi =o(n^{p/4})\).

*Proof*

To verify the quantizability, Property 4.2, Lemmas 4.3 and 4.4 can be used to apply Theorem 4.1. For this, note that \(\ell ^{4}+\ell ^{3}\log n=O(\ell ^{4}+n)\). Using Theorem 3.2 with \(\ell :=\lceil n^{p/4}\rceil \), we obtain the remaining part of the statement. \(\square \)

Interestingly, the restriction on \(\phi \) is independent of the dimension. Note that only \(p<3\) is reasonable, since deterministic cubic-time algorithms for exactly computing \(\mathrm {MaxM}\) exist. Furthermore, as described in Sect. 3, an algorithm achieving an asymptotically optimal approximation ratio also in expectation can be designed as well. For this, we may utilize a simple greedy linear-time \(\frac{1}{2}\)-approximation for \(\mathrm {MaxM}\) [5].

### 4.2 Maximum Traveling Salesman Problem

The approach for maximum matching of the previous subsection can be adapted to the maximum traveling salesman problem. For \(d\ge 2\), define \(\mathrm {MaxTSP}(X)\) as the maximum weight of a Hamiltonian cycle on \(X\subseteq [0,1]^{d}\), where the weight of a Hamiltonian cycle *C* is defined as \(\sum _{\{u,v\}\in C}\left\| u-v\right\| \). The problem is NP-hard (proven for \(d\ge 3\) in [8], conjectured for \(d=2\)) but admits a PTAS, cf. [8, 9]. According to Fekete et al. [20], these algorithms are not practical. They stress the need for (nearly) linear-time algorithms.

By using our framework algorithm of Theorem 4.5 and patching the constructed matching to a tour, we obtain the following result.

**Theorem 4.6**

Let \(1\le p\le 4d/(d+1)\) and \(f\in \mathcal{F}_{\phi }^{n}\). On instances drawn from *f*, there is a \(O(n^{p})\)-time computable probable \((1-O(\root d \of {\phi /n^{p/4}}))\)-approximation for \(\mathrm {MaxTSP}\). This is asymptotically optimal for \(\phi =o(n^{p/4})\).

*Proof*

*except*in the case that it is the last edge of a partial tour. We compensate for this loss with the pessimistic estimate of \(\sqrt{d}\) per vertex. Let \(\mathrm {ALG}_{\mathrm {MaxTSP}}(X)\) denote the length of the thus constructed tour and let \(\mathrm {ALG}_{\mathrm {MaxM}}(X)\) denote the length of the matching constructed by the algorithm of Theorem 4.5, then we have

Since \(\mathrm {MaxM}\) is a \(\frac{1}{2}\)-approximation to \(\mathrm {MaxTSP}\), the greedy linear-time computable \(\frac{1}{2}\)-approximation to \(\mathrm {MaxM}\) is a \(\frac{1}{4}\)-approximation to \(\mathrm {MaxTSP}\) and thus provides an adapted algorithm with asymptotically optimal *expected* approximation ratio for \(\phi =o(n^{p/4})\).

## 5 Balanced Quantization

Grid quantization proves useful for problems in which algorithms solving the high-multiplicity version are available. For maximum matching, we exploited a specifically designed algorithm, but for other problems, such algorithms might be missing. As an alternative route in these cases, this section establishes a more careful quantization step yielding *balanced* instances, i.e., instances in which each of the distinct points occurs equally often. These instances are often easier to handle, which holds, e.g., for k-means clustering and similar problems. In general, this alternative method can be applied to problems for which the objective scales controllably when all input points are duplicated.

The balanced quantization algorithm works by partitioning a subset of the input points into *packets*, which we define as collections of points such that all of these collections have the same cardinality.

**Lemma 5.1**

- 1.
\(\frac{\ell '(n)}{\ell (n)}\rightarrow 1\) (we obtain \(\ell \) packets asymptotically),

- 2.
\(|C_{i}|=w\) for \(1\le i\le \ell '(n)\) (each packet contains exactly

*w*points), - 3.
\(n-\sum _{i=1}^{\ell '(n)}|C_{i}|=O(\frac{n}{\ell ^{1/(d+1)}})\) (almost all points are covered),

- 4.
\(\mathrm {diam}(C_{i})=O(\frac{1}{\ell ^{1/(d+1)}})\) (each element in a packet represents it well).

*Proof*

We quantize *X* using a grid of \(t^{d}\) cubes \(B_{1},\dots ,B_{t^{d}}\), for *t* to be chosen later. Each cube \(B_{i}\) has side length 1 / *t*, volume \({1}/{t^{d}}\) and contains some number \(n_{i}\) of input points. For some *w* to be determined later, we create \(\lfloor \frac{n_{i}}{w}\rfloor \) packets for each cube \(B_{i}\): we succesively assign to each such packet *w* yet uncovered input points inside \(B_{i}\). This yields a number \(\ell '\) of packets, each of which contains exactly *w* points which were originally situated in a cube of volume \({1}/{t^{d}}\).

We need to choose *t* and *w* such that \(\frac{\ell '(n)}{\ell (n)}\rightarrow 1\) for \(n\rightarrow \infty \). Since each of the produced packets contains *w* points, we have \(\ell '(n)\le \frac{n}{w}\). Furthermore, since \(\ell '(n)=\sum _{i=1}^{t^{d}}\left\lfloor \frac{n_{i}}{w}\right\rfloor \ge \sum _{i=1}^{t^{d}}\frac{n_{i}}{w}-1=\frac{n}{w}-t^{d},\) setting \(w:=\lfloor \frac{n}{\ell (n)}\rfloor \) establishes \(\ell (n)-t^{d}\le \ell '(n)\le \ell (n)\frac{n}{n-\ell (n)}\). Thus, the first criterion is fulfilled if \(t=o(\root d \of {\ell })\). Since the rounding step leaves at most *w* points per cube uncovered, at most \(t^{d}w\le t^{d}\frac{n}{\ell (n)}\) points are lost in total by rounding. Setting \(t:=\ell ^{1/(d+1)}\) fulfills \(t=o(\root d \of {\ell })\) and creates \(t^{d}w=O(\frac{n}{\ell ^{1/(d+1)}})\) uncovered points. Furthermore, by this choice, the diameter of each cube is bounded by \(\frac{\sqrt{d}}{t}=O(\ell ^{-1/(d+1)})\). \(\square \)

The previous lemma yields the balanced quantization algorithm \(\mathrm {BalQ}\) that, on input *X*, returns \(\mathrm {BalQ}(X)=((\mathrm {cm}(C_{1}),w),\dots ,(\mathrm {cm}(C_{\ell '}),w))\), i.e., each point is rounded to the center of mass of its corresponding packet obtained by Lemma 5.1. This allows us to formalize the balanced quantization method as follows.

**Theorem 5.2**

- 1.
On all quantized inputs \(X'=((X'_{1},w),\dots ,(X'_{\ell },w))\), the value \(F(X')\) can be computed in time \(t(\ell )+O(w\ell )\).

- 2.There is a constant
*C*such that with high probability, the functional on an instance*X*drawn from any \(f\in \mathcal{F}^{n}\) differs by at most \(\frac{Cn}{\ell ^{1/(d+1)}}\) from the functional on \(\mathrm {BalQ}(X)\). Formally, we require$$\begin{aligned} \Pr _{X\sim f}\left[ |F(X)-F(\mathrm {BalQ}(X))|\le \frac{Cn}{\ell ^{1/(d+1)}}\right] =1-o(1). \end{aligned}$$ - 3.For each \(f\in \mathcal{F}^{n}\), it holds that$$\begin{aligned} \Pr _{X\sim f}\left[ F(X)\ge nQ(n)\right] =1-o(1). \end{aligned}$$

*F*is \(t(\ell )\)-time \((O(\ell ^{-\frac{1}{d+1}}),Q(n))\)-quantizable with respect to \(\mathcal{F}\).

*Proof*

In Definition 3.1, use \(\mathrm {BalQ}\) as quantization algorithm *A*. The other conditions follow directly from the assumptions. \(\square \)

For some problems, an instance in which every distinct point occurs equally often can be reduced to its distinct points only. In the following, we exploit this property by applying the previous theorem to k-means clustering in Sect. 5.1. The method also allows for improving the results on maximum matching and maximum TSP in Sect. 5.2.

### 5.1 K-Means Clustering

*X*, where

*k*is the desired number of clusters, i.e.,

*k*or

*d*is part of the input, it is NP-hard [16, 29]. However, a popular heuristic, the k-means algorithm, usually runs fast on real-world instances despite its worst-case exponential running time. This is substantiated by results proving a polynomial smoothed running time of the k-means algorithm under Gaussian perturbations [2, 4]. In terms of solution quality, however, such a heuristic can perform poorly.

Consequently, k-means clustering has also received considerable attention concerning the design of fast deterministic approximation schemes. There exist linear-time asymptotically optimal algorithms, e.g., PTASs with running time \(O(nkd+d\cdot \mathrm {poly}(k/\varepsilon )+2^{\tilde{O}(k/\varepsilon )})\) in [21] and \(O(ndk+2^{(k/\varepsilon )^{O(1)}}d^{2}n^{\sigma })\) for any \(\sigma >0\) in [14]. Treating the dimension as a constant as we do here, Har-Peled and Mazumdar [24] showed how to compute a \((1+\varepsilon )\)-approximation in time \(O(n+k^{k+2}\varepsilon ^{-(2d+1)k}\log ^{k+1}n\log ^{k}\frac{1}{\varepsilon })\).

Apart from smoothed analysis, the perturbation concept of *perturbation stability* has been applied to k-means clusterings by Awasthi et al. [6]. They restrict their attention to input instances which, when perturbed, maintain the same partitioning of the input points as an optimal clustering. Their perturbation model uses a bounded multiplicative increase of the distance of every pair of points. On instances that are stable under sufficiently large perturbations, they show how to compute the optimal k-means clustering in polynomial time.

In the following, we will frequently consider the k-means clustering objective with respect to other centroids \(\mu _{i}\) than the center of mass of the corresponding cluster. However, such a choice cannot decrease the objective. This follows from the following fact, which is proven, e.g., in [26].

**Property 5.3**

*C*be a multiset of points in \(\mathbb {R}^{d}\) and let \(x\in \mathbb {R}^{d}\). Then,

Consider \(\mathrm {BalQ}(X)=((\mathrm {cm}(C_{1}),w),\dots ,(\mathrm {cm}(C_{\ell '}),w))\), a quantized instance obtained by applying Lemma 5.1. Let \(Y=\mathrm {BalQ}(X)=(Y_{1},\dots ,Y_{n'})\), where we order the \(Y_{i}\)’s such that \(Y_{i}\) is the rounded version of \(X_{i}\). Note that the number \(n'=w\ell '\) of points in the rounded instance is potentially slightly smaller than *n*, since points may be lost in the balanced quantization step.

**Lemma 5.4**

*X*and balanced quantizations \(Y=\mathrm {BalQ}(X)\), we have

*Proof*

*X*and

*Y*with partitions of the indices \([n]:=\{1,\dots ,n\}\) and \([n']=\{1,\dots ,n'\}\). Let \((D_{1},\dots ,D_{k})\) be a partition of the indices \(\{1,\dots ,n\}\). From this, create a partition \((D_{1}',\dots ,D_{k}')\) of \(\{1,\dots ,n'\}\) by assigning the rounded versions of each point covered by \(C_{1},\dots ,C_{\ell '}\) to the cluster of the original point, i.e., \((D_{1}',\dots ,D_{k}')=(D_{1}\cap [n'],\dots ,D_{k}\cap [n'])\). Consider the k-means objective of \((D_{1}',\dots ,D_{k}')\) on

*Y*using the centroids of the unrounded clusters \(\mu _{i}=\frac{1}{|D_{i}|}\sum _{j\in D_{i}}X_{j}\). By expanding \(\left\| Y_{j}-\mu _{i}\right\| ^{2}=\left\| (Y_{j}-X_{j})+(X_{j}-\mu _{i})\right\| ^{2}\), we compute

*n*] by assigning all uncovered points to the first cluster; for the rest we mimic the decision of its rounded versions. Consider the k-means objective of \((D_{1},\dots ,D_{k})\) with respect to the centroids \(\mu _{i}'=\frac{1}{|D_{i}'|}\sum _{j\in D_{i}'}Y_{j}\),

*n*. With \(\varDelta :=\beta d+\alpha ^{2}+2\alpha \sqrt{d}\), the objective value with respect to the centroids \(\mu _{i}'\) is bounded by \(\mathrm {KMeans}(Y,k)+\varDelta n\ell ^{-1/(d+1)}\). Again, choosing the correct centroids cannot increase the cost, thus

Having established that rounding the input does not affect the objective value too much, the following lemma enables us to reduce the instance size significantly.

**Lemma 5.5**

Let \(X=((X_{1},w),\dots ,(X_{\ell },w))\) and \(Z=((X_{1},1),\dots ,(X_{\ell },1))\). It holds that \(\mathrm {KMeans}(X,k)=w\mathrm {KMeans}(Z,k)\).

*Proof*

*Z*. Then \(\mu _{i}=\frac{1}{|C_{i}|}\sum _{j\in C_{i}}X_{j}=\frac{1}{w|C_{i}|}\sum _{j\in C_{i}}wX_{j}\) and we conclude

It is left to give a lower bound on the objective value. For this argument, we introduce the following notion.

**Definition 5.6**

*k*elements an

*S*

*-clustering*by declaring the \(X_{i}\) with \(i\in S\) to be the centroids and assigning each point to its nearest centroid. With \(a:[n]\rightarrow S\) being the assignment of \(X_{i}\) to its nearest centroid, we set \(\mathrm {KSetMeans}(X;k):=\min _{S\subseteq [n],|S|=k}\mathrm {KSetMeans}_{S}(X;k),\) where

The following lemma allows us to restrict our attention to centroids at fixed locations, since choosing centroids only among the input points at most doubles the objective. A stronger version of this is used in the analysis of the k-means++ algorithm [3].

**Lemma 5.7**

It holds that \(\mathrm {KSetMeans}(X;k)\le 2\mathrm {KMeans}(X;k)\).

*Proof*

*C*be an arbitrary cluster and choose \(c_{0}:={\mathrm {argmin}}_{c\in C}\left\| c-\mathrm {cm}(C)\right\| ^{2}\) as centroid for

*C*. By Property 5.3, the contribution of

*C*to the objective value equals

For proving a lower bound, it thus suffices to consider all possible choices of the centroids among the input points. We let the adversary choose their locations and compute the contribution of each of the remaining points to the objective value.

**Lemma 5.8**

*Proof*

*S*-clustering and its objective value. Without loss of generality, assume that \(S=\{1,\dots ,k\}\). We first fix the

*k*centroids \(\mu _{1},\dots ,\mu _{k}\) by observing the locations of \(X_{1},\dots ,X_{k}\). We draw the remaining \(n-k\) points according to their distribution and look at the expected increase in the objective per point.

*d*-dimensional ball of radius

*r*. Consider point \(X_{i}\). If it contributes less than

*t*to the objective value, it is contained in a ball of radius less than \(\sqrt{t}\) around any of the centroids. Thus,

*t*. From \(E[I_{i}]\ge \frac{1}{2}\) and an application of the Chernoff bound of Lemma 2.1, we conclude

Note that for other Euclidean minimization functionals like minimum Euclidean matching or TSP, already the uniform distribution on the unit cube achieves an objective value of only \(O(n^{(d-1)/d})\) [35]. Hence a lower bound as given in Lemma 5.8 for these problems would not be possible, making the framework inapplicable in this case. For a more detailed discussion, we refer to Sect. 7.

To solve the smaller instance obtained by quantization, two approaches are reasonable. The first is to compute an optimal solution in time \(O(n^{kd+1})\) using [25] and results in the following theorem.

**Theorem 5.9**

For \(k=o(n/\log n)\), the functional \(\mathrm {KMeans}(X,k)\) is \(O(\ell ^{kd+1})\)-time \((O(\ell ^{-1/(d+1)}),\varOmega ((k\phi )^{-2/d}))\)-quantizable with respect to \(\mathcal{F}_{\phi }\). Consequently, for \(k=O(\log n/\log \log n)\) and \(1\le p\le kd+1\), there is a \(O(n^{p})\)-time computable probable \(\left( 1-O\left( \frac{(k\phi )^{2/d}}{n^{\frac{p}{(d+1)(kd+1)}}}\right) \right) \)-approximation for \(\mathrm {KMeans}(X,k)\) on smoothed instances.

Note that this is asymptotically optimal if \(\phi =o(\root c \of {n})\) with \(c=2(1+1/d)(kd+1)/p\) if \(k=O(1)\), or more generally, if \(k\phi =o(n^{\frac{pd}{2(d+1)(kd+1)}})\). Using existing linear-time approximation schemes, also an asymptotically optimal expected approximation ratio can be obtained for the same values of \(\phi \). Our framework algorithm applies even for relatively large values of *k*, e.g., \(k=\log n/\log \log n\), in which case known deterministic approximation schemes require superlinear time. However, for small *k*, incorporating such an approximation scheme into our algorithm yields a further improvement of the previous theorem.

**Corollary 5.10**

Let \(k=O(\log n/\log \log n)\), let \(1\le p\le 2k\) and \(\delta >0\). There is a \(O(n^{p})\)-time computable \(1-O\left( \frac{k^{\frac{2}{d}-\frac{3}{2(d+1)}}\phi {}^{2/d}}{n^{\frac{p}{2k(d+1)}-\delta }}\right) \)-approximation for \(\mathrm {KMeans}(X;k)\) with respect to the perturbations \(\mathcal{F}_{\phi }\). This is asymptotically optimal if \(\phi =o(\root c \of {n})\) with any \(c>4k(1+\frac{1}{d})/p\) if \(k=O(1)\), or more generally, if \(k^{1-\frac{3d}{4(d+1)}}\phi =o(n^{\frac{pd}{4k(d+1)}-\delta })\) .

### 5.2 Maximum Matching and Maximum TSP Revisited

The balanced quantization technique allows for an improvement for the maximum matching problem for \(d\ge 3\). As for k-means clustering, the key observation for applying balanced quantization is that identical points can be treated as one, as captured in the following lemma.

**Lemma 5.11**

Let \(X'=((X_{1},w),\dots ,(X_{\ell },w))\), then \(w\mathrm {MaxM}(X_{1},\dots ,X_{\ell })=\mathrm {MaxM}(X').\)

*Proof*

Consider a matching of \(X_{1},\dots ,X_{\ell }\). By mimicking the choice of \(X_{i}\) for all its copies in \(X'\), we obtain a solution of value \(w\mathrm {MaxM}(X_{1},\dots ,X_{\ell })\) immediately yielding \(w\mathrm {MaxM}(X_{1},\dots ,X_{\ell })\le \mathrm {MaxM}(X')\).

For the other direction, consider the optimal matchings of \(X'\). There is an optimal matching that does not connect two copies of the same point. Consider any optimal matching that does connect two copies of the same point, then there is some edge \((c_{1},c_{2})\) connecting two copies of some point \(X_{i}\) and some arbitrary other edge (*u*, *v*) in the matching where neither *u* nor *v* are copies of \(X_{i}\). This is due to the fact that there are at most \(w-2\) other matching edges that include a copy of \(X_{1}\), but there are *w* edges in the (perfect) matching.

*u*,

*v*) by \((c_{1},u)\) and \((c_{2},v)\). By triangle inequality, it holds that

As a result, we can decompose the matching into *w* layers, i.e., *w* matchings of the points \(X_{1},\dots ,X_{\ell }\), since no edge connects two copies of the same point. These *w* matchings \(M_{1},\dots ,M_{w}\) are independent of each other. All matchings must be optimal matchings on \(X_{1},\dots ,X_{\ell }\), otherwise we can replace each of them by an optimal matching on \(X_{1},\dots ,X_{\ell }\) to obtain a matching of smaller objective value. We conclude that \(\mathrm {MaxM}(X)\le w\mathrm {MaxM}(X_{1},\dots ,X_{\ell })\). \(\square \)

The previous lemma enables us to use the balanced quantization approach. Combined with the properties obtained in Sect. 4.1, the following theorem is immediately derived.

**Theorem 5.12**

\(\mathrm {MaxM}\) is \(O(\ell ^{3})\)-time \((O(\ell ^{-1/(d+1)}),\varOmega (\phi ^{-1/d}))\)-quantizable with respect to \(\mathcal{F}_{\phi }\).

*Proof*

*O*(

*n*). Consider the maximum matching problem on \(X_{C}=(\mathrm {cm}(C_{1}),\dots ,\mathrm {cm}(C_{\ell '}))\). Let

*U*denote the set of uncovered points of size \(|U|\le \frac{\alpha n}{\ell ^{1/(d+1)}}\) for some \(\alpha >0\). Since adding or deleting

*m*points can change the objective by at most \(m\sqrt{d}\), we have that

*R*,

*Q*)-quantizable where \(R=\frac{(\alpha \sqrt{d}+\beta )}{\ell ^{1/(d+1)}}=O\left( \ell ^{-1/(d+1)}\right) \) and \(Q=\varOmega (\phi ^{-1/d})\). \(\square \)

To strengthen our result further, consider the following lemma that is due to Duan and Pettie [17].

**Lemma 5.13**

Let \(\varepsilon >0\). A \((1-\varepsilon )\)-approximate maximum weighted matching can be computed in time \(O(m\varepsilon ^{-2}\log ^{3}n)\), where *m* is the number of edges and *n* is the number of vertices in the graph.

Hence, for \(\mathrm {MaxM},\) a \(O(\ell ^{2(d+2)/(d+1)}\log ^{3}n)\)-time computable probable \((1-O(\phi ^{1/d}/\ell ^{1/(d+1)}))\)-approximation exists. Put otherwise, in time \(O(n^{p})\) for \(1\le p\le 2(1+\frac{1}{d+1})\), we can compute an approximation that is asymptotically optimal if \(\phi =o(n^{\frac{p}{2}(1-\frac{2}{d+2})-\varepsilon })\) for some \(\varepsilon >0\). The largest adversary power possible with this approach is hence \(\phi =o(n^{1-1/(d+1)-\varepsilon })\). This improves upon the admissible adversary power of Sect. 4.

*Remark 5.14*

The results for \(\mathrm {MaxM}\) carry over to \(\mathrm {MaxTSP}\) as in Sect. 4.2. Extending a matching of a quantized instance \(X'=((X'_{1},w),\dots ,(X'_{2\ell },w))\) to a Hamiltonian cycle in linear time allows, for \(1\le p\le 2\), to compute a solution with \(\rho \ge 1-O(\root d \of {\phi }/n^{\frac{p}{2(d+2)}-\varepsilon })\) on perturbed instances in time \(O(n^{p})\).

## 6 Bin Packing

In this section, we will apply the grid quantization framework developed in Sect. 4 to the multidimensional bin packing problem. Let \(X=(X_{1},\dots ,X_{n})\in [0,1]^{dn}\) be a set of *n* items. An item \(X=(x_{1},\dots ,x_{d})\) is treated as a *d*-dimensional box, where \(x_{i}\) is its side length in dimension *i*. We define \(\mathrm {BP}_{d}(X)\) to be the minimum number of bins of volume one, i.e., *d*-dimensional hypercubes \([0,1]^{d}\), that are needed to pack all elements. Items must not be rotated and must be packed such that their interiors are disjoint.

In what follows, we extend the result of Karger and Onak [27], who gave linear-time asymptotically optimal approximation algorithms for smoothed instances with \(\phi =O(1)\) and for instances with i.i.d. points. These tractability results are highly interesting due to the fact that unless \(\mathsf {P}=\mathsf {NP}\), there is not even an asymptotic polynomial-time approximation scheme (APTAS) solving the two-dimensional bin packing problem [7].

While Karger and Onak’s approach appears rather problem-specific, our solution embeds nicely into our framework. The main difference of our approach lies in a much simpler rounding routine and analysis, after which we solve the problem exactly as in their distribution-oblivious algorithm. Note that their algorithm is supplied with a desired relative error \(\varepsilon >0\) and succeeds with a probability of \(1-2^{-\varOmega (n)}\). Although not stated for this case, we believe that their algorithm may also apply to superconstant choices of \(\phi \), at a cost of decreasing the success probability. We feel that our analysis offers more insights on the reasons why bin packing is smoothed tractable by putting it into the context of our general framework.

Consider first the one-dimensional case. Unless \(\mathsf{P}=\mathsf{NP}\), the functional \(\mathrm {BP}_{1}\) does not admit a \((\frac{2}{3}+\varepsilon )\)-approximation for any constant \(\varepsilon >0\). However, asymptotic polynomial approximation schemes exist [22], i.e., \((1-\varepsilon )\)-approximations on instances with a sufficiently large objective value. These approximation schemes have an interesting connection to smoothed analysis due to the following property.

**Lemma 6.1**

*Proof*

We say that a single item \(X=(x_{1},\dots ,x_{d})\) fits in a box \(B=(b_{1},\dots ,b_{d})\) if \(x_{i}\le b_{i}\) for all \(1\le i\le d\). In this case, we write \(X\sqsubseteq B\), adopting the notation of [27]. Regarding an item as a box as well, this relation is transitive and induces the monotonicity property that for each \(X=(X_{1},\dots ,X_{n})\) and \(Y=(Y_{1},\dots ,Y_{n})\) with \(X_{i}\sqsubseteq Y_{i}\), it holds that \(\mathrm {BP}_{d}(X)\le \mathrm {BP}_{d}(Y)\).

To apply the quantization framework, we require a suitable bound on the rounding errors. Unlike for \(\mathrm {MaxM}\) and \(\mathrm {MaxTSP}\), no deterministic bound of \(n\delta \) is possible for a \(\delta \)-rounding: Let the instance \(X^{(n)}\) consist of *n* copies of \((\frac{1}{2},\dots ,\frac{1}{2})\). Packing \(2^{d}\) items per bin results in zero waste, hence \(\mathrm {BP}_{d}(X^{(n)})=n/2^{d}\). However, for any \(\delta >0\), the \(\delta \)-rounding \(Y_{n}\) consisting of *n* copies of \((\frac{1}{2}+\frac{\delta }{\sqrt{d}},\dots ,\frac{1}{2}+\frac{\delta }{\sqrt{d}})\) has an objective value of \(\mathrm {BP}_{d}(Y^{(n)})=n=2^{d}\mathrm {BP}_{d}(X^{(n)})\). Thus, a smoothed analysis of the rounding error is necessary.

**Lemma 6.2**

Note that this probability tends to zero if \(t=\omega (\frac{1}{\phi \sqrt{n}})\). Since grid quantization rounds the points to \(\ell \) distinct points by moving each item by at most \(t=\sqrt{d}\ell ^{-d}\), the requirement \(\ell =o(n)\) even implies that \(t=\omega (\frac{1}{\sqrt{n}})\) for \(d\ge 2\).

To prepare the proof of Lemma 6.2, we introduce the following notion.

**Definition 6.3**

*-expansion by t,*written \(\delta _{t}:[0,1]^{d}\rightarrow [0,1]^{d}\), by

Given \(X=(X_{1},\dots ,X_{n})\), we abbreviate \(X_{-i}:=(X_{1},\dots ,X_{i-1},X_{i+1},\dots ,X_{n})\) and write \((X_{-i},z)\) shorthand for \((X_{1},\dots ,X_{i-1},z,X_{i+1},\dots ,X_{n})\). In what follows, we explicitly distinguish random variables \(X_{i}\) from specific realizations \(\bar{X_{i}}\).

**Lemma 6.4**

*Proof*

Let *N* be the smallest number of bins required to pack the items \(\bar{X}_{-i}\). Clearly, \(N\le \mathrm {BP}_{d}(\bar{X}_{-i},X_{i})\le N+1\), since any packing of \((\bar{X}_{-i},X_{i})\) must also pack \(\bar{X}_{-i}\) and it suffices to place element \(X_{i}\) in its own bin to obtain a bin packing with only one additional bin.

*N*bins if and only if \(\delta _{t}(\bar{X}_{i})\le T\) by the same observation. Since \(\delta _{t}(\bar{X}_{i})\le \bar{X}_{i}+t\), we conclude

*small*box is a box that can be packed into the empty space of some placement of the other boxes into

*N*bins. A

*large*box necessarily increases the number of bins needed to pack all items to \(N+1\). Let \(\mathcal{S}(\bar{X}_{-i})\) be the set of small boxes, i.e., \(\mathrm {BP}_{d}(\bar{X}_{-i},\bar{X})=N\) if \(\bar{X}\in \mathcal{S}(\bar{X}_{-i})\) and \(\mathrm {BP}_{d}(\bar{X}_{-i},\bar{X})=N+1\) otherwise. The conditional expectation of \(\mathrm {BP}_{d}(\bar{X}_{-i},X_{i})\) is given by

*N*bins containing \(\bar{X}_{-i}\) if and only if \(\delta _{t}(\bar{X}_{i})\in \mathcal{S}(\bar{X}_{-i})\). Consider \(\delta _{-t}(\mathcal{S}(\bar{X}_{-i})):=\{\delta _{-t}(s)\mid s\in \mathcal{S}(\bar{X}_{-i})\}\), then this is implied by \(\bar{X}_{i}\in \delta _{-t}(\mathcal{S}(\bar{X}_{-i}))\). To analyze the quantity \(\varDelta {:=}\mathrm {vol}(\mathcal{S}(\bar{X}_{-i}))-\mathrm {vol}(\delta _{-t}(\mathcal{S}(\bar{X}_{-i})))\), note that shrinking a geometric object contained in the unit cube along a single dimension by an additive amount of

*t*decreases its volume by at most

*t*. Applying this for all dimensions yields \(\varDelta \le dt\). We conclude,

Lemma 6.4 gives a bound on the expected loss of “removing” a \(\delta _{t}\)-expansion for a single item. By linearity of expectation, the expected loss adds up if all items are \(\delta \)-expanded by *t*.

**Lemma 6.5**

*Proof*

The symmetric statement follows analogously. \(\square \)

With the property that any \(\delta \)-expansion by *t* has only a small effect on the expected difference of the number of required bins, an application of Azuma’s inequality (Lemma 2.2) shows that this holds, in fact, with high probability.

*Proof of (Lemma*6.2

*)*By monotonicity of \(\mathrm {BP}_{d}\), all

*Y*with \(\left\| Y_{i}-X_{i}\right\| \le t\) satisfy \(\mathrm {BP}_{d}(\delta _{-t}(X))\le \mathrm {BP}_{d}(Y)\le \mathrm {BP}_{d}(\delta _{t}(X))\). Hence, if \(|\mathrm {BP}_{d}(X)-\mathrm {BP}_{d}(Y)|>2ntd\phi \), then

Solving the high-multiplicity version of the one-dimensional case has been a key ingredient in approximation schemes for this problem since the first APTAS by [22]. The following lemma from [27] solves the multi-dimensional case.

**Lemma 6.6**

Let \(X'=((X'_{1},n_{1}),\dots ,(X'_{\ell },n_{\ell }))\) be a quantized input with \(X'_{i}\in [\delta ,1]^{d}\). Then \(\mathrm {BP}_{d}(X')\) can be computed in time \(O(f(\ell ,\delta )\mathrm {polylog}(n))\) where \(n:=\sum _{i=1}^{\ell }n_{i}\), \(f(\ell ,\delta )\) is independent of *n* and \(f(\ell ,{1}/{\root d \of {\ell }})=2^{\ell ^{O(\ell )}}\).

Observe that each coordinate of the quantized points \(X'_{i}\) obtained by the grid quantization theorem (Theorem 4.1) is at least \(\ell ^{-1/d}\), since we may assume that \(\mathrm {GridQ}\) represents each hypercube \(Q_{i}^{k}\) by its maximal element. Hence, Lemmas 6.1, 6.2 and 6.6 fulfill the requirements for the grid quantization theorem.

**Theorem 6.7**

For \(d\ge 2\), \(\mathrm {BP}_{d}\) is \(2^{\ell ^{O(\ell )}}\)-time \((O(\frac{\phi }{\root d \of {\ell }}),\varOmega (\frac{1}{\phi ^{d}}))\)-quantizable with respect to \(\mathcal{F}_{\phi }\).

Hence, there is a linear-time probable \((1-O(\phi ^{d+1}/(\log \log n/\log ^{(3)}n)^{1/d}))\)-approximation. Thus, \(\mathrm {BP}_{d}\) can be computed asymptotically exactly in time *O*(*n*) if \(\phi =o((\log \log n/\log ^{(3)}n)^{1/d(d+1)})\). Here, allowing superlinear time has no effect on the admissible adversarial power. Furthermore, since \(\mathrm {BP}_{d}\) can be trivially approximated by a factor of *n* and the success probability of our algorithm is of order \(1-\exp (-\varOmega (n^{1-\varepsilon }))\), asymptotically optimal expected approximation ratios can be obtained for the same values of \(\phi \).

## 7 Concluding Remarks

Generalizing previous rounding-based approaches, we demonstrate that the general solution technique of quantization performs well on Euclidean optimization problems in the setting of smoothed analysis. We are optimistic that our framework can also be applied to disk covering and scheduling problems.

Note that our approach is orthogonal to the framework for smooth and near-additive Euclidean functionals by Bläser et al. [12]: By definition, a smooth Euclidean functional *F* on *n* points is bounded by \(O(n^{1-1/d})\). Hence, it can never compensate for the rounding error of at least \(\varOmega (\ell ^{-1/d})\) per point that our quantization methods induce, as quantization is only reasonable for \(\ell \le n\) and consequently, the total rounding error amounts to \(\varOmega (n^{1-1/d})\). Conversely, if a functional is large enough to compensate for rounding errors induced by quantization, it cannot be smooth. Thus, for any Euclidean functional, at most one of both frameworks is applicable.

This observation is especially interesting in the context of the work of Bern and Eppstein [11]. For a general class of *subadditive geometric**graphs,* which includes the optimal solutions of some problems tractable in the two smoothed analysis frameworks, they prove a gap theorem on the worst-case sums of the edge lengths of the graphs. Either these sums are bounded by \(O(n^{1-1/d})\) on all point sets, or there exists a point set inducing a graph of total edge length \(\varOmega (n)\). It might be interesting to explore whether such a gap behavior persists in the setting of smoothed analysis. It might be possible to identify general conditions for subadditive geometric graphs to be smoothed tractable, potentially exploiting both smoothed analysis frameworks on their corresponding sides of the gap.

## Footnotes

- 1.
If the framework algorithm fails with probability at most

*p*, then an*o*(1 /*p*)-approximation algorithm would also suffice to ensure expected asymptotic optimality. At this point, we require*O*(1)-approximations only for simplicity of presentation. In Sect. 6, we will make use of a slightly more precise analysis of the failure probability of the framework algorithm to use an*n*-approximation for bin packing.

## Notes

### Acknowledgments

The authors are grateful to Markus Bläser for kindling their interest in smoothed analysis and for stimulating discussions, and to the anonymous reviewers of this article for providing helpful remarks.

### References

- 1.Anstee, R.P.: A polynomial algorithm for b-matchings: an alternative approach. Inf. Proc. Lett.
**24**(3), 153–157 (1987)MATHMathSciNetCrossRefGoogle Scholar - 2.Arthur, D., Manthey, B., Röglin, H.: Smoothed analysis of the k-means method. J. ACM
**58**(5), 19:1–19:31 (2011)CrossRefGoogle Scholar - 3.Arthur, D., Vassilvitskii, S.: k-means++: the advantages of careful seeding. In: 18th Annual ACM-SIAM Symposium on Discrete Algorithms, SODA’07, pp. 1027–1035. SIAM (2007)Google Scholar
- 4.Arthur, D., Vassilvitskii, S.: Worst-case and smoothed analysis of the ICP algorithm, with an application to the k-means method. SIAM J. Comput.
**39**(2), 766–782 (2009)MATHMathSciNetCrossRefGoogle Scholar - 5.Avis, D.: A survey of heuristics for the weighted matching problem. Networks
**13**(4), 475–493 (1983)MATHMathSciNetCrossRefGoogle Scholar - 6.Awasthi, P., Blum, A., Sheffet, O.: Center-based clustering under perturbation stability. Inf. Proc. Lett.
**112**(1–2), 49–54 (2012)MATHMathSciNetCrossRefGoogle Scholar - 7.Bansal, N., Correa, J.É.R., Kenyon, C., Sviridenko, M.: Bin packing in multiple dimensions: inapproximability results and approximation schemes. Math. Oper. Res.
**31**, 31–49 (2006)MATHMathSciNetCrossRefGoogle Scholar - 8.Barvinok, A., Fekete, S.P., Johnson, D.S., Tamir, A., Woeginger, G.J., Woodroofe, R.: The geometric maximum traveling salesman problem. J. ACM
**50**(5), 641–664 (2003)MathSciNetCrossRefGoogle Scholar - 9.Barvinok, A.I.: Two algorithmic results for the traveling salesman problem. Math. Oper. Res.
**21**(1), 65–84 (1996)MATHMathSciNetCrossRefGoogle Scholar - 10.Beier, R., Vöcking, B.: Typical properties of winners and losers in discrete optimization. SIAM J. Comput.
**35**(4), 855–881 (2006)MATHMathSciNetCrossRefGoogle Scholar - 11.Bern, M., Eppstein, D.: Worst-case bounds for subadditive geometric graphs. In: 9th Annual Symposium on Computational Geometry, SCG’93, pp. 183–188, New York, NY, USA, ACM (1993)Google Scholar
- 12.Bläser, M., Manthey, B., Rao, B.V.R.: Smoothed analysis of partitioning algorithms for Euclidean functionals. Algorithmica
**66**(2), 397–418 (2013)MATHMathSciNetCrossRefGoogle Scholar - 13.Boros, E., Elbassioni, K., Fouz, M., Gurvich, V., Makino, K., Manthey, B.: Stochastic mean payoff games: smoothed analysis and approximation schemes. In: 38th International Colloquium on Automata, Languages and Programming, ICALP’11, pp. 147–158. Springer (2011)Google Scholar
- 14.Chen, K.: On coresets for k-median and k-means clustering in metric and Euclidean spaces and their applications. SIAM J. Comput.
**39**(3), 923–947 (2009)MATHMathSciNetCrossRefGoogle Scholar - 15.Curticapean, R., Künnemann, M.: A quantization framework for smoothed analysis of Euclidean optimization problems. In: 21st European Symposium on Algorithms, ESA’13, pp. 349–360. Springer, Berlin (2013)Google Scholar
- 16.Dasgupta, S.: The hardness of k-means clustering. Technical report cs2007-0890, University of California, San Diego (2007)Google Scholar
- 17.Duan, R., Pettie, S.: Approximating maximum weight matching in near-linear time. In: 51st Annual IEEE Symposium on Foundations of Computer Science, FOCS’10, pp. 673–682, Washington, DC, USA. IEEE Computer Society (2010)Google Scholar
- 18.Dyer, M.E., Frieze, A.M., McDiarmid, C.J.H.: Partitioning heuristics for two geometric maximization problems. Oper. Res. Lett.
**3**(5), 267–270 (1984)MATHMathSciNetCrossRefGoogle Scholar - 19.Englert, M., Röglin, H., Vöcking, B.: Worst case and probabilistic analysis of the 2-opt algorithm for the TSP: extended abstract. In: 18th Annual ACM-SIAM Symposium on Discrete Algorithms, SODA’07, pp. 1295–1304. SIAM (2007)Google Scholar
- 20.Fekete, S.P., Meijer, H., Rohe, A., Tietze, W.: Solving a “hard” problem to approximate an “easy” one: heuristics for maximum matchings and maximum traveling salesman problems. ACM J. Exp. Algorithmics
**7**, 11 (2002)MathSciNetCrossRefGoogle Scholar - 21.Feldman, D., Monemizadeh, M., Sohler, C.: A PTAS for k-means clustering based on weak coresets. In: 23rd Annual Symposium on Computational Geometry, SCG’07, pp. 11–18. ACM (2007)Google Scholar
- 22.Fernandez de la Vega, W., Lueker, G.: Bin packing can be solved within \(1 + \epsilon \) in linear time. Combinatorica
**1**(4), 349–355 (1981)MATHMathSciNetCrossRefGoogle Scholar - 23.Gabow, H.N.: An efficient implementation of Edmonds’ algorithm for maximum matching on graphs. J. ACM
**23**(2), 221–234 (1976)MATHMathSciNetCrossRefGoogle Scholar - 24.Har-Peled, S., Mazumdar, S.: On coresets for k-means and k-median clustering. In: 36th Annual ACM Symposium on Theory of Computing, STOC’04, pp. 291–300 (2004)Google Scholar
- 25.Inaba, M., Katoh, N., Imai, H.: Applications of weighted Voronoi diagrams and randomization to variance-based k-clustering: (extended abstract). In: 10th Annual Symposium on Computational Geometry, SCG’94, pp. 332–339 (1994)Google Scholar
- 26.Kanungo, T., Mount, D.M., Netanyahu, N.S., Piatko, C.D., Silverman, R., Wu, A.Y.: A local search approximation algorithm for k-means clustering. Comput. Geom. Theo. Appl.
**28**(2–3), 89–112 (2004)MATHMathSciNetCrossRefGoogle Scholar - 27.Karger, D., Onak, K.: Polynomial approximation schemes for smoothed and random instances of multidimensional packing problems. In: 18th Annual ACM-SIAM Symposium on Discrete Algorithms, SODA’07, pp. 1207–1216 (2007)Google Scholar
- 28.Karp, R.M., Luby, M., Marchetti-Spaccamela, A.: A probabilistic analysis of multidimensional bin packing problems. In: 16th Annual ACM Symposium on Theory of Computing, STOC’84, pp. 289–298, New York, NY, USA, ACM (1984)Google Scholar
- 29.Mahajan, M., Nimbhorkar, P., Varadarajan, K.: The planar k-means problem is NP-hard. Theor. Comput. Sci.
**442**, 13–21 (2012)MATHMathSciNetCrossRefGoogle Scholar - 30.Manthey, B., Röglin, H.: Smoothed analysis: analysis of algorithms beyond worst case. Inf. Technol.
**53**(6), 280–286 (2011)Google Scholar - 31.McDiarmid, C.: Concentration. In: Habib, M., McDiarmid, C., Ramirez-Alfonsin, J., Reed, B. (eds.) Probabilistic Methods for Algorithmic Discrete Mathematics, Volume 16 of Algorithms and Combinatorics, pp. 195–248. Springer, Berlin (1998)CrossRefGoogle Scholar
- 32.Plotkin, S.A., Shmoys, D.B., Tardos, É.: Fast approximation algorithms for fractional packing and covering problems. Math. Oper. Res.
**20**(2), 257 (1995)MATHMathSciNetCrossRefGoogle Scholar - 33.Spielman, D.A., Teng, S.: Smoothed analysis: an attempt to explain the behavior of algorithms in practice. Commun. ACM
**52**(10), 76–84 (2009)CrossRefGoogle Scholar - 34.Spielman, D.A., Teng, S.-H.: Smoothed analysis of algorithms: why the simplex algorithm usually takes polynomial time. J. ACM
**51**(3), 385–463 (2004)MATHMathSciNetCrossRefGoogle Scholar - 35.Steele, J.M.: Subadditive Euclidean functionals and nonlinear growth in geometric probability. Ann. Probab.
**9**(3), 365–376 (1981)MATHMathSciNetCrossRefGoogle Scholar - 36.Weber, M., Liebling, T.M.: Euclidean matching problems and the metropolis algorithm. Math. Methods Oper. Res.
**30**(3), A85–A110 (1986)MATHMathSciNetCrossRefGoogle Scholar