Volume in General Metric Spaces

Abraham, Ittai; Bartal, Yair; Neiman, Ofer; Schulman, Leonard J.

doi:10.1007/s00454-014-9615-4

Volume in General Metric Spaces

Published: 13 August 2014

Volume 52, pages 366–389, (2014)
Cite this article

Download PDF

Discrete & Computational Geometry Aims and scope Submit manuscript

Volume in General Metric Spaces

Download PDF

Ittai Abraham¹,
Yair Bartal²,
Ofer Neiman³ &
…
Leonard J. Schulman⁴

769 Accesses
1 Citation
Explore all metrics

Abstract

A central question in the geometry of finite metric spaces is how well can an arbitrary metric space be “faithfully preserved” by a mapping into Euclidean space. In this paper we present an algorithmic embedding which obtains a new strong measure of faithful preservation: not only does it (approximately) preserve distances between pairs of points, but also the volume of any set of $k$ points. Such embeddings are known as volume preserving embeddings. We provide the first volume preserving embedding that obtains constant average volume distortion for sets of any fixed size. Moreover, our embedding provides constant bounds on all bounded moments of the volume distortion while maintaining the best possible worst-case volume distortion. Feige, in his seminal work on volume preserving embeddings defined the volume of a set $S = \{v_1, \ldots , v_k \}$ of points in a general metric space: the product of the distances from $v_i$ to $\{ v_1, \dots , v_{i-1} \}$, normalized by $\tfrac{1}{(k-1)!}$, where the ordering of the points is that given by Prim’s minimum spanning tree algorithm. Feige also related this notion to the maximal Euclidean volume that a Lipschitz embedding of $S$ into Euclidean space can achieve. Syntactically this definition is similar to the computation of volume in Euclidean spaces, which however is invariant to the order in which the points are taken. We show that a similar robustness property holds for Feige’s definition: the use of any other order in the product affects volume$^{1/(k-1)}$ by only a constant factor. Our robustness result is of independent interest as it presents a new competitive analysis for the greedy algorithm on a variant of the online Steiner tree problem where the cost of buying an edge is logarithmic in its length. This robustness property allows us to obtain our results on volume preserving embedding.

Isometric embeddings of finite metric spaces

Article 01 January 2016

An Exact Algorithm for Finite Metric Space Embedding into a Euclidean Space When the Dimension of the Space Is Not Known

Discrete length-volume inequalities and lower volume bounds in metric spaces

Article 27 October 2015

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

1 Introduction

Recent years have seen a large outpouring of work in analysis, geometry and theoretical computer science on metric space embeddings guaranteed to introduce only small distortion into the distances between pairs of points.

Euclidean space is not only a metric space, it is also equipped with higher dimensional volumes. General metrics do not carry such structure. However, a general definition for the volume of a set of points in an arbitrary metric was developed by Feige [10].

In this paper we extend the study of metric embeddings into Euclidean space by first, showing a robustness property of the general volume definition. Using this robustness property, together with existing metric embedding methods, to show an embedding that guarantees small distortion not only on pairs, but also on the volumes of sets of points. The robustness property (see Theorem 2) is that the minimization over permutations in the volume definition affects it by only a constant. This result is of independent interest as it provides an analysis for the greedy algorithm on a variant of the online Steiner tree problem, where the cost of buying an edge is logarithmic in its length. We show that the greedy algorithm has a constant competitive ratio to the optimum. Our main application of Theorem 2 is an algorithmic embedding (see Theorem 3) with constant average distortion for sets of any fixed size. In fact, our bound on the average distortion scales logarithmically with the size of the set. Moreover this bound holds even for higher moments of the distortion (the $\ell _q$-distortion), while the embedding still maintains the best possible worst case distortion bound, simultaneously. Hence our embedding generalizes both [16] and [3] (see related work below).

1.1 Volume in General Metric Spaces

Let $d_\mathrm{{E}}$ denote Euclidean distance, and let $\mathrm{affspan}$ denote the affine span of a point set. The $(n-1)$-dimensional Euclidean volume of the convex hull of points $X=\{v_1,\ldots ,v_n\} \subseteq \mathbb {R}^d$ is

$$\begin{aligned} {\phi }_\mathrm{E}(X) = \frac{1}{(n-1)!} \prod _{i=2}^n d_\mathrm{E}(v_i,\mathrm{affspan}(v_1,\ldots ,v_{i-1})). \end{aligned}$$

This definition is, of course, independent of the order of the points.

1.1.1 Feige’s Notion of Volume

Let $(X,d_X)$ be a finite metric space, $X=\{v_1, \ldots , v_n\}$. Let $S_n$ be the symmetric group on $n$ symbols, and let $\pi _\mathrm{{P}} \in S_n$ be an order in which the points of $X$ may be adjoined to a minimum spanning tree by Prim’s algorithm. (Thus $v_{\pi _\mathrm{{P}}(1)}$ is an arbitrary point, $v_{\pi _\mathrm{{P}}(2)}$ is the closest point to it, etc.) Feige’s notion of the volume of $X$ is (we have normalized by a factor of $(n-1)!$):

$$\begin{aligned} {\phi }_\mathrm{F}(X) =\frac{1}{(n-1)!} \prod _{i=2}^n d_X(v_{\pi _\mathrm{{P}}(i)},\{v_{\pi _\mathrm{{P}}(1)},\ldots ,v_{\pi _\mathrm{{P}}(i-1)}\}). \end{aligned}$$

(1)

$\pi _\mathrm{{P}}$ minimizes the above expression (1) (see Sect. 2).

It should be noted that even if $X$ is a subset of Euclidean space, ${\phi }_\mathrm{E}$ and ${\phi }_\mathrm{F}$ do not agree. (The latter can be arbitrarily larger than the former, for instance, a very thin triangle.) The actual relationship that Feige found between these notions is nontrivial. Let $\mathcal {L}_2(X)$ be the set of non-expansive embeddings from $X$ into Euclidean space. Feige proved the following:

Theorem 1

(Feige) For any $n$ point metric space $(X,d)$:

$$\begin{aligned} 1 \le \big [ \frac{ {\phi }_{F}(X)}{\sup _{f \in \mathcal {L}_2(X)} {\phi }_{E}(f(X)) }\big ]^{1/(n-1)} \le 2. \end{aligned}$$

Thus, remarkably, ${\phi }_\mathrm{{F}}(X)$ is characterized to within a factor of $2$ (after normalizing for dimension) by the Euclidean embeddings of $X$.

1.1.2 Our Work, Part I: Robustness of the Metric Volume

What we show first is that Feige’s definition is insensitive to the minimization over permutations implicit in Eq. (1), and so also a generalized version of Theorem 1 can be obtained.

Theorem 2

There is a constant $C$ such that for any $n$-point metric space $(X,d)$, and with $\pi _\mathrm{{P}}$ defined as above, and for every $\pi \in S_n$:

$$\begin{aligned} 1 \le \left( \frac{\prod _{i=2}^n d_X(v_{\pi (i)},\{v_{\pi (1)},\ldots ,v_{\pi (i-1)}\})}{\prod _{i=2}^n d_X(v_{\pi _\mathrm{{P}}(i)},\{v_{\pi _\mathrm{{P}}(1)},\ldots ,v_{\pi _\mathrm{{P}}(i-1)}\})} \right) ^{1/(n-1)} \le C. \end{aligned}$$

An alternative interpretation of this result can be presented as the analysis of the following online problem. Consider the following variant of the online metric Steiner tree problem [14]. Given a complete weighted graph $(V,E)$, at each time unit $i$, the adversary outputs a vertex $v_i \in V$ and an online algorithm can buy edges $E_i \subseteq E$. At each time unit $i$, the edges bought $E_1,\dots ,E_i$ must induce a connected graph among the current set of vertices $v_1,\dots ,v_i$. The competitive ratio of an online algorithm is the worst ratio between the cost of the edges bought and the cost of the edges bought by the optimal offline algorithm. This problem has been well-studied when the cost of buying an edge is proportional to its length. Imase and Waxman prove that the greedy algorithm is $O(\log n)$ competitive, and shown that this bound is asymptotically tight. It is natural to consider a variant where the cost of buying is a concave function of the edge length. In this case a better result may be possible. In particular we analyze the case where this cost function is logarithmic in edge length. Such a logarithmic cost function may capture the economy-of-scale effects where buying multiplicatively longer edges costs only additively more. In Sect. 2.1, we prove the following corollary of Theorem 2,

Corollary 1

Given a complete weighted graph with arbitrary weights which are at least $2$, the greedy algorithm is $O(1)$-competitive for the Online Metric Steiner Tree with logarithmic edge costs.

1.1.3 Our Work, Part II: Volume Preserving Embeddings

We use Theorem 2 and recent results on metric embeddings [3] to show an algorithm that provides a non-contractive embedding into Euclidean space that faithfully preserves volume in the following sense: the embedding obtains simultaneously both $O(\log k)$ average volume distortion and $O(\log n)$ worst case volume distortion for sets of size $k$.

Given an $n$ point metric space $(X,d)$ an injective mapping $f:X \rightarrow L_2$ is called an embedding. An embedding is $(k-1)$-dimensional non-contractive if for any $S \in {X\atopwithdelims ()k}$: ${\phi }_\mathrm{{E}}(f(S))\ge {\phi }_\mathrm{{F}}(S)$.

Let $f$ be a $(k-1)$-dimensional non-contractive embedding. For a set $S \subseteq {X\atopwithdelims ()k}$ define the $(k-1)$-dimensional distortion of $S$ under $f$ as

$$\begin{aligned} \mathrm{dist}_\mathrm{F}(S) = \Big [\frac{ {\phi }_\mathrm{E}(f(S))}{{\phi }_\mathrm{F}(S)}\Big ]^{1/(k-1)}. \end{aligned}$$

For $2\le k \le n$ define the $(k-1)$-dimensional distortion of $f$ as

$$\begin{aligned} \mathrm{dist}^{(k-1)}(f)= \max _{S \in {X \atopwithdelims ()k}} \mathrm{dist}_f(S). \end{aligned}$$

More generally, for $2\le k \le n$ and $1 \le q \le \infty $, define the $(k-1)$-dimensional $\ell _q$-distortion of $f$ as

$$\begin{aligned} \mathrm{dist}_q^{(k-1)}(f) = {\mathbb {E}}_{S \sim {X\atopwithdelims ()k}}[\mathrm{dist}_f(S)^q ]^{1/q}, \end{aligned}$$

where the expectation is taken according to the uniform distribution over ${X\atopwithdelims ()k}$. Observe that the notion of $(k-1)$-dimensional distortion is expressed by $\mathrm{dist}_\infty ^{(k-1)}(f)$ and the average $(k-1)$-dimensional distortion is expressed by the $\mathrm{dist}_1^{(k-1)}(f)$-distortion.

It is worth noting that Feige’s definition of volume is related to the maximum volume obtained by non-expansive embeddings, while the definition of average distortion and $\ell _q$-distortion are using non-contractive embeddings. We note that these definitions are crucial in order to capture the coarse geometric notion described above and achieve results that significantly beat the usual worst case lower bounds (which depend on the size of the metric). It is clear that one can modify the definition to allow arbitrary embeddings (in particular non-contractive) by defining distortions normalized by taking their ratio with respect to the largest contraction.^{Footnote 1}

Our main theorem on volume preserving embeddings is:

Theorem 3

For any metric space $(X,d)$ on $n$ points and any $2\le k \le n$, there exists a map $f:X\rightarrow L_2$ such that for any $1\le q \le \infty ,\, \mathrm{dist}_q^{(k-1)}(f) \in O(\min \{\lceil q/(k-1)\rceil \cdot \log k, \log n\})$. In particular, $\mathrm{dist}_\infty ^{(k-1)}(f) \in O(\log n)$ and $\mathrm{dist}_1^{(k-1)}(f) \in O(\log k)$.

On top of the robustness property of the general volume definition of Theorem 2 the proof of Theorem 3 builds on the embedding techniques developed in [3] (in the context of pairwise distortion) along with combinatorial arguments that enable the stated bounds on the average and $\ell _q$-volume distortions.

Our embedding preserves well sets with typically large distances and can be viewed within the context of coarse geometry where we desire a “high level” geometric representation of the space. This follows from a special property formally stated in Lemma 5.

1.2 Related Work

Embeddings of metric spaces have been a central field of research in theoretical computer science in recent years, due to the fact the metric spaces are important objects in representation of data. A fundamental theorem of Bourgain [5] states that every $n$ point metric space $(X,d)$ can be embedded in $L_2$ with distortion $O(\log n)$, where the distortion is defined as the worst-case multiplicative factor by which a pair of distances change. Our work extends this result in two aspects: (1) bounding the distortion of sets of arbitrary size, and (2) providing bounds for the $\ell _q$-distortion for all $q \le \infty $.

1.2.1 Volume Preserving Embeddings

Feige [10] introduced volume preserving embeddings while developing an approximation algorithm for the bandwidth problem. He showed that Bourgain’s embedding provides an embedding into Euclidean space with $(k-1)$-dimensional distortion of $O(\sqrt{\log n}\cdot \sqrt{\log n+k \log k})$.

Following Feige’s work some special cases of volume preserving embeddings were studied, where the metric space $X$ is restricted to a certain class of metric spaces. Rao [21] studies the case where $X$ is planar or is an excluded-minor metric showing constant $(k-1)$-dimensional distortions. Gupta [12] showed an improved approximation of the bandwidth for trees and chordal graphs. As the Feige volume does not coincide with the standard Euclidean volume, it is also interesting to study this special case when the metric space is given in Euclidean space. This case was studied by Rao [21], Dunagan and Vempala [8] and by Lee [19]. We note that our work provides the first average distortion and $\ell _q$-distortion analysis also in the context of this special case.

The first improvement on Feige’s volume distortion bounds comes from the work of Rao [21]. As observed by many researchers Rao’s embedding gives more general results depending on a certain decomposability parameter of the space. This provides a bound on the $(k-1)$-dimensional distortion of $O((\log n)^{3/2})$ for all $k\le n$. This bound has been further improved to $O(\log n)$ in work of Krauthgamer et al. [16]. Krauthgamer et al. [15] show a matching $\Omega (\log n)$ lower bound on the $(k-1)$-dimensional distortion for all $k <n^{1/3}$.

1.2.2 Average and $\ell _q$ Distortion

The notions of average distortion and $\ell _q$-distortion is tightly related to the notions of partial embeddings and scaling embedding.^{Footnote 2} A $(1-\varepsilon )$ partial embedding requires distortion at most $\alpha $ for at least $(1-\varepsilon )$ fraction of the pairs. In a scaling embedding we have a function $\alpha :(0,1)\rightarrow \mathbb {R}$, and it demands $(1-\varepsilon )$ fraction of the pairs to have distortion at most $\alpha (\varepsilon )$, for all $\varepsilon \in (0,1)$ simultaneously. These notions were introduced by Kleinberg et al. [18], largely motivated by the study of distances in computer networks.

In [1] partial embedding into $L_p$ with tight $O(\log 1/\varepsilon )$ partial distortion were given. The embedding method of [3] provides a scaling embedding with $O(\log 1/\varepsilon )$ distortion for all values of $\varepsilon >0$ simultaneously. As a consequence of having scaling embedding, they show that any metric space can be embedded into $L_p$ with constant average distortion, and more generally that the $\ell _q$-distortion bounded by $O(q)$, while maintaining the best worse case distortion possible of $O(\log n)$, simultaneously.

Previous results on average distortion have applications for a variety of approximation problems, including uncapacitated quadratic assignment [3], and in addition have been used in solving graph theoretic problems [9]. Following [1, 3, 18] related notions have been studied in various contexts [2, 6, 7, 17].

2 Robustness of the Metric Volume

Proof of Theorem 2 For a tree $T$ on $n$ vertices $\{v_1,\ldots ,v_n\}$ let $\overline{{\phi }}(T)$ be the product of the edge lengths. Because of the matroid exchange property, this product is minimized by an MST.^{Footnote 3} Thus for any metric space on points $\{v_1,\ldots ,v_n\}$ and any spanning tree $T$, ${\phi }_{F}(v_1,\ldots ,v_n) \le \overline{{\phi }}(T)/(n-1)!$; the inequality is saturated by any (and only a) minimum spanning tree.

Definition 1

A forced spanning tree (FST) for a finite metric space is a spanning tree whose vertices can be ordered $v_1,\ldots ,v_n$ so that for every $i>1$, $v_i$ is connected to a vertex that is closest among $v_1,\ldots ,v_{i-1}$, and to no other among these. (We call such an ordering admissible for the tree.)

An MST is an FST with the additional property that in an admissible ordering $v_i$ is a closest vertex to $v_1,\ldots ,v_{i-1}$ among $v_i,\ldots ,v_n$.

Definition 2

For a tree $T$ let $\Delta (T)$ denote its diameter (the largest distance between any two points in the tree). Let the diameter $\Delta (F)$ of a forest $F$ with components $T_1,T_2,\ldots ,T_m$ be $\Delta (F)=\max _{1\le i\le m}\Delta (T_i)$. For a metric space $(X,d)$ let $\Delta _k(X)=\min \{\Delta (F)\mid F$ is a spanning forest of X with k connected components}.

Lemma 2

Let $(X,d)$ be a metric space. Let $k \ge 1$. An FST for $X$ has at most $k-1$ edges of length greater than $\Delta _k(X)$.

Proof

Let $v_1,\ldots ,v_n$ be an admissible ordering of the vertices of the FST. Assign each edge to its higher-indexed vertex. Since the ordering is admissible, this assignment is injective. The lemma is trivial for $k=1$. For $k \ge 2$, cover $X$ by the union of $k$ trees each of diameter at most $\Delta _k(X)$. Only the lowest-indexed vertex in a tree can be assigned an edge longer than $\Delta _k(X)$. (Note that $v_1$ is assigned no edge, hence the bound of $k-1$.) $\square $

Corollary 3

For any $n$-point metric space $(X,d)$ and any FST $T'$ for $X$, $ \overline{{\phi }}(T')\le \prod _{k=1}^{n-1}\Delta _k(X). $

Proof

Order the edges from $1$ to $n-1$ by decreasing length. The $k$th edge is no longer than $\Delta _k(X)$. $\square $

Using Corollary 3, our proof of Theorem 2 reduces to showing that for any MST $T$ of $X,\, \prod _{k=1}^{n-1}\Delta _k(X)\le e^{O(n-1)}\overline{{\phi }}(T)$. Specifically we shall show that for any spanning tree $T$,

$$\begin{aligned} \prod _{k=1}^{n-1}\Delta _k(X)\le \frac{1}{n^2} \Big (\frac{4 \pi ^2}{3} \Big )^{n-1} \overline{{\phi }}(T). \end{aligned}$$

(Observe incidentally that the FST created by the Gonzalez [11] and Hochbaum–Shmoys [13] process has $\overline{{\phi }}$ at least $2^{1-n} \prod _{k=1}^{n-1}\Delta _k(X)$.)

The idea is to recursively decompose $T$ by cutting an edge; letting the two remaining trees be $T_1$ (with some $m$ edges) and $T_2$ (with $n-2-m$ edges), we shall upper bound $\prod _1^{n-1} \Delta _k(T)$ in terms of $\prod _1^{m} \Delta _k(T_1)$ and $\prod _1^{n-2-m} \Delta _k(T_2)$. More on this after we show how to pick an edge to cut. Recall: $\sum _{j \ge 1} 1/j^2 = \pi ^2/6$.

Edge selection Find a diametric path $\gamma $ of $T$, i.e., a simple path whose length $|\gamma |$ equals the diameter $\Delta (T)$. For appropriate $\ell \ge 2$ let $u_1,\ldots ,u_\ell $ be the weights of the edges of $\gamma $ in the order they appear on the path. Select the $j$th edge on the path, for a $1 \le j \le \ell $ for which $u_j/|\gamma | > 1/(2(\pi ^2/6)\min \{j,\ell +1-j\}^2)$. Such an edge exists, as otherwise $\sum _1^\ell u_j \le (6/\pi ^2) |\gamma | \sum _1^\ell j^{-2} < |\gamma |$. Without loss of generality $j \le \ell +1-j$ (otherwise flip the indexing on $\gamma $), hence cutting $u_j$ contributes overhead $|\gamma | / u_{j} < 2(\pi ^2/6)j^2$ to the product $\prod _1^{n-1} \Delta _k$, and yields subtrees $T_1$ and $T_2$ each containing at least $j-1$ edges.

Think of this recursive process as successively breaking the spanning tree into a finer and finer forest. Note that we haven’t yet specified which tree of the forest is cut, but we have specified which edge in that tree is cut. The order in which trees are chosen to be cut is: $F_k(T)$ (which has $k$ components) is defined by (a) $F_1(T)=T$; (b) for $1<k<n$, $F_k(T)$ is obtained from $F_{k-1}(T)$ by cutting an edge in the tree of greatest diameter.

Note that by definition $\Delta _k(X)\le \Delta (F_k(T))$.

Induction Now we show that

$$\begin{aligned} \prod _1^{n-1} \Delta (F_k(T)) \le \frac{1}{n^2} \Big (\frac{4 \pi ^2}{3} \Big )^{n-1} \overline{{\phi }}(T) . \end{aligned}$$

It will be convenient to do this by an induction showing that there are constants $c_1,c_2>0$ such that

$$\begin{aligned} \prod _1^{n-1} \Delta (F_k(T)) \le e^{c_1(n-1)-c_2\log n} \overline{{\phi }}(T), \end{aligned}$$

and finally justify the choices $c_1=\log (4 \pi ^2/3)$ and $c_2=2$. As to base-cases, $n=1$ is trivial, and $n=2$ is assured for any $c_1 \ge 0$.

For $n>2$ let the children of $T$ be $T_1$ and $T_2$, that is to say, $F_2(T)=\{T_1,T_2\}$. Let $m$ and $n-2-m$ be the numbers of edges in $T_1$ and $T_2$ respectively. Observe that with $j$ as defined above, $\min \{m,n-2-m\} \ge j-1 \ge 0$.

Examine three sequences of forests: the $T$ sequence, $F_1(T),\ldots ,F_{n-1}(T)$; the $T_1$ sequence, $F_1(T_1),\ldots ,F_{m}(T_1)$; the $T_2$ sequence, $F_1(T_2),\ldots ,F_{n-2-m}(T_2)$.

As indicated earlier, in each forest $f$ in the $T$ sequence other than $F_1(T)$, choose a component $t$ of greatest diameter, i.e., one for which $\Delta (t)=\Delta (f)$. (In case of ties some consistent choice must be made within the $T,T_1$ and $T_2$ sequences.)

If $t$ lies within $T_1$, assign $f$ to the forest in the $T_1$ sequence that agrees with $f$ within $T_1$. Similarly if $t$ lies within $T_2$, assign $f$ to the appropriate forest in the $T_2$ sequence. Due to the process defining the forests $F_k(T)$, this assignment is injective. Moreover, a forest in the $T$ sequence, and the forest it is assigned to in the $T_1$ or $T_2$ sequence, share a common diameter. Hence

$$\begin{aligned} \prod _2^{n-1} \Delta (F_k(T)) = \Big (\prod _1^{m} \Delta (F_k(T_1))\Big ) \Big (\prod _1^{n-2-m} \Delta (F_k(T_2))\Big ). \end{aligned}$$

Therefore

$$\begin{aligned} \prod _1^{n-1} \Delta (F_k(T))&= \Delta (T)\cdot \prod _2^{n-1} \Delta (F_k(T)) \\&= \Delta (T)\cdot \Big (\prod _1^{m} \Delta (F_k(T_1))\Big ) \Big (\prod _1^{n-2-m} \Delta (F_k(T_2))\Big ). \end{aligned}$$

Now by induction

$$\begin{aligned} \prod _1^{n-1} \Delta (F_k(T)) \le \Delta (T)\cdot e^{c_1 m-c_2\log (m+1)} \cdot \overline{{\phi }}(T_1) \cdot e^{c_1(n-2-m)-c_2\log (n-1-m)} \cdot \overline{{\phi }}(T_2). \end{aligned}$$

As $\overline{{\phi }}(T) = u_j \cdot \overline{{\phi }}(T_1) \overline{{\phi }}(T_2)$ we get

$$\begin{aligned}&\frac{ \prod _1^{n-1} \Delta (F_k(T)) }{\overline{{\phi }}(T)} \\&\quad \le (\Delta (T)/u_{j})\cdot \exp \big \{c_1(n-2)-c_2(\log (m+1) + \log (n-1-m))\big \} \\&\quad \le \exp \big \{\log (2(\pi ^2/6)j^2) + c_1(n-2)-c_2(\log (m+1) + \log (n-1-m))\big \}\\&\quad \le \exp \big \{\log (\pi ^2 j^2/3) + c_1(n-2)-c_2(\log j + \log (n/2))\big \}\\&\quad \le \exp \big \{c_1(n-1) -c_2 \log n -(c_2-2) \log j -(c_1- c_2 \log 2 - \log (\pi ^2 /3)) \big \} \end{aligned}$$

Choose $c_2 \ge 2$ to take care of the third term in the exponent, and choose $c_1 \ge \log (\pi ^2/3) + c_2 \log 2 $ to take care of the fourth term in the exponent. (In the theorem statement, both of these choices have been made with equality.) So

$$\begin{aligned} \cdots \le \exp \left\{ c_1(n-1)-c_2 \log n \right\} . \end{aligned}$$

$\square $

2.1 Online Metric Steiner Tree

Here we prove Corollary 1. Recall that in the online metric Steiner Tree problem we are given a complete weighted graph $G=(V,E,w)$, with $d_G$ the shortest path metric on $G$ with respect to the weights, and the cost of each edge is the logarithm of its weight (we shall assume all weights are at least $2$, so the cost of every edge is at least $1$). Given a sequence $v_1,\dots ,v_n$ of vertices from $V$, we should output at every step $1\le i\le n$ a subgraph $C_i$ such that $v_1,\dots ,v_i$ are connected in $C_i$, and such that $C_{i-1}\subseteq C_i$ for all $i$. The cost of the subgraph $C_i$ is the summation over the costs of edges in $C_i$. The greedy algorithm does the following: at every step $i\ge 2$, add the edge $\{v_i,v_j\}$ where $v_j\in \{v_1,\dots ,v_{i-1}\}$ is the closest to $v_i$ among the previous vertices.

First we lower bound the cost of the optimal (offline) algorithm. For each $i$, the contribution of connecting $v_i$ to the minimum Steiner tree is at least $\frac{1}{2}d_G(v_i,V\setminus \{v_i\})$. Since the weights are at least $2$, for any path $u_1,\dots ,u_k$ of length $\ell $ we have that

$$\begin{aligned} \sum _{j=2}^k\log (d_G(u_{j-1},u_j))\ge \log \Big (\sum _{j=2}^kd_G(u_{j-1},u_j)\Big )\ge \log \ell . \end{aligned}$$

This implies that the cost of connecting each $v_i$ is at least $\log \big (\frac{1}{2}d_G(v_i,V\setminus \{v_i\})\big )$, and thus

$$\begin{aligned} \mathrm{cost}(OPT)\ge \sum _{i=2}^n \log \big (\frac{1}{2}d_G(v_i,V\setminus \{v_i\})\big )=\log \Big (\frac{1}{2^{n-1}}\prod _{i=2}^nd_G(v_i,V\setminus \{v_i\})\Big ). \end{aligned}$$

Next we upper bound the cost of the greedy algorithm, which is

$$\begin{aligned} \mathrm{cost}(ALG)\le \sum _{i=2}^n\log \big (d_G(v_i,\{v_1,\dots v_{i-1}\})\big )=\log \Big (\prod _{i=2}^nd_G(v_i,\{v_1,\dots v_{i-1}\})\Big ). \end{aligned}$$

Using Theorem 2 we have that

$$\begin{aligned} \mathrm{cost}(ALG)\le \mathrm{cost}(OPT)+O(n), \end{aligned}$$

and as $\mathrm{cost}(OPT)\ge n-1$ we have that the greedy is also a $O(1)$ multiplicative approximation algorithm.

3 Volume Preserving Embeddings

In this section we prove Theorem 3. The construction will be based on the embedding of [3], who gave a general framework for embedding metrics into normed spaces. It was shown in [3] that for every metric space $(X,d)$ on $n$ points there exists a distribution over maps $f:X\rightarrow \mathbb {R}$ with the following properties: Every map in the support has expansion $O(\log n)$, and for every pair of points $x,y\in X$, with probability $1/2$ the map $f$ does not contract $x,y$. Moreover, it was shown that distortion is scaling: for every $0<\varepsilon <1$, at least $1-\varepsilon $ fraction of the pairs of $X$ have expansion only $O(\log (1/\varepsilon ))$. Using this, one can construct an embedding of $X$ into $\mathbb {R}^{O(\log n)}$ by taking $O(\log n)$ independent copies of $f$, and applying concentration bounds. Having such a scaling distortion implies $O(1)$ average distortion and an $O(q)$ bound on the $\ell _q$ distortion.

Here we extend this framework for embedding that preserve the volume of subsets of $X$ of cardinality $k$. First we strengthen the analysis of the line embedding of [3], so that the bound on the contraction of the embedding holds for any $x\in X$ and any affine combination of the images of points in a subset $S\subset X$ (with constant probability). We then define an appropriate analogue notion of scaling distortion for sets of size $k$, and show that taking $O(k\log n)$ independent copies of the random line embedding yields an embedding with the appropriate bounds on the worst case, average and in general the $(k-1)$-dimensional $\ell _q$ distortion.

3.1 The Embedding

The following is a variation on a Lemma from [3], where the bound on contraction is strengthen to hold for subsets $S=\{s_0,\dots ,s_{k-1}\}$, rather than just for pairs. More precisely, instead of simply lower bounding the distance between two images of points in $X$, we will need to lower bound the distance of the image of some point $s_i$ from any affine combination of the images of $s_0,\dots s_{i-1}$ (conditioned on the values of these images). We can only prove this for some very specific ordering of the points in $S$, so from now on we shall enforce an ordering on every subset $S\subseteq X$ that complies with the requirements of the Lemma.

Lemma 4

There exists a universal constant $\hat{C}$ such that for every finite metric space $(X,d)$ on $n$ points, there exists a distribution ${\mathcal D}$ over functions $f:X \rightarrow \mathbb {R}$ such that the following holds.^{Footnote 4}

For all $u,v \in X$ and all $f \in \mathrm{supp}({\mathcal D})$,
$$\begin{aligned} | f(u) - f(v)| \le \hat{C} \cdot \log \Big ( \frac{n}{|B(u,d(u,v))|} \Big ) \cdot d(u,v). \end{aligned}$$
For every subset $S\subseteq X$ of size $k$, there exists an ordering $S=(s_0,\dots s_{k-1})$, such that for any $1\le i\le k-1$, values $x_0,\dots ,x_{i-1}\in \mathbb {R}$ and coefficients $\alpha _0,\dots ,\alpha _{i-1}\in \mathbb {R}$ with $\sum _{j=0}^{i-1}\alpha _j=1$:
$$\begin{aligned} \mathop {\Pr }\limits _{f\sim {\mathcal D}} \Big [ \Big | f(s_i) \!-\! \sum _{j=0}^{i-1}\alpha _jx_j\Big | \!\ge \! d(s_i,\{s_0,\dots ,s_{i-1}\})/\hat{C}\mid f(s_j) \!=\!x_j \forall ~0\!\le \! j\!\le \! i\!-\!1\Big ] \!\ge \! 1/2. \end{aligned}$$

Let $D=c\cdot k\ln n$ where $c$ is a constant to be determined later. Define the embedding $g:X\rightarrow \mathbb {R}^D$ by

$$\begin{aligned} g=\frac{4\hat{C}}{\sqrt{D}}\bigoplus _{t=1}^Df_t, \end{aligned}$$

where each $f_t$ is sampled independently according to Lemma 4.

Next, we generalize the notion of scaling distortion for subsets of size $k$. To this end, for each subset $S\in {X\atopwithdelims ()k}$ with its ordering $S=(s_0,\dots ,s_{k-1})$ (the ordering enforced by Lemma 4), define a sequence $(\varepsilon _1,\dots ,\varepsilon _{k-1})$ as follows. For each $1\le i\le k-1$ let $0\le j(i)<i$ be such that $d(s_i,\{s_0,\dots ,s_{i-1}\})=d(s_i,s_{j(i)})$. Let $\varepsilon _i$ be the value such that $|B(s_{j(i)},d(s_i,s_{j(i)}))|=\varepsilon _in$. In other words, $s_i$ is the $\varepsilon _in$ nearest neighbor of the closest point to it in $\{s_0,\dots ,s_{i-1}\}$.

Lemma 5

For any embedding $g$ in the support of the distribution and any $S\in {X\atopwithdelims ()k}$,

$$\begin{aligned} \frac{{\phi }_\mathrm{E}(g(S))}{{\phi }_\mathrm{F}(S)}\le \prod _{i=1}^{k-1}O(\log (1/\varepsilon _i))~. \end{aligned}$$

Proof

Fix any $1\le i\le k-1$, and let $0\le j(i)\le i-1$ be such that $d(s_i,s_{j(i)})=d(s_i,\{s_0,\dots ,s_{i-1}\})$. By the definition of $\varepsilon _i$, $|B(s_{j(i)},d(s_i,s_{j(i)}))|=\varepsilon _in$. Using the first property of Lemma 4 we have for any $t\in [D]$,

$$\begin{aligned} |f_t(s_i)-f_t(s_{j(i)})|&\le \hat{C}\cdot \log \Big (\frac{n}{|B(s_{j(i)},d(s_i,s_{j(i)}))|} \Big )\cdot d(s_i,s_{j(i)})\\&= \hat{C}\cdot \log (1/\varepsilon _i)\cdot d(s_i,s_{j(i)}), \end{aligned}$$

thus also

$$\begin{aligned} d_{E}(g(s_i),g(s_{j(i)}))&\le \Big (\frac{(4\hat{C})^2}{D}\sum _{t=1}^D(\hat{C}\cdot \log (1/\varepsilon _i)\cdot d(s_i,s_{j(i)}))^2\Big )^{1/2}\nonumber \\&\le 4\hat{C}^2\log (1/\varepsilon _i)\cdot d(s_i,s_{j(i)}). \end{aligned}$$

(2)

Now Theorem 2 suggests that

$$\begin{aligned} \prod _{i=1}^{k-1}d(s_i,s_{j(i)})\le C^k\prod _{i=1}^{k-1}d(s_i,S\setminus \{s_i\})=C^k(k-1)!\cdot {\phi }_\mathrm{{F}}(S), \end{aligned}$$

and we conclude the proof by

$$\begin{aligned} {\phi }_\mathrm{E}(g(S))&= \frac{1}{(k-1)!}\prod _{i=1}^{k-1}d_\mathrm{E}(g(s_i), \mathrm{affspan}(g(s_0),\dots ,g(s_{i-1})))\\&\le \frac{1}{(k-1)!}\prod _{i=1}^{k-1}d_\mathrm{E}(g(s_i),g(s_{j(i)}))\\&\le \frac{1}{(k-1)!}\prod _{i=1}^{k-1}4\hat{C}^2\log (1/\varepsilon _i)\cdot d(s_i,s_{j(i)})\\&\le \prod _{i=1}^{k-1}O\left( \log (1/\varepsilon _i)\right) \cdot {\phi }_\mathrm{F}(S). \end{aligned}$$

$\square $

Lemma 6

With probability at least $1-1/n^k$, the embedding $g$ is $(k-1)$-dimensional non-contractive.

Proof

Fix some $S=(s_0,\dots ,s_{k-1})$ and $1\le i\le k-1$. Let $\delta _i=d(s_i,\{s_0,\dots ,s_{i-1}\})$ and $A_i=\mathrm{affspan}(g(s_0),\ldots ,g(s_{i-1}))$. We would like to give a lower bound on the distance from $g(s_i)$ to $A_i$ in terms of $\delta _i$. The main difficulty is that the nearest point to $g(s_i)$ in $A_i$ naturally depends on the value of $g(s_i)$, thus we cannot use the second property of Lemma 4 directly on the nearest point (we may condition only on $g(s_j)$ for $j<i$). The solution is as follows: rather than showing a lower bound on the distance from $g(s_i)$ to the closest point in $A_i$, we will show a lower bound on the distance from $g(s_i)$ to all the points in a suitable net of $A_i$.

To this end, let $0\le j\le i-1$ be such that $\delta _i=d(s_i,s_j)$, and let $N_i$ be a $\delta _i$-net of $B(g(s_j),8\hat{C}^2\delta _i\log n)\cap A_i$. As $A_i$ is an $i-1$ dimensional space, and in such low dimensional space a ball of radius $2r$ can be covered by $2^{O(i)}$ balls of radius $r$. Applying this covering repeatedly we conclude that $B(g(s_j),8\hat{C}^2\delta _i\log n)\cap A_i$ can be covered by $2^{O(i\log \log n)}$ balls of radius $\delta _i/2$, and as each net point is contained in at most one of these balls, it follows that $|N_i|=2^{O(i\log \log n)}< n^k$ (for sufficiently large $n$).

Now, if $b_i\in A_i$ is the closest point to $g(s_i)$, then by using (2) and the fact that $\varepsilon _i\ge 1/n$.

$$\begin{aligned} d_\mathrm{E}(g(s_j),b_i)&\le d_\mathrm{E}(g(s_i),g(s_j))+d_\mathrm{E}(g(s_i),b_i)\\&\le 2d_\mathrm{E}(g(s_i),g(s_j))\\&\le 2(4\hat{C}^2)\log n\cdot d(s_i,s_j)\\&= 8\hat{C}^2\log n\cdot \delta _i. \end{aligned}$$

This suggests that indeed $b_i\in B(g(s_j),8\hat{C}^2\delta _i\log n)$, so that there exists $a'_i\in N_i$ with

$$\begin{aligned} d_\mathrm{E}(a'_i,b_i)\le \delta _i. \end{aligned}$$

(3)

Next we prove that there is a high probability that $g(s_i)$ is sufficiently far from all net points. Let $a_i\in N_i$ be an arbitrary point of the net, and let $\alpha _0,\dots ,\alpha _{i-1}$ be such that $\sum _{j=0}^{i-1}\alpha _j=1$ and $a_i=\sum _{j=0}^{i-1}\alpha _jg(s_j)$. Observe that

$$\begin{aligned} d_\mathrm{E}(g(s_i),a_i)^2=\frac{(4\hat{C})^2}{D}\sum _{t=1}^D \Big (f_t(s_i)-\sum _{j=0}^{i-1}\alpha _jf_t(s_j)\Big )^2 \end{aligned}$$

For each $t\in [D]$ let $Z_t$ be an indicator random variable for the event $|f_t(s_i)-\sum _{j=0}^{i-1}\alpha _jf_t(s_j)|\ge \delta _i/\hat{C}$, and let $Z=Z(S,i,a_i)=\sum _{t=1}^DZ_t$. Observe that if it is the case that $Z\ge D/4$, then

$$\begin{aligned} d_\mathrm{E}(g(s_i),a_i)&\ge \Big (\frac{(4\hat{C})^2}{D}\sum _{t~:~ Z_t=1}\Big (f_t(s_i)-\sum _{j=0}^{i-1}\alpha _jf_t(s_j)\Big )^2\Big )^{1/2}\nonumber \\&\ge \Big (\frac{(4\hat{C})^2}{D}\cdot \frac{D}{4}(\delta _i/\hat{C})^2\Big )^{1/2}\nonumber \\&= 2\delta _i. \end{aligned}$$

(4)

By the triangle inequality, (3) and (4) used on $a'_i$,

$$\begin{aligned} d_\mathrm{E}(g(s_i),b_i)\ge d_\mathrm{E}(g(s_i),a_i)-d_\mathrm{E}(a'_i,b_i)\ge 2\delta _i-\delta _i= \delta _i~. \end{aligned}$$

We conclude that

$$\begin{aligned} {\phi }_\mathrm{E}(g(S))&= \frac{1}{(k-1)!}\prod _{i=1}^{k-1}d_\mathrm{E}(g(s_i),b_i)\\&\ge \frac{1}{(k-1)!}\prod _{i=1}^{k-1}\delta _i\\&\ge {\phi }_\mathrm{F}(S). \end{aligned}$$

It remains to show that with probability at least $1-1/n^k$, all of the bad events $\{Z(S,i,a_i)<D/4\}_{S,i,a_i}$ do not happen. For a given $Z$, by Lemma 4 $\Pr [Z_t\mid g(s_0),\dots ,g(s_{i-1})]\ge 1/2$ (because the different coordinates are independently chosen, so for each $Z_t$, conditioning on $g$ is the same as conditioning just on $f_t$), so that ${\mathbb {E}}[Z]\ge D/2$. The crucial observation is that in the definition of the bad events we fixed only $g(s_0),\dots g(s_{i-1})$ (to determine $A_i$ and the net), but not $g(s_i)$. Using a standard Chernoff bound

$$\begin{aligned} \Pr [Z<D/4]\le \Pr [Z<{\mathbb {E}}[Z]/2]\le e^{-{\mathbb {E}}[Z]/8}\le e^{-D/(16)}\le n^{-3k}, \end{aligned}$$

where the last inequality holds when $c=48$, say. By applying the union bound on all possible ${n\atopwithdelims ()k}$ sets $S$, all $k$ possible indices $i$, and all the different points $a_i\in \mathbb {N}_i$ (recall that $|N_i|<n^k$) we have that with probability at most $k\cdot n^k\cdot {n\atopwithdelims ()k}/n^{3k}\le 1/n^k$, some bad event happened. $\square $

By Lemmas 5 and 6 we have that there is high probability to obtain an embedding $g:X\rightarrow \mathbb {R}^D$, such that for every subset $S\in {X\atopwithdelims ()k}$ with its sequence $(\varepsilon _1,\dots ,\varepsilon _{k-1})$,

$$\begin{aligned} \mathrm{dist}_g(S)\le O\Big (\Big (\prod _{i=1}^{k-1}\log (1/\varepsilon _i)\Big )^{1/(k-1)}\Big ). \end{aligned}$$

(5)

In the following sections we analyze the $\ell _q$ volume distortion of such an embedding. For the sake of simplicity we start by the $\ell _\infty $ and then the $\ell _1$ volume distortions before handling the general $(k-1)$-dimensional $\ell _q$-distortion.

3.1.1 Bounding the $(k-1)$-dimensional distortion

Lemma 7

The (worst case) $(k-1)$-dimensional distortion of $g$ is $O(\log n)$ i.e. $\mathrm{dist}_\infty ^{(k-1)}(g) = O(\log n)$.

Proof

For any set $S\in {X\atopwithdelims ()k}$ and $i\in [k-1]$, $\varepsilon _i\ge 1/n$. So by (5)

$$\begin{aligned} \mathrm{dist}_g(S)\le O\Big (\Big (\prod _{i=1}^{k-1}\log (1/\varepsilon _i)\Big )^{1/(k-1)}\Big )\le O(\log n). \end{aligned}$$

$\square $

3.1.2 Bounding the average $(k-1)$-dimensional distortion

Lemma 8

The average $(k-1)$-dimensional distortion of $g$ is $O(\log k)$ i.e. $\mathrm{dist}_1^{(k-1)}(g) = O(\log k)$.

Proof

For every set $S\in {X\atopwithdelims ()k}$ let $m=m(S)=\min _i\{\varepsilon _in\}$. By (5) there is a universal constant $C'$ such that the average distortion over all possible $S\in {X\atopwithdelims ()k}$ can be bounded as follows:

$$\begin{aligned} \frac{\mathrm{dist}_1^{(k-1)}(g)}{C'}&\le \mathbb {E}_{S\in {X\atopwithdelims ()k}}\Big [\Big (\prod _{i=1}^{k-1}\log (1/\varepsilon _i)\Big )^{1/(k-1)}\Big ]\\&\le \mathbb {E}\Big [\Big (\prod _{i=1}^{k-1}\log (n/m)\Big )^{1/(k-1)}\Big ]\\&= \mathbb {E}\left[ \log (n/m)\right] . \end{aligned}$$

In what follows we attempt to bound $\mathbb {E}\left[ \log (n/m)\right] $. First we show that for every $i$ and every $t\in [n]$, we have that $\Pr [\varepsilon _i=t/n]\le 2k/n$ (recall that $g$ is fixed, and the probability is over a uniform choice of a subset $S$). This is because conditioning on any $s_0,\dots ,s_{i-1}$, for every $0\le j\le i-1$ the probability that $s_i$ is the $t$-th nearest neighbor of $s_j$ is at most $1/(n-i)$, and by the union bound, the probability that there exists such a $j$ is at most $i/(n-i)\le k/(n-k)\le 2k/n$ (assuming $k<n/2$, as otherwise the lemma is trivial). It follows by the union bound that for all $t\in [n]$,

$$\begin{aligned} \Pr [m=t]\le \sum _{i=1}^{k-1}\Pr [\varepsilon _i=t/n]\le 2k^2/n. \end{aligned}$$

Let $h = \lceil \frac{n}{k^2} \rceil $, so that $\Pr [m=t]\le 2/h$. Now,

$$\begin{aligned} \mathbb {E}\left[ \log (n/m)\right]&\le \sum _{t=1}^{h}\Pr [m=t]\cdot \log (n/t)+ \Pr [m>h]\cdot \log (n/h)\\&\le \frac{2}{h} \Big (h\log n - \sum _{t=1}^{h}\log t\Big )+\log (k^2)+2. \end{aligned}$$

Note that $\sum _{t=1}^{h}\log t = \log (h!)\ge h\log (h/e)$, hence

$$\begin{aligned} \mathbb {E}\left[ \log (n/m)\right] \le 2(\log n-\log (n/(e k^2))+2\log k+2 = O(\log k). \end{aligned}$$

$\square $

3.1.3 Bounding the $(k-1)$-dimensional $\ell _q$-distortion

Here we generalize the bounds on the worst-case and average $(k-1)$-dimensional distortion, to the $\ell _q$ norm of the $(k-1)$-dimensional distortion for arbitrary $1\le ~q\le ~\infty $, and thus prove Theorem 3. Taking higher norms of the distortion suggests that we have to be more careful in estimating the probability of a sequence $(\varepsilon _1,\dots ,\varepsilon _{k-1})$ for a random set $S$ (unlike the $q=1$ case, where we could use only the minimal $\varepsilon _i$).

Lemma 9

For any $1\le q\le \infty $, $\mathrm{dist}_q^{(k-1)}(\hat{f}) = O(\lceil q/(k-1)\rceil \cdot \log k)$.

Proof

For $\ell \in \{0,1,\dots ,k-1\}$, let ${\mathcal {S}}^{(\ell )}\subseteq {X\atopwithdelims ()k}$ contain all the sets $S$ that have exactly $\ell $ values of $\varepsilon _i$ which are bigger than $1/k^6$. In what follows we attempt to bound $|{\mathcal {S}}^{(\ell )}|$. There are at most ${k-1\atopwithdelims ()\ell }\le 2^k$ possibilities to choose the $\ell $ locations in the sequence $(\varepsilon _1,\dots ,\varepsilon _{k-1})$ that will have values larger than $1/k^6$. Assume that these locations are fixed to be the first $\ell $ elements of the sequence, by sorting the sequence. How many sets correspond to such sequences? There are at most ${n\atopwithdelims ()\ell +1}$ possibilities to choose $s_0$ and the other $\ell $ points which induce the first $\ell $ values $\varepsilon _1,\dots ,\varepsilon _\ell $. As for the other values, observe that for $i>\ell $ we have $\varepsilon _i<1/k^6$, which suggests that $s_i$ is one of the $\varepsilon _in$ nearest neighbor of at least one of the other $k-1$ points, and thus there are at most $k\cdot \varepsilon _in$ choices for $s_i$. Let $K_\ell =\{6\log k,6\log k+1,\dots ,\log n\}^{k-\ell -1}$, for ease of notation assume $K_\ell $ is indexed by integers $\ell <i<k$ (that is, for $x\in K_\ell $ denote by $x_{\ell +1}$ the first element of $x$, and by $x_{k-1}$ the last one), and fix some $x\in K_\ell $. Denote by ${\mathcal {S}}^{(\ell )}_x$ the collection of sets in ${\mathcal {S}}^{(\ell )}$ satisfying that $2^{-x_i}<\varepsilon _i\le 2^{-x_i+1}$ for all $\ell <i<k$. Then

$$\begin{aligned} |{\mathcal {S}}^{(\ell )}_x|\le \Big (\begin{array}{c}{n} \\ {\ell +1}\end{array}\Big )\cdot 2^k\cdot \prod _{i=\ell +1}^{k-1}k\cdot n/2^{x_i}. \end{aligned}$$

(6)

Let $C'$ be the constant in Lemma 5, and let $m= \lceil q/(k-1) \rceil $. First we use (5) and the monotonicity of the normalized $\ell _p$ norm to argue that

$$\begin{aligned} \frac{\mathrm{dist}_q^{(k-1)}(g)}{C'}&\le \mathbb {E}_{S\in {X\atopwithdelims ()k}}\Big [\Big (\prod _{i=1}^{k-1}\log (1/\varepsilon _i)\Big )^{q/(k-1)}\Big ]^{1/q}\nonumber \\&\le \mathbb {E}_{S\in {X\atopwithdelims ()k}}\Big [\Big (\prod _{i=1}^{k-1}\log (1/\varepsilon _i)\Big )^m\Big ]^{\frac{1}{m(k-1)}}\nonumber \\&=\Big [\Big (\begin{array}{c}{n} \\ {k}\end{array}\Big )^{-1}\sum _{\ell =0}^{k-1}\sum _{S\in {\mathcal {S}}^{(\ell )}}\Big (\prod _{i=1}^{k-1}\log (1/\varepsilon _i)\Big )^{m}\Big ]^{\frac{1}{m(k-1)}}\nonumber \\&\le \Big [\Big (\begin{array}{c}{n} \\ {k}\end{array}\Big )^{-1}\sum _{\ell =0}^{k-1}\sum _{x\in K_\ell }\sum _{S\in {\mathcal {S}}^{(\ell )}_x}\Big ((6\log k)^\ell \prod _{i=\ell +1}^{k-1}\log (1/\varepsilon _i)\Big )^{m}\Big ]^{\frac{1}{m(k-1)}}\nonumber \\&\le \Big [\Big (\begin{array}{c}{n} \\ {k}\end{array}\Big )^{-1}\sum _{\ell =0}^{k-1}(6\log k)^{\ell m}\sum _{x\in K_\ell }\sum _{S\in {\mathcal {S}}^{(\ell )}_x}\Big (\prod _{i=\ell +1}^{k-1}x_i\Big )^{m}\Big ]^{\frac{1}{m(k-1)}}\nonumber \\&\mathop {\le }\limits ^{(6)}\Big [\Big (\begin{array}{c}{n} \\ {k}\end{array}\Big )^{-1}\sum _{\ell =0}^{k-1}(6\log k)^{\ell m}\sum _{x\in K_\ell }\Big (\begin{array}{c}{n} \\ {\ell +1}\end{array}\Big ) \cdot 2^k\cdot \prod _{i=\ell +1}^{k-1}k\cdot n/2^{x_i}\prod _{i=\ell +1}^{k-1}x_i^m\Big ]^{\frac{1}{m(k-1)}}\nonumber \\&=\Big [\Big (\begin{array}{c}{n} \\ {k}\end{array}\Big )^{-1} \cdot 2^k\sum _{\ell =0}^{k-1}\Big (\begin{array}{c}{n} \\ {\ell +1}\end{array}\Big )(6\log k)^{\ell m} \cdot (kn)^{k-\ell -1}\sum _{x\in K_\ell }\prod _{i=\ell +1}^{k-1}x_i^m/2^{x_i}\Big ]^{\frac{1}{m(k-1)}}.\nonumber \\ \end{aligned}$$

(7)

Next, we focus on the expression $\sum _{x\in K_\ell }\prod _{i=\ell +1}^{k-1}x_i^m/2^{x_i}$. Recall that $K_\ell $ is a $k-\ell -1$ tuple of elements in $\{6\log k,\dots ,\log n\}$, so that each $x_i\ge 6\log k$, and we may bound $1/2^{x_i}\le 1/2^{3\log k}\cdot 1/2^{x_i/2}=1/k^3\cdot 1/2^{x_i/2}$. Now, rather than a summation of products, we will take a product of summations. That is, we write

$$\begin{aligned} \sum _{x\in K_\ell }\prod _{i=\ell +1}^{k-1}x_i^m/2^{x_i}&\le \frac{1}{k^3}\sum _{x\in K_\ell }\prod _{i=\ell +1}^{k-1}x_i^m/2^{x_i/2}\nonumber \\&= \prod _{i=\ell +1}^{k-1}\Big (\sum _{z=6\log k}^{\log n}z^m/2^{z/2}\Big ),\nonumber \\&= \Big (\sum _{z=6\log k}^{\log n}z^m/2^{z/2}\Big )^{k-\ell -1} \end{aligned}$$

(8)

where the first equation holds because for any choice of a vector $x\in K_\ell $ with its corresponding product, we can associate a unique sequence of $k-\ell -1$ choices of numbers $z$. Next we bound the summation, with the variable change $y=z-6\log k$,

$$\begin{aligned} \sum _{z=6\log k}^{\log n}z^m/2^{z/2}&\le \sum _{z=6\log k}^{\infty }z^m/2^{z/2}\\&= \sum _{y=0}^{\infty }(y+6\log k)^m/2^{(y+6\log k)/2}\\&\le \frac{1}{k^3}\sum _{y=0}^{\infty }\big (2y^m+2(6\log k)^m\big )/2^{y/2}\\&\le \frac{8(6\log k)^m}{k^3}+\frac{2}{k^3}\sum _{y=0}^{\infty }y^m/2^{y/2}, \end{aligned}$$

where the last inequality is using that $\sum _{y\ge 0}2^{-y/2}\le 4$. We replace the sum by an integral and calculate

$$\begin{aligned} \sum _{y=0}^{\infty }y^m/2^{y/2}\le \sqrt{2}\int \limits _0^\infty y^m/2^{y/2}dy\le (16m)^m, \end{aligned}$$

which yields the following bound on (8)

$$\begin{aligned} \Big (\sum _{z=6\log k}^{\log n}z^m/2^{z/2}\Big )^{k-\ell -1}\le \Big (\frac{8(6\log k)^m}{k^3}+\frac{4(16m)^m}{k^3}\Big )^{k-\ell -1} \end{aligned}$$

(9)

Plugging (9) into (7) we get that

$$\begin{aligned}&\frac{\mathrm{dist}_q^{(k-1)}(g)}{C'}\\&\quad \le \Big [\Big (\begin{array}{c}{n} \\ {k}\end{array}\Big )^{-1} 2^k\sum _{\ell =0}^{k-1}\Big (\begin{array}{c}{n} \\ {\ell +1}\end{array}\Big )(6\log k)^{\ell m} \cdot (kn)^{k-\ell -1}\Big (\frac{8(6\log k)^m}{k^3}+\frac{4(16m)^m}{k^3}\Big )^{k-\ell -1}\Big ]^{\frac{1}{m(k-1)}}\\&\quad \le \Big [\Big (\begin{array}{c}{n} \\ {k}\end{array}\Big )^{-1} 2^k\sum _{\ell =0}^{k-1}\Big (\begin{array}{c}{n} \\ {\ell +1}\end{array}\Big )(n/k^2)^{k-\ell -1}\Big ((100\log k)^{m(k-1)}\cdot (100m)^{m(k-1)}\Big )\Big ]^{\frac{1}{m(k-1)}}\\&\quad \le \Big [k\cdot 2^k\Big ((100\log k)^{m(k-1)}\cdot (100m)^{m(k-1)}\Big )\Big ]^{\frac{1}{m(k-1)}}, \end{aligned}$$

where the last inequality is using that (for $k\ge 2$ and) for every $0\le \ell \le k-1$, ${n\atopwithdelims ()\ell +1} (n/k^2)^{k-\ell -1} \le {n\atopwithdelims ()k}$. Note that the above expression is at most $O(m\log k) = O(\lceil q/(k-1)\rceil \cdot \log k)$ as required. $\square $

Notes

There are other notions of average distortion that may be of interest, in particular such notions which normalize with respect to the maximum distortion have been considered. While these have advantages of their own, they take a very different geometric perspective which puts emphasis on small distance scales (as opposed to the coarse geometric perspective in this paper) and the worst case lower bounds hold for these notions. See [20].
Alternatively known as embeddings with slack and embeddings with gracefully degrading distortion.
The exchange property guarantees that the greedy algorithm for a minimum weight maximal independent set in a weighted matroid (with positive weights), gives a set whose (sorted) weight vector is smaller or equal in every coordinate than that of any other maximal independent set. This implies that the product of weights is also minimized.
Moreover, there is an efficient algorithm to sample from ${\mathcal D}$.

References

Abraham, I., Bartal, Y., Chan, T-H., Dhamdhere, K., Gupta, A., KLeinberg, J., Neiman, O., Slivkins, A.: Metric embeddings with relaxed guarantees. In: FOCS ’05: Proceedings of the 46th Annual IEEE Symposium on Foundations of Computer Science, pp. 83–100. IEEE Computer Society, Washington, DC (2005)
Abraham, I., Bartal, Y., Neiman, O.: Embedding metrics into ultrametrics and graphs into spanning trees with constant average distortion. In: SODA ’07 Proceedings of the Eighteenth Annual ACM-SIAM Symposium on Discrete Algorithms (2007)
Abraham, I., Bartal, Y., Neiman, O.: Advances in metric embedding theory. Adv. Math. 228(6), 3026–3126 (2011)
Article MATH MathSciNet Google Scholar
Abraham, I., Bartal, Y., Neiman, O., Schulman, L.J.: Volume in general metric spaces. In: Proceedings of the 18th Annual European Conference on Algorithms: Part II, ESA’10, pp. 87–99. Springer-Verlag, Berlin (2010)
Bourgain, J.: On Lipschitz embedding of finite metric spaces in Hilbert space. Isr. J. Math. 52(1–2), 46–52 (1985)
Article MATH MathSciNet Google Scholar
Chan, T.-H.H., Dinitz, M., Gupta, A.: Spanners with slack. In: ESA’06: Proceedings of the 14th Conference on Annual European Symposium, pp. 196–207. Springer-Verlag, London (2006)
Dinitz, M.: Compact routing with slack. In: PODC ’07: Proceedings of the Twenty-Sixth Annual ACM Symposium on Principles of Distributed Computing, pp. 81–88. ACM, New York (2007)
Dunagan, J., Vempala, S.: On Euclidean embeddings and bandwidth minimization. In: Lecture Notes in Computer Science, vol. 2129. Springer, Heidelberg (2001)
Elkin, M., Liebchen, C., Rizzi, R.: New length bounds for cycle bases. Inform. Process. Lett. 104(5), 186–193 (2007)
Article MATH MathSciNet Google Scholar
Feige, U.: Approximating the bandwidth via volume respecting embeddings. J. Comput. Syst. Sci. 60(3), 510–539 (2000) (30th Annual ACM Symposium on Theory of Computing, Dallas, TX, 1998)
Gonzalez, T.F.: Clustering to minimize the maximum intercluster distance. Theor. Comput. Sci. 38, 293–306 (1985)
Article MATH Google Scholar
Gupta, A.: Improved bandwidth approximation for trees and chordal graphs. J. Algorithms 40(1), 91–36 (2001)
Article Google Scholar
Hochbaum, D.S., Shmoys, D.B.: A best possible heuristic for the $k$-center problem. Math. Oper. Res. 10, 180–184 (1985)
Article MATH MathSciNet Google Scholar
Imase, M., Waxman, B.M.: Dynamic steiner tree problem. SIAM J. Discrete Math. 4(3), 369–384 (1991)
Article MATH MathSciNet Google Scholar
Krauthgamer, R., Linial, N., Magen, A.: Metric embeddings-beyond one-dimensional distortion. Discrete Comput. Geom. 31(3), 339–356 (2004)
Article MATH MathSciNet Google Scholar
Krauthgamer, R., Lee, J.R., Mendel, M., Naor, A.: Measured descent: a new embedding method for finite metrics. In: 45th Annual IEEE Symposium on Foundations of Computer Science, pp. 434–443. IEEE, Los Alamitos (October, 2004)
Konjevod, G., Richa, A.W., Xia, D., Yu, H.: Compact routing with slack in low doubling dimension. In: PODC ’07: Proceedings of the Twenty-Sixth Annual ACM Symposium on Principles of Distributed Computing, pp. 71–80. ACM, New York (2007)
Kleinberg, J., Slivkins, A., Wexler, T.: Triangulation and embedding using small sets of beacons. J. ACM 56(6), 1–37 (2009)
Article MathSciNet Google Scholar
Lee, J.R.: Volume distortion for subsets of euclidean spaces. In: SCG ’06: Proceedings of the Twenty-Second Annual Symposium on Computational Geometry, pp. 207–216. ACM, New York (2006)
Rabinovich, Y.: On average distortion of embedding metrics into the line and into L1. In: STOC ’03: Proceedings of the Thirty-Fifth Annual ACM Symposium on Theory of Computing, pp. 456–462. ACM Press, New York (2003)
Rao, S.: Small distortion and volume preserving embeddings for planar and Euclidean metrics. In: Proceedings of the Fifteenth Annual Symposium on Computational Geometry, pp. 300–306. ACM, New York (1999)

Download references

Acknowledgments

Yair Bartal was supported in part by a Grant from the Israeli Science Foundation (1609/11) and in part by a Grant from the National Science Foundation (NSF CCF-0652536). Ofer Neiman was supported in part by ISF Grant No. (523/12) and by the European Union’s Seventh Framework Programme (FP7/2007-2013) under Grant Agreement No. 303809. Leonard J. Schulman was supported in part by NSF CCF-0515342 and NSA H98230-06-1-0074. Part of the research was done while Yair Bartal was at the Center of the Mathematics of Information, Caltech, CA, USA.

Author information

Authors and Affiliations

Microsoft Research, Silicon Valley, CA, USA
Ittai Abraham
Hebrew University, Jerusalem, Israel
Yair Bartal
Ben-Gurion University of the Negev, Beer-Sheva, Israel
Ofer Neiman
Caltech, Pasadena, CA, USA
Leonard J. Schulman

Authors

Ittai Abraham
View author publications
You can also search for this author in PubMed Google Scholar
Yair Bartal
View author publications
You can also search for this author in PubMed Google Scholar
Ofer Neiman
View author publications
You can also search for this author in PubMed Google Scholar
Leonard J. Schulman
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Ofer Neiman.

Additional information

This paper is a full version based on the conference paper [4].

Appendix: Proof of Lemma 4

In this section we prove the properties of the line embedding. We begin by several definitions. As usual $(X,d)$ is a metric space on $n$ points.

Definition 3

The local growth rate of $x \in X$ at radius $r>0$ for given scales $\gamma _1, \gamma _2>0$ is defined as

$$\begin{aligned} \rho (x,r,\gamma _1,\gamma _2) = |B(x,r\gamma _1)|/|B(x,r\gamma _2)|. \end{aligned}$$

Given a subset $Z \subseteq X$, the minimum local growth rate of $Z$ at radius $r>0$ and scales $\gamma _1,\gamma _2>0$ is defined as $\rho (Z,r,\gamma _1,\gamma _2) = \min _{x \in Z} \rho (x,r,\gamma _1,\gamma _2)$. The minimum local growth rate of $x\in X$ at radius $r>0$ and scales $\gamma _1,\gamma _2>0$ is defined as $\bar{\rho }(x,r,\gamma _1,\gamma _2) = \rho (B(x,r),r,\gamma _1,\gamma _2)$.

The following simple claim was shown in [3]

Claim 10

Let $x,y \in X$, let $\gamma _1,\gamma _2 > 0$ and let $r$ be such that $2(1+\gamma _2)r < d(x,y) \le (\gamma _1-\gamma _2-2)r$, then

$$\begin{aligned} \max \{ \bar{\rho }(x,r,\gamma _1,\gamma _2), \bar{\rho }(y,r,\gamma _1,\gamma _2) \} \ge 2. \end{aligned}$$

Definition 4

(Partition) A partition $P$ of $X$ is a collection of pairwise disjoint sets ${\mathcal A}(P) = \{ A_1,A_2,\dots ,A_t\}$ for some integer $t$, such that $X = \cup _j A_j$. The sets $A_j$ are called clusters. For $x \in X$ denote by $P(x)$ the cluster containing $x$. Given $\Delta >0$, a partition is $\Delta $-bounded if for all $j\in [t]$, $\mathrm{diam}(A_j) \le \Delta $. For $Z\subseteq X$ we denote by $P[Z]$ the restriction of $P$ to points in $Z$.

Definition 5

(Probabilistic Partition) A probabilistic partition $\hat{\mathcal{P}}$ of a metric space $(X,d)$ is a distribution over a set $\mathcal{P}$ of partitions of $X$. Given $\Delta >0$, $\hat{{\mathcal P}}$ is $\Delta $-bounded if each $P \in {\mathcal P}$ is $\Delta $-bounded. Let $\mathrm{supp}(\hat{\mathcal P}) \subseteq {\mathcal P}$ be the set of partitions with non-zero probability under $\hat{\mathcal P}$.

Definition 6

(Uniform Function) Given a partition $P$ of a metric space $(X,d)$, a function $f$ defined on $X$ is called uniform with respect to $P$ if for any $x,y \in X$ such that $P(x) = P(y)$ we have $f(x) = f(y)$.

Let $\hat{\mathcal P}$ be a probabilistic partition. A collection of functions defined on $X$, $f = \{ f_\mathrm{{P}} | P \in {\mathcal P}\}$ is uniform with respect to ${\mathcal P}$ if for every $P \in {\mathcal P}$, $f_\mathrm{{P}}$ is uniform with respect to $P$.

Definition 7

(Uniformly Padded Local PP) Given $\Delta >0$ and $0<\delta \le 1$, let $\hat{\mathcal P}$ be a $\Delta $-bounded probabilistic partition of $(X,d)$. Given collection of functions $\eta = \{ \eta _\mathrm{{P}}: X \rightarrow [0,1] | P \in {\mathcal P} \}$, we say that $\hat{\mathcal {P}}$ is $(\eta ,\delta )$-locally padded if the event $B(x,\eta _\mathrm{{P}}(x) \Delta ) \subseteq P(x)$ occurs with probability at least $\delta $ regardless of the structure of the partition outside $B(x,2\Delta )$. Formally, for all $x\in X$, for all $A \subseteq X \setminus B(x,2\Delta )$ and all partitions $P'$ of $A$,

$$\begin{aligned} \Pr [ B(x,\eta _\mathrm{{P}}(x) \Delta ) \subseteq P(x) \mid P[A]=P'] \ge \delta \end{aligned}$$

Let $0<\hat{\delta }\le 1$. We say that $\hat{\mathcal {P}}$ is strong $(\eta ,\hat{\delta })$-locally padded if for any $\hat{\delta }\le \delta \le 1$, $\hat{\mathcal {P}}$ is $(\eta \cdot \ln (1/\delta ),\delta )$-padded.

We say that $\hat{\mathcal {P}}$ is $(\eta ,\delta )$-uniformly locally padded if $\eta $ is uniform with respect to ${\mathcal P}$.

The following Lemma was shown in [3]

Lemma 11

Let $(Z,d)$ be a finite metric space. Let $0<\Delta \le \mathrm{diam}(Z)$. Let $\hat{\delta }\in (0,1/2]$, $\gamma _1\ge 2$, $\gamma _2\le 1/16$. There exists a $\Delta $-bounded probabilistic partition $\hat{\mathcal P}$ of $(Z,d)$ and a collection of uniform functions $\{\xi _\mathrm{{P}}: Z \rightarrow \{0,1\}\mid P \in {\mathcal P} \}$ and $\{\eta _\mathrm{{P}}: Z \rightarrow (0,1]\mid P \in {\mathcal P} \}$ such that the probabilistic partition $\hat{\mathcal P}$ is a strong $(\eta , \hat{\delta })$-uniformly locally padded probabilistic partition; and the following conditions hold for any $P\in \mathrm{supp}(\hat{\mathcal P})$ and any $x\in Z$:

If $\xi _\mathrm{{P}}(x)=1$ then: $2^{-6} /\ln \rho (x,2\Delta ,\gamma _1,\gamma _2) \le \eta _\mathrm{{P}}(x) \le 2^{-6}/\ln (1/\hat{\delta })$.
If $\xi _\mathrm{{P}}(x)=0$ then: $\eta _\mathrm{{P}}(x) = 2^{-6}/\ln (1/\hat{\delta })$ and $\bar{\rho }(x,2\Delta ,\gamma _1,\gamma _2)<1/\hat{\delta }$.

In what follows we prove Lemma 4. Let $\Delta _0=\mathrm{diam}(X)$. For $l \in \mathbb {N}$ let $\Delta _l=(1/8)^l\Delta _0$ and let $P_l$ be a $\Delta _l$-bounded partition. For all $l \in \mathbb {N}$ let $\sigma _l:X \rightarrow [0,1]$, $\xi _l:X \rightarrow \{0,1\}$, $\eta _l:X \rightarrow \mathbb {R}^+$ be uniform functions with respect to $P_l$, the functions $\eta _l$ and $\xi _l$ will be randomly generated by the probabilistic partition. For every scale $l\in \mathbb {N}$ define $\varphi _l:X \rightarrow \mathbb {R}^+$ as

$$\begin{aligned} \varphi _l(x) = \min \Big \{ \frac{\xi _l(x)}{ \eta _l(x)} d\Big (x, X \setminus P_l(x)\Big ),\Delta _l/4 \Big \}, \end{aligned}$$

(10)

and for $l\in \mathbb {N}$ define $\psi _l:X \rightarrow \mathbb {R}^+$ as

$$\begin{aligned} \psi _l(x)=\sigma _l(x)\cdot \varphi _l(x). \end{aligned}$$

Finally let $f:X \rightarrow \mathbb {R}^{+}$ be defined as $f=\sum _{l \in \mathbb {N}} \psi _l$. Note that $f$ is well defined because $f(x)=\sum _{l\in \mathbb {N}}\psi _l(x)\le \sum _{l\in \mathbb {N}}\Delta _l$, and this is a geometric progression.

The distribution ${\mathcal D}$ on embeddings $f$ is obtained by choosing each $P_l$ from the distribution $\hat{\mathcal P}_l$ as in Lemma 11 with parameters $Z=X$, $\Delta =\Delta _l$, $\hat{\delta }=1/2$, $\gamma _1=32$ and $\gamma _2=1/16$. For each $l\in \mathbb {N}$ set $\xi _l = \xi _{P_l}$ and $\eta _l=\eta _{P_l}$ as defined in the lemma, and let $\sigma _l$ be a uniform function with respect to $P_l$ defined by letting $\{ \sigma _l(A) | A \in P_l, l\in \mathbb {N}\}$ be i.i.d random variables chosen uniformly in the interval $[0,1]$, and setting $\sigma _l(x)=\sigma _l(P_l(x))$. We begin by proving the first property of Lemma 4.

Lemma 12

For all $u,v \in X$, $f \in \mathrm{supp}({\mathcal D})$,

$$\begin{aligned} | f(u) - f(v)| \le \hat{C} \Big \lceil \log \Big ( \frac{n}{|B(u,d(u,v))|} \Big ) \Big \rceil d(u,v). \end{aligned}$$

where $\hat{C}$ is a universal constant.

Proof

Fix some $u,v \in X$ and $f \in \mathrm{supp}({\mathcal D})$. Hence $\{P_l\}_{l \in \mathbb {N}}$, $\{\sigma _l\}_{l \in \mathbb {N}}$ are fixed. Let $h \in \mathbb {N}$ be the maximum index such that $\Delta _{h} \ge 2d(u,v)$, if no such $h$ exists then let $h=0$. We bound $| f(u) - f(v)|$ by separating the sum into two parts $0\le l<h$, and $l\ge h$:

$$\begin{aligned} | f(u) - f(v)| \le \sum _{0\le l<h} |\psi _l(u) - \psi _l(v)| + \sum _{l\ge h}|\psi _l(u)| + \sum _{l\ge h}|\psi _l(v)|. \end{aligned}$$

(11)

We begin by bounding the first summation. Note that for any set $U \subseteq X$ and numbers $a,r\ge 0$, by the triangle inequality

$$\begin{aligned} \min \{a\cdot d(u,U), r\} - \min \{a\cdot d(v,U), r\} \le a\cdot d(u,v). \end{aligned}$$

(12)

We will show that this implies the following

$$\begin{aligned} \psi _l(u)-\psi _l(v) \le \frac{\xi _l(u)}{\eta _l(u)} d(u,v). \end{aligned}$$

(13)

To see (13), we use the fact that $\sigma _l, \xi _l, \eta _l$ are uniform functions. If it is the case that $P_l(u)=P_l(v)$ then by (12) and as $\sigma _l(u)=\sigma _l(v)\le 1$ we have that $\psi _l(u)-\psi _l(v) \le \frac{\xi _l(u)}{\eta _l(u)} d(u,v)$. Otherwise, if $P_l(u)\ne P_l(v)$, then $d(u,X \setminus P_l(u))\le d(u,v)$ and hence $\psi _l(u)-\psi _l(v) \le \psi _l(u) \le \frac{\xi _l(u)}{\eta _l(u)} d(u,v)$. By symmetry we have that

$$\begin{aligned} |\psi _l(u)-\psi _l(v)| \le \frac{\xi _l(u)}{\eta _l(u)} d(u,v)+\frac{\xi _l(v)}{\eta _l(v)} d(u,v). \end{aligned}$$

(14)

For any $x\in X$,

$$\begin{aligned} \sum _{0\le l<h}\frac{\xi _l(x)}{\eta _l(x)}&= \sum _{0\le l<h:\xi _l(x)=1}\eta _l(x)^{-1}\nonumber \\&\le \sum _{0\le l<h:\xi _l(x)=1}2^6\ln \rho (x,2\Delta _l,\gamma _1,\gamma _2)\nonumber \\&\le 2^6\sum _{0\le l<h} \ln \Big (\frac{|B(x,2\gamma _1\Delta _l)|}{|B(x,2\gamma _2\Delta _l)|}\Big )\nonumber \\&\le 2^6\cdot 3\ln \Big (\frac{|X|}{|B(x,\Delta _{h-1}/8)|}\Big )\nonumber \\&\le 2^9\ln \Big (\frac{|X|}{|B(x,\Delta _h)|}\Big ). \end{aligned}$$

(15)

The first inequality follows from the first property of Lemma 11, and the third inequality holds as $2\gamma _1\Delta _l= 8^2\Delta _l= 2\cdot 8^3\gamma _2\Delta _l= 2\gamma _2\Delta _{l-3}$, this suggests that the sum is telescopic and is bounded accordingly. And now, noticing that $|\psi _l(u)|\le \Delta _l/4$ for all $l\in \mathbb {N}$,

where $\hat{C}>2^{10}$ is a constant. The last inequality uses the fact that $B(u,d(u,v)) \subseteq B(u,\Delta _{h}) \cap B(v,\Delta _{h})$ and that the maximality of $h$ suggests that $\Delta _h\le 16d(u,v)$. $\square $

We now proceed to prove the second property of Lemma 4. Let us first introduce the desired ordering of a set $S\subseteq X$.

Claim 13

For any $S\in {X\atopwithdelims ()k}$ there exists an ordering $(s_0,s_1,\dots ,s_{k-1})$ of the elements of $S$ that satisfy the following property: For all $i\in \{0,\dots ,k-1\}$, if $h\in \mathbb {N}$ is the unique integer such that $6\Delta _h\le d(s_i,\{s_0,\ldots ,s_{i-1}\})<6\Delta _{h-1}$, then

$$\begin{aligned} \bar{\rho }(s_i,2\Delta _h,\gamma _1,\gamma _2)\ge 2. \end{aligned}$$

Proof

Let $(s_0,s_1,\ldots ,s_{k-1})$ be an ordering according to the following iterative process. Start with $W=S$, $i=k-1$.

1.
Let $\{u,v\}=\min _{s,t \in W}\{d(s,t)\}$. Let $h$ be the unique integer such that $6\Delta _{h}\le d(u,v)<6\Delta _{h-1}$.
2.
By Claim 10 with parameters $r=2\Delta _h$, $\gamma _1=32$ and $\gamma _2=1/16$ (as defined above), at least one of $u,v$, w.l.o.g $u$, has $\bar{\rho }(u,2\Delta _h,\gamma _1,\gamma _2)\ge 2$. Then set $s_i=u$ and $W=W\setminus \{u\}$.
3.
If $i>1$, set $i=i-1$ and go to 1. Otherwise (when $i=1$) set $s_0$ to be the last element of $W$.

The claim follows by the very definition of the process.

We will also use the following simple claim.

Claim 14

Let $A \in \mathbb {R}^{+}$ and let $\alpha $ be random variable uniformly distributed in $[0,1]$. Then for any $C \in \mathbb {R}$ and $\gamma >0$:

$$\begin{aligned} \Pr [|C + A\alpha | < \gamma \cdot A] < 2\gamma . \end{aligned}$$

Lemma 15

For every subset $S\subseteq X$ of size $k$, there exists an ordering $S=(s_0,\dots s_{k-1})$, such that for any $1\le i\le k-1$, values $x_0,\dots ,x_{i-1}\in \mathbb {R}$ and coefficients $\alpha _0,\dots ,\alpha _{i-1}\in \mathbb {R}$ with $\sum _{j=0}^{i-1}\alpha _j=1$:

$$\begin{aligned} \Pr _{f\sim {\mathcal D}} \Big [ \big | f(s_i) \!-\! \sum _{j=0}^{i-1}\alpha _jx_j\big | \!\ge \! d(s_i,\{s_0,\dots ,s_{i-1}\})/\hat{C}\mid \forall ~0\!\le \! j\!\le \! i\!-\!1, \quad \! f(s_j)=x_j\Big ] \!\ge \! 1/2. \end{aligned}$$

Proof

Fix $S\subseteq X$ of size $|S|=k$ and let $(s_0,\dots ,s_{k-1})$ be the ordering induced by Claim 13. Fix some $1\le i\le k-1$, values $x_0,\dots ,x_{i-1}\in \mathbb {R}$ and coefficients $\alpha _0,\dots ,\alpha _{i-1}\in \mathbb {R}$ with $\sum _{j=0}^{i-1}\alpha _j=1$ as in the statement of the Lemma. Let $h\in \mathbb {N}$ be such that $6\Delta _h\le d(s_i,\{s_0,\ldots ,s_{i-1}\})<6\Delta _{h-1}$. Note that the different scales $l\in \mathbb {N}$ are sampled independently, and that for the scale $h$, by Lemma 11 even when conditioning on the values $f(s_j)$ for $0\le j\le i-1$, there is probability $\delta =e^{-1/4}>3/4$ that the following event holds

$$\begin{aligned} \mathcal {E}_{s_i-\mathsf pad }=\{ B(s_i,\eta _h(u)\cdot \Delta _h/4) \subseteq P_h(s_i)\}. \end{aligned}$$

Also note that by Claim 13 we have $\bar{\rho }(s_i,2\Delta _h,\gamma _1,\gamma _2)\ge 2$ and also $1/\hat{\delta }=2$, thus the properties of Lemma 11 imply that it must be that $\xi _h(s_i)=1$. Given that event $\mathcal {E}_{s_i-\mathsf pad }$ holds, we obtain

$$\begin{aligned} \varphi _h(s_i)=\min \left\{ \frac{d(s_i,X\setminus P_h(s_i))}{\eta _h(s_i)},\Delta _h/4\right\} \ge \Delta _h/4. \end{aligned}$$

(16)

Next consider the value $a=\sum _{j=0}^{i-1}\alpha _jx_j$, and condition on the event that for all $0\le j\le i-1$, $f(s_j)=x_j$. Let $C=f(s_i)-a-\psi _h(s_i)$, $A=\varphi _h(s_i)$ and $\alpha =\sigma _h(s_i)$, and observe that $|f(s_i)-a|=|C+\alpha A|$. By Claim 14 with $\gamma =1/8$ it follows that

$$\begin{aligned} \Pr _\alpha [|C+\alpha A|<A/8]\le 1/4. \end{aligned}$$

(17)

Since the partition $P_h$ is $\Delta _h$ bounded and $d(s_j,s_i)\!>\!\Delta _h$ for all $0\!\le \!j\le i-1$, we have that $P_h(s_j)\ne P_h(s_i)$. This suggests that $\alpha =\sigma _h(s_i)$ is indeed chosen independently of the $f(s_j)$. Since both (16) and (17) each holds with probability at least $3/4$, there is probability at least $1/2$ that both of them hold (again, even when conditioning on the $f(s_j)$), in which case

$$\begin{aligned} |f(s_i)-a|\ge A/8\ge \Delta _h/32=d(s_i,\{s_0,\dots ,s_{i-1}\})/\hat{C}, \end{aligned}$$

for sufficiently large constant $\hat{C}$. $\square $

Rights and permissions

Reprints and permissions

About this article

Cite this article

Abraham, I., Bartal, Y., Neiman, O. et al. Volume in General Metric Spaces. Discrete Comput Geom 52, 366–389 (2014). https://doi.org/10.1007/s00454-014-9615-4

Download citation

Received: 08 August 2013
Revised: 23 April 2014
Accepted: 07 July 2014
Published: 13 August 2014
Issue Date: September 2014
DOI: https://doi.org/10.1007/s00454-014-9615-4

Keywords

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Volume in General Metric Spaces

Abstract

Similar content being viewed by others

Isometric embeddings of finite metric spaces

An Exact Algorithm for Finite Metric Space Embedding into a Euclidean Space When the Dimension of the Space Is Not Known

Discrete length-volume inequalities and lower volume bounds in metric spaces

1 Introduction

1.1 Volume in General Metric Spaces

1.1.1 Feige’s Notion of Volume

Theorem 1

1.1.2 Our Work, Part I: Robustness of the Metric Volume

Theorem 2

Corollary 1

1.1.3 Our Work, Part II: Volume Preserving Embeddings

Theorem 3

1.2 Related Work

1.2.1 Volume Preserving Embeddings

1.2.2 Average and \(\ell _q\) Distortion

2 Robustness of the Metric Volume

Definition 1

Definition 2

Lemma 2

Proof

Corollary 3

Proof

2.1 Online Metric Steiner Tree

3 Volume Preserving Embeddings

3.1 The Embedding

Lemma 4

Lemma 5

Proof

Lemma 6

Proof

3.1.1 Bounding the \((k-1)\)-dimensional distortion

Lemma 7

Proof

3.1.2 Bounding the average \((k-1)\)-dimensional distortion

Lemma 8

Proof

3.1.3 Bounding the \((k-1)\)-dimensional \(\ell _q\)-distortion

Lemma 9

Proof

Notes

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Additional information

Appendix: Proof of Lemma 4

Appendix: Proof of Lemma 4

Definition 3

Claim 10

Definition 4

Definition 5

Definition 6

Definition 7

Lemma 11

Lemma 12

Proof

Claim 13

Proof

Claim 14

Lemma 15

Proof

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation