Reference Work Entry

Encyclopedia of Algorithms

pp 1-99

# Approximating Metric Spaces by Tree Metrics

1996; Bartal, Fakcharoenphol, Rao, Talwar 2004; Bartal, Fakcharoenphol, Rao, Talwar
• Jittat FakcharoenpholAffiliated withDepartment of Computer Engineering, Kasetsart University
• , Satish RaoAffiliated withComputer Science Division, University of California at Berkeley
• , Kunal TalwarAffiliated withMicrosoft Research, Silicon Valley Campus

## Keywords and Synonyms

Embedding general metrics into tree metrics

## Problem Definition

This problem is to construct a random tree metric that probabilistically approximates a given arbitrary metric well. A solution to this problem is useful as the first step for numerous approximation algorithms because usually solving problems on trees is easier than on general graphs. It also finds applications in on-line and distributed computation.

It is known that tree metrics approximate general metrics badly, e. g., given a cycle C n with n nodes, any tree metric approximating this graph metric has distortion $${ \Omega(n) }$$ [17]. However, Karp [15] noticed that a random spanning tree of C n approximates the distances between any two nodes in C n well in expectation. Alon, Karp, Peleg, and West [1] then proved a bound of $${ \exp(O(\sqrt{\log n\log\log n})) }$$ on an average distortion for approximating any graph metric with its spanning tree.

Bartal [2] formally defined the notion of probabilistic approximation.

### Notations

A graph $${ G=(V,E) }$$ with an assignment of non-negative weights to the edges of G defines a metric space $${ (V,d_G) }$$ where for each pair $${ u,v\in V }$$, $${ d_G(u,v) }$$ is the shortest path distance between u and v in G. A metric (V, d) is a tree metric if there exists some tree $${ T=(V^{\prime},E^{\prime}) }$$ such that $${ V\subseteq V^{\prime} }$$ and for all $${ u,v\in V }$$, $${ d_T(u,v)=d(u,v) }$$. The metric (V, d) is also called a metric induced by T.

Given a metric (V, d), a distribution $${ \mathcal{D} }$$ over tree metrics over V α‑probabilistically approximates d if every tree metric $${ d_T\in\mathcal{D} }$$, $${ d_T(u,v)\geq d(u,v) }$$ and $$\mathrm{E}_{d_T\in\mathcal{D}}[d_T(u,v)]\leq\alpha\cdot d(u,v)$$, for every $${ u,v\in V }$$. The quantity α is referred to as the distortion of the approximation.

Although the definition of probabilistic approximation uses a distribution $${ \mathcal{D} }$$ over tree metrics, one is interested in a procedure that constructs a random tree metric distributed according to $${ \mathcal{D} }$$, i. e., an algorithm that produces a random tree metric that probabilistically approximates a given metric. The problem can be formally stated as follows.

### Problem (APPROX-TREE)

Input: a metric (V, d)

Output: a tree metric $${ (V,d_T) }$$ sampled from a distribution $${ \mathcal{D} }$$ over tree metrics that α‑probabilistically approximates (V, d).

Bartal then defined a class of tree metrics, called hierarchically well‐separated trees (HST), as follows. A khierarchically well‐separated tree (k-HST) is a rooted weighted tree satisfying two properties: the edge weight from any node to each of its children is the same, and the edge weights along any path from the root to a leaf are decreasing by a factor of at least k. These properties are important to many approximation algorithms.

Bartal showed that any metric on n points can be probabilistically approximated by a set of k-HST's with $${ O(\log^2 n) }$$ distortion, an improvement from $${ \exp(O(\sqrt{\log n\log\log n})) }$$ in [1]. Later Bartal [3], following the same approach as in Seymour's analysis on the Feedback Arc Set problem [18], improved the distortion down to $${ O(\log n\log\log n) }$$. Using a rounding procedure of Calinescu, Karloff, and Rabani [5], Fakcharoenphol, Rao, and Talwar [9] devised an algorithm that, in expectation, produces a tree with $${ O(\log n) }$$ distortion. This bound is tight up to a constant factor.

## Key Results

A tree metric is closely related to graph decomposition. The randomized rounding procedure of Calinescu, Karloff, and Rabani [5] for the 0‑extension problem decomposes a graph into pieces with bounded diameter, cutting each edge with probability proportional to its length and a ratio between the numbers of nodes at certain distances. Fakcharoenphol, Rao, and Talwar [9] used the CKR rounding procedure to decompose the graph recursively and obtained the following theorem.

### Theorem 1

Given an n-point metric (V, d), there exists a randomized algorithm, which runs in time O(n 2), that samples a tree metric from the distribution $${ \mathcal{D} }$$ over tree metrics that $${ O(\log n) }$$-probabilistically approximates (V, d). The tree is also a 2-HST.

The bound in Theorem 1 is tight, as Alon et al. [1] proved the bound of an $${ \Omega(\log n) }$$ distortion when (V, d) is induced by a grid graph. Also note that it is known (as folklore) that even embedding a line metric onto a 2-HST requires distortion $${ \Omega(\log n) }$$.

If the tree is required to be a k-HST, one can apply the result of Bartal, Charikar, and Raz [4] which states that any 2-HST can be $${ O(k/\log k) }$$-probabilistically approximated by k-HST, to obtain an expected distortion of $${ O(k\log n/\log k) }$$.

Finding a distribution of tree metrics that probabilistically approximates a given metric has a dual problem that is to find a single tree T with small average weighted stretch. More specifically, given weight c uv on edges, find a tree metric d T such that for all $$u,v\in V d_T(u,v)\geq d(u,v)$$ and $$\sum_{u,v\in V} c_{uv}\cdot d_T(u,v)\leq\alpha\sum_{u,v\in V}c_{uv}\cdot d(u,v)$$.

Charikar, Chekuri, Goel, Guha, and Plotkin [6] showed how to find a distribution of $${ O(n\log n) }$$ tree metrics that α‑probabilistically approximates a given metric, provided that one can solve the dual problem. The algorithm in Theorem 1 can be derandomized by the method of conditional expectation to find the required tree metric with $${ \alpha=O(\log n) }$$. Another algorithm based on modified region growing techniques is presented in [9], and independently by Bartal.

### Theorem 2

Given an n-point metric (V, d), there exists a polynomial-time deterministic algorithm that finds a distribution $${ \mathcal{D} }$$ over $${ O(n\log n) }$$ tree metrics that $${ O(\log n) }$$-probabilistically approximates (V, d).

Note that the tree output by the algorithm contains Steiner nodes, however Gupta [10] showed how to find another tree metric without Steiner nodes while preserving all distances within a constant factor.

## Applications

Metric approximation by random trees has applications in on-line and distributed computation, since randomization works well against oblivious adversaries, and trees are easy to work with and maintain. Alon et al. [1] first used tree embedding to give a competitive algorithm for the k-server problem. Bartal [3] noted a few problems in his paper: metrical task system, distributed paging, distributed k-server problem, distributed queuing, and mobile user.

After the paper by Bartal in 1996, numerous applications in approximation algorithms have been found. Many approximation algorithms work for problems on tree metrics or HST metrics. By approximating general metrics with these metrics, one can turn them into algorithms for general metrics, while, usually, losing only a factor of $${ O(\log n) }$$ in the approximation factors. Sample problems are metric labeling, buy-at-bulk network design, and group Steiner trees. Recent applications include an approximation algorithm to the Unique Games [12], information network design [13], and oblivious network design [11].

The SIGACT News article [8] is a review of the metric approximation by tree metrics with more detailed discussion on developments and techniques. See also [3,9], for other applications.

## Open Problems

Given a metric induced by a graph, some application, e. g., solving a certain class of linear systems, does not only require a tree metric, but a tree metric induced by a spanning tree of the graph. Elkin, Emek, Spielman, and Teng [7] gave an algorithm for finding a spanning tree with average distortion of $${ O(\log^2 n\log\log n) }$$. It remains open if this bound is tight.