$$\varepsilon $$ -Isometric Dimension Reduction for Incompressible Subsets of $$\ell _p$$

Eskenazis, Alexandros

doi:10.1007/s00454-023-00587-w

$\varepsilon $-Isometric Dimension Reduction for Incompressible Subsets of $\ell _p$

Open access
Published: 21 October 2023

Volume 71, pages 160–176, (2024)
Cite this article

Download PDF

You have full access to this open access article

Discrete & Computational Geometry Aims and scope Submit manuscript

$\varepsilon $-Isometric Dimension Reduction for Incompressible Subsets of $\ell _p$

Download PDF

Alexandros Eskenazis ORCID: orcid.org/0000-0002-1601-8307^1,2

489 Accesses
Explore all metrics

Abstract

Fix $p\in [1,\infty )$, $K\in (0,\infty )$, and a probability measure $\mu $. We prove that for every $n\in \mathbb {N}$, $\varepsilon \in (0,1)$, and $x_1,\ldots ,x_n\in L_p(\mu )$ with $\big \Vert \max _{i\in \{1,\ldots ,n\}} |x_i| \big \Vert _{L_p(\mu )} \le K$, there exist $d\le \frac{32e^2 (2K)^{2p}\log n}{\varepsilon ^2}$ and vectors $y_1,\ldots , y_n \in \ell _p^d$ such that

$$\begin{aligned} {\forall }\,\,i,j\in \{1,\ldots ,n\}, \quad \Vert x_i-x_j\Vert ^p_{L_p(\mu )}-\varepsilon\le & {} \Vert y_i-y_j\Vert _{\ell _p^d}^p\le \Vert x_i-x_j\Vert ^p_{L_p(\mu )}+\varepsilon . \end{aligned}$$

Moreover, the argument implies the existence of a greedy algorithm which outputs $\{y_i\}_{i=1}^n$ after receiving $\{x_i\}_{i=1}^n$ as input. The proof relies on a derandomized version of Maurey’s empirical method (1981) combined with a combinatorial idea of Ball (1990) and a suitable change of measure. Motivated by the above embedding, we introduce the notion of $\varepsilon $-isometric dimension reduction of the unit ball ${\textbf {B}}_E$ of a normed space $(E,\Vert \cdot \Vert _E)$ and we prove that ${\textbf {B}}_{\ell _p}$ does not admit $\varepsilon $-isometric dimension reduction by linear operators for any value of $p\ne 2$.

Epsilon-Regularity for Griffith Almost-Minimizers in Any Dimension Under a Separating Condition

Article 13 October 2023

Inverse Littlewood–Offord Problems for Quasi-norms

Article 09 November 2016

Extremal functions for the singular Moser-Trudinger inequality in 2 dimensions

Article 06 June 2015

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

1 Introduction

1.1 Metric Dimension Reduction

Using standard terminology from metric embeddings (see [38]), we say that a mapping between metric spaces is a bi-Lipschitz embedding with distortion at most $\alpha \in [1,\infty )$ if there exists a scaling factor $\sigma \in (0,\infty )$ such that

(1)

Throughout this paper, we shall denote by $\ell _p^d$ the linear space $\mathbb {R}^d$ equipped with the p-norm,

$$\begin{aligned} {\forall }a=(a_1,\ldots ,a_d)\in \mathbb {R}^d, \qquad \Vert a\Vert _{\ell _p^d} = \Big ( \sum _{i=1}^d |a_i|^p\Big )^{1/p}. \end{aligned}$$

(2)

The classical Johnson–Lindenstrauss lemma [21] asserts that if $(\mathcal {H},\Vert \cdot \Vert _{\mathcal {H}})$ is a Hilbert space and $x_1,\ldots ,x_n\in \mathcal {H}$, then for every $\varepsilon \in (0,1)$ there exists $d\le \tfrac{C\log n}{\varepsilon ^2}$ and $y_1,\ldots ,y_n\in \ell _2^d$ such that

$$\begin{aligned} {\forall }i,j\in \{1,\ldots ,n\},\qquad \Vert x_i-x_j\Vert _{\mathcal {H}} \le \Vert y_i-y_j\Vert _{\ell _2^d} \le (1+\varepsilon )\cdot \Vert x_i-x_j\Vert _\mathcal {H}, \end{aligned}$$

(3)

where $C\in (0,\infty )$ is a universal constant. In the above embedding terminology, the Johnson–Lindenstrauss lemma states that for every $\varepsilon \in (0,1)$, $n\in \mathbb {N}$, and $d\ge \tfrac{C\log n}{\varepsilon ^2}$, any n-point subset of Hilbert space admits a bi-Lipschitz embedding into $\ell _2^d$ with distortion at most $1+\varepsilon $. In order to prove their result, Johnson and Lindenstrauss introduced in [21] the influential random projection method that has since had many important applications in metric geometry and theoretical computer science and kickstarted the field of metric dimension reduction (see the recent survey [36] of Naor) which lies at the intersection of those two subjects.

Following [36], we say that an infinite dimensional Banach space $(E,\Vert \cdot \Vert _E)$ admits bi-Lipschitz dimension reduction if there exists $\alpha = \alpha (E)\in [1,\infty )$ such that for every $n\in \mathbb {N}$, there exists $k_n=k_n(E,\alpha )\in \mathbb {N}$ satisfying

$$\begin{aligned} \lim _{n\rightarrow \infty } \frac{\log k_n}{\log n} = 0 \end{aligned}$$

(4)

and such that any n-point subset $\mathcal {S}$ of E admits a bi-Lipschitz embedding with distortion at most $\alpha $ in a finite-dimensional linear subspace F of E with $\textrm{dim}F\le k_n$. The only non-Hilbertian space that is known to admit bi-Lipschitz dimension reduction is the 2-convexification of the classical Tsirelson space, as proven by Johnson and Naor in [22]. Turning to negative results, Matoušek proved in [32] the impossibility of bi-Lipschitz dimension reduction in $\ell _\infty $, whereas Brinkman and Charikar [10] (see also [30] for a shorter proof) constructed an n-point subset of $\ell _1$ which does not admit a bi-Lipschitz embedding into any $n^{o(1)}$-dimensional subspace of $\ell _1$. Their theorem was recently refined by Naor et al. [37] who showed that the same n-point subset of $\ell _1$ does not embed into any $n^{o(1)}$-dimensional subspace of the trace class $\textsf{S}_1$ (see also the striking recent work [41] of Regev and Vidick, where the impossibility of polynomial almost isometric dimension reduction in $\textsf{S}_1$ is established). We refer to [36, Thm. 16] for a summary of the best known bounds quantifying the aforementioned qualitative statements. Despite the lapse of almost four decades since the proof of the Johnson–Lindenstrauss lemma, the following natural question remains stubbornly open.

Question 1.1

For which values of $p\notin \{1,2,\infty \}$ does $\ell _p$ admit bi-Lipschitz dimension reduction?

1.2 Dimensionality and Structure

An important feature of the formalism of bi-Lipschitz dimension reduction in a Banach space E is that both the distortion $\alpha (E)$ of the embedding and the dimension $k_n(E,\alpha )$ of the target subspace F are independent of the given n-point subset $\mathcal {S}$ of E. Nevertheless, there are instances in which one can construct delicate embeddings whose distortion or the dimension of their targets depends on subtle geometric parameters of $\mathcal {S}$. For instance, we mention an important theorem of Schechtman [42, Thm. 5] (which built on work of Klartag and Mendelson [26]) who constructed a linear embedding of an arbitrary subset $\mathcal {S}$ of $\ell _2$ into any Banach space E whose distortion depends only on the Gaussian width of $\mathcal {S}$ and the $\ell $-norm of the identity operator $\textsf{id}_E:E\rightarrow E$. In the special case that E is a Hilbert space, a substantially richer family of such embeddings was devised in [31].

Let $\mu $ be a probability measure. For a subset $\mathcal {S}$ of $L_p(\mu )$, we shall denote

$$\begin{aligned} \mathcal {I}(\mathcal {S}) {\mathop {=}\limits ^{\textrm{def}}}\big \Vert \max _{x\in \mathcal {S}}|x|\big \Vert _{L_p(\mu )} \end{aligned}$$

(5)

and we will say that $\mathcal {S}$ is K-incompressible^{Footnote 1} if $\mathcal {I}(\mathcal {S})\le K$. The main contribution of the present paper is the following dimensionality reduction theorem for incompressible subsets of $L_p(\mu )$ which, in contrast to all the results discussed earlier, is valid for any value of $p\in [1,\infty )$.

Theorem 1.2

($\varepsilon $-isometric dimension reduction for incompressible subsets of $L_p(\mu )$) Fix parameters $p\in [1,\infty )$, $n\in \mathbb {N}$, $K\in (0,\infty )$ and let $\{x_i\}_{i=1}^n$ be a K-incompressible family of vectors in $L_p(\mu )$ for some probability measure $\mu $. Then for every $\varepsilon \in (0,1)$, there exist $d\in \mathbb {N}$ with $d\le \tfrac{32e^2(2K)^{2p}\log n}{\varepsilon ^2}$ and points $y_1,\ldots ,y_n\in \ell _p^d$ such that

$$\begin{aligned} {\forall }i,j\in \{1,\ldots ,n\}, \quad \Vert x_i-x_j\Vert ^p_{L_p(\mu )}- \varepsilon \le \Vert y_i-y_j\Vert _{\ell _p^d}^p \le \Vert x_i-x_j\Vert ^p_{L_p(\mu )}+\varepsilon . \end{aligned}$$

(6)

Besides the appearance of the incompressibility parameter K in the bound for the dimension d of the target space, Theorem 1.2 differs from the Johnson–Lindenstrauss lemma in that the error in (6) is additive rather than multiplicative. Recall that a map between metric spaces is called an $\varepsilon $-isometric embedding if

(7)

Embeddings with additive errors occur naturally in metric geometry and, more specifically, in metric dimension reduction (see e.g. [44, Sect. 9.3]). We mention for instance a result [40, Thm. 1.5] of Plan and Vershynin who showed that any subset $\mathcal {S}$ of the unit sphere in $\ell _2^n$ admits a $\delta $-isometric embedding into the d-dimensional Hamming cube $(\{-1,1\}^d,\Vert \cdot \Vert _1)$, where d depends polynomially on $\delta ^{-1}$ and the Gaussian width of $\mathcal {S}$. In the above embedding terminology and in view of the elementary inequality $|\alpha -\beta | \le |\alpha ^p-\beta ^p|^{1/p}$ which holds for every $\alpha ,\beta >0$, Theorem 1.2 asserts that any n-point K-incompressible subset of $L_p(\mu )$ admits an $\varepsilon ^{1/p}$-isometric embedding into $\ell _p^d$ for the above choice of dimension d. For further occurrences of $\varepsilon $-isometric embeddings in the dimensionality reduction and compressed sensing literatures, we refer to [8, 19, 20, 31, 40, 44] and the references therein.

1.3 Method of Proof

A large part of the (vast) literature on metric dimension reduction focuses on showing that a typical low-rank linear operator chosen randomly from a specific ensemble acts as an approximate isometry on a given set $\mathcal {S}$ with high probability. For subsets $\mathcal {S}$ of Euclidean space, this principle has been confirmed for random projections [12, 14, 21, 36], matrices with Gaussian [15, 16, 42], Rademacher [1, 5], and subgaussian [13, 17, 26, 31] entries, randomizations of matrices with the RIP [27] as well as more computationally efficient models [2, 3, 9, 24, 33] which are based on sparse matrices. Beyond its inherent interest as an $\ell _p$-dimension reduction theorem (albeit, for specific configurations of points), Theorem 1.2 also differs from the aforementioned works in its method of proof. The core of the argument, rather than sampling from a random matrix ensemble, relies on Maurey’s empirical method [39] (see Sect. 2.1) which is a dimension-free way to approximate points in bounded convex subsets of Banach spaces by convex combinations of extreme points with prescribed length. An application of the method to the positive cone of $L_p$-distance matrices (the use of which in this context is inspired by classical work of Ball [6]) equipped with the supremum norm allows us to deduce (see Proposition 2.1) the conclusion of Theorem 1.2 under the stronger assumption that

$$\begin{aligned} K\ge \max _{i\in \{1,\ldots ,n\}} \Vert x_i\Vert _{L_\infty (\mu )}. \end{aligned}$$

(8)

While Maurey’s empirical method is an a priori existential statement that is proven via the probabilistic method, recent works (see [7, 18]) have focused on derandomizing its proof for specific Banach spaces. In the setting of Theorem 1.2, we can use these tools to show (see Corollary 2.7) that there exists a greedy algorithm which receives as input the high-dimensional data $\{x_i\}_{i=1}^n$ and produces as output the low-dimensional points $\{y_i\}_{i=1}^n$. Finally, using a suitable change of measure [34] (see Sect. 2.3) we are able to relax the stronger assumption (8) to that of K-incompressibility and derive the conclusion of Theorem 1.2. Finally, we emphasize that, in contrast to most of the dimension reduction algorithms (randomized or not) discussed earlier, the one which gives Theorem 1.2 is not oblivious but is rather tailored to the specific configuration of points $\{x_i\}_{i=1}^n$ as it relies on the use of Maurey’s empirical method.

1.4 $\varepsilon $-Isometric Dimension Reduction

Given two moduli $\omega ,\Omega :[0,\infty )\rightarrow [0,\infty )$, we say (following [36]) that a Banach space $(E,\Vert \cdot \Vert _E)$ admits metric dimension reduction with moduli $(\omega ,\Omega )$ if for any $n\in \mathbb {N}$ there exists $k_n=k_n(E)\in \mathbb {N}$ with $k_n=n^{o(1)}$ as $n\rightarrow \infty $ such that for any $x_1,\ldots ,x_n\in E$, there exist a subspace F of E with $\textrm{dim}F\le k_n$ and $y_1,\ldots ,y_n \in F$ satisfying

$$\begin{aligned} {\forall }i,j\in \{1,\ldots ,n\}, \quad \omega (\Vert x_i-x_j\Vert _E) \le \Vert y_i-y_j\Vert _E \le \Omega (\Vert x_i-x_j\Vert _E). \end{aligned}$$

(9)

In view of Theorem 1.2, we would be interested in formulating a suitable notion of dimension reduction via $\varepsilon $-isometric embeddings which would be fitting to the moduli appearing in (6).

Remark 1.3

Let $a,b\in (0,\infty )$, suppose that $\omega ,\Omega :[0,\infty )\rightarrow [0,\infty )$ are two moduli satisfying

$$\begin{aligned} \lim _{t\rightarrow \infty } \frac{\omega (t)}{t} = a \qquad \text{ and } \qquad \lim _{t\rightarrow \infty } \frac{\Omega (t)}{t}=b \end{aligned}$$

(10)

and that the Banach space $(E,\Vert \cdot \Vert _E)$ admits metric dimension reduction with moduli $(\omega ,\Omega )$. Fix $n\in \mathbb {N}$ and $x_1,\ldots ,x_n\in E$. Applying the assumption (9) to the points $sx_1,\ldots ,sx_n$ where $s>\!\!\!>1$, we deduce that there exist points $y_1(s),\ldots ,y_n(s)$ in a $k_n$-dimensional subspace F(s) of E such that

$$\begin{aligned} {\forall }i,j\in \{1,\ldots ,n\}, \quad \omega (s\Vert x_i-x_j\Vert _E) \le \big \Vert y_i(s)-y_j(s)\big \Vert _E \le \Omega (s\Vert x_i-x_j\Vert _E). \end{aligned}$$

(11)

For any $\eta \in (0,1)$, we can then choose s large enough (as a function of $\eta $ and the $x_i$) such that

$$\begin{aligned} {\forall }i,j\in \{1,\ldots ,n\},\quad (1-\eta )a\Vert x_i-x_j\Vert _E \le \frac{\Vert y_i(s)-y_j(s)\Vert _E}{s} \le (1+\eta )b\Vert x_i-x_j\Vert _E.\nonumber \\ \end{aligned}$$

(12)

Therefore, we conclude that E also admits bi-Lipschitz dimension reduction (with distortion b/a).

This simple scaling argument suggests that any reasonable notion of $\varepsilon $-isometric dimension reduction can differ from the corresponding bi-Lipschitz theory only in small scales, thus motivating the following definition. We denote by ${\textbf {B}}_E$ the unit ball of a normed space $(E,\Vert \cdot \Vert _E)$.

Definition 1.4

($\varepsilon $-isometric dimension reduction) Fix $\varepsilon \in (0,1)$, $r\in (0,\infty )$ and let $(E,\Vert \cdot \Vert _E)$ be an infinite-dimensional Banach space. We say that ${\textbf {B}}_E$ admits $\varepsilon $-isometric dimension reduction with power r if for every $n\in \mathbb {N}$ there exists $k_n=k_n^r(E,\varepsilon )\in \mathbb {N}$ with $k_n=n^{o(1)}$ as $n\rightarrow \infty $ for which the following condition holds. For every n points $x_1,\ldots ,x_n\in {\textbf {B}}_E$ there exist a linear subspace F of E with $\textrm{dim}F\le k_n$ and points $y_1,\ldots ,y_n\in F$ satisfying

$$\begin{aligned} {\forall }i,j\in \{1,\ldots ,n\}, \qquad \Vert x_i-x_j\Vert _E^r - \varepsilon \le \Vert y_i-y_j\Vert _E^r \le \Vert x_i-x_j\Vert _E^r +\varepsilon . \end{aligned}$$

(13)

The fact that even high-dimensional infinite subsets of Euclidean space $\ell _2$ may admit $\varepsilon $-isometric embeddings into low-dimensional subspaces follows from the additive version of the Johnson–Lindenstrauss lemma, first proven by Liaw, Mehrabian, Plan, and Vershynin [31] (see also [44, Prop. 9.3.2]). In contrast to that, combining the scaling argument of Remark 1.3 with the fact that any d-dimensional subspace of $\ell _2$ is isometric to $\ell _2^d$, we deduce that if $k_n(\varepsilon )$ is the least dimension such that any n points in $\ell _2$ embed $\varepsilon $-isometrically in $\ell _2^{k_n(\varepsilon )}$, then $k_n(\varepsilon )= n-1$. This justifies the restriction of Definition 1.4 to the unit ball ${\textbf {B}}_E$ of E.

It is clear from the definitions that if a Banach space E admits bi-Lipschitz dimension reduction with distortion $\tfrac{1+\varepsilon }{1-\varepsilon }$, where $\varepsilon \in (0,1)$, then ${\textbf {B}}_E$ admits $2\varepsilon $-isometric dimension reduction with power $r=1$. The $\varepsilon $-isometric analogue of Question 1.1 deserves further investigation.

Question 1.5

For which values of $p\ne 2$ does ${\textbf {B}}_{\ell _p}$ admit $\varepsilon $-isometric dimension reduction?

Even though the K-incompressibility assumption of Theorem 1.2 may a priori seem restrictive, it is satisfied for most configurations of points in ${\textbf {B}}_{\ell _p}$. Suppose that $n,N\in \mathbb {N}$ such that N is polynomial^{Footnote 2} in n. Then, standard considerations show that with high probability, a uniformly chosen n-point subset $\mathcal {S}$ of $N^{1/p}{} {\textbf {B}}_{\ell _p^N}$ is $O(\log n)^{1/p}$-incompressible. We refer to Remark 2.4 for more information on this and related generic properties of finite subsets of rescaled p-balls.

1.5 $\varepsilon $-Isometric Dimension Reduction by Linear Maps

A close inspection of the proof of Theorem 1.2 (see Remark 2.6) reveals that in fact the low-dimensional points $\{y_i\}_{i=1}^n$ can be realized as images of the initial data $\{x_i\}_{i=1}^n$ under a carefully chosen linear operator. Nevertheless, we will show that for any $p\ne 2$ and n large enough, there exists an n-point subset of ${{\textbf {B}}}_{\ell _p}$ whose image under any fixed linear $\varepsilon $-isometric embedding has rank which is linear in n. In fact, we shall prove the following more general statement which refines a theorem that Lee, Mendel and Naor proved in [29] for bi-Lipschitz embeddings.

Theorem 1.6

(Impossibility of linear dimension reduction in ${\textbf {B}}_{\ell _p}$) Fix $p\ne 2$ and two moduli $\omega ,\Omega :[0,\infty )\rightarrow [0,\infty )$ with $\omega (1)>0$. For arbitrarily large $n\in \mathbb {N}$, there exists an n-point subset $\mathcal {S}_{n,p}$ of ${\textbf {B}}_{\ell _p}$ such that the following holds. If $T:\textrm{span}(\mathcal {S}_{n,p})\rightarrow \ell _p^d$ is a linear operator satisfying

$$\begin{aligned} {\forall }x,y\in \mathcal {S}_{n,p}, \qquad \omega (\Vert x-y\Vert _{\ell _p}) \le \Vert Tx-Ty\Vert _{\ell _p^d} \le \Omega (\Vert x-y\Vert _{\ell _p}), \end{aligned}$$

(14)

then $d\ge \left( \tfrac{\omega (1)}{\Omega (1)}\right) ^\frac{2p}{|p-2|} \cdot \tfrac{n-1}{2}$.

2 Proof of Theorem 1.2

We say that a normed space $(E,\Vert \cdot \Vert _E)$ has Rademacher type p if there exists a universal constant $T\in (0,\infty )$ such that for every $n\in \mathbb {N}$ and every $x_1,\ldots ,x_n\in E$,

$$\begin{aligned} \frac{1}{2^n} \sum _{\varepsilon \in \{-1,1\}^n} \Big \Vert \sum _{i=1}^n \varepsilon _i x_i\Big \Vert _E^p \le T^p \sum _{i=1}^n \Vert x_i\Vert _E^p. \end{aligned}$$

(15)

The least constant T such that (15) is satisfied is denoted by $T_p(E)$. A standard symmetrization argument (see [28, Prop. 9.11]) shows that if $X_1,\ldots ,X_n$ are independent E-valued random variables with $\mathbb {E}[X_i]=0$ for every $i\in \{1,\ldots ,n\}$, then

$$\begin{aligned} \mathbb {E}\Big \Vert \sum _{i=1}^n X_i\Big \Vert _E^p \le \big (2T_p(E)\big )^p \sum _{i=1}^n \mathbb {E} \Vert X_i\Vert _E^p. \end{aligned}$$

(16)

2.1 Maurey’s Empirical Method and Its Algorithmic Counterparts

A classical theorem of Carathéodory asserts that if $\mathcal {T}$ is a subset of $\mathbb {R}^m$, then any point z in the convex hull $\textrm{conv}(\mathcal {T})$ (that is, a convex combination of finitely many elements of $\mathcal {T}$) can be expressed as a convex combination of at most $m+1$ points of $\mathcal {T}$. Maurey’s empirical method is a powerful dimension-free approximate version of Carathéodory’s theorem, first popularized in [39], that has numerous applications in geometry and theoretical computer science. Let $(E,\Vert \cdot \Vert _E)$ be a Banach space, consider a bounded subset $\mathcal {T}$ of E and fix $z\in \textrm{conv}(\mathcal {T})$. Since z is a convex combination of elements of $\mathcal {T}$, there exist $m\in \mathbb {N}$, $\lambda _1,\ldots ,\lambda _m\in (0,\infty )$, and $t_1,\ldots ,t_m\in \mathcal {T}$ such that

$$\begin{aligned} \sum _{k=1}^m \lambda _k = 1 \quad \text{ and } \quad z=\sum _{k=1}^m \lambda _kt_k. \end{aligned}$$

(17)

Let X be an E-valued discrete random variable with $\mathbb {P}\{X=t_k\}=\lambda _k$ for all $k\in \{1,\ldots ,m\}$ and consider $X_1,\ldots ,X_d$ i.i.d. copies of X. Then, conditions (17) ensure that X is well defined and $\mathbb {E}[X]=z$. Therefore, applying the Rademacher type condition (16) to the centered random variables $\{X_s-z\}_{s=1}^d$ and normalizing, we get

$$\begin{aligned} \mathbb {E}\Big \Vert \frac{1}{d} \sum _{s=1}^d X_s - z\Big \Vert _E^p \le \frac{(2T_p(E))^p}{d^{p-1}} \ \mathbb {E}\Vert X-z\Vert _E^p. \end{aligned}$$

(18)

Since X takes values in $\mathcal {T}$, if $\mathcal {T} \subseteq R{{\textbf {B}}}_E$, we then deduce that there exist $x_1,\ldots ,x_d\in \mathcal {T}$ such that

$$\begin{aligned} \Big \Vert \frac{1}{d}\sum _{s=1}^d x_s - z\Big \Vert _E \le \frac{4RT_p(E)}{d^{1-1/p}}. \end{aligned}$$

(19)

While the above argument is probabilistic, recent works have focused on derandomizing Maurey’s sampling lemma for smaller classes of Banach spaces, thus constructing deterministic algorithms which output the empirical approximation $\tfrac{x_1+\ldots +x_d}{d}$ of z. The first result in this direction is due to Barman [7] who treated the case that E is an $L_r(\mu )$-space, $r\in (1,\infty )$. This assumption was recently generalized by Ivanov in [18] who built a greedy algorithm which constructs the desired empirical mean in an arbitrary p-uniformly smooth space.

2.2 Dimension Reduction in $L_p(\mu )$ for Uniformly Bounded Vectors

With Maurey’s empirical method at hand, we are ready to proceed to the first part of the proof of Theorem 1.2, namely the $\varepsilon $-isometric dimension reduction property of $L_p(\mu )$ under the strong assumption that the given point set consists of functions which are bounded in $L_\infty (\mu )$.

Proposition 2.1

Fix $p\in [1,\infty )$, $n\in \mathbb {N}$ and let $\{x_i\}_{i=1}^n$ be a family of vectors in $L_p(\mu )$ for some probability measure $\mu $. Denote by $L=\max _{i\in \{1,\ldots ,n\}} \Vert x_i\Vert _{L_\infty (\mu )}\in [0,\infty ]$. Then for every $\varepsilon \in (0,1)$, there exist $d\in \mathbb {N}$ with $d\le \tfrac{32e^2(2L)^{2p}\log n}{\varepsilon ^2}$ and $y_1,\ldots ,y_n\in \ell _p^d$ such that

$$\begin{aligned} {\forall }i,j\in \{1,\ldots ,n\}, \qquad \Vert x_i{-}x_j\Vert ^p_{L_p(\mu )}{-} \varepsilon \le \Vert y_i{-}y_j\Vert _{\ell _p^d}^p \le \Vert x_i{-}x_j\Vert ^p_{L_p(\mu )}+\varepsilon . \end{aligned}$$

(20)

Proof

We shall identify $\ell _\infty ^{\left( {\begin{array}{c}n\\ 2\end{array}}\right) }$ with the vector space of all symmetric $n\times n$ real matrices with 0 on the diagonal equipped with the supremum norm. Consider the set

$$\begin{aligned}{} & {} \mathcal {C}_p = \big \{ \big ( \Vert z_i-z_j\Vert _{L_p(\rho )}^p\big )_{i,j=1,\ldots ,n}: \ \rho \text{ is } \text{ a } \text{ probability } \text{ measure } \text{ and } \nonumber \\{} & {} \qquad \qquad \qquad \qquad \qquad \qquad \qquad \qquad \qquad \qquad z_1,\ldots ,z_n\in L_p(\rho )\big \} \subseteq \ell _\infty ^{\left( {\begin{array}{c}n\\ 2\end{array}}\right) }. \end{aligned}$$

(21)

It is obvious that $\mathcal {C}_p$ is a cone in the sense that $\mathcal {C}_p = \lambda \mathcal {C}_p$ for every $\lambda >0$ but moreover $\mathcal {C}_p$ is convex. To see this, consider $A,B\in \mathcal {C}_p$, probability spaces $(\Omega _1,\rho _1), (\Omega _2,\rho _2)$, and vectors $\{z_i\}_{i=1}^n, \{w_i\}_{i=1}^n$ in $L_p(\rho _1)$ and $L_p(\rho _2)$ respectively such that

$$\begin{aligned} {\forall }i,j\in \{1,\ldots ,n\}, \quad A_{ij} = \Vert z_i-z_j\Vert _{L_p(\rho _1)}^p \ \ \text{ and } \ \ B_{ij} = \Vert w_i-w_j\Vert _{L_p(\rho _2)}^p. \end{aligned}$$

(22)

Fix $\lambda \in (0,1)$ and consider the disjoint union $\Omega _1\sqcup \Omega _2$ of $\Omega _1$ and $\Omega _2$ equipped with the probability measure $\rho (\lambda ) = \lambda \rho _1+(1-\lambda )\rho _2$. Then, by (22) the functions $\zeta _i:\Omega _1\sqcup \Omega _2\rightarrow \mathbb {R}$ given by $\zeta _i|_{\Omega _1} = z_i$ and $\zeta _i|_{\Omega _2}=w_i$, where $i\in \{1,\ldots ,n\}$, belong to $L_p(\rho (\lambda ))$ and satisfy the conditions

$$\begin{aligned} {\forall }i,j\in \{1,\ldots ,n\}, \Vert \zeta _i-\zeta _j\Vert _{L_p(\rho (\lambda ))}^p= & {} \lambda \Vert z_i-z_j\Vert _{L_p(\rho _1)}^p + (1-\lambda ) \Vert w_i-w_j\Vert _{L_p(\rho _2)}^p\nonumber \\ {}= & {} \lambda A_{ij} + (1-\lambda ) B_{ij}, \end{aligned}$$

(23)

which ensure that $\lambda A+(1-\lambda )B\in \mathcal {C}_p$, making $\mathcal {C}_p$ a convex cone. Consider the embedding $\mathcal {M}:L_p(\mu )^n\rightarrow \mathcal {C}_p$ mapping a vector $z=(z_1,\ldots ,z_n)$ to the corresponding distance matrix, i.e.

$$\begin{aligned} {\forall }i,j\in \{1,\ldots ,n\}, \qquad \mathcal {M}(z)_{ij} = \Vert z_i-z_j\Vert _{L_p(\mu )}^p. \end{aligned}$$

(24)

By Ball’s isometric embedding theorem [6], $x_1,\ldots ,x_n$ have isometric images in $\ell _p^N$ with $N=\left( {\begin{array}{c}n\\ 2\end{array}}\right) +1$. Without loss of generality we will thus assume that the given points $x_1,\ldots ,x_n\in L_p(\mu )$ are simple functions (that is, each of them takes only finitely many values) with $\Vert x_i\Vert _{L_\infty (\mu )} \le L$. Let $\{S_1,\ldots ,S_m\}$ be a partition of the underlying measure space such that each function $x_i$ is constant on each $S_k$ and suppose that $x_i|_{S_k} = a(i,k) \in [-L,L]$ for $i\in \{1,\ldots ,n\}$ and $k\in \{1,\ldots ,m\}$. Then, for every $i,j\in \{1,\ldots ,n\}$, we have

$$\begin{aligned} \mathcal {M}(x)_{ij}= & {} \sum _{k=1}^m \int _{S_k} |x_i-x_j|^p \,\mathop {}\!\textrm{d}\mu = \sum _{k=1}^m \mu (S_k) \cdot \big |a(i,k)-a(j,k)\big |^p \nonumber \\= & {} \sum _{k=1}^m \mu (S_k) \ \mathcal {M}\big (y(k)\big )_{ij}, \end{aligned}$$

(25)

where $y(k) {\mathop {=}\limits ^{\textrm{def}}}(a(1,k),\ldots ,a(n,k))\in L_p(\mu )^n$ is a vector whose components are constant functions. As $\mu $ is a probability measure and $\{S_1,\ldots ,S_m\}$ is a partition, identity (25) implies that

$$\begin{aligned} \mathcal {M}(x) \in \textrm{conv} \big \{ \mathcal {M}\big ( y(k)\big ): \ k\in \{1,\ldots ,m\}\big \} \subseteq \ell _\infty ^{\left( {\begin{array}{c}n\\ 2\end{array}}\right) }.\quad \end{aligned}$$

(26)

Observe that since $a(i,k)\in [-L,L]$ for every $i\in \{1,\ldots ,n\}$ and $k\in \{1,\ldots ,m\}$, we have

$$\begin{aligned} {\forall }k\in \{1,\ldots ,m\}, \qquad \big \Vert \mathcal {M}\big (y(k)\big )\big \Vert _{\ell _\infty ^{\left( {\begin{array}{c}n\\ 2\end{array}}\right) }} {=} \max _{i,j\in \{1,\ldots ,n\}} \big |a(i,k){-}a(j,k)\big |^p \le (2L)^p.\qquad \end{aligned}$$

(27)

Moreover, $\ell _\infty ^{\left( {\begin{array}{c}n\\ 2\end{array}}\right) }$ is e-isomorphic to $\ell _{p_n}^{\left( {\begin{array}{c}n\\ 2\end{array}}\right) }$ where $p_n=\log \left( {\begin{array}{c}n\\ 2\end{array}}\right) $. It is well-known (see [28, Chap. 9]) that $T_2(\ell _p) \le \sqrt{p-1}$ for every $p\ge 2$ and thus

$$\begin{aligned} T_2\big (\ell _\infty ^{\left( {\begin{array}{c}n\\ 2\end{array}}\right) } \big ) \le e\sqrt{p_n-1} < \sqrt{2e^2\log n}. \end{aligned}$$

(28)

Applying Maurey’s sampling lemma (Sect. 2.1) while taking into account (27) and (28), we deduce that for every $d\ge 1$ there exist $k_1,\ldots ,k_d\in \{1,\ldots ,m\}$ such that

$$\begin{aligned} \Big \Vert \frac{1}{d} \sum _{s=1}^d \mathcal {M}\big ( y(k_s)\big ) - \mathcal {M}(x)\Big \Vert _{\ell _\infty ^{\left( {\begin{array}{c}n\\ 2\end{array}}\right) }} \le \frac{2^{p+\frac{5}{2}}eL^p\sqrt{\log n}}{\sqrt{d}}. \end{aligned}$$

(29)

Therefore, if $\varepsilon \in (0,1)$ is such that $d\ge \tfrac{32e^2 (2\,L)^{2p}\log n}{\varepsilon ^2}$ we then have

$$\begin{aligned} {\forall }i,j\in \{1,\ldots ,n\},\qquad \Big | \frac{1}{d} \sum _{s=1}^d \big |a(i,k_s)-a(j,k_s)\big |^p - \Vert x_i-x_j\Vert _{L_p(\mu )}^p \Big | \le \varepsilon .\qquad \end{aligned}$$

(30)

Finally, consider for each $i\in \{1,\ldots ,n\}$ a vector $y_i=(y_i(1),\ldots ,y_i(d))\in \ell _p^d$ given by

$$\begin{aligned} {\forall }s\in \{1,\ldots ,d\}, y_i(s) = \frac{a(i,k_s)}{d^{1/p}} \end{aligned}$$

(31)

and notice that (30) can be equivalently rewritten as

$$\begin{aligned} {\forall }i,j\in \{1,\ldots ,n\}, \quad \Vert x_i-x_j\Vert ^p_{L_p(\mu )}- \varepsilon \le \Vert y_i-y_j\Vert _{\ell _p^d}^p \le \Vert x_i-x_j\Vert ^p_{L_p(\mu )}+\varepsilon , \nonumber \\ \end{aligned}$$

(32)

concluding the proof of the proposition. $\square $

Remark 2.2

It is worth emphasizing that the coordinates of the vectors $y_1,\ldots ,y_n$ produced in Proposition 2.1 consist (up to rescaling) of values of the functions $x_1,\ldots ,x_n$. Such low-dimensional embeddings via sampling are a central object of study in approximation theory, see e.g. the recent survey [25] and the references therein.

The additive version of the Johnson–Lindenstrauss lemma, first observed in [31] as a consequence of a deep matrix deviation inequality (see also [44, Chap. 9]), asserts that for every n-point subset $\mathcal {X}=\{x_1,\ldots ,x_n\}$ of a Hilbert space $\mathcal {H}$ and every $\varepsilon \in (0,1)$, there exist $d\le \tfrac{C w(\mathcal {X})^2}{\varepsilon ^2}$ and points $y_1,\ldots ,y_n\in \ell _2^d$ such that

$$\begin{aligned} {\forall }i,j\in \{1,\ldots ,n\}, \qquad \Vert x_i-x_j\Vert _{\mathcal {H}}-\varepsilon \le \Vert y_i-y_j\Vert _{\ell _2^d} \le \Vert x_i-x_j\Vert _\mathcal {H}+\varepsilon , \end{aligned}$$

(33)

where $w(\mathcal {X})$ is the mean width of $\mathcal {X}$. We will now observe that the spherical symmetry of ${\textbf {B}}_{\ell _2}$ allows us to deduce a similar conclusion for points in ${\textbf {B}}_{\mathcal {H}}$ by removing the incompressibility assumption from Proposition 2.1 when $p=2$. We shall use the standard notation $L_p^N$ for the space $L_p(\mu _N)$ where $\mu _N$ is the normalized counting measure on the finite set $\{1,\ldots ,N\}$, that is

$$\begin{aligned} {\forall }a=(a_1,\ldots ,a_N)\in \mathbb {R}^N, \Vert a\Vert _{L_p^N} {\mathop {=}\limits ^{\textrm{def}}}\Big (\frac{1}{N}\sum _{i=1}^N |a_i|^p\Big )^{1/p}. \end{aligned}$$

(34)

Observe that for $0<p<q\le \infty $, we have ${\textbf {B}}_{L_q^N} \subseteq {\textbf {B}}_{L_p^N}$.

Corollary 2.3

There exists a universal constant $C\in (0,\infty )$ such that the following statement holds. Fix $n\in \mathbb {N}$ and let $\{x_i\}_{i=1}^n$ be a family of vectors in ${\textbf {B}}_{\mathcal {H}}$ for some Hilbert space $\mathcal {H}$. Then for every $\varepsilon \in (0,1)$, there exist $d\in \mathbb {N}$ with $d\le \tfrac{C(\log n)^3}{\varepsilon ^4}$ and points $y_1,\ldots ,y_n\in \ell _2^d$ such that

$$\begin{aligned} {\forall }i,j\in \{1,\ldots ,n\}, \quad \Vert x_i-x_j\Vert _{\mathcal {H}}- \varepsilon \le \Vert y_i-y_j\Vert _{\ell _2^d} \le \Vert x_i-x_j\Vert _{\mathcal {H}}+\varepsilon . \end{aligned}$$

(35)

Before proceeding to the derivation of (35) we emphasize that since the given points $\{x_i\}_{i=1}^n$ belong to ${\textbf {B}}_\mathcal {H}$, Corollary 2.3 is formally weaker than the Johnson–Lindenstrauss lemma. However we include it here since it differs from [21] in that the low-dimensional point set $\{y_i\}_{i=1}^n$ is not obtained as an image of $\{x_i\}_{i=1}^n$ under a typical low-rank matrix from a specific ensemble.

Proof of Corollary 2.3

Since any n-point subset $\{x_1,\ldots ,x_n\}$ of $\mathcal {H}$ embeds linearly and isometrically in $L_2^n$, we assume that $x_1,\ldots ,x_n\in {\textbf {B}}_{L_2^{n}}$. We will need the following claim.

Claim. Suppose that $X_1,\ldots ,X_n$ are (not necessarily independent) random vectors, each uniformly distributed on the unit sphere $\mathbb {S}^{n-1}$ of $L_2^{n}$. Then, for some universal constant $S\in (0,\infty )$,

$$\begin{aligned} \mathbb {E} \big [ \max _{i\in \{1,\ldots ,n\}} \Vert X_i\Vert _{L_\infty ^{n}} \big ] \le S \sqrt{\log n}, \end{aligned}$$

(36)

Proof of the Claim

By a standard estimate of Schechtman and Zinn [43, Thm. 3], for a uniformly distributed random vector X on the unit sphere $\mathbb {S}^{n-1}$ of $L_2^{n}$, we have

$$\begin{aligned} {\forall }t\ge \gamma _1\sqrt{\log n}, \qquad \mathbb {P}\big \{ \Vert X\Vert _{L_\infty ^{n}} > t\big \} \le e^{-\gamma _2 t^2} \end{aligned}$$

(37)

for some absolute constants $\gamma _1,\gamma _2\in (0,\infty )$. Let $W{\mathop {=}\limits ^{\textrm{def}}}\max _{i\in \{1,\ldots ,n\}} \Vert X_i\Vert _{L_\infty ^{n}}$ and notice that

$$\begin{aligned} \forall \; K\in (\gamma _1,\infty ), \quad \mathbb {E}[W] = \int _0^\infty \mathbb {P}\{W{>}t\} \,\mathop {}\!\textrm{d}t {\le } K\sqrt{\log n} {+} \int _{K\sqrt{\log n}}^\infty \mathbb {P}\{W{>}t\} \,\mathop {}\!\textrm{d}t.\qquad \end{aligned}$$

(38)

By the union bound, we have

$$\begin{aligned} {\forall }t>0, \qquad \mathbb {P}\{W>t\} \le \sum _{i=1}^n \mathbb {P}\{X_i>t\} = n \mathbb {P}\{X_1>t\}. \end{aligned}$$

(39)

Combining (38) and (39), we therefore get

$$\begin{aligned} \begin{aligned} \mathbb {E}[W]&\le K\sqrt{\log n} +n \int _{K\sqrt{\log n}} \mathbb {P}\{X_1>t\}\,\mathop {}\!\textrm{d}t {\mathop {\le }\limits ^{37}} K\sqrt{\log n} + n \int _{K\sqrt{\log n}}^\infty e^{-\gamma _2t^2} \,\mathop {}\!\textrm{d}t \\ {}&= K\sqrt{\log n} +n\sqrt{\log n} \int _{K}^\infty n^{-\gamma _2u^2} \,\mathop {}\!\textrm{d}u \\ {}&= K\sqrt{\log n} + \sqrt{\log n} \int _K^\infty n^{1-\gamma _2u^2} \,\mathop {}\!\textrm{d}u. \end{aligned} \end{aligned}$$

(40)

Choosing $K>\gamma _1$ such that $K^2\gamma _2>1$, the exponent in the last integrand becomes negative, thus

$$\begin{aligned} \mathbb {E}[W]\le K\sqrt{\log n}+2\sqrt{\log n} \int _K^\infty 2^{-\gamma _2 u^2}\,\mathop {}\!\textrm{d}u \le S\sqrt{\log n} \end{aligned}$$

(41)

for a large enough constant $S\in (0,\infty )$ and the claim follows.

Now let $U \in \mathcal {O}(n)$ be a uniformly chosen random rotation on $\mathbb {R}^{n}$. The aforementioned claim shows that since $\Vert x_i\Vert _{L_2^{n}}\le 1$ for every $i\in \{1,\ldots ,n\}$, writing $\hat{x}_i =\tfrac{x_i}{\Vert x_i\Vert _{L_2^{n}}}$, we have the estimate

$$\begin{aligned} \mathbb {E}\big [ \max _{i\in \{1,\ldots ,n\}} \Vert Ux_i\Vert _{L_\infty ^{n}}\big ] \le \mathbb {E}\big [ \max _{i\in \{1,\ldots ,n\}} \Vert U\hat{x}_i\Vert _{L_\infty ^{n}}\big ] \le S\sqrt{\log n}. \end{aligned}$$

(42)

Therefore, by (42) and Proposition 2.1 there exist a constant $C\in (0,\infty )$ and a rotation $U\in \mathcal {O}(n)$ such that for every $\varepsilon \in (0,1)$ there exist $d\le \tfrac{C(\log n)^3}{\varepsilon ^4}$ and points $y_1,\ldots ,y_n\in \ell _2^d$ for which

$$\begin{aligned} {\forall }i,j\in \{1,\ldots ,n\},\qquad \Vert Ux_i-Ux_j\Vert ^2_{L_2^{n}}{-} \varepsilon ^2 \le \Vert y_i{-}y_j\Vert _{\ell _2^d}^2 {\le } \Vert Ux_i{-}Ux_j\Vert ^2_{L_2^{n}}{+}\varepsilon ^2. \nonumber \\ \end{aligned}$$

(43)

Since $\Vert Ua-Ub\Vert _{L_2^{n}} = \Vert a-b\Vert _{L_2^{n}}$ for every $a,b\in L_2^n$, the conclusion follows by the elementary inequality $|\alpha -\beta | \le \sqrt{|\alpha ^2-\beta ^2|}$ which holds for every positive numbers $\alpha ,\beta \in (0,\infty )$. $\square $

Remark 2.4

Fix $p\in [1,\infty )$. The isometric embedding theorem of Ball [6] asserts that any n-point subset of $\ell _p$ admits an isometric embedding into $\ell _p^N$ where $N=\left( {\begin{array}{c}n\\ 2\end{array}}\right) +1$. Suppose, more generally, that $n,N\in \mathbb {N}$ are such that N is polynomial in n. Considerations in the spirit of the proof of Corollary 2.3 (e.g. relying on [43]) then show that if $x_1,\ldots ,x_n$ are independent uniformly random points in ${\textbf {B}}_{L_p^N}$, then the random set $\{x_1,\ldots ,x_n\}$ is $O(\log n)^{1/p}$-incompressible. In other words, incompressibility is a generic property of random n-point subsets of ${\textbf {B}}_{L_p^N}$. On the other hand, a typical n-point subset of ${\textbf {B}}_{L_p^N}$ is known to be approximately a simplex due to work of Arias-de-Reyna, Ball, and Villa [4] and so, in particular, it can be bi-Lipschitzly embedded in $O(\log n)$ dimensions.

2.3 Factorization and Proof of Theorem 1.2

Observe that Proposition 2.1 is rather non-canonical as the conclusion depends on the pairwise distances between the points $\{x_i\}_{i=1}^n$ in $L_p(\mu )$ whereas the bound on the dimension depends on $L=\max _i \Vert x_i\Vert _{L_\infty (\mu )}$. In order to deduce Theorem 1.2 from this (a priori weaker) statement we shall leverage the fact that Proposition 2.1 holds for any probability measure $\mu $ by optimizing this parameter L over all lattice-isomorphic images of $\{x_i\}_{i=1}^n$. The optimal such change of measure which allows us to replace L by $\Vert \max _i |x_i|\Vert _{L_p(\mu )}$ is a special case of a classical factorization theorem of Maurey (see [34] or [23, Thm. 5] for the general statement), whose short proof we include for completeness.

Proposition 2.5

Fix $n\in \mathbb {N}$, $p\in (0,\infty )$, and a probability space $(\Omega ,\mu )$. For every points $x_1,\ldots ,x_n\in L_p(\mu )$, there exists a nonnegative density function $f:\Omega \rightarrow \mathbb {R}_+$ supported on the support of $\max _i|x_i|$ such that if $\nu $ is the probability measure on $\Omega $ given by $\tfrac{\mathop {}\!\textrm{d}\nu }{\mathop {}\!\textrm{d}\mu }=f$, then

$$\begin{aligned} \max _{i\in \{1,\ldots ,n\}}\big \Vert x_i f^{-1/p}\big \Vert _{L_\infty (\nu )} \le \big \Vert \max _{i\in \{1,\ldots ,n\}} |x_i|\big \Vert _{L_p(\mu )}. \end{aligned}$$

(44)

Proof

Let $V=\textrm{supp}(\max _i |x_i|)\subseteq \Omega $ and define the change of measure f as

$$\begin{aligned} {\forall }\omega \in \Omega , \quad f(\omega ) {\mathop {=}\limits ^{\textrm{def}}}\frac{\max _{i\in \{1,\ldots ,n\}}|x_i(\omega )|^p}{\int _\Omega \max _{i\in \{1,\ldots ,n\}}|x_i(\theta )|^p\,\mathop {}\!\textrm{d}\theta }. \end{aligned}$$

(45)

Then, (44) is elementary to check. $\square $

We are now ready to complete the proof of Theorem 1.2.

Proof of Theorem 1.2

Fix a K-incompressible family of vectors $x_1,\ldots ,x_n\in L_p(\Omega ,\mu )$ and let $V=\textrm{supp}( \max _i |x_i|)\subseteq \Omega $. Denote by $f:\Omega \rightarrow \mathbb {R}_+$ the change of density from Proposition 2.5. If $\tfrac{\mathop {}\!\textrm{d}\nu }{\mathop {}\!\textrm{d}\mu }=f$, then the linear operator $T:L_p(V,\mu )\rightarrow L_p(\Omega ,\nu )$ given by $Tg = f^{-1/p}g$ is (trivially) a linear isometry. Therefore, Proposition 2.1 and (44) show that there exist $d\in \mathbb {N}$ with $d\le \tfrac{32e^2(2K)^{2p}\log n}{\varepsilon ^2}$ and points $y_1,\ldots ,y_n\in \ell _p^d$ such that the condition

$$\begin{aligned} \Vert x_i-x_j\Vert ^p_{L_p(\mu )}- \varepsilon= & {} \Vert Tx_i-Tx_j\Vert ^p_{L_p(\nu )}- \varepsilon \le \Vert y_i-y_j\Vert _{\ell _p^d}^p \nonumber \\\le & {} \Vert Tx_i-Tx_j\Vert ^p_{L_p(\nu )} + \varepsilon = \Vert x_i-x_j\Vert ^p_{L_p(\mu )}+\varepsilon , \end{aligned}$$

(46)

is satisfied for every $i,j\in \{1,\ldots ,n\}$. This concludes the proof of Theorem 1.2.

Remark 2.6

A careful inspection of the proof of Theorem 1.2 reveals that the low-dimensional points $\{y_i\}_{i=1}^n$ can be obtained as images of the given points $\{x_i\}_{i=1}^n$ under a linear transformation. Indeed, starting from a K-incompressible family of points $\{x_i\}_{i=1}^n$ in $L_p(\Omega ,\mu )$, we use Proposition 2.5 to find a change of measure $T:L_p(V,\mu )\rightarrow L_p(\Omega ,\nu )$ such that $\{Tx_i\}_{i=1}^n$ satisfy the stronger assumption of Proposition 2.1. Then, for some $d\in \mathbb {N}$ with $d\le \tfrac{32e^2(2K)^{2p}\log n}{\varepsilon ^2}$ we find pairwise disjoint measurable subsets $S_1,\ldots ,S_d$ of $\Omega $, each with positive measure, such that if $S:L_p(\Omega ,\nu )\rightarrow \ell _p^d$ is the linear map

$$\begin{aligned} {\forall }z\in L_p(\Omega ,\nu ), \quad Sz {\mathop {=}\limits ^{\textrm{def}}}\frac{1}{d^{1/p}}\Big ( \frac{1}{\mu (S_1)} \int _{S_1} z \,\mathop {}\!\textrm{d}\nu , \ldots , \frac{1}{\mu (S_d)} \int _{S_d} z\,\mathop {}\!\textrm{d}\nu \Big ) \in \ell _p^d, \end{aligned}$$

(47)

then the points $\{y_i\}_{i=1}^n = \{(S\circ T)x_i\}_{i=1}^n\subseteq \ell _p^d$ satisfy the desired conclusion (6).

We conclude this section by observing that the argument leading to Theorem 1.2 is constructive.

Corollary 2.7

In the setting of Theorem 1.2, there exists a greedy algorithm which receives as input the high-dimensional points $\{x_i\}_{i=1}^n$ and produces as output the low-dimensional points $\{y_i\}_{i=1}^n$.

Proof

As the density (45) is explicitly defined, the linear operator $T:L_p(V,\mu )\rightarrow L_p(\Omega ,\nu )$ can also be efficiently constructed. On the other hand, in order to construct the operator S defined by (47) one needs to find the corresponding partition $\{S_1,\ldots ,S_d\}$ and this was achieved in Proposition 2.1 via an application of Maurey’s sampling lemma to the cone $\mathcal {C}_p \subseteq \ell _\infty ^{N}$ where $N=\left( {\begin{array}{c}n\\ 2\end{array}}\right) $. As $\ell _\infty ^{N}$ is e-isomorphic to the 2-uniformly smooth space $\ell _{\log N}^{N}$, Ivanov’s result from [18] implies that the construction can be implemented by a greedy algorithm. $\square $

Analysis of the algorithm. The only nontrivial algorithmic task in our dimensionality reduction result is the implementation of Maurey’s approximate Carathéodory theorem. In the special case of $\ell _p$ spaces, various constructive proofs of Maurey’s lemma are known [7, 11, 35], each of which allows for an analysis of the algorithm’s running time. Assume that the initial points $x_1,\ldots ,x_n\in {\textbf {B}}_{\ell _p^m}$ for some finite m. Implementing, for instance, the mirror descent algorithm of [35, Thm. 3.5] on the convex hull of $\mathcal {M}(y(1)),\ldots ,\mathcal {M}(y(m))$ appearing in the proof of Theorem 1.2, the corresponding indices $k_1,\ldots ,k_d$ can be produced in time $O(m n^2 \log n /\varepsilon ^2)$. Therefore, assuming that the points $x_1,\ldots ,x_n$ a priori lie in a $\textrm{poly}(n)$-dimensional space (as is reasonable by Ball’s embedding theorem), the output points $y_1,\ldots ,y_n$ can be constructed in time $\textrm{poly}(n,1/\varepsilon )$.

3 Proof of Theorem 1.6

In this section we prove Theorem 1.6. The constructed subset of ${\textbf {B}}_{\ell _p}$ which does not embed linearly into $\ell _p^d$ for small d is a slight modification of the one considered in [29].

Proof of Theorem 1.6

Fix $m\in \mathbb {N}$ and denote by $\{w_i\}_{i=1}^{2^m}$ the rows of the $2^m\times 2^m$ Walsh matrix and by $\{e_i\}_{i=1}^{2^m}$ the coordinate basis vectors of $\mathbb {R}^{2^m}$. Consider the n-point set

$$\begin{aligned} \mathcal {S}_{n,p} = \{0\} \cup \{e_1,\ldots ,e_{2^m}\} \cup \big \{ \tfrac{w_1}{2^{m/p}}, \ldots , \tfrac{w_{2^m}}{2^{m/p}}\big \} \subseteq {\textbf {B}}_{\ell _p^{2^m}} \end{aligned}$$

(48)

where $n=2^{m+1}+1$ and suppose that $T:\ell _p^{2^m} \rightarrow \ell _p^d$ is a linear operator such that

$$\begin{aligned} {\forall }x,y\in \mathcal {S}_{n,p}, \omega (\Vert x-y\Vert _{\ell ^{2^m}_p}) \le \Vert Tx-Ty\Vert _{\ell _p^d} \le \Omega (\Vert x-y\Vert _{\ell ^{2^m}_p}). \end{aligned}$$

(49)

Assume first that $1\le p<2$. If we write $w_i = \sum _{j=1}^{2^m} w_i(j) e_j$ then by orthogonality of $\{w_i\}_{i=1}^{2^m}$,

$$\begin{aligned} \sum _{i=1}^{2^m} \Vert Tw_i\Vert _{\ell _2^d}^2= & {} \sum _{i=1}^{2^m} \Big \Vert \sum _{j=1}^{2^m} w_i(j) Te_j\Big \Vert _{\ell _2^d}^2 = \sum _{j,k=1}^{2^m} \langle w_j, w_k\rangle \langle Te_j, Te_k\rangle \nonumber \\ {}= & {} 2^{m} \sum _{j=1}^{2^m} \Vert Te_j\Vert _{\ell _2^d}^2. \end{aligned}$$

(50)

By assumption (49) on T, we have

$$\begin{aligned} {\forall }j\in \{1,\ldots ,2^m\}, \qquad \Vert Te_j\Vert _{\ell _2^d}^2 \le \Vert Te_j\Vert _{\ell _p^d}^2 \le \Omega (1)^2 \end{aligned}$$

(51)

and

$$\begin{aligned} {\forall }j\in \{1,\ldots ,2^m\}, \qquad \Vert Tw_j\Vert _{\ell _2^d}^2 {\ge } 2^{\frac{2m}{p}} d^{-\frac{2-p}{p}}\big \Vert T\big (\tfrac{w_j}{2^{m/p}} \big )\big \Vert _{\ell _p^d}^2 {\ge } 2^{\frac{2m}{p}} d^{-\frac{2-p}{p}} \omega (1)^2.\qquad \quad \end{aligned}$$

(52)

Combining (50), (51), and (52) we deduce that

$$\begin{aligned} 2^{m(1+\frac{2}{p})} d^{-\frac{2-p}{p}} \omega (1)^2 \le 4^m\Omega (1)^2, \end{aligned}$$

(53)

which is equivalent to $d\ge \left( \tfrac{\omega (1)}{\Omega (1)}\right) ^\frac{2p}{2-p} 2^m = \left( \tfrac{\omega (1)}{\Omega (1)}\right) ^\frac{2p}{|p-2|} \cdot \tfrac{n-1}{2}$. The case $p>2$ is treated similarly.

Remark 3.1

The point set $\mathcal {S}_{n,p}$ considered in the proof of Theorem 1.6 for $p\ne 2$ is $O(n^{1/p})$ incompressible and does not admit a linear $\tfrac{1}{2}$-isometric embedding in fewer than $\Omega (n)$ dimensions. This shows that the dimension of the linear embedding exhibited in Theorem 1.2 has to be of order at least $\Omega (K^p)$ up to lower order terms. This should be compared with the $O(K^{2p}\log n)$ upper bound of Theorem 1.2.

Notes

The terminology is borrowed by the standard use of the term “incompressible vector” from random matrix theory, which refers to points on the unit sphere of $\mathbb {R}^n$ which are far from the coordinate vectors $e_1,\ldots ,e_n$.
This relation between the parameters n, N is natural as any n-point subset of $\ell _p$ embeds isometrically in $\ell _p^N$ with $N=\left( {\begin{array}{c}n\\ 2\end{array}}\right) +1$ by Ball’s isometric embedding theorem [6].

References

Achlioptas, D.: Database-friendly random projections: Johnson–Lindenstrauss with binary coins. J. Comput. Syst. Sci. 66, 671–687 (2003)
Article MathSciNet Google Scholar
Ailon, N., Chazelle, B.: The fast Johnson–Lindenstrauss transform and approximate nearest neighbors. SIAM J. Comput. 39(1), 302–322 (2009)
Article MathSciNet Google Scholar
Ailon, N., Liberty, E.: An almost optimal unrestricted fast Johnson–Lindenstrauss transform. ACM Trans. Algorithms 9(3), # 21 (2013)
Arias-de Reyna, J., Ball, K., Villa, R.: Concentration of the distance in finite-dimensional normed spaces. Mathematika 45(2), 245–252 (1998)
Article MathSciNet Google Scholar
Arriaga, R.I., Vempala, S.: An algorithmic theory of learning: robust concepts and random projection. In: 40th Annual Symposium on Foundations of Computer Science (New York 1999), pp. 616–623. IEEE Computer Society, Los Alamitos (1999)
Ball, K.: Isometric embedding in $l_p$-spaces. Eur. J. Combin. 11(4), 305–311 (1990)
Article Google Scholar
Barman, S.: Approximating Nash equilibria and dense subgraphs via an approximate version of Carathéodory’s theorem. SIAM J. Comput. 47(3), 960–981 (2018)
Article MathSciNet Google Scholar
Bartal, Y., Gottlieb, L.-A.: Approximate nearest neighbor search for $\ell _p$-spaces $(2<p<\infty )$ via embeddings. Theor. Comput. Sci. 757, 27–35 (2019)
Article Google Scholar
Bourgain, J., Dirksen, S., Nelson, J.: Toward a unified theory of sparse dimensionality reduction in Euclidean space. Geom. Funct. Anal. 25(4), 1009–1088 (2015)
Article MathSciNet Google Scholar
Brinkman, B., Charikar, M.: On the impossibility of dimension reduction in $l_1$. J. ACM 52(5), 766–788 (2005)
Article MathSciNet Google Scholar
Combettes, C.W., Pokutta, S.: Revisiting the approximate Carathéodory problem via the Frank–Wolfe algorithm. Math. Program. 197(1), 191–214 (2023)
Dasgupta, S., Gupta, A.: An elementary proof of a theorem of Johnson and Lindenstrauss. Random Struct. Algorithms 22(1), 60–65 (2003)
Article MathSciNet Google Scholar
Dirksen, S.: Dimensionality reduction with subgaussian matrices: a unified theory. Found. Comput. Math. 16(5), 1367–1396 (2016)
Article MathSciNet Google Scholar
Frankl, P., Maehara, H.: The Johnson–Lindenstrauss lemma and the sphericity of some graphs. J. Combin. Theory Ser. B 44(3), 355–362 (1988)
Article MathSciNet Google Scholar
Gordon, Y.: On Milman’s inequality and random subspaces which escape through a mesh in ${R}^n$. In: Geometric Aspects of Functional Analysis (1986/1987). Lecture Notes in Mathematics, vol. 1317, pp. 84–106. Springer, Berlin (1988)
Indyk, P., Motwani, R.: Approximate nearest neighbors: towards removing the curse of dimensionality. In: 30th Annual ACM Symposiumon Theory of Computing (Dallas 1998), pp. 604–613. ACM, New York (1999)
Indyk, P., Naor, A.: Nearest-neighbor-preserving embeddings. ACM Trans. Algorithms 3(3), # 31 (2007)
Ivanov, G.: Approximate Carathéodory’s theorem in uniformly smooth Banach spaces. Discrete Comput. Geom. 66(1), 273–280 (2021)
Article MathSciNet Google Scholar
Jacques, L.: A quantized Johnson–Lindenstrauss lemma: the finding of Buffon’s needle. IEEE Trans. Inf. Theory 61(9), 5012–5027 (2015)
Article MathSciNet Google Scholar
Jacques, L.: Small width, low distortions: quantized random embeddings of low-complexity sets. IEEE Trans. Inf. Theory 63(9), 5477–5495 (2017)
MathSciNet Google Scholar
Johnson, W.B., Lindenstrauss, J.: Extensions of Lipschitz mappings into a Hilbert space. In: Conference in Modern Analysis and Probability (New Haven 1982). Contemporary Mathematics, vol. 26, pp. 189–206. American Mathematical Society, Providence (1984)
Johnson, W.B., Naor, A.: The Johnson–Lindenstrauss lemma almost characterizes Hilbert space, but not quite. Discrete Comput. Geom. 43(3), 542–553 (2010)
Article MathSciNet Google Scholar
Johnson, W.B., Schechtman, G.: Finite dimensional subspaces of $L_p$. In: Handbook of the Geometry of Banach Spaces, vol. I, pp. 837–870. North-Holland, Amsterdam (2001)
Kane, D.M., Nelson, J.: Sparser Johnson–Lindenstrauss transforms. J. ACM 61(1), # 4 (2014)
Kashin, B., Kosov, E., Limonova, I., Temlyakov, V.: Sampling discretization and related problems. J. Complex. 71, # 101653 (2022)
Klartag, B., Mendelson, S.: Empirical processes and random projections. J. Funct. Anal. 225(1), 229–245 (2005)
Article MathSciNet Google Scholar
Krahmer, F., Ward, R.: New and improved Johnson–Lindenstrauss embeddings via the restricted isometry property. SIAM J. Math. Anal. 43(3), 1269–1281 (2011)
Article MathSciNet Google Scholar
Ledoux, M., Talagrand, M.: Probability in Banach Spaces. Ergebnisse der Mathematik und ihrer Grenzgebiete, vol. 23. Springer, Berlin (1991)
Lee, J.R., Mendel, M., Naor, A.: Metric structures in $L_1$: dimension, snowflakes, and average distortion. Eur. J. Combin. 26(8), 1180–1190 (2005)
Article Google Scholar
Lee, J.R., Naor, A.: Embedding the diamond graph in $L_p$ and dimension reduction in $L_1$. Geom. Funct. Anal. 14(4), 745–747 (2004)
Article MathSciNet Google Scholar
Liaw, C., Mehrabian, A., Plan, Y., Vershynin, R.: A simple tool for bounding the deviation of random matrices on geometric sets. In: Geometric Aspects of Functional Analysis. Lecture Notes in Mathematics, vol. 2169, pp. 277–299. Springer, Cham (2017)
Matoušek, J.: On the distortion required for embedding finite metric spaces into normed spaces. Israel J. Math. 93, 333–344 (1996)
Article MathSciNet Google Scholar
Matoušek, J.: On variants of the Johnson–Lindenstrauss lemma. Random Struct. Algorithms 33(2), 142–156 (2008)
Article MathSciNet Google Scholar
Maurey, B.: Théorèmes de factorisation pour les opérateurs linéaires à valeurs dans les espaces $L^{p}$. Astérisque, vol. 11. Société Mathématique de France, Paris (1974)
Mirrokni, V., Leme, R.P., Vladu, A., Wong, S.-C.: Tight bounds for approximate Carathéodory and beyond. In: 34th International Conference on Machine Learning (Sydney 2017). Proceedings of Machine Learning Research, vol. 70, pp. 2440–2448 (2017). http://JMLR.org
Naor, A.: Metric dimension reduction: a snapshot of the Ribe program. In: International Congress of Mathematicians (Rio de Janeiro 2018), vol. I. Plenary Lectures, pp. 759–837. World Scientific, Hackensack (2018)
Naor, A., Pisier, G., Schechtman, G.: Impossibility of dimension reduction in the nuclear norm. Discrete Comput. Geom. 63(2), 319–345 (2020)
Article MathSciNet Google Scholar
Ostrovskii, M.I.: Metric Embeddings: Bilipschitz and Coarse Embeddings into Banach Spaces. De Gruyter Studies in Mathematics, vol. 49. De Gruyter, Berlin (2013)
Pisier, G.: Remarques sur un résultat non publié de B. Maurey. In: Seminar on Functional Analysis, 1980–1981, # 5. École Polytech., Palaiseau (1981)
Plan, Y., Vershynin, R.: Dimension reduction by random hyperplane tessellations. Discrete Comput. Geom. 51(2), 438–461 (2014)
Article MathSciNet Google Scholar
Regev, O., Vidick, T.: Bounds on dimension reduction in the nuclear norm. In: Geometric Aspects of Functional Analysis, vol. II. Lecture Notes in Mathematics, vol. 2266, pp. 279–299. Springer, Cham (2020)
Schechtman, G.: Two observations regarding embedding subsets of Euclidean spaces in normed spaces. Adv. Math. 200(1), 125–135 (2006)
Article MathSciNet Google Scholar
Schechtman, G., Zinn, J.: On the volume of the intersection of two $L^n_p$ balls. Proc. Am. Math. Soc. 110(1), 217–224 (1990)
Google Scholar
Vershynin, R.: High-Dimensional Probability. Cambridge Series in Statistical and Probabilistic Mathematics, vol. 47. Cambridge University Press, Cambridge (2018)

Download references

Acknowledgements

I am grateful to Keith Ball, Assaf Naor, and Pierre Youssef for insightful discussions and useful feedback.

Author information

Authors and Affiliations

CNRS, Institut de Mathématiques de Jussieu, Sorbonne Université, Paris, France
Alexandros Eskenazis
Trinity College, University of Cambridge, Cambridge, UK
Alexandros Eskenazis

Authors

Alexandros Eskenazis
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Alexandros Eskenazis.

Additional information

Editor in Charge: Kenneth Clarkson

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

The author was supported by a Junior Research Fellowship from Trinity College, Cambridge. A conference version of this article will be presented in SoCG 2022.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Eskenazis, A. $\varepsilon $-Isometric Dimension Reduction for Incompressible Subsets of $\ell _p$. Discrete Comput Geom 71, 160–176 (2024). https://doi.org/10.1007/s00454-023-00587-w

Download citation

Received: 13 July 2022
Revised: 22 August 2023
Accepted: 23 August 2023
Published: 21 October 2023
Issue Date: January 2024
DOI: https://doi.org/10.1007/s00454-023-00587-w

Keywords

Mathematics Subject Classification

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

\(\varepsilon \)-Isometric Dimension Reduction for Incompressible Subsets of \(\ell _p\)

Abstract

Similar content being viewed by others

Epsilon-Regularity for Griffith Almost-Minimizers in Any Dimension Under a Separating Condition

Inverse Littlewood–Offord Problems for Quasi-norms

Extremal functions for the singular Moser-Trudinger inequality in 2 dimensions

1 Introduction

1.1 Metric Dimension Reduction

Question 1.1

1.2 Dimensionality and Structure

Theorem 1.2

1.3 Method of Proof

1.4 \(\varepsilon \)-Isometric Dimension Reduction

Remark 1.3

Definition 1.4

Question 1.5

1.5 \(\varepsilon \)-Isometric Dimension Reduction by Linear Maps

Theorem 1.6

2 Proof of Theorem 1.2

2.1 Maurey’s Empirical Method and Its Algorithmic Counterparts

2.2 Dimension Reduction in \(L_p(\mu )\) for Uniformly Bounded Vectors

Proposition 2.1

Proof

Remark 2.2

Corollary 2.3

Proof of Corollary 2.3

Proof of the Claim

Remark 2.4

2.3 Factorization and Proof of Theorem 1.2

Proposition 2.5

Proof

Proof of Theorem 1.2

Remark 2.6

Corollary 2.7

Proof

3 Proof of Theorem 1.6

Proof of Theorem 1.6

Remark 3.1

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Mathematics Subject Classification

Search

Navigation